A1 a ITY LIBRARIES

Itittti»tmuﬂttti mu I u

 

 

 

 

 

 

 

 

 

 

 

 

ii

3 1293 00885 3347

 

This is to certify that the

.~ ".4“. ..I M“... ’l u .0 "U t..~‘puc

dissertation entitled

CORRECTING FOR SELF-SELECTION BIAS
IN CONTINGENT VALUATION

presented by

LIH-CHYUN SUN _

C. Cl

has been accepted towards ftilﬁllment
ofthe requirements for,

Ph. D. ‘ “'deg'reE‘fri'EgTEIttﬂ'tural Economics

 

 

 

 

Date gl/é [/9 3

MSU is an Afﬁrmative Action 'Equal Opportunity Institution

     

——- . V 1/ A
M a 10!" professor

‘John P. Hoehn

0-12771

 

 

 

 

PLACE IN RETURN BOX
to remove this checkout from your record.
TO AVOID FINES return on or before date due.

 

DATE DUE

DATE DUE

DATE DUE

 

'r in 9"“ ~ ‘ ' ' ‘
NW“ 3 :4 '13?

 

~19” 9 3 E99}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

use G/Clmolﬂpﬁ-ﬁﬂ‘

 

 

 

 

 

 

CORRECTING FOR SELF-SELECTION BIAS
IN CONTINGENT VALUATION

By

Lih—Chyun Sun

A DISSERTATION

Submitted to
Michi an State University
in partial ent of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Agricultural Economics

1993

 

 

ABSTRACT

CORRECTING FOR SELF-SELECTION BIAS
IN CONTINGENT VALUATION

By

Lih-Chyun Sun

In contingent valuation (CV) studies, data can only be collected from those
who are willing to participate in the studies. Results from the application of a
single equation approach to this truncated sample may lead to inconsistent
parameter estimates (self-selection bias). A self—selection model which contains a

self-selection and a demand equation may be speciﬁed in order to detect and to

correct for self-selection bias.
Based on a truncated sample, Bloom and Killingworth (1985) proposed a

maximum likelihood (ML) estimator which leads to theoretically consistent
parameter estimates. However, using Monte Carlo experiments, Muthén and

Jbreskog (1983) showed that the estimates for parameters in the self-selection

equation are not reliable even in large samples.

A self-selection model with measurement errors is proposed in this study.
In the model, a CV truncated sample is transferred into a censored sample by
combining survey individual data with census data which provides information for
non-respondents’ neighborhoods (e.g. census blocks). Based on the censored
sample, two ML estimators are derived where census data are treated as if they
are the true values plus errors, i.e. non-respondents’ characteristics are assumed to
be distributed as N(p.i', Ei').

To apply the self-selection model with measurement errors, pi' and E; are

replaced by their consistent estimates: u,, the average values calculated from each

 

 

Lih-Chyun Sun

census block and 2,, the corresponding variance-covariance matrix calculated from
each census block, or 2, the corresponding variance-covariance matrix calculated
from a sample drawn from the population.

Results from Monte Carlo experiments suggest that the self-selection
model with measurement errors performs well, especially when pi and 2, are
adopted. The results also indicate that if the self-selection model is correctly

speciﬁed, adoption of a self-selection model with measurement errors will not

contaminate the original truncated sample.
The application of a self-selection model with measurement errors is not

restricted to CV studies. The model can be applied to studies that adopt survey

data and regression analyses.

 

 

To my parents

Dr. Chen Sun and Mrs. Feng-Chiao Lee Sun

iv

 

 

ACKNOWLEDGEMENTS

It has been my pleasure and honor to work with my committee members,
Dr. John Hoehn, Dr. Eileen van Ravenswaay, and Dr. Ching-Fan Chung. I am
especially grateful to Dr. Hoehn. As my major professor, Dr. Hoehn lead me into
the area of resources/ environmental economics and empirical work. It was
through him that I ﬁrst experienced the joy of conducting research. Having
worked for Dr. van Ravenswaay for the past four years, I owe her much gratitude
for her guidance and tolerance. Dr. Chung enhanced my knowledge in
econometrics both inside and outside the classroom. In addition to thanking him
for his friendship, I thank him for introducing me to GAUSS which greatly
strengthened my ability in understanding and in practicing econometrics.

I would like to express my appreciation to the Agricultural Economics
Department for offering an excellent environment for studying. I am indebted to
my colleague Miss Tiffany D. Phagan who spent much of her precious time
editing my writing and making this dissertation readable.

Special appreciation goes to Drs. Anthony and Delia Koo. For the past
seven years, Drs. Koo have been very supportive. I could never have ﬁnished my
studies here at Michigan State University without their encouragement and help.

I owe a great deal to my mother-in-law, Mrs. Yu-Chueng Wu, who stayed
in Lansing for a long period of time to help take care of my daughter so both my
- wife and I could go to school. Of course, this would never have happened if my

father-in-law, Mr. I-Ming Song, was not a great gentleman.

V

 

With all my heart, I thank my parents Dr. Chen Sun and Mrs. Feng—Chiao
Lee Sun for their endless love and support. Although I did not inherit their
wisdom and other wonderful characteristics, I learned from them how to confront
and to conquer challenges. Should I make any contribution to society, they are
the persons who deserve the credit. Thanks also go to my younger brother Chih-
Chyun Sun who kept my parents away from loneliness while I was abroad. I am
also indebted to my aunt Diana Lee who encouraged me constantly throughout

the years.
I am very fortunate to have a wonderful wife and a lovely daughter, they

 

have sacriﬁced and suffered a lot to help me ﬁnish my studies. For the past
years, I could have been a better father than I was. I apologize to my daughter
Yihua Sun, and thank her for bringing extra happiness to the family. lastly, with
lots of love, I thank my wife Wei-Ling Song. This dissertation would never have

been ﬁnished without her love, encouragement, support, and toleration.

TABLE OF CONTENTS

Baa:

LIST OF TABLES ............................................ xi

LIST OF FIGURES ............................................ xiv
CHAPTER 1

INTRODUCTION ............................................. 1

1.1 Non-response in surveys ................................ 1

1.2 Self-selection and sample non-response biases ................ 3

1.2.1 Self-selection and sample non-response biases: a
regression analysis .............................. 3
1.2.2 Self-selection and sample non-response biases: a

graphical analysis .............................. 5

1.3 Self-selection in contingent valuation ..................... 10

1.4 Plan of work ........................................ 12

CHAPTER 2

LITERATURE REVIEW ...................................... 14

2.1 Introduction ........................................ 14

2.2 A self-selection model ................................. 15

2.3 Estimators ......................................... 17

2.3.1 Heckman’s two-stage estimator .................... 17

2.3.2 Self-selection with a censored sample ............... 20

vii

2.3.3 Self-selection with a truncated sample ............... 21

2.4 Summary .......................................... 23
CHAPTER 3
SELF-SELECTION MODELS WITH MEASUREMENT ERRORS ...... 24
3.1 Self-selection based on a random utility model under a CV
framework ........................................ 24
3.2 A probit model with measurement errors ................... 27
3.2.1 Derivation of the probit model with measurement
errors ...................................... 27
3.2.2 Parameter identiﬁcation in the probit model with
measurement errors ........................... 31
3.3 A self-selection model with measurement errors and a linear
demand equation ................................... 33
3.4 Generalization for closed-ended questionnaires .............. 35
3.4.1 A self-selection model with measurement errors and a
Tobit demand equation ......................... 36
3.4.2 A self-selection model with measurement errors and a
probit demand equation ........................ 37
3 5 Summary .......................................... 41
CHAPTER 4 .
MONTE CARLO EXPERIMENTS AND RESULTS .................. 43
4.1 Data generation ..................................... 44
4.1.1 Population generation .......................... 44
4.1.2 Sample generation ............................. 46
4.1.3 Monte Carlo experiments ........................ 47
4.2 A linear demand equation with self-selection ................ 48

4.2.1 Monte Carlo experiment results from a self-selection
model with measurement errors and a linear demand

equation .................................... 50

4.3 A Tobit demand equation with self-selection ................ 51

viii

 

4.3.1 Monte Carlo experiment results from a self-selection
model with measurement errors and a Tobit demand
equation .................................... 53

4.4 A probit demand equation with self-selection ................ 55

4.4.1 Monte Carlo experiment results from a self-selection
model with measurement errors and a probit demand

equation .................................... 57
4.5 General results from the Monte Carlo experiments ........... 58
4.6 Summary .......................................... 60
CHAPTER 5
CONCLUDING REMARKS .................................... 62
5.1 Summary .......................................... 62
5.2 Need for future research ............................... 64
5.3 Conclusion . . . . Q .................................... 65
APPENDIX A
RESULTS FROM MUTHEN AND JORESKOG’S STUDY ............ 68
APPENDIX B
NOTATION USED IN REPORTING MONTE CARLO RESULTS ...... 72
APPENDIX C
MONTE CARLO EXPERIMENT RESULTS ....................... 75

cm Estimates from a self-selection model with
measurement errors and a lmear demand equation

(9 = 0.25) .................................. 75

C.1.2 Estimates from a self-selection model with
measurement errors and a lmear demand equation

(9 = 0.5) ................................... 78

C.1.3 Estimates from a self-selection model with
measurement errors and a linear demand equation
(p = 0.75) .................................. 81

C.2.1 Estimates from a self-selection model with ‘
measurement errors and a Tobit demand equation
(p = 0.25) .................................. 85

C.2.2 Estimates from a self-selection model with
measurement errors and a Tobit demand equation

(p = 0.5) ................................... 88

C.2.3 Estimates from a self-selection model with
measurement errors and a Tobit demand equation

(p = 0.75) .................................. 91

C31 Estimates from a self-selection model with
measurement errors and a probit demand equation

(p = 0.25) .................................. 94

C32 Estimates from a self-selection model with
measurement errors and a probit demand equation

(p = 0.5) ................................... 97

C.3.3 Estimates from a self-selection model with
measurement errors and a probit demand equation

(p = 0.75) .................................. 100

APPENDIX D

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A
LINEAR DEMAND EQUATION ................................ 103

APPENDIX E

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A
TOBIT DEMAND EQUATION ................................. 112

APPENDIX F

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A
PROBIT DEMAND EQUATION .‘ ............................... 123

BIBLIOGRAPHY ............................................ 133

LIST OF TABLES

Rage

Table A1 Parameter estimates for data simulated according to model 1,

Nt == 496, N = 1000 ..................................... 69
Table A2 Parameter estimates for data simulated according to model 1,

Nt = 1963, N = 4000 .................................... 70
Table C.1.1.A Linear demand, OLS estimates without correcting for self-

selection, p = 0.25 ....................................... 76
Table C.1.1.B Linear demand, correcting for self-selection bias using

censored samme, = 0.25 ................................. 76
Table C.1.1.C Linear demand, correcting for self-selection using

measurement errors model with u, and 2, p = 0.25 .............. 77
Table C.1.1.D Linear demand, correcting for self-selection using

measurement errors model with pi and 2,, p = 0.25 .............. 77
Table C.1.2.A Linear demand, OLS estimates without correcting for self-

selection bias, p = 0.5 .................................... 79
Table C.1.2.B Linear demand, correcting for self-selection bias using

censored sample, p = 0.5 .................................. 79
Table C.1.2.C Linear demand, correcting for self-selection using

measurement errors model with p, and E, p = 0.5 ............... 80
Table C.1.2.D Linear demand, correcting for self-selection using

measurement errors model with [Li and 2i, p = 0.5 .............. 80
Table C.1.3.A Linear demand, OLS estimates without correcting for self-

selection bias, p = 0.75 ................................... 82
Table C.1.3.B linear demand, correcting for self-selection bias using

censored sample, p = 0.75 ................................. 82

Table C.1.3.C Linear demand, correcting for self-selection using

measurement errors model with [Li and 2, p = 0.75 .............. 83
Table C.1.3.B Linear demand, correcting for self-selection using
measurement errors model with u, and 2 i, p = 0.75 .............. 83
Table C.1.3.B Linear demand, correcting for self-selection bias using
truncated sample, p = 0.75 ................................ 84
Table C.2.1.A Tobit estimates without correcting for self-selection bias,
p = 0.25 ............................................. 86
Table C.2.1.B Tobit demand, correcting for self-selection bias using
censored sample, p = 0.25 ................................. 86
Table C.2.1.C Tobit demand, correcting for self-selection using
measurement errors model with u, and 2, p = 0.25 .............. 87
Table C.2.1.D Tobit demand, correcting for self-selection using
measurement errors model with u, and 2,, p = 0.25 .............. 87
Table C.2.2.A Tobit estimates without correcting for self-selection bias,
p = 0.5 ............................................... 89
Table C.2.2.B Tobit demand, correcting for self-selection bias using
censored sample, p = 0.5 .................................. 89
Table C.2.2.C Tobit demand, correcting for self-selection using
measurement errors model with u, and 2, p = 0.5 ............... 90
Table C.2.2.D Tobit demand, correcting for self—selection using
measurement errors model with u, and 2,, p = 0.5 .............. 90
Table C.2.3.A Tobit estimates without correcting for self-selection bias,
p = 0.75 .............................................. 92
Table C.2.3.B Tobit demand, correcting for self-selection bias using
censored sample, p = 0.75 ................................. 92
Table C.2.3.C Tobit demand, correcting for self-selection using
measurement errors model with u, and 2, p = 0.75 .............. 93
Table C.2.3.D Tobit demand, correcting for self-selection using
measurement errors model with u, and 2,, p = 0.75 .............. 93
Table C.3.1.A Probit estimates without correcting for self-selection bias,
p = 0.25 .............................................. 95
Table C.3.1.B Probit demand, correcting for self-selection bias using
censored sample, p = 0.25 ................................. 95
Table C.3.1.C Probit demand, correcting for self-selection using
measurement errors model with u, and 2, p = 0.25 .............. 96

xii

Table C.3.1.D Probit demand, correcting for self-selection using
96

measurement errors model with [Li and 2,, p = 0.25 ..............
Table 03.2.6615 Probit estimates without correcting for self-selection bias,
p = . ............................................... 98
Table C.3.2.D Probit demand, correcting for self-selection bias using
censored sample, p = 0.5 .................................. 98
Table C.3.2.C Probit demand, correcting for self-selection using
99

measurement errors model with pi and 2, p = 0.5 ...............

Table C.3.2.D Probit demand, correcting for self-selection using
measurement errors model with u, and 2,, p = 0.5 .............. 99

Table C.3.3.A Probit estimates without correcting for self-selection bias,

p = 0.75 .............................................. 101
Table C.3.3.B Probit demand, correcting for self-selection bias using

censored sample, p = 0.75 ................................. 101
Table C.3.3.C Probit demand, correcting for self-selection using

measurement errors model with p, and 2, p = 0.75 .............. 102

Table C.3.3.D Probit demand, correcting for self-selection using
measurement errors model with u, and 2,, p = 0.75 .............. 102

LIST OF FIGURES

Figure 1.1 Presence of sample non-response bias, absence of self-selection
bias ...................................................

Figure 1.2 Presence of both self-selection and sample non-response biases . . . 7

Figure 1.3 Presence of self-selection bias, absence of sample non-response
bias ................................................... 8

Figure 1.4 Self-selection bias affects only the constant term .............. 9

xiv

CHAPTER 1
INTRODUCTION

1.1 Non-response in surveys

Contingent valuation (CV) is one of the methods used by researchers to
elicit values of non-market goods. Depending on the CV survey design, the
elicited values can be either a Hicksian value (i.e. compensating or equivalent
variation) that is derived from a Hicksian demand function, or a consumer surplus
that is derived from a Marshallian (ordinary) demand function. In many CV
studies, data are collected using mail surveys. As with other survey methods, non-
response is a common problem in mail surveys. The problem created by non-
response is that data values intended to be observed by survey design are in fact
missing. These missing values not only lead to less efﬁcient estimates because of
the reduced size of the data base, but may also lead to biased estimates due to
the fact that respondents are often systematically different from non-respondents
(Rubin, 1987).

In analyzing survey data, two types of possible biases can be created by
non-response. The ﬁrst is known as sample non-response bias, and the second is
known as self-selection (or sample selection) bias (Michell and Carson, 1989).
Sample non-response bias occurs when the sample distribution of some socio-
economic or demographic characteristics is signiﬁcantly different from the
population. For example, if only low-income individuals respond to the CV

surveys, the sample mean of income is then lower than the population mean of

1

 

2
income. Sample non-response bias can be detected by comparing the sample
distribution of certain socio-economic or demographic characteristics with the
population distribution.

Self-selection bias occurs when the non-response is non-random, which
means that the reasons for non-response are endogenous to the survey study. For
example, only those who have a higher marginal propensity to consume the non-
market good respond to the CV survey. Unlike sample nomresponse bias, it is
difﬁcult to ﬁnd a simple indicator for detecting the existence of self-selection bias.

Non-response can usually be divided into two categories, namely, item non-
response and unit non-response. In CV mail surveys, item non-response means
that a respondent returns the survey but fails to answer some of the questions;
unit non-response indicates that a member of the sample fails to return the
survey. Both item and unit non-response can cause either sample non-response or
self-selection bias, or both.

One way to compensate for item non-response is to replace those missing
values with imputed values (Little and Rubin, 1987, Rubin, 1987). An alternative
is to use a generalized Heckman’s two-stage method to correct for the possible
biases that are caused by the item non-response (Ong et.a1., 1988). However,
item non-response is not the concern of this study and statistical methods that are
related to item non-response will not be discussed here.

The purposes of this study are ﬁrst to distinguish the differences between
sample non-response and self-selection biases and then to develop parametric
analyses to detect and to correct for the possible self-selection bias that is caused

by unit non-response.

 

3

1.2 Self-selection and sample non-response biases

In order to derive values of non-market goods in CV studies, a demand (or

inverse demand) function is estimated by regression analyses.1 In this section,
self-selection and sample non-response biases are first examined under a
regression framework. Next, graphs based on simpliﬁed models are provided to

demonstrate intuitively the relationships between self-selection and sample non-

response biases.

1.2.1 Self-selection and sample non-response biases: a regression analysis

Suppose that individual i’s demand for a non-market good, Y, is described

by a linear structural equation

yi =xi/84-q,i=l,2,...,N,
where xi is a column vector of stochastic variables, ui is an error term, and E(ui |
xi) = 0 (i.e. E(yi | xi ) = xi’ﬂ). Suppose now that the resulting OLS regression

using only the data from respondents is

Yi’xale+eq.i=1.2....,M,andM<N.

 

1 A total value function that determines an individual’s willingness to pay
(WTP) for a rceived environmental change is usuall estimated in a CV study

Randall, 198 , .260). The relationship between ’s and perceived
environmental c anges can be thou t of as a demand (inverse demand) function.

In some studies, researchers use C to estimate the demand for a market good
Wlth non-market attributes (van Ravenswaay and Hoehn, 1991a, 1991b). For
convenience, the following analyses concentrate only on a demand function.

4

By deﬁnition, self-selection bias occurs as a result of

E(e1 | 1;, return the survey) it 0,

and results of self-selection bias are

E(e I x,) v p, and
xs'e * E(y, I X.) = #13.

Sample non-response bias was previously deﬁned as the sample distribution
of x, differing signiﬁcantly from the population distribution. For example, if the
sample mean of x ( = [x1 x2 xM]') is signiﬁcantly different from the population
mean x', it is then suspected that the sample mean of y ( = [y1 y2 yM]’) is
different from the population mean y'. Based on neoclassical regression analyses
(Goldberger, 1991, Chapter 25), however, E(B | xi) = B is always true given that
there is no self-selection bias. Given the population mean x', y' is simply the
conditional expectation E(y | x = x'), and can be calculated consistently by x"6.
Apparently, the distribution of xi does not play a role in regression models.

In CV studies, for a demand function derived by regression analyses, E(yi I
x,) = xi’B is always true provided that there is no self-selection bias, i.e. E(ei | x,,
return the survey) = 0. This holds regardless of any difference between x, and
population distributions. In other words, in regression analyses, sample non-
response bias does not affect the consistency of the parameter estimates. As long
as the parameter estimates are consistent, the conditional expectation of yi given xi
can always be calculated consistently. Rather than worrying about sample non-
response bias, researchers should instead focus their attention on self-selection
bias.

In analyzing CV survey data, self-selection and sample non-response bias

are two very different issues; there exists no special relationship between them.

5
Self-selection causes biased (inconsistent) parameter estimates due to the non-
zero conditional expectation of the error term given the independent variables.

Sample non-response bias does not even play a role in regression analyses.
1.2.2 Self-selection and sample non-response biases: a graphical analysis

It is helpful to demonstrate graphically the possible relationships between
self-selection and sample non-response biases in the following examples.

Suppose that the relationship between an individual’s demand for a non-

market good, Y, and his/her income is

yiac-I-XIB-i-up

where c is a constant term and xi is assumed to be individual i’s income, and the

OLS regression using data from only the respondents is

Yi=d+xl6+ei'

Further, assume that the population (sample) mean for y and for income are y'
(‘9') and x' (2) respectively.

In the following graphs, a solid line represents the sample regression line, a
broken line represents the population regression line, and the marginal propensity

to consume the non-market good is deﬁned as dyi/dxi.

6
Example 1. Presence of sample non-response bias, absence of self-

selection bias (Figure 1.1).

Y

 

 

0 Y it

Figure 1.1 Presence of sample non-response bias, absence of self-selection bias

Only low-income individuals return the surveys, and both low- and high-
income individuals have the same marginal propensity to consume the non-market
good, i.e.

y ' > 37, x " > x;
d = c, and 0 = B.

7

Example 2. Presence of both self-selection and sample non-response biases
(Figure 1.2).

Y

 

 

 

“I
x

Figure 1.2 Presence of both self-selection and sample non-response biases

Only some of the low-income individuals return the surveys, and those low-
income individuals have a lower marginal propensity to consume the non-market
good than do other individuals, i.e.

y ' > y, x " > if,

dec,and6¢ﬂ.

8
Example 3. Presence of self-selection bias, absence of sample non-

response bias (Figure 1.3).

Y

 

 

 

t

_X-==X

Figure 1.3 Presence of self-selection bias, absence of sample non-response bias

Only those who have a lower marginal propensity to consume the non-

market good return the surveys, i.e.

y‘ =y,x’ =ibut
dvc,and0¢ﬂ.

9

Example 4. One special case is when self-selection bias affects only the
estimate of the constant term (Figure 1.4).

Y

I

‘<l

 

 

 

 

 

0 ._ t
X=X

Figure 1.4 Self-selection bias affects only the constant term

Both low- and high-income individuals have the same marginal propensity

to consume the non-market good, but the average consumptions are different, i.e.

dﬁc,but0=13.

10
Given the above examples, there is no special relationship between self~
selection and sample non-response biases that can be observed. Thus, sample
non-response bias is not an appropriate indicator for the presence of self-selection
bias. In CV studies, if regression analyses are adopted, examining the presence of
sample non-response bias will not help researchers to detect the presence of self—
selection bias. Furthermore, the consistency of parameter estimates has nothing

to do with the sample non-response bias.
1.3 Self-selection in contingent valuation

Although self-selection bias has been considered as a potential problem in
CV studies (Mitchell and Carson, 1989; Edwards and Anderson, 1987; Loomis,
1987), none of the existing studies has demonstrated any empirical evidence
regarding self—selection bias in CV studies. However, some studies indicate that
sample non-response bias may be a potential problem when surveys are used for
collecting data. For example, several studies have investigated the factors which
affect individuals’ decisions to answer surveys (Green, 1991; Green and Kvidahl,
1989; Green and Stager, 1986; Goyder, 1982; Brown, et a1. 1981; Kanuk and
Berenson, 1975). It ispointed out that respondents tend to be older, have higher
income and higher education. In some cases, respondents and non-respondents
have different characteristics with respect to occupation, residence location, and
gender. In some studies, it has been found that respondents are more interested
than non-respondents in the topic of the survey studies. For example, Whitehead
(1991) showed that members of environmental interest groups responding to CV
surveys that value environmental goods have a particular interest in the topic of
the survey. Brown et. a1. (1989) found that mail response rates were higher

among members of environmental interest groups. Walsh, et. a1. (1984), and

11
Bowker and Stoll (1988) have suggested that members of environmental interest
groups hold larger environmental values than non-members when measured in CV
markets.2

As previously stated, CV is used to derive values of non-market goods
through an estimated demand function. In economic modeling, certain socio-
economic and demographic characteristics are frequently used as the explanatory
variables in estimating a demand function. These explanatory variables may
include income, age, education, gender, and location. These same variables have
also been considered as factors affecting an individual’s response decision.

Three observations can be drawn from the studies cited above. First,
certain common variables explain both individuals’ decisions to answer the survey
and their demand for non-market goods. Second, respondents and non-
respondents may possess different characteristics. Finally, respondents may be
more interested in the survey topics than non-respondents.

Although these observations do not offer direct evidence for the presence
of self-selection bias, they do call attention to the potential existence of self-
selection bias. For instance, examples 2, 3, and 4 (Figures 1.2, 1.3, and 1.4,
Section 1.2.2) raised in the previous section gave several possible outcomes that
indicated the co-existence of sample non-response and self-selection biases.

Since sample non-response bias alone is not a sufﬁcient indicator of the
presence of self-selection bias, and since none of the existing studies has provided
satisfactory work to detect and to correct for self-selection bias empirically, self-

selection bias remains an empirical hypothesis that should be tested.

 

2 In estimating recreation demand, several authors have noticed the problem
of self-selection bias when data is collected from on-site and user groups (Shaw,
1988; Smith, 1988; Bockstael, et. al., 1990). However, this type of self-selection
bias is caused by sampling method.

12
1.4 Plan of work

This study concentrates on parametric analyses in testing and in correcting
for potential self-selection bias in CV studies. Respondents’ behavior is modeled
as a two-step decision making process. The ﬁrst step is concerned with a
respondent’s decision whether to return the survey or not. Under a random utility
framework, it is assumed that an individual gains utility from answering the
survey. The cost of answering the survey is the opportunity cost of the time that
is required for answering the CV survey. Based on utility maximization subject to
both budget and time constraints, an individual will answer and return the survey
only if the net utility gain is positive.3 The net utility gain can be summarized by
an equation referred to here as a self-selection equation. If the individual decides
to return the survey, the second step is to determine the respondent’s demand for
the non-market good through a demand function. The complete self-selection
model is described by estimating the self-selection and demand equations
simultaneously using maximum likelihood (ML) estimators.

Three types of ML estimators are considered in this study. The ﬁrst ML
estimator is used when the sample is truncated. In a truncated sample, neither
non-respondents’ characteristics nor their demand for the non-market goods can
be observed by researchers. The second ML estimator is used when the sample is
censored. With a censored sample, non-respondents’ characteristics are
observable, but demand for the non-market goods is not observed.

The third ML estimator transforms a truncated sample into a censored
sample by adapting information from both the truncated sample and census data.

For example, in mail surveys, although researchers do not observe anything from

 

3 In this study, respondents who returned the surveys with incomplete answers
are treated as non-respondents.

 

13
non-respondents, mailing addresses are usually available, and average
characteristics of non-respondents’ neighborhoods can be acquired from census
data. If a non-respondent’s characteristics can be treated as the average
characteristics of his/her neighborhood plus an error, this can provide researchers
with additional information that can be used in regression analyses.

Two types of questionnaires are frequently used in CV studies. The ﬁrst
type is the open-ended questionnaire and the second type is the closed-ended
questionnaire. For an open-ended questionnaire, demand is observed as a
continuous variable (or is sometimes censored at certain values). For a closed-
ended questionnaire, demand is not directly observable. Given a referendum
price, only a YES/NO answer is observed. In this study, data from both open-
and closed-ended questionnaires are discussed along with the three different ML
estimators.

The remaining chapters of this study are completed in the following
manner. Chapter 2 describes the statistical nature of self-selection bias and the
rationale of testing and correcting for self-selection bias. Studies concerned with
self-selection bias under both truncated and censored samples are also reviewed.
Chapter 3 derives a self-selection model under the random utility framework. In
addition, ML estimators that transform a truncated survey sample into a censored
sample by adopting information from both the truncated sample and census data
are also developed. The resulting estimates from the ML estimators are
examined by Monte Carlo experiments, and the results are summarized in

Chapter 4. Finally, concluding remarks are given in Chapter 5.

 

CHAPTER 2
LITERATURE REVIEW

2.1 Introduction

Self-selection has been considered by economists, particularly so by labor
economists, for some time. In most of the studies that include this issue, self-
selection is used in modeling the earnings among different sectors. For example,
let yl and y; represent the potential earnings for two sectors. An individual will
work in sector 1 only if y1 > yz, then E(y, l yl > y;) a E(yl). In a conventional
self-selection model, two separate equations are used to model y1 and y2. Based
on income maximization, a latent variable is modeled by a third equation which
describes I' as a function of (y1 - yz). The individual will work in sector 1 if I' > 0
(y, > yz), and in sector 2 otherwise. An econometric model that is designed for
this type of self-selection is often called a switching regression model.

Willis and Rosen (1979) modeled the demand for college attendance based
on the comparative advantage in expected lifetime earnings. Considering
simultaneously the demand for and supply of labor, Heckman and Sedlacek
(1985) presented a model of the sectoral allocation of workers from different
demographic types. What made their study unique is their use of aggregate data
to predict earnings for the different sectors, combining these predicted earnings
and micro data to estimate the labor supply in the different sectors. In his study,
Borjas (1987) modeled the earnings for immigrants based on the difference in

wages earned in the U. S. and potential wages in their native countries. Recently,

l4

15
Heckman and Sedlacek (1990) modeled self-selection based on utility
maximization, instead of self-selection based on earnings.

In a CV study, a self-selection hypothesis would assert that only those who
have enough interest in the topic of study will return the surveys, and that
respondents have different demand behavior than non-respondents. Under this
hypothesis, self-selection in a CV study differs from self-selection in a
conventional switching regression model in two ways. Borrowing the above two
sector earnings model, assume ya and ya are individual i’s demand for goods 1
and 2 respectively. First, instead of modeling demand for both goods 1 and 2, a
CV study usually models only the demand for one good (say, y“). Second, the
self-selection criterion in a CV study is the net utility gain from answering the
survey, while the criterion in a conventional switching regression model is the
potential difference in demand (yn - ya).

Since only the demand for one good (yn) is modeled in a CV study, the
self-selection model is less complicated than a conventional switching regression
model. With this simpliﬁcation, the nature of switching regression models is left

intact but the statistical process for estimating the model is simpliﬁed.
2.2 A self-selection model
Under a CV framework, consider the following self-selection model

regarding individual i’s demand for a good Q (q). Individual i’s self-selection

behavior is governed by the self-selection equation

16

#7 + u,, i = 1,2,...,N,

L.
L=1, Iﬂ Ii.>0’
l= 0 , otherwise.

In the self-selection equation, I,’ is unobservable, but I, is observed. If I, = 1, the

demand equation,
q = #13 + e,.

is then observed. In the self-selection and demand equations, 2, (x,) is a kxl
(mxl) vector of exogenous variables, y (B) is a kxl (mxl) vector of parameters to

be estimated, 11, (e,) is a random error. Assume that
u, ~ i.i.d. N(O, 1),
q ~ i.i.d. N(O, 02),

(e, 11,) ~ i.i.d. BN(O, o, n),

where N(.,.) and BN(.,.,.) are a univariate and a bivariate normal distribution

respectively, and

is the variance-covariance matrix.

Suppose that researchers are interested in estimating the demand equation.
Since q, is observed only if I, = 1, the distribution of q, is truncated, and the
demand equation alone does not correctly specify the demand for the good (q,).
To specify the demand equation correctly, the endogeneity that is caused by the

self-selection behavior must be taken into account. Thus, a correct model

17
speciﬁcation is described jointly by the self-selection and the demand equations,
and the objective is to obtain consistent estimates for B, y, p, and oz.

Under a CV framework, I,’ in the self-selection equation can be thought of
as individual i’s net utility gain from answering the survey. Self-selection implies
that an individual’s decision to answer and to return the survey is correlated with
the topic of study (i.e. q,, demand for the good Q). In other words, the decision

to answer and to return the survey is endogenous to the study.

2.3 Estimators

In this section, three estimators for correcting self-selection bias are
reviewed. Heckrnan’s two-stage estimator is ﬁrst examined,1 followed by two ML

estimators that are based on either a censored or on a truncated sample.

2.3.1 Heckman’s two-stage estimator

Based on the moments of a truncated bivariate normal distribution,
according to the self-selection model described by the self-selection and demand

equations, Heckman (1979, 1976) demonstrated that

 

1 Although Heckman’s two-stage estimator is well known, it provides a clear
and straightforward explanation of the nature of self-selection bras.

18
E(qlxohl)
=x'B+E(e,lI.-'=1)

= xi'Ii + E(e, I n, > 1’?)

 

 

_ I “'4,”
.. x B + I

1 - “(‘4 Y)
_ x’B + a “Zr/Y),

«a Y)

where a = p a (i.e. the covariance between e and u), d) and ¢ are standard
normal density and distribution functions respectively, and ¢(-)/<b(-) is the
inverse Mill’s ratio, or in some contexts, the hazard rate. The equation E(q, | x,,
I, = 1) clearly indicates that OLS regression of q on x leads to inconsistent
estimates for 8.2

Based on the equation E(q, | x,, Ii = 1), Heckman proposed a two-stage
method for estimating B, y, and a. In the ﬁrst stage, according to the self-
selection equation, a probit model is used to obtain 9, a consistent estimate of y.
The predicted inverse Mill’s ratio is then calculated as “ﬁg/«(#9). In the
second stage, using only the returned surveys, consistent estimates of B and a can
be obtained by regressing q on x and the predicted inverse Mill’s ratio using

standard OLS procedures.3 Self-selection bias can then be detected by testing the

hypothesis that a = 0 (i.e. p = 0)-

 

2 Unless o(-)/¢(-) is orthogonal to x, there is omitted variable bias.

3 Due to heteroscedasticity, feasible eneralized least s uared (FGLS can
also be used (Greene, 1990, pp. 739 - 74 . FGLS may res t in more e crent
estimates.

19

Based on the equation E(q, | x,, I, = 1), self-selection bias can be viewed
as an omitted variable bias, where the omitted variable is the inverse Mill’s ratio.

Several observations concerning self-selection bias can be drawn from the
equation E(q, | x,, I, = 1). First, there is no self-selection bias if e, and u, are
uncorrelated (p = 0). Second, since |p| e [0, 1], if a is small, and <0 is close to 1,
then p -a {¢(')/<b(-)] can be very close to zero. If this is the case, there will be
no self-selection bias. This can happen when the response rate is high (i.e. for
each individual, ¢(‘)/¢(-) is close to 0). Third, as pointed out by Heckman
(1976), if p-a {¢(-)/¢(-)] is a constant, then all of the slope coefﬁcients
estimated using OLS (where only the returned surveys are used) are consistent
except for the constant term4 (this can also be seen from Figure 1.4, Chapter 1,
Section 1.2.2). Fourth, E(q, | x,, z,), the conditional expectation given x, and z,

(unconditional on whether the survey is returned or not), can be derived as

 

E(qlim)
=tx’i+a «qu) l-m’y) +
«47)

a -————¢(q Y) ] ° 90-4/7)

Ixa’B -
“-4 v)

= xi’B ' [9(4'1) + «1'7”
= Kip.
9 Although Heckman’s two-stage method offers a convenient way to estimate

and to test for self-selection bias, it suffers from three major disadvantages. First,

in order to estimate 7 and to calculate the inverse Mill’s ratio, the 2 matrix for

 

4 This could happen if y = 0. Recall that E(q, | x,,I, == 1) = #9 + E(e, l 11:
> -Z,’y), if y = 0, then E(e, | u, > 0) = p n, which is a constant.

20
both respondents and non-respondents must be known. In a truncated sample, 2
is not known for non-respondents. Hence, the Heckman’s two-stage method
cannot be applied to data from a truncated sample. Second, since only the
returned surveys are used in the second stage, it is less efﬁcient than if the full
sample is used.’ Third, the conventional formula used in OLS to calculate the
variance-covariance matrix does not provide the correct variance-covariance
matrix for the second stage OLS estimation.

Instead of using Heckman’s two-stage method, one alternative for deriving
consistent estimates is to estimate the self-selection and the demand equations
jointly by ML estimators. Depending on the nature of the sample, there are
essentially two types of likelihood functions that can be speciﬁed. The ﬁrst type

of likelihood function is speciﬁed for a censored sample and the second type for a

truncated sample.
2.3.2 Self-selection with a censored sample

According to the self-selection and the demand equations, a censored
sample indicates that for those whose I, = 0 (individual i did not return the
survey), x, and 2, can still be observed. In a censored sample, for those who did
not return the survey, 2, is still available and can be used to explain individual i’s
self-selection behavior. In addition, the explanatory variables, x,, in the demand
equation can also be observed. For a censored sample, consistent and efﬁcient

estimates for B, y, p, and a2 can all be obtained by maximizing the likelihood

function6

5 It is also possible to use the full sample (Maddala, 1983, p. 159).

. 6 In (practice, it is the log-likelihood function that is maximized. However, the
likelihoo function simpliﬁes interpretation.

 

21

I't a H ‘11:!“ g(Q'Xilﬂ, 11, q dll

l,-l
. “2,1
HI I ‘ g(e,u,O)dude,
Il-O .. ..
where g(.,.,.) is a bivariate normal density function. In the likelihood function, LC,
the ﬁrst term is the likelihood for those who returned the survey and is the
product of the conditional density of q, given that individual i returned the survey.

The second term is the likelihood for those who did not return the survey and is

the product of the joint distribution function.7

2.3.3 Self-selection with a truncated sample

By deﬁnition, a censored sample implies that all non-respondents’ x’s and
2’s can still be observed. In practice, this does not seem to be the case for most
of the CV studies. Very often, data used in CV studies are truncated; namely,
neither x’s nor z’s can be observed from non-respondents.

According to the self-selection and the demand equations, a truncated
sample indicates that when Ii == 0, all q,, x,, and zi are not observed. In other
words, we know nothing of those who did not return the survey. In the case of a

truncated sample, the self-selection and demand equations can be estimated

jointly by maximizing the likelihood function

*

. 7 For further discussion of self-selection models under a censored sample, see
Iantfle (1(9895‘i)1.ee (1984), Goldberger (1981), Greene ( 1981), Olsen (1980), and
eson .

22

 

1;}, g(q-a’i. u. 0) du
L. = ,
1g «4 v)

and the estimated 8, 7, oz, and p are consistent (Bloom and Killingsworth,
1985).8 The likelihood function, L,, is based on a truncated normal distribution.
The numerator of i’s likelihood is the conditional density of q, given that
individual i returned the survey, and the denominator is the probability that
individual i returns the survey. In a self-selection model with a truncated sample,
only the information from returned surveys is available for use in estimation.9

In most CV studies, data used in econometric analyses are obtained from
truncated samples. According to Bloom and Killingsworth (1985), self-selection
models with a truncated sample should not create problems in econometric
analyses when the ML estimator is applied. However, bearing in mind that x, and
z, are likely to have variables in common, or at least to be highly correlated, it
seems unlikely that one would be able to obtain good estimates of parameters
other than B (Pudney, 1989, p. 83). A study by Muthén and Joreskog ( 1983)
using Monte Carlo experiments tends to conﬁrm this suspicion and shows that the
estimate for y is not reliable even in large samples,10 although it is possible to

correct for self-selection bias in the B coefﬁcients. This is a major disadvantage of

using data from a truncated sample.

$267516 same likelihood function is also presented by Maddala (1983, pp. 150
an .

9 Unlike the studies conducted by Hauseman and Wise (1981, 1977) where
the data are acquired from a sam le that is truncated by an exogenous variable.
A truncated sample in this study is a sample that is truncated by an endogenous

variable.
1° Part of Muthén and Jdreskog’s (1983) results are reported in Appendix A.

2.4 Summary

The self-selection model considered in this study consists of two
components. The ﬁrst is the self-selection equation which is a probit-type
equation, and the second is a demand equation. Under a CV framework, the
analyses began with an examination of self-selection models with either a
censored or a truncated sample. Econometric analyses with a censored sample
were found to have preferred properties (i.e. consistency and efﬁciency).
However, censored samples are generally not available for most CV studies. The
majority of CV studies use surveys to collect data. Since data is collected only
from respondents, the sample is truncated. According to existing econometric
methods, in order to test and to correct for self-selection bias in CV studies, the
ML estimator that is based on a truncated sample (Bloom and Killingsworth,
1985) must be adopted. In theory, ML estimators lead to consistent and efﬁcient
estimates provided that the likelihood function is correctly speciﬁed. However,

Monte Carlo experiments have shown that in truncated samples, parameters in

the self-selection equation could not be estimated reliably even with large samples

(Muthén and Jdreskog, 1983).
This disadvantage of the ML estimator that is based on a truncated
samples motivates the derivation of ML estimators in the next chapter which

combine individual survey data with census data and transfer a truncated sample

into a censored sample.

CHAPTER 3
SELF -SELECI‘ION MODELS WITH MEASUREMENT ERRORS

A self-selection model consists of two correlated components. The ﬁrst
component is a self-selection equation which is essentially a probit model. The
second component is a demand equation. Under self-selection, an individual’s
demand is observed only if the corresponding latent variable which is generated
by the self-selection equation has a value greater than zero. In this study, the
self-selection models differ from conventional models. For those individuals with
unobserved demand, all of the independent variables in the self-selection equation
are observed but with measurement errors.

This chapter begins by describing a self-selection equation. The self-
selection equation is developed using the random utility model under a CV
framework. A probit model with measurement errors is derived where the r.h.s.
variables are measured with errors whenever the l.h.s. latent variable has a value
less than or equal to zero. A self-selection model with measurement errors is
9 developed based on the probit model with measurement errors described above
and a linear demand equation. Finally, the model is generalized to allow for

qualitative and limited dependent variables in the demand equation.
3.1 Self-selection based on a random utility model under a CV framework

Assume a CV study uses mail surveys to elicit the demand for a good 0.

In addition, the questionnaires used in the survey are open-ended. Individual i’s

24

25
demand for the good is q,.1 However, q, is observed only if individual i returned

the survey and gave valid answers.2

Suppose that individual i’s decision to return the survey is based on his net
utility gain from answering the survey. Individual i will return the survey only if
the net utility gain from answering the survey is positive. However, individual i’s
net utility gain cannot be observed directly; only the realization of the net utility
gain (i.e. to return or not to return the survey) is observed.

To model an individual’s self-selection behavior, assume that an individual

maximizes utility subject to both a budget and a time constraint:

Max. U(C, L, t-I,I I s)

s.t.w-T‘-W°L-W°t°I-=P°C, (1)

T‘=T+L+t°I,

where C is a composite good that the individual consumes at price P, L is leisure
time spent, t is the time devoted to answering the survey, I is an indicator which
equals 1 if he answers the survey and 0 otherwise, w is the wage rate, s is a vector
of socio-economic and demographic variables other than the wage rate, T' is the

total time available (which is ﬁxed), T is the time devoted to market work and is

also assumed to be ﬁxed. At maximum utility,
U
UL = TI, (2)

where UL and UI are the marginal utilities of leisure and of answering the survey

respectively. At utility maximization, equation (2) states that the marginal utility

 

1 At this stage, - is assumed to be continuous and -oo < q, < co. This
assumption will be re eased later in this chapter.

2 All of the returned surveys are assumed to have valid answers. This
assumption excludes the case of item non-response.

26
per unit of time for answering the survey equals the marginal utility of leisure (i.e.
the marginal utility of answering the survey is equal to the marginal utility of
leisure multiplied by the time used in answering the survey).
The individual’s indirect utility function can be written as

wmmnuan G)

Let C be the numeraire, and set P = 1. Furthermore, assume that t is constant

across individuals. Since T is treated as ﬁxed, the indirect utility function

becomes
U “(Y i s, I), (4)
where Y ( == w'T) is the individual’s income. The condition for an individual to

answer and to return the survey is

U"(Y l s,I= 1) -U"(Y l s,I=0)
= V(Y, s) (5)

= V(z) > 0,
where z = (Y, s) is a vector of socio-economic and demographic variables

(including income).
Assume V(z) is'a linear function of all elements in z, and u is a random

error drawn from a standard normal distribution. Individual i’s self-selection

equation can be expressed by a standard probit model:

27

I,‘ = z,’y + u,, u, ~ i.i.d. N(O, r),
I,=1,iffl,'>0, (6)

I| = 0, otherwise.

Equation (6) is the self-selection equation that models individual i’s decision
behavior. In equation (6), I,’ is the net (indirect) utility gain and cannot be
observed, 2, is a column vector consisting of exogenous variables (including
income) that explain individual i’s net (indirect) utility gain, 7 is a column vector
of parameters to be estimated, and N (0, 1) represents a standard normal

distribution. Although I,’ cannot be observed, researchers can observe 1,.

3.2 A probit model with measurement errors

In this study, measurement errors occur when proxy variables are used to
approximate the true values of the exogenous variables in the self-selection

equation for non-respondents.
As stated earlier, the self-selection equation is essentially a probit model.

Before the self-selection model with measurement errors can be studied, a probit

model with measurement errors must be discussed.
3.2.1 Derivation of the probit model with measurement errors

Following the notation used in the previous sections, the derivation of a

probit model with measurement errors begins with

28

I,‘ = z,’y + u,, u‘ ~ i.i.d. N(O, 1). (7)
As before, I,’ is a latent variable, 2, is a kxl vector of independent variables, y is a
kxl vector of parameters, and u, is an error term drawn from a standard normal
distribution. I,’ cannot be observed, however, I, can be observed. In addition, Ii
equals 1 if I,‘ = z,’y + u, > 0; 0, otherwise.
For a respondent, Ii = 1, 2, can be observed, and the likelihood for the

respondent is derived as:

ZI'Y+U,>0

"U. > -2.’v
<8)
~Prob(u. > 1’7)

=l1- tl-A’Yli ' “if/v).
where ¢(~) is a standard normal distribution function.

For a non-respondent, I, = 0, 2, cannot be observed. However, p,, which is
the average value of z,, is estimated using a random sample drawn from individual
i’s neighborhood (e.g. a census block3).‘ Let n, be the size of the random sample
and zi ~ N(p,', 2;). Obviously,

. 2'
~ N “p '— o ‘ (9)
u», (u. n)

 

3 This can be a census block, a county, a state, or even a region. For
convenience, a census block is used in the followrng analyses.

4 For example, this can be done by matching the mailing list with the census
block to obtain the average value of each 2, aval able in the census data.

29

Deﬁne measurement errors as v, = z, - u,, then

 

v. ~ No. Lit-23)
”I ”I (10)
”l ‘ 1 .
~v, ~ N(O, “—2).
n
In general, (D' - 1) e- 1, so the distribution of v, can be approximated by5
nl
(11)

vi ~ N(O, 2').
To derive the likelihood for a non-respondent, the self-selection equation

for a non-respondent can be written as

#7 + u, s 0
”(H +V,)’v +11, 50 (12)
“Iii/Y +w, sO,andw, =u, +v,'y.
Further, assume that u, and v, are independent. Then
wi ~ N(O, of), and
(13)

5 Alternatively, the unobserved 2, can be decomposed into the sum of a
deterministic component, p,', and a random component, v,, with v, ~ N (0, 2,').
Now replace u,' with its consistent estimate, p,. We have 2, = u, + v,, and v, ~

N(O, 2;).

30
The likelihood for a non-respondent can then be derived as“
Ill/Y + W, s 0. W, ~ N(O. v.2)
/
" wi 5 ’ Pt 7

4
~Prob(w, s -u,’y) (1 )

I
=o_.‘."_1.
4‘1

Based on equations (8) through (14), the likelihood function for the probit

model with measurement errors is

is .-. H¢D(q'y) . [14.2511]. (15)

l,-r i,-o ‘9.

6 As with a regular probit model, 7 can only be identiﬁed up to a scalar
multiple. Let k be a scalar and k > 0, according to equations (12) and (13),

a’v+ti50-'ka'v+kti50
~ku.-’v + (1w. + kvs’v) s o, and (ku. + kn’v) ~ mm + My»
-kui/y

k V1 + 7’2'1

 

~probacu. + kn’y s -ku{r) = w

31

Comparing the likelihood function for the probit model with measurement
errors to the likelihood function for a regular probit model,7 the difference
between these two likelihood functions is found in the second term, representing
the likelihood for non-respondents. When average characteristics from non-
respondents’ neighborhoods, u,, replace the true value of non-respondents’
characteristics, 2,, variance is increased from 1 to of (= 1 + y’2,y). By
combining the individual survey data with the census data, the original truncated
sample becomes a censored sample. However, due to measurement errors,
members in the new censored sample are independently but not identically

distributed. For respondents, 11, ~ i.i.d. N(O, 1), but for non-respondents, w, ~ N (0,

9’3)-

322 Parameter identification in the probit model with measurement errors

It is well known that a measurement errors model suffers from problems of

parameters identiﬁcation (Fuller, 1987). In practice, the probit model with

measurement errors derived above suffers the same problems, namely (7, 2,.)

cannot be identiﬁed simultaneously.” To apply the probit model with

measurement errors without further complicating the model, one alternative is to

replace 2,’ by its consistent estimates.

7 The likelihood function for a regular probit model is

L = Hm’v) - [item-4’7).

Ij-l

. 8 Since the number of parameters (elements in 2,') increases with the sample
srze, there is an incidental parameters problem.

32

From census data, there are two candidates that can be chosen to replace
2'- The ﬁrst candidate, 2,, is a variance-covariance matrix estimated from a
sample drawn from the census block for non-respondent i. Typically, 2, v- 2 ,-
unless non-respondents i and j live in the same census block.

In contrast to 2, whose values vary across non-respondents, the second
candidate, 2, is a constant variance-covariance matrix estimated from a sample
drawn from the population. This same constant variance-covariance matrix, 2, is
applied to all the non-respondents.

In practice, 2 and 2, can be calculated using the "Public-Use Microdata
Samples."9 Researchers can purchase a 5-percent "Public-Use Microdata
Samples," and use this sample to calculate 2; or the 5-percent sample can be
broken down into census blocks10 and 2, can be calculated from each census
block.

In terms of empirical results, since both 2 and 2, lead to consistent
parameter estimates, it is difficult to determine whether 2 or 2, will do better.

Consequences of using 2 and 2, will be examined by Monte Carlo experiments in

the next chapter.11

9 The "Public-Use Microdata Sample" can be purchased from the U. S.
Department of Commerce, Bureau of Census, ph: (301) 763-2005.

1° An alternative is to purchase a 5-percent "Public-Use Microdata Sample"
for each census block.

11 To simplify notation in the following sections of this chapter, 2, is used to
represent either 2 or 2,.

33

3.3 A self-selection model with measurement errors and a linear demand

equation

A self-selection model with measurement errors is derived in this section
which replaces the self-selection equation (a probit model) in a self-selection
model by the probit model with measurement errors.

Recall that the self-selection equation with measurement errors is deﬁned
as:

(1) For a respondent, I, = 1,

71/, + u, > o, u, ~ i.i.d. N(O, 1). (16)
(2) For a non-respondent, I, = 0,
Zil‘Y + “5 S 0, 1* ~ I.I.d. N“), I)

/ /
-p,y+w,s0,w,=u,+v,y,
(17)

WE ~ N(O, of), and
a2 = 1 + 7’37-
To derive the self-selection model with measurement errors, assume that
individual i’s demand for a good (Q) is

q = n’p + (1., e ~ i.i.d. N(O, 0‘2), (18)
where x, and B are both mxl vectors. Further, assume that (e,, u,) are distributed
jointly as a bivariate normal distribution with a density function

3(0. 0. 0). (19)

where

34

a 3 (20)

is the variance-covariance matrix, and p is the correlation coefﬁcient.

For a respondent, the likelihood is
£21 SUI-4,5. 11. Q) du. (21)

For non-respondents, assume that the demand is uncorrelated with the
measurement errors (i.e. Cov(e,, v,) = 0), then (e,, w,) are distributed jointly as a

bivariate normal distribution with a density function

 

 

8(0. 0. I‘,), (22)
where
loz pa
1, . . (23)
.00 «ii

is the variance-covariance matrix.12 The likelihood for a non-respondent can

then be written as

I: L?" g(e, w, 1“,) dw de. (24)

 

12

E(wie.) = E [(u. +v.’v>e.l = B [we -r~,)’vle.j
= E (Ile+a’vq-m’vq) = E (tie)

= pa.

35
Based on equations (16) through (24), the likelihood function for the self-

selection model with measurement errors and a linear demand equation is

1e = H j", g(q-XI’B. u. 0) du
Ij-l ill,
(25)

H j: L2H], g(e, W. P,) dw de.

Ij-O

Consistent estimates for (y, p, B, 02) can be obtained by maximizing ln(LL).
Comparing the likelihood function for the self-selection model with
measurement errors to the likelihood function for the self-selection model with a
censored sample (Chapter 2, Section 2.3.2), the difference between these two
likelihood functions is found in the second term, representing the likelihood for
non-respondents. When average characteristics from non-respondents’
neighborhoods, p,, replace the true value of non-respondents’ characteristics, 2,,
variance in the self-selection equation is changed from 1 to «9,2 (= 1 + y’2,y). In

the selfcselection model with measurement errors, members in the sample are

independently, but no longer identically, distributed.
In addition, compared to the ML estimates from a self-selection model

with a truncated sample (Chapter 2, Section 2.3.3), the ML estimates from a self-
selection model with measurement errors is more efﬁcient due to the newly
introduced information )1, (the average characteristics from non-respondents’

neighborhoods) and 2, (the corresponding variance-covariance matrix),

3.4 Generalization for closed-ended questionnaires

The above discussion focuses on the case where the dependent variable in

the demand function (q,) is continuous. However, in many CV studies, the

36

demand responses are not continuous. For example, in many open-ended
questionnaires the demand responses are censored (e.g. a Tobit model). On the
other hand, surveys using referendum-type (closed-ended) questionnaires produce
dichotomized responses. In the following discussion, the demand equation in the

self-selection model with measurement errors is modiﬁed to allow for qualitative
and limited dependent variables. The following models present the case where

the demand equation is either a Tobit or a probit-related model.

3.4.1 A self-selection model with measurement errors and a Tobit demand

equation

A Tobit demand equation is deﬁned as:

qr . ,Ip . a, a, ~ i.i.d. N(O, oz).

q=q2rri+e>a (m)

qi = 0, otherwise.

The observed demand is now q, which is left censored at 0. A self-selection

model with measurement errors and a Tobit demand equation is described by

equations ( 16), (17), (26), ( 19), (20), (22), and (23).
For a respondent, if the observed demand equals 0, the likelihood is

f:‘l’p IQ, g(e, u, 0) du de. (27)

If the observed demand for a respondent is q, > 0, the likelihood is

37

g}, 8(q-s’ﬂ. u. 0) du. (28)
For a non-respondent, the likelihood is
(29)

I: I341 g(e, W, E) dw de.

Based on equations (27), (28), and (29), the likelihood function for the self-

selection model with measurement errors and a Tobit demand equation is

I
= " “"l' .
Lr E J; L. g(e, w, I“) dw de

£4, £2, g(e, 11, O) du de (30)

ll -l,qi IO

I'I E(y ski-£3.11. 0) du.

lj-1,q, >0
In the likelihood function, LI, the ﬁrst term is the likelihood for non-respondents.
The second term is the likelihood function for those respondents whose q, = O.

The third term is the likelihood function for those respondents whose q, > 0.

3.4.2 A self-selection model with measurement errors and a probit demand

equation

In a referendum-type (closed-ended) questionnaire, a respondent is usually

asked to answer YES or NO with respect to a given referendum index.13 The

demand equation takes the form

 

’3 For examﬂe, a respondent may face a question such as 'To maintain the

current water quality in your neighborhood, you will have to pay extra $100 per
car. Are you willing to pay for it or not?" The. $100 here is the referendum

index (price).

38

q‘ = x’B + e,, e, ~ i.i.d. N(O, 1),

q=l,ifx,'B+q>0, (31)

q = 0, otherwise,

where one of the x, elements is the referendum index. For respondents, the

probit demand equation is related to the self-selection equation (equation (16)) by

the assumption that (e,, u,) are distributed jointly as a bivariate normal

distribution with a density function

8(0. 0. 6). (32)
where
1 p
9 = (33)
p 1

is the variance-covariance matrix.
For a respondent who answers YES with respect to the referendum index,

I, = 1 and q, =1, the likelihood is

1;,“ L7,, g(e, u, 8) du de. (34)

For a respondent who answers NO with respect to the referendum index, I, = 1
and q, =0, the likelihood is

LT,” £21 g(e, u, 8) du de. (35)

For non-respondents, the relationship between the probit demand equation

and the self-selection equation with measurement errors (equation (17)) can be

 

39

derived where (e,, w,) are distributed jointly as a bivariate normal distribution with

density function

8(0. 0. A). (36)
where
1 p
A, s (37)
p n”
and o,’ = 1 + y’2,y. The likelihood for a non-respondent is
(33)

f: [31” g(e, w, 11,) dw de.

Based on equations (34), (35), and (38), the likelihood function for the self-

selection model with measurement errors and a probit demand equation is

l" ‘ 1.1-i I: If" g(e, w. A.) aw de

"“l' " e d d (39)
[Pg-o I“. [1'], g(e,“, ) 11 C

[FLIP]. L}, L}, g(e, u, 9) du de.

In the likelihood function, LP, the ﬁrst term is the likelihood for non-respondents.
The second term is the likelihood function for those respondents who answered
N O with respect to the referendum index. The third term is the likelihood

function for those respondents who answered YES with respect to the referendum

index.

 

40
An alternative to a probit demand equation is a censored probit inverse
demand equation (Cameron and James, 1987; Cameron, 1988).“ Instead of
modeling the probability of answering YES or NO with respect to a referendum
index, a censored probit model treats the answer YES (NO) as if q,' is greater
than or equal to (less than) the referendum index. Thus, the true q, is censored at
the referendum index. However, in terms of econometric estimation, a censored

probit demand equation produces results comparable to that of a probit demand

equation (McConnell, 1990).
For a self-selection model with measurement errors and a censored probit

inverse demand equation, the censored probit inverse demand equation is deﬁned
as:

q. = ,,/p + e, e, ~ i.i.d. N(O, oz).

q=1,jf)§’ﬂ+el>pl, (40)

q = 0, otherwise,

where p, is the referendum index and x, no longer contains the referendum index.
The self-selection model with measurement errors and a censored probit inverse
demand equation is deﬁned by equations ( 16), ( 17), (40), ( 19), (20), (22), and
(23). It can be easily shown that the likelihood function for the self-selection
model with measurement errors and a censored inverse probit demand equation

is”

 

1“ If the demand equation is estimated by a probit model, the censored probit
model estimates the inverse demand equation.

15 Unlike the case of probit demand equation, in a censored probit (logit)
inverse demand equation, oz is identiﬁable.

 

41

_/
chanf'ng(e,u,l‘,)dude
1i“) I...

- I .
IlIT-o I: xi, I‘ll/V g(e, u, 0) du de (41)
" "11

II Inf,“ L}, g(e, u. '0) an ac.

It'lal'l
In the likelihood function, La» the ﬁrst term is the likelihood for non-
respondents. The second term is the likelihood function for those respondents
who answered NO with respect to the referendum index, p,. The third term is the

likelihood function for those respondents who answered YES with respect to the

referendum index, p,.“

3.5 Summary

Models derived in this chapter take the average characteristics from the
non-respondents’ neighborhoods and treat them as the non-respondents’
characteristics, measured with error. Based on the measurement errors approach,
the probit self-selection equation is modiﬁed and becomes a probit model with
measurement errors. A self-selection model with measurement errors is
constructed using the probit model with measurement errors and a linear demand

equation.
CV studies use either open-ended or closed-ended questionnaires to collect

data. For open-ended questionnaires, responses to demand are sometimes
censored. For example, given a speciﬁc price, demand for a good may be left

1‘. A double-bounded censored logistic regression developed by Hoehn and
Loorms (1993) can also be applied. Derivation of the likelihood function is

straightforward.

 

42
censored at zero. For closed-ended questionnaires, the responses are
dichotomized (YES or NO). To account for these situations, the self-selection
model with measurement errors is generalized to allow for a Tobit demand

equation, a probit demand equation, or a censored probit inverse demand

equation.
Based on the measurement errors approach, models derived in this chapter

transfer a truncated sample into a censored sample. By applying these models, it

t

is expected that disadvantages from estimates under a truncated sample are

removed and advantages from the properties of the estimates under a censored

sample are obtained; namely, reliable estimates of the parameters in both the self-

selection and the demand equations. Furthermore, some gain in efﬁciency is

expected.

 

CHAPTER 4
MONTE CARLO EXPERIMENTS AND RESULTS

In the previous chapter, self-selection models with measurement errors
were developed with 1) a linear demand equation; 2) a Tobit demand equation;
3) a probit demand equation; and 4) a censored probit inverse demand equation.

Deviating from conventional measurement errors models, the variance-
covariance matrix of the measurement errors was replaced by its consistent
estimates. Two candidates were considered as replacements for the variance-
covariance matrix. One candidate, 2, was the variance-covariance matrix
estimated from a sample drawn from the population, and was not available for
each census block. The other candidate, 2,, was the variance-covariance matrix
estimated from samples drawn from each non-respondent’s census block.1

The purpose of this chapter is to use Monte Carlo experiments to examine
and compare the resulting estimates from 1) a truncated sample without
correcting for self-selection bias; 2) a self-selection model with a censored

sample;2 3) a self-selecﬁon model with measurement errors that adopts u,, the

 

1 A third type of variance-covariance diag(2,) which assumes zero covariance

was also tried. Although the diag(2,) is very easy to obtain, it is abandoned for
two reasons. First, the zero covariance assumption is not plausible. Second,
according to the model speciﬁed below, ML estimator based on diag(2,) has never

converged during the optimization procedure.

.2 Although it is nearl immssible to acquire a censored sample in reality,
estimates from a censore sample give the best possible results and can be used
to compare the results from the measurement errors models proposed in this

study.
43

 

44
mean vector, and 2; and 4) a self-selection model with measurement errors that
adopts u, and 2,.3 Monte Carlo experiments are conducted for each type of
demand equation except the censored probit inverse demand equation.4

This chapter begins with the data generation process. Steps for Monte
Carlo experiments are described and the resulting estimates are then reported.

Comparison of the results are presented, followed by concluding remarks.
4.1 Data generation

Due to the properties of the proposed self-selection models with
measurement errors, the data generation process is not straightforward. In each
replication, in order to acquire useful information, data used in Monte Carlo
experiments are generated in two steps. In the ﬁrst step, a "population" is
generated and certain required statistics are calculated. In the second step, a
"sample" is drawn from the "population," and models are estimated based on the

"sample."

k

3 Monte Carlo experiments for a self-selection model with a truncated sample
is conducted only for a linear demand equation wrth p = 0.75.

4 Due to the similari between a probit and a censored probit model,
estimates from a censore probit model are onutted.

 

45

4.1.1 Population generation

For each replication, a 10,000 x 5 matrix, [x1 x2 x3 11 e], is ﬁrst generated
where [x,, x,2 x,; u, e,] is distributed as an i.i.d. multivariate normal distribution

with a mean vector [3 1.5 4 0 O] and a variance-covariance matrix’

[1.44 0.24 0.096 0 0‘
0.24 1 0.24 0 0

Cov(xl, x2, x3, 11,, e,) = 0.096 0.24 0.64 0 O ,

 

 

where p = 0.25, 0.5, or 0.75.6

Since one of the demand speciﬁcations and the self-selection equation are
both probit equations, setting Var(u) = Var(e) = 1 simpliﬁes comparison of

parameter estimates.7

Dependent variables for both the self-selection equation (I,') and the

demand equation (q,')-are generated by

 

5 Corr(x,, x2) = 0.2, Corr(x,, x3) = 0.1, Corr(x2, x3) = 0.3, Corr(u, e) = p, and
p = 0.25, 0.5 or 0.75.

6 Based on p = 0.25, 0.5, and 0.75, three sequences of simulations are
conducted for each of the models.

7 Recall that in a probit model, B and a are not separately identiﬁable.
Coefficients estimated are B / o.

 

46

I' 1.5+1x,,-3x,2+u,,and

Q

6 + 4 xi2 - 3 x,,, + e.

In order for the model to be identiﬁable when both demand and self-selection
equations are of probit-type, both demand and self-selection equations cannot
have exactly the same independent variables.8

Based on the process described above, a sample [I' q' x1 x2 x3 u e], which
contains 10,000 observations and 7 variables, is generated and treated as the
"population" in a replication.

To apply the self-selection models with measurement errors, certain
statistics related to the distribution of x1 and x2 are required (i.e. the mean vector
and variance-covariance matrix). To obtain the necessary statistics, a random
sample containing 200 observations is drawn from the "population" and the
variance-covariance matrix (2) of x1 and x2 is calculated. The next step is to
randomly group the "population" into 250 ”blocks" with 40 observations in each
block. For each block, the mean vector (u,) and the variance-covariance matrix

(2,) of x1 and x2 are calculated.

4.1.2 Sample generation

In each replication, a random sample consisting of 1,000 observations is
drawn from the population. In the random sample, observations with I,’ > 0 (I,' s
0 ) are treated as respondents (non-respondents). Since the mean of I,’ is zero, a

response rate roughly equaling 50% (500 respondents) is expected.

 

8 An alternative is to have Corr(u, e) = 0. However, if this is the case, self-
selection does not exist.

47

In each replication, four models are estimated: 1) without correcting for
self-selection bias, a demand equation is estimated based on the truncated sample,
i.e. the number of observations is about 500; 2) both a demand and a self-
selection equation are estimated based on a censored sample with 1,000
observations; i.e. for non-respondents, x, and x2 are observable; 3) both a demand
and a self-selection equation are estimated using a self-selection model with
measurement errors, and for non-respondents, due to the unobserved x, and x,, u,
and 2 are used (i.e. the number of observations is 1,000); and 4) both a demand
and a self-selection equation are estimated using a self-selection model with
measurement errors, and for non-respondents, due to the unobserved x1 and x,, p,
and 2, are used (i.e. the number of observations is 1,000).

As previously mentioned, three types of demand equations are used in the
analysis. For a linear demand equation, q,' is used as the dependent variable. If
the demand equation is a Tobit equation, q; is left censored at 0 (q, = q,', if q,'
>0; 0, otherwise). Finally, for a probit demand equation, q,° is dichotomized (q,

= 1, if q,’ >0; 0, otherwise).
4.1.3 Monte Carlo experiments

Based on different demand speciﬁcations, three types of simulations related
to a linear, a Tobit, and a probit demand equation are conducted. For each type
of demand speciﬁcation, three sequences of simulations are conducted based on
different values of the correlation between self-selection and demand (p = 0.25,

0.5, and 0.75). At each replication, four models are estimated, and the number of

replications is 500.

48

4.2 A linear demand equation with self-selection

A self-selection model with measurement errors and a linear demand
equation is derived in Chapter 3 (Section 3.3). Based on the different correlation
measures between self-selection and demand (p = 0.25, 0.5, and 0.75), the
following section begins with OLS estimates from a truncated sample without
correcting for self-selection, and results are presented in Appendix C9 (Tables

C.1.1.A (p = 0.25), C.1.2.A (p = 0.5), and C.1.3.A (p = 0.75)).
Using a censored sample, estimates for a linear demand equation with self-

selection are obtained by the ML estimator based on the likelihood function

11.1 = H j'}, 8(ci-XI’B. u. 0) du

l,-l 'zl

/
{I} L. L22" g(e, u, 0) du de.
I

where g(.,.,.) represents a bivariate normal density function and

ozpo

is the variance-covariance matrix. Results of this model are listed in Tables

C.1.1.B (o = 0.25), C.1.2.B (p = 0.5), and C.1.3.B (p = 0.75).

For a truncated sample and p = 0.75, estimates for a linear demand

equation with self-selection are obtained by the ML estimator based on the

likelihood function

 

9 Notation used in Appendix C are deﬁned in Appendix B.

49

j",' 8(4-8’3, 11. 0) du
I. = II "‘ ,
1,.: 9(4 Y)

 

and results are listed in Table C.1.3.B.

Estimates from a self-selection model with measurement errors and a
linear demand equation are obtained by the ML estimator based on (u,, 2) and
the likelihood function

11.2 = H jg}, g(q-xl’B. 11. 0) du

Ij-l

II I; If“ g(e. w. r) dw de,

l,-0
where
a2 pa
P 3 9
pa (1 +Y’EY)

and results are presented in Tables C.1.1.C (p = 0.25), C.1.2.C (p = 0.5), and
C.1.3.C (p = 0.75).
Finally, if 2 (I‘) is replaced by 2, (Fa), i.e.

02 pa
pa (1+Y’zi'f)

the resulting estimates are shown in Tables C.1.1.D (p = 0.25), C.1.2.D (p = 0.5),
and C.1.3.B (p = 0.75).

50
4.2.1 Monte Carlo experiment results from a self-selection model with

measurement errors and a linear demand equation10

Based on Tables C.1.1.A, C.1.2.A, and C.1.3.A, when the demand equation
is estimated by applying OLS to a single equation without correcting for self-
selection bias, as p increases, both %BIAS and D(a.a2):s increase. This implies
that the higher the p, the farther the OLS results deviate from the true parameter
values. For example, as p increased from 0.25 to 0.75, the %BIAS of B1 (02)
increased from 2.19% (1.32%) to 6.45% (7.73%). In addition, RMSE and ASE
are very different for B,, indicating incorrect estimates of the variance-covariance
matrix.

When a censored sample is available and the self-selection model is
correctly speciﬁed, 02, p, self-selection, and demand parameters are well-estimated
by the ML estimator. As can be seen from Tables C.1.1.B, C.1.2.B, and C.1.3.B,
the %BIAS among demand (self-selection) parameters ranged from 0.01%

(0.08%) to 0.28% (1.20%); for a2 (p), %BIAS ranged from 0.32% (0.28%) to

0.62% (1.08%). D(B,¢2);CEN was always smaller than D ( a 62% and all the D(,);CEN’S
remained very close to zero.

For a truncated sample, the ML estimator produces different results from
that of Muthen and Joreskog (1983). Table C.1.3.B shows that biasedness is not
a major problem for all the a2 ,p, self-selection, and demand parameters, even
with p = 0.75. The real problem appears to be the difference between RMSE
and ASE. The difference between RMSE and ASE indicates that the variance-

covariance matrix produced by the ML estimator is incorrect and cannot be used

19 A GAUSS program for conducting the Monte Carlo experiments is
provrded in Appendix D.

51
to test hypotheses. Failure to conduct hypothesis testing may result in model
misspeciﬁcation and lead to inconsistent parameter estimates.

When the measurement errors model based on u, and 2 was applied, the
%BIAS among demand (self-selection) parameters ranged from 0.02% (1.53%) to
0.29% (3.38%); for a2 (p), %BIAS ranged from 0.28% (0.53%) to 0.80% (0.88%)
as shown in Tables C.1.1.C, C.1.2.C, and C.1.3.C. D (a. 02); MEI was always smaller
than D (p, a,” and all the 1),, WE, ’s remained very close to zero.

When the measurement errors model based on u, and 2, was applied, the
%BIAS among demand (self-selection) parameters ranged from 0.01% (0.15%) to
0.28% (2.43%); for a2 (p), %BIAS ranged from 0.22% (0.40%) to 0.68% (0.82%)
as shown in Tables C.1.1.D, C.1.2.D, and C.1.3.D. 130,3);an was always smaller
than D ( a. 00:8 and all the D (- mm ’s remained very close to zero.

Comparing results from the two measurement errors models, the only
difference is that the self-selection parameters always have smaller %BIAS when
2, is used. Apart from this, it is difﬁcult to distinguish the difference between the
two models.

Comparing results from the two measurement errors models with results
from the censored sample, all three models give similar estimates for the demand

parameters according to D . However, according to D“ 9)., 9 self-selection

(9,02%
parameters estimated by the two measurement errors models are less efﬁcient

than the estimates from the censored sample.
4.3 A Tobit demand equation with self-selection

A self-selection model with measurement errors and a Tobit demand
equation is derived in Chapter 3 (Section 3.4.1). Based on the different

correlation measures between self-selection and demand (p = 0.25, 0.5, and 0.75),

52
the following section begins with Tobit ML estimates from a truncated sample
without correcting for self-selection, and results are presented in Appendix C
(Tables C.2.1A (p = 0.25), C.2.2.A (p = 0.5), and C.2.3.A (p = 0.75)).
Using a censored sample, estimates for a Tobit demand equation with self-

selection are obtained by the ML estimator based on the likelihood function

.. " ‘47
L“ I} L, L. g(e, 11, Q) du de

-a

"X/ a
j"j, g(e.u.0)dude
Ij-IJIj-O ’1',

II I”), 8(41'011. 0) du.

Ij'l,q‘>0 -ll
where g(.,.,.) is a bivariate normal density function and

a2 pa
0 a
pa 1

is the variance-covariance matrix. Results of this model are listed in Tables
C.2.1.B (p = 0.25), 02.28 (p = 0.5), and C.2.3.B (p = 0.75).11

Estimates from 'a self-selection model with measurement errors and a Tobit
demand equation are obtained by the ML estimator based on (u,, 2) and the
likelihood function

u A Tobit self-selection model based on a truncated sample is dropped from
the Monte Carlo e eriments due to the difﬁculty in obtaining the starting values.
The ML estimator or a Tobit self-selection model based on a truncated sample is
very sensitive to the starting values. Very often, the o timization procedure can
not converge even with the true parameter values as t e starting values.

53

L,2 = II J: I347 g(e, w, I‘) dw de

li-O

I
-x' p ' e, u, du de
IVE") L. J12,“ 8( O)

- I
II L4, sat-xi I3. 11. 0) du.

I, -1,q, >0

where

02 pa

pa (1+Y’EY)

and results are presented in Tables C.2.1.C (p = 0.25), C.2.2.C (p = 0.5), and

C.2.3.C (p = 0.75).
Finally, the results of replacing 2 (I‘) by 2, (I‘,) are shown in Tables C.2.1.D

(P = 0.25), C.2.2.D (p = 0.5), and C.2.3.D (p = 0.75).

4.3.1 Monte Carlo experiment results from a self-selection model with

measurement errors and a Tobit demand equation12

When a single equation Tobit model is applied to estimate the demand
equation without correcting for self-selection bias, as in the case of a linear
demand equation, both %BIAS and D(p’02);s increase with p as shown in Tables
C.2.1.A, C.2.2.A, and C.2.3.A. This again implies that the higher the p, the
farther the estimates from a single equation Tobit model deviate from the true

parameter values. For example, as p increased from 0.25 to 0.75, the %BIAS of

I? A GAUSS program for conducting the Monte Carlo experiments is
provrded in Appendix E.

54
B, (02) increased from 3.84% (2.26%) to 11.09% (12.33%). In addition, RMSE
and ASE are very different for B,, indicating incorrect estimates of the variance-
covariance matrix.

When a censored sample is available and the self-selection model is
correctly speciﬁed, 02, p, self-selection, and demand parameters are well-estimated
by the ML estimator. As can be seen from Tables C.2.1.B, C228, and C.2.3.B,
the %BIAS among demand (self-selection) parameters ranged from 0.09%
(0.07%) to 0.30% (2.19%); for a2 (p), %BIAS ranged from 0.12% (1.39%) to
1.46% (3.28%). D(B,02);CEN was always smaller than D(ﬂ.¢2):5 and all the D(°);CEN,S
remained very close to zero.

When measurement errors model based on u, and 2 was applied, the
%BIAS among demand (self-selection) parameters ranged from 0.08% (1.75%) to
0.35% (4.68%); for a2 (p), %BIAS ranged ﬁ'om 0.18% ( 1.32%) to 1.42% (2.92%)
as shown in Tables C.2.1.C, C.2.2.C, and C.2.3.C. D ( a, 02):MEI was always smaller

than D(ﬂ.02);S
When the measurement errors model based on u, and 2, was applied, the

and all the D (. mm ’s remained very close to zero.

%BIAS among demand (self-selection) parameters ranged from 0.07% (1.23%) to
0.36% (3.71%); for 92 (p), %BIAS ranged from 0.16% (1.37%) to 1.38% (3.06%)
as can be seen in Tables C.2.1.D, C.2.2.D, and C.2.3.D. D (9. Am was always
smaller than D(n.o2);s and all the D (' ):MEz ’s remained very close to zero.

Comparing results ﬁom the two measurement errors models, the only
difference is that the self-selection parameters always have smaller %BIAS when
2, is used. Apart from this, it is difﬁcult to distinguish the difference between the
two models.

Comparing results hour the two measurement errors models with results
from the censored sample, all three models give similar estimates for the demand

parameters according to Dds?» . However, according to D (m)? , self-selection

I"

 

55
parameters estimated by the two measurement errors models are less efﬁcient

than the estimates from the censored sample.
4.4 A probit demand equation with self-selection

A self-selection model with measurement errors and a probit demand
equation is derived in Chapter 3 (Section 3.4.2). Based on the different
correlation measures between self-selection and demand (p = 0.25, 0.5, and 0.75),
the following section begins with probit ML estimates from a truncated sample
without correcting for self-selection, and results are presented in Appendix C
(Tables C.3.1.A (p = 0.25), C.3.2.A (p = 0.5), and C.3.3.A (p = 0.75)).

Using a censored sample, estimates for a probit demand equation with self-

selection are obtained by the ML estimator based on the likelihood function

I
1,., .. H f: L?” g(e, u, e) du de
li-O

-x,B ..
I l I, g(e,u,9) du de
t,-l.q,-o "' "1'

H ‘11:“, J1}, g(e, 11. 9) du de,

IP13, II
where

10

56
is the variance-covariance matrix. Results of this model are listed in Tables
C.3.1.B (p = 0.25), C.3.2.B (p = 0.5), and C.3.3.B (p = 0.75).13
Estimates from a self-selection model with measurement errors and a
probit demand equation are obtained by the ML estimator based on (u,, 2) and
the likelihood function

. : 7i"
I,2 g L. L. g(e, w, A) dw de

H L7,. £2, g(e, u, 9) du de

I,-1,q,-O
H j”, I? 8(C.U.9)dude.
h-r,q,-r "I" 'zl'
where
1 p
A = ,
p (1+Y’Ev)

and results are presented in Tables C.3.1.C (p = 0.25), C.3.2.C (p = 0.5), and
C.3.3.C (p = 0.75).
Finally, if 2 (A) is replaced by 2, (A,), i.e.

¥

13 A robit s lf-selection model based on a truncatedsample is dropped from
the Mont: Carloee eriments due to the difﬁcul in obtaining the starting values.
The ML estimator)?» a probit self-selection mo el based ona truncated sample
is very sensitive to the starting values. Very often, the optimization procedure can
not converge even with the true parameter values as the starting values.

57
the resulting estimates are shown in Tables C.3.1.D (p = 0.25), C.3.2.D (p = 0.5),

and C.3.3.D (p = 0.75).

4.4.1 Monte Carlo experiment results from a self-selection model with

measurement errors and a probit demand equation“

When a single equation probit model is applied to estimate the demand
equation without correcting for self-selection bias, as in the case of a linear
demand equation, both %BIAS and Dtu‘xs increase with p as presented in
Tables C.3.1.A, C.3.2.A, and C.3.3.A. This again implies that the higher the p, the
farther the estimates from a single equation probit model deviate from the true
parameter values. For example, as p increased from 0.25 to 0.75, the %BIAS of
B, increase from 7.43% to 18.35%. In addition, RMSE and ASE are very
different for B,, indicating incorrect estimates of the variance-covariance matrix.

When a censored sample is available and the self-selection model is
correctly speciﬁed, p, self-selection, and demand parameters are well-estimated by
the ML estimator. As can be seen from Tables C.3.1.B, C.3.2.B, and C.3.3.B, the
%BIAS among demand (self-selection) parameters ranged from 2.63% (0.45%) to
4.63% (2.19%); for 02(9), %BIAS ranged from 2.97% (1.08%) to 3.60%
(10.28%). D a; can was smaller than Dies when p = 0.5 and 0.75 and all the

D ’s remained very close to zero.

(' );MB1
When measurement errors model based on u, and 2 was applied, the
%BIAS among demand (self-selection) parameters ranged form 0.04% (3.46%) to

4.84% (7.43%); for o2 (9). %BIAS ranged from 3.04% (1.36%) to 3.74% (9.71%)

 

1.4 A GAUSS program for conducting the Monte Carlo experiments is
prov1ded in Appendix F.

58
as shown in Tables C.3.1.C, C.3.2.C, and C.3.3.C. DBMEI was smaller than DES
when p = 0.5 and 0.75 and all the D (, ); man’s remained very close to zero.

When the measurement errors model based on u, and 2, was applied, the
%BIAS among demand (self-selection) parameters ranged from 2.56% (2.26%) to
4.69% (4.97%); for o2 (p), %BIAS ranged from 3.00% (1.16%) to 3.65% (9.99%)
as presented in Tables C.3.1.D, C.3.2.D, and C.3.3.D. D M452 was smaller than
D as when p = 0.5 and 0.75 and all the p,.);m’s remained very close to zero.

Comparing results from the two measurement errors models, the only
difference is that the self-selection parameters always have smaller %BIAS when
2, is used. Apart from this, it is difﬁcult to distinguish the difference between the
two models.

Comparing results from the two measurement errors models with results
from the censored sample, all three models give similar estimates for the demand
parameters according to Dar . However, according to 1),”)? , self-selection
parameters estimated by the two measurement errors models are less efﬁcient
than the estimates from the censored sample.

One important issue is the estimate of p. For all three self-selection
models, as the true value of p increases, the %BIAS for the estimate increases
rapidly. However, BIAS for the estimates of p are always equal to zero,

statistically.
4.5 General results from the Monte Carlo experiments

Results from the single equation simulation show that in the presence of
self-selection (p rt 0), %BIAS increases as p increases when a single equation is

used to estimate the demand equation. This indicates biasedness caused by the

self-selection behavior.

59

For a truncated sample, the ML estimator produces different results from
that of Muthén and Jdreskog (1983). Instead of biasedness, the real problem
appears to be that the variance-covariance matrix produced by the ML estimator
is incorrect and cannot be used to test hypotheses. Failure to conduct hypothesis
testing may result in model misspeciﬁcation and may lead to inconsistent
parameter estimates.

When a censored sample is available and the self-selection model is
correctly speciﬁed, oz, self-selection, and demand parameters are well-estimated
by the ML estimator. For the parameter p, the ML estimator leads to acceptable
results; however, the estimates are not as accurate as other parameter estimates,
especially in the case of a probit demand equation with self-selection.

In terms of efﬁciency among different estimators, D (a. 02). can being very

15 - - ,
“mama and Dw~2xm indicates that the model Wthh uses a

censored sample and the two measurement errors models all lead to very similar

close to that of D

estimates of demand parameters and oz. For the self-selection parameters and p,

> . . .
D(y’p);Mm Down“: > D (m); cnu’ mdlcates that the model which uses a

censored sample performs the best and the measurement errors model that uses
it, and 2, performs somewhat better than the model that uses it, and 2. In the
overall performance, it is no surprise that the model which uses a censored
sample has the smallest value of D (7”, 02,9); can and performs the best. Even

thou ‘ ' l eater than that of
gh the value of 1;)(M02'mw1 IS shght y gr D,“ «2.9);191132’ the
two measurement errors models are not very different from each other.

There is a problem common to the case of the self-selection model with a

probit demand equation. The model that uses censored sample or either

15 For the probit demand equation case, they are DmCEN’ DWE, and Dome:
respectively. In the following discussion, (B, a2) is used to represent (B) in the
probit demand case as well as (B, 0’) in other cases.

60
measurement errors model results in an estimate of p that is not as accurate as
other parameters, especially when the true value of p is high. However, the

estimate remains statistically acceptable.

4.6 Summary

In this chapter, Monte Carlo experiments are conducted to examine and to
compare the resulting estimates from 1) a truncated sample without correcting for
self-selection bias; 2) a self-selection model with a censored sample; 3) a self-
selection model with measurement errors that adopts 11,, the mean vector, and 2,
the corresponding variance-covariance matrix estimated from a sample drawn
from the population; and 4) a self-selection model with measurement errors that
adopts )1, and 2,, the corresponding variance-covariance matrix estimated from
samples drawn from each non-respondent’s census block. Three sequences of
Monte Carlo experiments are conducted based on a linear, a Tobit, and a probit
demand equation. For each sequence of Monte Carlo experiment, based on p =
0.25, 0.5, and 0.75, three 500-replication simulations are executed.

Results from the Monte Carlo experiments show that the ML estimator
from the model which-uses a censored sample performs the best. Among the two
measurement errors models, the model that uses it, and 2, estimates the self-
selection parameters more accurately than the model that uses u, and 2.

In reality, censored sample is almost impossible to obtain. However, using
the measurement errors models derived in this study, a truncated sample can be
transferred into a censored sample, and self-selection models with measurement
errors can then be estimated by ML estimators. Results from Monte Carlo
experiments show that the estimates from the self-selection models with

measurement errors perform very well. According to the Monte Carlo experiment

61
results, when a correctly-speciﬁed self-selection model with measurement errors is
adopted, estimates of the demand parameters are as accurate as the estimates
from a model that uses a censored sample, and the estimates of the self-selection
parameters are very close to the true parameter values.

The results indicate an impressive message: adoption of a self-selection
model with measurement errors will not contaminate the original truncated
sample. Compared to the estimates from a model with truncated sample, the self-
selection models with measurement errors not only improve the efﬁciency of the

estimates but also lead to reliable estimates of the self-selection parameters.

CHAPTER 5
CONCLUDING REMARKS

In CV studies, when surveys are used for collecting data, non-response will
usually create problems. In analyzing survey data, two types of possible biases can
be created by non-response. The ﬁrst is sample non-response bias which occurs
when the sample distribution of some socio-economic or demographic
characteristics is signiﬁcantly different from that of the p0pulation. The second is
self-selection bias which occurs when the non-response is non-random, i.e. the
reasons for non-response are endogenous to the survey study.

In CV studies, although self-selection is usually ignored in empirical work,
it is recognized by researchers as an important issue. In this study, methods that
combine survey individual data with census data to correct for self-selection bias

are proposed and promising results are provided by Monte Carlo experiments.

5.1 Summary

In Chapter 1, consequences of self-selection are reported and the
differences between self-selection bias and samme non-response bias are
distinguished. When regression is used to analyze survey data, it is shown that
self-selection causes inconsistent parameter estimates and sample non-response
bias does not even play a role. It is also shown that there is no direct relationship

between sample non-response bias and self-selection bias. Instead of ignoring

62

63
self-selection bias in empirical work, it is suggested that CV researchers treat self-
selection as a serious issue.

Following an example in labor economics, the concept of self-selection in
CV is introduced in Chapter 2. It is identiﬁed that a complete self-selection
model consists of two equations. The ﬁrst is a self-selection equation which is
essentially a probit equation, and the second is a demand equation. To estimate a
self-selection model, several estimators that simultaneously estimate the self-
selection equation and the demand equation have been reviewed. However,
because the CV survey data is a truncated sample, evidence shows that the self-
selection equation parameters cannot be estimated reliably by existing estimators.
It is the deﬁciency of existing estimators that motivates this study.

In Chapter 3, a self-selection model under a CV framework is derived and
new ML estimators are proposed. According to a random utility model, a self-
selection equation can be expressed by a probit model with income as one of the
important explanatory variables. A self-selection model is completely described
by a self-selection probit equation and a demand equation which is correlated
with the self-selection equation. Since a CV data set is usually a truncated sample
where the only information available for a non-respondent is the address, a self-
selection model with measurement errors is derived by combining the CV
truncated sample with census data which provides information for non-
respondents’ neighborhoods (e.g. census blocks). Based on the self-selection
model with measurement errors, two ML estimators are then proposed. Finally,
the self-selection model with measurement errors is extended to allow for a
demand equation with qualitative or limited dependent variables.

It is found in Chapter 4 that for a truncated sample, the ML estimator
produces different results from that of Muthen and Jareskog (1983). Biasedness

is not a major problem even with p = 0.75. The real problem appears to be the

64
difference between RMSE and ASE which indicates that the variance-covariance
matrix produced by the ML estimator is incorrect and cannot be used in testing
hypotheses. If the self-selection equation cannot be correctly speciﬁed, all of the
oz, p, self-selection, and demand parameters may be estimated inconsistently.

The main purpose of Chapter 4 is to use Monte Carlo experiments to
compare the resulting parameter estimates from the two ML estimators for the
self-selection models with measurement errors proposed in Chapter 3. For the
ﬁrst estimator, a sample drawn from the population is used to calculate the
variance-covariance matrix (2) for non-respondents’ explanatory variables in self-
selection equation, and for the second estimator, the variance-covariance matrix
for non-respondents’ explanatory variables is calculated using samples drawn from
each non-respondent’s census block (2,). Monte Carlo results show that both of
the ML estimators give very accurate estimates for all of the self-selection and
demand parameters. However, in terms of efﬁciency, the estimator using 2,
performs somewhat better than the estimator using 2.

Although the ML estimator using 2, performs only slightly better than the
alternative estimator using 2, it is the estimator that is recommended. Consider a
case where census blocks are heterogeneous (2,' a 2,', i v- j). In this case, 2 is no
longer a consistent estimate for 2,', and the resulting estimator using 2 does not
lead to consistent parameter estimates. Although 2 is easier to obtain and
performs similarly to 2,, a stronger assumption is needed to assure the consistency

of parameter estimates.
5.2 Need for future research

It is indicated in Chapter 4 (Section 4.2.1) that Monte Carlo experiment

results from the ML estimator based on a truncated sample are different from

65
that of Muthén and Jdreskog (1983). The reasons behind this difference remain
to be explored by future studies that concentrate on the issue of model
speciﬁcation. It is important to determine the degree to which self-selection
models with measurement errors are sensitive to model misspeciﬁcation.

Another area that remains to be explored is the large %BIAS and
incorrect variance for B, that results from the use of a single equation approach
without correcting for self-selection bias. This problem may be approached by
varying the variance-covariance matrix structure for [x,, x,, x,, u, e,] and examine
how it affects the estimates from a single equation method such as OLS.

In a comparison of the two self-selection models with measurement errors,
Monte Carlo experiment results suggest that adoption of u, and 2, produces better
results. Recall that both it, and 2, are estimated from non-respondent i’s
neighborhood, and the neighborhood is loosely deﬁned as a census block, a
county, a state, or even a region. Deﬁnition of the neighborhood remains an

empirical problem and should be studied further.

5.3 Conclusion

In CV studies, data can only be collected from those who are willing to
participate in the studies. Results from the application of a single equation
approach to this truncated sample may lead to inconsistent parameter estimates
(self-selection bias). Unfortunately, there is no simple method to detect the
existence of self-selection bias in CV studies. A self-selection model which
contains a self-selection and a demand equation must be speciﬁed in order to
detect and to correct for self-selection bias. The ML estimator that is based on
the self-selection model with a truncated sample provides theoretically consistent

parameter estimates. However, unless the data is a censored sample, it is shown

66
that the parameters and the variance-covariance matrix in the self-selection
equation cannot be estimated reliably.

A method that transfers a truncated sample to a censored sample by
combining survey individual data and census data is proposed and is called a self-
selection model with measurement errors. Two .ML estimators are derived based
on the self-selection model with measurement errors where data from census are
treated as if they are the true values plus errors.

Results from the Monte Carlo experiments show that the ML estimator
based on the model which uses a censored sample has the best performance. ML
estimators based on the self-selection models with measurement errors perform
very well, especially in estimating demand parameters. According to the Monte
Carlo experiment results, when a correctly-speciﬁed self-selection model with
measurement errors is adopted, estimates of the demand parameters are as
accurate and efﬁcient as the estimates from a model that uses a censored sample,
and the estimates of the self-selection parameters are very close to the true
parameter values. The results indicate an impressive message: adoption of a self-
selection model with measurement errors will not contaminate the original
truncated sample.

Among the two‘ ML estimators from the self-selection model with
measurement errors, the model that uses 11, and 2, estimates the self-selection
parameters more accurately and efﬁciently than the model that uses )1, and 2.
Although 2 is easier to obtain and the ML estimator based on u, and 2 produces
acceptable results, compared to the estimator that uses it, and 2 ,, stronger
assumptions are needed to justify the results.

Although self-selection models with measurement errors developed in this
study started from a CV study using mail surveys with different demand

speciﬁcations, they can be easily generalized in several ways. First, since

67
derivation of the model requires no speciﬁc restriction for the surveys, any type of
cross-section survey can be applied. Second, although a demand function is used
in the model, the important issue is the correlation between the demand function
and the self-selection equation. The model is still valid even if the demand
function is replaced with a supply function, given that it is correctly speciﬁed. In
general, models developed in this study are broad enough to be applied to studies

that adopt survey data and regression analyses.

APPENDIX A
RESULTS FROM MUTHEN AND JORESKOG’S STUDY

APPENDIX A
RESULTS FROM MUTHtN AND JORESKOG’S STUDY

In the model 1 of Muthen and Jdreskog’s study (1983, Section 5), the
selection relation is speciﬁed as:

y, =0.0 +1.0x, +e,,
n, = 0.0 —1.0)g + 6,,
where [x, e, 8

,] is distributed as an i.i.d. trivariate normal distribution with a mean

vector [0 0 0] and a variance-covariance matrix

 

Cov(x,, 8,, 6,) = 0 1 -0.5 .

 

 

l 0 -0.5 1 I
Based on different sample sizes (i.e. N = 1,000 and N = 4,000), Monte

Carlo experiment results are presented in the following tables:1

‘

1 Estimates from a robit and a Heckman’s two-stage estimator that were
reported by Muthén an Jdreskog are omitted here. Notation used to report
results are deﬁned in Appendix .

68

69

Table A.l Parameter estimates for data simulated according to model 1, N,
496, N = 1000

OLS Estimates

ML Estimates BIAS %BIAS

     

Parameters

Truncated Sample
B0 = 0.0 -.373 -.209 0.209 _
(.054)" (.119)
p, = 1,0 .788 .931 0.069 6.9%
(.052) (.095)
a“ = 1,0 .985 .982 0.018 1.8%
(.065) (.076)
yo = 0,0 .991 0.991 _
(1.599)
7, = .10 -3.448 2.448 244.8%
(4.542)
p = -0,5 -.248 0.252 50.4%
(.413)
Censored Sample
[30 = 0,0 -.373 .074 0.074 _
(.054) (.179)
p, = 1,0 .788 1.033 0.033 3.3%
(.052) (.114)
o = 1,0 .985 1.126 0.126 12.6%
n (.065) (.131)
70 = 0.0 .013 0.013 _
(.046)
y, -.- .10 -1.040 0.040 4.0%
(.068)
p = -05 -.522 0.022 4.4%
(.164)

 

' Truncated sample size.

.. Standard errors in parentheses.

 

 

1963,‘ N =
Parameters OLS Estimates ML Estimates BIAS %BIAS_
Truncated Sample
(30 = 00 -.435 -.223 0.223 __
(.027) (.137)
B1 = 1.0 .807 .965 0.035 3.5%
(.027) (.084) _
a" = 1,0 .916 .978 0.022 2.2%
(.029) (.056)
yo = 0.0 .851 0.851 _
(.723)
y, .. .10 -1.277 0.277 27.7%
(346)
p = -05 -.521 0.021 4.2%
(.122)
Censored Sample
Bo = 0,0 -.435 .013 0.013 _
(.027) (.083)
p, - 1,0 .807 1.065 0.065 6.5%
(.027) (.054)
a = 1,0 .916 1.054 0.054 5.4%
u (.029) (.062)
70 = 0.0 .021 0.021 _
(023)
Y, _._ -1.0 -1.043 0.043 4.3%
(.032)
p = .05 -.538 0.038 7.6%
(.078)

 

~

° Truncated sample size.

.. Standard errors in parentheses.

71

There are two problems with Muthén and Joreskog’s results. First, it
cannot be identiﬁed that whether the standard errors reported in the study are the
RMSE’s or ASE’s. Second, since misspeciﬁed models are used in Monte Carlo
experiments, it is difﬁcult to distinguish whether the biased results are caused by
the truncated sample or by the misspeciﬁcation.2

Comparable Monte Carlo experiment results from a model speciﬁed in this
study (Chapter 4, Sections 4.1 and 4.2) are presented in Appendix C, Tables
C.1.3.A, C.1.3.B, and C.1.3.E.3

 

2 It is showed that a probit model is sensitive to model speciﬁcation (Yatchew
and Griliches, 1985).

3 Results are interpreted in Chapter 4, Section 4.2.1.

APPENDIX B
NOTATION USED IN REPORTING MONTE CARLO RESULTS

APPENDIX B
NOTATION USED IN REPORTING MONTE CARLO RESULTS1

To summarize results from Monte Carlo experiments, let 8,, a kxl vector,
be the estimate of the parameter vector obtained from the ith replication, and it,
is the corresponding kxk variance-covariance matrix calculated as the inverse of
the negative of the second derivatives matrix of the log-likelihood function at the

maximum likelihood estimates.2

First, the mean estimate of the parameter vector, MEAN, is deﬁned as

N
MEANzE=§T28p

i-l

where N is the total number of replications (N = 500 in this study). A measure

of the bias, BIAS, can be deﬁned as

BIAS . '5 - or,
where 0’ is the true value of the parameter vector. In addition, deﬁne %BIAS
by

%BIAS - 994 1:95] . 100%.

 

1 Adapted from Dhrymes (1970), Section 8.6, pp. 372 - 380.

2 ' ' ' ° ' al ical second
In the GAUSS o timization procedure, instead of usrng an an
derivative, a numericalgecond derivative that 18 based on an analytrcaIt ﬁrst

derivative is used.
72

73
Further, deﬁne average standard error, ASE, as

1 N .
ASE I N '23,] d1ag[ii,] ,

the covariance matrix about the true parameter value, COV ‘(8), as

cows) . ,1, id, - a'xs, - 6‘)’.

i-l

and root mean square errors, RMSE, as

 

RMSE . 1/ diag[ c6v (8)].

To examine Monte Carlo experiment results, it is important to check both
the RMSE and ASE. Under ideal condition, RMSE and ASE should be very
close to each other. The RMSE is very different from ASE if 1) the model is
misspeciﬁed; 2) the estimator does not lead to reliable parameter estimates; or 3)
the variance-covariance matrix cannot be calculated using the regular formula.

For the purpose of comparing efﬁciency among estimators, deﬁne
Db, . det[ c6v (8),, ],

where b speciﬁes a sub-vector of the parameter vector, j indicates the j“I type of
estimator and COV VS)”, is the corresponding covariance matrix about the true
parameter value. For different estimators, if the Db,’s are deﬁned over the same
sample, their (relative) magnitudes can be treated as an indicator of "efﬁciency."
For example, Db, > Db, indicates that the jth estimator is more efﬁcient than the

ith estimator. In this study, although different estimators are based on different

74

data sets,3 the (relative) magnitude of Db, can still be treated as an indicator of
efﬁciency.

In Appendix C, MEAN’s, BIAS’s, %BIAS, RMSE’s, ASE’s, Db,,’s as well as
the average log-likelihood value and its standard error (SE),4 summarize the

results of Monte Carlo experiments.

E

3 In this study, although different estimators are based on different data sets,
all the data sets are developed from the same pgpulation and contain an identical
proportion of respondents to non-respondents. or non-respondents, different
data sets contain either the real observations or some statistics estimated from the

same population.

4 These are simply the mean and standard error of the maximum log-
likelihood values from the N ( = 500) rephcations.

APPENDIX C
MONTE CARLO EXPERIMENT RESULTS

APPENDIX C
MONTE CARLO EXPERIMENT RESULTS

This appendix presents the results from Monte Carlo experiments for self-
selection models with measurement errors with a linear demand, a Tobit demand,
and a probit demand equation. Based on different correlation measures between
the demand and self-selection equations, three sequences of simulation are
conducted for each model (p = 0.25, 0.5, and 0.75), and each sequence of

simulation has 500 replications.

C.1.1 Estimates from a self-selection model with measurement errors and a

linear demand equation (p = 0.25)

Results presented below are based on p = 0.25.1 Estimates are obtained

after 500 replications, and the average number of respondents is 499.0820 (SE =

21.4049) out of 1,000.

 

1 Statistics reported in Tables are deﬁned in Appendix B.
75

 

 

 

 

 

Parameter MEAN BIAS %BIAS RMSE A81
130 = 6.0065 0.0065 0.11% 0.2328 0.2234
131 = 4.0876 0.0876 2.19% 0.1096 0.0640
132 - -3 -3.0028 -0.0028 0.09% 0.0611 0.0586
_Q2 = 1 0.9868 -0.0132 1.32% 0.0637
13“,”,5;s = 6.4531e-10
Table C.1.1.B Linear demand, correcting for self-selection bias using censored
sample, p = 0.25
Parameter MEAN BIAS %BIAS RMSE ASE
yo 8 15 1.5026 0.0026 0.17% 0.2070 0.1954
y, = 1 1.0120 0.0120 1.20% 0.0794 0.0786
y2 = —3 -3.0259 -0.0259 0.86% 0.1864 0.1799
pa = 6 6.0036 0.0036 0.06% 0.2335 0.2068
9, = 4 4.0014 0.0014 0.04% 0.0831 0.0744
132 = -3 -3.0004 -0.0004 0.01% 0.0612 0.0544
02 = 1 0.9968 -0.0032 0.32% 0.0642 0.0607
p = 025 0.2490 -0.0010 0.40% 0.1342 0.1221

Average log-likelihood = -928.1916 (SE = 40.6244)

ow),CEN = 8.8139e-09
= 3.0124e-10
= 1.1842e-18

13(03):an
D(v.ﬂ.02.p);CEN

selection,

77

Table C.1.1.C Linear demand, correctin for self-selecﬁ ’
errors model with “i and 2, p = 0.25 8 on using measurement

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 15 1.5476 0.0476 3.17% 0.3664 0.3208
,1 = 1 1.0338 0.0338 3.38% 0.1420 0.1259
,2 .-. -3 -3.0861 -0.0861 2.87% 0.3651 0.3116
Bo = 6 6.0038 0.0038 0.06% 0.2336 0.2090
5, = 4 4.0013 0.0013 0.03% 0.0843 0.0741
92 = -3 -3.0005 -0.0005 0.02% 0.0613 0.0549
02 = 1 0.9972 -0.0028 0.28% 0.0647 0.0602
p = 025 0.2522 0.0022 0.88% 0.1372 0.1232

Average log-likelihood = -1157.1148 (SE = 41.9968)

D(v.9);MEI = 136076-07
D = 3.1814e-10

(9.3mm
D(v.0.02.9);MBl

Table C.1.1.D Linear demand, corregt

= 1.8437e-17

errors model with u, and 2,, p = 0.2

ing for self-selection using measurement

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5130 0.0130 0.87% 0.3520 0.3180
y, = 1 1.0228 0.0228 2.28% 0.1353 0.1241
y, = —3 -3.0538 -0.0538 1.79% 0.3467 0.3079
B0 = 6 6.0033 0.0033 0.06% 0.2335 0.2094
3, = 4 4.0003 0.0003 0.01% 0.0844 0.0744
B2 = -3 -3.0005 -0.0005 0.02% 0.0613 0.0550
03 = 1 0.9974 -0.0026 0.26% 0.0647 0.0603
P = 0.25 0.2516 0.0016 0.64% 0.1368 0.1229
Averageﬁgﬂrelihood = -1157.1736 (SE = 42.0587)

130,»;an = 1.0158e-07

Dmmgma2 = 3.1997e-10

= 1.3851e-17

D(v.a.42.p);ME2

78
C.1.2 Estimates from a self-selection model with measurement errors and a

linear demand equation (p = 0.5)

Results presented below are based on p = 0.5. Estimates are obtained
after 500 replications, and the average number of respondents is 498.0860 (SE =
22.1867) out of 1,000.

79

 

 

 

 

 

 

g‘izlsrlepcllﬁg Linear demand, OLS estimates without correcting for self-selection
, Parameter MEAN BIAS %BIAS RMSE ASE
130 = 6 5.9969 -0.0031 0.05% 0.2079 0.2209
13, = 4 4.1714 0.1714 4.29% 0.1828 0.0632
132 = -3 -3.0025 -0.0025 0.08% 0.0539 0.0579
4,2 = 1 0.9649 -0.0351 3.51% 0.0704
D(a.o2);s = 1.4168e-09
Table C.1.2.B Linear demand, correcting for self-selection bias using censored
sample, p = 0.5
Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.4988 -0.0012 0.08% 0.2059 0.1955
y, s 1 1.0116 0.0116 1.16% 0.0790 0.0793
y, = -3 -3.0231 0.0231 0.77% 0.1861 0.1818
50 = 6 5.9945 -0.0055 0.09% 0.2060 0.2074
5, = 4 4.0023 0.0023 0.06% 0.0780 0.0732
132 = -3 -2.9989 0.0011 0.04% 0.0532 0.0542
oz = 1 0.9967 -0.0033 0.33% 0.0659 0.0617
p a 05 0.4946 -0.0054 1.08% 0.1112 0.1083
Average log-likelihood = -914.5157 (SE = 42.3432)
Dam)“,EN = 5.4306e-09
D ( a. 02mm = 1.6633e-10
= 4.0948e-19

D(v.a.o2.o);CEN

80

Table C.1.2.C Linear demand, correcting for self-selection using measurement
errors model With 11, and 2, p = 0.5 ,

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 15 1.5230 0.0230 1.53% 0.3428 0.3092
,1 = 1 1.0221 0.0221 2.21% 0.1359 0.1222
72 = -3 -3.0557 -0.0557 1.86% 0.3439 0.3003
130 = 6 5.9940 -0.0060 0.10% 0.2066 0.2092
6, = 4 4.0021 0.0021 0.05% 0.0794 0.0739
B; = -3 -2.9988 0.0012 0.04% 0.0532 0.0546
02 = 1 0.9972 -0.0028 0.28% 0.0673 0.0629
p = 0.5 0.4969 -0.0031 0.62% 0.1127 0.1091

Average log-likelihood = -1144.3670 (SE = 43.1100)
D(v.9);MEi = 6-87536-08

D = 1.8757e-10

= 5.4879e-18

(0.02);ME1
D(v.ﬂ.02.p);MEl

Table C.1.2.D Linear demand, correcting for self-selection using measurement
errors model with u, and 2,, p = 0.5

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.4978 -0.0022 0.15% 0.3246 0.3061
y, = 1 1.0137 0.0137 1.37% 0.1272 0.1213
y2 = -3 --3.0318 -0.0318 1.06% 0.3195 0.2966
00 = 6 5.9935 -0.0065 0.11% 0.2066 0.2094
13, = 4 4.0007 0.0007 0.02% 0.0794 0.0740
62 a -3 -2.9988 0.0012 0.04% 0.0532 0.0547
02 = 1 0.9978 -0.0022 0.22% 0.0673 0.0630
p = 05 0.4959 -0.0041 0.82% 0.1124 0.1089
Average log-likelihood = -1144.4764 (SE = 43.2787)

136,»;an = 4.8506e-08

130.42);an = 1.8872e-10

= 3.8592e-18

D(v.0.02.9);MBZ

81
C.1.3 Estimates from a self-selection model with measurement errors and a

linear demand equation (p = 0.75)

Results presented below are based on p = 0.75. Estimates are obtained
after 500 replications, and the average number of respondents is 498.9100 (SE =
20.9709) out of 1,000.

82
Table C.1.3.A Linear demand, OLS estimates without correcting for self-selection

 

 

 

 

 

 

bias, p = 0.75
Parameter MEAN BIAS %BIAS RMSE ASE
[30 = 6 6.0257 0.0257 0.43% 0.2187 0.2157
51 = 4 4.2579 0.2579 6.45% 0.2654 0.0619
132 a: -3 -3.0109 -0.0109 0.36% 0.0573 0.0565
412 a 1 0.9227 -0.0773 7.73% 0.0975
D(n.a2);s = 3.2229e-09
Table C.1.3.B Linear demand, correcting for self-selection bias using censored
sample, p = 0.75
Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 15 1.5097 0.0097 0.65% 0.1986 0.1824
«,1 = 1 1.0097 0.0097 0.97% 0.0794 0.0733
72 = -3 -3.0271 -0.0271 0.90% 0.1855 0.1734
Bo -.- 6 6.0165 0.0165 0.28% 0.2078 0.1931
6, = 4 4.0013 0.0013 0.03% 0.0691 0.0642
132 -.- -3 -3.0038 -0.0038 0.13% 0.0534 0.0504
02 = 1 0.9938 -0.0062 0.62% 0.0673 0.0632
p = 075 0.7521 0.0021 0.28% 0.0657 0.0616
Average log-likelihood = -891.7293 (SE = 38.4693)
Dam);CBN = 1.8304e-09
D(B,02);CEN = 1.1114e-10

D(v.0.02.9);CEN

= 1.0369e-19

83

Table C.1.3.C Linear demand, cogescting for self-selection using measurement

errors model with u, and 2, p -

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo . 15 1.5483 0.0483 3.22% 0.3079 0.2718
y, = 1 1.0305 0.0305 3.05% 0.1266 0.1103
72 a -3 -3.0846 -0.0846 2.82% 0.3114 0.2723
130 = 6 6.0174 0.0174 0.29% 0.2080 0.1950
13, .-. 4 4.0035 0.0035 0.09% 0.0710 0.0655
62 a -3 -3.0039 -0.0039 0.13% 0.0533 0.0507
02 = 1 0.9920 -0.0080 0.80% 0.0714 0.0647
p 8 075 0.7540 0.0040 0.53% 0.0675 0.0619
Average log-likelihood = -1120.3568 (SE = 38.5626)

DOW);MEI = 1.8001e-08

Bafﬁn, = 1.4288e-10

Du”, 62,9);ME1 = 1.0489e-18

Table C.1.3.B Linear demand, correcting for self-selection using measurement

errors model with u, and 2,, p = 0.75

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5290 0.0290 1.93% 0.2935 0.2689
y, = 1 1.0243 0.0243 2.43% 0.1226 0.1092
,2 = -3 -3.0677 -0.0677 2.26% 0.2982 0.2693
80 . 5 6.0166 - 0.0166 0.28% 0.2077 0.1953
6, = 4 4.0014 0.0014 0.04% 0.0698 0.0657
13, - -3 -3.0038 -0.0038 0. 13% 0.0533 0.0508
oz .-. 1 0.9932 -0.0068 0.68% 0.0714 0.0648
9 = 0.75 0.7530 0.0030 0.40% 0.0672 0.0620
Average log-likelihood = -1120.4394(SE = 38.6073)

D(y,p);m = 1.3920e-08

130.42);an = 1.3765e-10

= 8.0515e-19

D(v.o.o2.p);Mez

84

Table C.1.3.E Linear demand, correcting for self-selection bias using truncated
sample, p = 0.75

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 15 1.5442 0.0442 2.95% 0.6446 0.5437
7, = 1 1.0538 0.0538 5.38% 0.3605 0.2746
72 a: -3 -3.1487 -0.1487 4.96% 0.7926 0.6842
60 . 6 6.0123 0.0123 0.21% 0.2099 0.1960
6, : 4 3.9913 -0.0087 0.22% 0.1055 0.0966
62 = —3 -3.0041 -0.0041 0.14% 0.0535 0.0503
62 a 1 0.9967 -0.0033 0.33% 0.0877 0.0808
p :- 075 0.7600 0.0100 1.33% 0.1100 0.0989
Average log-likelihood = -665.0761 (SE = 32.2393)

D(Y'p);mu = 1.9254e-05

D ( II. 02); no = 9.4964e-10

D = 3.4627e-15

(1.0.02.9);T'RU

85
C.2.1 Estimates from a self-selection model with measurement errors and a Tobit

demand equation (p = 0.25)

Results presented below are based on p = 0.25. Estimates are obtained
after 500 replications, and the average number of respondents is 499.9500 (SE =
22.6791) out of 1,000. In the demand equation, the average number of censored
q,’s (q, = 0) is 370.4960 (SE = 19.7898), and the average number of uncensored
q,’s (q, > 0) is 129.4540 (SE = 11.1605).

Table C.2.1.A Tobit estimates without correcting for self-selection bias, p = 0.25

86

 

 

 

 

 

 

D(v.ﬂ.02.p);CEN

Parameter MEAN BIAS %BIAS RMSE ASE
60 = 6 5.9277 -0.0723 1.21% 0.4288 0.3878
pl = 4 4.1536 0.1536 3.84% 0.2491 0.1955
(12 = -3 -3.0048 -0.0048 0.16% 0.1523 0.1461
62 = 1 0.9774 -0.0226 2.26% 0.1257 0.1221
Average log-likelihood = 2262562 (SE = 19.4390)

13mm,»S = 1.7151e-07
Table C.2.1.B Tobit demand, correcting for self-selection bias using censored
sample, p = 0.25

Parameter MEAN BIAS %BLAS RMSE ASE
yo 3 15 1.5215 0.0215 1.43% 0.2130 0.2140
y, a 1 1.0013 0.0013 0.13% 0.0827 0.0836
72 = -3 -3.0185 -0.0185 0.62% 0.1978 0.1953
60 a 6 5.9834 -0.0166 0.28% 0.4005 0.3910
131 = 4.0039 0.0039 0.10% 0.2130 0.2221
62 = -3 -2.9974 0.0026 0.09% 0.1472 0.1462
a? = 1 0.9988 -0.0012 0.12% 0.1329 0.1289
p = 025 0.2426 -0.0074 2.96% 0.1873 0.1815
Average Melihood = 451.5493 (SE = 28.6113)

D(y,p);CEN = 2.0024e-08

D(0,62);CEN = 1.0154e-07

= 7.4238e-16

Table C.2.1.C Tobit demand, corre
errors model with u, and 2, p = 0.

87

2cging for self-selection using measurement

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
70 = 15 1.5702 0.0702 4.68% 0.3603 0.3481
71 = 1 1.0175 0.0175 1.75% 0.1316 0.1343
Y2 = -3 -3.0711 -0.0711 2.37% 0.3338 0.3328
60 = 6 5.9792 -0.0208 0.35% 0.4093 0.3985
51 = 4 4.0033 0.0033 0.08% 0.2140 0.2255
62 = -3 -2.9954 0.0046 0.15% 0.1499 0.1492
02 -.- 1 0.9982 -0.0018 0.18% 0.1341 ' 0.1295
p = 025 0.2427 -0.0073 2.92% 0.1911 0.1837

Average log-likelihood = -679.4912 (SE = 30.5418)
Dow),MEI = 2.4916e-07

D

D(r.ﬂ.02.p);Mﬁl

Table C.2.1.D Tobit demand, correcting

(0.02);ME1

= 1.0763e-07
= 9.7833e- 15

errors model with p, and 2,, p = 0.25

for self-selection using measurement

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
yo .. 1,5 1.5557 0.0557 3.71% 0.3530 0.3414
y, = 1 1.0123 0.0123 1.23% 0.1300 0.1319
y2 = -3 '-3.0597 -0.0597 1.99% 0.3265 0.3256
50 = 6 5.9786 -0.0214 0.36% 0.4104 0.3960
13, = 4 4.0029 0.0029 0.07% 0.2137 0.2258
32 = -3 -2.9957 0.0043 0.14% 0.1503 0.1478
‘02 = 1 0.9984 -0.0016 0.16% 0.1344 0.1295
P = 0.25 0.2424 -0.0076 3.04% 0.1909 0.1855
Average log-likelihood = -679.5947 (SE = 30.5940)

D(v.p);MEz = 1.9532e-07

D(0.02);MB2 = 1.0759e-07

= 7.5680e-15

D(v.ﬂ.02.9);ME2

88
C22 Estimates from a self-selection model with measurement errors and a Tobit

demand equation (p = 0.5)

Results presented below are based on p = 0.5. Estimates are obtained
after 500 replications, and the average number of respondents is 499.5380 (SE =
21.1736) out of 1,000. In the demand equation, the average number of censored
q,’s (q, = 0) is 364.3780 (SE = 18.4454), and the average number of uncensored
q,’s (q, > 0) is 135.1600 (SE = 11.7430).

89
Table C.2.2.A Tobit estimates without correcting for self-selection bias, p = 0.5

 

 

 

 

 

 

D(v.ﬂ.02.9);CEN

Parameter MEAN BIAS %BIAS RMSE ASE
130 = 6 5.8824 -0.1176 1.96% 0.4033 0.3732
6, = 4 4.3040 0.3040 7.60% 0.3626 0.1895
52 = -3 -3.0145 -0.0145 0.48% 0.1471 0.1391
62 .-. 1 0.9345 -0.0655 6.55% 0.1348 0.1139
Average log-likelihood = 2298972 (SE = 20.0318)
D(a.o2);s = 4.6211e-07
Table C.2.2.B Tobit demand, correcting for self-selection bias using censored
sample, p = 0.5
Parameter MEAN BIAS %BIAS RMSE ASE
yo 3 15 1.5329 0.0329 2.19% 0.2120 0.2144
y, = 1 1.0111 0.0111 1.11% 0.0879 0.0837
72 = -3 -3.0414 -0.0414 1.38% 0.2041 0.1962
00 a 6 5.9847 -0.0153 0.26% 0.3724 0.3715
5, = 4 4.0087 0.0087 0.22% 0.2043 0.2039
62 = —3 -2.9957 0.0043 0.14% 0.1380 0.1365
02 = 1 0.9854 -0.0146 1.46% 0.1353 0.1272
p a 05 0.4836 -0.0164 3.28% 0.1594 0.1512
Average log-likelihood = 450.5322 (SE = 29.5257)
D(v,p);CEN = 1.4824e-08
D0312): em: = 6.4816e-08
= 3.3847e-16

Table C.2.2.C Tobit demand, correct
5

90

errors model with it, and 2, p = 0.

ing for self-selection using measurement

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
'Yo . 15 1.5511 0.0511 3.41% 0.3635 0.3454
71 = 1 1.0282 0.0282 2.82% 0.1353 0.1381
72 = -3 -3.0788 -0.0788 2.63% 0.3515 0.3407
60 = 6 5.9824 -0.0176 0.29% 0.3758 0.3702
pl = 4 4.0106 0.0106 0.27% 0.2066 0.2053
132 = -3 -2.9958 0.0042 0.14% 0.1385 0.1355
62 = 1 0.9859 -0.0141 1.41% 0.1354 0.1291
6 = 0.5 0.4859 -0.0141 2.82% 0.1599 0.1534

Average log-likelihood = -680.7637 (SE = 30.1853)

D(Y’p);ME, = 1.7301e-07
D = 7.1621e-08
= 4.3018e-15

(9,02);MB1
D(v.ﬂ.¢2.9);MEl

Table C.2.2.D Tobit demand, correcting for self-selection using measurement
errors model with u, and 2,, p = 0.5

 

 

 

larameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5252 0.0252 1.68% 0.3447 0.3381
y, = 1 1.0205 0.0205 2.05% 0.1285 0.1314
y2 = -3 43.0565 -0.0565 1.88% 0.3282 0.3241
Bo = 6 5.9809 -0.0191 0.32% 0.3779 0.3765
B, a 4 4.0100 0.0100 0.25% 0.2077 0.2081
92 = -3 -2.9956 0.0044 0.15% 0.1396 0.1387
03 . 1 0.9862 -0.0138 1.38% 0.1361 0.1289
p = 05 0.4847 -0.0153 3.06% 0.1607 0.1534
Average 1_oLlikelihood = -680.8440 (SE = 30.2236)
D(v.p);ME2 = 125936-07
D(p,62);M132 = 7.18846-08

Dw’ngm);m = 3.1359e-15

91
C.2.3 Estimates from a self-selection model with measurement errors and a Tobit

demand equation (p = 0.75)

Results presented below are based on p = 0.75. Estimates are obtained
after 500 replications, and the average number of respondents is 499.9580 (SE =
21.8165) out of 1,000. In the demand equation, the average number of censored
q,’s (q, = 0) is 361.5160 (SE = 18.9821), and the average number of uncensored
q,’s (q, > 0) is 138.4420 (SE = 11.1505).

92
Table C.2.3.A Tobit estimates without correcting for self-selection bias, p = 0.75

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
60 = 6 5.8553 -0.1447 2.41% 0.3971 0.3561
pl = 4 4.4434 0.4434 11.09% 0.4808 0.1848
132 -.- -3 -3.0273 -0.0273 0.91% 0.1383 0.1328
62 = 1 0.8767 -0.1233 12.33% 0.1633 0.1054

Average log-likelihood = -228.8807 (SE = 18.4061)
130,02);S = 6.5670e-07

Table C.2.3.B Tobit demand, correcting for self-selection bias using censored
sample, p = 0.75

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
70 = 15 1.4990 -0.0010 0.07% 0.1959 0.2091
7, a 1 1.0177 0.0177 1.77% 0.0835 0.0821
72 = -3 -3.0368 -0.0368 1.23% 0.1862 0.1943
pa = 6 6.0087 0.0087 0.15% 0.3409 0.3408
6, = 4 4.0121 0.0121 0.30% 0.1780 0.1804
62 = -3 -3.0057 -0.0057 0.19% 0.1254 0.1232
02 = 1 0.9882 -0.0118 1.18% 0.1302 0.1280
p = 075 0.7396 -0.0104 1.39% 0.1028 0.0973

Averageﬁgﬂelihood = 443.3619 (SE = 26.9542)
D(y,p);CEN = 4.8510e-09
D = 2.8632e-08

= 5.7769e-17

(when:
‘ D(v.0.02.9);CEN

93

Table C.2.3.C Tobit demand, correeging for self-selection using measurement

errors model with 11, and 2, p = 0.

 

 

Parameter MEAN BIAS %BIAS RMSE ASE_
Yo = 15 1.5645 0.0645 4.30% 0.3311 0.3295
7, = 1 1.0362 0.0362 3.62% 0.1391 0.1279
72 = -3 -3.1039 -0.1039 3.46% 0.3510 0.3210
pa = 6 6.0113 0.0113 0.19% 0.3453 0.3421
6, = 4 4.0101 0.0101 0.25% 0.1813 0.1839
132 = -3 -3.0044 -0.0044 0.15% 0.1260 0.1229
62 a 1 0.9858 -0.0142 1.42% 0.1325 0.1305
p = 075 0.7401 -0.0099 1.32% 0.1058 0.0979

Average log-likelihood = -673.0108 (SE = 29.1919)

0,1,9);Mm = 5.6626e-08
D(’,02);MEI = 3.33986'08

Bahama = 6.8714e-16

Table C.2.3.D Tobit demand, correetsing for self-selection using measurement

errors model with 1.1, and 2,, p = 0.

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5348 0.0348 2.32% 0.3169 0.3248
y, = 1 1.0260 0.0260 2.60% 0.1330 0.1262
y, a -3 '-3.0763 -0.0763 2.54% 0.3360 0.3173
5,, = 6 6.0116 0.0116 0.19% 0.3454 0.3425
5, a 4 4.0068 0.0068 0.17% 0.1806 0.1841
62 = -3 -3.0050 -0.0050 0.17% 0.1261 0.1233
c? a 1 0.9891 -0.0109 1.09% 0.1346 0.1324
9 = 0.75 0.7397 -0.0103 1.37% 0.1060 0.0981
Average log-likelihood = -673.0461 (SE = 29.1821)

1),“);M122 = 4.1771e-08

D(9,02);ME2 = 3.3442e-08

D(y,n,02,p);ME2 = 5.12146'16

94
C.3.1 Estimates from a self-selection model with measurement errors and a

probit demand equation (p = 0.25)

Results presented below are based on p = 0.25. Estimates are obtained
after 500 replications, and the average number of respondents is 502.3760 (SE =
20.6115) out of 1,000. In the demand equation, the average number of left
censored q,’s (q, = 0) is 372.0040 (SE = 18.5927), and the average number of
right censored q,’s (q, = 1) is 130.3720 (SE = 11.2154).

95
Table C.3.1.A Probit estimates without correcting for self-selection bias, p = 0.25

 

Parameter MEAN BIAS %BIAS RMSE ASE
pa = 6 6.2406 0.2406 4.01% 0.9353 0.8337
91 = 4 4.2972 0.2972 7.43% 0.5777 0.4633
62 = -3 -3.1363 -0. 1363 4.54% 0.3961 0.3442

Average log-likelihood = -85.3713 (SE = 10.3680)

D 6:8 = 1.9852e-04

Table C.3.1.B Probit demand, correcting for self-selection bias using censored

 

 

sample, p = 0.25

Parameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5329 0.0329 2.19% 0.2293 0.2187
7, = 1 1.0144 0.0144 1.44% 0.0849 0.0852
72 a: -3 -3.0501 -0.0501 1.67% 0.2122 0.2005
60 = 6 6.1967 0.1967 3.28% 0.9208 0.8351
13, = 4 4.1095 0.1095 2.74% 0.5539 0.5104
B; = -3 -3.0892 -0.0892 2.97% 0.3863 0.3471
p = 025 0.2527 0.0027 1.08% 0.3060 0.2918

Average log-likelihood = -308.9247 (SE = 20.4965)

13W),EN = 6.4142e-08
DMN = 2.1815e-04_
136.1,),an = 5.7177e-12

96

Table C.3.1.C Probit demand, corre ' for self-selecti '
errors model with 1.1, and 2:, p = 0.2331118 on usmg measurement

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 1.5 1.6115 0.1115 7.43% 0.3756 0.3605
7, -.- 1 1.0449 0.0449 4.49% 0.1473 0.1384
72 a. -3 -3.1454 -0.1454 4.85% 0.3698 0.3477
60 = 6 6.2021 0.2021 3.37% 0.9239 0.8415
51 = 4 4.1132 0.1132 2.83% 0.5560 0.5124
62 = -3 -3.0912 -0.0912 3.04% 0.3877 0.3490
6 3: 025 0.2534 0.0034 1.36% 0.3092 0.2964

Average log-likelihood = -537.6684 (SE = 22.0822)
D(v,p);MEl = 8.7102e-07

D AME

1 = 2.2009e-04

D(v.B.p);ME1 = 7.8470e-11

errors model with 11, and 2,, p = 0.

Table C.3.1.D Probit demand, correcting for self-selection using measurement
25

 

 

_Paramerer MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5746 0.0746 4.97% 0.3539 0.3588
y, = 1 1.0335 0.0335 3.35% 0.1382 0.1381
,2 = -3 31112 -0.1112 3.71% 0.3438 0.3468
11,, a 6 < 6.2002 0.2002 3.34% 0.9230 0.8498
11, a 4 4.1111 0.1111 2.78% 0.5554 0.5160
132 . -3 30904 -0.0904 3.01% 0.3873 0.3539
9 = 0.25 0.2529 0.0029 1.16% 0.3077 0.2922

Average log-likelihood = -537.7948 (SE = 22.0741)

D(v.p);ME2

= 5.7620e-07

D II;ME2 = 2.2082e-04

D(v.ﬂ.p);ME2

= 5.2055e-11

97
C.3.2 Estimates from a self-selection model with measurement errors and a

probit demand equation (p = 0.5)

Results presented below are based on p = 0.5. Estimates are obtained
after 500 replications, and the average number of respondents is 501.0820 (SE =
23.3509) out of 1,000. In the demand equation, the average number of left
censored q,’s (q, = 0) is 366.9400 (SE = 20.4881), and the average number of
right censored q,’s (q, = 1) is 134.1420 (SE = 11.8349).

98

Table C.3.2.A Probit estimates without correcting for self-selection bias, p = 0.5

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
pa = 6 6.2315 0.2315 3.86% 0.8671 0.8428
6, .-. 4 4.5082 0.5082 12.71% 0.7114 0.4876
62 = -3 -3.1767 -0.1767 5.89% 0.3926 0.3519

Average log-likelihood = -83.3600 (SE = 10.2106)
DB's = 4.7665e-04

Table C.3.2.D Probit demand, correcting for self-selection bias using censored
sample, p = 0.5

 

_Parameter MEAN BIAS %BIAS RMSE ASE
70 = 15 1.5067 0.0067 0.45% 0.2121 0.2194
y, a 1 1.0173 0.0173 1.73% 0.0847 0.0864
y, = —3 -3.0376 -0.0376 1.25% 0.1977 0.2034
pa = 6 6.1576 0.1576 2.63% 0.8458 0.8762
6, = 4.1689 0.1689 4.22% 0.5658 0.5607
132 = -3 -3.0930 -0.0930 3.10% 0.3672 0.3718
p = 05 0.4529 -0.0471 9.42% 0.2570 0.2719

Average log-likelihood = -306.9455 (SE = 19.9936)

136,11);an = 3.3612e-08
Decals: = 1.8584e-04,

D0, 01:);an = 2.6645e- 12

99

Table C.3.2.C Probit demand, correcting for self-selection using measurement
errors model With 11, and 2, p = 0.5

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
Yo = 1.5 1.5762 0.0762 5.08% 0.3800 0.3657
7, = 1 1.0346 0.0346 3.46% 0.1470 0.1400
72 = -3 -3.1046 -0.1046 3.49% 0.3794 0.3516
60 = 6 6.1572 0.1572 2.62% 0.8451 0.8649
6, = 4 4.1653 0.1653 4.13% 0.5639 0.5706
62 = -3 -3.0917 -0.0917 3.06% 0.3661 0.3706
p =- 0.5 0.4594 -0.0406 8.12% 0.2606 0.2810

Average log-likelihood = -536.5940 (SE = 22.4278)
DMD);M131 = 5.4912e-07

DB;ME1 = 1.8759e-04

D(v.Ap);ME1 = 4.3700e-11

Table C.3.2.D Probit demand, correcting for self-selection using meaSurement
errors model with 11, and 2,, p = 0.5

 

 

iarameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5398 0.0398 2.65% 0.3624 0.3494
y, = 1 1.0226 0.0226 2.26% 0.1368 0.1378
72 a -3 -3.0695 -0.0695 2.32% 0.3529 0.3432
90 a 6 6.1533 0.1533 2.56% 0.8458 0.8823
13, = 4 4.1607 0.1607 4.02% 0.5634 0.5731
92 = —3 -3.0900 -0.0900 3.00% 0.3662 0.3764
9 a 05 0.4586 -0.0414 8.28% 0.2591 0.2795

Average log-likelihood = -536.7139 (SE = 22.4111)

D(v.p);ME2 = 3.6906e-07
DAMEZ = 1.8893e-04

D(v.B,p);MEZ = 2.9529e-11

100
C.3.3 Estimates from a self-selection model with measurement errors and a

probit demand equation (p = 0.75)

Results presented below are based on p = 0.75. Estimates are obtained
after 500 replications, and the average number of respondents is 501.5800 (SE =
22.6946) out of 1,000. In the demand equation, the average number of left
censored q,’s (q, = 0) is 362.0720 (SE = 19.2150), and the average number of
right censored q,’s (q, = 1) is 139.5080 (SE = 11.6935).

101

Table C.3.3.A Probit estimates without correcting for self-selection bias, p = 0.75

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
90 = 6 6.3941 0.3941 6.57% 0.9873 0.8639
6, = 4 4.7338 0.7338 18.35% 0.9147 0.5130
B, = -3 -3.2647 -0.2647 8.82% 0.4696 0.3633
Average log-likelihood = -81.930918E = 10.5347)

D 11;: = 9.4296e-04

Table C.3.3.B Probit demand, correcting for self-selection bias using censored
sample, p = 0.75

 

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
70 a 15 1.5155 0.0155 1.03% 0.2221 0.2164
,1 = 1 1.0151 0.0151 1.51% 0.0842 0.0858
y2 = -3 -3.0394 -0.0394 1.31% 0.2093 0.2023
60 a 6 6.2153 0.2153 3.59% 0.9112 0.8915
6, a 4 4.1852 0.1852 4.63% 0.6221 0.5865
132 -.- -3 -3.1079 -0.1079 3.60% 0.4056 0.3839
6 a 0.75 0.6729 -0.0771 10.28% 0.2064 0.2232
Average log-likelihood = -304.5480 (SE = 19.9643)
D (m); (EN = 4.2098e-08
[311;an = 1.3845e-04,

= 4.7759e-12

D(v.9.p);CBN

102

Table C.3.3.C Probit demand, correctin for self-sel ct' '
errors model with u, and 2, p = 0.75 g e 10” using measurement

 

 

Parameter MEAN BIAS %BIAS RMSE ASE
70 = 15 1.5747 0.0747 4.98% 0.3503 0.3566
7, = 1 1.0448 0.0448 4.48% 0.1481 0.1385
72 = -3 -3.1198 -0.1198 3.99% 0.3702 0.3475
Po 3 6 6.2255 0.2255 0.04% 0.9136 0.8503
Pr = 4 4.1936 0.1936 4.84% 0.6260 0.5624
62 = _3 -3.1121 -0.1121- 3.74% 0.4071 0.3655
9 = 075 0.6772 -0.0728 9.71% 0.2047 0.2203

Average log-likelihood = -532.9000 (SE = 22.0023)
DOWNEl = 5.3192e-07

Diwnr = 1.4048e-04

D(v,B,p);ME1 = 5.8446e-11

Table C.3.3.D Probit demand, correcting for self-selection using measurement
errors model with 11, and 2,, p = 0.75

 

 

imameter MEAN BIAS %BIAS RMSE ASE
yo = 15 1.5379 0.0379 2.53% 0.3256 0.3475
y, = 1 1.0333 0.0333 3.30% 0.1392 0.1355
y2 = —3 -3.0855 -0.0855 2.85% 0.3414 0.3384
6,, a 6 6.2195 0.2195 3.66% 0.9117 0.8446
9, = 4 4.1877 0.1877 4.69% 0.6232 0.5606
62 = -3 -3.1095 -0.1095 3.65% 0.4060 0.3628
P = 0.75 0.6751 -0.0749 9.99% 0.2063 0.2202

Average log-likelihood = -532.9650 (SE = 21.9812)
D(v,p);ME2 = 3.6802e-07
Dmm = 1.3780e-04

D(v,B,p);ME2 = 4.1849e-11

APPENDIX D

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A LINEAR
DEMAND EQUATION

APPENDIX D
A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF -SELECI'ION MODEL WITH MEASUREMENT ERRORS AND A LINEAR
DEMAND EQUATION

This GAUSS program is used to conduct Monte Carlo experiments for the
self-selection model with measurement errors and a linear demand equation (p =

0.25).

new;
use optmum;
output ﬁle = tl.out;

iter = 1;
do while iter 1e 500;

recal:

@--

npo : ulation size

nx: gugobgr of variables (x’s and e’s)

nb: number of blocks (nb)

bobs: number of obs in a block _ .

ss: number of obs used to calculate variance-covariance

sm : random sub-sample / population (0 < = smp < = 1)

npop = 10000;

nx = 5;

ﬁg: 250; / b
s = npop n ;

$8 = 200;

smp = 0.1;

@- draw random sample as population --@
x = rndn(npop,nx);

103

104

let v[5,5] = 1.44 0.24 0.096 0 0
0.24 1 0.24 0 0

0.096 0.24 0.64 0 0
0 0 0 1 0.25
0 0 0 0.25 1'

9

sqrtv = chol(v);

x=x'sqrtv;
let a[1,5] = 31.5400;
x=x+a;

evar = vcx(x[1:ss,1 2]);
clear v, sqrtv, a;

@-- divide population into blocks --@
si = 2000;

s = npop / si;
sg = nb s;
w = eye sg) .". ones(bobs,l);

sx = x[1:si,1 2];
msx = sx’w/bobs)’ .“. ones(bobs,l);
vsx = ( ((sx-msx)"2) ,
8X1..ll-mSXI-.1]).‘(SX -.2]-mSXI-.2])))
w (bobs-1) )’ .’. ones obs,1);
xx = msx~vsx;

i = 2;
do while i < = s;
l=si"(i-1)+ 1;
h = si * i;
sx = x[l:h,1 2 ;
msx = gsx’W/ b322, .'. ones(bobs,l);

~

vsx = ((sx-lmsx) 1]) ( 2 I 2,)»,
~ ., - ., ." sx., -msx.,
w I ggiibsl-ITSXYI.‘. ones( 055,1);

sx = msx~vsx;

xx=xx|sx;
1=i+1;
endo;

clear i, l, h, msx, vsx, sx;
@- xx = [x1 x2 x3 e1 e2 m1 m2 v11v22 v21] --@

x = x~xx;

clear xx;

@--

extract a random sub-sample from population xx,

105

and variables in the random sub-sample are
[x1x2 x3 e1 e2 m1 m2 v11 v22 v21].

i = indumponl);

x = 1~x;

data = selif(x,x[.,1] .le smp);
clear x;

data = data[.,2:(cols(data))];
obs = rows(data);

@--

generate dependent variables 8‘ and D‘
'= 1.5+ 1‘x1-3‘x2-I-e1

D‘=6+4"x2-3‘x3+e2

let bs[3,l] = 1.5 1 -3;

let bd[3,1] = 6 4 -3;

s = (ones(obs,1)~data[.,1 2],) ’ bs + data[.,?
(1 = (ones(obs,1)~data[.,2 3) ’ bd + data[., ];

@..
rearrange data accordin to

s > Ozxyes = [dx1x2 ], and

s <= 0: xno = [m1m2v11v22v21]

algé) data = [s (1 x1 x2 x3 m1 m2 v11 v22 v21].

data = s~d~data[.,1:3 6:10];
clear 8, d;

yes = selif(data, data[.,l] .gt 0);
no = selif(data, data[.,l] .le 0);
clear data;

nyes = rows(yes);

nno = rows(no);

sxyes = ones(nyes,1)~yes[.,3 4 ;
dxyes = ones(nyes,1)~yeS[.,4 51]:
d = yes[.,2];

clear yes;

sxno = ones(nno,1)~no[.,3 4];
sxnom = ones(nno,1)~no[.,6 7];
sxnov = no[.,8 9 10];

clear no;

@--
starting values for
[s0 s1 s2 d0 d1 d2 sigma‘2 rho]

b0 = bslbd]1]0.25;

106

@- gradient tolerance (default = 1e-5) --@
@-- _opgtol = 1e-10; --@

1?" 013 “(a dxy ) d

eta = 1n es’ es ‘ es’ ;

s = sumc(VI()d - es * betag’g2‘y) / nyes;
cov = s " in d( es’dxyes
stder = sqrt(‘rﬁag(cov));
output on;

iter ~ nyes ~ beta’ ~ stder’ ~ s;
output off;

0
’

@-- call and print optrnum --@

optset;
opgd rc = &focl;
{beta , f1, g, retcode} = optmum(&fnl, b0);
if retcode ne 0;
goto recal;
endif;
covl = _opﬂiess;
stderl = sqrt(diag(cov1));
ou ut on;
(-f1 ~beta1’~stder1’;
output off;

optset;
opgd rc = &foc3;
{be , 13, g, retcode} = optmum(&fn3, b0);
if retcode ne 0;
goto recal;
endif;
cov3 = _opﬂiess;
stder3 = sqrt(diag(cov3));
ou ut on;
(-B ~beta3’~stder3’; -
output off;

optset;

opgd rc = &foc4;
Tbeta4, f4, g, retcode} = optmum(&fn4, Do);
if retcode ne 0;

oto recal;
endiI;
cov4 = opfhess;

stder4 =_ sqrt(diag(cov4));
ou ut on;

(~f4 ~beta4’~stder4’;
print I! N;

output off; ’

 

107
@- procedures --@

@-

-(log-1ikelihood) function for a self-selection model
with measurement errors and a linear demand function
using censored data (non-respondents’ independent
variables are observed)

-@
proc fnl( (amp)
local bs, bbd, sigma, rho, dyes, kyes, kno, k,

bbs = para[{l 2 3, .;]

bbd= para 4 5 6. ,.;?
sigma= sqrt(para 7, ].),
rho = para[8,. ];

dyes = d - dxyes ‘ bbd;
9“." iinéixy 6381:, P9399 3?. 31,82)
/ sqsrt(1? )rho"2) );
kno = cdfn(- sxno ‘ bbs);
= ln(kyes|kno);

retP(-sum6(k));
endp;

proc focl ara) ,
local h s, bbd, sigma, rho, f, cyses’s,
SE C. cno, p. be. fbbd.
g gss, gbbs, grho, hbbs, k;

bbs= para? 2 3,. ,
bbd= =para 4 5 6,.
sigma= sqrt(para7 ,.;])

rho = para[8,.;

f = (1 / sigmas :58?“ (d- dxyes " bbd) / sigma),

cyes=
- rho‘ e(d - es ‘bbd) / sigma )
srt(1- rho‘2;
gp = (18 sqrt(l- rho"2))‘ pdfn(cyes);
c = cdfnc(cyes);
p = pdfn(-sxno’bbs);
hc = cdfn(- sxno ’ bbs);

fbbd -

((cIumdges ’ bbd) / sigma‘2). ‘ dxyes );
fss =((sdlmd§ryes " bbd)"2- sigma"2)

/ (2 ' sigma"4) );

108

gbbd = sumc( '
(- 8P -/ 89) “ (rho / sigma) -’ dxyes );
gss = sumc(

I. g‘./sigC) .‘ (rho ' (d - dxyes ‘ bbd))

a‘3) ;
gbbs = sumc gp .fgc) .’ sxyes );
grho = sumc - gp .4 gc)
." ( -((d - dxyes " bd) / si a)
)1- rho ') (/ ~(slxyesi1 '15? 3' o ‘ (d - dxyes ' bbd)
s1 a - r o ;
hbbs = sgilinmc( (hp ./ hc) ." (- sxno) );

k = (gbbs’+hbbs’) ~ (fbbd + gbbd)’
~ (fss + gss) ~ grho;

rctP(-l<);
endp;

@--
-(log-likelihood) function for a self-selection model
with measurement errors and a linear demand function
using empirical block mean and estimated variance-
covariance (200 obs. from the population)

for measurement error

proc ﬁ13 ara);
local bs, bbd, sigma, rho, dyes, kyes,
bbsn, delta, 0, k;

bbs = para[1 2 3,.];
bbd = para[4 5 6,.];
sigma = sqrt(para[7,.]);
rho = para[8,.];

£368 = d VdXngaes 1') Windy / ' 1
es = 1 s1 ' es $1 a
.' crifnc( - sxyes Pbbs - rho ‘ (ﬁle‘s
/ sigma)
/ sqrt(l - rho"2) );
bbsn = bbs[2 3,.];
delta = sqrt(l + bbsn’evar‘bbsn);
kno = cdfn(- sxnom ‘ bbs / delta);
k = ln(kyes|kno);

retP(-sum9(k));
ndP;

109

proc foc3(ar ara;)
local h ,b1, b2, b3, bbd, sigma, rho, f, cyes,
gp,g gc, cno, hp, hc, h, fbbd, fss,
g bd, gss, gbbs, grho, hb1,hb2, hb3, k;

bbs= ara[12 3, .;]

b1=bsl,.;
b2=bbs2 ;
b3=bbs3::

bbd= para[4 5 6,. ,3;
sigma= sqrt(para 7, ].);
rho = para[8,.;
f = (1 / sigma ‘ pdfn( (d- dxyes " bbd) / sigma);
cyes- = (- sxyes
- rho‘ (d - es ‘bbd) / sigma )

(1 rho"2;
= 17(sqrt(1 - rho"2)) pdfn(cyes);
gc- = cdfn c (cyes);

cno = (- sxnom bbs) ./
sqrt(l + b2"2 " evar[1,1] + b3"2 " evar[2,2]
+ 2 ‘ b2 b3 evar[1, 2]);
hp = (1 /sqrt(1 + b2"2 ‘ evar[1, 1]
+ b3"2 ‘ evar[2, 2]
+ 2 b2‘ b3” evar[1,2]))
-’;Pdfn(cn0)
hc = cdfn(cno);
h=1 + b2"2 " evar[1,]
+ b3"2 ‘ evar[2, 2] + ‘ b2 "‘ b3 ' evar[1,2];

fbbd- = sumc(
((d- dxyes " bbd) / sigma"2). ‘ dxyes );
fss 731111115156 own "2)
es "‘ sigma
/ (2 " sigma 4) );
gbbd = sumc(
s-( SSP c/ogC)"(rhO/~°>i811191) ' dxyes ):

8-( 11111./(.gc). " rho (d- dxyes “ bbd))
/2 6‘2". "3)

gbb = sumc gc). "‘ sxyes );

gm = “13°- gp 6110/ >
' es " si 0a
+(r h(<() ‘ (d-xysxyes ‘-bbs r "-(d dxyes ‘ bbd)
/ sigma) / (1- 11107)) );0

hbl = sumc - hp ./ hc );

hb2- = sumc (h ./hc '( - sxnom[ 2]
+sxnom‘ bs.‘c.2'b2‘evar1,1]
+2‘b3’ evar[1,].)/(2'h) );

hb3 = sumc( (h ./ hc .-’( sxnom 3]
+sxnom bs. * 2‘b3' evar2,2]
+ 2‘ b2 ’ evar[1,2j) ./(2 ‘h) );

= (gbbs +(hb1~hb2~hb3)) ~ (fbbd + gbbd)’

110
~ (fss + 855) ~ grho:
retP(-k);

endp;

-(log-likelihood) function for a self-selection model
with measurement errors and a linear demand function
using empirical block mean and empirical block
vaéiance-covariance for measurement error
proc fn4 ara);
local bs, bbd,1si a, rho, dyes, kyes,
bbs 1, bbs2, elta, kno,k

bbs = para[1 2 3,. ,
bbd= para[4 5 6.

sigma= sqrt(para 7, .;])
rho = para[8,. ];

dyes=d1/-dxyes‘bbd; .
kyes fincd/ St a) pdfn(dyes / $1 a)
fnc-( -ssxyes bbs - rho ‘ es

/ sqsrt(1 gm-r)rho"2) );

bbsl = bbs[2, ,.];
bbs2= bbs 3,
delta=sqrt(1 + bbsl"2‘ Simov[. 1,]
+ bbs2"2’ sxnov[. ,2]
+ 2 bbsl ‘ bsz‘ s1mov[.,3]);

kno = cdfn(- Simom ‘ bbs ./ delta);
k = ln(kyes|kno);
retp(-sumc(k)); I

endp;

proc foc4(para);
local bbs, b1, b2, b3, hbbd, sigma, rho, f, cyes,

c, cno, h , c,h, fbbd, fss,
Egbigi, gss, gblgflc grho, hbl, hb2, hb3, k;

bbs = para[1g 2 3, .];

 

b1=b
b2=bb52,.
b3 =bbs3,.

b_bd= para[4 5 6,. ”7.3 ])

sigma = sqrt(para7
rho = para[8, .];

111

f = (1 / sigma) ‘ Bdfn( (d - dxyes ‘ bbd) / sigma);
eyes = (-sxyes' bs
- rho ‘ (d - es 'bbd) / sigma )
/ rt(l - rho"2 ;
8? = ( sqrt(l - rh0‘2)) ' pdfn(cye8);
gc = cdfnc(cyes);
cno = (- sxnom ‘ bbs) ./
sqrt(l + b2"2 " sxnov[.,l] + b3‘2 " sxnov[.,2]
+ 2 “ b2 ‘ b3 ‘ sxnov[.,3]);
hp = (1 / sqrt(l + b2"2 “ sxnov[.,l]
+ b3‘2 " sxnov[.,2]
+ 2 ‘ b2 ' b3 "' sxnov[.,3]))
-‘ Pdfn(cn0);
hc = cdfn(cno);
h = 1 + b2"2 ‘ sxnov[.,l
+ b3"2 ‘ sxnov[.,2] + ‘ b2 ‘ b3 ‘ sxnov[.,3];

fbbd = sumc(
((d - dxyes " bbd) / sigrna‘2) .‘ dxyes );
fss =((sdumccl§yes ‘ bbd)"2 sigma"2)
/ (2 ‘ sigma"4) );
gbbd = sumc(
(- 3p -/ gc) ‘ (rho / sigma) -" dxyes );
gss = sumc(

$53-41“) .: (rho ' (d - dxyes " bbd))

3 ;
gbbs = sumc g1; fgc) .“ sxyes );
grho = sumc -gp .6 gc)
." ( -((d - dxyes " bdg / si 3)
+ rho'(-sxyes‘b s-r o’(d-dxyes‘bbd)
/ sigma) / (1- rh0‘2)) );
hbl = sumc - hp ./ hc );
hb2 = sumc (hip ./ hc ." ( - sxnom[.,2]
+ sxnom ' bs .' 2 " b2 ' s1mov[.,1]
+ 2 ‘ b3 ‘ sxnov[.,3]) ./ (2 " h )) );
hb3 = sumc( (h ../ hc .' ( - sxnom[.,3]
+ smom ‘ bs .‘ 2 ‘ b3 ‘ smov[.,2]
+ 2 "‘ b2 “ s:mov[.,3]) ./ (2 "' h) ) );

k = (gbbs’+(hb1~hb2~hb3)) ~ (fbbd + gbbd)’
~ (fss + 388) ~ grho:
retP(-k);

endp

iter = iter + 1;
endo;

system;

APPENDIX E

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF -SELECTION MODEL WITH MEASUREMENT ERRORS AND A TOBIT
DEMAND EQUATION

APPENDIX E
A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF -SELECI'ION MODEL WITH MEASUREMENT ERRORS AND A TOBIT
DEMAND EQUATION

This GAUSS program is used to conduct Monte Carlo experiments for the
self-selection model with measurement errors and a Tobit demand equation (p =

0.25).

new;
use optmum;
output ﬁle = tt.out;

@-- declare global variables «@.
declare matnx g_bbd, g_bbs, g_51gma, g_rho;
external matrix g_bbd, g_bbs, g_s1gma, g_rho;

iter = 1;
do while iter le 500;

recal:
@--

npop: population size

nx: number of variables (11’s and e’s)

nb: number of blocks (nb)

bobs: number of obs in a block . .

ss: number of obs used to calculate. variance-covariance
sm : random sub-sample / populatlon (0 < = smp < = 1)
npop = 10000;

nx = 5;

nb = 250;

bobs = npop / nb;

55 = 200;

smp = 0.1;

112

113

@-— draw random sample as population --@
x = mdn(np0p,nx);
let v[5,5] = 1.44 0.24 0.096 0 0
0.24 1 0.24 0 0
0.096 0.24 0.64 O 0
0 0 0 1 0.25
O 0 0 0.25 1;
sqrtv = chol(v);

x=x'sqrtv;
let a[1,5] = 31.5400;
x=x+a;

evar = vcx(x[1:ss,1 2]);
clear v, sqrtv, a;

@-- divide population into blocks --@

si = 000;

S = HPOP / si;

sg = nb s;

w = eye sg) .’. ones(bobs,l);

sx = x[1:si,1 2];

msx = sx’w/bobs‘); .‘. ones(bobs,l);

vsx = ( ((sx-msx) ) ’

SKI-.11-msx{..1])-‘(sx -.2]-msx[..2])))

w (bobs-1) )’ .‘. ones bs,1);

xx = msx~vsx;

~

x[l:h,1 2 ;
msx = sx’w b8); .‘. ones(bobs,l);
vsx = ((sx-msx)
~ SX[.,1]'mSX[.,1])..(SX .,2]'mSX[.,2D))’
w (bobs-1) )’ .“. ones obs,1);
sx = msx~vsx; ‘

xx=xx|sx,
1=1+1,
endo;

cleari,l,h,msx,vsx,sx;

@-- xx = [x1 x2 x3 e1 e2 m1 m2 v11v22 v21] --@
x = x~xx;
clear xx;

@-- .
extract a random sub-sample from populatlon xx,
and variables in the random sub-sample are

[x1x2 x3 e1 e2 m1 m2 v11v22 v21].

114

i = Endumponl);

x = 1~x;

data = selif(x,x[.,1] .le smp);
clear 11;

data = data[.,2:(cols(data))];
obs = rows(data);

@

generate dependent variables 8" and D‘
"‘ =1.5+1*x1-3‘x2+e1

D'=6+4‘x2-3’x3+e2

let bs[3,1] = 1.5 1 -3;

let bd[3,1] = 6 4 -3;

s = (ones(obs,1)~data&.,l 2]) " bs + data[.,4g;
d = (ones(obs,1)~data .,2 3]) ' bd + data[., ];

@--
rearrange data according to

s > Ozxyes = [dx1x2x3], and

s < = O: xno = [m1 m2 v11v22 v21]

31%)) data = [s d x1x2 x3 m1 m2 v11v22 v21].

data = s~d~data[.,1:3 6:10];
clear s, d;

81 (data .,1; .gt 0 .and (data[.,ZJ .gt 0;;
$0 = data .,1 .gt 0 .and data[.,2 .e 0 ;
yesl = selif(datasl ;

yesO = selif data,s0 ;

no = selif(data, data[.,l] .le 0);

clear s1,sO,data;

nno = rows(no);
nyesl = rows e51);
nyesO = rows esO ;
nyes = nyesl + nyesO;

sxyesl = ones(nyes1,1)~yesl[.,3 4 ;
dxyesl = ones(nyes1,1)~yesl[.,4 ;
d = yesl[.,2];

sxyesO = ones(nyesO,1)~yesO[.,3 4]];
dxyesO = ones(nyesO,1)~yesO[.,4 5 ;
clear yesl, yesO;

sxno = ones(nno,1)~no[.,3 4];
sxnom = ones(nno,1)~no[.,6 7];
sxnov = no[.,8 9 10];

clear no;

115

@..
starting values for
[s0 s1 s2 d0 d1 d2 sigma"2 rho]

in = bdll;
b0 = bs bd|1|0.25;

@-- call and print optmum --@

optset;

opgd rc = &focl;

{beta , f1, g, retcode} = optmum(&fnl, b0);
if retcode ne 0;

goto recal;

en ;

covl = invpdéhess (&fnl,beta1));

stderl = s rt diag cov1));

pout = (-f1 ~beta1’~stder1’;

optset;

opgd rc = &foc3;
Tbet , f3, g, retcode} = optmum(&an, betal);
if retcode ne 0;

dgoto recal;
en ;
cov3 = invpdéhess (&ﬁ13,beta3));

stder3 = sqrt dia cov3));
pout = pout~(- ~beta3’~stder3’;

optset;
opgd rc = &foc4;
'{beta , f4, g, retcode} = optmum(&fn4, beta3);
if retcode ne 0;
Igoto recal;
end ;

cov4 = invpdéhess (&fn4,beta4));
stder4 = sqrt dia cov4));
pout = pout~(-f ~beta4’~stder4’;

optset;
opgd rc = &focO;
(beta , 10, g, retcode} = optmum(&an, bt);
if retcode ne 0;
goto recal;
end ;
covO = invpdghess (&an,beta0));
stderO = sqrt diag cov0));
output on; , ,
iter~nyesO~nyesl~nyes~nno~(-f0)~betaO ~stder0 ~pout;
print n 11;
output off;

116
@- procedures -@
@

-(log-likelihood) function for a Tobit demand function
using data from only the respondents
proc fn ara);
local bd, si a, ll, 12;
bbd = para 1(2 3,.2; )
sigma = sqrt ara 4,.] ;
11 = (l/si a3 “ pdfn((d - dxyesl ‘ bbd) / sigma);
12 = cdfn (- dxyesO " bbd) / sigma );
retp( - sumc( ln(llllZ) ) );
endp;

proc fo ara);
local b d, s, fb, £52, f;
bbd = para[1 2 3,.];
s = sqrt(para[4,.]);
fb = sumc( ( dfn( (-dxyesO‘bbd) s)
./ cdlgx ggxyesO‘bbd) / s )
." es s ’
- (£71,302) ' d-dxyesl'bbd )’ dxyesl;
st = sumc( 0.5 ' ( dfn (%yesO‘bbd) / s )
./ cdfn( ( es "' s) )
.' ( (~dxyesO‘bbd) s 3) ) )’
+( (n (821 /4£)2.)8A2()d dxy 1 bbd) (d dxy 1 bbd)
-1 ’s" '- es' ’- es“ ;
f = fb~fs2;
ret130');
endp;

-(log-likelihood) function for a self-selection model
with a Tobit demand function using censored data
(non-respondents’ independent vanables are observed)

proc fnl ara);
local bs, bbd, sigma, rho, no, yesO, yesl;
bbs = para[1 2 3,. ;
bbd = para[4 5 6,. ;
sigma = sqrt(para 7,.]);
rho = para[8,.];
noo= “ﬁé‘é: “5;; b1? 8 )bbd) / ' )
es = c - es ’ s1 -
y cdfbvné (6 dxyesO ‘ bbd) / 5%),
(- sxyesO ' bbs), rho); .
yesl = (I si a) ' pdfn((d - dxyesl “ bbd) / Sigma)
.’ c (- sxyes] "‘ bbs -
(rho / sigma) ‘ (d - dxyesl “ bbd) )
/ sqrts 1 - rho?) ;;
retp(-sumc( ln(no yesOlyesl );

117

proc foc1(para);
local bbs, bbd, si a, rho, f, esl,
gg, gc, cno, p, be, fbbdffyss,
g bd, as, bbs, grho, hbbs, k,
1 , 1c, , i c, igesOa, iyesOb,
1 bd, iss, ibbs, vd, irho;

bbs = para[1 2 3,.];
bbd = para[4 5 6,1];
sigma = sqrt(para 7,.]);
rho = para[8,. ;
f = (l / sigma ‘ pdfn( (d - dxyesl ' bbd) / sigma); .
cyesl = ( - sxyes] ‘ bbs
- rho ' (d - es1 ‘bbd) / sigma )
/ s rt(1 - rho"2 ;

sp = (13 sqrt(l - mm» ' pdfn(cyesl);
ﬁe = cdfnc(cyesl);

p = pdfn( - srmo ' bbs);
hc = cdfn( - smo ' bbs);

fbbd = sumc(

((d - dxyesl " bbd) / sigma"2) .‘ dxyesl );
fss =((Sclllmd§1y 1 bbd)"2 ' "2)

- es ' - s1gma

/ (2 ‘ sigma“) );
gbbd = sumc(

(- gp -/ gc) ‘ (rho / sigma) -' dxyesl );
gss =(-sumc(

. .* h . d- l‘bbd
/ 513/5 i8¢)aA33F o ( dxyes ))

gbbs = sumc gp .)gc) .‘ sxyesl );

h = - .
8’ ‘T- (fatal Ema /. .)
+ rho ‘(-sixyesi11:213)s-)r 0"(d-dxyesl ‘bbd)
- r o ;
hbbs/=SI$2(/(lgp ./ hc) .' (- sxno) );

g bbs = bbs;
g:bbd = bbd;
g_sigma = sigma;
_rho :15? 0dxy 0 bbd / ' )
1 = - es ‘ srgma ;
ii): gdfno dxyesO ‘ bbd / sigma);
1 = 1c -
cdfbvn( ((- esO ' bbd) / sigma),
(- sxyes ' bbs), rho);
iyesOa = ( - sxyesO ‘ bbs .
- rho “ (- dxyesO ' bbd) / Sigma)
/ sqrt(l - rho‘2);
iyesOb = ( - dxyesO ' bbd
- rho ‘ sigma ‘ (- sxyesO ’ bbs) )
/ sqrt(l - rho"2 ; .
ibbd = sumc( (ip .‘ - dxyesO / Slgma)

118

-(1 si a ‘i .‘cdfniesOa
. .‘ (Alxyﬁf) )/ siglina) ) ./(igc); )
185 = sumc(.(1p “ .5 .. (dxyesO " bbd / sigma"(3/2) )
- (1 / Slgma) ' 1p .‘ cdfn(iyesOa " (-.5)
.; (gcdxyeso "' bbd / sigma (3/2 ) )
1 .

ibbs '= sumc( - pdfn(- sxyesO * bbs) .' cdfn(iyesOb)
C i

." (- e50) ./ ib
bvd = gradxpgcfzbvmg rho);
irho = sumc - bvd .7 ibc);

k = (gbbs’ + hbbs’ + ibbs’) ~ (fbbd + gbbd + ibbd)’
~ (fss + gss + iss) ~ (grho + irho);

rlam-k);
endp;

@--
-(log-likelihood) function for a self-selection model
with measurement errors and a Tobit demand function
using empirical block mean and estimateted variance-
covariance (200 obs. from the population)
for measurement error
proc f113( ara);

local bs, bbd, sigma, rho, bbsn, delta,

no, yesO, yesl;

bbs = para[1 2 3,. ;

bbd = para[4 5 6,. ;

sigma = sqrt(para 7,.]);

rho = para[8,. ;

bbsn = bbs[2 ,.];

delta = sqrt(l + bbsn’evar‘bbsn);

no 0: cdf(111f(n - “(13113111 .Ob’bI) c)!e}ta.);

yes = c - es $1 a ) -

cdfbvné (G dxyesO ' bbd) / s§glrlna ,
( (' “)esopzifgliizi {131% 1 * bbd)/ ' >
esl = I si a ‘ - es Slgma
y .' cdéicﬁl- esl ‘ bbs -

rho / sigma) " (d - dxyesl “ bbd) )
sqrtﬁ 1 - rho ; ;;
dretp(-sumc( ln(no yesOlyesl );
n P;

proc foc3(para); .
local bbs, b1, b2, b3, bbd, s1gma, rho, f, cyesl,
, gc, cno, hp, hc, h, fbbd, fss,
5E d, s, bbs, grho, hbl, hb2, hb3, k,
1 , ic, g: i i esOa, iyesOb,
1 bd, iss, ibbs, vd, irho;

bbs = para[1 2 3,.];

119

bl = bbs 1,. ;
b2 = bbs 2,. ;
b3 = bbs 3,. ;

bbd = para[4 5 6,.3;
sigma = sqrt(para 7,.]);
rho = para[8,. ;
f = (1 / sigma " pdfn( (d - dxyesl " bbd) / sigma);
cyesl = ( - sxyesl ‘ bbs
- rho ‘ (d - esl 1“bbd) / sigma)
/ s rt(l - rho"2 ;
gp = (13 sqrt(l - rho“2)) ‘ pdfn(cyesl);
gc = cdfnc(cyesl);
cno = (- sxnom " bbs) ./
sqrt(l + b2"2 ' evar[1,1] + b3"2 ' evar[2,2]
+ 2 " b2 ‘ b3 "' evar[1,2]);
hp = (1 /sqrt(1 + b2"2 ‘ evar[1,1]
+ b3"2 ‘ evar[2,2]
+ 2 ‘ b2 ‘ b3 " evar[1,2]))
.‘ pdfn(cno);
he = cdfn(cno);
h = 1 + b2"2 ‘ evar[1,1
+ b3"2 " evar[2,2] + " b2 0' b3 ' evar[1,2];

fbbd = sumc(
( d - dxyesl ' bbd) / sigma‘2) ." dxyesl );
fss =«sdumc( 1 bbd) 2 A2)
- es ‘ " - si
/(2d5ysigma*4) ); gma
gbbd = sumc(
(- gp -/ gC) " (rho / Sigma) 3 dxyesl );
gss = sumc(
(- ./ gc) .‘ (rho ‘ (d - dxyesl ' bbd))
/ ép' si 3‘3) ;
gbbs = sumc gp . gc) .‘ sxyesl );
grho = surge - gp ./bg)cd / )
." - - s ‘ si a
+(rh((() " (dfysxyesl ‘ b - Ebn‘ (d - dxyesl "‘ bbd)
/ sigma) / (1 - 1110’?” );
hb1= sumc -hp ./ hc ;
hb2 = sumc (h ./ hc .‘ (- smom .,2]
+ smom" bs.‘ 2‘b2‘evar1,1]
+ 2'b3'evar[1,])./(2'h) );
hb3 = sumc( (h ./ hc ."' ( - s1mom.,3]
+ sxnom ‘ bs ." 2 ‘ b3 ‘ evar 2,2]
+ 2 " b2 " evar[1,2]) ./ (2 ‘ h) );

g bbs = bbs;

g:bbd = bbd;

g_sigma = sigma;

who :15? ocixy o bbd/ ' )
1p = p - es " 5.181113 ;
ic = cdfn(- dxyesO ' bbd / srgma);
ibc = ic -

120

cdfbvn( ((- dxyesO ' bbd) si a ,
. (- sxyesO " bbs), rho); gm )
1yesOa = ( - sxyesO " bbs
- rho " (- dxyesO " bbd) / sigma)
/ sqrt(l - rho"2);
iyesOb = ( - dxyesO " bbd
- rho " sigma " (- sxyesO ‘ bbs) )
/ sqrt(l - rho"2 ;
ibbd = sumc( (ip ." - dxyesO / sigma)
- (1 / sigma) ‘ 1p .' cdfn(i esOa)
. ."‘ (- dxyesO / sigma) ) ./ i c);
1ss = sumc( (ip " .5 . (dxyesO “ bbd / sigrna"(3/2) )
- (1 / sigma) “ ip ." cdfn(iyesOa) " (-.5)
7 (b dxyesO "' bbd / sigma (3/2 ) )
. 1 c ;
ibbs = sumc( - pdfn(- esO " bbs) .‘ cdfn(iyesOb)
b d '1' (.d 83130) '/ it11c)
v = gra p vn,g r o ;
irho = sumc - bvd .7 ibc);

= (gbbs’+(hb1~hb2~hb3)+ibbs’) ~ (fbbd + gbbd + ibbd)’
~ (fss + gss + iss) ~ (grho + irho);

retp(-k);
endp;

-(log-likelihood) function for a self-selection model
with measurement errors and a Tobit demand function
using empirical block mean and empirical block
variance-covariance for measurement error
proc fn4 ara);
local bs, bbd, sigma, rho, bbs 1, bsz,
delta, no, yesO, yesl;
bbs = para[1 2 3,. ;
bbd = para[4 5 6,. ;
sigma = sqrt(para 7,.]);
rho = para[8,. ;
bbsl = bbs 2,. ;
bbs2 = bbs 3,. ;
delta = sqrt(l + bbsl"2 ‘ smov .,1]
+ bszAZ ‘ sxnov[.,2
+ 2 ' bbsl " bsz ‘ sxnov[.,3]);
no = cdfn( - szmom ' bbs ./ delta );
yesO = cdfn( - dxyesO “ bbd) / Slgma ) -
cdfbvn (- dxyesO ' bbd) / Sigma ,
(- sxyesO " bbs), rho); .
yesl = (lélsi a) " pdfn((d - dxyesl " bbd) / s1gma)
.' cd c??- sxyesl ' bbs -
(rho / sigma) ' (d - dxyesl ‘ bbd) )
/ sqrt( 1 - rho‘2) );

121

retp(-sumc( ln(no lyesO l yes 1) ));

endp;

proc foc4(para);

local bbs, b1, b2, b3, bbd, sigma, rho, f, cyesl,

gg, gc, cno, hp, he, h, fbbd, fss,
g bd, ﬁts, gbbs, grho, hbl, hb2, hb3, k,
1g, 1c, , i , iyesOa, iyesOb,

i bd, iss, ibbs, vd, irho;

bbs = ara[1 2 3,.];

b1= b s 1,.;
b2 = bsz,.;
b3 = bbs 3,.;

bbd, = para[4 5 6,.];
sigma = sqrt(para[7,.]);
rho = para[8,];
f = (1 / sigma ‘ pdfn( (d - dxyesl ‘ bbd) / sigma);
cyesl = (- sxyesl " bbs
- rho ‘ (d - esl ‘bbd) / sigma)
/ s rt(1 - rho"2 ;
gp = (1 sqrt(l - rho"2)) “ pdfn(cyesl);
gc = cdfnc(cyesl);
cno = (- smom ‘ bbs) ./
sqrt(l + b2"2 ‘ sxnov[.,l] + b3“2 ‘ srmov[.,2]
h ( 7 2 '(b2 #3323"; sxnov[.,vaﬁ
= 1 rt 1 + " simov .,
p 511 b3"2 ' sxnov[.,2]
+ 2 ‘ b2 ' b3 ' s:mov[.,3]))
.“ pdfn(cno);
hc = cdfn(cno);
h= 1 +b2"2's1mov[.,£ -
+ b3"2 ‘ sxnov[.,2] + ‘ b2 " b3 ‘ s1mov[.,3];

fbbd = sumc(

((d - dxyesl ‘ bbd) / sigma"2) .' dxyesl );
fss =((sdumfl§y 1 'bbd)"2 ' A2)

- es ‘ - s1gma
/ (7- “ sigma“) );

gbbd = sumc(

(- gp ./ gc) ‘ (rho / sigma) .‘ dxyesl );
gss =§sum0(

épj .gc) .;331hof (d - dxyesl ‘ bbd))
gbbs = sun?l gp . gc) .‘ sxyesl );
m1" smé” 'eé‘l’ Jbiﬁl/ si )
. +(rh(<() ‘ (dfysxyesl " bbs - rﬁbna (d - dxyesl ' bbd)
/ sigma) / (1 - r110"2)) );
h 1=sumc -hp./hc;
hb2 = sumc (h ./ hc .‘ ( - sxnom[.,2]
+ sxnom ‘ bs .‘ 2 ' b2 “ sxnov[.,l]
+ 2 " b3 ‘ srmov[.,3]) ./ (2 ’ h) ) );

122

hb3 = sumc( (th ./ hc .' ( - sxnom[.,3£
+ sxnom " ."' 2 ' b3 ’ sxnov[., ]
+ 2 ‘ b2 " srmov[.,3]) ./ (2 ‘ h )) );

g bbs = bbs;
g:bbd = bbd;
g sigma = sigma;
:rho 311:? dey 0 bbd / )
1p=p - es‘ sigma;
ilc) = cdfn(- dxyesO " bbd / sigma);
1 c = 1c -
cdfbvn( ((- esO ‘ bbd) / sigma),
(- sxyes ' bbs), rho);
iyesOa = ( - sxyesO " bbs
- rho "‘ (- dxyesO “ bbd) / sigma)
/ sqrt(l - rho"2);
iyesOb = ( - dxyesO ‘ bbd
- rho “ sigma ‘ (- sxyesO ‘ bbs) )
/ sqrt(l - rho‘2 ; .
ibbd = sumc( (ip ." - dxyesO /.81grna)
- (1 / sigma) ’ 1p .‘ cdfn(1ges0a)
." (- dxyesO / si a) ) ./i c); _
iss = sumc( (ip " .5 . (dxyesO " bbd / Slgma"(3/2) )
- (1 / sigma) ‘ ip .‘ cdfn(iyesOa; ‘ (-.5)
." (b dxyesO ‘ bbd / sigma (3/2 ) )
1 c °

ibbs °= sum’c( - pdfn(- esO ' bbs) :- cdfn(iyesOb)
°‘ (1 a") -/ i3;

bvd = a p vn,g r o ;

irho = %rumc - bvd .7 ibc);

k = (gbbs’+(hb1~hb2~hb3)+ibbs’) ~ (fbbd + gbbd + ibbd)’
~ (fss + gss + iss) ~ (grho + 1rho);

retP(-k);
endp;

proc bvn(r);
= r; 0
re cdfbvn - esO " g bbd) / g_51gma),
t“ i-(gxyi’éyo ' g.b5s). r0) );
endp;

iter = iter + 1;
endo;

system;

APPENDIX F

A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A PROBIT
DEMAND EQUATION

APPENDIX F
A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS:
SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A PROBIT
DEMAND EQUATION

This GAUSS program is used to conduct Monte Carlo experiments for the
self-selection model with measurement errors and a probit demand equation (p =

0.25).

new;
use optmum;
output ﬁle = tp.out;

@-- declare global variables -@
declare matnx g_bbd, g_bbs, g_rho;
external matrix g_bbd, g_bbs, g_rho;

iter = 1;
do while iter le 500;

recal:

@--
npop: population size
nx: number of variables (x’s and e’s)
nb: number of blocks (nb)
bobs: number of obs in a block . .
ss: number of obs used to calculate. vanance-covanance
sag: random sub-sample / populatlon (0 < = smp = 1)
npop = 10000;
1111 = 5;
Ebb: 250‘ / b
o s = npop n ;
$5 = 200;
smp = 0.1;

123

124

@-- draw random sample as population --@
x = rndn(npop, nx);
1etv[5,5]=1.44 0.24 0.096 0 O
0. 24 l O. 24 0 O
0. 096 O. 24 O. 64 0 O
0 O 0 1 0.25
O 0 0 0.25 1;
sqrtv = chol(v);

x=x‘ sqrtv;
leta[1,5]=31.5400;
x=x+a;

evar = vcx(x[1:ss,1 2]);
clear v, sqrtv, a;

@-- divide population into blocks --@
si = 2000;
s = npop / si;
sg = nb s;
= eye sg). ‘ .ones(bobs, 1);
= x[1.51,12];
msx = sx’w/bobs)2).'. .,ones(bobs 1);
= ( ((Sx-msx )
sx[ 1]-msx[.,1]).‘(SXﬁ&.Og-s’1),msxl 21»)
w (bo obs-1) )’. " .oncs
riot-=2msx~vsx;
do while 1 <— - s;
1=si"(i-1)+1;

~

h = si ' i;
= x[l: h, 1 2;

msx= (sx’w/ obsyz’.‘ .,ones(bobs 1);
(«1711- [)11) :(sx 21- msxt 21)»
5x -msx

$70) (bobs-1)). ' .ones b5 ,;1)

sx= msx~vsx;

xx= xxlsx;

1 = i + 1,

endo;

clear 1, l, h, msx, vsx, sx;

@--xx =[x1 x2 x3 e1 e2 m1 m2 v11 v22 v21] --@
x = x~xx;
clear xx;

@--
extract a random sub-sample from population xx,
and variables 1n the random sub-sample are

[x1 x2 x3 e1 e2 m1 m2 v11v22 v21].

125

i = mdU(nP0P.l);

x = i~x;

data = selif(x,x[.,1] .le smp);
clear x;

data = dataJ.,2:(cols(data))];
obs = rows data);

@--
generate dependent variables 8' and D"
‘ =1.5+1‘x1-3‘x2+e1

D’ =6+4'x2-3‘x3+e2

let bs[3,1] = 1.5 1 -3;

let bd[3,1] = 6 4 -3;

s = (ones(obs,1)~data[.,12]]) ‘ b5 + data[.,4g;
d = (ones(obs,1)~data[.,2 3) ‘ bd + data[., ];

@--
rearrange data according to

s > Ozxyes = [dx1x2x3], and

5 <= 0:1mo = [m1m2v11v22v21]
alééodata = [5dx1x2x3m1 m2v11v22v21].

data = s~d~data[.,1:3 6:10];

clear 5, d;

51 (data .,1] .gt 0 .and (data[.,ZJ .gt 0);
50 data .,1 gr 0 .and data .,2 .e 0 ;
yesl = selif(datasl ;

yesO = selif data,sO ;

no = selif(data, data[.,l] .le 0);
clear 51,50,data;

nno = rows(no);
nyesl = rows esl);
nyesO = rows esO ;
nyes = nyesl + nyesO;

sxyesl = ones(nyesl,l)~ye51[.,3 4;
dxyesl = ones(nyesl,1)~yesl[.,4 ;
sxyesO = ones(nyesO,1)~yesO[r, 5?);
dxyesO = ones(nyesO,1)~ye50 .,4 ;
clear yesl, yesO;

51mo = ones(nno,1)~no[.,3 4];
sxnom = ones(nno,1)~no[.,6 7];
sxnov = no[.,8 9 10];

clear no;

@-.

126

starting values for
[50 51 52 d0 d1 d2 sigma"2 rho]

-@
b=’6 bd‘
b =bs|bd|0.25;

@-- call and print optmum --@

optset;

opgdprc = &focl;
(beta1l,lf1, g, retcode} = optmum(&fnl, b0);
if retcode ne 0;

Eoto recal;
end
covl = invpdghess (&nt ,;beta1))
stderl - s diag(covl);
pout = (-f1~beta1’~st erl’;

 

optset;
opgd rc = &foc3;
(bet, 13, g, retcode} = optmum(&fn3,beta1);

if retcode ne 0;

ﬁnchgoto recal;
cov3= =invpd£heass (&fn3, beta3));
stder3 sqrtdl cov3))

pout- = pout~(- ~beta3’~stder3’;

optset;

opgd rc = &foc4;
(beta, f4, g, retcode}= optmum(&fn4, beta3);
if retcode ne 0;

endlgoto recal;
cov4 = invp vpdéhea (&fn4, beta4));
stder4= sqrt di sis)(,cov4):
pout = pout~(- -ff)~ beta)4’ ~stder4’;
optset;
opgd rc = &focO;

(beta, f0, g, retcode} = optmum(&an, bp);
if retcode ne 0;
endﬁoto recal;

cov0= invp vpdghess (&an, beta0));
stder0= sqrt diag cov0));
output on; ’
iter ~ nyesO ~ nye51~ nyes ~ nno ~ (-f0) ~ betaO’ ~ stderO ~ pout;
print N N.
output off;

127
@- procedures --@

@--
-(log-like1ihood) function for a probit demand function
usmg data from only the respondents
proc fn0( ara);

local b, 11, 12;

bb = para;

11 = cdfn( e51 " bb);
12 = cdfn - esO " bb);
retp( - sumC( (11l12) ) );
endp;

proc foc0(para);
local bb, y, z, ff;
b = para;
y = ones(n esl,1)|zeros(nye50,1);
x = dxyesl dxyesO;
ff =(limddfm bb)) dfn( b )
-c x' .‘p x'b .‘x
. f1()cdfn(x " bb) .‘ cdfnc(x ‘ bb)) i

’o
’

retp
endp;

@-- .
-(log-likelihood) function for a self-selection model
with a probit demand function using censored data
non-respondents’ independent variables are observed)
proc fn1(gara);
local bs, bbd, rho, no, yesO, yesl, 11;
bbs = para[1 2 3,.];
bbd = para[4 5 6,.];
rho = para[7,.];
no = cdfn( - sxno f' bbs );
yesO = cdfn( - dxyesO ‘ bbd) -
cdfbvn (- dxyesO ‘ bbd;, (- sxyesO ' bbs), rho);
yesl = cdfncg (- sxyesl ‘ bbs ) -
( cdfn( - dxyesl “ %ng -
cdfbvn( (- dxyesl ‘ b ), (- sxyesl “ bbs), rho) );
trap 1'

11 = ln(no esOlyesl);
if scalerr(xll)y;

11 = " AN";
endif;
retp(-81mm 11));

endp

proc foc1( am);
local b s, bbd, rho, fbbs,
denO, gbbd, gbbs,

128

denl, hbbd, hbbs,
bvr0, bvrl, grho, hrho, k;

bbs = para[1 2 3,.]];
bbd = para[4 5 6,.
rho = para[7,.];

fbbs= sumc( (pdfn(- sxno "' bbs)
./ cdfn(- sxno ‘ bbs)) .‘ (-sxno) );

denO = cdfn( - dxyesO "‘ bbd; - '
cdfbvn( (- dxyesO " bb ), (- sxyesO " bbs), rho);

gbbd = sumc( (l/denO) .‘

cdfnc( ( -sxyesO'bbs - rho ' (-dxyesO‘bbd)))

df11(/cI§Irt((1) .1515???) (dxy 0))

p - es " .‘ - es ;

sumc 1 enO .’ - dfn - esO'bbs .“
cdfn((( {dxyes)0‘b1gdlz (rh(o ﬂ-sxyesO‘gbs)»

s rt(1 - rho"2) ) .'
(sxyesof) >;

den] = cdfnc - e51 ' bbs - cdfn - e51 " bbd -
cdfbvn( (- med " bbd), (— sxye(51d§ybbs), rho) ));
hbbd = sumc( (-1/den1) .‘
cdfnc( ( -5xyesl‘bbs - rho ’ (-dxyesl‘bbd)))
/ sqrt(l - rho"2) .'
pdfn( -d>3'esl'bbd ) .“ (~dxye51) );
hbbs = sumc( If enl) .1' bb) pdfn( l bb )
- -sxyes‘ s + -sxyes‘ s .'
cd - esl‘bbd - rho ’ - esl'bbs)))
(/(sqdri(1 - rim) 1 ) .' (siyxzsl) );

gbbs

g bbs = bbs;

g:bbd = bbd;
rho = rho;

erO = gradp(&bvn0,g_rho ;

bvrl = gradp &bvnl,5 rho ;
ho = sumc( (l/den .' (- ber) );
ho = sumc l/denl .‘ bvrl );

k = (fbbs’ + gbbs’ + hbbs’) ~ (gbbd + hbbd)’
~ (grho + hrho);

{CM-k);

endp

gag-likelihood) function for a self-selection model .
with measurement errors and a probit demand funcuon
using empirical block mean and estimateted vanance-
covariance (200 obs. from the populatlon)

for measurement error

proc fn3(para);

129

local bbs, bbd, rho, bbsn, delta,
no, yesO, yesl, ll;

bbs = para[12 3,. ,3;

bbd= para 4 5 6

rho = ,.;ng
bbsn = pbbs 2
delta = sqrt(l +1 bbsn’ evar’bbsn);
no = cdfn(- smom' bbs / delta );
yesO = cdfn(- dxyesO bbd) -

cdfbvn (- dxyesO bbd;, (- sxyesO‘ bbs), rho);
yesl = cdfnc( (- sxyesl ' bb

( cdfn( dxyesl "' bbd) )-

cdfbvn( (- dxyesl "' bbd), ( sxyesl “ bbs), rho) );
ll ln(nol e50 yesl);
retp(-sumc( 11 ));
endp;

proc foc3 ara);
localdb 5, b1, b2, b3, bbd, rho,
(1612,13,
fb1,fb2,fb3,
denO, bbd, bbs,
den1,bbd,bbs,
bvr0,bvr1, grho, hrho, k;

bbs = para[1 2 3, .;]
bl = b
b2 = bsz 2., ;
b3 = bbs 3,.
bbd= para 4 5 6, .;]
rho = para 7, .;]
delta = sqrt(l + b2"2 " evar[1, ,1? + b3"2 ‘ evar[2,2]
+ 2' b2 " b3'evar1,2]);
= -51mom"bbs / delta;

fbl = dfn d lta
fb2 = :uuggé ( P1331135)! 351118)) ‘/ (1- e sxn)om[. ,2] / delta
+ sxnom bbs.‘ 2 " b2 " evar[1,1
). (2 " delta"3 )) );

(z) ). ‘ (- sxnom[. ,3] / delta

‘b3 ‘ evar[2, 2]
+2'b2'evar[1, ])6/(2‘delta"3)));
denO = cdfn(- dxyesO ’
cdfbvn( (- dxyesO " bbd ), (- sxyesO ' bbs), rho);

bbd = sumc( 1 denO).‘
g cdfnc(; -5xye50‘bbs - rho‘ (-dxye50"bbd)))

fb3= sumc( ( pdfn(z)
+ srmom

+ 2 ‘ b3 ‘ evar[lzc

/sqrt(1- rho"2)(
d§'(-e50'bbd ). ' dxyesO) );
gbbs= sumc( (1/ en0 (- pdfn( - sxyesO'bbs

O‘bbd - h - O'gb )))
cdfn(} $1211.07}; 0" (sxyes s
1 - ((133,680) esl ‘ bbs) - cdfn(- dxyesl " bbd) -
den cdfbvn(cg- flxxyyesl' ‘bbd), (- sxyesl ‘ bbs), rho) );

130

hbbd = sumc( (-1/den1) .'
cdfnc( ( -5xyesl'bbs - rho “ (-dxyesl‘bbd)))
dfn(/ ”2351515130?! (dxyesl) )-
hbbs = 5$mc( @6111) .' . ’
- 131d (- sxyesl‘bbs) + pdfn( -sxyesl‘bbs ) .*

 

cdl ( ( -dxyesl'bbd - (rho ‘ -sxyesl"bbs)))
/ sqrt(l - rho"2) ) ) ." (~5xyesl) );
g bbs = bbs;
g:bbd = bbd;
£530 = rh((1);&b 0 h
= gra p vn ,g r o ;

bvrl = gradp(&bvn1,8'rho ;

ho = sumc( (l/den )3 (- bvr0) );

ho = sumc l/denl .' bvrl );

k = ((fbl~fb2~fb3) + bbs’+ hbbs’)
~ (gbbd + hbbd)’ ~ grho + hrho);

ret13(1);

endp;

@--
-(log-likelihood) function for a self-selection model
with measurement errors and a probit demand function
using empirical block mean and empirical block

vanance-covariance for measurement error

proc fn4 ara);
local bs, bbd, rho, bbsl, bb52,
delta, no, yesO, yes 1, ll;
bbs = para[1 2 3,.];
bbd = para[4 5 6,.];
rho = para 7,. ;
bbsl = bbs 2,. ;
bb52 = bbs 3,. ;
delta = sqrt(l + bbsl‘2 ‘ sxnov .,1]
+ bb52"2 ' sxnov[.,2
+ 2 ' bbsl " bbs2 " 51mov[.,3]);
no = cdfn( - sxnom ’ bbs ./ delta );
yesO = cdfn( - dxyesO ' bbd) -
cdfbvn (- dxyesO " bbd;, (- sxyesO ' bbs), rho);
yesl = cdfnc( (- sxyesl ' bbs )-
( cdfn( - dxyesl ‘ bbdgd) -
cdfbvn( (- dxyesl " b ), (- sxyesl “ bbs), rho) );
11 = ln(no] e50 yesl);
retp(-sumc(, 11 ));

P,

proc foc4(para);
local bbs, b1, b2, b3, bbd, rho,
delta, 2,

131

fbl, fb2, fb3,

denO, bbd, bbs,

denl, bbd, bbs,

ber, bvrl, grho, hrho, k;

bbs = para[1 2 3,.];

bl = b s 1,.
b2 = bbs 2,. ;
b3 = bbs 3,. ;

bbd = para[4 5 6,.];
rho = para[7,.];
delta = sqrt(l + b2"2 “ s:mov[.,1] + b3"2 "‘ sxnov[.,2]
+ 2 "‘ b2 ’ b3 ' s:mov[.,3]);
z = -5:mom'bbs ./ delta;
fbl = sumc( é- pdfn(z) . cdfn(z) ) ./ delta );
fb2 = sumc pdfn(z) . cdfn(z) ) .“ ( - sxnom[.,2] ./ delta
+ smom “ bbs .‘ 2 ' b2 " 51mov[.,1]
+ 2 ‘ b3 ' 51mov[.,3]) ./ (2 ‘ delta"3 ) ) );
fb3 = sumc( ( pdfn(z) . cdfn z) ) .' ( - s:mom[.,3] ./ delta
+ srmom bbs .' 2 ‘ b3 ' smov[.,2
+ 2 ‘ b2 ' sxnov[.,BL) ./ (2 ‘ delta"3 ) )' );
denO = cdfn( - dxyesO ‘ bd -
cdfbvn( (- dxyesO ‘ b ), (- sxyesO ‘ bbs), rho);
gbbd = sumc( (1 denO) .'
cdfnc( -5xye50‘bb5 - rho " (-dxye50‘bbd)))
/ sqrt(l - rho"2) .‘

pdfn esO'bbd .’ - e50 ;
gbbs = sumc((ayg’enO) .‘ (-)pdfn( -sxye5)02bbsg ."
cdfn( -dxye50"bbd - (rho ‘ (-sxyesO“ bs)))
s rt(1 - rho"2) ) ."
(sxyesO) ;
denl = cdfnc - sxyesl ‘ bbs) - cdfn(- dxyesl ' bbd) -
cdfbvn( - dxyesl ‘ bbd), - sxyesl ' bbs), rho) );
hbbd = sumc (-1/den1) ."
cdfnc( ( -5xyesl’bbs - rho ‘ (-dxyesl"bbd)))
/ sqrt(l - rho"2) .'
pdfn( -d13'esl‘bbd ) .' (-dxye51) );
hbbs = 81(1mc(dg( enl) '1. bb) dfn( 1 bb )
- -sxyes' s+p -5xyes" s .“
«1&1 - esl'bbd - rho ‘ - esl'bbs)))
92131 «10215 > .1 ($11.» 1.

g bbs = bbs;

535d = %bd:
r o = r o;

VrO = gradp &bvn0,g_rho ;

bvrl = gradp &bvnl, rho ;
ho = sumc( (l/dengg'.‘ (- ber) );
ho = sumc l/denl .‘ bvrl );

k = ((fb1~fb2~fb3) + bbs’+ hbbs’)
~ (gbbd + hbbd)’ ~ grho + hrho);

retP(-k);

132
endp;

roc bvn r
p 23cm0();
retp( cdfbvn( (- e50 ' _,bbd)
(- sxyg’g' g_ b 8) r0));
endp;

proc bvn1(r);
local r1;
r1 --( r; ((
retp cdfbvn dxyesl " _,bbd)
(-sxyesl ‘ g_ b 5), r1));
endp;

iter = iter + 1;
endo;

system;

BIBLIOGRAPHY

BIBLIOGRAPHY

Bloom, D. E., and M. R. Killingsworth ( 1985), "Correcting for Truncation Bias
C3ause3<§ by A Latent Truncation Variable," Journal of Econometrics, 27(1):
1 1-1 .

Bockstael, N. E., Strand, 1. E., McConnell, K. E., and F. Arsanjani (1990),
"Sample Selection Bias in the Estimation of Recreation Demand Function:
An Application to Sportfishing," Land Economics, 66(1): 40-49.

Borjas, G. J. (1987), "Self-Selection and the Earning of Immigrants," American
Economic Review, 77(4): 531-553.

Bowker, J. M., and J. R. Stoll (1988), ”Use of Dichotomous Choice Nonmarket
Methods to Value the Whoo insg Crane Resource,” American Journal of
Agricultural Economics, 70(2?: 72-381.

Brown, T. L., Dawson, C. P., Hustin, D. L, and DJ. Decker (1981), "Comments
on the Importance of Late Respondent and Nonrespondent Data from
Mail Surveys," Journal of Leisure Research, 13(1): 76-79.

Brown, T. L, Decker, D. J., and N. A. Connell ( 1989), "Response to Mail
Surveys on Resource-based Recreation 0 ics: A Behavioral Model and an
Empirical Analysis," Leisure Sciences, 11(. : 99-110

Cameron, T. A. (1988), "A New Paraldl'liln for Valuing Non-market Goods Using
Referendum Data: Maximum ' elihood Estimation by Censored Logisuc

gise 3c7553ion," Journal of Environmental Economics and Management, 15(3):

Cameron, T. A., and M. D. James 1987), "Efﬁcient Estimation Methods For
”Closed-ended" Contin ent aluation Surveys," The Review of Economics
and Statistics, 69(2): 9-276.

Dhrymes, P. J. ( 1970), Econometrics, New York, NY: Harper & Row.

Edwards, S. F., and G. D. Anderson (1987), "Overlooked Biases in Contingent
Valuation Surveys: Some Considerations," Land Economics, 63(2): 168-178.

Fuller, SW. A. (1987), W New York, NY: John Wiley &
, ons.

Goldberger, A. S. ( 1981), "Linear Regression After Selection,” Journal of
Econometrics, 15(3): 357-366.

133

134

Goldber er, A. S. (1991), AW Cambridge, MA: Harvard
niversity Press.

Goyder, C. J. (1982), "Further Evidence on Factors Affecting Response Rates to
Mailed Questionnaires," American Sociological Review, 47(4): 550-553.

Green, E. K. (1991), "Reluctant Respondents: Differences Between Early, Late,
and Nonresponders to a Mail Survey," Journal of Experimental Education,
59(3): 268-276.

Green, E. K., and R. F. Kvidahl ( 1989), "Personalization and Offers of Results:
2E6f§e§t7500n Re5pon5e Rates," Journal of Experimental Education, 57(3):

Green, E. K., and S. F. Stager (1986), "The Eﬁects of Personalization, Sex,
Locale, and Level Taught on Educators’ Res onses to a Mail Survey,"
Journal of Experimental Education, 54(4): 2 3-206.

 

Greene, W. H. (1981), "Sample Selection Bias as a Speciﬁcation Error:
Comment," Econometrica, 49(3): 795-798.

Greene, W. H. (1990), W New York, NY: Macmillan
Publishing Company.

Hauseman, J. A., and D. A. Wise 1977), "Social Experimentation, Truncated
Distributions, and Efﬁcient stimation," Econometrica, 45(4): 919-938.

Hauseman, J. A., and D. A. Wise (1981), ”Stratiﬁcation on Endogenous Variables
and Estimation: The Ga Income Maintenance Experiment," in Manski,
CE. and D. McFadden a“). WWW ' '

° ' ' 51-111, Cambridge, MA: MIT Press.

Heckman, J. J. (1976), "The Common Structure of Statistical Models of
Truncation, Sample Selection and limited Deipendent Variables and a
Simple Estimator for Such Models," Annals 0 Economic and Social
Measurement, 5(4): 475-492.

Heckman, J. J. (1979),“ ”Sample Selection Bias as a Speciﬁcation Error,"
Econometrica, 47(1): 153-161.

Heckman, J. J., and G. L. Sedlacek 1985), "Heterogeneigy, %gregation, and
Market Wage Functions: An mpirical Model of elf- election in the
Labor Market," Journal of Political Economy, 93(6): 1077-1125.

Heckman, J. J., and G. L. Sedlacek (1990 , "Self-Selection and the Distribution of
Hourly Wages," Journal of Labor nomics, 8(1): 5329-5363.

Hoehn, J. P. and J. B. Loomis (1993), "Substitution Eﬁects in the Valuation of
Multiple Environmental Pro ams," Journal of Environmental Economics
and Management, 25(1): 56- 5.

Kanuk, 1..., and C. Berenson (1975), "Mail Surveys and Response Rate: A
Literature Review," Journal of Marketing Research, 12(4): 440-453.

 

135

Lee, L. F. ( 1984), "Tests for Bivariate Normal Distribution in Econometric Models
with Selectivity," Econometrica, 52(4): 843-863.

little, R. J. A. ( 1985), "A Note About Models for Selectivity Bias," Econometrica,
53(6): 1469-1474.

Little. 12.1. A. and D. B. Rubin (1987 515W
New York, NY: John Wiley & n5.

Loomis, J. B. (1987), "Expanding Contingent Value Sample Estimates to
Aggregate Beneﬁt Estimates: Qirrent Practices and Proposed Solutions,"
Land onomics, 63(4): 396-402.

Maddala, G. S. ( 1983), ' ‘ - - -
Econometrics New York. : Cambridge University Press.

McConnell, K. E. (1990 , "Models for Referendum Data: The Structure of
Discrete Choice odels for Contingent Valuation," Journal of
Environmental Economics and Management, 18(1): 19-34.

Mitchell, R. C., and R. T. Carson ( 1989), Wm
W Washington, D.C.: Resources for the
Future.

Muthén, B., and K. G. J6reskog (1983), "Selectivity Problems in Quasi-
experimental Study." Evaluation Review, 7(2): 139-174.

Nelson, F. D. (1977), "Censored Regression Models with Unobserved Stochastic
Censoring Thresholds," Journal of Econometrics, 6(3): 309-327.

Olsen, R. J. ( 1980), "A Least Squares Correction For Selectivity Bias,"
Econometrica, 48(7): 1815-1820.

Ong, P. M., Holt, 8., Skumatz, L. A., and R. S. Barnes (1988), "Nonresponse in

Residential Energy Surveys: Systematic Patterns and Implications for End-
Use Models," Energy Journal, 9(2): 137-151.

Pudney, S. (1989), M ’ ’ ' ' '
° e5, Cambridge, MAzBasil Blackwell.

Randal;o A. (1987), W 2nd Ed., New York, NY: John Wiley &
n5.

Rubin, D- B- (1987). W New York.
NY: John Wiley & n5.

Shaw, D. ( 1988), "On-site Samples’ Regression: Problems of Non-negative
Integers, Truncation, and Endogenous Stratiﬁcation," Journal of
Econometrics, 37(2): 211-223.

Smith, V. K. (1988), "Selection and Recreation Demand," American Journal of
Agricu rural Economics, 70(1): 29-36.

 

136

van Ravenswaa , E. O. and J. P. Hoehn (1991a), "Contin ent Valuation and Food
Safety: e Case of Pesticide Residues in Food," taff PL?“ No. 91-13,
Department of Agricultural Economics, Michigan State niversity.

van Ravenswaay, E. O. and J. P. Hoehn (1991b , "Consumer Willingness to Pay
for Reducin Pesticide Residues in Foo : Results of a Nationwide Survey,"
Staff Paper 0. 91-18, Department of Agricultural Economics, Michigan
State University.

Walsh, R. G., Loomis, J. B., and R. A. Gillman (1984), "Valu' Options,
E‘n'ésgence, and Bequest Demands for Wilderness," Land conomics, 60(1):
1 - .

Whitehead, J. C. (1991), "Environmental Interest Group Behavior and Self-
331:1??6-12316 in Contingent Valuation Mail Surveys," Growth and Change,

Willis, R. J., and S. Rosen ( 1979), "Education and Self-Selection," Journal of
Political Economy, 87(5): 52-535.

Yatchew, A., and Z. Griliches (1985), "Specification Error in Probit Models," The
Review of Economics and Statistics, 67(1): 134-139.

"‘i1111111111“