A1 a ITY LIBRARIES Itittti»tmuflttti mu I u ii 3 1293 00885 3347 This is to certify that the .~ ".4“. ..I M“... ’l u .0 "U t..~‘puc dissertation entitled CORRECTING FOR SELF-SELECTION BIAS IN CONTINGENT VALUATION presented by LIH-CHYUN SUN _ C. Cl has been accepted towards ftilfillment ofthe requirements for, Ph. D. ‘ “'deg'reE‘fri'EgTEIttfl'tural Economics Date gl/é [/9 3 MSU is an Affirmative Action 'Equal Opportunity Institution ——- . V 1/ A M a 10!" professor ‘John P. Hoehn 0-12771 PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE 'r in 9"“ ~ ‘ ' ' ‘ NW“ 3 :4 '13? ~19” 9 3 E99} use G/Clmolflpfi-fifl‘ CORRECTING FOR SELF-SELECTION BIAS IN CONTINGENT VALUATION By Lih—Chyun Sun A DISSERTATION Submitted to Michi an State University in partial ent of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Agricultural Economics 1993 ABSTRACT CORRECTING FOR SELF-SELECTION BIAS IN CONTINGENT VALUATION By Lih-Chyun Sun In contingent valuation (CV) studies, data can only be collected from those who are willing to participate in the studies. Results from the application of a single equation approach to this truncated sample may lead to inconsistent parameter estimates (self-selection bias). A self—selection model which contains a self-selection and a demand equation may be specified in order to detect and to correct for self-selection bias. Based on a truncated sample, Bloom and Killingworth (1985) proposed a maximum likelihood (ML) estimator which leads to theoretically consistent parameter estimates. However, using Monte Carlo experiments, Muthén and Jbreskog (1983) showed that the estimates for parameters in the self-selection equation are not reliable even in large samples. A self-selection model with measurement errors is proposed in this study. In the model, a CV truncated sample is transferred into a censored sample by combining survey individual data with census data which provides information for non-respondents’ neighborhoods (e.g. census blocks). Based on the censored sample, two ML estimators are derived where census data are treated as if they are the true values plus errors, i.e. non-respondents’ characteristics are assumed to be distributed as N(p.i', Ei'). To apply the self-selection model with measurement errors, pi' and E; are replaced by their consistent estimates: u,, the average values calculated from each Lih-Chyun Sun census block and 2,, the corresponding variance-covariance matrix calculated from each census block, or 2, the corresponding variance-covariance matrix calculated from a sample drawn from the population. Results from Monte Carlo experiments suggest that the self-selection model with measurement errors performs well, especially when pi and 2, are adopted. The results also indicate that if the self-selection model is correctly specified, adoption of a self-selection model with measurement errors will not contaminate the original truncated sample. The application of a self-selection model with measurement errors is not restricted to CV studies. The model can be applied to studies that adopt survey data and regression analyses. To my parents Dr. Chen Sun and Mrs. Feng-Chiao Lee Sun iv ACKNOWLEDGEMENTS It has been my pleasure and honor to work with my committee members, Dr. John Hoehn, Dr. Eileen van Ravenswaay, and Dr. Ching-Fan Chung. I am especially grateful to Dr. Hoehn. As my major professor, Dr. Hoehn lead me into the area of resources/ environmental economics and empirical work. It was through him that I first experienced the joy of conducting research. Having worked for Dr. van Ravenswaay for the past four years, I owe her much gratitude for her guidance and tolerance. Dr. Chung enhanced my knowledge in econometrics both inside and outside the classroom. In addition to thanking him for his friendship, I thank him for introducing me to GAUSS which greatly strengthened my ability in understanding and in practicing econometrics. I would like to express my appreciation to the Agricultural Economics Department for offering an excellent environment for studying. I am indebted to my colleague Miss Tiffany D. Phagan who spent much of her precious time editing my writing and making this dissertation readable. Special appreciation goes to Drs. Anthony and Delia Koo. For the past seven years, Drs. Koo have been very supportive. I could never have finished my studies here at Michigan State University without their encouragement and help. I owe a great deal to my mother-in-law, Mrs. Yu-Chueng Wu, who stayed in Lansing for a long period of time to help take care of my daughter so both my - wife and I could go to school. Of course, this would never have happened if my father-in-law, Mr. I-Ming Song, was not a great gentleman. V With all my heart, I thank my parents Dr. Chen Sun and Mrs. Feng—Chiao Lee Sun for their endless love and support. Although I did not inherit their wisdom and other wonderful characteristics, I learned from them how to confront and to conquer challenges. Should I make any contribution to society, they are the persons who deserve the credit. Thanks also go to my younger brother Chih- Chyun Sun who kept my parents away from loneliness while I was abroad. I am also indebted to my aunt Diana Lee who encouraged me constantly throughout the years. I am very fortunate to have a wonderful wife and a lovely daughter, they have sacrificed and suffered a lot to help me finish my studies. For the past years, I could have been a better father than I was. I apologize to my daughter Yihua Sun, and thank her for bringing extra happiness to the family. lastly, with lots of love, I thank my wife Wei-Ling Song. This dissertation would never have been finished without her love, encouragement, support, and toleration. TABLE OF CONTENTS Baa: LIST OF TABLES ............................................ xi LIST OF FIGURES ............................................ xiv CHAPTER 1 INTRODUCTION ............................................. 1 1.1 Non-response in surveys ................................ 1 1.2 Self-selection and sample non-response biases ................ 3 1.2.1 Self-selection and sample non-response biases: a regression analysis .............................. 3 1.2.2 Self-selection and sample non-response biases: a graphical analysis .............................. 5 1.3 Self-selection in contingent valuation ..................... 10 1.4 Plan of work ........................................ 12 CHAPTER 2 LITERATURE REVIEW ...................................... 14 2.1 Introduction ........................................ 14 2.2 A self-selection model ................................. 15 2.3 Estimators ......................................... 17 2.3.1 Heckman’s two-stage estimator .................... 17 2.3.2 Self-selection with a censored sample ............... 20 vii 2.3.3 Self-selection with a truncated sample ............... 21 2.4 Summary .......................................... 23 CHAPTER 3 SELF-SELECTION MODELS WITH MEASUREMENT ERRORS ...... 24 3.1 Self-selection based on a random utility model under a CV framework ........................................ 24 3.2 A probit model with measurement errors ................... 27 3.2.1 Derivation of the probit model with measurement errors ...................................... 27 3.2.2 Parameter identification in the probit model with measurement errors ........................... 31 3.3 A self-selection model with measurement errors and a linear demand equation ................................... 33 3.4 Generalization for closed-ended questionnaires .............. 35 3.4.1 A self-selection model with measurement errors and a Tobit demand equation ......................... 36 3.4.2 A self-selection model with measurement errors and a probit demand equation ........................ 37 3 5 Summary .......................................... 41 CHAPTER 4 . MONTE CARLO EXPERIMENTS AND RESULTS .................. 43 4.1 Data generation ..................................... 44 4.1.1 Population generation .......................... 44 4.1.2 Sample generation ............................. 46 4.1.3 Monte Carlo experiments ........................ 47 4.2 A linear demand equation with self-selection ................ 48 4.2.1 Monte Carlo experiment results from a self-selection model with measurement errors and a linear demand equation .................................... 50 4.3 A Tobit demand equation with self-selection ................ 51 viii 4.3.1 Monte Carlo experiment results from a self-selection model with measurement errors and a Tobit demand equation .................................... 53 4.4 A probit demand equation with self-selection ................ 55 4.4.1 Monte Carlo experiment results from a self-selection model with measurement errors and a probit demand equation .................................... 57 4.5 General results from the Monte Carlo experiments ........... 58 4.6 Summary .......................................... 60 CHAPTER 5 CONCLUDING REMARKS .................................... 62 5.1 Summary .......................................... 62 5.2 Need for future research ............................... 64 5.3 Conclusion . . . . Q .................................... 65 APPENDIX A RESULTS FROM MUTHEN AND JORESKOG’S STUDY ............ 68 APPENDIX B NOTATION USED IN REPORTING MONTE CARLO RESULTS ...... 72 APPENDIX C MONTE CARLO EXPERIMENT RESULTS ....................... 75 cm Estimates from a self-selection model with measurement errors and a lmear demand equation (9 = 0.25) .................................. 75 C.1.2 Estimates from a self-selection model with measurement errors and a lmear demand equation (9 = 0.5) ................................... 78 C.1.3 Estimates from a self-selection model with measurement errors and a linear demand equation (p = 0.75) .................................. 81 C.2.1 Estimates from a self-selection model with ‘ measurement errors and a Tobit demand equation (p = 0.25) .................................. 85 C.2.2 Estimates from a self-selection model with measurement errors and a Tobit demand equation (p = 0.5) ................................... 88 C.2.3 Estimates from a self-selection model with measurement errors and a Tobit demand equation (p = 0.75) .................................. 91 C31 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.25) .................................. 94 C32 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.5) ................................... 97 C.3.3 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.75) .................................. 100 APPENDIX D A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A LINEAR DEMAND EQUATION ................................ 103 APPENDIX E A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A TOBIT DEMAND EQUATION ................................. 112 APPENDIX F A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A PROBIT DEMAND EQUATION .‘ ............................... 123 BIBLIOGRAPHY ............................................ 133 LIST OF TABLES Rage Table A1 Parameter estimates for data simulated according to model 1, Nt == 496, N = 1000 ..................................... 69 Table A2 Parameter estimates for data simulated according to model 1, Nt = 1963, N = 4000 .................................... 70 Table C.1.1.A Linear demand, OLS estimates without correcting for self- selection, p = 0.25 ....................................... 76 Table C.1.1.B Linear demand, correcting for self-selection bias using censored samme, = 0.25 ................................. 76 Table C.1.1.C Linear demand, correcting for self-selection using measurement errors model with u, and 2, p = 0.25 .............. 77 Table C.1.1.D Linear demand, correcting for self-selection using measurement errors model with pi and 2,, p = 0.25 .............. 77 Table C.1.2.A Linear demand, OLS estimates without correcting for self- selection bias, p = 0.5 .................................... 79 Table C.1.2.B Linear demand, correcting for self-selection bias using censored sample, p = 0.5 .................................. 79 Table C.1.2.C Linear demand, correcting for self-selection using measurement errors model with p, and E, p = 0.5 ............... 80 Table C.1.2.D Linear demand, correcting for self-selection using measurement errors model with [Li and 2i, p = 0.5 .............. 80 Table C.1.3.A Linear demand, OLS estimates without correcting for self- selection bias, p = 0.75 ................................... 82 Table C.1.3.B linear demand, correcting for self-selection bias using censored sample, p = 0.75 ................................. 82 Table C.1.3.C Linear demand, correcting for self-selection using measurement errors model with [Li and 2, p = 0.75 .............. 83 Table C.1.3.B Linear demand, correcting for self-selection using measurement errors model with u, and 2 i, p = 0.75 .............. 83 Table C.1.3.B Linear demand, correcting for self-selection bias using truncated sample, p = 0.75 ................................ 84 Table C.2.1.A Tobit estimates without correcting for self-selection bias, p = 0.25 ............................................. 86 Table C.2.1.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.25 ................................. 86 Table C.2.1.C Tobit demand, correcting for self-selection using measurement errors model with u, and 2, p = 0.25 .............. 87 Table C.2.1.D Tobit demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.25 .............. 87 Table C.2.2.A Tobit estimates without correcting for self-selection bias, p = 0.5 ............................................... 89 Table C.2.2.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.5 .................................. 89 Table C.2.2.C Tobit demand, correcting for self-selection using measurement errors model with u, and 2, p = 0.5 ............... 90 Table C.2.2.D Tobit demand, correcting for self—selection using measurement errors model with u, and 2,, p = 0.5 .............. 90 Table C.2.3.A Tobit estimates without correcting for self-selection bias, p = 0.75 .............................................. 92 Table C.2.3.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.75 ................................. 92 Table C.2.3.C Tobit demand, correcting for self-selection using measurement errors model with u, and 2, p = 0.75 .............. 93 Table C.2.3.D Tobit demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.75 .............. 93 Table C.3.1.A Probit estimates without correcting for self-selection bias, p = 0.25 .............................................. 95 Table C.3.1.B Probit demand, correcting for self-selection bias using censored sample, p = 0.25 ................................. 95 Table C.3.1.C Probit demand, correcting for self-selection using measurement errors model with u, and 2, p = 0.25 .............. 96 xii Table C.3.1.D Probit demand, correcting for self-selection using 96 measurement errors model with [Li and 2,, p = 0.25 .............. Table 03.2.6615 Probit estimates without correcting for self-selection bias, p = . ............................................... 98 Table C.3.2.D Probit demand, correcting for self-selection bias using censored sample, p = 0.5 .................................. 98 Table C.3.2.C Probit demand, correcting for self-selection using 99 measurement errors model with pi and 2, p = 0.5 ............... Table C.3.2.D Probit demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.5 .............. 99 Table C.3.3.A Probit estimates without correcting for self-selection bias, p = 0.75 .............................................. 101 Table C.3.3.B Probit demand, correcting for self-selection bias using censored sample, p = 0.75 ................................. 101 Table C.3.3.C Probit demand, correcting for self-selection using measurement errors model with p, and 2, p = 0.75 .............. 102 Table C.3.3.D Probit demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.75 .............. 102 LIST OF FIGURES Figure 1.1 Presence of sample non-response bias, absence of self-selection bias ................................................... Figure 1.2 Presence of both self-selection and sample non-response biases . . . 7 Figure 1.3 Presence of self-selection bias, absence of sample non-response bias ................................................... 8 Figure 1.4 Self-selection bias affects only the constant term .............. 9 xiv CHAPTER 1 INTRODUCTION 1.1 Non-response in surveys Contingent valuation (CV) is one of the methods used by researchers to elicit values of non-market goods. Depending on the CV survey design, the elicited values can be either a Hicksian value (i.e. compensating or equivalent variation) that is derived from a Hicksian demand function, or a consumer surplus that is derived from a Marshallian (ordinary) demand function. In many CV studies, data are collected using mail surveys. As with other survey methods, non- response is a common problem in mail surveys. The problem created by non- response is that data values intended to be observed by survey design are in fact missing. These missing values not only lead to less efficient estimates because of the reduced size of the data base, but may also lead to biased estimates due to the fact that respondents are often systematically different from non-respondents (Rubin, 1987). In analyzing survey data, two types of possible biases can be created by non-response. The first is known as sample non-response bias, and the second is known as self-selection (or sample selection) bias (Michell and Carson, 1989). Sample non-response bias occurs when the sample distribution of some socio- economic or demographic characteristics is significantly different from the population. For example, if only low-income individuals respond to the CV surveys, the sample mean of income is then lower than the population mean of 1 2 income. Sample non-response bias can be detected by comparing the sample distribution of certain socio-economic or demographic characteristics with the population distribution. Self-selection bias occurs when the non-response is non-random, which means that the reasons for non-response are endogenous to the survey study. For example, only those who have a higher marginal propensity to consume the non- market good respond to the CV survey. Unlike sample nomresponse bias, it is difficult to find a simple indicator for detecting the existence of self-selection bias. Non-response can usually be divided into two categories, namely, item non- response and unit non-response. In CV mail surveys, item non-response means that a respondent returns the survey but fails to answer some of the questions; unit non-response indicates that a member of the sample fails to return the survey. Both item and unit non-response can cause either sample non-response or self-selection bias, or both. One way to compensate for item non-response is to replace those missing values with imputed values (Little and Rubin, 1987, Rubin, 1987). An alternative is to use a generalized Heckman’s two-stage method to correct for the possible biases that are caused by the item non-response (Ong et.a1., 1988). However, item non-response is not the concern of this study and statistical methods that are related to item non-response will not be discussed here. The purposes of this study are first to distinguish the differences between sample non-response and self-selection biases and then to develop parametric analyses to detect and to correct for the possible self-selection bias that is caused by unit non-response. 3 1.2 Self-selection and sample non-response biases In order to derive values of non-market goods in CV studies, a demand (or inverse demand) function is estimated by regression analyses.1 In this section, self-selection and sample non-response biases are first examined under a regression framework. Next, graphs based on simplified models are provided to demonstrate intuitively the relationships between self-selection and sample non- response biases. 1.2.1 Self-selection and sample non-response biases: a regression analysis Suppose that individual i’s demand for a non-market good, Y, is described by a linear structural equation yi =xi/84-q,i=l,2,...,N, where xi is a column vector of stochastic variables, ui is an error term, and E(ui | xi) = 0 (i.e. E(yi | xi ) = xi’fl). Suppose now that the resulting OLS regression using only the data from respondents is Yi’xale+eq.i=1.2....,M,andM 37, x " > x; d = c, and 0 = B. 7 Example 2. Presence of both self-selection and sample non-response biases (Figure 1.2). Y “I x Figure 1.2 Presence of both self-selection and sample non-response biases Only some of the low-income individuals return the surveys, and those low- income individuals have a lower marginal propensity to consume the non-market good than do other individuals, i.e. y ' > y, x " > if, dec,and6¢fl. 8 Example 3. Presence of self-selection bias, absence of sample non- response bias (Figure 1.3). Y t _X-==X Figure 1.3 Presence of self-selection bias, absence of sample non-response bias Only those who have a lower marginal propensity to consume the non- market good return the surveys, i.e. y‘ =y,x’ =ibut dvc,and0¢fl. 9 Example 4. One special case is when self-selection bias affects only the estimate of the constant term (Figure 1.4). Y I ‘ yz, then E(y, l yl > y;) a E(yl). In a conventional self-selection model, two separate equations are used to model y1 and y2. Based on income maximization, a latent variable is modeled by a third equation which describes I' as a function of (y1 - yz). The individual will work in sector 1 if I' > 0 (y, > yz), and in sector 2 otherwise. An econometric model that is designed for this type of self-selection is often called a switching regression model. Willis and Rosen (1979) modeled the demand for college attendance based on the comparative advantage in expected lifetime earnings. Considering simultaneously the demand for and supply of labor, Heckman and Sedlacek (1985) presented a model of the sectoral allocation of workers from different demographic types. What made their study unique is their use of aggregate data to predict earnings for the different sectors, combining these predicted earnings and micro data to estimate the labor supply in the different sectors. In his study, Borjas (1987) modeled the earnings for immigrants based on the difference in wages earned in the U. S. and potential wages in their native countries. Recently, l4 15 Heckman and Sedlacek (1990) modeled self-selection based on utility maximization, instead of self-selection based on earnings. In a CV study, a self-selection hypothesis would assert that only those who have enough interest in the topic of study will return the surveys, and that respondents have different demand behavior than non-respondents. Under this hypothesis, self-selection in a CV study differs from self-selection in a conventional switching regression model in two ways. Borrowing the above two sector earnings model, assume ya and ya are individual i’s demand for goods 1 and 2 respectively. First, instead of modeling demand for both goods 1 and 2, a CV study usually models only the demand for one good (say, y“). Second, the self-selection criterion in a CV study is the net utility gain from answering the survey, while the criterion in a conventional switching regression model is the potential difference in demand (yn - ya). Since only the demand for one good (yn) is modeled in a CV study, the self-selection model is less complicated than a conventional switching regression model. With this simplification, the nature of switching regression models is left intact but the statistical process for estimating the model is simplified. 2.2 A self-selection model Under a CV framework, consider the following self-selection model regarding individual i’s demand for a good Q (q). Individual i’s self-selection behavior is governed by the self-selection equation 16 #7 + u,, i = 1,2,...,N, L. L=1, Ifl Ii.>0’ l= 0 , otherwise. In the self-selection equation, I,’ is unobservable, but I, is observed. If I, = 1, the demand equation, q = #13 + e,. is then observed. In the self-selection and demand equations, 2, (x,) is a kxl (mxl) vector of exogenous variables, y (B) is a kxl (mxl) vector of parameters to be estimated, 11, (e,) is a random error. Assume that u, ~ i.i.d. N(O, 1), q ~ i.i.d. N(O, 02), (e, 11,) ~ i.i.d. BN(O, o, n), where N(.,.) and BN(.,.,.) are a univariate and a bivariate normal distribution respectively, and is the variance-covariance matrix. Suppose that researchers are interested in estimating the demand equation. Since q, is observed only if I, = 1, the distribution of q, is truncated, and the demand equation alone does not correctly specify the demand for the good (q,). To specify the demand equation correctly, the endogeneity that is caused by the self-selection behavior must be taken into account. Thus, a correct model 17 specification is described jointly by the self-selection and the demand equations, and the objective is to obtain consistent estimates for B, y, p, and oz. Under a CV framework, I,’ in the self-selection equation can be thought of as individual i’s net utility gain from answering the survey. Self-selection implies that an individual’s decision to answer and to return the survey is correlated with the topic of study (i.e. q,, demand for the good Q). In other words, the decision to answer and to return the survey is endogenous to the study. 2.3 Estimators In this section, three estimators for correcting self-selection bias are reviewed. Heckrnan’s two-stage estimator is first examined,1 followed by two ML estimators that are based on either a censored or on a truncated sample. 2.3.1 Heckman’s two-stage estimator Based on the moments of a truncated bivariate normal distribution, according to the self-selection model described by the self-selection and demand equations, Heckman (1979, 1976) demonstrated that 1 Although Heckman’s two-stage estimator is well known, it provides a clear and straightforward explanation of the nature of self-selection bras. 18 E(qlxohl) =x'B+E(e,lI.-'=1) = xi'Ii + E(e, I n, > 1’?) _ I “'4,” .. x B + I 1 - “(‘4 Y) _ x’B + a “Zr/Y), «a Y) where a = p a (i.e. the covariance between e and u), d) and ¢ are standard normal density and distribution functions respectively, and ¢(-)/ -Z,’y), if y = 0, then E(e, | u, > 0) = p n, which is a constant. 20 both respondents and non-respondents must be known. In a truncated sample, 2 is not known for non-respondents. Hence, the Heckman’s two-stage method cannot be applied to data from a truncated sample. Second, since only the returned surveys are used in the second stage, it is less efficient than if the full sample is used.’ Third, the conventional formula used in OLS to calculate the variance-covariance matrix does not provide the correct variance-covariance matrix for the second stage OLS estimation. Instead of using Heckman’s two-stage method, one alternative for deriving consistent estimates is to estimate the self-selection and the demand equations jointly by ML estimators. Depending on the nature of the sample, there are essentially two types of likelihood functions that can be specified. The first type of likelihood function is specified for a censored sample and the second type for a truncated sample. 2.3.2 Self-selection with a censored sample According to the self-selection and the demand equations, a censored sample indicates that for those whose I, = 0 (individual i did not return the survey), x, and 2, can still be observed. In a censored sample, for those who did not return the survey, 2, is still available and can be used to explain individual i’s self-selection behavior. In addition, the explanatory variables, x,, in the demand equation can also be observed. For a censored sample, consistent and efficient estimates for B, y, p, and a2 can all be obtained by maximizing the likelihood function6 5 It is also possible to use the full sample (Maddala, 1983, p. 159). . 6 In (practice, it is the log-likelihood function that is maximized. However, the likelihoo function simplifies interpretation. 21 I't a H ‘11:!“ g(Q'Xilfl, 11, q dll l,-l . “2,1 HI I ‘ g(e,u,O)dude, Il-O .. .. where g(.,.,.) is a bivariate normal density function. In the likelihood function, LC, the first term is the likelihood for those who returned the survey and is the product of the conditional density of q, given that individual i returned the survey. The second term is the likelihood for those who did not return the survey and is the product of the joint distribution function.7 2.3.3 Self-selection with a truncated sample By definition, a censored sample implies that all non-respondents’ x’s and 2’s can still be observed. In practice, this does not seem to be the case for most of the CV studies. Very often, data used in CV studies are truncated; namely, neither x’s nor z’s can be observed from non-respondents. According to the self-selection and the demand equations, a truncated sample indicates that when Ii == 0, all q,, x,, and zi are not observed. In other words, we know nothing of those who did not return the survey. In the case of a truncated sample, the self-selection and demand equations can be estimated jointly by maximizing the likelihood function * . 7 For further discussion of self-selection models under a censored sample, see Iantfle (1(9895‘i)1.ee (1984), Goldberger (1981), Greene ( 1981), Olsen (1980), and eson . 22 1;}, g(q-a’i. u. 0) du L. = , 1g «4 v) and the estimated 8, 7, oz, and p are consistent (Bloom and Killingsworth, 1985).8 The likelihood function, L,, is based on a truncated normal distribution. The numerator of i’s likelihood is the conditional density of q, given that individual i returned the survey, and the denominator is the probability that individual i returns the survey. In a self-selection model with a truncated sample, only the information from returned surveys is available for use in estimation.9 In most CV studies, data used in econometric analyses are obtained from truncated samples. According to Bloom and Killingsworth (1985), self-selection models with a truncated sample should not create problems in econometric analyses when the ML estimator is applied. However, bearing in mind that x, and z, are likely to have variables in common, or at least to be highly correlated, it seems unlikely that one would be able to obtain good estimates of parameters other than B (Pudney, 1989, p. 83). A study by Muthén and Joreskog ( 1983) using Monte Carlo experiments tends to confirm this suspicion and shows that the estimate for y is not reliable even in large samples,10 although it is possible to correct for self-selection bias in the B coefficients. This is a major disadvantage of using data from a truncated sample. $267516 same likelihood function is also presented by Maddala (1983, pp. 150 an . 9 Unlike the studies conducted by Hauseman and Wise (1981, 1977) where the data are acquired from a sam le that is truncated by an exogenous variable. A truncated sample in this study is a sample that is truncated by an endogenous variable. 1° Part of Muthén and Jdreskog’s (1983) results are reported in Appendix A. 2.4 Summary The self-selection model considered in this study consists of two components. The first is the self-selection equation which is a probit-type equation, and the second is a demand equation. Under a CV framework, the analyses began with an examination of self-selection models with either a censored or a truncated sample. Econometric analyses with a censored sample were found to have preferred properties (i.e. consistency and efficiency). However, censored samples are generally not available for most CV studies. The majority of CV studies use surveys to collect data. Since data is collected only from respondents, the sample is truncated. According to existing econometric methods, in order to test and to correct for self-selection bias in CV studies, the ML estimator that is based on a truncated sample (Bloom and Killingsworth, 1985) must be adopted. In theory, ML estimators lead to consistent and efficient estimates provided that the likelihood function is correctly specified. However, Monte Carlo experiments have shown that in truncated samples, parameters in the self-selection equation could not be estimated reliably even with large samples (Muthén and Jdreskog, 1983). This disadvantage of the ML estimator that is based on a truncated samples motivates the derivation of ML estimators in the next chapter which combine individual survey data with census data and transfer a truncated sample into a censored sample. CHAPTER 3 SELF -SELECI‘ION MODELS WITH MEASUREMENT ERRORS A self-selection model consists of two correlated components. The first component is a self-selection equation which is essentially a probit model. The second component is a demand equation. Under self-selection, an individual’s demand is observed only if the corresponding latent variable which is generated by the self-selection equation has a value greater than zero. In this study, the self-selection models differ from conventional models. For those individuals with unobserved demand, all of the independent variables in the self-selection equation are observed but with measurement errors. This chapter begins by describing a self-selection equation. The self- selection equation is developed using the random utility model under a CV framework. A probit model with measurement errors is derived where the r.h.s. variables are measured with errors whenever the l.h.s. latent variable has a value less than or equal to zero. A self-selection model with measurement errors is 9 developed based on the probit model with measurement errors described above and a linear demand equation. Finally, the model is generalized to allow for qualitative and limited dependent variables in the demand equation. 3.1 Self-selection based on a random utility model under a CV framework Assume a CV study uses mail surveys to elicit the demand for a good 0. In addition, the questionnaires used in the survey are open-ended. Individual i’s 24 25 demand for the good is q,.1 However, q, is observed only if individual i returned the survey and gave valid answers.2 Suppose that individual i’s decision to return the survey is based on his net utility gain from answering the survey. Individual i will return the survey only if the net utility gain from answering the survey is positive. However, individual i’s net utility gain cannot be observed directly; only the realization of the net utility gain (i.e. to return or not to return the survey) is observed. To model an individual’s self-selection behavior, assume that an individual maximizes utility subject to both a budget and a time constraint: Max. U(C, L, t-I,I I s) s.t.w-T‘-W°L-W°t°I-=P°C, (1) T‘=T+L+t°I, where C is a composite good that the individual consumes at price P, L is leisure time spent, t is the time devoted to answering the survey, I is an indicator which equals 1 if he answers the survey and 0 otherwise, w is the wage rate, s is a vector of socio-economic and demographic variables other than the wage rate, T' is the total time available (which is fixed), T is the time devoted to market work and is also assumed to be fixed. At maximum utility, U UL = TI, (2) where UL and UI are the marginal utilities of leisure and of answering the survey respectively. At utility maximization, equation (2) states that the marginal utility 1 At this stage, - is assumed to be continuous and -oo < q, < co. This assumption will be re eased later in this chapter. 2 All of the returned surveys are assumed to have valid answers. This assumption excludes the case of item non-response. 26 per unit of time for answering the survey equals the marginal utility of leisure (i.e. the marginal utility of answering the survey is equal to the marginal utility of leisure multiplied by the time used in answering the survey). The individual’s indirect utility function can be written as wmmnuan G) Let C be the numeraire, and set P = 1. Furthermore, assume that t is constant across individuals. Since T is treated as fixed, the indirect utility function becomes U “(Y i s, I), (4) where Y ( == w'T) is the individual’s income. The condition for an individual to answer and to return the survey is U"(Y l s,I= 1) -U"(Y l s,I=0) = V(Y, s) (5) = V(z) > 0, where z = (Y, s) is a vector of socio-economic and demographic variables (including income). Assume V(z) is'a linear function of all elements in z, and u is a random error drawn from a standard normal distribution. Individual i’s self-selection equation can be expressed by a standard probit model: 27 I,‘ = z,’y + u,, u, ~ i.i.d. N(O, r), I,=1,iffl,'>0, (6) I| = 0, otherwise. Equation (6) is the self-selection equation that models individual i’s decision behavior. In equation (6), I,’ is the net (indirect) utility gain and cannot be observed, 2, is a column vector consisting of exogenous variables (including income) that explain individual i’s net (indirect) utility gain, 7 is a column vector of parameters to be estimated, and N (0, 1) represents a standard normal distribution. Although I,’ cannot be observed, researchers can observe 1,. 3.2 A probit model with measurement errors In this study, measurement errors occur when proxy variables are used to approximate the true values of the exogenous variables in the self-selection equation for non-respondents. As stated earlier, the self-selection equation is essentially a probit model. Before the self-selection model with measurement errors can be studied, a probit model with measurement errors must be discussed. 3.2.1 Derivation of the probit model with measurement errors Following the notation used in the previous sections, the derivation of a probit model with measurement errors begins with 28 I,‘ = z,’y + u,, u‘ ~ i.i.d. N(O, 1). (7) As before, I,’ is a latent variable, 2, is a kxl vector of independent variables, y is a kxl vector of parameters, and u, is an error term drawn from a standard normal distribution. I,’ cannot be observed, however, I, can be observed. In addition, Ii equals 1 if I,‘ = z,’y + u, > 0; 0, otherwise. For a respondent, Ii = 1, 2, can be observed, and the likelihood for the respondent is derived as: ZI'Y+U,>0 "U. > -2.’v <8) ~Prob(u. > 1’7) =l1- tl-A’Yli ' “if/v). where ¢(~) is a standard normal distribution function. For a non-respondent, I, = 0, 2, cannot be observed. However, p,, which is the average value of z,, is estimated using a random sample drawn from individual i’s neighborhood (e.g. a census block3).‘ Let n, be the size of the random sample and zi ~ N(p,', 2;). Obviously, . 2' ~ N “p '— o ‘ (9) u», (u. n) 3 This can be a census block, a county, a state, or even a region. For convenience, a census block is used in the followrng analyses. 4 For example, this can be done by matching the mailing list with the census block to obtain the average value of each 2, aval able in the census data. 29 Define measurement errors as v, = z, - u,, then v. ~ No. Lit-23) ”I ”I (10) ”l ‘ 1 . ~v, ~ N(O, “—2). n In general, (D' - 1) e- 1, so the distribution of v, can be approximated by5 nl (11) vi ~ N(O, 2'). To derive the likelihood for a non-respondent, the self-selection equation for a non-respondent can be written as #7 + u, s 0 ”(H +V,)’v +11, 50 (12) “Iii/Y +w, sO,andw, =u, +v,'y. Further, assume that u, and v, are independent. Then wi ~ N(O, of), and (13) 5 Alternatively, the unobserved 2, can be decomposed into the sum of a deterministic component, p,', and a random component, v,, with v, ~ N (0, 2,'). Now replace u,' with its consistent estimate, p,. We have 2, = u, + v,, and v, ~ N(O, 2;). 30 The likelihood for a non-respondent can then be derived as“ Ill/Y + W, s 0. W, ~ N(O. v.2) / " wi 5 ’ Pt 7 4 ~Prob(w, s -u,’y) (1 ) I =o_.‘."_1. 4‘1 Based on equations (8) through (14), the likelihood function for the probit model with measurement errors is is .-. H¢D(q'y) . [14.2511]. (15) l,-r i,-o ‘9. 6 As with a regular probit model, 7 can only be identified up to a scalar multiple. Let k be a scalar and k > 0, according to equations (12) and (13), a’v+ti50-'ka'v+kti50 ~ku.-’v + (1w. + kvs’v) s o, and (ku. + kn’v) ~ mm + My» -kui/y k V1 + 7’2'1 ~probacu. + kn’y s -ku{r) = w 31 Comparing the likelihood function for the probit model with measurement errors to the likelihood function for a regular probit model,7 the difference between these two likelihood functions is found in the second term, representing the likelihood for non-respondents. When average characteristics from non- respondents’ neighborhoods, u,, replace the true value of non-respondents’ characteristics, 2,, variance is increased from 1 to of (= 1 + y’2,y). By combining the individual survey data with the census data, the original truncated sample becomes a censored sample. However, due to measurement errors, members in the new censored sample are independently but not identically distributed. For respondents, 11, ~ i.i.d. N(O, 1), but for non-respondents, w, ~ N (0, 9’3)- 322 Parameter identification in the probit model with measurement errors It is well known that a measurement errors model suffers from problems of parameters identification (Fuller, 1987). In practice, the probit model with measurement errors derived above suffers the same problems, namely (7, 2,.) cannot be identified simultaneously.” To apply the probit model with measurement errors without further complicating the model, one alternative is to replace 2,’ by its consistent estimates. 7 The likelihood function for a regular probit model is L = Hm’v) - [item-4’7). Ij-l . 8 Since the number of parameters (elements in 2,') increases with the sample srze, there is an incidental parameters problem. 32 From census data, there are two candidates that can be chosen to replace 2'- The first candidate, 2,, is a variance-covariance matrix estimated from a sample drawn from the census block for non-respondent i. Typically, 2, v- 2 ,- unless non-respondents i and j live in the same census block. In contrast to 2, whose values vary across non-respondents, the second candidate, 2, is a constant variance-covariance matrix estimated from a sample drawn from the population. This same constant variance-covariance matrix, 2, is applied to all the non-respondents. In practice, 2 and 2, can be calculated using the "Public-Use Microdata Samples."9 Researchers can purchase a 5-percent "Public-Use Microdata Samples," and use this sample to calculate 2; or the 5-percent sample can be broken down into census blocks10 and 2, can be calculated from each census block. In terms of empirical results, since both 2 and 2, lead to consistent parameter estimates, it is difficult to determine whether 2 or 2, will do better. Consequences of using 2 and 2, will be examined by Monte Carlo experiments in the next chapter.11 9 The "Public-Use Microdata Sample" can be purchased from the U. S. Department of Commerce, Bureau of Census, ph: (301) 763-2005. 1° An alternative is to purchase a 5-percent "Public-Use Microdata Sample" for each census block. 11 To simplify notation in the following sections of this chapter, 2, is used to represent either 2 or 2,. 33 3.3 A self-selection model with measurement errors and a linear demand equation A self-selection model with measurement errors is derived in this section which replaces the self-selection equation (a probit model) in a self-selection model by the probit model with measurement errors. Recall that the self-selection equation with measurement errors is defined as: (1) For a respondent, I, = 1, 71/, + u, > o, u, ~ i.i.d. N(O, 1). (16) (2) For a non-respondent, I, = 0, Zil‘Y + “5 S 0, 1* ~ I.I.d. N“), I) / / -p,y+w,s0,w,=u,+v,y, (17) WE ~ N(O, of), and a2 = 1 + 7’37- To derive the self-selection model with measurement errors, assume that individual i’s demand for a good (Q) is q = n’p + (1., e ~ i.i.d. N(O, 0‘2), (18) where x, and B are both mxl vectors. Further, assume that (e,, u,) are distributed jointly as a bivariate normal distribution with a density function 3(0. 0. 0). (19) where 34 a 3 (20) is the variance-covariance matrix, and p is the correlation coefficient. For a respondent, the likelihood is £21 SUI-4,5. 11. Q) du. (21) For non-respondents, assume that the demand is uncorrelated with the measurement errors (i.e. Cov(e,, v,) = 0), then (e,, w,) are distributed jointly as a bivariate normal distribution with a density function 8(0. 0. I‘,), (22) where loz pa 1, . . (23) .00 «ii is the variance-covariance matrix.12 The likelihood for a non-respondent can then be written as I: L?" g(e, w, 1“,) dw de. (24) 12 E(wie.) = E [(u. +v.’v>e.l = B [we -r~,)’vle.j = E (Ile+a’vq-m’vq) = E (tie) = pa. 35 Based on equations (16) through (24), the likelihood function for the self- selection model with measurement errors and a linear demand equation is 1e = H j", g(q-XI’B. u. 0) du Ij-l ill, (25) H j: L2H], g(e, W. P,) dw de. Ij-O Consistent estimates for (y, p, B, 02) can be obtained by maximizing ln(LL). Comparing the likelihood function for the self-selection model with measurement errors to the likelihood function for the self-selection model with a censored sample (Chapter 2, Section 2.3.2), the difference between these two likelihood functions is found in the second term, representing the likelihood for non-respondents. When average characteristics from non-respondents’ neighborhoods, p,, replace the true value of non-respondents’ characteristics, 2,, variance in the self-selection equation is changed from 1 to «9,2 (= 1 + y’2,y). In the selfcselection model with measurement errors, members in the sample are independently, but no longer identically, distributed. In addition, compared to the ML estimates from a self-selection model with a truncated sample (Chapter 2, Section 2.3.3), the ML estimates from a self- selection model with measurement errors is more efficient due to the newly introduced information )1, (the average characteristics from non-respondents’ neighborhoods) and 2, (the corresponding variance-covariance matrix), 3.4 Generalization for closed-ended questionnaires The above discussion focuses on the case where the dependent variable in the demand function (q,) is continuous. However, in many CV studies, the 36 demand responses are not continuous. For example, in many open-ended questionnaires the demand responses are censored (e.g. a Tobit model). On the other hand, surveys using referendum-type (closed-ended) questionnaires produce dichotomized responses. In the following discussion, the demand equation in the self-selection model with measurement errors is modified to allow for qualitative and limited dependent variables. The following models present the case where the demand equation is either a Tobit or a probit-related model. 3.4.1 A self-selection model with measurement errors and a Tobit demand equation A Tobit demand equation is defined as: qr . ,Ip . a, a, ~ i.i.d. N(O, oz). q=q2rri+e>a (m) qi = 0, otherwise. The observed demand is now q, which is left censored at 0. A self-selection model with measurement errors and a Tobit demand equation is described by equations ( 16), (17), (26), ( 19), (20), (22), and (23). For a respondent, if the observed demand equals 0, the likelihood is f:‘l’p IQ, g(e, u, 0) du de. (27) If the observed demand for a respondent is q, > 0, the likelihood is 37 g}, 8(q-s’fl. u. 0) du. (28) For a non-respondent, the likelihood is (29) I: I341 g(e, W, E) dw de. Based on equations (27), (28), and (29), the likelihood function for the self- selection model with measurement errors and a Tobit demand equation is I = " “"l' . Lr E J; L. g(e, w, I“) dw de £4, £2, g(e, 11, O) du de (30) ll -l,qi IO I'I E(y ski-£3.11. 0) du. lj-1,q, >0 In the likelihood function, LI, the first term is the likelihood for non-respondents. The second term is the likelihood function for those respondents whose q, = O. The third term is the likelihood function for those respondents whose q, > 0. 3.4.2 A self-selection model with measurement errors and a probit demand equation In a referendum-type (closed-ended) questionnaire, a respondent is usually asked to answer YES or NO with respect to a given referendum index.13 The demand equation takes the form ’3 For examfle, a respondent may face a question such as 'To maintain the current water quality in your neighborhood, you will have to pay extra $100 per car. Are you willing to pay for it or not?" The. $100 here is the referendum index (price). 38 q‘ = x’B + e,, e, ~ i.i.d. N(O, 1), q=l,ifx,'B+q>0, (31) q = 0, otherwise, where one of the x, elements is the referendum index. For respondents, the probit demand equation is related to the self-selection equation (equation (16)) by the assumption that (e,, u,) are distributed jointly as a bivariate normal distribution with a density function 8(0. 0. 6). (32) where 1 p 9 = (33) p 1 is the variance-covariance matrix. For a respondent who answers YES with respect to the referendum index, I, = 1 and q, =1, the likelihood is 1;,“ L7,, g(e, u, 8) du de. (34) For a respondent who answers NO with respect to the referendum index, I, = 1 and q, =0, the likelihood is LT,” £21 g(e, u, 8) du de. (35) For non-respondents, the relationship between the probit demand equation and the self-selection equation with measurement errors (equation (17)) can be 39 derived where (e,, w,) are distributed jointly as a bivariate normal distribution with density function 8(0. 0. A). (36) where 1 p A, s (37) p n” and o,’ = 1 + y’2,y. The likelihood for a non-respondent is (33) f: [31” g(e, w, 11,) dw de. Based on equations (34), (35), and (38), the likelihood function for the self- selection model with measurement errors and a probit demand equation is l" ‘ 1.1-i I: If" g(e, w. A.) aw de "“l' " e d d (39) [Pg-o I“. [1'], g(e,“, ) 11 C [FLIP]. L}, L}, g(e, u, 9) du de. In the likelihood function, LP, the first term is the likelihood for non-respondents. The second term is the likelihood function for those respondents who answered N O with respect to the referendum index. The third term is the likelihood function for those respondents who answered YES with respect to the referendum index. 40 An alternative to a probit demand equation is a censored probit inverse demand equation (Cameron and James, 1987; Cameron, 1988).“ Instead of modeling the probability of answering YES or NO with respect to a referendum index, a censored probit model treats the answer YES (NO) as if q,' is greater than or equal to (less than) the referendum index. Thus, the true q, is censored at the referendum index. However, in terms of econometric estimation, a censored probit demand equation produces results comparable to that of a probit demand equation (McConnell, 1990). For a self-selection model with measurement errors and a censored probit inverse demand equation, the censored probit inverse demand equation is defined as: q. = ,,/p + e, e, ~ i.i.d. N(O, oz). q=1,jf)§’fl+el>pl, (40) q = 0, otherwise, where p, is the referendum index and x, no longer contains the referendum index. The self-selection model with measurement errors and a censored probit inverse demand equation is defined by equations ( 16), ( 17), (40), ( 19), (20), (22), and (23). It can be easily shown that the likelihood function for the self-selection model with measurement errors and a censored inverse probit demand equation is” 1“ If the demand equation is estimated by a probit model, the censored probit model estimates the inverse demand equation. 15 Unlike the case of probit demand equation, in a censored probit (logit) inverse demand equation, oz is identifiable. 41 _/ chanf'ng(e,u,l‘,)dude 1i“) I... - I . IlIT-o I: xi, I‘ll/V g(e, u, 0) du de (41) " "11 II Inf,“ L}, g(e, u. '0) an ac. It'lal'l In the likelihood function, La» the first term is the likelihood for non- respondents. The second term is the likelihood function for those respondents who answered NO with respect to the referendum index, p,. The third term is the likelihood function for those respondents who answered YES with respect to the referendum index, p,.“ 3.5 Summary Models derived in this chapter take the average characteristics from the non-respondents’ neighborhoods and treat them as the non-respondents’ characteristics, measured with error. Based on the measurement errors approach, the probit self-selection equation is modified and becomes a probit model with measurement errors. A self-selection model with measurement errors is constructed using the probit model with measurement errors and a linear demand equation. CV studies use either open-ended or closed-ended questionnaires to collect data. For open-ended questionnaires, responses to demand are sometimes censored. For example, given a specific price, demand for a good may be left 1‘. A double-bounded censored logistic regression developed by Hoehn and Loorms (1993) can also be applied. Derivation of the likelihood function is straightforward. 42 censored at zero. For closed-ended questionnaires, the responses are dichotomized (YES or NO). To account for these situations, the self-selection model with measurement errors is generalized to allow for a Tobit demand equation, a probit demand equation, or a censored probit inverse demand equation. Based on the measurement errors approach, models derived in this chapter transfer a truncated sample into a censored sample. By applying these models, it t is expected that disadvantages from estimates under a truncated sample are removed and advantages from the properties of the estimates under a censored sample are obtained; namely, reliable estimates of the parameters in both the self- selection and the demand equations. Furthermore, some gain in efficiency is expected. CHAPTER 4 MONTE CARLO EXPERIMENTS AND RESULTS In the previous chapter, self-selection models with measurement errors were developed with 1) a linear demand equation; 2) a Tobit demand equation; 3) a probit demand equation; and 4) a censored probit inverse demand equation. Deviating from conventional measurement errors models, the variance- covariance matrix of the measurement errors was replaced by its consistent estimates. Two candidates were considered as replacements for the variance- covariance matrix. One candidate, 2, was the variance-covariance matrix estimated from a sample drawn from the population, and was not available for each census block. The other candidate, 2,, was the variance-covariance matrix estimated from samples drawn from each non-respondent’s census block.1 The purpose of this chapter is to use Monte Carlo experiments to examine and compare the resulting estimates from 1) a truncated sample without correcting for self-selection bias; 2) a self-selection model with a censored sample;2 3) a self-selecfion model with measurement errors that adopts u,, the 1 A third type of variance-covariance diag(2,) which assumes zero covariance was also tried. Although the diag(2,) is very easy to obtain, it is abandoned for two reasons. First, the zero covariance assumption is not plausible. Second, according to the model specified below, ML estimator based on diag(2,) has never converged during the optimization procedure. .2 Although it is nearl immssible to acquire a censored sample in reality, estimates from a censore sample give the best possible results and can be used to compare the results from the measurement errors models proposed in this study. 43 44 mean vector, and 2; and 4) a self-selection model with measurement errors that adopts u, and 2,.3 Monte Carlo experiments are conducted for each type of demand equation except the censored probit inverse demand equation.4 This chapter begins with the data generation process. Steps for Monte Carlo experiments are described and the resulting estimates are then reported. Comparison of the results are presented, followed by concluding remarks. 4.1 Data generation Due to the properties of the proposed self-selection models with measurement errors, the data generation process is not straightforward. In each replication, in order to acquire useful information, data used in Monte Carlo experiments are generated in two steps. In the first step, a "population" is generated and certain required statistics are calculated. In the second step, a "sample" is drawn from the "population," and models are estimated based on the "sample." k 3 Monte Carlo experiments for a self-selection model with a truncated sample is conducted only for a linear demand equation wrth p = 0.75. 4 Due to the similari between a probit and a censored probit model, estimates from a censore probit model are onutted. 45 4.1.1 Population generation For each replication, a 10,000 x 5 matrix, [x1 x2 x3 11 e], is first generated where [x,, x,2 x,; u, e,] is distributed as an i.i.d. multivariate normal distribution with a mean vector [3 1.5 4 0 O] and a variance-covariance matrix’ [1.44 0.24 0.096 0 0‘ 0.24 1 0.24 0 0 Cov(xl, x2, x3, 11,, e,) = 0.096 0.24 0.64 0 O , where p = 0.25, 0.5, or 0.75.6 Since one of the demand specifications and the self-selection equation are both probit equations, setting Var(u) = Var(e) = 1 simplifies comparison of parameter estimates.7 Dependent variables for both the self-selection equation (I,') and the demand equation (q,')-are generated by 5 Corr(x,, x2) = 0.2, Corr(x,, x3) = 0.1, Corr(x2, x3) = 0.3, Corr(u, e) = p, and p = 0.25, 0.5 or 0.75. 6 Based on p = 0.25, 0.5, and 0.75, three sequences of simulations are conducted for each of the models. 7 Recall that in a probit model, B and a are not separately identifiable. Coefficients estimated are B / o. 46 I' 1.5+1x,,-3x,2+u,,and Q 6 + 4 xi2 - 3 x,,, + e. In order for the model to be identifiable when both demand and self-selection equations are of probit-type, both demand and self-selection equations cannot have exactly the same independent variables.8 Based on the process described above, a sample [I' q' x1 x2 x3 u e], which contains 10,000 observations and 7 variables, is generated and treated as the "population" in a replication. To apply the self-selection models with measurement errors, certain statistics related to the distribution of x1 and x2 are required (i.e. the mean vector and variance-covariance matrix). To obtain the necessary statistics, a random sample containing 200 observations is drawn from the "population" and the variance-covariance matrix (2) of x1 and x2 is calculated. The next step is to randomly group the "population" into 250 ”blocks" with 40 observations in each block. For each block, the mean vector (u,) and the variance-covariance matrix (2,) of x1 and x2 are calculated. 4.1.2 Sample generation In each replication, a random sample consisting of 1,000 observations is drawn from the population. In the random sample, observations with I,’ > 0 (I,' s 0 ) are treated as respondents (non-respondents). Since the mean of I,’ is zero, a response rate roughly equaling 50% (500 respondents) is expected. 8 An alternative is to have Corr(u, e) = 0. However, if this is the case, self- selection does not exist. 47 In each replication, four models are estimated: 1) without correcting for self-selection bias, a demand equation is estimated based on the truncated sample, i.e. the number of observations is about 500; 2) both a demand and a self- selection equation are estimated based on a censored sample with 1,000 observations; i.e. for non-respondents, x, and x2 are observable; 3) both a demand and a self-selection equation are estimated using a self-selection model with measurement errors, and for non-respondents, due to the unobserved x, and x,, u, and 2 are used (i.e. the number of observations is 1,000); and 4) both a demand and a self-selection equation are estimated using a self-selection model with measurement errors, and for non-respondents, due to the unobserved x1 and x,, p, and 2, are used (i.e. the number of observations is 1,000). As previously mentioned, three types of demand equations are used in the analysis. For a linear demand equation, q,' is used as the dependent variable. If the demand equation is a Tobit equation, q; is left censored at 0 (q, = q,', if q,' >0; 0, otherwise). Finally, for a probit demand equation, q,° is dichotomized (q, = 1, if q,’ >0; 0, otherwise). 4.1.3 Monte Carlo experiments Based on different demand specifications, three types of simulations related to a linear, a Tobit, and a probit demand equation are conducted. For each type of demand specification, three sequences of simulations are conducted based on different values of the correlation between self-selection and demand (p = 0.25, 0.5, and 0.75). At each replication, four models are estimated, and the number of replications is 500. 48 4.2 A linear demand equation with self-selection A self-selection model with measurement errors and a linear demand equation is derived in Chapter 3 (Section 3.3). Based on the different correlation measures between self-selection and demand (p = 0.25, 0.5, and 0.75), the following section begins with OLS estimates from a truncated sample without correcting for self-selection, and results are presented in Appendix C9 (Tables C.1.1.A (p = 0.25), C.1.2.A (p = 0.5), and C.1.3.A (p = 0.75)). Using a censored sample, estimates for a linear demand equation with self- selection are obtained by the ML estimator based on the likelihood function 11.1 = H j'}, 8(ci-XI’B. u. 0) du l,-l 'zl / {I} L. L22" g(e, u, 0) du de. I where g(.,.,.) represents a bivariate normal density function and ozpo is the variance-covariance matrix. Results of this model are listed in Tables C.1.1.B (o = 0.25), C.1.2.B (p = 0.5), and C.1.3.B (p = 0.75). For a truncated sample and p = 0.75, estimates for a linear demand equation with self-selection are obtained by the ML estimator based on the likelihood function 9 Notation used in Appendix C are defined in Appendix B. 49 j",' 8(4-8’3, 11. 0) du I. = II "‘ , 1,.: 9(4 Y) and results are listed in Table C.1.3.B. Estimates from a self-selection model with measurement errors and a linear demand equation are obtained by the ML estimator based on (u,, 2) and the likelihood function 11.2 = H jg}, g(q-xl’B. 11. 0) du Ij-l II I; If“ g(e. w. r) dw de, l,-0 where a2 pa P 3 9 pa (1 +Y’EY) and results are presented in Tables C.1.1.C (p = 0.25), C.1.2.C (p = 0.5), and C.1.3.C (p = 0.75). Finally, if 2 (I‘) is replaced by 2, (Fa), i.e. 02 pa pa (1+Y’zi'f) the resulting estimates are shown in Tables C.1.1.D (p = 0.25), C.1.2.D (p = 0.5), and C.1.3.B (p = 0.75). 50 4.2.1 Monte Carlo experiment results from a self-selection model with measurement errors and a linear demand equation10 Based on Tables C.1.1.A, C.1.2.A, and C.1.3.A, when the demand equation is estimated by applying OLS to a single equation without correcting for self- selection bias, as p increases, both %BIAS and D(a.a2):s increase. This implies that the higher the p, the farther the OLS results deviate from the true parameter values. For example, as p increased from 0.25 to 0.75, the %BIAS of B1 (02) increased from 2.19% (1.32%) to 6.45% (7.73%). In addition, RMSE and ASE are very different for B,, indicating incorrect estimates of the variance-covariance matrix. When a censored sample is available and the self-selection model is correctly specified, 02, p, self-selection, and demand parameters are well-estimated by the ML estimator. As can be seen from Tables C.1.1.B, C.1.2.B, and C.1.3.B, the %BIAS among demand (self-selection) parameters ranged from 0.01% (0.08%) to 0.28% (1.20%); for a2 (p), %BIAS ranged from 0.32% (0.28%) to 0.62% (1.08%). D(B,¢2);CEN was always smaller than D ( a 62% and all the D(,);CEN’S remained very close to zero. For a truncated sample, the ML estimator produces different results from that of Muthen and Joreskog (1983). Table C.1.3.B shows that biasedness is not a major problem for all the a2 ,p, self-selection, and demand parameters, even with p = 0.75. The real problem appears to be the difference between RMSE and ASE. The difference between RMSE and ASE indicates that the variance- covariance matrix produced by the ML estimator is incorrect and cannot be used 19 A GAUSS program for conducting the Monte Carlo experiments is provrded in Appendix D. 51 to test hypotheses. Failure to conduct hypothesis testing may result in model misspecification and lead to inconsistent parameter estimates. When the measurement errors model based on u, and 2 was applied, the %BIAS among demand (self-selection) parameters ranged from 0.02% (1.53%) to 0.29% (3.38%); for a2 (p), %BIAS ranged from 0.28% (0.53%) to 0.80% (0.88%) as shown in Tables C.1.1.C, C.1.2.C, and C.1.3.C. D (a. 02); MEI was always smaller than D (p, a,” and all the 1),, WE, ’s remained very close to zero. When the measurement errors model based on u, and 2, was applied, the %BIAS among demand (self-selection) parameters ranged from 0.01% (0.15%) to 0.28% (2.43%); for a2 (p), %BIAS ranged from 0.22% (0.40%) to 0.68% (0.82%) as shown in Tables C.1.1.D, C.1.2.D, and C.1.3.D. 130,3);an was always smaller than D ( a. 00:8 and all the D (- mm ’s remained very close to zero. Comparing results from the two measurement errors models, the only difference is that the self-selection parameters always have smaller %BIAS when 2, is used. Apart from this, it is difficult to distinguish the difference between the two models. Comparing results from the two measurement errors models with results from the censored sample, all three models give similar estimates for the demand parameters according to D . However, according to D“ 9)., 9 self-selection (9,02% parameters estimated by the two measurement errors models are less efficient than the estimates from the censored sample. 4.3 A Tobit demand equation with self-selection A self-selection model with measurement errors and a Tobit demand equation is derived in Chapter 3 (Section 3.4.1). Based on the different correlation measures between self-selection and demand (p = 0.25, 0.5, and 0.75), 52 the following section begins with Tobit ML estimates from a truncated sample without correcting for self-selection, and results are presented in Appendix C (Tables C.2.1A (p = 0.25), C.2.2.A (p = 0.5), and C.2.3.A (p = 0.75)). Using a censored sample, estimates for a Tobit demand equation with self- selection are obtained by the ML estimator based on the likelihood function .. " ‘47 L“ I} L, L. g(e, 11, Q) du de -a "X/ a j"j, g(e.u.0)dude Ij-IJIj-O ’1', II I”), 8(41'011. 0) du. Ij'l,q‘>0 -ll where g(.,.,.) is a bivariate normal density function and a2 pa 0 a pa 1 is the variance-covariance matrix. Results of this model are listed in Tables C.2.1.B (p = 0.25), 02.28 (p = 0.5), and C.2.3.B (p = 0.75).11 Estimates from 'a self-selection model with measurement errors and a Tobit demand equation are obtained by the ML estimator based on (u,, 2) and the likelihood function u A Tobit self-selection model based on a truncated sample is dropped from the Monte Carlo e eriments due to the difficulty in obtaining the starting values. The ML estimator or a Tobit self-selection model based on a truncated sample is very sensitive to the starting values. Very often, the o timization procedure can not converge even with the true parameter values as t e starting values. 53 L,2 = II J: I347 g(e, w, I‘) dw de li-O I -x' p ' e, u, du de IVE") L. J12,“ 8( O) - I II L4, sat-xi I3. 11. 0) du. I, -1,q, >0 where 02 pa pa (1+Y’EY) and results are presented in Tables C.2.1.C (p = 0.25), C.2.2.C (p = 0.5), and C.2.3.C (p = 0.75). Finally, the results of replacing 2 (I‘) by 2, (I‘,) are shown in Tables C.2.1.D (P = 0.25), C.2.2.D (p = 0.5), and C.2.3.D (p = 0.75). 4.3.1 Monte Carlo experiment results from a self-selection model with measurement errors and a Tobit demand equation12 When a single equation Tobit model is applied to estimate the demand equation without correcting for self-selection bias, as in the case of a linear demand equation, both %BIAS and D(p’02);s increase with p as shown in Tables C.2.1.A, C.2.2.A, and C.2.3.A. This again implies that the higher the p, the farther the estimates from a single equation Tobit model deviate from the true parameter values. For example, as p increased from 0.25 to 0.75, the %BIAS of I? A GAUSS program for conducting the Monte Carlo experiments is provrded in Appendix E. 54 B, (02) increased from 3.84% (2.26%) to 11.09% (12.33%). In addition, RMSE and ASE are very different for B,, indicating incorrect estimates of the variance- covariance matrix. When a censored sample is available and the self-selection model is correctly specified, 02, p, self-selection, and demand parameters are well-estimated by the ML estimator. As can be seen from Tables C.2.1.B, C228, and C.2.3.B, the %BIAS among demand (self-selection) parameters ranged from 0.09% (0.07%) to 0.30% (2.19%); for a2 (p), %BIAS ranged from 0.12% (1.39%) to 1.46% (3.28%). D(B,02);CEN was always smaller than D(fl.¢2):5 and all the D(°);CEN,S remained very close to zero. When measurement errors model based on u, and 2 was applied, the %BIAS among demand (self-selection) parameters ranged from 0.08% (1.75%) to 0.35% (4.68%); for a2 (p), %BIAS ranged fi'om 0.18% ( 1.32%) to 1.42% (2.92%) as shown in Tables C.2.1.C, C.2.2.C, and C.2.3.C. D ( a, 02):MEI was always smaller than D(fl.02);S When the measurement errors model based on u, and 2, was applied, the and all the D (. mm ’s remained very close to zero. %BIAS among demand (self-selection) parameters ranged from 0.07% (1.23%) to 0.36% (3.71%); for 92 (p), %BIAS ranged from 0.16% (1.37%) to 1.38% (3.06%) as can be seen in Tables C.2.1.D, C.2.2.D, and C.2.3.D. D (9. Am was always smaller than D(n.o2);s and all the D (' ):MEz ’s remained very close to zero. Comparing results fiom the two measurement errors models, the only difference is that the self-selection parameters always have smaller %BIAS when 2, is used. Apart from this, it is difficult to distinguish the difference between the two models. Comparing results hour the two measurement errors models with results from the censored sample, all three models give similar estimates for the demand parameters according to Dds?» . However, according to D (m)? , self-selection I" 55 parameters estimated by the two measurement errors models are less efficient than the estimates from the censored sample. 4.4 A probit demand equation with self-selection A self-selection model with measurement errors and a probit demand equation is derived in Chapter 3 (Section 3.4.2). Based on the different correlation measures between self-selection and demand (p = 0.25, 0.5, and 0.75), the following section begins with probit ML estimates from a truncated sample without correcting for self-selection, and results are presented in Appendix C (Tables C.3.1.A (p = 0.25), C.3.2.A (p = 0.5), and C.3.3.A (p = 0.75)). Using a censored sample, estimates for a probit demand equation with self- selection are obtained by the ML estimator based on the likelihood function I 1,., .. H f: L?” g(e, u, e) du de li-O -x,B .. I l I, g(e,u,9) du de t,-l.q,-o "' "1' H ‘11:“, J1}, g(e, 11. 9) du de, IP13, II where 10 56 is the variance-covariance matrix. Results of this model are listed in Tables C.3.1.B (p = 0.25), C.3.2.B (p = 0.5), and C.3.3.B (p = 0.75).13 Estimates from a self-selection model with measurement errors and a probit demand equation are obtained by the ML estimator based on (u,, 2) and the likelihood function . : 7i" I,2 g L. L. g(e, w, A) dw de H L7,. £2, g(e, u, 9) du de I,-1,q,-O H j”, I? 8(C.U.9)dude. h-r,q,-r "I" 'zl' where 1 p A = , p (1+Y’Ev) and results are presented in Tables C.3.1.C (p = 0.25), C.3.2.C (p = 0.5), and C.3.3.C (p = 0.75). Finally, if 2 (A) is replaced by 2, (A,), i.e. ¥ 13 A robit s lf-selection model based on a truncatedsample is dropped from the Mont: Carloee eriments due to the difficul in obtaining the starting values. The ML estimator)?» a probit self-selection mo el based ona truncated sample is very sensitive to the starting values. Very often, the optimization procedure can not converge even with the true parameter values as the starting values. 57 the resulting estimates are shown in Tables C.3.1.D (p = 0.25), C.3.2.D (p = 0.5), and C.3.3.D (p = 0.75). 4.4.1 Monte Carlo experiment results from a self-selection model with measurement errors and a probit demand equation“ When a single equation probit model is applied to estimate the demand equation without correcting for self-selection bias, as in the case of a linear demand equation, both %BIAS and Dtu‘xs increase with p as presented in Tables C.3.1.A, C.3.2.A, and C.3.3.A. This again implies that the higher the p, the farther the estimates from a single equation probit model deviate from the true parameter values. For example, as p increased from 0.25 to 0.75, the %BIAS of B, increase from 7.43% to 18.35%. In addition, RMSE and ASE are very different for B,, indicating incorrect estimates of the variance-covariance matrix. When a censored sample is available and the self-selection model is correctly specified, p, self-selection, and demand parameters are well-estimated by the ML estimator. As can be seen from Tables C.3.1.B, C.3.2.B, and C.3.3.B, the %BIAS among demand (self-selection) parameters ranged from 2.63% (0.45%) to 4.63% (2.19%); for 02(9), %BIAS ranged from 2.97% (1.08%) to 3.60% (10.28%). D a; can was smaller than Dies when p = 0.5 and 0.75 and all the D ’s remained very close to zero. (' );MB1 When measurement errors model based on u, and 2 was applied, the %BIAS among demand (self-selection) parameters ranged form 0.04% (3.46%) to 4.84% (7.43%); for o2 (9). %BIAS ranged from 3.04% (1.36%) to 3.74% (9.71%) 1.4 A GAUSS program for conducting the Monte Carlo experiments is prov1ded in Appendix F. 58 as shown in Tables C.3.1.C, C.3.2.C, and C.3.3.C. DBMEI was smaller than DES when p = 0.5 and 0.75 and all the D (, ); man’s remained very close to zero. When the measurement errors model based on u, and 2, was applied, the %BIAS among demand (self-selection) parameters ranged from 2.56% (2.26%) to 4.69% (4.97%); for o2 (p), %BIAS ranged from 3.00% (1.16%) to 3.65% (9.99%) as presented in Tables C.3.1.D, C.3.2.D, and C.3.3.D. D M452 was smaller than D as when p = 0.5 and 0.75 and all the p,.);m’s remained very close to zero. Comparing results from the two measurement errors models, the only difference is that the self-selection parameters always have smaller %BIAS when 2, is used. Apart from this, it is difficult to distinguish the difference between the two models. Comparing results from the two measurement errors models with results from the censored sample, all three models give similar estimates for the demand parameters according to Dar . However, according to 1),”)? , self-selection parameters estimated by the two measurement errors models are less efficient than the estimates from the censored sample. One important issue is the estimate of p. For all three self-selection models, as the true value of p increases, the %BIAS for the estimate increases rapidly. However, BIAS for the estimates of p are always equal to zero, statistically. 4.5 General results from the Monte Carlo experiments Results from the single equation simulation show that in the presence of self-selection (p rt 0), %BIAS increases as p increases when a single equation is used to estimate the demand equation. This indicates biasedness caused by the self-selection behavior. 59 For a truncated sample, the ML estimator produces different results from that of Muthén and Jdreskog (1983). Instead of biasedness, the real problem appears to be that the variance-covariance matrix produced by the ML estimator is incorrect and cannot be used to test hypotheses. Failure to conduct hypothesis testing may result in model misspecification and may lead to inconsistent parameter estimates. When a censored sample is available and the self-selection model is correctly specified, oz, self-selection, and demand parameters are well-estimated by the ML estimator. For the parameter p, the ML estimator leads to acceptable results; however, the estimates are not as accurate as other parameter estimates, especially in the case of a probit demand equation with self-selection. In terms of efficiency among different estimators, D (a. 02). can being very 15 - - , “mama and Dw~2xm indicates that the model Wthh uses a censored sample and the two measurement errors models all lead to very similar close to that of D estimates of demand parameters and oz. For the self-selection parameters and p, > . . . D(y’p);Mm Down“: > D (m); cnu’ mdlcates that the model which uses a censored sample performs the best and the measurement errors model that uses it, and 2, performs somewhat better than the model that uses it, and 2. In the overall performance, it is no surprise that the model which uses a censored sample has the smallest value of D (7”, 02,9); can and performs the best. Even thou ‘ ' l eater than that of gh the value of 1;)(M02'mw1 IS shght y gr D,“ «2.9);191132’ the two measurement errors models are not very different from each other. There is a problem common to the case of the self-selection model with a probit demand equation. The model that uses censored sample or either 15 For the probit demand equation case, they are DmCEN’ DWE, and Dome: respectively. In the following discussion, (B, a2) is used to represent (B) in the probit demand case as well as (B, 0’) in other cases. 60 measurement errors model results in an estimate of p that is not as accurate as other parameters, especially when the true value of p is high. However, the estimate remains statistically acceptable. 4.6 Summary In this chapter, Monte Carlo experiments are conducted to examine and to compare the resulting estimates from 1) a truncated sample without correcting for self-selection bias; 2) a self-selection model with a censored sample; 3) a self- selection model with measurement errors that adopts 11,, the mean vector, and 2, the corresponding variance-covariance matrix estimated from a sample drawn from the population; and 4) a self-selection model with measurement errors that adopts )1, and 2,, the corresponding variance-covariance matrix estimated from samples drawn from each non-respondent’s census block. Three sequences of Monte Carlo experiments are conducted based on a linear, a Tobit, and a probit demand equation. For each sequence of Monte Carlo experiment, based on p = 0.25, 0.5, and 0.75, three 500-replication simulations are executed. Results from the Monte Carlo experiments show that the ML estimator from the model which-uses a censored sample performs the best. Among the two measurement errors models, the model that uses it, and 2, estimates the self- selection parameters more accurately than the model that uses u, and 2. In reality, censored sample is almost impossible to obtain. However, using the measurement errors models derived in this study, a truncated sample can be transferred into a censored sample, and self-selection models with measurement errors can then be estimated by ML estimators. Results from Monte Carlo experiments show that the estimates from the self-selection models with measurement errors perform very well. According to the Monte Carlo experiment 61 results, when a correctly-specified self-selection model with measurement errors is adopted, estimates of the demand parameters are as accurate as the estimates from a model that uses a censored sample, and the estimates of the self-selection parameters are very close to the true parameter values. The results indicate an impressive message: adoption of a self-selection model with measurement errors will not contaminate the original truncated sample. Compared to the estimates from a model with truncated sample, the self- selection models with measurement errors not only improve the efficiency of the estimates but also lead to reliable estimates of the self-selection parameters. CHAPTER 5 CONCLUDING REMARKS In CV studies, when surveys are used for collecting data, non-response will usually create problems. In analyzing survey data, two types of possible biases can be created by non-response. The first is sample non-response bias which occurs when the sample distribution of some socio-economic or demographic characteristics is significantly different from that of the p0pulation. The second is self-selection bias which occurs when the non-response is non-random, i.e. the reasons for non-response are endogenous to the survey study. In CV studies, although self-selection is usually ignored in empirical work, it is recognized by researchers as an important issue. In this study, methods that combine survey individual data with census data to correct for self-selection bias are proposed and promising results are provided by Monte Carlo experiments. 5.1 Summary In Chapter 1, consequences of self-selection are reported and the differences between self-selection bias and samme non-response bias are distinguished. When regression is used to analyze survey data, it is shown that self-selection causes inconsistent parameter estimates and sample non-response bias does not even play a role. It is also shown that there is no direct relationship between sample non-response bias and self-selection bias. Instead of ignoring 62 63 self-selection bias in empirical work, it is suggested that CV researchers treat self- selection as a serious issue. Following an example in labor economics, the concept of self-selection in CV is introduced in Chapter 2. It is identified that a complete self-selection model consists of two equations. The first is a self-selection equation which is essentially a probit equation, and the second is a demand equation. To estimate a self-selection model, several estimators that simultaneously estimate the self- selection equation and the demand equation have been reviewed. However, because the CV survey data is a truncated sample, evidence shows that the self- selection equation parameters cannot be estimated reliably by existing estimators. It is the deficiency of existing estimators that motivates this study. In Chapter 3, a self-selection model under a CV framework is derived and new ML estimators are proposed. According to a random utility model, a self- selection equation can be expressed by a probit model with income as one of the important explanatory variables. A self-selection model is completely described by a self-selection probit equation and a demand equation which is correlated with the self-selection equation. Since a CV data set is usually a truncated sample where the only information available for a non-respondent is the address, a self- selection model with measurement errors is derived by combining the CV truncated sample with census data which provides information for non- respondents’ neighborhoods (e.g. census blocks). Based on the self-selection model with measurement errors, two ML estimators are then proposed. Finally, the self-selection model with measurement errors is extended to allow for a demand equation with qualitative or limited dependent variables. It is found in Chapter 4 that for a truncated sample, the ML estimator produces different results from that of Muthen and Jareskog (1983). Biasedness is not a major problem even with p = 0.75. The real problem appears to be the 64 difference between RMSE and ASE which indicates that the variance-covariance matrix produced by the ML estimator is incorrect and cannot be used in testing hypotheses. If the self-selection equation cannot be correctly specified, all of the oz, p, self-selection, and demand parameters may be estimated inconsistently. The main purpose of Chapter 4 is to use Monte Carlo experiments to compare the resulting parameter estimates from the two ML estimators for the self-selection models with measurement errors proposed in Chapter 3. For the first estimator, a sample drawn from the population is used to calculate the variance-covariance matrix (2) for non-respondents’ explanatory variables in self- selection equation, and for the second estimator, the variance-covariance matrix for non-respondents’ explanatory variables is calculated using samples drawn from each non-respondent’s census block (2,). Monte Carlo results show that both of the ML estimators give very accurate estimates for all of the self-selection and demand parameters. However, in terms of efficiency, the estimator using 2, performs somewhat better than the estimator using 2. Although the ML estimator using 2, performs only slightly better than the alternative estimator using 2, it is the estimator that is recommended. Consider a case where census blocks are heterogeneous (2,' a 2,', i v- j). In this case, 2 is no longer a consistent estimate for 2,', and the resulting estimator using 2 does not lead to consistent parameter estimates. Although 2 is easier to obtain and performs similarly to 2,, a stronger assumption is needed to assure the consistency of parameter estimates. 5.2 Need for future research It is indicated in Chapter 4 (Section 4.2.1) that Monte Carlo experiment results from the ML estimator based on a truncated sample are different from 65 that of Muthén and Jdreskog (1983). The reasons behind this difference remain to be explored by future studies that concentrate on the issue of model specification. It is important to determine the degree to which self-selection models with measurement errors are sensitive to model misspecification. Another area that remains to be explored is the large %BIAS and incorrect variance for B, that results from the use of a single equation approach without correcting for self-selection bias. This problem may be approached by varying the variance-covariance matrix structure for [x,, x,, x,, u, e,] and examine how it affects the estimates from a single equation method such as OLS. In a comparison of the two self-selection models with measurement errors, Monte Carlo experiment results suggest that adoption of u, and 2, produces better results. Recall that both it, and 2, are estimated from non-respondent i’s neighborhood, and the neighborhood is loosely defined as a census block, a county, a state, or even a region. Definition of the neighborhood remains an empirical problem and should be studied further. 5.3 Conclusion In CV studies, data can only be collected from those who are willing to participate in the studies. Results from the application of a single equation approach to this truncated sample may lead to inconsistent parameter estimates (self-selection bias). Unfortunately, there is no simple method to detect the existence of self-selection bias in CV studies. A self-selection model which contains a self-selection and a demand equation must be specified in order to detect and to correct for self-selection bias. The ML estimator that is based on the self-selection model with a truncated sample provides theoretically consistent parameter estimates. However, unless the data is a censored sample, it is shown 66 that the parameters and the variance-covariance matrix in the self-selection equation cannot be estimated reliably. A method that transfers a truncated sample to a censored sample by combining survey individual data and census data is proposed and is called a self- selection model with measurement errors. Two .ML estimators are derived based on the self-selection model with measurement errors where data from census are treated as if they are the true values plus errors. Results from the Monte Carlo experiments show that the ML estimator based on the model which uses a censored sample has the best performance. ML estimators based on the self-selection models with measurement errors perform very well, especially in estimating demand parameters. According to the Monte Carlo experiment results, when a correctly-specified self-selection model with measurement errors is adopted, estimates of the demand parameters are as accurate and efficient as the estimates from a model that uses a censored sample, and the estimates of the self-selection parameters are very close to the true parameter values. The results indicate an impressive message: adoption of a self- selection model with measurement errors will not contaminate the original truncated sample. Among the two‘ ML estimators from the self-selection model with measurement errors, the model that uses 11, and 2, estimates the self-selection parameters more accurately and efficiently than the model that uses )1, and 2. Although 2 is easier to obtain and the ML estimator based on u, and 2 produces acceptable results, compared to the estimator that uses it, and 2 ,, stronger assumptions are needed to justify the results. Although self-selection models with measurement errors developed in this study started from a CV study using mail surveys with different demand specifications, they can be easily generalized in several ways. First, since 67 derivation of the model requires no specific restriction for the surveys, any type of cross-section survey can be applied. Second, although a demand function is used in the model, the important issue is the correlation between the demand function and the self-selection equation. The model is still valid even if the demand function is replaced with a supply function, given that it is correctly specified. In general, models developed in this study are broad enough to be applied to studies that adopt survey data and regression analyses. APPENDIX A RESULTS FROM MUTHEN AND JORESKOG’S STUDY APPENDIX A RESULTS FROM MUTHtN AND JORESKOG’S STUDY In the model 1 of Muthen and Jdreskog’s study (1983, Section 5), the selection relation is specified as: y, =0.0 +1.0x, +e,, n, = 0.0 —1.0)g + 6,, where [x, e, 8 ,] is distributed as an i.i.d. trivariate normal distribution with a mean vector [0 0 0] and a variance-covariance matrix Cov(x,, 8,, 6,) = 0 1 -0.5 . l 0 -0.5 1 I Based on different sample sizes (i.e. N = 1,000 and N = 4,000), Monte Carlo experiment results are presented in the following tables:1 ‘ 1 Estimates from a robit and a Heckman’s two-stage estimator that were reported by Muthén an Jdreskog are omitted here. Notation used to report results are defined in Appendix . 68 69 Table A.l Parameter estimates for data simulated according to model 1, N, 496, N = 1000 OLS Estimates ML Estimates BIAS %BIAS Parameters Truncated Sample B0 = 0.0 -.373 -.209 0.209 _ (.054)" (.119) p, = 1,0 .788 .931 0.069 6.9% (.052) (.095) a“ = 1,0 .985 .982 0.018 1.8% (.065) (.076) yo = 0,0 .991 0.991 _ (1.599) 7, = .10 -3.448 2.448 244.8% (4.542) p = -0,5 -.248 0.252 50.4% (.413) Censored Sample [30 = 0,0 -.373 .074 0.074 _ (.054) (.179) p, = 1,0 .788 1.033 0.033 3.3% (.052) (.114) o = 1,0 .985 1.126 0.126 12.6% n (.065) (.131) 70 = 0.0 .013 0.013 _ (.046) y, -.- .10 -1.040 0.040 4.0% (.068) p = -05 -.522 0.022 4.4% (.164) ' Truncated sample size. .. Standard errors in parentheses. 1963,‘ N = Parameters OLS Estimates ML Estimates BIAS %BIAS_ Truncated Sample (30 = 00 -.435 -.223 0.223 __ (.027) (.137) B1 = 1.0 .807 .965 0.035 3.5% (.027) (.084) _ a" = 1,0 .916 .978 0.022 2.2% (.029) (.056) yo = 0.0 .851 0.851 _ (.723) y, .. .10 -1.277 0.277 27.7% (346) p = -05 -.521 0.021 4.2% (.122) Censored Sample Bo = 0,0 -.435 .013 0.013 _ (.027) (.083) p, - 1,0 .807 1.065 0.065 6.5% (.027) (.054) a = 1,0 .916 1.054 0.054 5.4% u (.029) (.062) 70 = 0.0 .021 0.021 _ (023) Y, _._ -1.0 -1.043 0.043 4.3% (.032) p = .05 -.538 0.038 7.6% (.078) ~ ° Truncated sample size. .. Standard errors in parentheses. 71 There are two problems with Muthén and Joreskog’s results. First, it cannot be identified that whether the standard errors reported in the study are the RMSE’s or ASE’s. Second, since misspecified models are used in Monte Carlo experiments, it is difficult to distinguish whether the biased results are caused by the truncated sample or by the misspecification.2 Comparable Monte Carlo experiment results from a model specified in this study (Chapter 4, Sections 4.1 and 4.2) are presented in Appendix C, Tables C.1.3.A, C.1.3.B, and C.1.3.E.3 2 It is showed that a probit model is sensitive to model specification (Yatchew and Griliches, 1985). 3 Results are interpreted in Chapter 4, Section 4.2.1. APPENDIX B NOTATION USED IN REPORTING MONTE CARLO RESULTS APPENDIX B NOTATION USED IN REPORTING MONTE CARLO RESULTS1 To summarize results from Monte Carlo experiments, let 8,, a kxl vector, be the estimate of the parameter vector obtained from the ith replication, and it, is the corresponding kxk variance-covariance matrix calculated as the inverse of the negative of the second derivatives matrix of the log-likelihood function at the maximum likelihood estimates.2 First, the mean estimate of the parameter vector, MEAN, is defined as N MEANzE=§T28p i-l where N is the total number of replications (N = 500 in this study). A measure of the bias, BIAS, can be defined as BIAS . '5 - or, where 0’ is the true value of the parameter vector. In addition, define %BIAS by %BIAS - 994 1:95] . 100%. 1 Adapted from Dhrymes (1970), Section 8.6, pp. 372 - 380. 2 ' ' ' ° ' al ical second In the GAUSS o timization procedure, instead of usrng an an derivative, a numericalgecond derivative that 18 based on an analytrcaIt first derivative is used. 72 73 Further, define average standard error, ASE, as 1 N . ASE I N '23,] d1ag[ii,] , the covariance matrix about the true parameter value, COV ‘(8), as cows) . ,1, id, - a'xs, - 6‘)’. i-l and root mean square errors, RMSE, as RMSE . 1/ diag[ c6v (8)]. To examine Monte Carlo experiment results, it is important to check both the RMSE and ASE. Under ideal condition, RMSE and ASE should be very close to each other. The RMSE is very different from ASE if 1) the model is misspecified; 2) the estimator does not lead to reliable parameter estimates; or 3) the variance-covariance matrix cannot be calculated using the regular formula. For the purpose of comparing efficiency among estimators, define Db, . det[ c6v (8),, ], where b specifies a sub-vector of the parameter vector, j indicates the j“I type of estimator and COV VS)”, is the corresponding covariance matrix about the true parameter value. For different estimators, if the Db,’s are defined over the same sample, their (relative) magnitudes can be treated as an indicator of "efficiency." For example, Db, > Db, indicates that the jth estimator is more efficient than the ith estimator. In this study, although different estimators are based on different 74 data sets,3 the (relative) magnitude of Db, can still be treated as an indicator of efficiency. In Appendix C, MEAN’s, BIAS’s, %BIAS, RMSE’s, ASE’s, Db,,’s as well as the average log-likelihood value and its standard error (SE),4 summarize the results of Monte Carlo experiments. E 3 In this study, although different estimators are based on different data sets, all the data sets are developed from the same pgpulation and contain an identical proportion of respondents to non-respondents. or non-respondents, different data sets contain either the real observations or some statistics estimated from the same population. 4 These are simply the mean and standard error of the maximum log- likelihood values from the N ( = 500) rephcations. APPENDIX C MONTE CARLO EXPERIMENT RESULTS APPENDIX C MONTE CARLO EXPERIMENT RESULTS This appendix presents the results from Monte Carlo experiments for self- selection models with measurement errors with a linear demand, a Tobit demand, and a probit demand equation. Based on different correlation measures between the demand and self-selection equations, three sequences of simulation are conducted for each model (p = 0.25, 0.5, and 0.75), and each sequence of simulation has 500 replications. C.1.1 Estimates from a self-selection model with measurement errors and a linear demand equation (p = 0.25) Results presented below are based on p = 0.25.1 Estimates are obtained after 500 replications, and the average number of respondents is 499.0820 (SE = 21.4049) out of 1,000. 1 Statistics reported in Tables are defined in Appendix B. 75 Parameter MEAN BIAS %BIAS RMSE A81 130 = 6.0065 0.0065 0.11% 0.2328 0.2234 131 = 4.0876 0.0876 2.19% 0.1096 0.0640 132 - -3 -3.0028 -0.0028 0.09% 0.0611 0.0586 _Q2 = 1 0.9868 -0.0132 1.32% 0.0637 13“,”,5;s = 6.4531e-10 Table C.1.1.B Linear demand, correcting for self-selection bias using censored sample, p = 0.25 Parameter MEAN BIAS %BIAS RMSE ASE yo 8 15 1.5026 0.0026 0.17% 0.2070 0.1954 y, = 1 1.0120 0.0120 1.20% 0.0794 0.0786 y2 = —3 -3.0259 -0.0259 0.86% 0.1864 0.1799 pa = 6 6.0036 0.0036 0.06% 0.2335 0.2068 9, = 4 4.0014 0.0014 0.04% 0.0831 0.0744 132 = -3 -3.0004 -0.0004 0.01% 0.0612 0.0544 02 = 1 0.9968 -0.0032 0.32% 0.0642 0.0607 p = 025 0.2490 -0.0010 0.40% 0.1342 0.1221 Average log-likelihood = -928.1916 (SE = 40.6244) ow),CEN = 8.8139e-09 = 3.0124e-10 = 1.1842e-18 13(03):an D(v.fl.02.p);CEN selection, 77 Table C.1.1.C Linear demand, correctin for self-selecfi ’ errors model with “i and 2, p = 0.25 8 on using measurement Parameter MEAN BIAS %BIAS RMSE ASE Yo = 15 1.5476 0.0476 3.17% 0.3664 0.3208 ,1 = 1 1.0338 0.0338 3.38% 0.1420 0.1259 ,2 .-. -3 -3.0861 -0.0861 2.87% 0.3651 0.3116 Bo = 6 6.0038 0.0038 0.06% 0.2336 0.2090 5, = 4 4.0013 0.0013 0.03% 0.0843 0.0741 92 = -3 -3.0005 -0.0005 0.02% 0.0613 0.0549 02 = 1 0.9972 -0.0028 0.28% 0.0647 0.0602 p = 025 0.2522 0.0022 0.88% 0.1372 0.1232 Average log-likelihood = -1157.1148 (SE = 41.9968) D(v.9);MEI = 136076-07 D = 3.1814e-10 (9.3mm D(v.0.02.9);MBl Table C.1.1.D Linear demand, corregt = 1.8437e-17 errors model with u, and 2,, p = 0.2 ing for self-selection using measurement Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5130 0.0130 0.87% 0.3520 0.3180 y, = 1 1.0228 0.0228 2.28% 0.1353 0.1241 y, = —3 -3.0538 -0.0538 1.79% 0.3467 0.3079 B0 = 6 6.0033 0.0033 0.06% 0.2335 0.2094 3, = 4 4.0003 0.0003 0.01% 0.0844 0.0744 B2 = -3 -3.0005 -0.0005 0.02% 0.0613 0.0550 03 = 1 0.9974 -0.0026 0.26% 0.0647 0.0603 P = 0.25 0.2516 0.0016 0.64% 0.1368 0.1229 Averagefigflrelihood = -1157.1736 (SE = 42.0587) 130,»;an = 1.0158e-07 Dmmgma2 = 3.1997e-10 = 1.3851e-17 D(v.a.42.p);ME2 78 C.1.2 Estimates from a self-selection model with measurement errors and a linear demand equation (p = 0.5) Results presented below are based on p = 0.5. Estimates are obtained after 500 replications, and the average number of respondents is 498.0860 (SE = 22.1867) out of 1,000. 79 g‘izlsrlepcllfig Linear demand, OLS estimates without correcting for self-selection , Parameter MEAN BIAS %BIAS RMSE ASE 130 = 6 5.9969 -0.0031 0.05% 0.2079 0.2209 13, = 4 4.1714 0.1714 4.29% 0.1828 0.0632 132 = -3 -3.0025 -0.0025 0.08% 0.0539 0.0579 4,2 = 1 0.9649 -0.0351 3.51% 0.0704 D(a.o2);s = 1.4168e-09 Table C.1.2.B Linear demand, correcting for self-selection bias using censored sample, p = 0.5 Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.4988 -0.0012 0.08% 0.2059 0.1955 y, s 1 1.0116 0.0116 1.16% 0.0790 0.0793 y, = -3 -3.0231 0.0231 0.77% 0.1861 0.1818 50 = 6 5.9945 -0.0055 0.09% 0.2060 0.2074 5, = 4 4.0023 0.0023 0.06% 0.0780 0.0732 132 = -3 -2.9989 0.0011 0.04% 0.0532 0.0542 oz = 1 0.9967 -0.0033 0.33% 0.0659 0.0617 p a 05 0.4946 -0.0054 1.08% 0.1112 0.1083 Average log-likelihood = -914.5157 (SE = 42.3432) Dam)“,EN = 5.4306e-09 D ( a. 02mm = 1.6633e-10 = 4.0948e-19 D(v.a.o2.o);CEN 80 Table C.1.2.C Linear demand, correcting for self-selection using measurement errors model With 11, and 2, p = 0.5 , Parameter MEAN BIAS %BIAS RMSE ASE Yo = 15 1.5230 0.0230 1.53% 0.3428 0.3092 ,1 = 1 1.0221 0.0221 2.21% 0.1359 0.1222 72 = -3 -3.0557 -0.0557 1.86% 0.3439 0.3003 130 = 6 5.9940 -0.0060 0.10% 0.2066 0.2092 6, = 4 4.0021 0.0021 0.05% 0.0794 0.0739 B; = -3 -2.9988 0.0012 0.04% 0.0532 0.0546 02 = 1 0.9972 -0.0028 0.28% 0.0673 0.0629 p = 0.5 0.4969 -0.0031 0.62% 0.1127 0.1091 Average log-likelihood = -1144.3670 (SE = 43.1100) D(v.9);MEi = 6-87536-08 D = 1.8757e-10 = 5.4879e-18 (0.02);ME1 D(v.fl.02.p);MEl Table C.1.2.D Linear demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.5 Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.4978 -0.0022 0.15% 0.3246 0.3061 y, = 1 1.0137 0.0137 1.37% 0.1272 0.1213 y2 = -3 --3.0318 -0.0318 1.06% 0.3195 0.2966 00 = 6 5.9935 -0.0065 0.11% 0.2066 0.2094 13, = 4 4.0007 0.0007 0.02% 0.0794 0.0740 62 a -3 -2.9988 0.0012 0.04% 0.0532 0.0547 02 = 1 0.9978 -0.0022 0.22% 0.0673 0.0630 p = 05 0.4959 -0.0041 0.82% 0.1124 0.1089 Average log-likelihood = -1144.4764 (SE = 43.2787) 136,»;an = 4.8506e-08 130.42);an = 1.8872e-10 = 3.8592e-18 D(v.0.02.9);MBZ 81 C.1.3 Estimates from a self-selection model with measurement errors and a linear demand equation (p = 0.75) Results presented below are based on p = 0.75. Estimates are obtained after 500 replications, and the average number of respondents is 498.9100 (SE = 20.9709) out of 1,000. 82 Table C.1.3.A Linear demand, OLS estimates without correcting for self-selection bias, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE [30 = 6 6.0257 0.0257 0.43% 0.2187 0.2157 51 = 4 4.2579 0.2579 6.45% 0.2654 0.0619 132 a: -3 -3.0109 -0.0109 0.36% 0.0573 0.0565 412 a 1 0.9227 -0.0773 7.73% 0.0975 D(n.a2);s = 3.2229e-09 Table C.1.3.B Linear demand, correcting for self-selection bias using censored sample, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE Yo = 15 1.5097 0.0097 0.65% 0.1986 0.1824 «,1 = 1 1.0097 0.0097 0.97% 0.0794 0.0733 72 = -3 -3.0271 -0.0271 0.90% 0.1855 0.1734 Bo -.- 6 6.0165 0.0165 0.28% 0.2078 0.1931 6, = 4 4.0013 0.0013 0.03% 0.0691 0.0642 132 -.- -3 -3.0038 -0.0038 0.13% 0.0534 0.0504 02 = 1 0.9938 -0.0062 0.62% 0.0673 0.0632 p = 075 0.7521 0.0021 0.28% 0.0657 0.0616 Average log-likelihood = -891.7293 (SE = 38.4693) Dam);CBN = 1.8304e-09 D(B,02);CEN = 1.1114e-10 D(v.0.02.9);CEN = 1.0369e-19 83 Table C.1.3.C Linear demand, cogescting for self-selection using measurement errors model with u, and 2, p - Parameter MEAN BIAS %BIAS RMSE ASE Yo . 15 1.5483 0.0483 3.22% 0.3079 0.2718 y, = 1 1.0305 0.0305 3.05% 0.1266 0.1103 72 a -3 -3.0846 -0.0846 2.82% 0.3114 0.2723 130 = 6 6.0174 0.0174 0.29% 0.2080 0.1950 13, .-. 4 4.0035 0.0035 0.09% 0.0710 0.0655 62 a -3 -3.0039 -0.0039 0.13% 0.0533 0.0507 02 = 1 0.9920 -0.0080 0.80% 0.0714 0.0647 p 8 075 0.7540 0.0040 0.53% 0.0675 0.0619 Average log-likelihood = -1120.3568 (SE = 38.5626) DOW);MEI = 1.8001e-08 Baffin, = 1.4288e-10 Du”, 62,9);ME1 = 1.0489e-18 Table C.1.3.B Linear demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5290 0.0290 1.93% 0.2935 0.2689 y, = 1 1.0243 0.0243 2.43% 0.1226 0.1092 ,2 = -3 -3.0677 -0.0677 2.26% 0.2982 0.2693 80 . 5 6.0166 - 0.0166 0.28% 0.2077 0.1953 6, = 4 4.0014 0.0014 0.04% 0.0698 0.0657 13, - -3 -3.0038 -0.0038 0. 13% 0.0533 0.0508 oz .-. 1 0.9932 -0.0068 0.68% 0.0714 0.0648 9 = 0.75 0.7530 0.0030 0.40% 0.0672 0.0620 Average log-likelihood = -1120.4394(SE = 38.6073) D(y,p);m = 1.3920e-08 130.42);an = 1.3765e-10 = 8.0515e-19 D(v.o.o2.p);Mez 84 Table C.1.3.E Linear demand, correcting for self-selection bias using truncated sample, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE Yo = 15 1.5442 0.0442 2.95% 0.6446 0.5437 7, = 1 1.0538 0.0538 5.38% 0.3605 0.2746 72 a: -3 -3.1487 -0.1487 4.96% 0.7926 0.6842 60 . 6 6.0123 0.0123 0.21% 0.2099 0.1960 6, : 4 3.9913 -0.0087 0.22% 0.1055 0.0966 62 = —3 -3.0041 -0.0041 0.14% 0.0535 0.0503 62 a 1 0.9967 -0.0033 0.33% 0.0877 0.0808 p :- 075 0.7600 0.0100 1.33% 0.1100 0.0989 Average log-likelihood = -665.0761 (SE = 32.2393) D(Y'p);mu = 1.9254e-05 D ( II. 02); no = 9.4964e-10 D = 3.4627e-15 (1.0.02.9);T'RU 85 C.2.1 Estimates from a self-selection model with measurement errors and a Tobit demand equation (p = 0.25) Results presented below are based on p = 0.25. Estimates are obtained after 500 replications, and the average number of respondents is 499.9500 (SE = 22.6791) out of 1,000. In the demand equation, the average number of censored q,’s (q, = 0) is 370.4960 (SE = 19.7898), and the average number of uncensored q,’s (q, > 0) is 129.4540 (SE = 11.1605). Table C.2.1.A Tobit estimates without correcting for self-selection bias, p = 0.25 86 D(v.fl.02.p);CEN Parameter MEAN BIAS %BIAS RMSE ASE 60 = 6 5.9277 -0.0723 1.21% 0.4288 0.3878 pl = 4 4.1536 0.1536 3.84% 0.2491 0.1955 (12 = -3 -3.0048 -0.0048 0.16% 0.1523 0.1461 62 = 1 0.9774 -0.0226 2.26% 0.1257 0.1221 Average log-likelihood = 2262562 (SE = 19.4390) 13mm,»S = 1.7151e-07 Table C.2.1.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.25 Parameter MEAN BIAS %BLAS RMSE ASE yo 3 15 1.5215 0.0215 1.43% 0.2130 0.2140 y, a 1 1.0013 0.0013 0.13% 0.0827 0.0836 72 = -3 -3.0185 -0.0185 0.62% 0.1978 0.1953 60 a 6 5.9834 -0.0166 0.28% 0.4005 0.3910 131 = 4.0039 0.0039 0.10% 0.2130 0.2221 62 = -3 -2.9974 0.0026 0.09% 0.1472 0.1462 a? = 1 0.9988 -0.0012 0.12% 0.1329 0.1289 p = 025 0.2426 -0.0074 2.96% 0.1873 0.1815 Average Melihood = 451.5493 (SE = 28.6113) D(y,p);CEN = 2.0024e-08 D(0,62);CEN = 1.0154e-07 = 7.4238e-16 Table C.2.1.C Tobit demand, corre errors model with u, and 2, p = 0. 87 2cging for self-selection using measurement Parameter MEAN BIAS %BIAS RMSE ASE 70 = 15 1.5702 0.0702 4.68% 0.3603 0.3481 71 = 1 1.0175 0.0175 1.75% 0.1316 0.1343 Y2 = -3 -3.0711 -0.0711 2.37% 0.3338 0.3328 60 = 6 5.9792 -0.0208 0.35% 0.4093 0.3985 51 = 4 4.0033 0.0033 0.08% 0.2140 0.2255 62 = -3 -2.9954 0.0046 0.15% 0.1499 0.1492 02 -.- 1 0.9982 -0.0018 0.18% 0.1341 ' 0.1295 p = 025 0.2427 -0.0073 2.92% 0.1911 0.1837 Average log-likelihood = -679.4912 (SE = 30.5418) Dow),MEI = 2.4916e-07 D D(r.fl.02.p);Mfil Table C.2.1.D Tobit demand, correcting (0.02);ME1 = 1.0763e-07 = 9.7833e- 15 errors model with p, and 2,, p = 0.25 for self-selection using measurement Parameter MEAN BIAS %BIAS RMSE ASE yo .. 1,5 1.5557 0.0557 3.71% 0.3530 0.3414 y, = 1 1.0123 0.0123 1.23% 0.1300 0.1319 y2 = -3 '-3.0597 -0.0597 1.99% 0.3265 0.3256 50 = 6 5.9786 -0.0214 0.36% 0.4104 0.3960 13, = 4 4.0029 0.0029 0.07% 0.2137 0.2258 32 = -3 -2.9957 0.0043 0.14% 0.1503 0.1478 ‘02 = 1 0.9984 -0.0016 0.16% 0.1344 0.1295 P = 0.25 0.2424 -0.0076 3.04% 0.1909 0.1855 Average log-likelihood = -679.5947 (SE = 30.5940) D(v.p);MEz = 1.9532e-07 D(0.02);MB2 = 1.0759e-07 = 7.5680e-15 D(v.fl.02.9);ME2 88 C22 Estimates from a self-selection model with measurement errors and a Tobit demand equation (p = 0.5) Results presented below are based on p = 0.5. Estimates are obtained after 500 replications, and the average number of respondents is 499.5380 (SE = 21.1736) out of 1,000. In the demand equation, the average number of censored q,’s (q, = 0) is 364.3780 (SE = 18.4454), and the average number of uncensored q,’s (q, > 0) is 135.1600 (SE = 11.7430). 89 Table C.2.2.A Tobit estimates without correcting for self-selection bias, p = 0.5 D(v.fl.02.9);CEN Parameter MEAN BIAS %BIAS RMSE ASE 130 = 6 5.8824 -0.1176 1.96% 0.4033 0.3732 6, = 4 4.3040 0.3040 7.60% 0.3626 0.1895 52 = -3 -3.0145 -0.0145 0.48% 0.1471 0.1391 62 .-. 1 0.9345 -0.0655 6.55% 0.1348 0.1139 Average log-likelihood = 2298972 (SE = 20.0318) D(a.o2);s = 4.6211e-07 Table C.2.2.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.5 Parameter MEAN BIAS %BIAS RMSE ASE yo 3 15 1.5329 0.0329 2.19% 0.2120 0.2144 y, = 1 1.0111 0.0111 1.11% 0.0879 0.0837 72 = -3 -3.0414 -0.0414 1.38% 0.2041 0.1962 00 a 6 5.9847 -0.0153 0.26% 0.3724 0.3715 5, = 4 4.0087 0.0087 0.22% 0.2043 0.2039 62 = —3 -2.9957 0.0043 0.14% 0.1380 0.1365 02 = 1 0.9854 -0.0146 1.46% 0.1353 0.1272 p a 05 0.4836 -0.0164 3.28% 0.1594 0.1512 Average log-likelihood = 450.5322 (SE = 29.5257) D(v,p);CEN = 1.4824e-08 D0312): em: = 6.4816e-08 = 3.3847e-16 Table C.2.2.C Tobit demand, correct 5 90 errors model with it, and 2, p = 0. ing for self-selection using measurement Parameter MEAN BIAS %BIAS RMSE ASE 'Yo . 15 1.5511 0.0511 3.41% 0.3635 0.3454 71 = 1 1.0282 0.0282 2.82% 0.1353 0.1381 72 = -3 -3.0788 -0.0788 2.63% 0.3515 0.3407 60 = 6 5.9824 -0.0176 0.29% 0.3758 0.3702 pl = 4 4.0106 0.0106 0.27% 0.2066 0.2053 132 = -3 -2.9958 0.0042 0.14% 0.1385 0.1355 62 = 1 0.9859 -0.0141 1.41% 0.1354 0.1291 6 = 0.5 0.4859 -0.0141 2.82% 0.1599 0.1534 Average log-likelihood = -680.7637 (SE = 30.1853) D(Y’p);ME, = 1.7301e-07 D = 7.1621e-08 = 4.3018e-15 (9,02);MB1 D(v.fl.¢2.9);MEl Table C.2.2.D Tobit demand, correcting for self-selection using measurement errors model with u, and 2,, p = 0.5 larameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5252 0.0252 1.68% 0.3447 0.3381 y, = 1 1.0205 0.0205 2.05% 0.1285 0.1314 y2 = -3 43.0565 -0.0565 1.88% 0.3282 0.3241 Bo = 6 5.9809 -0.0191 0.32% 0.3779 0.3765 B, a 4 4.0100 0.0100 0.25% 0.2077 0.2081 92 = -3 -2.9956 0.0044 0.15% 0.1396 0.1387 03 . 1 0.9862 -0.0138 1.38% 0.1361 0.1289 p = 05 0.4847 -0.0153 3.06% 0.1607 0.1534 Average 1_oLlikelihood = -680.8440 (SE = 30.2236) D(v.p);ME2 = 125936-07 D(p,62);M132 = 7.18846-08 Dw’ngm);m = 3.1359e-15 91 C.2.3 Estimates from a self-selection model with measurement errors and a Tobit demand equation (p = 0.75) Results presented below are based on p = 0.75. Estimates are obtained after 500 replications, and the average number of respondents is 499.9580 (SE = 21.8165) out of 1,000. In the demand equation, the average number of censored q,’s (q, = 0) is 361.5160 (SE = 18.9821), and the average number of uncensored q,’s (q, > 0) is 138.4420 (SE = 11.1505). 92 Table C.2.3.A Tobit estimates without correcting for self-selection bias, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE 60 = 6 5.8553 -0.1447 2.41% 0.3971 0.3561 pl = 4 4.4434 0.4434 11.09% 0.4808 0.1848 132 -.- -3 -3.0273 -0.0273 0.91% 0.1383 0.1328 62 = 1 0.8767 -0.1233 12.33% 0.1633 0.1054 Average log-likelihood = -228.8807 (SE = 18.4061) 130,02);S = 6.5670e-07 Table C.2.3.B Tobit demand, correcting for self-selection bias using censored sample, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE 70 = 15 1.4990 -0.0010 0.07% 0.1959 0.2091 7, a 1 1.0177 0.0177 1.77% 0.0835 0.0821 72 = -3 -3.0368 -0.0368 1.23% 0.1862 0.1943 pa = 6 6.0087 0.0087 0.15% 0.3409 0.3408 6, = 4 4.0121 0.0121 0.30% 0.1780 0.1804 62 = -3 -3.0057 -0.0057 0.19% 0.1254 0.1232 02 = 1 0.9882 -0.0118 1.18% 0.1302 0.1280 p = 075 0.7396 -0.0104 1.39% 0.1028 0.0973 Averagefigflelihood = 443.3619 (SE = 26.9542) D(y,p);CEN = 4.8510e-09 D = 2.8632e-08 = 5.7769e-17 (when: ‘ D(v.0.02.9);CEN 93 Table C.2.3.C Tobit demand, correeging for self-selection using measurement errors model with 11, and 2, p = 0. Parameter MEAN BIAS %BIAS RMSE ASE_ Yo = 15 1.5645 0.0645 4.30% 0.3311 0.3295 7, = 1 1.0362 0.0362 3.62% 0.1391 0.1279 72 = -3 -3.1039 -0.1039 3.46% 0.3510 0.3210 pa = 6 6.0113 0.0113 0.19% 0.3453 0.3421 6, = 4 4.0101 0.0101 0.25% 0.1813 0.1839 132 = -3 -3.0044 -0.0044 0.15% 0.1260 0.1229 62 a 1 0.9858 -0.0142 1.42% 0.1325 0.1305 p = 075 0.7401 -0.0099 1.32% 0.1058 0.0979 Average log-likelihood = -673.0108 (SE = 29.1919) 0,1,9);Mm = 5.6626e-08 D(’,02);MEI = 3.33986'08 Bahama = 6.8714e-16 Table C.2.3.D Tobit demand, correetsing for self-selection using measurement errors model with 1.1, and 2,, p = 0. Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5348 0.0348 2.32% 0.3169 0.3248 y, = 1 1.0260 0.0260 2.60% 0.1330 0.1262 y, a -3 '-3.0763 -0.0763 2.54% 0.3360 0.3173 5,, = 6 6.0116 0.0116 0.19% 0.3454 0.3425 5, a 4 4.0068 0.0068 0.17% 0.1806 0.1841 62 = -3 -3.0050 -0.0050 0.17% 0.1261 0.1233 c? a 1 0.9891 -0.0109 1.09% 0.1346 0.1324 9 = 0.75 0.7397 -0.0103 1.37% 0.1060 0.0981 Average log-likelihood = -673.0461 (SE = 29.1821) 1),“);M122 = 4.1771e-08 D(9,02);ME2 = 3.3442e-08 D(y,n,02,p);ME2 = 5.12146'16 94 C.3.1 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.25) Results presented below are based on p = 0.25. Estimates are obtained after 500 replications, and the average number of respondents is 502.3760 (SE = 20.6115) out of 1,000. In the demand equation, the average number of left censored q,’s (q, = 0) is 372.0040 (SE = 18.5927), and the average number of right censored q,’s (q, = 1) is 130.3720 (SE = 11.2154). 95 Table C.3.1.A Probit estimates without correcting for self-selection bias, p = 0.25 Parameter MEAN BIAS %BIAS RMSE ASE pa = 6 6.2406 0.2406 4.01% 0.9353 0.8337 91 = 4 4.2972 0.2972 7.43% 0.5777 0.4633 62 = -3 -3.1363 -0. 1363 4.54% 0.3961 0.3442 Average log-likelihood = -85.3713 (SE = 10.3680) D 6:8 = 1.9852e-04 Table C.3.1.B Probit demand, correcting for self-selection bias using censored sample, p = 0.25 Parameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5329 0.0329 2.19% 0.2293 0.2187 7, = 1 1.0144 0.0144 1.44% 0.0849 0.0852 72 a: -3 -3.0501 -0.0501 1.67% 0.2122 0.2005 60 = 6 6.1967 0.1967 3.28% 0.9208 0.8351 13, = 4 4.1095 0.1095 2.74% 0.5539 0.5104 B; = -3 -3.0892 -0.0892 2.97% 0.3863 0.3471 p = 025 0.2527 0.0027 1.08% 0.3060 0.2918 Average log-likelihood = -308.9247 (SE = 20.4965) 13W),EN = 6.4142e-08 DMN = 2.1815e-04_ 136.1,),an = 5.7177e-12 96 Table C.3.1.C Probit demand, corre ' for self-selecti ' errors model with 1.1, and 2:, p = 0.2331118 on usmg measurement Parameter MEAN BIAS %BIAS RMSE ASE Yo = 1.5 1.6115 0.1115 7.43% 0.3756 0.3605 7, -.- 1 1.0449 0.0449 4.49% 0.1473 0.1384 72 a. -3 -3.1454 -0.1454 4.85% 0.3698 0.3477 60 = 6 6.2021 0.2021 3.37% 0.9239 0.8415 51 = 4 4.1132 0.1132 2.83% 0.5560 0.5124 62 = -3 -3.0912 -0.0912 3.04% 0.3877 0.3490 6 3: 025 0.2534 0.0034 1.36% 0.3092 0.2964 Average log-likelihood = -537.6684 (SE = 22.0822) D(v,p);MEl = 8.7102e-07 D AME 1 = 2.2009e-04 D(v.B.p);ME1 = 7.8470e-11 errors model with 11, and 2,, p = 0. Table C.3.1.D Probit demand, correcting for self-selection using measurement 25 _Paramerer MEAN BIAS %BIAS RMSE ASE yo = 15 1.5746 0.0746 4.97% 0.3539 0.3588 y, = 1 1.0335 0.0335 3.35% 0.1382 0.1381 ,2 = -3 31112 -0.1112 3.71% 0.3438 0.3468 11,, a 6 < 6.2002 0.2002 3.34% 0.9230 0.8498 11, a 4 4.1111 0.1111 2.78% 0.5554 0.5160 132 . -3 30904 -0.0904 3.01% 0.3873 0.3539 9 = 0.25 0.2529 0.0029 1.16% 0.3077 0.2922 Average log-likelihood = -537.7948 (SE = 22.0741) D(v.p);ME2 = 5.7620e-07 D II;ME2 = 2.2082e-04 D(v.fl.p);ME2 = 5.2055e-11 97 C.3.2 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.5) Results presented below are based on p = 0.5. Estimates are obtained after 500 replications, and the average number of respondents is 501.0820 (SE = 23.3509) out of 1,000. In the demand equation, the average number of left censored q,’s (q, = 0) is 366.9400 (SE = 20.4881), and the average number of right censored q,’s (q, = 1) is 134.1420 (SE = 11.8349). 98 Table C.3.2.A Probit estimates without correcting for self-selection bias, p = 0.5 Parameter MEAN BIAS %BIAS RMSE ASE pa = 6 6.2315 0.2315 3.86% 0.8671 0.8428 6, .-. 4 4.5082 0.5082 12.71% 0.7114 0.4876 62 = -3 -3.1767 -0.1767 5.89% 0.3926 0.3519 Average log-likelihood = -83.3600 (SE = 10.2106) DB's = 4.7665e-04 Table C.3.2.D Probit demand, correcting for self-selection bias using censored sample, p = 0.5 _Parameter MEAN BIAS %BIAS RMSE ASE 70 = 15 1.5067 0.0067 0.45% 0.2121 0.2194 y, a 1 1.0173 0.0173 1.73% 0.0847 0.0864 y, = —3 -3.0376 -0.0376 1.25% 0.1977 0.2034 pa = 6 6.1576 0.1576 2.63% 0.8458 0.8762 6, = 4.1689 0.1689 4.22% 0.5658 0.5607 132 = -3 -3.0930 -0.0930 3.10% 0.3672 0.3718 p = 05 0.4529 -0.0471 9.42% 0.2570 0.2719 Average log-likelihood = -306.9455 (SE = 19.9936) 136,11);an = 3.3612e-08 Decals: = 1.8584e-04, D0, 01:);an = 2.6645e- 12 99 Table C.3.2.C Probit demand, correcting for self-selection using measurement errors model With 11, and 2, p = 0.5 Parameter MEAN BIAS %BIAS RMSE ASE Yo = 1.5 1.5762 0.0762 5.08% 0.3800 0.3657 7, = 1 1.0346 0.0346 3.46% 0.1470 0.1400 72 = -3 -3.1046 -0.1046 3.49% 0.3794 0.3516 60 = 6 6.1572 0.1572 2.62% 0.8451 0.8649 6, = 4 4.1653 0.1653 4.13% 0.5639 0.5706 62 = -3 -3.0917 -0.0917 3.06% 0.3661 0.3706 p =- 0.5 0.4594 -0.0406 8.12% 0.2606 0.2810 Average log-likelihood = -536.5940 (SE = 22.4278) DMD);M131 = 5.4912e-07 DB;ME1 = 1.8759e-04 D(v.Ap);ME1 = 4.3700e-11 Table C.3.2.D Probit demand, correcting for self-selection using meaSurement errors model with 11, and 2,, p = 0.5 iarameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5398 0.0398 2.65% 0.3624 0.3494 y, = 1 1.0226 0.0226 2.26% 0.1368 0.1378 72 a -3 -3.0695 -0.0695 2.32% 0.3529 0.3432 90 a 6 6.1533 0.1533 2.56% 0.8458 0.8823 13, = 4 4.1607 0.1607 4.02% 0.5634 0.5731 92 = —3 -3.0900 -0.0900 3.00% 0.3662 0.3764 9 a 05 0.4586 -0.0414 8.28% 0.2591 0.2795 Average log-likelihood = -536.7139 (SE = 22.4111) D(v.p);ME2 = 3.6906e-07 DAMEZ = 1.8893e-04 D(v.B,p);MEZ = 2.9529e-11 100 C.3.3 Estimates from a self-selection model with measurement errors and a probit demand equation (p = 0.75) Results presented below are based on p = 0.75. Estimates are obtained after 500 replications, and the average number of respondents is 501.5800 (SE = 22.6946) out of 1,000. In the demand equation, the average number of left censored q,’s (q, = 0) is 362.0720 (SE = 19.2150), and the average number of right censored q,’s (q, = 1) is 139.5080 (SE = 11.6935). 101 Table C.3.3.A Probit estimates without correcting for self-selection bias, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE 90 = 6 6.3941 0.3941 6.57% 0.9873 0.8639 6, = 4 4.7338 0.7338 18.35% 0.9147 0.5130 B, = -3 -3.2647 -0.2647 8.82% 0.4696 0.3633 Average log-likelihood = -81.930918E = 10.5347) D 11;: = 9.4296e-04 Table C.3.3.B Probit demand, correcting for self-selection bias using censored sample, p = 0.75 Parameter MEAN BIAS %BIAS RMSE ASE 70 a 15 1.5155 0.0155 1.03% 0.2221 0.2164 ,1 = 1 1.0151 0.0151 1.51% 0.0842 0.0858 y2 = -3 -3.0394 -0.0394 1.31% 0.2093 0.2023 60 a 6 6.2153 0.2153 3.59% 0.9112 0.8915 6, a 4 4.1852 0.1852 4.63% 0.6221 0.5865 132 -.- -3 -3.1079 -0.1079 3.60% 0.4056 0.3839 6 a 0.75 0.6729 -0.0771 10.28% 0.2064 0.2232 Average log-likelihood = -304.5480 (SE = 19.9643) D (m); (EN = 4.2098e-08 [311;an = 1.3845e-04, = 4.7759e-12 D(v.9.p);CBN 102 Table C.3.3.C Probit demand, correctin for self-sel ct' ' errors model with u, and 2, p = 0.75 g e 10” using measurement Parameter MEAN BIAS %BIAS RMSE ASE 70 = 15 1.5747 0.0747 4.98% 0.3503 0.3566 7, = 1 1.0448 0.0448 4.48% 0.1481 0.1385 72 = -3 -3.1198 -0.1198 3.99% 0.3702 0.3475 Po 3 6 6.2255 0.2255 0.04% 0.9136 0.8503 Pr = 4 4.1936 0.1936 4.84% 0.6260 0.5624 62 = _3 -3.1121 -0.1121- 3.74% 0.4071 0.3655 9 = 075 0.6772 -0.0728 9.71% 0.2047 0.2203 Average log-likelihood = -532.9000 (SE = 22.0023) DOWNEl = 5.3192e-07 Diwnr = 1.4048e-04 D(v,B,p);ME1 = 5.8446e-11 Table C.3.3.D Probit demand, correcting for self-selection using measurement errors model with 11, and 2,, p = 0.75 imameter MEAN BIAS %BIAS RMSE ASE yo = 15 1.5379 0.0379 2.53% 0.3256 0.3475 y, = 1 1.0333 0.0333 3.30% 0.1392 0.1355 y2 = —3 -3.0855 -0.0855 2.85% 0.3414 0.3384 6,, a 6 6.2195 0.2195 3.66% 0.9117 0.8446 9, = 4 4.1877 0.1877 4.69% 0.6232 0.5606 62 = -3 -3.1095 -0.1095 3.65% 0.4060 0.3628 P = 0.75 0.6751 -0.0749 9.99% 0.2063 0.2202 Average log-likelihood = -532.9650 (SE = 21.9812) D(v,p);ME2 = 3.6802e-07 Dmm = 1.3780e-04 D(v,B,p);ME2 = 4.1849e-11 APPENDIX D A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A LINEAR DEMAND EQUATION APPENDIX D A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF -SELECI'ION MODEL WITH MEASUREMENT ERRORS AND A LINEAR DEMAND EQUATION This GAUSS program is used to conduct Monte Carlo experiments for the self-selection model with measurement errors and a linear demand equation (p = 0.25). new; use optmum; output file = tl.out; iter = 1; do while iter 1e 500; recal: @-- npo : ulation size nx: gugobgr of variables (x’s and e’s) nb: number of blocks (nb) bobs: number of obs in a block _ . ss: number of obs used to calculate variance-covariance sm : random sub-sample / population (0 < = smp < = 1) npop = 10000; nx = 5; fig: 250; / b s = npop n ; $8 = 200; smp = 0.1; @- draw random sample as population --@ x = rndn(npop,nx); 103 104 let v[5,5] = 1.44 0.24 0.096 0 0 0.24 1 0.24 0 0 0.096 0.24 0.64 0 0 0 0 0 1 0.25 0 0 0 0.25 1' 9 sqrtv = chol(v); x=x'sqrtv; let a[1,5] = 31.5400; x=x+a; evar = vcx(x[1:ss,1 2]); clear v, sqrtv, a; @-- divide population into blocks --@ si = 2000; s = npop / si; sg = nb s; w = eye sg) .". ones(bobs,l); sx = x[1:si,1 2]; msx = sx’w/bobs)’ .“. ones(bobs,l); vsx = ( ((sx-msx)"2) , 8X1..ll-mSXI-.1]).‘(SX -.2]-mSXI-.2]))) w (bobs-1) )’ .’. ones obs,1); xx = msx~vsx; i = 2; do while i < = s; l=si"(i-1)+ 1; h = si * i; sx = x[l:h,1 2 ; msx = gsx’W/ b322, .'. ones(bobs,l); ~ vsx = ((sx-lmsx) 1]) ( 2 I 2,)», ~ ., - ., ." sx., -msx., w I ggiibsl-ITSXYI.‘. ones( 055,1); sx = msx~vsx; xx=xx|sx; 1=i+1; endo; clear i, l, h, msx, vsx, sx; @- xx = [x1 x2 x3 e1 e2 m1 m2 v11v22 v21] --@ x = x~xx; clear xx; @-- extract a random sub-sample from population xx, 105 and variables in the random sub-sample are [x1x2 x3 e1 e2 m1 m2 v11 v22 v21]. i = indumponl); x = 1~x; data = selif(x,x[.,1] .le smp); clear x; data = data[.,2:(cols(data))]; obs = rows(data); @-- generate dependent variables 8‘ and D‘ '= 1.5+ 1‘x1-3‘x2-I-e1 D‘=6+4"x2-3‘x3+e2 let bs[3,l] = 1.5 1 -3; let bd[3,1] = 6 4 -3; s = (ones(obs,1)~data[.,1 2],) ’ bs + data[.,? (1 = (ones(obs,1)~data[.,2 3) ’ bd + data[., ]; @.. rearrange data accordin to s > Ozxyes = [dx1x2 ], and s <= 0: xno = [m1m2v11v22v21] algé) data = [s (1 x1 x2 x3 m1 m2 v11 v22 v21]. data = s~d~data[.,1:3 6:10]; clear 8, d; yes = selif(data, data[.,l] .gt 0); no = selif(data, data[.,l] .le 0); clear data; nyes = rows(yes); nno = rows(no); sxyes = ones(nyes,1)~yes[.,3 4 ; dxyes = ones(nyes,1)~yeS[.,4 51]: d = yes[.,2]; clear yes; sxno = ones(nno,1)~no[.,3 4]; sxnom = ones(nno,1)~no[.,6 7]; sxnov = no[.,8 9 10]; clear no; @-- starting values for [s0 s1 s2 d0 d1 d2 sigma‘2 rho] b0 = bslbd]1]0.25; 106 @- gradient tolerance (default = 1e-5) --@ @-- _opgtol = 1e-10; --@ 1?" 013 “(a dxy ) d eta = 1n es’ es ‘ es’ ; s = sumc(VI()d - es * betag’g2‘y) / nyes; cov = s " in d( es’dxyes stder = sqrt(‘rfiag(cov)); output on; iter ~ nyes ~ beta’ ~ stder’ ~ s; output off; 0 ’ @-- call and print optrnum --@ optset; opgd rc = &focl; {beta , f1, g, retcode} = optmum(&fnl, b0); if retcode ne 0; goto recal; endif; covl = _opfliess; stderl = sqrt(diag(cov1)); ou ut on; (-f1 ~beta1’~stder1’; output off; optset; opgd rc = &foc3; {be , 13, g, retcode} = optmum(&fn3, b0); if retcode ne 0; goto recal; endif; cov3 = _opfliess; stder3 = sqrt(diag(cov3)); ou ut on; (-B ~beta3’~stder3’; - output off; optset; opgd rc = &foc4; Tbeta4, f4, g, retcode} = optmum(&fn4, Do); if retcode ne 0; oto recal; endiI; cov4 = opfhess; stder4 =_ sqrt(diag(cov4)); ou ut on; (~f4 ~beta4’~stder4’; print I! N; output off; ’ 107 @- procedures --@ @- -(log-1ikelihood) function for a self-selection model with measurement errors and a linear demand function using censored data (non-respondents’ independent variables are observed) -@ proc fnl( (amp) local bs, bbd, sigma, rho, dyes, kyes, kno, k, bbs = para[{l 2 3, .;] bbd= para 4 5 6. ,.;? sigma= sqrt(para 7, ].), rho = para[8,. ]; dyes = d - dxyes ‘ bbd; 9“." iinéixy 6381:, P9399 3?. 31,82) / sqsrt(1? )rho"2) ); kno = cdfn(- sxno ‘ bbs); = ln(kyes|kno); retP(-sum6(k)); endp; proc focl ara) , local h s, bbd, sigma, rho, f, cyses’s, SE C. cno, p. be. fbbd. g gss, gbbs, grho, hbbs, k; bbs= para? 2 3,. , bbd= =para 4 5 6,. sigma= sqrt(para7 ,.;]) rho = para[8,.; f = (1 / sigmas :58?“ (d- dxyes " bbd) / sigma), cyes= - rho‘ e(d - es ‘bbd) / sigma ) srt(1- rho‘2; gp = (18 sqrt(l- rho"2))‘ pdfn(cyes); c = cdfnc(cyes); p = pdfn(-sxno’bbs); hc = cdfn(- sxno ’ bbs); fbbd - ((cIumdges ’ bbd) / sigma‘2). ‘ dxyes ); fss =((sdlmd§ryes " bbd)"2- sigma"2) / (2 ' sigma"4) ); 108 gbbd = sumc( ' (- 8P -/ 89) “ (rho / sigma) -’ dxyes ); gss = sumc( I. g‘./sigC) .‘ (rho ' (d - dxyes ‘ bbd)) a‘3) ; gbbs = sumc gp .fgc) .’ sxyes ); grho = sumc - gp .4 gc) ." ( -((d - dxyes " bd) / si a) )1- rho ') (/ ~(slxyesi1 '15? 3' o ‘ (d - dxyes ' bbd) s1 a - r o ; hbbs = sgilinmc( (hp ./ hc) ." (- sxno) ); k = (gbbs’+hbbs’) ~ (fbbd + gbbd)’ ~ (fss + gss) ~ grho; rctP(-l<); endp; @-- -(log-likelihood) function for a self-selection model with measurement errors and a linear demand function using empirical block mean and estimated variance- covariance (200 obs. from the population) for measurement error proc fi13 ara); local bs, bbd, sigma, rho, dyes, kyes, bbsn, delta, 0, k; bbs = para[1 2 3,.]; bbd = para[4 5 6,.]; sigma = sqrt(para[7,.]); rho = para[8,.]; £368 = d VdXngaes 1') Windy / ' 1 es = 1 s1 ' es $1 a .' crifnc( - sxyes Pbbs - rho ‘ (file‘s / sigma) / sqrt(l - rho"2) ); bbsn = bbs[2 3,.]; delta = sqrt(l + bbsn’evar‘bbsn); kno = cdfn(- sxnom ‘ bbs / delta); k = ln(kyes|kno); retP(-sum9(k)); ndP; 109 proc foc3(ar ara;) local h ,b1, b2, b3, bbd, sigma, rho, f, cyes, gp,g gc, cno, hp, hc, h, fbbd, fss, g bd, gss, gbbs, grho, hb1,hb2, hb3, k; bbs= ara[12 3, .;] b1=bsl,.; b2=bbs2 ; b3=bbs3:: bbd= para[4 5 6,. ,3; sigma= sqrt(para 7, ].); rho = para[8,.; f = (1 / sigma ‘ pdfn( (d- dxyes " bbd) / sigma); cyes- = (- sxyes - rho‘ (d - es ‘bbd) / sigma ) (1 rho"2; = 17(sqrt(1 - rho"2)) pdfn(cyes); gc- = cdfn c (cyes); cno = (- sxnom bbs) ./ sqrt(l + b2"2 " evar[1,1] + b3"2 " evar[2,2] + 2 ‘ b2 b3 evar[1, 2]); hp = (1 /sqrt(1 + b2"2 ‘ evar[1, 1] + b3"2 ‘ evar[2, 2] + 2 b2‘ b3” evar[1,2])) -’;Pdfn(cn0) hc = cdfn(cno); h=1 + b2"2 " evar[1,] + b3"2 ‘ evar[2, 2] + ‘ b2 "‘ b3 ' evar[1,2]; fbbd- = sumc( ((d- dxyes " bbd) / sigma"2). ‘ dxyes ); fss 731111115156 own "2) es "‘ sigma / (2 " sigma 4) ); gbbd = sumc( s-( SSP c/ogC)"(rhO/~°>i811191) ' dxyes ): 8-( 11111./(.gc). " rho (d- dxyes “ bbd)) /2 6‘2". "3) gbb = sumc gc). "‘ sxyes ); gm = “13°- gp 6110/ > ' es " si 0a +(r h(<() ‘ (d-xysxyes ‘-bbs r "-(d dxyes ‘ bbd) / sigma) / (1- 11107)) );0 hbl = sumc - hp ./ hc ); hb2- = sumc (h ./hc '( - sxnom[ 2] +sxnom‘ bs.‘c.2'b2‘evar1,1] +2‘b3’ evar[1,].)/(2'h) ); hb3 = sumc( (h ./ hc .-’( sxnom 3] +sxnom bs. * 2‘b3' evar2,2] + 2‘ b2 ’ evar[1,2j) ./(2 ‘h) ); = (gbbs +(hb1~hb2~hb3)) ~ (fbbd + gbbd)’ 110 ~ (fss + 855) ~ grho: retP(-k); endp; -(log-likelihood) function for a self-selection model with measurement errors and a linear demand function using empirical block mean and empirical block vaéiance-covariance for measurement error proc fn4 ara); local bs, bbd,1si a, rho, dyes, kyes, bbs 1, bbs2, elta, kno,k bbs = para[1 2 3,. , bbd= para[4 5 6. sigma= sqrt(para 7, .;]) rho = para[8,. ]; dyes=d1/-dxyes‘bbd; . kyes fincd/ St a) pdfn(dyes / $1 a) fnc-( -ssxyes bbs - rho ‘ es / sqsrt(1 gm-r)rho"2) ); bbsl = bbs[2, ,.]; bbs2= bbs 3, delta=sqrt(1 + bbsl"2‘ Simov[. 1,] + bbs2"2’ sxnov[. ,2] + 2 bbsl ‘ bsz‘ s1mov[.,3]); kno = cdfn(- Simom ‘ bbs ./ delta); k = ln(kyes|kno); retp(-sumc(k)); I endp; proc foc4(para); local bbs, b1, b2, b3, hbbd, sigma, rho, f, cyes, c, cno, h , c,h, fbbd, fss, Egbigi, gss, gblgflc grho, hbl, hb2, hb3, k; bbs = para[1g 2 3, .]; b1=b b2=bb52,. b3 =bbs3,. b_bd= para[4 5 6,. ”7.3 ]) sigma = sqrt(para7 rho = para[8, .]; 111 f = (1 / sigma) ‘ Bdfn( (d - dxyes ‘ bbd) / sigma); eyes = (-sxyes' bs - rho ‘ (d - es 'bbd) / sigma ) / rt(l - rho"2 ; 8? = ( sqrt(l - rh0‘2)) ' pdfn(cye8); gc = cdfnc(cyes); cno = (- sxnom ‘ bbs) ./ sqrt(l + b2"2 " sxnov[.,l] + b3‘2 " sxnov[.,2] + 2 “ b2 ‘ b3 ‘ sxnov[.,3]); hp = (1 / sqrt(l + b2"2 “ sxnov[.,l] + b3‘2 " sxnov[.,2] + 2 ‘ b2 ' b3 "' sxnov[.,3])) -‘ Pdfn(cn0); hc = cdfn(cno); h = 1 + b2"2 ‘ sxnov[.,l + b3"2 ‘ sxnov[.,2] + ‘ b2 ‘ b3 ‘ sxnov[.,3]; fbbd = sumc( ((d - dxyes " bbd) / sigrna‘2) .‘ dxyes ); fss =((sdumccl§yes ‘ bbd)"2 sigma"2) / (2 ‘ sigma"4) ); gbbd = sumc( (- 3p -/ gc) ‘ (rho / sigma) -" dxyes ); gss = sumc( $53-41“) .: (rho ' (d - dxyes " bbd)) 3 ; gbbs = sumc g1; fgc) .“ sxyes ); grho = sumc -gp .6 gc) ." ( -((d - dxyes " bdg / si 3) + rho'(-sxyes‘b s-r o’(d-dxyes‘bbd) / sigma) / (1- rh0‘2)) ); hbl = sumc - hp ./ hc ); hb2 = sumc (hip ./ hc ." ( - sxnom[.,2] + sxnom ' bs .' 2 " b2 ' s1mov[.,1] + 2 ‘ b3 ‘ sxnov[.,3]) ./ (2 " h )) ); hb3 = sumc( (h ../ hc .' ( - sxnom[.,3] + smom ‘ bs .‘ 2 ‘ b3 ‘ smov[.,2] + 2 "‘ b2 “ s:mov[.,3]) ./ (2 "' h) ) ); k = (gbbs’+(hb1~hb2~hb3)) ~ (fbbd + gbbd)’ ~ (fss + 388) ~ grho: retP(-k); endp iter = iter + 1; endo; system; APPENDIX E A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF -SELECTION MODEL WITH MEASUREMENT ERRORS AND A TOBIT DEMAND EQUATION APPENDIX E A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF -SELECI'ION MODEL WITH MEASUREMENT ERRORS AND A TOBIT DEMAND EQUATION This GAUSS program is used to conduct Monte Carlo experiments for the self-selection model with measurement errors and a Tobit demand equation (p = 0.25). new; use optmum; output file = tt.out; @-- declare global variables «@. declare matnx g_bbd, g_bbs, g_51gma, g_rho; external matrix g_bbd, g_bbs, g_s1gma, g_rho; iter = 1; do while iter le 500; recal: @-- npop: population size nx: number of variables (11’s and e’s) nb: number of blocks (nb) bobs: number of obs in a block . . ss: number of obs used to calculate. variance-covariance sm : random sub-sample / populatlon (0 < = smp < = 1) npop = 10000; nx = 5; nb = 250; bobs = npop / nb; 55 = 200; smp = 0.1; 112 113 @-— draw random sample as population --@ x = mdn(np0p,nx); let v[5,5] = 1.44 0.24 0.096 0 0 0.24 1 0.24 0 0 0.096 0.24 0.64 O 0 0 0 0 1 0.25 O 0 0 0.25 1; sqrtv = chol(v); x=x'sqrtv; let a[1,5] = 31.5400; x=x+a; evar = vcx(x[1:ss,1 2]); clear v, sqrtv, a; @-- divide population into blocks --@ si = 000; S = HPOP / si; sg = nb s; w = eye sg) .’. ones(bobs,l); sx = x[1:si,1 2]; msx = sx’w/bobs‘); .‘. ones(bobs,l); vsx = ( ((sx-msx) ) ’ SKI-.11-msx{..1])-‘(sx -.2]-msx[..2]))) w (bobs-1) )’ .‘. ones bs,1); xx = msx~vsx; ~ x[l:h,1 2 ; msx = sx’w b8); .‘. ones(bobs,l); vsx = ((sx-msx) ~ SX[.,1]'mSX[.,1])..(SX .,2]'mSX[.,2D))’ w (bobs-1) )’ .“. ones obs,1); sx = msx~vsx; ‘ xx=xx|sx, 1=1+1, endo; cleari,l,h,msx,vsx,sx; @-- xx = [x1 x2 x3 e1 e2 m1 m2 v11v22 v21] --@ x = x~xx; clear xx; @-- . extract a random sub-sample from populatlon xx, and variables in the random sub-sample are [x1x2 x3 e1 e2 m1 m2 v11v22 v21]. 114 i = Endumponl); x = 1~x; data = selif(x,x[.,1] .le smp); clear 11; data = data[.,2:(cols(data))]; obs = rows(data); @ generate dependent variables 8" and D‘ "‘ =1.5+1*x1-3‘x2+e1 D'=6+4‘x2-3’x3+e2 let bs[3,1] = 1.5 1 -3; let bd[3,1] = 6 4 -3; s = (ones(obs,1)~data&.,l 2]) " bs + data[.,4g; d = (ones(obs,1)~data .,2 3]) ' bd + data[., ]; @-- rearrange data according to s > Ozxyes = [dx1x2x3], and s < = O: xno = [m1 m2 v11v22 v21] 31%)) data = [s d x1x2 x3 m1 m2 v11v22 v21]. data = s~d~data[.,1:3 6:10]; clear s, d; 81 (data .,1; .gt 0 .and (data[.,ZJ .gt 0;; $0 = data .,1 .gt 0 .and data[.,2 .e 0 ; yesl = selif(datasl ; yesO = selif data,s0 ; no = selif(data, data[.,l] .le 0); clear s1,sO,data; nno = rows(no); nyesl = rows e51); nyesO = rows esO ; nyes = nyesl + nyesO; sxyesl = ones(nyes1,1)~yesl[.,3 4 ; dxyesl = ones(nyes1,1)~yesl[.,4 ; d = yesl[.,2]; sxyesO = ones(nyesO,1)~yesO[.,3 4]]; dxyesO = ones(nyesO,1)~yesO[.,4 5 ; clear yesl, yesO; sxno = ones(nno,1)~no[.,3 4]; sxnom = ones(nno,1)~no[.,6 7]; sxnov = no[.,8 9 10]; clear no; 115 @.. starting values for [s0 s1 s2 d0 d1 d2 sigma"2 rho] in = bdll; b0 = bs bd|1|0.25; @-- call and print optmum --@ optset; opgd rc = &focl; {beta , f1, g, retcode} = optmum(&fnl, b0); if retcode ne 0; goto recal; en ; covl = invpdéhess (&fnl,beta1)); stderl = s rt diag cov1)); pout = (-f1 ~beta1’~stder1’; optset; opgd rc = &foc3; Tbet , f3, g, retcode} = optmum(&an, betal); if retcode ne 0; dgoto recal; en ; cov3 = invpdéhess (&fi13,beta3)); stder3 = sqrt dia cov3)); pout = pout~(- ~beta3’~stder3’; optset; opgd rc = &foc4; '{beta , f4, g, retcode} = optmum(&fn4, beta3); if retcode ne 0; Igoto recal; end ; cov4 = invpdéhess (&fn4,beta4)); stder4 = sqrt dia cov4)); pout = pout~(-f ~beta4’~stder4’; optset; opgd rc = &focO; (beta , 10, g, retcode} = optmum(&an, bt); if retcode ne 0; goto recal; end ; covO = invpdghess (&an,beta0)); stderO = sqrt diag cov0)); output on; , , iter~nyesO~nyesl~nyes~nno~(-f0)~betaO ~stder0 ~pout; print n 11; output off; 116 @- procedures -@ @ -(log-likelihood) function for a Tobit demand function using data from only the respondents proc fn ara); local bd, si a, ll, 12; bbd = para 1(2 3,.2; ) sigma = sqrt ara 4,.] ; 11 = (l/si a3 “ pdfn((d - dxyesl ‘ bbd) / sigma); 12 = cdfn (- dxyesO " bbd) / sigma ); retp( - sumc( ln(llllZ) ) ); endp; proc fo ara); local b d, s, fb, £52, f; bbd = para[1 2 3,.]; s = sqrt(para[4,.]); fb = sumc( ( dfn( (-dxyesO‘bbd) s) ./ cdlgx ggxyesO‘bbd) / s ) ." es s ’ - (£71,302) ' d-dxyesl'bbd )’ dxyesl; st = sumc( 0.5 ' ( dfn (%yesO‘bbd) / s ) ./ cdfn( ( es "' s) ) .' ( (~dxyesO‘bbd) s 3) ) )’ +( (n (821 /4£)2.)8A2()d dxy 1 bbd) (d dxy 1 bbd) -1 ’s" '- es' ’- es“ ; f = fb~fs2; ret130'); endp; -(log-likelihood) function for a self-selection model with a Tobit demand function using censored data (non-respondents’ independent vanables are observed) proc fnl ara); local bs, bbd, sigma, rho, no, yesO, yesl; bbs = para[1 2 3,. ; bbd = para[4 5 6,. ; sigma = sqrt(para 7,.]); rho = para[8,.]; noo= “fié‘é: “5;; b1? 8 )bbd) / ' ) es = c - es ’ s1 - y cdfbvné (6 dxyesO ‘ bbd) / 5%), (- sxyesO ' bbs), rho); . yesl = (I si a) ' pdfn((d - dxyesl “ bbd) / Sigma) .’ c (- sxyes] "‘ bbs - (rho / sigma) ‘ (d - dxyesl “ bbd) ) / sqrts 1 - rho?) ;; retp(-sumc( ln(no yesOlyesl ); 117 proc foc1(para); local bbs, bbd, si a, rho, f, esl, gg, gc, cno, p, be, fbbdffyss, g bd, as, bbs, grho, hbbs, k, 1 , 1c, , i c, igesOa, iyesOb, 1 bd, iss, ibbs, vd, irho; bbs = para[1 2 3,.]; bbd = para[4 5 6,1]; sigma = sqrt(para 7,.]); rho = para[8,. ; f = (l / sigma ‘ pdfn( (d - dxyesl ' bbd) / sigma); . cyesl = ( - sxyes] ‘ bbs - rho ' (d - es1 ‘bbd) / sigma ) / s rt(1 - rho"2 ; sp = (13 sqrt(l - mm» ' pdfn(cyesl); fie = cdfnc(cyesl); p = pdfn( - srmo ' bbs); hc = cdfn( - smo ' bbs); fbbd = sumc( ((d - dxyesl " bbd) / sigma"2) .‘ dxyesl ); fss =((Sclllmd§1y 1 bbd)"2 ' "2) - es ' - s1gma / (2 ‘ sigma“) ); gbbd = sumc( (- gp -/ gc) ‘ (rho / sigma) -' dxyesl ); gss =(-sumc( . .* h . d- l‘bbd / 513/5 i8¢)aA33F o ( dxyes )) gbbs = sumc gp .)gc) .‘ sxyesl ); h = - . 8’ ‘T- (fatal Ema /. .) + rho ‘(-sixyesi11:213)s-)r 0"(d-dxyesl ‘bbd) - r o ; hbbs/=SI$2(/(lgp ./ hc) .' (- sxno) ); g bbs = bbs; g:bbd = bbd; g_sigma = sigma; _rho :15? 0dxy 0 bbd / ' ) 1 = - es ‘ srgma ; ii): gdfno dxyesO ‘ bbd / sigma); 1 = 1c - cdfbvn( ((- esO ' bbd) / sigma), (- sxyes ' bbs), rho); iyesOa = ( - sxyesO ‘ bbs . - rho “ (- dxyesO ' bbd) / Sigma) / sqrt(l - rho‘2); iyesOb = ( - dxyesO ' bbd - rho ‘ sigma ‘ (- sxyesO ’ bbs) ) / sqrt(l - rho"2 ; . ibbd = sumc( (ip .‘ - dxyesO / Slgma) 118 -(1 si a ‘i .‘cdfniesOa . .‘ (Alxyfif) )/ siglina) ) ./(igc); ) 185 = sumc(.(1p “ .5 .. (dxyesO " bbd / sigma"(3/2) ) - (1 / Slgma) ' 1p .‘ cdfn(iyesOa " (-.5) .; (gcdxyeso "' bbd / sigma (3/2 ) ) 1 . ibbs '= sumc( - pdfn(- sxyesO * bbs) .' cdfn(iyesOb) C i ." (- e50) ./ ib bvd = gradxpgcfzbvmg rho); irho = sumc - bvd .7 ibc); k = (gbbs’ + hbbs’ + ibbs’) ~ (fbbd + gbbd + ibbd)’ ~ (fss + gss + iss) ~ (grho + irho); rlam-k); endp; @-- -(log-likelihood) function for a self-selection model with measurement errors and a Tobit demand function using empirical block mean and estimateted variance- covariance (200 obs. from the population) for measurement error proc f113( ara); local bs, bbd, sigma, rho, bbsn, delta, no, yesO, yesl; bbs = para[1 2 3,. ; bbd = para[4 5 6,. ; sigma = sqrt(para 7,.]); rho = para[8,. ; bbsn = bbs[2 ,.]; delta = sqrt(l + bbsn’evar‘bbsn); no 0: cdf(111f(n - “(13113111 .Ob’bI) c)!e}ta.); yes = c - es $1 a ) - cdfbvné (G dxyesO ' bbd) / s§glrlna , ( (' “)esopzifgliizi {131% 1 * bbd)/ ' > esl = I si a ‘ - es Slgma y .' cdéicfil- esl ‘ bbs - rho / sigma) " (d - dxyesl “ bbd) ) sqrtfi 1 - rho ; ;; dretp(-sumc( ln(no yesOlyesl ); n P; proc foc3(para); . local bbs, b1, b2, b3, bbd, s1gma, rho, f, cyesl, , gc, cno, hp, hc, h, fbbd, fss, 5E d, s, bbs, grho, hbl, hb2, hb3, k, 1 , ic, g: i i esOa, iyesOb, 1 bd, iss, ibbs, vd, irho; bbs = para[1 2 3,.]; 119 bl = bbs 1,. ; b2 = bbs 2,. ; b3 = bbs 3,. ; bbd = para[4 5 6,.3; sigma = sqrt(para 7,.]); rho = para[8,. ; f = (1 / sigma " pdfn( (d - dxyesl " bbd) / sigma); cyesl = ( - sxyesl ‘ bbs - rho ‘ (d - esl 1“bbd) / sigma) / s rt(l - rho"2 ; gp = (13 sqrt(l - rho“2)) ‘ pdfn(cyesl); gc = cdfnc(cyesl); cno = (- sxnom " bbs) ./ sqrt(l + b2"2 ' evar[1,1] + b3"2 ' evar[2,2] + 2 " b2 ‘ b3 "' evar[1,2]); hp = (1 /sqrt(1 + b2"2 ‘ evar[1,1] + b3"2 ‘ evar[2,2] + 2 ‘ b2 ‘ b3 " evar[1,2])) .‘ pdfn(cno); he = cdfn(cno); h = 1 + b2"2 ‘ evar[1,1 + b3"2 " evar[2,2] + " b2 0' b3 ' evar[1,2]; fbbd = sumc( ( d - dxyesl ' bbd) / sigma‘2) ." dxyesl ); fss =«sdumc( 1 bbd) 2 A2) - es ‘ " - si /(2d5ysigma*4) ); gma gbbd = sumc( (- gp -/ gC) " (rho / Sigma) 3 dxyesl ); gss = sumc( (- ./ gc) .‘ (rho ‘ (d - dxyesl ' bbd)) / ép' si 3‘3) ; gbbs = sumc gp . gc) .‘ sxyesl ); grho = surge - gp ./bg)cd / ) ." - - s ‘ si a +(rh((() " (dfysxyesl ‘ b - Ebn‘ (d - dxyesl "‘ bbd) / sigma) / (1 - 1110’?” ); hb1= sumc -hp ./ hc ; hb2 = sumc (h ./ hc .‘ (- smom .,2] + smom" bs.‘ 2‘b2‘evar1,1] + 2'b3'evar[1,])./(2'h) ); hb3 = sumc( (h ./ hc ."' ( - s1mom.,3] + sxnom ‘ bs ." 2 ‘ b3 ‘ evar 2,2] + 2 " b2 " evar[1,2]) ./ (2 ‘ h) ); g bbs = bbs; g:bbd = bbd; g_sigma = sigma; who :15? ocixy o bbd/ ' ) 1p = p - es " 5.181113 ; ic = cdfn(- dxyesO ' bbd / srgma); ibc = ic - 120 cdfbvn( ((- dxyesO ' bbd) si a , . (- sxyesO " bbs), rho); gm ) 1yesOa = ( - sxyesO " bbs - rho " (- dxyesO " bbd) / sigma) / sqrt(l - rho"2); iyesOb = ( - dxyesO " bbd - rho " sigma " (- sxyesO ‘ bbs) ) / sqrt(l - rho"2 ; ibbd = sumc( (ip ." - dxyesO / sigma) - (1 / sigma) ‘ 1p .' cdfn(i esOa) . ."‘ (- dxyesO / sigma) ) ./ i c); 1ss = sumc( (ip " .5 . (dxyesO “ bbd / sigrna"(3/2) ) - (1 / sigma) “ ip ." cdfn(iyesOa) " (-.5) 7 (b dxyesO "' bbd / sigma (3/2 ) ) . 1 c ; ibbs = sumc( - pdfn(- esO " bbs) .‘ cdfn(iyesOb) b d '1' (.d 83130) '/ it11c) v = gra p vn,g r o ; irho = sumc - bvd .7 ibc); = (gbbs’+(hb1~hb2~hb3)+ibbs’) ~ (fbbd + gbbd + ibbd)’ ~ (fss + gss + iss) ~ (grho + irho); retp(-k); endp; -(log-likelihood) function for a self-selection model with measurement errors and a Tobit demand function using empirical block mean and empirical block variance-covariance for measurement error proc fn4 ara); local bs, bbd, sigma, rho, bbs 1, bsz, delta, no, yesO, yesl; bbs = para[1 2 3,. ; bbd = para[4 5 6,. ; sigma = sqrt(para 7,.]); rho = para[8,. ; bbsl = bbs 2,. ; bbs2 = bbs 3,. ; delta = sqrt(l + bbsl"2 ‘ smov .,1] + bszAZ ‘ sxnov[.,2 + 2 ' bbsl " bsz ‘ sxnov[.,3]); no = cdfn( - szmom ' bbs ./ delta ); yesO = cdfn( - dxyesO “ bbd) / Slgma ) - cdfbvn (- dxyesO ' bbd) / Sigma , (- sxyesO " bbs), rho); . yesl = (lélsi a) " pdfn((d - dxyesl " bbd) / s1gma) .' cd c??- sxyesl ' bbs - (rho / sigma) ' (d - dxyesl ‘ bbd) ) / sqrt( 1 - rho‘2) ); 121 retp(-sumc( ln(no lyesO l yes 1) )); endp; proc foc4(para); local bbs, b1, b2, b3, bbd, sigma, rho, f, cyesl, gg, gc, cno, hp, he, h, fbbd, fss, g bd, fits, gbbs, grho, hbl, hb2, hb3, k, 1g, 1c, , i , iyesOa, iyesOb, i bd, iss, ibbs, vd, irho; bbs = ara[1 2 3,.]; b1= b s 1,.; b2 = bsz,.; b3 = bbs 3,.; bbd, = para[4 5 6,.]; sigma = sqrt(para[7,.]); rho = para[8,]; f = (1 / sigma ‘ pdfn( (d - dxyesl ‘ bbd) / sigma); cyesl = (- sxyesl " bbs - rho ‘ (d - esl ‘bbd) / sigma) / s rt(1 - rho"2 ; gp = (1 sqrt(l - rho"2)) “ pdfn(cyesl); gc = cdfnc(cyesl); cno = (- smom ‘ bbs) ./ sqrt(l + b2"2 ‘ sxnov[.,l] + b3“2 ‘ srmov[.,2] h ( 7 2 '(b2 #3323"; sxnov[.,vafi = 1 rt 1 + " simov ., p 511 b3"2 ' sxnov[.,2] + 2 ‘ b2 ' b3 ' s:mov[.,3])) .“ pdfn(cno); hc = cdfn(cno); h= 1 +b2"2's1mov[.,£ - + b3"2 ‘ sxnov[.,2] + ‘ b2 " b3 ‘ s1mov[.,3]; fbbd = sumc( ((d - dxyesl ‘ bbd) / sigma"2) .' dxyesl ); fss =((sdumfl§y 1 'bbd)"2 ' A2) - es ‘ - s1gma / (7- “ sigma“) ); gbbd = sumc( (- gp ./ gc) ‘ (rho / sigma) .‘ dxyesl ); gss =§sum0( épj .gc) .;331hof (d - dxyesl ‘ bbd)) gbbs = sun?l gp . gc) .‘ sxyesl ); m1" smé” 'eé‘l’ Jbifil/ si ) . +(rh(<() ‘ (dfysxyesl " bbs - rfibna (d - dxyesl ' bbd) / sigma) / (1 - r110"2)) ); h 1=sumc -hp./hc; hb2 = sumc (h ./ hc .‘ ( - sxnom[.,2] + sxnom ‘ bs .‘ 2 ' b2 “ sxnov[.,l] + 2 " b3 ‘ srmov[.,3]) ./ (2 ’ h) ) ); 122 hb3 = sumc( (th ./ hc .' ( - sxnom[.,3£ + sxnom " ."' 2 ' b3 ’ sxnov[., ] + 2 ‘ b2 " srmov[.,3]) ./ (2 ‘ h )) ); g bbs = bbs; g:bbd = bbd; g sigma = sigma; :rho 311:? dey 0 bbd / ) 1p=p - es‘ sigma; ilc) = cdfn(- dxyesO " bbd / sigma); 1 c = 1c - cdfbvn( ((- esO ‘ bbd) / sigma), (- sxyes ' bbs), rho); iyesOa = ( - sxyesO " bbs - rho "‘ (- dxyesO “ bbd) / sigma) / sqrt(l - rho"2); iyesOb = ( - dxyesO ‘ bbd - rho “ sigma ‘ (- sxyesO ‘ bbs) ) / sqrt(l - rho‘2 ; . ibbd = sumc( (ip ." - dxyesO /.81grna) - (1 / sigma) ’ 1p .‘ cdfn(1ges0a) ." (- dxyesO / si a) ) ./i c); _ iss = sumc( (ip " .5 . (dxyesO " bbd / Slgma"(3/2) ) - (1 / sigma) ‘ ip .‘ cdfn(iyesOa; ‘ (-.5) ." (b dxyesO ‘ bbd / sigma (3/2 ) ) 1 c ° ibbs °= sum’c( - pdfn(- esO ' bbs) :- cdfn(iyesOb) °‘ (1 a") -/ i3; bvd = a p vn,g r o ; irho = %rumc - bvd .7 ibc); k = (gbbs’+(hb1~hb2~hb3)+ibbs’) ~ (fbbd + gbbd + ibbd)’ ~ (fss + gss + iss) ~ (grho + 1rho); retP(-k); endp; proc bvn(r); = r; 0 re cdfbvn - esO " g bbd) / g_51gma), t“ i-(gxyi’éyo ' g.b5s). r0) ); endp; iter = iter + 1; endo; system; APPENDIX F A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A PROBIT DEMAND EQUATION APPENDIX F A GAUSS PROGRAM FOR MONTE CARLO EXPERIMENTS: SELF-SELECTION MODEL WITH MEASUREMENT ERRORS AND A PROBIT DEMAND EQUATION This GAUSS program is used to conduct Monte Carlo experiments for the self-selection model with measurement errors and a probit demand equation (p = 0.25). new; use optmum; output file = tp.out; @-- declare global variables -@ declare matnx g_bbd, g_bbs, g_rho; external matrix g_bbd, g_bbs, g_rho; iter = 1; do while iter le 500; recal: @-- npop: population size nx: number of variables (x’s and e’s) nb: number of blocks (nb) bobs: number of obs in a block . . ss: number of obs used to calculate. vanance-covanance sag: random sub-sample / populatlon (0 < = smp = 1) npop = 10000; 1111 = 5; Ebb: 250‘ / b o s = npop n ; $5 = 200; smp = 0.1; 123 124 @-- draw random sample as population --@ x = rndn(npop, nx); 1etv[5,5]=1.44 0.24 0.096 0 O 0. 24 l O. 24 0 O 0. 096 O. 24 O. 64 0 O 0 O 0 1 0.25 O 0 0 0.25 1; sqrtv = chol(v); x=x‘ sqrtv; leta[1,5]=31.5400; x=x+a; evar = vcx(x[1:ss,1 2]); clear v, sqrtv, a; @-- divide population into blocks --@ si = 2000; s = npop / si; sg = nb s; = eye sg). ‘ .ones(bobs, 1); = x[1.51,12]; msx = sx’w/bobs)2).'. .,ones(bobs 1); = ( ((Sx-msx ) sx[ 1]-msx[.,1]).‘(SXfi&.Og-s’1),msxl 21») w (bo obs-1) )’. " .oncs riot-=2msx~vsx; do while 1 <— - s; 1=si"(i-1)+1; ~ h = si ' i; = x[l: h, 1 2; msx= (sx’w/ obsyz’.‘ .,ones(bobs 1); («1711- [)11) :(sx 21- msxt 21)» 5x -msx $70) (bobs-1)). ' .ones b5 ,;1) sx= msx~vsx; xx= xxlsx; 1 = i + 1, endo; clear 1, l, h, msx, vsx, sx; @--xx =[x1 x2 x3 e1 e2 m1 m2 v11 v22 v21] --@ x = x~xx; clear xx; @-- extract a random sub-sample from population xx, and variables 1n the random sub-sample are [x1 x2 x3 e1 e2 m1 m2 v11v22 v21]. 125 i = mdU(nP0P.l); x = i~x; data = selif(x,x[.,1] .le smp); clear x; data = dataJ.,2:(cols(data))]; obs = rows data); @-- generate dependent variables 8' and D" ‘ =1.5+1‘x1-3‘x2+e1 D’ =6+4'x2-3‘x3+e2 let bs[3,1] = 1.5 1 -3; let bd[3,1] = 6 4 -3; s = (ones(obs,1)~data[.,12]]) ‘ b5 + data[.,4g; d = (ones(obs,1)~data[.,2 3) ‘ bd + data[., ]; @-- rearrange data according to s > Ozxyes = [dx1x2x3], and 5 <= 0:1mo = [m1m2v11v22v21] alééodata = [5dx1x2x3m1 m2v11v22v21]. data = s~d~data[.,1:3 6:10]; clear 5, d; 51 (data .,1] .gt 0 .and (data[.,ZJ .gt 0); 50 data .,1 gr 0 .and data .,2 .e 0 ; yesl = selif(datasl ; yesO = selif data,sO ; no = selif(data, data[.,l] .le 0); clear 51,50,data; nno = rows(no); nyesl = rows esl); nyesO = rows esO ; nyes = nyesl + nyesO; sxyesl = ones(nyesl,l)~ye51[.,3 4; dxyesl = ones(nyesl,1)~yesl[.,4 ; sxyesO = ones(nyesO,1)~yesO[r, 5?); dxyesO = ones(nyesO,1)~ye50 .,4 ; clear yesl, yesO; 51mo = ones(nno,1)~no[.,3 4]; sxnom = ones(nno,1)~no[.,6 7]; sxnov = no[.,8 9 10]; clear no; @-. 126 starting values for [50 51 52 d0 d1 d2 sigma"2 rho] -@ b=’6 bd‘ b =bs|bd|0.25; @-- call and print optmum --@ optset; opgdprc = &focl; (beta1l,lf1, g, retcode} = optmum(&fnl, b0); if retcode ne 0; Eoto recal; end covl = invpdghess (&nt ,;beta1)) stderl - s diag(covl); pout = (-f1~beta1’~st erl’; optset; opgd rc = &foc3; (bet, 13, g, retcode} = optmum(&fn3,beta1); if retcode ne 0; finchgoto recal; cov3= =invpd£heass (&fn3, beta3)); stder3 sqrtdl cov3)) pout- = pout~(- ~beta3’~stder3’; optset; opgd rc = &foc4; (beta, f4, g, retcode}= optmum(&fn4, beta3); if retcode ne 0; endlgoto recal; cov4 = invp vpdéhea (&fn4, beta4)); stder4= sqrt di sis)(,cov4): pout = pout~(- -ff)~ beta)4’ ~stder4’; optset; opgd rc = &focO; (beta, f0, g, retcode} = optmum(&an, bp); if retcode ne 0; endfioto recal; cov0= invp vpdghess (&an, beta0)); stder0= sqrt diag cov0)); output on; ’ iter ~ nyesO ~ nye51~ nyes ~ nno ~ (-f0) ~ betaO’ ~ stderO ~ pout; print N N. output off; 127 @- procedures --@ @-- -(log-like1ihood) function for a probit demand function usmg data from only the respondents proc fn0( ara); local b, 11, 12; bb = para; 11 = cdfn( e51 " bb); 12 = cdfn - esO " bb); retp( - sumC( (11l12) ) ); endp; proc foc0(para); local bb, y, z, ff; b = para; y = ones(n esl,1)|zeros(nye50,1); x = dxyesl dxyesO; ff =(limddfm bb)) dfn( b ) -c x' .‘p x'b .‘x . f1()cdfn(x " bb) .‘ cdfnc(x ‘ bb)) i ’o ’ retp endp; @-- . -(log-likelihood) function for a self-selection model with a probit demand function using censored data non-respondents’ independent variables are observed) proc fn1(gara); local bs, bbd, rho, no, yesO, yesl, 11; bbs = para[1 2 3,.]; bbd = para[4 5 6,.]; rho = para[7,.]; no = cdfn( - sxno f' bbs ); yesO = cdfn( - dxyesO ‘ bbd) - cdfbvn (- dxyesO ‘ bbd;, (- sxyesO ' bbs), rho); yesl = cdfncg (- sxyesl ‘ bbs ) - ( cdfn( - dxyesl “ %ng - cdfbvn( (- dxyesl ‘ b ), (- sxyesl “ bbs), rho) ); trap 1' 11 = ln(no esOlyesl); if scalerr(xll)y; 11 = " AN"; endif; retp(-81mm 11)); endp proc foc1( am); local b s, bbd, rho, fbbs, denO, gbbd, gbbs, 128 denl, hbbd, hbbs, bvr0, bvrl, grho, hrho, k; bbs = para[1 2 3,.]]; bbd = para[4 5 6,. rho = para[7,.]; fbbs= sumc( (pdfn(- sxno "' bbs) ./ cdfn(- sxno ‘ bbs)) .‘ (-sxno) ); denO = cdfn( - dxyesO "‘ bbd; - ' cdfbvn( (- dxyesO " bb ), (- sxyesO " bbs), rho); gbbd = sumc( (l/denO) .‘ cdfnc( ( -sxyesO'bbs - rho ' (-dxyesO‘bbd))) df11(/cI§Irt((1) .1515???) (dxy 0)) p - es " .‘ - es ; sumc 1 enO .’ - dfn - esO'bbs .“ cdfn((( {dxyes)0‘b1gdlz (rh(o fl-sxyesO‘gbs)» s rt(1 - rho"2) ) .' (sxyesof) >; den] = cdfnc - e51 ' bbs - cdfn - e51 " bbd - cdfbvn( (- med " bbd), (— sxye(51d§ybbs), rho) )); hbbd = sumc( (-1/den1) .‘ cdfnc( ( -5xyesl‘bbs - rho ’ (-dxyesl‘bbd))) / sqrt(l - rho"2) .' pdfn( -d>3'esl'bbd ) .“ (~dxye51) ); hbbs = sumc( If enl) .1' bb) pdfn( l bb ) - -sxyes‘ s + -sxyes‘ s .' cd - esl‘bbd - rho ’ - esl'bbs))) (/(sqdri(1 - rim) 1 ) .' (siyxzsl) ); gbbs g bbs = bbs; g:bbd = bbd; rho = rho; erO = gradp(&bvn0,g_rho ; bvrl = gradp &bvnl,5 rho ; ho = sumc( (l/den .' (- ber) ); ho = sumc l/denl .‘ bvrl ); k = (fbbs’ + gbbs’ + hbbs’) ~ (gbbd + hbbd)’ ~ (grho + hrho); {CM-k); endp gag-likelihood) function for a self-selection model . with measurement errors and a probit demand funcuon using empirical block mean and estimateted vanance- covariance (200 obs. from the populatlon) for measurement error proc fn3(para); 129 local bbs, bbd, rho, bbsn, delta, no, yesO, yesl, ll; bbs = para[12 3,. ,3; bbd= para 4 5 6 rho = ,.;ng bbsn = pbbs 2 delta = sqrt(l +1 bbsn’ evar’bbsn); no = cdfn(- smom' bbs / delta ); yesO = cdfn(- dxyesO bbd) - cdfbvn (- dxyesO bbd;, (- sxyesO‘ bbs), rho); yesl = cdfnc( (- sxyesl ' bb ( cdfn( dxyesl "' bbd) )- cdfbvn( (- dxyesl "' bbd), ( sxyesl “ bbs), rho) ); ll ln(nol e50 yesl); retp(-sumc( 11 )); endp; proc foc3 ara); localdb 5, b1, b2, b3, bbd, rho, (1612,13, fb1,fb2,fb3, denO, bbd, bbs, den1,bbd,bbs, bvr0,bvr1, grho, hrho, k; bbs = para[1 2 3, .;] bl = b b2 = bsz 2., ; b3 = bbs 3,. bbd= para 4 5 6, .;] rho = para 7, .;] delta = sqrt(l + b2"2 " evar[1, ,1? + b3"2 ‘ evar[2,2] + 2' b2 " b3'evar1,2]); = -51mom"bbs / delta; fbl = dfn d lta fb2 = :uuggé ( P1331135)! 351118)) ‘/ (1- e sxn)om[. ,2] / delta + sxnom bbs.‘ 2 " b2 " evar[1,1 ). (2 " delta"3 )) ); (z) ). ‘ (- sxnom[. ,3] / delta ‘b3 ‘ evar[2, 2] +2'b2'evar[1, ])6/(2‘delta"3))); denO = cdfn(- dxyesO ’ cdfbvn( (- dxyesO " bbd ), (- sxyesO ' bbs), rho); bbd = sumc( 1 denO).‘ g cdfnc(; -5xye50‘bbs - rho‘ (-dxye50"bbd))) fb3= sumc( ( pdfn(z) + srmom + 2 ‘ b3 ‘ evar[lzc /sqrt(1- rho"2)( d§'(-e50'bbd ). ' dxyesO) ); gbbs= sumc( (1/ en0 (- pdfn( - sxyesO'bbs O‘bbd - h - O'gb ))) cdfn(} $1211.07}; 0" (sxyes s 1 - ((133,680) esl ‘ bbs) - cdfn(- dxyesl " bbd) - den cdfbvn(cg- flxxyyesl' ‘bbd), (- sxyesl ‘ bbs), rho) ); 130 hbbd = sumc( (-1/den1) .' cdfnc( ( -5xyesl'bbs - rho “ (-dxyesl‘bbd))) dfn(/ ”2351515130?! (dxyesl) )- hbbs = 5$mc( @6111) .' . ’ - 131d (- sxyesl‘bbs) + pdfn( -sxyesl‘bbs ) .* cdl ( ( -dxyesl'bbd - (rho ‘ -sxyesl"bbs))) / sqrt(l - rho"2) ) ) ." (~5xyesl) ); g bbs = bbs; g:bbd = bbd; £530 = rh((1);&b 0 h = gra p vn ,g r o ; bvrl = gradp(&bvn1,8'rho ; ho = sumc( (l/den )3 (- bvr0) ); ho = sumc l/denl .' bvrl ); k = ((fbl~fb2~fb3) + bbs’+ hbbs’) ~ (gbbd + hbbd)’ ~ grho + hrho); ret13(1); endp; @-- -(log-likelihood) function for a self-selection model with measurement errors and a probit demand function using empirical block mean and empirical block vanance-covariance for measurement error proc fn4 ara); local bs, bbd, rho, bbsl, bb52, delta, no, yesO, yes 1, ll; bbs = para[1 2 3,.]; bbd = para[4 5 6,.]; rho = para 7,. ; bbsl = bbs 2,. ; bb52 = bbs 3,. ; delta = sqrt(l + bbsl‘2 ‘ sxnov .,1] + bb52"2 ' sxnov[.,2 + 2 ' bbsl " bbs2 " 51mov[.,3]); no = cdfn( - sxnom ’ bbs ./ delta ); yesO = cdfn( - dxyesO ' bbd) - cdfbvn (- dxyesO " bbd;, (- sxyesO ' bbs), rho); yesl = cdfnc( (- sxyesl ' bbs )- ( cdfn( - dxyesl ‘ bbdgd) - cdfbvn( (- dxyesl " b ), (- sxyesl “ bbs), rho) ); 11 = ln(no] e50 yesl); retp(-sumc(, 11 )); P, proc foc4(para); local bbs, b1, b2, b3, bbd, rho, delta, 2, 131 fbl, fb2, fb3, denO, bbd, bbs, denl, bbd, bbs, ber, bvrl, grho, hrho, k; bbs = para[1 2 3,.]; bl = b s 1,. b2 = bbs 2,. ; b3 = bbs 3,. ; bbd = para[4 5 6,.]; rho = para[7,.]; delta = sqrt(l + b2"2 “ s:mov[.,1] + b3"2 "‘ sxnov[.,2] + 2 "‘ b2 ’ b3 ' s:mov[.,3]); z = -5:mom'bbs ./ delta; fbl = sumc( é- pdfn(z) . cdfn(z) ) ./ delta ); fb2 = sumc pdfn(z) . cdfn(z) ) .“ ( - sxnom[.,2] ./ delta + smom “ bbs .‘ 2 ' b2 " 51mov[.,1] + 2 ‘ b3 ' 51mov[.,3]) ./ (2 ‘ delta"3 ) ) ); fb3 = sumc( ( pdfn(z) . cdfn z) ) .' ( - s:mom[.,3] ./ delta + srmom bbs .' 2 ‘ b3 ' smov[.,2 + 2 ‘ b2 ' sxnov[.,BL) ./ (2 ‘ delta"3 ) )' ); denO = cdfn( - dxyesO ‘ bd - cdfbvn( (- dxyesO ‘ b ), (- sxyesO ‘ bbs), rho); gbbd = sumc( (1 denO) .' cdfnc( -5xye50‘bb5 - rho " (-dxye50‘bbd))) / sqrt(l - rho"2) .‘ pdfn esO'bbd .’ - e50 ; gbbs = sumc((ayg’enO) .‘ (-)pdfn( -sxye5)02bbsg ." cdfn( -dxye50"bbd - (rho ‘ (-sxyesO“ bs))) s rt(1 - rho"2) ) ." (sxyesO) ; denl = cdfnc - sxyesl ‘ bbs) - cdfn(- dxyesl ' bbd) - cdfbvn( - dxyesl ‘ bbd), - sxyesl ' bbs), rho) ); hbbd = sumc (-1/den1) ." cdfnc( ( -5xyesl’bbs - rho ‘ (-dxyesl"bbd))) / sqrt(l - rho"2) .' pdfn( -d13'esl‘bbd ) .' (-dxye51) ); hbbs = 81(1mc(dg( enl) '1. bb) dfn( 1 bb ) - -sxyes' s+p -5xyes" s .“ «1&1 - esl'bbd - rho ‘ - esl'bbs))) 92131 «10215 > .1 ($11.» 1. g bbs = bbs; 535d = %bd: r o = r o; VrO = gradp &bvn0,g_rho ; bvrl = gradp &bvnl, rho ; ho = sumc( (l/dengg'.‘ (- ber) ); ho = sumc l/denl .‘ bvrl ); k = ((fb1~fb2~fb3) + bbs’+ hbbs’) ~ (gbbd + hbbd)’ ~ grho + hrho); retP(-k); 132 endp; roc bvn r p 23cm0(); retp( cdfbvn( (- e50 ' _,bbd) (- sxyg’g' g_ b 8) r0)); endp; proc bvn1(r); local r1; r1 --( r; (( retp cdfbvn dxyesl " _,bbd) (-sxyesl ‘ g_ b 5), r1)); endp; iter = iter + 1; endo; system; BIBLIOGRAPHY BIBLIOGRAPHY Bloom, D. E., and M. R. Killingsworth ( 1985), "Correcting for Truncation Bias C3ause3<§ by A Latent Truncation Variable," Journal of Econometrics, 27(1): 1 1-1 . Bockstael, N. E., Strand, 1. E., McConnell, K. E., and F. Arsanjani (1990), "Sample Selection Bias in the Estimation of Recreation Demand Function: An Application to Sportfishing," Land Economics, 66(1): 40-49. Borjas, G. J. (1987), "Self-Selection and the Earning of Immigrants," American Economic Review, 77(4): 531-553. Bowker, J. M., and J. R. Stoll (1988), ”Use of Dichotomous Choice Nonmarket Methods to Value the Whoo insg Crane Resource,” American Journal of Agricultural Economics, 70(2?: 72-381. Brown, T. L., Dawson, C. P., Hustin, D. L, and DJ. Decker (1981), "Comments on the Importance of Late Respondent and Nonrespondent Data from Mail Surveys," Journal of Leisure Research, 13(1): 76-79. Brown, T. L, Decker, D. J., and N. A. Connell ( 1989), "Response to Mail Surveys on Resource-based Recreation 0 ics: A Behavioral Model and an Empirical Analysis," Leisure Sciences, 11(. : 99-110 Cameron, T. A. (1988), "A New Paraldl'liln for Valuing Non-market Goods Using Referendum Data: Maximum ' elihood Estimation by Censored Logisuc gise 3c7553ion," Journal of Environmental Economics and Management, 15(3): Cameron, T. A., and M. D. James 1987), "Efficient Estimation Methods For ”Closed-ended" Contin ent aluation Surveys," The Review of Economics and Statistics, 69(2): 9-276. Dhrymes, P. J. ( 1970), Econometrics, New York, NY: Harper & Row. Edwards, S. F., and G. D. Anderson (1987), "Overlooked Biases in Contingent Valuation Surveys: Some Considerations," Land Economics, 63(2): 168-178. Fuller, SW. A. (1987), W New York, NY: John Wiley & , ons. Goldberger, A. S. ( 1981), "Linear Regression After Selection,” Journal of Econometrics, 15(3): 357-366. 133 134 Goldber er, A. S. (1991), AW Cambridge, MA: Harvard niversity Press. Goyder, C. J. (1982), "Further Evidence on Factors Affecting Response Rates to Mailed Questionnaires," American Sociological Review, 47(4): 550-553. Green, E. K. (1991), "Reluctant Respondents: Differences Between Early, Late, and Nonresponders to a Mail Survey," Journal of Experimental Education, 59(3): 268-276. Green, E. K., and R. F. Kvidahl ( 1989), "Personalization and Offers of Results: 2E6f§e§t7500n Re5pon5e Rates," Journal of Experimental Education, 57(3): Green, E. K., and S. F. Stager (1986), "The Efiects of Personalization, Sex, Locale, and Level Taught on Educators’ Res onses to a Mail Survey," Journal of Experimental Education, 54(4): 2 3-206. Greene, W. H. (1981), "Sample Selection Bias as a Specification Error: Comment," Econometrica, 49(3): 795-798. Greene, W. H. (1990), W New York, NY: Macmillan Publishing Company. Hauseman, J. A., and D. A. Wise 1977), "Social Experimentation, Truncated Distributions, and Efficient stimation," Econometrica, 45(4): 919-938. Hauseman, J. A., and D. A. Wise (1981), ”Stratification on Endogenous Variables and Estimation: The Ga Income Maintenance Experiment," in Manski, CE. and D. McFadden a“). WWW ' ' ° ' ' 51-111, Cambridge, MA: MIT Press. Heckman, J. J. (1976), "The Common Structure of Statistical Models of Truncation, Sample Selection and limited Deipendent Variables and a Simple Estimator for Such Models," Annals 0 Economic and Social Measurement, 5(4): 475-492. Heckman, J. J. (1979),“ ”Sample Selection Bias as a Specification Error," Econometrica, 47(1): 153-161. Heckman, J. J., and G. L. Sedlacek 1985), "Heterogeneigy, %gregation, and Market Wage Functions: An mpirical Model of elf- election in the Labor Market," Journal of Political Economy, 93(6): 1077-1125. Heckman, J. J., and G. L. Sedlacek (1990 , "Self-Selection and the Distribution of Hourly Wages," Journal of Labor nomics, 8(1): 5329-5363. Hoehn, J. P. and J. B. Loomis (1993), "Substitution Efiects in the Valuation of Multiple Environmental Pro ams," Journal of Environmental Economics and Management, 25(1): 56- 5. Kanuk, 1..., and C. Berenson (1975), "Mail Surveys and Response Rate: A Literature Review," Journal of Marketing Research, 12(4): 440-453. 135 Lee, L. F. ( 1984), "Tests for Bivariate Normal Distribution in Econometric Models with Selectivity," Econometrica, 52(4): 843-863. little, R. J. A. ( 1985), "A Note About Models for Selectivity Bias," Econometrica, 53(6): 1469-1474. Little. 12.1. A. and D. B. Rubin (1987 515W New York, NY: John Wiley & n5. Loomis, J. B. (1987), "Expanding Contingent Value Sample Estimates to Aggregate Benefit Estimates: Qirrent Practices and Proposed Solutions," Land onomics, 63(4): 396-402. Maddala, G. S. ( 1983), ' ‘ - - - Econometrics New York. : Cambridge University Press. McConnell, K. E. (1990 , "Models for Referendum Data: The Structure of Discrete Choice odels for Contingent Valuation," Journal of Environmental Economics and Management, 18(1): 19-34. Mitchell, R. C., and R. T. Carson ( 1989), Wm W Washington, D.C.: Resources for the Future. Muthén, B., and K. G. J6reskog (1983), "Selectivity Problems in Quasi- experimental Study." Evaluation Review, 7(2): 139-174. Nelson, F. D. (1977), "Censored Regression Models with Unobserved Stochastic Censoring Thresholds," Journal of Econometrics, 6(3): 309-327. Olsen, R. J. ( 1980), "A Least Squares Correction For Selectivity Bias," Econometrica, 48(7): 1815-1820. Ong, P. M., Holt, 8., Skumatz, L. A., and R. S. Barnes (1988), "Nonresponse in Residential Energy Surveys: Systematic Patterns and Implications for End- Use Models," Energy Journal, 9(2): 137-151. Pudney, S. (1989), M ’ ’ ' ' ' ° e5, Cambridge, MAzBasil Blackwell. Randal;o A. (1987), W 2nd Ed., New York, NY: John Wiley & n5. Rubin, D- B- (1987). W New York. NY: John Wiley & n5. Shaw, D. ( 1988), "On-site Samples’ Regression: Problems of Non-negative Integers, Truncation, and Endogenous Stratification," Journal of Econometrics, 37(2): 211-223. Smith, V. K. (1988), "Selection and Recreation Demand," American Journal of Agricu rural Economics, 70(1): 29-36. 136 van Ravenswaa , E. O. and J. P. Hoehn (1991a), "Contin ent Valuation and Food Safety: e Case of Pesticide Residues in Food," taff PL?“ No. 91-13, Department of Agricultural Economics, Michigan State niversity. van Ravenswaay, E. O. and J. P. Hoehn (1991b , "Consumer Willingness to Pay for Reducin Pesticide Residues in Foo : Results of a Nationwide Survey," Staff Paper 0. 91-18, Department of Agricultural Economics, Michigan State University. Walsh, R. G., Loomis, J. B., and R. A. Gillman (1984), "Valu' Options, E‘n'ésgence, and Bequest Demands for Wilderness," Land conomics, 60(1): 1 - . Whitehead, J. C. (1991), "Environmental Interest Group Behavior and Self- 331:1??6-12316 in Contingent Valuation Mail Surveys," Growth and Change, Willis, R. J., and S. Rosen ( 1979), "Education and Self-Selection," Journal of Political Economy, 87(5): 52-535. Yatchew, A., and Z. Griliches (1985), "Specification Error in Probit Models," The Review of Economics and Statistics, 67(1): 134-139. "‘i1111111111“