105 396 ITHS_ . k ~ 1 1 1‘? 11111111 11111 1111111111 11111111 1111 1293 00666 1890 I " r I J- " t A. I" . V. ' If . .4" G‘*.a..~ “.‘- I‘D'Io ‘ _ _".a - A‘Lfi ---o‘-‘.‘d.'. I Q‘QJ‘LE I. O A‘ Altman“. mu. 1 Iu-i‘r-ltiwl This is to certify that the dissertation entitled PANEL DATA WITH CROSS-SECTION VARIATION IN THE SLOPES AS WELL AS THE INTERCEPT: THE EFFECTS OF UNIONS ON WAGES presented by CHRISTOPHER MARK CORNWELL has been accepted towards fulfillment of the requirements for Ph . D . degree in Economics m 3&9? Major professor Date Noveniber 13, 1985 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 MSU BEIURNING MATERIALS: Place in book drop to remove this checkout from LIBRARIES gnu—cunnin- your record. FINES will be charged if book is returned after the date ~- '“" stamped below. ‘ m1 8 1 1994 SE? 2 1.2008 A 443‘? PANEL DKTA HIT“ CROSS-SECTION VARIATION IN THE SLOPES AS HELL AS THE INTERCEPT: THE EFFECTS OF UNIONS 0N "AGES by Christopher Mark Cornwell A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 1985 ABSTRACT Combining time-series and cross-section data is useful in controlling for omitted or unobservable individual Specific attributes which may be correlated with the explanatory variables in a regression. A regression function that does not condition on the individual specific effects will not identify the parameters of the model. Econometric models that assume the availability of panel data are usually of the constant slopes and variable intercept form. This study considers a panel data model with cross-sectional variation in some of the slopes as well as the intercept. An established literature exists on the estimation of the simple model. The choice of estimation procedures depends on the assumptions about the individual effects. We distinguish three sets of assumptions: (1) fixed effects, (2) random effects uncorrelated with the regressors, and (3) random effects correlated with the regressors. The fixed effects model is estimated by analysis of covariance, or within. Generalized least squares is the standard procedure when the effects are random and uncorrelated with the regressors. When the effects are random and correlated with the regressors, the instrumental variables estimator introduced by Hausman and Taylor is appropriate. Each of these estimators is asyhptotically well-behaved in the case of inany individuals and few time periods. For the general model we derive the analogous within, GLS, and Hausman-Taylor instrumental variables estimators. Furthermore, we prove that these estimators possess the same properties in the general model that they have in the simple model. Then, we apply some of our theoretical results to an attempt to measure the impact of unions on wages. Conventional wisdom suggests that cross-section estimates are upwardly biased due to the positive correlation of unobserved individual specific attributes, or ”ability", with union status. Most often this bias is addressed through a fixed effects specification of the simple model. However, this approach is criticized for ignoring the sectoral dependence of the individual effects. We consider a special case of our model in an attempt to deal with this criticism. We conclude the conventional wisdom is confirmed in our empirical investigation. ACKNOWLEGHENTS The completion of this dissertation leaves me indebted to many people. To a few people the debt I owe is particularly large. My dissertation chairman, Peter Schmidt, has consistently provided careful guidance and patient instruction. From him I have learned most of what I know about econometrics. The debt I owe to him is certainly beyond my capacity to repay. Dan Hamermesh has similarly provided often needed guidance and direction. My interest in labor economics is due, in large part, to him. I also owe much to Harry Holzer. I have benefitted greatly from his comments and advice. Perhaps the greatest debt is that which I owe to my wife. Without her love, encouragement, and support this research would still be incomplete. In addition, I am very grateful for the constant support of my parents and the excellent typing provided by Terie Snyder. 11 TABLE OF CONTENTS CHAPTER ONE: INTRODUCIION..OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO0.0... CHAPTER TWO: FIXED EFFECI‘SOOOOOOOOOOOOOOOOOOOOOO...0.0.0.0....OOOOOOOOOOOOOO CHAPTER THREE: RANDOM EFFECTS UNCORRELATED WITH THE REGRESSORS................ Appendi-XOOCOOOOOOOOOOOOOIO....0..0.OOOOOOOOOCOOOOOOOO0...... CHAPTER FOUR: RANDOM EFFEch mRRELATED WITH TI‘IE REGRESSORS O O O C O O O O O O O O O I O O O 0 Appendix AOOOOOOOOOCOOOO00.00....0.0COO-OOOOOOOOOOOOOOOOOOOO Appendix BOOOOOOOOOOOOOCOO0.0......0......OOOOOCOOOOOOOOOCOO CHAPTER FIVE: ESTIMATION OF UNION WAGE DIFFERENTIALS I O O O O O O O O O O O O O O O O O O O O O O O . CHAPTER SIX: SUMMARY AND mNCIIUSIONSOOOOOO0.0..0...OOOOOOOOOOOOOOOOOO0...... FOOTNOTES Chapter TWOOOOOOOOOOOOOOOOOO...OIOOOOOOOOOOOOOOOOOOOOOO00...... Chapter ThreeOOOOOCOOOIOOOOOOO00.0.0.0...OOOOOOOOOOOOOOOOOOOOO. Chapter FourOOOOOOOOOOCIOOOOOOII.00......OOOOOOOOOOOOOOOOOOOOOO Chapter FiveOOOOOOIOOOOOOOOOOOOO0.0...OOOOOOOCOOOOOOOOO00...... REFERENCESOCOOOOOOOO0.0.0.000...00.0.00...0..OOOOOOOOOOOOOOOOOOOOOO iii Page 15 24 25 38 43 47 68 74 75 77 80 82 LIST OF TABLES TABLE 1 Means (and Standard Deviations)................................ TABLE 2 Cross-Section Estimates Dependent Variable: LOG Wage.......... TABLE 3 Panel Estimates: Intercept Varying Dependent Variable: Lm wage...O...0.0.0....00......OOOOCCOCOOOOOOOOOO0.0.0.... TABLE 4 _ Estimates: Slopes and Panel Intercept Varying Dependent variable: ch; wageOOOOOOOOOOOOOOOOOOOOOOOOOO000......0.... TABLE 5 Percent Union Wage Effects..................................... iv 52 54 56 63 66 CHAPTER ONE INTRODUCTION Panel or longitudinal data are simply time-series observations on a cross-section. The applied economist may have several years of data on individuals, households, or firms. Typically the number of time observations (T) is small and the number of cross-sectional units (N) is large. Combining time-series and cross-section data is particularly useful in controlling for omitted or unobservable attributes specific to the cross-sectional unit (henceforth taken to be an individual) which are correlated with the explanatory variables. A regression function which does not condition on these individual specific effects will not identify the parameters of the model. Econometric models that assume the availability of panel data most often take the form (101) Yit =3 Xit'B + 01 + cit , i=1,ooo’N, F190009T , where X1: is a vector of explanatory variables and sit is an iid S error. The a1 are individual specific parameters, or effects. 80, in (1.1) each individual has a unique intercept. How we estimate (1.1) depends on our assumptions about the 01. From the literature we can identify three distinct cases: (1) fixed effects, (2) random effects uncorrelated with the regressors, and (3) random effects correlated with the regressors. Case (1) is the weakest set of assumptions. Here the individual effects are taken to be constant over time (no other assumptions about them are necessary). In (2) the at are assumed to be iid random variables that are uncorrelated with all of X. This specification is sometimes referred to as the error components model. The last case drops the independence assumption and allows the individual effects to be correlated with some of X. This study has as its focus the estimation of an obvious generalization of (1.1) - a panel data model which allows cross- sectional variation in some of the slopes as well as the intercept. Such a model can be written as a ' ' = (1.2) Y1: X1: 8 + Wit 61 + Sit, 1 1,ooo,N, F1’000,T , where ”it is a vector of explanatory variables associated with coefficients that depend on 1. (Alternatively, we could partition X and a ' ' .1 \ a B and write Yit xlit 811+ XZithZF + €1t)' Clearly, if “it constant then (1.2) reduces to the simple, intercept varying model. Our investigation proceeds as follows. The next three chapters present theoretical results. Chapter Two considers the fixed effects case; Chapter Three takes up the case of random effects uncorrelated "PC\, with the regressors; then, the case of random effecté uncbrrelated with \w .r the regressors is covered in Chapter Four. For each set of assumptions, we first review the results on estimation established for the simple model. Then, we extend these results to the general model. In each case we are interested in estimators which have good asymptotic properties as Nrn while T is fixed. In Chapter Two this means deriving an analog to the within (or analysis of covariance) estimator of the simple model. We also show that under normality, the within estimator for the general model is the conditional MLE. The error components model is traditionally estimated by generalized least squares. So, in Chapter Three, we derive the GLS estimator for our model and prove that the properties of GLS in the simple model carry over to the general model. The groundwork for Chapter Four is laid by Hausman and Taylor (1981) (hereafter referred to as H-T). They develop an instrumental variables procedure for the simple model in which the individual effects are random and correlated with some of the regressors. We derive a similar instrumental variables estimator for our model, and following H—T, detail conditions under which it differs from the fixed effects estimator. In Chapter Five, we apply some of our theoretical results in an empirical exercise where we attempt to measure the impact of unions on earnings. Using data from the years 1978-1981 of the Michigan Panel Study of Income Dynamics (PSID), we estimate: (1) the simple cross- sectional earnings equations for the four years of our sample; (2) the usual panel data model in which only the intercept varies across individuals; and (3) a special case of our general panel data model. Conventional wisdom states that the cross-section estimates are upwardly biased due to the positive correlation of unobserved individual specific attributes - collectively referred to as ”ability” - with union status. Most often this bias is addressed through a fixed effects specification of the simple model. We note some criticisms of this approach and examine a Special case of the fixed effects version of our model that attempts to deal with the criticisms. For the sake of comparison, we also estimate (2) and (3) under both sets of random effects assumptions. In general, we are able to confirm the conventional wisdom that cross-section estimates of the union wage effect are upwardly biased. In Chapter Six, we present a summary of our results and offer some final remarks on panel data models in which some slopes as well as the intercept vary cross-sectionally. CHAPTER THO FIXED EFFECTS 2.1 Introduction In this Chapter, we consider the estimation of (1-1) and (1-2) under the weakest set of assumptions; i.e., fixed effects. Since the number of individual Specific parameters increase with sample size, we focus our analysis on the estimation of B. In particular, we seek estimators of B that are consistent in the common panel case of large N and small T. First, we review the estimation of the simple model. We derive the ”within" estimator of covariance analysis, which possesses the above 1 consistency Property. This is equivalent to maximum likelihood. “:1 Chamberlain (1980) demonstrates the incidental parameters problem can also be circumvented through a conditional likelihood approach. He derives the conditional MLE of B in (1.1) which is also equivalent to the within estimator.‘ Secondly, we extend the results of the standard model to the more general model which allows cross-sectional variation in some of the slopes as well as the intercept. There exists a substantial literature on the case in which all coefficients vary across 1 (see, for example, ‘\ .1 Judge et a1 (1985, sectioé 13.5). In this case, the model can be 1 1 considered as N seemingly unrelated regressions; and if not all the coefficients vary across 1, then cross-equation restrictions are implied. Since we are concerned with an asymptotic theory in which N+00 .TT“ and T is fixed, this treatment is unsatisfactory. Mundlak (1978) has ; \‘ed~' 1 \ \\»fi investigated this case, and notes (given standard assumptions about the errors) the cross-sectionally constant regression coefficients can be estimated by a version of least squares. So, we follow Chamberlain, and derive the conditional MLE of B in (1.1). This is shown to be equivalent to the obvious least squares estimator, which is a comforting result. Our conclusions are summarized in section four. 2.2 The Standard Model Recall the usual representation of a linear regression model with panel data. This is described in (1.1) as Y axit's +0 +6 131,0009N; talgooogTo it i it where git is assumed to be iid N(o,oz). The n1 are incidental 1 parameters and B is a K-dimensional vector of cross-sectionally constant coefficients. As outlined in the previous section, we seek an estimator of B which is consistent for the usual panel case of large N and small T. It is well known that the “within” estimator of covariance analysis ~ possesses this consistency property. Let us review this estimation procedure. As a matter of notation, define (2.2.1) Y = Y . This allows us to write (2.2.2) Y1 = X18 + a1 + 81 . Then, we may consider all NT observations as (202.3) Y :3 X3 + m* + e D where Y1 (2.2.4) Y ' Y2 ’ YR and >4 N a.” H z". Z 000M H er 0 ...0 “I D = I s e = 0 e : , a = a , (2.2.5) N T : T. : * ? 0 00.. e1. ’ 1 0N where GT is a T-dimensional vector of ones. Notice D is simply a matrix of individual indicator (dummy) variables. The within estimator of B is derived as follows. Let a _ I '1 a (2.2.6) MD INT D(D D) D . Then transform the data by premultiplying (2.2.3) by MD, thereby obtaining "DY a Mst + MDDn* +_MDF , which reduces to (2.2.7) MDY a MDXB + Mus , since HDD=O.. This transformation changes a vector of observations into deviations from individual means.l Least squares applied to (2.2.7) yields the within estimator of 8, defined as - (x'an)'lx'u Y .1; (2.2.8) 5 n W ‘ A Under the condition that xit varies over time, 8W is consistent as N*” for fixed T. It should be clear that the within estimation does not depend on normality of the errors. When we invoke normality, we see that maximum- likelihood is, in fact, analysis of covariance. Chamberlain (1980) derives the conditional MLE of B, which he shows is also equivalent to O 8 The conditional likelihood approach employs a set of sufficient “0 statistics for the Q1, removing any incidental parameters problem. The consistency of Bw is confirmed by the coincidence of the conditional and joint MLE's. 2.3 A Generalization As described in (1.2), a straightforward generalization of the standard panel data model is 6 +6 i=1’000’N; tal,...’T . a c t x1e B + "it 1 1c Yit (’1 ‘1 \\ \ The "it and\6 are L-dimensional vectors of explanatory variables and i coefficients, respectively. The remaining variables and parameters are defined as in the simple model. This distinguishing feature of this model is that we allow for cross-sectional variation in some of the slapes as well as the intercept.‘ Obviously, if ”it is a constant, (1.2) reduces to the simple model. Again, we seek a consistent (as N+co for fixed T) estimator OfB. a Let “1 a ("11’ "12,000, “1'9'. This allows 118 to write 10 (2.3.1) Y1 = x13 + ”151 + 51 . Then, considering all NT observations, we obtain (2.3.2) Y = xs + Q 6* + e . where W1 61 (20303) Q " ”2 , 6* a 62 o wN 6N This general model can be estimated by least squares. By analogy to the within transformation, we premultiply (2.3.2) by the idempotent matrix Mo, which is defined as o -1! u - INT“Q(QQ> Q Q M1 (2.3.4) _ u 2 MN with , -r~ (2.3.5) M1 =- IT - u1(v1'w1) ”‘1' . 11 Then, we may apply least squares to the transformed model, . . M = M M , (2 3 6) QY QXB + QC which yields the following estimator of 8: (2.3.7) B 1 1 1 1 1 1 i i -1 -1 = ' ' a 2 ' Z ' o w (x MQX) x MQY ( x M x ) x M Y The estimatorgw is consistent for fixed T if (x'qu)‘l+o as N*”. Essentially, this is a condition which requires sufficient temporal variation in X1 not explained by W1. When W1 is only a constant term, we have the familiar condition that xit must vary over time. As in the standard model, the above is straightforward and does not depend on normality. We now invoke normality to prove that'é'w is also the conditional MLE of 8. Following Chamberlain, consider the 61 as incidental parameters for which we need to find sufficient statistics. The likelihood of Y, conditional on the sufficient statistics, will not depend on the 51. Maximizing this conditional likelihood should provide a consistent estimator of B. To prove this, we first show that Wi'Y1 is sufficient for 61. Consider the (T+L)Xl vector Ill 04 (2.3.8) y - 12 The vector y has a (singular) multivariate normal distribution with mean u and covariance matrix X, which with the above partitioning gives u1 = X18 + "151 = t I "2 "1 X15 + "1 "151 2 (2.3.9) 211 a IT 2 == 02W 'W 22 1 1 2 212 = 0 W1 2 l 221 0 W1 Standard results on normal distributions imply the distribution of Y1 conditional on Wi'Y1 is (singular) normal with mean 0 a "l _ Eexp[- -1—2-(Y1-X18)'M1(Y1-X18)] . 20 Since observations are assumed independent across i, we may multiply over i to obtain the conditional likelihood function, Q 3 (2.3.17) : =- (21v)-N(T-L)/zo-N(T-L)exp[- Z(II—221: (vi-xieriui-xisn. 14 This likelihood function is indeed maximized by 3W as given in (2.3.7). 2. Conclusion We have considered an extension of the usual fixed effects panel data model in which some of the explanatory variables may have coefficients which vary across 1. The results we obtain are essentially the same as those obtained in the standard, simpler case. Our model may be estimated by least squares, and the resulting estimates (of the cross-sectionally constant coefficients) are consistent given a reasonable condition on the variability of the regressors. Under normality this estimator is in fact the conditional MLE. However, we cannot claim asymptotic efficiency. Unlike the direct MLE, the conditional MLE will not, in general, be efficient in the sense that its asymptotic variance equals the Cramer-Rao lower bound (see Andersen (1970)). On the other hand, we do not know of any estimators with superior asymptotic behavior. CHAPTER THREE RANDOM EFFECTS UNCORRELATED WITH THE REGRESSORS 3.1 Introduction Having demonstrated the consistency of the within estimator in the usual (fixed T) panel case, we must now acknowledge two drawbacks of the fixed effects specification. First, for small T, the within estimator is not fully efficient since it ignores variation between individuals. Secondly, time-invariant explanatory variables are orthogonal to the within transformation and therefore cannot be incorporated into a fixed effects model. This is potentially a serious problem, since in many applications, attention is focused on the coefficients of such variables (e.g. on the coefficients of race or education in an earnings equation). As a remedy to the problems of fixed—effects models, a random effects specification is sometimes proposed. Random effects models take the individual effects to be iid random variables independent Of the explanatory variables and the disturbance. Estimation of the simple a model, also referred to as the error components model, is well documented. The basic results are reviewed in the next section. 15 16 Then, in section three, we extend the results of the error components model to our more general model where some of the slope coefficients are allowed to vary cross—sectionally. Our model, under the assumption of random effects, is essentially a Swamy random coefficient model where some of the coefficients do not vary across i. Section four summarizes our results. 3.2 Error Components Assume the a in (1.2) are iid N(0,aa2) variables. Furthermore, i let 011 be uncorrelated with the columns of (x,e). Under these assumptions, the usual panel data model has an error components structure, where (3.201) vit a a1 + cit 9 and therefore a 0 (3.2.2) Y t Kit 3 + v 1 it' Then, considering all NT observations, (302.3) Y a XB + v o with v = Dn* + s . Traditionally, models 11k.‘<3.2.3) are estimated by generalized least squares. The GLS estimator of B is defined as l7 ‘ = . -l -l , -1 (3.2.4) BGLS (x n x) x n Y , where = g 2 2 . (3.2.5) o cov(v) 0 INT + 00 DD and a 2 -l a l 2 , 2 a a o +Ta a Now, like BW’ BGLS is consistent as N+w for fixed T. But, BGLS is asymptotically more efficient than B". This efficiency gain is a result of the exploitation by GLS of both within and "between” (across individuals) variation. Within estimation only uses variation within 1. However, this efficiency gain disappears and BGLS + 8W as Tr”. 1 An additional advantage of the random effects specification is that time-invariant explanatory variables can be incorporated into the model (Recall the within transformation annihilates time-invariant regressors). So, in the classical panel case (large N, small T), the error components model may be preferred. One caveat is in order. The consistency of aGLS depends crucially on the assumption that the a1 are uncorrelated with the columns of X. This is often an unreasonable assumption; and, if violated, gGLS is no longer consistent.2 Ironically, it is inconsistency in the presence of such correlation that led to thbsoriginal fixed effects model. N 18 3.3 A Generalization Generalizing the error components model to include cross- sectionally varying slope coefficients is straightforward. Let (3.3.1) 5 = 6 + u . Assume the “1 are iid N(O,A) random variables, with A E COV(u1). As in the previous case, take the “1 to be uncorrelated with the explanatory variables and 8. Given (3.3.1), the full model may be defined as =- I 3 (3.3.2) Y xits+wt6 +v , it i 0 it where (3.3.3) Vit 3 "it “1 + Cit o More conveniently, we have (3.3.4) Y=XB +W60+v and (3.3.5) v = Qu* + e ,. where s 19 w1 u1 W = W2 and u* = u2 . "N "N Applying GLS to (3.3.4) we obtain the following estimator of 8: (3.3.6) chs = (x'n' 1/2Mw*n' llzx)”1MWn" l/ZY , where (3.3.7) MW* = 1_w*(u*ou*)‘1w*-=I_Q- 1/2w[(9- 1/2w).9— l’zwl’lw'n‘ 1/2’ and 2 (3.3.8) 9 = COV(v) = 0 INT + QAQ' . with A 2 IN a A. Note that (3.3.9) 9‘1 - 11,. HQ + 0(Q'Q)‘1r‘l(o'o)’lo' . 0 where T = 02(Q'Q)-l + A. (For the derivation of n‘l, see the appendix which concludes this chapter). ~ Now, as in the usual error components model, BGLS is consistent as S Nfir for fixed T. Then, do the efficiency results of GLS carry through ’4’” to this more general model? Indeed, we would be surprised if they did 20 not. It is straightforward to prove that EGLS is efficient relative to ~ EW for fixed T and that'fi B as T+m. + GLS If BGLS is efficient relative to BW’ then [COV(Bw)-COV(BGLS)] must be a positive semidefinite (PSD) matrix. It is well known that for any two nonsingular matrices A and B, (A-B) is PSD if and only if (B.1 - A‘l) is PSD. Therefore, our problem is to show that 1/2 1 x --—— x' M x 02 Q - 1/2 - (3.3.10) x'n Mugs is PSD. Rewriting (3.3.10), we obtain 1/2 ___g . X 2 X MQX X'Q- l/ZII'Q- l/ZW(W'Q-IW)-IW'Q- 1’2] Q- 0 (3.3.11) x - x'n'IW(w'n'lw)‘lw'n‘lx - —%-x' MQX . O = x'n‘l Now, given (3.3.9), (3.3.12) x'n‘lx =.—% X'MQX + x'Q(Q'Q)-IP'I(Q'Q)'IQ'X a A and Xin‘lw(w1n‘1w)‘1w'n‘1x (3.3.13) . = x'Q(Q'Q)'1r‘IENL(nNL'r‘lzNL)‘IENL'r‘l‘1 [62(0'0)’1+A1(Q'Q)‘IQ' 0' (A.2) ' s 1 .-1‘-1 . -1. --a—2-HQ+Q(QQ)‘I' (00) Q '. where r - 02(Q'Q)-1+A . 24 CHAPTER FOUR RANDOM EFFECTS CORRELATED WITH THE REGRESSORS The conventional random effects specification allows us to include time-invariant explanatory variables which cannot be incorporated into a fixed effects model. In the usual fixed T case, GLS estimation of this Specification is more efficient than within estimation of the fixed effects model. However, these improvements usually come at the expense of an untenable assumption: that the individual effects are uncorrelated with all the regressors. In this Chapter we drop this assumption. We investigate random effects panel data models in which the individual effects are assumed correlated with some of the explanatory variables. As in the previous two chapters, we focus on the fixed T case and begin by discussing the results, established by Hausman and Taylor (1981), for a model with only an intercept that varies across 1. Briefly, they use prior information to construct exogeniety restrictions that are then employed to derive a consistent and asymptotically efficient instrumental variables estimator. They also derive the conditions under which it differs from the fixed effects estimator. ~ Next, we generalize the H—T analysis to include 810pes that vary 25 26 across 1. Although the derivation of our estimator is more complicated, we show that the results of H-T carry through to the general model. In the last section we offer a brief summary. 6.2 The Hausman—Taylor Analysis In addressing the problems associated with both the fixed effects and error components Specifications, H-T consider the following model: 'Y + a + s , =- I (4.2.1) Y x“: s + z 1 it it i where 21 is a J—dimensional vector of time-invariant explanatory variables (notice it is not indexed by t) and Y is a conformably dimensioned parameter vector. The oi, as in the error components model, are assumed to be iid N(0,oa2) random variables. However, unlike the usual random effects specification, H—T take the at to be correlated with some of the columns of X and Z. According to H-T, consistent and asymptotically efficient estimation of all of the parameters in (4.2.1) hinges on our ability to distinguish columns of x and 2 which are not correlated with the a1. To examine this, let us adopt a more convenient form of the model. Let EN N oON I (4.202) 2 - 2* n-6,]: ’ 2* ' 2‘"- 27 So, we may write (4.2.3) Y=XB+2y+V, where V = Du*+e (as before). Then, suppose we have prior information on which of the columns of X and Z are correlated with the at. Let (4.2.4) x = [x1,x2] , z = [21.22] . where X1 is NTXkl, X is NTsz, 2 is NTle, and 22 is NTij (and 2 l kl+k2=K, j1+jz=J). For fixed T, assume 1 'D l D (4.2.5) .1. ' l ’ N X2 Dn*+ hx¢0 fi-Zz Dn*+hz¢0 . Now, it should be noted that although the condition E(a1|X1t,21) a 0 fails, consistent, though inefficient estimates of B and Y may still be obtained from the within regression.l First, we estimate 8 by within, obtaining a“ defined in (2.2.8). Secondly, we compute the within residuals, (Y-XBw). From the within residuals we estimate the individual means, defined as (4.2.6) .d = PD(Y-X8w) - ZY + Da* + PD(e+error). __'- s where PD=D(D'D) 1D' and “error” denotes estimation error from the within ‘ regression. Treating PD(e+error) as an unobservable zero mean 28 disturbance, we attempt to estimate Y from (4.2.6). We know OLS and GLS are inconsistent for Y since the a are not independent of 22. However, i if the columns of X1 (which are uncorrelated with Dn*) provide sufficient instruments for the columns of Z2 (which are correlated with Du*), consistent estimation of‘y from (4.2.6) is possible. A necessary condition for this is that the model must include at least as many time- varying exogenous variables as time—invariant endogenous variables; i.e., it must be that k1>jz. If this condition is fulfilled, instrumental variables applied to (4.2.6), using as instruments (412.7) B = [x1,zl] , yields the following estimator for Y (denoted Y"): A a ' -1 ' A (4.2.8) '7w (2 PBZ) z de , where PB = B(B'B)-1B', the projection onto the column space of B. This estimator is consistent for fixed T, but not fully efficient since it is calculated from the within-residuals. (Recall a" is not fully efficient since it ignores between variation.) . Now, consistent and asymptotically efficient estimates of B and Y can be derived if these parameters can be identified using prior information like that given in (4.2.5). Even without (4.2.5) all of the elements of B are identifiable as is clear from the within regression fl (1.2.. X'HDX is nonsingular). waever, without this information, no ‘ elements of‘Y are identifiable. But, given (4.2.5) we have the set of 29 instruments.2 (4.2.9) A = [MD,X1,21] and the corresponding projection PA, suggesting the following proposition: Proposition 1(H-T): A necessary and sufficient condition for the identification of (B,Y) in (4.2.3) is that x l PA [xl,zl] be nonsingular. And, associated with this rank condition is the order condition to which we referred earlier: Proposition 2 (H—T): A necessary condition for the identification of (B,Y) in (4.2.3) is that kl>jz. SuPpose the parameters of (4.2.3) are identified by the information in (402-5)o3 Let (4.2.10) 9- 1/2 a MD + OPD 3 INT-(l-G)PD , where 6 = (l-TBIZ) 1,2 = [oz/(02+Toza)]1/2. 4 Then, perform instrumental variables on the transformed equation, Q 3 (4.2.11) 9" 1’21: = n" UZXB + 12" 1’22)! + 9' l”V . 30 using the set of instruments given by A in (4.2.9). This procedure yields consistent and asymptotically efficient estimates of B and Y. Equivalently, and more computationally convenient,5 we may apply OLS to (4.2.12) FAQ‘ 1’21 = pAn‘ 1’sz + FAQ“ ”221 + PAS?- 1le . where PA is the projection onto the column space of A. We denote the estimators of (B,Y) obtained from (4.2.11) or (4.2.12) as (6*,Y*) . Now, in evaluating the information given in (4.2.5), three cases are possible: under-identification, exact-identification, and over- identification. First, in the under-identified case (klj2, 9 we perform instrumental variables on (4.3.5), using the set of instruments10 _ - 1/2 _ - 1/2 (4.3.8) 3* — n B — n (xl,zl,w) . This yields ‘Yw (4.3.9) = [(le)'n- l/ZPB 9' l/2(Z.W)]'l(z,w)"1$1" l/ZPB E as. * * ow ' -l with P3* = 3*(B*'B*) B*' . To understand this, recall the definition of the H-T 9- 1,2 in (4.2.10). Suppose we substitute (4.2.10) for PD in the calculation of d. The, instrumental variables using 3* a n‘ 1’2(xl,zl) yields 1/2 - 1/2 -1 . - 1/2 - 1/2 _ * PB n z) z 9 PB n (Y xsw) . (4.3.10) 1" = (242' * 1: which is generally different from‘Yw in (4.2.8).11 However, when kl-j2 (the exact identification case), both (4.2.8) and (4.3.10) ‘4 3 are equivalent to12 ‘ 34 (4.3.11) 'Yw - (B'z)"l B'(Y-x§w) . So, in the case in which this procedure is appropriate (in the sense it is equivalent to "efficient” estimation when kl=j2), it makes no difference in the H—T model whether we use pD(Y-x§w) or 9- l[2(‘17-Xg ) for d or whether we em loy (X z ) n- 1/2()( Z ) w 9 p 1, l or 1, l as instruments. In our model, it is never the case that substitution of Pb for 9- 1,2 and B for 3* results in estimates equivalent to the correct Y" andlgow.given in (4.3.9). In sum, consistent but inefficient estimation, based on the fixed effects regression, of all the parameters in (4.3.2) is possible if the columns of X1 provide sufficient instruments for the columns of 22. As before, the inefficiency is grounded in the use of the fixed effects residuals. Now we turn to efficient estimation of the model. Specifically, we seek identifying restrictions from which instruments may be formed to estimate our model consistently and asymptotically efficiently. Following H-T, consider the set of instruments (4.3.12) A = Inq,xl,zl,w], and the projection onto the column space of A, PA. Then, given Propositions l and 2, rank and order conditions for identification are easily derived. The order condition, which is mentioned above, is the same as in H-T; namely that kl>j2 is a necessary but not sufficient condition for the identificatioh'of B,Y, and 60 in (4.3.2). The rank \ condition is almost the same as in H-T: a necessary and sufficient 35 condition for the identification of all the parameters in (4.3.2) is that G'PAG be nonsingular, where G = (X,Z,W). Suppose the rank condition is fulfilled by the information in (4.3.4).13 Then, similarly to (4.2.11), we transform (4.3.2) by - 1/2 our 9 and perform instrumental variables using the set of instruments = - 1/2 (4.3.13) A* n (MQ,x zl,w) . 19 Equivalently, we may apply OLS to (4.3.14) * A* A* ~ 1/2 - 1/ + PA*Q W60 + PA*Q 2v , where PA* is the projection onto the column space of A*. This yields ~* ~* — .— — — - (4.3.15) Y 8 (6'9 1/2P 9 1IZG) IG'Q 1/2P 9 lle . * 4* 4* 60 These estimates are consistent and asymptotically efficient. Returning to the information given in (4.3.4), we (again) distinguish three separate cases. Appendix A formally derives the characteristics of §*; Y*, and‘30* when the model is under-identified, exactly-identified, or over—identified. Although the derivations \ differ,14 the characteristics of‘these estimators when kéjz are .‘ essentially the same as in the H-T model. To summarize, 36 if k132, (s ,y '30 ) s (BW’YW’KOW) with the former being more efficient than the latter. Finally, there is the following computational note. While it is possible to calculate 9-1/2 (or F) and estimate the transformed model (see Appendix B), it is not necessary. Instead, we may directly calculate ~* ;W ~* -1 2 -1 ~ and Y , using 3 (or F ), Recall a is defined in 6014 30 * 0 (3.3.9) as [—%-NQ+Q(Q'Q)_lF-1(Q'Q)-IQ']. However, this may be expressed a 88 -l l 2 (4.3.16) :2 =- M + F . 37' Q So, in the consistent but inefficient procedure, (4.3.17) 9"1’298 a“ 1’2 = 1241303'n"lli)'113'fi"1 . * Simplifying (4.3.9) to 7 _ w I. (4.3.18) - [(2.11)'F213(B'n'ls)‘ls'1?2(z,w)]'l 6ow (z,w)'FZB(B'n'ls)’lB'Fzfi . 37 And since p = “Q is a projection onto p 9- 1/23, the efficient A* Q estimator in (4.3.15) can be computed as ~* ~* ... .. .- (4.3.19) 1 =- {c'[-% MQ+FZB(B'FZB) ls'rzlc} 1G,[_%MQ+F23(B.F23) lB'leY ~ * O O 5 O 4.4 Summary We have considered a random effects specification of our model in which the unreasonable independence assumption of Chapter Three is dropped and time-invariant explanatory variables are added. Following Hausman and Taylor (1981), we derived a consistent and asymptotically efficient estimator of our model using identifying restrictions constructed from prior information about which explanatory variables are uncorrelated with the individual effects. We also derive conditions under which this estimator differs from the within estimator discussed in section three of Chapter Two. This represents a significant improvement over the fixed effects specification since we now can estimate coefficients of time-invariant variables and gain efficiency without requiring that all the regressors \ 3 be exogenous. CHAPTER FOUR APPENDICES CHAPTER FOUR APPENDIX A To derive the characteristics of §*, Y*, and 3 * when kl-é-jz, we 0 will make use of the following lemma, due to Trevor Breusch (personal communication): Lemma: Let H and C be nxm and nxp matrices respectively, such that H and H'C both have rank m. Then H and H(H'H)"1H'C have the same column Space. ~* Case I (under-identification): If kl < j2 , B = 8W and ~ * 6 0 does not exist. First, consider 8. The ”efficient" estimator of 8, §*, is obtained (separately) by OLS of PA 9. lle on the part of PA 9- 1/2x orthogonal * s - 1/2 - 1/2 to PAJR (Z,W). Now, since A* a n ‘ (M0,xl,zl,W) a (HQ’B*), p . M + projection onto the part of B* orthogonal to HQ; i.e., A* Q P a h + p B*(B*'PQB*)-IB*'P (11.1) A. Q Q Q - 1/2 . — 1/2 - 1/2 -1 , - 1/2 HQ + PQQ B(B fl PQQ B) B Q PQ But, 9 - n + FB(B'FZB)-1B'F A* Q 38 39 and - 1/2 _ 1 , 2 -1 . 2 (4.3) PA*n — a-MQ + FB(B F B) B F . Therefore, (A.4) PA n’ 1”x =-l-M x + FB(B'F2B)-1B'F2X * 0 Q (4.5) Plan“ 1/2(z,w) = FB(B'FZB)-1B'F2(Z,W) (A.6) PA 9- 1’22 = l.u Y + FB(B'FZB)‘IB'F2Y . * a Q Given the above lemma (with H = PB and C = F(Z,W)). when k1 < 12, the rank of B determines the rank of 9A n‘ 1,2(Z,W). Thus, both * PA 9- l/2(Z,W) and EB share the same column space and null space. * Hence, the part of PA 9- 1”X orthogonal to P n. l/2(Z,W) must also be * orthogonal to F8. This part of PA 9’ 1/2(Z,W) must also be orthogonal * - l 2 to PB. This part of PAH I X is %-MQX. SO. when k1 < 32: —v* a 1 . 1 -1 1 . 1 . 2 -l . 2 s [(3.th) 6.14Qx] (E-MQX) [a-MQ+FB(B F B) B F ]Y I -1! (x MQX) x M Y (A.7) Q :38”. Now, consider Y and 60. Since the column space of PA 9" l[2(Z,W) ‘ * fl equals the column space of EB, when k1j2 , t Yw ~ * ~ 60 60” and the former are more efficient. If kl>j2, rank (FB) ) rank F(W,Z). Then, the column space of P Q— l”(Z,W). Intuitively, this means that there are parts A* of P 9“ 1/2x orthogonal to PA 9‘ l/2(Z,W) even though they are not A* * ~* ~ orthogonal to F3. Hence 8 ¢ 8". ..* ~ ~ ~ ~ Since 8 t 8W in this case, (Y-XB*) i d = (Y-XBW)' Additionally, I there is the general nonequivalence of B'fl- B and B'FZB (which is not ~* ~ Y Y" mentioned by H-T). So, for two reasons, ~ * t ~ . And, 60 6OW ~* ~ Y Y“ because ‘~ * is asymptotically efficient, ~ is not in this 60 6011 case. (BAPTER FOUR APPENDIX B We now consider the computation of the consistent and asymptotically efficient estimates defined in (4.3.15). First, we need a consistent.n— 1,2. More precisely, we need consistent (as N+¢) estimation of 02 and A, since - 1/2 1 l 9 = M +F = M :7 Q 37 Q (13.1) +Q 0 . In other words, there exists an omitted variables problem, since in the simple cross-section we are unable to control for those individual wage determining attributes - collectively referred to as ”ability" - which are correlated with union status. Consequently, 81 is positively correlated with U1 and the least squares estimate of 6 is biased upward. With the availability of panel data, these individual attributes, or effects, can be taken into account. Typically this has been done within a fixed-effects framework,2 where 1=l’.0.,N’ FI’OOO,T O a O + + it t5 a e i and n1 is the individual effect, assumed constant over time. The inclusion of the<11's can be viewed as a form of differencing. Indeed, this is the essence of the within transformation, to difference away the correlation between the error and union status. Then, applying least squares to the within-transformed equation yields an unbiased and consistent estimate of the union wage differential. _The notion of omitted variable bias is supported by those studies in which models like (5.1.3) are estimated. Invariably these analyses report lower union wage effectsfiihan do cross-section studies.3 \ Now, models like (5.1.3) are also subject to criticism. First, 49 there is the issue of measurement error with respect to the union status variable. This problem is not addressed in this paper, but Freeman (1984) and Chowdhury and Nickell (1985) correctly point out that errors in the reporting of union status are accentuated in longitudinal studies since the estimation of panel data models usually depends on a small number of union status changes. A second criticism, which we do consider, is offered by Stewart (1983). He notes that the standard fixed effects model (varying intercept only) ignores the possibility that the individual effects may be sector dependent. In recognizing that the processes which determine wages are different in the union and nonunion sectors, Stewart constructs a model which allows the union wage differential to vary across people. As we will see below, his model is just a special case of the fixed effects version of our model presented in section 3 of Chapter Two. We may express Stewart's model as a 0 where (5.1.5) ¢it S Clearly, if A 8 1 (5.1.4) reduce; to the standard fixed effects model. If’l ¢ 1, then the individual effects are sector dependent. Stewart 50 suggests A 50,000), city size (SMSA) is given the value 1. .Similarly, we use a redneck (REG) dummy, set equal to 1, if an individual lives in the South. Marital status (MARR) is recorded with a valuefibf 1 if married. The gender variable (SEX) is equal to 1 if an individual is male. An individual's race 52 'LAILEI Means (and Standard Deviations) Variables 1978 1979 1980 1981 8001.81) 1w 6.3706 6.4779 6.5773 6.6616 6.5218 (.4843) (.4825) (.4707) (.4688) (.4902) 8) 12.1219 12.1237 12.1237 12.1237 12.1232 (2.8715) (2.8687) (2.8687) (2.8687) (2.8687) x 13.9461 14.9461 15.9461 16.9461 15.4461 (11.0018) (11.0018) (11.0018) (11.0018) (11.0561) x2 315.4619 344.3540 375.2462 408.1383 360.8001 (412.2647) (433.4538) (454.7202) (476.0537) (446.0018) TEN 70.0305 77.7907 88.4877 119.2339 88.8882 (77.6660) (81 .0625) (90.0519) (103.6219) (90.6037) Tm? 10932.7456 12618.6829 15936.4297 24947.4209 16108.9448 (23348.9671) (25332.7136) (30691.7007) (40331.7051) (31108.8640) 5184 .6940 .6858 .6899 .6858 .6889 (.4610) (.4643) (.4627) (.4643) (.4630) RE: .4302 .4314 .4308 .4326 .4313 (.4953) (.4954) (.4853) (.4956) (.4953) MAR .7145 .7216 .7315 .7292 .7242 (.4518) (.4484) (.4433) (.4445) (.4469) sr-x .8224 .8224 .8224 .8224 .8224 (.3823) (.3823) (.3823) (.3823) (.3822) sacs .6676 .6676 .6676 .6676 .6676 (.4712 (.4712) (.4712) (.4712) (.4712) CB .3458 .3646 .3581 .3628 .3579 (.4893) (.4825) (.4707) (.4688) (.4902) N= 1706 53 (RACE) is defined as 1 if he/she is white (the nonwhite category only consists of blacks). Union coverage (i.e., whether a person's job is covered by a union contract), rather than union membership, is chosen as the union status variable. It (CB) is set to 1 if a person's job is unionized. Finally, we include a series of one—digit occupation dummies to control for the effect of occupation on wages. In the next section we combine the data with the structural models described in 5.1 to examine the impact of unions on wages. 5.3 Estimation and Results Here we present the results of estimation. Our primary concern is to gain a better understanding of the union wage effect. Of related interest is a comparison of the results obtained from the simple and general panel data models (in particular, the within and H-T estimators). We proceed by considering, in turn, the models described by (5.1.1), (5.1.3). and (5.1.6). As mentioned in section one of this Chapter, the first attempts to measure the union wage differential were simple cross-section studies where ordinary least squares was applied to an equation like (5.1.1). we replicate this procedure on each of the four years of our sample. These cross-section results are given in Table 2. In general, our estimates are very similar to those obtained in other cross-section studies. Two exceptions are the coefficients of tenure and tenure- squared. In no year are these coefficients estimated with much \ precision. This is due, at least.in part, to the fact that our tenure ‘ variable is defined as "tenure on the job” rather than the preferred 54 TAILEZ (toss—Section Estimates Dependent Variable: 100 Wage Emlanatory Variables 1978 1979 1980 1981 n) .0472 .0481 .0505 .0547 (.0039) (.0038) (.0037) (.0037) x .0238 .0235 .0214 .0167 (.0028) (.0029) ( .0029) (.0032) x2 -.00044 -.00047 -.00038 -.00032 (.00007) (.00007) (.00007) (.00007) M om372 om336 -0m264 omo (.00360) (.00336) (.00300) (.00288) TENZ -.00004 .0000006 .00014 -.00012 (.00013) (.0001253) (.00010) (.00009) 3154 .1173 .1164 .1170 .1193 (.0187) (.0183) (.0177) (.0179) RE; -0048]. _00547 “00520 -00509 (.0185) (.0183) (.0177) (.0181) MARR .0700 .0827 .1102 .1325 (.0263) (.0264) (.0264) (.0272) $81 .2585 .2299 .2035 .1865 (.0334) (.0331) (.0327) (.0331) Ram .1071 .1389 .1204 .1305 (.0202) (.0201) (.0196) (.0199) CB .1758 .1573 .1915 .1809 (.0188) (.0182) ( .0179) ( .0182) i2 . _ .550 .551 .556 .535 N = 1706 . Standard errors in parentheses 55 "tenure with the employer.” The coefficient on union status (8) ranges from .157 to .191, which implies a union wage differential of 17 to 21 percent.6 Like our parameter estimates in general, this range of estimated union wage differentials is in agreement with earlier cross- section results. Perhaps also not surprising is the decline and rise in the union wage effect over the four year period. This may be explained by the incompleteness of union cost-of-living adjustments (COLA's) during the inflationary period of the late 1970's, and the countercyclical nature of the union wage differential (demonstrated by the effects of the 1980 recession). . Since we strongly suspect the selectivity of union workers causes an upward bias in the cross-section estimates of the union wage effect, we turn to the panel data model of (5.1.3), where each individual has a unique intercept (oi, the intercept effect). For the sake of comparison, we first estimate this model by OLS. These results are presented in the first column of Table 3. Notice they are vary much like those obtained by the simple cross-section regressions. This is because OLS ignores the longitudinal nature of the data. In other words, OLS assumes no correlation between the explanatory variables and the individual effects. So, the 8 calculated by OLS is upwardly biased for the same reason as the union status coefficients given in Table 2. The claim of this bias in the cross-section and OLS/panel estimates of 6 is supported by the fixed effects results given in the third column of Table 3. Performing OLS on a within transformation of the data (i.e., deviations from individual means) yields an estimate of the union \ 3 Panel Estimates: 56 TAILE3 Dependent Variable: Intercept Varying Explanatory Variables OLS GLS Within HT 83 .0514 .0728 .1573 (.0020) (.0152) (.0677) x .0233 .0960 .1164 .1155 (.00151) (.0017) (.0027) ( .0027) X2 -0m5 “.le -om3 -0m2 (.00004) (.0001) (.00007) (.00007) TEN .00648 .00360 .00096 .00180 (.00156) (.00003) (.00003) (.00003) Tm? -.0000003 -.0000011 -.0000003 -.0000005 (.0000004) (.0000002) (.0000002) (.0000002) (.0094) (.0156) (.0157) ( .0157) RE -00519 ".0427 ’00416 “.0308 (.0094) (.0285) (.0298) (.0291) “ARR o lm3 omsa "' om27 "omds (.0138) (.0132) (.0131) ( .0131) 381 .2298 .3393 .2912 (.0171) (.1065) (.1154) (.0104) (.0917) (.1761) (:8 .1803 .0975 .0945 .0941 (.0095) (.0096) (.0096) (.0096) E2 .536 ‘. .397 .405 .403 N - 17%. Standard errors in parentheses. 57 status coefficient of .094, a 65 percent decrease from the cross-section estimate.7 With the exceptions of experience and experience-squared, the impacts of the other explanatory variables are reduced by allowing for the individual effects. The within estimates are unbiased and consistent even if the effects are correlated with the regressors. However, they are not fully efficient and the within transformation removes all time-invariant variables (ED, SEX, RACE) from the model. As we stated at the outset of Chapter Three, this is a potentially serious problem if one is interested in, for example, the return to schooling. An alternative specification of (5.1.3) is the error components model, which was reviewed in section two of Chapter Three. In this case, the a1 are taken to be iid random variables uncorrelated with the regressors. If this assumption is true, generalized least squares based on consistent estimates of the variance components will produce consistent and asymptotically efficient estimates of all the parameters, including the coefficients of the time-invariant variables; and GLS is simply computed. It is equivalent to OLS on a (1-0) differencing of the data, where 6 - [oz/(02+Toza)] 1’2 (see Note 4 of Chapter Four). 2 and a2 can be derived from the within Consistent estimates of o a residuals (see H—T, p. 1384). In our case, (5.3.1) 0 = .025, on = 3.418, 9 B .043 . The results of GLS estimation are listed in the second column of Table 3. Now we have argued that there exists an omitted variables is problem in the cross—section. Since GLS also assumes no correlation 'D between the effects and the explanatory variables, the error-components 58 specification is vulnerable to the same criticism leveled at the cross- section estimates. The GLS estimate of 6 (.098) is higher than the within estimate (.094). However, it is surprisingly low given the argument of omitted variables bias. In any case, if the effects are correlated with union status, the GLS estimate is biased and inconsistent. The coefficient on education, which we also expect to be correlated with the effects, is closer to the OLS estimate. In sum, unless we are prepared to reject the story of selectivity of union workers, the fixed effects specification should be preferred over the error components model for measuring the union wage differential. However, we need not rely solely on the fixed effects version of (5.1.3). Instead, we may let the n1 be random variables correlated with the regressors. Then, following the H—T coefficient procedure described in Chapter Four, we are able to include time-invariant explanatory variables, and obtain consistent and asymptotically efficient estimates, thereby meeting the objections to the OLS, within, and GLS estimators. Implementation of this procedure requires that we be able to distinguish those regressors that are correlated with the effects from those that are not. Since the effects presumably control for ability, obvious choices as endogenous variables are union status and education. In addition, applications of this technique by H-T and Chowdhury and Nickell lead us to include experience and tenure (and their quadratic terms) as endogenous variables.8 Recalling the H-T order condition for identification of the model, we know that we need at least as many time varying explanatory variables (SMSA, REG, MARR) uncorrelated with the effects ab‘we have time invariant explanatory \ variables correlated with the effects (ED). Clearly this condition is 59 fulfilled here.9 In fact, the model is over-identified. Hence, the H-T estimator should yield an efficiency improvement over the within estimator for this model.10 Beyond determining which explanatory variables are endogenous, the only major difficulty is computational. However, the computational difficulty can be reduced. Following H-T (Appendix B), the fitted values of the endogenous variables (both time-varying and time- invariant) can be calculated from their reduced forms in a manner that reduces the size of the estimation problem from sample size NT regressions to sample size N regressions. The predicted values of the time-invariant endogenous variables are obtained from a regression of these variables on the time-invariant exogenous variables and the individual means of the time varying exogenous variables. The calculation of the predicted values of the time-varying endogenous variables is almost as simple. They are derived from a regression of the individual means of the time-varying endogenous variables on the time-invariant exogenous variables and the individual means of the time- varying exogenous variables. To these predicted individual means-we must add the true deviations from means since the correct prediction of the time-varying endogenous variables is calculated with the set of instruments that includes the projection used to transform the data into deviations from means. Then, fitted values are combined with the variables that are uncorrelated with the effects to obtain consistent and asymptotically efficient estimates of 8 and d. The results from the H-T efficient estimation of (5.1.3) are presented in the last column of\Table 3. First, note that the estimated ‘ coefficient on union status is essentially the same as that calculated 60 by within. This should not be surprising since correlation between the effects and union status is assumed by this specification. Education is also taken as endogenous and the effect of this is a marked increase in its coefficient, to .157, over both the OLS (.051) and GLS (.073) estimates. This rise in the returns to schooling is not in accordance with a story of positive correlation between education and ability leading to an upwardly biased OLS (or GLS) estimate. But, H-T point out that when the amount of education is endogenous, there may be a negative correlation between ability and the amount of education chosen.ll Their application of this procedure to an investigation of the returns to schooling reveals a similar rise over OLS and GLS estimates. The coefficients of the other time-varying explanatory variables are reasonably close to the within estimates. The sex and race parameters, which cannot be estimated by within, are both considerably different from either the OLS or GLS estimates. In particular, notice the effect of race has essentially vanished. Now, the H-T procedure produces parameter estimates which support the omitted-variables bias argument, and are consistent and fully efficient. However, within the framework of (5.1.3) we are still vulnerable to the criticism of Stewart. Thus, we next consider the unrestricted version of (5.1.6), where the union wage differential is allowed to vary cross-sectionally. We examine (as we did for the simple model), the fixed-effects and two random-effects specifications of this special case of our general panel data model. First, we take 6 and mi in (5.1.6) to be fixed over time. The i \ fixed effects version of the genbral model is estimated by performing \ OLS on the within (deviations from means) transformation of the simple 61 model (see chapter two). These results are presented in the second column of Table 4. Since T=4, the individual 61's cannot be estimated equals .093, which is only A consistently. However, their mean, fi-i 61 slightly less than the usual fixed effects estimate. In general, the coefficients of all the explanatory variables are not too different from those calculated by within on the simple model, even though the effects are now composed of both 61 and n1. Finally, notice the time-invariant variables are (again) eliminated by the transformation. We have consistently argued that any estimation method which assumes no correlation between the regressors and the effects is inappropriate for measuring the union wage effect. However, for the sake of comparison, let 61 and a1 be iid random variables uncorrelated with the explanatory variables. With this specification we estimate by GLS, but we first need consistent estimates of the variance of sit and the covariance matrix of (01,01). Using the estimates defined in Appendix B to Chapter Four, we obtain “2 o = .032 (5.3.2) . .774 .003 A a o .003 .043 Given (5.3.2) we can calculate a consistent estimator of n" 1/2 (again, see chapter four, appendix B). Then, GLS is obtained by performing an.a- 1’2 transformation on the data and running OLS. The resulting estimates, which are only consistent if the uncorrelatedness assumption is‘true, are given in the first column of ‘ Table 4. As in the simple model, the GLS estimate of the union wage 62 effect (.095) remains higher than, though close to, the fixed effects estimate. The estimates of the other parameters, with the exception of the coefficients of the experience and tenure variables, are very different from those calculated by GLS on the simple model. One reason for this must be the inclusion of the 61 as part of the random effects, which are assumed to be uncorrelated with the regressors. These results should not be too upsetting, however, since they are based on an unrealistic assumption and are therefore biased and inconsistent. Finally, we address the drawbacks of within and GLS. Now we take the a1 and 61's to be random variables correlated with the explanatory variables. To this version of (5.1.6) we apply our extension of the H-T analysis. The variables taken to be endogenous in the simple model are also assumed to be correlated with the effects here. The only exception is union status, which is now exogenous.12 Like the simple model, our model is over-identified, and therefore the H-T procedure should yield consistent and asymptotically efficient estimates of the general model. These estimates are presented in the last column of Table 4. After distinguishing those variables that are uncorrelated with the effects, as in the simple model, the main difficulty is computational. In an analogous fashion to the procedure outlined for the simple model, we calculate the fitted values of the endogenous variables from their reduced forms. For the general model, however, efficient estimation requires the reduced forms be transformed by 9 - 1,2 (constructed from 02 and A given in (5.3.2)). (General reduced form expressions for those variables correlated with the effects are 62 63 TAILE4 Estimates: Slopes and Panel Intercept Varying Dependent Variable: 1m thge Ebcplana toxy Variables GLS Within 111‘ n) .3364 .3743 (.0045) (.0227) x .1236 .1141 .1677 (.0031) (.0026) (.0123) x2 -.00142 -.00061 -.00095 (.00008)- (.00007) (.00033) TEN omzm OM72 -0m168 (.00120) (.00096) (.00480) Tm? -.00004 -.00003 -.00001 (.00004) (.00003) (.00016) (.0175) (.0153) (.0328) R .1272 -00220 -o%77 (.0297) (.0290) (.0556) MARR -.0212 -.0091 -.0208 (.0151) (.0126) (.0237) SEX 1.1048 .3498 (.0573) (.1658) RACE -.0610 -.4510 (.0522) (.1135) , (13 .0955 .0930 .0854 ' (.0074) (.0128) E’ .911 . .403 .791 N = 1706. Standard errors in parentheses. 64 defined in chapter four, appendix B). 1/2 transformed) reduced Direct application of OLS to the (0 - forms is cumbersome since the set of instruments includes not only the exogenous variables, but also the projection used to transform the fixed-effects model. This projection can be ”parsed out” in both reduced forms before performing OLS. However, to obtain the correct predicted values of the time-varying endogenous variables, the ”within” transformed exogenous variables must be added back to the fitted values calculated from the least-squares regression with the ”within" projection ”parsed out”. (This is formally described in appendix B to chapter four). The predicted values of the endogenous variables are then combined with the (0 - 1’2 transformed) exogenous variables in an OLS regression to obtain consistent and asymptotically efficient estimates of the parameters in the general model. The union status coefficient is estimated to be .085, implying a union wage differential of 8.9 percent. This is very close to the mean of the 61's derived from the within regression. As in the simple model, the H-T estimate of the return to schooling is much higher than that calculated by the original OLS regression. This lends further support to the story offered by H—T of a negative correlation between ability and education when the amount of schooling is made endogenous. The coefficients on the other time- varying explanatory variables are closer to the within estimates of the general model than those obtained by GLS. However, one particularly peculiar result is the estimate of the race parameter. In the simple model the race effect essentially disappears. In their return to schooling exercise, HJT report b‘similarly reduced effect of race on ‘ earnings from their efficient procedure; but, a negative race effect of 65 such magnitude is completely unexpected. Why the inclusion of 61 as part of the individual effect would lead to this result is unclear. 5.4 Conclusions Table 5 contains a tabular summary of the union wage differentials obtained from the different models and estimation techniques we have considered. Because of the selectivity of union workers it is likely that ability is correlated with union status. If union status is observed without error, then we conclude that there is no real justification for measuring the union wage differential from the simple cross-section, or from a panel estimation method which does not allow for correlation between the regressors and the individual effects. The simplest means of estimating the union wage effect is within estimation of the usual fixed effects model. These estimates are unbiased and consistent. Similar, and likewise consistent results may be obtained by estimating a fixed effects version of our general model (e.g., Stewart's Specification). In this way we can let the union wage differential vary cross-sectionally (or allow for sectoral dependence of the effects). Finally, if we are also concerned about the coefficients of the time invariant explanatory variables, we can apply H-T to the simple (intercept varying) model and obtain consistent and asymptotically efficient estimates of all the parameters in the model (provided the parameters are identified). In the general model, the H-T analysis provides somewhat peculiar results for the coefficients of the time- invariant variables. This may be due to misspecification, since union ‘ status is not endogenous in the general model. In sum, all of the ()6 TABLELS Percent Union Wage Effects Cross-Section Panel OLS WITHIN GLS HT 1978: 19.22 Intercept Varying: 19.76 9/91 10.24 9.87 1979: 17.03 Slopes and 1980: 21.11 Intercept Varying: 9.75 10.02 8.92 1981: 19.83 67 estimation methods which allow for correlation between the individual effects and the explanatory variables yield union wage differentials that are much smaller than those obtained in simple cross-section or OLS/panel estimation. CHAPTER SIX SUMMARY AND CONCLUSIONS A regression function which does not control for omitted or unobservable variables that are correlated with the explanatory variables will not identify the parameters of the model. Conventional estimation of such a regression function will produce biased and inconsistent results. However, the availability of panel data allows us to control for these omitted or unobserved characteristics through the inclusion of individual specific parameters or effects. The focus of this study is on the estimation of panel data models in which there is cross-sectional variation in some of the slopes as well as the intercept. A well established literature exists on the estimation of the simpler case in which only the intercept varies cross— sectionally. The results for the simple model are a function of the assumptions about the individual effects. We identify three different cases: (1) fixed effects, (2) random effects uncorrelated with the regressors, and (3) random effects correlated with the regressors. For each set of assumptions we review the appropriate method of estimation for the simple model_and then extend these results to our more general model. In both the review and the extension, our primary interest is in estimation techniques that behave well when we have a 68 69 small number of time observations on a large cross-section, a typical case with longitudinal data. First, we consider the weakest set of assumptions: fixed effects. In this case, the individual effects, while differing across people, are assumed to be constant for each person. Traditionally, the simple model is estimated by OLS on the within transformed data. The within estimator is consistent for fixed T. We develop the analogous transformation for the general model and show that OLS on our transformed model is also consistent (given a reasonable assumption about the variability of the regressors) in the case of fixed T. In addition, we prove that under normality, the within estimator of the general model is also the conditional MLE. Two drawbacks of the fixed effects specification are noted. The first, and perhaps less serious, is that (for both models) the within estimator is not fully efficient when T is small. Secondly, time- invariant explanatory variables are orthogonal to, and therefore eliminated by, the within transformation. One solution to these two problems is to adopt a random-effects specification where one assumes the individual effects are iid random variables uncorrelated with the regressors. In the model with only a variable intercept, estimation is by GLS. The GLS estimator is consistent and efficient relative to within when T is small. We derive the GLS estimator for our model and show that the results from the simple model carry over to the general model. However, the consistency of GLS in both models hinges on the assumption that the effects be uncorrelated with the explanatoTy variables. This assumption is not justified in most empirical applications. 70 Finally, we consider the case of random effects which are correlated with the regressors. Under this set of assumptions, Hausman and Taylor (1981) derive an instrumental variables procedure for the simple model that allows the inclusion of time invariant variables, and yields consistent and asymptotically efficient estimates. Their IV estimator is unique in that it uses the included exogenous time-varying variables as instruments for the endogenous time-invariant variables. Specifically, their procedure requires that we have at least as many time-varying explanatory variables uncorrelated with the effects as we have time invariant explanatory variables correlated with the effects. This is essentially an order condition for the identification of the parameters in the model. We apply the HJT analysis to the case in which slopes and intercepts are allowed to vary. An analogous order condition for the identification of our model is obtained, and a consistent and asymptotically efficient IV estimator is derived. Then, following H-T, we detail conditions under which the efficient IV estimator of our model differs from within. After our theoretical examination of panel data models in which slopes and intercepts are allowed to vary, as an empirical exercise we consider the estimation of unions' impact on earnings. The issue of the union wage effect offers an appropriate empirical question for the application of a special case of our model. The first attempts to measure the union wage effect were conducted within the framework of a simple cross-sectional earnings equation containing a union status dummy. These studies ignored the selectivity of union workers. Given high union wages, firms tend to select more 5 able workers, producing a positive correlation between union status and 71 the disturbance. Put differently, there is an omitted variables problem since ”ability” is ignored in the cross-section regression. The result is a measure of the union wage differential which is upwardly biased. With the availability of panel data, the typical response to the biased cross-section results has been through the standard fixed effects panel data model. Here individual specific intercepts (effects) are included to control for ability. This can be viewed as a form of differencing, which is the essence of the within transformation. Estimation by within yields an unbiased and consistent estimate of the union wage differential. However, the usual fixed effects model is not without its critics. Stewart (1983), noting that the processes which determine wages are different in the union and nonunion sectors, constructs a model that allows for the individual effects to be sector dependent. This is equivalent to letting the union wage differential vary across the individuals in the sample. Thus Stewart's model is just a special case of the fixed effects version of our general panel-data model. In his model, the union wage differential is constrained to be a linear function of the individual effects. Estimation of his model is by nonlinear least squares. We consider an unrestricted version of Stewart's model which is estimated by within. In either case, the individual union wage effects cannot be estimated consistently as long as T is fixed. However, we can calculate the average union wage differential for the sample. Using data from the years 1978-1981 of the PSID we estimate: (1) the simple cross-sectional earnings equation, (2) the usual fixed effects model, and (3) the unconstrained Stewart model. Since within 72 estimation of (2) and (3) eliminates time invariant variables (e.g., education), and because the resulting estimates are not fully efficient in the case of small T (here, T=4), we also estimate the random-effects (correlated and uncorrelated with regressors) specifications of each. From (1) we obtain estimates of the union wage differential in the range of 17 to 21 percent for the years considered. These results are in agreement with other cross-section studies. Estimation of the standard fixed-effects model yields a much smaller union wage differential of 9.9 percent. Averaging the individual union wage effects obtained from within estimation of (3) produces a measure of the union wage differential of about 9.7 percent. Similar results are obtained when we take the effects to be random and correlated with the regressors. In general, we find that in every case where we allow correlation between the regressors and the individual effects, the story of upward bias in the cross-section is confirmed with substantially reduced estimates of the impact of unions on wages. We conclude from this empirical exercise that unless there is interest in the coefficients of the time invariant explanatory variables, the fixed effects specification of either the simple or general model provides a satisfactory framework for measuring the union wage effect. And, since within estimates of each are always consistent, they can serve as a basis of comparison for results from more restrictive models. Our final remarks concern what remains to be done. Aside from other applications of the theoretical results presented here, there are further extensions of our modelfifio be considered. One is to allow some \ of the variables associated with the cross-sectionally varying 73 coefficients to be correlated with the individual effects. Another extension might explore the limits of conditional likelihood analysis in a system of simultaneous equations with panel data, where some of the slope coefficients vary across individuals. Such considerations are tOpics for future research. FOOTNOTES CHAPTER TWO NOTES 1 To see this, notice MD may also be expressed as l I MD — INT - [IN 8 T eTeT ] . So, prelmultiplication of the data by MD yields Y11 ’ Y1 Y - Y 12. l - _1 MDY a : ' Yi ‘ ‘T’ E Yit Y _- 1T Y1 F Y21 ’ 2 Y22 T Y2 Y2T ‘ Y2 YN1 ' YN Ynz ' YN YNT ' YN and MDX similarly. 74 CHAPTER THREE NOTES 1 Recall the within estimator, A _. '1. 8w — (x MDX) x MDY . where MD = I - D(D'D)-1D'. Let PD = D(D'D)-lD'. The "between" estimator is defined as “ _l = ' . BB (x PDX) x PDY. The projection PD transforms the data into individual means. So, BB is derived from a regression of Y1 x1 2 on X2 , YN *8 and therefore uses variation across individuals. may be expressed‘as a matrix weighted average A Now, BGLS A of 8W and BB: 75 76 fl 2 -1 2 = I I I I BGLS (x (MDX + e x PDX) (x MD + 0 x PD) Y. 2 2 02 where 6 = 1 - T8 =.—___—_— . For fixed T, the use of between 1 2 2 o +To a variation by GLS results in an efficiency gain over within. But, A 02 + O as T + w implying 8 = B l . GLS W for arge T 2 Except where the individual effects are correlated with all of the columns of X. In this case GLS = within (see Mundlak (1978)). CHAPTER FOUR NOTES 1Specification tests are outlined on pp. 1382-3 of H-T. 2Any vector orthogonal to a time-invariant vector can be used as an instrument. Since the time—invariant n1 are the only components of the disturbance which are correlated with an explanatory variable, MD may be included as an instrument. As H-T note, the time-invariance of the (11 provides N(T-l) linearly dependent instruments for (4.2.3). (See H-T p. 3These identifying restrictions can be tested. See H-T, pp. 1388-9 for details. 1/2 éThe matrix Q - transform 9 into a scalar matrix; i.e. fl - 1,299- 1/2 = 021 When used to transform (4.2.3), 0 — 1/2 NT. yields a simple (1-0) differencing of the data, Y - (149))?1 = [xit-(l—efiils + 621)! + 0:11 + (ct-(l-efiil. it 3 OLS applied to this transformation is GLS (see note 2). This is computationally convenient. 77 78 5This procedure combines the computational convenience of the Q _ 1/2 transformation with the simplifications provided by PA° First, PA applied to the exogenous variables yields the variables themselves. Secondly, the projection of the endogenous variables onto the column space of A can be derived by using only individual means (see H—T appendix B). 6The proofs of these results are given in Appendix A of H-T. 7Formally we assume MQZ = O (i.e., Mil1 = 0, V1). 8Derivation of Q- 1’2 is a straightforward application of Wansbeek and Kapteyn (1982). 9Or, equivalently if k1 + J1 + L > J + L. 10Since (9‘ lIZXI, 9- 1,221, Q- lIZW) is more highly correlated than (x1, zl w) with 9‘ 1’20:, z, u). - 1/2 1/2 11Since 9 z = 02 and n" (Y-XBw) = 0(Y-X8w) . y" a [2;8(8'n"18)"lB'Z]-1 Z'B (B'fl-IB)-IB'(Y‘XBW)- This is generally different from Yw in (4.2.8); i.e., (4.2.8) uses 79 while (4.3.10) uses 2 I I B'n'ls _ xl I(MD+O 1>D)x1 8 x1 21 I I 0 21 x1 0 Z1 21 Incidently, B'B t B'Q-IB is another reason (one H-T do not mention) for A 5* Yw t Y in the over-identified case. 12Since B'Z is nonsingular when k1 = jz. 13Again, the identifying restrictions may be tested. See note 6. 14Compare with H-T Appendix A. CHAPTER FIVE NOTES 1An extensive literature exists on the estimation of union wage differentials. Two excellent surveys, discussing methodological issues as well as empirical findings, are Freeman and Niedoff (1981) and Lewis (1983). 2A random effects/GLS approach to this problem would not make sense for obvious reasons. However, at least one study, Chowdhury and Nickell (1985), has applied the H-T analysis to the question of unions' effect on wages. 3See Lewis (1983) for a critique of these studies. 4See Freeman (1980). 5The restricted version of Stewart's model can be estimated consistently by nonlinear least squares (i.e., searching over the values of 1). 6Percentage differentials are calculated by (exp(5)-1)100. 7Percentage changes are calculated by differences in natural logarithm. 80 81 8The occupation dummies are also treated as endogenous. 9The rank condition is also fulfilled. 10Recall the H—T efficient estimator is equivalent to within in the exactlky identified case. 11See also Criliches (1977) and Criliches, Hall, and Hausman (1978). 12Union status is now a part of W in our general model (see (4.3.4)). While this may be intuitively unsatifying, allowing parts of W to be correlated with the efffects makes estimation of the model overly complicated. REFERENCES Chamberlain, G. (1980), "Analysis of Covariance with Qualitative Data,” Review of Economic Studies, 47, 225-238. Chowdhury, G. and Nickell, S. (1985), "Hourly Earnings in the United States: Another Look at Unionization, Schooling, Sickness, and Unemployment Using PSID Data," Journal of Labor Economics, 3, 38- 69. Freeman, B. (1984), "Longitudinal Analysis of the Effects of Trade Unions,” Journal of Labor Economics, 2, 1-2. Criliches, 2., Hall, 8., and Hausman, J. (1978), "Missing Data and Self— Selection in Large Panels," Annales de L'Insee, 30-31, 137-176. Criliches, Z. (1977), "Estimating the Returns to Schooling: Some Econometric Problems," Econometrica, 95, 1-22. Hausman, J.A. and Taylor, W. (1981), "Panel Data and Unobservable Individual Effects," Econometrica, 14, 1377-1399. Judge, G.G. (1985), The Theory and Practice of Econometrics, New York: Wiley. Lewis, H.G. (1983), Union Relative Wage Effects: A Survey, Chapters 5 and 7, presented at the NBER Conference on the Economics of Trade Unions. Mundlak, Y. (1978), "Models with Variable Coefficients: Integration and Extension,” Annales de L'Insee, 30-31, 483-504. Mundlak, Y. (1978), ”On the Pooling of Time—Series and Cross-Section Data," Econometrica, 46, 69-86. Rao, C.R. (1973), Linear Statistical Inference and Its Applications, New York: Wiley. Stewart, M.B. (1983), "The Estimation of Union-Wage Differentials from Panel Data: The Problem of Not-So-Fixed Effects," Mimeo, Princeton University Industrial Relations Section. Wansbeek, T. and Kapteyn, A. (1982), ”A Class of Decompositions of the Variance-Covariance Matrix of a General Error Components Model,” Econometrica, 50, 713-724. 82 flICHI 11111111171111.1114;as