’\ '3”- 2 143;“ ~ . @1 ,3 . N 3375 1‘; _.. ‘ WM 2* ’7 33““ 3,42 .5252“? hafijfi " -- $3“; 532‘ 22 ” 2:22;“2'22‘ .r" 2’ 3|! . fl ~I ~" r '7"; ,5 J“ ' 345’ ‘5'»: wv 'fl‘ 45-. S?!“ ‘ ‘ “15%,2 4 - um ' . 2 ' . .~ “3‘“ {23; A“; 2‘ {2’22 1 my? ,1 : ' " '3 ‘ ' “fig” 12. a": 2 “A... I." C‘“ .. u‘ '01-! .q... n! ' 1‘ ‘Qué .. .1533 '5. «u». it‘d; .n 2 u If “‘1‘ éf’if'w‘t" “JILL..- 2- ‘2 5’ l.,?"m‘ -fl 1‘. qfi.“ 1f?! N1l~:‘}‘:l. w“ it?“ .' 1 1K“ 3.; .fi: ‘2’ 22 £222: L. Mu ‘ f’te‘X-VQI c-an‘ ’ 5;: 35"”“2‘3‘” 33:223. ”I"? 22.2.2: 2 2.;~22 ii " 3,357,. 1-“: . 3"“233 my“; a ,5. " xfizfi‘ W532, ' 7n; "7 “22‘. $321.94, “22223; ._ ... .2‘ . :I. ‘ -u;.‘5 1 I 2 <1 - ‘1' '.’ I b r', t‘ '1", .I '5‘222017,3‘...?.., ' Iv“? 1': “ .5? ‘3 0'1““ 9.7 fifty!" "‘ ”(1" up: Milli11111111111111lnlmmmF» Mm; 1293 00692 038 LIBRARY Michigan State University This is to certify that the dissertation entitled i PAWl Wm Models with Multiplfcafiva Laotzu‘ntuod and TIME ‘EQQeo‘be‘. Appltcxxfions +0 CompenSM'Ton and “Emmi-Yer Pradaofion Cum/Hons presented by Woungf Hoon Lee has been accepted towards fulfillment of the requirements for PA . D degree in £( 0 770;" IS %SQ_&/ Majorproesf Date T-EBRUAU m m: MSU i: an Affirmative Action/Equal Opportunity Institution 0-12771 PLACE IN REWRN BOX to remove this checkout from your record. TO AVOID FINES return on or More on. due. DATE DUE - DATE DUE DATE DUE T W l I ll M301. An Aflirmdlvo ActiorVEqunl Opportunity Institution ommt PANEL DATA NODELS 'ITH NULTIPLICATIVE INDIVIDUAL AND TIME EFFECTS: APPLICATIONS TO COMPENSATION AND FRONTIER PRODUCTION FUNCTIONS BY YOUNG HOON LEE A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 1991 ABSTRACT PANEL DATA MODELS WITH MULTIPLICATIVE INDIVIDUAL AND TIME EFFECTS: APPLICATIONS TO COMPENSATION AND FRONTIER PRODUCTION FUNCTIONS BY YOUNG HOON LEE The increasing availability of panel data (pooling cross section and time series data) enables econometricians to extract information both from variation between individuals and from variation between time periods. Most of the panel data literature assumes that slopes are common for all cross sections, but that intercepts vary over individuals. The role of individual-variant intercepts. is to control for unobservable individual specific effects. The unobservables which are represented by the individual effect should have' influences on the dependent variable that are constant over time but varying over individuals. For example, according to the conventional panel data model, unmeasurable ability or ambition should have the same effect on wage over time. The primary focus of this study is on the construction of a regression model that allows time-varying effects of individual specific components on the dependent variable. We discuss fixed effects and random effects and derive the estimators that are analogous to the within and GLS estimators of the standard panel data model. We derive the asymptotic properties of the generalized within and GLS estimators. Furthermore, we construct test statistics for the hypothesis that the individual effect has a constant coefficient over time. We apply the model in two different settings. The first application. deals ‘with. the. compensation of a sample of economics faculty members from six U.S. universities. There are two separate time periods, and the effect of unobserved ability on compensation is found to be different in the two periods. Second, we apply our model to the frontier production function (efficiency measurement) problem. Previously, frontier models estimated from panel data could estimate the technical inefficiency of each firm by assuming it to be time- invariant or by allowing technical inefficiency to vary over time only in a specific restrictive way (such as a quadratic function of time). The application of our general panel data model to frontier production functions allows technical inefficiency to change over time in a relatively unrestricted way. Our results for a sample of Indonesian rice farms show that. technical. efficiency' levels 'vary' significantly' over farms, and indicate interesting time trends in efficiency levels. ACKNOWLEDGMENTS Without support of so many people, this dissertation would never have been completed. I am especially indebted to the chairman of thesis committee, Dr. Peter Schmidt for his patience and generosity in giving time and discussing every aspect of the study. His delightful teaching and careful advice motivated me to study econometrics and led to finish this dissertation. I am also indebted to the other'members of the committee. Professor Daniel Hamermesh and Ching-Fan Chung for their time spent reading this research and their suggestions for improving it. Perhaps the greatest debt is that which I owe to my wife, Jeong Won. I would like to thank for her love, encouragement, and understanding. In addition, I would like to dedicate this thesis to my parents for their constant support throughout my graduate studies. iv TABLE OF CONTENTS LISTOFTABI‘ESO0.0.0.0.0.0....OOOOOOOOOOOOOOOOOOOOOO ..... OVi LISTOFFIGURESOOOO0.0.000...00.0.00... ....... OOOOOOOOOOOOVii CHAPTER I. II. INTRODUCTIONOOOOOOOOOO0.0.000000000000000000000000 FIXED EFFECTSOOOOOOCOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 2.1. 2.2. 2.3. 2.4. The Simple Model............................ The General Model.................... ..... .. The.G-Component.Model....................... Summary.............................. ..... .. IIIORANWMEFFECTSOOOOOOOOO ......... OOOOOOOOOOOOOOOOOO IV. VI. VII. CONCLUSIONSOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO0...... 3.1. 3.2. 3.3. 3.4. The Simple Model............................ The.General Model........................... 3.2.1. Ordinary Least.Squares Estimation.... 3.2.2. Generalized Least.Squares Estimation. The.G-Component.Model....................... Summary..................................... TEST STATISTICSCOOOOOO00......OOOOOOOOOOOOOOOO...I ESTIMATION OF COMPENSATION........................ 5.1. 5.2. Data...0.0...0......OOOOCOOOOOOOOOOOOOO0.... Estimation...OCOOOOOOOOOOOOO0.0.0.0.....0... FRONTIERPRODUCTIONMWIONS.OOOOOOOOOOOOOOOOOOOO 6.1. 6.2. 6.3. 6.4. 6.5. Review...................................... Presentation of The Model................... Data........................................ Estimation.................................. Summary..................................... 1 7 7 9 21 23 25 25 28 28 32 39 44 45 49 49 50 58 58 61 64 66 77 79 FOOTNOTESOOOOI...0....0.0.000...OOOOOOOOOOOOOOOOOOOOOOO0.082 APPENDICESOOOOOOOO...OO..0...O...OOOOOOOOOOOOOOOOOOOOOO00.84 REFERENCES...O0..O0.00...O0..O0.00......0.0.00.00000000000112 V LIST OF TABLES TABLE PAGE 5.1. Means and Standard Deviations........................ 50 5.2. TheSimpleEstimation................................ 52 5.3. TheGeneral Estimation............................... 52 5.4. Test-Statistics for Ho: 5185151...................... 53 5.5. EstimationwithaTime-Dummy......................... 54 5.6. Test-Statistics (Estimation includes a time-dummy) . . . 55 5.7. TheWithin Estimation of a 56 6.1. Estimation of the Simple Panel Data Model. . . . . . . . . . . . 67 6.2. Estimation of the General Panel Data Model. . . . . . . . . . . 68 6.3. Test-Statistics......................................69 6.4. Technical Efficiency (from the simple estimation). . . . 72 6.5. Technical Efficiency (from the general estimation). . . 72 6.6. EfficiencyLevelsfi).................................75 LIST OF FIGURES FIGURE PAGE 6. 1. Technical Efficiency of The Median Family. . . . . . . . . . . . 74 6.2. Technical Efficiency of The Selected Families. . . . . . . . 76 vii CHAPTER ONE INTRODUCTION Panel data are data that have both a cross-sectional and a time-series dimension. For example, we might have observations on each of 1000 individuals for each of five years. Letting N denote the cross-sectional sample size and T denote the time-series sample size, we have a total of NT observations; in the example just given, N=1000 and T=5, so there are NT=5000 observations in all. Panel data are potentially useful for several reasons. At the most basic level, observing each individual repeatedly is a way of increasing the total number of observations. Also, some parameters may be estimated more readily from cross- sectional information and others from time-series information. For example, in budget studies it is often argued that prices display little cross-sectional variation, so that precise estimation of price elasticities requires time-series information, while real incomes display little temporal variation, so that precise estimation of income elasticities requires cross-sectional information. Panel data contain both types of information and therefore may be very useful. However, in this study we will be concerned specifically with techniques that are useful when N is large and T is small. Such cases are common in labor economics, since many 1 2 longitudinal data sets contain thousands of individual but only a few time periods of data per individual. In such cases the usual motivation for the use of panel data is to control for possible biases due to unobservable individual characteristics. For example, Mundlak (1961) considered a Cobb-Douglas production function for farms, and was concerned about possible biases due to differences across farms in soil quality, an unobserved variable that affects output and may be correlated with the inputs. More recently, many labor economists have estimated wage equations and have been concerned with possible biases due to differences across individuals in unobserved ability. The existing panel data literature has dealt extensively with the problem of avoiding biases due to unobservables like soil quality or ability, by assuming the unobservables to be time invariant. The standard model that is used is the regression model with individual effects: (1.1) Yic-Xicp+a1+eit i-l’ooopN’ t-110oo,T, Here Iit is the dependent variable; X is a le vector of n explanatory variables; fl is a le of parameters (regression coefficients): ai is the unobserved individual effect, which is time invariant (does not depend on t); and a” is the random error. The errors cit are assumed to be independently and identically distributed (i.i.d.) with.E(e")=0 and‘Var(en)=oz. 3 In this model the unobserved individual characteristics represented by the individual effect ai are assumed to have the same effect on the dependent variable Y in all time periods. The motivation for this study is that this assumption is unnecessarily strong, and we will relax it. Specifically, we will allow the effect of t1i on Y to vary over time, though we will require that the temporal pattern of the effect of ai on Y must be the same for all individuals. Specially, we will consider the model (1.2) Yic-XicB+eta1+eit i-lpooo'N’ t-llooolTo This model requires a normalization, and we set 01=1. Compared to the model (1.1), the new model introduces the (T-l) new parameters 02, 03,..., 0, to represent the effect of as on Yit for t=2, 3,.. ., T relative to the effect of ai on Y“. As a matter of notation, let Yi=(Yi1 Yi2 Yr) ', €i=(6i1 Eiz e”)' and xi=(Xn' Xiz' Xn')', each representing the T observations for person i. Then we can write equation (1.2) as (1.3) Yi-Xifl+5a1+ei 1-1'00I’N ‘ where '6; 63 (1.4) E-[é]. e- (6.11 The usual model ( 1.1) thus corresponds to the case that 02=03=...=01=1, or equivalently that 0 (or 5) is a vector of ones. As we shall see, this is a testable proposition in our model. The model we consider can also be compared to the two-way analysis.of covariance model that includes both individual and time effects. That model can be written as (1.5) Yic-X1t+ai+et+eic 1-1,...,N’ t-1,...,T The number of parameters in (1.5) is exactly the same as in our model (1.3), but the -models are different. Our interpretation of (1.5) is that it is suitable in cases in which there are relevant unobservable variables that vary over time but not over individuals; it does not handle the case that our model is designed for, in which the effects of unobservable individual characteristics vary over time. Compared to the two-way analysis of covariance model (1.5), 5 our model (1.3) is more difficult to estimate, because it is nonlinear. However, unlike the analysis of covariance model, our model allows for inclusion of observables that are time invariant or invariant over individuals, a considerable advantage in some applications. The plan of this study is as follows. Chapter 2 discusses the fixed effects model in which the parameters cti are treated as fixed. We derive a generalized within estimator, and we show its consistency and asymptotic distribution. We also discuss the case of several possible interactions between time-invariant and individually invariant parameters, as in the model G (1.6) Yi-XIB-I-Eegag-itei i-1,...,N. 9-1 Chapter 3 discusses the random effects model in which the individual effects a‘- are treated as random. We derive the appropriate GLS estimator and prove that it is more efficient than the within estimator. Chapter 4 considers tests of the hypothesis that 5 is a vector of ones, so that our model reduces to the simple panel data model. We present Lagrange Multiplier (LM) and likelihood ratio (LR) and Wald statistics for this hypothesis. Chapter 5 presents an application of our model to the compensation of academics, previously considered by Hamermesh 6 (1989) using the standard panel data model. Chapter 6 presents an application to the measurement of the technical efficiency of‘a sample of Indonesian rice farms, previously considered by Erwidodo (1990) using the standard model. Finally, Chapter 7 gives our concluding remarks. CHAPTER TWO FIXED EFFECTS 2.1 The Simple Model We may rewrite (1.1) as (2.1.1) Yi-Xifl‘l'erai'tei 1-1,...iN where In 3&1 €i1 ’32 3&2 812 Y1- o I X1- 0 I 81- o Xu~ 9%: 31T and eT is a Txl vector of ones. This is also identical to (2.1.2) i’-Jfl3+-Ga-+e where Y1 'Xll t21 'afl Y2 3% e2 “2 Y - o ’ X - o ' e - o I a - o YN X". Le N. La N and e, 0 0 0 6T. 0 <;- 1J9..- . 0 o 09,. As a matter of notation, define (2.1.3) p6 = G(G'G)'1G' and 141‘; = I”T - P5. Note that PGGa = Ga and MGGa = 0. The fixed effects model treats ai as fixed. That is, each (A is a parameter to be estimated. The within transformation,‘ which eliminates the effects by transforming the data into deviations from individual means, corresponds to multiplication by MG: (2.1.4) MGY- MGXB + use Avoiding matrix algebra, this amounts to: (2.1.5) yin-17, - (Kn-Em + (en—31) i-1,...,N, t-1,...,T, where T T - 1 '— 1 Y1--1—~2:Yit’ Xi-Trftz:xic' 6'1 c-l The DIS estimator of the transformed model is the within estimator of 3; (2.1.6) Bw- (X’MGX)‘1X’MGY N T _ __ N T __ __ - (Z; .2; (X,,-X,)’(X,,-X,) 1 '1 {I}; m (X,.-X,)’(Y1,-Y,)] The within estimator is the best linear unbiased estimator (BLUE) and is consistent as N goes to infinity with T fixed. 2.2 The General Model We rewrite (1.3) as Y1'Xip+€a1+ei i-l’...,N 10 where tsit is i.i.d. N(0,oz)‘. This reduces to the simple model of section 2.1 if £=eT. We may consider all NT observations as (2.2.1) Y- XB + (IN®E)G + e If (2.2.1) is the true relationship and efet, the estimates of B from the simple model are not unbiased, since E(B,,) - E[X’(IN®M‘T) X] ’1X’(IN®MOT) Y - (3+ [26 (1.8%,) X] ‘IX’ (1.8%,) (1.35) a N r _ __ N 1' _ _ p ' l3+ [2: 2 (Xic'xi),(xic'xi) ] -1; 2 (Xic’xi) (5 FE) a 1 1-1 c-1 -1 t-l where E=(1/T) 225v Thus we expect the simple within estimates to be biased for the coefficients of those variables whose temporal variation is correlated with the temporal variation in the effect of a on Y. The generalization of the within transformation is to premultiply (2.2.1) by the idempotent matrix (Iume) that is defined as (2.2.2) Me = I," - PE where P6 = ((ver‘g'. 11 That is, the transformed regression model is expressed by (2.2.3) (IN®M,)Y- (IN®M,)XB + (I,®M,)e or (2.2.4) szl' -M£X1p +M£€1 i-l,...N since M£§=0. The individual effects are deleted by taking deviations from individual weighted means (PeY and P£X) instead of taking differences from individual means in the simple model. We may not apply OLS to (2.2.3) since MeYi and M‘EXi are not observables; MeYi and Mexi include the parameter vector €- Instead, we construct an objective function which will be minimized with respect to B and 0. This objective function is simply the error sum of squares of (2.2.3): N (2.2.5) CSSE - I: (Y,-X,B)’M,(Yi-X,B) -1 - (Y-XB)’(IN®M,) (Y—XB) The reason that.we.denote this objective function CSSE is that it is the same as the (concentrated) error sum of squares of (2.2.1). By taking derivatives of (2.2.5) with respect to 3 and 0, the first order conditions are obtained as 6033}: BB N (2.2.6) - -2 zxfiMgn-xifl) - o -1 12 661953 2 N , 66 t”: 2.3. 1 1” 1' 1'” N " ; (Y1'X1B)IP£(Y1-Xip)e] - 0 -1 where Y1: X12 Y1: X1: Y1 ' . X1. ' .Y 17'. .Xir. The solutions of the first order conditions are the following; (2.2.8) Bw- (X’(IN®M£')X)’1X’(IN®Mh)Y N N V - ( X’M KW 26»! y) § i t" i (2:. 1" 1 N (2.2.9) 5,,- (1,6’,)’ is an eigenvector of ;: (Yi—XIB ,) (Yr-Kill w)’ -1 The derivation of (2.2.9) is given in Appendix 2.1. 13 NQTE 1 For a matrix A, suppose that A is an eigenvalue and x is the corresponding eigenvector. Then, Ax=xx and x'Ax=A. LEMMA 1 Eu is the eigenvector corresponding to the largest eigenvalue. Proof> N (2.2.10) CSSE - ;(Y1-X,B,,)M,(Y1-X,B,,) -1 N N -2 (Yi—XIBN)I(Yi-XIBW) ”—1—Elwz (Yi-XIBW) (Y1-XiBN)IEW =1 1 E’NE ,, 1-1 By NOTE 1, (2.2.9) can be rewritten as N (2.2.11) CSSE - ;(Y1—X18,)’(Y1-Xiflw) - X, -1 where A" is the estimated eigenvalue. We pick the largest eigenvalue since we wish to minimize CSSE. Q.E.D. The solutions for By and 9w are not closed forms of the data, since the solution for B" depends on Pu and vice versa. However, these can be calculated by iteration starting with any initial value of pr The estimate 3" from the simple model is a good candidate for the initial value. For the proof of consistency and asymptotic normality of Bu.and 9" , we need two theorems provided by Amemiya(1985).2 14 THEOREM 1 Make the assumptions: (A) The parameter space 6 is a compact subset of the Euclidean K-space (R‘) , and the true value 00 is in 6. (B) QNCy,0) is continuous in 066 for all y and is a measurable function of y for all 066. (C) N'1QN(0) converges to a nonstochastic function Q(0) in probability uniformly in 066 as N goes to m, and 0(0) attains an unique global maximum at 0°. Define 5" as a value that satisfies QN(6N) _ Delia: Qy‘e) . Then 5} converges to 00 in probability. THEOREM 2 Assume: (A1) lim (l/N) fixi'xi exists and is finite and nonsingular. (A2) lim (1/N) gaf exists and is finite and nonzero. Then, Bu and 5n which satisfy N CSSE(B,,,E,,) - 0:132 z: (Yi-Xifl)’M¢(Y1-X1l3) -1 are consistent. Proof> The proof that the assumptions (A) and (B) in THEOREM 1 hold in this model is omitted since it is trivial. With (A1) and (A2), it is shown in Appendix 2.2. that (2.2.12) plim %TCSSE(B.E) - (arm/odors) + Q.ESMJO 15 + 2(Bo‘plleJo + (T-1)o2 where 1 XI Q-lim— MX 4“ N... N ,1 1‘1 N - 1 2 -lm— a CL .M~ .N _1 01 —- N X/ Q -lim iMai Define a compact parameter space 6 by 6'6 5 ca and ('5 5 c2 where c1 and c2 are large positive constants and assume (30' 50') ' is an interior pointcf 6. Then N"CSSE(fi,£) converges to (2.2.12) uniformly in probability and plim (1/N)CSSE(B,£) attains an unique global minimum at (60, £0) : plim(1/N)CSSE(B°,£0)=(T-1)oz. Thus assumption (C) holds. Using THEOREM 1, Ba and so which minimize the objective function converge to true (30 and so in probability as N goes infinity. Q.E.D. Using the following theorem by Amemiya (1985), we may derive the asymptotic normality. IMEQBEM_; Make the following assumptions in addition to the assumptions of THEOREM 1. (AA) aZQN/aoao' exists and is continuous in an open, convex 16 neighborhood of 00. (BB) N“(a2QN/aoao ”a: converges to a finite nonsingular matrix A(0o)=lim E N"(aZQu/a00')oc in probability for any sequence 0,: such that plim 0:500. (co) N"’2(aQ./ao), » N(0.B(0o)). where B(0o)=1im E N‘(aQ,,/ao),°(ao,,/aa'),'. Let {0"} be a sequence obtained by choosing one element from 0“ defined in THEOREM 1 such that plim 3,300. (We call 3" a consistent root). Then, «MON-60) - N[0,A(60)’1B(80)A(60)'1] Applying THEOREM 3, we have OCSSE A}. -1» E'— (°) 1”“ MET—61'“ N 1 9,, 11 1m _( a X’E e’ a ) N 2X! 01- {loge/1:021 0 OJ: - 2 e I Q‘(IT.1- 36°) 5050 a finite nonsingular matrix, and BOSSE aCSSE 3(10) limE-fi(—-—— a}. )lOX(—a—Al—)Lo N o, 1.13 Tbtgx’ a.”- Jeopx’z 6,10,) - 4o2 6.63 ( +0 ) (I ) . £35. ”'1 tat. where - [‘3] (See Appendix 2.3 for the derivation of A(Ao) and 8(10).) Therefore, «W (31-6... -'N[0,A(10)’1B(10)A(1°)‘1]. aw’eo An advantage of the general model over the simple model (1.1) is the ability to include time-invariant explanatory variables. To see this, consider first the simple model with time-invariant regressors zi added: (2.2.13)Yic-Xicp+ziy+a1+eit 01' (2.2.14) Y- x|3 + (Z®e,.)y + Ga + e 18 where Premultiplying by Mh,'the transformed regression model is B (2.2.15) MGY - [MGX MG(Z®e,-)] Y we -[MGX o][5]+MGe- (M6108 +MG€ since MG(Z®eT)=03. This is the reason why we can not incorporate time-invariant explanatory variables into a fixed effects model. This problem does not arise in our general model. The equation for the general model corresponding to (2.2.13) is (2.2.16) Yic-chp+ziy+eta1+eic 19 or (2.2.17) Y- XB + (z®e,.)y + (I,,®£)a + e - [X (Z®e,.)] [5] + (IN®E)a + 6 The within transformation leads (2.2.16) to 3] + (IN®€)6 (2.2.18) (1,614,” - {(1,611)}: (Z®MEeT)] But (I"®M€) (Z®e.)=Z®(M€eT) is generally not equal to zero unless £=eT. Therefore, the inclusion of time-invariant regressors" is allowed, and their coefficients can be estimated consistently. This is an advantage of the general model since time-invariant explanatory variables are often important in many applications. For instance, in a wage equation, years of schooling, race, union status or sex could be important determinants of the wage. Notice that the overall intercept is also identified in the general model while not in the simple model. In the simple model with fixed effects, assuming the normality of the 6. the conditional maximum likelihood It' estimator (CMLE) is equal to the within estimator . Furthermore, the MLE is the same as the CMLE (or within estimator). Thus the incidental parameters problem is not 20 relevant in the simple model. The above results do not hold in the general model. The same type of derivation as in the simple model for the CMLE can not be applied in the general model. The individual mean, Y1! is a sufficient statistic for a, in the simple model and the likelihood conditional on Y, does not depend upon incidental parameter (1.. However, Pin in the general model corresponds to Yi in the simple model, and it is not a sufficient statistic since it is not a function of only the data. A parameter 5 is included in PeY‘. Because the incidental parameters problem is relevant in the general model, the asymptotic theory developed in this section does not agree with (naive) normal likelihood theory. According to normal likelihood theory, the covariance matrix of (3,0) derived from the likelihood function L(fi,0,a) , which is a submatrix of the inverse of the information matrix, should equal the covariance matrix of (3,0) derived from the concentrated likelihood function. Furthermore, this covariance matrix should be (asymptotically) the covariance matrix of the estimates (5,0). However, in the present case none of these statements is true. (See Appendix 2.4.) In summary, the conventional way to derive the CMLE does not work and the asymptotic theory of the generalized within estimator is different from that indicated by likelihood theory. 21 2.3 The G-component Model It is impossible to have many different individual effects in the simple model since they would not be identified. However, we may include a number of individual specific components in the general regression model: specifically, we may assume (2.3.1) Yi-XiB+Ela11+---+£Gaci+ei i'l,...,N where ' 1 6 we). 9 .692: For identification we make the orthorgonality assumption €9'£f=ol gEf° The within transformed version of model (2.3.1) is (2.3.2) MEYI -M8XIB +M861 1-1,...N 22 where PS-P£1+P£3+---+P£a’ Ma-IT-Pa. (Note that the projection onto [51, £2, ... , £6] equals PS because €°'£f=0, gyéf.) The objective function is constructed in the same way as in section 2.2: (2.3.3) 1341': CSSE - (Y—XB)’(IN®M3) (Y-XB) where We can obtain solutions for B in terms of 0 and 0 in terms of 3 from the first order conditions. (2.3.4) Bw-(X’(IN®M3)X)‘1X’(IN®M3)Y 23 N (2.3.5) ng- (1 G’W)’ is an eigenvector of ; (Yi-Xifi ,) (Yi—Xifi ”f -1 M2 59" is the eigenvector corresponding to the gth largest eigenvalue of 2(Yi-XiB") (Yi-XiBuw. Proof> N (2.3.6) CSSE- ;(Y1-X13,,)’(IT-le- . . . -PEG)(Y1-X1[3,) -1 N ' 2(Yi-X1BN)I(Yi-X16N) ‘ x1 " ° ' ° ' xc: We have to choose the largest G eigenvalues to minimize CSSE. Q.E.D. The same asymptotic theory as in Section 2.2 is applied to show that the estimators are consistent and to derive their asymptotic covariance matrix. (See Appendix 2.5 for this covariance matrix.) 2.4 Summary We have discussed a generalization of the conventional fixed effects model that allows different time-effects of individual specific components on the dependent variable. 24 We derive a consistent estimator of the regression coefficients (B) and of the coefficients of the individual effects (E) using the conventional within transformation. We noted that the coefficients of time-invariant explanatory variables, which cannot be estimated in the simple model, may be estimated consistently and asymptotically efficiently. The inclusion of several individual specific components in the regression model is also introduced, and the results are similar to those with one individual specific component. Unlike the simple model, the asymptotic theory of this model does not agree with normal likelihood theory. The sufficient statistic for the individual effects depends on other parameters, and so the CMLE cannot be obtained by the usual method (see Chamberlain (1980)) by conditioning on a sufficient statistic. The MLE is consistent, but this must be proved directly, and the usual formula for its asymptotic covariance matrix (the inverse of the information matrix) does not apply. CHAPTER THREE RANDOM EFFECTS 3.1 The Simple Model An alternative approach in panel data models is to assume that the individual components are random. That is to say, random effects models consider the individual effects to be independently identically distributed and to be independent of the disturbance and the explanatory variables. Hsiao (1985)5 mentions the difference between fixed effects models and random effects models. The fixed effects model is regarded as providing inference conditional on the effects in the sample, whereas the random effects model is regarded as providing unconditional inference with respect to the population of effects. The within estimator does not consider variation between individuals. The GLS estimator used in the random effects model considers both variation between individuals and variation over time within each individual. Therefore, the GLS estimator can be expressed as a combination of the within and the between estimators. The GLS estimator is more efficient than the within estimator because of the utilization of the variation between individuals. However, we need a distributional assumption about the effects, which reduces N 25 26 parameters to a single parameter (the variance of the effects), and we also need to assume that the effects are uncorrelated with the regressors. The regression equation (1.1) can be written as (3.1.1) YiC-XiCp+viC 1-1,...N, t-1,...T where ic'a1+eic We assume that the effects t:i are i.i.d., with E(ai)=0 and Var(ai)=oaz, and that ai is independent of X and 6. Combining all NT observations, We have (3.1.2) Y-XB+V where t/- Ga-te and where G=In®eT as in section 2.1. The knowledge of the covariance matrix of v is necessary to derive the GLS estimator of H: 27 (3.1.3) Cov(v) - Q - E(Ga+e) (Ga+6)’ - OEQW-tlbiPG - l 02 (3.1.4) Ql-—(Im-(1-k2)P), k2-——— 02 a 02+Taf, and (3.1.5) 0 i 1 2 - :(Im-(l—k)PG) Treating k as known, the GLS estimation can be calculated by the regression of (n'l’zY) on (n'l/ZX). Equivalently, the GLS estimator of B is given by (3.1.6) (36” - (X’Q’1X)'1X’Q'1Y gas is consistent and asymptotically efficient. It is more efficient than B}, but the! efficiency difference disappears as T goes to infinity. A consistent estimate of n can.be obtained from the estimated variances in the within and the between regressions, and the feasible GLS estimator is asymptotically equivalent to the GLS estimator. 28 3.2 The General Model 3.2.1 Ordinary Least Squares Estimation The regression equation (1.2) is considered as (3.201) Yic-X1UB+VIU i-llooepN, t-lpooaiT where Vic ' etai + 616 We let E(ai)=u and assume that ai'=a‘-u is i.i.d. with 2 Var(af)=oa. Estimation of (3.2.1) is identical to (3.2.2) Yi-X1B+Ep+vg where O Vi'£“i+31 and (3.2.3) Y-XB-t- (ep£)|.l+V‘ 29 where 'v‘- (I$3£)a‘+te and eN is the N-dimensional vector of ones. The OLS estimation procedure ignores the fact that the covariance matrix of the error term is not the identity matrix. Its objective function is (3.2.4) SSE - (Y-XB- (61,35) u)’(Y—XB- (13,65) (1) The derivative of SSE with respect to u is (3.2.5) 635E - -2(e,®5)/(y-xp-(e,®£)p) - o and this yields the solution for u in terms of B and $3 (3.2.6) {3015' N:,E(e,®£)’(Y-xp) The concentrated objective function is obtained by substituting (3.2.6) into (3.2.4): (3.2.7) CSSE - (Y-XB)’[INT-(P,'®Pg)] (Y-XB) 30 where P," - eN(e§16N) 'lely - (swab/N Minimizing CSSE with respect to B and 0, the first order conditions and the OLS estimator of B and 0 are obtained as (3.2.8) 3%?” - exam—(94819.” (Y-XB) - o OCSSE 2 - 1 (3.2.9) T ' ‘ NET (E’EeO-figleTeT’EO) "' 0 and (3.2.10) 301.1- (X’lINT-(PON®P£]X)'1X[INT-(P,'®P£)]Y (3.2.11) EOLS is an eigenvector of E’ where 1 N 1 " N -1 N -1 a- (6......e,)' LEMMA 3 Ems is equal to é/é1. Proof> After some algebra using the first order condition (3.2.9), we can derive O OLS Note that 65' is a Tthmatr x whose rank is one. Therefore, T— 1 eigenvalues are zeros and one is positive. It is clear that E is the eigenvector corresponding to the positive eigenvalue =é'é, since (éé')é=(§'§)§=.\é. Therefore EOLs=é is proportional to 6. We can check that this satisfies (3.2.12): (3.2.13) E’E—i—E’Eé’e “11—67;- 67a -0 The division of E by E, is required to satisfy the normalization condition that the first element of 6 is one. Therefore, Em5=§/e1. Q.E.D. Finally, the solutions for B, 0 and B using LEMMA 3 can be written in closed form: N _ _ N _ _ (3.2.14) [50,,3 - [2: (X,-X)’(x1-X)]'1; (Xi-X)’(Y1-Y) -1 1-1 31 ar ma ES 32 9|)!” (3,2,15) 60,3- (3.2.16) flou,- 61 Note that BOLS is indeed the usual OLS estimator, and that 0 and B are then calculated from the OLS residuals. 3.2.2 Generalized Least Squares Estimation Unlike the case with 015 estimation, the covariance matrix of the error term is taken into account in GLS estimation. The covariance structure of v is as follows: (3.2.17) COV(V) - 2 - E((IN®E)a’+e) ((IN®E)a‘+e)’ - 02%.T + E'5O:(IN®P£) (3.2.13) 2-1 - Tj’lgorw-u—qZ) (1,612,» where 33 -1 2 (3.2.19) 2: -%(Im.-(1-q) (13,614)). GLS can be obtained by OLS applied to the transformed regression model -3 -1 -2. -_1 (3.2.20) 2 ZY-z 2X8 +2 2(e,,®£)p+2 2v where -l -l -l COV(E 2v)-1~:'(23 2VV’E 2) - INT This transformation is a combination of the within and the between transformations. For example, 2'1’2Y=(IN®M€)Y+q(I“®P€)Y. Since 2""? includes the parameter vector 0, we cannot simply apply OLS to (3.2.20). we‘will derive the.GLS estimator of B and 0 which minimizes the objective function, equal to the error sum of squares.of the transformed equation (3.2.20). That is, we wish to minimize (3.2.21) SSE - [Y-xp-(eflEml’IIM-(l-qz) (IN®P,)] 34 ° [Fm-(6,186)] The derivative of SSE with respect to u is (3.2.22) agfiE--2(e,,®£)'trw—(1—q2) (1,6291 [Y-XB-(e,®£)uJ-o and this implies (3.2.23) (1613 " fig(eN®£)’(Y—Xfl) Substituting (3.2.23) into (3.2.21), we obtain the concentrated SSE (3.2.24) CSSE - (Y—XB)’[(4)2014)“;2 (M,'®P¢)] (Y-xp) where AQN- IN-ewe@Um The values BGLS and 501s which minimize CSSE are derived by taking derivatives of CSSE with respect to1B and 0 and setting them to zero. This gives 35 (3.2.25) 5115 - (X’[(I,,<8>1\t,)+q2(1~1,"<8>13,)]x)‘1 - ° X’I (IN®M£) +q2 (M9N®P£) 15’ (3.2.26) {as is an eigenvector of N z: [1/1-q2 61+(1-y/1-q2) 51 [t/l-q2 61+ (1-1/1-q2 ) a' -1 where N 1 ei-IYI-3UBGB" 9"ffi.z:(33"3956u9 1-1 The proof that ems is the eigenvector corresponding to the largest eigenvalue is essentially the same as the proof of LEMMA 1, 2 or 3. Similarly, the asymptotic properties of BGLS and 501s are derived using Theorem 1 & 3 as before, and we obtain Bots-Bo _1 _1 (3.2.27) JI—V ~ N[0,A BA 1 a61.5'60 The matrix A comes from the second derivatives of CSSE while B is(derived from the cross-products of the first derivatives. These are (K+T—1)x(K+T-1) matrices given by: B-4o2 OH on 36 f- i?£oeé) 0 0 M1?- (191-(1-«12)a3) (1--,- , 0 u (23- :1: Fad.) / 0 0 (02+ (1-q2)a§) (IM- 6060) E I o I 0003 ) - 202A €350 ‘ where Qxx - lim fix’urfim) +q2(MeN®P£)]X N“. Therefore, the asymptotic covariance matrix of BGLS and 0 GLS ’ which is given by (1/N)A"BA", is simply equal to 40‘8". The efficiency gain of the GIS estimator compared to the within estimator is shown by the difference of the asymptotic covariance matrices. If Cov(Bu,0U) - Cov(BcLs,0GLs) is positive semidefinite (PSD) , BGLS and 901s are more efficient than B“ and 0". Thus we ask (3.2.28) Cov(X,,) — COV(XGLS) is 19317? This is identical to the question: (3.2.29) [COVUIGLSH‘1 - [Cov(1,,)]'1 is PSD? 37 Assuming E(ai)== u = 0 for simplicity (See appendix 3.1), we focus on the submatrices in (3.2.29) that correspond to B and 0, respectively. Note that the covariance matrices of the within and the GLS estimators are block-diagonal when u=0. This gives (3.2.30) [Va1((3c,.Ls)]‘1 - [Var(fl,,)]‘1 - Tlvx/[(IN®M£)+q2(IN®p,)]X — 71VX’(IN®M,)X 2 - %X’(IN®p,)x, which is PSD. (3.2.31) [Va1'(6m)]‘1 - [Var(6,,)]'1 (l-qzwi 66’ - (l-qzwi 66’ - ' -—:§—( 7.1-?) 1 " TLIT-l-W) 1 ' 0 Thus Bus is more efficient than BH and gms and 0H are equally efficient. The efficiency gain of GLS over within disappears as T goes to infinity since ('5 4 m as T 4 w. Therefore 2 02+€’€03 38 which implies that in (3.2.30) 2 Efix’upng-o 0 as T~ co. Because of the lack of knowledge of qz, we need a feasible GLS estimator using a consistent estimator of qz. We can estimate q2 from the results of the within and the between regressors; or, for that matter, from the within and between sums of squares evaluated at any consistent estimates. Specifically SSE (3.2.32) lim 62 - lim " o2 N-oa N-o N(T-1) ”K - SSE (3.2.33) lim (02+£’Eoi) - lim 3 - a2 + {’ioi ”—0 N... N-K-l - . SSE _ - (3.2.34) limdz-llm w NK1 N... 11.. SSE, N(T-1)-K Since q2 is consistently estimated, the asymptotic properties of the feasible GLS estimator are asymptotically equivalent to those of the GLS estimator. We have noted in Chapter 1 that our general model is different from the simple model that includes both individual and time effects. We now note that the regression model 39 (3.2.2) with random effects and “$0 is identical to the model with zero u and including fixed time effects. That is to say, (3.2.35) Y“ - c1 + X1113 + (0.¢1+€1c) or (3.2.36) YR - c:1 + Xufl + 9;“ + (Oca'peu) is identical to (3.2.37) Y1: - c2 + X38 + 6: + (Oca’pen) , c2+6 t-c1+8 tn This general model effectively includes not only time-variant coefficients of individual specific components but also "simple" time effects (or time trend). 3.3 The G-component Model As in the fixed effects model, we can include a finite number of individual specific components in the random effects model. The regression equation is then (303.1) Yi-Xifl+v.i i-lpooclN 40 where V1 " 51am "’ E2&21 " ' ' °+ £54131 + 61 The assumptions in this regression model are as follows: (A.3.1) Baggy)“ and Var(agi)=092- “oi is independent of a“. for all g, f, i, and j except g=f and i=j. It is * independent of X and 6 and we denote 02m = agi-ug. (A. 3.2) The orthogonality conditions hold: £Q'£f=0, gyéf. Then, (3.3.1) is the same as (3.3.2) Y1. -X1.B +£1u1+. . .+EGp.G+ v; where V; ' 51‘111 + - - - + €54.81 + 31 1 «Int-“grits, 1 and (3.3.3) Y-XB +(eN®£1)p.1+.. . + (91.3559961" v’ 41 where v‘ - (1,351)“; + . . . + (IN®EG)a'G+e The covariance matrix of v' is calculated as (3.3.4) Cov(v‘)-E-o’IM+£’1£10:1(IN®P£1)+. . .+E'G£GaiG(IN®P£O) (3.3.5) 2-1 - _01_2[I,,,.-(1-qf) (IN®P1,>-- . .-(1-qg)(1,,®p.a)] where 2 02 q " -1, . . . ,G g 02+E/gigoig g .1 (3.2.6) 2 2 - %[Im-(1—q1)(IN®P£1)—. . .-(1-qG) (1,650)] Therefore, the objective function (SEE after transformation by 2‘1/2) is G G (3.3.7) SSE - (Y-XB-Z (e,®£,)u,)’[Im-2 (1-q3) (1,613.31 9-1 9-1 42 The derivative of SSE with respect to no is (3.3.8) 3;” - -2(eN®E,)’[Im-(1—q§) (IN®P,,)] 9' °[Y—XB-(efl®£,)ttg] - 0 The solution of [.19 obtained from (3.3.8) is (3.3.9) ag- N95 (epgg)/(Y-X6) 99 Then, the concentrated objective function gotten by substituting (3.3.9) to (3.3.7) is G (3.3.10) CSSE-(Y-XB)’[Im-(P,'®P3)] [INT-E (1-q3) (1,613.31 g~1 - [rm-(2,323)] (Y-XB) G NQI'EJ. [INT—(Peppgn’IIm-E (1—q3) (14813)] [Im-(P.,®Pa)1 g-i G G - IN, — E q§(P,.®P£') — E (1-q3) (I,® P“) g- 9-1 1 43 G - (I,,®M3) + Eq3(M,,®P£,). 9-1 Using NOTE 2, we can rewrite CSSE as follows: G (3.3.11) CSSE - (Y-XB)’[(IN®M3) + 2q§(M,'®P.')] (Y-XB) 9'1 We can derive the solutions for fl and each 09 by minimizing CSSE with respect to B and as. This yields G (3 .3 .12) BGLS- (X’ [IN®M3) +2 q: (11,919“) 1 X) '1 gu G . X’[ (138143) +2) q; 01,3912“) 1 y rd (3.3.13) 59 8Ls=(1 99 ms.) ' is the eigenvector corresponding to the largest eigenvalue of 2 [ l-qgepu- l-qg)a[(/1-q361+(1- 1-qg)é]’. 1']. The estimates in (3.3.12) and (3.3.13) are consistent and asymptotically efficient by the same reasoning in section 3.2. As in the one-component case, we can get a consistent estimator of qu using the results of the within and the between regressors. Specifically 44 SSE, —-02 NKT‘G (3.3.14) plim 62 - plim _ , SSE (3.3.15) plim (0243,02) - P111" —% ' 024.5175 9°29 SSE, N 2 SSEfiVNWTbG) (3.3.16) plim q; - plim where SSEBg - (Y—XB)’(M9"®P£') (Y-XB) The properties of the feasible GLS estimation using a consistent estimator of q: are asymptotically equivalent to those of the GLS estimator. 3.4 Summary We have discussed a generalization of the conventional random effects ‘model that assumes (g to be i.i.d. and independent of the disturbance and the explanatory variables. We derived the OLS estimator and showed that it is consistent. We also derived the GLS estimator, showed that it is consistent, and derived its asymptotic distribution. The GLS estimator is more efficient than the within estimator, but the efficiency gain disappears as T ~ 0. CMP‘I'BR FOUR TEST STATISTICB It is meaningful to test the hypothesis that o is a vector of ones. This is the restriction that reduces our general model to the usual simple panel data model. The within estimator and the GLS estimator of the simple model are not consistent if are”. In the case of the within estimator, (4.1.1) plim [3,, - plim (X’MGX) ‘1X’MGY - B + 1:33." (X’MGX) '1X’MG(IN®E)a ¢ (3 since MG(I~®£)7$O‘. This means that the conventional panel data} model produces inconsistent estimators (has a specification problem) if fife”. We may develop test-statistics for the hypothesis o=eM based on the work of Ronald Gallant (1985)7. Gallant considers estimators derived by minimizing an objective function Sn(0) , where n=sample size and asparameters. Our estimators minimize objective functions and therefore fit his framework. For example, for GLS we have N 1 (4.1.2) SD(B,6) figsmfi) 45 46 2;: (y-xp)'[(r,,®u,) +q2(M,,®P,)] (Y-XB) CSSE m 2 where a preliminary estimator” 72 is 02 derived from the within estimator (3.2.32). The null hypothesis is considered as (4.1.3) [10:6 -4.=3,._1 or (4.1.4) Ho: hme) - Hm - e111 - o where H = [0 : 1%,] is a (T-l) by (K+T-1) matrix. Then, the LM statistic given by Gallant (p. 219) is aSN(xal.S) I -1 I -1 -1 aSN(th.S‘) (4.1.5) LM-N( 61 )8 H’(H1?H) H8 (T) where Im' restricted estimate of [5] - [Bats . 0 97-1 47 pas = GLS estimator with 0=eM imposed, aS(X__G__L_,) N 1;— 6161’ 9: _ N; (_ 332?”) ) ( 65%;”) ), )7 - 84.78“ The LM statistic in (4.1.5) has asymptotically a Chi-square distribution with (T-l) degrees of freedom. Gallant (p. 220) also provides a test-statistic analogous to the usual likelihood ratio and Wald statistics: (4.1.6) LR - 2N[S,,(Xm) - 5,,(xmn - % [cssg(xam) - CSSEde (4.1.7) w - N°h(BGLS,5m)’(m7H’)'1h(Bm,9m) where 131,5 " [g as] - unrestricted GLS estimate of [g] 48 t7 - unrestricted GLS estimate of V. Under general conditions we have also that LR and W are asymptotically Chi-square with T-l degrees of freedom. These three test-statistics will be used in Chapter 5 when we apply this general model to the compensation regression of faculty members, and test the hypothesis that the effects of individual specific components on the dependent variable (compensation) are equal over time. CHAPTER FIVE EBTIMATION of COMPENSATION 5.1 Data Our data consist of 100 full professors of economics in six large public universities: Michigan State University, the University of Michigan, the University of Wisconsin-Madison, the University of Illinois-Urbana, the University of Minnesota and the University of Maryland. These 100 observations have been taken at 1979-80 and 1985-86 so that this data set is a panel data set with N=100 and T=2. The data set includes log of nominal compensation (LCR), an administrative experience dummy (AD) , a theorist dummy (TH), citations (CITS), and experience (EX). Nominal compensation is transformed from salaries. AD is a dummy variable equal to one for those *with current or’ prior administrative service at or above the level of department chair. TH is a dummy variable equal to one for those who are theorists or 'theoretical econometricians; it. is a ‘time- invariant variable. CITS is the average annual number of citations by others in the previous 5 years. EX is the number of years since the individual obtained the Ph.D. Thus every individual has.6:more years of experience in the second.period (1985-86) than in the first period (1979-1980). The means and 49 50 standard deviations of the variables are shown in Table 5.1. For a further discussion of the data, see Hamermesh (1989). TABLE 5.1. Means and Standard Deviations Variables 1979-80 1985-86 Pooled Mean S.D. Mean S.D. Mean S.D. LCR 3.763 0.17 4.227 0.19 3.995 0.29 CITS 19.460 6.74 25.460 6.74 22.460 7.36 EX 20.556 24.58 28.422 40.40 24.489 33.59 N=100 5.2 Estimation The compensation equation is described as (5.2.1) LCRi - [304».11Difll+THi[32+15.‘X1[3_,,+CITSJS,+E(:L1+6:JE The only difference between the simple and the general panel data models developed in Chapter 2 and 3 is whether to assume 5 is a vector of ones or not. That is, when we consider that ai is unobserved, time-invariant and has an effect on compensation, the general regression model allows the effect on compensation to be time-variant whereas the simple model 51 assumes the effects are equal over time. The meaning of 5 needs to be discussed in the present setting before we go on. Suppose $1=1 and £2=1.3. Then, it is not true that the compensation of individual 1 is 30 percent higher at T=2 than at T=1 because of the individual effect 02.. This would be true only when ai-l. It is more accurate to say that the effect of ai on nominal compensation at T=2 is 30% higher than at T=1. If ai=0.1, compensation is 3% higher 6 years later because of ai, holding all other regressors constant. We can also say that the 1'th individual has 5 percent higher compensation than the j'th individual when ai is one unit greater than c):j and the two individuals have the same values for all other regressors. Table 5.2 reports the OLS, simple within and simple GLS estimates. Table 5.3 presents the estimates of the general within and general GLS models. The first noticeable change from the simple to the general model is that the general within estimation could include time-invariant regressors, such as a constant term and TH, which are excluded from the simple within estimation. Table 5.3 shows the value of £2. The within estimate and the GLS estimate of 52 are 1.3424 and 1.5630, respectively. According to the within-estimated f, the unobservable parameter a:i has a 34.24 percent higher effect on compensation in 1985-86 than in 1979-80. As we discussed, £2=l.3424 does not mean that compensation increases 34.24% because of 02,. We 52 TABLE 5.2. The Simple Estimation Independent Estimation Method Variables OLS Within GLS Constant 3.4301 2.9218 (63.44) (44.67) AD 0.1679 0.1249 0.2080 (4.15) (4.41) (4.07) TH 0.1199 0.1468 (3.05) (2.42) EX 0.0188 0.0728 0.0400 (8.71) (51.60) (15.31) CITS 0.0034 0.0022 0.0040 (6.90) (6.42) (6.66) Adjusted R2 0.415 0.965 0.725 t-value in parenthesis. TABLE 5.3. The General Estimation Independent Estimation Method Variables Within GLS Constant 2.3448 2.7495 (11.53) (13.78) AD 0.0823 0.1057 (2-89) (1.47) TH 0.0086 0.0819 (0.11) (0.71) EX 0.0363 0.0174 (4.75) (2.34) CITS 0.0012 0.0023 (2.48) (2.04) 52 1.3424 1.5627 (3.58) (4.27) Adjusted R2 0.977 0.967 t-value in parenthesis. t-value of $2 is for £2=1. 53 TABLE 5.4. Test-Statistics for Ho: £1=€2=1 Test-statistics Within GLS t-statistic 3.58 4.27 LM-statistic 24.86 7.76 LR-statistic 34.32 83.62 will have an opportunity to look at the percentage change in compensation caused by ai later in this section, after we estimate ai. Unlike the general estimation, the simple regression model assumes £2=£1=1. Therefore, we need to test the hypothesis £2=1 using the test-statistics9 developed in Chapter 4. The results of the hypothesis tests are given in Table 5.4. The LM and LR test statistics all show that we can reject the assumption that g is a vector of ones. In other words, the di have different effects on compensation over time. We noted in the introductory chapter that our general model is similar in some ways to a panel data model with additive individual and time effects, but that it is not the same model. Now, we may run the same regression as (5.2.1) but including a time dummy in order to verify the above statement empirically. Thus consider the model (5.2.2) LCR, - BO+A0101+TH,B,+EX103+CITS,B,+DZIBS+Ea1+8, 54 where D2 is a time—dummy variable, equal to one at T=1 and zero at T=2. One might expect that 52 should be close to one when a time-dummy is in the regression, since the time-dummy absorbs the factor 5. Table 5.5 shows that is not so. The coefficients of the time-dummy are not much different between the estimation with and without 5. All three estimates in Table 5.5 imply that nominal compensation in 1979-80 is about thirty-four percent lower than in 1985-86 if the rest of the explanatory variables are equal. Moreover, the value of £2 with the time-dummy included is really no different from its value without the time-dummy (Table.5.5 vs.Table 5.3). Both of TABLE 5.5. Estimation with a Time-Dummy Independent Estimation method Variables OLS The Simple GLS The General GLS Constant 3.8997 3.9037 3.6738 (92.55) (74.22) (81.71) AD 0.1224 0.1228 0.1057 (4.91) (4.69) (4.68) TH 0.1113 0.1134 0.0819 (4.63) (3.68) (2.42) EX 0.0079 0.0079 0.0174 (5.46) (4.18) (10.88) CITS 0.0025 0.0024 0.0023 (8.23) (7.54) (6.41) D2 -0.3871 -0.3882 -0.3328 (-18.03) (-24.05) (-22.87) 62 1.5627 (4.21) Adjusted R2 0.777 0.929 0.967 t-value in parenthesis. t-value of 52 is for £2=l. 55 the GLS estimates of {2 (in Table 5.3 and Table 5.5) imply that nominal compensation of individual ‘4 is 56.3 times a:i percent higher 6 years later because of ai, holding all other variables constant. This is some empirical evidence that f is not simply another expression of a time effect. Actually, the regression without a time-dummy, such as the general random effects model in Table 5.3, is identical to the regression with a time-dummy and E(ai)=0, which is the general random effects (GLS) model in. Table 5.5. Notice that all GLS estimates in Table 5.3 and 5.5 are equal. Therefore, we do not have to include time effects when ‘we use this general estimation method. Refer to Table 5.6 for the hypothesis test for including a time effect in this general regression model. TABLE 5.6. Test-Statistics (Estimation includes a time-dummy.) Test-statistics GLS t-statistic 4.21 LM-statistic 10.69 LR-statistic 50.17 The t-statistic, LM and LR test-statistics show that we can reject the hypothesis that “R has a time-invariant effect on compensation. In order to calculate how many percent compensation rose 56 in 6 years because of the individual effect ai, we need to estimate a.. The ai can be estimated using the first order condition from the minimization of SSE. 5’" E’ ,, (5.2.3) 61W- (Yi'xiflw) The estimate am is a consistent estimate of as as T - m but not as N - m with T fixed. (See Appendix 5.1 for the list of as). Table 5.7 shows the maximum, the average and the minimum estimated ai in the within regression. amu=1.114 along with £d=1.3424 in Table 5.3 implies that the faculty member whose as is the maximum among 100 individuals has 38.1 (0.381=l.114x0.3424) percent higher nominal compensation in 1985-86 than in 1979-80 because of the individual effect a“, 0n the other hand, the individual whose as is the minimum has 7.2 (0.072=0.210x0.3424) percent higher nominal compensation in 1985-86 than in 1979-80 because of the individual effect a”. We can also say that on average the individual effect on TABLE 5.7. The Within Estimate of ai Maximum Minimum Average & 1.114 0.210 0.673 Faculty Number 26 39 57 nominal compensation 6 years later is 32.0 (0.230 = 0.673x0.3424) percent higher. It is worth discussing what a can represent in the compensation equation. The a. should not only be unobserved but also have significant effects on the change in nominal compensation. These effects should be different over time to qualify as a legitimate candidate for a. I think that work habits are a good candidate for 0s. The fact that Eu=l.34 and Em3=1.56 implies that the same work habits in 1985-86 have 1.34 and 1.56 times more impact on nominal compensation of faculty members than in 1979-80, according to the within and the GLS estimates, respectively. This is reasonable in the sense that work. habits do not have a large impact on compensation at the early stage of the job, but the impact will grow gradually as time passes. Suppose an individual just becomes a faculty member. At this early stage, TH, EX, AD, and CITS determine the compensation but work habits have little effect even if they are very good. Good work.habits will be an increasingly important factor in determining compensation as time goes on since good working attitudes will be appreciated by colleagues, supervisors and so on (even though we cannot measure them). The work habits of each individual affect compensation greatly and are stable over time, as well as unobserved in the data. Therefore, work habits satisfy all the requirements to be considered as what as represents. CHAPTER SIX FRONTIER PRODUCTION FUNCTIONS 6.1 Review A standard production function represents the maximum possible amount of output obtained from a given amount of inputs. However, the output data we observe are not necessarily equal to the maximum possible output. The difference between maximal output and observed output is a measure of technical inefficiency. The desire to measure technical inefficiency motivates the use of so-called "frontier production functions" to model maximal possible output, given inputs. A stochastic frontier model assumes output to be bounded by a stochastic frontier, whereas a deterministic frontier model regards the production frontier as deterministic. That is to say, in the stochastic frontier model the production frontier can vary randomly over time or across firms. Aigner, Lovell and Schmidt (1977) and Meeusen and ven den Broeck (1977) introduce a stochastic frontier model as follows: (6.1.1) Y1 " BO+XIB+61 i-lpoooNo 58 59 Here Y represents output and X represents inputs; for example, in the Cobb-Douglas case, Y and x are measured in logarithms. The stochastic frontier model decomposes the error e:i as €i=Vi- ui so that (6.1.1) can be written as (6.1.2) Y1 - BO+XIB+V1-U1 The error term (vi-ui) has two parts. The component vi is statistical noise, and represent the variation in output due to luck, weather, and other factors outside the control of the firm. It is assumed to be i.i.d. as N(0,ov2). The second component ui represents technical inefficiency, and so ui.>.0. It is assumed to be i.i.d. with a specific (one-sided) density. The original papers considered half-normal and} exponential distributions for u. Other choices include truncated normal (Stevenson (1980)) and gamma (Green(1990)). In any case, the model is called a stochastic frontier because the upper bound (frontier) for Y: is (30+xifi+vi), which is stochastic. The model may be estimated by maximum likelihood or by a corrected least squares procedure. There is, however, a problem in estimating the technical inefficiency ui for each observation. After the frontier function is estimated, the residuals are easily obtained, but they are estimates of ei=vi- ui, not of ui. (The average level of technical inefficiency can be estimated by the average of the 6,.) Jondrow, Lovell, 60 Materov and Schmidt (1982) use the fact that o5i can be estimated and has information on ui. They suggest estimating ui by E(ui|vi-ui), and they give an explicit formula for the 'w. However, this estimate contains noise (due half-normal case to vs) even asymptotically. The estimate of technical inefficiency depends upon the distributional assumptions made, such as normality for v and half-normal, exponential, gamma, etc. for u. Schmidt (1986) says, "In my opinion the only serious intrinsic problem with stochastic frontiers is that the separation of noise and inefficiency ultimately hinges on strong (and arbitrary) distributional assumptions." Schmidt and Sickles (1984) present a stochastic production frontier model with panel data which does not require strong distributional assumptions about technical inefficiency. Their model is the following; (6.1.3) Y1: -' po+Xicp+V1c-U 1-1,...pN; t-ll...,T. .i The term ui represents technical inefficiency and is assumed to be constant over time. By defining ai=Bo-ui, (6.1.3) becomes (6.1.4) Y1; - Xicp+a 1+VIC 61 a panel data model with an individual effect. Schmidt and Sickles define (6.1.5) 0: -ma-11x 041‘ (6.1.6) ui - a-a1 so that we may decompose the effects as into overall intercept or and technical inefficiency ui. This model can be estimated without any assumptions about ui, other than uizo, by treating the effects as fixed. In this case the usual fixed effects (within) estimator applies. Alternatively, if we assume the ui to be i.i.d., but do not make a specific distributional assumption, we have the usual- random effects model and a GLS estimator applies. Finally, if we are willing to make a specific distributional assumption, the model may be estimated by MLE, as suggested by Pitt and Lee (1981). In this case the technical inefficiency ui is estimated by E (ui |vi1-ui , . . . Nit-u.) . 6.2 Presentation of The Model The Schmidt and Sickles model rel-axes strong distributional assumptions about technical efficiency, but at the cost of imposing another strong assumption, that technical 62 efficiency is constant over time. Schmidt (1986) says, "An important line of future research, in my opinion, is to allow inefficiency to change over time, but in such a way that it can still be separated from statistical noise without making very strong distributional assumptions. I believe strongly in the usefulness of panel data in estimating frontiers and measuring inefficiency." We need to weaken the assumption of time-invariant inefficiency but should not lose:the advantages of Schmidt and Sickles model. Kumbhakar has generalized the Schmidt and Sickles model by assuming that the technical inefficiency'for'firmli at time t, u“, can be written as uit = g(t,0)ai where ai is an individual effect and g(t,0) is a specified function that depends on.t1and some parameters 0. He considers the specific function g(t,o) = (1 + exp(bt+ct2))'1 In this model as is fixed over time, but its effect on output changes over time as g(t,9) changes with t. The empirical problem is choosing the function g, appropriately. Our panel data model of Chapter 2 and 3 can be regarded as. a generalization of Kumbhakar's model, in 'which the parametric function g(t,0) is replaced a set of dummy variables representing' time. As such. we. do not require assumptions about g(t,0). Specifically, our model applied to the frontier production function setting can be written as 63 (6-2'1) Y1: ' B0+X1cfl+vifu1c When we define u"=-0gm, (6.2.1) becomes (6 . 2 .2) Y“ - BO+XRB+0 t(z 1+inc or (6.2.3) Y1 - BO+XIB+EG1+V1 This is the general panel data model with individual effects whose coefficients change over time. Similarly to Schmidt and Sickles, we define (6.2.4) 0:c - max [80+0ca1] 1 (6.2.5) Ill-t - (It-(Bojfitai) Notice that (6.2.4) and (6.2.5) are equal to (6.1.5) and (6.1.6) when every at is equal to one. If all elements of e have the same sign the definition in (6.2.4) and (6.2.5) are equivalent to (6.2.6) cz-m?x«a1 64 (6.2.7) 01., - e,(a-a,) Our regression equations will also include some variables representing influences on output that are not inputs under control of the firm. (Examples are dummy variables representing wet versus dry season and village location.) These are properly regarded as part of the intercept and should be included in the above calculations so that they do not appear to be inefficiency. Letting Din represent the effects of these variables, (6.2.4) and (6.2.5) become (6.2.8) 0:c - msx [(30+Dicy+0ca1] (6.2.9) ”1: - at-(Bo+Dity+0ta1) The estimates in (6.2.5), (6.2.7) or (6.2.9) are consistent as N and T - 90 since the estimate of ai is consistent as T - co and the most efficient firm in the sample will indeed be perfectly efficient as N - 90. 6.3 Data We will reanalyze a data set previously analyzed by - Erwidodo (1990). The data consist of information on 171 rice farms in Indonesia, for six growing seasons. The data set was 65 collected by the Agro Economic Survey, as part of the Rural Dynamic Study in the rice production area of the Cimanuk River Basin, West Java, and obtained from the Center for Agro Economic Research, Ministry of Agriculture, Indonesia. The data are for 171 rice farming families and extend over six time periods. Each time period is a growing season; there are two growing seasons per year. Three of the six time periods are dry seasons and the other three are wet seasons. Data are collected from six different villages that contain 19, 24, 37, 33, 22 and 36 farm families, respectively. The data set includes information on seed, urea, TSP (Triple Super Phosphate), labor, and land. It also includes some dummy variables. DP is a dummy variables equal to one if pesticides are used, and zero otherwise. DVl equalsone if HYV’ (High Yield.Variety) of rice are planted, while DV2 equals one if mixed varieties are planted; the omitted category represents traditional varieties. DSS equals one in the wet season and zero in the dry season. DR1,...,DR5 are dummy variables that represent the six different villages, and are intended to control for differences in soil quality across 1. For a further discussion of the data, see Erwidodo villages1 _ c K. (1990). 66 6.4 Estimation In this section, we estimate a production function for Indonesian rice farms. The Cobb-Douglas production function to be estimated is specified as follows: 5 (6 .4 .1) unfit-00+}: B , lnXut+BsDP1c+B7DV11c+BBDV21c+[39DSSC ha +910DR11+511DR21+$12DR31+913DR41+914DR51+6 t“ 1+V1c where Y: total production of rough rice in kilograms X1: the amount of seed (Kg) X2: the amount of urea (Kg) X3: the amount of TSP (Kg) X4: the amount of labor (hours) X5: the area planted with rice (Ha) and all dummies are defined in Section 6.3. Our main concern is a comparison of the results obtained from the simple and general panel data models. Estimation results are given in‘Table 6.1 and 6.2. Table 6.1 displays the estimates of the simple panel data model by OLS, within and GLS. (The second column of‘ Table 6.1 shows the ‘within estimates and it cannot include the village dummies (DR) 67 TABLE 6.1. Estimation of the Simple Panel Data Model Independent Estimation Methods Variables OLS Within GLS Constant 5.0811 5.0636 (26.73) (26.32) Seed 0.1358 0.1208 0.1327 (5.06) (4.46) (4.93) Urea 0.1200 0.0918 0.1132 (6.91) (4.79) (6.38) TSP 0.0718 0.0892 0.0761 (6.31) (7.71) (6.66) Labor 0.2167 0.2431 0.2230 (7.60) (8.25) (7.75) Land 0.4819 0.4521 0.4770 (15.90) (14.03) (15.57) DP 0.0077 0.0338 0.0141 (0.27) (1.15) (0.49) DV1 0.1755 0.1788 0.1772 (4.60) (4.75) (4.66) DV2 0.1356 0.1754 0.1446 (2.60) (3.40) (2.78) DSS 0.0489 0.0533 0.0492 (2.26) (2.73) (2.35) DRl -0.0500 -0.0511 (-l.16) (-1.03) DR2 -0.0393 -0.0442 (-0.73) (-0.75) DR3 -0.0623 -0.0724 (-1.09) (-1.17) DR4 0.0248 0.0117 (0.47) (0.20) DRS 0.0818 0.0750 (1.48) (1.25) Adjusted R3 0.882 0.989 0.890 t-value in parenthesis. 68 TABLE 6.2. Estimation of the General Panel Data Model Independent Estimation Methods Variables Within GLS Constant 4.2605 4.7453 (10.99) (16.88) Seed 0.1241 0.1286 (3.86) (3.89) Urea 0.1069 0.1045 (5.07) (4.63) TSP 0.0303 0.0421 (2.27) (3.16) Labor 0.2303 0.2188 (7.92) (6.98) Land 0.4579 0.4739 (10.63) (10.74) DP 0.0080 0.0272 (0.29) (0.97) DV1 0.0805 0.1040 (2.28) (2.95) DV2 0.1226 0.1370 (2.43) (2.89) DSS 0.1580 0.1684 (3.21) (2.67) DRl 0.0487 0.0124 (0.35) (0.15) DR2 0.6292 0.1621 (2.40) (1.79) DR3 0.4853 0.0904 (2.13) (0.96) DR4 0.2316 0.0625 (1.27) (0.60) DR5 0.6342 0.2581 (2.98) (2.78) (2 1.1713 1.4410 (1.48) (2.06) (3 0.4912 0.3229 (-3.05) (-4.71) g, 0.6800 0.4157 (-2.47) (-3.38) 55 1.2203 1.1993 _ (2.51) (1.75) 66 1.3854 1.6848 (2.48) (2.56) Adjusted R2 0.935 0.928 t-value in parenthesis. t-values of g are for £t=1. 69 TABLE 6.3. Test-Statistics Test-statistics Within GLS F-statistic 2.29 14.69 LM-statistic 69.53 9.78 LR-statistic 238.17 288.73 because they are time-invariant.) Table 6.2 reports the estimates of the general panel data model by within and GLS. The within estimates in Table 6.2 include the coefficients of the village dummies, and they seem to be significant. In this application, the villages have significantly different soil conditions from each other and village dummies should be included in the regression. One of the advantages of the general model is the ability to include time-invariant regressors such as constant and village dummies, in performing within estimation. The primary focus must be on the value of 5 since there is no real difference between the simple and general panel data models if 6 is close to a vector of ones. In addition, 5 allows technical efficiency to change over time. Both E¢=(1 1.171 0.491 0.680 1.220 1.385)' and Ema-(1 1.441 0.323 0.416 1.199 1.685)' seem to be far different from a vector of ones. Table 6.3 provides us 'with ‘the test-statistics for ‘the hypothesis that technical efficiency is constant over time. 70 The asymptotic distribution under the null hypothesis is xf} for which the 5% critical value is 11.07. For all of the statisticsntexcept one we can reject the null hypothesis that technical inefficiency is time invariant. There is no obvious temporal pattern to our estimates of 6- In particular there is no clear trend, nor is there a seasonal pattern. Despite the fact that an and 5015 are significantly different from a vector of ones, the estimated regression coefficients for the simple model (Table 6.1) and the general model (Table 6.2) are not very different. The biggest changes are in the coefficient of TSP (.089 to .030), DV1 (.179 to.081) and DSS (.053 to .158). Table 6.4 and 6.5 show some summary'measures of technical efficiency of individual rice farms. (See Appendix 6.1 for a complete list of technical efficiency of each farm.) The results in Table 6.4 are calculated according to (6.1.5) and (6.1.6) ‘while we construct Table 6.5 according to (6.2.8) and (6.2.9). The. dummy ‘variables ‘which are included in, the intercept are the seasonal dummy (DSS) and.the village:dummies (DR). The pesticides use dummy (DP) and the variety dummy (DV) are in the production function of rice since DP is considered as an input and DV represent different outputs. The average level of technical efficiency in Table 6.4 from the Schmidt and Sickles model is fairly close to the overall average efficiency in Table 6.5 from the new model. 71 That is to say, the simple within estimator tells us that these six Indonesian rice farm villages have an average level technical efficiency of 56.69%. The general within estimator also implies that average technical efficiency is 56.79%. Erwidodo (1990) estimated the production function of the rice farms from the simple panel data model and used the Battese and Coelli (1988) method to measure technical inefficiency by assuing a half-normal distribution of technical inefficiency (ui)‘3. His measure of the average level of technical efficiency is approximately 94.20%. The high efficiency measures are expected from the half-normal density assumption since it implies that the mode is at.ts=0 which means there are many perfectly efficient farms. On the other hand, the average efficiency measure of 56.79% from the general panel data model seems to be too low. However, this measure seems to be more legitimate than the measure of Erwidodo when we consider that these six Indonesian rice farm villages are relatively less developed (for examples, poor drainage, water control, transportation, etc.) and they harvest twice a year. The advantage of the general model is that it gives us different efficiency levels in. each, season. The. average technical efficiency of the first season is 56.52% and it is more or less the same in the second season. The third season brings a large improvement of efficiency to 67.27%, and efficiency declines slightly to 62.87% inuthe:next season. The 72 TABLE 6.4. Technical Efficiency (from the simple estimation) The Simple Within Estimation Farm Number Efficiency(%) Maximum 164 100.0 Minimum 45 36.55 Median 15 55.40 Average (Mean) 56.69 TABLE 6.5. Technical Efficiency (from the general estimation) Efficiency(%) from The General Within Farm Number T=1 T=2 T=3 T=4 T=5 T=6 Maximum 164 100.0 100.0 100.0 100.0 100.0 94.39 Median 80 55.40 50.11 74.63 66.82 48.70 41.73 Minimum 45 33.63 27.93 58.40 47.59 26.48 20.90 Average 56.52 53.62 67.27 62.87 52.85 47.59 Overall Average = 56.79 Maximum, Minimum and Median Farms are defined in the first season . last two seasons have lower technical efficiency levels of 52.85% and 47.59%, respectively. We can also calculate the technical efficiency level of each rice farm family over the six different time periods in 73 this sample. Table 6.5 shows the efficiency levels of selected families. Family number 164 is the most efficient in all seasons except the last season. (It is the third most efficient for T=6.) The simple within estimator says that family number 164 is the most efficient for all six seasons, because it assumes that the efficiency level does not change over time. The most inefficient rice farm family for T=1 is family number 45, and it is 33.63% efficient according to Table 6.5. This family becomes the 133rd most efficient family for T=3, with an efficiency level of 58.40%, but it goes back to being the least efficient rice farm family for T=5 and.T=6. The results of the Schmidt and Sickles model say that the technical efficiency level of family number 45 is 36.55% and it is the lowest among the 171 families. Figure 6.1 shows the trend of technical efficiency of the median family at T=1 (family number 80). Its efficiency level is 55.40% for T=1. Its efficiency level then falls at T=2, rises rapidly at T=3, and then declines continuously. On the other hand, the efficiency level of the same family is 55.18% over all six time periods according to the simple within estimator. Table 6.6 records the percentage of rice farm families (among the 171 families) which fall in each decile of technical efficiency. The implication of Table 6.6 is not very different from that of Table 6.4 and 6.5. As expected, rice farms are relatively more efficient at T=3 and 4 since the 74 within-estimated e shows that $3 and 6, are smaller than the other elements of g. The value (51.3854 is the largest element of g and so the rice farms are on average the least Technical 75% Efficiency of the 80th family 70 65 60 55 3‘ i_,simple‘ within 50 45 general 40 within 1 2 3 4 5 6 T (FIGURE 6.1> efficient at T=6. It is not true that every family has the highest efficiency level at T=3 and the lowest at T=6 just 75 because the maximum, the median or the average efficiency in Table 6.5 has that kind of trend. Some other farms have different trends for technical efficiency. Figure 6.2 shows three different types of efficiency trends. Family number 5 has its lowest efficiency level (67.78%) at T=3 and its highest efficiency (100%) at T=6. Family number 25 has a relatively constant efficiency level over time: 58.60%, 59.11%, 57.09%, 57.65%, 59.26%, and 56.41%. Finally, the trend of the technical efficiency level of family number 160 is opposite to that of family number 5. TABLE 6.6. Efficiency Levels(%) Efficiency The Simple The General Within Intervals Within % T=1 T=2 T=3 T=4 T=5 T=6 100-90 1.75 1.75 2.34 2.92 1.17 2.34 2.34 89-80 1.17 2.34 1.17 5.85 5.85 5.85 0.00 79-70 6.43 6.43 6.43 29.23 8.77 5.85 4.09 69-60 20.47 16.96 13.45 33.33 42.11 13.45 10.53 59-50 43.86 46.78 33.33 27.49 38.01 29.24 17.54 49-40 23.98 22.81 33.92 1.17 4.09 37.43 34.50 39-30 2.34 2.92 9.36 0.00 0.00 10.53 25.15 29-20 0.00 0.00 0.58 0.00 0.00 0.58 5.85 76 Technical 100% Efficiency Family 5 95 90 85 80 75 70 Family 160 65 60 55 mily 25
77 6.5 Summary The usual motivation for the use of panel data in the measurement of technical efficiency is the desire to avoid strong distributional assumptions for technical inefficiency. The simple panel data model was first used for this purpose by Schmidt and Sickles (1984), and can be viewed as replacing distributional assumptions by the assumption that technical inefficiency is time invariant. More recent work by Cornwell, Schmidt and Sickles (1990) and Kumbhakar (1990) has relaxed the assumption that technical inefficiency is time invariant, by allowing it to vary in specific ways. These papers impose a certain amount of smoothness in the pattern of technical inefficiency over time. Our general panel data model is more flexible than previous models. It assumes only that the temporal pattern of technical inefficiency is the same over time for all firms. That is, the pattern of change must be the same for every firm, though the direction can be reversed for some firms and the extent of the intertemporal variation differs across firms. Our model nests the model of Kumbhakar, and our three- component model nests the model of Cornwell, Schmidt and Sickles. An interesting topic for future research is to use our model to test the restrictions imposed by their models on the smoothness of the intertemporal change in efficiency levels. Our empirical results for Indonesian rice farms may 78 show more change over time in efficiency levels than is plausible, so that some smoothing may be desirable. If so, our general results are at the very least useful as a reference against which to test the models that impose such smoothness. CHAPTER SEVEN CONCLUSIONS The usual motivation for the use of panel data in labor economics and in related areas is the desire to avoid potential bias caused by the omission of unmeasured individual characteristics from the regression equation. For example, in a wage equation, individual "ability" (or "ambition") is usually unobservable, and may have an effect on wage. If so, the omission of ability from the regression will cause a bias in the estimation of the coefficients of those variables that are correlated with ability. The usual solution to this problem is to assume that ability (or more properly the effect of ability on wage) is time invariant and can therefore be captured by'a time-invariant individual-specific effect. This leads to the so-called within estimator. However, the assumption that the individual effects are time-invariant is very strong. In this thesis we have considered a model that weakens this assumption. In particular, we assume an unobservable time-invariant individual variable (such as ability), but we do not assume that its effect on the dependent variable is time invariant. Rather, we need only to assume that the effect of this variable on the dependent variable has the same temporal pattern for all individuals. 79 80 Thus, for example, the effect of ability on wage may differ across the business cycle, or may display a trend, so long as it does so for all individuals. We estimate this temporal pattern along with the other parameters of the model. We develop fixed-effects and random-effects treatments of our model. Fixed-effects treatments are relevant when the motivation for the use of panel data is bias reduction, as diScussed above, while random-effects treatments are relevant when the motivation is efficiency of estimation. Our model is nonlinear so estimation is more complicated than in the usual simple model. In both the fixed and random effects cases we propose a method of estimation, and we prove the consistency and asymptotic normality of the estimates. This is non-trivial since standard likelihood theory does not apply, due to the- so-called incidental parameters problem (the number of unobservable effects increases with sample size). We also propose asymptotically valid tests of the restrictions that reduce our model to the usual panel data model. After the theoretical consideration of our model, we consider two applications. The first application is a model of the compensation of faculty members. Hamermesh (1990) estimated this model using the simple within estimator, while we use estimators based on our more general panel data model. We find that the individual effect does not have a constant effect on compensation: rather, its effect is larger in the second time period (1985-1986) than in the first (1979-1980). 81 This seems plausible since the effect of ability or work habits on compensation may be larger when one has been in the job longer. The choice of model makes a modest difference to the results for ‘the variables of ‘main interest in the regression. Our second application is to the frontier production function (efficiency measurement) problem. We analyze data on Indonesian rice farms previously analyzed by Erwidodo (1990) using the simple panel data model. Our model allows a much richer pattern of technical inefficiency over time than the simple model, which assumes constant technical inefficiency, and it therefore yields substantially different results than Erwidodo found. A promising line of future research is to consider models that are intermediate between the simple model, in which individual effects have a time-invariant effect on the dependent variable, and our model, in which the temporal pattern of these effects is completely unrestricted. Kumbhakar (1990) has proposed one such model in the frontier production function setting, and our model can be used to test the specification of his model or of other similar models. It is obviously an empirical question how much flexibility of specification the data will typically support, and we hope to address this question in further applications. FOOTNOTES FOOTNOTES 10. 11. 12. A distributional assumption for e. is necessary since the fourth moment of eit appears in ghe calculation of the covariance matrix of the estimator. See Amemiya (1985), p. 105-114. MG(z8eT) = [I - (I 8e( e')/T)](Z®e) = (zge) - @(e;= e 'e )/T] = (Z® e::) - (Z% 0. The simple between estimator has to exclude individual- invariant explanatory variables for the same reason that the simple within estimation cannot include time- invariant regressors. However, those regressors can be included in the general between estimator. See Hsiao (1985), p. 131. (1.802 e ')/T)1(e.. 800 (62:85) - "819... (ee 'e)/TJ" 7&0 . See Gallant (1985), p. 217-220. MG ( eN®£ ) Gallant originally defines 2 lNéé" {-76th 0'1 and eit as least squares residuals obtained from each univariate model when he discuss multivariate nonlinear least squares. See Gallant (1985), p. 149-150. The number of degrees of freedom is one, so that x 2=6. 63 with 99% of confidence. = ule.) ? Eo.[f