‘ . ,:.. 4., 4.on . "v.4. buaquhw imam.» ‘ .34“ ”14:! lo 1.. 91¢ 3de u: .. . , .4 $4551.19. , ‘3 . A in. my» . 5&1}, 33m. J. E8: 4 n8. . .. a... "1H,... away: . mm 3 n, . . {A D .r {I $3., .AI Lana. 3‘ . 9.». .34 . I In. a). "$5 ,a {A an. gfimxfimfimm :1 $ £5 . . i .r . .5.» :u. 31A... . . «4 {it Infla I, < It“... . 3 . . .1. Ma. . 1. .1... 6N1." hm In... and}... , TEES l S ’) f ' i". ’x :7 Y {Hf “fld‘ This is to certify that the dissertation entitled THREE PAPERS IN LABOR ECONOMICS presented by DAVID LARSON WETZELL has been accepted towards fulfillment of the requirements for Ph . D . Economic 5 degree in /Z/ V Major professor a/xx/a a MS U is an Affirmative Action/Equal Opportunity Institution 0— 12771 LIBRARY Michigan State University PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 cJCIRC/DateDuepGS—pJS THREE PAPERS IN LABOR ECONOMICS By David Larson Wetzell A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirement for the degree of DOCTOR OF PHILOSOPHY Department of Economics 2002 ABSTRACT THREE PAPERS IN LABOR ECONOMICS By David Larson Wetzell The first paper demonstrates that the time-allocation generalization of labor supply can generate interesting and surprising predictions regarding labor supply behavior. These predictions follow fi'om the assumptions made about input combinations available for final consumption. When goods cannot be produced strictly from time or money then either input could curtail the substitution effect. The second paper shows that Korenman and Blackburn’s (1994) finding of a ten-percentage point drop in the Male Marriage Earning Differential (MMED) over the years 1967-1988 is biased upWards. It presents evidence that the bias may be due to the imputation procedures used before 1976. Then, it uses a residual-based trimmed estimator to remove the bias and construct a more consistent MMED series. The new series supports the conclusion that both changes in selection and married female labor force participation affect the MMED. The paper concludes that, while the MMED does not appear to have i changed significantly, as of the late 19805, the composition of the MMED has become more due to selection. The third paper surveys investigations into the causality of the MMED. Gray’s (1997) comparison of the returns to years of marriage with the NLS and NLSY and Stratton’s (2002) estimation of the return to years of marriage and cohabitation using the NFHS are reexamined with the assistance of a residual- based trim and then compared with other longitudinal studies’ findings. Then, a qualitative comparison is made of recent MMED studies’ findings of the prevalent direction of causality for the MMED. There appears to be evidence that being married had an effect on earnings before 1970 and 1970-1979. However, more recent studies either support the selection hypothesis or fail to find evidence that the household division of labor enhances the husband’s market productivity. Finally, a meta-analysis of cross-sectional estimates of the MMED across time and age groups is made. The meta-analysis finds a four-percentage point decline in the MMED in recent years, after controlling for how the MMED rises with age. Copyright by. DAVID L WETZELL 2002 A To My Grand-parents, My Father and Mother and My Three Sisters: Chris, Cathy and Cassandra. ACKNOWLEDGEMENTS To acknowledge and thank everyone that has made it possible for me to complete my dissertation would take more space than the rest of the dissertation. I owe much thanks to my major professor Dr. David Neumark. Thank you for your time, for being a role model as to what it means to be a professional and much constructive criticism. My dissertation is far better than it would have been without your guidance. I must thank Dr Jeff Biddle for his time, feedback and support and for, on some occasions, taking the time to boost my spirits by treating me like a colleague and arguing with me. .I also must thank Dr John Strauss for his support and useful, clarifying comments on my papers and for a thoroughly enjoyable course on applied econometrics. I am thankful for Dr. Jeffrey Wooldridge for joining my committee and for bringing his expertise in econometrics to bear on my papers. I am thankful for my friend, Dr. Warren Samuels, for his encouragement and sound advice over these past five years. I admire and aspire afier your love of knowledge. I owe much to the faculty of the Michigan State University Economics Department. In particularly, I am indebted to: Drs. Jack Meyer, Steve Matusz, Stephen Woodbury, Rowena Pecchenino, Charles Ballard and many others. I am also thankful to Dr. Daniel Hamermesh for his suggestion to send my first paper for publication to Economic Letters. 1 am thankful for the encouragement, advice and feedback from my friendships with David Smith, Jess Reaser and Cheng-Peng Cheng who got their vi PhDs from MSU a year before and a year after my arrival. I am thankful for the comradeship of my more contemporaneous PhD students: Daiji Kawaguchi, Scott Adams, Ali Berker, Vinit Jagdish, Chien-Ho Wang, Mao-Sheng Chen, Linda Bailey, Elda Pema, Paul Corrigan, Iva Petrova, Alina Luca, Jongbyung Jun and many others. I am also thankful for the encouragement from so many friends from all over that I met while living at Owen Hall and being a student at MSU. I am thankfiil for Stacy Vatne, Oktay Unlu, Beyhan Asma, Emilio Ungerfeld and Paola Pastore and many others with whom I can look forward to staying in touch for the rest of my life. I also must acknowledge my indebtedness to my friends and family who helped to lay the foundation for who I am and have supported me during my studies. I am thankful for all of my teachers that encouraged me to think outside the box and ask insightful questions. I am thankful for the mentorship of what it means to be both a scholar and a humanitarian that I received while I was in college. I am thankfiil for the fellowship and acceptance of my home church. It is only through the unconditional love I received while growing up that I was able to persevere through personal adversities to be able to complete my dissertation as I have. In the face of all these debts, I can only hope to be able to go out and do likewise with whatever route along which my life takes me. vii TABLE OF CONTENTS LIST OF TABLES ....................................................................................................... ix LIST OF FIGURES ....................................................................................................... x Paper 1: On Some Unappreciated Implications of Becker's Time Allocation Model of Labor Supply ................................................................................................. l I. Introduction ................................................. 1 II. The Model ................................................................................................................ 4 III. A Generalized Proof of the Distinctive Features of the Becker Model ...................... 6 IV. An Investigation of the Properties of Labor Supply Curves with "Nice", or Constant Elasticity of Substitution, Preferences under the Becker Model ....................... 8 V. An Investigation of the Comparative Statics for Labor Supply ................................ 10 VI. Relevance of the Model ......................................................................................... 11 VII. A Reinterpretation of Classical Models of Labor Supply ...................................... 12 VH1. Additional Predictions of the Model .................................................................... 13 D(. How Would One Test this Theory against the Standard Theory ............................. 14 X. Conclusion ............................................................................................................. 15 XI. Bibliography ......................................................................................................... 16 APPENDIX A ............................................................................................................. 17 APPENDIX B ............................................................................................................. 19 APPENDIX C ............................................................................................................. 22 APPENDIX D ............................................................................................................. 24 viii Paper 2: On the Measurement and Explanation of Recent Changes In the Male Marriage Earnings Differential .................................................................................... 30 I. Introduction ............................................................................................................. 30 H. Review and Replication of Korenman and Blackburn’s Findings ............................ 33 III. Explaining the Shift: The Misclassification Problem .............................................. 35 IV. An Empirical Examination of the Impact of Misclassification with Inconsistent Information .............................................................................................. 37 - The difficulty in directly examining the Impact of Misclassification ........... 37 - Predicted Log-Earnings Levels by Marital Status ....................................... 38 - Predicted Skew of Earnings Distribution Residuals by Marital Status ......... 40 - Probability of Classification as Full-Time/Full-Year by Marital Status ....... 42 - A Simulation of the Impact of Different Imputation Procedures ................. 43 - A Reconciliation of the Different Tests for Misclassification ...................... 46 V. Calibrating the Percentages Misclassified and the EXtent of the Bias ...................... 47 - The Results of the Calibration .................................................................... 52 VI. An Alternative Robust Approach to estimating the MMED ................................... 53 VII. Econometric Analysis of Trimmed MMED Series ................................................ 58 - Comparison of Findings for the Controlled and the Raw MMED Series ..... 60 - An Investigation into the Nature of the Changes In Marital Selection ......... 61 VIII. Conclusion .......................................................................................................... 63 IX. Bibliography ......................................................................................................... 65 APPENDIX A: Description of Replication of Korenman and Blackburn’s MMED Series .......................................................................................................................... 92 APPENDIX B: Description of Additional Changes made to form the Alternative MMED Series ............................................................................................................. 94 APPENDIX C: Description of Why the 1975 Trimmed MMED is considered an outlier .......................................................................................................................... 97 APPENDIX D: Comparison of Accuracy of Prediction between the Median and Mean MMED 1967-1988 Time Series Regressions ................................................... 100 Paper 3: Has the Male Marriage Earnings Differential’s Causality Changed? A Historical Over-View of the Literature ................................................................... 101 I. Introduction ........................................................................................................... 101 II. Reinvestigation into Gray (1997) .......................................................................... 103 III. Reinvestigation into Stratton (2002) .................................................................... 105 IV. A Comparison of the Findings of Longitudinal Studies ........................................ 107 V. Earliest Studies ..................................................................................................... 108 VI. Studies based on Data Collected during the Seventies .......................................... 110 VII. Studies based on Data Collected during the Eighties/Nineties ............................. 113 VIII. Summary of Meta-Analysis of Cross-Sectional MMEDs .................................. 117 IX. Conclusion .......................................................................................................... 1 18 X. Bibliography ......................................................................................................... 119 i.\' LIST OF TABLES Paper 2: ON THE MEASUREMENT AND EXPLANATION OF RECENT CHANGES IN THE MALE MARRIAGE EARNINGS DIFFERENTIAL. Table 1A: Descriptive Statistics for Replication of Korenman and Blackbum.Years 1967-1996 ........................................................................................ 84 Table 1B: Summary of Comparison of Replication with Korenman and Blackburn’s Reported Time Series Results .................................................................. 84 Table 1C: Results from the Regression of Korenman and Blackburn’s Marriage Earnings Differential Series on the Replication Marriage Earnings Differential Series ....................................................................................................... 84 Table 2A: Summary of Sample Statistics For Simulation of Early Imputation Procedures using Data from 19888 March CPS ......................................... 85 Table ZB: Summary of Alternate Hot-Decking Procedure’s Impact on the MMED using Data from 19888 March CPS ................................................................ 85 Table 3: Statistics Used for Calibration of Misclassification of Work- History ........................................................................................................................ 86 Table 4: Summary of Alternative MMED Series Descriptive Statistics ............ 87 Table 5: Descriptive Regressions ..................................................................... 88 Table 6: Summary of Time Series Regressions of 2. 58 Cut- Off, Trimmed MMEDs on Descriptive Statistics ................................................................................ 89 Table 7: Summary of Time Series Regressions of Trimmed Raw (Age Controls Only) MMEDs on Descriptive Statistics ........................................................ 90 Table 8A: Summary of Importance of Earnings for Probability of Being Married or Divorced/Separated For the Years 1967 and 1988 ...................................... 91 Table SB: A Table Summary of Changes In Means and Dispersion of Probabilities of Being Married or Divorced/Separated for 1967 and 1988 .................... 91 Table D-l: Summary of Median Regression Assessments of Accuracy of Predicted Trimmed 2.58 Controlled MMEDs for the years 1989-1994 ....................... 100 Paper 3: HAS THE MALE MARRIAGE EARNINGS DIFFERENTIAL’S CAUSALITY CHANGED? A HISTORICAL OVERVIEW OF THE LITERATURE. Table 1: Summary of Trimmed Versions of Gray (1997) ............................... 121 Table 2A: Comparison of Descriptive Statistics Stratton (2002) ..................... 122 Table ZB: Summary of Reexamination of Stratton (2002) .............................. 122 Table 3: Summary of the Impact of Cohabitation ........................................... 123 Table 4: Summary of Returns to Years Married from Longitudinal MMED Studies With Controls for Years Married .................................................................... 124 Table 5: Summary of Earliest studies based on data collected before 1970 ..... 125 Table 6: Summary of MMED Studies based on Data Collected 1970-1980 126 Table 7: Summary of MMED Studies based on Data Collected 1980-1990 127 Table 8: Summary of MMED Studies based on Data Collected 1984-1998128 Table 9A: List of Values Used for Meta-Analysis of Cross-Sectional MMED across Age Groups and Over Time ............................................................................ 130 Table 93: Summary of Meta-Analysis of Changes in Cross-Sectional MMED ...................................................................................................................... 130 LIST OF FIGURES Paper 1: ON SOME UNAPPRECIATED IMPLICATIONS OF BECKERS TIME-ALLOCATION MODEL OF LABOR SUPPLY. Figure 1: Possible Labor Supply Curve Shapes With Time Costs of Consumption and Pecuniary Costs of Leisure ................................................................ 9 Paper 2: ON THE MEASUREMENT AND EXPLANATION OF RECENT CHANGES IN THE MALE MARRIAGE EARNINGS DIFFERENTIAL. Figure 1: Comparison of Smoothed and Non-Smoothed Original and Replication of Korenman and Blackburn's Male Marriage Earnings Differential Series ........................................................................................................................ 66 Figure 2: Alternative Series for White Male Marriage Earnings Differential .................................................................................................................. 67 Figure 3: Trends in Education Coefficients Over the Years 1969-1981 ............ 68 Figure 4: Trend in White-Collar Coefficient over Time .................................... 69 Figure 5: Comparison of Predicted Mean Earnings in 1987 Dollars for Full-Time/Full-Year Workers by Marital Status .......................................................... 70 Figure 6: Probability Density Functions for Normal and Mixed Normal Distributions with parameters ...................................................................................... 71 Figure 7: Comparison of Average Skew of Residuals from Median Regression by Marital Status ....................................................................................... 72 Figure 8: Comparison of Probability Employed Full-Time/Full-Year by Marital Status for Employed Males ............................................................................. 73 Figure 9: Comparison of Proportions of Sample Trimmed because Residual was too High or too Low by Marital Status ................................................... 74 Figure 10: Comparison of Trimmed with Non-Trimmed OLS Male Marriage Earnings Differential Series ................................................... . ...................... 75 Figure 11: Comparison of Trimmed OLS Male Marriage Earnings Differential Series ....................................................................................................... 76 Figure 12: Independent Regressors Used in the MMED Time Series ............. '...77 Figure 13: Comparison of Observed and Predicted Male Marriage Earnings Differentials .................................................................................................. 78 Figure 14: Comparison of Observed and Predicted Male Marriage Earnings Differentials (Age Only Controls) ................................................................. 79 Figure 15: Predicted Probabilities of Being Married by Age and Log- Earnings Year 1967 ..................................................................................................... 80 Figure 16: Predicted Probabilities of Being Married by Age and Log- Earnings Year 1988 ..................................................................................................... 81 Figure 17: Probability of Being Divorced/Separated Conditional On Having Been Married Year 1967 ................................................................................. 82 Figure 18: Probability of Being Divorced/Separated Conditional On Having Been Married Year 1988 ................................................................................. 83 Figure C-l: Comparison of Average Heteroskedastic Standard Deviations by Marital Status ......................................................................................................... 98 xi LIST OF FIGURES CONTINUED Figure C-2: Trends in the Proportion of Observations by Marital Status with Imputed Earnings Last Year ................................................................................ 99 Paper 3: HAS THE MALE MARRIAGE EARNINGS DIFFERENTIAL’S CAUSALITY CHANGED? A HISTORICAL OVERVIEW OF THE LITERATURE. ‘ Figure 1: Comparison of Predicted and Observed Trimmed MMED .............. 131 Figure 2: Comparison of the Observed MMED and DSED series For Years of Survey 1991-96 ........................................................................................... 132 xii PAPER 1 On Some Unappreciated Implications of Becker's Time Allocation Model of Labor Supply Introduction. Economists have largely overlooked some of the implications of Gary Becker’s generalization of labor supply theory. Becker's (1965) paper considers time allocation across many goods and primarily analyzes the comparative statics of changes in efficiency of work-place production and household consumption or production. In his paper, Becker does suggest that an additional implication of his theory would be that an increase in the wage could induce a negative substitution effect. However, Atkinson and Stern (1979) show the impossibility of a negative substitution effect. Following this proof, the consensus in economics, as articulated in Mark Killingsworth (1983), has been that the time-allocation model's inclusion of time costs of consumption and a pecuniary price of leisure is useful as a household production model where the comparative statics of changes in household production technology on market and non-market work can be examined. However, the model was not perceived as adding anything fundamentally new to our understanding of labor supply. However, this does not take into consideration the possibility that Becker's assumption that all final goods require inputs of time and intermediary inputs purchased with money may permit one of the input constraints to dominate the maximization problem. This can only take place if time and money (or intermediary inputs) are not completely substitutable for each other in the production of goods for final consumption. Otherwise, Killingsworth is accurate in his assessment that the inclusion of time costs for consumption and pecuniary prices to leisure does not affect the basic predictions for labor supply. Money and time are completely substitutable for each other in production when, at any feasible ratio of time and money used, it is possible to substitute more of one input for the other, as is assumed in household production models with Cobb- Douglas household production functions. The assumption of complete substitutability of inputs in the household production function may be a nontrivial assumption for labor supply, particularly in situations of poverty. It assumes there are no limitations in how time and money can be combined to produce utility. This assumption about technology is not easily amenable to direct testing since we only observe a subset of possible feasible inputs at any point in time and what is observed is subject to what is rational. Conversely, excluding the theoretical possibilities of pure consumption and leisure does not assist in ascertaining what is the case empirically, specially since what is feasible is subject to change over time. However, the difficulty in pinpointing what is possible at any point in time does not affect the qualitative implications for labor supply of theoretical restrictions as to what is possible. It should be acknowledged that the observation of individuals willing to supply a significant positive number of hours of labor to the market at lower wages is also consistent with the implications of fixed-costs of work models. Fixed-costs of work are costs that must be born if and only if one decides to supply a positive number of hours to the labor force. A good overview of fixed- costs of work models is given in Killingsworth (1983) on pages 23-28. Both models set out situations where cost-constraints make working lower levels of hours infeasible, or undesirable. The major difference is that, while a fixed cost of work model leaves the option to choose not to work1 as a theoretical possibility, the time allocation model potentially removes it from the locus of rational labor supply responses to wage offers. The only other paper that appears to have dealt with the implications of Gary Becker's time allocation model for labor supply is Stern (1986). In his comprehensive review of properties of different functional forms for labor supply, Stern graphically illustrates that, with a CBS (Constant Elasticity of Substitution) utility function and a negative unearned income, leisure can "become a normal good for individuals with high hours of work (pp.162-163)." Stern also proves for LES (Linear Expenditure System) utility with positive time costs of consumption that the hours worked will go to zero as the wage goes to infinity. There also may be no positive wage where the hours worked is zero if the time to money endowment ratio is too high, (pp.186-187). This is consistent with the results found here. However, Stern does not explore this possibility any further. He does not point out how this property coincides with his earlier finding in Atkinson et al (1981) where a direct incorporation of Becker's theory led to the finding of a similar nonlinearity in labor supply. Atkinson et a] (1981) find that, while the estimated labor supply curve was largely negatively-sloped for British workers, it lWork is defined, here, as all activities engaged in primarily for pecuniary gain. 3 becomes positively sloped for higher wages. Stern also does not prove his results under a generalized set of assumptions about utility and the production process. The Model This paper distinguishes its model from existing versions of Becker's theory of time allocation by expanding the set of sets of feasible time-money ratios for the production of goods for final consumption to include any subset of the positive real numbers. Attention is given to the cases where the extreme values of the set of feasible input ratios are not positive infinity and zero since, in these cases, any additional consumption will always require additional inputs of both time and money. To simplify the presentation of the potential complexity introduced by expanding the production functions considered and to emphasize that this is a labor supply model, the household production (or individual time- allocation) problem is rewritten as, maximize U(L, C) subject to tL(w) L+tc(w) C+H=l pL(w) L+pc(w) C=w H+A with the usual non-negativity constraints for L, C and H. Here, an individual's time and income for a period are decomposed into the amounts used for Leisure (L), Consumption (C) and Work (H). Rationality in production makes the amount of time and money spent producing a unit of Consumption and Leisure, or tc(w), tL(w), pc(w) and pL(w), functions of the wage offer. The requirement of time and money inputs for all final goods guarantees that the above time and money costs are positivez. The positive time and money costs for both Consumption and Leisure permit one constraint to dominate if the agent is unable to substitute between the inputs in the production of any good. Hence, in the proofs below, the distinctive prediction that technology will dominate preferences in determining the slope of labor supply curves can only be made when the wage offer approaches zero or infinity. 2 It is shown under general assumptions in appendix A that when pure leisure and pure consumption are excluded as possibilities for time allocation that time inputs and money inputs for consumption and leisure will always be positive. Or, in other words, the limits as w-—>oo or as w—)0 of tc(w), tL(w), pc(w) and pL(w)will be positive. A Generalized Proof of the Distinctive Features of the Becker Model. Backward Bend: A proof that labor supply curves will be negatively-sloped for higher wage levels consists of making the simple observation that, as the wage gets large, the budget constraint becomes no longer binding. The maximization problem becomes subject only to the time constraint. Thus, if limw_,,,tc (w), tL (w) > 0 then the hours of work supplied to the labor force will be zero asymptotically. This guarantees that the labor supply curve will bend backwards at some point if, at any wage level, it was upward sloping. If limwg, tC (w) = O , as is the case when the inputs into the household production are assumed to be completely substitutable, then, as in the standard case, we can no longer sign the slope of the labor supply curve when the wage offer gets large. It depends upon preferences. Forward Bend: A generalized proof can also be made for when the labor supply curve will be negatively-sloped as the wage gets small. It requires that U(L, C), is concave with both L and C normal while U1, (0, C)=oo and Uc(L, 0): 00 so as to ensure that the first order conditions will hold. To simplify the algebra, the above average cost equations can be replaced with constants representing their limits: limwo pL(w)= p2, limwotdw): t2, limwo pc(w)= p; , & limwo tc(w)= 1;. As the wage rate approaches zero, either the budget or time constraint will dominate the maximization problem. So utility is maximized with respect to p; L + p; C=A or I; L +1; C+H=1. Here we assume that A>O. Since workers are indifferent to hours worked, ceteris paribus, when the budget constraint dominates the maximization problem the hours worked will become a slack variable in the time constraint. This guarantees that t; L +t2C+H=1 as the wage approaches zero. If the reservation wage is equal to zero, as is the case when unearned income is insufficient to consume the entirety of the time endowment in the most time-intensive activity, ( p; >A), then it follows that a positive level of hours will be supplied as the wage approaches zero. This indicates that the budget constraint will dominate as the wage approaches zero. Hence, an increase in the wage, by increasing income, will increase both Consumption and Leisure since they are normal goods. However, since the time constraint is necessarily binding, it also follows that lim 9!: <0. Otherwise, if the reservation wage, wR, is positive then limwgwR H=0 and the labor supply curve must necessarily be upward sloping as it approaches the reservation wage. If A<0 then the proof is similar to the above proof that the labor supply curve is negatively-sloped for higher wages. An Investigation of the Properties of Labor Supply Curves with "Nice", or Constant Elasticity of Substitution, Preferences under the Becker Model It is well established that under homothetic preferences, labor supply curves in the standard model can be monotonically upward-sloping or backward- bending. The set of possible labor supply curve shapes, maintaining the assumption of nice preferences, changes when all consumption requires positive time and money costs. In this framework, the utility maximization problem becomes... 8 Maximize [(fl - L8_1)€_1 + ((1 — fl) . C8-1)5_l ]£—l S.t. L+tc C+H=1 pL L+ C: w H+A H(,B.A. w, pL .tc )= (l—flXW‘i' PL): +fl(PL "AXH‘W'ICY (l-fl)(w+pL)‘(l+w-tc)+fl(w+ pL)(l+w-tc)‘ The parameters tc , pL represent the average time cost of consumption and the average money cost, or pecuniary price, of leisure, respectively. In the standard labor supply model, these parameters are set equal to zero. The features that determine possible shapes when there are positive time costs to all consumption are the degree of substitutability between Consumption and Leisure, 3, and whether unearned income, A, is sufficient to provide for consumption of leisure for the entirety of the time endowment, p1,. The impact of the two factors on the possible shapes of the labor supply curves is examined below. The thinner curves reflect the case when 8 =7, the thicker curves, 8 =5. The smooth and discontinuous curves illustrate the respective cases where A=.6>p1, and A=.20. By combining these partial derivatives with the above Euler Conditions, it is trivial to show that the average time cost of consumption, tc(w), M will always be positive. Similarly, if TL = g [.1 and fi L > O, the production L 1 function, (L in this case), will be locally characterized with partial derivatives, LPG and L.=O. As before, there will always be a positive pecuniary cost of leisure. This implies that there will remain positive pecuniary costs to leisure and consumption, pL(w),pc(w) >0, as the wage approaches zero, assuming some production is still possible or A20. It also implies that there will remain time costs to any consumption, tc(w), tL(w)>0, as the wage approaches infinity. 18 Appendix B Labor Supply Curves under Constant Elasticity of Substitution Preferences with fixed time costs for Consumption and money costs for Leisure will have at most 2 local optimas or bends. To establish that the taxonomy of shapes for labor supply curves is exhaustively examined in this paper, we look at how many bends can exist for a labor supply curve. To study these bends, we solve for the equilibrium hours supplied... (l'flXW'tPLY +fl(p1, -A)(1+W"c)‘ (1-fl)(W+pt)‘(1+W'lc)+fl(W+pr)(l+w-tc)" (1) H(fltA,w:pL’tC): By differentiating equation 1 with respect to wage and setting the slope equal to zero and simplifying by removing terms such as the denominator of equation 1 and multiplying both sides by -1, it can be established that 1+w-t fl ’ _ C c 1‘5 w+pL £_ l-fl(pL A)(w+pL) + fl ’C(l+w-IC) (2) ' l—p -t 1+(p —A)t (1_pL.1C)(g, w 1. (’+ 45 _ L C)=0 w+pL 1+w-IC w+pL l—pL-IC is the algebraic expression for the existence of “bends” in the labor supply curve. 1+w-t To simplify the above expression even firrther, we substitute PW into the .. I (3) 11'6fl(pL “/0218 +l—‘F'41C1FE +g(—S—-(A—pL).u)_ (a-1)(1—(A—pL)rC)=o expression to get... 19 It is important to recognize in the above expression that 1/ p L > u > (C for (rt—I ) all oo>w>0, and that QL= -——C—<0. Since the left hand side of equation three allows us to detect whether the labor supply curve will have local minima or not. Or, whether or not the labor supply curve will bend or change its direction for a change in the wage offer. If the left-hand side of equation 3 is always negative then the labor supply curve will be monotonically negatively sloped. The left- hand side can be guaranteed to be negative, since u>0, if unearned income is insufficient to cover the cost of consuming leisure all day long, or A — p L <0, and consumption and leisure are not strong substitutes, or a — 1 <0. Alternatively, the potential number of bends in the labor supply curve can be limited to one or two if the second or third derivative of equation 1 is signable for all values of w. This can also be proven with the first and second derivatives of the right hand side of equation 3 since the terms removed were all positive and our interest is only in the signs of the derivatives for the labor supply curves. If A -— p 1. >0 then the first derivative of the expression on the left-hand side of equation 3 a _ l 1__.’B(pL—A)u£"l—l 'Bt -£_l—(—C—+A— (4) 8( 7— C” "2 pL» is always negative. This signifies that there can be at most one bend in the labor supply curve if there is sufficient unearned income to consume leisure for the entirety of the time endowment. 20 If A — pL <0 and >1, the case when <1 was handled earlier, then the first derivative of equation 4, _ I (5) £((£-1)-1—:8F(pL —A)u£ ’ 2 +(£+ 01—74161‘5'2 4.2%) is always positive. This implies a maximum of two bends in the labor supply curve when there is not sufficient unearned income to consume leisure for the entirety of the time endowment and consumption and leisure are substitutable. This rules out the possibility of more than two bends existing in a labor supply curve. The existence of positive time costs to all consumption guarantees that a labor supply curve will be negatively-sloped as the wage offer gets large. This, in conjunction with the maximum of two bends, ensures the basic shapes a labor supply curve can take. 21 Appendix C An Increase in Unearned Income will eventually make Labor Supply Curves Positively-Sloped for Low Wages. The main proof showed that if unearned income is insufficient to provide for spending the entirety of the time endowment in consuming the most time- intensive bundle of goods available that labor supply will be negatively sloped as the wage approaches zero. The converse of this is not true. If consumption and leisure are not substitutable, or -1<0, then the labor supply curve can still be monotonically decreasing even when unearned income, A, is greater than the cost of leisure, pL. However, it can still be shown that increasing unearned income does eventually make the labor supply curve positively sloped for some lower wage offers. If we look at the numerator of the first derivative of labor the labor supply equation, (-fl2(pL —A)-(1+w-r(.)25 —(1—.6)2 4C -(w+pL)25 +fl(l-fl)- (1) (w+pL)£(1+w-IC)£-(1—pLzC)- (a w “PFC, A-a _‘+(P1.‘A)’c) w+pL1+w-IC w+pL l-thC then, since the denominator is always positive and independent of unearned income, we can proof the above by showing that for when w->0 there exists an A where the above will be positive, or there there exists an A where... -162(pL -A)’(1_16)2’CpL28 + (2) (A—Zp )t .>0 _. 8 _ . .fi- L C 13(1 fl)pL (1 pL 1C) (PL 1+ I‘PL"C 22 In the above, the second two terms that are negative do not include A. The remaining terms are increasing linearly in A. Since the negative terms are finite, then there will exist an A > P1. so that the labor supply curve will be positively sloped for very low wages. 23 Appendix D The Comparative Statics of how the Exogenous Parameters affect the Shape of Labor Supply Curves. Let W... be the wage at which the labor-supply curve bends backwards, or shifts from positively sloped to negatively sloped as the wage increases, and Wf be the wage where the curve bends forward, or similarly shifts from negatively sloped to positively sloped. Based on the previous proof, we know that both bends will exist when -1 and pL - A >0. Thus, we need to be able to differentiate between the two bends. This can be done by examining the second order conditions or by totally . 0 o z o a o d1fferent1at1ng the above first order condmon w1th respect to wage. Doing this gives us... 13 5—1 51 1‘5 - -151 (l) £‘(1_fl(A-pL)'" $+TICH 8 54—3-1" 1C 611 (—2-uaw+(A- pflg’w- —-)) _ 2 , or after substituting in for El: -£:’——tC—)—, 0w l—thC 2 (l—flwg Imam-t )2“..- --pL).6+(1 m-Ct u-l 8) U ' I (1- mfla— PL C) Since w, is a local maximum, the second derivative should be positive and (A — pL )fl +(1- fl) - 1C 41—1—8 > 0. Whereas, since w is a local minimum, (A - p L )6 +0 — ,3) - 1C 41-1—8 < 0. These inequalities permit us to distinguish between the two bends in the following proofs of the comparative statics effects of changes in exogenous parameters. 24 Here we totally differentiate the first-order condition with respect to A to find 8’“i; 19.”; 77A" 6A ' (1) TLBTuE—E-ing-(PL -z‘1)11E lg—’-%"j+87g1CN—E_ g’W—L—g-WZ —O +8(u—C2311L6A+(A_ pL)S’:——’- 571'— 11)+(l—£)IC Ol' aw:fl(ll£fl+(1-fl)£'ll-(1-fl)(g_1)tC)/_al__l (u-ufl +1, .6) 5((A- p1)" 1p+(1- 3) 1C 11 ) The numerator will be negative since the second term in the parenthesis, (l-B) u, is greater than the third term (l-B)( -1) re and Slim. Then, since u — ufl + 118/3 >0, we can sign the derivative by signing (3) (A—pL)u-lfl+(l—fi)-IC .u‘z’g. Since equation 3 will be positive for wb and negative for Wf by the second order f conditions, it follows that —Q >0 and <0. 6A 0A Qualitatively, this implies that the length of the portion of the labor-supply curve that is upward-sloping is positively related to the level of unearned income. Even if to, p1, are set equal to zero this still holds true for the backward bend. That is labor supply curves will still bend backwards at higher wage levels when an individual's unearned income is increased. This indicates that the elasticity of labor supply can vary with exogenous changes in wealth. Now we totally differentiate the first-order condition with respect to p1, to find 9:1, ”I, 6"’L L _.6 £__fl__ _ 618116111 1_—_,3_ —£-1_@_u__iw_ (1) Wu 81-33(‘01- A)" 5W5117+£ fl (CH aWapL -0 t _ a: 6W 81 61 +e(-1l§2-73-$a7L-+(A—pL)5—w’—ap:——u)+(1—e)zC 01’ (2) aw = —,6(u£,6+(l—fl)c.u-(l—fl)(c—1)t 11C)/5'—’ 51% (u-u,6+u£fl)-£((A—pL)u-1,6+(1—fl)-1C-u 2"8) The above expression is the opposite of the partial derivative with respect to unearned income. This gives the implication that changes in the pecuniary price of leisure has a similar, but opposite, effect to changes in unearned income. Hence, anything that affects increases the price of leisure, or the minimum cost of living, would have a similar impact for labor supply as a drop in unearned income. 26 6w Now we totally differentiate the first-order condition with respect to tc to find a: b , C W aif O C ,6 8-1 011 aw 1‘fl -8 1‘16 ‘6—1 611 5W — — A -— — — + t —— (1) 5173001 1" 5w 3? 13 " 57‘ c" aw 61C _0 t _ _ _g(%%igV—+(A-pL)%§L-N ')+(1—e)0 which is more likely to be the case for the forward bend then the numerator can be signed since [3 u"> (l-B) u — g and the numerator would be lessthan fl-u" -,B-e-u" +fl(£— l)(A — pL)<0 if >1. Then, since g—L <0, we have four negatives which makes a positive. Hence a decrease in the time cost of consumption, when consumption and leisure are substitutable, will make the wage level at which labor supply becomes positively sloped become lower. The rest of the cases do not show any predictions that are independent of tastes. There does no appear to be a clear impact on the point at which a labor supply curve bends backwards by a change in the time cost of consumption. 27 Now we totally differentiate the first-order condition with respect to ,B to find at 3‘: 6,6’ 513' (1-13)2(PL A)ug ‘57(pL—A)ug 1%%+1—5%u 8+ ' (I) :0 51-361 u 81%Efi-E(§%%_WL_A)%QP 01' 2 2 2 2 a: (2) QW—z 11 ((A-PL)" £,fl +(l-fl) 'tC)/€wl— fl -fl)/3' If we assume that u>l and B>l/2 and >1 this represents the forward bend occurring for the lower range of wages among individuals with a preference for leisure but a fair amount of substitutability between leisure and consumption. Based on these assumptions we can show that the denominator is positive since u - ufl — 115,6 <0. However, for the forward bend (1- ,B)IC < 11£+1fl - (pL — A) and so it follows that the term in the numerator,(A — pL )1125 -,B2 +(l — fl)2 4C is less than _ 23. 2 __ 2_ 3+1 . _ (A PL)“ 13 +(1 3) IC11 fl (12L A) or fl-(p. -A)u£+1((l-fl)-ug'lfl). Given the assumptions made earlier, the above upper bound can be shown to be negative. 28 Thus, the numerator will also be positive and for this particular case, the wage for which the labor supply curve bends forward will decline with a shift in preferences toward less time-intensive consumption. Qualitatively, this implies that leisure-loving, low-wage individuals who behave consistently with locally negatively-sloped labor supply curves may be induced to change their behavior if their preferences are altered to favor more money-intensive consumption. Otherwise, there does not appear to be any obvious impact of a change in preferences on the shape of the labor supply curve. 29 PAPER 2 On the Measurement and Explanation of Recent Changes in the Male Marriage Earnings Differential I. Introduction Korenman and Blackburn (1994) claim that the Male Marriage Earnings Differential (NIB/115D)7 declined by ten-percentage points between the years 1967 and 19888. This goes against the Goldin ( 1990, p. 102)’s claim that the MIVIED has remained "virtually stable" over time. However, inasmuch the causal explanations for the MMED include increased productivity from being married, selection into marriage based on earnings ability9, and employer discrimination, one would expect changing patterns in the division of labor in the household and marital selection to impact the MMED. The ten percentage point decline in the MMED, according to Korenman and Blackburn, should allow for us to test between the specialization and selection explanations for the existence of the MMED. But their time series analysis is unfortunately flawed since there is a significant discontinuity in their MMED series. One very likely explanation for the discontinuity is that the Pre-l976 imputation (or "hot decking") procedures bias the earlier estimates of the MMED upwards. This data problem leaves the question of whether the MMED fell unanswered. 7 The Male Marriage Earnings Differential (MMED) is the adjusted average difference in earnings between married and never-married males. Korenman and Neumark (1991) describe the MMED as being in the 10-40 percent range and argrre that it accounts for about one-third of estimated gender-based wage discrimination in the United States (e.g. Neurnark 1988). Koremnan and Blackburn’s claim is based on their successive, cross sectional estimates of the MMED for white males between 25-54 from the March Current Population Stuvey (CPS). 9 This also includes personal characteristics valued both in the labor and marriage market. 30 Fortunately, a flaw in the earlier imputation procedures is identifiable and correctable. It causes a number of males with unusually low earnings levels to appear among never married, full-time/full-year workers, thereby generating outliers. To remove the outliers’ influence, the paper proposes an innovative form of a trimmed mean regression designed especially for multivariate regressions. The process of trimming the mean provides an alternative MMED series for the year 1967-1996. The trimmed MMED series permits a reexamination of Korenman and Blackburn’s time series regression. One can tell whether changing patterns in marital statusdemographics10 and married female labor force participation(MFLFPR) are correlated with the MMED. The paper finds a significant correlation between both marital selection and MFLFPR statistics and the trimmed MMED for the years 1967-1988. Consistent with the specialization hypothesis, there is a negative correlation between the married female labor force participation rate (MFLFPR) and the MMED. However, contrary to the expectations of Korenman and Blackburn (1994), a decline in the percentage of males ever married in the sample raises the MMED. Similarly, the rise in the percentage of divorced/separated males in the sample is correlated with an increase in the MMED. The joint movement of the MFLFPR and the marriage demographic statistics tend to cancel out each other’s influence on the MIVIED. Thus, while the MMED may not be immutable to the changing division of labor '0 Percent ever married and divorced/separated among the males included in the trimmed MMED sample. 31 in the household“, the countervailing influence of changing patterns in marital selection help to make the MIVIED appear “virtually stable.” " The regression predicts that the MMED in 1988 and beyond is due mostly to selection. 32 II. Review and Replication of the Korenman and Blackburn Findings. Korenman and Blackburn investigate the MMED using two data sources, the March Current Population Surveys (CPS) and the 1970 and 1980 Censuses. The ten-percentage point decline in the MMED is based on a comparison of the cross-sectional MMED from the 1968 and 1989 March CPS data sets. The samples used to construct the earnings differentials are all twenty-two March CPS data sets collected from the years 1967-1988. The samples consist of white, full- time/full-year workers between the ages 25 and 54. Korenman and Blackbum‘s dependent variable is the log of earnings from last year divided by 2080. To establish the decline over time, Korenman and Blackburn present a series of figures with the smoothed estimates of the MMEDs and DSEDs12 for black and white males”. The smoothed series in Figure 1 typifies the smoothed series Korenman and Blackburn(1994) show. The smoothed series all show a downward trend. Figure 1 shows the closeness of the replication to the original MMED series". Table 1 confirms the closeness of the fit. The reported mean and standard deviation of the original series are .216 and .034. The corresponding statistics for the replication are .213 and .032. At the bottom of Table l, a regression of Korenman and Blackburn’s MMED series on the replication series '2 DSED is Divorced/Separated Eamings Differential. '3 Their series include both raw and controlled MMED and DSEDs. Raw earnings difl‘erential include only age dummies along with marital status controls. They smooth with a running three- year average. See Appendix 1 for notes on sample construction of the replication series. 33 shows that the replication differentials accurately predict Korenman and Blackburn’s differentials with a R2 very close to one. Korenman and Blackburn try to construct consistent measure of the MMED for each year of datals that would accurately reflect the variation over time. But, since changes in marital selection and the married female labor force participation occur gradually“, one would expect the same to be true for the MMED series. Hence, a way to judge whether a series is consistent is by its continuity. Aidiscontinuity in a coefficient series, or an outlier, could indicate that some important differences in the data collection procedures still remain. Figure 2 shows the importance in looking for discontinuities in any coefficient series. The alternative smoothed seriesl7 assumes a discontinuity between the 1974 and 1975 years of the survey”. The assumption of a discontinuity appears accurate given the semi-smoothed and non-smoothed series’ agreement for these two years. The MMED declines by seven percentage points between the years 1974 and 1975. The discontinuity in the series suggests that some differences remain between the pre-1976 and the post-1976 March CPS data sets”. '5 To do this, Korenman and Blackburn restrict the sample for each year to males employed full- tirne/full-year during the preceding year. This is necessary since, for the 1968-1975 March CPS data sets, the hours and weeks worked last year are not recorded. Thus. the standard CPS wage cannot used as a dependent variable. '6 See figure 7 for a depiction of the changes in the MFLFPR and the percentages ever married and divorced/separated in the sample. A significant exception to this is in the year 1981 during a short, yet severe recession. '7 See appendix 2 for details of 110w the altemative series is constnrcted. " As reported in the 1975 and 1976 March CPS data sets. '9 One way to check whether the discontirnrity is likely to be due to differences across the data sets is by checking whether the discontinuity is robust across additional series of coefficients from the same set of regressions. In figures 3-4, similar discontinuities between 1974 and 1975 appear to be present in both the higher education dummy-variable coefficient series and the white-collar occupation dummy-variable series. Curiously. the higher education coefficients seem to rise. This 34 III. Explaining the Shift: The Misclassification Problem As it so happens, the March CPS imputation procedures underwent some important changes between 1975 and 1976. According to Ed Welniak (1990), before the 1976 March CPS, the CPS imputation procedures did not use education, marital status, current labor force status of self and spouse, number of children, region, and type of residence to impute missing earnings, work . experience, and longest job information. Instead, the early CPS imputation procedures look only at a person’s relationship to the head of the household, gender, race and age. Also, early CPS imputation procedures impute only those characteristics respondents do not report. A respondent who reports his earnings from last year but not how many hours or weeks he worked last year would keep his original earnings while being assigned a work history. To keep original information allows for inconsistencies in a respondent’s work history. From the 1976 March CPS on, more attention was given to ensure consistency between earnings and work history”. If the changes in the imputation procedures were behind the discontinuity in the MMED series, with the later imputation procedures being more accurate, then the earlier imputation procedure would have biased the MMED upwards. An upward bias is not consistent with the omission of marital status in earnings seems to suggest that the outlier low observations biasing the MMED upwards is concentrated among never married males with higher levels of education. These could be “over-educated” males who are unable to find full-tinte/full-year employment. 2° Based on the description of the early imputation procedure by Spiers and Knott(1969), it seems like Earnings-Work Consistency Edits were designed to ensure that non-workers with no income were not classified as employed. Also a post eamings imputation consistency edit was only nm for persons who had at least one type of eamings imputed. This means that part-time, or part-year workers who reported all of their income but not their work-status could be assigned a full-time, full-year work-status but keep their true work-status. LOJ 'Jt imputation in the early imputation procedures since the omission of marital status fiom the imputation of earnings would bias the MMED downwards“. The omission of marital status is compensated some by the inclusion of headship status in the imputation procedures, since, for males, marital status and household headship are strongly correlated”. Among the data collection changes, the possibility of inconsistency of information between work-status and earnings (from here on referred to as misclassification with inconsistent information) is more likely to contribute to the upward bias in the MMED. The impact of misclassification with inconsistent information on the MMED can easily be modeled with a simple equation. To simplify the analysis, marital status‘and employment status can be reduced to two groups each”. Let M represent whether an individual is married. Let Ip represent those misclassified as full-time with earnings that reflect their true work-status. Similarly, the MMED can be assumed to not differ by employment status“. With these assumptions, the log-earnings equation, conditional upon being classified as a full-time worker, become, 2' This sort of bias reflects the findings of much recent work on imputation. such as Lillard et a1. (1986) and Hirsch and Schumachcr (2000). These papers focus 011 110w earnings imputation can bias the coefficients on characteristics that were excluded from the “Hot-Decking" procedures. 22 One can test the correlation with simple summaries for each year of the 1970-1975 March CPS data sets of the percentages classified as head of household by marital status and whether or not full-time/full-year employed. The summary shows that for never married males from three to five percent of part-year or part-time in the sample are heads and eight to ten percent of full-time/full- year workers are heads. For married males in the sample. ninety-seven percent of part-time or part-year males are heads and ninety-nine percent of full-time/fiill-year males are heads. Thus, regardless of work-status. marital status is highly correlated with headship of household. 2" The divorced/separated category is excluded here and part-time or part-year workers are grouped as ”part-time” workers with full-time/full-year workers referred to as “full-time” workers. These simplifications do not affect the analysis of the direction of the bias to the MMED. 2‘ This removes the necessity of looking at M - - I - . . l p ’1 36 Yi :BOf +BlfMi +00Ip,i +81; (10 <0, 30f >0,Blf>0. The source of the bias to the MMED here is an omitted variables problem. Because one cannot distinguish whether an individual’s work-history is imputed in the earlier March CPS data sets since record was not kept of this information, a restriction of the sample to individuals classified as firll-time introduces into the regression outliers. The bias to the estimate of MMED, flAU 25, from the omission COV (Ip,M) rs a0 Var (M7 . The bras rs posrtrve srnce 1nd1v1duals whose true status opr is part-time tend to have lower incomes, or an < 0, and the omitted variable, Ip , is negatively correlated with marital status, M26. IV. An Empirical Examination of the Impact of Misclassification with Inconsistent Information. The difiiculty in directly examining the impact afmisclassification. Ideally, one would want to examine those individuals whose work-status had been imputed as full-time/full-year and verify the consistency of their reported'eamings. However, this is made impossible by virtue of the fact that, prior to the 1988 March CPS, records were not kept of whether or not hours or weeks last year was reported. As such, one cannot attempt to solve the upward bias by omitting all observations whose work-status was imputed. Starting with the 1976 March CPS, earnings are imputed when work-status is not reported. The Census bureau did not retain records of the original reported earnings until the 2’ Estimates of population parameters are distinguished from the population parameters with a carrot hat in this paper. 26 The higher incidence of misclassified never married individuals with inconsistent information is due to the significant negative correlation between marital status and an individual’s likelihood to be part-time/part-year. As shown in table 3, the odds-ratio between married and never married male workers being classified as full-time/full-year is five to four for the years 1976-1979. 37 1988 March CPS. However, inasmuch as there may be changes over time in patterns ofnon-reporting between 1974 and 1987 and it is difficult to get records of reported incomes of non-reporters of work-status, it is difficult to replicate the impact of the earlier imputation procedure. Because of this, one must consider the circumstantial evidence for misclassification with inconsistent information. Predicted Log-Earnings Levels by Mari tal Status There are other implications of the misclassification of work-history besides imparting an upward bias to the MMED. These implications are potentially verifiable empirically. The predicted earnings for both married and never married workers classified as full-time/full-year27 should be lower than their A A true values or; [30f < 30f , 30f + [3 . However, the bras to <5 +3 1f 0f 1f never-married workers’ earnings should be more negative, or for “Bar (Bar +Blf “Bar +511?)- However, complications do exist in verifying whether earnings levels are depressed in a given set of years. A marital status group’s earnings change from year to year along with other individual characteristics that help to determine earnings levels. Changes in personal characteristics can be controlled for, but one cannot say with certainty whether variation in an unstable mean is natural or unnatural. The extent of the bias must be sizeable relative to the normal changes in the means. Predicting annual log-earnings levels with the mean values of ersonal characteristics for a articular ear28 controls for ear-to- ear variation Y Y 27 The predicted eamings for never married and married males. 111 this simplified example, are respectively the intercept and the intercept plus the MMED. 2' Here the year of survey 1975 is used. 38 in personal characteristics. Figure 5 shows the predicted mean log-eamings levels for married and never married individuals for the years 1969-1978. A comparison between 1974 and 1975 shows how the predicted log-earnings for never married males rose from 2.30 to 2.32, and for married males, fell from 2.62 to 2.59. One can also compare the predicted means for 1969 and 1970.29 Here, the predicted mean for never-married males fell from 2.36 to 2.32 while for married males it is constant at 2.36. To measure the significance of these changes, the predicted mean log- earnings are first differenced for both series for the years 1969-1978. The median statistics for the change in predicted log-earnings for never married and married males are .OO6(.010) and .007(.0164), respectively”. Based on these statistics, the two and four percentage point declines in the predicted log- earnings for never married males appear to differ significantly from the normal variation in the means. However, the changes for married males are not even close to being significantly negative. From this, the upward bias to the MMED appears to be 29 The significant difference between the MMED in 1969 and surrounding years. as shown in the first column of Table 2-A. suggests that there may be significantly lower levels of misclassification with inconsistent information in this year. This is confinned by the absence of a significant difference from between the trirttrtted MMED for 1969 and the surrounding years, shown in figure 10. It is also shown by the less negative skew of the residual distribution for this year relative to the surrounding years. as shown in figure 7. The U S Census Bureau reports that, “The presence of the Census outreach programs have a definite positive impact on the response rates to CPS and the March supplement. One sunnises that this is because people think that they are answering the decennial census when it is the CPS instead.” Table 2-C shows that the Basic and March supplement non-response rates were significantly lower around the time of the decennial censuses. The improvement in the response rate to both Basic CPS questions and March Supplements ltas a strong impact on the March CPS since the March Supplement is only collected from households interviewed with the basic survey. Hence. one can expect the data sets from the March-CPS decennial surveys to have a higher overall response rate and to contain more accurate information than otherwise would ltave been imputed. , 3° These are estimated using a median regression that includes only a constant. 39 due to a disproportionate reduction in the mean log-earnings of never married males. Predicted Skew of Earnings Distribution Residuals by Marital Status However, the misclassification of work-status with inconsistent information should affect more than the means of the married and never-married earnings distributions. An outlier’s impact on a distribution is picked up better by the skew statistic. The skew is a statistic31 commonly used by engineers and statisticians to help identify the presence of outliers. A skew of zero occurs when the distribution is symmetric. A negative skew occurs when a distribution contains a large number of unusually low observations. Averaging over the cube of the t-statistic makes the skew statistics more sensitive to outliers. The sensitivity to outliers makes skew statistics more useful than the predicted means in identifying the influence of misclassification in a distribution. The bias to the skew statistic should be more dramatic than the bias to the predicted means”. /\ . . . ’7 x.- 1- - . . . , 3' The skew stattsttc for a sample rs defined as _Z(—17\LL )3 . n . A drstrrbutton s skew 1:1 . St sad—fl): 1. O’ 32 A simple example illustrates the skew’s appropriateness for analyzing the impact of ntisclassification with inconsistent information. Assume the true distribution of log-eamings conditional on employment and marital status is nonttal. Then. when full-time status is wrongly imputed, the log-eanrings distribution conditional on being classified as full-time/full-year would become a mixed normal distribution. The qualitative implications of misclassification on the conditional, on ntarital status. means and skews of a distribution are robust to what was the initial, true distribution. The assumption of nonnality is made here for expositional purposes. Figure 6 depicts the distributional consequences of mixing two norntal distributions. The more part-time workers misclassified as full-time. the more negative. or less symmetric the distribution’s skew becomes. ~ 40 The skew statistic for the residuals of a median regression has the most potential for reflecting the influence of outliers”. The skew of the residual distribution can be estimated with the average value of the cube of residual test- statistic”. The changes in the skew of the married and never-married distributions between 1967 and 1985 are shown in Figure 7. Consistent with the misclassification story, the skew for never married males is extremely more negative prior to 1975 than after 1975. There is no readily apparent discontinuity in the skew for married males. One discrepancy with expectations is that the skew for never-married males in 1975 is also unusually negative. However, in that the MMED for 1975 is comparable to the following years, it is likely that something else in 1975 is affecting the skew35. A way to statistically verify that there is a discontinuity in the skew series is by first differencing both the married and never married series”. The median statistics for the changes in the skew series are -.24 (.26) and .28 (.81) for married and never married, respectively as measured by a median regression. The changes in skew between 1974 and 1976 are -.70 and 4.74 for the married and never married series. A test-statistic for whether the first difference for never- married males is significant is (4.74-.28)/.81=5.5. The same statistic for married males is (-.70-.24)/.26= -1.76. As such, while the reduction in the negative skew ’3 A median regression minimizes the sum of the absolute value of the difference between the observations of the dependent variable and its values as predicted by the independent variables. Wltile the mean regression is more efficient so long as the data is well behaved. the median regression will be more robust to the presence of “outliers”. Outliers, here, are taken to be observations whose values differ from expected values severely because of measurement error. 3‘Dividing the residual by a robust measure of its standard error forms the residual test-statistic. 3’ Additional evidence on the source of this discrepancy and why the trimmed MMED for 1975 is considered an outlier is given in appendix 3. 36 Here, the years 1969 and 1975 are omitted. 41 for never married is certainly significant, the change in the skew for married males is in the wrong direction. The skew series is consistent with some never married full-time/full-year workers with unusually low earnings being behind the lower mean predicted earnings levels for never married males before 1975”. Probability of Classification as F 1111- Time/F ull-Year by Marital Status A direct check for the impact of imputation procedures on work-status misclassification is to compare the predicted probability of being classified full- time/full-year’by marital status across time. One can predict the probabilities of being classified full-time/full-year by marital status for each year from 1969-1978 with a linear probability model and a constant set of means of observable characteristics”. One would expect to find a discontinuity in the probability series betweep the years 1974 and 1975. Figure 8 confirms the existence of the expected discontinuity. The predicted probability of a never married male worker being classified full-time/full-year drops by 5.5 percentage points. The predicted probability of a married male worker being classified full-time/firll-year drops by 2.9 percentage points. When the predicted probability of being classified fitll- time is first differenced, the median statistics for the first differences are .001(.018) and -.002(.0116) for the never married and married series. The test- statistics for the decline in the probability of being classified full-time/full-year between 1974 and 1975 are then -5.5/1.8=-3.l and -2.7/1.16=-2.3. By conventional standards, the first differences for both married and never married 37 A series of predicted mean log-eamings has qualitatively similar implications. 3‘ The same set of year of survey 1975 means used to make the predictions about the mean log- earnings. 42 males are statistically significant. However, never married workers appear to be more likely to be misclassified by the earlier imputation procedures. A Simulation of the Impact of Different Imputation Procedures. Another simple way to check whether the different imputation procedures would bias the MMED upwards is with a simulation. Using the 1988B March CPS, ten percent39 of observed non full-time/full-year male workers can be selected at random as not reporting their true work-status“. Then, the missing work-status can be imputed with different hot-decking procedures. First, earnings are not replaced in a simplified version of the earlier imputation procedure“. Second, earnings are replaced with the same smaller set of characteristics used for the imputation procedures. Third, both work-status and earnings are replaced with a larger set of characteristics. The second set of characteristics used in the ’9 The ten percent of part-time or pan-year workers misclassified is based roughly on the findings of the first attempt at a calibration as presented later. ’0 The ten percent is defined over workers who would be in the sample if they were employed fttll- time/full-year. The subsample is selected using the sample command from STATA and the seed 1776. "' Instead of imputing whether someone was full-time or part-time and full-year or part-year separately, four work-status categories were formed and jointly imputed. In the earlier hot-decking procedures, most of the categories used for hotdecking of work-status such as color, sex and class of worker are no. longer relevant because of the restrictions made to the sample. The remaining characteristics, according to Spiers and Knott(l969). are headship status, amount of earnings and age. For the hot-decking simulation. three categories are formed for ages 25-34, 35-44, 45-54. Spiers and Knott(1969) do not report the eantings levels used to form the earnings categories. Because of this. some assumptions are made here. The cut-off values for eantings differ across headship-status. Since there are far more heads of household in the sample. it is assumed that more categories were used to impute their eantings. However. only the lowest eantings categories are considered here for the imputation of work-status since the observations with inconsistent information appear to have unusually low earnings as the calibration shows later. The cut-off value for non-household heads is the median value for all non-head workers. The cut-off value for household heads is the 12.5 percentile value for all head workers. All observations assigned as not reporting work-status are classified as falling in the lower earnings category. Another set of imputations was done where the median value was used for heads and the qualitative impact on the MMEDs was not very large. The number of heads of households misclassified as full- time/full-year is higher and the upward bias to the MMED from misclassification is reduced some. 43 imputation procedures is the same group of independent variables used to generate the MMED”. It is important to allow for the possibility that the decision to not report work-status is endogenous. Hence, earnings of non-work-status reporting males may be systematically different from most part-time or part-year workers. This is important when simulating the impact of the early imputation procedures since the reported eamings’ level affects the magnitude of the bias to the MMED. Since the reported earnings were unavailable, an alternative approach is used to examine this possibility. In the simulation, all non-reporters are assigned the same earnings level according to whether they are a head or non-head. Two sets of earnings are assigned for the same simulation of the early imputation procedure. The initial value assigned is the median earnings, by headship status, for part-time or part-year workers"3 . This represents the case where non-reporters’ earnings do not systematically differ from part-time or part-year workers. The second set of earnings assigned to non-reporters is lower and allows for the possibility that non-reporters’ earnings differ from the average part-time/part-year worker“. The results of the simulation are summarized in Tables 2A and 2B. Overall, the results of the simulation confirm the inherent difficulty of direct observation of whether misclassification caused an upward bias to the MMED. The first two rows of Table 2A show that the early imputation procedure raises ‘2 The same seed, 1000, is used in all three imputation procedures. . ‘3 The assignment of this value represents the case where non-reporters’ eantings do not systematically differ from part-time or part-year workers. 44 the proportion of the sample full-time/full-year by 2.6 percentage points for never married males, 1.7 percentage points for Divorced/Separated and .7 percentage points for the Currently Married. Similarly, the proportions of full-time/full-year males misclassified are 3.5 and .8 respectively for never married and married males. As shown earlier, it is the combination of the difference in the incidence of misclassification by marital status and inconsistent information that biases the MMED upwards as shown in columns 2, 3 and 4 of row 3 of Table 2B. However, the extent of the bias is dependent on the earnings levels of non- reporters and the percent of non-reporters in the sample. Column 3 shows that when the average earnings levels for non-reporters are the same as the median value for partytime/part-year workers that the misclassification biases the MMED upwards by only one percentage point. This bias is considerably lower than the observed seven-percentage point drop in the MMED series, even with a sizeable fraction of non-reporters in the sample. Hence, the misclassification with inconsistent information hypothesis may only explain the bias to the MMED if non-reporters’ earnings were unusually lower than the typical part-time or part- year worker’s earnings. Alternatively, when non-reporter’s earnings are assumed to be substantially lower there is an upward bias of three and a half percentage 0 - S pornts, as shown 1n column 44‘. “ The rule used here is derived from the later calibration. The mean earnings for full-time/full- year workers, disaggregated by headship status. are scaled downwards by exp(-2). ‘5 Further comparison with earlier findings is made by looking at how different imputation procedures impact the skew of the residual distribution. Rows 4, 5 and 6 show the different skews by marital status before and after each imputation procedure. The comparison between the skews in column 3 and 4 is especially interesting since in column 3 the skews do not differ substantially from the non-imputed sample. while in column 4 the skews are considerably larger across all three marital status groups. As shown by figure 7. there is no evidence that the skew for married males is biased downwards in the earlier years. This could be because some peculiarity of the early 45 The MMEDs in the fifih and sixth columns of Table ZB show that the MMED falls some, but does not change significantly when earnings are imputed along with work-status. This is true regardless of whether one controls for headship and age or all of the characteristics used in the MMED regression. A Reconciliation of the Different Tests for Misclassification. The different tests for misclassification with inconsistent information pose a slight puzzle. While the earlier imputation procedure seems to be associated with a higher probability of both married and never-married males being classified firll-time/full-year, there is no evidence in the series of residual skews and predicted means for significant misclassification with inconsistent information of married males during the earlier years. A possible reconciliation of this puzzle is that work-status mis-classification with inconsistent information is less likely to occur for married than never-married males. Since there are more male heads of the household than non-heads, the number of earnings categories used for the early imputation procedures may differ by headship status. If additional earnings categories used for heads of households resulted in fewer non- reporters with very, unusually low inconsistent information then this would help to reconcile the evidence from the tests for misclassification with inconsistent information. imputation procedures (such as a larger number of earnings categories being used for heads of households) may have prevented misclassification of work-status for lteads of households. 46 V. Calibrating the Percentages Misclassified and the Extent of the Bias Reasonable levels of misclassification should be able to explain the entirety of the upward bias to the MMED. A concrete sense of what levels of misclassification with inconsistent information would explain the bias is possible with a calibration of the equation(s) 110,1- =(1’pmc,i)'“T,i +pmc,i '“mc,i In this equatron, the observed . . . . 45 . overall mean, no i , for marrred and never marrred groups, 1ndexed by 1=m, rt 15 I the weighted average between the “true” overall mean", “T i , and the mean I value ofobservations misclassified with inconsistent information, p A are ,i ' calibration of these equations determines the predicted true MMED, ”Tan — ”T,nm and the extent of the upward bias, . “O,m - ”Omm) ' (“Tm ' “Tmm )' However, to predict the overall means requires a way to predict the percentage of observations misclassified with inconsistent information, p m C ,i , and the value of 11 To better determine both of these values, one must mc,i' narrow down the portion of the sample likely to contain the misclassified ‘6 A consistent, though complicated. notation is used throughout this section. The Greek symbol, 1.1 , is used .to represent means. The letter p is used to represent proportions. The subscript O consistently refers to observed values. The absence of an O in the subscript for a mean or proportion implies the “true” value of what the mean or proportion would be without the misclassification problem. The subscript Tr stands for the sub-sample of observations that would belong in the lower portion of a trim. The subscript mc stands for the misclassified observations. The subscripts FT, PY represents observations whose work-status is classified as full-time/full- year and part-time or part-year. 47 observations with inconsistent information. Observations identified as belonging in the lower portion of a residual-based trim"8 are judged as likely to include the entirety of the misclassified observations. What the relegation of all of the outliers to a sub-sample would do is relate the observed means for the sub-sample observations, “Tro i , as weighted . I averages of the mean values of misclassified observations and what true mean value of the sub-sample would be in the absence of misclassification, or 49 ”Troy =(1-ai)"uTr,i ”’1' 4’ch To complete the calibration, one must predict the values of and r1 - for both married and never married groups. —a-- mc,.t pmc:,i I pTrO,i Since there are two equations with six unknowns, a solution will only require four additional assumptions. Two different approaches will be used here to calibrate the bias to the MMED. The first two assumptions are predictions about the “true” means of the sub samples, or “Tr i . To predict these values, the log-eamings of observations I for the years 1970-74 and 1976-1978 are pooled together into two groups. Then, the differences in log-earnings stemming from differences in ages between the ‘7 Or, the mean in the absence of the misclassification of work-status problem. ’3 After estimating a median regression 011 the entire sample. one squares the residuals and uses them as a dependent variable itt another median regression to estimate the hetero-skedastic variance of all the observations. Observations whose residuals are more than 2.58 standard deviations below zero are likely candidates for misclassification. This is done for the years of survey 1970-1974 and 1976-1978 ‘9 “1' here is the proportion in the sub-sample that are misclassified with inconsistent information. 48 two periods are removed”. There are two groups for both married and never married full-time/full-year workers; one for before and one for after the change in the imputation procedures. Out of each group, four statistics are observed“: the mean log-eamings for all observations in the marital status group, no i t 52; the I I mean log-earnings for the lower-trimmed sub-sample, ”Tim 1' t ; the percentage I I of the overall sample of full-time/full-year workers in the lower-trimmed sub- sample, pTrO i t; and the proportion of workers in the sample classified as full- time/fitll-year, p FTO,i ,t' A prediction for the value of ”Tr ,i , 1 is 'uTrO,1',2 , the observed mean for the trimmed sample during the later period. This assumes _ 53 that “TrO,i , 2 — “T131 , 1 ' It is inappropriate to assume that the proportions in the trimmed sub- sample, in the absence of the misclassification problem, would be the same across the two periods, or pTr,i , l = pTrO,i I 2. Figure 9 shows that the percentage in the lower trimmed sub-sample rises over time. An alternative assumption is that the proportion of observations in the lower trim would be equal across marital status in the absence of misclassification, or that p p (l-am)pTrO,m :(l-anm)pTrO,nm' Tr,m _ Tr,mn’ 5° The log-earnings for the later group are regressed on a ftrll set of age dummies. Then, the mean proportions for each age group in the years 1970-1974 is used to predict what the log-earnings for the later period 1976-1978 would have been 111 the absence of age differences. 5' Although, sampling weights are used for comparison of the regressions across time, sampling weights are not used for the calibration. ’2 The subscript i denotes marital statrrs (111 or run) and the subscript t denotes period 1 or 2. The absence of a time subscript either denotes period 1 or that the time subscript is irrelevant. ’3 Since there is no longer a misclassification problem in the years after 1975, one can estimate the “true” mean log-earnings for the earlier sub-sample with the later mean log-earnings value. 49 For the final assumption, two approaches are taken here. The first approach is to assume that the proportions of part-time or part-year males misclassified as full-time/full-year, p PYm C i , are equal across marital status”. I For this assumption some additional notation is needed. Let the percentage of married or never married males misclassified as full-time be res ectivel . One can decom ose the ro ortions of pmc’ml pmc’nma p y p p p workers classrfied as full-time w1th the equation pFTO,i = pFT,i + pmc,i . This allows us to define the proportion of part-year or part-time workers . . p ' . . m15classrfied, pPYmc as equal to mC%-PFT i). The assumptton then 15 I that, Pmc ,m' : Pmc,nm ( ’PFTO,m’me,m) (l—PFTOmm—me/nm Then, by substttuttng 1n 0’. - pTrO,i for p m c,i , one gets the system of four equations, (1)0m'llmc,m '1” (1 “ant)UTr,m :pTrO,m (2) anm 'Umcmm + (1 ’ arm?) “Trmm : “TrOmm 5‘ This assumes that greater numbers of never-married males are misclassified because a larger percentage of never married than married males are part-year or part-time. There is no reason to believe a differential propensity to be ruisclassified exists across marital status. 3) (l—am)pTrO,m _(1_anm)pTrO,nm P TrO,m _ (4) “m (1 - PFTO,m _ ampTrO,m) anm pTrO ,nm (PP F TO ,nm TanmpTrO ,nm) With the solutions to the above equations”, one can solve for the true overall mean log-earnings for each group, “T i , the proportion of full-time workers misclassified, p and the proportion of the part-year or part-time mc,i ’ sample misclassified, p P Ym C .5 6 The second approach to completing the calibration is to assume that no married males were misimputed with inconsistent information. This assumption is supported strongly by the evidence from the predicted means and skew series. “mc . = ((l-pFTO,m)pTrO,nm’(1-pFTO,nm)pTrO,m) “Tr,i + ’1 (l-pFTO,i) (pTrO,nm-pTr0,m) Prro ,1 (pFTO ,m ‘Prro ,nm) “no ,1 (l-pFTO,i) (pTrOmm'pTrOnn) 55 a _ (l-pFTO ,i) (pTr0,nm'pTrO ,m) 1 (Prro,m “PFr0,nm)Prro,i . where i=nm. 111. 56 - - - To do tlus requires the formula “0,1 - pmc’i -pmc’i + (1 - Pmc,i )“T,i' One can then manipulate the decomposition formula to solve for the true overall mean ”"251 =(“0,1 - mrnc,i "pmc,i ) / (1 - pmc,i )' This gives the solutions to the three variables listed above as... ' ' ' ' = — . B nthis Thts assumption 1mplres that, pmc,nm pTrO,nm pTrO,m asedo alternative assumption, another prediction for the bias to the MMED can be found. The Results of the Calibration Table 3 summarizes the statistics and predictions described above for both approaches at calibration. Since the observed means are 2.638 and 2.329, the observed raw MMED for the years of survey 1970-74 is .309. The observed raw MMED for the years of survey 1976-78 is .180. The MMED appears to fall by 12.9 percentage points. However, when the percentage of part-time or part-year workers misclassified with inconsistent information is assumed to be equal across marital status groups then the true MMED is predicted to be .262 with a 4.7 percentage point bias to the MMED. This calibration also predicts that two and four percent of the married and never-married full-time/full-year worker samples are misclassified. However, since the percentage of observations in the lower trim for married males is lower in 1970-1974 than 1976-1978, it is not likely that two percent of male full-time/full-year workers are misclassified with inconsistent (l'pFTO,i) (pI‘rO,nm'pTrO,m) pmc,i - (pFTOIm-pFTOmm) ' piI'rO, nm - pTrO,m pPYmc =( ) r r; pFTO,m pFTO,nm pTr0,nm p Tr0,m ("0,1 -pTrO,i ' pTrO,i) (prrom - pFTOmm) "T i 2r ) (1 ) ( )+ I pI-"'.I‘0,m pFTO,nm pFTO, .i pTr0,nm pTr0,m . 1- - 1 - ”Tr, .1” pFTO,nm)pTrO,m ( pFTO,m)pTrO,nm) - .. 1- .. (pFTO,m pFTOmrn) ( pFTO, i) (pTrO,nm pTrO,m) information. Thus, the likelihood of misclassification with inconsistent information is not equal across marital status. When the entirety ofthe misclassification problem is assumed to be concentrated among never married males, the predicted bias to the raw MMED is (.309 - .247) = .062. Almost half of the 12.9 percentage point decline in the raw MMED between these two periods is due to misclassification”. This calibration estimates that only 2.29 percent of never married workers are misclassified as full-time/full-year with inconsistent information. The rate of misclassification of part-time or part-year workers is then only 6.84 percent. These estimates appear reasonable given that to report one piece of information and not report another piece of information seems likely to be a rare, but important, event. The calibration also shows that while there is a real decline in the MMED between the two periods, it is not from a decrease in the earnings of married males. Instead, the earnings of never married males employed full-time/full-year rise between the two periods. VI. An Alternative Robust Approach to estimating the MMED The previous section shows that the existence of low outliers among never-married males could bias the MMED upwards for the years prior to 1975. These observations cannot be directly removed since the CPS does not keep track of whose work-status was imputed during this time. Hence, to investigate ’7 When the predicted true value of the trimmed sub-sample for never married males is adjusted downwards some to allow for the fact that the mean log-eantings rose some between the two periods, the bias to the MMED falls by .001 or .002 points. However. if most. but not all of the biased observations. are captured itt the trim then the estimate of the bias here will be biased downward. This is unlikely to be too severe as the comparison of the descriptive regressions for the 1.96 him and 2.58 trirtt 111 Table 5 shows that precious little is gained in reduction of the bias due to misclassiftcation by tightening the trirtt. 'Ja w whether the MMED changes over the years 1967-1996”, one needs an alternative estimate of the MMED that is robust to the bias caused by outlier observations. A simple approach to remove the influence of unusually low log-earnings values is with a residual-based, trimrned-OLS regression. The idea for a residual-based, trimmed-OLS regression is very simple. Ordinarily trimmed estimators exclude a certain percentage of the extreme values of the dependent variable that are judged as likely to be due to mis-measurement. In a multiple regression framework, exclusions based on extreme values of the dependent variable are difficult to justify since what would constitute an extreme value varies across individuals. Even if a worker’s reported log-eamings, or log- wage, were unusually low or high for their characteristics, it may still be within some broad range of acceptable log-earnings. Thus, a trim based on the values of the dependent variable may not remove all outliers. Thus, it is necessary to base a trim on residuals, not values of the dependent variables. As shown by Bollinger and Chandra (2001), a trim based on arbitrary cutoff values of the dependent variable will likely bias the estimators. Instead, which observations are trimmed should be chosen based on the quantiles or standard deviations of a residual distribution. However, if there are outliers in the data then precautions need to be used measure that the outliers do not influence the grim. First, one should use a median regression to generate the residuals. This is because a median regression is more robust to the presence of outliers and its residuals are more likely to reflect the extent the outlier differs from the norm. Second, one should allow for the possibility of heteroskedasticity ’8 The alternative specification detailed in appendix 2 is used here for the trimmed means. .1 U1 when estimating the standard errors or quantile values for the residual distribution. This is because what is an unusually high or low residual may vary with observable characteristics. Without controls for heteroskedasticity, an unusual proportion of individuals with a higher variance in earnings59 will be excluded from the sample. This is particularly relevant for the estimation of the MMED since there is heteroskedasticity in earnings across married and never married males. Third, if the trim is based on the standard deviations, then the estimates of the heteroskedastic standard deviations need to be made from a median regression with the log of the square of the residuals as the dependent variable“. The standard deviations of the residual distribution are then based on a transformation“ of the predictions made from the median regression. The residual trim based on standard deviations excludes an observation from the ordinary least squares-regression if the observation’s residual is more than x standard deviations away from 0. A trim based on quantiles requires the estimation of two quantile regressions with the residuals from the median regression as dependent variables“. To trim x percent of the sample, the first quantile regression is estimated at the 100-x/2 level and the second quantile regression is estimated at the x/2 level. Observations are trimmed if their residual is either less than or greater than the lower or greater quantile. However, a trim ’9 For example, when a heteroskedastic variance equation was regressed without controls for marital status, the trinuned MMED values dropped by fifiy percent. 6° For the heteroskedastic variance regression. the same specification as the final regression is used. The log of the square of the residuals is taken to ensure that the dependent variable is not winsorized at zero. 6' It is acknowledged here that the exponent of one half of the expected value of the log of a residual squared is not a consistent estimator of the standard deviation. However, it should not be that far off and is suitable for the purpose of trimming. OJ. Ur based on quantiles does not remove as much of the discontinuity in the MMED. This could be because the inclusion of additional observations as fiill-time/full- year workers with low earnings levels affects the measurement of the quantiles. Because of this, a trim based on standard deviations, rather than quantiles, is used here. In figure 10, a comparison is made between the untrimmed MMED series and a trimmed MMED series where the trim excludes from the sample all observations 2.58 standard deviations away from zero“. Because of selection into marital status, married and never married males’ residual distributions are likely to be asymmetric. As such, one can expect the trimmed OLS estimates of the MMEDs to legitimately differ some from the untrimmed MMED estimates“. However, the trimmed MMEDs will be more robust to the bias considered earlier. As long as the outlier observations are more than 2.58 standard deviations from their predicted medians, they will no longer influence the MMED. ‘2 With, of course. the same full set of regressors used in the quantile regression as would be used in the estimation of the robust standard deviations. ‘3 A trimmed series at 1.96 standard deviations is also examined. The 2.58 trim and the 1.96 trim consistently removes approximately ten and twenty percent of the sample, respectively. A comparison of the Post-76 Intercept irt the second and sixth columns of Table 5 shows that the 1.96 cutoff trim does not significantly reduce the bias. Also. Table 4 shows that the year-to-year variation in the series with a 1.96 cut-off is greater. rtot smaller. than the other trimmed series. This appears to be counter-intuitive. lt tums out that the nature of trim makes the tighter trim less reliable as is shown later. Bollinger and Chandra (2001) quote Stephen Stigler as having concluded from his studies of the benefits of trimming 111 the natural sciences that the ten- percentage point trimmed mean is the most reliable estimator. As such. the cutoff value of 2.58 a pears to be reliable. A trimmed mean only shares the same asymptotic mean with the regular mean when the error distribution is symmetric. If the distribution of ability were symmetric but there is cutoff point for whether an individual becomes married then the distribution for manied males would ltave a positive skew and tlte distribution for never-married males would ltave negative skew. This would make the trimmed MMED estimate “biased.” However, if the nature of the difference between the trinuned and non-trimmed MMED is stable. then the trimmed series should still allow for a valid examination of the changes in the MMED over time. 56 Table 5 summarizes a descriptive econometric comparison of the untrimmed MMED series and the trimmed MMED series. The untrimmed MMED regression’s Post-1975 intercept shift shows a significant seven- percentage point decline. The trimmed MMED regression’s equivalent statistic shows a three-percentage point decline. The trim takes off an average of 4.2 percentage points from the pre-1975 MMEDs. Figure 10 confirms a reduction in the discontinuity in the series“. Also, the MMED for 1969 is no longer unusually lower than its surrounding years in the trimmed MMED series. But, when the Post-1975 intercept is removed from the regression, the fit does fall and, as shown in columns four and five ofTable 5, shifts in intercept in 1975 and 1989 still explain a good portion of the MMED series variation over the years 1967- 199666.67. Figure 11 shows a “raw” version of the trimmed MMED that includes only age controls along with the alternative MMED. The comparison between the two series is done as a test for the robustness. The two series appear to mimic each other pretty strongly. The Raw MMED series varies more and is on average lower than the Controlled MMED series. ‘5 For tlte calculation of the smoothed series. the 1975 trimmed marriage earnings differential is not used. These observations were shown to be outliers in Table 5. Smoothed backward extrapolations are made based upon the contiguous years. ‘6 One possible reason for the decline from 1967-1974 to 1975-1988 is that the later hot-decking procedures routinely impute the eantings of never-married males with fewer personal characteristics. As Lillard et 31. point out smaller groups. like never-married males. are more difficult to find a match based 011 all of the criteria used for imputation. As such, the quality of the match is not as good. Since Never Married Males are more likely to have their earnings from last year imputed, they are more likely to be assigned the higher eantings of a married male, or the higher eantings associated with some orher characteristics correlated with marital status. An artificial increase in the mean eantings of never-married males would bias the MMED downward. ‘7 1989 is when the imputation procedures changed for the second time. This could have lead to a reduction in a downward bias due to a disproportionate number of never married males earnings VII. Econometric Analysis of Trimmed MMED Series The same strategy as Korenman and Blackburn (1994) can be used to test whether the trimmed MMED series’ variation over the years 1967-88 is explainable by the changes in selection and specialization“, with the necessary caveats about the difficulty of identifying causality from a time series regression and an acknowledgement that the independent regressors’ may be somewhat endogenous. Hence, the influence of changes in specialization can be identified by the Married Female Labor Force Participation statistics for females age 25-34 from all races (MFLFPR)69. The influence of changes in marital selection can be identified with the percentage of males ever married or currently divorced/separated in the sample”. A potential change in the relationship between the MMED and the MFLFPR and Marital Status Selection statistics in 1981 is controlled for with a spline-fit". Then, any remaining systematic differences in the MMED between the 1967-1974 and 1975-88 periods can be being imputed with a smaller set of characteristics. This bias is controlled for in the regression analysis with intercept shifts between the periods. ‘3 The regression MMED values are weighted ltere by the inverse of their standard errors squared. ‘9 The MFLFPR statistics are froru the Labor Force Statistics Derived from the CPS, 1948-1987 (BLS Bulletin 2307). The statistics for the remaining years were not made public and are from the Bureau of Labor Statistics. 7° Korenman and Blackbum appear to have used the proportion of never ntarried males classified as Full-Year. Full-Time Workers irt their samples to identify the impact of selection. This is not a good measure for the same reasons that led to the bias in the marriage earnings differentials. The percentage of males itt the marital status groups in the sample reflects the changing composition of full-time/full-year workers 111 the sample and the population. If an increase in the MMED is consistently due to a reduced percentage of married ntales with lower-earnings in the sample and these individuals are not getting married then the coefficient of the percentage ever married in the time series regression should be negative. Similarly, if the distribution of married males declines in its means because several higher ability males are consistently selecting out of marriage the coefficient on the percentage divorced/separated would be negative. By controlling for both the percentage ever married and divorced/separated. one allows for the decision to not marry or to select out of or back into marriage to have different influences on the MMED. 7' As noted by Korenman and Blackbum (1994). there is widely believed to be a strong change in marital patterns in the US in 1981. The change may ltave been due to a change permitting “no- controlled for'with an additional shift parameter”. Figure 12 shows the changes in the independent regressors over the time period. Table 6 shows the results of the regressions”. After finding the best fit for the mean regressions, a median regression is estimated. Out of sample predictions are made for both the mean and median regressions. The median regression is preferred based on its superior out of sample predictions as shown in table D-l in appendix D. Figure 13 shows that the median time-series regression predicts the variation in the trimmed MMED fairly well for the years 1967-1994. The regression result provides evidence that changes in selection and specialization exert countervailing influences on the observed MMED for this period. An increase in the proportion of younger married females working outside the home is negatively correlated with the observed MMED. Likewise, a decrease in the percentage of ever-married males and an increase in the percentage of currently divorced/separated males in the sample are correlated with an increase in the MMED. From the regression results, the larger magnitude of the change in the MFLFPR appears to be responsible for the earlier decline in the trimmed MMED. Overall, the MMED does not change that much in level“. However, the appearance of stability in the MMED over time does not imply the absence of a change in the composition of fault” divorces. No-fault divorce laws make it is easier to get a divorce when one of the parties in a marriage objects to the divorce. 7’ The inclusion of this control does not affect the qualitative implications of the regression analysis. , 73 The time series regression with the trimmed MMED estimates as a dependent variable is weighted by the inverse of the square of the standard errors of the MMED coefficients. 7‘ This is especially the case after controlling for remaining differences between 1967-1974 and 1975-1988 as shown in the fourth and fifth colunm of Table 3. Over thirty percent of the variation in the thirty years can be controlled for by intercept changes between the three periods. U1 \0 the MMED. The extent to which there is a change can be checked by a comparison between the years 1967 and 1988. The MFLFPR nearly doubled, rising by 68.6-35=33.6 percentage points between the two years. Since the coefficient on the MFLFPR is -.45, the regression predicts that the MMED would have fallen by fifteen percentage points between the two years. It would have gone fi'om twenty-four percentage points in 1967 to nine percentage points in 1988, ceteris paribus. Since the trimmed MMED in 1988 is measured at 18.3 with an upward correction of 2.8 percentage points”, it appears that an additional twelve percentage points of the MMED in 1988 is due to changes in marital selection. Comparison of Findings for the C ontro/led and the Raw [VIA/{ED Series The direction of the influence on the MMED from changes in the MFLFPR and marital status selection statistics appears to be robust for the Raw MMED regressions as shown in table 7. However, while the spline-fit dramatically improved the fit for the Controlled MMED-m, the spline-fit is less important for the Raw MMED. But, as shown in figure 14, the basic predictions still hold. The predicted MMED, when the MFLFPR is extrapolated beyond 1988, is too low and the predictions made with a fixed MFLFPR fit with the observed MMEDs better. Also, when the MFLFPR is held fixed at its value in 1988, the predicted MMEDs for the earliest years in the sample are far lower than the observed MMEDs. The time series regression’s prediction that the causality 7’ The correction upwards is based on there being a negative downward bias to the MMED estimated after 1974. This bias is picked up by an intercept shift in the time series regression. 75In 1981, a change in the percentage of males divorced/separated becomes associated with an even stronger change in the Controlled MMED. 60 of the MMED has changed is observed for the raw MMED, as well as the controlled MMED. An Investigation into the Nature of the Changes In Marital Selection Given that the predictions for the impact of changes in the selection into and out of marriage on the MMED are indeterminate, it would be enlightening to check whether or not changes in the distributions of earnings ability or age of marriage contributed to the observed changes in the percent ever married or divorced/separated. To do this, a comparison is made for the years 1967 and 1988 of the probabilities of being married conditional on not currently being divorced/separated or the probabilities of being currently divorced/separated conditional on having been married. Figures 15-18 and Table 8 show the results of estimated probits with log-eamings and its square and cube and age dummy variables as the independent variables for each year”. Figures 15-16 show the‘predicted probabilities of being married for 1967 and 1988. Immediately apparent is that the probability of marriage for younger males is significantly lower in 1988. For the older age ranges, though, the probabilities of being married are comparable across the two years, as shown in Table 8’s summary of the predicted probabilities of being married by age group. Table 8 reports the mean predicted probability of being married and the standard deviation conditional on age. The variation in the predicted probabilities conditional on age reflects the importance of differences in earnings ability. This is because only age and earnings are included as controls in the probits. Table 8 61 shows that the variance in the probability of being married across the age groups has not changed significantly between the two years. This indicates that the primary reason for the decline in the percentage of males ever married in the sample is because of the postponement of marriage. Since earnings rise with age, an increase in the post-ponement of marriage would increase the positive correlation between marital status and earnings. Hence, the postponement of marriage could explain why the decline in the percentage of ever-married males in the sample appears to be correlated with an increase in the MMED. Figures 17-18 show the predicted probabilities of being divorced/separated for 1967 and 1988. In both years, it appears that the probability of being currently divorced/separated initially rises with age and then falls back to its original level. The overall probability of currently being divorced/separated has risen across all age levels, but more so for the midrange of ages. Table 8 shows that the amount of variation in the predicted probabilities of being currently divorced/separated conditional on age has grown considerably. Hence, an increase in the variation in earnings ability has contributed strongly to the probability of being currently divorced/separated. Thus, endogenous selection out of marriage appears to be at least in part responsible for the positive correlation between the growing percentage of currently divorced/separated workers in the sample and the MMED. 77 The samples for the probits were based on the relevant subset of the samples used for the estimation of the 2.58 Cutoff, Trimmed MMED. 62 VIII. Conclusion It appears that the MMED did not decline by ten-percentage points. A Replication ofKorenman and Blackburn’s MMED series from 1967 to 1988 demonstrates the existence and significance of a non-trivial data-problem in the 1968-1975 March CPS data sets. Once the nature of the data problem is identified as owing to a concentration of low outliers among never married males, one can construct a consistent MMED series by removing the influence of the outliers. To do this, the paper proposes and executes a residual-based trim. The trimming technique improves the estimation of the MMED by removing the noisiest portion of the sample. Recent papers on “hot-decking,” such as Lillard et al. (1986), criticize the way the CPS imputation procedures beginning with the 1976 March CPS data sets replace all of the original valid work-history information. The main problem with the imputation procedures were their failure to keep record of the original reported values of earnings, work-status last year and information about the most recent job. This paper demonstrates that, while pertinent original information should never be discarded, it is possible that an inconsistent official work-history may also be a significant source of bias. By identifying the likely source of the bias in the original MMED series and constructing an alternative MMED estimator, the original objective for constructing a series of coefficients— to test the relative importance of the selection and specialization hypotheses for the changes in the MMED— becomes 63 possible. The regressions from the years 1967—1988 shows that both variations in the percentage of males ever married and the younger married female labor force participation rate appear to influence the MMED series-’8. A dramatic increase in the proportion of younger married females in the labor force is correlated with a decline in the MMED. However, the decline of the observed MMED associated with the increasing MFLFPR is checked by changes in the composition of the samples by marital status. The decision of younger males to post-pone marriage and the selection of predominantly lower ability males out of marriage raise the observed MMED. The regression results imply that the MMED has come to signify more the impact of marital selection and less the increased earnings associated with being married. 7‘ This is true for both controlled and raw MMED series. 64 Bibliography Blackburn, McKinley and Korenman, Sanders. 1994. “The Declining MMED.” Journal of Population Economics 7(3): 249-70. Bollinger, Christopher and Chandra, Amitabh. 2001. “Iatrogenic Specification Error: Cleaning Data Can Exacerbate Measurement Bias.” First Draft, July 9, 2001 as presented at the 2001 Joint Statistical Meetings. Comwell, Christopher and Rupert, Peter. 1997. “Unobservable Individual Efi‘ects, Marriage and the Earnings of Young Men.” Economic Inquiry 35(2), 285-294. Gray, Jeffrey S. 1997. “The Fall in Men’s Return to Marriage: Declining Productivity or Effects of Changing Selection?” Journal of Human Resources 32(3): 481-504. Goldin, Claudia. 1990. “Understanding the Gender Gap, An Economic History of American Women,” New York, Oxford University Press. Hirsch, Barry T. and Schumacher, Edward J. 2000. “Earnings Imputation and Bias in Wage Gap Estimates” Unpublished, February. Korenman, Sanders and Neumark. David. 1991. “Does Marriage Really Make Men More Productive?” Journal of Human Resources, 26(2), 282-307. Lillard, Lee; Smith, James P. and Welch, Finis. 1986. “What Does One Really Know About Wages? The Importance of Nonreporting and Census Imputation.” Journal of Political Economy 94(3): 489-506. Nakosteen, R. A. and M. A. Zimmer. 1987. “Marital Status and Earnings of Young Men: A Model with Endogenous Selection.” Journal of Human Resources, 22(2), 248-68. Spiers, Emmett F. and Knott, Joseph J. 1969. “Computer Methods to Process Missing Income and Work Experience Information.” ASA Proceedings of Social Statistics Section, 289-297. Welniak, Ed. 1990. “Effects of the March Current Population Survey’s New Processing System on Estimates of Income and Poverty”, unpublished paper from the 1990 ASA convention. 65 Figure 1 Comparison of Smoothed and Non-Smoothed Original and Replication of Korenman and Blackburn's Male Marriage Earnings Differential Series ——9— Korenman and Blackburn's MMED—a— Replication of Original MMED 30* I 1 fl r l r f I I l 67 69 71 73 75 77 79 81 83 85 87 Year of Survey 66 Figure 2* Alternative Series for White Male Marriage Earnings Differential 0 Alternative MMED series Alternative MMED series, Smoot O O 30~ \ \ o (I) E O E, o o g o (I) 3’ 25a s o O 0 iii JJ\\ Ga 0 .g \O\\ o 0% \Q/ (U \. / 2 O 2 \/ o g 20 a O I I l l I l l I l l l l l l l 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 Year of Survey ' A discontinuity between 1974 and 1975 is assumed for the smoothed differential by making 1974 and 1975 effectively endpoints. The three-year moving averages here and in the rest of the paper are weighted according to their standard errors. A linear time trend is assumed for the estimates and the errors in measurement are assumed to be independent. Based on these assumptions, for a series of coefficients X, Y and Z that has standard errors x, y and z, the smoothed estimate of Y is ((x"2+z"2) Y+2*y"2*(X+Z))/(x"2+z"2+4*y"2); the smoothed estimate of X, a left endpoint, is ((4‘y"2+z"2) X+2 x"2 Y-x"2 Z)/(x"2+z"2+4‘y"2); the smoothed estimate of Z, a right endpoint, is ((4*y"2+x"2) 2+2 z"2 Y-z"2 X)/(x"2+z"2+4"y"2). Standard Errors or smoothed estimates are calculated by assuming the independence of the consecutive year’s standard errors. These equations are all calculated by picking the best linear unbiased estimator based on the assumptions listed above. 67 Figure 3 Trends in Education Coefficients Over the Years 1969-1981 + Years of Education 12 & 16 o—--Years of Education 9-11 & 13—15 & 17-18 75 - 2 7o — /W :2 65 — )1" o 5 ° g 50 - W/ .1 0 g) 50 T w + g 45 " + 18 40 1 o o g" 35 ‘ 9_ 0 \o___/ O o 5 30—1 0”? 0 0 + + + + g 25 ‘ ._ - + + M '8 20 — o o g _j/°———o-——/ if] 15 T w it)” 0 0 1O - ° 5 _ T l l l l l l 69 71 73 75 77 79 81 Year of Survey 68 White Collar Earnings Differentials Figure 4 Trend in White-Collar Coefficient over Time 18 — O o —1 fi\-\ // 16 , \w/ 0 0/, 14 T O o O 12 O /—\/ ,/ 10 — O o I I I I I I f 69 71 73 75 77 79 81 Year of Survey 69 Figure 5 Comparison of Predicted Mean Earnings in 1987 Dollars for F ull-Time/F ull-Year Workers by Marital Status —e— Mean Log ‘Wage‘. Never-Married—a— Mean Log 'Wage'-.25. Married 2.4 " N N w o: a: on g 1 \\ \\‘ t f/ B\. /. Average Log-Wage / / i3 / \ f / l x. N (A) l l 70 72 74 76 78 Year of Survey 70 Figure 6 Probability Density Functions for Normal and Mixed Normal Distributions with parameters {Mean 1, Variance 1, Mean 2, Variation 2, Proportion of Distribution 2} {SI 1’ 2! 2' O} 0.4 0.3 0.2 0.1 2 4 6 8 10 {5, 1, 2, 2, 0.25 } 0.3 :- 0.25 0.2 0.15 0.1 2 4 6 8 10 {5, 1, 2, 2, 0.5} 0.2 ; 0.15 ' 0.1 . 0.05 ’ 10 71 Figure 7 Comparison of Average Skew of Residuals from Median Regression by Marital Status Never-Married Skew. Smoothed Married Earnings Skew, Smoothed o Never-Married Earnings Skew D Married Earnings Skew ; 0“ o 6‘; c W .9 D ‘5 .o '5 .2 _ o '5 , In // g) 0 m o . 11.1 o / ”5 o 2 -10- 3 (n 8 2 O ‘63 3 .0 a? -15 I I I l I I I I I I 67 69 71 73 75 77 79 81 83 85 Year of Survey 72 Figure 8 Comparison of Probability Employed Full-Time/Full-Year by Marital Status for Earnings Last Year Employed Males —e—— Prop. Full-Time Never-Married —8— Proportion Full-Time Married-.1 .8 ‘ EL .75 4 cu SCI /B/B\ \ 13’ \\E] .65 ‘ \\‘M/\\ \‘k .6 a \ .55 ‘ I l l T I I I l l l 59 7O 71 72 73 74 75 76 77 78 Year of Survey 73 Percentage of Sample Trimmed 2.58 .1 - a/X // .08 a \ N" b, '06 _ W . / k \k/ .04 — /\ E j 1' .02 - Figure 9 Comparison of Proportions of Sample Trimmed because Residual was too High or too Low by Marital Status. ——8— Never-Married, Lower Trim ——a— Never-Married. Upper Trim —9— Currently Married, Lower Trim——-—- Currently Married, Upper Trim T E j I 7 l I I I I I T I I I 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 Year of Survey 74 Male Marriage Earnings Differentials 30“ 25— 20" 15* Figure 10 Comparison of Trimmed with Non-Trimmed OLS Male Marriage Earnings Differential Series 0 Alternative MMED series 0 Trimmed MMED series 2.58 o O \\ 01/, /o O o C] O \, 0’ D E] \\ o/j 11: \o/ Cl \\ //—/ h\ O\\, O \ 131 El \\ Q‘\ are? ‘0/ EZ/fl/fl \ j/fl Cl Alternative MMED series. Smooth Tn'mmed MMED 2.58, Smoothed . rm. . In .7/ l I I F I I l I I I I I I 67 69 71 73 75 -77 79 81 83 85 87 89 91 Year of Survey 75 I I 93 95 Male Marriage Earnings Differentials Figure 11 Comparison of Trimmed OLS Male Marriage Earnings Differential Series 0 Raw Trimmed MMED series 0 Trimmed MMED series 2.58 Raw Trimmed MMED, Smootne Trimmed MMED 2.58, Smoothed 25- O O O D I A O 0 Dig, W0 9‘3, /-—/o/\ \‘D\ O ,//\\\i \\ O 0/0 /\ ‘1 "V \ DJD Q / lo 20— I/ a \\ o .0 ,0 )4 \91 C] \ O/ \// {D \U 0 U\\ \\\// O O D,/ \ 22 O / a 08.0 / ‘ CI \ / \\ M / O \ /D / 15“ CV % C] D o I I I I I I I I l I E l l l I 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 Year of Survey 76 I aux-w Figure 12 Independent Regressors Used in the MMED Time Series: Married Female Labor Force Participation Rate(MFLFPR) for Women Ages 25-34, all Races; Percentage of Ever-Married Males in the 2.58 Trimmed Sample of Full-Time/Full-Year Workers; Percentage of Divorced/Separated Males in the 2.58 Trimmed Sample of Full-Time/Full-Year Workers. —e——NFLFPR ——e—%Ever-lVariedinSarple-20 ———a—%Divorced/Sepa'aed+30 OMQe—Jug flgU/D/BW o Wm 60 fl fig W0 Mir- ‘r-m (36.5.5 !l 77 Figure 13 Comparison of Observed and Predicted Male Marriage Earnings Differentials 0 Trimmed MMED series 2.58—— Trimmed MMED 2.58, Smoot —9— Predicted Trimmed MMED —5— Predicted MMED MFLFPR Co 30‘ 257 207 15— 107 Male Marriage Earnings Differential l l I l l (517 619 7'1 7‘3 7'5 7'7 7'9 8'1 8'3 85 87 89 91 93 95 Year of Survey 78 Figure 14 Comparison of Observed and Predicted Male Marriage Earnings Differentials (Age Only Controls) 0 Raw Trimmed MMED series Raw Trimmed MMED, Smoothe ——8— Predicted Trimmed Raw MMED—a— Predicted Raw MMED, FLFPR fi Raw Male Marriage Earnings Differential 25— O O D "'\,D D //’D/B\,\\ ”my \ w 204 O \'\\\ o o \ O D O o A .D /f\) / F \ O / A A O W / A” A . A? 15—« A A- / a - E3 . .0 )4; ‘ a/y A . ' T\ 7 10‘ H \‘\, 43 a A \f/ / A 41/ AM A 5.. D I I I I I I | I I I I I I I 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 Year of Survey 79 Figure 15 Predicted Probabilities of Being Married by Age and Log-Earnings Year 196779 Predicted Probability of being Married Conditional On Not Being Divorced/Separated 25 35 45 55 Age 79 Tables 15-18 are constructed with the predictions from the ols regression of Currently Married, or Currently Divorced/Separated on the log-wage, its square and cube and age dummies. The sample used for Tables 15-16 is the same as that used for the estimation of the 2.58 Cutoff, Trimmed MMED for the years 1967 and 1988. The sample used for Tables 17-18 is the same as that used for the estimation of the 2.58 Cutoff, Trimmed MMED for the years 1967 and 1988 excluding never-married observations. 80 Figure 16 Predicted Probabilities of Being Married by Age and Log-Earnings Year 1988 go 8 luv 0 g 00 .23 00 Q8 0 0 £8 Allin—6080 Q All—avg go 0 Q8 0 OO guOQu O 0 g 0 O O .2250 O O IEHBOQQ O OO 2:5. 0000 £88588 0 Ego O 8 ECO O O O .fiSOBUOOO O OO 0 g 0 00 0 £58966 0 O AIHfluHNUOAXUOUQU O LariazZ... OOC g 000 O O 55 Age 25 3658938020 933 “oz :0 63550 85.2 33 5 2:389“. 8553 81 Figure 17 Probability of Being Divorced/Separated Conditional On Having Been Married Year 1967 O 00 Ogi O OO 8 0 0% 8i 0 O O 888i OO O 899i 0 OO gggxi O O O 608i Age 35 25 BEm comm 9.3m c0_mco=_ucoo 855 868.95 52:0 £52.05 82 Figure 18 Probability of Being Divorced/Separated Conditional On Having Been Married Year 1988 45 Age 35 3:5 :25 9:5 :0 .mcoEucoo .855 $68.95 5.3.0 £323 83 Descriptive Statistics for Replication of Korenman and Blackburn. Table 1A Years 1967-1996. Original Replication Replication 1967-1988 1967-1988 1967-1996 Marriage .216 .213 .211 Coefficients L034) (03 2) (.028) Table 18 Summary of Comparison of Replication with Korenman and Blackburn’s Reported Time Series Results. Dependent/ O . . . . Independent rrgmal . Replicanon of . V . Korenman and Blackburn Senes Korenman and Blackburn Senes anables Year 67-88 67-88 67-88 67-88 67-88 67-88 67-88 67-88 Sample :33. -.42 -.62 -.293 -. 169 -.39 -.58 -.312 -.l64 (.07) (.12) (.23) (.173) (.07) (.10) (.21) (.104) (Trend-14)" .59 .222 .57 .264 I(Year>80) (.28) (.36) (.26) (.325) Post-76 -3. l 7 -3.96 -2.62 -3.57 Intercept (1.98) (1.49) (1.80) (1.37) Adjusted R2 .61 .67 .70 .71 .62 .69 .70 .71 Table 1C Results from the Regression of Korenman and Blackburn’s Marriage Earnings Difi‘erential Series on the Replication Marriage Earnings Differential Series “tilfizgldém Coefficients/Statistics Iléephcatron of 1.0671 orenman and ( 0412) Blackburn ' -1.203 Constant (339) Adjusted R2 970 ' The trend variable is equal to 1 during Year of Survey 1967. 84 Table 2A Summary of Sample Statistics For Simulation of Early Imputation Procedures using Data from 1988B March CPS Percentages /Groups Never Married Divorced or Separated Currently Married % of Sample Full-Time/Full-Year Before Imputation .712 .748 .847 % of Sample Full-Time/Full-Year After Imputation80 .738 .765 .854 % of Overall Sample Assigned Non-Reporter Work-Status .040 .030 .015 % of Full-Time/Full-Year Workers Misclassified6 .035 .023 .008 Table 28 Summary of Alternate Hot-Decking Procedure’s Impact on the MMED using Data from 19888 March CPS - - Work- Work- (3111:3333: ‘35 None Staxzrgnlv Stafiizrghlv Status and Status and p ' ° Eamings Earnings Characteristics Head-Ship. Head-Ship. . same Used for None Age and Age and Heals-Selim, 83:52]; Hot-Decking Eaniings Earnings g MMED" .232 .244 .2688? .227 .228 MD (.010) (.010) (.011) (.010) (.010) Residual Skew . -6.67 -5.99 -1 1.67 -5.20 -6.65 Never Married Residual Skew Divorced/Separa -2.93 -3 .04 -8. 19 -2.55 -2.98 ted Resfizgiesgew -3.72 -3.71 -6.23 -3.67 -3.65 3° These percentages reflect the values from the simulation of the early imputation procedures. 8' The same controls used in the main regression for estimating the MMED are used as 110t- decking characteristics. The only exception is that ages are grouped into three categories, ages 25- 34, 3544, 45-54. ‘2 The difference between the MMEDs shown in the third and fourth columns is in the earnings value assigned to non-reporters. 1n colunm 3. the eantings are the mean eamings for heads and non-heads who are part-year or part-time workers. In column 4. the earnings are exp(-2) times the mean earnings for heads and non-heads who are full-time. full-year workers. The second set of earnings values is considerably lower than the first and is based on the calibration results. 85 Table 3 Statistics Used for Calibration of Misclassification of Work-History Years 1970-74 Years 1976-79 Married Never- Married Married Never- Married Overall Means 2.638 2.329 2.648 2.468 Observed Raw MMED .309 .180 Sub-Sample Means ' 1.798 1.162 1.778 1.423 Proportion of Sample in Sub-Sample .0555 .0784 .0628 .0673 Proportion of Workers Classified as Full-Time .823 .688 .792 .640 Proportionfi Misclassified Approach I .0207 .0436 Percentage of Part-Year Workers Misclassified .088 Predicted True Overall hdeans 2.663 2.401 2.623 2.443 Predicted Raw MMED .262 .180 Bias to MMED .047 Proportion“ Misclassified Approach 11 .0229 Percentage Part-Year Misclassified .0684 Predicted True Overall Means 2.638 2.391 2.623 2.443 Predicted MMED .247 .18 Bias .062 ‘ Predicted Proportion of Full-Time/Full-Year Workers. 86 Table 4 Summary of Alternative MMED Series Descriptive Statistics Years 67-88 Years 67-96 Excluding 75 Excluding 75 Untrimmed OLS 24.81 24.31 MMED (3.59) (3.21) Untrimmcd OLS MMED Standard (11'3” (11%?) Errors ' ' Trimmed OLS mo (220647; (220 '1676) 2.58 Cutofl‘ ' ' Trimmed OLS MMED .854 .840 2.58 Cutoff (.053) (.052) Standard Errors Trimmed OLS MMED (22°12?) (22° 2473) 1.96 Cutoff ‘ ‘ ‘ Trimmed OLS MMED .755 .744 1.96 Cutoff (.044) (.045) Standard Errors Trimmed Raw 18.05 18.54 OLS MMED (2.91) (2.75) Trimmed Raw mm 3.2:, 3.22. Standard Errors ' ° 87 Table 5 Descriptive Regressions lOOfIndep Nontrimm Trimmed Alternative Trimmed Alternative Vanable ed MMED Series MMED Series MMFD Cutoff 2.58 Cutoff 1.96 Series Sample 67-96 67-96 67-96 67-96 67-96 67-96 67-96 67-96 67-96 Years Year/10 0.0892 -1.22 .464 .91 .. -1.08 .433 1.35 _ (0.274) (3.41) (1.87) (2.99) (3.69) (2.02) (3.14) Year"2/ -0.l76 0.5 12.5 -.90 __ 0.40 1.2 -1.2 _ 100 (0.688) (0.9) (.57) (.90) (0.90) (0.6) (.89) Post-76 -7.21 -2.6 __ -214 -3.05 -2.46 _ -1.93 -3.00 Intercept (1.77) (2.18) (1.86) (.84) (2.36) (1.96) (.91) Post-89 .. __ __ 4.83 1.95 .. _- 5.59 2.08 Intercept (1.53) (.79) (1.61) (.85) Year=1969 outlier -541 -1.94 -1.95 -1.85 __ .1.44 -1.45 -1.34 _ . (1.78) (2.28) (2.30) (1.95) (2.48) (2.48) (2.05) intercept Yeafil975 . outlier 1.1 -1.23 -2.35 -1.89 -.821 -3.34 .44 -4.1 -2.87 . (2.11) (2.51) (2.35) (2.15) (2.09) (2.72) (2.53) (2.26) (2.24) intercept £91m“ .712 .132 .117 .369 .304 .128 .125 .402 .316 88 Summary ofT Table 6 ime Series Regressions of 2.58 Cut-Off, Trimmed MMEDS on Descriptive Statistics 2.58 Cut-Off, Trimmed MMEDS Years 1967-1988 Inde endent ‘ + v1.2.1.1... Means (1) (2) (3) (4) (5) (6) (7) MFLFPR“3 52.7 -0.398 -0.662 .327 .0525 .0571 .055 -0.45 (11.38) (0.260) (0.349) (.31) (0.305) (.328) (.254) (.059) MFLFPR“ 23.5 -.061 .181 ”W193” (31.91) " " " (0.377) (.627) " " Percentage Em 3: ed 88.25 -0214 0.122 -.098 -.74 -.954 -0.697 -.741 Elie: (3.60) (.40) (0.498) (.485) (0.544) (0.709) (0.458) (.086) Married Percentage of Employed 30.60 -- -- __ __ .57 -- - ‘Ever- (41.4) (1. 16) Married" (Yeor>1981) Percentage Empatoved 7.07 0.87 1.06 .847 1.57 1.56 1.70 1.53 a i ( Divorced! (2.62) (0.87) (0.88) (8)6) (1.16) (1.1)) (.77) (0.19) Separated Percentage °f 242 Employed 3.60 -- -- .- 2.27 2.21 2.08 (0'20) Divorced/ (4.88) (1.63) (1.68) (1.09) ' Separated“ (Year>l981) ““9198” .364 __ -- -- -22.1 -85.6 -240 -28.82 (.49) (16.1) (130) (10.2) (1.77) “6391974) .636 __ __ -.816 -l.89 -1.92 -1.83 -2.77 (.492) (1.81) (1.44) (1.48) (1.32) (.30) (Year=l975) .045 -2.38 -2.57 .200 -1.76 -1.71 -1.79 -1.46 (.213 (1.63) (1.63) (1.87) (1.36) (1.4) (1.30) (.16) Trend 11.5 .. 0.571 (6.5) (0.508) " " “ " " D.W. Statistics - 1.71 1.81 1.74 2.62 2.61 2.62 2.24 Adjusted R2/ - 0.514 0.522 .490 0.745 0.729 0.763 0.614 Pseudo R2 * Median Regression Specification. '3 Married Female Labor Force Participation Rate. 89 Table 7 Summary of Time Series Regressions of Trimmed Raw @ge Controls Only) MMEDs on Descriptive Statistics 2.58 Cut-Off Trimmed Raw MMEDS Years 1967-1988 Inde ndent + v.52... Means 0) <2) <3) <4) (5) (6) (7) MFLFPR“ 52.7 .0372 -0686 -.166 -0.633 -0.651 -0.498 -0.466 (11.38) (0.253) (0.331) (.287) (0.306) (.332) (.261) (.788) MFLFPR“ 23.5 .327 .417 ““9193” (31.91) " " " (0.374) (.629) " " Percentage Em 2f ed 88.25 .157 .564 .491 -.088 -.168 -0.316 .460 Elie? (3.60) (.389) (.475) (.448) (0.541) (0.713) (0.47) (1.24) Married Percentage of Employed 30:60 __ __ -- __ .2 12 -- -- Ever- (41.4) (1.16) Married“ (Year>l981) Percentage Empfiiyed 7.07 .881 1.10 .815 2.73 2.73 1.99 1.97 Divorce d/ (2.62) (.851) (0.84) (.83) (1.16) (1.2) (.79) (2.56) Separated Percentage °f 186 Employed 3.60 __ __ __ -.355 -.381 .668 (3 69) Divorced] (4.88) (1.62) (1.69) (1.11) ' Separated“ (Year>1981) ““9198” .364 __ -- __ -211 -44.7 -10.6 -3.24 (.49) (15.9) (130.2) (10.36) (35.0) (“391974) .636 __ __ .232 .321 -3.22 -3.58 .479 (.492) (1.67) (1.42) (1.47) (1.34) (3.36) (Year=l975) .045 -1.81 -2.03 -.741 -.482 -.464 -.323 -.071 (.213 (1.63) (1.59) (1.76) (1.37) (1.423) (1.342) (1.804) Trend 1 1.5 -- “-636 -- -- (6.5) (0.484) " " " D.W. Statistics - 1.76 2.00 1.94 2.70 2.67 2.58 1.96 Adjusted R2/ - 0.752 0.766 .765 0.865 0.854 0.867 0.711 Pseudo R2 * Median Regression Specification. 8‘ Married Female Labor Force Participation Rate. 90 Table 8A Summary oflmportance ofEarnings for Probability of Being Married or Divorced/Separated For the Years 1967 and 1988 Regression Probability of Being Married' Probability of Being Divorced/Separated+ Year 1967 1988 1967 1988 Log-Wage -.011 .34 -.53 -.24 (.026) 4.1 l) (.34) (.13) Log-Wage7 1.2 -1.0 1.5 .69 /100 (1. 1) (.47) (1.4) (.52) Log-Wage} -.20 .12 -. 17 -.08 /1000 (. 154) (.064) (.20) (.07) Pseudo R2 .106 .159 .049 .027 Sample Size 13279 12395 12813 13351 Table 8B A Table Summary of Changes in Means and Dispersion of Probabilities of Being Married or Divorced/Separated For the Years 1967 and 1988 P . Probability of Being Married, Probability of Being r01)" . + Divorced/Separated Period Year 1967 Year 1988 Year 1967 Year 1988 Group Nev? Married Nev.“ Married Married Lollicger Married Loliiger Mamed Married Married Married Age .77 .81 .45 .48 .02 .015 .081 .096 25 (.09) (.07) (.08) (.07) (.013) (.007) (.036) (.059) Age .89 .93 .84 .85 .023 .024 .114 .137 3 5 (.045) (.046) (.048) (.046) (.017) (.005) (049) (.058) Age .92 .95 .91 .93 .042 .054 .120 .140 45 (.06) (.03) (.05) (.03) (.022) (.043) (.044) (.047) Age .90 .945 .92 .965 .020 .020 .080 .097 54 (.074) (.038) (.056) (.037) (.016) (.010) (.041) (.052) ' Conditional on not being Divorced/Separated and being part of the sample used to determine the 2.58 Cutofl', Trimmed MMED. + Conditional on having been Married and being part of the sample used to determine the 2.58 Cutoff, Trimmed MMED. 91 Appendix A Description of Replication of Korenman and Blackburn’s MMED Series Korenman and Blackburn construct their time series of cross-sectional MMEDs for fiill-year/full-time, white workers using the 1967-1988 March CPS samples. Their dependent variable is the log of an individual’s earnings in 1987 dollars divided by 2000. The sample is restricted to workers whose dependent variable is greater than one and who report no personal self-employment income85 and are not employed in agriculture or fisheries and wildlife, private household service, or welfare and religious service“. A constant-real-value top code equal to the real value of the lowest top-code is applied to the earnings data. MMEDS "Mrs” :- are then estimated, without sampling weights”, with three categories of marital status: never married, currently married88 and divorced or separated”. Other controls include 29 age dummies, 18 education dummies, 10 industry dummies, one white-collar dummy, 8 region dummies and dummies for whether the respondent's residence is outside an SMSA or in a central citygo. The industry and white-collar dummy variables were more difficult to replicate since the two-digit '5 Koremnan and Blackbuni remove individuals with positive values of personal income from self- employment. In the later. altemative specification only indin‘duals who report zero personal income last year from self-employment are in the sample. '6 The two—digit codes for industry last year were used to form the industry dummy variables. 37 The alternative MMED series uses sampling weights to ensure that differences in targeted sampling do not atTect the MMED series. '8 Currently married includes the category. manied spouse absent. '9 Widowers are excluded from the sample. 9° The age, education. and region dummies were easy to define based 011 standard reported characteristics including the nine census regional divisions. For the place of residence variables, there is an additional category in later years for respondents whose MSA status is not identifiable. Since Koremnan and Blackburn do not report this category, individuals in this category were included with those living inside the SMSA. 92 classification of major industry"1 and occupations‘2 last year changed twice over the 1967-1996 period”. 9' The ten industry dummy variables were condensed from the many two-digit industry last year codes. 92 Occupations chosen to be included in the white-collar status were those that were listed first among the two-digit occupation codes. They include professional. medical and salaried managerial occupations and clerical. sales occupations. 93 It changed between the 1975 and 1976. 1982 and 1983 March CPS Surveys. . ' | Appendix B Description of Additional Changes made to form the Alternative MMED Series For an alternative series spanning the year 1967-1996, the specification needs to be adjusted to ensure consistency in the MMED series and to account for additional changes made in sample construction. One of the biggest changes in the data sets that affects the estimation of the MMED is how years of education are reported. From 1992 on, instead of asking how many years of education someone has, the CPS surveys ask about degree completion. To make a consistent series, six education categories are formed for each year”. Besides the change in the education dummy variables and the restriction of the sample to exclude individuals with negative values of income from self-employment and the use of sampling weights, the additional change to the regression specification is the formation of fourteen industry categories”. The changes made in the formation of the data set are that sampling weights are now used to control for year-to-year differences in targeted sampling. Also, one no longer needs to drop individuals whose wage is less than one dollar since the trim will take care of these values. Finally, instead of imposing a 9‘ The first category consists of individuals with less than nine years of education. The second category consists of individuals with some high school education but not a high school diploma, or nine to eleven years of education. The third category consists of individuals with a high school diploma and with or without some college education. but no college diploma. or twelve to thirteen years of education. The fourth category consists of individuals with an associate degree or who went to a technical college with individuals who report fourteen to fifteen years of education. The ftfih category consists of individuals with a bachelor degree. or sixteen to seventeen years of education. The sixth category consists of individuals with a post-bachelors degree or who have eighteen years of education. ’ 95 The Industry categories are: Construction. Mining. Manufacturing Transportation and Communication and Utilities and Postal Workers. Wholesale. Retail. Banking/Finance and Insurance and Real Estate. Business and Repair Services. Personal and Entertainment/Recreation Services except Private Household. Medical and Health Services (excluding Hospitals), Hospitals, Educational Services, Other Professional Services. Public Administration. 94 consistent low top-code across the years, the log-earnings are consistently imputed for top-coded respondents. These changes are made to take into consideration the findings of Bollinger and Chandra (2001). Bollinger and Chandra find that using arbitrary values of the dependent variable to trim or censor observations likely introduces more bias than it removes. As such, since the underlying distribution does not stay fixed when a fixed censor value is imposed, it may dampen the true year-to-year variation in the MJVIED. An alternative approach is to impute the values of top-coded observationsgé. To do this a simple assumption is made. The upper half of the log-earnings distribution is normally distributed. The median log-earnings value is found. A variable is created where all observations whose log-earnings have values less than the median log-earnings value are assigned the median log- earnings value. Then, a Tobit is estimated with the median values being the lower censor point and the top-coded values being the upper censor point. After the Tobit regression, one calculates the expected value of the log-eamings for top- coded observations under the assumption that the Tobit distribution is accurate and that the true values are above the assigned values. The top-coded observations are then assigned the predicted true log-eamings. This approach maintains a consistency across the years by simultaneously correcting for the changing real values of the top-code and underlying log-earnings distribution. 9‘ The current dollars levels of top-codes vary a fair amount across the data sets from 50,000 to 75,000 to 99,999 to the last couple of years where there is no fixed top-code but a number of values above 100,000. In the imputation procedures, for all of the data sets. any observation whose current dollar value is above 99.999 is reset to 99.999 before imputation. This allows for consistency across the years. an.“ .r P‘.‘ " {'1‘ LI' 5.2"6’1 Another advantage is that one can test the accuracy of the normal distribution assumption for the top half of the earnings distribution”. 97 This is quite easy to do. Isolate the observations at or above the median so that all the observations can be expressed as ,u MED + a) . where a) Z O. For all the observations where w > 0 , additional observations can be created with the values ,1: ME D - a) . This guarantees a symmetric distribution and that if the upper half of the log-eantings distribution is normally distributed then the constructed distribution will be nonnally distributed. A box-cox test for normality can then be estimated 011 the constructed distribution. The box-cox tests 011 the transfonned upper half of the log-eantings distribution for fitll-timc/full-year white males ages 25- 54 for the years 1967-1996 fails to reject the null hypothesis at the five percentage point significance level that the constructed log-earnings distribution was significantly different from the normal distribution. The full log-eantings distribution of full-time/full-year males for every year did differ very significantly from the nonual distribution. 96 Appendix C Description of Why the 1975 Trimmed MMED is considered an outlier. The year 1975 appears to be an outlier as it deviates strongly from the overall downward trend between 1974 and 1976 in figure 10. In figure 9, the percentage of never-married males excluded in the lower portion of the trim in 1975 is comparable with the earlier years. However, the figure below shows that the unusually low trimmed MMED for 1975 is because of how another form of measurement error affects the trimming process for 1975. The average heteroskedastic variance for both married and never married males is unusually low for the year 1975. When the heteroskedastic variance is biased towards zero with the distributions having a negative skew, the result is that a disproportionate number of lower observations, particularly‘in the never-married sample, are excluded in 1975. What would bias the variance downward? The 1976 March CPS data sets is the first set where precise hours and weeks worked last year are asked of respondents. If a larger number of respondents refuse to give their true work- history for this year”, as is shown to be the case in figure 2 of this appendix, then one would impute both their work-history and their income from last year. If the imputed incomes, on average, were more average incomes then the variance of log-earnings would be biased downwards by an unusually large number of imputations. 9' Perhaps, because of the way the question was phrased. 97 is... ..- .« 4 _' Mean Heteroskedastic Variance O Figure C-l Comparison of Average Heteroskedastic Standard Deviations by Marital Status Never-Married Earnings Variance——— NeverMarried Variance. Smoothed Married Earnings Variance Married Earnings Var.. Smoothed 1 l 1 1 j l 1 1 67 69 71 73 75 77 79 81 Year of Survey 98 Figure 02 Trends in the Proportion of Observations by Marital Status with Imputed Earnings Last Year. —9— % in Never-Married Sample—e— % in Married Sample .2 r R /B\E1 _L ‘1 U" L \T / \ / _... "1 E/\\ \\ .125 — / 1 2f / // M V _i. 1 CD 1 1 l T 72 74 76 78 Year of Survey Percentage of Sample with imputed Income from Last Year a l 81- 4 99 Appendix D Comparison of Accuracy of Prediction between the Median and Mean MMED 1967-1988 Time Series Regressions. As shown in tables 6-7, the preferred, final specification is estimated with both a mean and median regression. To distinguish which regression is better, a . prediction is made for the years 1989-1994 for both regressions. The regression results show that the 2.58, trimmed MMED regression predicts better out of sample than the 1.96, trimmed MMED regression. The correlation between the predicted and actual MMEDS is negative for the latter trimmed series. An explanation for the inaccuracy of prediction using the 1.96, trimmed series is because of the measurement error in the estimated standard errors. When the trim is tighter, the measurement error in the standard errors is more important since the density of the residual distribution is higher. As such variation in the standard errors may introduce noise to the trimmed MMED99. Table D-l Summary of Median Regression Assessments of Accuracy of Predicted Trimmed 2.58 Controlled MMEDS for the years 1989-1994. Tl if}. c l Dependent Trimmed 2.58 Trimmed 1.96 Trimmed 2.58 Raw Variable MMEDS MMEDS MMEDS Predicted MMED Series .816 -.163 -.318 Mean (.795) (.677) (2.68) Regression Pseudo R‘ .023 .154 .003 Predicted MMED Series .707 -.235 1.085 Median (.681) (.436) (1.01) Regression Pseudo R‘ .085 .141 .382 100 .'h ...-. K." ° .‘ Has the Male Marriage Earninlgipgrifferential’s Causality Changed? A Historical Over-View of the Literature. 1. Introduction. The Male Marriage Earnings Differential (MMED)’S sizeable and Significant nature was first set out by Hill (1979) over twenty years ago. Since then, the bulk of research on the MMED centers on inferring the direction of the causality between marriage and higher earnings rather than its magnitude'00 or its . . . . 10' vanation across time 01' COUIIII’ICS . Papers have tried to show that marriage causes earnings to rise and form the MMED or that selection into marriage based on earnings ability is responsible for the MMED'OZ. Yet these stories are not mutually exclusive and together may explain the MMED. Moreover, the relative importance of the different potential causes of the MMED may differ across time or location with different patterns in sex roles and marriage. To determine the relative magnitudes of the various effects causing the MMED vary across time, the return to years of marriage after controlling for fixed 99 The noise in the standard error estimation is more significant when the underlying residual distribution is not symmetric. In this case, and variation in the level at which the mean is trimmed leads to variation in the asymptotic mean. This seems likely to be the case with the MMED. '00 Korenman and Neumark (199 l )’s 10-40 percent range no longer appears to be an accurate description of reality. The only MMED reported in this paper that is near 40 percent is in Nakosteen and Zimmer (1987). The size of their MMED could be specious due to the inclusion of males ages 18-22 in their sample. This group of young, mostly unmarried males earnings will be lower than the earnings of older married males. Moreover, among younger males in more recent samples, Gray (1997) finds MMEDS statistically Significantly less than 10 percent. As such, a better range for the MMED would be 5-25. '0' Goldin (1990) declares that the MMED has been virtually stable. Similarly, Schoeni (1990) fails to find significant systematic differences in the MMED across developed countries. '02 The best way to determine the direction of causality of the MMED is by simultaneously controlling for differences in unobserved ability with fixed effects and how many years someone has been married. The fixed effects pick up part of the effect from selection and the years married show whether earnings grow with how long someone has been married. The longitudinal MMED 101 effects from Gray (1997) and Stratton (2002) are reexamined. Gray (1997) compares the longitudinal estimates of the MMED and the return to years of marriage with the NLS and NLSY. Stratton (2002) similarly looks at longitudinal estimates of the MMED and the return to years of marriage/cohabitation using data from the National Survey of Families and Households (N SFH). The return to years of marriage, in Gray (1997), is significantly greater than zero in the earlier but not the later cohort. However, the cohorts in Gray (1997) have returns to years of marriage that do not differ statistically Significantly from each other. To confirm a change in the return to years of marriage, a residual-based trim is used to reexamine Gray (1997) and Stratton (2002)”)3. Then, the trimmed longitudinal returns to years of marriage are compared to each other and other papers’ estimates to verify whether the return to years of marriage is declining. After looking at the longitudinal evidence, other studies that investigate the prevalent direction of causality behind the MMED with different methods104 are compared by when their data sets are collected. A meta-analysis of the MMED across age groups and by periods when the data was collected is then estimated. is examined along with controls for years of marriage and its square and, in most studies, years divorced. '03 The residual-based trim is estimated using Gray (1997) and Stratton (2002)’s original data sets. The trim is based on the residuals from a median regression. This is because one can tell whether an observation’s reported log-wage is aberrant by its residual. The residuals from a median regression are used since “outliers" due to measurement error are less likely to affect the estimates of a median regression. Stratton’s sample is altered to allow for better comparison with other studies. Instead of using all males between the ages 18 and 65, the sample is restricted to observations between the ages 25 and 54. '04 There are other techniques used that either help to distinguish between the competing causal explanations or provide evidence for or against one of the two hypotheses. However, because the number of papers is small with a fair amount of variation in techniques and sample selection, one can only directly compare the longitudinal MMEDS with controls for years of marriage across the studies. However, the qualitative implications of the different studies as to the causality behind the MMED can be considered. (‘2‘ 2 In general, the qualitative comparisons of the remaining papers are consistent with the findings of the longitudinal studies. In more recent years, there is less evidence for the causal effect from being married to having higher earnings and more evidence for higher earnings increasing the likelihood of being married. To find a change in the predominant direction of causality with no evidence of a change in the cross-sectional estimates of the MMED illustrates the reduced form of observed cross-sectional MMEDS. These naive estimates are not useful for determining underlying structural causality, or tracing out the effects of a change in structural causality. Additional information is needed to infer whether the direction of causality has changed. But, the evidence accumulated in this paper does indicate that when social structures such as sex roles, marital behavior, divorce law and etc. change, economic relations will also change. II. Reinvestigation into Gray (1997) Gray (1997) uses young men aged 24-31 in 1976 or 1989 from the National Longitudinal Study of Young Men (N LS) and National Longitudinal Study of Youth (NLSY) to test whether there has been a change in the causality of the MMED over time. Table 1 compares the results of the non-trimmed and trimmed versions of an alternate specification of Gray (1997) 105. In the second row, we see that 15 or 15.7 percent of the sample is removed by a residual-based Him“. In the third row, we see that there is a low adjusted R-squared for the '05 Gray (1997) includes the existence of dependents in the regression as does Comwell and Rupert(1997), Akerlof(l997) and Korenman and Neumark (1991). However, as couples are likely to wait to have children until they can afford them and the variable is partially collinear with marriage, it is not included in the specifications run here. Generally, the Sign of the variable goes very close to zero when one controls for fixed effects and years of marriage. '06 The specification used is a slight variation on Gray’s original cross-sectional specification. Changes include how five education category dummies are included in the regression. The 103 rpm“ .v‘.1'.'a 1 1 linear probability regression over whether an observable is trimmed or not can be predicted based on observablesm. The linear probability regression serves as a safe-check that observations are not being systematically excluded from the sample by the trim. The impact of the trim is apparent by a comparison of the standard errors for the trimmed and non-trimmed samples. For the longitudinal estimates of the MMED without controls for years of marriage, the standard error goes from .30 to .21 and .25 to .15. The standard error is around a third lower than its previous . Nadiaplluuuo- '-“.\.. - categories include less than nine years of education, less than twelve but greater than eight years of education, twelve years of education, less than sixteen but greater than twelve years of education, sixteen years of education and greater than sixteen years of education. Also whether or not dependents are present in the household is not included in the specification. This variable is omitted because of its endogeneity and correlation with marriage. The specification described above is first used for a median regression. Only the residuals are considered from the median regression. Then, a mean, individual, fixed-effects regression is estimated with the median regression residuals as the dependent variable. The residuals from the fixed-effects regressions are then used for the trim. The reason for the two steps is because of the limit on number of variables in STATA and how fixed effects has not been programmed for median regressions. Then, an estimate is made of the conditional median log-variance of the residual distribution by regressing two times the log of the absolute value of the fixed effects regression’s residuals on the same independent variables used in the original median regression. The specification used in the median log-variance regression is the same as that used in the original median regression. Predicted log-variances are made for all observations. The predicted values are exponentiated and then the square root is taken to arrive at a set of robust estimates of the conditional standard deviations of the residual distributions. Robust is here used to refer to how the influence of outliers on the standard errors is reduced relative to what would be the case if a mean log-variance regression was used to estimate standard deviations. Then, all observations whose residuals from the residual fixed effect regression are more than 2.58 standard deviations away from zero are removed from the sample. '07 The same variables as the Gray Specification without the dependents variable are included in the median regression. The linear probability for being trimmed excludes year dummies. The purpose of this linear probability is to make sure that the trim does not disproportionately remove observations from the sample based on any observable characteristics. This is a safe guard against the trim causing additional bias to the observed coefficients through an endogenous attrition of the sample. It makes sure that the controls for heteroskedasticity in the trim are working to ensure that the cut—off values for the trim are consistent across observable characteristics. This is particularly relevant for estimation if there are asymmetries in the earnings distribution, since to trim too deeply because of the underestimation of the standard deviation for a particular group will exacerbate the inevitable bias caused from trimming an asymmetric distribution. 104 Is value. Although, the gains in efficiency seem to be higher for the NLSY than the NLS‘O". There is now a statistically significant change in the change in the return to years of marriage between the two cohorts'og. However, with the NLS, 2.3 of the original 12.0 non-trimmed cross-sectional MMED is attributable to years of marriagel 10. 9.7 percentage points are attributable to selection for the earlier period. In the NLSY, none of the 9.8 non-trimmed cross-sectional MMED can be directly attributed to years of marriage. This indicates that the cross-sectional MMED may have fallen primarily because of there no longer being a return to years of marriage. III. Reinvestigation into Stratton (2002) Stratton (2002) looks at the MMED and the difference in earnings between cohabiting males and single, non-cohabiting males. It uses data from the 1987-88 and 1992-94 waves of the NSFH. Her original sample is restricted to white, non-hispanic men under the age of 65 and at least 18. Tables 2 and 3 llland summarize the findings from looking at Stratton’s original sample additional samples. Stratton, in her original sample, finds a significant cross- sectional MMED and Cohabitating earnings differential. Stratton also finds a '08 This is shown by how a larger portion of the sample is trimmed and how the adjusted R-squares rises more because of the trim. '09 In a regression where the two cohorts were estimated simultaneously, the restriction that the return to years of marriage and years of divorce had not changed between the two cohorts was rejected at the 5% significance level. ”0 Trimming the mean does reduce the MMED. However, if this reduction is because of asymmetries in the log-eamings distributions conditional on marital status and the asymmetries are due to endogenous selection then the trim reduces the influence of selection on the MMED. 111 Only observations found in both waves are included in the cross-sectional regression, unlike in Stratton (2002). This tends to raise the MMED but the overall qualitative implications are not changed. 105 I-“ . significant return to years of marriageI [2 and evidence that the return to years of cohabitation is only positive for males who are long-terrn cohabitersl '3. However, Stratton’s sample includes very young males ages 18-24. This group is very young with correspondingly low earnings. They also have a very low likelihood of being married. The correlation between experience and marital status is made more positive by the inclusion of this group in the sample. The strengthening of the correlation between experience and marital status increases the magnitude of bias transferred from the experience coefficientsl '4 to the marital-status coefficients. The inclusion of younger, unmarried males will tend to bias the MMED upwards. Because of this, the sample is restricted to males ages 25-541'5. Some of Stratton’s findings are sensitive to the change in the sample and specification1 ‘6. There no longer is a Significant cohabitation earnings differential and there is no longer any significant return to years of marriage. "2 Stratton (2002) only lists the results from regressing the log of years of marriage plus one. This paper regresses years of marriage and its square to maintain continuity with earlier papers. 1n the original Stratton sample, neither of the years of marriage and its square are individually statistically significant, but together they are statistically significant. ”3 The years cohabited is negative but its square is significantly positive so that after two years of cohabitation there begins to be a positive return to years cohabited. "4 The experience coefficients are biased in part by endogenous selection out of employment. "5 Older males are unlikely to be never married and also are excluded. There also is a selection effect from endogenous retirement for older males that can be avoided by not including the oldest males. ”6 Stratton’s original specification included education, experience, experience squared, tenure, tenure squared, tenure and tenure squared interacted with the second wave, years not employed, residence in SMSA, a second wave interaction with SMSA, South, Active in Union with wave interaction, children present, wave dummy, eleven industry dummies, seven occupation dummies. Children present, years not employed are too likely to be endogenous. The new specification omits children present, years not employed. and all the wave interactions used above. Most of the wave interactions are not significant and no a priori reason for why they should be included is given. Instead of education, education cohorts for years less than nine, less than twelve more than eight, twelve years, less than sixteen more than twelve, sixteen years, more than sixteen years of education. These cohorts are interacted with wave dummies. This is because of the rising return to education. Only the dummy variable for sixteen years of education is significantly greater in the second wave. 106 in c‘ When a residual-based trim' '7 is run on the modified Stratton sample, the cross-sectional and longitudinal MMED both become statistically significantly different from zero. It also becomes more apparent that controls for years of marriage and years cohabited do not explain any portion of the longitudinal MMED. This confirms the findings of Gray (1997) that there does not appear to be a return to years of marriage in the late eighties. IV. A Comparison of the Findings of Longitudinal Studies Table 4 summarizes the findings of longitudinal studies of the MMEDl '8. The findings are summarized in the form of the cross-sectional MMED, longitudinal MMED and the longitudinal MMED after controlling for years of marriage and divorce. Unfortunately, the first paper, Comwell and Rupert (1997), is not comparable to the other studies since it includes tenure at current job and its square in its final longitudinal regression that measures the return to years of marriage. Otherwise, Korenman and Neumark (1991) and Trimmed Gray (1997) Al 19 are very similar in their findings of a significant return to years of marriage after controlling for fixed effects'zo. Both Akerlof (1997) and Gray (1997) B find a lower MMED and a lower return to years of marriage in the National Longitudinal Survey of Youth (NLSY). Akerlof (1997) uses data drawn from an earlier set of years than Gray (1997) B. There still is a positive return to years of ”7 The same residual-based trim used to analyze Gray (1997) earlier in the paper. ”8 The coefficients from the MMED and dependents variable are added together for Comwell and Rupert (1997), Korenman and Neumark (1991), Akerlof ( l 997). The standard errors presented are estimated by taking the square root of the sum of the squares of the reported standard errors for the two coefficients. This is a conservative estimate of what would be the MMED and its standard error if the dependents variable was not included in the specification. ”9 There are two time periods where Gray (1997) estimates the MMED. The first from 1976- 1980 and the second is from 1989-1993. '20 Both papers use the National Longitudinal Survey of Young Men for their papers. 107 5‘ 1"- \n“! 81.“. marriage in Akerlof(1997) but it is lower than in Korenman and Neumark(199l ). Then in Gray (1997) B. there does not appear to be any return to years of marriage. The MMEDS from Stratton (2002)!” are somewhat higher than the other papers. However. this caould be because ofendogenous sample attritionm. ln Stratton (200), the signs on years of marriage and its square are the wrong direction and the MMED rises, instead of falling, when controls for years of marriage are added. Altogether, there appears to be a downward trend in whether . the MMED rises with years of marriage. i To further investigate whether the causality of the MMED has changed, the rest of the paper compares the findings of investigations into the causality of j the MMED by the time periods when the data is collected. Then a meta-analysis is done to test whether the MMED has fallen after controlling for differences in the age group used to estimate the MMED. IV. Earliest Studies. The earliest study by Hill (1979) consists of verifying whether the MMED is still significantly positive after one adds additional controls to the earnings regression. Hill (1979) finds that the MMED remains significantly positive and even rises with the number of controls. Hill, after demonstrating the existence of '2' These MMEDS are from the modified sample and specification. '22 Later, when a meta-analysis of cross-sectional MMEDS is measured based only on observations from the first wave, the MMED estimated from the NSFH is comparable to the other cross- sectional estimates, after controlling for age of sample and a shift after 1980. When both waves are used to estimate a cross-sectional MMED, it is considerably higher than the other cross- sectional MMEDS. 108 a MMED that is unexplainable by observable characteristics. poses the question as to its source123 . However, one cannot infer the cause(s) of the MMED from a solitary cross-sectional estimate. More information is needed. Kenny (1982) first incorporates multiple observations from the same individuals to obtain more information about the effect of marriage on earnings. As summarized in Table 5, Kenny (1982) tests whether earnings rise at a different rate before or after a man is married. He finds that earnings tend to rise at a higher rate after marriage than before marriage. Bartlett and Callahan (1984)’s later finding of greater income growth among continuously married males provides support for Kenny’s finding. While Bartlett and Callahan’s finding by itself is not convincing because of the obvious selection story, it is consistent with Kenny’s finding that earnings rises faster during marriage. Kenny (1982) interprets the increased rate at which earnings rises after marriage as evidence for Becker’s theory that married males are able to accumulate more market-oriented human capital'24. However, as Korenman and Neumark (1991) observe, Kenny’s finding is also consistent with the employer favoritism model. Both papers argue that there is evidence in favor of marriage increasing productivity versus employer favoritism. The evidence in favor of increased productivity from marriage is based on the supposition that employer favoritism would increase the level of earnings of married males and not increase their earnings growth rate. The counterfactual to this argument is that if 123 After posing the question, Hill gives her econometrically unsubstantiated assessment that the MMED is due to employer favoritism toward married males. 109 “'45.”. 2‘. employers are favoring married males primarily through an increased number of promotions then the impact of favoritism would be a higher growth rate of earnings. Korenman and Neumark (1991) go further to Show that married workers are more likely to be more productive. Married workers receive higher performance ratings from their supervisors. Their increased chance of promotion, thus, corresponds to their higher performance ratings'zs. Regardless though, to find a higher growth rate during married years does suggest that the MMED is not completely due to the selection of males with higher earnings into marriage. V. Studies Based on Data Collected during the Seventies. Table 6 summarizes the major findings of studies using data drawn during the Seventies. The paper using the earliest data set, Nakosteen and Zimmer (1987), runs the first test of the endogeneity of marital status. Nakosteen and Zimmer (1987) finds that it is unable to reject the exogeneity of marital status. Their instrumental variables estimate of the MMED is marginally higher than the OLS estimate of the MMED. As such their paper does not provide convincing evidence for the claim that the observed MMED is the causal effect from selection into marriage based on earnings abilitym’. The remaining three papers briefly discussed earlier examine the NLS data set with fixed effects and measures of married/divorced years. They include: '24 The increased productivity would be based on the household division of labor. '25 Implicit here seems to be an argument against discrimination based on the theory of discrimination set out in Becker (1957). Becker argues that if a firm did have discriminatory tastes or supervisor’s performance ratings reflected more than the actual productivity of workers that in a competitive market a firm would not be profit maximizing. Competition from non- discriminatory firms would likely put such firms out of business or result in a change in ownership. '26 Nakosteen and Zimmer (1987) use the statistical insignificance of the 1V MMED statistic to claim that there is no evidence for the household specialization/employer favoritism hypotheses. llO 1‘... Korenman and Neumark (1991 ). Gray (1997) and Cornwell and Rupert (1997). Both Korenman and Neumark (1991) and Gray (1997) find a significant return to years of marriage during the seventies for white males in America. However. the similar findings are with the same data set for the same years and so the two studies cannot be said to be independent of each other. Korenman and Neumark (199]) goes further and looks at an additional data set taken from the records of a company personnel database. The analysis of the data set indicates that there is a significant MMED but that the differences are mostly due to a concentration of married males in higher job grades. Married men receive higher performance ratings and as such are more likely to qualify for promotions. AS with Kenny (1982), the increasing levels of income during years of marriage support the household specialization/employer favoritism hypotheses. Comwell and Rupert (1997) use the same NLS data set to reexamine the findings of Korenman and Neumark (1991). They lengthen the panel data set so that it includes observations from 1971 so as to include additional changes in marital status. Contrary to Korenman and Neumark (1991), Cornwell and Rupert (1997) do not find a significant return to years of marriage after controlling for fixed effects. However, their paper neglects to consider additional implications of their extension of the panel data set. There are changes in their data set from Korenman and Neumark (1991) that could provide alternative explanations of the different findingsm. 127 These include a smaller sample size and the inclusion of years of tenure variable in the specification. Korenman and Neumark (1991) show that, since marital status affects turnover likelihood, the variable is endogenous. The inclusion of tenure at current job in a regression would bias downward the return to years married. 11] The biggest change involves the effect ofthe inclusion of non-spouse dependents in their regression. Both Korenman and Neumark (1991) and Comwell and Rupert (1997) include the dummy variable in their specification. Cornwell and Rupert find a significant correlation of .052 (.019) in their fixed effect specification. On the other hand, Korenman and Neumark do not find a significant correlation, .02 (.02), between having dependents and earnings. The negative correlation between significance in marital status and significance in presence of dependents across the two papers could be related to the collinearity between the two variables. Cornwell and Rupert (1997) show in their Table 1 that there is a strong collinearity between the two variables for the later years in the panel'zg. However, in 1971 , the year that Cornwell and Rupert add to the panel, only 73 percent of married males have dependents. Herein lies an explanation for the differences between the two papers. Since married males that have non- Spouse dependents are likely to be older and the decision to have or not have children can depend on their affordability, married males with dependents are likely to have higher earnings than married males without dependents, which makes the non-spouse dependent dummy variable is endogenous and biased upwards in the earnings equation. The positive correlation between the non- spouse dependent dummy and the married dummy leads to a transfer of bias to the MMED. The MMED becomes biased downwards. One can check whether or not the inclusion of the dependents variable is affecting the MMED by adding the coefficients for the married and non-spouse dependent dummies for both papers ’28 They report 85, 89 and 92 percent of married males have non-spouse dependents in the years 1976, 1978 and 1980. Only 3, 6, l l and 9 percent of never married males report having 112 and comparing the point estimates. For Comwell and Rupert, the sums of the two coefficients are .135 and .108 for the cross-sectional and fixed-effects estimates. For Korenman and Neumark, the sums of the same two coefficients are .15 and .08. The differences in the sums of the two coefficients are well within their likely standard errorsm. Because of this exigency, Comwell and Rupert (1997) does not contradict Korenman and Neumark (1991)’s finding of a return to years of marriage. Thus, it seems that the MMED, in all likelihood, did rise with years of marriage during the seventies. VI. Studies based on Data Collected during the Eighties/Nineties. However, an examination of Tables 7 and 8 Show that, after the seventies, there is less evidence in favor of the specialization/favoritism hypotheses. Nakosteen and Zimmer (1997, 2001) both show with the PSID that selection into marriage based on earnings ability during the eighties is important. Similar to Gray (1997), Jacobsen and Rayack (1996) Show that both the MMED and the negative correlation between hours worked by spouse and a male’s earnings are dampened when one controls for fixed effectsm. Loh (1997) directly tests a number of the predictions of the theory that marriage enhances productivity. These include tests of: whether higher hours of work by wives lead to a decline in dependents in the relevant four years, 1971, 1976, 1978, 1980. '29 The standard error of the differences between the sums is greater than .02 for the cross- sectional estimates and greater than .03 for the longitudinal estimates. This is based on a comparison of the standard errors for the individual coefficients. One must also remember that there are also differences in the two data sets. '30 Unfortunately, like Gray (1997), Jacobsen and Rayack (1996) does not report the relevant information about their 1V estimation technique. As such, one cannot assume it is reliable. Gray (1997) does report that a significant negative correlation between hours the spouse works accounts for the decline in the MMED. However, this is not robust to changes in specification and the instrumental variables of child less than six does not pass the test for a valid instrument and the remaining correlation between the instruments and the hours worked by spouse is not strong enough for the 1V results to be reliable. 113 the earnings of males. whether cohabitation prior to marriage leads to higher earnings ceteris paribusl3 1, whether self—employed males who are married have higher earnings than not married self-employed malesm. In all cases. no evidence is found for the productivity-enhancing hypothesis. Hersch and Stratton (2000) use the information from the National Survey of Families and Households to test whether variation in hours spent in housework explain part of the MMED. They find that, while there is a negative correlation between a male’s earnings and hours spent in food preparation. controls for the hours spent in housework do not affect the MMED. Hersch and Stratton (2000) do not find direct evidence that the extent of specialization in household production has a Significant causal effect on market earnings. Stratton (2002) finds that when younger males ages 18-24 are included in the sample there is a Significantly positive cross-sectional correlation between cohabiting and earnings. The earnings of males in long-tenn cohabitation relationships also seem to rise with time. However, these findings are not robust to when the sample is restricted to include only males ages 25-54. The change in the sample shows that there is no significant difference in earnings '3 ' When a couple chooses to cohabitate, this allows for the same marriage specific specialization in human capital investment as may occur during marriage. The human capital model predicts that, after controlling for years of actual marriage, the additional time spent in cohabitation Should increase the MMED. '32 If married men have higher earnings then both salaried and self-employed married men should have similar higher earnings. Otherwise, one might expect from an employer-discrimination model that married self-employed men may not be better off than never married self-employed men. Loh shows that, conditional on being self-employed, married men make less than never married men. The finding of a marriage penalty for self-employed men is robust to the inclusion of a selection correction for being self-employed. Loh does not look into whether the marriage penalty is robust to differences in specification. Self-employed is not interacted with any other term besides marriage. Other terms included in the regression that are not interacted with being self-employed are education, tenure, tenure squared and age. With tenure, we would expect that a self-employed worker would not need a Lazearean implicit contract with himself over time. If married males were more likely to be salaried workers with this type of long-tenn contract, then 114 between older cohabiters and never-married non-cohabiters and no return to years cohabited. This corroborates the findings of Loh (1997) that there are no higher earnings associated with cohabitation. Akerlof (1998) exploits the details of the NLSY to establish that large differences in behavioral patterns exist across males by marital status. The differences in traits, such as likelihood to get intoxicated or incarcerated, may contribute to the selection hypothesis to the extent they are valued both in the marriage and labor markets. Chun and Lee (2001) is the only recent paper that claims to find evidence against the importance of selection in the later yearsm. Chun and Lee (2001) estimate a switching regression between married and never-married states. However, Chun and Lee (2001) finds that controls for the endogenous selection do not change the MMED. This implies that the characteristics used to identify the decision to become married do not also lead to higher earnings. Chun and Lee (2001) also find evidence that there is a negative correlation between unobservables in the marriage and earnings equations. One would expect that if unobserved differences in earnings ability also are important in the selection into marriage that there would be a positive correlation between the error terms in both equations. Hence, for the error terms for these two equations to have a negative correlation does complicate the selection story. Yet, it does not disprove the selection storym. This is because there is more than one possible selection story. It also may be possible that multiple selection stories for marriage can coexist. there would be a negative bias to the interaction of marriage and self-employed. As such, the comparison across groups of workers is not convincing. "‘3 Their data set is collected in 1998. 115 IK I An additional selection story could be where higher ability males. once freed from previous social constraints, select not to marry. One can easily explain why this could be the case. As Becker (1973) notes in his treatise on the family. “Nothing distinguishes married households more from singles households or from those with several members of the same sex than the presence, even indirectly, of children. Sexual gratification, cleaning. feeding. and other services can be purchased, but not own children: both the man and woman are required to produce their own children and perhaps to raise them(p.818).” Given the centrality of production of children as an economic rationale for marriage and the time-intensive nature of child rearing as well as the indivisibility of children, one can easily posit that some high-income males will prefer not to have children. As such, the incentive for marriage may not exist for themm. There could be a non-linearity in the relationship between earnings ability and propensity to become married. Tentative evidence that the average earnings of never-married males has risen in recent years is found using the same data set used in the second paper of this dissertation. Graph 1 shows that in the years of survey 1995 and 1996, the MMEDI36 begins to decline relative to its predicted levels from the years 1967- 1988. Graph 2 shows that the decline of the MMED is paralleled with a similar decline in the Divorced/ Separated earnings differential (DSED). The parallel between the two series’ declines is likely due to an increase in the average earnings among never married males. If this trend continues to exist in 1999 then '34 Moreover, if it did refute the selection hypotheses then how would one account for the longitudinal estimates of the MMED being lower than the cross-sectional estimates of the MMED? '35 This is particularly the case if employer favoritism toward married males in the past is no longer the case. ”1’ This actually is the trimmed MMED. A similar form of trim is employed in Wetzell (2002) as is used to reexamine Gray ( 1997) and Stratton (2002). 116 it likely affects Chun and Lee’s data. One can postulate that the inclusion of a fraction of higher ability males as never married would increase the group’s average earnings. This would then reduce the MMED and potentially make the correlation between the error terms for marital status and earnings equations negative. The key point is that Chun and Lee (2001) do not disprove the selection story. Their findings can be reconciled with the earlier finding of evidence in papers like Nakosteen and Zimmer (1997, 2001) in support of selection as one of the factors causing the MMED. To reconcile this finding, an additional, theory- based selection story is given. Then with the help of this story, the deviation of the two last MMEDS in the series constructed by Wetzell (2002) from the pattern that prevails in the previous twenty-eight years is explained and extrapolated as likely to still apply in more recent years. VII. Summary of Meta-Analysis of Cross-Sectional MMEDS The main, or average, Cross-Sectional MMEDS for eleven of the reported studies are reported in table 9. These MMEDS are grouped with the mid-range of the sample’s age evaluated at the mean year when the data was collected and a dummy variable for whether the data was collected after 1980. A median regression with the Cross-Sectional MMED as the dependent variable is then estimated with the age and when the data set was collected as independent variables. The regression shows that the cross-sectional MMED does rise with the age of the group and that there is a Significant four-percentage point decline in the Cross-Sectional MMEDS estimated on data sets collected after 1980. The 117 median regression establishes that Nakosteem and Zimmer’s estimate of the MMED is an outlier. When a mean regression is estimated on the rest of the sample, the same results are found. VIII. Conclusion The above comparison of the results of studies across time confirms that the MMED has changed. The relative importance of the sources of the MMED is not fixed. The finding of a significant return to years of marriage in Kenny (1982), Korenman and Neumark (1991) and Gray (1997) A, and Akerlof (1997) show that there was a significant causal effect from marriage to higher earnings. But the Significance of the causal effect of marriage to earnings for the existence of the MMED appears to have declined in importance, as illustrated by Gray (1997) B and the modified version of Stratton (2002). The decline in the importance of marriage for earnings is supported by how a meta-analysis shows a decline in the Cross-Sectional MMEDS after 1980. In addition, a survey of several more recent studies that look at the predictions of the specialization hypothesis fail to find strong support for its contribution to the MMED. Lastly, several more recent papers do find strong support that selection into and out of marriage in more recent years is correlated with earnings ability. Thus, it seems that the MMED, in more recent years, is less due to marriage increasing earnings productivity than increased earnings productivity increasing the likelihood of being married. 118 Bibliography Akerlof, George A. 1998. “Men Without Children.” The Economic Journal 108, pg. 287-309. Bartlett and Callahan. 1984. “Wage Determination and Marital Status: Another Look.” Industrial Relations 23 (1): 90-96. Becker, Gary. 1957. “The Economics of Discrimination.” Chicago. Becker, Gary. 1973. “A theory of marriage: part 1.” Journal of Political Economy, 81, 4, July/August, pg. 813-46. Blackburn, M and Korenman S. 1994. “The declining marital-status earnings differential.” Journal of Population Economics, 7, pg. 247-70. Chun, Hyunbae, and Lee, Injae. 2001. “Why Do Married Men Earn More: Productivity or Marriage Selection?” Economic Inquiry 39 (Spring) pg. 307-319. Comwell C. and Rupert PG. 1997. “Unobserved Individual Effects, Marriage, and the Earnings of Young Men.” Economic Inquiry 35 (Spring) pg. 285-94. Gray, J. S. 1997. “The fall in men’s return to marriage: declining productivity effects or changing selection?” Journal of Human Resources, 32 (Summer) pg. 481-504. Hersch and Stratton. 2000. “Household specialization and the male marriage wage premium.” Industrial & Labor Relations Review 54 (1) pg. 78-94. Hill, Martha S. 1979. “The Wage Effects of Marital Status and Children.” Journal of Human Resources, 24 (Fall) pg. 579-94. Kenny, L. W. 1983. “The accumulation of human capital during marriage by males.” Economic Inquiry, vol. 21 (April), pg. 223-31. Korenman, S and Neumark, D. 1991. “Does marriage really make men more productive?” Journal of Human Resources, vol. 26 (Spring), pg. 282-307. Jacobsen and Rayack. 1996. “Do Men Whose Wives Work Really Earn Less?” American Economic Review 86 (2) pg. 268-273. Loh, E. S. 1996. “Productivity differences and the marriage wage premium for white males.” Journal of Human Resources, 31 (Summer), pg. 566-89. 119 Nakosteen, R. and Zimmer. M. 1987. “Marital status and earnings of young men.” Journal of Human Resources, 22 (Spring) pg. 248-68. Nakosteen, R, and Zimmer, M. 1997. “Men, Money and Marriage: Are Higher Earners More Prone than Low EarnerS to Marry?” Social Science Quarterly, 78, March, pg. 66-82. Nakosteen, R, and Zimmer, M. 2001. “Spouse Selection and Earnings: Evidence of Marital Sorting.” Economic Inquiry 39 (Spring) pg. 201-213. Schoeni, Robert. 1990. “The earnings effects of marital status: results for twelve countries.” Population Studies Center Research Report, University of Michigan, March pg. 90-172. Stratton, Leslie. 2002. “Examining the Wage Differential for Married and Cohabiting Men” Economic Inquiry 40 (Spring) pg. 199-212. Wetzell, David, 2002. “On the Measurement and Explanation of Recent Changes in the Male Marriage Earnings Differential.” Paper two in Dissertation, Michigan State University. 120 Table 1 Summary of Trimmed Versions of Gray (1997) Sample NLS - NLSY - e T e Z °/o of Original ASL R. Of f % of Original ASL R. Of Trim Sample Pro ability 0 Sample Pro ability of . . . Berng . Being Statistics Trimmed Trimmed Trrmmed Trimmed .150 .0041 .157 .0023 Type of Non- . Non- . Sample Trimmed Trimmed Trimmed Trimmed Cross- Sectional .4200 .5068 .3525 .4651 Adj. R2 .120 .096 .098 .096 MMED (.017) (.015) (.014) (.012) Longitudinal Adj. R2 .7757 .9052 .6989 .8948 .091 .066 .022 .056 MMED (.030) (.021) (.025) (.015) Longitudinal w. Controls Years of .7770 .9061 .6986 .8949 Marriage Adj. R2 .063 .043 .007 .042 MMED (.032) (.022) (.026) (.016) Years of .027 .026 .008 .004 Marriage (.009) (.006) (.009) (.005) $31,122:: —. 142 -.1 12 -.069 -.069 Squared/100 (.043) (.029) (.051) (.031) 121 Table 2A Comparison of Descriptive Statistics Stratton (2002)137 Sample .Stratton’s Wetzell’s Sub-Sample Original Sample Males Ages 25-54 No. of Individuals 1358 1209 1017 Type of Sample Non-Trimmed Non-Trimmed Trimmed Real Wage 1992 8 15.85 16.63 16.52 17.64 16.55 17.29 Log Wage 2.62 2.62 2.67 2.71 2.69 2.73 Married 73.2 77.5 78.3 80.9 79.4 81.9 Divorced. 9.0 11.7 Widowed. Separated 8'0 I 1’9 8'7 l 1'0 Cohabiting 3.3 5.2 3.9 4.2 3.5 4.1 Has Cohabited 25.6 32.2 28.8 33.9 28.9 33.3 Years of Marriage 11.7 16.0 12.2 17.0 12.3 17.1 Age 36.28 42.0 37.3 43... 37.4 43.3 Education 13.88 13.92 14.08 14.13 14.13 14.17 Table ZB 138 Summary of Reexamination of Stratton (2002) Sam 1e Stratton’s Wetzell’s Sub-Sample p Original Sample Males Ages 25-54 Type of Sample Non-Trimmed Non-Trimmed Trimmed'” RI .3905 .3451 .4790 .198 .218 .201 MMED (.049) (.055) (.042) Longitudinal R" .845 .833 .938 .085 .062 .069 MMED (.044) (.057) (.036) Longitudinal R" .846 .835 .938 .085 .081 .080 MMED (.044) (.060) (.036) . .004 -.001 -.006 Years of Marriage (.008) (.010) (.005) Years of Marriage .010 .027 .026 Squared/100 ( .022) (.024) (.014) '37 The means are weighted by sample weights. Cross-sectional MMEDS are estimated with sampling weights, but sampling weights were not permitted by STATA for XTREG, fixed effects. '38 There is an upwards bias to the cross-sectional MMED caused by the sample attrition between the two periods for the NSFH. When the MMED is estimated from the first wave alone it is .143 (.033). '39 The Adjusted R2 from the linear probability regression of whether or not an observation is trimmed is 0.0179. This is a little higher than usual and indicates that males with less than high- school education were more likely to be trimmed. To disproportionately trim a portion of the sample causes bias if the underlying distribution is asymmetric. For example, if controls are not made for marital status a disproportionate percentage of never married males are trimmed from the sample. This is because there naturally is a wider variance in earnings for never married males. But, because the earnings distribution for never married males is negatively skewed, to trim too many never-married males reduces the MMED significantly. As it is, never married and divorced/separated males have a four percentage point greater likelihood of being trimmed. This may not be enough of a difference to alter the MMED Significantly. 122 Table 3 Summary of the Impact of Cohabitation Stratton’s Wetzell’s Sample Original Sample Subsample Males Ages 25-54 Type of Sample Non-Trimmed Trimmed R" .3930 .4797 .230 .203 MMED (.049) (.044) . . .113 -.014 Cohab1t1ng (.068) (.054) Previously -.024 -.030 Cohabited (.026) (.025) Longitudinal Rj .845 .938 .080 .062 MMED (.047) (.037) . . -.020 -.O48 Cohabrtmg (.050) (.039) Previously -.001 -.010 Cohabited (.047) (.034) Longitudinal Rt .848 .938 .080 .059 MMED (.048) (.037) . . .041 -.048 Cohabiting (.054) (04]) Previously .047 -.012 Cohabited (.048) (.035) . .0080 Years Married (.0084) Years .012 Marriedz/IOO (.022) . -.022 .0025 Years Cohabited (.018) (.0142) Years .012 .086 CohabitedZ/IOO (.022) (.127) II." ! Table 4 Summary of Returns to Years Married from Longitudinal MMED Studies With Controls for Years Married. Cross- . . Time Age Data Section Longrtudrnal Span Range Source Married Married Yeags 3110052 1971, A868 (31:31:21, 1976, 19-29 NLSY .135 .099 .033 -.005 -.03 1991;”, 1978, in M (.028) (.032) (.028) (.006) (.03) ( ) 1980. 1971 Koren- 197 6 Ages man 1978’ 24-34 NLSY .15 .08 .03 .022 -.096 Neumark 1980, in M (.03) (.036) (.03) (.009) (.034) (1991) ' 1976 Trimmed 1976 Ages Gray 1978, 24-31 NLSY .096 .066 .043 .026 -.112 (1997)A 1980. 1n M (.015) (.021) (.022) (.006) (.029) 1976 By 1985 Akerlof 1984- Done NLSY .109 .04 .018 .016 -.15 (1998) 1991 With (.026) (.024) (.020) (.007) (.05) Educ- ation Trimmed 1 937- 95%;: Modified 1988, in NSFH .201 .069 .080 -.006 .026 Stratton 1992- (.042) (.036) (.036) (.005) (.014) 1988 . Ages hanged :33? 24-31 NLSY .096 .056 .042 .004 -.069 (199.03 1993: 1;?” (.012) (.015) (.016) (.005) (.031) '40 Years Married and its square. '4' Unlike the other papers, Comwell and Rupert include tenure at current job and its square in their specification. Also see text about the consequences of the inclusion of dependents dummy. Only longitudinal results are reported with years of marriage as a dependent variable. 124 Table 5 Summary of Earliest studies based on data collected before 1970. Authors/ Time Range Age Range Source of Data Estimated Principal Publishing of Earnings of Data Set Set MMED Finding Date Information Lawrence Before Ages 30- Retrospective 17-20 Monthly Kenny 1969. 40 Monthly Data earnings rise (1983) more quickly when a male is married Bartlett and 1966-1977 Ages 55- National 20-32 Continuously Callahan 64 Longitudinal married men ( 1984) Survey of have higher Older Men earnings growth 125 Table 6 Summary of MMED Studies based on Data Collected 1970-1980. Authors/ Time Range Age Source of Cross- Principal Finding Publishing of Earnings Range of Data Set Sectional Date Information Data Set MM ED Cornwell 1971, 1976, Ages 19- National 13.5”" No positive return and Rupert 197 8 and 29 in Longitudinal to years married (1997) 1980. 1971 Survey of after one controls Young Men for tenure. Nakosteen 1977 Ages 18- Panel Survey 37-41'” Correcting for the and Zimmer 24 of Income endogeneity of (1987) Dynamics marital status fails to reduce the MMED from its OLS level. The test for endogeneity of marital status is also inconclusive. Korenman 1976, 1978 Ages 24- National 15 There exists a and and 1980. 34 in Longitudinal positive return to Neumark 1976 Survey of years married after (1991) Young Men controlling for fixed effects. Jeffrey Gray 1976, 1978 Ages 24- National 12.0 Replicates (1997) and 1980. 31 in Longitudinal Korenman and 1976 Survey of Neumark (1991)’s Young Men finding of significant rise in MMED with years of marriage Korenman 1976 Not White male 1 1.9"” MMED associated and Given managers and with higher Neumark professionals performance ratings (1991) from a single and greater/lower firm likelihood of promotion/tumover I- IV— "2 The MMED is found here by adding the coefficient on marital status to the coefficient on the dummy variable for non-spouse dependents, when the non-spouse dependent variable is included in the regression. This is relevant for Comwell and Rupert (1997) and Korenman and Neumark (1991). The lower MMED is the one associated with the longitudinal analysis, the higher MMED is from the cross—sectional. ”3 The lower estimate is the OLS estimate and the higher estimate is the 2SLS estimate. '4‘ This includes precompany experience and its square, company service and its square, region and education dummies. 126 Table 7 Summary of MMED Studies based on Data Collected 1980-1989. Authors/ Time Range Age Source of Estimated Principal Finding Publishing of Earnings Range of Data Set MMED Date Information Data Set Nakosteen 1984, 1986. Younger Panel Survey -- Shows that traits and Zimmer 1988 Males of Income that predict future (1997) Dynamics earnings predict well the transitions into and out of marriage. Nakosteen 1983, 1985, Singles Panel Survey -- Shows that there and Zimmer 1987, 1989 who of Income exists a positive (1987) become Dynamics correlation between married wage regression in the unobservables next between singles and period. future spouses. Jacobsen 1984-1989 Younger Panel Survey -23, 5,1 1"" Shows that and Rayack waves. Males of Income instrumental (1996) Dynamics variables or fixed effects removes or dampens the MMED and the negative correlation between wife’s hours of work and earnings. Taken as evidence against specialization. ”5 MMED for not self-employed. Lowest value is the instrumental variables estimate, middle value is fixed effects estimate, and highest value is the OLS estimate. 127 Table 8 Summary of MMED Studies based on Data Collected 1984-1998. Authors/ Time Age Source of Estimated Principal Finding Publishing Range of Range of Data Set MMED Date Earnings Data Set lnfonnation George 1984-1991 Completed National 10.9 Never Married Males Akerlof education Longitudinal are more likely to (1998) by 1985; Survey of engage in non- Age 24-31 Youth socially productive in 1989 behavior. Eng Seng 1990 25-33 National 9.1 Finds that a series of Loh (1997) Longitudinal tests do not support Survey of marriage as Youth productivity- enhancing. '46 Hersch and 1987-1988, 18-59 National 9.4-1 1.0 Shows that the hours Stratton 1992- 1994 Survey of of work in the (2000) 18- 6 5 Families and household, while and Hersch Households 17.0 negatively correlated (2002) with earnings, do not appear to affect the magnitude of the MMED.”7 The cohabitation earnings differential is only significant for long-tenn cohabiters. Jeffrey 1989, 24-31 in National 9.6 Shows that the Gray 1991, 1993 1989 Longitudinal MMED no longer (1997) Survey. Survey of rises with years of Youth marriage, as was the ”6 Finds that the extent of wife’s labor force participation does not affect the MMED. Finds that self-employed, married men earn less than self-employed, never-married men and that men who cohabited with their wives before marriage do not have higher MMEDS than men who did not cohabitate with their wives before marriage. Married males, whose wives do not work, do about two hours less housework than never 147 married and married males with spouses who work. Formerly married males work on average about three hours more than married males. However, overall, the magnitudes of hours worked do not differ that greatly among housework by marital status. The negative correlation between hours worked with earnings is primarily driven by food preparation time. This correlation may be due to males with lower market earnings being more likely to invest in food preparation human capital. The direction of causation may be the opposite direction. 128 case in the seventies. Hyunbae Chun and lnjae Lee (2001) 1998. Ages 18- 40 March CPS Supplement 11.7-12 Estimating a switching regressionI48 does not alter the MMED significantly. There is a slight negative correlation between unobservables in the marriage and earnings equations. ' The 1999 March CPS Supplement was used. Unlike earlier studies, all races are pooled in the same sample here. A dummy variable for whether someone is black is included in the s ecification. The sample is also restricted to never married and married males. ' 8 The switching is between the married and never-married statuses. 129 Table 9A List of Values Used for Meta-Analysis of Cross-Sectional MMED across Age Groups and Over Time MidRange of Age Years Data Study over Years MMED Collected>1980 Kenny (1983) 35 .185 0 Bartlett and Callahan 7 (1984) 59.5 .._6 O Comwell and Rupert ., q (1997) -9...5 .135 O Nakosteen and 7 fl Zimmer (1987) 7' “’7 0 Korenman and . Neumark(199l) ’1 '15 0 Non-Trimmed 7 7 Gray(l997) A "9'5 'l"0 0 Akerlof ( l 997) 26 .109 l Non-Trimmed 7 Gray(l997) B “9'5 '098 1 Eng Seng Loh (1997) 29 .091 1 Modified Hersch and Stratton 39.5 .143 l (2002)' Hersch and Stratton 7 ([997) 41...5 .11 1 Chun and Lee (2001) 29 .1 185 1 Table 98 Summary of Meta-Analysis of Changes in Cross-Sectional MMED MMED dependent variable/ Independent Variables Median Regression Mean Regression Excluding Nakosteen and Zimmer( 1997) 100*Age .386 .425 (.081) (.068) 100*Period (Year>l970) -3.98 -4.25 (2.30) (1.24) 100*Constant 3.04 3.89 (3.84) (2.65) ' This includes all observations ages 25-54 in the regression in the first wave. Lower educated, Never married males experienced a disproportionate rate of attrition in the second wave. The upward bias from the endogenous attrition appears to be six percentage points. Male Marriage Earnings Differential Figure 1 Comparison of Predicted and Observed Trimmed MMED o Trimmed MMED series Smoothed, Trimmed MMED —B—-—- Predicted Trimmed MMED 30 r / 25 ~ 0 13.4, 8A8 4 / /:/ / 0 \\e/ O 20 8 ”>6 \ ° \ 15 " l I l l r 87 89 9‘1 93 95 Year of Survey 131 Figure 2 Comparison of the Trimmed MMED and DSEDMQ series For Years of Survey 1991-96 E ——e—— Trimmed MMED series ———e—— Trimmed DSED series E g 25 — g 5 w/ in O) E 20 - 8 \ U 2 \U S m 8 15 — a) i O \ ..>. g 10 .4 /g/‘a \\ 9/8/ S \e d) D 9 9 5 ‘ 1’ 1 1 1 g 91 93 95 Year of Survey "9 DSED stands for the Divorced/Separated Earnings Differential. 132 11111111(111111411411111