THREE ESSAYS IN LABOR ECONOMICS By Kelly Noud Vosters A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Economics - Doctor of Philosophy 2016 ABSTRACT THREE ESSAYS IN LABOR ECONOMICS By Kelly Noud Vosters The first chapter tests a recently proposed hypothesis regarding rates of social mobility. Recent work by Gregory Clark and coauthors uses a new surnames approach to examine intergenerational mobility, finding much higher persistence rates than traditionally estimated. Clark proposes a model of social mobility to explain the diverging estimates, including the crucial but untested assumption that traditional estimates of intergenerational persistence are biased downward because they use only one measure (e.g., earnings) of underlying status. I test for evidence of this using an approach from Lubotsky and Wittenberg (2006), incorporating information from multiple measures into an estimate of intergenerational persistence with the least attenuation bias. Contrary to Clark's prediction, I do not find evidence of substantial bias in prior estimates. The second chapter, coauthored with Martin Nybom, further examines this hypothesis using rich administrative data for Sweden. We exploit detailed proxy measures to test the proposition regarding attenuation bias in prior estimates for Sweden, and also conduct a Sweden-U.S. comparison. We find no evidence of substantial bias in prior estimates, or that the Sweden-U.S. difference in persistence is smaller than found in previous research. We further explore the concept of family status by incorporating mothers, thereby also contributing to the literature on intergenerational transmission for women. We find that while mothersÕ income is a poor proxy for status, incorporating information on mothersÕ occupation improves the ability to capture transmission from mothers to both sons and daughters. The third chapter, coauthored with Cassandra Guarino and Jeffrey Wooldridge, examines the SAS¨ EVAAS¨ models for estimating teacher effectiveness, which are used by several states and districts in teacher evaluation programs despite little attention in the evaluation literature. The EVAAS approach involves using one of two distinct models, the Multivariate Response Model (MRM) or the Univariate Response Model (URM). The MRM jointly models scores from multiple subjects, grades, and cohorts in a 5-year period; it is generally limited to within-district purposes due to the large computational burden and is sometimes not feasible if data requirements cannot be met. Hence, the URM was developed for these situations. The URM models a single subject, and thus is less intensive computationally and more flexible with respect to data requirements. The method involves the computation of a composite score on several lagged scores in multiple subjects, and then using this composite score as the only regressor in empirical BayesÕ estimation of the teacher effects. In this paper, we discuss and illustrate advantages and disadvantages of the EVAAS approach relative to the other widely used and studied value-added methods. We perform simulations to evaluate their ability to uncover true teacher effects under various teacher assignment scenarios. We also use administrative data to illustrate the extent of agreement between the URM and other common value-added approaches. Although the differences are small in our administrative data, we show with theory and simulations that standard linear regression using OLS performs at least as well asÑand sometimes better thanÑthe more complicated EVAAS URM. iv ACKNOWLEDGEMENTS Many thanks to Gary Solon for his extensive guidance and support. I am also grateful to Jeff Wooldridge and Leslie Papke for providing helpful advice and encouragement. I have been very fortunate to have such a wonderful committee and have learned a tremendous amount from each of them. I would like to acknowledge Martin Nybom, with whom the second chapter is coauthored. The third chapter is coauthored with Cassie Guarino and Jeff Wooldridge, to whom I am grateful for helpful guidance and collaboration. I am also thankful for financial support from the Institute of Education Sciences Grants R305B090011 and R305D10002 to Michigan State University. Thanks to my classmate Margaret Brehm, whose daily conversations have helped improve my research and also made my time at Michigan State far more enjoyable. I am extremely grateful to my family for their endless love and support throughout my graduate studies as well as my endeavors that led me here. I am also grateful to my husband, Brian, whose unwavering love, patience, support, and sense of humor have given me strength and motivation through the ups and downs of graduate school. !v TABLE OF CONTENTS LIST OF TABLES vii LIST OF FIGURES viii Chapter 1 Is the Simple Law of Mobility Really a Law? Testing ClarkÕs Hypothesis 1 1.1 Introduction 1 1.2 Data 5 1.3 Empirical Approach 7 1.4 Results 11 1.4.1 Main Results 11 1.4.2 Robustness Checks 13 1.5 Conclusions 14 APPENDIX 16 REFERENCES 22 Chapter 2 Intergenerational Persistence in Latent Socioeconomic Status: Evidence from Sweden 26 2.1 Introduction 26 2.2 Data 31 2.2.1 Sources and Sample Selection 31 2.2.2 Construction of Status Measures 31 2.2.3 Alternative Measures for U.S. Comparison 34 2.3 Empirical Approach 35 2.4 Empirical Results 40 2.4.1 Main Results 40 2.4.2 A Comparison of Sweden and the United States 42 2.4.3 Robustness of Main Results 43 2.4.4 Extension to Mothers and Daughters 45 2.5 Conclusions 49 APPENDIX 52 REFERENCES 62 Chapter 3 Understanding and Evaluating SAS¨ EVAAS¨ Models for Measuring Teacher Effectiveness 66 3.1 Introduction 66 3.2 Value-Added Models 68 3.2.1 Common Methods for Estimating Teacher Effects 70 3.2.2 EVAAS Methods 71 3.2.2.1 EVAAS Univariate Response Model (URM) 71 3.2.2.1.1 Relating the EVAAS URM to Other Approaches 76 !vi 3.2.2.2 EVAAS Multivariate Response Model (MRM) 79 3.3 Prior Literature Evaluating EVAAS Methods 82 3.4 Simulation 86 3.4.1 Simulation Design 86 3.4.2 Simulation Results 88 3.4.3 Sensitivity of Simulation Results 90 3.5 Empirical Analysis 91 3.5.1 Administrative Data 91 3.5.2 Empirical Results 92 3.6 Summary and Conclusions 94 APPENDIX 95 REFERENCES 104 !vii LIST OF TABLES Table A1: Summary Statistics for Analysis Sample 17 Table A2: FathersÕ Average Earnings and Education by Occupation Category 18 Table A3: OLS, IV, and LW Results 19 Table A4: Robustness of LW Results 20 Table B1: Summary Statistics for Full Sample and U.S. Comparison Sample 53 Table B2: OLS, IV, and LW Estimates for Full Sample (Fathers and Sons) 54 Table B3: Comparison of LW Estimates - Sweden and the U.S. 55 Table B4: Robustness of LW Estimates to Construction of Status Measures 56 Table B5: OLS Estimates from Extensions with MothersÕ Measures of Status 57 Table B6: LW Estimates from Extensions with MothersÕ Measures of Status 58 Table B7: Summary Statistics for Mothers & Fathers (Balanced Samples) 59 Table B8: OLS Estimates from Extensions with MothersÕ Measures of Status, for All Parent-Child Samples 60 Table B9: LW estimates from Extensions with MothersÕ Measures of Status, for All Parent-Child Samples 61 Table C1: Correlations Between Estimated and True Teacher Effects (1 Cohort of Students) 96 Table C2: Correlations Between Estimated and True Teacher Effects (3 Cohorts of Students) 97 Table C3: Correlations - URM vs. Other Estimators (Small Teacher Effects) 98 Table C4: Correlations - Estimated vs. True Teacher Effects (Large Teacher Effects) 99 Table C5: Correlations - URM vs. Other Estimators (Large Teacher Effects) 100 Table C6: Descriptive Statistics for Students in Sample, by Grade 101 Table C7: Spearman Rank Correlations, Comparing EVAAS URM to Other Estimators 102 Table C8: Disagreement with the URM in Classification of Teachers Above the 10th Percentile 103 !!viii LIST OF FIGURES Figure A1: LW Results 21 Chapter1IstheSimpleLawofMobilityReallyaLaw?TestingClarkÕsHypothesis1.1IntroductionTherehasbeenlong-standinginterestinthepersistenceofoutcomesacrossgenerations,fromearliertheoreticalworkbyBeckerandTomes(1976,1979),tothedevelopmentofintergenerationaldatasetsenablingexpansionsofempiricalwork.Thesestudiesaimtodescribe,forinstance,theextenttowhichinequalitiesarepassedonfromonegenerationtothenext,ortheextenttowhichopportunitiesoroutcomeshavebeenequalizedforchildrenfromvariousfamilybackgrounds.ThetypicalapproachtostudyingintergenerationalmobilitybeginswithabasicmodelrelatingchildrenÕsoutcomestoparentsÕoutcomes:yit+1=!yit+"i(1)whereiindexesfamily,tindicatesparentÕsgenerationandt+1indicatesthechildÕsgeneration.1Generally,yit+1andyitrepresentameasuresuchasincome,wealth,oreducation.Theregressioncoe!cient,!,thenprovidesameasureofpersistence,orimmobility,intheoutcomefromtheparentÕsgenerationtothechildÕsgeneration.Hence,thequantity1-!canbeinterpretedasameasureofmobility.FortheU.S.,thepersistenceparameterrelatingachildÕslogincometoparentÕslogincome(hence,anincomeelasticity)isestimatedtobeabout0.4to0.6(Solon,1999;Mazumder,2005;Lee&Solon,2009;Black&Devereux,2011),whileforNordiccountriestheestimateislowerat0.1to0.3(Black&Devereux,2011).2Theseestimatesaretakentobesummarystatistics,describingtheextenttowhichincomedi"erencespersistfromonegenerationtothenextinacountryorsociety.AmongtheexplanationsforthelowerpersistenceobservedinNordiccountries1Inequation(1),alongwiththeremainingequationsinthepaper,theinterceptissuppressedbyconsideringthevariablesindeviation-from-meanform.2Thispaperusesintergenerationalincomeregressionsasapointofdeparture,thusextendingtheincomemobilityliterature,butthereisabroaderliteraturethatlooksatotheroutcomes.Forexample,Hertzetal.(2007)isanoftcitedrecentexampleprovidingintergenerationalcorrelationandregressioncoe!cientsineducationalattainmentfor42countries;Bj¬orklundandSalvanes(2011)alsoprovideasuccinctreviewofrelatedliterature.Additionally,anothersubsetoftheliteratureisconcernedwithintergenerationalpersistenceinoccupationoroccupationalprestige.Hodge(1966)isanearlyexamplestudyingintergenerationaloccupationalmobilityintheU.S.,whileLong&Ferrie(2007,2013)aremorerecentexamples;seealsoBlack&Devereux(2011)forabriefdiscussionofrelatedstudies.1relativetotheU.S.isonethathighlightswhysomuchattentionisgiventosuchdi!erencesinmobility:highermobilitymayreßectpolicydi!erences,suchasmoreredistributivetaxstructuresandgeneroussocialwelfareprograms.Inarecentlypublishedbook,though,GregoryClarkmakestheprovocativeclaimthattheseestimatesaresubstantiallybiaseddownward,andthatÒtrueÓpersistenceinsocialstatusismuchhigherÑapproximately0.75Ñandisuniformacrossallcountriesandovertime(Clark,2014).ThelatterpartofClarkÕsassertion,regardinglowermobility,drawsonabodyofworkbyClarkandhiscoauthors,includinganarticleinthisjournal,thatusesinnovativemethodsandavarietyofcreativenamesdatasourcescoveringmanysocietiesoverseveralcenturies.3Themethodsexploittheinformationcontentofraresurnamesinthesesocietiestoexploresocialmobility,withouthavingactualintergenerationalfamilylinks.4Thebasicideaisthatifinheritancematters,thenraresurnamescontaininformationoneconomicstatus,andtheyalsoindicatesomefamilylineagegivennamingconventionsandtheinheritanceofpaternalsurnames(orinsomecountriesbothmaternalandpaternalsurnames).5TheÞrstpartofClarkÕscontroversialclaimÑregardingbiasinpriorestimatesÑisbasedonamodelproposedtoexplainthediscrepanciesbetweenmobilityestimates.Clark(2014)postulatesthatthehigherpersistencerate(0.75)governsalawofsocialmobility,andsummarizesthegeneralintuitionunderlyingthehypothesizeddownwardbiasintraditionalestimatesas:ÒFamiliesturnouttohaveageneralsocialcompetenceorabilitythatunderliespartialmeasuresofstatussuchasincome,education,andoccupation.Thesepartialmeasuresarelinkedtothisunderlying,notdirectlyobserved,socialcompetenceonlywithsub-stantialrandomcomponents.Therandomnesswithwhichunderlyingstatusproducesparticularobservedaspectsofstatuscreatestheillusionofrapidsocialmobilityusingconventionalmeasures.Ó(Clark,2014,p.8)3SeeClark(2014)foracomprehensivelistofthesestudies,aswellasthemorerecentpapersClark&Cummins(2015)andClarketal.(2015).4Forthedatasourcescontainingexplicitsocioeconomicmeasures,suchasprobatedwealthatdeath,equation(1)isestimatedusingthegroupaveragesofwealthforraresurnames.Fordatawithoutsuchmeasures,theapproachinsteadlooksatpersistenceintherepresentationoftheraresurnameinanÒeliteÓgrouprelativetorepresentationinthepopulationasawhole.5G¬uelletal.(2014)showthatraresurnamesdocontainsuchinformation,andproposeamethodusingthejointdistributionofsurnamesandeconomicstatustoexploreintergenerationaltransmissionofstatusinSpain.2Moreformally,Clark&Cummins(2015)andClark(2014)presentasimplemodelformobil-ity:x!it+1=bx!it+eit(2)wherex!representsunderlyingsocialstatus,andbtheÒtrueÓpersistencerate.ThehypothesizedattenuationbiasinpriorestimatesisthoughttoarisefromthefocusonasingleÒnoisyÓmeasure,yit(e.g.,income,wealth,oreducation),oftheunderlyingsocialstatus,x!it,wherethisrelationshipisassumedtobeoftheform:yit=x!it+uit(3)whereuitisidiosyncraticerror.6Additionally,ClarkclaimstobeabletomeasuretheÒtrueÓpersistenceratebyusingsurnamegroupaveragesinequation(1),orøyzt+1=bøyzt+øuzt,wherezindexessurname(insteadofiindexingfamily).Theargumentreliesonclassicalmeasurementerrorassumptionssothatøyzt!øx!ztbecauseøuzt!0whenthesurnamesamplesaresu!cientlylarge.7Inarecentarticleinthisjournal,Clark&Cummins(2015)presentbothtraditionalandsurnameestimatesofsocialmobilityinEnglandusingwealthmeasurestoillustratethediscrepanciesinmobilityestimates,andalsotestimplicationsofonedimensionoftheproposedmodelÑtheAR(1)formofthelawofmotionforsocialmobilityinequation(2).However,theydonottesttheproposedexplanationforthediscrepancies:Ò...ifweweretomeasurethesocialstatusoffamiliesasanaggregateofearnings,wealth,education,occupation,andhealth,thenobservedsocialmobilityeveninparentchildstudieswoulddecline.Forsuchanaggregationwouldreducethevarianceoftheerrorcomponentinmeasuredstatus.Thusthemeasuredrateofpersistence,eveninonegeneration,willbemuchclosertothatoftheunderlyinglatentvariable.Ó(Clark&Cummins,2015)6SpeciÞcally,theassumptionisthattraditionalestimatesarebiaseddownwardbytheusualclassicalmeasurementerrorattenuationfactor!2x!2x+!2u,where!2xisvar(x!)and!2uisvar(u).7Clark(2014)notesthatanygroupaveragingoverindividualswouldsimilarlyreducemeasurementerrorandrevealtruestatus,thusresultinginmuchhigherestimatesofpersistence(Clark,2014,p.110).AsnotedbySolon(2015)however,manyoftheintergenerationalmobilitystudiesthatusegroupaveragesdonotactuallyÞndsuchresults.Forexample,Chettyetal.(2014)showinAppendixDthatusingsurnamegroupaveragesfromadministrativeU.S.incometaxdataresultsinestimatessimilartotheindividual-levelregressions.3IÞllthisgapbyexploringthishypothesisthatwheninformationfrommultiplemeasuresisaggre-gatedandthenusedtoobtaintraditionalestimates,thelowermobilityrateswillberevealed.8Thispaperempiricallyteststheproposedexistence,magnitude,andnatureofadownwardbiasintraditionalestimates.Conveniently,thetheoreticalsetupforthelawofsocialmobilitylaidoutinequations(2)and(3)translatesnicelyintoalatentvariablesframework,andtheattenuationbiasportionofthisintriguingtheorycanbeeasilytestedusingpubliclyavailabledata.Consideringx!thelatentstatus,equation(2)canbeinterpretedasthestructuralequation.Foreachofthepar-ticularmeasuresmentionedintheÞrstquotationabove(i.e.,income,education,andoccupation)wecanwriteaseparatemeasurementequationoftheformpresentedinequation(3).UnderthestrongclassicalmeasurementerrorassumptionsmaintainedinClarkÕstheory,instrumentalvari-ables(IV)usingonenoisymeasuretoinstrumentforanothernoisymeasureproducesaconsistentestimateoftheintergenerationalcoe!cient(IGC),b.Iftheclassicalassumptionsarerelaxedtoallowforslopecoe!cientsinthemeasurementequationsaswellasunrestrictedcorrelationsamongthemeasurementerrors,IVestimationisinconsistent.Themagnitudeanddirectionoftheincon-sistencyispotentiallyunknown,dependingontheassumptionsandmeasuresused.However,anapproachproposedbyLubotsky&Wittenberg(2006)isparticularlywellsuitedforaddressingthecaseofmultiplenoisymeasures,andunderlessstringentassumptions.Whilenotidentifyingb,themethodallowsonetoobtainanestimatewiththeleastattenuationbiasÑsointhiscaseagreatestlowerboundonbÑbyincorporatinginformationfromallofthesuggestedmeasures(i.e.,income,education,occupation)intoasingleestimateofb.Inthispaper,IemploytheseapproachesusingasampleoffathersandsonsfromthePanelStudyofIncomeDynamicstotesttheattenuationbiasassumptionunderlyingthelawofsocialmobility.IÞndlittleevidencesupportingthehypothesizeddownwardbiasinpriorestimates,andshowthatincorporatingadditionalmeasuressuchaseducationandoccupationhasnomeaningfulimpactontheestimatedpersistenceratesobtainedfromtraditionalmodelsfocusedonsinglemeasures.8OtherrecentpapershavebeentestingotherhypothesesputforthinClarkÕswork.Forexample,infootnote7,ImentionedtheestimatesfromChettyetal.(2014)thatdonotsupportClarkÕsassertionthatmobilityestimatesbasedonanygroupaveragesoverindividualswillresultinhigherestimates.Clark(2014)alsoadvocatesthathisresultsexplainwhyÞndingsfrommultigenerationalregressionsindicateapositivegrandparentcoe!cient.Infact,asSolon(2015)pointsout,thepapersdonotallÞndpositivecoe!cients.Forinstance,LucasandKerr(2013)Þndlittleevidenceofnon-zerograndparentcoe!cientsinmultigenerationalregressionsusingadministrativeincomedataforFinland.Similarly,Braun&Stuhler(2015)usesurveydataoneducationandoccupationinGermanyandÞndthataftercontrollingforparentsÕoutcomes,theycannotrejectazerocoe!cientforthegrandparentsÕoutcome.4Consideringintergenerationalpersistenceinthismorecomprehensivesensedoesnotrevealhigherpersistenceestimates,butratherconÞrmsthepictureofmobilityobtainedfrompriorstudiesthatfocusedonasinglemeasureofsocioeconomicstatus.Thepaperisorganizedasfollows.InthenextsectionIdescribethedataandsample.ThenIoutlinetheempiricalapproach,andnextpresenttheresults.InthelastsectionIsummarizetheresultsandconclude.1.2DataIusedatafromthePanelStudyofIncomeDynamics(PSID),asthisdataisideallysuitedformystudy.Thedatacontainstherequisiteintergenerationallinksandalsoincludesinformationonmultiplemeasuresofsocioeconomicstatus,whichiscrucialfortestingtheattenuationbiasclaim.9Further,IamabletoselectasampleofindividualsverysimilartopriorPSIDstudiesaboutwhichtheattenuationbiasclaimsaremade,therebyfacilitatinganappropriatecomparison.10ThePSIDisalongitudinalstudythatbeganin1968withasampleofapproximately5,000familiesintheU.S.,withinterviewsconductedannuallythrough1997,andbienniallysincethen.Childrenfromtheseoriginalfamiliesarefollowedwhentheystarttheirownhouseholds,andonecanobservefamilylinksandfollowmultiplegenerations,whichiskeyfortraditionalintergenerationalmobilitystudies.ThispaperfocusesontheSurveyResearchCenter(SRC)partofthesample11,inparticularduringthe1968-1972surveysforfathersand1992surveyforsons.12Whilemorerecentyearsareavailable,thistimeperiodallowsformoredirectcomparabilitytopriorestimatestargetedbytheproposedbias,lessensconcernsaboutdeteriorationofdataqualityinlateryears,andstillallowssonsÕagestobeappropriateformeasuringearningsoutcomes.9AlthoughadministrativedatasetssuchastheincometaxrecordsusedbyChettyetal.(2014)havemuchlargersamples,thedatawouldnotsu!ceforthetestsconductedinthispaperbecauseinformationonotherstatusmeasuressuchaseducationalattainmentoroccupationisnotavailable.10Forexample,Solon(1992)andChadwick&Solon(2002)usesimilarfather-sonsamples.Theirsampleselectionsdi"erinthatsonÕsearningsisobservedstartingatage25.IrestrictmysampletosonsforwhomIobserveearningsstartingatage30(uptoage40),tominimizelife-cyclebias,asdiscussedbelow.11TheSRCsamplewasdesignedtobenationallyrepresentativein1968,whiletheothercomponentÑtheSurveyofEconomicOpportunity(SEO)sampleÑoversampledlowincomehouseholds.12Focusingonfather-sonpersistenceinstatusratherthanparent-child(ormother-daughter,etc.)ismorestraight-forwardgivenfemalelabourforceparticipationpatterns,andtheresultingissueswithdeÞningandmeasuringearningsandoccupationoutcomes.Thesurnamesworkalsofocusedprimarilyonpatrilineallinesofinheritance,givennamingconventions(Clark,2014,p.15),butstillpositedthisgenerallaw.Hence,theproposedlawofmobilityshouldbejustasevidentusingonlyfathersandsonsaswouldbethecaseifmothersordaughterswereincluded.5Myanalysissampleiscomprisedofsonswhoweremembersoftheoriginal1968sampleandaremaleheadsoftheirhouseholdinthe1992survey,restrictedtothosewhowerebornin1951-1961.Thelowerboundonbirthyearensuresthatthesonswere17yearsofageoryoungerin1968,avoidingselectingolderchildrenstilllivingathome.Further,thesonsÕbirthyearrestrictionsminimizelife-cyclebiasinannualearningsbyensuringthatsonsare30to40yearsoldforthe1991earningsmeasure(reportedduringthe1992survey).13FathersareidentiÞedasthemaleheadsofthehouseholdinwhichthesonlivedin1968.Theearningsoutcomeforbothfathersandsonsismeasuredaslogannualearnings,sothesampleexcludesanyobservationswithnon-positiveearningsorearningswhichwereimputedbymajorassignment(forsons,thisreferstoearningsin1991,andforfathers,earningsineachoftheyears1967Ð71).Fathersmissingdataoneducationalattainmentarealsoexcluded.Theearningsexclusionsapplyto24sonsand28fathers,with11additionalfathersexcludedduetomissingeducation,amountingtoexcludingatotalof46father-sonpairs,andleavingaÞnalsampleof415sonsfrom293fathers.14TableA1providessummarystatisticsdescribingthissample.Thesampleispredominatelywhite,withonlyÞvepercentblack.Giventheageexclusionsforsons(andlackthereofforfathers),thefathersareobserved,onaverage,atanolderagethansons,withfathersÕaverageagejustover40in1967andsonsÕaverageageapproximately35in1991.Averageannualearningsareslightlylowerforsonsthanfathers,andarealsomorevariableforsons,consistentwiththewell-documentedlife-cycleproÞleinearnings.15Approximately25percentofthefathershaveatleastafour-yearcollegedegree.Fortheempiricalanalysis,IdeÞnetheeducationmeasureoffatherÕslatentstatusasfatherÕseducationalattainmentasofthe1968survey,codedas1-16foryearsofschoolinguptoa4year13Haider&Solon(2006)showthatthemeasurementerrorinmenÕscurrentearningsasanindicatoroflifetimeearningsisnon-classicalatyoungerandolderages,causingintergenerationalpersistenceestimatestobebiaseddownward(asalsoillustratedinLeeandSolon(2009)).TheyÞndthatobservingmenÕsearningsfromtheearlythirtiesthroughtheearlyfortiesbestavoidsthislife-cyclebias,asthisiswhenthemeasurementerrorisapproxi-matelyclassical.FindingspresentedbyNybom&Stuhler(forthcoming)showsimilarresultsusingSwedishearningsdata.14ItispossibletoconstructlargerPSIDsamples,butIchooseasamplesimilartothoseinpriorintergenerationalstudiessincethesewereusedtoproducetheU.S.estimateswhichClarkpurportsarebiaseddownward,andarethusgermanetotheexplorationsinthispaper.Further,NybomandVosters(2015)useSwedishadministrativedatatoconductsimilartestsaswellassupplementaryanalysesexaminingtherobustnessoftheresultsinthispaper,showingthattheresultsarenotuniquetothissampleorthemeasuresused.15Allearningsvariablesareexpressedin1991dollars(adjustedforinßationusingtheCPI-U)forillustrationpurposes,butthistransformationdoesnota!ectIGCestimatessincethelogofearningsisbeingused.6collegedegree,withavalueof18indicatinganygraduateschoolcompleted.Theoccupationmeasurereferstothemainjobdiscussedinthe1969survey,andisincorporatedintheformofoccupationalcategoryindicators.AslistedinTableA1,therearesevencategories:1)professional,technical;2)managers,businessmen,self-employed;3)clerical,sales;4)craftsman,foreman;5)operatives;6)labourers,serviceworkers,farmers,andfarmmanagers;7)miscellaneous(includesarmedservicesmembers,protectiveservicesworkers,thosenotcurrentlyemployed,andthosemissinganoccupationcategory).Tofurtherillustratethecompositionoftheoccupationcategoriesforfathers,TableA2providesaverageeducationandearningsbycategory.Averageearningsandeducationaregenerallymonotonicallydecreasingfromoccupationcategories1to6.TheÞnalcategory,7Ðmiscellaneous,issimilartocategories3and4,thoughwithfewobservationsandsubstantialvariabilityinearnings.Hence,Itakeaßexibleapproachintheanalysis,incorporatingtheoccupationmeasureasavectorofindicatorsforeachoftheÞrstsixoccupationcategories(withcategory7theomittedreferencegroup),takingnostanceontherelativesocialstatusofthecategories,butratherassumingthateachcontainssomeinformationontheunderlyinglatentstatus.1.3EmpiricalApproachTotestthehypothesisthattraditionalestimatesofintergenerationalpersistencesu!erfromatten-uationbias,IbeginbyprovidingabaselinetraditionalestimatefromthisPSIDsample.IusetheÞve-yearaverageoflogearningsfrom1967-71asthemeasureoffatherÕsstatusinequation(1),similartopreviousstudies(e.g.,Solon,1992;Zimmerman,1992;Chettyetal.,2014).16Giventhattheproposedattenuationbiasisthoughttocomefromthefocusonasinglenoisymeasureofanunderlyinglatentsocialstatus,andthatincorporatingadditionalmeasuressuchaseducationandoccupationshouldrevealgreaterpersistenceinstatus,IextendthemodelbyaddingtheseothermeasuresoffatherÕsstatus.Ithenestimatetheseintergenerationalregressionsusingthetypicalordinaryleastsquares(OLS)approach,aninstrumentalvariables(IV)approach,andtheapproach16Withclassicalnoiseinannualearningsmeasures,estimatingequation(1)usingOLSresultsinanIGCestimatethatisbiaseddownwardbythewell-knownattenuationfactorof!2x!2x+!2u,where!2xisvar(x!)and!2uisvar(u).TakingtheÞve-yearaverageofearningsmitigatestheattenuationbias,reducingtheattenuationfactorto!2x!2x+(!2u/5).Theattenuationfactorbecomesmorecomplicatedwhenoneincorporatesserialcorrelationinearningsfromoneyeartothenext.7proposedbyLubotsky&Wittenberg(2006),tolookforevidenceofattenuationbias.Inallesti-mations,IcontrolforaquadraticinfatherÕsageandaquadraticinsonÕsagetoaccountforthelife-cycleproÞleinearnings.17TomoreclearlyillustrateClarkÕstheorydiscussedaboveinthecontextofmyempiricalap-proach,Ipresentamoreformallatentvariablesframework,withtheintergenerationalequation(1)nowrepresentedbytheso-calledstructuralequation:yit+1=!x!it+"i(4)whereyit+1issonÕslogearnings,andx!itisfatherÕsunderlyingsocialstatus.Thenwecanconsiderequation(3)expandedtocomprisethesystemofmeasurementequations:y1it=#1x!it+u1it(5)y2it=#2x!it+u2it(6)...yjit=#jx!it+ujit(7)Inthesemeasurementequations,y1itrepresentstheaverageoffatherÕslogannualearningsin1967-71,y2itisfatherÕseducation,andy3itisfatherÕsoccupation(speciÞcally,avectorofoccupationcategoryindicators).18Further,thisframeworkallowsforslopecoe!cientsinthemeasurementequations,relaxingthetheorypresentedearlier,whichtookthese#jtobeequalto1.19ThisnotationreßectsthefactthatIdonotdirectlyaddressthelatentstatusforsons.IfweweretotakeliterallythesimplelawÕsassumptionofclassicalmeasurementerrorontheleft-handside,therewouldbenoconcernofthislimitationinducingbias.Moregenerally,withanystatusmeasureontheleft-handside,weshouldstillseegrowthintheintergenerationalcoe!cient17IncludingquadraticsinbothfatherÕsandsonÕsageascontrolsarisesfromtakingmodelsofcurrentearningsoftheformyit=yi+ai0+ai1Ageit+ai2Age2it+vit,fori=fatherorsonandt=time(e.g.,year),thensolvingforthelongruncomponentofearningsyi,andsubstitutingeachintoequation(1).TakingtheÞve-yearaverageoflogearningsimpliesusingtheÞve-yearaverageofage.SeeSolon(1992)forexplicitderivations.18TheoccupationindicatorsaregenerallyreferredtoasonemeasureÑoccupationÑeventhoughoccupationisßexiblyaccountedforbyincludinganindicatorforeachoccupationcategory.ThisimplementationissimilartothedrinkingwaterproxyforwealthusedinoneoftheexamplespresentedinLubotskyandWittenberg(2006).19Interceptsareomittedbecausetheoutcome,measures,andlatentvariableshouldallbeconsideredtobedemeaned,whichisconsistentwiththeimplementationoftheLubotskyandWittenberg(2006)approachdiscussedbelow.8towards0.75asweaddmeasuresforfathersontheright-handsideiftheproposedattenuationbiasargumentholds.Further,addressingstatusforsonsistricky,asthereisnobasisforobtainingoptimalweights(discussedbelow)forsonÕsmeasuresontheleft-handside.Evenso,IperformarobustnesscheckbyapplyingtheweightsdeterminedforfathersÕmeasurestothoseforsonstoobtainamorecomprehensivestatusmeasureforsons,andgetverysimilarresults.20Alsoundertheperhapsunwiseassumptionsofclassicalmeasurementerror,onemethodforconsistentlyestimating!isinstrumentalvariables(IV).Onecanuseanyyjtoinstrumentforanothermeasureykandconsistentlyestimate!,providedthat"jk!cov(uj,uk)=0and#k=1(otherwise,theestimateconvergesto!/#k).Hence,thisIVapproachisslightlyrobusttofailureofclassicalassumptions,allowingsome#j"=1.Inthecasewhere"jk=0fails,theIVestimatorisnolongerconsistentfor!,butthedirectionofbiasmaybeintuitivelyinferredbasedonbeliefaboutthesignofcov(uj,uk).21AlthoughClarkÕssimplelawassumesthemeasurementerrorsareuncorrelated(i.e.,cov(uj,uk)=0)thereareobviousreasonstobelievethisassumptionisviolatedinthesettingconsideredhere.22Thus,Inextturntomypreferredapproachwhichallowsforthiscorrelation.TheapproachproposedbyLubotsky&Wittenberg(2006)(henceforthLW),notonlyproducesasingleestimateof!whileincorporatingmultiplemeasures,butdoessoinanoptimalwaysuchthattheestimateasymptoticallyprovidesthegreatestlowerboundon!.Theapproachresultsintheleastattenuationbiasbyextractingthestrongestcombinedsignaloutofallofthemeasures.23Hence,Icandirectlytesttheattenuationbiasargumentbyobservingwhetherestimatesarecon-vergingtothehypothesizedpersistencerateof0.75asIincorporateadditionalnoisymeasuresoffatherÕsstatus.Notonlydoesthemethodallowforincorporatingallavailablemeasures,italsorelaxesthestrongassumptionsthatcov(uj,uk)=0forallj"=kand#j=1forallj,allowingthese20Asdiscussedbelowwiththeresultsonrobustnesschecks,theintergenerationalcoe!cientobtainedfromthisregressionbasedonusingincome,education,andoccupationforsonsandfathersis0.433,whichisnotsigniÞcantlydi"erentfromtheestimateof0.473basedononlyincomeforsons.21When!=1,"IVconvergesto"!2x!2x+!jk,where#2xisvar(x!),implyingupwardbiasif#jk<0ordownwardbiasif#jk>0.When!!=1,"IVconvergesto""j!2x"k"j!2x+!jk,withamorecomplicatedinconsistencyfactor.22Forexample,itisplausiblethatanidiosyncraticshockmaya"ectfatherÕsincomeandoccupation,inducingcorrelationamongthemeasurementerrors.Allowingforunrestrictedcorrelationamongthemeasurementerrorsthuspermitserrorstructuresthatcontainacommonfactor,sothatshocksmaya"ectallobservablemeasuresforanindividual.23SpeciÞcally,theLWestimateachievesagreatestlowerboundamongaclassofestimators,butotherestimatescansimplybemappedintothisclassforcomparingmagnitudes.9tobemostlyunrestricted(subjecttoanormalizationon!).24Thenormalizationon!isneededtoidentifythisvectorofslopecoe!cientsinthemeasurementequations.Inormalize!1toequal1,whichsimplysetsthescaleofthelatentx!tothatofy1(earnings).Clearlylatentstatushasnoscale,butgiventhatIampositioningthispaperusingintergenerationalincomeregressionsasthepointofdeparture,thenaturalnormalizationtoadoptistothescaleoffatherÕsincome.Withthisnormalization,theequationfortheremaining!jcanbeshowntobe:!j=cov(yit+1,yjit)cov(yit+1,y1it)(8)ThisratiocanbeestimateddirectlyusingIVestimation,instrumentingfory1it(fatherÕsincome)usingyit+1(sonÕsincome),withyjit(themeasureweareestimating!jfor)asthedependentvariable.LWshowthatanauxiliaryordinaryleastsquaresregressionofyit+1onthemeasuresy1it,y2it,...,yjit,producesthevectorofcoe!cientestimates,ö",whichprovidesinformationonthenoisinessofthemeasuresandontheconditionalcovarianceofeachmeasurewithyit+1(conditionalontheothermeasures).Then,thesecoe!cientestimates,ö",combinedwiththeestimates,ö!,formanoptimallinearcombinationoftheinformationfromthejmeasures.25Thisoptimallinearcombinationprovidesagreatestlowerboundon#.Explicitly,theLWestimatoris26:#LW=ö!1ö"1+ö!2ö"2+ááá+ö!jö"j(9)Tocontrolforothercovariates(namelythequadraticsinfatherÕsandsonÕsage),thesecovariatesareincludedintheauxiliaryregressionofsonÕsearnings,yit+1,onfatherÕsstatusmeasures,y1it,y2it,...,yjit,(toobtain")aswellasintheIVestimationsofthe!j.27Standarderrorsforthe#LWestimatesarebootstrappedwith1,000repetitions,usingablock/panelbootstraptoaccount24Theapproachassumesthatcov(uj,!)=0,althoughverysmalldeviationsfromthiswillnotsubstantiallyaltertheresults.25LinearityisadoptedthroughoutthisdiscussionandisrelieduponfortheLWapproach,butthisisareasonableapproximationforthemeasuresconsideredhereandthehypothesisbeingexamined.26Notethateachelementof"LW(i.e.,#j$j)canbeconsideredasaproductofratioscov(yit+1,yjit)cov(yit+1,y1it)cov(!yit+1,!yjit)var(!yjit),wherecov(!yit+1,!yjit)var(!yjit)isconditionalontheothermeasuresinyt,sotheestimatedö"LWwillbemonotonicallyincreasinginmagnitudeasmeasuresareaddedonlyinthecasewheretheconditionalcovariancehasthesamesignastheunconditionalcovariance.27Thisimplementationstrategyistheoretically(andnumerically)equivalenttothatsuggestedinLubotsky&Wit-tenberg(2006)ÑtoÞrstregresseachmeasureandthedependentvariableontheothercovariatesandusetheseresidualizedvariablesforestimationof#and$.10forclusteringwithinfamily.1.4Results1.4.1MainResultsFirst,IestablishabaselineestimateusingthetraditionalapproachwiththisPSIDsampleof415father-sonpairs.Next,Iexplorethesensitivityofthisestimatetoincludingothermeasuresofstatusintheregression.PanelAofTableA3providestheresultsfromthisseriesofordinaryleastsquares(OLS)regressions.Thebaselineestimateoftheintergenerationalcoe!cient(IGC)is0.439,usingthetraditionalapproachofregressingsonÕslogearningsontheÞve-yearaverageoffatherÕslogearnings(hencethiscanalsobeinterpretedasanincomeelasticity).Asexpected,thisisintherange(0.4Ð0.6)oftraditionalIGCestimatesfortheU.S.(Solon,1999;Black&Devereux,2011).Movingalongcolumns2-4ofTableA3,IpresenttheOLSresultsfromtheaugmentedmodels.Addingeducationtothemodel,thecoe!cientonfatherÕsearningsfallsslightlyto0.398,butthecoe!cientoneducationisessentiallyzero.Similarly,whenIaddoccupationcategoriesinsteadofeducation,thecoe!cientsontheseoccupationcategoryindicatorsarenotjointlysigniÞcant(F=0.54,p-value=0.779);inthiscase,however,thecoe!cientonearningsrisesslightlyto0.480.Wheneducationandoccupationarebothincorporated,thecoe!cientonearningsissimilartothebaselineestimate.Again,neitherthecoe!cientoneducationnorthecoe!cientsontheoccupationalcategoryindicators(F=0.66,p-value=0.682)aresigniÞcant.28ThenextpanelinTableA3showstheresultsfromanIVapproach,whichiscommonlyusedto28Similartomyresults,otherstudiesalsoÞndthatwhenthevariableusedfortheparentisthesameasthatusedfortheo!springinthedependentvariable,thenadditionalvariablesfortheparentsdonothavepracticallyorstatisticallysigniÞcantcoe"cients.Sewell&Hauser(1975,p.86)ÞndthisresultinanalysisbasedontheWisconsinLongitudinalStudy.WithsonÕsearningsasthedependentvariable,theynotethatthecoe"cientsonfatherÕseducationandoccupationarenotstatisticallysigniÞcantafterconditioningonfatherÕsincome.UsingthePSID,Corcoranetal.(1992)alsousesonÕsearningsasthedependentvariableandsimilarlyÞndthatafteraccountingforparentalincome,thecoe"cientsforseveralotherfamilyorcommunitybackgroundcharacteristicsarenotpracticallyorstatisticallysigniÞcant.Duncanetal.(2005)Þndsimilarresultsforintergenerationalassociationsfor17outcomemeasures(traitsandbehaviors)intheNationalLongitudinalSurveyofYouth(NLSY).Afteraccountingforthesamemeasureforparents,thecoe"cientsontheothertraitorbehavioralmeasuresarenotstatisticallysigniÞcantin84percentofthe272cases.Further,twoveryrecentstudiesÞndthisresultusinglargeadministrativedatasets:Boserup,Kopczuk,&Kreiner(2014)estimatethewealthelasticityinDenmark,anduponaddingparentalandchildincomeÞndthatthesecoe"cientsarenotpracticallysigniÞcant;Nybom&Vosters(2015)performanalysesanalogoustothoseinthispaperusingSwedishadministrativedataandshowthat,withsonÕsincomeasthedependentvariable,afterconditioningonfatherÕsincomethecoe"cientonfatherÕseducationisnotpracticallyorstatisticallysigniÞcant.11addressclassicalmeasurementerror.WithtwoÒnoisyÓmeasuresofstatus(earningsandeducation)Iuseeducationtoinstrumentforearnings.29TheestimatedIGCis0.497,whichstillfallsintherangeoftraditionalestimatesfortheU.S.anddoesnotindicatesubstantialattenuationbiasinthebaselineestimate.Finally,inPanelC,Ipresenttheestimatesoftheintergenerationalpersistencecoe!cientob-tainedusingtheLWapproachtominimizetheattenuationbiasfromusingmultiplenoisymeasuresofstatus.AlloftheIGCestimatesthemselvesarestatisticallysigniÞcant,soIfocusthediscussiononchangesintheestimatesacrossspeciÞcations.TheÞrstestimateissimplytheOLSestimate(0.439),asthisisaspecialcaseoftheLWapproachwhenoneusesasinglemeasure.Addingfa-therÕseducationasanadditionalmeasureofstatusproducesonlyaslightincreaseintheestimatedIGCto0.445.Whenoccupationinformationisaddedinsteadofeducation,theIGCestimateislarger,at0.465.And,whenbotheducationandoccupationmeasuresaresimultaneouslyincluded,theIGCestimateincreasesslightlyto0.473,butagainthereisnotasubstantialincreaseintheestimatedpersistence.30NotethattheOLScoe!cientestimatespresentedinPanelAareidenticaltotheauxiliarycoe!cientestimates,ö!j,usedintheLWapproach.GiventhelackofpracticalorstatisticalsigniÞcanceoftheseestimatesdiscussedabove,itisunsurprisingthatwedonotseelargechangesintheLWestimatesoftheintergenerationalcorrelation.Attemptingtoincorporateaddi-tionalinformationonsocialstatuscausestheIGCtoßuctuatesome,butallestimatesremainintherangeofpriorestimatesfortheU.S.FigureA1showsthatevenwhenconsideringtheprecisionoftheestimatesandlookingatthe95percentconÞdenceintervals(thebars)aroundtheestimates(thedots),neitherindicateIGCestimatesincreasingtothehypothesizedunderlyingpersistencerateof0.75.TheplotsshowtheestimatesandconÞdenceintervalsforeachspeciÞcationlistedinTableA3,beginningwiththebaselineestimate,thenaddingeducation,occupation,andboth.TheupperboundsontheconÞdenceintervalsare,respectively,0.585,0.5850.622,and0.629,stillfallingshortofthehypothesizedpersistencerate.Theprecisionoftheseestimatesishamperedbythe29Asnotedabove,instrumentinginthisfashionproducesanIVestimatethatconvergesto!/"1where"1isthecoe!cientintheearningsmeasurementequation(andassumingthemeasurementerrorsareuncorrelated),thusenablingthecomparabilitytoourLWestimatebasedonlatentstatussettothescaleoffatherÕsincome.30WhenamoreßexibleapproachistakenusingtheÞveannualearningsyearsasseparatevariablesaswellasseparateeducationcategoryvariables(highschoolgraduate,somecollege,four-yeardegree,atleastsomegraduateschool),theLWestimateofintergenerationalpersistenceisstillquitesimilarat0.485butlessprecisewithastandarderrorof0.095.12PSIDsamplesize,butNybomandVosters(2015)ÞndstrikinglysimilarÑandmorepreciseÑresultsforSweden,withsimilarlysmallincreasesinpersistenceestimates,evenafterusingmoredetailedmeasuresandincorporatinganalogousmeasuresformothers.1.4.2RobustnessChecksInthemainanalysis,Ifocusonaddingmeasuresofstatusforfathers,butdonotdirectlyaddresssonÕslatentstatus.Asdiscussedintheempiricalapproachsection,thisshouldnotsubstantiallyaltertheresults.However,IstillperformasensitivitycheckinwhichsonÕslatentstatusisexplicitlyaddressed.IapplytheweightsdeterminedbytheLWapproachforfatherÕslatentstatustothemeasuresforbothfathersandsons,creatingindexmeasuresofstatusforeachgeneration.ThenIregressthecompositemeasureforsonsonthecompositemeasureforfathers.ThisresultsinanIGCestimateof0.433witha(bootstrapped)standarderrorof0.071,whichissimilarto,albeitslightlysmallerthan,themainLWestimatesreportedinTableA3.MyLWresultsarealsorobusttoadjustingseveralofthesamplerestrictions,asshowninTableA4.TheÞrstrowofresults,withtheestimatesinboldandstandarderrorsinitalicsunderneath,simplyprovidesthemainresultsfromPanelCofTableA3forcomparison.Allowingsonswhoare25-29yearsoldattheir1991earningsmeasuretoalsobeincludedinthesample,thesamplegrowsto582father-sonpairs(withsonsaged25-40yearsold).TheIGCestimatesof0.402Ð0.464areslightlysmallerrelativetothemainresults,consistentwiththelife-cyclee!ectsliterature(Haider&Solon,2006;Nybom&Stuhler,forthcoming),butthepatternofminimalincreasesremainsunchangedasadditionalmeasuresofstatusareincluded.ThesamepatternisrevealedwheninsteadofadjustingtherestrictionsonsonÕsage,IdosoforfatherÕsage,limitingthefatherstothoseaged30-50in1968andobtainingIGCestimatesranging0.457Ð0.494.Incorporatingbothofthesesampleadjustmentsatthesametimealsoproducesthesamepattern,asexpected,whichisshowninthenextrowofresultswithestimatesranging0.420Ð0.452.Returningtotheoriginalsamplerestrictions,exceptnowincludingmother-sonpairsfromfemale-headed1968households(sosinglemothers)inthesample,theIGCestimatesaresmallerinmagnitude(0.360Ð0.410)butstillfollowthesamepatternasmoremeasuresofstatusareadded.Finally,thelastrowofestimatesinTableA4presentsresultsfromchangingthefunctionalformofthefatherÕsearningsmeasurefromtheaverageoflogearnings13for1967Ð71tothelogofaverageearnings.Theseresultsyetagainexhibitthesamepatternasthemainresults,withIGCestimatesranging0.463Ð0.495.1.5ConclusionsSeveralrecentstudiesbyGregoryClarkandcoauthorshaveexaminedintergenerationalmobilityusinganewmethodbasedonsurnamesandnewlydevelopeddatasets,Þndinghigherpersistencerates(i.e.,lowermobility)thanpreviouslyestimated(e.g.,Clark,2014;Clark&Cummins,2015).Inthesestudies,thehypothesespresentedtoexplainthediscrepancyuseasimplemeasurementerrorargumentthatisconsistentwiththeproclaimedhigherpersistencerateofapproximately0.75fromsurnamemethodsandthesmallerestimatesfromtraditionalstudies.IamtheÞrsttoempiricallytestthepropositionthatpriorestimatesareattenuatedfromfocusingonasinglemeasuresuchasincomeandshouldrisewhenadditionalinformationisincorporated.IuseLubotsky&WittenbergÕs(2006)approachdesignedforscenariossuchasthis,wheremultiplemeasuresofalatentvariable(i.e.,status)areavailable,butthemeasurementerrorsarelikelycorrelated.Themethodcombinestheinformationfromavailablemeasuresofthelatentvariableinawaythatproducesasinglepersistenceestimatewiththeleastattenuationbias.Iaggregateinformationfromincome,education,andoccupationÑthreerecommendedmeasuresoffatherÕssocialstatusÑusingtheLWmethod,yetIseenoindicationofthepersistenceratesapproaching0.75astheadditionalmeasuresareadded.Therearesmallincreasesinthepersistenceestimatesasadditionalmeasuresareincorporated,butthesechangesarenotmeaningfulinastatisticallysigniÞcantorpracticalsense.Infact,alloftheestimatespresentedinthemainresults,aswellasinrobustnesschecks,rangefrom0.360to0.491,quitesimilartothepriorestimatesfortheU.S.ThepatternofsmallincreaseswithadditionalmeasuresisrobusttoadjustingsamplerestrictionsaswellasmeasuredeÞnitions.And,althoughmysamplesizeisnotconducivetoassessingthestatisticalsigniÞcanceofthesesmallchangesinthepointestimates,thesampleIusefacilitatesrelevantcomparisonstopriorliterature.Iamabletoobtainabaselineestimateanalogoustothepriorstudiesaboutwhichtheattenuationbiasclaimsaremade,whichisanappropriatestartingpointforthenincorporatinginformationfromothermeasures.IÞndnoevidencethataddinginformationfromotherstatusmeasuresproducesestimatesthatareconverging14toasubstantiallygreaterlevelofintergenerationalpersistence.MyÞndingsrejectClarkÕsmeasurementerrorinterpretationofhisresultsrelativetothosefrompriorliterature,buttheydonotshedlightonwhyhisestimatesbasedonsurnamesarehigherthantraditionalestimates.Averagingoversurnamesdoesnotalwaysproducehigherpersistenceestimates,asshownwithU.S.incometaxdatainAppendixDofChettyetal.(2014).FurtherworkisneededtogainamorenuancedunderstandingofdiscrepanciesbetweenClarkÕsestimatesusingthesurname-averagemethodandtraditionalmethods,andwhateachmethodmightbeidentifying.AsnotedbySolon(2015)andChettyetal.(2014),thetraditionalapproachmaybecorrectlyidentifyingindividual-levelmobility,whilethesurnamesmethodmaybeidentifyinggroup-levelmobilityfortheseparticulargroupsofsurnames.ThisisfurtherdevelopedinarecentexpositionbyTorche&Corvalan(2015),whichshowsthatestimatingsurname-levelregressionscapturesbetween-grouppersistenceinaverageoutcomesfortheparticularlyÒeliteÓorÒunderclassÓsurnameschosen,ratherthanClarkÕsinterpretationofusinggroupaveragestoeradicatemeasurementerrorandrevealindividual-levelmobility.15!16 APPENDIX !17 Table A1: Summary Statistics for Analysis Sample Mean Std. Dev. Min Max Race - black 0.05 0.22 0 1 Sons' age in 1991 34.92 3.14 30 40 Sons' 1991 individual earnings 35,695 26,251 300 335,000 Fathers' age in 1967 40.47 6.81 27 67 Fathers' Individual earnings Annual earnings 1967 39,684 24,409 1,101 244,671 Log annual earnings 1967 10.43 0.60 7.00 12.41 5-year-avg of log earnings, 1967-71 10.46 0.59 7.79 12.65 Fathers' Educational attainment Less than HS graduate 0.33 0.47 0 1 High school graduate 0.32 0.47 0 1 Some college 0.11 0.31 0 1 Bachelor's degree 0.14 0.34 0 1 At least some graduate school 0.11 0.31 0 1 Fathers' 1969 Occupation categories 1 - Professional, technical 0.23 0.42 0 1 2 - Manager/businessmen 0.14 0.35 0 1 3 - Clerical, sales 0.09 0.29 0 1 4 - Craftsman, foreman 0.23 0.42 0 1 5 - Operatives 0.17 0.38 0 1 6 - Laborers, service, farmers 0.12 0.32 0 1 7 - Not currently employed/missing 0.02 0.15 0 1 Notes. The sample includes 415 sons and 293 fathers. All earnings are expressed in 1991 dollars. !!!!18 Table A2: FathersÕ Average Earnings and Education by Occupation Category Earnings in 1969 Educational attainment Occupation Category Mean Std. Dev. Mean Std. Dev. N 1 - Professional, technical 61,382 40,129 15.67 1.76 66 2 - Manager/businessmen 54,983 41,871 12.83 2.73 42 3 - Clerical, sales 38,379 10,920 12.96 1.89 27 4 - Craftsman, foreman 38,212 15,203 10.79 2.62 67 5 - Operatives 30,044 11,108 9.76 2.53 50 6 - Laborers, service, farmers 20,614 10,468 9.97 2.55 34 7 - Not employed or missing 38,379 31,544 10.00 3.61 7 Overall 42,419 30,209 12.09 3.27 293 Notes. The sample includes 293 fathers. All earnings are expressed in 1991 dollars. !!!!19 Table A3: OLS, IV, And LW Results [1] [2] [3] [4] FathersÕ noisy measures of status Earnings Earnings, education Earnings, occupation Earnings, education, occupation Panel A: OLS results Five-year average of log earnings: 1967-71 0.439 0.398 0.480 0.438 0.075 0.098 0.100 0.120 Educational attainment 0.010 0.016 0.013 0.016 Occupation categories 1 - Professional, technical 0.002 -0.077 0.228 0.236 2 - Manager/businessmen -0.029 -0.064 0.233 0.222 3 - Clerical, sales 0.001 -0.051 0.256 0.258 4 - Craftsman, foreman 0.066 0.052 0.233 0.218 5 - Operatives -0.027 -0.032 0.229 0.211 6 - Laborers, service, farmers 0.181 0.152 0.254 0.244 Panel B: IV results (education to IV for 5-yr-avg earn) First stage 0.105 0.006 Second stage 0.497 0.090 Panel C: LW estimates of IGC 0.439 0.445 0.465 0.473 0.075 0.072 0.080 0.080 N 415 415 415 415 Notes. All specifications use log of sonÕs 1991 earnings as the dependent variable and include as controls a quadratic in sonÕs earnings and a quadratic in fatherÕs age (the average age during the five years of earnings observations). The omitted occupation category is Ò7 - Not employed or missingÓ. The sample size for all estimations is 415 father-son pairs, from 293 families. OLS and IV standard errors are clustered by family. LW standard errors are computed using a block bootstrap to account for within family correlation (1,000 repetitions). !!20 Table A4: Robustness of LW Results [1] [2] [3] [4] N Earnings Earnings, education Earnings, occupation Earnings, education, occupation Main results 415 0.439 0.445 0.465 0.473 0.075 0.072 0.080 0.080 Adjusting sample exclusions: SonÕs age 25-40 582 0.402 0.422 0.446 0.464 0.065 0.061 0.075 0.077 FatherÕs age 30-50 380 0.457 0.463 0.484 0.494 0.083 0.083 0.090 0.089 SonÕs age 25-40 and FatherÕs age 30-50 483 0.420 0.426 0.444 0.452 0.076 0.073 0.083 0.082 Include 1968 female-headed households 444 0.360 0.375 0.392 0.410 0.072 0.064 0.072 0.069 Adjusting earnings measure: Log of father's 5-yr avg. of annual earnings 1967-71 415 0.463 0.466 0.490 0.495 0.075 0.073 0.081 0.081 Father's status measures Earnings (5-yr-avg) x x x x Educational attainment x x Occupational categories x x Notes. The dependent variable is log of sonÕs 1991 earnings, and the measure of fatherÕs earnings is the 5-year average of log earnings from 1967-71. All specifications include as controls a quadratic in sonÕs earnings and a quadratic in fatherÕs age (the average age during the five years of earnings observations). The omitted occupation category is Ò7 - Not employed or missingÓ. Standard errors are computed using a block bootstrap to account for within family correlation (1,000 repetitions). !!!!21 Figure A1: LW Results !Notes. The sample includes 415 fathers and 293 fathers. Standard errors are computed using a block bootstrap to account for within family correlation (1,000 repetitions). Simple Law: IGC ~ 0.75LW Estimates, with 95% Confidence Intervals0.2.4.6.81IGC[1] Earnings[2] Earn & Educ[3] Earn & Occup[4] AllSpecification!22 REFERENCES REFERENCESBecker,G.&Tomes,N.(1976).Childendowments,andthequantityandqualityofchildren.JournalofPoliticalEconomy,84(4)2,S143-S162.Becker,G.&Tomes,N.(1979).Anequilibriumtheoryofthedistributionofincomeandintergen-erationalmobility.JournalofPoliticalEconomy,87,1153-189.Bj¬orklund,A.,&Salvanes,K.G.(2011).Educationandfamilybackground:Mechanismsandpolicies,inE.Hanushek,S.Machin,andL.Woessmann(eds.),HandbookoftheEconomicsofEducation,3(3),201-247.Black,S.E.&Devereux,P.J.(2011).Recentdevelopmentsinintergenerationalmobility,inO.AshenfelterandD.Card(eds.),HandbookofLaborEconomics,4,1487-1541,Amsterdam:Elsevier.Braun,S.,&Stuhler,J.(2015).Thetransmissionofinequalityacrossmultiplegenerations:TestingrecenttheorieswithevidencefromGermany.Mimeo.Boserup,S.H.,Kopczuk,W.,&Kreiner,C.T.(2014).Intergenerationalwealthmobility:EvidencefromDanishwealthrecordsofthreegenerations.Workingpaper,October2014.Chetty,R.,Hendren,N.,Kline,P.&Saez,E.(2014).WhereistheLandofOpportunity?ThegeographyofintergenerationalmobilityintheUnitedStates,QuarterlyJournalofEconomics,129(4),1553-1623.Clark,G.(2014).Thesonalsorises:surnamesandthehistoryofsocialmobility.PrincetonUniversityPress.Clark,G.andCummins,N.(2015).IntergenerationalwealthmobilityinEngland,1858-2012.Surnamesandsocialmobility.EconomicJournal,125,61-85.Clark,G.,Cummins,N.,Hao,Y.&Vidal,D.D.(2015).Surnames:anewsourceforthehistoryofsocialmobility.ExplorationsinEconomicHistory,55,3-24.Corcoran,M.,Gordon,R.,Laren,D.,&Solon,G.(1992).TheassociationbetweenmenÕseconomicstatusandtheirfamilyandcommunityorigins.JournalofHumanResources,27(4),575-601.Duncan,G.,Kalil,A.,Mayer,S.E.,Tepper,R.,andPayne,M.R.(2005).Theappledoesnotfallfarfromthetree,inSamuelBowles,HerbertGintisandMelissaOsborneGroves(eds.),UnequalChances:FamilyBackgroundandEconomicSuccess,pp.23-79,PrincetonUniversityPress.G¬uell,M.,Rodr«õguezMora,J.V.&Telmer,C.(2014).Theinformationalcontentofsurnames,theevolutionofintergenerationalmobilityandassortativemating.ReviewofEconomicStudies,AdvanceAccesspublishedonlineDecember10,2014,doi:10.1093/restud/rdu041.23Haider,S.J.&Solon,G.(2006).Life-cyclevariationintheassociationbetweencurrentandlifetimeearnings.AmericanEconomicReview,96(4),1308-1320.Hertz,T.,Jayasundera,T.,Piraino,P.,Selcuk,S.,Smith,N.,&Verashchagina,A.(2007).Theinheritanceofeducationalinequality:InternationalcomparisonsandÞfty-yeartrends.TheBEJournalofEconomicAnalysisandPolicy,7(2).Lee,C.I.&Solon,G.(2009).Trendsinintergenerationalincomemobility.TheReviewofEco-nomicsandStatistics,91(4),766-772.Long,J.,&Ferrie,J.(2007).Thepathtoconvergence:IntergenerationaloccupationalmobilityinBritainandtheU.S.inthreeeras.TheEconomicJournal,117(519),C61-C71.Long,J.,&Ferrie,J.(2013).IntergenerationaloccupationalmobilityinGreatBritainandtheUnitedStatessince1850.TheAmericanEconomicReview,103(4),1109-1137.Lubotsky,D.&Wittenberg,M.(2006).Interpretationofregressionswithmultipleproxies.TheReviewofEconomicsandStatistics,88(3),549-562.Lucas,R.E.,&Kerr,S.P.(2013).IntergenerationalincomeimmobilityinFinland:Contrastingrolesforparentalearningsandfamilyincome.JournalofPopulationEconomics,26(3),1057-1094.Mazumder,B.(2005).Fortunatesons:NewestimatesofintergenerationalmobilityintheUnitedStatesusingsocialsecurityearningsdata.TheReviewofEconomicsandStatistics,87(2),235-255.Nybom,M.&Stuhler,J.(forthcoming).HeterogeneousincomeproÞlesandlife-cyclebiasininter-generationalmobilityestimation.JournalofHumanResources.Nybom,M.&Vosters,K.(2015).Intergenerationalpersistenceinlatentsocioeconomicstatus:EvidencefromSweden.SOFIWorkingPaper3/2015.PanelStudyofIncomeDynamics,publicusedataset.ProducedanddistributedbytheInstituteforSocialResearch,UniversityofMichigan,AnnArbor,MI(accessedDec2013).Sewell,W.H.,&Hauser,R.M.(1975).Education,occupation,andearnings.Achievementintheearlycareer.NewYork:AcademicPress.Solon,G.(1992).IntergenerationalincomemobilityintheUnitedStates.TheAmericanEconomicReview,82(3),393-408.Solon,G.(1999).IntergenerationalmobilityinthelabormarketÕ,inO.AshenfelterandD.Card(eds.),HandbookofLaborEconomics,3A,pp.1761-1800,Amsterdam:North-Holland.Solon,G.(2015).Whatdoweknowsofaraboutmultigenerationalmobility?NBERWorkingPaperNo.w21053.24Torche,F.&Corvalan,A.(2015).Estimatingintergenerationalmobilitywithgroupeddata:acritiqueofClarkÕsTheSonAlsoRises.WorkingPaper15-22,NYUPopulationCenter.Zimmerman,D.J.(1992).Regressiontowardmediocrityineconomicstature.TheAmericanEconomicReview,82(3),409-429.25 26 Chapter 2 Intergenerational Persistence in Latent Socioeconomic Status: Evidence from Sweden1 2.1 Introduction Researchers and policymakers have long shown a great deal of interest in understanding the degree of socioeconomic mobility within and across societies, resulting in a large body of economic research examining the extent to which income differences are passed on from parents to their children. One of this literatureÕs most notable results is that intergenerational mobility in the Nordic countries is substantially higher than in countries such as the United States. However, recent work by Gregory Clark and coauthors has led to surprisingly contrary conclusions, suggesting that the ÒtrueÓ rate of mobility is generally very low and also steady across time and countries with vastly different social and economic contexts, including Sweden and the United States (Clark 2014, p.107). The descriptive literature on intergenerational income mobility generally estimates an equation resembling a basic AR(1) process: !!"!!!!"!"!!!!, (1) where yit+1 is offspring log income of family i, yit is parental (typically fathersÕ) log income, and !i an idiosyncratic error; ! is then interpreted as the intergenerational elasticity.2 This process is not necessarily taken literally, nor is the estimate believed to be causal, but instead the goal is to obtain a 1 This chapter is coauthored with Martin Nybom from the Institute for Labor Market and Education Policy Evaluation (IFAU) and the Swedish Institute for Social Research (SOFI), Stockholm University. 2 This parameter thus measures persistence, whereas 1-! is a measure of mobility. For this equation and all those that follow, variables are considered in deviation-from-mean form, allowing intercepts to be suppressed. 27 summary statistic describing how differences in economic status persist from one generation to the next. For Sweden, the estimated persistence in income is around 0.2-0.3, compared to 0.4-0.6 in the U.S. (see Solon, 1999, Bjırklund & J−ntti, 2009, Black & Devereux, 2011). The greater mobility in the Nordic countries is often attributed to policy differences, such as more redistributive tax structures, which facilitate public human-capital investments in terms of subsidized pre-school and college education.3 Others point out that characteristics of the labor market also matter, such as differences in the returns to skills and the intergenerational transmission of employers (Bjırklund et al., 2012; Corak & Piraino, 2011). However, Clark (2014, p.5) follows the former argument, boldly interpreting the low and constant rates of mobility as evidence of large policy failure. The creative methods underlying this recent work exploit the information content of uncommon surnames in lieu of actual intergenerational family links, and the results paint an extraordinarily different picture of mobility for Sweden as well as for other countries.4 The persistence rate for underlying status is estimated to be as high as 0.7-0.8 across a wide range of societies and time periods, leading to the conclusion that for Sweden, ÒThe implied social mobility rates are as low as those of modern England or the United StatesÓ (Clark 2014, p. 41). These claims are quite controversial, with important implications for the interplay between policy and socioeconomic mobility. They also clearly contradict conclusions from prior intergenerational studies. Acknowledging this incongruity, Clark and coauthors suggest that conventional methods have been limited in their measures of socioeconomic status. The main argument is that families have a general social status that underlies imperfect status measures such as income, education, or occupation, and these measures are linked to this underlying and unobserved 3 Public investment in childrenÕs human capital is put forth as one of the key determinants of the size of the reduced-from intergenerational income relationship in SolonÕs (2004) log-linear version of Becker and TomesÕ (1979) model of parental investments. 4 ell et al. (2015) show that rare surnames do contain such information, developing a different method using the joint distribution of surnames and economic status to examine intergenerational transmission of status in Spain. 28 latent factor with substantial random components. Formalized into a simple model, social mobility is reduced to a universal law of mobility, !!"!!!!!!"!!!", where !!" is the underlying status of family i in generation t (Clark, 2014). A single measure such as income is then assumed to be related to status with some additive random noise, !!"!!!"!!!", whereby substituting this into the conventionally estimated equation (1) leads to the classical errors-in-variables attenuation bias. Averaging within a surname, z, then reveals true status, !!"!!!", as !!" is approximately zero for large enough surname groups. For data without surnames, Clark & Cummins (2015) propose that if the information from multiple measuresÑfor example, income, education, and occupationÑwere combined for an individual, then conventionally estimated persistence would rise. Applying an approach proposed by Lubotsky & Wittenberg (2006) to optimally aggregate information from multiple measures, Vosters (2015) tests this proposition using data from the Panel Study of Income Dynamics (PSID). The estimated persistence rates remain just under 0.5 and are insignificantly different from conventional estimates, even after accounting for multiple partial measures of underlying status. While this study shows that this approach does not substantially raise estimated persistence for the U.S., the question remains as to what information could be extracted from multiple status measures in a country with a more redistributive welfare state, such as Sweden. In fact, according to ClarkÕs hypothesis, this approach should have a greater impact on estimated persistence in settings where persistence is conventionally estimated to be quite low. Therefore, we follow the above approach, performing similar tests to look for any evidence of this asserted attenuation bias in conventional estimates for Sweden. We first provide estimates using measures constructed to take full advantage of the rich Swedish data. We construct nearly career-long income measures, which mitigates biases stemming from transitory fluctuations (Mazumder, 2005) and life-cycle effects (Haider & Solon, 2006; Nybom & Stuhler, forthcoming) in short-run incomes. Our data also have more detailed occupation categories available, allowing us to better 29 examine the degree to which information on status can be extracted from an individualÕs occupation. Moreover, the small sample size in Vosters (2015) yields low statistical precision and only very large attenuation biases can be formally rejected. In contrast, our sample consists of more than 167,000 parent-child pairs, which provides much greater statistical power. We also examine the claim that persistence is uniform across countries. For this, we provide estimates using variables constructed similarly to those based on the PSID, facilitating a test of whether persistence in underlying status is indeed of equal magnitude in Sweden and the U.S. In doing so, we also indirectly address implications of various data limitations with the U.S. data, such as inaccurate measurement of long-run income and occupations. As such, not only do we obtain results comparable to those for the U.S. to evaluate the applicability of the simple law of mobility across countries, we also obtain more robust results on the magnitude of the hypothesized attenuation bias in the Swedish estimates. We find no evidence to support the simple law of mobility, as persistence estimates remain around 0.25-0.30 even after multiple measures are combined. Further, our comparison with the U.S. confirms the prior perception that mobility is indeed substantially higher in Sweden. These results are robust across a variety of specifications and methods for constructing the measures, and the country difference in persistence appears even greater when using measures constructed to mimic those used for the comparable U.S. study. Our findings thus support those in Vosters (2015), as well as the discussion in Chetty et al. (2014), suggesting that the very low mobility rates provided by the surname approach strongly underestimate the degree of mobility in the population as a whole.5 Although much of our evidence reaffirms results from existing literature rather than lending support to ClarkÕs conclusions, we do find that the latent status framework of the simple law can be empirically relevant for certain groups (e.g., mothers). Motivated primarily by the concept of family 5 See also Braun & Stuhler (2015), who discuss the surname approach in the context of mobility across multiple generations. 30 status, we add the analogous status measures for mothers as we incorporate those for fathers. We find mothersÕ occupation to be the most important addition, though producing only a nominal rise in persistence. Further exploring this result, we examine persistence with respect to mothersÕ status alone, finding that both the estimates of mother-son and mother-daughter relationships increase substantially when multiple measures are used, though rising from low levels compared to those typically found for father-child relationships. Still, these results highlight the unintentional implications of this framework for measurement issues specific to women, showing that combining multiple proxy measures can provide more informative estimates in cases where appropriate income data is not available. Our paper thus extends the literature on the measurement of intergenerational mobility. To date, research has mostly focused on the measurement of specific status indicators, with the approximation of lifetime (or permanent) income being the prime example. Inspired by the work by Gregory Clark and others, we complement this research by providing new evidence on whether such status indicators themselves, even when accurately measured, suffice to capture a broader concept of socioeconomic status. Our findings imply that for men detailed measures of long-run income are indeed good proxies for latent status. In contrast, for women combining individual income information with occupation improves the measurement of status substantially. We also add to the large literature on cross-national mobility differences (e.g., Solon, 2002). The finding based on surnames data, that social mobility is constant across countries, is put into question by our results; these show a Sweden-U.S. country differential that is in line with previous income-based evidence. The rest of the paper unfolds as follows. Section 2 describes the data, before Section 3 discusses our empirical approach. Section 4 presents our main findings. Section 5 presents our extension to both parents, and the intergenerational associations related to mothers and daughters. The final section offers some concluding remarks. 31 2.2 Data 2.2.1 Sources and Sample Selection We use administrative data from various sources, which have been merged by Statistics Sweden using unique personal identifiers. A multigenerational register links children to their biological parents; censuses provide data on parentsÕ occupation and education; income tax declarations for both parents and offspring provide data on total individual income. Our main sample is based on a random draw of 35 percent of all children born in Sweden 1951-1961 and their biological parents.6 We restrict our analysis to these cohorts for a couple of reasons. Given the available income data, we can observe long-run prime-age incomes of both these offspring cohorts and their parents. Moreover, these are the cohorts used in Vosters (2015), so this selection further facilitates the comparison between our estimates for Sweden with those for the U.S. 2.2.2 Construction of Status Measures For annual income, we use administrative data covering the years 1968-2007. The data are based on individual income-tax declarations and we define the income measure separately for fathers and mothers. Our measure includes income before taxes from all sources except means-tested benefits and universal child benefits. These data come with a number of advantages: they are almost entirely free from attrition and reporting error, pertain to all jobs, and are not censored.7 For parents, we approximate log lifetime income by the average of log annual income over ages 30-60. For offspring, 6 We exclude those with parents who were more than 40 years old at the birth of the child. 7 In contrast to many other administrative data sources, our data are not censored (nor truncated) in the top of the income distribution. Further, the Swedish system provides strong incentives to declare some taxable income since doing so is a requirement for eligibility to most social insurance programs. Hence, we expect very little missing data in the low end of the distribution. 32 we construct measures of log lifetime income as the average of log income over ages 27-46. We require parents and offspring to have at least five non-missing annual income measures.8 Throughout all analyses we control flexibly for parental and offspring birth year using cohort dummies. While the tests we conduct focus on the measurement of ÒstatusÓ, our income measure is particularly important because it also minimizes the potential for two well-known sources of bias in estimated intergenerational associations; bias arising from transitory shocks to income and from life-cycle effects. By using a long-run average of annual income observations, potential attenuation biases from transitory fluctuations are greatly reduced (Mazumder, 2005). In our sample, 88 percent of sons have 20 non-missing log income observations and 88 percent of fathers have at least 10 non-missing log income observations. Further, we measure income as long-run averages during mid-life in order to minimize so-called life-cycle bias (Nybom & Stuhler, forthcoming). While there are slightly fewer mid-life income observations for fathers, 91 percent of them have at least one annual income observation from before age 50. We use occupation data from national censuses conducted every five years between 1960 and 1990. The occupational classification employed in the censuses builds on the Nordic Occupational Classification (NYK), which is based on the International Standard Classification of Occupations (ISCO). The NYK categorizes occupations according to the end result of the tasks and duties undertaken in the job. Hence, level of education and professional status are typically not considered in the categorization (Statistics Sweden, 2004). The classification has a hierarchical structure, allowing for analyses at different aggregation levels. Three-digit codes denote unique occupations, two-digit codes denote minor occupation groups and one-digit codes denote major occupation 8 Missing income is rare in our sample, and such occurrences could be due to quite different reasons; individuals could be living abroad, they could fail to file their tax declaration, or it might arise due to coding errors. 33 groups. To fully exploit the available information, we use the unique occupation indicators in our main analyses, but also test the sensitivity of our results to using the broader classification levels. We define a parentÕs occupation as the occupation he or she had in the 1970 census. Fathers in our sample are, on average, about 44 years old in 1970, so this census provides a good prime-age occupation measure. If occupation is missing in this census, however, we use the corresponding data from the 1975 or the 1980 censuses.9 For those with occupation still not coded, we include indicators for missing and undefined in our main specifications to flexibly account for these special cases. Including missing and undefined as separate categories, the resulting sample holds 270 unique occupations classified into 61 minor occupation groups, or 12 major occupation groups. To demonstrate the nature of the classification, the major occupation groups are: 1) Professional work (arts and sciences); 2) Managerial work; 3) Clerical work; 4) Wholesale, retail, and commerce; 5) Agriculture, forestry, hunting, and fishing; 6) Mining and quarrying; 7) Transportation and communication; 8) Manufacturing; 9) Services; 10) Military/Armed forces; 11) Undefined; 12) Missing. For parental education, we use data on final education in 1970 according to the data from Statistics SwedenÕs education register, which is based on a standard conversion translating each level into years of education. The measures of parental education reflect their highest educational attainment, with the levels including: less than nine years of primary school, nine years of primary school, two-year secondary school, three-year secondary school, less than three years of post-secondary school, three years or more of post-secondary school, and graduate school. We also perform a set of robustness tests in which we control for education more flexibly. First, we again use the above measure but now by including a dummy variable for each of the different levels. Second, 9 Incorporating the later censuses is primarily beneficial in obtaining more accurate information on occupation for mothers, who are more likely to have missing data in 1970. Very few fathers have missing occupation in 1970. 34 we also exploit more detailed information on educational attainment from the same data source. In doing so, we include a large set of dummies reflecting length and type of education, distinguishing between various tracks within high school as well as a large number of different academic and vocational post-secondary educational categories. Because we are exploring implications of aggregating the information on parental income, education, and occupation, we only include parent-child pairs for which the parents have non-missing information on all of these measures and the child has the requisite non-missing income measures. Table B1 provides descriptives for our resulting main sample of 167,552 sons matched to 153,920 fathers. 2.2.3 Alternative Measures for U.S. Comparison We also construct alternative measures to facilitate a Sweden-U.S. comparison based on comparable findings in Vosters (2015). The analysis by Vosters is based on data from the nationally representative part of the Panel Study of Income Dynamics (PSID), which began with a sample of about 3,000 families in 1968. Importantly, the PSID includes family links and follows original sample members and their children over time. Fathers are identified as the male head of the household in which the child resided at the time of the initial survey, which does not necessarily represent a biological link. Thus, our Swedish sample differs slightly in that we use biological rather than cohabitating fathers.10 To enable a credible cross-country comparison, we construct alternative measures for Sweden that are analogous to those from the PSID. For offspring income, we use the log of annual income 10 For approximately 95 percent of the sons in the Vosters (2015) PSID sample, the identified cohabitating father is in fact the biological father, so this difference is minor. 35 in 1991. FathersÕ income is defined as the average of log income in 1968-72.11 Our education measure is very similar, reflecting the highest level of attainment. For occupation, we use the major groups described above, which differ slightly but not much from the seven groups used in the PSID (see Vosters, 2015). To better match the last ÒresidualÓ category in the PSID, we add missing and undefined occupations to our military/armed services category, resulting in 10 major categories for the Swedish sample. In the U.S. data, education and occupation are from the 1968-1969 surveys, while our corresponding data are from 1970. Although there are minor differences in some variable definitions across the two countries, they are marginal at most and should have very little effect on our results. The sample with non-missing data on these measures includes 146,783 sons matched to 135,020 fathers. We provide descriptive statistics for both the full sample and this restricted U.S. comparison sample in Table B1. The samples are very similar across all observables. SonsÕ average income is slightly higher than that for fathers; in logarithmic form, these averages are 12.22-12.29.12 FathersÕ average education of just over 9 years, as well as the distribution among the various levels of attainment, is nearly identical across samples, as are the proportions in each occupation category. Professional work and manufacturing comprise much of the sample of fathers, with 19 and 38 percent in the respective categories. 2.3 Empirical Approach Our empirical approach is designed to test the hypothesis that estimates of intergenerational persistence in socioeconomic status approach 0.7-0.8 as we add the proposed partial measures. We 11 Since our income data start in 1968, this measure is marginally different from the U.S. data that are based on earnings in 1967-71. 12 Income is provided in 2005 Swedish kronor (SEK). 36 then proceed to contrast our results with comparable estimates for the U.S. to also test the claim that persistence in latent status is the same across countries. We begin by obtaining a baseline estimate of persistence by estimating the usual intergenerational income equation above in (1). To gauge the degree of attenuation bias in this estimate, we then add the additional measures of parental status to this equation. Although this provides insights into the sensitivity of conventional estimates to accounting for other status measures, it does not provide a single estimate of persistence in underlying status that combines information from all measures. Our preferred method, proposed by Lubotsky & Wittenberg (2006), estimates the persistence in latent status, aggregating the information in the included proxy measures. To better illustrate our methodological approach, we first present the hypothesis in a simple latent variables framework, writing measurement equations for each of the partial measures, yjit, of the form: !!"#!!!!!"!!!!!"#, (2) where j indexes the measure, i indexes family, and t generation. We generalize the measurement equations from the simple law to allow for slope coefficients, !!. Our main empirical specifications include equations for y1it for parental (e.g., fathersÕ) income, y2it for parental education, and y3it -ykit for the k-2 parental occupation indicators. x*it is the unobserved latent status and the ujit are the measurement errors. The so-called structural equation can then be written: !!"!!!!!!"!!!!!"! (3) 37 where ! is the measure of intergenerational persistence in underlying latent status. This notation shows that we do not explicitly address offspringsÕ latent status with multiple partial measures.13 However, the outcome variable we do useÑa twenty-year average of annual incomes during mid-lifeÑis likely one of the best single measures of socioeconomic status available. Further, the simple conditions underlying the simple law of mobility rely on the assumptions of a classical errors-in-variables model, under which measurement error on the left-hand side is innocuous. Under the classical assumptions, the measurement errors (ujit) are all uncorrelated, and the coefficients !! are equal to one. In this simple case, there are several econometric methods available. For example, instrumental variables (IV) estimation using one measure to instrument for another is common solution. We provide one such estimate, using fatherÕs education to instrument for fatherÕs income, which under the proposed law should estimate persistence levels in the 0.7-0.8 range.14 Other possible approaches include the MIMIC (multiple indicators, multiple causes) or LISREL frameworks (see, e.g., Jıreskog & Goldberger, 1975, and Bollen, 1989). More recently, Black & Smith (2006) propose a GMM estimator with potential efficiency gains. However, each of these approaches relies critically on the assumption of uncorrelated measurement errors, and we find this restriction to be particularly concerning in the setting considered here.15 First, the nature of the suggested measures (income, education, and occupation) makes the case of zero correlation among measurement errors unlikely. Second, the anecdotal examples used to motivate the concept of 13 To assess sensitivity to this choice, we performed two different tests. First, we created omnibus measures of status for fathers and sons (applying fathersÕ weights to sonsÕ measures) and obtained an estimate of 0.237, which is nearly identical to the comparable estimate (0.238) using only fathersÕ measures. Second, we used average log incomes across same-sex siblings as measure of offspring status (excluding those without same-sex siblings from the sample). While baseline estimates and thus the scaling differ in the latter case, the estimated decrease in attenuation bias is very similar. 14 Note that in this particular IV setup, consistency requires only the coefficient in the income measurement equation to equal one (and the measurement errors still being uncorrelated), which is not problematic as this is the normalization we adopt for our preferred approach described below. This normalization simply sets the scale of latent status to be on that of fathersÕ income. 15 If the measurement errors were positively correlated, Black & Smith (2006) point out that the IV estimate from using one measure to instrument for the other provides a benchmark for a lower bound. In our case though, the measurement errors may be negatively correlated, which would leave the IV estimate biased upward. 38 underlying latent status directly imply correlation among the measurement errors.16 Further, our main purpose is not to point identify !, rather we seek to test whether attenuation bias decreases as multiple proxies for latent status are taken into account. The LW approach is in this respect superior, allowing us to compare different lower bounds without making restrictive assumptions on cross-correlations of the measurement errors. In addition to relaxing the assumption of zero correlation among the measurement errors (ujit), we also allow the coefficients, !!, in the measurement equations to be mostly unrestricted (subject to a normalization discussed below). The approach from Lubotsky & Wittenberg (2006; henceforth, LW) is ideally suited for this scenario, as it actually exploits the correlation in the measurement errors and estimates the coefficients in the measurement equations. In fact, the LW approach incorporates the information from all included measures of status in an optimal fashion, producing the estimate of persistence with the least attenuation bias. The LW estimator can be written as: !!"!!!!!!!!!!!!!!!!!!, (4) where the !!Õs are estimates of the slope coefficients in the measurement equations, and the !!Õs are obtained from an auxiliary OLS regression described below. Hence, actual computation entails a multistep process. The first step of the LW approach is to obtain the auxiliary OLS coefficient estimates of !! from regressing the dependent variable on all measures of status: 16 Clark (2014, p.11) refers to education being a poor measure of status for Bill Gates (who presumably has high status), as he is a college dropout but has incredibly high income. Conversely, the other example posits that income would be a poor measure of status for a philosophy professor, whose education would be a more appropriate measure. These scenarios imply a negative correlation among the measurement errors for income and education. 39 !!"!!!!!!!!"!!!!!!!"!!!!!!!!"#!!!!. (5) To identify the coefficients in the measurement equations, we need a normalization assumption on one of the !!Õs. We normalize !!!!, which simply sets the scale of the latent status to be on that of fathersÕ log income.17 This implies the following formula to obtain estimates of the !!Õs: !!!!"#!!"!!!!!!"#!"#!!"!!!!!!!" (6) Estimating this ratio can be done in a single step via IV estimation, with !!"# as the outcome variable and using !!"!! to instrument for !!!". We obtain standard errors for the !!" estimate using a block bootstrap (100 replications) to account for within-family correlation. While not identifying ! itself, this estimator provides an estimate of ! with the least attenuation bias based on the joint information in the proxy measures of status. If the simple law of mobility does hold, we should see estimated persistence levels rising as we add these measures of status. In addition to the proclaimed elevated persistence (i.e., lower mobility), the other controversial aspect of the simple law is the assertion that rates of mobility are constant across countries. To facilitate a cross-country comparison between Sweden and the U.S., we estimate analogous specifications using a Swedish sample with the measures constructed similarly to those used for the U.S. by Vosters (2015). From this we can also examine the consequences of various data limitations within the Swedish setting, thus providing indirect evidence on whether the U.S. estimates would change if based on richer data. We also conduct various robustness checks with other constructs of the income, education, and occupation measures. 17This normalization hence allows the LW estimate to be directly comparable to the conventionally estimated intergenerational income elasticity. In fact, in the case where income is the only status measure used, it is easily seen from equations (4) and (5) that the LW estimate is identical to this conventional estimate. 40 Further, because the hypothesized simple law relies on the social status of families, we extend our analysis to other family members, by adding the analogous measures for mothers, as well as estimating specifications with only mothersÕ measures. This exercise provides some suggestive evidence not only on the role of mothers but also on new methods for measuring mothersÕ status. In addition, given the paucity of evidence on intergenerational persistence for daughters, we also extend our analysis to daughters. 2.4 Empirical Results 2.4.1 Main Results We first examine the conventionally estimated intergenerational persistence of income in Sweden, and whether adding additional partial measures affects the estimated coefficient on log income. In these and all other estimations, we control flexibly for cohorts of each generation using birth-year dummies. For the results presented in Table B2, we use the long-run average of sonsÕ log income as the dependent variable, and fathersÕ measures of status include the long-run average of fathersÕ log income, educational attainment, and unique occupation indicators. The first set of results in Table B2 provides OLS estimates (omitting those for the 269 occupation indicators for brevity), with columns [1]-[4] progressively adding measures of fathersÕ status. Note that these estimates also correspond to the OLS components (!!) of the LW estimate obtained in the auxiliary regression. The baseline OLS estimate of equation (1) is 0.23. This estimate of the intergenerational income elasticity is of similar magnitude to previous estimates for Sweden. Moving to column [2], fathersÕ educational attainment is added to the regression, but the coefficient estimate on fathersÕ income remains nearly identical. When instead fathersÕ occupation indicators are added to the regression in column [3], the coefficient on income does fluctuate some, falling to 0.21. This estimate is hardly 41 affected by the inclusion of education, as shown in column [4], indicating that while there is some sensitivity to the addition of the occupation measure, we see very little sensitivity to the addition of educational attainment. With two noisy measures of status, and assumptions of classical measurement error, IV estimation provides a consistent estimate of intergenerational persistence in underlying status. Considering this scenario, the next rows of Table B2 include first and second stage results when instrumenting for fathersÕ income using fathersÕ education. This estimate of persistence is 0.24, similar to conventionally estimated persistence for Sweden. However, it is important to recognize the possibility that the assumptions for consistency may be violated. In particular, the measurement error in income as a measure of social status may be correlated with the measurement error in educational attainment, leaving the direction of bias unknown without further information on the nature of the correlation. The final estimation approach, proposed by Lubotsky & Wittenberg (2006), exploits such violations by using the information on the relationships among the measurement errors and providing the greatest lower bound on persistence in underlying status. The LW estimate in column [1] is identical to the OLS estimate (by construction). However, as we incorporate more measures of status, this approach provides a single estimate from an optimal aggregation of the information from all measures. Given that the OLS estimates shown in the top of Table B2 are underlying components of the LW coefficient estimate, it is unsurprising that adding education does not change the LW estimate, as shown in column [2]. Similarly, given the sensitivity of the OLS coefficients to adding the occupation indicators, the increased persistence with the inclusion of occupation in column [3] is somewhat expected. However, the nominal rise from the conventional estimate of 0.23 to 0.26 when all suggested partial measures are included (column [4]) does not support the hypothesis of substantial attenuation bias in prior estimates. This pattern of results is similar to that 42 found for the U.S. (Vosters, 2015), exhibiting minimal increases in persistence when more partial measures of status are included, despite claims of elevated persistence in all countries. However, an important difference here is the statistical certainty. Due to our much smaller standard errors, we can reject even moderate drops in attenuation bias (for this specific model). 2.4.2 A Comparison of Sweden and the United States Next we turn to directly address the hypothesis that persistence in status is in fact constant across countries. Our main results (provided again in Table B3) show that the persistence estimates for Sweden remain in the previously cited range of 0.20-0.30. Further, these estimates are substantially lower than the U.S. estimates of 0.44-0.47 found by Vosters (2015), illustrating a meaningful distinction in persistence between the two countries. However, the Swedish measures are constructed differently (e.g., the long-run income measure and the unique occupation indicators). While the measurement differences would likely bias the U.S. estimates towards the Swedish estimates, we carefully construct our measures to mimic those used by Vosters to allow for a more sound comparison. Using the Swedish data, we also indirectly gauge what the estimated persistence in status might look like in the U.S. if based on richer data. With the five-year average of log income and broad occupation categories constructed to match those for the U.S., we find that estimated persistence in Sweden is lower at 0.19-0.22. To check whether this might be due to sample composition differences between our main sample and this smaller sample, we also analyze the same sample using our original measure constructs, and find estimates (0.23-0.26) nearly identical to our main results. Sample composition does not appear to be driving the differences. While our results do show that the U.S. estimates may be somewhat attenuated, possibly by some 10-20 percent, we can also see that the increase in estimates as additional measures are added does not change regardless of how measures are constructed; in no 43 cases do the estimates rise substantially when additional measures of status are included. Further, the estimates for Sweden remain in the approximate range previously asserted in the literature, albeit at the low end around 0.20, and clearly differ from the estimates for the U.S.18 Thus, our results fail to support either aspect of the simple law of mobility. 2.4.3 Robustness of Main Results Next we examine the sensitivity of our main results to various modifications to the measures of status. For our measure of income, we did see some sensitivity to the number of yearly income observations included in the average, as the five-year average used for the Sweden-U.S. comparison produced lower estimates than our longer-term measure. Another more arbitrary aspect of our measure construction is the choice to use the average of the annual log earnings rather than the log of average annual earnings. We provide estimates based on this alternative income measure construction in Table B4. While these estimates are slightly higher than our main results, they still remain in the typical range of estimates for Sweden. Moreover, the general pattern of the estimates as additional measures of status are added remains unchanged. The other adjustments to the income measure, as well as the education and occupation measures, are motivated in part by our chosen empirical approach. For example, our long-term income measure gives equal weight to each annual measure from age 30 to 60, while each annual measure entering separately would allow the LW method to optimally choose these weights, which may vary over the life cycle. However, since the LW method also excludes any observations with a missing covariate and several of the fathers in our sample have incomplete income histories, we 18 That the estimate for this specification is slightly lower than previous ones in the literature is not unexpected. While previous estimates have been based on long-run income measures and an optimal use of existing data, our goal here is to use data constructs comparable to the U.S. 44 estimate specifications that include annual log income from age 40 to 50, to reduce the data requirements while still focusing on income measures during mid-life. The persistence estimates are higher, ranging between 0.25-0.30, but are based on a much smaller and presumably more homogeneous sample of fathers that have log income observed in each of these eleven consecutive years. For comparison, we also estimate persistence based on the average of these annual log incomes, finding persistence estimates to be slightly lower (0.24-0.30), suggesting only trivial gains from allowing the LW method to determine the weights on the separate annual income measures. For another point of reference, the corresponding estimates using our baseline income measure for this sample are 0.28-0.33, which are even higher. So it appears that this sample exhibits more persistence than the full sample, but we also see that our longer-term average is serving as a better proxy for status than using the more flexible annual income measures when limited to fewer years. We also adjust the education and occupation measures. For our main specifications, educational attainment enters under the assumption of a linear relationship in years of schooling. We relax this by using indicators for each level of highest attainment. Even with this flexible approach, education does not appear to provide substantial information on status (conditional on income), with estimates increasing by less than 0.01. We also estimate specifications indicating the type of education along with each level, again with increases of less than 0.01 in the estimate. Our main specification used the most flexible measure available for occupation. However, these detailed occupation indicators can be grouped into minor or major occupation groups (similar to those used for our U.S. comparison), resulting in estimates of 0.25 and 0.24, respectively. We thus see some numerical sensitivity of the estimates in this regard, though not to an extent that would affect the conclusions reached with our main analysis. In our main analysis, we included observations with occupation missing or undefined, accounted for using separate category indicators. When excluding these two groups, the baseline estimates increase by around 0.05. However, this modest numerical change has 45 little effect on our main conclusions regarding the level of persistence in Sweden (or the comparison to the U.S.). In general, our robustness checks in Table B4 show that while there is some sensitivity of the estimates to how the partial measures are constructed, none of the changes are meaningful qualitatively. In particular, they do not change our conclusion that estimated persistence is not converging to 0.7-0.8 as additional measures are included, nor the conclusion of higher mobility in Sweden relative to the U.S. 2.4.4 Extension to Mothers and Daughters Our results thus far have focused on male lineages, as is common in the intergenerational literature (including VostersÕ and ClarkÕs work). However, the simple law is described to pertain to underlying latent family status. To more appropriately address the concept of family status, we perform tests analogous to those above but including mothersÕ income, education, and occupation in addition to the same measures for fathers. This extension is warranted both by the specific hypothesis we are testing, but also by the dearth of evidence pertaining to mothers. To supplement the limited evidence in the literature, we also estimate persistence based on only mothersÕ status, and then attempt to disentangle contributions of status measures separately for mothers and fathers, in determining their childÕs later socioeconomic status. Since intergenerational associations for daughters are also much less common in the literatureÑespecially mother-daughter associations in individual incomeÑwe conduct all of these tests for daughters as well.19 19 Chadwick & Solon (2002) for the U.S. along with Rauum et al. (2007) for several different countries look at intergenerational income associations for daughters, circumventing the labor force participation issues by using a family income measure. Altonji & Dunn (1991) comprehensively looks at associations in family income and individual income, for all parent-child pairs, using U.S. data. 46 Similar to our main analysis, we begin by obtaining a baseline estimate via OLS and further augment the regression with additional measures. The first set of results in Table B5 replicates the main analysis for fathers and sons, only now restricting the sample to sons matched to both a father and mother, to facilitate comparisons with the different parent-offspring samples considered here.20 For sons, the coefficient on fathersÕ income is not affected by the addition of education, while for daughters, adding fathersÕ education does seem to matter. The estimates for both daughters and sons are somewhat sensitive to accounting for fathersÕ occupation. When we add the corresponding measures for mothers to each of these specifications, the changes in the coefficient estimates are negligible for both sons and daughters (comparing the first panel to the second). The last set of results is for specifications using only mothersÕ measures. As the coefficient on mothersÕ income is very low, these estimates illustrate why mothers are generally not considered in studies of intergenerational income persistence. While today Sweden indeed has a high rate of female labor force participation, it was much lower for the cohorts of mothers in our sample (born before 1940), and thus individual income is a very noisy indicator of socioeconomic status. In Table B6 we present the LW results, which aggregate information from additional status measures for each of the different parent-child samples. For fathers and sons, the results are nearly identical to the main results from the full sample, with persistence estimates ranging 0.23-0.26. For daughters, the intergenerational persistence in status with regard to their fathers is slightly lower, with estimates ranging 0.15-0.19. An important difference is that fathersÕ education does matter for the association in status with daughters, while it did not for sons. FathersÕ occupation is similarly important for persistence in status with daughters and sons. The results for mothers are more striking, showing that mothersÕ occupation is crucial for measuring mothersÕ status. This holds 20 Descriptives for these samples can be found in Table B7. OLS and LW results for the full mother-offspring and father-offspring samples can be found in Tables B8 and B9, respectively. 47 especially when considering intergenerational associations with sons, as the estimated persistence rises from 0.03 to 0.24. For daughters, the corresponding increase is from 0.06 to 0.13. These estimates are similar to the mother-son association found for the U.S. by Altonji & Dunn (1991), though they obtained a larger mother-daughter estimate. Clearly income is a very poor measure of status for mothers, and this is further confirmed by the results in Table B6; what was not obvious in the OLS results in Table B5 is the substantial impact of accounting for mothersÕ occupation, which is made apparent by the LW methodÕs aggregation of all information contained in mothersÕ income, education, and occupation. Education is also salient to mothersÕ status, as shown in columns [6] and [8], though less so than occupation. Next we include mothersÕ and fathersÕ measures jointly, to consider how persistence might change if we take more literally the concept of family status. When we compare these estimates to those accounting for only fathersÕ status (i.e., estimates reflecting the same information as most of the literature), we see that mothersÕ occupation does contain additional information on family status with respect to intergenerational transmission to sons, and even more so for daughters. Further, mothersÕ income seems salient to transmission of family status for daughters, a result consistent with Altonji & DunnÕs (2000) finding that factors underlying earnings had stronger intergenerational associations along gender lines. To further assess the relative importance of mothersÕ and fathersÕ status measures, we also attempt to separate the relative contributions of each parent to the intergenerational persistence estimate. Decomposing the estimate into portions due to mothersÕ and fathersÕ status, we see in the bottom portion of Table B6 that the vast majority of the persistence for sons is coming from fathersÕ measures, with only 4-5 percent from mothers in the income and income/education 48 specifications.21 MothersÕ occupation appears more important though, shifting more weight to mothers so they account for 15-16 percent. For daughters, the role of mothersÕ status is more substantial, accounting for 32-43 percent of estimated persistence in underlying status. Whether mothers should contribute (conditional on fathers) to intergenerational persistence in family status is an empirical question. Theoretically, one could posit several stories. For example, if we believe there to be substantial positive assortative matching on latent status in the marriage market, then we might expect mothersÕ or fathersÕ status measures to serve as equally suitable measures of family status. Indeed, for the sample of sons, LW estimates from specifications including all measures for fathers are very similar to those including all measures for mothers (0.26 and 0.25, respectively). However, this is not the case when occupation indicators are omitted; nor does it hold as convincingly for the sample of daughters (with estimates of 0.19 and 0.14). While previous studies have found evidence of positive assortative matching in both Sweden (Hirvonen, 2008; Nakosteen et al., 2004) and the U.S. (Chadwick & Solon, 2002), this does not seem to explain our results here. In auxiliary correlational analyses, we find the mother-father correlation in educational attainment to be the highest at 0.55, but the correlation in long-run income is low (0.06). The correlations between mothersÕ and fathersÕ estimated latent status is also low (0.08), which is not surprising given that income both weights heavily into the status measures and exhibits low parental matching. More likely, our results are explained by the well-known issues with using mothersÕ income, some of which we mentioned above. For education, it is less clear what the explanation is; educational attainment is both believed to suffer less from measurement problems and exhibit smaller male-female differences than what is the case for income. However, we do see that 21 The decomposition is done by separating the sum !!"!!!!!!!!!!!!!!!!!! into the sum of elements from fathersÕ measures and the sum of elements from mothersÕ measures. 49 combining information from income and education can mitigate these measurement issues, and adding occupation is especially helpful. So Clark & CumminsÕ (2015) proposition that persistence estimates will rise when combining information from multiple measures seems to have some merit for capturing intergenerational associations with mothers. Each of our noisy measures contributes to measuring mothersÕ status, however not to the extent of raising persistence estimates to the levels proposed in the simple law. 2.5 Conclusions ClarkÕs work shifts the focus to be on underlying socioeconomic status, which is described to be a slightly differentÑpresumably more generalÑconcept relative to the purely economic ones economists have thus far considered. While it is not entirely clear to what extent these concepts should differ, ClarkÕs work is painting an entirely different landscape for intergenerational persistence, provoking a new set of studies (such as this one) testing the surname results and associated hypotheses. Very few of these papers are confirming the results found with the surnames approach or the proposed reasoning for the contradictory results, as in the present paper (e.g., Chetty et al., 2014; Braun & Stuhler, 2015; Vosters, 2015). We tested two facets of the hypothesized simple law of mobility, failing to find evidence to support either claim. We first looked for evidence of substantially increased intergenerational persistence in underlying social status in Sweden when information from several partial measures of parental status was combined. Incorporating information on educational attainment has almost no effect on the conventionally estimated persistence rate of 0.23. When occupation is included, the estimate increases slightly to 0.26, but does not come close to the hypothesized ÒtrueÓ persistence rate of about 0.7-0.8. We then investigated the claimed uniform persistence across countries, by comparing our Swedish estimates with those for the U.S. (presented in Vosters, 2015). Even after 50 harmonizing our sample and variables as to mimic those used in the U.S. study, our estimates still differ substantially, with the U.S. estimates of persistence being more than twice as large as the Swedish. Our analysis thus confirms the previously established higher levels of intergenerational mobility for Sweden relative to the US. Prior studies, such as Goldberger (1989), also recognized that non-income measures may be important in measuring persistence in socioeconomic status.22 However, ClarkÕs theory formalizes this notion with a very simple measurement error framework and proposes an easily testable hypothesis. So while Clark is not the first to emphasize the importance of non-income measures, the exercise of considering a more general latent status has also prompted various extensions to the literature. For example, ours along with Vosters (2015) is one of the first studies to aggregate information from different dimensions of status into a single measure of persistence. While sociologists and economists have included, say, income and education in the same regression, these have been attempts to identify mechanisms, or simply reactions to data limitations, rather than for the purposes of obtaining one aggregate persistence estimate. Coupled with our method for obtaining an aggregate estimate, ClarkÕs theory regarding latent status unintentionally inspires another important contribution to both the measurement and intergenerational literatures, enabling further examination of intergenerational associations related to mothers. Studies rarely consider status transmission from mothers to children, or even fathers to daughters, due to data limitations stemming from lower labor force participation rates for females. In the context of ClarkÕs work, despite the underlying theory being presented in the realm of male lineages, the latent variable approach might be more relevant for females. Hence, we first extended our analysis to more carefully consider the concept of family status by accounting for mothersÕ in 22 Sociologists also consider non-income measures, instead often focusing on social ÒclassÓ and various measures of occupational prestige. 51 addition to fathersÕ status measures, which had very little impact on estimated persistence, especially for sons, with persistence rising to only 0.28. Although beyond the scope of the simple law, we do find the framework to be more relevant to females, especially mothers. We show that a modified version of the measurement error framework presented by Clark proves useful in estimating intergenerational associations between mothers and their offspring. In contexts where income is a very noisy measure of socioeconomic status, as is often the case for mothers, supplementing this information with additional noisy measures can make an important difference, as shown by our analyses incorporating mothersÕ education and occupation.23 In fact, for daughters the intergenerational persistence estimates accounting for all measures are only slightly lower for mothers relative to fathers. While these results warrant future research for a better understanding of these transmission channels, our results here illustrate what information might be gained by considering other estimation approaches, and other measures of status. 23 If the available income data is of low quality (or observed only as short snapshots), the same approach could also be potentially useful when studying men. 52 APPENDIX 53 Table B1: Summary Statistics for Full Sample and U.S. Comparison Sample Full sample U.S. comparison sample Variable mean std dev mean std dev Offspring Year of birth 1956 3.14 1956 3.12 Average income, age 27-46 250,584 198,680 253,318 201,623 Average log income, age 27-46 12.22 0.53 12.25 0.50 Non-missing incomes, age 27-46 19.52 1.81 19.66 1.41 Log income in 1991 12.23 0.70 Years of education 11.79 2.40 11.83 2.41 Number of offspring (N) 167,552 146,783 Fathers Age when offspring born 30.05 5.13 30.22 5.06 Year of birth 1926 6.53 1925 6.29 Average income, age 30-60 245,013 166,478 249,144 164,416 Average log income, age 30-60 12.26 0.48 12.29 0.45 Non-missing incomes, age 30-60 17.78 6.56 17.71 6.39 Average log income 1968-72 12.28 0.48 Years of education 9.14 2.88 9.15 2.91 Educational attainment (years) < 9 years of primary school 0.58 0.49 0.58 0.49 9 years of primary school 0.04 0.21 0.04 0.20 2-year secondary school 0.18 0.38 0.17 0.38 3-year secondary school 0.11 0.31 0.11 0.31 < 3 years of post-secondary school 0.03 0.17 0.03 0.17 3+ years of post-secondary school 0.06 0.23 0.06 0.24 Graduate school 0.01 0.08 0.01 0.08 Occupation category 1. Professional work (arts & sciences) 0.19 0.40 0.20 0.40 2. Managerial work 0.04 0.21 0.05 0.21 3. Clerical work 0.04 0.19 0.04 0.19 4. Wholesale, retail, & commerce 0.08 0.28 0.08 0.27 5. Agriculture, forestry, hunting, & fishing 0.10 0.30 0.10 0.30 6. Mining & quarrying 0.01 0.08 0.01 0.08 7. Transportation & communication 0.09 0.29 0.09 0.29 8. Manufacturing 0.38 0.48 0.38 0.49 9. Services 0.04 0.20 0.04 0.19 10. Military / armed forces 0.01 0.10 0.01 0.10 Undefined <0.00 0.01 0.00 0.01 Missing 0.02 0.13 0.01 0.10 Number of fathers (N) 153,920 135,020 Notes. The main sample is the full sample used for our main analysis as well as robustness checks. The U.S. comparison sample is the subset that has the income measures needed to compute the PSID comparable income measures (average of log income in years 1968-72 for fathers and, for sons, log income in 1991). 54 Table B2: OLS, IV, and LW Estimates for Full Sample (Fathers and Sons) [1] [2] [3] [4] OLS estimates FathersÕ log average income 0.231 0.230 0.208 0.207 0.003 0.004 0.004 0.004 FathersÕ years of education 0.000 0.000 0.001 0.001 FathersÕ unique occupation (indicators) x x IV estimates First stage (educ. IV for log income) 0.083 0.000 Second stage 0.235 0.006 LW estimates of the IGE B 0.231 0.231 0.260 0.260 0.004 0.004 0.004 0.004 Observations (N) 167,552 167,552 167,552 167,552 Notes. All specifications use the average of sonsÕ log income as the dependent variable and include birth-year dummies of fathers and sons as controls. The noisy measures of status for fathers included in each model are: [1] income; [2] income and education; [3] income and occupation; [4] income, education and occupation. Because the occupation measure is 270 unique occupation categories, the OLS coefficients and standard errors for occupations are omitted from the table. OLS and IV standard errors are clustered by family and LW standard errors are computed using a block bootstrap to account for within-family correlation (100 repetitions). 55 Table B3: Comparison of LW Estimates - Sweden and the U.S. N [1] [2] [3] [4] Sweden estimates Main results (full sample) 167,552 0.231 0.231 0.260 0.260 0.004 0.004 0.004 0.004 Main specifications for restricted sample used in U.S. comparable specification 146,783 0.231 0.231 0.262 0.262 0.003 0.003 0.004 0.004 Sweden estimates using U.S. (PSID) comparable specification 146,783 0.194 0.194 0.215 0.215 0.004 0.004 0.005 0.005 U.S. estimates (from Vosters, 2014) 415 0.439 0.445 0.465 0.473 0.075 0.072 0.080 0.080 Notes. The noisy measures of status for fathers included in each model are: [1] income; [2] income and education; [3] income and occupation; [4] income, education and occupation. The main specifications use the average of sonsÕ log income (age 27-46) as the dependent variable, average of log income (age 30-60) for fatherÕs income, unique occupation indicators for fathersÕ occupation, and include birth-year dummies of fathers and sons as controls. The PSID-comparable measures are: sonsÕ log income in 1991; fatherÕs average log income 1968-1972; indicators for fathersÕ major occupation category. All specifications use years of education as the measure of educational attainment. LW standard errors are computed using a block bootstrap to account for within-family correlation (100 repetitions). 56 Table B4: Robustness of LW Estimates to Construction of Status Measures N [1] [2] [3] [4] Main results 167,552 0.231 0.231 0.260 0.260 0.004 0.004 0.004 0.004 Adjusting the occupation measure Indicators for minor occupation (2-digit) 167,552 0.231 0.231 0.247 0.247 0.004 0.004 0.004 0.004 Indicators for major occupation 167,552 0.231 0.231 0.238 0.238 0.004 0.004 0.004 0.004 Excluding ÒundefinedÓ and missing 164,678 0.235 0.235 0.265 0.265 0.003 0.003 0.004 0.004 Adjusting the education measure Indicators for each education level 167,552 0.231 0.233 0.260 0.261 0.004 0.004 0.004 0.004 Indicators for level/type of attainment 167,552 0.231 0.241 0.260 0.265 0.004 0.004 0.004 0.004 Adjusting the income measure Log (average annual income) 167,550 0.270 0.274 0.296 0.297 0.003 0.003 0.003 0.003 Separate log annual income measures, age 40-50 57,728 0.247 0.247 0.304 0.304 0.008 0.008 0.009 0.010 Average of log annual income, age 40-50 57,728 0.241 0.241 0.298 0.298 0.008 0.008 0.009 0.009 Main specification using this restricted sample 57,728 0.279 0.279 0.333 0.333 0.007 0.007 0.009 0.009 Notes. All specifications use the average of sonsÕ log income as the dependent variable and include birth-year dummies of fathers and sons as controls. The noisy measures of status for fathers included in each model are: [1] income; [2] income and education; [3] income and occupation; [4] income, education and occupation. LW standard errors are computed using a block bootstrap to account for within-family correlation (100 repetitions). 57 Table B5: OLS Estimates from Extensions with MothersÕ Measures of Status Sons Daughters [1] [2] [3] [4] [5] [6] [7] [8] FathersÕ measures Log average income 0.231 0.229 0.208 0.207 0.152 0.128 0.130 0.123 0.003 0.004 0.004 0.005 0.003 0.004 0.004 0.004 Education 0.001 0.001 0.008 0.006 0.001 0.001 0.001 0.001 FathersÕ & MothersÕ measures FathersÕ log avg. income 0.225 0.227 0.203 0.203 0.142 0.125 0.125 0.120 0.003 0.004 0.005 0.005 0.003 0.004 0.004 0.004 MothersÕ log avg. income 0.024 0.023 0.010 0.009 0.059 0.053 0.048 0.046 0.002 0.002 0.002 0.002 0.002 0.002 0.003 0.003 FathersÕ education -0.001 -0.001 0.003 0.002 0.001 0.001 0.001 0.001 MothersÕ education 0.002 0.003 0.005 0.005 0.001 0.001 0.001 0.001 MothersÕ measures Log average income 0.034 0.021 0.002 -0.001 0.064 0.052 0.043 0.040 0.002 0.002 0.002 0.002 0.002 0.002 0.003 0.003 Education 0.016 0.011 0.015 0.012 0.001 0.001 0.001 0.001 Observations (N) 152,486 152,486 152,486 152,486 145,256 145,256 145,256 145,256 Notes. All specifications use the average of sons' or daughters' log income as the dependent variable and include birth-year dummies of included parents and offspring as controls. The noisy measures of status for parents included in each model are: [1], [5] income; [2], [6] income and education; [3], [7] income and occupation; [4], [8] income, education and occupation. Because the occupation measure is 270 unique occupation categories, the OLS coefficients and standard errors for occupations are omitted from the table. Standard errors are clustered by family to account for within-family correlation. 58 Table B6: LW Estimates from Extensions with MothersÕ Measures of Status Sons Daughters [1] [2] [3] [4] [5] [6] [7] [8] Fathers 0.231 0.231 0.262 0.262 0.152 0.163 0.188 0.193 0.003 0.003 0.004 0.004 0.003 0.003 0.004 0.004 Mothers 0.034 0.096 0.235 0.252 0.064 0.096 0.132 0.142 0.002 0.006 0.014 0.015 0.002 0.003 0.004 0.004 Fathers & Mothers 0.234 0.235 0.283 0.283 0.209 0.218 0.273 0.276 0.003 0.003 0.004 0.004 0.004 0.005 0.006 0.006 FathersÕ portion 0.225 0.222 0.242 0.239 0.142 0.140 0.160 0.157 96% 95% 85% 84% 68% 64% 59% 57% MothersÕ portion 0.009 0.012 0.041 0.044 0.066 0.078 0.113 0.118 4% 5% 15% 16% 32% 36% 41% 43% Observations (N) 152,486 152,486 152,486 152,486 145,256 145,256 145,256 145,256 Notes. These estimation samples have non-missing data on all measures for mothers and fathers. All specifications use the average of sonsÕ or daughtersÕ log income as the dependent variable and include birth-year dummies of included parents and offspring as controls. The noisy measures of status for parents included in each model are: [1], [5] income; [2], [6] income and education; [3], [7] income and occupation; [4], [8] income, education and occupation. Because the occupation measure is 270 unique occupation categories, the OLS coefficients and standard errors for occupations are omitted from the table. Standard errors are computed using a block bootstrap to account for within-family correlation (100 repetitions). 59 Table B7: Summary Statistics for Mothers & Fathers (Balanced Samples) Variable mean std dev mean std dev mean std dev mean std dev Offspring Sons Daughters Year of birth 1956 3.13 1956 3.12 Average income, age 27-46 252,496 203,293 169,955 78,675 Average log income, age 27-46 12.23 0.53 11.84 0.54 Non-missing incomes, age 27-46 19.52 1.79 19.57 1.66 Years of education 11.81 2.39 12.21 2.30 Number of offspring (N) 152,486 142,020 Parents Fathers Mothers Fathers Mothers Age when offspring born 29.93 5.12 29.92 5.13 Year of birth 1926 6.38 1929 6.16 1926 6.38 1929 6.16 Average income, age 30-60 245,518 161,788 121,359 65,144 245,259 182,143 121,419 64,393 Average log income, age 30-60 12.26 0.48 11.38 0.78 12.26 0.48 11.39 0.78 Non-missing incomes, age 30-60 18.22 6.45 19.49 6.60 18.22 6.45 19.53 6.61 Years of education 9.18 2.90 8.53 2.36 9.17 2.89 8.53 2.37 Educational attainment (years) < 9 years of primary school 0.57 0.49 0.63 0.48 0.57 0.49 0.63 0.48 9 years of primary school 0.04 0.21 0.11 0.31 0.04 0.20 0.10 0.31 2-year secondary school 0.18 0.38 0.18 0.38 0.18 0.39 0.18 0.38 3-year secondary school 0.11 0.31 0.02 0.14 0.11 0.31 0.02 0.14 < 3 years of post-secondary school 0.03 0.17 0.03 0.17 0.03 0.17 0.03 0.17 3+ years of post-secondary school 0.06 0.24 0.03 0.18 0.06 0.24 0.03 0.18 Graduate school 0.01 0.08 0.00 0.02 0.01 0.08 0.00 0.02 Occupation category 1. Professional work 0.20 0.40 0.18 0.38 0.20 0.40 0.18 0.38 2. Managerial work 0.04 0.21 0.01 0.08 0.04 0.20 0.01 0.08 3. Clerical work 0.04 0.19 0.16 0.36 0.04 0.19 0.16 0.36 4. Wholesale, retail, & commerce 0.09 0.28 0.11 0.31 0.09 0.28 0.11 0.32 5. Agriculture, forestry, hunting, fishing 0.09 0.29 0.06 0.23 0.09 0.29 0.06 0.23 6. Mining & quarrying 0.01 0.08 0.00 0.02 0.01 0.08 0.00 0.02 7. Transportation & communication 0.09 0.29 0.04 0.19 0.09 0.29 0.04 0.19 8. Manufacturing 0.38 0.48 0.09 0.29 0.38 0.49 0.10 0.29 9. Services 0.04 0.20 0.26 0.44 0.04 0.20 0.26 0.44 10. Military / armed forces 0.01 0.10 0.00 0.00 0.01 0.10 0.00 0.00 Undefined 0.00 0.01 0.00 0.04 0.00 0.02 0.00 0.04 Missing 0.02 0.13 0.10 0.30 0.02 0.13 0.10 0.29 Number of parents (N) 140,052 140,234 133,884 134,108 60 Table B8: OLS Estimates from Extensions with MothersÕ Measures of Status, for All Parent-Child Samples Sons Daughters [1] [2] [3] [4] [5] [6] [7] [8] FathersÕ measures Log average income 0.231 0.230 0.208 0.207 0.153 0.129 0.130 0.122 0.003 0.004 0.004 0.004 0.003 0.004 0.004 0.004 Education 0.000 0.000 0.008 0.006 0.001 0.001 0.001 0.001 Observations (N) 167,552 167,552 167,552 167,552 159,172 159,172 159,172 159,172 FathersÕ & MothersÕ measures FathersÕ log avg. income 0.225 0.227 0.203 0.203 0.142 0.125 0.125 0.120 0.003 0.004 0.005 0.005 0.003 0.004 0.004 0.004 MothersÕ log avg. income 0.024 0.023 0.010 0.009 0.059 0.053 0.048 0.046 0.002 0.002 0.002 0.002 0.002 0.002 0.003 0.003 FathersÕ education -0.001 -0.001 0.003 0.002 0.001 0.001 0.001 0.001 MothersÕ education 0.002 0.003 0.005 0.005 0.001 0.001 0.001 0.001 Observations (N) 152,486 152,486 152,486 152,486 145,256 145,256 145,256 145,256 MothersÕ measures Log average income 0.032 0.019 -0.002 -0.005 0.062 0.049 0.039 0.036 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 Education 0.016 0.010 0.015 0.012 0.001 0.001 0.001 0.001 Observations (N) 173,608 173,608 173,608 173,608 165,161 165,161 165,161 165,161 Notes. All specifications use the average of sonsÕ or daughtersÕ log income as the dependent variable and include birth-year dummies of included parents and offspring as controls. The noisy measures of status for parents included in each model are: [1], [5] income; [2], [6] income and education; [3], [7] income and occupation; [4], [8] income, education and occupation. Because the occupation measure is 270 unique occupation categories, the OLS coefficients and standard errors for occupations are omitted from the table. Standard errors are clustered by family to account for within-family correlation. 61 Table B9: LW Estimates from Extensions with MothersÕ Measures of Status, for All Parent-Child Samples Sons Daughters [1] [2] [3] [4] [5] [6] [7] [8] Fathers 0.231 0.231 0.260 0.260 0.153 0.164 0.190 0.194 0.004 0.004 0.004 0.004 0.004 0.004 0.005 0.005 Observations (N) 167,552 167,552 167,552 167,552 159,172 159,172 159,172 159,172 Mothers 0.032 0.098 0.246 0.263 0.062 0.049 0.039 0.036 0.003 0.007 0.012 0.012 0.003 0.004 0.006 0.007 Observations (N) 173,608 173,608 173,608 173,608 165,161 165,161 165,161 165,161 Notes. All specifications use the average of sonsÕ or daughtersÕ log income as the dependent variable and include birth-year dummies of included parents and offspring as controls. The noisy measures of status for parents included in each model are: [1], [5] income; [2], [6] income and education; [3], [7] income and occupation; [4], [8] income, education and occupation. Because the occupation measure is 270 unique occupation categories, the OLS coefficients and standard errors for occupations are omitted from the table. Standard errors are computed using a block bootstrap to account for within-family correlation (100 repetitions). 62 REFERENCES 63 REFERENCES Altonji, J. G. & Dunn, T. A. (1991). Relationships among the family incomes and labor market outcomes of relatives (No. 3724). National Bureau of Economic Research. Altonji, J. G. & Dunn, T. A. (2000). An intergenerational model of wages, hours, and earnings. Journal of Human Resources, 35(2), 221-258. Becker, G. & Tomes, N. (1979) An equilibrium theory of the distribution of income and intergenerational Mobility. Journal of Political Economy, (87), 1153-1189. Bjırklund, A. & J−ntti, M. (2009). Intergenerational income mobility and the role of family background, in (W. Salverda, B. Nolan and T. Smeeding eds.) Oxford Handbook of Economic Inequality, Oxford University Press. Bjırklund, A. J−ntti, M., & Nybom, M. (2015). The contribution of early-life vs. labour-market factors to intergenerational income persistence: a comparison of the UK and Sweden. Mimeo, Stockholm University. Black, D. A. & Smith, J. A. (2006). Estimating the returns to college quality with multiple proxies for quality. Journal of Labor Economics, 24(3), 701-728. Black, S. E. & Devereux, P. J. (2011). Recent developments in intergenerational mobility, in (O. Ashenfelter and D. Card, eds.), Handbook of Labor Economics, (4), 1487-1541, Amsterdam: Elsevier. Bollen, K. A. (1989). Structural equations with latent variables. Wiley, New York, NY. Braun, S. & Stuhler, J. (2015). The transmission of inequality across multiple generations: Testing recent theories with evidence from Germany. Mimeo. Chadwick, L. & Solon, G. (2002). Intergenerational income mobility among daughters. American Economic Review, 92(1), 335-344. Chetty, R. Hendren, N., Kline, P., & Saez, E. (2014). Where is the land of opportunity? The geography of intergenerational mobility in the United States. Quarterly Journal of Economics, 129(4), 1553-1623. Clark, G. (2014). The son also rises: surnames and the history of social mobility. Princeton University Press. Clark, G. & Cummins, N. (2015). Intergenerational wealth mobility in England, 1858Ð2012: Surnames and social mobility. The Economic Journal, 125(582), 61-85. 64 Corak, M. & Piraino, P. (2011). The intergenerational transmission of employers. Journal of Labor Economics, 29(1), 37-68. ell, M., Rodr™guez Mora, J. V. & Telmer, C. (2015). The informational content of surnames, the evolution of intergenerational mobility, and assortative mating. Review of Economic Studies, 82(2), 693-735. Goldberger, A.S. (1989). Economic and mechanical models of intergenerational transmission. American Economic Review, 79(3), 504-513. Haider, S. J. & Solon, G. (2006). Life-cycle variation in the association between current and lifetime earnings. American Economic Review, 96(4), 1308-1320. Hirvonen, L. H. (2008). Intergenerational earnings mobility among daughters and sons: Evidence from Sweden and a comparison with the United States. American Journal of Economics and Sociology, 67(5), 777-826. Jıreskog, K. G. & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351a), 631-639. Lubotsky, D. & Wittenberg, M. (2006). Interpretation of regressions with multiple proxies. The Review of Economics and Statistics, 88(3), 549-562. Mazumder, B. (2005). Fortunate sons: New estimates of intergenerational mobility in the United States using social security earnings data. The Review of Economics and Statistics, 87(2), 235-255. Nakosteen, R. A., Westerlund, O., & Zimmer, M. A. (2004). Marital Matching and Earnings: Evidence from the Unmarried Population in Sweden. Journal of Human Resources, 39(4), 1033-1044. Nybom, M. & Stuhler, J. (forthcoming). Heterogeneous Income Profiles and Life-Cycle Bias in Intergenerational Mobility Estimation. Journal of Human Resources. Raaum, O., Bratsberg, B., R¿ed, K., –sterbacka, E., Eriksson, T., J−ntti, M., & Naylor, R.A. (2007). Marital Sorting, Household Labor Supply, and Intergenerational Earnings Mobility Across Countries. The BE Journal of Economic Analysis & Policy, 7(2). Solon, G. (1999). ÒIntergenerational mobility in the labor market,Ó in (O. Ashenfelter and D. Card, eds.), Handbook of Labor Economics, (3A), 1761-1800, Amsterdam: North-Holland. Solon, G. (2002). Cross-country differences in intergenerational earnings mobility. The Journal of Economic Perspectives, 16(3), 59-66. Solon, G. (2004). A model of intergenerational mobility variation over time and place. In M. Corak, Generational Income Mobility in North America and Europe (pp. 38-47). Cambridge University Press. 65 Statistics Sweden (2004). J−mfırelse mellan yrkesuppgifter i FoB och yrkesregistret. URL: http://www.scb.se/sv_/Hitta-statistik/Statistik-efter-amne/Arbetsmarknad/Sysselsattning-forvarvsarbete-och-arbetstider/Yrkesregistret-med-yrkesstatistik/59068/Jamforelser-mot-aldre-yrkesstatistik/ (Accessed: 2015-05-28) Vosters, K. (2015). Is the simple law of mobility really a law? Testing ClarkÕs hypothesis. Unpublished manuscript. Chapter3UnderstandingandEvaluatingSAS¨EVAAS¨ModelsforMeasuringTeacherE!ectiveness13.1IntroductionAlargeliteratureexaminesmanyofthestatisticalmethodsthatstatesordistrictsareusingtoesti-mateteachere!ectivenessbasedontheirstudentsÕtestscores.However,oneofthemethodologicalapproachesthathasbeenadoptedbyseveralstatesanddistrictsÑtheSAS¨EVAAS¨modelÑhasexperiencedrelativelylimitedexposureinthesestudies,inlargepartduetotheproprietarynatureoftheanalysis.Still,theEVAASestimateshavebeenincorporatedintoformalteacherevalua-tionprogramsusedforaccountability,includinghighstakespoliciessuchastenure,dismissal,orincentivepay.Withhighstakesprogramssuchastheserelyingonestimatede!ectiveness,itisimportanttounderstandthestrengthsandlimitationsoftheunderlyingmethods.Theprevalenceofsuchpolicieshasgrowninrecentyears,buttheSASEVAASapproachitselfhasamuchlongerhistory.Thecurrentname,EVAAS,standsforEducationValue-AddedAssessmentSystem,whichisavariantontheearlierandperhapsmorefamiliarnameTennesseeValue-AddedAssessmentSystem(TVAAS),asTennesseewaswhereitwasdevelopedandusedsincetheearly1990Õs.2Inadditiontothenamechange,documentationoftheEVAASmethodshasevolvedovertheyearsbutthedetailsthatallowresearcherstoeasilyreplicatetheapproachremainsomewhatelusive.ThenatureofthedocumentationcombinedwithproprietaryprogramsanddatalikelyimpedetheimplementationofEVAASinmanyevaluationstudies(Kupermintz,2003;Amrein-Beardsley,2008).TheEVAASmethodsincludetwooptionsforestimatingteachere!ectiveness;themultivariateresponsemodel(MRM)andtheunivariateresponsemodel(URM).TheMRM,alsoreferredtoastheÒlayeredÓteachermodel,involvesjointmodelingofscoresfrommultipletestedsubjectsformultiplegradesandcohortsina5-yearperiod.Jointlymodelingthetestscoresaimstoimprove1ThischapteriscoathoredwithCassandraGuarinoandJe!reyWooldridge.2ThenameisoftenmodiÞedinasimilarfashioninstateswhichadopttheEVAASmethods,suchasÒPVAASÓforPennsylvania(e.g.,www.portal.state.pa.us,accessed1/12/2015).66e!ciency,andusingthecompletesetofscoresavailableforastudentattemptstoaccountforanyotherstudentcharacteristicsthatmighta"ectachievement.Thismodelisgenerallylimitedtowithin-districtpurposesduetothelargecomputationalburden,andissometimesnotfeasibleifdatarequirementscannotbemet.Hence,theURMwasdevelopedforthesesituations.TheURMfocusesonasinglesubject,andthusislessintensivecomputationallyandmoreßexiblewithrespecttodatarequirements.Themethodinvolvesthecomputationofasinglecompositescoreforeachstudentbasedontheirlaggedscoresinthesamesubjectaswellasothers,andthenusingthiscompositescoreastheonlyregressorinempiricalBayesÕestimationoftheteachere"ects.AnumberofstudieshaveaddressedsignaturefeaturesoftheMRM,suchastheomissionofstudentcovariatesorjointmodelingofsubjects,typicallyfocusingonacomparisontoageneralizedormodiÞedversionofthemodel(e.g.,Ballou,Sanders,&Wright,2004;McCa"reyetal.,2004;Lockwoodetal.,2007).TheURMhasreceivedlessattention,withtheexceptionofarecentreportbyRose,Henry,&Lauen(2012)thatstudiestheperformanceofnineestimators,oneofwhichistheURM,insimulationsandwithadministrativedata.TheyÞndthatunderrandomassignmentofstudentstoteachers,athree-levelhierarchicallinearmodel(HLM)andtheURMoutperformseveralotherpopularestimationapproachesandthatundercertainnonrandomassignmentscenarios,theHLMapproachoutperformstheURMbyafairmargin.OurpaperalsofocusesprimarilyontheURMandweincludebothsimulationsandtheanalysisofactualdata.WebuildontheworkofRoseetal.(2012),althoughoursimulationsaredesignedsomewhatdi"erently,andourresultsdivergefromtheirs.WhileweconÞrmthatrandome"ectsapproachessuchasHLMarebestunderrandomassignment(aresultwehavefoundinpriorworkÑseeGuarino,Reckase,andWooldridge2015),weÞndthatunderthetypeofnonrandomassignmentthatwesimulate,approachesthatassumeÞxedratherthanrandomteachere"ectsarebettersuitedtocapturingtrueteachere"ectsthantheURM.TheURMassumesrandomteachere"ectsandisthusinconsistentifteacherassignmentisrelatedtostudentsÕpriortestscores.Incontrast,OLSestimationoftheregressionofstudentachievementonteacherÞxede"ectsandcontrolvariablesincludinglaggedstudentachievementscoresisconsistentevenwhennonrandomassignmentbasedonlaggedachievementgeneratescorrelationbetweentheteacherdummyvariablesandthecontrolvariables.67OLS,however,assumesÞxedteachere!ectsandisstillconsistentunderthistypeofteacherassignment.InothernonrandomassignmentscenariosinwhichboththeURMandOLSareinconsistent,OLSperformsatleastaswellastheURM.OurpaperfurthercontributestotheliteraturebydrawingkeytheoreticalconnectionsbetweentheURMandothertypesofestimationapproaches.Inparticular,weshowareasofoverlapbetweentheURMandOLSorempiricalBayesÕestimationoftypicalvalue-addedmodels,andwealsoshowhowandwherethevariousestimationapproachesdi!er.Throughthetheoreticaldiscussion,simulations,andempiricalwork,weshowthatstandardlinearregressiontechniquesperformverysimilarlyÑandincertaincasesbetterÑunderplausibledatascenarios.Inaddition,ourdetaileddescriptionsoftheURMhelpmakeitmorereadilyavailableforotherresearcherstoimplementandincludeinfutureevaluationstudies.Webeginbydescribingcommonvalue-addedmodel(VAM)approachesaswellastheEVAASapproachesinSection2,providingdetailsonboththeMRMandURM,andthenwereviewrelevantliteratureinSection3.InSection4,wediscussoursimulationdesignandpresentresultsfromthesimulation.Section5describesourempiricalanalysisandresultsusingadministrativedata.WesummarizeandconcludeinSection6.3.2Value-AddedModelsTeachervalue-addedmodels(VAMs)aregenerallyderivedfromormotivatedbyaso-calledÒeduca-tionproductionfunctionÓ(Hanushek,1979;Todd&Wolpin,2003;Guarino,Reckase&Wooldridge,2015).Initsmostgeneralformulation,academicachievementatanypointintimeiswrittenasafunctionofallcurrentandpastchild,family,andschoolinputs:Ait=f(Eit,...,Ei0,Xit,...,Xi0,ci,uit)(1)whereAitiscurrentachievementattimetforstudenti,Eit,...,Ei0representcurrentandpasteducation(school)inputs,Xit,...,Xi0representcurrentandpaststudentorparentinputs,ciisun-observedstudentheterogeneity(e.g.,motivationorsomeformoftime-invariantinnateability),anduitisanidiosyncraticerrorterm.Giventhatwecannotmeasureeachoftheseelementsduringeach68timeperiodÑatleastnotinavailabledataÑresearcherstypicallyadoptamoreparsimoniousmodelwithasimple(estimable)functionalform.Forexample,withasetofsimplifyingassumptions,theeducationproductionfunctionisreducedtoanestimatingequationsuchas:Ait=!t+"Ait!1+Xit#+Eit$+ci+eit(2)where!tallowsforadi!erentinterceptineachtimeperiodtocapturetime(e.g.,year)e!ects,Aitisthecurrenttestscoreattimet,Ait!1isthelaggedtestscorefromthepreviousyear,Eitisavectorofobservededucationinputsattimet(e.g.,teacherassignmentindicators),andXitisavectorofobservedindividualstudentcharacteristics.3Thesimplifyingassumptionsthatfacilitatethetransitionfromequation(1)toequation(2)includelinearityandgeometricdecayintheparameters;seeGuarino,Reckase,&Wooldridge(2015)foradetaileddiscussionandderivations.Wecannotmeasuretheindividualstudentheterogeneity,ci,sothisisgenerallyleftintheerrortermincommonlyusedapproaches.Whiletherearemethodstoeliminatethisterminpaneldatasettings(e.g.,addingstudentindicators,orÞxede!ectsestimation),weseldomcomputeteachervalue-addedmeasureswithmultipleyearsofdataonthesamestudents,whichwouldberequiredtoidentifytheseindividualstudente!ects.4Rather,teachere!ectsaretypicallyobtainedusinguptoafewyearsofdataonteachers(somultiplecohortsofdi!erentstudents).Evenwiththisrelativelyparsimoniousmodel,administrativedatamaybemissingtestscoresorcharacteristicsforsomestudents,orsomestudentsmaynotbelinkedtoteachers.IntraditionalregressionanalysissuchasOLSestimation,studentobservationsmissingthesedataareomittedfromtheestimationsample,butconsistentestimatescanstillbeobtained.Forconsistency,whetherdata(ontheoutcomeortheregressors)areobservedormissingforastudentcanberelatedtotheobservedcovariatesthatwecontrolfor(e.g.,thelaggedscore,Ait!1,orstudentcharacteristics,Xit)butnotunobservedelementsoftheerrorterm(seeWooldridge,2010,Ch19).ThisissimilartotheÒmissingatrandomÓ(MAR)assumptionEVAASmethodsaresaidtorelyon(Wrightet3Intheempiricalworkpresentedlater,thesetofstudentcharacteristicsincludesrace/ethnicity,gender,free-andreduced-priceluncheligibility,limitedEnglishproÞciency,disability,anddaysabsent.4SuchapproachesactuallyperformedquitepoorlyinthesimulationsconductedinGuarino,Reckase,&Wooldridge(2015).Seethepaperfordetailsonthereasonsforthisforeachgrouping/assignmentscenario.69al.,2010),withthedistinctionthatMARgenerallyassumesthatthecovariatesrelatedtowhetherdataaremissingarealwaysobservedthemselves(Wooldridge,2010,Ch.19).3.2.1CommonMethodsforEstimatingTeacherE!ectsGiventhatthestudentheterogeneityterminequation(2)isgenerallyignoredwhenestimatingvalue-addedmodels,theestimatingequationforagivensubjectscanbewrittenas:Aist=!t+"Aist!1+Xit#+Eist$+vist(3)wherevist=ci+eististhecompositeerrorterm.OLSonthisequationwillestimateteachere!ects,ö$.WecallthisestimatorDOLS,toreßecttheOLSestimationoftheteachere!ectsandacknowledgethedynamic(D)speciÞcationcontainingthelagscoreontheright-handside.Thiscaneasilybeextendedtoincorporatemultiplelaggedscoresinmultiplesubjects.Withthisapproach,toconsistentlyestimatethevector$,weneedteacherassignment(Eist)tobeuncorrelatedwiththestudentheterogeneityterm,ci.Thismeans,forexample,thatprincipalscannotassignstudentswithhigher(orlower)unobservedabilitytomoree!ectiveteachers.Thenexttwomethodsomittheteacherassignmentdummies(Eist);wethenobtainestimatesofteachere!ectivenessfromthestudent-levelresiduals.Oneapproachistoestimatetheabbreviatedversion(omittingEist)ofequation(3)viaOLS,andthencalculatetheteachere!ectsasthewithin-teacheraveragesofthestudent-levelOLSresiduals.Werefertothisastheaverageresidual(AR)method.Again,consistencyrequiresthatteacherassignmentisnotbebasedonthestudentheterogeneity.However,alsonotethatanycorrelationbetweenthelaggedtestscoreAist!1andtheteacherassignmentisnotbeingpartialledoutoftheteachere!ects,soassignmentbasedonpriorscoresalsobecomesproblematic.Thelastapproach,whichwewillabbreviatetoEB,involvesempiricalBayesÕestimationofthismoreparsimoniousequation,obtainingtheteachere!ectsfromtheshrunkenresiduals.TheempiricalBayesÕmethodisessentiallyaGLSorrandome!ectsapproach,wheretheteachere!ectestimatesaree!ectivelyÒshrunkenÓtowardsthemeanteachere!ect(Guarinoetal.,2015).5The5AsdescribedinGuarinoetal.(2015),thismethodinvolvestwostages,butiseasilyimplementedinStatawith70so-calledshrinkagetakesteachersÕclasssizesintoaccount,andthusaimstoreducethenoisinessoftheestimatesfromasmallnumberofobservationscontributingtotheestimationoftheteachere!ects.LiketheARmethod,consistentestimationreliesonteacherassignmentbeinguncorre-latedwithstudentheterogeneityandstudent-levelcovariatescontainedinthemodel(includingpriorachievement).ThelatterisalsorelevanttotheEVAASURMapproachwefocusoninthispaper.3.2.2EVAASMethods3.2.2.1EVAASUnivariateResponseModel(URM)SimilartotheOLSandEBapproachesdiscussedabove,theURMestimatesteachere!ectivenessforasinglegradeandsubject(e.g.,5thgrademath).Therearetwokeydi!erencesbetweenthecommonapproachesjustdescribedandtheURM.First,theURMusespriortestscoresfrommultipleyearsandsubjectsinlieuofstudentcharacteristicstoaccountforpaststudentachievementorotherstudentcharacteristicsa!ectingcurrentachievement.Second,theURMallowsforstudentstobemissingsomeofthesepriortestscores.TheURMÕsstrategyforallowingincompletetestscoredatageneratesthecomplexnatureoftheapproach,butthecomplicatedstepsdonotnecessarilydevelopamorerobustestimator.Forinstance,theconsistencyoftheURMestimatesreliesonverysimilarassumptionsregardingthenatureofthesemissingdatatotheassumptionsneededforOLSestimatestobeconsistent.Infact,whentherearenomissingdata,thereisadirectrelationshipbetweentheURMandsimplerstandardlinearregressiontechniques.Considerthesimplestcasewherestudentshavenomissingtestscoredata,studentsarerandomlyassignedtoteachers,teachershaveidenticalclasssizes,andestimationisbasedononecohortofstudentsforteachers.Thentheteachere!ectestimatesfromtheURMareidentical(uptoaconstant)toOLSestimates.Whenstudentsarenonrandomlyassignedtoteachersbasedontheincludedpriortestscores,theestimatesdiverge.OLSpartialsoutthisassignmentmechanismandconsistentlyestimatestheteachere!ectswhiletheURMdoestheÒxtmixedÓcommandspecifyingarandomcomponentattheteacherlevel,andthenpost-estimationusingtheÒpredict,re!ectsÓcommandtogettheteacherrandome!ects.TheÞrststageestimatesthenormalmaximumlikelihood(withtherandomteachere!ectsintheerrorterm)andthesecondstageappliestheshrinkagefactortotheseteachere!ects.71notpartialoutassignmentandconsequentlyproducesbiasedestimatesoftheteachere!ects.Onegoalofthispaperistoderiveanddemonstratetheserelationships.Inthediscussionthatfollows,weÞrstprovideadetailedexplanationoftheURMapproach,expandingonthedescriptioninWrightetal.(2010),andthenillustratehowtheURMcompareswithstandardlinearregressionmethods.TheURMestimatingequationforsubjectsis:Aist=!+"öAist+#+$ist(4)where,comparedtoequation(3),theintercept!doesnothaveatimesubscript,thelagscoreandstudentcovariateshavebeenreplacedbyaÒcompositescoreÓöAist,andnowtheerrorterm$istincludesestimationerrorfromusingestimatedcomponentsinöAist.ThisequationisestimatedusingempiricalBayesÕtoobtaintheteachere!ects#.The#containstherandome!ectforthestudentÕsteacher.Althoughthisappearsrelativelysimple,thecompositescoreöAististheresultofamulti-stepprocessusingallavailablelaggedtestscores(Wrightetal.,2010),sothemodelisnotasparsimoniousasitappears.Thecompositescoreisessentiallyadi!erentapproachtoacontrol,usingmultiplelaggedtestscorestopredictastudentÕscurrentscore,andthispredictionservesasasortofsu"cientstatisticforthestudentÕspastinputs.TheideaisexplainedbySandersetal.(2009),ÒbyincludingallofastudentÕstestinghistory,eachstudentservesashisorherowncontrol.ÓTheURMinvolvesmultiplestepstocomputethecompositescore,witheachstepperformedseparatelyforeveryyearofdata(i.e.,studentcohort)thatcontributestotheestimatedteachere!ects.Thus,toestimateteachere!ectivenessduringathree-yearperiod(i.e.,basedonthreecohortsofstudents),eachoftheinitialstepsÑuptoandincludingcomputingthecompositescoreÑisdoneseparatelyfortheÞrst,second,andthirdyearsofdata.ThentheÞnalstepÑempiricalBayesÕestimationoftheteachere!ectsÑisperformedpoolingthethreeyearsofdata.Incomputingthecompositescores,theURMallowsformanypriortestscoresacrossdi!erentsubjectsandyears.Forclarity,wefocusourdiscussiononanexamplewhereweareusing1-yearand2-yearlaggedtestscoresforbothreading(r)andmath(m).TheURMcomputesacomposite72scoreinaspeciÞcsubject(mathshownintheequationbelow)asalinearcombinationofdemeanedversionsofthelaggedtestscores:öAimt=öµmt+ö!mt!1¬Aimt!1+ö!mt!2¬Aimt!2+ö!rt!1¬Airt!1+ö!rt!2¬Airt!2.(5)Inthisequation,¬Aist!y(forthe1-yearand2-yearlaggedscoresinsubjects)denotesaÒdemeanedÓy-yearlaggedtestscoreinsubjectsforstudenti,¬Aist!y=Aist!y!öµst!y(6)Inequations(5)and(6),theestimatedmeansöµst!yarenottheoverallmeansofthetestscores.Rather,eachöµst!y(includingy=0forthecurrentscore)isthesumoftwocomponents:anaverageacrossteachersoftheteacher-levelmeanscoreandanadjustmenttoaccountforstudentswithmissingtestscoredata.Wediscusseachofthesecomponentsinfurtherdetailbelow.Theweightsinthecompositescoreequation,ö!st!y,arecoe!cientestimatesthatmaximizethecorrelationbetweenthelaggedscoresandcurrentscore.Withnomissingdata,ö!isessentiallyavectorofOLScoe!cientestimatesfromtheregressionofAimtonanintercept,Aimt!1,Aimt!2,Airt!1,Airt!2,andteacherassignmentindicators.So,thisparticularstepwouldproducecoe!cientsonthelagsfromaDOLS-typeequationthatincludeslaggedtestscoresinmultiplesubjects,whereteacherassignmentispartialledoutofthecoe!cientestimates.Ratherthanuseregression,however,theURMtakesadi"erentapproachtoestimationtoallowforcertainpatternsofmissingdata.Ingeneral,theURMrequiresaminimumofthreelaggedscoresandoneofthesemustbethemostrecentlaginthesamesubjectasthedependentvariable.Inourexample,thismeansstudentsmusthaverecordsforAimt!1andatleasttwoscoresoutofthesetof{Aimt!2,Airt!1,Airt!2}.TheURMusestheEMAlgorithmtoestimateavariance-covariancematrix,C,forcalculatingthecoe!cientsö!(ratherthanestimatingthesedirectlywitharegression,whichwouldomitobservationswithmissingdata).66TheEMAlgorithmisanoptimizationalgorithmthatiteratesbetweentheEstep(expectation)andtheMstep(maximization)untilthevaluesofallparameterssu!cientlyconverge.TheStatacodeforestimationasdescribedhereis:miimputemvn¬am0¬am1¬am2¬ar1¬ar2,emonly.73TheEMAlgorithmestimationstepoftheURMisdoneseparatelyforeachyearofdata.Itusesatransformationofthecurrentandlaggedtestscoreswheretheteacher-levelmeansaresubtractedfromeachscoresothatCisaÒwithin-teacherÓvariance-covariancematrix.WedenotethesetransformedscoresusedfortheEMAlgorithmestimationas:¬aisy=Aist!y!öµjst!y(7)whereöµjst!yistheaverageofAist!yacrossthestudentsiassignedtoteacherj.Thenthewithin-teachervariance-covariancematrixobtainedviatheEMAlgorithm,foreachyear,is:C=!"#c¬am0¬am0c¬asy¬am0c¬am0¬asyC¬asy¬asy$%&=!"""""""""""#c¬am0¬am0c¬am1¬am0c¬am2¬am0c¬ar1¬am0c¬ar2¬xm0c¬am0¬am1c¬am1¬am1c¬am2¬am1c¬ar1¬am1c¬ar2¬xm1c¬am0¬am2c¬am1¬am2c¬am2¬am2c¬ar1¬am2c¬ar2¬xm2c¬am0¬ar1c¬am1¬ar1c¬am2¬ar1c¬ar1¬ar1c¬ar2¬xr1c¬am0¬ar2c¬am1¬ar2c¬am2¬ar2c¬ar1¬ar2c¬ar2¬xr2$%%%%%%%%%%%&(8)wheretheÞrstmatrixshowssubdividedÒblocksÓofthematrix(tobereferencedbelow),with¬asyreferencingthevectoroflaggedtestscoresinbothsubjects.Thesecondmatrix,withthelinesforthesubdividedblocks,isfullyexpandedtoshoweachelementofC;thediagonalelementsarethevariancetermsandtheo!-diagonal(symmetric)elementsarethecovarianceterms.TheURMusestheelementsofCtocomputethesetofwithin-teachercoe"cientestimates,ö!st!y,bypluggingintothefamiliarformula:!p=C!1¬asy¬asy,pc¬asy¬am0,p(9)wherephasbeenaddedtoindexeachpatternofobservedscores.Withcompletedataforallstudents,thepindexisnotneeded,andthisequationwouldbeequivalenttotheOLSestimatorfromtheregressionof¬am0on¬am1,¬am2,¬ar1,¬ar2(or,equivalently,withtheoriginalscores,fromtheregressionofAmtonAmt!1,Amt!2,Art!1,Art!2,andteacherassignmentindicators).7When7Anotherequivalentrepresentationforthecaseoffulldata,inthematrixnotationoftenusedfortheOLSestimator,74studentshaveincompleterecordsthough,theformulain(9)allowsustoseparatelyestimateauniquevectorofcoe!cients,ö!p,foreachpatternofobservedscores,usingthesubsetofmatrixCcorrespondingtotherelevantobservedscores.So,inourexample,giventhattheÞrstlagofthemathscoremustbepresent,wewouldcomputeuptofourvectorsö!ptoaccountfordi"erentmissingscores.Wecouldconsiderp=0forcompleterecords,p=1forrecordsmissingAmt!2,p=2forrecordsmissingArt!1andp=3forrecordsmissingArt!2.Forstudentswithp=0thefullmatrixisused,whileforstudentswithp=1(missingAmt!2)the3rdrowand3rdcolumnaredropped.TheEMAlgorithmestimationalsoproducesmeansthatcontributetotheöµstinthecompositescoreequationandtheöµst!yunderlyingthetransformedscores(¬Aist!y)in(6).Tobeclear,inequations(5)and(6),theestimatedmeanisöµst!y=öµmtmst!y+öµEMmst!y,whichisnottheoverallmeanofthelaggedtestscore.TheÞrsttermontheright-hand-sideisthemean-of-teacher-meansöµmtmst!yforeachy-yearlaggedscoreinsubjects.Inotherwords,themeanlaggedtestscoreiscomputedforeachteacherandthentheaverageoverallteachersistaken.8Thesecondtermontheright-hand-sideisproducedbytheEMAlgorithm.9Itisanadjust-menttothemeanofteachermeanstoaccountformissingdataÑi.e.,studentswithincompleterecords.SincetheEMAlgorithmestimationstepusesdemeanedtestscores(speciÞcally,theteacher-demeanedscores¬ast!y),thistermiszerowhenthereiscompletedataforallstudents.Butwhensomestudentsaremissingtestscores(andthusnotcontributingtothemean-of-teacher-meansforthemissingscore),theestimatedöµmtmst!ymaybebiasedandtheURMincludesthemeanprovidedintheEMAlgorithmoutput,öµEMmst!y,toreducepotentialbiasfrommissinglaggedscores.Thetransformationin(6)thatsubtractsthesetwomeancomponentsissimilartoremovingyeare"ects,whichwouldbedonebyinsteadsubtractingtheoverallmean(orbyincludingyeardummiesinaregression).Subtractingthemean-of-teacher-means(öµmtmst!y)insteadensuresthattheÒaverageÓteacherhasateachere"ectofzeroandtheEMAlgorithmcomponent(öµEMmst!y)correctsis!=(¬a!sy¬asy)"1¬a!sy¬am0,where¬asycontains¬am1,¬am2,¬ar1,¬ar2,or!=(X!X)"1X!AmtwhereXincludesAmt"1,Amt"2,Art"1,Art"2,aninterceptandteacherassignmentindicators.8TothebestofourknowledgeÐbasedonthedescriptioninWrightetal.(2010)ÑthisaverageperteacherisacrossalloftheteacherÕsstudents,eveniftheteacherteachesmultipleclasses.Regardless,thisdistinctionisnotimportantforourtheoreticalorempiricalresultsandconclusions.9TheEMAlgorithmestimatesboththevariance-covariancematrixdiscussedearlieraswellasthemeans,öµEMmst"y,usedhere.75forpotentialbiasinthemean-of-teacher-meansfromstudentsmissingtestscores(Wrightetal.,2010).Finally,wecomputetheso-calledcompositescore,öAimt,accordingtoequation(5).Thecom-positescoreisthesumoftheÒadjustedmeanÓofthecurrentmathscore(öµst=öµmtmst+öµEMmst)plusaweightedaverageoftransformedlaggedscores¬Ast!y,withtheweightsbeingthecoe!cientestimates,ö!p.Thecompositescoreisapredictionofthecurrentscore(Aimt)basedonthestudentÕspasttestscoresandassumingthestudenthastheÒaverageÓteacherinthecurrentyear(Wrightetal.,2010).Afterthecompositescoresareobtained,theÞnalstepincomputingtheteachere"ectsistheempiricalBayesÕestimationofequation(4)Ñasmentionedabove.Notethatthisdiscussionhasfocusedonestimatingteachere"ectsformathteachers.Ifonewishedtoestimateteachere"ectivenessin,say,reading,thentheoutcomevariablewouldbethecurrentreadingscore,andthecompositescorewouldconstituteapredictedreadingscore.Whilethesamelaggedscorescouldbeusedtoobtainthecompositescore,theestimatedelements(i.e.,theöµmtmst,öµEMmst,andö!st)wouldbedi"erentbecausetheywouldbebasedonpredictingthecurrentreadingscore,usingthesampleofstudentssatisfyingthecorrespondingdatarequirements.So,inthisrespect,theURMissimilartothecommonVAMapproachesthatestimateteachere"ectsseparatelybysubject(andgrade).3.2.2.1.1RelatingtheEVAASURMtoOtherApproachesUnliketraditionalregression-basedVAMmethods,theEVAASapproachhandlesatleastsomemissingdatapatterns.ItalsousesempiricalBayesÕshrinkageintheÞnalstepinordertoac-countforteachershavingdi"erentnumbersofstudents.ButisEVAASverydi"erentfromthestandardregressionestimators?Inpractice,di"erencesintheestimatedteacherVAMsmaybeminor.Infact,inthesimplestscenariothetwoapproachesyieldnumericallyidenticalteachere"ectestimates.Inthesimplestsetting,therearenomissingdataandonlyoneyearofdataisused.Eithershrinkageisnotusedorthenumberofstudentsperteacherisidentical,inwhichcaseshrinkage76simplymultipliesalloftheteacherVAMsbythesameconstant.Withasingleyearofdata,asimpleextensionofDOLStoallowotherlaggedtestscorescomesfromOLSestimationoftheequationAi=Xi!+Ei"+vi,(10)whereXiincludesalllaggedtestscoresinvarioussubjectsandEiisthevectorofteacherassignmentdummies.Forsimplicity,wedropthetimesubscriptsindicatingsubjectandyear.Technically,theOLSestimatesfrom(10)arenottheDOLSestimatesdescribedearlierbecause(10)includesotherlaggedtestscores.ButaddingadditionallagsofthesameandothersubjecttestscoresisasmallmodiÞcation,andproducesnoextraconceptualorcomputationaldi!culties.WecouldlegitimatelyrefertotheOLSestimatesfrom(10),asthemotivationisthesame:controlforfactorsthatpredictcurrenttestscoresandmaybecorrelatedwithteacherassignment.FromtheFrisch-Waughpartialling-outtheorem,theOLScoe!cientsonthelaggedtestscores,ö!,canbeobtainedinthreesteps:(i)RegressAionEiandobtaintheresiduals,¬Ai.Now,¬Ai=Ai!Eiö#where,becausetheEiareteacherassignmentdummies,ö#jistheaverageoftheAi(currenttestscore)forteacherj.Therefore,¬AiisstudentiÕstestscoredeviatedfromtheaveragetestscoreforthestudentÕsteacher.(ii)RegresseachlaggedtestscoreinXionEiandcollectthevectorsofresiduals,¬Xi.Justaswith¬Ai,eachelementof¬XiisoneofstudentiÕslaggedtestscoresdeviatedfromthemeanforstudentiÕsteacher.(iii)Runtheregression¬Aion¬Xiandobtainö!.Inotherwords,whentheregressionisrestrictedtoasingleyear,andtherearenomissingdata,77theOLSandURMestimatesof!areidentical;theURMsimplyperformsthepartiallingoutofteacherassignmentinaseparatestep,ratherthanusingthefullregressionin(10).Asdescribedearlier,thenextstepintheURMistoconstructthecompositescoreinequation(5).ButthecompositescoreöAicanbewrittenasöAi=Xiö!+ö",(11)whereö"dependsonö!andtheoverallmeansofthetestscores.Now,theequationusedtoobtaintheteachere!ectsisAi=#öAi+Ei$+errori,(12)whereerroriincludesestimationerrorbecauseöAidependsonö!.TheURMapproachappliesempiricalBayesÕto(12),butthatissimplytoshrinktheestimatesof$towardstheaverageteachere!ect.Withoutshrinkage,orwiththesamenumberofstudentsperteacher,wejustapplyOLSto(12).Again,withoutmissingdata,weknowtheresultbythealgebraofOLS:ö#=1andö$willbeidenticaltowhatisobtainedfrom(10).Theargumentissimple.WeknowtheDOLSestimatesminimizethesumofsquaredresiduals,andyetweknowtheö!obtainedfromtheURMisidenticaltotheö!fromDOLS.Soonecannotdoanybetterbychoosingö#di!erentfromunityandö$astheDOLScoe"cients.Theadditiveconstantin(11)changesnothingbecausetheDOLSregression,withafullsetofteacherdummies,e!ectivelyestimatesanintercept.However,whenthecoe"cientonthecompositescoreisestimatedbyEB,thecoe"cientisnotunity,whichbreakstheequivalence.Infact,thisseemstocausebias.Soif(12)wereestimatedbyOLSthentheURMandOLSestimateswouldbethesame.SohowdoestheEVAASURMgenerallydi!erfromOLS?EvenifweassumenomissingdataandignoreshrinkageintheÞnalstepofEVAAS,thereisadi!erencewithmorethanonetimeperiod.WithOLSestimation,typicallyonewouldaugment(10)byaddingyeardummyvariables,andthenthepartiallingoutinsteps(i)and(ii)arealsodoneviapooledregressionontheyeardummiesandteachere!ects.Bycontrast,EVAASdoesateacher-yeardemeaning,whichisthe78sameasallowingafullsetofinteractivee!ectsbetweentheyeardummiesandteacherdummies.However,afterobtainingthecompositetestscores,EVAASthenpoolsthedata(say,overthreeyears)toobtainasingleteachere!ectforeachteacher.TheresultingURMestimatescannotbecharacterizedascomingfromanOLSregression,butthedi!erencemaynotbegreat.Ifonethoughtthatteachere!ectsvaryovertime,thenonemightestimateanequationbyOLSseparatelyforeveryyear.ThiswouldpreciselyachievethepartiallingoutusedbytheURM.Then,giventheö!t,onemustdecidehowtocombinetheseintosingleteachere!ectestimates.TheURMhasonewaytodothat,butthereareothers,suchasusingaweightedaveragewithweightschosentoreßecttherelativeprecisionoftheö!tacrossdi!erentyears.Whetherallowingforteacher-yearspeciÞce!ectsisimportantismainlyanempiricalissue,butitwouldnotbesurprisingtoÞndthataddingyeardummiestotheOLSregression,andimposingconstantteachere!ectsacrosstime,generallyproducessimilarresults.OftenOLSwillprovidegoodestimatesofaveragepartiale!ectswheninteractiontermsarepresentbutomittedfromregressionanalyses.See,forexample,Wooldridge(2010,Chapter6).3.2.2.2EVAASMultivariateResponseModel(MRM)TheMRMisamultivariate,longitudinal,linearmixedmodelwherethefullsetofobservedscoresÑmeaningallsubjectsandallyearsÑisÞttedsimultaneously.Hence,thismodelsimultaneouslyestimatesteachere!ectsfortheseseparatesubject/grade/years,whereastheURMandotherVAMsdiscussedearlierestimateteachere!ectsforasinglesubject/grade(possiblypoolingovermultipleyears).Withthejointmodelingofscoresacrossvariousgradesandyears,theMRMrequiresverticallyscaledtests,orconversionofscalescorestoNCEs(Normalcurveequivalents)(Wrightetal.,2010).Toshowthis,webeginwithasetofequationsthatillustratetheneedfortheappropriatelyscaledtestscoresaswellasthedescriptionÒlayeredteachermodelÓ.AsportrayedinBallou,Sanders,&Wright(2004),astudentÕssetof,say,mathscores,must79satisfythefollowingequations:y3t=b3t+u3t+e3ty4t+1=b4t+1+u3t+u4t+1+e4t+1y5t+2=b5t+2+u3t+u4t+1+u5t+2+e5t+2(13)whereygtisthetestscore(gain)forgradeginyeartandbgtisthedistrict-levelaveragetestscoreforgradeginyeart.ugtisthegradegteacherÕsinputtothestudentÕstestscoreinyeartandegtisastudent-levelidiosyncraticerrortermforthegradegscoreinyeart.Theyearsubscriptontheteachere!ectsshowthatteachere!ectsvaryovertheyears,sothisapproachisestimatingathee!ectofeachteacherineachyear(i.e.,teacher/yeare!ects).(Moreprecisely,whenweconsiderthefullmodelwithmultiplesubjects,theapproachactuallyestimatesteacher/year/subjecte!ects).However,foragivenstudent,thee!ectsofpastteachersdonotchangeovertimeÑastudentÕs3rdgradeteacherÕscontributiontotheir4thgradescoreisthesameasthatsameteacherÕscontributionwastothestudentÕs3rdgradescore.Inotherwords,ateacherÕse!ectonastudentÕsachievementdoesnotdiminishasthestudentprogressesthroughgrades.Thishighlightstheimportanceofusingverticallyscaledtestscores.Themeaningoftheteachere!ect(resultingintestscoregain/loss)mustbethesameinanygradeaswellasthroughoutthetestscoredistribution.Somovingdownthesetofequationsin(10)fromtheÞrstlinetothesecond,anadditionalÒlayerÓ(teachere!ectfromthenextteacherandnextidiosyncraticshock)isaddedineachyear,motivatingthenicknameÒlayeredteachermodelÓcommonlyusedtodescribetheMRM.10ThemoretechnicalrepresentationoftheMRMbeginswithpresentingthelinearmixedmodel10AnotherrepresentationoftheMRM,asgiveninWrightetal.(2010),isthealgebraicequation,yijkl=µjkl+(!k!!kTijk!l!!t=1!ijk!l!t!"ijk!l!t)+#ijkl,wherethetheinnersummationaddsacrossallteachersthestudenthasinagivensubject/grade/yearwiththe!ijk!l!ttermcapturingthefractionoftimespentwithaparticularteacher,andtheoutersummationiswheretheÒlayeredÓaspectcomesin,addingthecumulativeteachere!ectsoverpreviousgradesandyearsinthesamesubject.Notethatthisrepresentationhighlightstheabilitytoaccommodateteam-teachingorstudentsswitchingteachersduringtheyear;thisispossiblewithotherapproachesaswell,butisrarelydoneinpractice.80(withnotationsimilartoearliersectionsofourpaper):A=X!+E"+#.(14)NowAcontainsthesetofalltestscores(gainscores),meaningallsubjectstestedoverallgradesandyearsforallstudentsduringtheperiodbeingstudied(upto5years).ThematrixXiscomprisedofsubject/grade/yearindicators,and!isthevectorofcoe!cientsthataretreatedasÞxed.TheEmatrixcontainsteacher/grade/subject/yearassignmentindicators,andtheteacherrandome"ectsarecontainedin".Thejointdistributionof"and#issuchthatE(")=E(#)=0andthevariance-covariancematrixisblockdiagonalwithVar(")=GandVar(#)=RandCov(",#)=0.Estimatesof!and"areobtainedassolutionstoHendersonÕsmixedmodelequations(seeWrightetal.(2010)forexplicitequations)sothattheresultingestimatorfortheteachere"ectsis:"!=GE"(EGE"+R)#1(A!X!!)(15)where!!istheGLSestimatorforAonX(soA!X!!isthevectorofGLSresiduals)andGE"(EGE"+R)#1istheshrinkagefactor.Althoughtheshrinkagefactormaylookcomplicatedinmatrixform,theideaisthesameasthatfortheshrinkageusedintheEBandURMapproaches.11Thegreaterthestudent-levelnoise(i.e.,thelargerthevariancesalongthediagonalelementsofvar(#)=R),themoretheestimatedresiduals(A!X!!)areshrunktowardsthemean(zero).1213Sincethedistrictmean(gain)scoresareestimatedin!,theteachere"ectsaredeviationsfromthedistrictmean,andgainsattributed11Forbasicintuition,considerBallou,Sanders,&WrightÕs(2004)examplewiththesimplecasewhere!containsoneteachere!ectsotheshrinkagefactorreducestothereliabilityratio,var(!)var(!)+[var(")/N].12Rcapturesthewithin-studentcovariancesinstudenttestscoreresiduals,".Sortingthescores,A,bystudent,RisblockdiagonalwithablockRiforeachstudent,andallotherelementszeroreßectingtheimposedzerocorrelationbetweenstudents.Toformthismatrix,consideranoverallcovariancematrix,R0,thatcontainsarowandcolumnforeachsubject/grade,socovariancesamongsubjectsandgradesareassumedtobethesameforallyears(cohorts),butisotherwiseunrestricted.SimilartotheURMÕsaccommodationofincompleterecords,intheMRMeachstudenthasablockRicomposedofthesubsetofelementsintheoverallcovariancematrix,R0,thatcorrespondtothesubject/gradesforwhichthestudenthastestscores,regardlessofwhetherthestudentislinkedtoateacherforthesescores.Hence,theRmatrixallowstheMRMtoincorporateinformationfromallavailablescoresfromeachstudent.13Gcapturesthevarianceofteachere!ects,andisblockdiagonalwithablockforeachsubject/grade/year.The(block)diagonalformreßectstheassumptionthatteachere!ectsarenotcorrelatedacrosssubjectsoryears,allowingteachersÕe!ectivenesstovaryfromyeartoyearandsubjecttosubject.Eachblockhastheform#2jklIwhere#2jklistheteachervarianceforthejthsubjectinthekthgradeinthelthyear,allowingthevarianceofteachere!ectstovaryacrosssubjects,grades,andyears.81toteachersareestimatedbyaddingtheteachere!ecttothedistrictmean(gain).AsnotedinWrightetal.(2010),whenGandRareknown,!!isthebestlinearunbiasedpredictor(BLUP)of!,"!isthebestlinearunbiasedestimator(BLUE)of",andthesolutionisequivalenttoGLS.If!and#areNormal,thenthesolutionisMLE.GenerallyGandRarenotknownsoestimatesareusedinstead;thesolutionapproachesMLEastheestimatedGandRapproachtheirtruepopulationvalues.ThisapproachiscomputationallyburdensomeÑandhencegenerallylimitedtodistrict-levelanalysisratherthanstate-levelÑsowedonotestimatetheMRMinthispaper.Further,manycharacteristicsoftheapproach,suchasthejointmodelingofscoresfromdi!erentsubjects,ortheaccommodationofmissingdata,havebeenevaluatedinotherstudies(e.g.,Lockwoodetal.,2007;McCa!reyetal.,2011).3.3PriorLiteratureEvaluatingEVAASMethodsWhiletheresearchliteratureonestimatingteachere!ectivenesshasbeengrowingrapidlyinrecentyears,onlyahandfulofthesestudieshaveeverimplementedeitheroftheEVAASteachermodelsinsimulationsorusingadministrativedata(Lockwoodetal.,2003;Ballou,Sanders,&Wright,2004;McCa!reyetal.,2004;Lockwoodetal.,2007;McCa!reyetal.,2008;Rose,Henry,&Lauen,2012).ThemajorityofthesestudiesfocusonaspeciÞcassumptionorcharacteristicoftheEVAASMRM,suchasthecompletepersistenceofteachere!ects,andonlymentioninpassingthatthisispartoftheEVAASmethod.Toourknowledge,Rose,Henry,&Lauen(2012)istheonlyotherstudytoevaluatetheURM,amongseveralotherestimatorstheyconsider.WefocusspeciÞcallyontheURM,providingamoredetaileddiscussionofthemethodandalsohowthemethodrelatestoother(simpler)approaches.Inparticular,weshowthatstandardlinearregressionusingOLSisasimplerÑandinsomecasesmorerobustÑalternativetothisEVAASmethod.ThereareseveralpapersmadeavailablebytheSASInstitute(SASWhitePapers)thatdiscussthetheoreticaladvantagesoftheEVAASmethods,andsomealsoevaluatetheperformanceoftheEVAASmethods(e.g.,Sanders,2006;Wright,Sanders,&Rivers,2006;Wright,2010).Thesepaperstendtofocusonthescalingofthetestscores,measurementerrorinthetestscores,and82missingdata.Thetest-scorescalingissuestemsfromthefactthatsomeapproachesÑsuchastheMRMandothergain-scoreVAMsÑrequiretestscorestobeverticallyscaled,notjustfromgrade-to-grade,butinawaythatleavesthemeaningofa1-unitchangeinscorethesameatanypointinthedistribution.TheURMandotherlag-scoreVAMs,however,donotrequiresuchscaling,and,fortheapproachesthatdo,theentireissuecanbecircumventedbyconvertingthescorestonormalcurveequivalents(NCEs)(Wrightetal.,2010).Thesecondconcernisthatthemeasurementerrorinthelaggedtestscoreswillcausebiasintheestimatesofteachere!ects,lendingalsotoinstabilityintheestimates.Inapaperaimedatevaluatingastandardizedgainmodelandstudentgrowthpercentilemodel,theURMandMRMarealsoestimatedforcomparison,andtheirestimatedteachere!ectsareshowntohavesmallercorrelationswiththepercentofstudentsinateacherÕsclasswhoareeligibleforfree-andreduced-pricelunches(Wright,2010).Hence,theproposedsolutionformeasurementerrorbiasistoincludemultiplelagscores(atleastthree)tomitigatetheattenuationbias,asthemeasurementerrortendstoaverageout(e.g.,Wright,2010).However,thiscanworsenmissingdataissuesinsomeapproaches,lendingadvantagetotheEVAASMRM,asitusesallpossibletestscores(Wright,2010).ThisleadstothelastofthemajorconcernsÑstudentswithincompletetestscorerecords.BoththeMRMandURMincorporatewaystomitigatemissing-dataissues,withtheMRMincludingstudentswithanyobservedtestscores,whiletheURMrequiresatleastthreepriorscores(Wrightetal.,2010).14Othernotedfeaturesincludetheuseofshrinkageestimation(empiricalBayesÕ)(Sanders,2006)andtheMRMslayeringofallpast,presentandÒfutureÓtestscores(Wrightetal.,2010),bothofwhicharethoughttoimprovestabilityofteachere!ectestimates.InoneoftheearlypaperstoimplementEVAAS,McCa!reyetal.(2004)proposeaÒgeneralmodelÓwhichencompassesseveralotherVAMsasaspecialcase.TheirapproachforthegeneralmodelissimilartotheEVAASMRM,di!eringinthatitallowsfortheinclusionofstudentcovariates14ThedistinctionisalsomadethattheEVAASmethodsrequiredatatobeÒmissingatrandomÓ(MAR)asopposedtoothermethods,whichrequirethedatatobeÒmissingcompletelyatrandomÓ(Wright,2010;Wrightetal.,2010).However,asnotedabove,theconsistencyofanOLSapproachsuchasDOLSonlyreliessomethingsimilartoMAR.83anddoesnotimposecompletepersistenceofteachere!ects.Inatheoreticaldiscussion,whichissupportedbytheirsimulationandempiricalevidence,theyÞndthatomittingstudentcovariatesresultsinbiasedteachere!ectestimateswhenthedistributionofcovariatesdi!erbyschool(andschoole!ectsareomitted),butthatinothercasesÑwhenthedistributionofcovariatesdi!ers,say,byclassroomÑtheuseofwithin-studentcorrelationmitigatesthisbias.AnotherfeatureoftheEVAASMRMistheassumedcompletepersistenceofteachere!ects(e.g.,thecontributionofthe3rdgradeteacherpersistsundiminishedforthescoresinallsubsequentgrades).GiventhatthisassumptionisnottheoreticallyorempiricallyjustiÞed,itisperhapsunsurprisingthatMcCa!reyetal.(2004)Þndnoevidencetosupportthis,estimatingthepersistenceparameterstobe0.1Ð0.3(noneofwhicharesigniÞcantlydi!erentfromzero).However,withthesmallsimulationandlimitedadministrativedata(678studentsfrom5elementaryschoolsinasinglesuburbandistrict,withfree-andreduced-priceluncheligibilityastheonlycovariate),eventheauthorsadmittheevidenceonboththeomissionofstudentcovariatesaswellaspersistenceisinsu"cientandwarrantsfutureresearch.Morerecently,Lockwoodetal.,(2007)developaBayesianframeworkwhichisbettersuitedtoscaletolargedatasetsthanthemaximumlikelihoodmethodsusedbyMcCa!reyetal.(2004).Further,theyexpandtheanalysisbynowjointlymodelingreadingandmathscores,andexploretheimplicationsofusingdi!erentapproachesforaddressingmissingdata.TheyuseÞveyearsofdataononecohortofstudentsfromalargeurbandistrict,aswellassimulations,and,again,Þndpersistenceestimatesaresubstantiallylessthan1.Theyconcludethatjointversusmarginalmodelingdoesnota!ectteachere!ectssigniÞcantly(rankcorrelationsbetweenteachere!ectsfromjointandmarginalmodelsaregreaterthan0.99forthevariablepersistencemodelandgreaterthan0.97forthecompletepersistencemodel).Theyalsonotethattheirresultsarerobusttowhichmethodischosentohandlemissingdata.Tofurtherexaminetheimplicationsofmissingdata,McCa!rey&Lockwood(2011)extendtheapproachtoexplicitlyallowfordatatobemissingnotatrandom,butÞndlittleimpactontheestimatedteachere!ects,suggestingthatviolationsofthemissingatrandomassumption(MAR)maynotbeproblematic.Alsousingtheirproposedmodel(ageneralizationoftheMRM),Lockwood&McCa!rey(2007)usesimulationstoexplorehowthepotentialbiasfromomittingstudentcovariateschanges,depend-84ingontheassumptionsoneiswillingtomakeaboutthewayinwhichthestudentheterogeneityrelatestothemeasures,andwhetherarandome!ectsorÞxede!ectsapproachistaken.Theyarguethatevenwhenomittedstudentheterogeneityisrelatedtoothervariables,theGLSestima-torarisingfromthemixedmodelapproach(similartotheMRM)whichjointlymodelstestscoresfromdi!erentsubjectshasadditionalinformationavailableontheheterogeneity,andthisincreasese"ciencyandalsoreducesthebias(relativetomodelingasinglesubject).15Ballou,Sanders,&Wright(2004)alsofocusontheissueofomittingstudentcovariates,butdosospeciÞcallywiththeEVAASMRM.16TheauthorsobtaintheusualEVAASestimatesoftheteachere!ectsaswellasestimatesfromamodiÞedEVAASapproachthatcontrolsforstudentÕsFRLeligibility,non-whiterace,gender,andinteractionsbetweenthesecovariates.ThismodiÞcationisimplementedinaÞrststagetoobtainquasi-residualsfromestimationusingthegainscoreasthedependentvariableandstudentcharacteristicsandteacher-by-yearindicatorsascovariates.Thentheyusethesequasi-residualsintheusualEVAASestimation.TheyÞndthattheestimatedteachere!ectsdonotdi!ersubstantially,withhighrankcorrelationsbetweenestimatesandalsosimilarnumbersofteacherclassiÞedasÒexcellentÓ.Toexplorewhetherthisresultisduetothehistoryofpriorscoresaccountingforstudentcovariates,theyalsocomparetheRmatrixfortheoriginalandmodiÞedEVAASapproaches,Þndingtheelementstobeapproximately18%smallerinthelatter.TheyconcludethatincludingpriortestscoresdoescontrolforÒmuchÓoftheinformationcontainedinstudent-levelcovariates.Whilethisissuggestiveevidenceinsupportofomittingstudentcovariates,thisandearlierevidenceislimitedtotheMRM,asnoneofthestudiesmentionedherehaveincludedtheURM.ArecentpaperbyRose,Henry,&Lauen(2012),however,notonlyprovidescomparisonsusingtheEVAASURM,butactuallyprovidesthesecomparisonswithabroadersetof(nine)value-addedapproaches.Thepaperdiscussesassumptionsandimplicationsofviolations,andalsoprovidessimulationandstatewideempiricalevidenceusingthreeyearsofadministrativedatafrom15Forthispaper,theyusemaximumlikelihoodmethodsforthemixedmodels,butalsonotethatseparatesimulationsusingaBayesianapproachdoesNOT???changetheresultssubstantively.16Interestingly,theauthorsdiscusstheomissionofstudentcovariatesasavirtueoftheEVAASMRM,inthatthisreducesdatarequirements.Indeedthesedata(suchasFRLeligibility,gender,race,absences,etc.)canbemissingforsomestudents,butthesedataaregenerallyavailable,andasimilarargumentcouldbemadeforthemanyVAMsthatutilizeonlyoneortwopriortestscores,asusingacompletehistoryoftestscorescanbequiteonerouswhenusinglargeadministrativedatasets.Forexample,Lockwoodetal.(2007)notethatonly20%oftheirsampleofstudentshadacompletesetofreadingandmathscoresoverthe5years(grades)used.85NorthCarolina.TheestimationapproachesincludetheURM,threeHLMapproacheswhichalsotreattheteachere!ectsasrandom,twowhichtakethewithin-teacheraverageoftheresiduals,andthreewhichtreattheteachere!ectsasÞxed.ThespeciÞcationsvarywithrespecttothenumberof(andsubjectof)laggedscores,schoole!ects,andtime-constantstudentcovariates.TheyÞndhighagreementamongmostoftheapproaches,butoverallrecommendtheURM,thetwoapproachesthattreattheteachere!ectsasrandomandaccountforaschoolrandome!ect,aswellasastudentÞxede!ectsapproach.173.4Simulation3.4.1SimulationDesignWeconductsimulationstoassesstheperformanceoftheDOLS,EB,AR,andURMestimatorsundervariousstudentgroupingandassignmentscenarios.ThisallowsustoknowtheÒtrueÓteachere!ect(whichwegenerate),andthenevaluatetheabilityofeachoftheestimatorstocapturethise!ectÑsomethingnotpossiblewithadministrativedata.Wegeneratedatafor3cohortsof800studentseach,creatingacurrentscoreandtwolaggedscoresforeachstudent.Forouranalysis,wefocusonasinglegrade,sousingoneobservationperstudent,but3cohortsofstudentsperteacher.Thesimulationsaredesignedwithelementarygradesinmind,sowecanthinkofthissettingaslookingat5thgradestudentsandteachers.Classsizeissetto20,foratotalof40teachers.Togeneratethetestscores,weÞrstobtainabaselinescore(i.e.,theÞrstgradetested)drawnfromastandardnormaldistribution.Eachofthesubsequenttestscores,Ait,isthengeneratedaccordingtotheequationbelow:Ait=!Ai,t!1+"it+ci+uit(16)whereAi,t!1islaggedachievement,"itistheteachercontributiontothecurrentscore(thetrueteachere!ect),ciisthetime-constantunobservedstudente!ect,anduittheidiosyncraticerror.Thedecayparameter,!,issettoeither0.5(substantialdecay)or1(nodecay).Thecorrelation17Rose,Henry,&Lauen(2012)alsonotethattheirresults(andhenceconclusions)fortheDOLSestimatormaydi!erfromthoseinGuarino,Reckase&Wooldridge(forthcoming)duetosimulationdesign.86betweenlaggedachievementandthestudentÞxede!ectis0.5.Thethreerandomparametersaredrawnfromnormaldistributions:studentÞxede!ectci!N(0,.52),teachere!ect!!N(0,.252),andtheidiosyncraticerroruit!N(0,1)(sotheirrespectiveproportionsofthetotalvarianceintestscoresare19%,5%,and76%).Tolookatnonrandomsortingofstudents,wemakethedistinctionbetweengrouping(howstudentsaregroupedintoclassrooms)andassignment(howstudentsareassignedtoteachers),allowingforstudentstobe,say,groupedbasedonpriorachievementlevels,butthenrandomlyassignedtoteachers.Welookatgroupingbasedonthelaggedscore(referredtoasdynamicgrouping),basedontheoriginalbaselinescore(aformofÒstaticÓgroupingreferredtoasbaselinegrouping),andbasedonthestudentindividualheterogeneity(anotherformofstaticgrouping,referredtoasheterogeneitygrouping).Welookatthreedi!erentassignmentmechanismsforeachofthesegroupingscenarios:randomassignment,positiveassignment(e.g.,betterstudentstobetterteachers),andnegativeassignment(e.g.,strugglingstudentstobetterteachers).Inthecasesofnonrandomassignment,theassignmentisnotperfectlyseparatingstudentsinrankorderof,say,laggedachievement,ratherassignmentisnoisywiththenoisebeingdrawnfromastandardnormaldistribution.Inadditiontovaryingthegroupingandassignmentmechanismsandthedecayparameter("),wealsoconductsimulationsusinglargerteachere!ects,with!!N(0,.62)(andci!N(0,.52),sotheirrespectiveproportionsofthetotalvarianceintestscoresare21%each).Weconduct100MonteCarlorepetitionsforeachgrouping-assignment-parameterscenario.Weexaminetheperformanceoffouroftheestimatorsdiscussedabove(DOLS,AR,EB,EVAASURM).FortheÞrstthreeestimatorsweconsideraÒcommonÓspeciÞcation,similartoequation(3),wherethecovariatesincludealaggedtestscore,andinthecaseofDOLS,teacherassignmentindicators.(Wedonotincorporatee!ectsforstudentcharacteristicsintothesimulation.)FortheURM,weusebasethecompositescoreonthissamelaggedtestscoreaswellasatwo-yearlaggedtestscore.18Asdiscussedabove,wealsoestimatespeciÞcationsthatÒmimicÓtheEVAASURM18IncorporatingfurtherlaggedscoresorlaggedscoresinothersubjectswouldnotcontributesubstantivelytoourevaluationofthetheoreticalimplicationsofsortingorassignmentfortheURMestimator,asthesewouldconstitutethesameissuesashavingonevs.twolags.Weprefertopresentthesimplecaseoftwolagstofacilitatetransparencyinoursimulationdesignandresults.87approach,usingDOLS,AR,andEB,toillustratewheredivergencesintheperformanceoftheestimatorsiscomingfrom.Hence,forthesimulations,thismeansincludingboththeone-yearandtwo-yearlaggedtestscoresintheestimatingequation.ForallestimatorsandspeciÞcations,weestimatetheteachere!ectsÞrstusingoneyear(cohort)ofdata,andthenestimatethempoolingoverthreecohorts(years)aswell.OurÞrstmetricforevaluatingtheperformanceoftheseestimatorsistheSpearmanrankcorrelationsbetweentheestimatesandthetrueteachere!ects,toexaminetheirabilitytouncoverthetruee!ect.WealsolookatthecorrelationsbetweentheestimatesobtainedviatheURMandthosefromtheotherestimators,tolookathowsimilarlytheyrankteachers.3.4.2SimulationResultsWeÞrstassesstheabilityofeachoftheestimatorstouncoverthetrueteachere!ect,lookingatthecorrelationsbetweentheestimatedandtruee!ects.Forourmainresults,wefocusontheÒsmallÓteachere!ects,whichaccountfor5percentofthevariationintestscores.Inpractice,itwouldbeconvenienttouseoneyearofdata(i.e.,onecohortofstudents)toevaluateteachere!ectiveness,soTableC1providestherankcorrelationsbetweenthetrueteachere!ectsandtheestimatedteachere!ectsinthissetting.PanelAshowsthecaseofsubstantialdecay(!=.5)andPanelBthecaseofcompletepersistence(!=1).Withineachpanel,10grouping-assignmentscenariosareexplored.TheestimatorsconsideredÞrstareDOLS,AR(averageresidual),andEBontheÒcommonÓspeciÞcationwhichcontrolsonlyforonelagscore(inadditiontotheteachere!ectsinthecaseofDOLS).Thenextsetofcolumnsarebasedonapproacheswhichalsoincludeatwo-yearlaggedscore,tomimictheinformationinthecompositescoreoftheURM.TableC1showsthatunderrandomgroupingandrandomassignment,therankcorrelationsare0.69forallestimators,andnonrandomgroupingdoesnotcauselargedeparturesfromthis,aslongasassignmenttoteachersisrandom.Theestimatorsactuallyperformbestinthepositiveassignmentcases,inparticularwhengroupingisbasedonthestudentheterogeneity,withrankcorrelationsranging.78Ð.80,aresultarisingfrombiasthatexpandsthedistributionofestimatedteachere!ects,makingiteasiertodistinguishbetweenteachers(seeGuarino,Reckase,&Wooldridge(2015)fora88moredetaileddiscussionofthisresult).Conversely,theestimatorsperformtheworstwhenstudentsaregroupedonheterogeneityandthennegativelyassignedtoteachers,withrankcorrelationsranging.41Ð.43when!=.5and.45Ð.46when!=1.AlsoevidentinTableC1isthecloserelationshipbetweentheEVAASURMandusingEBtoestimateaspeciÞcationwiththesamelaggedtestscores,asthecorrelationsintheURMandEB-mimiccolumnsarenearlyidentical.Further,weseethatunderdynamicgroupingwithpositiveornegativeassignment,althoughallestimatorsperformworserelativetorandomassignment,DOLSperformssubstantiallybetterthanAR,EB,orURM,regardlessofthevalueof!.Thisarisesfromthefactthattheseapproachesarenotcorrectlypartiallingouttheassignmentmechanismfromtheteachere!ects.EBandtheURMbothareclosertoDOLSthanAR,though,becauseasthenumberofstudentsperteachergetslarger,theEmpiricalBayesÕestimatesoftheteachere!ects(underlyingEBandURM)willgetclosertoDOLS(seeGuarinoetal.(2015)foramoredetaileddiscussionofthisresultthattherandome!ects(RE)estimateswillconvergetotheÞxede!ectsestimatesasthesamplesizeincreases).Althoughusingonecohortofstudentsisconvenient,inpracticemultiplecohortsareoftenused,sowealsopresentresultsfromusingthreecohortsofstudents(i.e.,threeyearsofdataonteachers)toestimateteachere!ectiveness.Giventhatthisisincreasingtheamountofinformationonteachers(andteachere!ectsdonotvarybyyearinoursimulation),weexpecttheperformanceofalloftheestimatorstoimprove.TherankcorrelationsinTableC2showthisimprovedperformance,buttheresultsalsofollowthesamerelativeperformancepatternsacrossscenariosandestimators.Thecorrelationsunderrandomgroupingandrandomassignmentarenowlargerat.84.Inthecaseofgroupingbasedonstudentheterogeneitywithpositiveassignmenttoteachers,thecorrelationsarenow.89forallestimators.Whenstudentsareinsteadnegativelyassignedtoteachers(basedonheterogeneity),thecorrelationsare.52Ð.56when!=.5and.57Ð.58when!=1.Underthisscenario,thecorrelationsfortheÒmimicÓspeciÞcationestimatorsareslightlylargerthanthosefromtheÒcommonÓspeciÞcationswhen!=.5,butthisresultcomesfromtheamountofdecay;when!=1thereisnomotivationforcontrollingforasecondlaggedscore.AgainweseethenearlyidenticalperformanceoftheURMandEB-mimic.TheissueofpoorperformanceofAR,EB,andURMundernonrandomassignmentbasedonthelaggedscoreremains.Again,theURMandEB89estimatorsperformmoresimilarlytoDOLSthanARexhibitingtheconvergenceoftherandome!ectsapproach(EB,URM)totheÞxede!ectsapproach(DOLS).ARperformstheworstbecausetheassignmentmechanismisnotpartialledoutatall.TableC3showsthecorrelationsfortheDOLS,AR,andEBestimatescomparedwiththeEVAASURMestimatesunderthevariousgroupingandassignmentscenarios,Þrstwhenestima-tionusesonecohortofstudentsandthenwhenestimationisbasedonthreecohortsofstudents.AgreementwiththeURMishigh(.99Ð1.00)forallestimatorsundermostscenarios,withthesmall-estcorrelationsbeingforthecasesofnonrandomassignmentbasedonthelaggedtestscore.Inthesecases,thecorrelationsbetweenDOLSandtheURMarestillabove.90(ranging.92Ð.97),reßectingthedi!erenceinaccountingfortheassignmentmechanismdiscussedabove.3.4.3SensitivityofSimulationResultsWhilesomesensitivityanalyseswerepresentedaspartofthemainresults(e.g.,usingoneversusthreecohortsofstudents,orchoosing!=.5versus!=1),wealsoconductedsimulationswithlargerteachere!ects.Inthiscase,theteachere!ectandthestudentheterogeneityeachaccountforabout21%ofthetotalvariationintestscores.Withtheteachere!ectsaccountingformoreofthevariationinthetestscores,wenaturallyexpecttheperformanceoftheestimatorstobeimprove.AsshowninTableC4,thisiscertainlythecase.Still,theresultsfollowthesamegeneralpatternsdiscussedforthesmallteachere!ectscase.Similarly,TableC5showsthesimilaragreementbetweenDOLS,AR,EBandtheURM,withthecorrelationsbeingofsimilarmagnitudeexceptforthedynamicgroupingandnonrandomassignmentcases.Theagreementinthesecasesishigher,butstillillustratesthedivergenceinestimatesthatarisesfromhowtheapproachesaccountfortheassignmentmechanism.903.5EmpiricalAnalysis3.5.1AdministrativeDataWeuseadministrativedataonstudentsingrades5and6duringyears2002Ð2007inalargeurbananonymousdistrict.19SimilartoourexampleusedintheEVAASURMdiscussion,wefocusonmathscoresastheoutcomeanduseone-yearandtwo-yearlaggedmathandreadingscoresascovariatesinsomespeciÞcations.Thedatacontaininformationonstudentrace/ethnicity,daysabsent,gender,disability,limitedEnglishproÞciency(LEP),andfree-orreduced-priceluncheligibility(FRL).Weexcludestudentswhoarenotlinkedtomathematicsteachers,studentswhoareassignedtoclasses(i.e.,teacher/yeargroups)withfewerthan10students,andstudentswhowereretainedorhaveduplicategrade-yearobservations.Allestimationsalsorequirethatstudentshave,ataminimum,acurrentmathscoreandaone-yearlaggedmathscore.SamplecharacteristicsandaveragescoresfortheÞfthandsixthgradesamplesareprovidedinTableC6forthestudentswithdatasatisfyingtheminimumsampleinclusionrequirementsjustdescribed;theseestimationsamplescoveryears2002Ð2007.TheÞrstsetofdescriptivesinPanelAareforthesampleofÞfthgradestudents,whilePanelBcontainsthedescriptivesforthesixthgradesample.Acrossgrade,theaveragestudentcharacteristicsareverysimilar,withabout61%ofthestudentsbeingHispanic,28%Black,and50%arefemale.Approximately52%areclassiÞedaslimitedEnglishproÞcient(LEP)andabout70%areFRL-eligible.Thesamplesizesarealsoprovidedforeachvariableinthetable,toillustratehowthesamplescouldchangedependingonwhichlaggedtestscoresareincluded.Forexample,addingatwo-yearlaggedmathscoreinaregressionwouldmean3,433(3.1%)Þfthgradestudentsareomitted.Forsixthgrade,thesamplefallsby3,412(3.4%)withtheadditionoftwo-yearlaggedmath.Thisindicatesthatincludingalongerhistoryofscoresdoesimposedatarestrictions,thoughasdiscussedabove,theURMisabletorelaxtheserestrictionssomewhat.Weestimateteachere!ectsseparatelyfor5thand6thgrade,focusingonmathteachersonly(soweusemathscoresasouroutcomevariable).Wecomputeestimatesusingoneortwoyearsofdataforteachers(i.e.,threecohortsofstudents).Similartotheapproachforthesimulations,19Ourdatasharingagreementdoesnotallowustonamethedistrictorstate.91weestimateseveralspeciÞcationsusingAR,DOLS,andEB.OnesetofspeciÞcationsincludesstudentcharacteristicsandeitherthe1-yearorboththe1-yearand2-yearlaggedmathscores.TheotherspeciÞcationsaredesignedtobemoresimilartotheURM(andomitstudentcharacteristics);onespeciÞcationincludesthe1-yearand2-yearlaggedscoresinmathandreading,andtheotherspeciÞcationusesthecompositescoresastheonlyregressor.3.5.2EmpiricalResultsWiththeadministrativedata,weestimateafewspeciÞcationswitheachofthethreeÒcommonÓestimationmethodsconsideredinthispaper(DOLS,AR,andEB).TheÞrstspeciÞcationisthelag-scorespeciÞcationshowninequation(3),whichcontrolsforthe1-yearlaggedmathscore,otherstudent-levelcovariates,andyeare!ects.20ThesecondspeciÞcationisaugmentedwitha2-yearlaggedmathscorealso.ThethirdspeciÞcationomitsstudentcovariatesbutincludesthesamelaggedscoresasthecompositescorecomputedfortheURM,henceattemptingtoÒmimicÓtheinformationusedintheURMestimation.ThelastspeciÞcationusesthecompositescoreitselfastheonlycovariate(sowhenusingEBestimation,thisisidenticaltotheURM).Weconsiderteachere!ectscomputedbasedononeyearofdataorpooledovertwoyearsofdata,coveringtheyears2002-2007.Wethenexamineagreementamongtheestimatorsineachyearandpresentresultsonaverageagreementduringthistimeperiod.InTableC7,weprovideaverageSpearmancorrelationsbetweentheEVAASURMestimatesandthosefromeachoftheotherestimator/speciÞcationcombinations.WithineachspeciÞcation,therankcorrelationsdonotchangesigniÞcantlywhenpoolingoveranadditionalyearofdataforestimationandalsodonotdi!ersubstantiallybetweenestimators.Incolumn[1],thecorrelationsshowthatagreementwiththeURMisslightlybetterinthe6thgradeanalysisforallestimators,andtherewealsoseethatagreementishighestforEB,slightlylowerforDOLS,andlowestforAR.Whenweadda2-yearlaggedmathscoretothespeciÞcation(column[2]),therankcorrelationsallimprovesubstantially,around.97for5thgradeandslightlyhigheraround.98for6thgrade20Thestudent-levelcovariatesincludecontrolsfordaysabsent,race/ethnicity,disability,LEP,FRL-eligibility,andfemale.92(withtheexceptionofAR,whichislowerat.96for6thgrade).Incolumn[3]weomitstudentcharacteristicsanduse2lagscoreseachinreadingandmath,andnowÞndevengreateragreementwiththeURMestimateswithrankcorrelationsabove.99.Finally,incolumn[4],weusethecompositescoreastheonlyregressor,andnowtherankcorrelationsareevenhigher.(TherankcorrelationsforEBareexactly1becausethisistheURMapproachitself.)Withineachgrade/speciÞcationcombination,theEBrankcorrelationsareatleastaslargeasthoseforDOLSorAR,whichindicatesthattheestimationapproachmatterssomewhat.However,thespeciÞcationseemstobemoreimportantinourdata.AgreementwiththeURMincreasesforallestimatorsaswegetclosertousingthesamespeciÞcationastheURM(movinglefttorightfromcolumns[1]-[4]);whenweusethecompositescoreastheonlyregressor,alloftherankcorrelationsareverycloseto1.Theresultsincolumn[3]alsoshowthatthedi!erencesbetweentheURMandtheregressionbasedapproachesusingthesamelagscoresarenotlarge.ThecomplicatednatureoftheURMstemsfromtakingextrastepstoincludestudentswithcertainpatternsofpartiallymissingtestscorerecords,sinceregression-basedmethodsomitthesestudentsfromestimation.GiventhatconsistentestimationforDOLSandtheURMrequiresverysimilar(ifnotidentical)assumptionsregardingthewayinwhichdataaremissing,itisnotsurprisingthatthetwoapproachesreachsimilarresults.TheestimatesfromsimpleDOLSestimationofasimilarspeciÞcationwithteacherindicatorscorrelatesveryhighly(.99)withthecomplicatedmulti-stepEVAASURMestimation.ThehighagreementbetweenDOLSandtheURMalsosuggeststhatthereisnotsubstantialnonrandomassignmentbasedonpriorachievementinourdata.OursimulationresultsshowedthatDOLSisrobusttothistypeofnonrandomassignmentwhiletheURM(alongwithEBandAR)isnot.Foranotherillustrationrelatedtoapolicycontext,TableC8showstheaveragepercent(andnumber)ofteachersforwhicheachoftheotherestimatorswoulddisagreewiththeURMontheirclassiÞcationofteachersinthetopdecileinthedistributionofestimatedteachere!ects.Sothiscouldrepresentascenariowherethetop10percentofteachersreceivedapayincreaseorbonus.Thedisagreementratesrangefrom0.3%Ð2.6%,withthesmallestforEBestimationofthespeciÞcation93thatÒmimicsÓtheURM,whichisexpected.Inthiscase,duringthe2002-2007periodonly4or5sixthgrade(9or10forÞfthgrade)teachere!ectswereclassiÞedinthetop10percentwiththeURMestimates,butclassiÞedasbelowthe90thpercentilewiththeEB-mimicestimates.Theanalogousresultsincolumn[3]forDOLSshowdisagreementratesonthetopdecileare.7%(30or34teachere!ects)forÞfthgradeand1.2%Ð1.6%(22or29teachere!ects)forsixthgrade.3.6SummaryandConclusionsWehaveshownhow,inasimpliÞedsetting,themulti-stepEVAASURMestimationapproachrelatesverycloselytosimpleOLSestimationusingthesamelaggedtestscores.Whilethisexactrelationshipismoredi"culttoseewhenweextendtosettingswithmissingdataormultipleyears,weshowhowsimilartheestimatesare,andunderwhatconditionstheyareexpectedtodiverge,usingbothsimulationsandadministrativedata.OursimulationevidenceshowsthattheURMexhibitssimilarperformancepatterstothoseseenwithempiricalBayesÕestimationinGuarinoetal.(2015).WhiletheURMandEBperformsimilarlytoDOLSundertheidealconditionsofrandomassignmentandrandomgrouping,DOLSismostrobusttononrandomassignment,especiallyassignmentbasedonthelaggedscore,whichiscertainlyaplausibleassignmentmechanism.Ourresultsbasedonadministrativedatasuggestthattheremaynotbesubstantialsortinginthisdistrict,giventhesimilaritybetweentheURM/EBandDOLS,regardlessofspeciÞcation.AlthoughoursimulationsshowedthatOLSgenerallydoesaswellÑorbetterÑthanthemorecomplicatedEVAASURMinrecoveringtrueteachere!ects,ouranalysisofadministrativedatasuggeststheextentofthedi!erencesmaynotbeextremelyproblematicinpractice.ThisisperhapsreassuringgiventhattheEVAASmethodsarealreadyusedinseveralstatesanddistrictsforteacherevaluationpurposes,insomecasesforhigh-stakesdecisionmaking.94!95 APPENDIX !96 Table C1: Correlations Between Estimated and True Teacher Effects (1 Cohort of Students) 1 cohort of students "common" "mimic" Small teacher effects 1-yr lag score 1-yr and 2-yr lag scores Grouping Assignment DOLS AR EB URM DOLS AR EB PANEL A - Substantial decay (lambda = 0.5) Random Random 0.69 0.69 0.69 0.69 0.69 0.69 0.69 Dynamic Random 0.70 0.70 0.70 0.70 0.70 0.70 0.70 Positive 0.67 0.49 0.53 0.53 0.68 0.50 0.53 Negative 0.70 0.53 0.58 0.58 0.70 0.53 0.57 Baseline Random 0.67 0.67 0.67 0.68 0.68 0.68 0.68 Positive 0.75 0.72 0.73 0.71 0.73 0.69 0.71 Negative 0.50 0.49 0.50 0.55 0.55 0.53 0.55 Heterogeneity Random 0.64 0.64 0.64 0.65 0.64 0.65 0.65 Positive 0.80 0.79 0.79 0.79 0.79 0.78 0.79 Negative 0.41 0.41 0.41 0.43 0.43 0.43 0.43 PANEL B - Complete persistence (lambda = 1) Random Random 0.69 0.69 0.69 0.69 0.69 0.69 0.69 Dynamic Random 0.68 0.68 0.68 0.68 0.68 0.68 0.68 Positive 0.65 0.43 0.46 0.46 0.65 0.43 0.46 Negative 0.70 0.49 0.53 0.52 0.70 0.49 0.53 Baseline Random 0.69 0.69 0.69 0.69 0.69 0.69 0.68 Positive 0.69 0.64 0.66 0.66 0.69 0.63 0.65 Negative 0.63 0.59 0.61 0.62 0.64 0.59 0.61 Heterogeneity Random 0.65 0.65 0.65 0.65 0.65 0.65 0.65 Positive 0.79 0.78 0.79 0.79 0.79 0.78 0.79 Negative 0.46 0.45 0.45 0.45 0.45 0.45 0.45 Notes. This table provides the Spearman rank correlations with the true teacher effects. The URM and the "mimic" DOLS, AR, and EB are based on specifications with a 1-year and 2-year lagged score, while the "common" DOLS, AR, and EB estimates are based on the specification with just the 1-year lagged score. These results are based on simulations with small teacher effects, and 1 cohort of students. !!!!97 Table C2: Correlations Between Estimated and True Teacher Effects (3 Cohorts of Students) 3 cohorts of students "common" "mimic" Small teacher effects 1-yr lag score 1-yr and 2-yr lag scores Grouping Assignment DOLS AR EB URM DOLS AR EB PANEL A - Substantial decay (lambda = 0.5) Random Random 0.84 0.84 0.84 0.84 0.84 0.84 0.84 Dynamic Random 0.84 0.84 0.84 0.84 0.84 0.84 0.84 Positive 0.84 0.66 0.76 0.76 0.84 0.66 0.76 Negative 0.83 0.68 0.77 0.77 0.83 0.68 0.77 Baseline Random 0.82 0.82 0.82 0.83 0.83 0.83 0.83 Positive 0.88 0.87 0.88 0.87 0.87 0.85 0.87 Negative 0.65 0.65 0.65 0.71 0.72 0.70 0.71 Heterogeneity Random 0.81 0.81 0.81 0.82 0.82 0.82 0.82 Positive 0.89 0.89 0.89 0.89 0.89 0.89 0.89 Negative 0.52 0.52 0.52 0.55 0.56 0.55 0.55 PANEL B - Complete persistence (lambda = 1) Random Random 0.84 0.84 0.84 0.84 0.84 0.84 0.84 Dynamic Random 0.84 0.84 0.84 0.83 0.84 0.84 0.84 Positive 0.84 0.59 0.70 0.70 0.84 0.59 0.70 Negative 0.84 0.62 0.72 0.72 0.84 0.62 0.72 Baseline Random 0.84 0.84 0.84 0.84 0.84 0.84 0.84 Positive 0.85 0.81 0.84 0.84 0.85 0.80 0.84 Negative 0.80 0.76 0.79 0.79 0.81 0.76 0.79 Heterogeneity Random 0.82 0.82 0.82 0.82 0.82 0.82 0.82 Positive 0.89 0.89 0.89 0.89 0.89 0.89 0.89 Negative 0.58 0.57 0.58 0.58 0.58 0.57 0.58 Notes. This table provides the Spearman rank correlations with the true teacher effects. The URM and the "mimic" DOLS, AR, and EB are based on specifications with a 1-year and 2-year lagged score, while the "common" DOLS, AR, and EB estimates are based on the specification with just the 1-year lagged score. These results are based on simulations with small teacher effects, and 3 cohorts of students. !!!!98 Table C3: Correlations - URM vs. Other Estimators (Small Teacher Effects) "Common" estimating equation Small teacher effects 1 cohort of students 3 cohorts of students Grouping Assignment DOLS AR EB DOLS AR EB PANEL A - Substantial decay (lambda = 0.5) Random Random 0.99 0.99 0.99 0.99 0.99 0.99 Dynamic Random 0.99 0.99 0.99 0.99 0.99 0.99 Positive 0.94 0.98 0.99 0.97 0.96 0.99 Negative 0.96 0.99 0.99 0.97 0.97 0.99 Baseline Random 0.99 0.99 0.99 0.99 0.99 0.99 Positive 0.99 0.99 0.99 0.99 1.00 0.99 Negative 0.99 0.99 0.99 0.98 0.98 0.98 Heterogeneity Random 0.99 0.99 0.99 0.99 0.99 0.99 Positive 0.99 0.99 1.00 1.00 1.00 1.00 Negative 0.99 0.99 0.99 0.99 0.99 0.99 PANEL B - Complete persistence (lambda = 1) Random Random 1.00 1.00 1.00 1.00 1.00 1.00 Dynamic Random 0.98 1.00 1.00 0.99 0.99 0.99 Positive 0.92 0.99 1.00 0.93 0.97 1.00 Negative 0.93 0.99 1.00 0.94 0.97 1.00 Baseline Random 1.00 1.00 1.00 1.00 1.00 1.00 Positive 0.99 1.00 1.00 1.00 0.99 1.00 Negative 1.00 1.00 1.00 1.00 0.99 1.00 Heterogeneity Random 1.00 1.00 1.00 1.00 1.00 1.00 Positive 1.00 1.00 1.00 1.00 1.00 1.00 Negative 1.00 1.00 1.00 1.00 1.00 1.00 Notes. This table provides the Spearman rank correlations with the URM estimates, for the DOLS, AR, and EB estimates. The URM composite score uses a 1-year and 2-year lagged score, while the DOLS, AR, and EB estimates are based on the "common" specification with just the 1-year lagged score. These results are based on simulations with small teacher effects. !!!!99 Table C4: Correlations - Estimated vs. True Teacher Effects (Large Teacher Effects) OLS, AR, EB on "common" specification Large teacher effects 1 cohort of students 3 cohorts of students Grouping Assignment DOLS AR EB URM DOLS AR EB URM PANEL A - Substantial decay (lambda = 0.5) Random Random 0.90 0.90 0.90 0.90 0.95 0.95 0.95 0.95 Dynamic Random 0.90 0.89 0.90 0.90 0.95 0.95 0.95 0.95 Positive 0.89 0.68 0.83 0.83 0.95 0.75 0.93 0.94 Negative 0.90 0.76 0.87 0.87 0.95 0.83 0.94 0.94 Baseline Random 0.89 0.89 0.89 0.90 0.95 0.95 0.95 0.95 Positive 0.91 0.88 0.90 0.90 0.96 0.94 0.96 0.95 Negative 0.82 0.82 0.82 0.84 0.89 0.89 0.89 0.91 Heterogeneity Random 0.86 0.87 0.86 0.87 0.94 0.94 0.94 0.94 Positive 0.93 0.92 0.93 0.93 0.96 0.95 0.96 0.96 Negative 0.77 0.76 0.77 0.78 0.84 0.83 0.84 0.85 PANEL B - Complete persistence (lambda = 1) Random Random 0.90 0.90 0.90 0.90 0.95 0.95 0.95 0.95 Dynamic Random 0.89 0.89 0.90 0.90 0.95 0.95 0.95 0.95 Positive 0.89 0.63 0.77 0.77 0.95 0.69 0.92 0.92 Negative 0.90 0.70 0.84 0.84 0.95 0.77 0.94 0.94 Baseline Random 0.90 0.90 0.90 0.90 0.95 0.95 0.95 0.95 Positive 0.90 0.85 0.89 0.89 0.95 0.91 0.95 0.95 Negative 0.87 0.85 0.87 0.87 0.93 0.92 0.93 0.94 Heterogeneity Random 0.87 0.88 0.87 0.87 0.94 0.94 0.94 0.94 Positive 0.93 0.92 0.93 0.93 0.96 0.95 0.96 0.96 Negative 0.79 0.78 0.79 0.79 0.86 0.85 0.86 0.86 Notes. This table provides the Spearman rank correlations with the true teacher effects. The URM composite score uses a 1-year and 2-year lagged score, while the DOLS, AR, and EB estimates are based on the "common" specification with just the 1-year lagged score. These results are based on simulations with large teacher effects. !!!!100 Table C5: Correlations - URM vs. Other Estimators (Large Teacher Effects) "Common" estimating equation Large teacher effects 1 cohort of students 3 cohorts of students Grouping Assignment DOLS AR EB DOLS AR EB PANEL A - Substantial decay (lambda = 0.5) Random Random 1.00 0.99 1.00 1.00 1.00 1.00 Dynamic Random 0.99 0.99 0.99 0.99 0.99 0.99 Positive 0.97 0.94 1.00 0.99 0.88 1.00 Negative 0.98 0.94 0.99 0.99 0.91 0.99 Baseline Random 0.99 0.99 0.99 1.00 0.99 1.00 Positive 0.99 0.99 1.00 1.00 0.99 1.00 Negative 0.99 0.99 0.99 0.99 0.99 0.99 Heterogeneity Random 1.00 0.99 1.00 1.00 1.00 1.00 Positive 1.00 0.99 1.00 1.00 0.99 1.00 Negative 0.99 0.99 0.99 0.99 0.99 0.99 PANEL B - Complete persistence (lambda = 1) Random Random 1.00 1.00 1.00 1.00 1.00 1.00 Dynamic Random 0.99 0.99 1.00 1.00 0.99 1.00 Positive 0.94 0.95 1.00 0.99 0.86 1.00 Negative 0.97 0.94 1.00 0.99 0.88 1.00 Baseline Random 1.00 1.00 1.00 1.00 1.00 1.00 Positive 1.00 0.98 1.00 1.00 0.98 1.00 Negative 1.00 0.99 1.00 1.00 0.99 1.00 Heterogeneity Random 1.00 1.00 1.00 1.00 1.00 1.00 Positive 1.00 0.99 1.00 1.00 0.99 1.00 Negative 1.00 1.00 1.00 1.00 1.00 1.00 Notes. This table provides the Spearman rank correlations with the URM estimates, for the DOLS, AR, and EB estimates. The URM composite score uses a 1-year and 2-year lagged score, while the DOLS, AR, and EB estimates are based on the "common" specification with just the 1-year lagged score. These results are based on simulations with large teacher effects. !!!!101 Table C6: Descriptive Statistics for Students in Sample, by Grade Obs Mean Std. Dev Min Max Panel A: Grade 5 Math score 110,147 1638.62 232.26 569 2456 Reading score 109,878 1572.35 314.13 474 2713 1-yr lag Math 110,147 1485.78 254.84 569 2330 2-yr lag Math 106,714 1344.95 287.95 375 2225 1-yr lag Reading 109,879 1523.12 317.68 295 2638 2-yr lag Reading 106,510 1297.28 350.45 86 2514 Disability 110,147 0.11 0.32 0 1 LEP 110,147 0.51 0.50 0 1 Female 110,147 0.50 0.50 0 1 FRL 110,147 0.70 0.46 0 1 Black 110,147 0.28 0.45 0 1 Hispanic 110,147 0.60 0.49 0 1 Panel B: Grade 6 Math score 101,307 1652.63 242.93 770 2492 Reading score 101,122 1635.41 302.95 539 2758 1-yr lag Math 101,307 1634.20 220.28 569 2456 2-yr lag Math 97,895 1460.65 247.73 569 2330 1-yr lag Reading 101,008 1550.82 306.45 474 2713 2-yr lag Reading 97,572 1504.67 313.31 86 2638 Disability 101,307 0.06 0.24 0 1 LEP 101,307 0.52 0.50 0 1 Female 101,307 0.51 0.50 0 1 FRL 101,307 0.71 0.46 0 1 Black 101,307 0.28 0.45 0 1 Hispanic 101,307 0.61 0.49 0 1 !!!!102 Table C7: Spearman Rank Correlations, Comparing EVAAS URM to Other Estimators 1-yr lag Math, Student Char. 1-yr & 2-yr lag Math, Student Char. 1-yr & 2-yr lags in Math & Reading Composite score [1] [2] [3] [4] Panel A: 5th grade 1-year estimates DOLS 0.918 0.971 0.997 0.999 AR 0.922 0.972 0.995 0.998 EB 0.920 0.972 0.998 1.000 N 5016 5016 5016 5016 2-year estimates DOLS 0.918 0.971 0.997 0.999 AR 0.921 0.971 0.994 0.997 EB 0.920 0.972 0.998 1.000 N 4203 4203 4203 4203 Panel B: 6th grade 1-year estimates DOLS 0.941 0.982 0.995 0.997 AR 0.931 0.964 0.987 0.990 EB 0.944 0.984 0.998 1.000 N 1814 1814 1814 1814 2-year estimates DOLS 0.945 0.982 0.995 0.997 AR 0.935 0.963 0.986 0.990 EB 0.948 0.984 0.998 1.000 N 1536 1536 1536 1536 Notes. This table provides the average Spearman rank correlation between the EVAAS URM estimate and the other estimator/specifications. N = the number of teacher effect observations underlying the average. For the 1-year estimates this corresponds to the number of Teacher-Year observations from 2002-2007. For the 2-year estimates this corresponds to the number of teacher effects computed (for each estimator/specification) during 2003-2007. !!!!103 Table C8: Disagreement with the URM in Classification of Teachers Above the 10th Percentile 1-yr lag Math, Student Char. 1-yr & 2-yr lag Math, Student Char 1-yr & 2-yr lags in Math & Reading Composite score [1] [2] [3] [4] Panel A: Grade 5 1-year estimates (N=5,016 teacher effect estimates) DOLS 2.6% 1.5% 0.7% 0.7% n 129 73 34 35 AR 2.6% 1.4% 0.9% 0.8% n 130 71 47 41 EB 2.6% 1.4% 0.2% 0.0% n 128 72 10 0 2-year estimates (N=4,203 teacher effect estimates) DOLS 2.5% 1.4% 0.7% 0.7% n 106 58 30 31 AR 2.5% 1.4% 1.0% 0.8% n 108 58 42 35 EB 2.5% 1.4% 0.2% 0.0% n 106 57 9 0 Panel B: Grade 6 1-year estimates (N=1,814 teacher effect estimates) DOLS 2.1% 1.1% 1.2% 1.1% n 38 21 22 20 AR 2.0% 2.0% 1.6% 1.5% n 36 36 29 27 EB 2.0% 0.8% 0.3% 0.0% n 36 15 5 0 2-year estimates (N=1,536 teacher effect estimates) DOLS 2.0% 1.2% 1.2% 1.0% n 31 19 18 16 AR 1.9% 2.0% 1.6% 1.5% n 29 31 25 23 EB 1.8% 0.8% 0.3% 0.0% n 28 12 4 0 Notes. This table provides the average percent of teachers whose classification changes from the top 10 percent in the distribution of EVAAS URM estimated teacher effects to below the top 10 percent in the distribution of teacher effects based on the other estimator/specification combinations. The average is taken as the simple average of the percent misclassified in each year 2002-2007 for the 1-year estimates and 2003-2007 for the 2-year estimates. n = the number of teachers for whom classification changes in this way. !!104 REFERENCES REFERENCESBoyd,D.,Lankford,H.,Loeb,S.,&James,W.(2012).Measuringtestmeasurementerror:ageneralapproach,inWorkingpaperpreparedfortheNationalConferenceonValue-AddedModeling(UniversityofWisconsin-Madison).Amrein-Beardsley,A.(2008).MethodologicalconcernsabouttheEducationValue-AddedAssess-mentSystem(EVAAS).EducationalResearcher,37(2),65-75.Dieterle,S.,Guarino,C.,Reckase,M.&Wooldridge,J.(2015).Howdoprincipalsassignstudentstoteachers?Findingevidenceinadministrativedataandtheimplicationsforvalue-added.JournalofPolicyAnalysisandManagement,34(1),32-58.Guarino,C.,MaxÞeld,M.,Reckase,M.,Thompson,P.,&Wooldridge,J.(2015).Anevalua-tionofempiricalBayesÕestimationofvalue-addedteacherperformancemeasures.JournalofEducationalandBehavioralStatistics,(40),190-222.Guarino,C.,Reckase,M.&Wooldridge,J.(2015).Canvalue-addedmeasuresofteacherperfor-mancebetrusted?EducationFinanceandPolicy,10(1),117-156.Hanushek,E.(1979).Conceptualandempiricalissuesintheestimationofeducationalproductionfunctions.TheJournalofHumanResources,14(3),351-388.Lockwood,J.R.&McCa!rey,D.F.(2007).Controllingforindividualheterogeneityinlongitudinalmodels,withapplicationstostudentachievement.ElectronicJournalofStatistics,(1),223-252.Lockwood,J.R.,McCa!rey,D.F.,Mariano,L.T.,&Setodji,C.(2007).Bayesianmethodsforscalablemultivariatevalue-addedassessment.JournalofEducationalandBehavioralStatistics,32(2),125-150.McCa!rey,D.F.,&Lockwood,J.R.(2011).Missingdatainvalue-addedmodelingofteachere!ects.TheAnnalsofAppliedStatistics,5(2A),773-797.McCa!rey,D.F.,Lockwood,J.R.,Koretz,D.,Louis,T.A.,&Hamilton,L.(2004).Modelsforvalue-addedmodelingofteachere!ects.Journalofeducationalandbehavioralstatistics,29(1),67-101.Rose,R.,Henry,G.,&Lauen,D.(2012).Comparingvalue-addedmodelsforestimatingindivid-ualteachere!ectsonastatewidebasis:Simulationsandempiricalanalyses.ConsortiumforEducationalResearchandEvaluationNorthCarolina.http://cerenc.orgSanders,W.(2006).Comparisonsamongvariouseducationalassessmentvalue-addedmodels.Pre-sentedatThePoweroftwo:NationalConferenceonValue-Added,Columbus,OH,October16.SAS¨WhitePaper.Cary,NC:SASInstitute.105Sanders,W.L.,Wright,S.P.,Rivers,J.C.,&Leandro,J.G.(2009).AresponsetocriticismsofSAS¨EVAAS¨.SAS¨WhitePaper.Cary,NC:SASInstitute.Todd,P.E.,&Wolpin,K.I.(2003).OnthespeciÞcationandestimationoftheproductionfunctionforcognitiveachievement.TheEconomicJournal,113(485),F3-F33.Wooldridge,J.M.(2010).Econometricanalysisofcrosssectionandpaneldata,2nded.Cambridge,MA:MITPress.Wright,S.P.,Sanders,W.L.,&Rivers,J.C.(2006).Measurementofacademicgrowthofindividualstudentstowardvariableandmeaningfulacademicstandards.SAS¨WhitePaper.Cary,NC:SASInstitute.Wright,S.P.,White,J.T.,Sanders,W.L.,&Rivers,J.C.(2010).SAS¨EVAAS¨statisticalmodels.SAS¨EVAAS¨TechnicalReport.Cary,NC:SASInstitute.Wright,S.P.(2010).AnInvestigationofTwoNonparametricRegressionModelsforValue-AddedAssessmentinEducation.SAS¨EVAAS¨TechnicalReport.Cary,NC:SASInstitute.106