INFERENCESINMIXED-EFFECTSMODELSWITHMISSINGCOVARIATESAND APPLICATIONTOMETA-ANALYSIS By AkshitaChawla ADISSERTATION Submittedto MichiganStateUniversity inpartialentoftherequirements forthedegreeof Statistics-DoctorofPhilosophy 2015 ABSTRACT INFERENCESINMIXED-EFFECTSMODELSWITHMISSING COVARIATESANDAPPLICATIONTOMETA-ANALYSIS By AkshitaChawla Thisdissertationconsistsoffourchapters.Thechaptermotivatesproblemsofin- terestandgivesabriefliteraturereview. Thesecondchapterinvestigatespopularmethodsoftestinginlinearmixedmodels.The linearmixedmodelisprominentlyusedinresearchinvolvinghumanandanimalsubjects. Drawinginferencesonmodelparametersisprimarilyimportantforexplainingthebiological outcomes.Unlikethestandardlinearregressionmodels,thelinearmixedmodelinferenceis basedonasymptotictheory.Thusastudyofsampleperformanceoftheexistingproce- durecouldbeofpracticalinterest.Thischapterreviewsthefollowingpopularapproachesof testingedinalinearmixedmodel:(1)likelihoodratiotest,(2)restrictedmaximum likelihoodratiotest(3)Bartlettcorrectedlikelihoodratiotest(4)Bartlettcorrected Cox-Reidlikelihoodratiotestand(5)Kenward-RogerapproximateF-test.Theperformance ofthesemethodsarecomparedbasedonType-Ierrorrateviaanextensivesimulationstudy. WeconcludethattheKenward-RogertestisthebestinpreservingType-Ierrorrate. Thethirdchapterdevelopsatestfortsinsmallsamplelinearmixedmodelwith missingcovariates.Partiallyobservedvariablesarecommoninscienresearch.Ignoring thesubjectswithpartialinformationmayleadtobiasedandortestimators,and consequentlyanytestbasedonlyonthecompletelyobservedsubjectsmaytheerror probabilities.Missingdataissuehasbeenextensivelyconsideredintheregressionmodel, especiallyintheindependentlyidentically(IID)datasetup.Relativelylessattentionhas beenpaidforhandlingmissingcovariatedatainthelinearmixedmodel{adependent datascenario.Incaseofcompletedata,Kenward-Roger'sFtestisawellestablishedmethod fortestingofinalinearmixedmodel.Inthischapter,wepresentamo Kenward-Rogertypetestfortestinginalinearmixedmodelwhenthecovariates aremissingatrandom.Intheproposedmethod,weattempttoreducebiasfromthree sources,thesmallsamplebias,thebiasduetomissingvalues,andthebiasduetoestimation ofvariancecomponents.Theoperatingcharacteristicsofthemethodisjudgedandcompared withtwoexistingapproaches,listwisedeletionandmeanimputation,viasimulationstudies. Thefourthchapterappliestherandommodeltometa-analysisofrareeventdata. Inclinicaltrialsandmanyotherapplications,meta-analysisismainlyconductedtosum- marizethesizeonprimaryendpointsoronkeysecondaryendpoints.However,there aretimeswhensafetyendpointssuchasriskofcomplicationsinpancreaticsurgeryorthe riskofmyocardialinfarction(MI)isthemainpointofinterestinadrugstudy.Asthese typesofsafetyendpointsarerareinnature,moststudiesreportzerosuchincidences;the generalstatisticalframeworkofmeta-analysisbasedonlargesampletheoryfallsapartto combinethesize.Asaworkaround,eithertrialswithbotharmshavingzeroevents aredeletedora0.5correctionisapplied.Inarandomizedcontroltrial(RCT)set-up,Cai, ParastandRyan,2010,proposedmethodsbasedonPoissonrandommodelstodraw inferencesonrelativerisksfortwoarmswithrareeventdata.Inthischapter,wegivethe generalframeworkofhowtheirassumptionofhavingRCTcanbefurtherrelaxedanduti- lizedfornon-RCTstudiestodrawinferencesfortwotreatments.Wealsodeveloptwonew approachesbasedonzeroPoissonrandommodelsthataremoreappropriate withexcessivezerocountsdata. ACKNOWLEDGMENTS IwishtoexpressmysincerethankstomyadvisorProfessorTapabrataMaitiforguiding andinspiringmethroughmygraduatestudy.Ithankhimforsuggestinginterestingand challengingproblemsandalsoforprovidingmewithinnumerableopportunitiesthroughmy timehere. IthankProfessorHiraKoulforhavinggivenmetheopportunitytopursuegraduatestudy atMichiganStateUniversityandforalwaysbeingavailableforguidance.Ialsothankhim andhisfamilyforbeingsowelcoming.IthankProfessorChaeYoungLim,ProfessorPing- ShouZhongandProfessorRobertTempelmanforservingonmydissertationcommittee. Ithankmyparentsfortheirconstantloveandsupport,withoutwhichthisworkwould nothavebeenpossible. FinallyIwouldliketothankmyfriendsAbhishek,ShaheenandAtishiformakingthis journeyjoyous. ThisresearchhasbeenpartlysupportedbyNSFgrantsDMS-1106450andSES-0961649, forwhichIamgrateful. iv TABLEOFCONTENTS LISTOFTABLES .................................... vii LISTOFFIGURES ................................... xi Chapter1Introduction ............................... 1 Chapter2Smallsampleinferencesinlinearmixedmodels ......... 6 2.1Likelihoodratiotest(LRT)...........................7 2.2Restrictedmaximumlikelihoodratiotest(RM-LRT).............9 2.3Bartlettcorrectedlikelihoodratiotests.....................11 2.3.1Bartlett-correctedplikelihoodratiotest(BC-LRT).......11 2.3.2Bartlett-correctedCox-Reidprolikelihoodratiotest (BC-CRT).................................12 2.4Kenward-RogerapproximateF-test(KRT)...................14 2.5SimulationStudy.................................17 2.5.1StudyI:longitudinalstudy........................18 2.5.2StudyII:balancedincompleteblockdesign...............23 2.6Realdataexample................................28 Chapter3Kenward-Rogerapproximationforlinearmixedmodelswith missingcovariates ............................ 30 3.1Inferenceforthe...........................33 3.1.1Estimationofthevarianceoftheestimator b .............33 3.1.2Approximatingthedistributionoftheteststatistic..........36 3.2Simulation.....................................37 3.3Realdataexamples................................46 3.3.1Example1.................................46 3.3.2Example2.................................47 3.4ProofsofChapter3................................49 3.4.1Assumptions................................49 3.4.2EstimatesofandinTheorem1.................50 3.4.3Derivationof e E and e V usedinTheorem2...............56 3.4.4ImplementationofSimulationStudy..................63 Chapter4Application:Meta-analysisusingmodels .... 76 4.1Likelihood-basedapproachduetoCPR.....................78 4.1.1Poissonmodelwithrelativerisk:POIF..............78 4.1.2Poissonmodelwithrandomrelativerisk:POIR............79 4.2Proposedmethodsformeta-analysisbasedonzeroPoissonmodels.80 v 4.2.1ZeroPoissonmodelwithrelativerisk:ZIPF......80 4.2.2ZeroPoissonmodelwithrandomrelativerisk:ZIPR.....80 4.3Numericalstudies.................................81 4.3.1Example1:OverallcomplicationsduetoneedlesinEUS-FNA....81 4.3.2Example2:Rosiglitazonestudy.....................88 4.4SimulationII...................................90 4.5Discussion.....................................93 4.6ProofsofChapter4................................94 BIBLIOGRAPHY ................................. 99 vi LISTOFTABLES Table2.1Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................19 Table2.2Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................19 Table2.3Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................20 Table2.4Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................21 Table2.5Simulatedsizeandsimulatedpowerfor =0.HereLRT,RM-LRT, BC-LRT,BC-CRTandKRTrepresentlikelihoodratiotest,restricted maximumlikelihoodratiotest,Bartlett-correctedlikelihood ratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotest andKenward-RogerapproximateFtest,respectively.........22 Table2.6Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................22 Table2.7Case1..................................24 vii Table2.8Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................24 Table2.9Type-Ierrorratesfor 2 =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................24 Table2.10Type-Ierrorratesfor 2 =0and 3 =0.HereLRT,RM-LRT,BC- LRT,BC-CRTandKRTrepresentlikelihoodratiotest,restricted maximumlikelihoodratiotest,Bartlett-correctedlikelihood ratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotest andKenward-RogerapproximateFtest,respectively.........25 Table2.11Case2..................................25 Table2.12Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................26 Table2.13Type-Ierrorratesfor 2 =0.HereLRT,RM-LRT,BC-LRT,BC- CRTandKRTrepresentlikelihoodratiotest,restrictedmaximum likelihoodratiotest,Bartlett-correctedlikelihoodratiotest, Bartlett-correctedCoxReidadjustedlikelihoodratiotestandKenward- RogerapproximateFtest,respectively.................26 Table2.14Type-Ierrorratesfor 2 =0and 3 =0.HereLRT,RM-LRT,BC- LRT,BC-CRTandKRTrepresentlikelihoodratiotest,restricted maximumlikelihoodratiotest,Bartlett-correctedlikelihood ratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotest andKenward-RogerapproximateFtest,respectively.........27 Table3.1HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0 atthe5%levelofcewhen n =27at40%missingness. HereKRM,LD-LRT,MI-LRTandMI-WTrefertoKenward-Roger adjustmentformissingcovariates,listwisedeletionfollowedbylike- lihoodratiotest,meanimputationfollowedbylikelihoodratiotest andmeanimputationfollowedbyWaldTest..............40 viii Table3.2HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0 atthe5%levelofcewhen n =13at40%missingness. HereKRM,LD-LRT,MI-LRTandMI-WTrefertoKenward-Roger adjustmentformissingcovariates,listwisedeletionfollowedbylike- lihoodratiotest,meanimputationfollowedbylikelihoodratiotest andmeanimputationfollowedbyWaldTest..............40 Table3.3HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0at the5%levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2for all i =1 ;:::; 15.HereKRM,LD-LRTandMI-LRTrefertoKenward- Rogeradjustmentformissingcovariates,listwisedeletionfollowedby likelihoodratiotestandmeanimputationfollowedbylikelihoodratio test.....................................41 Table3.4HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0at the5%levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =30and n i =2for all i =1 ;:::; 30.HereKRM,LD-LRTandMI-LRTrefertoKenward- Rogeradjustmentformissingcovariates,listwisedeletionfollowedby likelihoodratiotestandmeanimputationfollowedbylikelihoodratio test.....................................42 Table3.5HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0at the5%levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2for all i =1 ;:::; 15whenthemissingmechanismisMAR.HereKRM, LD-LRTandMI-LRTrefertoKenward-Rogeradjustmentformissing covariates,listwisedeletionfollowedbylikelihoodratiotestandmean imputationfollowedbylikelihoodratiotest...............43 Table3.6HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0at the5%levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2for all i =1 ;:::; 15whenthepartiallymissing(20%missing)covariate isgeneratedfromheavytaileddistributions.HereKRM,LD-LRT, MI-LRTandMI-KRTrefertoKenward-Rogeradjustmentformissing covariates,listwisedeletionfollowedbylikelihoodratiotestandmean imputationfollowedbylikelihoodratiotest...............45 Table4.1Initialproportionestimatesofthetreatmentgroupswithandwithout continuitycorrection(CC).......................82 Table4.2EstimatesoflogrelativeriskforEUS-FNAstudy...........84 Table4.3Mahalanobisdistances(EUS-FNAstudy)...............87 Table4.4EstimatesoflogrelativeriskforRosiglitazonestudy.........89 ix Table4.5Mahalanobisdistances(Rosiglitazonestudy).............90 Table4.6ComparisonofmethodsonthebasisofMSE,SizeandPoweroftest92 x LISTOFFIGURES Figure4.1Summaryofthe22/25Gdata......................77 xi Chapter1 Introduction Longitudinaldataandrepeatedmeasurementsarisefrequentlyinappliedsciences.Insuch studies,theobservationsarecollectedovertime.Linearmixedmodels(LairdandWare, 1982)arewellsuitedforsuchdata,whichdisplayheterogeneityofresponsestotreatment. Thestatisticaltestproceduresformodelparametersareusuallybasedonlargesampletheory. However,veryoftenlongitudinaldataarisinginmedicaldatainvolveveryfewnumberof subjectsalongwithunbalanceddata.Therefore,itisessentialtoinvestigatetheperformance ofmixedmodelsinthecontextofsmalltomoderatesamples. Inalinearmixedmodel,themodelparametersaretypicallyestimatedusingMLorREML estimates(PattersonandThompson,1977;Harville,1977).Inordertodrawinferenceson theasymptotictheoryisused.Anymethodthatignorestheuncertaintyof theestimatedvariancecomponentintheinferenceoftheparameter willresult ininterval,speciallywhensamplesizeissmall.Therefore,inamixed modelsetup,wehavetwobiasestodealwith:smallsamplebiasandvariancebias.The mostcommonlyusedstatisticaltoolisthelikelihoodratiotest,whichdoesnotaccount foranyoftheabovementionedbiases.Thenulldistributionofthelikelihoodratiotest statisticisapproximatedasa ˜ 2 distributionuptotheapproximation O p ( n 1 ),where n denotesthesamplesize.However,whenthesamplesizeissmall,theinferencesusing thisapproximationcouldbehighlyinaccurate.Bartlett,1937proposedacorrectiontothe likelihoodratiotesttoobtainabetterapproximationtothenulldistribution.Zuckeretal., 1 2000proposedanadjustmenttothelikelihoodratiostatisticincaseofalinearmixedmodel, thenulldistributionofwhichisapproximatedtoa ˜ 2 uptotheapproximationof O p ( n 2 ) andHall,1988),thusaddressingthebiasduetosmallsamplesize. Toaddressthesecondbias,CoxandReid,1987proposedformsofadjustedlikelihoodwhich reducetheofnuisanceparametersontheestimationofedZuckeretal., 2000alsoderivedtheBartlettcorrectionfortheCox-Reidadjustedlikelihoodincaseof linearmixedectsmodel,thusaddressingbothbiases. Theproblemofapproximatingthesmallsampleprecisionof b hasalsobeenstudied byKackar&Harville(1984),Harville(1985)andHarville&Jeske(1992).Inordertoaddress thesecondproblemofunderestimationofvarianceof b ,theestimatorof ,anapproximate toranF-statisticisused.Ingeneral,totestthehypothesis H 0 : L T =0,aWald-typetest statisticisusedwhichisapproximatedasanFdistributionwiththenumeratordegreesof freedomasrank( L )andthedenominatordegreesoffreedombeingestimatedfromthedata. Thedegreesoffreedomswereusedtobeestimatedbytheresidualmethod,containment methodortheSatterthwaite-typeapproximation.LaterKenward&Roger(1997),intheir seminalpaper,proposedatapproachofdealingwiththosetwoproblems.They combinedtheresultsfromKackarandHarville,1984toshowthatthevarianceofestimated canbepartitionedintotwocomponentsandapproximatedsuchthatthevariance biasduetoestimationofvariancecomponentsisaccountedfor.TheyconstructedaWald- typeteststatisticandfurtherreducedthesmallsamplebiasbyapproximatingitsnull distributiontoan F distributionupto O p ( n 5 = 2 ).Theteststatisticandthedenominator degreesoffreedomoftheFdistributionderivedbythemusestheadjustedestimatorofthe covariancematrixoftheestimatedItisaverywidelyusedmethodoftesting insmallsamplesandhasbeenincorporatedinthestatisticalsoftwareSAS. 2 InChapter2,wereviewrentapproachesoftestingnamelylikelihood ratiotestanditsBartlettcorrection,Cox-ReidadjustedlikelihoodratiotestanditsBartlett- correctionandtheKenward-RogerapproximateF-test.Allthesetestsareapproximate teststhataimatobtainingthemostaccuratesolutionbyreducingthetofnuisance parametersandaccountingforsmallsamplebias.Thischapteraddressesthepractical problemofchoosingthemostrobustapproachtotesttheinalinearmixed modelwhenthesamplesizeissmall.Ourfocusisonthecasewhenthecovariancematrix hasalinearstructure.Theparameterofinterestvariesfromasingleparametercaseto amulti-parametercase.Toassesstheperformanceoftheabovemethods,wepresentan extensivesimulationstudyonawiderangeofpracticalproblems. Partiallymissingvariableiscommoninclinicalstudies.Dependingonthemissingness mechanismmissingdatacanbeintothreemaincategories,missingcompletely atrandom(MCAR)wherethemissingnessmechanismiscompletelyrandomanddoesnot dependonanyvariables,missingatrandom(MAR)wheremissingnessmechanismdepends onlyontheobservedvaluesofthevariables,andmissingnotatrandom(MNAR)wherethe missingnessmechanismmaydependontheunobservedvaluesofthevariablesalongwith thecompletelyobservedvariables.Thetwomechanismsareignorableinalikelihood frameworkwhilethethirdisnot.Missingvaluesmayoccurinaresponsevariable,ina covariateorinboth.Thereareseveralmethodsofdealingwithmissingresponsesdepending onthemissingmechanism.Amongthemostcommonmethodsarelistwisedeletionwhere thesubjectwithanymissingvalueisdeletedfromthestudy,meanormultipleimputation andthemaximumlikelihoodmethodthatdonotinvolvemodellingthemissingmechanism whenthemissingmechanismiseitherMCARorMAR.PadillaandAlgina(2004)showed viaasimulationstudythatwhentheresponsesaremissing(MCARorMAR)inasmall 3 samplesetup,Kenward-RogertestpreservestheTypeIerrorratesforasinglefactorwithin- subjectANOVAascomparedtotheHotelling-Lawley-McKeonprocedure(McKeon,1974). Therefore,KenwardRogerFtestisproventobesuperiorevenincaseofmissingresponses. Inseveralaricles,Ibrahimandcoworkersconsideredpartiallymissingresponseorcovariate inregressionmodels(Ibrahim,1990;Ibrahim&Chen,2001;Stubbendick&Ibrahim,2003; Ibrahimetal.,2005;Chenetal.,2008).However,littlehasbeendoneincaseofmissing covariatesinalinearmixedmodel. InChapter3,wedealwithmissingcovariatedata,andassumethatthecovariatedata aremissingatrandom(MAR).Weassumeaparametricdistributiononthepartiallymissing covariate.Underthissetup,wederiveaKenward-Rogertypeadjustedtestforthe inalinearmixedmodelincaseofsmallsamples.Toderiveanimprovedtest,we needtoovercomethreebiases:thevariancebias(suchthatthevariabilityofthevariance componentsistakenintoaccount),thesmallsamplebiasandthebiasduetomissingness. Kenward&Roger(1997)consideredthetwotypesofbiasesintheirseminalpaper forfullyobserveddata.Here,weneedtodevelopasimilartestbyaccountingforthe missingnessinthecovariates.Weconsiderthebiasintheestimationofthecovariance matrixof b andthenproposeanewWaldstatisticthatusesthisnewadjustedcovariance matrix.Further,thisWaldstatisticisapproximatedtoanFdistributionwiththedegrees offreedomcalculatedusingthenewcovariancematrix.Todemonstratetheaccuracyofthis method,wecomparethismethodtothelistwisedeletionandimputationmethodsthrough simulationstudies. InChapter4,welookatapplicationofmodelsinmeta-analysisofrare events.Inmeta-analysis,thestudyspsizessuchasmeanerences,relative risksoroddsratiosarecombinedusingeithertheorrandomectsmodels. 4 Inmodels,thestudyspareassumedtobeidenticalacrossstudies; however,inrandommodelsthisassumptionisrelaxed.Usingthelargesampletheory, DerSimonianandLaird,1986proposedarandommodelforestimatingthemeanof therandomdistribution.However,therandommodelbasedonlargesample theorydoesnotworkwellwhenthestudyoutcomesacrossstudieshavelowincidencesor zeroevents.Incaseofzeroevents,MantelandHaenszelmethod,1959requireszero-cell0.5 correctionstoobtaintheestimatesofthesize.Petomethod,1985doesnotrequire anysuchcorrections;howevertrialswithnoeventsinbotharmsarenotgivenanyweight andarehenceexcludedfromtheanalysis.Forthedetailswerecommendthediscussions fromBradburn etal. ,2007andLane,2013. Aplethoraofliteratureisavailablethatdiscussestmethodsapplicableformeta- analysis;however,veryfewdiscussthemethodsapplicableforrareeventsdata(seeBradburn etal. ,2007;DanielsandGatsonis,1999;Lane,2013.ForthemodelTian et al. ,2009proposedanexactapproach.UsingPoissonrandommodel,Cai,Parastand Ryan,2010proposedanelegantapproachtocalculatethesizeforrareeventdata incaseofrandomizedtrials.However,themodelsproposedbythemareunabletohandle casesbeyondacertainthresholdofoccurrenceofzeroevents.Theyalsorequirethetrials usedintheiranalysistohavetwogroupspertrial.Inthischapter,ourmaincontribution istoproposezero-iPoissonmodels.Furthermore,ourmodelswillalsobecapableof analysingtrialswherebothgroupsarenotavailableforeachtrial.Weshowthatourmodels arehighlysuitedforthecaseofextremelyrareeventsandillustratetheiradvantagesviaa comprehensivesimulationstudy. 5 Chapter2 Smallsampleinferencesinlinear mixedmodels Weconsiderthefollowinglinearmixedmodelwiththeresponsevector Y .Thedesignmatrix fortheisdenotedby X whilethedesignmatrixfortherandomisdenoted by Z .Themodelerrorvectorwillbedenotedby : Theresponsevectorisrelatedto X;Z; bythefollowingrelation, Y = X + Z b + : (2.0.1) Inthisrelation, =( 1 ;:::; p ) 0 2 R p isthevectorof b =( b 1 ;:::;b q ) 0 ˘ N q (0 ;G )isaq-dimensionalrandomvectorwhere G involves q ( q +1) = 2unknown parameters,and =( 1 ;:::; n ; ) 0 isann-dimensionalerrorvector,theelementsofwhichare assumedtobei.i.dnormalwithmean0andunknownvariance ˙ 2 e ,i.e. ˘N n (0 ;˙ 2 e ) : Furthermore, b and areassumedtobeindependent.Themodel(2.0.1)canberewritten asfollows Y = X + ; (2.0.2) 6 where ˘N n (0 ; ˙ )), ˙ )= ZGZ 0 + ˙ 2 e and ˙ isan r = q ( q +1) = 2+1dimensional vectorofvariance-covarianceparameters. Inthefollowing,ourobjectiveistotestthehypothesis H 0 : L 0 =0 ; (2.0.3) where L 0 isany l p matrixand isasnedin(2.0.1).Weconsiderthecasewhen islinearin ˙ .If ˙ isknownandthentheproblemistrivialsincethemodelreduces toalinearmodel.However,inthecaseoflinearmixedmodels, ˙ isunknownandhasto beestimated.Inthefollowing,wediscussvarioustestprocedureforthehypothesis(2.0.3) whicharepopularlyusedintheliterature. 2.1Likelihoodratiotest(LRT) ThemostwidelyusedmethodfortestingisthelikelihoodratiotestusingMLestimation.The maximumlikelihoodestimatesoftheparametersofinterestarecomputedbymaximizingthe likelihoodwhichisbelow.Itfollowsfromclassicallikelihoodtheory,thatunder standardregularityconditions,thetestingoftsiscarriedoutbyapproximating thelikelihoodratiotoachi-squaredistribution(CoxandHinkley,1990)Webeginwiththe likelihoodofthemodel(2.0.2),whichcanbewrittenasfollows L ( ; ˙ )= 1 2 ˇ n= 2 j ˙ ) j 1 = 2 exp ˆ 1 2 ( Y X ) 0 ˙ ) 1 ( Y X ) ˙ : 7 Hence,theloglikelihoodisgivenby l ( ; ˙ )= 1 2 n log j ˙ ) j +( Y X ) 0 ˙ ) 1 ( Y X ) o : (2.1.4) Bymaximizing(2.1.4),weobtaintheestimateof intermsof ˙ asfollows, ~ ( ˙ )=( X 0 ˙ ) 1 X ) 1 X 0 ˙ ) 1 Y : Thenthelikelihoodisgivenby l p ( ˙ )= l ( ~ ; ˙ ) = 1 2 n log j ˙ ) j +( Y X ~ ) 0 ˙ ) 1 ( Y X ~ ) o : Maximizing l p w.r.t ˙ givesthemaximumlikelihoodestimateof ˙ as ^ ˙ ML : Substitutingit backin ~ ( ˙ )weobtainthemaximumlikelihoodestimateof as ^ = ~ ( ˙ ) ˙ = ^ ˙ ML . Theteststatisticfortesting(2.0.3)isgivenby, LR =2[ l ( ^ ; ^ ˙ ML ) l ( ^ 0 ; ^ ˙ 0 )] ; (2.1.5) where( ^ 0 ; ^ ˙ 0 )=argmax 2 H 0 ˙ 2 l ( ;˙ ) : Understandardregularityconditionsand(2.0.3),theLRstatisticconvergesindistribu- tiontoa ˜ 2 l distributionand E ( LR )= l + O ( n 1 ) 8 where l =rank(L).Therefore,therejectionregionisgivenby LR>˜ 2 1 ;l : (2.1.6) 2.2Restrictedmaximumlikelihoodratiotest(RM- LRT) Restrictedmaximumlikelihood(REML)(PattersonandThompson,1971)iswellestablished asamethodforestimatingtheparametersofalinearmixedmodel.Incontrasttoearlier maximumlikelihoodestimation,REMLcanproduceunbiasedestimatesofvarianceand covarianceparameters.ThetestingprocedureusingREMLisdescribedinWelhamand Thompson,1997.Herethemarginalloglikelihoodisusedtoestimate ˙ whichisgivenby, l R ( ˙ )=log Z L ( ; ˙ ) d ; where Z L ( ; ˙ ) d = Z 1 (2 ˇ ) n= 2 j ˙ ) j 1 = 2 exp 1 2 ( Y X ) 0 ˙ ) 1 ( Y X ) d : (2.2.7) Therefore,therestrictedmaximumlikelihoodisgivenby, l R ( ˙ )= 1 2 [log j j +( Y X ~ ) 0 1 ( Y X ~ )] 1 2 log j X 0 1 X j + const: = l p ( ˙ ) 1 2 log j A j + const: (2.2.8) 9 Wecanmaximize(2.2.8)toobtaintheREMLestimateof ˙ whichisdenotedby ^ ˙ REML . Theestimateof isgivenby ^ =( X 0 1 ( ^ ˙ ) X ) 1 X 0 1 ( ^ ˙ ) Y : (2.2.9) TheREMLlikelihoodunder H 0 isgivenby l R 0 ( ˙ )= 1 2 [log j j +( Y X 0 ~ 0 ) 0 1 ( Y X 0 ~ 0 )] 1 2 log j X 0 0 1 X 0 j + const: (2.2.10) where X 0 and 0 areportionsof X and suchthat H 0 issaIfweconsiderthe enceofthetwolog-likelihoods(2.2.8)and(2.2.10),thereisnocommonbasisofcomparison. ThereforeWelhamandThompson,1997proposedatestbythemodelunderthenull asusual,andthereafterlookingatthechangeinthelog-likelihoodwhenthefullmodelis i.e. l R 1 ( ˙ )= 1 2 [log j j +( Y X ~ ) 0 1 ( Y X ~ )] 1 2 log j X 0 0 1 X 0 j + const: (2.2.11) Therefore,theteststatisticbasedontheREMLisgivenby LR REML = 2( l R 0 ( ˙ ) l R 1 ( ˙ ))andunderstandardregularityconditions, LR REML canbeapproximatedtoa ˜ 2 l distribution. Although,theREMLbasedteststatisticisalsoapproximatedasa ˜ 2 distributionupto theorderof n 1 ,butitprovidesunbiasedestimatesofthecovarianceparameters,resulting inbetterinferenceascomparedtoLRT. 10 2.3Bartlettcorrectedlikelihoodratiotests 2.3.1Bartlett-correctedlikelihoodratiotest(BC-LRT) The LR statistic(2.1.5)isdistributedas ˜ 2 toorder O ( n 1 ).Itisknownthatthe orderapproximationdoesnotworkwellforsmallsamplesizes.Soinordertoachievehigher accuracy,Bartlett,1937proposedmultiplyingthe LR statisticbyaconstant,thusgiving theBartlettcorrectedteststatistic, LR = LR 1+ C=l ; (2.3.12) whereLRisgivenin(2.1.5), l isthenumberoftobetestedand C isaconstant oforder n 1 suchthat E ( LR )= l + O ( n 3 = 2 ) : InordertocalculateC,reparametrizationisrequiredsuchthattheparametersofinterest areorthogonaltotheremainingparametersi.e.theexpectedmixedpartialderivativeofthe likelihoodwithrespecttotheparameterofinterestandanyoneofthenuisanceparameters iszero.Totestforthesome =( 1 ; 2 ; ; l )i.e. H 0 : = 0 against H 1 : 6 = 0 ,areparametrizationisusedinwhichtheparametersofinterest andthe nuisanceparametersareorthogonal.Let ~ X l denotethe l columnsofXand ~ X n l be therestofthecolumns.Afterreparametrization,theexpressionforCisgivenby C = tr ( D 1 ( 1 2 M + 1 4 P 1 2 ( + ) ˝ 0 )) : 11 Wheretheelementsof D;M and P aregivenby, D = 1 2 tr ( _ j _ k ) ; M = tr ( ~ ~ X l 0 _ j ~ ~ X l ) 1 ( ~ ~ X l 0 jk ~ ~ X l +2 _ ~ X l 0 _ j ~ ~ X l ) ; P = tr ( ~ ~ X l 0 _ j ~ ~ X l )( ~ ~ X l 0 1 ~ ~ X l ) 1 ( ~ ~ X l 0 _ k ~ ~ X l )( ~ ~ X l 0 1 ~ ~ X l ) 1 ; and ˝ , and arevectorswhose j th elementsaregivenby tr (( ~ ~ X l 0 1 ~ ~ X l ) 1 ( ~ ~ X l 0 _ j ~ ~ X l )), tr ( D 1 A ( j ) )and tr (( ~ X 0 n l 1 ~ X n l ) 1 ( ~ X 0 n l _ j ~ X n l ))respectively.Intheaboveexpres- sions, _ j = @ @ ˙ j ; _ j = @ 1 @ ˙ j = 1 _ j 1 ; jk = @ 2 @˙ j @˙ k ; jk = @ 2 1 @˙ j @˙ k = 2 _ k _ j 1 1 jk 1 ; ~ ~ X l =[ I ~ X n l ( ~ X 0 n l 1 ~ X n l ) 1 ~ X 0 n l 1 ] ~ X l and _ ~ X l = @ ~ ~ X l @˙ j : Ithasbeenshownthat LR is ˜ 2 l distributed,uptoanerroroforder n 2 in NeilsenandHall 6 .Therefore,itisexpectedtobemoreaccuratethanlikelihoodratio testespeciallyforsmallsamples. 2.3.2Bartlett-correctedCox-Reidlikelihoodratiotest (BC-CRT) BartlettcorrectedLRTtakescareofthebiasduetosmallsamplesizetosomeextent. However,inthelinearmixedmodel,speciallyinsmallsamples,inferenceforthe ishighlybytheestimationofnuisanceparameters.Toreducetheof 12 nuisanceparameters,CoxandReid 7 proposedanadjustedformofthelikelihood. Atransformationoftheparametervectorisrequiredsuchthattheparametersofinterest areorthogonaltothenuisanceparameters.Let denotethe( l 1)vectorofparameterof interestand ˚ bethetransformednuisanceparametervector.TheformofCox-Reidadjusted log-likelihoodisgivenby l CR ( )= l ( ; ^ ˚ ( )) 1 2 log fj l ˚˚ ( ^ ˚ ( )) jg where ^ ˚ ( )isthemaximumlikelihoodestimatorof ˚ foravalueof and l ˚˚ isthe matrixofsecondderivativesof l withrespectto ˚ .Whentestingfortheentiree vectorinalinearmixedmodels,theCox-Reidadjustedlog-likelihoodisthesameasthe REMLlog-likelihoodgivenin(2.2.8).TheCox-Reidadjustedlikelihoodratiostatisticfor testing H 0 : = 0 = 0 isgivenby LR CR =2 f l CR ( ~ ) l CR ( 0 ) g : (2.3.13) Understandardregularityconditions,the LR CR convergesindistributionto ˜ 2 l uptoan errorof O ( n 1 ).DiCiccioandStern,1994gavethegeneralBartlettcorrectedlikelihoodfor Cox-ReidlikelihoodandZuckeretal.,2000workedouttheBartlettcorrectedteststatistic fortesting H 0 : = 0 incaseoflinearmixedmodels,whichisgivenby LR ? CR = LR CR 1+ C ? =p ; (2.3.14) 13 where C ? = tr ( D 1 ( M + 1 4 P + ? ˝ 0 )) : HereD,M,Pand ˝ aresameasgivenassection2.3.1andthe j th elementof ? isgivenby tr ( D 1 tr ( _ k _ j 1 _ l )). ThedetailsoftheproofcanbefoundinMeloetal.,2009andZuckeretal.,2000. The LR ? CR isapproximatedas ˜ 2 l distributeduptoanerrorof O ( n 2 ).ThereforeBartlett correctedCox-Reidlikelihoodratiotestaddressesbothissues;itreducestheofnuisance parameterswhiletakingcareofsmallsamplebias. 2.4Kenward-RogerapproximateF-test(KRT) Theabovefourmethodswerebasedonthelikelihoodratioteststatistic.Kenwardand Roger,1997proposeaWald-typeteststatistic.TheyusetheREMLestimationtoestimate thecovarianceparameters,describedinsection(2.2).Theestimateof thatusestheREML estimateofisgivenby(2.2.9).Tomakeinferenceforthe ,thecovariance matrixofitsasymptoticdistributionisusedinaWald-typeteststatistic.Howeverthe variabilityintheestimateofisnottakenintoaccount.Thishasseriousrepercussions insmallsamplestudiesinthesensethatitcanimpacttheprecisionofthe tly.KenwardandRoger,1997usedanestimatorofthecovariancematrixof ^ whichisadjustedforsmallsamplebias.Theysuggestedthatthevariabilityof ^ canbe partitionedintotwocomponents, V [ ^ ]= A =+ ; 14 where ˙ )isthecovariancematrixoftheasymptoticdistributionof ^ ,theestimateof whichisgivenby ^ =( X 0 1 ( ^ ˙ ) X ) 1 andaccountsforthevariabilityin ^ ˙ .Let ˙ =( ˙ 1 ;:::;˙ r ) 0 .KackarandHarville,1984showthatthevalueofcanbeapproximated by = 0 @ r X i =1 r X j =1 w ij ( Q ij P i P j ) 1 A + O ( n 5 = 2 )(2.4.15) where P i = X 0 @ 1 @˙ i X = X 0 _ i X and Q ij = X 0 1 @˙ i 1 @˙ j X = X 0 _ i _ j X and w ij isthe( i;j ) th elementof V ( ^ ˙ ).Thebiasin ^ isapproximatedusingaTaylorseries expansionabout ^ ˙ whichgives E [ ^ =+ 1 2 r X i =1 r X j =1 @ 2 @˙ i @˙ j + O ( n 5 = 2 ) ˇ + 0 @ r X i =1 r X j =1 w ij ( P i P j Q ij ) 1 A + 1 2 r X i =1 r X j =1 w ij R ij (2.4.16) Hence,using2.4.15and2.4.16,thenewestimatorofthecovariancematrixof ^ isgivenby ^ A = ^ +2 ^ 2 4 r X i =1 r X j =1 ^ w ij ( ^ Q ij ^ P i ^ ^ P j 1 4 ^ R ij ) 3 5 ^ where E [ ^ A ]= Var ( ^ )+ O ( n 5 = 2 ) Further,todrawinferencesabouted l linearcombinationsof( ): , L being a( l p )xedmatrix,KenwardandRogerapproximatedthedistributionoftheWald-type teststatisticwhichusesthenewadjustedcovariancematrix.TheWaldtypepivotisgiven 15 by F = 1 l ( ^ ) 0 L ( L 0 ^ A L ) 1 L 0 ( ^ ).WiththehelpofaTaylorseriesexpansion,the expectedvalueandthevarianceofthisteststatisticisgivenby E [ F ]=1+ A 2 l + O ( n 3 = 2 ) Var [ F ]= 2 l (1+ B )+ O ( n 3 = 2 ) ; where B = 1 2 l ( A 1 +6 A 2 ) ; A 1 = r X i =1 r X j =1 w ij tr ( ^ ^ ^ P i ^ tr ( ^ ^ ^ P j ^ ;A 2 = r X i =1 r X j =1 w ij tr ( ^ ^ ^ P i ^ ^ ^ ^ P j ^ for= L ( L 0 ^ L ) 1 L 0 : (2.4.17) Inordertoestimatethedenominatordegreesoffreedom( m )andthescalefactor( ), suchthat F ? = ˘ F ( l;m ),thetwomomentsof F ? arematchedtothoseofthe approximatingFdistribution.Thereforethefollowingresultsareobtained, m =4+ 2+ l lˆ 1 where ˆ = V [ F ] 2 E [ F ] 2 ; and = m E [ F ]( m 2) : ThedetailedproofscanbefoundinKenwardandRoger,1997andAlnosaier,2007.The overallprocedurecanbeappliedtoconstructtestsandintervalsford OncomparisonoftheBartlettcorrectedLRTandtheKenwardRogertest,weobserve thattheBartlettcorrectedtestusesthetraditionallikelihoodratiotestandapproximates 16 ittoa ˜ 2 approximationwhichisrescaledusingitsexpectedvalue.Ontheotherhand, Kenward-RogertestusesaWald-typestatisticandapproximatesittoanFdistribution whilemakinguseofthenewadjustedcovariancematrix.Closelyobservingthetwoformulas weget, E ( LR )=1+ C=l + O ( n 3 = 2 ) ; E ( F )=1+ A 2 =l + O ( n 3 = 2 ) whereboth C and A 2 aretraceofmatricesconstructedusingandsecondorderderiva- tivesof 2.5SimulationStudy Theperformanceofthelikelihoodratiotest(LRT),restrictedmaximumlikelihood ratiotest(RM-LRT),Bartlett-correctedlikelihoodratiotest(BC-LRT),Bartlett- correctedCoxReidadjustedlikelihoodratiotest(BC-CRT)andKenward-Rogerapproxi- mateFtest(KRT)intestingthedofalinearmixedmodelisassessedthrough simulationstudiesonthebasisofType-Ierrorrates.Wehaveconductedsimulationstudies fortwoentsettings.Inboththesettings,thecovariancematrixofrandomhas alinearstructureandsodoesIneachstudy,wewillrunthesimulationsfortwocases: (i)testingforthewholevectorof(ii)testingforasingleForeach case,5000datasetsaregenerated. 17 2.5.1StudyI:longitudinalstudy Thesimulationstudyisalongitudinalstudywherethedataconsistsofagrowthdata for11girlsand16boys.Measurementsaretakenoverfourtimeintervals(ageoftheperson): 8,10,12and14.ThedatasetupistakenfromVerbekeandMolenberghs,2000.Let x i denotetheindicatorvariableforsexand t j bethetimepointsatwhichtheobservationis recorded.Themodelcanbeexpressedas Y ij = 0 + 01 x i + 10 t j + 11 t j x i + b 0 + b 1 t j + ij : Inthematrixnotation Y i = X i + Z i b + i ,thedesignmatrix X i andthe designmatrixcanbewrittenas X i = 0 B B B B B B B B B @ 1 x i 88 x i 1 x i 1010 x i 1 x i 1212 x i 1 x i 1414 x i 1 C C C C C C C C C A Z i = 0 B B B B B B B B B @ 18 110 112 114 1 C C C C C C C C C A Themeasurementerrorstructure i = ˙ 2 I 4 .Weassumeanunstructuredcovariance matrixGfortherandom b i givenby G = 0 B @ ! 1 ! 2 ! 2 ! 3 1 C A Therefore,theresultingcovariancematrixof Y i isgivenby Z i GZ 0 i + ˙ 2 I 4 ,hencethecovariance matrixhasalinearstructure.Wefocuson ,thefullvectorofandon 11 ,the 18 ofgenderonslope.Tosimulatedata,weassume x i =1forgirlsand x i =0forboys. Wekeep ˙ 2 at0.05and ! 1 at1.Weconsidermultiplecasesbychoosingt valuesof ! 2 and ! 3 rangingfrom0to1.Foreachsetting,5000datasetsaregenerated. Wecomparetheetestproceduresonthebasisofsimulatedsizebasedona0.05nominal level.Table2.1givestheType-Ierrorratesofthetteststatisticswhentestingfor H 0 : =0whileTable2.2givesthesamefor H 0 : 11 =0. Table2.1Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.075 0.071 0.055 0.053 0.053 01 0.089 0.074 0.072 0.062 0.056 0.250.5 0.060 0.057 0.048 0.046 0.049 0.251 0.068 0.065 0.052 0.049 0.047 Table2.2Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.065 0.060 0.047 0.045 0.053 01 0.056 0.052 0.045 0.044 0.046 0.250.5 0.057 0.054 0.042 0.040 0.047 0.251 0.060 0.061 0.058 0.038 0.049 BylookingatTables2.1and2.2,weobservethatingeneralalltestingproceduresperform 19 betterwhentestingforsingleparameterascomparedtoamulti-parametercase.Also,while testing H 0 : =0,forallvaluesof ! 2 and ! 3 ,BartlettCorrectedCox-Reidadjusted likelihoodratiotestandKenward-Rogertestpreservethesizeofthetestat0.05nominal levelbestascomparedtotheotherthreemethods.Fortestingofasingleparameteri.e. 11 =0,KRTpreservesthenominallevelwhileBC-CRTfailstodosowhen ! 2 6 =0. Toinvestigatefurther,wealsoconsiderasetupofsmallersizei.e.n=14.Arandom sampleofsizefourteenisdrawnfromtheoriginaldatasettothegenderandobservation times.Again,simulatedsizeiscalculatedbasedon0.05nominallevel.Tables2.3and2.4 givethecomparisonoftheetestprocedures. Table2.3Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.121 0.090 0.057 0.053 0.050 01 0.118 0.098 0.077 0.058 0.056 0.250.5 0.102 0.076 0.063 0.055 0.056 0.251 0.096 0.084 0.063 0.056 0.049 20 Table2.4Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.080 0.067 0.054 0.053 0.051 01 0.075 0.069 0.059 0.055 0.052 0.250.5 0.082 0.078 0.060 0.056 0.056 0.251 0.070 0.065 0.074 0.055 0.048 Duetothedecreaseinsamplesize,theBartlettcorrectionsindicateamuchbetterim- provementascomparedtothestandardlikelihoodratiotests.However,theperformanceof alltestingproceduresexceptKRTdeteriorateintermsofpreservationofType-Ierrorrates. ThesimulatedsizeofKRTisstillveryclosetothenominallevel. Next,weconsideranunbalancedlongitudinalstudy.LittleandRubin,1987deleted9 observationsfromthedatasetresultingin9(outof27)incompletesubjects.Deletionis tomeasurementstakenatage10.So,werepeatthesimulationstudyforincomplete data.Table2.5andTable2.6givetheType-Ierrorratesofthetestingprocedures. 21 Table2.5Simulatedsizeandsimulatedpowerfor =0.HereLRT,RM-LRT,BC-LRT, BC-CRTandKRTrepresentlikelihoodratiotest,restrictedmaximumlikelihoodratiotest, Bartlett-correctedlikelihoodratiotest,Bartlett-correctedCoxReidadjustedlikeli- hoodratiotestandKenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.061 0.055 0.047 0.045 0.046 01 0.090 0.080 0.073 0.061 0.054 0.250.5 0.071 0.062 0.050 0.048 0.048 0.251 0.076 0.068 0.057 0.053 0.051 Table2.6Type-Ierrorratesfor 11 =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ! 2 ! 3 LRT RM-LRT BC-LRT BC-CRT KRT 00.5 0.067 0.061 0.039 0.038 0.046 01 0.063 0.057 0.049 0.046 0.052 0.250.5 0.065 0.060 0.058 0.054 0.048 0.251 0.066 0.061 0.050 0.046 0.054 TheobservationremainsthesamethatforamultiparametercasebothKRTandBC- CRTseemtobeworkequallywellforallvaluesof ! 2 and ! 3 ,butforasingleparametercase KRTseemstobethebestchoice.Again,fromtheincompletedataset,fourteensubjects arechosenatrandom.Thereforewehaveafewincompletesubjectsinthedata.Asimilar simulationstudywasconductedandtheresultsweresimilartothoseobtainedintheprevious cases(notshownhereforthesakeofbrevity).Therefore,foralongitudinalstudy(balanced orunbalanced)withsmallsamplesizes,werecommendtheapproximateF-testproposedby 22 KenwardandRoger. 2.5.2StudyII:balancedincompleteblockdesign Next,followingasimilarsimulationsetupasinAlnosaier,2007,weconsiderabalanced incompleteblockdesignwherewehave n observationsinall: q blocks, p treatmentsand s treatmentsperblock.Themodeloftheblockdesignisgivenby y ij = + i + b j + e ij (2.5.18) for i =2 ;:::;p , j =1 ;:::;q ,where isthegeneralmean, i arethetreatment oftreatmentsfromtreatment1), b j aretheblock b j ˘ N (0 ;˙ 2 b ), e ij ˘ N (0 ;˙ 2 e ),and b j 'sand e ij 'sareindependent.Thecovariancematrixof Y i isgivenby i = ˙ 2 e I s + ˙ 2 b 1 s where 1 s isan s dimensionalvectorof1's.Ourmainobjectiveistotestthe followinghypothesis(i) H 0 : 2 = 3 = = p =0and(ii) H 0 : 2 =0.Tocomputethe simulatedsizeoftests,wegeneratethedataunderthenullhypothesisi.e.assumingthere isnocanttreatment5000timesandapplyallemethodsoftestingonit.We alsocalculatethesimulatedpowerforalltheetests.Wechooseditvaluesof ˆ = ˙ 2 b ˙ 2 e rangingfrom0.25to4. Case1:p=6,q=10,n=30 Thedatasetupisgivenasfollows 23 Table2.7Case1 Block Treat. Block Treat. Block Treat. Block Treat. Block Treat. 1 1,2,5 3 1,3,4 5 1,4,5 7 2,3,5 9 3,5,6 2 1,2,6 4 1,3,6 6 2,3,4 8 2,4,6 10 4,5,6 Thesimulationresultsareasfollows: Table2.8Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.105 0.091 0.067 0.062 0.049 0.5 0.104 0.085 0.069 0.067 0.048 1 0.105 0.088 0.075 0.069 0.051 2 0.098 0.080 0.067 0.065 0.046 4 0.108 0.087 0.075 0.071 0.051 Table2.9Type-Ierrorratesfor 2 =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.087 0.079 0.055 0.053 0.044 0.5 0.088 0.078 0.056 0.052 0.049 1 0.089 0.076 0.048 0.045 0.048 2 0.092 0.079 0.048 0.047 0.048 4 0.096 0.085 0.053 0.049 0.052 24 Table2.10Type-Ierrorratesfor 2 =0and 3 =0.HereLRT,RM-LRT,BC-LRT,BC-CRT andKRTrepresentlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett- correctedlikelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratio testandKenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.120 0.105 0.074 0.069 0.060 0.5 0.120 0.103 0.069 0.064 0.057 1 0.104 0.086 0.048 0.040 0.050 2 0.115 0.095 0.056 0.046 0.053 4 0.120 0.102 0.060 0.049 0.058 FromtheresultsinTables2.8,2.9and2.10weobservethatLRTandRM-LRTdoa verypoorjobonpreservingtheType-Ierrorrates,sincetheesamplesizeisonlyten. KRTcontinuestopreservethesizeveryaccuratelyforallvaluesof ˆ . Asimilarsimulationstudyisconductedforsixblocksandsixtreatments,threetreat- mentsperblock. Case2:n=18,p=6,q=6 Thedatasetupisasfollows Table2.11Case2 Block Treat. Block Treat. 1 1,2,5 4 2,3,4 2 1,2,6 5 3,5,6 3 1,3,4 6 4,5,6 25 ThesimulatedsizeofthetestsisgiveninTables2.12,2.13and2.14. Table2.12Type-Ierrorratesfor =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.183 0.143 0.108 0.085 0.053 0.5 0.184 0.142 0.113 0.087 0.053 1 0.176 0.131 0.107 0.092 0.054 2 0.175 0.126 0.105 0.095 0.047 4 0.174 0.121 0.105 0.094 0.050 Table2.13Type-Ierrorratesfor 2 =0.HereLRT,RM-LRT,BC-LRT,BC-CRTandKRT representlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett-corrected likelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratiotestand Kenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.138 0.123 0.082 0.071 0.052 0.5 0.133 0.113 0.069 0.055 0.048 1 0.145 0.125 0.067 0.060 0.051 2 0.139 0.114 0.060 0.053 0.046 4 0.136 0.115 0.065 0.059 0.049 26 Table2.14Type-Ierrorratesfor 2 =0and 3 =0.HereLRT,RM-LRT,BC-LRT,BC-CRT andKRTrepresentlikelihoodratiotest,restrictedmaximumlikelihoodratiotest,Bartlett- correctedlikelihoodratiotest,Bartlett-correctedCoxReidadjustedlikelihoodratio testandKenward-RogerapproximateFtest,respectively. ˆ LRT RM-LRT BC-LRT BC-CRT KRT 0.25 0.138 0.123 0.082 0.071 0.052 0.5 0.133 0.113 0.069 0.055 0.048 1 0.145 0.125 0.067 0.060 0.051 2 0.139 0.114 0.060 0.053 0.046 4 0.136 0.115 0.065 0.059 0.049 Intheabovecase,theesamplesizeisonlysix,thusithasatimpacton theperformanceofalltests.KRTstillmanagestopreservethenominallevel. FromtheabovesimulationstudyweobservethatLRTandRM-LRTarenotwellsuited forasmallsamplestudy.Theoretically,wewouldexpecttheKenward-Rogermethodto performbetterthantheBartlettcorrectedtests,intermsofpreservingthesizeofthe test.Thisisbecause,Bartlettcorrectedtestsrescalethe ˜ 2 distributionusingtheexpected value.Thiswouldimplythatthedistributionwillbemoreaccurateinthemedianregion ascomparedtothetailregion.Ontheotherhand,KRmethodusesanFdistributionwith suitabledegreesoffreedomsuchthatthesecondmomentofthestatisticsmatchaswell; henceprovidingagoodinthetailregions.Thisfacthasbeeninthesimulation studyresultsaswell. 27 2.6Realdataexample Weusethemethodsdescribedonastudyof20preadolescentgirlsreportedbyGoldstein, 1979,whoseheightaremeasuredonayearlybasisfromage6to10.Thegirlsare accordingtotheheightoftheirmotherwhichisadiscretevariablegroupedfrom1(shortin height)to3(tallinheight).Themeasurementsforthegirlwerereportedas114.5,112, 126.4,131.2and135.Thesecondmeasurementseemsinaccurateandhasbeenreplacedby 122asdoneinVerbekeandMolensberghs,2000.Overallthereare100measurements.We usethefollowingmodel Y ij = 0 + 1 t ij + 20 G 2 i + 30 G 3 i + b 0 i + b 1 i t ij + ij with i =1 ; 2 ; ; 20and j =1 ; ; 4,where Y ij istheheightofgirls, t ij isthe j th timepoint atwhich i th subject'sheightwasrecorded, G 2 i isadummyvariableindicatingthatmother's heightisingroup2and G 3 i istheindicatorthatmother'sheightisingroup3.Assuming ( b 0 i ;b 1 i ) 0 iid ˘ N 2 (0 ;D ),whereDisthecovariancematrixand ij iid ˘ N (0 ;˙ 2 e )independentof b i . Thegoalofthestudyistoinvestigatetheectofmother'sheightontheheightofthe girl,i.e.wewanttotestthefollowinghypothesis: H 0 :( 20 ; 30 ) 0 =0against H 0 :( 20 ; 30 ) 0 6 =0(2.6.19) Theteststatisticandp-valuefortestingtheabovehypothesis2.6.19forthettest proceduresare(i)LRT:8.658(p-value=0.0132),(ii)RM-LRT:8.496(p-value=0.0143) (iii)BC-LRT:7.273(p-value=0.0263)(iv)BC-CRT:8.611(p-value=0.0137)and(v)KRT: 28 8.225(p-value:0.0032).Thep-valuesfor(i)-(iv)werebasedona ˜ 2 2 distributionwhile(v) wasbasedontheF-distribution.Alltestsyieldthesameresultthatthenullhypothesisis rejectedat5%leveli.e.theheightofmotherhasatimpactonthe heightofgirls. However,inthehopeofdrawingthesameconclusionfromasubsetofthisdataset,we randomlysampled15subjectsfromthedataset(75overallmeasurements),andusedallthe etestingproceduresagaintotestthehypothesisgivenin(2.6.19).Weobtainthefollowing teststatisticsandp-valuesforthetests:(i)LRT:5.576(p-value=0.0616),(ii)RM-LRT: 5.347(p-value=0.0690)(iii)BC-LRT:4.544(p-value=0.103)(iv)BC-CRT:5.416(p-value= 0.0666)and(v)KRT:4.576(p-value:0.0333).Fromtheaboveresults,onlyKRTrejectsthe nullhypothesisat5%level. Therefore,weconcludethattheheightofthemotherhasacantimpactonthe heightofgirls.ThisconclusionisconsistentwiththeresultsobtainedbyVerbekeand Molenberghs,1997. 29 Chapter3 Kenward-Rogerapproximationfor linearmixedmodelswithmissing covariates Consideralinearmixedmodelwith m groupsand n i measurementsinthe i th group.Denote n = P m i =1 n i thetotalnumberofobservations.Foreachgroup,weobservean n i 1vector ofresponses Y i ,let X i bean n i p designmatrixwhosecolumnisavector ofonestoaccountfortheintercept, Z i bean n i q designmatrix, bea p 1vectorofcotsand b i beagroup-spvectorofrandomregression cots.Themodelcanbewrittenas Y i = X i + Z i b i + i ;i =1 ; ;m; where i ˘ N n i (0 ;˙ 2 e I n i ), b i ˘ N q (0 ;V )and i and b i areindependentlydistributed. Wecandeducethat Y 1 ; ; Y m areindependentand Y i ˘ N n i ( X i ; i )with i = Z i VZ T i + ˙ 2 e I n i .Denotethestackedvectors Y =( Y T 1 ; ; Y T m ) T , b =( b T 1 ; ; b T m ) T , = ( T 1 ; ; T m ) T andthestackedmatrices X n p =( X T 1 ; ;X T n ) T , Z n mq = diag ( Z 1 ; ;Z m ) 30 and n n = diag 1 ; ; n ).Thenthemodelcanbewrittenas Y = X + ? ; where ? = Z b + suchthat ? ˘ N n (0 ; ˙ )).Thecovariancematrixisafunctionof 1+ q ( q +1) = 2parameters,wheretheparameteristhevarianceofthemodelerrorsi.e., ˙ 2 e andtherest q ( q +1) = 2parameterscharacterizetherandomcovariancematrix V: These parametersarerepresentedbythevector ˙ =( ˙ 1 ;˙ 2 ; ;˙ r ) T where r =1+ q ( q +1) = 2 : Let X ( 1 ) =( X 1(1) ;X 2(1) ; ;X n (1) ) T bethecolumnofthecovariatematrix X .Then X n p canbewrittenas X =( X ( 1 ) : X ( 1) )where X ( 1 ) isofdimension n 1and X ( 1) is a n ( p 1)matrixofthecovariatesotherthantheWithoutlossofgenerality,assume that X (1) containspartiallymissingcovariates,andnethemissingindicatoras B j = 8 > < > : 0if X j (1) ismissing 1otherwise ; for j =1 ; ;n .Assumethatthepartiallymissingcovariate X ( 1 ) followsaparametricmodel f ( X ( 1 ) j X ( 1 ) ; )thatisknownuptoadimensionalparameter =( 1 ; ; k ) T .In thepresenceofmissingdata,wewrite Y = X ? + ? ; 31 where X ? n p =( X ? (1) : X ( 1) ), X ? ( 1 ) =( X ? 1(1) ; ;X ? n (1) ) T suchthat, X ? j (1) = 8 > < > : E ( X j ( 1 ) j X j ( 1 ) ; )= ( )if B j =0 X j (1) if B j =1 ; for j =1 ; ;n .Thisapproachisverycommonlyused,foreg.,Little&Rubin(2002). Nowwecanestimate bymaximizingthelikelihood L = n Y j =1 f ( X j (1) j X j ( 1) ; ) 8f j : B j =1 g : (3.0.1) ThestandarderroroftheparameterestimatescanbedeterminedfromtheHessianmatrix. Wewillbeworkingwiththefollowingmodelsetupintherestofthechapter.Nowthe n -dimensional Y followsamultivariatenormaldistribution, Y ˘ Normal( X ? ; ; where X ? isan n p matrixofknown/estimatedcovariates,assumedtobeoffullrank. is a p 1vectorofunknownparametersandisanunknownvariancecovariancematrixwhose elementsarefunctionsof ˙ =( ˙ 1 ; ;˙ r ) T whichinturnisafunctionof =( 1 ; ; k ) T . ThisisbecausethecovariateXisafunctionof andfromtheREMLequationsweknow thattheestimatesof ˙ dependonX. TheREMLbasedestimatedleastsquaresestimatorof is b = X ? T ( b 1 f b ˙ ( b ) g X ? ( b ) 1 X ? T ( b 1 f b ˙ ( b ) g Y ; (3.0.2) 32 where b ˙ ( b )istheREMLestimatorof ˙ .Inourstudy, istheparameterofinterestand f ˙ ( ) g isthenuisanceparameter.Wewilltestthehypothesis H 0 : L T =0,where L is a p l matrix. Remark :Themethodcanbegeneralizedforthecaseoftwoormorepartiallymissing independentcovariates.Withoutlossofgenerality,assumethat X (1) and X (2) contain partiallymissingcovariatesandthattheyfollowparametricdistributions f ( X ( 1 ) j X ( 1 ) ; ˝ 1 ) and f ( X ( 2 ) j X ( 2 ) ; ˝ 2 )respectively.Themissingvaluesforeachofthecovariatescanbe imputedusingsimilarmaximumlikelihoodprocedureasdescribedabove.ThentheREML basedestimatorof isgivenby b givenin(3.0.2)where =( ˝ 1 ; ˝ 2 ) T . 3.1Inferenceforthe Tomakeinferencesaboutsuchasintervalestimationandtestingof hypothesis,weneedtoanapproximatepivotandcomputeitsdistributioninsmall samplesettings. 3.1.1Estimationofthevarianceoftheestimator b Underthelinearmixedmodelframework,theasymptoticvarianceof b isgivenby b = f b ˙ ( b ) g = X T ( b 1 f b ˙ ( b ) g X ( b ) 1 ,where b istheestimatorobtainedaftermaximizing thelikelihoodgivenin(3.0.1)and b ˙ ( b )istheREMLestimateof ˙ obtainedafterusing theimputeddataset.Therearethreesourcesofbiaswhen b isusedasanestimator forthevarianceof b forsmallsamples: b isabiasedestimatorof ˙ ( )),doesnot takethevariabilityof b ˙ and b intoaccount.Weconsideranapproximationtothesmall samplecovariancematrixof b undermissingcovariates,whichaccountsforthevariabilityin 33 both b ˙ and b ,thusreducingthesmallsamplebiasandanybiasduetothemissingness.For completedata,KackarandHarville(1984)addressedthesecondsourceofbiasbypartitioning thevarianceoftheestimated intotwocomponents:+FurtherKenwardandRoger (1997)addressedthesourceofbiasandcombinedbothadjustmentsthusproposinga newestimatorof b KR = b + b ; wheretheestimatorsofandaresuchthat E ( b = e + R + O ( n 5 = 2 ) ; (3.1.3) E ( b = e + O ( n 5 = 2 ) ; (3.1.4) with =[ X ? T ( 1 f ˙ ( ) g X ? ( )] 1 ; (3.1.5) e = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m Q lm P l P m ; (3.1.6) 34 and O ( n r )denotes O ( n r ) =n r isaboundednumberas n !1 ,forsome r .Intheabove expressions P l = X ? T ( 1 f ˙ ( ) g @ @˙ l 1 f ˙ ( ) g X ? ( ) ; Q lm = X ? T ( 1 f ˙ ( ) g @ @˙ l 1 f ˙ ( ) g @ @˙ m 1 f ˙ ( ) g X ? ( ) ; R lm = X ? T ( 1 f ˙ ( ) g @ 2 @˙ l @˙ m 1 f ˙ ( ) g X ? ( ) ; R = 1 2 r X l =1 r X m =1 cov( b ˙ l ; b ˙ m R lm : Wenowconsiderthebiasin b asanestimatorof( b ˙ ( ))thusadjustingformissingness. Theorem1 UndertheassumptionsgiveninAppendixA1,theadjustedvarianceof b can beapproximatedas b A = b + b + b : Theestimatorsof , and aregivenby b , b and b respectivelysuchthat(3.1.3)-(3.1.6) holdand E ( b =+ O ( n 2 ) ,where = k X i =1 b ˙ ( )) K i var ( b i ) K T i T ( b ˙ ( ))+ k X i =1 k X j =1 i 6 = j b ˙ ( )) K i cov ( b i ; b j ) K T i T ( b ˙ ( )) : Proof:TheproofisdetailedinAppendixA2. 35 3.1.2Approximatingthedistributionoftheteststatistic Supposeweareinterestedinmakinginferencesabout l linearcombinationsoftheelements .Inotherwords,weareinterestedintesting H 0 : L T =0where L T isamatrixof dimension( l p ).Acommonstatistictotest H 0 istheWaldstatisticgivenby F = 1 l ( b ) T L ( L T b A L ) 1 L T ( b ) ; (3.1.7) where b A isanadjustedcovariancematrixfor b .Wewillfollowasimilarprocedureasused inKenward&Roger(1997)andAlnosaier(2007)byapproximatingthetwomomentsof theFstatisticgivenin(3.1.7)andmatchthemomentstoapproximatethescalingfactorand thedenominatordegreesoffreedomoftheteststatistici.e., and d suchthat ˘ F ( l;d ) indistribution.Here F ( l;d )denotesF-distributionwithdegreesoffreedom( l;d ). Theorem2 UndertheassumptionsgiveninAppendixA1,thetwomomentsofthetest statistic(3.1.7)under H 0 : L T =0 are E ( F )=1+ A 2 l A 4 l + O ( n 3 = 2 ) ; (3.1.8) var ( F )= A 1 l 2 + 2 l + 6 A 2 l 2 ; (3.1.9) where b = L ( L T b L ) 1 L T , A 1 = P r l =1 P r m =1 cov ( b ˙ l ; b ˙ m ) trace ( b b P l b trace ( b b P m b , A 2 = P r l =1 P r m =1 cov ( b ˙ l ; b ˙ m ) trace ( b b P l b b b P m b , A 3 = P r l =1 P r m =1 cov ( b ˙ l ; b ˙ m ) trace f b Q lm P l b P m R lm = 4) g , A 4 = P r l =1 P r m =1 cov ( b ˙ l ; b ˙ m ) trace ( b b , andusing(3.1.8)and(3.1.9),weget e d =4+(2+ l ) = ( l e ˆ 1) ,where e ˆ = e V= 2 e E 2 ,and 36 e = e d= f e E ( e d 2) g ,where e E and e V aredinAppendixA3. Proof:Theexpectationandvarianceareapproximatedusingthefollowingconditionalar- gumentsthat E ( F )= E f E ( F j b ˙ ) g andvar( F )= E f var( F j b ˙ ) g +var f E ( F j b ˙ ) g .Afterusing theTaylorseriesexpansionweobtain E ( F )= e E + O ( n 3 = 2 ) ; andvar( F )= e V + O ( n 3 = 2 ) ; wheretheexplicitexpressionsof e E and e V arederivedinAppendixA3.Inordertodetermine and d ,wematchthetwomomentsof withthoseof F ( l;d )i.e., E f F ( l;d ) g = d d 2 ; andvar f F ( l;d ) g =2 d d 2 2 l + d 2 l ( d 4) ; toobtaintheexpressionsfor e d and e . 3.2Simulation Toexplorethebehaviourofthestatisticsderivedintheprevioussections,weconducted simulationstudiesfortwoentsettings. Design1:RandomCotRegressionModel Weconsideralongitudinalstudywherethedataconsistsofmeasurementson n subjects. Measurementsaretakenoverfourtimeintervals(say):8,10,12and14.Let t j bethetime pointsatwhichtheobservationisrecorded.Themodelcanbeexpressedas Y i = X i + Z i b + i ; 37 where X i isthe(4 p )matrix X i andthedesignmatrix Z i canbewrittenas Z i = 0 B B B B B B B B B @ 18 110 112 114 1 C C C C C C C C C A Themeasurementerrorstructure i = ˙ 2 I 4 .Weassumeanunstructuredcovariance matrixGfortherandom b i givenby G = 0 B @ ! 1 ! 2 ! 2 ! 3 1 C A Therefore,theresultingcovariancematrixof Y i isgivenby Z i GZ 0 i + ˙ 2 I 4 .Tosimulatethe data,weconsider p =5andgeneratedthecovariatematrix X fromamultivariatenormal distributionwithmeanzeroandcovariancematrixwhose( i;j ) th elementisgivenby j 0 : 2 j i j . Weintroduced20%and40%missingnessinoneofthecolumnsofX(i.e.,missingnessin onecovariate)usingabernoullidistributioni.e.assumingthemissingmechanismtobe MCAR.Thenwemaximizedtheobservedlikelihoodofthepartiallymissingcovariatetoget theestimateofthemeanandthevariance,andreplacedthemissingvaluesinthecovariate matrixbytheestimatedmean,thusobtainingtheimputeddataset.Finally,wedeployed ourmethodontheimputeddatasettotestthehypothesis H 0 : =0vs H a : 6 =0.We keep ˙ 2 at0.05and ! 1 at1andconsidermultiplecasesbychoosingt valuesof ! 2 and ! 3 rangingfrom0to1.Foreachsetting,5000datasetsaregeneratedand 38 Type-Ierrorratesarenoted. Design2:BlockDesign Inthenextsimulationstudy,weconsideredablockdesignwithonerandom( q =1), whosevarianceis ˙ 2 b .Hereweconsideredthenumberofi.e., p =5.Tosimulate thedata,asusual,wegeneratedthecovariatematrix X fromagaussiandistributionand introducedmissingnessunderMCARmechanism.Further,torelaxtheMCARassumption, weconsideredanothercasetomimictheMARmechanismwhereweintroducedmissingness inoneofthecovariatessuchthattheprobabilityofanobservationbeingmissingdepends onthevalueofanothercovariate.Ourobjectiveistotestthehypothesis H 0 : =0vs H a : 6 =0.Totestthis,wegenerated Y from N (0 ; fortvaluesof ˆ = ˙ 2 b =˙ 2 e rangingfrom0.25to4.Foreachsettingwegenerated5000datasetsandcomputedthe simulatedsize(Type-Ierrorprobabilities). Implementation: Tobeabletorunthesimulation,weneedtocalculatethefollowingterms: @X ? ( ) =@ and @ b ˙ ( )) =@ .Thetermis @X ? ( ) =@ = I ( X ? = ( )),andusingthe chainruleofofderivativethesecondtermis @ b ˙ ( )) =@ = P r i =1 ( @ =@ b ˙ 1 ) ( @ b ˙ 1 =@ ).The detailedexpressionofthesetermsaregiveninAppendixA4. Inoursimulationstudy,wewillcompareourtestingprocedurei.e.,KenwardRoger adjustmentformissingcovariates(KRM)tothefollowingmethodswhichareusedtodeal withmissingcovariatesinalinearmixedmodelsetuponthebasisofType-Ierrorrates:1) listwisedeletionfollowedbylikelihoodratiotest(LD-LRT),2)meanimputationfollowedby likelihoodratiotest(MI-LRT),3)meanimputationfollowedbyWaldtest. Results: ForDesign1,Tables3.1andTable3.2presenttheType-Ierrorrateswhenthe missingmechanismisMCARfor n =27and n =13respectively 39 Table3.1HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5%level ofwhen n =27at40%missingness.HereKRM,LD-LRT,MI-LRTandMI- WTrefertoKenward-Rogeradjustmentformissingcovariates,listwisedeletionfollowedby likelihoodratiotest,meanimputationfollowedbylikelihoodratiotestandmeanimputation followedbyWaldTest. ! 2 ! 3 KRM LD-LRT MI-LRT MI-WT 00.5 0.059 0.085 0.068 0.084 01 0.070 0.118 0.122 0.096 0.250.5 0.065 0.091 0.077 0.095 0.251 0.068 0.119 0.145 0.097 Table3.2HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5%level ofwhen n =13at40%missingness.HereKRM,LD-LRT,MI-LRTandMI- WTrefertoKenward-Rogeradjustmentformissingcovariates,listwisedeletionfollowedby likelihoodratiotest,meanimputationfollowedbylikelihoodratiotestandmeanimputation followedbyWaldTest. ! 2 ! 3 KRM LD-LRT MI-LRT MI-WT 00.5 0.075 0.157 0.103 0.135 01 0.062 0.162 0.124 0.109 0.250.5 0.070 0.130 0.116 0.113 0.251 0.055 0.144 0.154 0.103 ForDesign2,Tables2.8and2.9providetheType-Ierrorrateswhenthemissingmecha- nismisMCARforsamplesizes15and30respectively,Table2.10providestheresultsunder theMARmechanism. 40 Table3.3HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5% levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2forall i =1 ;:::; 15.HereKRM, LD-LRTandMI-LRTrefertoKenward-Rogeradjustmentformissingcovariates,listwise deletionfollowedbylikelihoodratiotestandmeanimputationfollowedbylikelihoodratio test. Percentageof ˆ KRMLD-LRTMI-LRTMI-WT missingdata 200.250.0490.1480.1130.094 0.50.0550.2460.1220.087 10.0590.1520.1200.099 20.0570.1880.1270.095 40.0630.1730.1290.094 400.250.0420.2980.1120.094 0.50.0570.2050.1260.089 10.0520.3750.1230.093 20.0580.3750.1300.098 40.0630.4150.1240.088 41 Table3.4HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5% levelofwhen ˆ = ˙ 2 b =˙ 2 e , m =30and n i =2forall i =1 ;:::; 30.HereKRM, LD-LRTandMI-LRTrefertoKenward-Rogeradjustmentformissingcovariates,listwise deletionfollowedbylikelihoodratiotestandmeanimputationfollowedbylikelihoodratio test. Percentageof ˆ KRMLD-LRTMI-LRTMI-WT missingdata 200.250.0440.0850.0780.067 0.50.0550.0830.0870.068 10.0500.0940.0820.066 20.0550.1060.0830.073 40.0470.0880.0790.063 400.250.0470.1000.0730.065 0.50.0480.1430.0810.063 10.0480.1040.0720.066 20.0510.2170.0790.066 40.0480.1690.0760.069 42 Table3.5HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5%level ofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2forall i =1 ;:::; 15whenthemissing mechanismisMAR.HereKRM,LD-LRTandMI-LRTrefertoKenward-Rogeradjustment formissingcovariates,listwisedeletionfollowedbylikelihoodratiotestandmeanimputation followedbylikelihoodratiotest. ˆ KRMLD-LRTMI-LRTMI-WT 0.250.0580.1030.1020.094 0.50.0530.1140.0910.087 10.0500.1640.1030.093 20.0490.1170.0890.087 40.0630.1410.0990.095 FromtheresultsinTables3.1-3.5,weobservethatforthecaseoflistwisedeletion, theType-Ierrorratesarecloseto0.1whenthereis20%missingnessinthedataandit getsworsewhenthepercentageofmissingnessincreasesto40%.Asthesampleincreases, thereissomedecreaseintheerrorrates,however,theyarestilltlylarge.Mean imputationseemsaviableoptionsinceitprovidesacompletedatatoworkwith.Itworks betterthanlistwisedeletionasexpected,butthestandardlikelihoodratio(MI-LRT)used ontheimputeddatasetstilldoesnotworkwell,sincethe ˜ 2 testisbasedonlargesample approximations.Further,asseenfromresultsofdesign1,Wald-testperformsbetterthan LD-LRTandMI-LRTbutthesizeisnowhereclosetothenominallevel.Weconducted anotherad-hoctestprocedurewhereweusedtheoriginalKenwardRogermethodonthe imputeddatasetwithoutreducingthebiasduetomissingness(resultsnotshownhere).It performsreasonablywell,allType-Ierrorrateswerebetween0.045to0.075;howeveritis nottheoreticallyFurther,wealsocomparedWaldtypetestsusingtwot covariancematricesof b namely(i) b whichdoesnottakethevariabilityof b ˙ and b into 43 accountand(ii) b KR whichtakesthevariabilityinonly b ˙ intoaccount.Thesearealso ad-hoctestsandinthesetwocases,noapproximationismadetothedenominatordegrees offreedomofFdistribution(weusetheresidualdegreesoffreedom).Toavoidredundancy, theresultsfrommethods(i)and(ii)arenotshownhere.Finally,ourproposedmethod (KRM)outperformstheotherproceduresforallvaluesof ˆ;n andthetypeandpercentage ofmissingness.ItpreservestheType-Ierrorratescloseto0.05foreachcase.Wealsotested forsingleparameteri.e. H 0 : 1 =0(resultsnotshownhere)inwhichcasethe resultsarehesame. Robustness: Inpractice,thedistributionofthepartiallymissingcovariateisnotknown. Therefore,tochecktherobustnessofourprocedure,wegeneratedthepartiallymissing covariatefromheavytaileddistributionsliketheLaplacedistributionwithmeanparameter aszeroandvarianceaseight,andtheStudent'st-distributionwithonedegreeoffreedom. Weensured20%missingnessinthecovariate.ThisisdoneforDesign2.Thecorresponding resultsaregiveninTable3.6. 44 Table3.6HerewepresentType-Ierrorprobabilitiesforthetest H 0 : =0atthe5%level ofwhen ˆ = ˙ 2 b =˙ 2 e , m =15and n i =2forall i =1 ;:::; 15whenthepartially missing(20%missing)covariateisgeneratedfromheavytaileddistributions.HereKRM, LD-LRT,MI-LRTandMI-KRTrefertoKenward-Rogeradjustmentformissingcovariates, listwisedeletionfollowedbylikelihoodratiotestandmeanimputationfollowedbylikelihood ratiotest. Distribution ˆ KRMLD-LRTMI-LRTMI-WT Laplace(0,2)0.250.0520.1060.0950.093 0.50.0560.1150.1020.094 10.0490.1850.0940.089 20.0490.1700.0990.090 40.0510.1990.1000.095 Student'st(1)0.250.0500.1530.0990.085 0.50.0510.1050.0960.093 10.0470.1370.1040.098 20.0550.1420.1050.093 40.0590.1390.1070.094 Table3.6showsthatourtestingprocedureisrobust.TheType-Ierrorratesforthe KRMprocedurearecontainedbetween0.049and0.059evenwhenthedistributionofthe covariateismisspWhile,asexpectedtheerrorprobabilitiesforLD-LRT,MI-LRT andMI-WTare 45 3.3Realdataexamples 3.3.1Example1 Wedeploythemethoddetailedinsection3.1onthedatamentionedintheintroduction. ThedataisdescribedasthevagaltonedatainWeiss(2005).Vagaltoneisameasureof theactivityintheparasympatheticnervoussystem(PNS).Itissupposedtobehighbut itgetslowerinresponsetostress.Thisdatasetsconsistsofevagaltonemeasurements takenbeforeandafterthecatheterization,twoweretakenpriortothesurgery(timepoints: pre1andpre2)andthreeweretakenpostsurgery(timepoints:post1,post2andpost3). Theresponsevariableattwotimepointsisapproximatelythesame,thenitdecreases atpost1andincreasesbacktonormalforthelasttwomeasurements.Inouranalysis,we considerthreemeasurements(attimepointspre2,post1andpost2).Thecovariatesinthe dataaregender,ageinmonths,durationofthecatheterizationinminutesandameasureof illnessseverity.Thedataishighlyunbalanced.Inaddition,weintroduce30%missingness inthecovariateage.Themodelcanbeexpressedasfollows. Y ij = 0 + 1 X 1 i + 2 X 2 i + 3 X 3 i + 4 X 4 i + 5 t ij + b 0 i + ij with i =1 ; ; 21and j =1 ; 2 ; 3,where Y ij isthevagaltonemeasurementfor i th subject atthe j th timepoint, X 1 i ;X 2 i ;X 3 i and X 4 i indicatethegender,age,durationofsurgery andseverityofillnessrespectivelyforthe i th subject.Assuming( b 0 i ) iid ˘ N (0 ;˙ 2 b )and ij iid ˘ N (0 ;˙ 2 e )isindependentof b 0 i . Ourobjectiveinthisexampleistoquantifytheofageoftheinfantonthevagal tonemeasurement.Thereforewetest H 0 : 2 =0.Theteststatistic,demoninatordegrees 46 offreedom( ddf )andthe p -valueforttestproceduresare:(i)KRM:1 : 626( ddf = 25 : 11, p -value=0 : 214)(ii)LD-LRT:2 : 807( p -value=0.093)and(iii)MI-LRT:1 : 994( p - value=0 : 157).Listwisedeletionconcludesmarginalofageonthevagaltone measurementwhiletheMI-LRTsuggestsachangeinthedirectionoftheconclusion.KRM thechangeindirectionbyproducingap-valueof0.214sinceitisabletoaccount forthebiasduetosmallsamplesize,thebiasduetoestimationofvarianceparametersand thebiasduetomissingnessresultinginamoreaccuratetestingprocedure. 3.3.2Example2 Weconsiderthedatafromarandomized,double-blind,studyofAIDSpatientswithad- vancedimmunesuppression(CD4countsoflessthanorequalto50 cells=mm 3 ).Thedata descriptionisasinthedatasetssectioninFitzmaurice(2004):PatientsinAIDSClinicalTrial Group(ACTG)Study193AwererandomizedtodualortriplecombinationsofHIV-1reverse transcriptaseinhibitors.Sp,patientswererandomizedtooneoffourdailyregimens containing600mgofzidovudine:zidovudinealternatingmonthlywith400mgdidanosine; zidovudineplus2.25mgofzalcitabine;zidovudineplus400mgofdidanosineorzidovudine plus400mgofdidanosineplus400mgofnevirapine(tripletherapy).Thepatientscharacter- isticslikeageandsexwerealsorecorded.Theresponsevariablei.e.measurementsofCD4 countswerescheduledtobecollectedatbaselineandat8-weekintervalsduringfollow-up. However,duetoskippedvisitsanddropouts,themeasurementscouldnotbetakenatreg- ularintervalsthereforewehavetheweeksasanothervariable.Forouruse,weusethelog transformedCD4countslog(CD4counts+1)astheresponsevariable.Thedataisavailable for1313patients,however,sincewearetestingforsmallsamplestudies,wewilltruncate thedatabyrandomlychoosing40patientsfromthestudy.Weintroduce30%missingness 47 inthecovariateageanddeploythediscussedmethodsontheobtaineddata.Themodel underconsiderationis Y ij = 0 + 1 X 1 i + 2 X 2 i + 3 t ij + 20 G 2 i + 30 G 3 i + 40 G 4 i + b 0 i + ij (3.3.10) with i =1 ; 2 ; ; 40and j =1 ; ;n i n i varyingfrom1to9,where Y ij isthelogtransformed CD4counts, G 2 i isadummyvariableindicatingthatthepatientwasintreatmentgroup2 and G 3 i istheindicatorthatpatientwasintreatmentgroup3and G 4 i isdelikewise, X 1 i representsthegenderofthepatient(1=male,0=female), X 2 i representstheageofthe patientinyearsand t ij isthe j th timepoint(inweeks)atwhich i th patient'smeasurement wasrecorded.Assuming( b 0 i ) iid ˘ N (0 ;˙ 2 b )and ij iid ˘ N (0 ;˙ 2 e )isindependentof b 0 i .Inthe abovemodel(3.3.10),weintroducemissingnessinthevariable X 2 i . Thegoalofourstudyistoinvestigateifthereisanytbetween theimpactofthefourtreatmentgroupsontheCD4counts.Therefore,wewilltest H 0 :( 20 ; 30 ; 40 ) T =0against H 0 :( 20 ; 30 ; 40 ) T 6 =0.Whenwetestthehypoth- esisusingallthedataavailablei.e.1313patients,weobtainap-valueofapproximately 0.008whichsuggeststhattreatmentgrouptypeistinthestudy.However,totest ourmethodology,werandomlyselectasampleofsize40.Theteststatisticand p -value fortestingtheabovehypothesisforthettestproceduresare(i)KRM:3 : 201( p - value=0 : 029),(ii)LD-LRT:8.068( p -value=0 : 044)(iii)MI-LRT:8.775( p -value=0 : 033). The p -valuefor(i)isbasedontheapproximateF-distributionwithnumeratordegreesof freedomasthreeandthedenominatordegreesoffreedomas62 : 50whicharecalculatedfrom thedataandthep-valuesfor(ii)and(iii)arebasedona ˜ 2 3 distributionwhiletherestare basedontheF-distributionwithtdegreesoffreedom.Alltestsyieldtheresultthat 48 thenullhypothesisisrejectedat5%anceleveli.e.thetreatmentgrouptypehas atimpactontheCD4counts.Althoughtheconclusionfromallmethodsisthe same,thep-valueoftheKRMtestistheleastandwecanhavemoreinthistest sinceitistheoretically 3.4ProofsofChapter3 3.4.1Assumptions WeimposethefollowingassumptionsaboutthemodelasdoneinAlnosaier(2007) C.1 isablockdiagonalandnonsingularmatrix.Also 1 ;@ =@˙ l ;@ =@ i ;@ 2 =@˙ l @˙ m and @ 2 =@ i @ j arebounded. C.2 E ( b ˙ )= ˙ + O ( n 3 = 2 ). C.3 Thepossibledependencebetween b ˙ ( b )and b isignored. C.4 =( X T ( 1 ( ˙ ( )) X ( )) 1 = O ( n 1 ) ; ( L T L )= O ( n ) : C.5 @ e =@˙ l = O ( n 1 = 2 ) ;@ 2 e =@˙ l @˙ m = O ( n 1 = 2 )where e = f X T ( 1 ( ˙ ( )) X ( ) g 1 X T ( 1 ( ˙ ( )) Y : C.6 Theexpectationof b exists. 49 3.4.2Estimatesof , and inTheorem1 Let ˙ =( ˙ 1 ; ;˙ r ) T and =( 1 ; ; k ) T .WebeginbyexpandingtheREMLbasedesti- mateof fortheimputeddatasetwhichisgivenby b =( X ? T ( b 1 ( b ˙ ( b )) X ? ( b )) 1 X ? T ( b 1 ( b ˙ ( b )) Y = C 1 + C 2 + C 3 + C 4 + C 5 + C 6 where C 1 =( X ? T ( b 1 ( b ˙ ( b )) X ? ( b )) 1 X ? T ( b ^ ˙ (^ )) 1 ^ ˙ ( )) 1 ) Y ; C 2 =( X ? T ( b 1 ( b ˙ ( b )) X ? ( b )) 1 ( X ? T (^ ) X ? T ( ^ ˙ ( )) 1 Y ; C 3 =( X ? T ( b 1 ( b ˙ ( b )) X ? ( b )) 1 ( X ? T ( b 1 ( b ˙ ( b )) X ? T ( )) 1 ) X ? T ( ^ ˙ ( )) 1 Y ; C 4 =( X ? T ( b 1 ( b ˙ ( b )) X ? ( )) 1 ( X ? T ( b 1 ( b ˙ ( )) X ? T ( )) 1 ) X ? T ( ^ ˙ ( )) 1 Y ; C 5 =( X ? T ( b 1 ( b ˙ ( )) X ? ( )) 1 ( X ? T ( 1 ( b ˙ ( )) X ? T ( )) 1 ) X ? T ( ^ ˙ ( )) 1 Y ; C 6 =( X ? T ( 1 ( b ˙ ( )) X ? ( )) 1 X ? T ( ^ ˙ ( )) 1 Y : Fornotationalconvenience,wewilldenote X ? as X .BytheTaylorseriesexpansionaround , ^ ˙ (^ )) 1 ^ ˙ ( )) 1 ˇ k X i =1 @ b ˙ ) 1 @ i ( b i i )+ 1 2 k X i =1 k X j =1 @ 2 b ˙ ) 1 @ i @ j ( b i i )( b j j ) : 50 Since E ( b )= + O p ( n 1 = 2 )and E ( Y )= X ,weget ^ ˙ (^ )) 1 ^ ˙ ( )) 1 = k X i =1 @ b ˙ ) 1 @ i ( b i i )+ O p ( n 1 ) = k X i =1 b ˙ ) 1 @ b ˙ ) @ i b ˙ ) 1 ( b i i )+ O p ( n 1 ) (3.4.11) Substituting(3.4.11)intheexpressionfor C 1 weget C 1 = e C 1 + O ( n 1 = 2 ),where f C 1 = f X T ( b ˙ ( )) X ( ) g 1 X T ( ) P k i =1 b ˙ ) 1 f @ b ˙ ) =@ i g 1 ( b ˙ )( b i i ) g X . Let b ˙ ( ))= f X T ( 1 ( b ˙ ( )) X ( ) g 1 ,then C 1 = b ˙ ( )) X T ( ) k X i =1 b ˙ ) 1 @ b ˙ ) @ i b ˙ ) 1 ( b i i ) X + O p ( n 1 = 2 ) : (3.4.12) Consider C 2 =( X T ( b 1 ( b ˙ ( b )) X T ( b )) 1 ( X T (^ ) X T ( ^ ˙ ( )) 1 Y . BytheTaylorseriesexpansion, X T ( b ) X T ( )= k X i =1 @X T @ i ( b i i )+ O p ( n 1 ) : Therefore, C 2 = f C 2 + O p ( n 1 = 2 ) ; where f C 2 = b ˙ ; ) k X i =1 @X T @ i ( b i i ^ ˙ ( )) 1 X : (3.4.13) Consider C 3 = f ( X T ( b 1 ( b ˙ ( b )) X T ( b )) 1 ( X T ( b 1 ( b ˙ ( b )) X T ( )) 1 g X T ( ^ ˙ ( )) 1 Y: 51 Again,usingtheTaylorseriesexpansion, ( X T ( b ^ ˙ ( )) 1 X ( b )) 1 ( X T ( b ^ ˙ ( )) 1 X ( )) 1 = k X i =1 @ @ i ( X T ( b ^ ˙ (^ )) 1 X ( )) 1 ( b i i )+ O p ( n 1 ) : Hence, C 3 = e C 3 + O p ( n 1 = 2 ) ; where e C 3 = k X i =1 ( X T ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 @X ( ) @ i ( b i i ) = b ˙ ( )) k X i =1 X T ( ^ ˙ ( )) 1 @X ( ) @ i ( b i i ) : (3.4.14) Similarly, e C 4 = k X i =1 b ˙ ; ) X T ( ) @ ^ ˙ ( )) 1 @ i X ( b i i ) = k X i =1 b ˙ ; ) X T ( ^ ˙ ( )) 1 @ b ˙ ( )) @ i ^ ˙ ( )) 1 X ( b i i ) ; (3.4.15) e C 5 = k X i =1 b ˙ ; ) @X T @ i ^ ˙ ( )) 1 X ( b i i ) ; (3.4.16) C 6 =( X T ( 1 ( b ˙ ( )) X T ( )) 1 X T ( ^ ˙ ( )) 1 Y = : Let H ( b ˙ )= e C 1 + e C 2 + e C 3 + e C 4 + e C 5 .Further,weneedtocomputethevarianceof b . V ( b )= V ( H ( b ˙ ))+ V ( ) : Totakeintoaccountthevariabilityin b ˙ ( )when iswefollowthesameproofasin 52 Kenward&Roger(1997)andAlnosaier(2007), V ( )= V ( e )+ V ( e ) ; where e =( X T ( ˙ ( )) 1 X ( )) 1 X T ( ˙ ( )) 1 Y : V ( )=+ : Theestimatesofandaregivenby b and b suchthat E ( b = e + R + O ( n 5 = 2 ) ; E ( b = e + O ( n 5 = 2 ) ; where =( X T ( 1 ( ˙ ( )) X ( )) 1 ; e = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m Q lm P l P m : Intheaboveexpressions P l = X T ( 1 ( ˙ ( )) @ @˙ l 1 ( ˙ ( )) X ( ) ; Q lm = X T ( 1 ( ˙ ( )) @ @˙ l 1 ( ˙ ( )) @ @˙ m 1 ( ˙ ( )) X ( ) ; R lm = X T ( 1 ( ˙ ( )) @ 2 @˙ l @˙ m 1 ( ˙ ( )) X ( ) ; R = 1 2 r X l =1 r X m =1 cov( b ˙ l ; b ˙ m R lm : 53 Now,from(3.4.12)-(3.4.16) H ( b ˙ )= k X i =1 b ˙ ( )) X T ( ^ ˙ ( )) 1 @ b ˙ ) @ i ^ ˙ ( )) 1 X ( ) g + f @X T ( ) @ i ^ ˙ ( )) 1 X ( ) gf X T ( ^ ˙ ( )) 1 @X ( ) @ i g + f X T ( ^ ˙ ( )) 1 @ b ˙ ) @ i ^ ˙ ( )) 1 X ( ) g @X T ( ) @ i ^ ˙ ( )) 1 X ( ) g ( b i i ) = k X i =1 b ˙ ( )) K i ( b i i ) ; where K i = X T ( ^ ˙ ( )) 1 @ b ˙ ) @ i ^ ˙ ( )) 1 X ( ) g + f @X T ( ) @ i ^ ˙ ( )) 1 X ( ) g X T ( ^ ˙ ( )) 1 @X ( ) @ i g + f X T ( ^ ˙ ( )) 1 @ b ˙ ) @ i ^ ˙ ( )) 1 X ( ) g @X T ( ) @ i ^ ˙ ( )) 1 X ( ) g ; and = V ( H ( b ˙ ))= k X i =1 b ˙ ( )) K i V ( i ) K T i T ( b ˙ ( )) + k X i =1 k X j =1 i 6 = j b ˙ ( )) K i cov( i ; j ) K T j T ( b ˙ ; ) : NowusingtheTaylorseriesexpansionabout ˙ weget, b ˙ )=( ˙ )+ r X l =1 @ @˙ l ( b ˙ l ˙ l )+ r X l =1 r X m =1 @ 2 @˙ l @˙ m ( b ˙ l ˙ l )( b ˙ m ˙ m )+ : (3.4.17) 54 Consider @ ˙ ) @˙ l = @ @˙ l k X i =1 K i var( b i ) K T i T + k X i =1 k X j =1 i 6 = j K i cov( b i ; b j ) K T j T = k X i =1 @ K i @˙ l var( b i ) K T i T + k X i =1 K i var( b i ) @K T i T @˙ l + k X i =1 k X j =1 i 6 = j @ K i @˙ l cov( b i ; b j K j + k X i =1 k X j =1 i 6 = j K i cov( b i ; b j ) @K T j T @˙ l (3.4.18) Now, @ K i @˙ l = k X i =1 @ @˙ l K i + k X i =1 @K i @˙ l = k X i =1 ( P l K i + k X i =1 @K i @˙ l : (3.4.19) and @K i @˙ l = f X T ( ) @ @˙ l ( @ 1 @ i ) X ( )+ @X T ( ) @ i @ 1 @˙ l X ( ) X T ( ) @ 1 @˙ l @X ( ) @ i X T ( ) @ @˙ l @ 1 @ i X ( ) @X T ( ) @ i @ 1 @˙ l X ( ) g = X T ( ) @ 1 @˙ l @X ( ) @ i : Letusdenote M li = X T ( ) @ 1 =@˙ l @X ( ) =@ i : Therefore, @ K i @˙ l = k X i =1 ( P l K i + k X i =1 X T ( ) @ 1 @˙ l @X ( ) @ i = k X i =1 ( P l K i M li : (3.4.20) 55 Using(3.4.19)and(3.4.20),(3.4.18)becomes, @ ˙ ) @˙ l = k X i =1 k X i =1 n ( P l K i M li )cov( b i ; b j ) K T j T + K i cov( b i ; b j )( P l K j M lj ) T o : Substitutingtheabovein(3.4.17),weget b ˙ ) = ˙ )+ r X l =1 k X i =1 n ( P l K i M li )var( b i ) K T i T + K i var( b i )( P l K i M li ) T o + k X i =1 k X i =1 i 6 = j n ( P l K i M li )cov( b i ; b j ) K T j T + K i cov( b i ; b j )( P l K j M lj ) T o + r X l =1 r X m =1 @ 2 @˙ l @˙ m ( b ˙ l ˙ l )( b ˙ m ˙ m ) = ˙ )+ O p ( n 2 ) : (3.4.21) Therefore,theadjustedcovariancematrixof b isgivenby b A = b +2 b b R ? + b and E ( b A )=var( b )+ O ( n 2 ).Thiscompletestheproof. 3.4.3Derivationof e E and e V usedinTheorem2 WebeginbycomputingtheexpectedvalueandvarianceoftheFstatistic F = 1 l ( b ) T L ( L T b A L ) 1 L T ( b ) : 56 Theexpectedvalueisapproximatedusingtheconditionalexpectationarguments. E ( F )= E f E ( F j b ˙ ) g ; E ( F j b ˙ )= E f 1 l ( b ) T L ( L T b A L ) 1 L T ( b ) j b ˙ g = 1 l E f ( b ) T L g ( L T b A L ) 1 E f L T ( b ) g +trace f ( L T b A L ) 1 var( L T ( b )) g : Since b isanunbiasedestimatorof ,therefore lE ( F j b ˙ )=trace f ( L T b A L ) 1 ( L T VL ) g ; where V =var( b )=++ : (3.4.22) Since b A = b + b A + b where b A =2 b P r l =1 P r m =1 cov( b ˙ l ; b ˙ m )( b Q lm b P l b b P m 1 4 b R lm ) b (Alnosaier2007),weget ( L T b A L ) 1 ( L T VL )= f ( L T b L )+( L T b A L )+( L T b L ) g 1 ( L T VL ) = ( L T b L ) I +( L T b L ) 1 ( L T b A L )+( L T b L )( L T b L ) 1 ( L T VL ) = I +( L T b L ) 1 ( L T b A L )+( L T b L )( L T b L ) 1 ( L T b L ) 1 ( L T VL ) =( L T b L ) 1 ( L T VL ) ( L T b L ) 1 ( L T b A L )( L T b L ) 1 ( L T VL ) ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T VL ) : FromAlnosaier(2007),wenoticethat E [trace f ( L T b L ) 1 ( L T VL ) g ]= l + A 2 +2 A 3 + O p ( n 3 = 2 ) ; (3.4.23) 57 E [trace f ( L T b L ) 1 ( L T b A L )( L T b L ) 1 ( L T VL ) g ]= E f b A ) g =2 A 3 + O ( n 3 = 2 ) ; (3.4.24) where = L ( L T L ) 1 L T ; A 2 = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m P l P m ; A 3 = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m )trace Q lm P l P m 1 4 R lm ) : Next,weneedtoconsider ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T VL )=( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L )+ ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L )+ ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L )+ =( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L )+ O ( n 2 ) : 58 UsingtheTaylorseriesexpansionfor( L T b L ) 1 about ˙ ,weget ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L )+ O ( n 2 ) = ˆ ( L T L ) 1 + r X l =1 ( b ˙ l ˙ l ) @ ( L T L ) 1 @˙ l + 1 2 r X l =1 r X m =1 ( b ˙ l ˙ l )( b ˙ m ˙ m ) @ 2 ( L T L ) 1 @˙ l @˙ m ˙ ( L T b L ) 1 ˆ ( L T L ) 1 + r X l =1 ( b ˙ l ˙ l ) @ ( L T L ) 1 @˙ l + 1 2 r X l =1 r X m =1 ( b ˙ l ˙ l )( b ˙ m ˙ m ) @ 2 ( L T L ) 1 @˙ l @˙ m ˙ ( L T L ) =( L T L ) 1 ( L T b L )+ O p ( n 3 = 2 ) : AsshowninTheorem1, E ( b =+ O p ( n 2 ),therefore E [trace f ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T VL ) g ]= E f b g = A 4 + O ( n 3 = 2 ) ; (3.4.25) where A 4 = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m b : Therefore,substituting(3.4.23)-(3.4.25)in(3.4.22),weobtaintheapproximationofthe expectedvalueoftheFstatistics(3.1.7). E ( F )=1+ A 2 l A 4 l + O ( n 3 = 2 ) : (3.4.26) 59 Now,weneedtocomputethevarianceoftheFstatistic. var( F )= E f var( F j b ˙ ) g +var f E ( F j b ˙ ) g : (3.4.27) Consider var( F j b ˙ )= 1 l 2 var ( b ) T L ( L T b A L ) 1 L T ( b ) j b ˙ = 2 l 2 trace ( L T b A L ) 1 ( L T VL ) 2 ; sinceitisaquadraticform. (3.4.28) UsingsimilarargumentsasusedinAlnosaier(2007),weapproximatethefollowing ( L T b A L ) 1 ( L T VL )= f ( L T b L )+( L T b A L )+( L T b L ) g 1 ( L T VL ) = ( L T b L ) I +( L T b L ) 1 ( L T b A L )+( L T b L )( L T b L ) 1 ( L T VL ) = I +( L T b L ) 1 ( L T b A L )+( L T b L )( L T b L ) 1 ( L T b L ) 1 ( L T VL ) =( L T b L ) 1 ( L T VL ) ( L T b L ) 1 ( L T b A L )( L T b L ) 1 ( L T VL ) ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T VL ) : 60 Therefore trace ( L T b A L ) 1 ( L T VL ) 2 =trace ( L T b L ) 1 ( L T VL ) 2 2trace f ( L T b L ) 1 ( L T VL )( L T b L ) 1 ( L T b A L )( L T b L ) 1 ( L T VL ) g 2trace f ( L T b L ) 1 ( L T VL )( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T VL ) g + O p ( n 2 ) =trace ( L T b L ) 1 ( L T L + L T L + L T L )( L T b L ) 1 ( L T L + L T L + L T L ) 2trace ( L T b L ) 1 ( L T L + L T L + L T L )( L T b L ) 1 ( L T b A L ) ( L T b L ) 1 ( L T L + L T L + L T L ) 2trace ( L T b L ) 1 ( L T L + L T L + L T L ) ( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L + L T L + L T L ) + O p ( n 2 ) =trace ( L T b L ) 1 ( L T L )( L T b L ) 1 ( L T L ) + trace ( L T b L ) 1 ( L T L )( L T b L ) 1 ( L T L ) + 2trace ( L T b L ) 1 ( L T L )( L T b L ) 1 ( L T L ) 2trace ( L T b L ) 1 ( L T L )( L T b L ) 1 ( L T b A L )( L T b L ) 1 ( L T L ) 2trace ( L T b L ) 1 ( L T L )( L T b L ) 1 ( L T b L )( L T b L ) 1 ( L T L ) + O p ( n 2 ) UsingtheTaylorseriesexpansionabout ˙ ,weget =trace I +2 r X l =1 ( b ˙ l ˙ l ) @ @˙ l ( L T b L ) 1 ( L T L )+ r X l =1 r X m =1 cov( b ˙ l ; b ˙ m ) @ 2 @˙ l @˙ m ( L T b L ) 1 ( L T L ) + trace r X l =1 r X m =1 cov( b ˙ l ; b ˙ m ) @ @˙ l ( L T b L ) 1 ( L T L ) @ @˙ m ( L T b L ) 1 ( L T L ) + 2trace ( L T b L ) 1 ( L T L ) +2trace ( L T b L ) 1 ( L T L ) 2trace ( L T b L ) 1 ( L T b A L ) 2trace ( L T b L ) 1 ( L T b L ) + O p ( n 3 = 2 ) : 61 Using E ( b =+ O ( n 2 )and(3.4.28),weapproximatetheexpectedvalueoftheconditional varianceasfollows, E f var( F j b ˙ ) g = 2 l 2 ( l +3 A 2 )+ O ( n 3 = 2 ) : (3.4.29) Also,usingtheexpressionofthemeani.e.(3.4.26),weobtain var f E ( F j b ˙ ) g = A 1 l 2 + O ( n 3 = 2 ) ; where(3.4.30) A 1 = r X l =1 r X m =1 cov( b ˙ l ; b ˙ m P l P m : Using(3.4.29)and(3.4.30)in(3.4.27)weobtain var( F )= A 1 l 2 + 2 l + 6 A 2 l 2 + O ( n 3 = 2 ) : ThereforetheapproximateexpectedvalueandvarianceoftheFstatisticthatwillbe usedformatchingmomentsof withthoseof F ( l;d )distributionaregivenas e E =1+ A 2 l A 4 l ; e V = A 1 l 2 + 2 l + 6 A 2 l 2 : Thiscompletestheproof. 62 3.4.4ImplementationofSimulationStudy Wederivetherequiredquantitiesforablockdesignexperiment,wherethe covariatesaregeneratedfromaGaussiandistribution.Therefore, @X ? ( ) @ 1 = 0 B @ 0: X ? = X 1: X ? = 1 1 C A and @X ? ( ) @ 2 =0 : Also,nowweneedtocompute @ b ˙ 2 e @ and @ b ˙ 2 b @ .Inthefollowing,= ˙ 2 e D 2 + ˙ 2 b D 1 ,and forourcasewetake D 2 = I ;therefore @ @ b ˙ 2 e = D 2 = I @ @ b ˙ 2 b = D 1 : TheREMLequationsaregivenby: 2 @l ( ˙ ) @˙ 2 e = trace( G )+ Y T GG Y since D 2 = I; (3.4.31) 2 @l ( ˙ ) @˙ 2 b = trace( GD 1 )+ Y T GD 1 G Y ; (3.4.32) where G = 1 1 X ? ( X ? 1 X ? ) 1 X ? T 1 : Again,fornotationalconvenience,wewillreplace X ? by X .ToobtainMLEof ˙ 2 e and 63 ˙ 2 b ,weneedtosettheequations(3.4.31)and(3.4.32)tozero. @l @˙ 2 e ˙ 2 e = b ˙ 2 e ;˙ 2 b = b ˙ 2 b =0 ; = ) trace( G ( b ˙ ( )))= Y T G ( b ˙ ( )) G ( b ˙ ( )) Y = ) @ @ trace( G ( b ˙ ( )))= @ @ f Y T G ( b ˙ ( )) G ( b ˙ ( )) Y g : (3.4.33) and @l @˙ 2 b ˙ 2 e = b ˙ 2 e ;˙ 2 b = b ˙ 2 b =0 ; = ) trace( G ( b ˙ ( )) D 1 )= Y T G ( b ˙ ( )) D 1 G ( b ˙ ( )) Y = ) @ @ trace( G ( b ˙ ( )) D 1 )= @ @ f Y T G ( b ˙ ( )) D 1 G ( b ˙ ( )) Y g : (3.4.34) LetusconsidertheLHS(lefthandside)ofequation(3.4.31) @ @ trace f G ( b ˙ ( )) g =^ ˙ ( )) 1 ) trace f ^ ˙ ( )) 1 X ( )( X T ( ^ ˙ ( )) 1 X ) 1 X T ( ^ ˙ ( )) 1 g ] = f @ @ b ˙ 2 e ^ ˙ ( )) 1 ) @ b ˙ 2 e @ + @ @ b ˙ 2 b ^ ˙ ( )) 1 ) @ b ˙ 2 b @ g @ @ (trace( M ( ))) ; where M =^ ˙ ( )) 1 X ( )( X T ( ^ ˙ ( )) 1 X ) 1 ) X T ( ^ ˙ ( )) 1 ; =trace( @ ^ ˙ ( )) 1 @ @ @ b ˙ 2 e ) @ b ˙ 2 e @ +trace( @ ^ ˙ ( )) 1 ) @ @ @ b ˙ 2 b ) @ @ (trace( M ( ))) =trace( b ˙ ( )) 2 T D 2 ) @ b ˙ 2 e @ +trace( b ˙ ( )) 2 T D 1 ) @ b ˙ 2 b @ trace( @M ( ) @ ) : (3.4.35) 64 Weneedtoexplicitlysolveoneofthetermsin(3.4.35), @M ( b ˙ ( )) @ = @ ^ ˙ ( )) 1 @ X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 + (3.4.36) ^ ˙ ( )) 1 ( @X ( ) @ )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 + (3.4.37) ^ ˙ ( )) 1 X ( ) @ ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 @ X T ( ^ ˙ ( )) 1 + (3.4.38) ^ ˙ ( )) 1 X ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 @X ( ) @ ^ ˙ ( )) 1 + (3.4.39) ^ ˙ ( )) 1 X ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 @ ^ ˙ ( )) 1 @ : (3.4.40) Further,weneedtosolvealltheterms(3.4.36)-(3.4.40)separatelyusingstandardmatrix calculus.Thereforeconsider(3.4.36), @ ^ ˙ ( )) 1 @ b ˙ 2 e @˙ 2 e @ + @ ^ ˙ ( )) 1 @ b ˙ 2 b @ b ˙ 2 b @ X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 = ˆ ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 @ b ˙ 2 e @ ) ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 @ b ˙ 2 b @ ) ˙ X ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 : (3.4.41) 65 Solving(3.4.38)weget, ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ˆ @X T ( ) @ ^ ˙ ( )) 1 X ( ) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( ) @ b ˙ 2 e @ X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X @ b ˙ 2 b @ + X T ^ ˙ ( )) 1 @X ( ) @ ˙ ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 : (3.4.42) Solving(3.4.40)weget, ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ˆ ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 @ b ˙ 2 e @ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 @ b ˙ 2 b @ ˙ : (3.4.43) 66 Substitutingtheexpressionsfrom(3.4.37) ; (3.4.39) ; (3.4.41) ; (3.4.42)and(3.4.43)weget, @M ( b ˙ ( )) @ = @ b ˙ 2 e @ ˆ ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 +^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 +^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ˙ + @ b ˙ 2 b @ ˆ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 +^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ˙ +^ ˙ ( )) 1 @X ( ) @ b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 X T ( ^ ˙ ( )) 1 @X @ b ˙ ( )) X T ( ^ ˙ ( )) 1 +^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 : (3.4.44) 67 Nowsubstituting(3.4.44)in(3.4.35),weobtainthefollowing, @ trace( G ( b ˙ ( ))) @ = @ b ˙ 2 e @ ˆ trace( b ˙ ( )) 2 T D 2 ) ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ) ˙ + @ b ˙ 2 b @ ˆ trace( b ˙ ( )) 2 T D 1 ) ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ) ˙ ^ ˙ ( )) 1 @X ( ) @ ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 @X ( ) @ b ˙ ( )) X T ( ^ ˙ ( )) 1 ) ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 @X T ( ) @ ^ ˙ ( )) 1 ) : (3.4.45) Movingon,wecomputetheRHS(righthandside)ofequation(3.4.33), @ @ ( Y T G ( b ˙ ( )) G ( b ˙ ( )) Y ) = Y T @G ( b ˙ ( )) @ G ( b ˙ ( )) Y + Y T G ( b ˙ ( )) @G ( b ˙ ( )) @ Y (3.4.46) 68 Consider, @G ( b ˙ ( )) @ = @ @ f ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 g = ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 @ b ˙ 2 e @ @M ( b ˙ ( )) @ : (3.4.47) 69 Nowusing(3.4.44)in(3.4.47),weget, @ @ ( Y T G ( b ˙ ( )) G ( b ˙ ( )) Y ) = @ b ˙ 2 e @ ˆ ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ˙ + @ b ˙ 2 b @ ˆ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ˙ ^ ˙ ( )) 1 @X ( ) @ b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 @X ( ) @ b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 : (3.4.48) 70 Fornotationalconvenienceletusthreenewquantities U;U 1 and V asfollows, U = ˆ ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ˙ ; (3.4.49) U 1 = ˆ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ˙ ; (3.4.50) and V = ^ ˙ ( )) 1 @X ( ) @ ( X 0 ( ^ ˙ ( )) 1 X ( )) 1 X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) @X T ( ) @ ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 @X ( ) @ b ˙ ( )) X T ( ^ ˙ ( )) 1 ^ ˙ ( )) 1 X ( )( X 0 ( ^ ˙ ( )) 1 X ( )) 1 @X T ( ) @ ^ ˙ ( )) 1 : (3.4.51) Now,substitutingtheequations(3.4.48) ; (3.4.49) ; (3.4.50) ; (3.4.51)in(3.4.46)wegetRHS 71 of(3.4.33)as Y T ˆ @ b ˙ 2 e @ ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 + U )+ @ b ˙ 2 b @ ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 + U 1 ) + V ˙ G ( b ˙ ( )) Y + Y T G ( b ˙ ( )) ˆ @ b ˙ 2 e @ ( ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 + U ) + @ b ˙ 2 b @ ( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 + U 1 )+ V ˙ Y = @ b ˙ 2 e @ ˆ Y T UG Y + Y T GU Y Y T ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 Y T G ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 Y ˙ + @ b ˙ 2 b @ ˆ Y T U 1 G Y + Y T GU 1 Y Y T ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 Y T G ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 Y ˙ + Y T VG Y + Y T GV Y : (3.4.52) EquatingtheRHS(3.4.52)andLHS(3.4.45)of(3.4.31)weget, @ b ˙ 2 e @ ˆ trace( b ˙ ( )) 2 T D 2 )+trace( U ) Y T UG Y Y T GU Y + Y T ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 G Y + Y T G ^ ˙ ( )) 1 D 2 Y ˙ + @ b ˙ 2 b @ ˆ trace( b ˙ ( )) 2 T D 1 )+trace( U 1 ) Y T U 1 G Y Y T GU 1 Y + Y T ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 G Y + Y T G ^ ˙ ( )) 1 D 1 Y ˙ = Y T VG Y + Y T GV Y trace( V ) : (3.4.53) 72 Similarly,weneedtosolvebothsidesoftheequation(3.4.32).ConsidertheLHS @ @ f ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 X ( b ˙ ( )) X T ( ^ ˙ ( )) 1 D 1 ) g = @ @ b ˙ 2 e ^ ˙ ( )) 1 D 1 ) @ b ˙ 2 e @ + @ @ b ˙ 2 b ^ ˙ ( )) 1 D 1 ) @ b ˙ 2 b @ @ @ (trace( M ( ) D 1 )) =trace f @ @ ^ ˙ ( )) 1 D 1 ) @ @ b ˙ 2 e g +trace f @ @ ^ ˙ ( )) 1 D 1 ) @ @ b ˙ 2 b g @ b ˙ 2 b @ @ @ trace( M ( ) D 1 ) : (3.4.54) Now, @ @ trace( M ( ) D 1 )=trace( @MD 1 ) =trace( @M @ D 1 ) =trace f ( @ b ˙ 2 e @ U @ b ˙ 2 b @ U 1 V ) D 1 g =trace( UD 1 ) @ b ˙ 2 e @ trace( U 1 D 1 ) @ b ˙ 2 b @ trace( VD 1 ) : (3.4.55) Using(3.4.55)in(3.4.54),weget @ @ trace( GD 1 )= @ b ˙ 2 e @ trace( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ) T D 2 +trace( UD 1 ) g + @ b ˙ 2 b @ trace( ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ) T D 1 +trace( U 1 D 1 ) g +trace( VD 1 ) : (3.4.56) 73 SolvingRHSof(3.4.34)weget, @ @ ( Y T GD 1 G Y ) = Y T ( @G @ ) D 1 G Y + Y T GD 1 @G @ Y = Y T ˆ @ b ˙ 2 e @ ( U ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ) + @ b ˙ 2 b @ ( U 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 )+ V ˙ D 1 G Y + Y T GD 1 ˆ @ b ˙ 2 e @ ( U ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 ) + @ b ˙ 2 b @ ( U 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 )+ V ˙ Y = @ b ˙ 2 e @ ˆ Y T UD 1 G Y + Y T GD 1 U Y Y T ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 D 1 G Y Y T GD 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 Y ˙ + @ b ˙ 2 b @ ˆ Y T U 1 D 1 G Y + Y T GD 1 U 1 Y Y T ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 D 1 G Y Y T GD 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 Y ˙ + Y T VD 1 G Y Y T GD 1 V Y : (3.4.57) Therefore,equatingLHS(3.4.56)=RHS(3.4.57)weget, @ b ˙ 2 e @ ˆ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ) T D 2 )+trace( UD 1 ) Y T UD 1 G Y Y T GD 1 U Y + Y T ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 D 1 G Y + Y T GD 1 ^ ˙ ( )) 1 D 2 ^ ˙ ( )) 1 Y ˙ + @ b ˙ 2 b @ ˆ ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 ) T D 1 )+trace( U 1 D 1 ) Y T U 1 D 1 G Y Y T GD 1 U 1 Y + Y T ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 D 1 G Y + Y T GD 1 ^ ˙ ( )) 1 D 1 ^ ˙ ( )) 1 Y ˙ = Y T VD 1 G Y + Y T GD 1 V Y trace( VD 1 ) : (3.4.58) Finally,wecansolvetheequations(3.4.53)and(3.4.58)simultaneouslytoobtain @ b ˙ 2 e =@ 74 and @ b ˙ 2 b =@ . Remark: Herewehavederivedthequantitiestoimplementthesimulationforastudy withonerandomInthestudies,whichhavemorethanonerandomsimilar methodcanbeusedtoderivetherequiredquantities. 75 Chapter4 Application:Meta-analysisusing models Meta-analysisprovidesawayofcombiningoutcomesacrosststudies.Theuseof meta-analysisisveryfrequentinclinicaltrialsandmanyotherapplications.TheInter- nationalConferenceonHarmonizationE9guidelinestatesthatmeta-analysesshouldbe prospectivelyplannedwiththeclinicaltrialsprograminthedevelopmentofanewtreat- ment.Inclinicaltrials,meta-analysisisprimarilyusedtoquantifythesizeandits uncertaintyoftheprimaryorthekeysecondaryendpoints.TheCochraneCollaboration providestheworld'slargestrepositoryofsystematicreviewandmeta-analysisfrequently usedinclinicaltrials.Theunderlyingassumptionofthecommonlyusedstatisticalmethods isthatallkeyendpointshavereasonablenumberofacrosststudies. However,thereareinstances(mainlyinsafetystudies)whentheprimaryinterestsofthe outcomesfromtstudiesarerare(lowincidences)andmeta-analysisisrequiredto capturethesize.Forexample,considerarecentlyinvestigatedmeta-analysisstudyof threetypesofneedlesusedintheEUS-FNA 1 procedureforgastrointestinallesions(forthe tialityreason,wearenotdisclosingthenameofthesponsorhere).Thisprocedureis usedtoobtainbiopsiesbeforeperforminganypancreaticsurgery.Thethreemostcommonly usedneedlesinthisprocedureare19-gauge(19G),22-gauge(22G)and25-gauge(25G)with 1 EUS-FNA:endoscopicultrasoundneedleaspiration 76 the19Gneedlehavingthelargestdiameter.Giventhelargersize,the19Gneedlecanyield ttissuefordiagnosisbutisrelativelytomaneuverandissuspectedtolead tomorecomplicationslikeexcessivebleeding(Trindade&Berzin,2013)especiallywiththe less-experiencedendosonographers.Thefocusoftherecentlyconductedmeta-analysiswas tocomparetheoverallcomplicationratesduetopancreatitis,bleedingduetotheneedle, perforation,infection,fever,painorelevatedpancreaticenzyme.Sincetheusageof19Gis limited,onlysixstudiesareavailableintheliteraturereportingoverallcomplicationswhich areasfollows0 = 38,0 = 72,2 = 18,1 = 44,0 = 30and0 = 13,wherethedenominatorrepresents thetotalnumberofsubjectsinthestudyandnumeratorrepresentsthenumberofsubjects reportingcomplications.Ontheotherhand,45studiesareavailablereportingtheusageof 22Gor25Gneedlesofwhich27studiesreportedzerooverallcomplicationsand18studies reportedsomeamountofcomplicationsmostlybetween1to5.Asummaryofthedatais providedinFigure1. Figure4.1:Summaryofthe22/25Gdata Thestudiesincludedintheaboveinvestigationarebasedonasimilarpatientpopulation. Thepaucityoftheeventsfromtstudiesandnon-randomizationnatureofthese studiesposesaseriousthreattothestatisticalcomparisonandexacerbatestheuseofall statisticalmethodsthatarenotdesignedforthesesituations. Inthefollowing,section4.1andsection4.2presentourextensionoftheselectedCPR 77 methodsandthenewproposedmethodsincorporatingmodels,section4.3 presentstheimplementationofthemethodsonrealworlddata,section4.4presentsasimu- lationstudywherethecomparisonsaremadetotheMHmethod,1959andthePetomethod, 1985andsection4.5drawstheconclusion. 4.1Likelihood-basedapproachduetoCPR Supposeweobservedatafrom n tstudies,outofwhich m arefromtreatmentarm 1and n m arefromtreatmentarm2.Let Y i bethenumberofadverseeventsobserved among n i subjectsinthe i th study.Let X i betheindicatorcovariateindicatingtreatment arm1.Inthisstudy,weareinterestedindetectingthebetweenthetwotreatment groups.Sincetheresponsevariableisdiscrete,webeginbyassumingaPoissonmodelon Y i . ThemethodssuggestedinCPRwereusedonlyforrandomizedstudieswhereeverystudy observedsubjectsfrombothtreatmentarms(treatmentarm1and2).However,inour setup,wewanttodrawacomparisonbetweentwotreatmentswheresomeorallstudiesdo nothaveobservedsubjectsinbothtreatmentarms.WederivethemethodsinCPRforour setup.Inthefollowingmodels,wewritedowntheexplicitlikelihoodandmaximizeittoget theestimatesoftheparametersofinterest. 4.1.1Poissonmodelwithrelativerisk:POIF FollowingasimilarmodelasinCPR, Y i followsaPoissondistributionwithrate n i i where, i = ˘ i e ˝X i : (4.1.1) 78 FollowingsimilarnotationasinCPR,theparameter ˘ i representsthebaselineeventrate while ˝ representsthelogrelativeriskassociatedwiththetreatment.Herethebaselineevent ratesvaryfromstudytostudy,buttherelativerisk( e ˝ )isconstant.Agammadistribution isassumedonthe ˘ i i.e. ˘ i ˘ Gamma( ; ).Thelikelihoodofthemodelcanbewrittenas follows, L n ( )= n Y i =1 y i + ) )( + n i e ˝X i ) ( y i + ) ( n i e ˝X i ) y i y i ! (4.1.2) where, =( ;;˝ ) 0 representsvectorofmodelparameters.Theabovelikelihoodequation canbemaximisedtoobtaintheestimatedparameters. 4.1.2Poissonmodelwithrandomrelativerisk:POIR Model(4.1.1)isextendedbyassuminglogrelativerisktoberandom,hence,inthiscase bothbaselineriskandrelativerisksarevariedfromstudytostudy.Therefore,themodelis i = ˘ i e ( ˝ i X i ) ; (4.1.3) wherethebaselineriskis ˘ i ˘ Gamma( ; )andthelogrelativeriskis ˝ i ˘ N ( ˙ 2 ).The resultinglikelihoodfunctionisgivenby, n Y i =1 Z 1 y i + ) )( + n i e ˝X i ) ( y i + ) ( n i e ˝X i ) y i y i ! 1 p 2 ˇ˙ e ( ˝ ) 2 = 2 ˙ 2 d˝: (4.1.4) Weapproximatetheabovelikelihoodfunctionusinggaussianquadratureandthenobtain theestimateof =( ;;˙ ) 0 : 79 4.2Proposedmethodsformeta-analysisbasedonzero Poissonmodels OnfurtherexaminationoftheEUS-guidedFNAdatadescribedintheintroduction,we observeamajorityofresponsesaszeros.Wesuspectthattherearemorezerosinthedata thanwouldbepredictedbyastandardPoissonmodel.Sincethedataisoverdispersed,we proposezeromodelswherethemeanandvariancedonothavetobeequal.This leadsustoproposethefollowingmodels. 4.2.1ZeroPoissonmodelwithrelativerisk:ZIPF Weassumethat Y i followsaPoissondistributionwithmean n i i andparameter ˇ i ; i.e. Y i ˘ 8 > < > : P ( n i i )w.p. ˇ i 0w.p.1 ˇ i 9 > = > ; : Asassumedinpreviousmodels, i = ˘ i e ˝X i where ˘ i ˘ Gamma ( ; ) : Further,weuse abetadistributiontomodeltheparameteri.e. ˇ i ˘ Beta ( c;d ) : Thefulllike- lihood L ( ;;˝;c;d )hasbeenderivedintheappendix.Wecanobtaintheestimatesof =( ;;˝;c;d ) 0 bymaximizingthelikelihood. 4.2.2ZeroPoissonmodelwithrandomrelativerisk:ZIPR Inmethod(4.2.1),weaccountedforthestudyvariation ˘ i .Nowwewouldliketointro- ducevariationin ˝ .Therefore,weextendmodel4.2.1to i = ˘ i e ˝ i X i toallowforpos- 80 sibilityofstudytostudyheterogeneitywithrespecttorelativerisk.Boththebaseline parametersandtherelativeriskareassumedtoberandom.Here, ˘ i ˘ Gamma ( ; )and ˝ i ˘ Normal ( ˙ 2 ) : Furthertheparameterisassumedtofollowabetadistribution i.e. ˇ i ˘ Beta ( c;d ).Tobeabletogettheestimatesoftheparameters( ;;˙;ˇ ),we needtocomputethelikelihoodofthespmodel,whichisapproximatedusingnumerical methodsliketheGaussianquadratureandLaplaceapproximations.Themethodisdetailed intheappendix. 4.3Numericalstudies Thissectionexplicatesthenumericalstudiesusingthemethodsdiscussedintheprevious section.First,wediscusstheanalysisusingtheEUS-FNAstudydiscussedintheintroduction andshowhowtheproposedmethodscanbeappliedinanonRCTsetup.Next,weexplicate therandomizedmeta-analysisbyrevisitingarecentlypublishedcontroversialmeta-analysis studypublishedintheNewEnglandJournalofMedicinejournal(Nissen&Wolski,2007, 2011). 4.3.1Example1:OverallcomplicationsduetoneedlesinEUS- FNA Asstatedinsection1,theobjectiveofthestudyistocomparetheoverallcomplicationrates duetothe19Gandthe22G/25GneedlesinEUSguidedFNAprocedure.Only6studies arefoundintheliteraturereportingtheoverallsurgicalcomplicationswith19G;whereas45 studiesarefoundusingeitherthe22Gneedleorthe25Gneedlereportingthesame.The traditionalmeta-analysisexamineseachtreatmentgroupseparatelybycomputingthepooled 81 estimatesandthecorrespondingintervalsoftheoverallcomplicationratesacross allstudies.Sincemanyoftheobservationsarezeroseitherin19Ggrouporin22G/25G group,almostallstatisticalsoftwaretoourknowledgebecamecomputationallyvulnerable; therefore,tobypassthecomputationalerrors,acontinuitycorrection0 : 5isaddedtothe studieswithzerocounts.ThenaiveestimatesforthetwogroupsareshowninTable4.1. Table4.1Initialproportionestimatesofthetreatmentgroupswithandwithoutcontinuity correction(CC) Treatment1:19GTreatment2:22/25GRelativeRisk Rate95%C.I.Rate95%C.I.Estimate95%C.I. With0.5CC0.0036[0,0.0180]0.0063[0.0039,0.0086]0.571[0.204,1.602] WithoutanyCC0.0139[0,0.0295]0.0124[0.0096,0.0152]1.121[0.262,4.801] BasedontheresultsinTable4.1,bycomparingtheestimates,wenoticethattheoverall complicationratesof22/25G(0 : 63%)ishigherthanthatof19G(0 : 36%).However,the 95%intervalsofthecomplicationratesof19Gand22/25Ggroupoverlap.Notice thattheintervalforthe19Ggroupismuchwiderascomparedtothe22/25Ggroup,as thenumberofstudiesandthecorrespondingsamplesizesin19Ggroupismuchsmaller thanthe22/25Ggroup.However,inthisanalysis,wehaveaddedacontinuitycorrection of0.5tothestudieswithzerocountswhichcanleadtoincorrectresults.Whenwebase ourcalculationsonrawdata(withoutanycontinuitycorrection);wethat3outof215 peoplehadcomplicationsinthe19Ggroupproducingtherisktobe0.0139asopposedto 74outof5957(risk=0.0124)inthe22G/25Ggroup.Hence,weobtainarelativeriskof 1.121suggestingthattheoverallcomplicationrateisabout12%higherinthe19Ggroupas comparedtothe22G/25Ggroup.Basedonthetraditionalmethods,noconclusioncanbe 82 drawnandfurtherinvestigationisrequired. Forfurtheranalysis,weusethemodelsdescribedinsections4.1and4.2.Sincethedata foreachgroupcontainsexcessivezeros,itisrelevanttotestifatnumberofzeros arecomingfromazerostateasopposedtothePoissondistribution.Forthispurpose,we conductalikelihoodratiotesttotestthefollowinghypothesis. H 0 : ˇ i =1 8 i =1 ;::;n inthe model Y i ˘ Poisson ( n i ˘ i e ˝X i )withtheparameteras ˇ i ,where ˘ i ˘ Gamma ( ; ), ˝ i ˘ N ( ˙ 2 )and ˇ i istheprobabilitythatthedatafollowsaPoissondistributionwithout anyzeroThelikelihoodratioteststatisticisgivenby 2 log ^ L ( ^ H 0 ) ^ L ( ^ ) ! (4.3.5) where, ^ H 0 = argmax ;H 0 ^ L ( ).Sincewearetestingfor ˇ i =1,whichisontheboundaryof theparameterspace,thelikelihoodratioteststatisticsfollowsamixtureof ˜ 2 distribution with50:50of ˜ 2 0 and ˜ 2 1 asdiscussedinSelfandLiang,1987.Thevalueoftheteststatistic is3.298andthep-valueis0.03.Basedonthetestresultswerejectthenullhypothesis therebyindicatingthatazeroPoissonmodelwouldbemoreappropriateforthis dataset. Estimationofrelativerisk ThemethodsinCPRwereonlyderivedforarandomized-controlstudy.Itwasnottested fornonrandomizedstudieswherethenumberofstudiesinthetwoarmsareunequal;like ourproposeddata.However,wewereabletoextendthetwomethodsbasedonPoisson modelsfromCPRforoursetup.Hence,weuseallmodelstothedataforcomparison.For modelsPOIFandZIPF,where ˝ isestimationofrelativeriskisstraightforward.We cancalculatetheasymptoticcointervalof ˝ byusingtheinformationmatrix.Butfor 83 modelsPOIRandZIPR,where ˝ israndom,weneedtocomputetheposteriordistribution of ˝ i i.e. f ( ˝ i j y ).Subsequently,theposteriormeancanbecalculated.Assuming ˝ i , ˘ i and ˇ i tobemutuallyindependent,theposteriordistributionof ˝ i when X i =1canbeobtained by f ( ˝ i j y )= f ( y j ˝ i ) :f ( ˝ i ) R f ( y j ˝ i ) :f ( ˝ i ) d˝ i ; = f ( y i j ˝ i ) :f ( ˝ i ) R f ( y i j ˝ i ) :f ( ˝ i ) d˝ i ; since ˝ i and y j areindependentfor i 6 = j where f ( y i j ˝ i )= R 1 f ( y i j ˝ i ; i ) f ( i ) d i : Here, i = ˘ i forPOIRand i =( ˘ i ;ˇ i )forZIPR.Theestimateoflogrelativeriskforthe i th studywhen f X i =1 g isdenotedby^ ˝ i = E ( ˝ i j y )canbeapproximatedusingnumerical methodssuchasgaussianquadrature.Thiswassubsequentlyusedtoobtaintheestimate ofthelogrelativerisk.Weusedparametricbootstraptocomputethestandarderrorofthe estimate.TheresultsforthefourmodelsaregiveninTable4.2.Here,formodelsPOIRand ZIPR,wereporttheaverageof^ ˝ 0 i s asanestimateofthelogrelativerisk. Table4.2EstimatesoflogrelativeriskforEUS-FNAstudy logrelativerisk( ˝ )C.I.ofrelativerisk ModelLoglikelihoodEstimateSE95%C.I. POIF-67.8860.360.77(0.32,6.48) POIR-67.8360.130.83(0.22,5.79) ZIPF-66.2370.470.77(0.35,7.23) ZIPR-66.1790.260.64(0.37,4.55) Fromtheresults,themeanofthebaselineeventrateisapproximately0.01(notshownin 84 thetable)forallmethods,indicatinglittlebetweenstudyvariations.FormodelsZIPFand ZIPRweobservethatabout60%(meanofparameter)oftheobservationscome fromthePoissondistributionwhiletheremaining40%comefromazerostate.Sinceweare interestedincomparingtheoverallcomplicationsinthe2treatmentgroups,ourparameter ofinterestintheabovemodelsisthelogrelativeriskgivenby ˝ .FrommodelZIPR,the pointestimateoftherelativerisk( e ˝ )is1.297,whichsuggeststhatthesubjectsintreatment group1(19G)haveabout30 : 0%higherriskofoverallcomplicationsascomparedtothose intreatmentgroup2(22/25G). TheestimatesgiveninTable4.2canbeusedtodrawinferencesaboutrelativerisk.The 95%cintervalfortherelativeriskformodelZIPRis(0 : 37 ; 4 : 55)whichsuggests thatthereisa95%chancethattherelativeriskofoverallcomplicationsinthe19Ggroup comparedto22/25Gisbetween0.37and4.55.Allmodelsindicatehighercomplicationrates inthe19Ggroup,however,theestimatesfromnoneofthemodelsarestatisticallyt; theintervalsareverywide.Althoughtherelativeriskestimateisnotstatistically tfromone,itisworthnoticingthatthestandarderrorof ˝ issmallestforZIPR therebyresultingintightercintervalascomparedtotheotherthreemodels.The tablealsogivesthevalueoftheloglikelihoodforthefourmodelswhichismaximumforthe modelZIPR. TheestimatesobtainedfromallfourmodelsaretfromthoseobtainedinTable 4.1whichwerebasedonnaivecalculationswherea0.5correctionwasmadetostudieswith zeroevents.Noinformationislostfromtheoriginaldata;theydonotmakeanycorrections ordeleteanypartofthedata.Also,theexcessivezeroscanbeaccommodatedwhenthey areobservedbyusingazerodistribution.Further,wewillshowbysimulationthat thezeromodels(ZIPFandZIPR)areabetterforthedata. 85 Simulationstudy Weneedtothemodelthatbestthedata.Weuseaparametricbootstrapapproachas describedinEfron&Tibshirani,1993andAllcroft&Glasbey,2003.Theideaistosimulate datafromeachmodelandallthemodelstothesamedatasetandcomputeandcompare thegoodnessofstatisticswhichinourcaseisthemaximizedloglikelihoodestimate.By simulatingfromeachmodelinturn,wedeterminewhetheranymodelisconsistentwiththe data.Thebestmodelisexpectedtobehaveinthesamemannerastheoriginaldataset. Theformalstepbystepprocedureisgivenasfollows, 1. Fitthe4modelstotheoriginalobserveddataandestimatethemodelparametersby optimizingthelikelihood.Recordthevaluesofthemaximizedloglikelihoodinthe vector u =( u 1 ;u 2 ;u 3 ;u 4 ) 0 . 2. Fromeachofthefourmodelsinstep1,replicatethedata200timesusingthe modelparametersobtainedinStep1,generating800(200 4)datasetsinall. 3. allthefourmodelstoeachreplicateddatasetandrecordthemaximizedlog likelihoodestimates.So,foreachdatasetwehavefourmaximizedloglikelihood estimatescorrespondingtothefourmodelssay l =( l 1 ;l 2 ;l 3 ;l 4 ) 0 .Thereare200such vectorswhenthedataisreplicatedfromthe i th model.Let u i denotethemeanofthe these200vectorsand S i denotethesamplecovariancematrixforthese200vectors. 4. Comparethemaximizedloglikelihoodsoftheoriginalobserveddatatothatofthe replicateddataforeachofthe4models. Now,sinceweusenegativeloglikelihoodasthegoodnessofcriteriatheirjointdistribution ismultivariatenormal.Therefore,wecanuseMahalanobissquareddistancetocomparethe 86 models(Mardia etal. ,1979)whichisgivenby, D 2 i =( u u i ) 0 S 1 i ( u u i ) : (4.3.6) Thisisthestandardizedsquareddistancebetweenthepointobtainedfromtheoriginaldata andthemeanofthe i th model.Wecantestthehypothesis H 0 : i th modeliscorrect,using thestatistic D 2 i .Underthenullhypothesis D 2 i isdistributedas4 F 4 ; 199 .Thereforethe p-valuescanbeobtained. Table4.3Mahalanobisdistances(EUS-FNAstudy) Model D 2 P-value POIF30.35 ˝ 0.001 POIR2.660.617 ZIPF2.600.626 ZIPR1.9280.748 BylookingattheresultsinTable4.3,wecanconcludethatPOIFisnotappropriateto thisdata.Also,closecomparisonshowsthatthevalueofMahalanobisdistanceforthe ZIPRmodelistheleastamongallfourmethods.Thistheuseofthezero PoissonmodelasopposedtoaPoissonmodel. Discussionofresults 19Gneedlesarethelargestindiameteramongthethreeneedlesizes.Therefore,itissus- pectedtoyieldthelargestamountoftissuesample,butitcanincreasethechancesof complicationsasmentionedinGerke,2009andTrindade&Berzin,2013.Fromtheresults ofallfourmodels,thepointestimateofrelativerisksuggeststhattherateofoverallcompli- 87 cationswouldbehigherinthe19Ggroupascomparedtothe22/25Ggroupwhichcomplies withthisreasoning.However,theresultsfromallmodelsarenotstatisticallyt andwebelievethatthisisbecausewehavelimiteddataavailableontheusageofthe19G needle.Outofthefourmodels,ZIPRprovidesuswiththetightestintervaland isshowntothemodelbestviathesimulationstudy. 4.3.2Example2:Rosiglitazonestudy WedeployourmodelstoanotherdatasetthatisalsodiscussedinCPR.Thedataused inthisstudyisthemyocardialinfarction(MI)relateddeathdatafromtherosiglitazone study.Rosiglitazoneisusedtotreatpatientswithdiabetesandresearchersareinterested inassessingtheofthisdrugonmyocardialinfarctions(MI)andcardiovascular(CV) mortality.AkeycontributiontowardsthisstudywasmadebyNissenandWolski,2007.The dataconsistedofthenumberofdeathsandthenumberofmyocardialinfarctionsin48studies inthetreatmentgroup(groupadministeredwiththerosiglitazonedrug)and48studiesin thecontrolgroup.Theyused42outofthe48studiessincetherewerenoeventsin6ofthe studiesinbothgroups.TheiranalysisconcludedthatrosiglitazoneincreasestheriskofMI andCVmortalitytly.However,thispaperwasconsideredverycontroversialand manyparallelanalyseswereperformed.TheRosiglitazoneEvaluatedforCardiacOutcomes andRegulationofGlycaemiainDiabetes(RECORD)byHome etal. ,2009)concludedno tincreaseofMIriskandatdecreaseinCVmortalityriskdueto rosiglitazonewhichcontradictedwiththepreviousresultsofNissenandWolski.Further analysisusinglargerdatasetswasdonebyGraham etal. ,2010andNissenandWolski, 2011whichthenegativeimpactofrosiglitazoneonMIriskandCVmortality, duetowhichthereislimitedaccesstothisdruginsomecountries.However,theof 88 rosiglitazonestillremaincontroversial.Inthisstudy,weconducttheanalysisusingthe48 studiesasdiscussedinNissenandWolski,2007andCPR.Asdoneinthepreviousexample, weconductalikelihoodratiotesttotest H 0 : ˇ i =1 8 i =1 ;::;n andfailtorejectthe nullhypothesis.Hence,wecannotestablishtheappropriatenessofonemodeloveranother forthisdata.Therefore,wecomparetheresultsobtainedfromallfourmodels.Table4.4 givesestimatesofthemeanandstandarderror(SE)oftherelativeriskbasedont methods. Table4.4EstimatesoflogrelativeriskforRosiglitazonestudy logrelativerisk( ˝ )C.I.ofrelativerisk ModelLoglikelihoodMeanSE95%C.I. POIF-124.410.250.17(0.92,1.79) POIR-129.350.250.16(0.94,1.75) ZIPF-129.440.250.28(0.74,2.22) ZIPR-129.380.240.15(0.95,1.70) Inthisstudy,onlyabout10%ofthezeroswerefromazerostate.Theresultsobtained fromallfourmodelsaresimilar.WeconcludethattheriskofMIisapproximately30% higherinthetreatmentgroup.Bylookingattheintervals,allfourmodels indicatemarginalAlthoughtheloglikelihoodvaluesdonotfavorthezero modelsoverthePoissonmodels,theZIPRmodelprovidesthetightest interval.Wewillalsoshowbysimulationthattheproposedmodelsusingzeroi Poissondistributionareabetterforthedata. Simulationstudy Weconductasimilarsimulationstudyasdescribedinsection(4.3.1)fortherosiglitazone dataandobtaintheresultsinTable4.5. 89 Table4.5Mahalanobisdistances(Rosiglitazonestudy) Model D 2 P-value POIF1.6400.80 POIR1.1110.89 ZIPF1.5920.81 ZIPR0.6690.95 Fromtheresults,weobservethatallthemodelsquitesatisfactorilytothedatabut closecomparisonshowsthatthevaluesofMahalanobisdistance( D 2 )forthemodelZIPR istheleastamongallfourmethods.AmongPOIFandZIPFwhere ˝ isassumedtobe non-random,ZIPFperformsbetterthanPOIFintermsofgoodnessof Discussionofresults FrommodelZIPR,weobtainthe95%C.I.forrelativeriskas(0.95,1.70)andap-value of0.10fortesting ˝ =0indicatingmarginalWehencetheresults obtainedbyNissenandWolski,2007,2011andCai,Parast&Ryan,2010;thatrosiglitazone causesamarginallytincreaseintheriskofmyocardialinfarctions. 4.4SimulationII Inthissectionwenumericallyanalyzetheperformanceofourmodelsundertsetups. WealsocomparetheirperformancetotheMantel-Haenszel(MH)methodandthePeto method.SinceMHandPetomethodscannotcomputetheestimateofrelativeriskwhen thestudiesarenon-randomized,thereforenocomparisonscouldbemadefordatasimilarto example1(EUS-FNAstudy).Hence,weusedthestudysizesfromexample2(rosiglitazone 90 study)tosimulateourdata.AllsimulationsweredoneinR,andtheestimatesforMHand Petomethodwerecomputedusingthepackage`meta'(GuidoSchwarzer,2014). Simulationsetup :Inthisstudy,thenumberofstudiesintreatmentandcontrolarmis chosenas48.Thestudysizesarechosentobesameasthoseinrosiglitazonestudy.Wegen- eratethenumberofadverseeventsusingabinomialdistributionwitherentprobabilities ofadverseeventstosimulateextremelyrareeventdatatomoderatelyrareeventdata.The probabilitieswerechosenas0.001(suchthatabout75%oftheobservationsarezeros)and 0.005(suchthatabout45%oftheobservationsarezeros).Forsimulationpurposes,thevalue ofrelativeriskistakenasonei.e.thereisnobetweenthetwogroups.Foreach setup,500datasetsaregenerated.Wecomputetheestimatesofrelativerisksfollowedby themeansquarederror(MSE).Wealsocomputethesizeofthetestforeachmethodtotest thefollowinghypothesis H 0 : ˝ =0,where ˝ isthelogrelativeriskat0 : 05level. Furthercomparisonsincludethepowerofthetestswherethetruevalueofthelog-relative risk ˝ isassumedtobe0.5.TheresultsfromthesimulationstudyaresummarizedinTable 4.6. 91 Table4.6ComparisonofmethodsonthebasisofMSE,SizeandPoweroftest ModelMSESizePower Extremelyrareevents (75%zeros) MH0.1730.0320.98 Peto0.1830.0360.99 POIF0.1830.1300.88 ZIPF0.1070.0500.89 Moderatelyrareevents (45%zeros) MH0.0310.0400.96 Peto0.0320.0560.98 POIF0.0300.0540.94 ZIPF0.0310.0540.94 Wealsocomputedtheestimatesforthemodels(POIRandZIPR)and obtainedsimilarresultsasinthemodels(POIFandZIPFrespectively).Hence, toavoidrepetitionandredundancy,hereweonlyincludetheestimatesfrom modelsintheTable.AsshowninTable4.6,forextremelyrareeventdata,the Poissonmodelprovidestheleastvalueofmeansquarederrorwhilepreservingthesizeofthe test.FormoderatelyrareeventdatathePoissonandPoissonmodelperform equallywellsincethePoissondistributionisabletoaccommodateforalmostallthezeros inthedata.Inbothcases,eventhoughtheMHandthePetomethodshavereasonablylow meansquarederrors,buttheyareunabletopreservethesizeofthetest.MHandPeto 92 methodsprovidebetterpowerbutwebelievethisisbecausetheyareunabletopreservethe size. 4.5Discussion Inthischapter,weproposedtheuseofzeroPoissonmodelwithrandomectsfor aclinicaltrialwhentheeventrateisrare.Earlierworkhasbeendoneunderarandomized controlsetupwheretheincidencerateislow.ThePoissonrandommodelsexplained inCPRdonotaccountfortheunequalnumberofstudiesinthetwotreatmentgroups.We extendthemodelsinCPRtoanonrandomizedtrialwithrareeventstodrawinferenceson twotreatments.Inourmethods,weareabletowritetheexplicitlikelihoodforeachstudy ineacharmseparatelywhichallowsustouseourmethodsforanontreatment-controltype study.Themodelingfunctionassumesthatthebaselineeventrate,logrelativeriskandthe zeroratesareindependentofeachother.Although,wehaven'tseenanyconcrete examplessuggestingotherwise,thismaynotalwaysbetrue. Inthischapter,themotivationalexampleoftheEUS-FNAstudyhasonly6studiesin treatmentgroup1asopposedto45intreatmentgroup2.Toaddresssomeoftheconcerns regardingreliabilityofresultsforsuchasetup,weconductedasimulationstudyforavariety ofcaseswithvaryingnumberofstudiesineachgroupwherewecomputetheestimatesand meansquarederror(MSE)ofthelog-relativerisk.MSEisusedasameasureofreliability. Weconcludedthatinallcases,thelog-relativerisk ˝ isestimatedaccurately,however,as thenumberofavailablestudiesincreases,MSEreducestlyapproximatelyatarate of p n . Onthebasisofdataanalysisandsimulationresults,thezeroPoissonmodels 93 areevidentlysuperiortothePoissonmodelswhentheeventrateisextremelyrareandthe ratioofthenumberofstudiesinthetwoarmsishighlyuneven(EUS-FNAstudy).However, thenumberofparameterstobeestimatedincreaseswhenweuseazeroPoisson model.Also,itisworthpointingoutthat,ingeneral,Poissonmodelwithrandom iscomputationallytohandle.CPRadoptedabayesianapproachusingMCMC methodstoestimatetherelativerisk.Poissonmodelwithrandombeing amorecomplicatedmodelrequirestheapproximationandmaximizationofanon-convex likelihoodfunctionoversixparameters.Theestimatesobtainedfromstandardoptimization routineslike nlminb and optim inR,areheavilydependentonthechoiceofinitialvalues.To dealwiththisissue,weusedatialEvolutionoptimizationalgorithmimplemented inthepackageDEoptim(Mullenetal.,2011)in R whichdoesnotrequirethefunctionto beeithercontinuousortiableorconvex.Itoptimizesthefunctionoveragivenrange ofparametervaluesandthecomputingtimeisreasonablyfast. 4.6ProofsofChapter4 Derivationofmodel4.2.1 Assume Y i followsaPoissondistributionwithmean n i i andparameter ˇ i ; i.e. Y i ˘ 8 > < > : P ( n i i )w.p. ˇ i 0w.p.1 ˇ i 9 > = > ; : 94 Here i = ˘ i e ˝X i and ˘ i ˘ Gamma ( ; ) : a i = 8 > < > : 0: Y i =0 1: Y i > 0 9 > = > ; : Theconditionallikelihoodof Y given ˘ iswrittenas n Y i =1 1 ˇ i + ˇ i e n i ˘ i e ˝X i 1 a i ˇ i e n i ˘ i e ˝X i ( n i ˘ i e ˝X i ) y i y i ! ! a i : (4.6.7) Since ˘ i ˘ Gamma ( ; )wecanintegrateout ˘ i from(4 : 6 : 7),toobtainthemarginallikeli- hood. Considerthedensityof Y i forsome i , ) Z 1 0 1 ˇ i + ˇ i e n i ˘ i e ˝X i 1 a i ˇ i e n i ˘ i e ˝X i ( n i ˘ i e ˝X i ) y i y i ! ! a i ˘ 1 i e ˘ i d˘ i : (4.6.8) Now,if a i =0,(4 : 6 : 8)becomes ) Z 1 0 (1 ˇ i + ˇ i e n i ˘ i e ˝X i ) ˘ 1 i e ˘ i d˘ i =1 ˇ i + ˇ i ) Z ˘ 1 i e ˘ i ( + n i 1 e ˝X i ) d˘ i =1 ˇ i + ˇ i ( + n i e ˝X i ) : (4.6.9) Similarly,if a i =1,(4 : 6 : 8)becomes ) Z 1 0 ˇ i e n i ˘ i e ˝X i ( n i ˘ i e ˝X i ) y i y i ! ˘ 1 i e ˘ i d˘ i = ˇ i ( n i e ˝X i ) y i y i ! ) Z ˘ y i + 1 i e ˘ i ( + n i e ˝X i ) d˘ i = ˇ i ( n i e ˝X i ) y i y i ! ) y i + ) ( + n i e ˝X i ) y i + : (4.6.10) 95 Therefore,using(4 : 6 : 9)and(4 : 6 : 10)thedensityof Y i i.e.(4 : 6 : 8)isgivenby 1 ˇ i + ˇ i ( + n i e ˝X i ) 1 a i ˇ i ( + n i e ˝X i ) y i + y i + ) ) ( n i e ˝X i ) y i y i ! ! a i : (4.6.11) Further,weassumeaBetadistributionontheparameteri.e. ˇ i ˘ Beta ( c;d ). Thereforeweneedtoexpressthelikelihoodintermsof c and d .Again,weconsiderthe densityof Y i .If a i =0,whenintegratedover ˇ i ,(4 : 6 : 11)canbewrittenas 1 ( c;d ) Z 1 0 (1 ˇ i + ˇ i ( + n i e ˝X i ) ) ˇ c 1 i (1 ˇ i ) d 1 dˇ i = 1 ( c;d ) Z 1 0 1 ˇ i (1 ( + n i e ˝X i ) ) ˇ c 1 i (1 ˇ i ) d 1 dˇ i = 1 ( c;d ) Z 1 0 (1 ( + n i e ˝X i ) )(1 ˇ i )+ ( + n i e ˝X i ) ˇ c 1 i (1 ˇ i ) d 1 dˇ i = 1 ( c;d ) (1 ( + n i e ˝X i ) ) Z 1 0 ˇ c 1 i (1 ˇ i ) d 1 dˇ i + ( + n i e ˝X i ) = ( c;d +1) ( c;d ) (1 ( + n i e ˝X i ) )+ ( + n i e ˝X i ) : (4.6.12) If a i =1,whenintegratedover ˇ i ,(4 : 6 : 11)canbewrittenas ( + n i e ˝X i ) y i + y i + ) ) ( n i e ˝X i ) y i y i ! 1 ( c;d ) Z 1 0 ˇ i ˇ c 1 i (1 ˇ i ) d 1 dˇ i = ( + n i e ˝X i ) y i + y i + ) ) ( n i e ˝X i ) y i y i ! ( c +1 ;d ) ( c;d ) : (4.6.13) Henceusing(4 : 6 : 12)and(4 : 6 : 13),theresultinglikelihoodbecomes L ( ;;˝;c;d )= n Y i =1 ( c;d +1) ( c;d ) 1 ( + n i e ˝X i ) + ( + n i e ˝X i ) 1 a i ( c +1 ;d ) ( c;d ) ( + n i e ˝X i ) y i + y i + ) ) ( n i e ˝X i ) y i y i ! ! a i 96 Wemaximizetheabovelikelihoodandobtaintheestimateof =( ;;˝;c;d ) 0 . Derivationofmodel4.2.2 Considerthedensityof Y i forsome i 2f treatmentarm1 g .Since ˝ i ˘ N ( ˙ 2 ),for a i =0,(4 : 6 : 9)becomes Z 1 1 f X i =1 g (1 ˇ i + ˇ i ( + n i e ˝ ) ) 1 p (2 ˇ ) ˙ e 1 2 ( ˝ ) 2 ˙ 2 d˝ = 1 f X i =1 g 0 @ 1 ˇ i + ˇ i p (2 ˇ ) ˙ Z 1 e 1 2 ( ˝ ) 2 ˙ 2 = ( + n i e ˝ ) d˝ 1 A = 1 f X i =1 g 1 ˇ i + ˇ i p ( ˇ ) ˙ Z 1 e t 2 = ( + n i e p 2 ˙t + ) dt; where ˝ p 2 ˙ = t ! = 1 f X i =1 g 1 ˇ i + ˇ i p (2 ˇ ) Z 1 f ( t ) w ( t ) dt ! ; where f ( t )= 1 ( + n i e p 2 ˙t + ) and w ( t )= e t 2 : WecanintegratetheaboveexpressionusingGauss-HermiteQuadrature. If a i =1,thedensityof Y i forsome i 2f treatmentarm1 g isgivenby 1 f X i =1 g ˇ i p 2 ˇ˙ y i + ) ) n y i i y i ! Z 1 e ˝y i ( + n i e ˝ ) y i + e 1 2 ( ˝ ) 2 ˙ 2 d˝ = 1 f X i =1 g ˇ i p 2 ˇ˙ y i + ) ) n y i i y i ! ˙e 1 2 ( 2 ˙ 2 ( ˙ + y i ˙ ) 2 ) Z e t 2 2 ( + n i e ˙t + + y i ˙ 2 ) y i + dt; where ˝ ˙ ( ˙ + y i ˙ )= t = 1 f X i =1 g ˇ i p ˇ y i + ) ) n y i i y i ! e 1 2 ( 2 ˙ 2 ( ˙ + y i ˙ ) 2 ) Z 1 f 1 ( t ) w 1 ( t ) dt; where f 1 ( t )= 1 ( + n i e p 2 ˙t + + y i ˙ 2 ) y i + and w 1 ( t )= e t 2 : WecanuseGauss-HermiteQuadraturetoapproximatethedensityof Y i .Next,weassume 97 abetadistributionon ˇ andfollowingsimilarargumentstheresultinglikelihoodisgivenby L ( ;;˙;c;d )= n Y i =1 1 f X i =1 g ( c;d +1) ( c;d ) (1 h 1 i p ˇ )+ h 1 i p ˇ 1 a i 0 @ ( c +1 ;d ) ( c;d ) p ˇ y i + ) ) y i ! n y i i e 1 2 ( 2 ˙ 2 ( ˙ + y i ˙ ) 2 ) h 2 i 1 A a i Y i 1 f X i =0 g ( c;d +1) ( c;d ) (1 ( + n j ) )+ ( + n j ) 1 a i ( c +1 ;d ) ( c;d ) ( + n j ) y j + y j + ) ) y j ! n y j j ! a i ; where h 1 i = Z 1 e t 2 = ( + n i e p 2 ˙t + ) dt and h 2 i = Z 1 e t 2 = ( + n i e p 2 ˙t + + y i ˙ 2 ) y i + dt: 98 BIBLIOGRAPHY 99 BIBLIOGRAPHY [1] AllcroftDJ,GlasbeyCA,Asimulation-basedmethodformodelevaluation. Statistical Modelling,AnInternationalJournal 2003; 3 (1):1-13. [2] AlnosaierWS. Kenward-RogerApproximateFTestforFixedctsinMixedLinear Models. PhDthesis,OregonStateUniversity,USA,2007. [3] BarOEandHallP.Onthelevel-errorafterBartlettadjustmentofthe likelihoodratiostatistic. Biometrika 1988; 75 :374-378. [4] BartlettMS.Propertiesofandstatisticaltest. ProcRSocLondon,SerA 1937; 160 :268-282. [5] BradburnMJ,DeeksJJ,BerlinJA,LocalioAR:Muchadoaboutnothing:acomparison oftheperformanceofmeta-analyticalmethodswithrareevents. StatisticsinMedicine 2007; 26 :53-77. [6] CaiT,ParastL,RyanL.Meta-analysisforrareevents. StatisticsinMedicine 2010; 29 : 2078-2089. [7] ChenQ,IbrahimJG,ChenMH,SenchaudhuriP,TheoryandInferenceforregression modelswithmissingresponsesandcovariates. JournalofMultivariateAnalysis 2008; 99 (6):1302-1331. [8] CoxDRandHinkleyDV. TheoreticalStatistics. ChapmanandHall,1990. [9] CoxDRandReidN.Parameterorthogonalityandapproximateconditionalinference. J RStatSocB 1987; 49 :1-39. [10] DanielsMJ,GatsonisC.Hierarchicalgeneralizedlinearmodelsintheanalysisofvaria- tionsinhealthcareutilization. JournaloftheAmericanStatisticalAssociation 1999; 94 : 29-30. [11] DerSimonianR,LairdN.Meta-analysisinclinicaltrials. ControlledClinicalTrials 1986; 7 :177-188. 100 [12] DiCiccioTJandSternSE.FrequentistandBayesianBartlettcorrectionofteststatistics basedonadjustedlikelihoods. JRStatSocB 1994; 56 :397-408. [13] EfronB,TibshiraniRJ.AnIntroductiontotheBootstrap. ChapmanandHall,New York 1993. [14] FitzmauriceG,LairdN,WareJ,AppliedLongitudinalAnalysis. JohnWiley&Sons: NewYork 2004. [15] GerkeH.EUS-guidedFNA:bettersampleswithsmallerneedles? GastrointestinalEn- doscopy 2009; 70 (6):1098-1100. [16] GoldsteinH. DesignandAnalysisofLongitudinalStudies. London:AcademicPress, 1979. [17] GrahamDJ,Ouellet-HellstromR,MaCurdyTEetal.Riskofacutemyocardialinfarc- tion,stroke,heartfailure,anddeathinelderlyMedicarepatientstreatedwithrosiglita- zoneorpioglitazone. JournaloftheAmericanMedicalAssociation 2010; 304 (4):411-418. [18] GregoryKB. AComparisonofDenominatorDegreesofFreedomApproximationMeth- odsintheUnbalancedTwo-WayFactorialMixedModel. Masterspaper,TexasA&M University,USA,2011. [19] GuidoSchwarzer.meta:Meta-AnalysiswithR. Rpackageversion3.7-0. 2014. [20] HarvilleDA,Maximumlikelihoodapproachestovariancecomponentestimationand relatedproblems. JournalofAmericanStatisticalAssociation 1977; 72 :320-340. [21] HarvilleDA,Decompositionofpredictionerror. JournaloftheAmericanStatistical Association 1985; 80 :132-138. [22] HarvilleDA,JeskeDR,Meansquarederrorofestimationorpredictionunderageneral linearmodel. JournaloftheAmericanStatisticalAssociation 1992; 87 :724-731. [23] HenryK,EriceAetal.,Arandomized,controlled,double-blindstudycomparingthe survivalboffourtreversetranscriptaseinhibitortherapies(three-drug, two-drug,andalternatingdrug)forthetreatmentofadvancedAIDS.AIDSClinical TrialGroup193AStudyTeam. Journalofacquiredimmunesyndromesand humanretrovirology. 1998; 19 (4):339-349. 101 [24] HigginsJPT,GreenS. CochraneHandbookforSystematicReviewsofInterventions. JohnWiley&SonsLtd,England. [25] HomePD,PocockSJ,Beck-NielsenHetal.Rosiglitazoneevaluatedforcardiovascular outcomesinoralagentcombinationtherapyforType2diabetes(RECORD):amulti- centre,randomised,open-labeltrial. Lancet 2009; 373 (9681):2125-2135. [26] IbrahimJG,Incompletedataingeneralizedlinearmodels. JournaloftheAmerican StatisticalAssociation 1990; 85 (411):765-769. [27] IbrahimJG,ChenMH,LipsitzSR,Missingresponsesingeneralizedlinearmixedmodels whenthemissingdatamechanismisnonignorable. Biometrika 2001; 88 (2):551-564. [28] IbrahimJG,ChenMH,LipsitzSR,HerringAH,Missing-Datamethodsforgeneralized linearmodels. JournaloftheAmericanStatisticalAssociation 2005; 100 (469):332-346. [29] InternationalConferenceonHarmonisation(ICH)ofTechnicalRequirementsforReg- istrationofPharmaceuticalsforHumanUse.ICHHarmonisedTripartiteGuideline.Sta- tisticalpriniciplesforclinicaltrials:E9 [30] KackarANandHarvilleDA.Approximationsforstandarderrorsofestimatorsof andrandomtsinmixedlinearmodels. JAmStatAssoc 1984; 79 :853-862. [31] KenwardMGandRogerJH.Smallsampleinferenceforfromrestricted maximumlikelihood. Biometrics 1997; 53 :983-997. [32] LairdNMandWareJH.modelsforlongitudinaldata. Biometrics 1982; 38 :963-974. [33] LanePW.Meta-analysisofincidenceofrareevents. StatisticalMethodsinMedical Research 2013; 22 :117-132. [34] LeeperJD,ChangSW,Comparisonofgeneralmultivariateandmodels forgrowthcurveanalysis:incompletedatasmallsamplesituations. JournalofStatistical ComputationandSimulation 1992; 44 (1-2):93-104. [35] LittleRJAandRubinDB. StatisticalAnalysiswithMissingData. NewYork:John Viley&Sons,1987. [36] MantelN,HaenszelW.Statisticalaspectsoftheanalysisofdatafromretrospective studiesofdisease. JournaloftheNationalCancerInstitute 1959; 22 (4):719-748. 102 [37] MardiaKV,KentJT,BibbyJM. MultivariateAnalysis. AcademicPress,London,1979. [38] McKeonJJ,FapproximationstothedistributionofHotelling's T 2 0 . Biometrika 1974; 61 (2):381-383. [39] MeloTFN,FerrariSLPandCribari-NetoFransisco.Improvedtestinginferenceinmixed linearmodels. ComputStatDataAnal 2009; 53 (7):2573-2582. [40] MullenK,ArdiaD,GilD,WindoverD,ClineJ.DEoptim:AnRPackageforGlobal OptimizationbytialEvolution. JournalofStatisticalSoftware 2011; 40 (6):1-26. [41] NissenS,WolskiK.ofrosiglitazoneontheriskofmyocardialinfarctionanddeath fromcardiovascularcauses. NewEnglandJournalofMedicine 2007; 356 (24):2467-2471. [42] NissenS,WolskiK.Anupdatedmeta-analysisofriskformyocardialinfarctionand cardiovascularmortality. ArchivesofInternalMedicine 2010; 170 (14):1191-1201. [43] PattersonHDandThompsonR.Recoveryofinter-blockinformationwhenblocksizes areunequal. Biometrika 1971; 58 :545-554. [44] RDevelopmentCoreTeam.R:Alanguageandenvironmentforstatisticalcomputing. RFoundationforStatisticalComputing,Vienna,Austria 2014. [45] SatterthwaiteFF.Synthesisofvariance. Psychometrika 1941; 6 :309-316. [46] SelfSG,LiangKY.AsymptoticPropertiesofMaximumLikelihoodEstimatorsandLike- lihoodRatioTestsUnderNonstandardConditions. JournaloftheAmericanStatistical Association 1987; 82 (398):605-610. [47] StubbendickAL,IbrahimJG,Maximumlikelihoodmethodsfornonignorablemissing responsesandcovariatesinrandommodels. Biometrics 2003; 59 (4):1140-1150. [48] TheCochraneCollaboration.TheCochranePolicyManual [49] ThompsonR.Maximumlikelihoodestimationofvariancecomponents. Mathematische OperationsforschungundStatistik,SeriesStatistics 1980; 11 :545-561. [50] TianL,CaiT,PfeM,PiankovN,CremieuxP-Y,WeiLJ.Exactandtinfer- enceprocedureformetaanalysisitsapplicationtotheanalysisofindependent2x2tables withallavailabledatabutwithoutcontinuitycorrection. Biostatistics 2009; 10 : 275-281. 103 [51] TrindadeAJ,BerzinTM.Clinicalcontroversiesinendoscopicultrasound. Gastroen- terologyReport1 2013;33-41. [52] VerbekeGandMolenberghsG. Linearmixedmodelsinpractice:ASASorientedap- proach. NewYork:Springer-Verlag,1997. [53] VerbekeGandMolenberghsG. Linearmixedmodelsforlongitudinaldata. NewYork: Springer,2000. [54] WeissRE,ModelingLongitudinalData. Springer:NewYork 2005. [55] WelhamSJandThompsonR.Likelihoodratiotestsformodeltermsusingresidual maximumlikelihood. JRStatSocB 1997; 59 (3):701-714. [56] YusufS,PetoR,LewisJ,CollinsR,SleightP.Betablockadeduringandaftermyocardial infarction:anoverviewoftherandomizedtrials. ProgressinCardiovascularDiseases 1985; 27 (5):335-371. [57] ZuckerDM,LiebermanOandManorO.Improvedsmallsampleinferenceinthemixed linearmodel:Bartlettcorrectionandadjustedlikelihood. JRStatSocB 2000; 62 (4): 827-838. 104