EMPIRICALLIKELIHOODBASEDFUNCTIONALDATAANALYSISANDHIGH DIMENSIONALINFERENCEWITHAPPLICATIONSTOBIOLOGY By HonglangWang ADISSERTATION Submittedto MichiganStateUniversity inpartialentoftherequirements forthedegreeof Statistics|DoctorofPhilosophy 2015 ABSTRACT EMPIRICALLIKELIHOODBASEDFUNCTIONALDATAANALYSISAND HIGHDIMENSIONALINFERENCEWITHAPPLICATIONSTOBIOLOGY By HonglangWang Highdimensionaldataanalysishasbeenarapidlydevelopingtopicinstatisticswith variousapplicationsinareassuchasgenetics/genomics,neuroscience,socialsci- enceandsoon.Withtherapiddevelopmentoftechnology,statisticsasadatascience requiresmoreandmoreinnovationsinmethodologiesaswellasbreakthroughsinmathe- maticalframeworks.Inhighdimensionalworld,classicalstatisticalmethodsdesignedfor dimensionalmodelsareoftendoomedtofail.Thisthesisfocusesontwotypesofhigh dimensionaldataanalysis.Oneisthestudyoftypical\large p small n "probleminlinear regressionwithhighdimensionalcovariates X 2 R p butsmallsamplesize n ,andtheother isthefunctionaldataanalysis.Functionaldatabelongtotheclassofhighdimensionaldata inthesensethateverydataobjectconsistsofalargenumberofmeasurements,whichmay belargerthanthesamplesize.Butthekeycharacteristicisthatfunctionalobjectscanbe modeledassmoothcurvesorsurfaces.WemakeuseofEmpiricalLikelihood(EL)introduced by[Owe01],tosolvesomefundamentalproblemsinthesetwoparticularhighdimensional problems. Thepartofthethesisconsiderstheproblemoftestingfunctionalconstraintsina classoffunctionallinearregressionmodelswhereboththepredictorsandtheresponseare functionaldatameasuredatdiscretetimepoints.Weproposetestproceduresbasedon theempiricallikelihoodwithbias-correctedestimatingequationstoconductbothpointwise andsimultaneousinference.Theasymptoticdistributionsoftheteststatisticsarederived underthenullandlocalalternativehypotheses,wheresparseanddensefunctionaldataare consideredinaframework.Weaphasetransitionintheasymptoticdistributions andtheordersofdetectablealternativesfromsparsetodensefunctionaldata.Sp, theproposedtestscandetectalternativesofroot- n orderwhenthenumberofrepeated measurementspercurveisofanorderlargerthan n 0 with n beingthenumberofcurves. Thetransitionpoints 0 aretforpointwiseandsimultaneoustestsandbothare smallerthanthetransitionpointintheestimationproblem. Inthesecondpartofthethesis,weconsiderhypothesistestingproblemsforalow- dimensionalcotvectorinahigh-dimensionallinearmodelunderheteroscedasticer- ror.Heteroscedasticityisacommonlyobservedphenomenoninmanyapplicationsincluding andgenomicstudies.Severalstatisticalinferenceprocedureshavebeenproposedfor low-dimensionalcotsinahigh-dimensionallinearmodelwithhomoscedasticnoise. However,thoseproceduresdesignedforhomoscedasticerrorarenotapplicableformod- elswithheteroscedasticerrorandtheheterscedasticityissuehasnotbeeninvestigatedand studied.Weproposeainferenceprocedurebasedonempiricallikelihoodtoovercomethe heteroscedasticityissue.Theproposedmethodisabletomakevalidinferenceunderhet- eroscedasticitymodelevenwhentheconditionalvarianceofrandomerrorisafunctionof thehigh-dimensionalpredictor.Weapplyourinferenceproceduretothreerecentlyproposed estimatingequationsandestablishtheasymptoticdistributionsoftheproposedmethods. Forbothofthetwoparts,simulationstudiesandrealdataanalysesareconductedto demonstratetheproposedmethods. Iwouldliketodedicatethisthesistomybelovedparents,DaofuWangandShuizhen Wang,andmylittlebrother,HailangWang. iv ACKNOWLEDGMENTS Firstandforemost,IwouldliketoexpressmysinceregratitudetomytwoadvisorsDr. YuehuaCuiandDr.Ping-ShouZhongfortheircontinuoussupportguidance,understanding, patienceandencouragementduringmyPhDstudyandresearch.Dr.CuiandDr.Zhong havepushedmeintocontactwithamultitudeofdisciplines,andtheirguidanceabouthow toapproachresearch,write,andgivetalkshasbeeninvaluable.Theyhavealsoprovidedme excellentenvironmentsfordoingresearchinthedevelopmentofmethodologyandtheoryas wellasintherealdataanalysis.Withouttheirguidanceandpersistenthelp,thisdissertation wouldnothavebeenpossible.Forallofthis,Iamverythankfultobothofmyadvisors. Iwouldalsoliketothanktheotherwonderfulmembersofmyresearchcommittee,Dr. C.RobinBuellandDr.Hyokyoung(Grace)Hong.IngettingmydualPhDdegreein QuantitativeBiology,Dr.Buellhasprovidedmuchguidanceandassistance,whichalso makememorettobecomeaBio-statistician.Ihavebeenenjoyinginvolvement inthepotatoprojectandlearningalotfrommonthlygroupcalls,annualmeetingsand theBioinformaticsworkshops.Herguidanceprovidedmewiththeuniqueopportunityto gainawiderbreadthofexperienceinbiologyscience,whichisespeciallyimportantfora Bio-statistician. Besidesthat,IthankalltheotherprofessorsandinthiswonderfulDepartment ofStatisticsandProbabilitywhohaveneverhedaboutansweringaquestionfroma nagginggraduatestudent|somethingthatisembeddedinthecultureofWellsHall.My specialthanksgotoDr.HiraL.Koul,Dr.YiminXiaoandDr.TapabrataMaitifortheir interestingcourses,valuableadviseandencouragement. Comingtofriends,IamgratefultoYuzhenZhou,TaoHe,JikaiLei,ChenYue,XinQi, v LiqianCai,XiaoqingZhu,BinGao,XuLiuandallotherfellowstudentsfromtheDepartment ofStatisticsandProbabilityforthefriendshipandthefuntimewespenttogetherinthe pasteyears. Finallyandmostimportantly,Iwouldexpressmyprofoundgratitudetomybeloved parents,DaofuWangandShuizhenWangandmylitterbrother,HailangWangfortheir love,endlesssupportandfaithinmeinallofmyendeavors. vi TABLEOFCONTENTS LISTOFTABLES .................................... ix LISTOFFIGURES ................................... x KEYTOABBREVIATIONS ............................. xi Chapter1Introduction ............................... 1 1.1EmpiricalLikelihood...............................1 1.2BigDataAnalysis.................................4 1.2.1FunctionalDataAnalysis.........................4 1.2.2HighDimensionalDataAnalysis....................7 Chapter2pointwiseempiricallikelihoodratiotestsforfunctional linearmodelsandthephasetransitionfromsparsetodense functionaldata .............................. 10 2.1Introduction....................................10 2.2Abias-correctedestimatorandsomepreliminaryresults...........14 2.2.1Abias-correctedestimator........................14 2.2.2Regularityconditionsandpreliminaryresults.............15 2.3Apointwisetest.............................19 2.4Implementationissues..............................22 2.4.1Bandwidthselection...........................22 2.4.2CovarianceEstimation..........................24 2.5Simulationstudies................................25 2.6TechnicalDetails.................................28 2.6.1ProofofTheorem1............................28 2.6.2ProofsofPropositions..........................29 2.6.2.1SomeUsefulLemmas......................30 2.6.2.2ProofofPropositions......................48 2.6.2.3ExistenceofRMELEandtheasymptoticexpressionfor ~ .49 Chapter3simultaneousempiricallikelihoodratiotestsforfunc- tionallinearmodelsandthephasetransitionfromsparseto densefunctionaldata .......................... 66 3.1Introduction....................................66 3.2Asimultaneoustest............................68 3.2.1Nulldistributionandlocalpower....................69 3.2.2Wildbootstrapprocedure........................73 3.3Simulationstudies................................74 vii 3.4Realdataanalysis.................................77 3.4.1CD4dataanalysis............................77 3.4.2Ergonomicsdataanalysis.........................78 3.5TechnicalDetails.................................81 3.5.1ProofsofMainTheorems........................81 3.5.1.1ProofofTheorem2.......................81 3.5.1.2ProofofCorollary1......................84 3.5.1.3ProofofTheorem3.......................85 3.5.1.4ProofofTheorem4.......................89 3.5.2ProofsofPropositionandLemma....................91 Chapter4EmpiricalLikelihoodinTestingCotsinHighDimen- sionalHeteroscedasticLinearModels ................ 94 4.1Introduction....................................94 4.2PreliminaryandExistingMethods.......................95 4.2.1LassoProjection.............................97 4.2.2KFCProjection..............................99 4.2.3InverseProjection.............................102 4.3EmpiricalLikelihoodBasedApproach......................104 4.4TheoreticalExamples...............................106 4.4.1LassoProjection.............................106 4.4.2InverseProjection.............................107 4.4.3KFCProjection..............................108 4.5SimulationStudies................................109 4.6RealDataAnalysis................................117 4.6.1WGCNAofcorrelatedgenes.......................118 4.6.2Test.............................118 4.6.3PresenceofHeteroscedasticity......................120 4.6.4ResultsforTop4GeneswithHeteroscedasticity............121 4.7TechnicalDetails.................................124 4.7.1AssumptionsforTheoreticalExamples.................124 4.7.2ProofofTheorems............................127 Chapter5ConclusionsandFutureDirections ................. 156 5.1SummaryandContributions...........................156 5.2FutureDirections.................................157 BIBLIOGRAPHY .................................... 159 viii LISTOFTABLES Table1.1 Transitionphasepointfromsparsetodensedataandopti- maldetectableorderoflocalalternativesforbothpointwise andsimultaneousinference. Notethatweloweredthetransition phasepoint 0 whichwas1/4intheexistingliterature........6 Table2.1Empiricalcoverageprobability(%)andaveragelengthofpointwise intervals(inparenthesis)for 1 ( t )at t =0 : 3 ; 0 : 5and0.7.26 Table3.1Empiricalsizeandpowerfortesting H 0 A : 1 ( )= 2 ( )undersce- narioA..................................76 Table3.2P-valuesforpairwisecomparisonamongttreatmentgroups.79 Table3.3P-valuesfortestingeachcotfunctioninthequadraticmodel (3.4.4)...................................81 Table4.1 Powercomparison. Covariate:Toeplitzmatrixwith ˆ =0 : 2;Er- ror:N(0 ; 1)................................114 Table4.2 Powercomparison. Covariate:Toeplitzmatrixwith ˆ =0 : 2;Er- ror:0 : 7 X 1 N(0 ; 1).............................115 Table4.3 Powercomparison. Covariate:Toeplitzmatrixwith ˆ =0 : 2;Er- ror: 1 p 1 X 1 P p j =2 X j 1 X j N(0 ; 1)....................116 Table4.4 ModuleSizes. .............................118 ix LISTOFFIGURES Figure2.1 Panels(a)and(b)areboxplotsforbandwidthsselectedformodel(2.5.19) with 1 ( t )= 1 2 sin( ˇt )and 2 ( t )=2sin( ˇt +0 : 5)usingtheproposed bandwidthselectionmethodinSection2.4.Panels(c)and(d)arethe plotsofthelogarithmofmedian( ^ h )vslog( nm ). .............27 Figure3.1 Empiricalsizeandpowerfortesting H 0 B : 2 ( )=0atthe5%nominal levelunderscenarioB.Theleftpanelisfor ˆ =0 : 2andtherightpanelis for ˆ =0 : 5. ................................76 Figure4.1 EmpiricalSizeandPowerComparisonamongEmpiricalLike- lihoodbasedapproachesandamongHolyTrinityand p = 100 . (a)\EL-KFC"representsELapproachwithKFCprojection, \EL-INV"representsELapproachwithinverseprojectionand\EL- LASSO"representsELapproachwithLassoprojection;(b)\Wald" representsWaldtypetest,\Score"representsScoretestand\EL" representslikelihoodratiotest......................112 Figure4.2 EmpiricalSizeandPowerComparisonamongEmpiricalLike- lihoodbasedapproachesandamongHolyTrinitywithHet- eroscedasticNoise 1 p 1 X 1 P p j =2 X j 1 X j N (0 ; 1) and p =500 . (a)\EL-KFC"representsELapproachwithKFCprojection,\EL- INV"representsELapproachwithinverseprojectionand\EL-LASSO" representsELapproachwithLassoprojection;(b)\Wald"represents Waldtypetest,\Score"representsScoretestand\EL"represents likelihoodratiotest............................113 Figure4.3 BreastCancerCohortStudies. (a)Clusteringdendrogramof genes,withdissimilaritybasedontopologicaloverlap,togetherwith assignedmodulecolors.(b)ManhattanplotforModule3......119 Figure4.4 WonderingSchematicPlotforTop4GeneswithHeteroscedas- ticity. ..................................122 Figure4.5 ManhattanPlotforTop4GeneswithHeteroscedasticity. .123 x KEYTOABBREVIATIONS EL:EmpiricalLikelihood; IID:IndependentandIdenticallyDistributed; KL:Kullback-Leibler; SNPs:SingleNucleotidePolymorphisms; GWAS:GenomeWideAssociationStudies; KFC:KeyconFounderControlling WGCNA:WeightedGeneCo-expressionNetworkAnalysis; xi Chapter1 Introduction 1.1EmpiricalLikelihood InStatistics,thelikelihoodprincipleistheprimaryprincipleasstatedin[Edw84], Withintheframeworkofastatisticalmodel,alltheinformationwhichthedataprovide concerningtherelativemeritsoftwohypothesesiscontainedinthelikelihoodratioof thosehypothesesonthedata....Foracontinuumofhypotheses,thisprincipleasserts thatthelikelihoodfunctioncontainsallthenecessaryinformation. However,fortheinferenceproceduretobemorewidelyapplicable,somenon-parametric versionofthelikelihoodisdesirablesothatwecannotonlygainrobustnessandy butalsokeeptheenessaswellassomeothermeritsofthelikelihoodprinciple.Inthe lateeighties,ProfessorArtB.Owenproposedthegreatidea,\EmpiricalLikelihood"(EL) [Owe88,Owe90],whichisanon-parametriclikelihood.Thewellknown\WilksPhenomenon" belongingtotheparametriclikelihoodstillholdsforEL[Owe90,Owe01].ELalsoenjoysthe Bartlettcorrectionproperty[DHR91,CC06].Besides,itproducesmorenaturaldatadriven shapeofregions. WeconsidertheunivariatemeaninferenceproblemtointroducetheELidea.Given n IIDobservations f X i 2 R ;i =1 ; 2 ; ;n g fromanunknownunderlyingdistribution F 0 withtwomoments,wewanttoconducttheinferencefortheunivariatemean 0 := E F 0 ( X i ).Anaturalpointestimationof 0 isthesamplemean X ,buthowtoget 1 antnceintervalwithagivenlevelisnotthatsimplesincewehave noideaabouttheunderlyingdistributionuptothetwomoments.According to[Owe90],theempiricallikelihoodfor istheproductoftheprobabilityweights,say f 0 p i 1 ;i =1 ; 2 ; ;n g ,sittingonthesamplepoints f X i ;i =1 ; 2 ; ;n g ,thatis Q n i =1 p i ,withthemomentconstraint P n i =1 p i X i = ,i.e. L EL ( )=max f p i g n i =1 ( n Y i =1 p i : n X i =1 p i =1 ;p i 0 ; n X i =1 p i ( X i )=0 ) : (1.1.1) Actually,wecanderivetheaboveformulation(1.1.1)inthefollowingformalway.The statisticalmodelwiththemomentrestrictioncouldbephrasedformallyasthesetofall probabilitymeasuresthatarecompatiblewiththemomentcondition,i.e. P = S P ( ), where P ( )= ˆ probabilitymeasure P on R : Z ( X ) dP =0 ˙ : Notethatitiscorrectlyspifandonlyif P includesthetruemeasure dF 0 ( x )asits member.Thefollowingfunctioncouldberegardedasameasureforthedivergencebetween twoprobabilitymeasures P and Q : D ( P;Q )= Z ˚ ( dP dQ ) dQ; aslongas ˚ ischosentobeconvex.AndweknowthattheKullback-Leibler(KL)divergence betweenprobabilitymeasures P and Q isaspecialcasebytaking ˚ ( x )= log( x ). Ifthemodeliscorrectlyspwehavethefollowingnicepropertyatthepopulation level 0 =inf inf P 2P ( ) D ( P;F 0 ) : 2 Henceanaturalstatisticalprocedurefortheestimationofthemeancanbeobtainedby replacingtheunknown F 0 withtheempiricalmeasure F n andsearchingovertherestricted statisticalmodel P = S P ( ),where P ( )= ( F p := n X i =1 p i X i : Z ( X ) dF p =0 ) : Andthentheestimationofthemeanisnedastheminimizerofthefollowingoptimization problem inf inf F p 2 P ( ) D ( F p ;F n )=inf inf P n i =1 p i ( X i )=0 ; P n i =1 p i =1 ;p i 0 1 n n X i =1 ˚ ( np i ) : (1.1.2) Inparticular,withtheKLdivergencein(1.1.2),wehave inf inf P n i =1 p i ( X i )=0 ; P n i =1 p i =1 ;p i 0 1 n n X i =1 log( np i ) ; whichnaturallyleadstothelogempiricallikelihoodasnedin(1.1.1) ` EL ( ):=log L EL ( )=max f p i g n i =1 ( n X i =1 log p i : n X i =1 p i =1 ;p i 0 ; n X i =1 p i ( X i )=0 ) : Mostimportantly,[Owe90]provedthefollowingWilksproperty 2 ` EL ( 0 ) 2 n log n d ! ˜ 2 1 : Basedonthisasymptoticresult,wecannotonlyperformhypothesistestingbutalsocon- structintervalforthemeanparameterwithdatadrivenshape. 3 AnoverviewoftheELmethodscanbefoundin[Owe01]and[CVK09]. 1.2BigDataAnalysis Intheageofinformationandtechnology,alongwiththeadvancementoftechnologicalrev- olution,informationacquisitionisbecomingeasyandcheap,whichleadstotheexplosion ofdatacollectionthroughautomateddatacollectionprocesses.Fromvarioussuch asbiomedicalsciences,engineeringandsocialsciences,massivedatacharacterizedbyhigh dimensionalityarepoppingupallthetime.Forexample,withtherapidnextgeneration sequencingtechnologydevelopment,hundredsofthousandsofgeneticvariantssuchassingle nucleotidepolymorphisms(SNPs),arepotentialfeaturesingenomewideassociationstudies (GWAS).Timeserieswithverydensetimepointscanbecollectedfromhundredsofthou- sandsofregionsineconomics,earthsciences,aswellasneuroscience.IntheBigDataera, documents,images,videosandotherobjectscanallberegardedasformsofmassivedata. Statisticianshavebeenalsoproposingnewstatisticalmethodologiestodiscoverknowledge fromthosebigdata.Forexample,fromstudyingdatapointsintheEuclideanspaces tostudyingcurves(i.e.functionaldataanalysis),surfaces,evenmanifoldsdirectlyin dimensionalspaces. 1.2.1FunctionalDataAnalysis Weconsiderthefollowinggeneralfunctionallinearregressionmodel, Y i ( t ij )= | 0 ( t ij ) X i ( t ij )+ i ( t ij ) ;i =1 ; ;n ; j =1 ; ;m i (1.2.3) 4 where X i ( t ) ˘f ( t ) ; ( s;t ) g , t ij ˘ f ( t )and i ( t ) ˘f 0 ; s;t ) g aremutuallyindependent. Forconvenience,assumethatwith m i 's(1 i n )areallofthesameorderas m = n forsome 0.Datawith =0,arecalledsparsefunctionaldata,i.e.longitudinaldata; thosesatisfying 0 ,where 0 isatransitionpointtobesparereferredasdense functionaldata.Thescenarioswith 2 (0 ; 0 )areinagreyzoneintheliteratureandwe referthemas\moderatelydense". Historically,sparseanddensefunctionaldatawereanalyzedwithtmethodologies. Fordensefunctionaldata,onecansmootheachcurveseparatelyandproceedwithfurther estimationandinferencebasedonthepre-smoothedcurves.Apartiallistofrecentliterature ondensefunctionaldataincludes[CLS86],[RS91],[ZC07],[EH08]and[BHK + 09].For sparsefunctionaldata,thepre-smoothingapproachisnotapplicableand,instead,oneneeds topoolalldatatogethertoborrowstrengthfromindividualcurves[YMW05a,YMW05b]. [HMW06]investigatedthetheoreticalpropertiesoffunctionalprincipalcomponentanalysis basedonlocallinearsmoothers.Theyfoundthat,fordensefunctionaldatawith 1 = 4, thepre-smoothingerrorsareasymptoticallynegligibleandquantitiessuchasthemean, covarianceandeigenfunctionscanbeestimatedwithaparametricroot- n rate,whilethese quantitiescanonlybeestimatedwithanonparametricconvergencerateforsparsefunctional datawith =0.Sincesparseanddensefunctionaldataareasymptoticconceptsand arehardtodistinguishinreality,[LH10]proposedanestimationproceduretreatingall typesoffunctionaldataunderaframeworkincludingthemoderatelydensecases. Morerecently,[KZ13]proposedauself-normalizingapproachtoconstructpointwise intervalsforthemeanfunctionoffunctionaldata.Theaforementionedpapers established 0 =1 = 4asthetransitionpointtoparametricconvergencerate. Incontrasttoestimation,lessisknownabouttheinferenceforfunctionaldata,witha 5 fewexceptionssuchas[ZC07]and[KZ13].InChapter2and3ofthethesis,wepropose pointwiseandsimultaneousinferenceproceduresforthefunctionallinearmodelundera frameworkforalltypesoffunctionaldataandinvestigatingthephasetransition fromsparsetodensedata.Wearenotonlytheonetoproposeaninference procedureintheregressionsetupwhichcancoveralltypesoffunctionaldata,butalsothe onetoinvestigatethetransitionphasefromsparsetodensefunctionaldata,forthe followingverybroadhypothesistestingproblem H 0 : H f 0 ( t ) g =0vs H 1 n : H f 0 ( t ) g = b n d ( t )(1.2.4) where H ( )isanyspfunctionalwithsomeregularconditionand b n isthedetectable orderoflocalalternativestobespe(Table1.1).InChapter2and3,wenotonly derivetheasymptoticdistributionsunderthenullhypothesisandlocalalternatives,butalso proposeawildbootstrappingapproachtounifytheinferenceprocedureinpracticealong withanicebandwidthselectionmethod. Table1.1: Transitionphasepointfromsparsetodensedataandoptimalde- tectableorderoflocalalternativesforbothpointwiseandsimultaneousinfer- ence. Notethatweloweredthetransitionphasepoint 0 whichwas1/4intheexisting literature. PointwiseInference 0 =1 = 8SimultaneousInference 0 =1 = 16 0 < 0 0 0 < 0 0 b n n 4(1+ ) = 9 n 1 = 2 n 8(1+ ) = 17 n 1 = 2 6 1.2.2HighDimensionalDataAnalysis Rapidprogresshasbeenmadeduringthepastdecadeinhighdimensionalstatistics,es- peciallyinlinearregressionmodelasoneoftheclassicalmodelsinstatisticaltheory.The vastmajorityofexistingliteraturehasbeenpursuedforestimationundersparsityandho- moscedasticitybasedonregularizationwithtpenalties,eitherconvexornoncon- vex.ThemostpopularrepresentativeofconvexpenaltiesistheLassopenalty[Tib96]. ThetheoreticalpropertiesoftheLassoestimatorsuchastheoracleproperty,whichrefers toconsistentlyrecoveringthesparsepatternandestimatingtheparametersoftheco cientvector,andselectionconsistencyhavebeeninvestigatedby[MY09,BRT09,BTW + 07, VdG08,Zha09,NRWY12]and[MB06,ZY06,Wai09].Thenonconvexrepresentativesinclude SCAD[FL01],MCP[Zha10],amongothers.Acomprehensiveoverviewofhighdimensional estimationforhomoscedasticregressionmodelscanbefoundin[BVDG11]. Despiteitsprevalenceinstatisticaldatasets,heteroscedasticityhasbeenlargelyignored inhighdimensionalstatisticsliterature.[WWL12]analyzedtheheteroscedasticityinhigh dimensionalcasebyusingquantileregression.[DCL12]proposedamethodologythatallows nonconstanterrorvariancesforhighdimensionalestimationbutwithaparametricformof thevariancefunction.Andrecently,[BCW14]cameupwithaself-tuning p Lassoestimation methodthatsolvedthisimportantprobleminhighdimensionalregressionanalysis. Althoughpeoplehavemadetprogresstowardsunderstandingtheestimation theoryforhighdimensionalmodels,verylittleworkhasbeendoneforconstructing intervals,statisticaltestingandassigninguncertaintyforpenalizedestimatorsinhighdimen- sionalsparsemodels.Inanearlywork,[KF00]showedthatthelimitingdistributionofthe Lassoestimatorisnotnormaleveninthelowdimensionalsetting.Recently,[GVHF11]and 7 [CG14]consideredglobaltestingwithhighdimensionalalternative.[MMB09]and[WR09] consideredp-valuesbasedonthesamplesplittingtechnique.Stabilityselection[MB10]and itsmo[SS13]provideanotherproceduretoestimateerrormeasuresforfalseposi- tiveselectionsingeneralhighdimensionalsettings.Forthelassoestimator,[LTTT14]and [TLTT14]consideredaninterestingconditionalinferencewithrandomhypothesis,whichis philosophicallytwiththetraditionalunconditionalinference. Intermsoftestingtheofonesingleregressioncot,theclassical z test (or t test)isnolongerapplicablebecausethehighdimensionality.Peoplehavebeenpropos- inglow-dimensionalprojectionproceduretoconducthypothesistestingandconstructcon- regions[ZZ14,B + 13,JM13,vdGBR13,LZL + 13,NL14].Thewaytoselectthe projectionvariablesvariesfrommethodtomethod.Someofthemusenode-wiseLassopro- ceduretoselecttheprojectionvariables,andsomeofthemusethesocalledKeyconFounder Controlling(KFC)methodmotivatedbyscreeningapproaches[FL08]. However,alltheaboveinferenceproceduresassumedhomoscedasticityfortheerrorterm, inparticular,theconditionalvarianceoftheerrorisaconstant.Thisisessentialfortheir inferenceproceduretobevalidsincetheyrequiretheaccurateestimationoftheerrorvari- ance.Withouthomoscedasticity,itishardforthemtocarryouttheestimationoftheerror varianceinhighdimensionalsettings.Butthishardlyholdsinpractice.Thereisrarely goodcausetohavestrongbeliefintheassumptionthattheerrorsarehomoscedasticand similarlythereisrarelytinformationtoenableaccuratespofthevariance function.Theuseofincorrectvariancemodelswill,ingeneral,leadtoinferencesthatare notasymptoticallyvalid[Bel02].[WD12]generalizedtheasymptoticresultsof[KF00]for thecaseofaparameterdimensionunderheteroscedasiticerrors.Butthereislittle workindealingwithheteroscedasticityundergrowingdimensionalongwithsamplesize.To 8 bridgethisgap,inChapter4ofthisthesis,weproposetouseEmpiricalLikelihood(EL)to teststatisticalhypothesesandconstructregionsforlowdimensionalcomponents inhighdimensionallinermodelswithheteroscedasticnoise. 9 Chapter2 pointwiseempiricallikelihood ratiotestsforfunctionallinearmodels andthephasetransitionfromsparse todensefunctionaldata 2.1Introduction Weconsiderstatisticalinferenceproblemsunderageneralfunctionallinearregressionmodel, whereboththeresponse Y ( t )andthecovariate X ( t )= f X (1) ( t ) ;:::;X ( p ) ( t ) g | are continuouslyonatimeinterval[ a;b ].Therelationshipbetween Y ( t )and X ( t )isgivenby Y ( t )= | 0 ( t ) X ( t )+ ( t ) ; (2.1.1) where 0 ( t )= 10 ( t ) ; ; p 0 ( t ) | isa p -dimensionalvectorofunknownfunctionsand ( t )isazeromeanerrorprocess,independentof X andwithacovariancefunction s;t )= Cov f ( s ) ; ( t ) g .Themodelin(2.1.1)isalsoreferredastheconcurrentfunctionallinearmodel in[SR05],whichincludesthevaryingcotmodelsandfunctionalanalysisofvariance (fANOVA)models[MC06,ZHM + 10]asspecialcases.InmanyfANOVAapplications,some 10 componentsof X ( t )arerandomindicatorsoftreatmentassignmentswithcomplicatedcross ornestedstructures,see[FZ00]formorediscussionsontherelationshipandbe- tweenmodel(2.1.1)andthevaryingcotmodels.Withoutlossofgenerality,weallow X ( t )tobeamultivariaterandomprocesswithmeanfunction ( t )= E f X ( t ) g andcovariance function ( s;t )=Cov f X ( s ) ; X ( t ) g . Let f Y i ( t ) ; X i ( t ) g , i =1 ;:::;n ,beindependentrealizationsof f Y ( t ) ; X ( t ) g .Insteadof observingtheentiretrajectories,onecanonlyobserve Y i ( t )and X i ( t )ondiscretetimepoints f t ij ;j =1 :::;m i g .Forconvenience,denote Y ij = Y i ( t ij )and X ( k ) ij = X ( k ) i ( t ij ),andassume that m i 's(1 i n )areallofthesameorderas m = n forsome 0.Thatis m i =m are boundedbelowandabovebysomeconstants.Functionaldataareconsideredtobesparse ordensedependingontheorderof m [HMW06,LH10].Datawithbounded m ,or =0, arecalledsparsefunctionaldata;thosesatisfying 0 ,where 0 isatransitionpointto bespbelow,arereferredasdensefunctionaldata.Thescenarioswith 2 (0 ; 0 )are inagreyzoneintheliteratureandwereferthemas\moderatelydense"inthischapter. AswementionedinSection1.2.1,sparseanddensefunctionaldatawereanalyzedwith tmethodologies.Butsparseanddensefunctionaldataareasymptoticconceptsand arehardtodistinguishinpractice,[LH10]proposedanestimationproceduretreatingall typesoffunctionaldataunderaframeworkincludingthemoderatelydensecasesand theyfound 0 =1 = 4isthetransitionpointtoparametricconvergencerateintheestimation. Incontrasttoestimation,lessisknownabouttheinferenceforfunctionaldata,withafew exceptionssuchas[ZC07]and[KZ13].Thefocusofthechapterisonproposingpointwiseand simultaneousinferenceproceduresforthefunctionallinearmodelin(2.1.1)undera frameworkforalltypesoffunctionaldataandinvestigatingthephasetransitionfromsparse 11 todensedata.Weareinterestedintesting H 0 : H f 0 ( t ) g =0vs H 1 : H f 0 ( t ) g6 =0(2.1.2) where H f z g isa q -dimensionalfunctionof z =( z 1 ; ;z p ) | 2 R p suchthat C ( z ):= @H ( z ) @ z | isa q p fullrankmatrix( q p )forall z . Thetestproblemin(2.1.2)isverybroad,includingmanyinterestinghypothesesasspecial cases.Forinstance,if H f z g = z ,thenullhypothesisisequivalentto H 0 : k 0 ( )=0forall k . If H f z g =( z 1 z 2 ;z 2 z 3 ; ;z p 1 z p ) | ,then(2.1.2)isessentiallyanANOVAhypothesis forthecotfunctions k 0 ( ).If H f z g = c 0 fora q p knownconstantmatrix andaknownvector c 0 ,then(2.1.2)becomes H 0 : 0 ( )= c 0 ,whichisatestforlinear constraintson 0 ( ).Similarhypothesistestingproblemshavebeenstudiedby[ZC07]and [Zha11].However,theirmethodsonlyapplytodensefunctionaldatawith > 5 = 4. Inthischapter,weproposenonparametrictestsbasedontheempiricallikelihood(EL)to test(2.1.2)pointwisely.WeshowtheEL-basedtestsenjoyaniceself-normalizingproperty suchthatwecantreatbothsparseanddensefunctionaldataunderaframework. TherehavebeensomeworksonELmethodsforsparsefunctionaldatawith =0.Among them,[XZ07]proposedanELmethodforconstructingpointwiseintervaland aBonferronitypesimultaneousbandforthemeanfunction.[CZ10]studiedan EL-basedmethodfortestingANOVAtypehypothesesinpartiallinearmodelswithmissing values. Toinvestigatethepowerofthetests,weconsiderthelocalalternatives H 1 n : H f 0 ( t ) g = b n d ( t ) ; (2.1.3) 12 where b n isasequenceofnumbersconvergingto0ataratetobesplaterand d ( t ) 6 =0is any q -dimensionalfunction.Foragiventest, b n isthesmallestorderofthelocalalternatives sothatthetesthasanon-trivialpowerforanynon-zero d ( ).Thus b n quanthe orderofsignalsthatatestcandetect.Forthesparsedatawith =0,itisknownthatthe ELmethodusingaglobalbandwidth h [CZ10]candetectalternativesoforder b n =( nh ) 1 = 2 forpointwisetests.Since h ! 0inatypicalnonparametricregressionsetting,thedetectable orderhereislargerthan n 1 = 2 .However,fordensedatawith > 0,thedetectableorder b n isstilllargelyunknown.Onekeyinterestinthischapteristounderstandtheof on b n inthepointwisetest.Theoptimal b n isobtainedbymaximizingthepowerofthe test(i.e.,minimizingtheorderof b n )whilecontrollingthetypeIerroratthedesiredlevel. Undersomemildconditions,wethat,forthepointwisetest,theoptimalrate b n islarger than n 1 = 2 for 1 = 8andequalsto n 1 = 2 for > 1 = 8.Thetransitionpoint1 = 8willbe refereedas 0 forthepointwisetests.Once > 0 ,withaproperlychosenbandwidth,the proposedtestscandetectasignalataparametricrate. Therestofthechapterisorganizedasfollows.InSection2.2,wepresentabias-corrected estimatorandsomepreliminaryresults.WeproposethepointwiseELtestinSection 2.3whereweinvestigatetheasymptoticdistributionsoftheteststatisticunderboththe nullandlocalalternatives,andthetransitionphasesfor b n .InSection2.4,weaddress implementationissuessuchasbandwidthselectionandcovarianceestimation.Simulation studiesarepresentedinSection2.5.AllthetechnicaldetailsarerelegatedtotheSection 2.6. 13 2.2Abias-correctedestimatorandsomepreliminary results Inthissection,wewillintroduceaninitiallocallinearestimator ^ ( t )[FG96]for 0 ( t ) andthenintroduceabias-correctedestimator ( t )andsomepreliminaryresults. 2.2.1Abias-correctedestimator Let K ( )beasymmetricprobabilitydensityfunctionthatweuseasakernel, h beaband- width,anddenote K h ( )= K ( =h ) =h .Forany t inaneighborhoodof t 0 , k 0 ( t )canbe approximatedby k 0 ( t ) ˇ k 0 ( t 0 )+ @ k 0 ( t 0 ) @t ( t t 0 ):= a k + b k ( t t 0 ) ;k =1 ; 2 ; ;p: Denote # =( a 1 ;:::;a p ;hb 1 ;:::;hb p ) | and D ij ( t )=( X | ij ; t ij t h X | ij ) | .Put Y i =( Y i 1 ;Y i 2 ;:::;Y im i ) | ; Y =( Y | 1 ; Y | 2 ; ; Y | n ) | ; D i ( t )=( D i 1 ( t ) ; D i 2 ( t ) ; ; D im i ( t )) | ; D ( t )=( D | 1 ( t ) ; D | 2 ( t ) ; ; D | n ( t )) | ; W i ( t )= 1 m i diag f K h ( t i 1 t ) ;K h ( t i 2 t ) ;:::;K h ( t im i t ) g ; and W ( t )=diag( W 1 ( t ) ; W 2 ( t ) ;:::; W n ( t )) : Anestimatorfor # isobtainedas ^ # =argmin # [ Y D ( t 0 ) # ] | W ( t 0 )[ Y D ( t 0 ) # ] ; (2.2.4) =[ D | ( t 0 ) W ( t 0 ) D ( t 0 )] 1 D | ( t 0 ) W ( t 0 ) Y : 14 Thusthelocallinearestimatorfor 0 ( t 0 )is ^ ( t 0 )= I p ; 0 p ^ # = I p ; 0 p [ D | ( t 0 ) W ( t 0 ) D ( t 0 )] 1 D | ( t 0 ) W ( t 0 ) Y ; (2.2.5) where I p isa p p identitymatrixand 0 p isa p p zeromatrix.ItisshowninLemma1 inSection2.6.2that sup t 2 [ a;b ] j ^ ( t ) 0 ( t ) j = O ˆ h 2 +( log n n + log n nmh ) 1 = 2 ˙ a:s: (2.2.6) Sincethebiasof ^ ( t )isoforder h 2 ,undersmoothingistypicallyneededforanunbiased testingprocedurebasedon ^ ( t )[XZ07].Toavoidundersmoothingandreducetheestima- tionbiasin ^ ( t ),we ( t )asthesolutionofthefollowingresidual-adjusted[XZ07] estimatingequationfor ( t ) g n f ( t ) g := 1 n n X i =1 g i f ( t ) g =0 ; (2.2.7) with g i f ( t ) g = 1 m i P m i j =1 n Y ij | ( t ) X ij f ^ ( t ij ) ^ ( t ) g | X ij o X ij K h ( t ij t ),where ^ ( t )isthelocallinearestimatorfor 0 ( t ). 2.2.2Regularityconditionsandpreliminaryresults Wenowpresentsomepreliminaryresultsregardingtheasymptoticsof ( t ).Assumethat t ij arei.i.d.randomvariablesfollowingaprobabilitydensityfunction f ( t ).Forconvenience, ( t )= ( t;t ), t )=( t;t ), C ( t )= C f 0 ( t ) g and A ( t )= ( t ) f ( t ).Wewillalsouse ~ o p and ~ O p torepresentthat,respectively, o p and O p holduniformlyforall t 2 [ a;b ].The 15 followingconditionsareneededforourasymptoticresults. (C1): Thekernelfunction K ( )isasymmetricprobabilitydensityfunctionwithabounded supportin[ 1 ; 1]. (C2): Assumethat E n sup t 2 [ a;b ] k X ( t ) k 1 o < 1 and E n sup t 2 [ a;b ] j ( t ) j 2 o < 1 forsome 1 ; 2 5where kk isthe L 2 normforavector. (C3): Assumethat f ( t )and ( t )havecontinuoussecondderivativeson[ a;b ], 0 ( t )has continuousthirdderivativeson t 2 [ a;b ],and C ( t )isuniformlyboundedon t 2 [ a;b ]. (C4): =min f 1 ; 2 g andlet h = n 0 with 0 2 (0 ; 1)beingtheorderofthe bandwidth.Assumethat(i) 0 < 1 2 if 2 [0 ; 1 = 8]and 0 < 1 = 2 1 if > 1 = 8;(ii)(1+ ) = 9 < 0 if 2 [0 ; 1 = 8]and1 = 8 < 0 < if > 1 = 8. Conditions(C1)and(C3)arecommonlyusedregularityconditionsinnonparametric regressions.Condition(C2)issimilartothatin[LH10].Theupperboundsonthebandwidth h in(C4)(i)areadaptedfrom[LH10].Detailedexplanationontherestrictionson h in(C4)(ii) willbegiveninRemark2afterProposition2.Selectingabandwidththat(C4)will bediscussedinSection2.4. Thefollowingpropositionprovidesanasymptoticexpansionfor ( t ). Proposition1. Underconditions(C1)-(C3)and(C4)(i), ( t ) 0 ( t )= A 1 ( t ) ˘ n ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ) ; (2.2.8) 16 where ˘ n ( t )= n 1 P n i =1 ˘ i ( t ) and ˘ i ( t )= m 1 i P m i j =1 X ij ij K h ( t ij t ) .Let r =lim n !1 n 1 n X i =1 m=m i ; ts = Z u s K t ( u ) du; then Var f ˘ n ( t ) g = ( t t ) f ( t ) n r mnh 20 + m r nm f ( t ) o f 1+~ o (1) g : (2.2.9) TheproofofProposition1isprovidedinSection2.6.2. Remark1. Proposition1impliesthatthemeansquareerror(MSE)of ( t ) isMSE ( t ) = O f h 8 + 1 mnh + 1 n g : Hencetheoptimal h opt thatminimizetheMSEof ( t ) is h opt ˘ ( mn ) 1 = 9 = n (1+ ) = 9 .Itfollowsthat ( t ) 0 ( t )= O p f h 4 opt +( mnh opt ) 1 2 + n 1 2 g = O p f n 1 = 2 + n 4(1+ ) = 9 g : Thentheoptimalconvergencerateof ( t ) isoforder n 4(1+ ) = 9 if 1 = 8 andoforder n 1 = 2 if > 1 = 8 .Thus, 0 =1 = 8 isthetransitionpointfortheconvergencerateof ( t ) . When > 0 , ( t ) isnolongersensitivetothechoiceof h anditstheconvergencerate remainsat O p ( n 1 = 2 ) aslongas h = O ( n 1 = 8 ) and h ˛ m 1 = n . Thefollowingpropositionprovidestheasymptoticdistributionof ( t )anditsproofis providedinSection2.6.2. Proposition2. Suppose mh ! 0 2 [0 ; 1 ] , C 0 = 8 > > > < > > > : f n= ( mh ) g 1 = 2 ; if 0 < 1 ; n 1 = 2 ; if 0 = 1 (2.2.10) 17 and B ( t )= ( t t ) f ( t ) f ( r 20 + 0 f ( t )) I ( 0 < 1 )+ f ( t ) I ( 0 = 1 ) g .Underconditions (C1)-(C4),wehave nC 1 0 ( t ) 0 ( t ) d ! N ( 0 ; V ( t )) : (2.2.11) where V ( t )= A 1 ( t ) B ( t ) A 1 ( t ) . Remark2. ByProposition1,thebiasin nC 1 0 f ( t ) 0 ( t ) g isoforder O p ( nh 4 =C 0 ) . Sincethebiascanleadtoinvalidtests,weuseCondition(C4)(ii)toensurethatthebiasis asymptoticallynegligible.When 0 =1 = 8 ,thecondition 0 > (1+ ) = 9 warrantsthat mh< 1 andhence nh 4 =C 0 = n 1 = 2 m 1 = 2 h 9 = 2 = n (1+ 9 0 ) = 2 = o (1) .When > 0 ,the conditionthat 1 = 8 < 0 < implies mh !1 and nh 4 =C 0 = n 1 = 2 h 4 = n 1 = 2 4 0 ! 0 . ByProposition2andtheDeltamethod,wecanshowthat,under H 0 , nC 1 0 H f ( t ) g d ! N ( 0 ; R 1 ( t ))(2.2.12) where R ( t )= f C ( t ) V ( t ) C ( t ) | g 1 .Theasymptoticvariancesof H f ( t ) g aretunder sparseanddensecases.AWald-typeteststatisticmaybeconstructedusing(2.2.12)ifan appropriateestimatorforthevarianceof H f ( t ) g canbeobtained.Butwewillnotpursue thisdirectionbecausetheestimationoftheasymptoticvarianceinvolvesmanynonparametric functionse.g. ( t ) ; t )and f ( t ),whichrequiresproperlyselectingseveralbandwidths. Instead,weproposeaself-normalizingELmethodinthenextsectionwhichavoidsestimating theasymptoticvarianceexplicitly. 18 2.3Apointwisetest Inthissection,wewillintroduceatestfor H 0 atanytime t ,whichisbased ontheempiricallikelihoodratio(ELR)statistic.ToconstructanELRstatisticfortesting (2.1.2),wetheELfunctionat ( t )fora t 2 [ a;b ].Following[Owe90],the empiricallikelihoodfor ( t )isas L f ( t ) g =max p 1 ;p 2 ;:::;p n ( n Y i =1 p i : n X i =1 p i =1 ;p i 0 ; n X i =1 p i g i f ( t ) g =0 ) : ApplyingtheLagrangemultiplier,thelog-ELfunctionbecomes l f ( t ) g :=log L f ( t ) g = X log f 1+ | ( t ) g i f ( t ) gg n log n where ( t )isasolutiontothefollowingequation Q 1 n f ( t ) ; ( t ) g := 1 n n X i =1 g i f ( t ) g 1+ | ( t ) g i f ( t ) g =0 : (2.3.13) Themaximumlog-ELwithoutanyconstraintis l f ( t ) g = n log n: Itfollowsthatthe negativelog-ELRfortesting H 0 : H f 0 ( t ) g =0is ` ( t ):=min H f ( t ) g =0 l 0 f ( t ) g ; (2.3.14) where l 0 f ( t ) g = P n i =1 log f 1+ | ( t ) g i f ( t ) gg .Tosolve(2.3.14),weminimizethefollowing objectivefunction[QL95] M f ( t ) ; ( t ) g = 1 n l 0 f ( t ) g + | ( t ) H f ( t ) g ; 19 where ( t )isa q 1vectorofLagrangemultipliers.tiating M ( ; )withrespectto and andsettingthemtozero,wehave Q 2 n f ( t ) ; ( t ) ; ( t ) g := 1 n @l 0 f ( t ) g @ | ( t ) + C | f ( t ) g ( t )=0and H f ( t ) g =0 : Combiningequation(2.3.13)for ( t ),theconstrainedminimizationproblemin(2.3.14)is equivalenttosolvingthefollowingestimatingequationsystem Q 1 n f ( t ) ; ( t ) g =0; Q 2 n f ( t ) ; ( t ) ; ( t ) g =0and H f ( t ) g =0 : (2.3.15) WeshowinSection2.6.2.3thataconsistentsolutionto(2.3.15),denotedas( ~ ( t ) ; ~ ( t ) ; ~ ( t )), existsalmostsurely.Wecall ~ ( t )theRestrictedMaximumEmpiricalLikelihoodEstimator (RMELE).Thentheteststatisticin(2.3.14)becomes ` ( t )= l 0 f ~ ( t ) g : (2.3.16) Thefollowingpropositionprovidesanasymptoticexpansionfor2 ` ( t ). Proposition3. Underconditions(C1)-(C4),andunder H 0 ,wehave,foreach t 2 [ a;b ] , 2 ` ( t )= U n ( t ) | U n ( t )+ O p ( nh 4 =C 0 ) ; (2.3.17) where U n ( t )= nC 1 0 G ( t ) ˘ n ( t ) , G ( t )= R 1 = 2 ( t ) C ( t ) A 1 ( t ) and R ( t ) and A ( t ) arethe sameasdin(2.2.12). Theasymptoticexpansionin(2.3.17)makesaconnectionbetween2 ` ( t )andthebias- correctedestimator ( t )describedinSection2.2.ByProposition1and(2.2.12), U n ( t )= 20 nC 1 0 R 1 = 2 ( t ) H f ( t ) g + o p (1)andasymptoticallyfollowsa q -dimensionalmultivariate standardnormaldistribution.Naturally,2 ` ( t ) d ! ˜ 2 q underthenullhypothesis.Thefact thattheasymptoticdistributionof2 ` ( t )doesnotdependon m (or )provesthatitisa self-normalizedteststatisticnomatterthedataaresparseordense.Thisisaveryappealing propertybecausethetestprocedureisthesameforalltypesoffunctionaldataandsolving (2.3.15)doesnotrequireestimatingthevarianceof H f ( t ) g . ThefollowingTheoremsummarizestheasymptoticdistributionof2 ` ( t )underboth H 0 andthelocalalternative(2.1.3). Theorem1. Underconditions(C1)-(C4)andsuppose H f 0 ( t ) g = b n d ( t ) for t 2 [ a;b ] , where b n = n 1 C 0 and d ( t ) isanydrealvectoroffunctions,wehave 2 ` ( t ) d ! ˜ 2 q f d | ( t ) R ( t ) d ( t ) g where d | ( t ) R ( t ) d ( t ) isthenoncentralityparameter. AproofofTheorem1isprovidedintheSection3.5.1. Remark3. Under H 0 , d ( t )=0 andTheorem1suggeststhat 2 ` ( t ) followsa ˜ 2 q distribution asymptotically.Anasymptotic leveltestisgivenbyrejecting H 0 atadpoint t if 2 ` ( t ) >˜ 2 q where ˜ 2 q istheupper quantileof ˜ 2 q .Bytakingaspecialfunction H f g = j ( t ) ,wecanalsoconstructa( 1 )100%cdenceintervalfor j ( t ) ( j =1 ; ;p ) asCI = f j ( t ):2 ` ( t ) <˜ 2 q g ,whichcanbecomputednumerically.Thisprovidesan alternativeself-normalizedceintervaltothosebasedonaself-normalizednormal approximation[KZ13].ComparingtoKimandZhao'smethod,ourmethoddoesnotrequire estimatingthebiasbecauseweusebias-correctedestimatingequations. 21 Wethesizeofthedetectablesignal b n asthesmallestorder b n in(2.1.3)thatthe proposedtestcandetect.Foragiventlevel , b n =min h b n subjectto(i)TypeIerror under H 0 (2.3.18) and(ii)thepowerisnon-trivialunder H 1 n . Theorem1guaranteesthattheproposedtestcontrolstheTypeIerroratthenominal levelasymptotically.Forthesparseandmoderatedensecases( 1 = 8),Condition(C4) implies mh ! 0andhence b n =( nmh ) 1 = 2 byTheorem1.Inthiscase, b n isequivalentto min h b n =( nmh ) 1 = 2 subjecttocondition(C4)on h . Theoptimal h thatsolvestheminimizationproblemaboveis h = n (1+ + ) = 9 foran arbitrarilysmall > 0.Thisimpliestheoptimal b n is n 4(1+ ) = 9+ = 18 ,whichresultsin b n = n 4(1+ ) = 9 byletting ! 0.Fordensedata( > 1 = 8),(C4)leadsto mh !1 . Theorem1impliesthattheproposedtesthasanon-trivialpowerunderalocalalternative ofsize b n = n 1 = 2 ,whichisthedetectableorderofaparametrictest. 2.4Implementationissues 2.4.1Bandwidthselection Theperformanceoftheestimationandtestproceduresdependsonthebandwidth h andour asymptotictheoryrelieson h fallingintherangeinCondition(C4).Forlongitudinal data(sparsefunctionaldata)wheresubjectsareassumedtobeindependent,onemayapply 22 a\leave-one-out"cross-validationstrategy[RS91]tochoosebandwidth.However,cross- validationistime-consumingandingeneral,itsperformancefordensefunctionaldatais unknown. Weproposetoselectthebandwidththroughminimizingtheconditionalintegratedmean squarederror(IMSE)ofthelocalpolynomialestimator ^ ( t ).By(2.2.5),thebandwidth h thatminimizingtheIMSEof ^ ( t )isattheorderof n (1+ ) = 5 ,whichcondition(C4) forbothsparseanddensecases.Let D = f ( t ij ; X ij ) ;j =1 ; 2 ; ;m i ;i =1 ; 2 ; ;n g .It isnottoshowthatforany t , MSE( ^ ( t ) jD )= b | ( t ) b ( t )+tr f Cov( ^ ( t ) jD ) g where b ( t )=Bias( ^ ( t ) jD ).TheIMSEisas IMSE( ^ ( ) jD )= Z b a MSE( ^ ( t ) jD ) $ ( t ) f ( t ) dt where $ ( t )isaknownweightfunctionand f ( t )istheprobabilitydensityfunctionof t ij . Theconditionalbiasis b ( t )=( I ; 0 )( D | ( t ) W ( t ) D ( t )) 1 D | ( t ) W ( t ) l ( t ) ; where l ( t )=( l 11 ( t ) ; ;l 1 m 1 ( t ) ;l 21 ( t ) ; ;l nm n ( t )) | with l ij ( t )= X | ij ( t ij ) X | ij [ 0 ( t )+( t ij t ) (1) ( t )] = X | ij [ ( t ij ) 0 ( t ) ( t ij t ) (1) ( t )] ˇ X | ij (2) ( t )( t ij t ) 2 = 2 ; 23 and ( s ) ( t )= f ( s ) 1 ( t ) ; ; ( s ) p ( t ) g | , s =1 ; 2,isthe s -thderivativeof 0 ( t ).Theconditional covarianceis Cov( ^ ( t ) jD )=( I ; 0 )( D | ( t ) W ( t ) D ( t )) 1 D | ( t ) W ( t ) W ( t ) D ( t )( D | ( t ) W ( t ) D ( t )) 1 0 B @ I 0 1 C A ; where =Cov( Y jD )=diag( 1 ; 2 ; ; n )and i = t ij ;t ik ) m i j;k =1 . Anestimatorofthecovariance s;t )isdescribedinSection2.4.2.Toestimate (2) ( t ), weuseahigherorderlocalpolynomialestimatorof 0 ( t )withapilotbandwidth h .The pilotbandwidthisobtainedbyminimizingtheresidualsquarescriterionin[ZL00].By replacing (2) ( t )and withtheirestimators d (2) ( t )and ^ ,weobtainestimatorsofthe conditionalmeanandcovariance, ^ b ( t )and d Cov( ^ ( t ) jD ).Thenthebandwidth h ischosen byminimizingtheempiricalIMSE ^ h =argmin h 1 N n X i =1 m i X j =1 [ MSE f ^ ( t ij ) jDg $ ( t ij ) where N = P n i =1 m i and [ MSE( ^ ( t ) jD )= ^ b | ( t ) ^ b ( t )+tr f d Cov( ^ ( t ) jD ) g : 2.4.2CovarianceEstimation Thecovariancefunction ; )canbeestimatedbythenonparametrickernelestimatorof [YMW05a],whichisuniformlyconsistent[LH10].However,thenonparametriccovariance estimatorisnotnecessarilypositiveInstead,weadoptthesemiparametric covarianceestimationof[FHL07].Supposethecovariancefunctioncanbedecomposedas 24 s;t )= ˙ ( s ) ˆ ( s;t ) ˙ ( t ),wemodelthevariancefunction ˙ 2 ( t )nonparametricallyandthe correlationfunction ˆ ( s;t )parametrically.Forestimation,weapplythenonparametric kernelestimatorsof( s;t )and ˙ 2 ( t )[YMW05a]togetinformationabouttheparametric structureof ˆ ( s;t ).Thenweaparametricmodelto ˆ ( s;t )usingthequasimaximum likelihoodestimatorof[FHL07].Theparametricstructureguaranteesthepositivesemi- oftheestimatedcorrelationfunction.Formoredetailsoftheimplementation, seeSection2.5. 2.5Simulationstudies Simulationstudieswereconductedtoevaluatetheperformanceoftheproposedin- ferenceprocedures.Wegenerateddatafromthefollowingmodel Y i ( t ij )= 1 ( t ij ) X (1) i ( t ij )+ 2 ( t ij ) X (2) i ( t ij )+ i ( t ij )(2.5.19) for i =1 ; 2 ; ;n and j =1 ; 2 ; ;m where t ij 'sareIIDUnif[0,1]distributed, X (1) i ( t ij )= 1+2 e t ij + v ij and X (2) i ( t ij )=3 4 t 2 ij + u ij .Here u ij and v ij areIID N (0 ; 1)randomvariables, whichareindependentwith t ij and i ( t ij ).Therandomerror i ( t ij )wasgeneratedfrom azeromeanAR(1)processsuchthatVar f ( t ) g =1andCov f ( t ) ; ( t s ) g = ˆ 10 s for some ˆ 2 (0 ; 1).Toevaluatetheproposedmethodsforbothsparseanddensedata,weset m =5 ; 10and50.Thesamplesizeswerechosentobe100and200.TheEpanechnikov kernel K ( x )= 3 4 (1 x 2 ) + wasusedforestimation,where( a ) + =max( a; 0).Bandwidth selectionwasconductedforeverysimulateddatasetusingthemethodproposedinSection 2.4. 25 Weset 1 ( t )= 1 2 sin t and 2 ( t )=2sin( t +0 : 5)inModel(2.5.19)andappliedthe procedureinSection2.3toconstructpointwiseCIsfor 1 ( t ).Table2.1summarizesthe empiricalcoverageprobability(CP)inpercentageandtheaveragelength(AL)oftheCIs (inparentheses)for 1 ( t )at t =0 : 3 ; 0 : 5and0.7basedon1000simulationreplicates.These resultswereobtainedusingthedata-drivenbandwidth.Aswecanseefromthetable,the CPsareclosetothenominallevel95%inbothsparseanddensecasesandtheALsare shorterunderalargersamplesize.Inaddition,theALsimproveas m increasesfrom5to 50. Table2.1:Empiricalcoverageprobability(%)andaveragelengthofpointwise intervals(inparenthesis)for 1 ( t )at t =0 : 3 ; 0 : 5and0.7. m =5 m =10 m =50 tnˆ =0 : 2 ˆ =0 : 5 ˆ =0 : 2 ˆ =0 : 5 ˆ =0 : 2 ˆ =0 : 5 0.310092.1(0.272)92.9(0.268)92.9(0.203)92.5(0.203)93.7(0.107)93.9(0.107) 20092.3(0.205)92.3(0.205)93.5(0.152)93.0(0.152)94.7(0.081)94.4(0.081) 0.510092.9(0.270)93.5(0.267)94.5(0.210)94.0(0.209)93.3(0.107)93.1(0.108) 20093.6(0.201)93.3(0.200)94.6(0.152)94.4(0.152)94.0(0.083)93.8(0.081) 0.710092.1(0.273)92.5(0.272)92.2(0.211)92.1(0.208)93.4(0.106)92.8(0.106) 20092.3(0.201)92.4(0.201)94.1(0.153)93.3(0.153)93.9(0.083)93.8(0.081) Tofurtherdemonstratetheperformanceoftheproposedbandwidthselectionmethodin Section2.4,weshowinpanels(a)and(b)ofFigure2.1theboxplotsof ^ h selectedformodel (2.5.19)with 1 ( t )= 1 2 sin( ˇt )and 2 ( t )=2sin( ˇt +0 : 5)basedon500replicates.Both themedianandspreadof ^ h decreasedasthe n and m increasedandthecorrelation ˆ had littleimpactonthebandwidthselectionresult.Theseplotsalsoshowthatourbandwidth selectionprocedureisverystableasthereareveryfewoutliersineachcase.Inpanels(c) and(d)ofFigure2.1,weplotthelogarithmofMedian( ^ h )againstthelogarithmof nm for eachvalueof ˆ .Theseplotsshowclearlineardecreasingtrends,contheselected 26 bandwidthdecreasesinapolynomialorderof nm . (a) n =100 (b) n =200 (c) ˆ =0 : 2 (d) ˆ =0 : 5 Figure2.1: Panels(a)and(b)areboxplotsforbandwidthsselectedformodel(2.5.19)with 1 ( t )= 1 2 sin( ˇt )and 2 ( t )=2sin( ˇt +0 : 5)usingtheproposedbandwidthselectionmethodin Section2.4.Panels(c)and(d)aretheplotsofthelogarithmofmedian( ^ h )vslog( nm ). 27 2.6TechnicalDetails ThissectioncontainstheproofsforthemaintheoremsinSection2.3.Proofsforthepropo- sitionscanbefoundinthenextsection. 2.6.1ProofofTheorem1 ProofofTheorem1. Forconvenience,wesuppresstheargumentofallthefunctionsont. 1 = 0 B B B B B @ B 1 + B 1 APAB 1 B 1 APB 1 AQ | PAB 1 PQ | QAB 1 Q R 1 C C C C C A ; where P = V ( I C | Q )and Q = RCV .ByTaylorexpansionoftheequations(2.3.15)at ( ; 0 ; 0)asinLemma4inSection2.6.2,wehave 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A = 1 0 B B B B B @ n 1 P n i =1 g i ( 0 )+ o p n ) o p n ) H ( 0 )+ o p n ) 1 C C C C C A = 0 B B B B B @ B 1 + B 1 APAB 1 PAB 1 QAB 1 1 C C C C C A ( 1 n n X i =1 g i ( 0 ) ) + 0 B B B B B @ B 1 AQ | Q | R 1 C C C C C A H ( 0 ) g + o p n ) ; 28 where n = k ~ 0 k + k ~ k + k ~ k .Thenunderlocalalternativehypothesis H 1 : H f 0 ( t ) g = n 1 C 0 d ( t ),wehave n = k 0 B B B B B @ ~ ~ 0 ~ 1 C C C C C A kk 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A k O p ( C 0 =n )+ o p n ) ; whichimpliesthat n = O p ( C 0 =n ). Thusfor ~ ,wehave ~ = QA | B 1 f 1 n n X i =1 g i ( 0 ) g + R H ( 0 )+ o p ( C 0 =n ) = R C A 1 f 1 n n X i =1 g i ( 0 ) g + R H ( 0 )+ o p ( C 0 =n ) : (2.6.20) Accordingly,wehave nC 1 0 R 1 = 2 f ~ R H ( 0 ) g d ! N ( 0 ; I q ) : Underlocalalternative hypothesis H 1 : H f 0 ( t ) g = n 1 C 0 d ( t ),wehave nC 1 0 R 1 = 2 ~ d ! N ( R 1 = 2 d ; I q ) : Thus2 ` ( t )= n 2 C 2 0 ~ | R 1 ~ + o p (1) d ! ˜ 2 q ( d | Rd ) : 2.6.2ProofsofPropositions Inthissection,weprovidetheproofsforallthepropositionsinthischapterandtheexistence oftheRMELE ~ ( t ).AnasymptoticexpressionfortheLagrangemultiplier ~ ( t )in(2.3.13) isalsoincluded. 29 2.6.2.1SomeUsefulLemmas WepresentsomeusefullemmasandtheirproofsbeforetheproofsforthePropositions. Denote n = n 1 + h 2 , n 1 =( d n log n nh 2 ) 1 2 where d n = h 2 + rh=m . Lemma1. Underassumptions(C1)-(C3)and(C4)(i),wehave sup t 2 [ a;b ] j ^ ( t ) 0 ( t ) j = O ( n ) a:s:: Proof. Bytheexpressionof ^ ( t ),usingaTaylorexpansion,wehave ^ ( t ) 0 ( t )= I p ; 0 p f D | ( t ) W ( t ) D ( t ) g 1 D ( t ) W ( t ) Y 0 ( t ) = I p ; 0 p n X i =1 D | i ( t ) W i ( t ) D i ( t ) 1 n X i =1 D | i ( t ) W i ( t ) Y i 0 ( t ) = I p ; 0 p n X i =1 D | i ( t ) W i ( t ) D i ( t ) 1 n X i =1 D | i ( t ) W i ( t )[ B i ( t )+ i ] ; where B i ( t )= ( t i 1 t ) 2 X | i 1 (2) 0 ( t i 1 ) = 2 ; ; ( t im i t ) 2 X | im i (2) 0 ( t im i ) = 2 | with t ij be- tween t and t ij and i =( i 1 ; 12 ;:::; im i ) | : Observethatfordenominator I ( t ):= 1 n P n i =1 D | i ( t ) W i ( t ) D i ( t ),wehave I ( t )= 1 n n X i =1 1 m i m i X j =1 0 B @ X ij X | ij K h ( t ij t ) X ij X | ij K h ( t ij t ) t ij t h X ij X | ij K h ( t ij t ) t ij t h X ij X | ij K h ( t ij t )( t ij t h ) 2 1 C A := 0 B @ I 11 ( t ) I 12 ( t ) I 21 ( t ) I 22 ( t ) 1 C A : Inordertogettheuniformboundfor I ( t ),weuseLemma2in[LH10]for I ij ( t ) ;i;j =1 ; 2. 30 For I 11 ( t ),wehave E f I 11 ( t ) g = E 8 < : 1 n n X i =1 1 m i m i X j =1 X i ( t ij ) X | i ( t ij ) K h ( t ij t ) 9 = ; = E 8 < : E [ 1 n n X i =1 1 m i m i X j =1 X i ( t ij ) X | i ( t ij ) K h ( t ij t ) j t ij ] 9 = ; = E 8 < : 1 n n X i =1 1 m i m i X j =1 ( t ij ) K h ( t ij t ) 9 = ; = 1 n n X i =1 1 m i m i X j =1 Z ( s ) K h ( s t ) f ( s ) ds = Z ( s ) K h ( s t ) f ( s ) ds = Z ( t + uh ) K ( u ) f ( t + uh ) du = ( t ) f ( t )+ ~ O ( h 2 ) ; aslongas 12 < 1 whichistruebycondition(C1)and[ ( t ) f ( t )] 00 isuniformlybounded on t 2 [ a;b ]by(C3),where ~ O denoteuniformorderforall t 2 [ a;b ]andalsoforthe~ o below. Hence,undertheconditionthat E n sup t 2 [ a;b ] k X ( t ) k 1 o < 1 forsome5 1 < 1 ,and d 1 n ( log n n ) 1 2 1 = o (1),whichistrueunder(C4)(i),byLemma2in[LH10],wehave sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij X | ij K h ( t ij t ) ( t ) f ( t ) k = O ( n ) ;a:s:: Bysimilarcalculationsforotherthreeterms,wehave E f I 12 ( t ) g = Z ( s ) K h ( s t ) s t h f ( s ) ds = ~ O ( h ) ; under 12 < 1 and[ ( t ) f ( t )] 0 uniformlyboundedon t 2 [ a;b ],whicharetrueunder(C1) 31 and(C3)respectively.And E f I 22 ( t ) g = Z ( s ) K h ( s t )( s t h ) 2 f ( s ) ds = ( t ) f ( t ) 12 + ~ O ( h 2 ) ; under[ ( t ) f ( t )] 00 isuniformlyboundedon t 2 [ a;b ]by(C3).Henceinsummary,wehave underconditions(C1)-(C3)and(C4)(i), I ( t )= 0 B @ ( t ) f ( t )+ ~ O ( n ) ~ O ( n 1 + h ) ~ O ( n 1 + h ) ( t ) f ( t ) 12 + ~ O ( n ) 1 C A ;a:s:: Thenwehave I 1 ( t )= 0 B @ ( t ) f ( t )0 0 ( t ) f ( t ) 12 1 C A 1 + ~ O ( n 1 + h ) ;a:s:: (2.6.21) Forthenumerator II ( t ):= 1 n P n i =1 D | i ( t ) W i ( t ) B i ( t ),wehave II ( t )= 1 n n X i =1 1 m i m i X j =1 0 B B B @ X ij X | ij ( t ij t ) 2 (2) 0 ( t ij ) 2 K h ( t ij t ) X ij X | ij ( t ij t ) 3 h (2) 0 ( t ij ) 2 K h ( t ij t ) 1 C C C A := 0 B @ II 1 ( t ) II 2 ( t ) 1 C A : Similarasthedenominator,undertheconditionthat 0 ( t )hascontinuoussecondderivative 32 on t 2 [ a;b ](C3),wehave E f II 1 ( t ) g = E 8 < : 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t ) 2 (2) 0 ( t ij ) 2 K h ( t ij t ) 9 = ; = E 8 < : 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t h ) 2 K h ( t ij t ) 9 = ; ~ O ( h 2 ) = ( t ) f ( t ) 12 ~ O ( h 2 )= ~ O ( h 2 ) if 12 < 1 bycondition(C1)and ( t ) f ( t )uniformlyboundedon t 2 [ a;b ](C3),and E f II 2 ( t ) g = E 8 < : 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t ) 3 h (2) 0 ( t ij ) 2 K h ( t ij t ) 9 = ; = E 8 < : 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t h ) 3 K h ( t ij t ) 9 = ; ~ O ( h 3 ) =[ ( t ) f ( t )] 0 14 ~ O ( h 3 )= ~ O ( h 3 ) if 14 < 1 (C1)and[ ( t ) f ( t )] 0 uniformlyboundedon t 2 [ a;b ](C3). ByLemma2in[LH10],underthecondition E n sup t 2 [ a;b ] k X ( t ) k 1 o < 1 forsome 5 1 < 1 ,and d 1 n ( log n n ) 1 2 1 = o (1)whichistrueunder(C4)(i),wecanhave sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t ) 2 (2) 0 ( t ij ) 2 K h ( t ij t ) k = h 2 O ( n 1 +1) ;a:s:; and sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij X | ij ( t ij t ) 3 h (2) 0 ( t ij ) 2 K h ( t ij t ) k = h 2 O ( n 1 + h ) ;a:s:: 33 Notethat III ( t ):= 1 n n X i =1 D | i ( t ) W i ( t ) i = 1 n n X i =1 1 m i m i X j =1 0 B @ X ij ij K h ( t ij t ) X ij ij K h ( t ij t ) t ij t h 1 C A : Similarly,bycondition(C2)and(C3),wehavethefollowingduetoLemma2in[LH10] sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij ij K h ( t ij t ) k = O ( n 1 ) ;a:s:; and sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij ij K h ( t ij t ) t ij t h k = O ( n 1 ) ;a:s:: Thuswehave ^ ( t ) 0 ( t )= I p p ; 0 p p 0 B @ ( t ) f ( t )0 0 ( t ) f ( t ) 12 1 C A 1 8 > < > : h 2 0 B @ ~ O ( n 1 +1) ~ O ( n 1 + h ) 1 C A + 0 B @ ~ O ( n 1 ) ~ O ( n 1 ) 1 C A 9 > = > ; = h 2 ~ O ( n 1 +1)+ ~ O ( n 1 )= ~ O ( n ) ;a:s:; since n = n 1 + h 2 : Lemma2. Underconditions(C1)-(C3)and(C4)(i),wehave E ( g i f 0 ( t ) g )= ~ O ( h 4 ) and Var ( g i f 0 ( t ) g )= ˆ 1 m i h ( t t ) f ( t ) 20 + m i 1 m i ( t t ) f 2 ( t ) ˙ f 1+~ o (1) g : 34 Proof. Bytheof g i f 0 ( t ) g ,wedecompose g i f 0 ( t ) g asthefollowingtwoparts g i f 0 ( t ) g = m 1 i m i X j =1 X ij X | ij f [ ^ ( t ) 0 ( t )] [ ^ ( t ij ) 0 ( t ij )] g K h ( t ij t ) + m 1 i m i X j =1 X ij ij K h ( t ij t ):= L 1 i ( t )+ ˘ i ( t ) : Toanalyzetheterm L 1 i ( t )intheaboveexpression,wefurtherobtaintheexpansion for ^ ( t ) 0 ( t )inthefollowing.Bytheexpressionof ^ ( t )andaTaylorexpansion,weobtain ^ ( t ) 0 ( t )= I p p ; 0 p p n n 1 n X i =1 D | i ( t ) W i ( t ) D i ( t ) o 1 n n 1 n X i =1 D | i ( t ) W i ( t )( B i ( t )+ T i ( t )+ i ) o ; where B i ( t )= 1 2 ( X | i 1 (2) 0 ( t )( t i 1 t ) 2 ; ; X | im i (2) 0 ( t )( t im i t ) 2 ) | and T i ( t )= 1 6 ( X | i 1 (3) 0 ( t i 1 )( t i 1 t ) 3 ; ; X | im i (3) 0 ( t im i )( t im i t ) 3 ) | with t ij isbetween t and t ij .Itthenfollowsthat ( ^ ( t ) 0 ( t )) ( ^ ( t ij ) 0 ( t ij ))= 1 n n X k =1 1 m k m k X l =1 n 1 ;kl ( t ) 1 ;kl ( t ij ) +( 2 ;kl ( t ) 2 ;kl ( t ij ))+( 3 ;kl ( t;t 1 ) 3 ;kl ( t;t 2 )) o f 1+~ o p (1) g 35 where t 1 isbetween t and t kl and t 2 isbetween t ij and t kl and 1 ;kl ( t )= f 1 ( t ) 1 ( t ) X kl kl K h ( t kl t ) 2 ;kl ( t )= 1 2 f 1 ( t ) 1 ( t ) X kl X | kl ( t kl t ) 2 (2) 0 ( t ) K h ( t kl t ) 3 ;kl ( t;t )= 1 6 f 1 ( t ) 1 ( t ) X kl X | kl ( t kl t ) 3 (3) 0 ( t ) K h ( t kl t ) : Thenwecanwrite L 1 i ( t )= f I 1 i ( t )+ I 2 i ( t )+ I 3 i ( t ) gf 1+~ o p (1) g where I 1 i ( t )= 1 m i m i X j =1 1 n n X k =1 1 m k m k X l =1 X ij X | ij 1 ;kl ( t ) K h ( t ij t ) 1 m i m i X j =1 1 n n X k =1 1 m k m k X l =1 X ij X | ij 1 ;kl ( t ij ) K h ( t ij t ):= I 11 ;i ( t ) I 12 ;i ( t ) ; I 2 i ( t )= 1 m i m i X j =1 1 n n X k =1 1 m k m k X l =1 X ij X | ij 2 ;kl ( t ) K h ( t ij t ) 1 m i m i X j =1 1 n n X k =1 1 m k m k X l =1 X ij X | ij 2 ;kl ( t ij ) K h ( t ij t ):= I 21 ;i ( t ) I 22 ;i ( t )and I 3 i ( t )= 1 m i m i X j =1 1 n n X k =1 1 m k m k X l =1 X ij X | ij f 3 ;kl ( t;t 1 ) 3 ;kl ( t ij ;t 2 ) g K h ( t ij t ) : For I 1 i ( t ),wehave E f I 1 i ( t ) g =0and Var f I 11 i ( t ) g = n 1 n 2 m i h 2 X k 6 = i 1 m k 1 ( t t ) 2 20 +2 m i 1 n 2 m i h X k 6 = i 1 m k 1 ( t t ) f ( t ) 20 + m i 1 n 2 m i X k 6 = i m k 1 m k 1 ( t t ) f 2 ( t ) o f 1+~ o (1) g where 1 ( t )= E f X i ( t ) X | i ( t ) 1 ( t ) X i ( t ) X | i ( t ) g .Theleadingordervarianceof I 12 i ( t )is 36 thesameasthatof I 11 i ( t ).Insummary,wehaveVar f I 1 i ( t ) g = ~ O ( 1 nm 2 h 2 ) 1 ( 0 < 1 )+ ~ O ( 1 n ) 1 ( 0 = 1 ). Bycondition(C3), 0 ( t )hascontinuousthirdderivative,and ( t ) f ( t ),[ ( t ) f ( t )] 0 ,[ ( t ) f ( t )] 00 , 1 ( t ), f ( t ), f 1 ( t )areuniformlyboundedon[ a;b ],wehave E f I 21 ;i ( t ) g = ˆ 1 2 n 1 ( t )+ n 1 2 n ( t ) ˙ f ( t ) (2) 0 ( t ) 12 h 2 + ~ O ( h 4 )and E f I 22 ;i ( t ) g = ˆ 1 2 n 1 ( t )+ n 1 2 n ( t ) ˙ f ( t ) (2) 0 ( t ) 12 h 2 + ~ O ( h 4 ) : Therefore E f I 2 ;i ( t ) g = E f I 21 ;i ( t ) g E f I 22 ;i ( t ) g = ~ O ( h 4 ).Toevaluatethevarianceof I 2 i ( t ), weevaluatethevarianceof I 21 ;i ( t ).Notethat ( nm i ) 2 I 21 ;i ( t ) I | 21 ;i ( t ) = 1 m 2 i m i X j 1 ;j 2 =1 m i X l 1 ;l 2 =1 X ij 1 X | ij 1 2 ;il 1 ( t ) K h ( t ij i t ) X ij 2 X | ij 2 2 ;il 2 ( t ) K h ( t ij 2 t ) + n X k ( 6 = i )=1 m i X j 1 ;j 2 =1 m k X l 1 ;l 2 =1 1 m 2 k X ij 1 X | ij 1 2 ;kl 1 ( t ) K h ( t ij i t ) X ij 2 X | ij 2 2 ;kl 2 ( t ) K h ( t ij 2 t ) + n X ( k 1 6 = k 2 )=1 m i X j 1 ;j 2 =1 m k 1 X l 1 =1 m k 2 X l 2 =1 1 m k 1 m k 2 X ij 1 X | ij 1 2 ;k 1 l 1 ( t ) K h ( t ij i t ) X ij 2 X | ij 2 2 ;k 2 l 2 ( t ) K h ( t ij 2 t ) :=( nm i ) 2 f J 1 ( t )+ J 2 ( t )+ J 3 ( t ) g : Let 2 ( t )= E f X i ( t ) X | i ( t ) X i ( t ) X | i ( t ) g .Itiseasytoseethatthedominanttermof 37 E f I 21 ;i ( t ) I | 21 ;i ( t ) g is E f J 3 ( t ) g .Carefulderivationshowsthat,uptoascaleconstant, E f J 3 ( t ) g = n 1 m i 2 ( t ) f ( t ) 2 12 20 h 3 + 2 ( t ) f 2 ( t ) 2 12 h 4 o f 1+~ o (1) g : SimilarderivationshowsthatVar f I 22 ;i ( t ) g isofthesameorderasVar f I 21 ;i ( t ) g .Therefore, insummary,wehaveVar f I 2 ;i ( t ) g = ~ O ( h 3 =m ) 1 ( 0 < 1 )+ ~ O ( h 4 ) 1 ( 0 = 1 ).For I 3 ;i ( t ),it canbeshownthat E f I 3 ;i ( t ) g = ~ O ( h 4 )and Var f I 3 ;i ( t ) g = ~ O ( h 6 =m ) 1 ( 0 < 1 )+ ~ O ( h 7 ) 1 ( 0 = 1 ) : Finally,weevaluatetheorderof ˘ i ( t ).Itisclearthat E f ˘ i ( t ) g =0and Var f ˘ i ( t ) g = n 1 m i h ( t t ) f ( t ) 20 + m i 1 m i ( t t ) f 2 ( t ) o f 1+~ o (1) g : (2.6.22) Insummary, E ( g i f 0 ( t ) g )= ~ O ( h 4 )andbycomparingthevarianceof ˘ i ( t )tothevariances of I 1 ;i ( t )to I 3 ;i ( t ),wehaveVar( g i f 0 ( t ) g )=Var f ˘ i ( t ) gf 1+~ o (1) g .Thiscompletesthe proofofthisLemma. Lemma3. Underconditions(C1)-(C4),wehavefortrue 0 ( t ) C 1 0 n X i =1 g i f 0 ( t ) g d ! N ( 0 ; B ( t )) ; where C 0 and B ( t ) aredinProposition2inSection2.2. Proof. Let ˘ i ( t ):= m 1 i P m i j =1 X ij ij K h ( t ij t )andusingtheproofofLemma2, g i f 0 ( t ) g = ˘ i ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ) ; (2.6.23) 38 and V i ( t ):=Var f ˘ i ( t ) g = ~ O f ( mh ) 1 g 1 ( 0 < 1 )+ ~ O f 1 g 1 ( 0 = 1 ) : Wewillshowthattheasymptoticnormalityof P n i =1 g i f 0 ( t ) g isthesameastheasymp- toticnormalityof P n i =1 ˘ i ( t ). Firstconsiderthecase 0 < 1 ,i.e. mh ! [0 ; 1 ),with(2.6.23)andcondition(C4),we have ( mh ) 1 = 2 p n n X i =1 g i f 0 ( t ) g = ( mh ) 1 = 2 p n n X i =1 ˘ i ( t )+~ o p (1) : (2.6.24) Asabove,wecancheckthat E n ( mh ) 1 = 2 P n i =1 ˘ i ( t ) = p n o =0and Var n ( mh ) 1 = 2 p n n X i =1 ˘ i ( t ) o = 1 n n X i =1 [ m m i 20 + m i 1 m i f ( t )] ( t t ) f ( t ) f 1+~ o (1) g ! [ r 20 + 0 f ( t )] ( t t ) f ( t )= B ( t ) : Next,weconsiderthecase 0 = 1 ,i.e. mh !1 .Againby(2.6.23)andcondition(C4) 1 p n n X i =1 g i f 0 ( t ) g = 1 p n n X i =1 ˘ i ( t )+~ o p (1) : (2.6.25) Similarly,itcanbecheckedthat E n P n i =1 ˘ i ( t ) = p n o =0and Var n n X i =1 ˘ i ( t ) = p n o = 1 n n X i =1 m i 1 m i ( t t ) f 2 ( t ) f 1+~ o (1) g ! f 2 ( t ) ( t t )= B ( t ) : Toshowtheasymptoticnormalityunderbothcases,applyingthecramer-wolddevice,it isenoughtoshowtheasymptoticnormalityof P n i =1 | ˘ i ( t ) =C 0 forany 2 R p atany 39 timepoint t .ItremainstochecktheLyapunovcondition.Tothisend,notethat s 2 n =Var f n X i =1 | ˘ i ( t ) g = n X i =1 | V i ˘ C 2 0 : Andontheotherhand,for m !1 , n X i =1 E | ˘ i ( t ) 2+ 0 o = n X i =1 E m 1 i m i X j =1 | X ij ij K h ( t ij t ) 2+ 0 o C n X i =1 E f sup t j | X ( t ) j 2+ 0 g E f sup t j ( t ) j 2+ 0 g˘ n bytaking 2 =2+ 0 intheassumption(C2).Thuswehave 1 s 2+ 0 n n X i =1 E | ˘ i ( t ) 2+ 0 o ˘ n n 1+ 0 = 2 ! 0 ;n !1 : Andsimilarly,for m isbounded, n X i =1 E | ˘ i ( t ) 2+ 0 o Cn h 2+ 0 E f sup t j | X ( t ) j 2+ 0 g E f sup t j ( t ) j 2+ 0 g˘ n=h 2+ 0 : Then,itfollowsthat 1 s 2+ 0 n n X i =1 E | ˘ i ( t ) 2+ 0 o ˘ n=h 2+ 0 ( n=h ) 2+ 0 2 = 1 n ( 0 2 0 0 0 ) = 2 : Theaboveratiogoesto0ifandonlyif 0 < 0 = 2+ 0 .Bytaking 2 =2+ 0 ,thiscondition isequivalentto 0 < 0 = 2+ 0 =1 2 2 .Byassumption(C4),thisconditionis because 0 < 1 2 < 1 2 2 .ThiscompletestheproofofthisLemma. Lemma4. Underassumptions(C1)-(C4),andforeach t 2 [ a;b ] underthenullhypothesis 40 H 0 : H f 0 ( t ) g =0 ,wehave 2 ` ( t ) d ! ˜ 2 q : Proof. First,forconveniencewesuppresstheargument t inthefunctions ( t ), ~ ( t )and A ( t ),sincewe t 2 [ a;b ]inthisproof.Theproofissimilartothatin[QL95]. Weobtaintheirderivativeswithrespecttothethreevariables ; and . @Q 1 n ( ; ) @ | = 1 n n X i =1 @g i ( ) @ | (1+ | ( ) g i ( )) g i ( ) | @g i ( ) @ | (1+ | ( ) g i ( )) 2 ; @Q 1 n ( ; ) @ | = 1 n n X i =1 g i ( ) g | i ( ) (1+ | ( ) g i ( )) 2 ; @Q 1 n ( ; ) @ | =0 ; @Q 2 n ( ; ; ) @ | = 1 n n X i =1 @ 2 g | i ( ) @ | @ (1+ | ( ) g i ( )) @g | i ( ) @ | @g i ( ) @ | (1+ | ( ) g i ( )) 2 + @C | ( ) @ | ; @Q 2 n ( ; ; ) @ | = 1 n n X i =1 @g | i ( ) @ | @g | i ( ) @ | g | i ( ) (1+ | ( ) g i ( )) 2 ; @Q 2 n ( ; ; ) @ | = C | ( ) ; @H ( ) @ | = C ( ) ; @H ( ) @ | =0 ; @H ( ) @ | =0 : Hence,wehavethefollowingTaylorexpansionsofthesystemofequationsat( 0 ; 0 ; 0).Let 41 n = k ~ 0 k + k ~ k + k ~ k . 0= Q 1 n ( ~ ; ~ ; ~ ) = Q 1 n ( 0 ; 0 ; 0)+ @Q 1 n ( 0 ; 0 ; 0) @ | ( ~ 0 )+ @Q 1 n ( 0 ; 0 ; 0) @ | ( ~ 0) + @Q 1 n ( 0 ; 0 ; 0) @ | ( ~ 0)+ o p n ) = 1 n n X i =1 g i ( 0 )+ 1 n n X i =1 @g i ( 0 ) @ | ( ~ 0 ) 1 n n X i =1 g i ( 0 ) g | i ( 0 ) ~ + o p n ) ; 0= Q 2 n ( ~ ; ~ ; ~ ) = Q 2 n ( 0 ; 0 ; 0)+ @Q 2 n ( 0 ; 0 ; 0) @ | ( ~ 0 )+ @Q 2 n ( 0 ; 0 ; 0) @ | ( ~ 0) + @Q 2 n ( 0 ; 0 ; 0) @ | ( ~ 0)+ o p n ) = 1 n n X i =1 @g | i ( 0 ) @ ~ + C | ( 0 ) ~ + o p n ) ; and0= H ( ~ )= H ( 0 )+ C ( 0 )( ~ 0 )+ o p n )= C ( 0 )( ~ 0 )+ o p n ) : Puttingthe aboveequationsintoamatrixform,weobtain 0 B B B B B @ n 1 P n i =1 g i ( 0 )+ o p n ) o p n ) o p n ) 1 C C C C C A = n 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A : where n = 0 B B B B B @ C 2 0 P n i =1 g i ( 0 ) g | i ( 0 ) n 1 P n i =1 @g i ( 0 ) @ | 0 n 1 P n i =1 @g | i ( 0 ) @ 0 C | ( 0 ) 0 C ( 0 )0 1 C C C C C A : 42 Thenwehave n P ! = 0 B B B B B @ BA 0 A 0 C | 0 C 0 1 C C C C C A : Bycalculation,wehave 1 = 0 B B B B B @ B 1 + B 1 APAB 1 B 1 APB 1 AQ | PAB 1 PQ | QAB 1 Q R 1 C C C C C A ; where P = V ( I C | Q ), R =( CVC | ) 1 , Q = RCV ; V =( AB 1 A ) 1 : Thuswehave thefollowing 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A = 1 0 B B B B B @ n 1 P n i =1 g i ( 0 ) 0 0 1 C C C C C A + o p n ) Bythis,wecouldoutthat n = k 0 B B B B B @ ~ ~ 0 ~ 1 C C C C C A kk 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A k = k 1 0 B B B B B @ 1 0 0 1 C C C C C A ( 1 n n X i =1 g i ( 0 ) ) + o p n ) k O p ( C 0 =n )+ o p n ) ; 43 whichimpliesthat n = O p ( C 0 =n ). Insummaryoftheaboveresults,wehave 0 B B B B B @ C 2 0 n 1 ~ ~ 0 ~ 1 C C C C C A = 0 B B B B B @ B 1 + B 1 APAB 1 PAB 1 QAB 1 1 C C C C C A ( 1 n n X i =1 g i ( 0 ) ) + o p ( C 0 =n ) : (2.6.26) Thuswehavetheasymptoticexpressionfor ~ , ~ = RCA 1 f 1 n n X i =1 g i ( 0 ) g + o p ( C 0 =n ) : (2.6.27) Fortheasymptoticexpressionof ~ 0 ,(2.6.26)togetherwith(2.6.27)gives ~ 0 =[ A 1 + VC | RCA 1 ] f 1 n n X i =1 g i ( 0 ) g + o p ( C 0 =n ) = A 1 f 1 n n X i =1 g i ( 0 ) g + VC | RCA 1 f 1 n n X i =1 g i ( 0 ) g + o p ( C 0 =n ) = A 1 f 1 n n X i =1 g i ( 0 ) g VC | ~ + o p ( C 0 =n ) : (2.6.28) Usingtheexpressionof in(2.6.40)andtheaboveasymptoticexpressionfor ~ 0 , 44 theempiricallog-likelihoodratiostatisticcanbewrittenas 2 ` ( t )=2 n X i =1 ~ | g i ( ~ ) n X i =1 ~ | g i ( ~ ) g | i ( ~ ) ~ + o p (1) = n ( 1 n n X i =1 g | i ( ~ )) n C 2 0 B 1 ( 1 n n X i =1 g i ( ~ ))+ o p (1) = n 2 C 2 0 ~ | CVAB 1 AVC | ~ + o p (1)= n 2 C 2 0 ~ | R 1 ~ + o p (1) : By(2.6.27),wehave 2 ` ( t )= 1 C 2 0 f n X i =1 g i ( 0 ) g | A 1 C | RCA 1 f n X i =1 g i ( 0 ) g + o p (1) : (2.6.29) Weseethat E ( R 1 = 2 CA 1 f P n i =1 g i ( 0 ))=0andas n !1 , C 1 0 Var R 1 = 2 CA 1 f n X i =1 g i ( 0 ) g ! R 1 = 2 CA 1 BA 1 C | R 1 = 2 = R 1 = 2 C f ABA g 1 C | R 1 = 2 = R 1 = 2 CVC | R 1 = 2 = R 1 = 2 R 1 R 1 = 2 = I q q Thus,bycentrallimittheorem,wehave R 1 = 2 CA 1 f C 1 0 P n i =1 g i ( 0 ) g d ! N ( 0 ; I q ) : Then by(2.6.29),wehave2 ` ( t ) d ! ˜ 2 q . Denote n =( d n log n nh 2 ) 1 2 + h 2 forsome0 << 1 6 . Lemma5. Underassumptions(C1)-(C3)and(C4)(i),wehavethesolution ( t ) tothe estimatingequation(2.2.7) (a) sup t 2 [ a;b ] k ( t ) 0 ( t ) k = O ( n 1 + h 4 ) ;a:s:: (b) Andforeach t 2 [ a;b ] ,inthesphere n ( t ):sup t 2 [ a;b ] k ( t ) 0 ( t ) k n o ,where 45 0 ( t ) isthetrueparameter,wehave 2 ` ( t )= n 2 C 2 0 H | f ( t ) g R ( t ) H f ( t ) g + o p ( nh 4 =C 0 ) : Proof. Weprove(a).Usingtheestimatingequation(2.3),oneobtain 0= 1 n n X i =1 g i f ( t ) g = 1 n n X i =1 1 m i m i X j =1 ij X ij K h ( t ij t ) + 1 n n X i =1 1 m i m i X j =1 | ;ij ( t ) X ij X ij K h ( t ij t ) ; where ;ij ( t )=[ ^ ( t ) 0 ( t )] [ ( t ) 0 ( t )] [ ^ ( t ij ) 0 ( t ij )] Itfollowsthat n 1 n n X i =1 1 m i m i X j =1 X | ij X ij K h ( t ij t ) o [ ( t ) 0 ( t )] = 1 n n X i =1 1 m i m i X j =1 ij X ij K h ( t ij t ) + 1 n n X i =1 1 m i m i X j =1 n [ ^ ( t ) 0 ( t )] [ ^ ( t ij ) 0 ( t ij )] o | X ij X ij K h ( t ij t )= g n f 0 ( t ) g ; (2.6.30) Sincewehave g n f 0 ( t ) g = ~ O p ( n 1 + h 4 ),andwealsoknowthatfromtheproofofLemma 1, sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij X | ij K h ( t ij t ) ( t ) f ( t ) k = O ( n ) ;a:s:: Thus(2.6.30)givessup t 2 [ a;b ] k ( t ) 0 ( t ) k = O ( n 1 + h 4 ) ;a:s: .Thiscompletestheproof ofpart(a). 46 For(b),wehavethefollowingTaylorexpansionfor 1 n P n i =1 g i f ( t ) g by(a)foreach t 2 [ a;b ],wehave k ( t ) 0 ( t ) k = O p ( C 0 =n + h 4 ) 0= 1 n n X i =1 g i f ( t ) g = 1 n n X i =1 g i f 0 ( t ) g + 1 n n X i =1 @g i f 0 ( t ) g @ | ( t ) [ ( t ) 0 ( t )]+ o p ( C 0 =n + h 4 ) = 1 n n X i =1 g i f 0 ( t ) g + A ( t )[ ( t ) 0 ( t )]+ o p ( C 0 =n + h 4 ) ; (2.6.31) whichgives ( t ) 0 ( t )= A 1 ( t ) f 1 n n X i =1 g i f 0 ( t ) gg + o p ( C 0 =n + h 4 ) : (2.6.32) TheTaylorexpansionfor H f ( t ) g around 0 ( t )canbeexpressedasfollowsbypluggingin (2.6.32) H f ( t ) g = H f 0 ( t ) g + C ( t )[ ( t ) 0 ( t )]+ o p ( C 0 =n + h 4 ) = H f 0 ( t ) g C ( t ) A 1 ( t ) f 1 n n X i =1 g i f 0 ( t ) gg + o p ( C 0 =n + h 4 ) = H f 0 ( t ) g + R 1 ( t ) ~ ( t ) H f 0 ( t ) g + o p ( C 0 =n + h 4 ) = R 1 ( t ) ~ ( t )+ o p ( C 0 =n + h 4 ) ; (2.6.33) wherethesecond-to-lastequalityisduetosimilarresultas(2.6.27)forgeneral H f 0 ( t ) g . 47 ThuswecouldeasilyseefromtheproofofLemma4that 2 ` ( t )= n 2 C 2 0 ~ | R 1 ~ + o p ( nh 4 =C 0 ) = n 2 C 2 0 H | f ( t ) g R ( t ) H f ( t ) g + o p ( nh 4 =C 0 ) : 2.6.2.2ProofofPropositions Inthissection,weprovidetheproofforthePropositionsinthischapter. ProofofProposition1. By(2.6.32),wehave ( t ) 0 ( t )= A 1 ( t ) f 1 n n X i =1 g i f 0 ( t ) gg + o p ( C 0 =n + h 4 ) : AndbyLemma2,wehave g i f 0 ( t ) g = ˘ i ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ) : Combiningthesetworesultstogether,wehave ( t ) 0 ( t )= A 1 ( t ) ˘ n ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ). AndforVar f ˘ n ( t ) g ,from(2.6.22)intheproofofLemma2,wecaneasilyget(2.2.9) intheproposition. ProofofProposition2. ByLemma3,andProposition1,andunderthebandwidthcondition 48 ( C 4)whichmakesthebiasnegligible,wehave nC 1 0 ( t ) 0 ( t ) d ! N ( 0 ; V ( t )) where V ( t )= A 1 ( t ) B ( t ) A 1 ( t ). ProofofProposition3. By(b)ofLemma5,wehave 2 ` ( t )= n 2 C 2 0 H | f ( t ) g R ( t ) H f ( t ) g + o p ( nh 4 =C 0 ) : From(2.6.33),wehavethatunder H 0 : H f 0 ( t ) g =0, R 1 = 2 ( t ) H f ( t ) g = R 1 = 2 ( t ) C ( t ) A 1 ( t ) f 1 n n X i =1 g i f 0 ( t ) ggf 1+~ o p (1) g = R 1 = 2 ( t ) C ( t ) A 1 ( t ) ˘ n ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ) = G ( t ) ˘ n ( t ) f 1+~ o p (1) g + ~ O p ( h 4 ) : By U n ( t )= nC 1 0 G ( t ) ˘ n ( t ),wehave 2 ` ( t )= U n ( t ) | U n ( t )+ O p ( nh 4 =C 0 ) : 2.6.2.3ExistenceofRMELEandtheasymptoticexpressionfor ~ Inthissection,westudytheexistenceofRMELE ~ ( t )andtheorderoftheLagrange multiplier ~ ( t ).Tothisend, n =( d n log n nh 2 ) 1 2 + h 2 forsome0 << 1 6 where 49 d n = h 2 + rh=m . Lemma6. Underassumptions(C1)-(C3)and(C4)(i),inthesphere ( ( t ):sup t 2 [ a;b ] k ( t ) 0 ( t ) k n ) ; (2.6.34) where 0 ( t ) isthetrueparameter,wehave(a) sup t k n 1 P n i =1 g i f ( t ) gk = O p ( n ); (b) sup t max i k g i f ( t ) gk = o p ( 0 1 n ) with 0 n = n =C 2 0 n ;and(c) lim n !1 P (inf t C 2 0 n X i =1 g i f ( t ) g g | i f ( t ) g > 0)=1 : Proof. For(a),noticethat 1 n P n i =1 g i f ( t ) g = T 1 ( t )+ T 2 ( t )where T 1 ( t )= 1 n n X i =1 1 m i m i X j =1 ij X ij K h ( t ij t ) and T 2 ( t )= 1 n n X i =1 1 m i m i X j =1 | ;ij ( t ) X ij X ij K h ( t ij t ) where ;ij ( t )=[ ^ ( t ) 0 ( t )] [ ( t ) 0 ( t )] [ ^ ( t ij ) 0 ( t ij )]. For T 1 ( t ),byLemma1in[LH10]fortheprocess ( t ) X ( t ),underthecondition(C2),as weprovedinLemma1,wehavesince E f T 1 ( t ) g =0andhencesup t k T 1 ( t ) k = O ( n 1 ) ;a:s:: 50 For T 2 ( t ),byLemma1andtheassumptionfor ( t )in(2.6.34), sup t k T 2 ( t ) k sup t 1 n n X i =1 1 m i m i X j =1 k ;ij ( t ) kk X ij k 2 K h ( t ij t ) (2sup t k ^ ( t ) 0 ( t ) k +sup t k ( t ) 0 ( t ) k ) sup t 1 n n X i =1 1 m i m i X j =1 k X ij k 2 K h ( t ij t )= O p ( n ) : Thuswehavesup t k n 1 P n i =1 g i f ( t ) gk = O p ( n ) : Thistheproofforpart(a). Forprovingpart(b),notethat, sup t k g i f ( t ) gk sup t k 1 m i m i X j =1 ij X ij K h ( t ij t ) k +sup t k 1 m i m i X j =1 | ;ij ( t ) X ij X ij K h ( t ij t ) k sup t k i ( t ) X i ( t ) k sup t 1 m i m i X j =1 K h ( t ij t ) + ˆ 2sup t k ^ ( t ) 0 ( t ) k +sup t k ( t ) 0 ( t ) k ˙ sup t k X i ( t ) k 2 sup t 1 m i m i X j =1 K h ( t ij t ) sup t k i ( t ) X i ( t ) k + C 1 n sup t k X i ( t ) k 2 sup t 1 m i m i X j =1 K h ( t ij t ) : If m i 'sarebounded,thenwehavesup t m i 1 P m i j =1 K h ( t ij t )= O p (1 =h ).Andif m i 'stend toy,thenbythetheoremin[Sil78]wehavesup t m i 1 P m i j =1 K h ( t ij t )= O p (1)under theregularityconditionsofthekernelfunctionin(C1). 51 Forthecase m i 'sbounded,wehave 0 n = n and sup t k g i f ( t ) gk sup t 1 m i m i X j =1 K h ( t ij t ) f sup t k i ( t ) X i ( t ) k + C 1 n sup t k X i ( t ) k 2 g C h (sup t k i ( t ) X i ( t ) k + n sup t k X i ( t ) k 2 ) : Thenwehave,forany > 0,byassumption(C4), P (max 1 i n sup t k g i f ( t ) gk > 0 n ) n P ˆ C h (sup t k i ( t ) X i ( t ) k + n sup t k X i ( t ) k 2 ) > n ˙ n P (sup t k ( t ) X ( t ) k > 2 C n )+ n P (sup t k X i ( t ) k 2 > 2 C 2 n ) n E f [sup t k ( t ) X ( t ) k ] g ( 2 C n ) + n E f [sup t k X ( t ) k ] 1 g ( 2 C 2 n ) 1 = 2 n ( 2 C n ) E f sup t k ( t ) k g +( 2 C 2 n ) 1 = 2 E f [sup t k X ( t ) k ] 1 g Cn f ( n ) +( n ) 1 g Cn ( n ) ! 0 ; where =min f 1 ; 2 g .Thisimpliessup t max i k g i f ( t ) gk = o p ( 0 1 n ). Forthecasethat m i 'stendtoy,wehave sup t k g i f ( t ) gk sup t 1 m i m i X j =1 K h ( t ij t ) f sup t k i ( t ) X i ( t ) k + C 1 n sup t k X i ( t ) k 2 g C ˆ sup t k i ( t ) X i ( t ) k + n sup t k X i ( t ) k 2 ˙ : 52 Thenwehave,forany > 0,byassumption(C4), P ˆ max 1 i n sup t k g i f ( t ) gk > n ˙ Cn n ( n ) +( n ) 1 o Cn ( n ) ! 0 ; where =min f 1 ; 2 g .Thisimpliessup t max i k g i f ( t ) gk = o p ( 1 n )= o p ( 0 1 n ).This completestheproofofpart(b). For(c),weneedtoshowthat,forany u 2 R p , lim n !1 P (inf t C 2 0 n X i =1 u | g i ( ( t )) g | i ( ( t )) u > 0)=1 : (2.6.35) Infact,notethat C 2 0 n X i =1 g i f ( t ) g g | i f ( t ) g = C 2 0 n X i =1 1 m 2 i m i X j;l =1 ij il X ij X | il K h ( t ij t ) K h ( t il t ) + C 2 0 n X i =1 1 m 2 i m i X j;l =1 | ;ij ( t ) X ij X ij X | il il K h ( t ij t ) K h ( t il t ) + C 2 0 n X i =1 1 m 2 i m i X j;l =1 | ;ij ( t ) X il X ij X | il ij K h ( t ij t ) K h ( t il t ) + C 2 0 n X i =1 1 m 2 i m i X j;l =1 | ;ij ( t ) X ij | ;il ( t ) X il X ij X | il K h ( t ij t ) K h ( t il t ) = C 2 0 n X i =1 1 m 2 i m i X j;l =1 ij il X ij X | il K h ( t ij t ) K h ( t il t )+~ o p (1) ; 53 byLemma1,andtheassumption(2.6.34)for ( t ).Thuswehaveforany u > 0, P (inf t C 2 0 n X i =1 u | g i f ( t ) g g | i f ( t ) g u > 0) P (inf t C 2 0 n X i =1 u | g i f ( t ) g g | i f ( t ) g u > u ) = P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u +~ o p (1) > u ) P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > 2 u ; j o p (1) j < u ) = P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > 2 u ) P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > 2 u ; j o p (1) j u ) P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > 2 u ) P ( j o p (1) j u ) : Nowsincelim n !1 P ( j o p (1) j u )=0,forproving(2.6.35)weonlyneedtoprovethatfor some u > 0, lim n !1 P (inf t C 2 0 n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > u )=1 : (2.6.36) Tothisend,notethat E 8 < : 1 m 2 i m i X j;l =1 ij il X ij X | il K h ( t ij t ) K h ( t il t ) 9 = ; =Var( ˘ i ( t )) = ~ O f ( mh ) 1 g 1 f 0 =0 g + ~ O (1) 1 f 0 = 1g : 54 Bythestronglawoflargenumbers,wehave C 2 0 n X i =1 1 m 2 i m i X j;l =1 ij il X ij X | il K h ( t ij t ) K h ( t il t ) ! L ( t ) ;a:s:; where L ( t )= r ( t t ) f ( t ) 20 1 f 0 =0 g + ( t t ) f 2 ( t ) 1 f 0 = 1g : Bytaking u = 1 2 inf t u | L ( t ) u > 0,wehave lim n !1 P inf t 1 n n X i =1 1 m 2 i m i X j;l =1 u | ij il X ij X | il K h ( t ij t ) K h ( t il t ) u > u =1 : Hence(c)isproved. Lemma7. Underassumptions(C1)-(C3)and(C4)(i),inthesphere ( ( t ):sup t 2 [ a;b ] k ( t ) 0 ( t ) k n ) ; (2.6.37) where 0 ( t ) isthetrueparameter,theequation Q 1 n f ( t ) ; ( t ) g =0 almostsurelyhasroot ( t )= f ( t ) g and sup t 2 [ a;b ] k ( t ) k = O p ( 0 n ) ,where 0 n = n =C 2 0 n : Proof. Similartotheproofin[Owe90],let ( t ):= ˆ ( t ) ( t )with k ( t ) k =1and ˆ ( t ) 0, andthenfromtheequation Q 1 n f ( t ) ; ( t ) g = 1 n X i g i f ( t ) g 1+ | ( t ) g i f ( t ) g =0 ; wehave ˆ ( t ) | ( t ) S ( t ) ( t ) 1+ ˆ ( t )sup t max i k g i f ( t ) gk 1 n j | ( t ) n X i =1 g i f ( t ) gj 0 ; (2.6.38) where S ( t )= 1 n P n i =1 g i f ( t ) g g | i f ( t ) g .Byapplying(a)-(c)inLemma6and(2.6.38),we 55 have ˆ ( t ) | ( t ) nC 2 0 S ( t ) ( t ) f 1+ ˆ ( t )sup t max i k g i ( ( t )) kg 1 C 2 0 j | ( t ) n X i =1 g i f ( t ) gj = f 1+ ˆ ( t )~ o p ( 0 1 n ) g ~ O p ( 0 n )= ~ O p ( 0 n )+ ˆ ( t )~ o p (1) ; whichimplies ˆ ( t ) ~ O p ( 0 n ) | ( t ) nC 2 0 S ( t ) ( t )+ o p (1) ˘ ~ O p ( 0 n ) ; since S ( t ) ˘ C 2 0 =n uniformlyfor t 2 [ a;b ].Namelyweprovedsup t k ( t ) k = O p ( 0 n ) : Remark4. For f 0 ( t ) g ,wehave sup t k ( 0 ( t )) k = O p ( n =C 2 0 ) : Thisisbecause Lemma1,Lemma6andLemma7arestilltrueifwereplace ( t ) by 0 ( t ) and n by n .This impliesthat sup t k ( 0 ( t )) k = O p f (log( n ) =nh ) 1 = 2 g forsparsedataand sup t k ( 0 ( t )) k = O p f (log( n ) =n ) 1 = 2 g fordensedata. Expressionfor ( t ) : Fromtheequation Q 1 n ,wehave 0= Q 1 n f ( t ) ; ( t ) g = n 1 n X i =1 g i f ( t ) g 1+ | ( t ) g i f ( t ) g = n 1 n X i =1 g i f ( t ) g n 1 n X i =1 g i f ( t ) g g | i f ( t ) g ( t ) + n 1 n X i =1 g i f ( t ) g [ | ( t ) g i f ( t ) g ] 2 1+ | ( t ) g i f ( t ) g (2.6.39) Inthefollowing,wewanttoshowthattheorderofthethirdtermis~ o p ( 0 n ).Tothisend,we observethat j | ( t ) g i f ( t ) gj sup t max i k g i f ( t ) gk sup t k ( t ) k =~ o p ( 1 n ) ~ O p ( n )=~ o p (1) : 56 Thuswehave n 1 n X i =1 g i f ( t ) g [ | ( t ) g i f ( t ) g ] 2 1+ | ( t ) g i f ( t ) g ˘ n 1 n X i =1 g i f ( t ) g [ | ( t ) g i f ( t ) g ] 2 : Let | ( t )=( 1 ( t ) ; 2 ( t ) ; ; p ( t ))and g | i f ( t ) g =( g i 1 f ( t ) g ; ;g ip f ( t ) g ), i =1 ; 2 ; ;n . Then u -thcomponentof n 1 P n i =1 g i f ( t ) g [ | ( t ) g i f ( t ) g ] 2 is n 1 n X i =1 p X j;k =1 j ( t ) k ( t ) g iu f ( t ) g g ij f ( t ) g g ik f ( t ) g whoseabsolutevaluecanbeboundedby j n 1 n X i =1 p X j;k =1 j ( t ) k ( t ) g iu f ( t ) g g ij f ( t ) g g ik f ( t ) gj (sup t k ( t ) k ) 2 sup t max i j g iu f ( t ) gj n 1 n X i =1 p X j;k =1 g ij f ( t ) g g ik f ( t ) g (sup t k ( t ) k ) 2 sup t max i k g i f ( t ) gk n 1 n X i =1 ( p X j =1 g ij f ( t ) g ) 2 C (sup t k ( t ) k ) 2 sup t max i k g i f ( t ) gk sup t p n n X i =1 k g i f ( t ) gk 2 = ~ O p f ( 0 n ) 2 g ~ o p ( 1 n ) ~ O p (1)=~ o p ( 0 n ) ; Thismeansthatthethirdtermin(2.6.39)isoforder~ o p ( 0 n ).Itthenfollowsthat ( t )= ( n 1 n X i =1 g i f ( t ) g g | i f ( t ) g ) 1 ( n 1 n X i =1 g i f ( t ) g ) +~ o p ( 0 n ) : (2.6.40) 57 Lemma8. Underassumptions(C1)-(C3)and(C4)(i),inthesphere ( ( t ):sup t 2 [ a;b ] k ( t ) 0 ( t ) k n ) ; theequationsystem(2.3.15)almostsurelyhasrootin U n = f ( ( t ) ; ( t ) ; ( t )):sup t k ( t ) 0 ( t )+ ( t )+ ( t ) k n g : Andanysolutionisindeedasolutiontotheminimizationproblem(3.2). Proof. SincewehavealreadyprovedinLemma7thatforevery ( t ) 2f ( t ):sup t 2 [ a;b ] k ( t ) 0 ( t ) k n g ,theequation Q 1 n ( ( t ) ; ( t ))=0almostsurelyhasroot ( t )= ( ( t ))= ~ O ( 0 n ),weonlyhavetoprovethefollowing: (a) Forevery ( t ) 2f ( t ):sup t k ( t ) 0 ( t ) k n g , ( t )= f ( t ) g = ~ O ( n )couldbe solvedfromtheequations Q 2 n ( ( t ) ; ( t ) ; ( t ))=0. (b) Andtherealmostsurelyexistsasolution ~ ( t ) 2 U n totheequationsystem(3.3). (c) Anysolutionisindeedasolutiontotheminimizationproblem(3.2). Inordertoprove(a),recalltheexpressionin(2.6.40)andtheasymptoticvariance B ( t ) inLemma3,bytheuniformlystronglawoflargenumbers(SLLN),wehave C 2 0 n X i =1 g i f ( t ) g g | i f ( t ) g = B ( t )+~ o p (1) : 58 Thus f ( t ) g = ( n 1 n X i =1 g i f ( t ) g g | i f ( t ) g ) 1 ( n 1 n X i =1 g i f ( t ) g ) +~ o p ( 0 n ) = B 1 ( t ) ( 1 C 2 0 n X i =1 g i f ( t ) g ) +~ o p ( 0 n ) ; (2.6.41) and f 0 ( t ) g = ~ O p ( n =C 2 0 )=~ o p ( n =C 2 0 )=~ o p ( 0 n ) : (2.6.42) Wehave @ f ( t ) g @ | ( t ) = B 1 ( t ) ( C 2 0 n X i =1 @g i f ( t ) g @ | ( t ) ) +~ o p ( 0 n ) : (2.6.43) BecausetheuniformlySLLNgives n 1 P n i =1 @g i f 0 ( t ) g @ | ( t ) = A ( t )+~ o p (1)where A ( t )= ( t ) f ( t ),wehavethefollowing @ f 0 ( t ) g @ | ( t ) = B 1 ( t ) ( C 2 0 n X i =1 @g i f 0 ( t ) g @ | ( t ) ) +~ o p ( 0 n ) = nC 2 0 B 1 ( t ) A ( t )+~ o p ( 0 n ) : (2.6.44) Let S f ( t ) g = n 1 P n i =1 @g | i f ( t ) g =@ ( t ) 1+ | f ( t ) g g i f ( t ) g ,then Q 2 n f ( t ) ; ( t ) ; ( t ) g = S f ( t ) g f ( t ) g + C | f ( t ) g ( t ) : (2.6.45) 59 Forthetaylorexpansionof Q 2 n f ( t ) ; ( t ) ; ( t ) g at 0 ( t ),weneedthefollowing: S f ( t ) g = n 1 n X i =1 @g | i f ( t ) g =@ ( t ) 1+ | f ( t ) g g i f ( t ) g = n 1 n X i =1 @g | i f ( t ) g @ ( t ) ˆ 1 | f ( t ) g g i f ( t ) g 1+ | f ( t ) g g i f ( t ) g ˙ = n 1 n X i =1 @g | i f ( t ) g @ ( t ) + ~ O p ( 0 n ) ; (2.6.46) whichimpliesthat S f 0 ( t ) g = A ( t )+ ~ O p ( 0 n ) : (2.6.47) Hencewehave @S f ( t ) g @ | ( t ) = n 1 n X i =1 @ 2 g | i f ( t ) g @ | ( t ) @ ( t ) + ~ O p ( 0 n ) ; (2.6.48) @S f 0 ( t ) g @ | ( t ) = E @ 2 g | i f 0 ( t ) g @ | ( t ) @ ( t ) + O p ( 0 n ):= D ( t )+ ~ O p ( 0 n ) : (2.6.49) Let W f ( t ) g = S f ( t ) g f ( t ) g and S f ( t ) g =( S 1 ; S 2 ; ; S p )where S j isthe j -th columnof S f ( t ) g .Thenby(2.6.42),(2.6.44),(2.6.47),(2.6.49)andtheassumptionabout 60 ( t )wehave W f ( t ) g = W f 0 ( t ) g + S f 0 ( t ) g @ f 0 ( t ) g @ | ( t ) ( ( t ) 0 ( t )) + p X j =1 @ S j @ | ( t ) j f 0 ( t ) g ( ( t ) 0 ( t ))+ ~ O p f ( n ) 2 g = f A ( t )+ ~ O p ( 0 n ) g ~ o p ( 0 n ) + f [ A ( t )+ ~ O p ( 0 n )][ nC 2 0 B 1 ( t ) A ( t )+~ o p ( 0 n )] +[ D ( t )+ ~ O p ( 0 n )]~ o p ( 0 n ) g [ ( t ) 0 ( t )]+ ~ O p f ( n ) 2 g = nC 2 0 A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+~ o p ( 0 n ) : Bypluggingtheaboveinto(2.6.45),weget 0= nC 2 0 A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ C | f ( t ) g ( t )+~ o p ( 0 n ) : (2.6.50) Since A ( t ) B 1 ( t ) A ( t )isinvertible,bymultiplying C ( t ) f A ( t ) B 1 ( t ) A ( t ) g 1 onbothside of(2.6.50)wehave 0= nC 2 0 C ( t )[ ( t ) 0 ( t )]+ C ( t ) f A ( t ) B 1 ( t ) A ( t ) g 1 C | f ( t ) g ( t )+~ o p ( 0 n ) : (2.6.51) Fromthethirdequationoftheequationsystem(3.3), 0= H f ( t ) g = H f 0 ( t ) g + C ( t )[ ( t ) 0 ( t )]+~ o ( 0 n ) = C ( t )[ ( t ) 0 ( t )]+~ o ( 0 n ) ; 61 wehave C ( t )[ ( t ) 0 ( t )]=~ o ( 0 n ) : (2.6.52) Combine(2.6.51)and(2.6.52), C ( t ) f A ( t ) B 1 ( t ) A ( t ) g 1 C | f ( t ) g ( t )= nC 2 0 C ( t )[ ( t ) 0 ( t )]+ o p ( 0 n )= o p ( 0 n ) ; Thatis ( t )= n C ( t ) f A ( t ) B 1 ( t ) A ( t ) g 1 C | f ( t ) g o 1 o p ( 0 n )= o p ( 0 n ) : (2.6.53) Henceweproved(a). Forproving(b),from(2.6.50)and(2.6.53),wehave 0= nC 2 0 A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ C | f ( t ) g ( t )+ o p ( 0 n ) = nC 2 0 A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ o p ( 0 n ) ; whichimpliesthat 0= A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ o p ( n ) : (2.6.54) Nowconsidertheaboveequation(2.6.54)andafunction ˚ onthetheunitdiskin R p by ˚ ( t ) 0 ( t ) n = A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ o p ( n ) : 62 Weknowthat ˚ isacontinuousfunctionontheunitdisk.Alsowehave 1 n [ ( t ) 0 ( t )] | ˚ ( t ) 0 ( t ) n = 1 n [ ( t ) 0 ( t )] | A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ o p ( n ) : Henceonthecircle k ( t ) 0 ( t ) k = n ,wehave 1 n [ ( t ) 0 ( t )] | ˚ ( t ) 0 ( t ) n = 1 n [ ( t ) 0 ( t )] | A ( t ) B 1 ( t ) A ( t )[ ( t ) 0 ( t )]+ o p ( n ) n ˝ 0 ( t )+ o p ( n ) < 0 ; if n bigenough ; where ˝ 0 ( t ) > 0isthesmallesteigenvalueof A ( t ) B 1 ( t ) A ( t ),whichispositive Thusbythelemmain[AS58],thereexistsapoint ~ ( t ) 2 U n and ˚ f ~ ( t ) 0 ( t ) n g =0,which means ~ ( t )isasolutiontotheequationsystem(3.3). Nextwehavetoprove(c).Assumingthat ~ ( t )isasolutionin U n ,welet ( t )beapoint inaneighborhoodof ~ ( t )containedin U n suchthat H f ( t ) g =0and k ( t ) ~ ( t ) k >> 0. Thenbyexpanding l 0 f ( t ) g at ~ ( t )wehave l 0 f ( t ) g l 0 f ~ ( t ) g = @l 0 f ~ ( t ) g @ | ( t ) [ ( t ) ~ ( t )] + 1 2 [ ( t ) ~ ( t )] | @ 2 l 0 f ( t ) g @ ( t ) @ | ( t ) [ ( t ) ~ ( t )] ; (2.6.55) where ( t ) 2 U n .Wewishtoshowthat l 0 f ( t ) g l 0 f ~ ( t ) g > 0 : 63 Next,weapproximatethetwotermsontherightsideof(2.6.55):Fortheterm,note that @l 0 f ~ ( t ) g @ | ( t ) = n X i =1 1 1+ | f ~ ( t ) g g i f ~ ( t ) g g | i f ~ ( t ) g @ f ~ ( t ) g @ | ( t ) + n X i =1 1 1+ | f ~ ( t ) g g i f ~ ( t ) g | f ~ ( t ) g @g i f ~ ( t ) g @ | ( t ) = n X i =1 1 1+ | f ~ ( t ) g g i f ~ ( t ) g | f ~ ( t ) g @g i f ~ ( t ) g @ | ( t ) = n | f ~ ( t ) g S | f ~ ( t ) g = nW | f ~ ( t ) g : (2.6.56) By(2.6.45),wehave W | f ~ ( t ) g = ~ | ( t ) C f ~ ( t ) g : (2.6.57) Fromthetaylorexpansionof H f ( t ) g at ~ ( t ),wehave 0= H f ( t ) g H f ~ ( t ) g = C f ~ ( t ) g [ ( t ) ~ ( t )]+~ o ( n ) ; fromwhichwecouldobtain C f ~ ( t ) g [ ( t ) ~ ( t )]=~ o ( n ) : (2.6.58) Thus,forthetermof(2.6.55),combining(2.6.56)-(2.6.58)wehave @l 0 f ~ ( t ) g @ | ( t ) [ ( t ) ~ ( t )]= nW | f ~ ( t ) g [ ( t ) ~ ( t )] = n ~ | ( t ) C f ~ ( t ) g [ ( t ) ~ ( t )]= n 2 C 2 0 ~ o p f ( n ) 2 g : (2.6.59) 64 Forthesecondtermof(2.6.55),wehave @ 2 l 0 f ( t ) g @ ( t ) @ | ( t ) = n @W | f ( t ) g @ ( t ) = n ˆ @ | f ( t ) g @ ( t ) S | f ( t ) g + f ( t ) g @S | f ( t ) g @ ( t ) ˙ = n [ nC 2 0 A ( t ) B 1 ( t )+~ o p ( 0 n )][ A ( t )+ ~ O p ( 0 n )] + n ~ O p ( 0 n )[ D ( t )+ ~ O p ( 0 n )] = n f nC 2 0 A ( t ) B 1 ( t ) A ( t )+ ~ O p ( 0 n ) g : Itfollowsthat 1 2 [ ( t ) ~ ( t )] | @ 2 l 0 f ( t ) g @ ( t ) @ | ( t ) [ ( t ) ~ ( t )] = 1 2 [ ( t ) ~ ( t )] | n f nC 2 0 A ( t ) B 1 ( t ) A ( t )+ ~ O p ( 0 n ) g [ ( t ) ~ ( t )] = n 2 2 C 2 0 [ ( t ) ~ ( t )] | A ( t ) B 1 ( t ) A ( t )[ ( t ) ~ ( t )]+ n 2 C 2 0 ~ o p f ( n ) 2 g : (2.6.60) Hence,plugging(2.6.59)and(2.6.60)into(2.6.55),wehave l 0 f ( t ) g l 0 f ~ ( t ) g = n 2 C 2 0 ˆ 1 2 ( ( t ) ~ ( t )) | A ( t ) B 1 ( t ) A ( t )( ( t ) ~ ( t ))+~ o p f ( n ) 2 g ˙ n 2 C 2 0 ( n ) 2 f 1 2 ˝ 0 ( t )+~ o p (1) g > 0 ; if n bigenough ; where ˝ 0 ( t ) > 0isthesmallesteigenvalueof A ( t ) B 1 ( t ) A ( t ),whichispositive 65 Chapter3 simultaneousempirical likelihoodratiotestsforfunctional linearmodelsandthephasetransition fromsparsetodensefunctionaldata 3.1Introduction Inthischapter,wecontinuetoconsiderthesamemodel(2.1.1)aswediscussedinChapter 2.Andweareinterestedinthesamehypothesistestingproblemasin(2.1.2), H 0 : H f 0 ( ) g =0vs H 1 : H f 0 ( ) g6 =0 : (3.1.1) Butinsteadoftestingthecotfunctionsatapoint t asinChapter2,wewould liketotestthefunctionssimultaneouslyonthewholesupport[ a;b ]. Inthischapter,weproposenonparametrictestbasedonthepointwiseempiricallikelihood ratiotestinChapter2,totest(2.1.2)simultaneously.SinceinChapter2,weshowedthe EL-basedpointwisetestsenjoyaniceself-normalizingpropertysuchthatbothsparseand densefunctionaldatacanbetreatedunderaframework,thesimultaneoustesting 66 proceduretobedevelopedherecanalsotreatalltypesoffunctionaldatawithdt densenessinaway. Toinvestigatethepowerofthetests,weconsiderthesamelocalalternatives(2.1.3)as inChapter2fortheentirefunctions 0 ( )simultaneously H 1 n : H f 0 ( ) g = b n d ( ) ; (3.1.2) Forthesparsedatawith =0,itisalsoknownthattheELmethodusingaglobalbandwidth h [CZ10]candetectalternativesoforder b n = n 1 = 2 h 1 = 4 forsimultaneoustest,whichis alsolargerthan n 1 = 2 .SimilarlyasinthepointwisecaseinChapter2,fordensedatawith > 0,thedetectableorder b n isstilllargelyunknown.Thisleadstothesamekeyinterest inthischapterasinthelastchapter,understandingtheof on b n .Weusethesame principletogettheoptimal b n bymaximizingthepowerofthetest(i.e.,minimizingtheorder of b n )whilecontrollingthetypeIerroratthedesiredlevel.Undersomemildconditions,we that,forthesimultaneoustest, b n islargerthan n 1 = 2 for 1 = 16andequalsto n 1 = 2 for > 1 = 16.Thetransitionpoints1 = 16willbestillrefereedas 0 asinthepointwisecase forthissimultaneoustest.Once > 0 ,withaproperlychosenbandwidth,theproposed testscandetectasignalataparametricrate.Thisphasetransitionresultechoesthesimilar phenomenadiscoveredby[LH10]forestimationproblems. Therestofthechapterisorganizedasfollows.Weproposethesimultaneoustest inSection3.2whereweinvestigatetheasymptoticdistributionsoftheteststatisticunder boththenullandlocalalternatives,andthetransitionphasesfor b n .Simulationstudies arepresentedinSection3.3,followedbytworealdataanalysisexamples,oneforsparseand onefordensefunctionaldata,inSection3.4.Allthetechnicaldetailsarerelegatedtothe 67 Section3.5. 3.2Asimultaneoustest Weassumethesameregularityconditions(C1)-(C4)forkernelfunction,momentsofthe underlyingprocesses,smoothnessoftherelatedfunctionsandtheselectionofbandwidthas in2.2.2inChapter2. Wenowconsiderasimultaneousteston H 0 in(3.1.1)forall t 2 [ a;b ].ByLemma5in Section2.6.2inChapter2 2 ` ( t )= n 2 C 2 0 H | f ( t ) g R ( t ) H f ( t ) g +~ o p (1) : Intuitively,2 ` ( t )measuresthedistancebetween H f 0 ( t ) g and0atany t 2 [ a;b ].Totest thehypothesis(3.1.1)simultaneously,weproposeaCramer-vonMisestypeteststatistic T n = Z b a 2 ` ( t ) w ( t ) dt; (3.2.3) where w ( )isaknownprobabilitydensityfunction.Theconstructionof T n allowsusto borrowinformationacrossthetimedomainandyieldamorepowerfultestthanthepointwise test.Similarconstructionswereusedby[HM93]and[CZ10].Theweightfunction w ( t ) isasubjectivechoiceofthepractitioner.Themostcommonlyusedweightfunctionisa uniformdensitytoputequalweightsonallpoints,butifthereispriorknowledgeonthe importanceofaparticularsubintervalonecanchange w ( t )toputmoreweightsonthe importantsubinterval. 68 3.2.1Nulldistributionandlocalpower Bytheasymptoticdecompositionof2 ` ( t )inProposition3inChapter2,weneedtoun- derstandthecovariancestructureoftheprocess U n ( t )inordertoinvestigatethedistribution of T n . Proposition4. UnderConditions(C1)-(C4)and H 0 ,Cov f U n ( s ) ; U n ( t ) g = n ( s;t ) f 1+ o p (1) g where n ( s;t )= 8 > > > > > > > > < > > > > > > > > : 1 20 K (2) ( s t h ) I q ; if m 2 h ! 0 ; I q I ( s = t )+ mh 0 ( s;t ) I ( s 6 = t ) if m 2 h !1 and mh ! 0 ; 0 ( s;t ) ; if mh !1 ; K (2) ( x )= R K ( y ) K ( x y ) dy and 0 ( s;t )= G ( s ) ( s;t ) G | ( t s;t ) f ( s ) f ( t ) . Obviously,theleadingterminthecovarianceof U n ( t )istundertasymp- toticscenarios.Inthesecondcaseintheexpressionof n ( s;t ),the I q I ( s = t )termseemsto dominatebutisonlynon-zeroinanareawithLebesguemeasure0;the mh 0 ( s;t ) I ( s 6 = t ) termisnonzeroalmosteverywhereandproducestheleadingordervarianceof T n inthis case. Supposethecovariancefunction n ( s;t )hasthefollowingspectraldecomposition[Bal60] n ( s;t )= P 1 k =1 nk ˚ nk ( s ) ˚ | nk ( t )forany s;t 2 [ a;b ] ; where n 1 n 2 0aretheorderedeigenvaluesand ˚ n 1 ( t ) ;˚ n 2 ( t ) ; arethe associatedeigenfunctions.Theeigenfunctionsarevectorvaluedorthonormalfunctionssat- 69 isfying R b a ˚ | nk ( t ) ˚ nl ( t ) w ( t ) dt = l k where l k =1if k = l and0otherwise.Eventhough theeigenvalues nk changeundertasymptoticscenarios,itiseasytoverifythat P 1 k =1 nk =tr f R n ( t;t ) w ( t ) dt g = q forallcasesinProposition4.Alsonotethatinthe thirdcaseofProposition4, n = 0 doesnotdependon n andtherefore nk k and ˚ nk ( t ) ˚ k ( t )forall k . Toestablishtheasymptoticdistributionof T n ,weneedalltheconditionsinChapter2 withreplacingthecondition(C4)(ii)by (C4)(ii'): 2(1+ ) = 17 < 0 if 2 [0 ; 1 = 8]and1 = 8 < 0 < if > 1 = 8. Underthenullhypothesis,wecana q -dimensionalGaussianprocess U ( t ),withmean 0 0 0andcovarianceCov( U ( s ) ; U ( t ))= n ( s;t ),asacounterpartoftheprocess U n ( t ).Wewill showthatthelimitingdistributionof T n isthesameasthatof Z n = R b a U | ( t ) U ( t ) w ( t ) dt , whichfollowsa ˜ 2 -mixturedistribution.Thisresultisdescribedinthefollowingtheorem, theproofofwhichisprovidedintheSection3.5.1. Theorem2. Under H 0 in(3.1.1)andConditions(C1)-(C3),(C4)(i)and(C4)(ii'), T n d = Z n f 1+ o p (1) g ,where Z n d = P 1 k =1 nk ˜ 2 1 ;k and ˜ 2 1 ;k , k =1 ; 2 ;::: ,areindependentchi- squarerandomvariableswithonedegreeoffreedom. Remark5. Theasymptotic ˜ 2 -mixturedistributioninTheorem2isquiteentfrom theasymptoticnormaldistributionforclassicempiricallikelihoodratiotestsforindependent data,timeseriesorsparselongitudinaldata[CHL03,CZ10].Infact,fordensefunctional data,ourcalculationshowsthe E f ( T n E T n ) 4 g6 =3 var 2 ( T n ) ,andhence T n canbehavequite entlyfromaGaussianvariable.However,forsparseormoderatelydensefunctional datawith 1 = 16 ,the ˜ 2 -mixtureisalsoasymptoticallynormal.Thisresultiscollectedin thefollowingcorollary,theproofofwhichisgiveninSection3.5.1. 70 Corollary1. UnderthesameconditionsasthoseinTheorem2,if 1 = 16 ,wehave h 1 = 2 ( T n q ) d ! N (0 ;q˙ 2 0 ) where ˙ 2 0 =2 2 20 R b a w 2 ( t ) dt R 2 2 f K (2) ( u ) g 2 du . Corollary1makesaconnectionbetweenourgeneralresultsinTheorem2withtheclassic results.Thenulldistributionof T n istundertasymptoticscenariosandmay dependonsomeunknownquantitiessuchas nk ,whichmakesittouseinpractice. Inthenextsubsection,wewillproposeabootstrapmethodunanimouslyapplicabletoall typesoffunctionaldatatoestimatethisnulldistribution.Next,westudythepowerofthe simultaneoustestunderthelocalalternatives. Theorem3. Supposethatthelocalalternativehypothesisin(3.1.2)holdsandConditions (C1)-(C3),(C4)(i)and(C4)(ii')ared. (a) If 1 = 16 and b n = n 1 = 2 ( m 2 h ) 1 = 4 ,then h 1 = 2 ( T n q ) d ! N ( 0 ;q˙ 2 0 ) ; where 0 = R b a d | ( t ) R ( t ) d ( t ) w ( t ) dt and ˙ 2 0 isdinCorollary1. (b) If 1 = 16 < 1 = 8 , 0 < 2 and b n = n 1 = 2+ foranarbitrarilysmall > 0 ,then ˙ 1 1 ( T n q nb 2 n 0 ) d ! N (0 ; 1) where ˙ 2 1 =4 nb 2 n ( mh ) 2 1 and 1 = Z b a Z b a d | ( t ) R 1 = 2 ( t ) 0 ( t;s ) R 1 = 2 ( s ) d ( s ) w ( t ) w ( s ) dtds: (c) If > 1 = 8 and b n = n 1 = 2 ,let u k = R b a [ R 1 = 2 ( t ) d ( t )] | ˚ k ( t ) w ( t ) dt .Then T n d ! P 1 k =1 k ˜ 2 1 ;k u 2 k k : 71 WecanuseTheorem3toexaminethepowerandsizeofdetectablesignalsofthesimul- taneoustestundertscenarios.Weusethesameprinciple(2.3.18)inChapter2to determinetheoptimalratefor b n .When 1 = 16,followingpart(a)inTheorem3,the asymptoticpowerofthetestis B ( d )= z + 0 = p q˙ 0 where 0 and ˙ 0 arein Theorem3and )istheCDFofastandardnormaldistribution.Thetesthasnontrivial powersforsignalsofsize b n = n 1 = 2 ( m 2 h ) 1 = 4 .Undertheconstraints(C4)(i)and(C4)(ii') on h , b n attainsitsminimumat h = n 2(1+ + ) = 17 foranyarbitrarysmall > 0suchthat b n = n 8(1+ ) = 17+ = 34 .Byletting ! 0,theoptimaldetectableorderis b n = n 8(1+ ) = 17 . When1 = 16 < 1 = 8,byourcalculationsinProposition4andTheorem2thenulldistri- butionof T n isa ˜ 2 mixturewithmean( P 1 k =1 nk ) f 1+ o (1) g = q f 1+ o (1) g andvariance (2 P k 2 nk ) f 1+ o (1) g =tr f RR 2 n ( s;t ) w ( s ) w ( t ) dsdt gf 1+ o (1) g = O ( mh ).Therefore,the thresholdforan -leveltestisoftheform q + c ,where c (2 P k 2 nk ) 1 = 2 = O ( mh ) byChebyshev'sinequality.Bypart(b)ofTheorem3,theasymptoticpoweris B ( d )= c 2 p nb n mh p 1 + 0 2 p 1 p nb n ! 1 ; for b n = n 1 = 2+ withanarbitrarilysmall > 0.Thisalsomeansthatthetesthasnontrivial powersforsignalsofsize b n = n 1 = 2 . Similarly,thepowerofthetestundercase(c)is B ( d )= P 1 X k =1 k ˜ 2 1 ;k u 2 k k >q + c where q + c isthe -thquantileof P 1 k =1 k ˜ 2 1 ;k .Inthiscase, B ( d )isaconstantaslong as d ( t )isanon-zerofunction,whichimpliesthatthetesthasanon-trivialpowerif 72 b n = n 1 = 2 .Combiningparts(b)and(c),theoptimaldetectableorderofthesimultaneous testis b n = n 1 = 2 when > 1 = 16. Notethattheoptimaldetectableorderforthesimultaneoustestissmallerthanthat ofthepointwisetestweobtainedinChapter2when 1 = 8.Thisisunderstandable becausethesimultaneoustestborrowinformationovertheentiretimedomainandismore powerful.Boththepointwiseandsimultaneoustestscandetectsignalsofroot- n orderfor densefunctionaldatawith > 1 = 8. 3.2.2Wildbootstrapprocedure Theasymptoticdistributionsof T n aretforsparseanddensefunctionaldata,but theboundarybetweentscenariosisonlyintheasymptoticsense,making tasymptoticscenariosveryculttodistinguishinpractice.Tounifytheinference procedure,weproposeawildbootstrapprocedure[Mam93].Someresidualbasedbootstrap procedureshavealsobeenproposedin[Far97]and[ZC07]fordensefunctionaldata,but theconsistencyofsuchprocedureswasnotinvestigated.Theproposedbootstrapprocedure consistsofthefollowingsteps: Step1: Generatingbootstrapsamples f Y ( b ) ij ;t ( b ) ij ; X ( b ) ij g B b =1 accordingtothefollowing model: Y ij = ~ | ( t ij ) X ij + ij : where ~ ( t ij )isthesolutionoftheestimatingequationsin(2.3.15)inChapter2.Theresid- ualvector i =( i 1 ; ; im i ) | isgeneratedfroman m i -dimensionalmultivariatenormal distributionwithmean 0 andcovariance ^ i =( ^ t ij ;t ik )) m i j;k =1 where ^ t;s )isaconsistent estimatorof t;s )describedinSection2.4.2inChapter2. 73 Step2: Basedonthe b -thbootstrappedsample,computeabootstrappedversionof T n , denotedas T ( b ) n . Step3: RepeatSteps1and2alargeinteger B timestoobtain B bootstrapvalues f T ( b ) n g B b =1 andthenthe100(1 )%quantileof f T ( b ) n g B b =1 ,denotedas ^ t .Rejectthe nullhypothesisif T n > ^ t . ThefollowingtheoremjusttheaboveBootstrapprocedure Theorem4. Let X n = f ( Y ij ;X ij ;t ij ) ;j =1 ;:::;m i ;i =1 ;:::;n g denotestheoriginal dataand L ( T n ) betheasymptoticdistributionof T n underthenullhypothesis.Underthe sameconditionsasTheorem2andsuppose ^ s;t ) isaconsistentcovarianceestimator,the conditionaldistributionof T n given X n , L ( T n jX n ) convergesto L ( T n ) almostsurely. 3.3Simulationstudies Forthesimulationstudiesforsimultaneousinference,weconsiderthesamesetupasinthe simulationstudiesforthepointwiseinferenceinSection2.5inChapter2.Weconsidered twoscenariosAandB,correspondingtotwohypotheseson ( t ).InscenarioA,weused H f ( z 1 ;z 2 ) | g = z 1 z 2 totest H 0 A : 1 ( )= 2 ( )vs H 1 A : 1 ( ) 6 = 2 ( ) ; whereweset 1 ( t )= 1 2 sin t and 2 ( t )=( 1 2 + a )sin t for a =0 ; 0 : 1 ; 0 : 2 ; 0 : 3and0 : 4in(2.5.19) inChapter2toevaluatetheempiricalsize(when a =0)andpowers(when a> 0).In 74 scenarioB,weset H f ( z 1 ;z 2 ) | g = z 2 totest H 0 B : 2 ( )=0vs H 1 B : 2 ( ) 6 =0 ; wherewechose 1 ( t )= 1 2 sin t and 2 ( t )= c for c =0 ; 0 : 02 ; 0 : 04 ; ; 0 : 14.Inthecon- structionoftheteststatistic T n ,wechosetheweightfunction w ( t )=1for t 2 (0 ; 1)and0 otherwise.Thecovariancefunctionwasestimatedbythequasimaximumlikelihoodmethod of[FHL07].Allsimulationresultsbelowwerebasedon500simulationreplicatesandthe criticalvalueofthetestwasestimatedby500bootstrapsamplesineachsimulationrun.We performedthesamebandwidthselectionprocedureineachbootstrapsampletotakeinto accounttheextravariationinthetestcausedbybandwidthselection. Table3.1summarizestheempiricalsizesandpowersforhypothesis H 0 A atthe5% nominallevel.Itcanbeseenthattheempiricalsizesarereasonablycontrolledaroundthe nominallevel.Asweexpected,theempiricalpowerincreasesastheincreaseofthesample size n andthenumberofrepeatedmeasurements m ,whichourtheoreticalresults inSection3.2.Inaddition,thecorrelation ˆ doesnothaveaclearimpactonthepower, indicatingthattheproposedprocedureisrobustwithrespecttothecovariancestructureof therandomerror. ThesimulationresultsforscenarioBareillustratedinFigure3.1.Theresultsunder n =100and n =200arerepresentedbysolidanddashedlines,respectively.Weobserveda verysimilarpatternasthatunderscenarioA.Thesizeiswellcontrolledatthe5%nominal levelandthepowerincreasesasthevalueof c increases.Ateachvalueof c ,thepower increasesasweincrease n or m . 75 Table3.1:Empiricalsizeandpowerfortesting H 0 A : 1 ( )= 2 ( )underscenarioA. m =5 m =10 m =50 anˆ =0.2 ˆ =0.5 ˆ =0.2 ˆ =0.5 ˆ =0.2 ˆ =0.5 0.01000.0620.0580.0640.0480.0700.054 2000.0600.0520.0680.0440.0580.066 0.11000.1340.1320.1880.2120.7720.764 2000.2240.2280.3880.3440.9840.966 0.21000.3440.4060.6760.7081.0001.000 2000.7240.7340.9480.9481.0001.000 0.31000.7460.7480.9760.9821.0001.000 2000.9740.9740.9981.0001.0001.000 0.41000.9620.9601.0001.0001.0001.000 2001.0001.0001.0001.0001.0001.000 (a) ˆ =0 : 2 (b) ˆ =0 : 5 Figure3.1: Empiricalsizeandpowerfortesting H 0 B : 2 ( )=0atthe5%nominallevelunder scenarioB.Theleftpanelisfor ˆ =0 : 2andtherightpanelisfor ˆ =0 : 5. 76 3.4Realdataanalysis Weappliedourproposedmethodstotworealfunctionaldatasets,oneissparseandthe otherisdense. 3.4.1CD4dataanalysis Thisdatasetwascollectedfromarandomizeddouble-blindedstudyofAIDSpatientswith advancedimmunesuppression(CD4counts 50cells/mm 3 )conductedbytheAIDSClinical TrialGroup(ACTG)Study193A.Patientswererandomlyassignedtodualortriplecombi- nationsofHIV-1reversetranscriptaseinhibitors.Sp,patientswererandomizedto oneoffourdailyregimenscontaining600mgofzidovudine:zidovudinealternatingmonthly with400mgdidanosine(treatmentI);zidovudineplus2.25mgofzalcitabine(treatmentII); zidovudineplus400mgofdidanosine(treatmentIII);orzidovudineplus400mgofdidano- sineplus400mgofnevirapine(treatmentIV).Therewasatotalof1309patientsincluded inthestudyand325,324,330and330patientswere,respectively,assignedtotreatments I-IV.MeasurementsofCD4countswerecollectedatbaselineandat8-weekintervalsduring follow-up.Butduetovariousreasons,suchasdropoutandskippedvisits,therepeated measurementswereunbalanced.Thenumberofrepeatedmeasurementsduringthe40 weeksoffollow-upvariedfrom1to9,withamedianof4.Thus,thedatacanbeconsidered assparsefunctionaldata.Moredetailsofthestudycanbefoundin[KAC + 98]. OurinterestistostudythetreatmentontheCD4counts.Weconsidertheresponse variabletobelog(CD4counts+1).TotestfortreatmentwesettreatmentIVas thebaselineandthreedummyvariables T 1 ;T 2 and T 3 asindicatorsoftreatments 77 I-III,respectively.Then,wethedatawiththefollowingfunctionallinearmodel: Y i ( t ij )= 0 ( t ij )+ 1 ( t ij ) T 1 i + 2 ( t ij ) T 2 i + 3 ( t ij ) T 3 i + 4 ( t ij )Age i ( t ij )+ 5 ( t ij )Gender i + 6 ( t ij )PreCD4 i + i ( t ij ) ; for i =1 ; ; 1309and j =1 ; ;m i where Y ( t )=log(CD4counts+1)istheresponse, t isthetime(inweeks).WealsoincludedAge,GenderandPreCD4asthecovariatesinthe modelandallowedAgechangeover t . Totestfortreatmentweconsideredtheglobalhypotheses H 01 : 1 ( )= 2 ( )= 3 ( )=0vs H 11 :atleastoneof k ( ) 6 =0 ;k =1 ; 2 ; 3 : Weappliedtheproposedsimultaneoustestbasedon1000bootstrapreplicates.Theband- widthwasselectedbytheproposedprocedureinSection2.4.Wegota p -valueof < 0 : 001 indicatingthatthetreatmentareindeedt.Tofurtherdissect betweentreatments,weconductedpairwisecomparisonamongtreatments.Theresultsare summarizedinTable3.2.Allthep-valuesforthepairwisecomparisonsexcepttheonefor comparingtreatmentIIandIIIarelessthan5%.Theresultsindicatethatpairwise encesintimebetweenttreatmentgroupsarestatisticallytexcept fortreatmentIIvsIII. 3.4.2Ergonomicsdataanalysis Aspartofastudyofthebodymotionsofautomobiledrivers,researchersattheCenterfor ErgonomicsattheUniversityofMichigancollecteddataonthemotionofasingleindividual 78 Table3.2:P-valuesforpairwisecomparisonamongttreatmentgroups. ComparisonHypothesisp-value IvsII H 02 : 1 ( )= 2 ( )0.040 IvsIII H 03 : 1 ( )= 3 ( )0.000 IvsIV H 04 : 1 ( )=00.000 IIvsIII H 05 : 2 ( )= 3 ( )0.078 IIvsIV H 06 : 2 ( )=00.000 IIIvsIV H 07 : 3 ( )=00.002 to20targetlocationswithinatestcar.Foreachlocation,theresearchersmeasured3times theangleformedattherightelbowbetweentheupperandlowerarms,whichyieldeda sampleofsize20 3=60.Theangleofeachmotionwasrecordedrepeatedlyfromthestart totheendofeachtestdrive.Thetimeperiodofeachmotionvariedinlengthbecauseof thetargetsbeingattdistancesfromthedriverandthedrivermayreachthemat tspeeds.Theobjectiveofthestudywastomodeltheshapeofthemotionbutnot thespeedatwhichitoccurred.Thusinthisstudy, t isusedtorepresenttheproportion,not thetime,ofthemotionbetweenthestartandtheend.See[Far97]and[SF04]foramore detaileddescriptionofthisdataset. Let Y ( t )representtheangleataproportion t for t 2 [0 ; 1].Foragivenmotion, Y ( t )is observedonanequallyspacedgridofpoints.Althoughthenumberofsuchpointsinthe originaldatavariesfromobservationtoobservation,thenumberofrepeatmeasurements foreachmotionis20afterimputation,whichwasconsideredasdensefunctionaldataasin [Zha11].Thepurposeofourstudywastoamodelforpredictingtherightelbowangle curve Y ( t ) ;t 2 [0 ; 1]giventhecoordinates( c x ;c y ;c z )ofthetarget,where c x representsthe \lefttoright"direction, c y representsthe\closetofar"direction,and c z representsthe\down toup"direction.Thecoordinates( c x ;c y ;c z )ofeachofthe20targetsintheexperimentwere knownandusedaspredictorsinourmodel.[SF04]comparedalinearmodel,aquadratic 79 modelandaone-wayANOVAmodel.Theyfoundthataquadraticmodelofthefollowing formthedataadequately Y i ( t ij )= 1 ( t ij )+ c xi 2 ( t ij )+ c yi 3 ( t ij )+ c zi 4 ( t ij ) + c 2 xi 5 ( t ij )+ c 2 yi 6 ( t ij )+ c 2 zi 7 ( t ij ) + c xi c yi 8 ( t ij )+ c yi c zi 9 ( t ij )+ c zi c xi 10 ( t ij )+ i ( t ij ) : (3.4.4) for i =1 ; ; 60and j =1 ; ; 20. Westartedwithmodel(3.4.4),andtestedeachofthecotfunctions k ( t ) ;k = 1 ; ; 10tocheckwhichtermcouldbedroppedfromthemodel.Table3.3summarizes thep-valuesfortestingeachcotfunction.Atthe5%tlevel,wecansee that 7 ( t ) ; 9 ( t )and 10 ( t )arenott,suggestingtodeletethemfromthequadratic model(3.4.4).Wethenobtainedthereducedmodel Y ( t )= 1 ( t )+ c x 2 ( t )+ c y 3 ( t )+ c z 4 ( t ) + c 2 x 5 ( t )+ c 2 y 6 ( t )+ c x c y 8 ( t )+ ( t ) : Fromtheabovereducedmodel,wecouldseethattheanglecurve Y ( t )hasasigntlinear relationshipwiththe\downtoup"coordinate z ,butatquadraticrelationshipwith the\lefttoright"coordinate x andthe\closetofar"coordinate y .Themodelselectedabove isconsistentwiththemodelchosenby[Zha11]. 80 Table3.3:P-valuesfortestingeachcotfunctioninthequadraticmodel(3.4.4). Hypothesisp-valueHypothesisp-value H 01 : 1 ( )=00.000 H 06 : 6 ( )=00.032 H 02 : 2 ( )=00.006 H 07 : 7 ( )=00.050 H 03 : 3 ( )=00.006 H 08 : 8 ( )=00.004 H 04 : 4 ( )=00.005 H 09 : 9 ( )=00.080 H 05 : 5 ( )=00.038 H 0 ; 10 : 10 ( )=00.109 3.5TechnicalDetails ThissectioncontainstheproofsforthemaintheoremsinSection3.2.Proofsforthepropo- sitionscanbefoundinthenextsection. 3.5.1ProofsofMainTheorems 3.5.1.1ProofofTheorem2 ProofofTheorem2. Weprovethecasewith 2 [0 ; 1 = 8],underwhichwechoosethe bandwidth h = n 0 from2(1+ ) = 17 < 0 < 1 2 .Inthisscenario,itiseasyto seethat mh ! 0.Inthiscase,wehavethedecompositionfor T n , T n = T n 1 + T n 2 ,where T n 1 = C 2 0 n X i =1 Z b a ˘ | i ( t ) G | ( t ) G ( t ) ˘ i ( t ) w ( t ) dt T n 2 = C 2 0 n X i =1 n X k 6 = i Z b a ˘ | i ( t ) G | ( t ) G ( t ) ˘ k ( t ) w ( t ) dt: 81 Itthencanbeshownthat E ( T n 1 )= q + qh m r r 20 Z b a f ( t ) w ( t ) dt + O ( h 2 ) Var( T n 1 )= q + qh m r r 20 Z b a f ( t ) w ( t ) dt 2 + O ( h 2 +1 =n ) q + qh m r r 20 Z b a f ( t ) w ( t ) dt 2 + O ( h 2 )= O ( h 2 +1 =n ) ; and E ( T n 2 )=0, Var( T n 2 )=2 q 2 20 Z 2 2 [ K (2) ( u )] 2 du Z b a w 2 ( t ) dt +2( mh ) 2 Z b a Z b a tr f 0 ( t;s ) 0 ( s;t ) g w ( t ) w ( s ) dtds + O ( mh 2 + h=n ) : HencewehaveVar( T n 1 )= O ( h 2 +1 =n )= o f Var( T n 2 ) g : Itfollowsthat T n E ( T n )= T n 1 E ( T n 1 )+ T n 2 = T n 2 f 1+ o p (1) g : Thus,tostudytheasymptoticpropertyof T n ,weonlyneedtostudythatof T n 2 . Infact,wecanwrite T n 2 as T n 2 = 1 n n X i 6 = k Z b a Z | i ( t ) Z k ( t ) w ( t ) dt; where Z i ( t )= p mh G ( t ) ˘ i ( t ).Let U n = 1 n 1 T n 2 = 2 n ( n 1) P 1 i 1 = 8.Inthiscase,wechoose thenbandwidth h = n 0 from1 = 8 < 0 < min f ; 1 = 2 1 g .Underthisscenario,we have mh !1 .ByLemma9inSection3.5.2,weknow U n ( t )asymptoticallyconvergesto aGaussianprocess U ( t ; )withmean0andcovariancefunction ( s;t ).Thusthelimit- ingdistributionof T n isthesameasthedistributionof Z = R U ( t ; ) | U ( t ; ) w ( t ) dt .We 83 onlyneedtoshowthedistributionof Z .Tothisend,usingthefollowingKarhunen-Loeve representationfor U ( t ; )[Bal60] U ( t ; )= 1 X k =1 ˘ k ˚ k ( t ) ; where ˘ k = R b a U ( t ; ) | ˚ k ( t ) w ( t ) dt areindependent( k =1 ; 2 ; ; 1 )normalwithmean0 andvariance k .Here k and ˚ k ( t )are,respectively,the k -thorderedeigenvalueof ( s;t ) andthecorrespondingeigenfunctionsin R q .Thenwehave Z = 1 X k =1 1 X l =1 ˘ k ˘ l Z b a ˚ k ( t ) | ˚ l ( t ) w ( t ) dt = 1 X k =1 ˘ 2 k : Since ˘ k areindependent N (0 ; k ),wehave T n d ! Z = P 1 k =1 k ˜ 2 1 ;k : Thusbycombining theabovetwocasestogether,wecompletetheproofofpart(b). 3.5.1.2ProofofCorollary1 Proof. FromTheorem2, T n hasthesamedistributionas Z n = P 1 k =1 nk ˜ 2 1 ;k : Thuswe onlyneedtoshowtheasymptoticalnormalityof P 1 k =1 k ˜ 2 1 ;k .ByLyapunovcentrallimit theorem,ifthefollowingconditionhold 1 X k =1 4 nk = ( 1 X k =1 2 nk ) 2 ! 0 ; (3.5.5) Thenwehave Z n P 1 k =1 nk q 2 P 1 k =1 2 nk d ! N (0 ; 1) : 84 UsingProposition4, n ( s;t )= 1 20 K (2) ( s t h ) I q andinparticular n ( t;t )= I q ,we that P 1 k =1 nk =tr( )= q R b a w ( t ) dt = q and 1 X k =1 2 nk = q Z b a Z b a 2 20 f K (2) ( s t h ) g 2 w ( s ) w ( t ) dsdt = qh˙ 2 0 = 2 ; where ˙ 2 0 wasinthecorollary.Therefore,theconclusioninthisLemmaholds.It remainstoshowthecondition(3.5.5).Let ( s;t )= 1 20 K (2) ( s t h ).Then 1 X k =1 4 nk = q ZZZZ ( s;t ) ( t;l ) ( l;m ) ( m;s ) w ( s ) w ( t ) w ( l ) w ( m ) dsdtdldm = qh 3 C 0 4 20 Z b a w 4 ( t ) dt where C 0 = R K (2) ( u 1 ) K (2) ( u 2 ) K (2) ( u 3 ) K (2) ( u 1 + u 2 + u 3 ) du 1 du 2 du 3 isaconstant.Thus thecondition(3.5.5)holds.Thiscompletestheproofofthiscorollary. 3.5.1.3ProofofTheorem3 ProofofTheorem3. Firstnoticethat2 ` ( t )= n 2 C 2 0 ~ | ( t ) R 1 ( t ) ~ ( t )+ o p ( h 1 = 2 )and underlocalalternative, ~ ( t )= R ( t ) C ( t ) A 1 ( t ) 1 n n X i =1 g i f 0 ( t ) g + R ( t ) H f 0 ( t ) g +~ o p ( n ) : Wethen U + n ( t )= G ( t ) C 1 0 P n i =1 ˘ i ( t ) nC 1 0 R 1 = 2 ( t ) H f 0 ( t ) g : Firstconsideringtheproofforpart(a)with0 0 =1 = 16,underwhichwechoose thebandwidth h = n 0 with2(1+ ) = 17 < 0 < 1 2 .Inthisscenario,wehave 85 m 2 h ! 0.Wehave T n = Z b a 2 ` ( t ) w ( t ) dt = Z b a U + | n ( t ) U + n ( t ) w ( t ) dt + o p ( h 1 = 2 ) = C 2 0 n X i =1 n X k =1 Z b a ˘ | i ( t ) G | ( t ) G ( t ) ˘ k ( t ) w ( t ) dt 2 nC 2 0 n X i =1 Z b a ˘ | i ( t ) G | ( t ) R 1 = 2 ( t ) H f 0 ( t ) g w ( t ) dt + n 2 C 2 0 Z b a H | f 0 ( t ) g R ( t ) H f 0 ( t ) g w ( t ) dt + o p ( h 1 = 2 ) := R n 1 2 R n 2 + R n 3 + o p ( h 1 = 2 ) : ThenbytheresultinCorollary1,wehave h 1 = 2 f R 1 n q g d ! N (0 ;q˙ 2 0 ) : Andfor R 2 n , obviouslywehave E ( R 2 n )=0,and Var( R 2 n )= n 2 b 2 n C 4 0 n X i =1 m i X j =1 m i X j =1 1 m 2 i E Z b a Z b a X | ij G | ( t ) R 1 = 2 ( t ) d ( t ) X | il G | ( s ) R 1 = 2 ( s ) d ( s ) ij il K h ( t ij t ) K h ( t il s ) w ( t ) w ( s ) dtds = O ( n 3 b 2 n C 4 0 )= O f n ( b n mh ) 2 g : Sinceinthiscase b n =( nm ) 1 = 2 h 1 = 4 ,wehaveVar( R 2 n )= O ( mh 3 = 2 ) : Thuswehave h 1 = 2 R 2 n p ! 0sincewehave mh 1 = 2 ! 0underthiscase.Andfor R 3 n whichisnon-random, wehave h 1 = 2 R 3 n = R b a d | ( t ) R ( t ) d ( t ) w ( t ) dt: Thuswehave h 1 = 2 ( T n q ) d ! N ( 0 ;q˙ 2 0 ), where 0 = R b a d | ( t ) R ( t ) d ( t ) w ( t ) dt . Forpart(b)with1 = 16 < 1 = 8,underwhichwechoosethebandwidth h = n 0 with 2(1+ ) = 17 < 0 < 1 2 .Inthisscenario,wealsomaketohave m 2 h !1 and 86 mh ! 0.Wewrite U + n ( t )= C 1 0 n X i =1 G ( t ) ˘ i ( t ) b n R 1 = 2 ( t ) d ( t ) := 1 p n n X i =1 Z + i ( t ) ; where Z + i ( t )= p mh G ( t ) ˘ i ( t ) b n R 1 = 2 ( t ) d ( t ) .Thenwehave T n = Z b a 2 ` ( t ) w ( t ) dt = Z b a U + | n ( t ) U + n ( t ) w ( t ) dt + o p ( mh ) = 1 n n X i =1 n X k =1 Z b a Z + | i ( t ) Z + k ( t ) w ( t ) dt + o p ( mh ):= T + n 1 + T + n 2 + o p ( mh ) ; where T + n 1 = 1 n P n i =1 R b a Z + | i ( t ) Z + i ( t ) w ( t ) dt and T + n 2 = 1 n P n i 6 = k R b a Z + | i ( t ) Z + k ( t ) w ( t ) dt: By similarcalculationasinthenullhypothesisfor T n 1 ,wehave E ( T + n 1 )= q + ( m r ) qh r 20 R b a f ( t ) w ( t ) dt + mhb 2 n 0 + O ( h 2 )andVar( T + n 1 )= O ( h 2 +1 =n ). For T + n 2 ,wetheU-statisticasfollows U n = T + n 2 ( n 1) = 1 n ( n 1) n X i 6 = k Z b a Z + | i ( t ) Z + k ( t ) w ( t ) dt = 1 n ( n 1) n X i 6 = k K ( Z + i ; Z + k ) ; wherethekernelfunction K isthesameasintheproofforthenullcase.Itiseasytoshow = E K ( Z + 1 ; Z + 2 )= mhb 2 n 0 .Andtherstprojection K 1 ( Z + 1 )= E fK ( Z + 1 ; Z + 2 ) jZ + 1 g = b n p mh R b a Z + | 1 ( t ) R 1 = 2 ( t ) d ( t ) w ( t ) dt hasthevariance 1 ,whichcanbeobtainedby E K 2 1 ( Z + 1 )= b 2 n mh Z b a Z b a d | ( t ) R 1 = 2 ( t ) E fZ + 1 ( t ) Z + | 1 ( s ) g R 1 = 2 ( s ) d ( s ) w ( t ) w ( s ) dtds: Therefore,wehave 1 = b 2 n ( mh ) 2 1 + O ( b 2 n mh 2 ),where 1 isinTheorem3. 87 Wealsohave 2 =Var fK ( Z + 1 ; Z + 2 ) g =( mh ) 2 V + O ( h + b 2 n m 2 h 2 ),where V =2 Z b a Z b a tr f 0 ( s;t ) 0 ( t;s ) g w ( t ) w ( s ) dtds: ThusbyU-statistictheory,if 2 = o ( 1 ),whichisequivalentto b 1 n = o ( p n ),wehave U n ˘ AN( ; 4 1 n )providedthattheprojectionsequence fK 1 ( Z + i ) g n i =1 satisfytheLyapunov's condition,whichcanbevasfollows.Since E K 1 ( Z + i )= ,Var fK 1 ( Z + i ) g = 1 and E fK 1 ( Z i ) g 4 ˘ b 4 n ( mh ) 4 uptoaconstant,wehave P n i =1 E fK 1 ( Z + i ) g 4 P n i =1 Var 2 fK 1 ( Z + i ) g 2 ˘ nb 4 n ( mh ) 4 n 2 b 4 n ( mh ) 4 = 1 n ! 0 : Thusif b 1 n = o ( p n ),wehave T + n 2 ˘ AN( nb 2 n 0 ; 4 nb 2 n ( mh ) 2 1 ).Thentheconclusionin part(b)holds. Forpart(c)with > 1 = 8,underwhichwechoosethebandwidth h = n 0 with 1 = 8 < 0 < 1 = 2 1 .Inthisscenario,wehave mh !1 .Since b n = n 1 = 2 and C 0 = n 1 = 2 ,wehave U + n ( t )= G ( t ) C 1 0 n X i =1 ˘ i ( t ) R 1 = 2 ( t ) d ( t ) : ByLemma9inSection3.5.2inChapter2,weknow U + n ( t )asymptoticallyconvergesto aGaussianprocess U + ( t ; )withmean R 1 = 2 ( t ) d ( t )andcovariancefunction ( s;t ).Thus thelimitingdistributionof T n isthesameasthedistributionof Z + := R b a U + ( t ; ) T U + ( t ; ) w ( t ) dt . Weonlyneedtoshowthedistributionof Z + .Tothisend,usingthefollowingeigenvaluede- compositionfor U + ( t ; )[Bal60] U + ( t ; )= P 1 k =1 ˘ + k ˚ k ( t ) ; where ˘ + k = R b a U + ( t ; ) | ˚ k ( t ) w ( t ) dt areindependent( k =1 ; 2 ; ; 1 )normalwithmean u k andvariance k .Here k and 88 ˚ k ( t )isthe k -thorderedeigenvalueof 0 ( s;t )andcorrespondingeigenfunctionsin R q .Then wehave Z + = 1 X k =1 1 X l =1 ˘ + k ˘ + l Z b a ˚ k ( t ) | ˚ l ( t ) w ( t ) dt = 1 X k =1 ˘ +2 k : Because ˘ + k areindependent N ( u k ; k ),wehave T n d ! P 1 k =1 k ˜ 2 1 ;k 2 k k : Thiscom- pletestheproofofpart(c). 3.5.1.4ProofofTheorem4 ProofofTheorem4. Conditionalonthedata X n = f Y ij ;X ij ;t ij g n i =1 ,thebootstrappedsam- plewasgeneratedaccordingto Y ij = ~ | ( t ij ) X ij + ij ; whichcanberegardedananalogof themodel(2.1.1)withthetruecotfunction ~ ( t )and ij hasmean0andcovari- ance ^ s;t ).Let o p (1)and O p (1)bethestochasticorderwithrespecttotheconditional probabilitymeasuregiventheoriginalsamples. Basedonthisbootstrappedsample f Y ( b ) ij ;t ij ; X ij : i =1 ; ;n ; j =1 ; ;m i g ,we estimatethetrue ~ ( t )bylocallinearsmoothingwiththeestimator ^ ( t ),which fromtheoriginal ^ ( t )onlyviatheerror.Andthenourestimatingequationisconstructed asfollowing g i f ( t ) g = 1 m i m i X j =1 n Y ij | ( t ) X ij f ^ ( t ij ) ^ ( t ) g | X ij o X ij K h ( t ij t ) : SincewehaveprovedthefollowingresultsintheproofofLemma1inSection2.6.2, sup t 2 [ a;b ] k 1 n n X i =1 1 m i m i X j =1 X ij X | ij K h ( t ij t ) ( t ) f ( t ) k = O ( n )a.s. ; 89 andbythesimilarproofasLemma2inSection2.6.2,wehave g i f ~ ( t ) g = ˘ i ( t ) f 1+~ o p (1) g + ~ O ( h 4 )a.s. where ˘ i ( t )= 1 m i P m i j =1 X ij ij K h ( t ij t )andhereandbelow,thealmostsurelyconvergence holdswithrespecttotheoriginalprobabilitymeasure,whichistruealmostsurelyforallthe samplepointsinthesamplespaceof X n when n istlarge.Thenbythefactthat sup t k ~ ( t ) 0 ( t ) k = O ( n 1 + h 4 )a.s..Thus,similarto(2.6.29),wehavethefollowing resultsalmostsurely 2 ` ( t )= 1 C 2 0 f n X i =1 g i ( ~ ) g | A 1 C | RCA 1 f n X i =1 g i ( ~ ) g + o p (1)+ ~ O ( n 1 + h 4 ) = 1 C 2 0 f n X i =1 g i ( ~ ) g | G | G f n X i =1 g i ( ~ ) g + o p (1)+ ~ O ( n 1 + h 4 ) = U | n ( t ) U n ( t ) f 1+ o p (1) g + ~ O ( n 1 + h 4 ) ; where U n ( t )= C 1 0 G ( t ) P n i =1 ˘ i ( t )with G ( t )= R 1 = 2 ( t ) C ( t ) A 1 ( t ). Thusthebootstrappedversionteststatistic T n canberepresentedas T n = Z b a U | n ( t ) U n ( t ) w ( t ) dt f 1+ o p (1) g + o (1)a.s.(3.5.6) Let d ( F;G )bethemaximumnormdistancebetweentwodistributionfunctions F and G suchthat d ( F;G )=sup x j F ( x ) G ( x ) j .FromtheproofofTheorem2,weknowthatthe requiredconditionsforshowingtheconvergenceof d ( L ( R b a U | n ( t ) U n ( t ) w ( t ) dt ) ; L ( T n )) ! 0 aretheindependencebetween X i ( t )and i ( t ), i ( t )areindependentwithE f i ( t ) g =0and momentsfor i =1 ; ;n . 90 Toshowthat d ( L ( R b a U | n ( t ) U n ( t ) w ( t ) dt jX n ) ; L ( T n )) ! 0 ;n !1 ,wenotethedif- ferencebetween R b a U | n ( t ) U n ( t ) w ( t ) dt and R b a U | n ( t ) U n ( t ) w ( t ) dt isthat i ( t )isreplaced by i ( t ),whichhasmean0andcovariance ^ s;t ).Since ^ s;t )isaconsistentestima- torof s;t ),andfromourconstructionof i ( t )inthewildbootstrapprocedure,given X n ,wehavetheindependencebetween X i ( t )and i ( t ),E f i ( t ) g =0and i ( t )has nite moments.Thus,basedonthestandardmooftheproofofTheorem2, wehave d ( L ( R b a U | n ( t ) U n ( t ) w ( t ) dt jX n ) ; L ( T n )) ! 0.Thistogetherwith(3.5.6),wehave d ( L ( T n jX n ) ; L ( T n )) ! 0almostsurely. 3.5.2ProofsofPropositionandLemma Lemma9. Underassumptions(C1)-(C4),forthedensefunctionaldata, U n ( t ) converges toamultivariateGaussianprocess ˘ ( t ) withmean0andcovariancematrix 0 din Proposition4inSection3.2. Proof. Itisclearthat E f U n ( t ) g = C 1 0 P n i =1 G ( t ) E f ˘ i ( t ) g =0and Cov f U n ( s ) ; U n ( t ) g = 1 C 2 0 ( n X i =1 G ( s ) E f ˘ i ( s ) ˘ | i ( t ) g G | ( t ) ) : Forcomputing E f ˘ i ( s ) ˘ | i ( t ) g ,bysimilarcalculationasbefore,wehavethefollowingresult, E f ˘ i ( s ) ˘ | i ( t ) g = K (2) ( s t h ) m i h ( s s ) f ( s )+ m i 1 m i ( s;t s;t ) f ( s ) f ( t )+ ~ O ( h 2 ) ; where K (2) ( x )= R K ( y ) K ( y x ) dy and ( s;t )= E f X ( s ) X | ( t ) g ; s;t )= E f ( s ) ( t ) g . 91 Thenwehave Cov f U n ( s ) ; U n ( t ) g = G ( s ) ( s s ) G | ( t ) f ( s ) K (2) ( s t h ) 1 C 2 0 n X i =1 1 m i h + G ( s ) ( s;t s;t ) G | ( t ) f ( s ) f ( t ) 1 C 2 0 n X i =1 m i 1 m i + n ~ O ( h 2 ) C 2 0 G ( s ) G | ( t ) : (3.5.7) Bytheof C 0 ,wehavethefollowingresult, Cov f U n ( s ) ; U n ( t ) g˘ G ( s ) ( s;t s;t ) G | ( t ) f ( s ) f ( t )= 0 ( s;t ) : ThuswehaveCov f U n ( s ) ; U n ( t ) g = 0 ( s;t )+~ o (1) : TheproofinLemma3provesthecentral limittheoremthejointdistributionof f U n ( t 1 ) ; ; U n ( t s ) g attimepoints f t 1 ; ;t s g . Weakconvergenceof U n ( t )nowfollows(Billingsley(1968),page95)if 8 a 2 R q , a | E f [ U n ( s ) U n ( t )][ U n ( s ) U n ( t )] | g a C ( s t ) 2 canbeestablished.Tothisend,wenotethat a | E f [ U n ( s ) U n ( t )][ U n ( s ) U n ( t )] | g a = a | G ( s ) B ( s ) G | ( s ) a a | 0 ( s;t ) a a | 0 ( t;s ) a + a | G ( t ) B ( t ) G | ( t ) a =2 a | a f a | [ 0 ( s;s )+( t s ) @ 0 ( s;s ) @t +( t s ) 2 @ 2 0 ( s;s ) @t 2 ] a f a | [ 0 ( t;t )+( s t ) @ 0 ( t;t ) @s +( t s ) 2 @ 2 0 ( t;t ) @s 2 ] a g s t jj a | f @ 0 ( s;s ) @t @ 0 ( t;t ) @s g a j + C 1 ( s t ) 2 C ( s t ) 2 ; 92 whereweused 0 ( s;s )= 0 ( t;t )= I q andthelasttwoinequalitiesfollowfromtheconti- nuitycondition(C3). ProofofProposition4. By(3.5.7)intheproofofLemma9,andtheof C 0 ,we havethefollowingresult,uptoafactor1+ o p (1),Cov f U n ( s ) ; U n ( t ) g = 1 20 K (2) ( s t h ) I q + mh 0 ( s;t )for mh ! 0,andCov f U n ( s ) ; U n ( t ) g = 0 ( s;t )for mh !1 . Since K (2) ( s t h )= 20 when s = t ,wecanfurtherhave,uptoafactor1+ o p (1), Cov f U n ( s ) ; U n ( t ) g = 8 > > > > > > > > < > > > > > > > > : 1 20 K (2) ( s t h ) I q ; if m 2 h ! 0 ; I q I ( s = t )+ mh 0 ( s;t ) I ( s 6 = t )if m 2 h !1 and mh ! 0 ; 0 ( s;t ) ; if mh !1 ; whichcompletetheproofoftheproposition. 93 Chapter4 EmpiricalLikelihoodinTesting CotsinHighDimensional HeteroscedasticLinearModels 4.1Introduction AsmentionedinSection1.2.2,peoplehavemadetprogresstowardsunderstanding theestimationtheory,butverylittleworkhasbeendoneforstatisticalinferenceforhigh dimensionallinearmodels,especiallywithheteroscedasticnoise.Empiricallikelihoodhas theabilityofinternalstudentizingtoavoidvarianceestimation,whichcanhelpsolvethe heteroscedasticityissue. InSection4.2,westudytheasymptoticnormalityofWaldtypestatisticfortheexisting methodsundertheheteroscedasticnoise.InSection4.3,weproposethegeneralempiri- callikelihoodframeworkforanalyzingtheestimatingequationsproposedintways, althoughtheyallfollowthelowdimensionalprojectionidea.InSection4.4,weprovide implicationsofthegeneralresultsonthreetcases,projectionvialassoestimation, projectionviainverseregressionandprojectionviaKFCsetselection.Section4.5provides numericalresultsandSection4.6showssomerealdataanalysis.Wereferalloftheproofs 94 totheTechnicalDetails4.7. Thefollowingnotationisadoptedthroughoutthischapter.For v =( v 1 ;v 2 ; ;v d ) | 2 R d ,we k v k q =( P d i =1 j v i j q ) 1 =q for0 n ,theOLSestimatorisnolonger valid.Insteadofprojectionontothespacespannedbyalloftherestcovariates,peopleselect theprojectionspacebasedonthecorrelationsbetween X j andtheothers. 4.2.1LassoProjection In[ZZ14,vdGBR13,NL14],theyusedthelinearsparseregularizedregressionproceduresuch asLassotoselecttheprojectionspace. ij := X ij X | i; n j 1 n j; n j n j;j .Thatis X ij = X | i; n j w 0 j + ij with w 0 j = 1 n j; n j n j;j ,whichleadstothefollowinggeneralizedversionof(4.2.3)with relaxedprojection ^ (lin) j = Z | j Y Z | j X j ; where Z j = X j X n j ^ w j (4.2.4) with ^ w j asanestimatorof w 0 j .However, ^ (lin) j isbiased.Tosolvethisissue,[ZZ14]proposed thede-biasedestimatorasfollows, ^ (de) j = Z | j Y P k 6 = j Z | j X k ^ k Z | j X j ; (4.2.5) 97 where ^ issomeinitialestimatorof 0 .Thisde-biasedestimator(4.2.5)canberegarded asthesolutiontotheestimatingequation,whichisbasedonthepopulationsubject ij i = X ij E( X ij j X i; n j ) Y i X | i 0 ,thatis n X i =1 m (lasso) ni ( j ):= n X i =1 X ij X | i; n j ^ w j Y i X ij j X | i; n j ^ n j =0 : (4.2.6) Andbysimplealgebra,wehave m (lasso) ni ( 0 j )= i ij | {z } W (lasso) ni + ij ( 0 n j ^ n j ) | X i; n j +( w 0 j ^ w j ) | X i; n j Y i X ij 0 j X i n j ^ n j | {z } R (lasso) ni : Bysimplecalculation,wehaveE( W (lasso) ni )=E i ( X ij j; n j 1 n j; n j X i; n j ) =0and E[( W (lasso) ni ) 2 ]=E 2 i ( X ij j; n j 1 n j; n j X i; n j ) 2 =E f 2 i ( X 2 ij 2 X ij j; n j 1 n j; n j X i; n j + j; n j 1 n j; n j X i; n j X | i; n j 1 n j; n j n j;j ) g =E f Z 2 ij 2 Z ij j; n j 1 n j; n j Z i; n j + j; n j 1 n j; n j Z i; n j Z | i; n j 1 n j; n j n j;j g = jj 2 j; n j 1 n j; n j j; n j + j; n j 1 n j; n j n j; n j 1 n j; n j n j;j := ˙ 2 n; lasso : Notethatifweassumetheindependencebetweentheerrortermandthecovariates,wehave thefollowingform E[( W (lasso) ni ) 2 ]= ˙ 2 ( ˙ jj j; n j 1 n j; n j n j;j ) : Thisshowsthebetweenourheteroscedasticcaseandthehomoscedasticcase. Forthehomoscedasticcase,asdiscussedin[ZZ14][vdGBR13],theinferenceproce- 98 durebasedonasymptoticnormalityneedstoestimatetheasymptoticvariance ˙ 2 = ( ˙ jj j; n j 1 n j; n j n j;j ).Undertheheteroscedasticnoise,wecanstillshowthefollowingasymp- toticnormalitybutwithmuchmorecomplicatedasymptoticvariance. Proposition5. Undermodel(4.2.1)withheteroscedasticnoise,ifAssumption1inthe appendixholds,wehave p n ( ^ ( de ) j 0 j ) d ! N (0 ;˙ 2 lasso )(4.2.7) wheretheasymptoticvarianceisdasfollows ˙ 2 lasso =lim n !1 jj 2 j; n j 1 n j; n j j; n j + j; n j 1 n j; n j n j; n j 1 n j; n j n j;j ( ˙ jj j; n j 1 n j; n j n j;j ) 2 : (4.2.8) Suchcomplexasymptoticvariance(4.2.8)makesithardtouseWaldtypeinference procedureinpracticesinceitisytogetagoodestimatefortheasymptoticvariance. ThusnaivelyusingtheWaldtypetestprocedureproposedby[ZZ14]intheheteroscedastic casewillleadtoinvalidresults,whichwillbedemonstratedinthesimulationstudyinSection 4.5. 4.2.2KFCProjection [LZL + 13]proposedanotherwaytoselecttheprojectionspace,whichisbasedontheso calledKFCset S = f l 6 = j : j ˙ jl j >c g forsomepre-spthresholdvalue c> 0.That isessentiallythesetofallkeyconfoundersassociatedwith X j .Andthentheestimatorcan 99 beobtainedbytheprojectionwithrespecttothecovariatesindexedby S , ^ (kfc) j = X | j Q S Y X | j Q S X j = ~ X | j ~ Y ~ X | j ~ X j ; (4.2.9) withthedresponseandtargetpredictoras ~ Y = Q S Y ; ~ X j = Q S X j . Basedonthede-biasidea,weproposethefollowingde-biasedKFCestimator ^ (kfc-de) j = ~ X | j ~ Y P k 2S ~ X | j ~ X k ^ k ~ X | j ~ X j ; (4.2.10) where S = S + c ,i.e.thecomplementof S + := f 1 g[S ,and ^ S isaninitialestimator. Infact,theabovede-biasedKFCestimatoristhesolutiontotheestimatingequation basedonthepopulationsubject ij; S i := X ij E( X ij j X i S ) Y i X | i 0 ,thatis n X i =1 m (kfc) ni ( j ):= n X i =1 ( ~ Y i ~ X ij j ~ X | i S ^ S ) ~ X ij =0 ; (4.2.11) where m (kfc) n ( 0 j )canbedecomposedas m (kfc) ni ( 0 j )= i ij; S + j S 1 SS X i S X ij X | i S ( X | S X S ) 1 X | S + i X | i S ( X | S X S ) 1 X | S j S 1 SS X i S X | j X S ( X | S X S ) 1 X i S + X ij X | j X S ( X | S X S ) 1 X i S X | i S X | i S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] : Wedenotethetermas W (kfc) ni andalltheothersaredenotedby R (kfc) ni .Andfor simplicityweassumethenormalityof X i ˘ N( 0 ; )fortheKFCprojectionsection.Now 100 W (kfc) ni = f i ( X ij j S 1 SS X i S ) g n i =1 areIIDwithE W (kfc) ni =0and E[( W (kfc) ni ) 2 ]=E 2 i ( X ij j S 1 SS X i S ) 2 =E f 2 i ( X 2 ij 2 X ij j S 1 SS X i S + j S 1 SS X i S X | i S 1 SS S j ) g =E f Z 2 ij 2 Z ij j S 1 SS Z i S + j S 1 SS Z i S Z | i S 1 SS S j g = jj 2 j S 1 SS j S + j S 1 SS SS 1 SS S j : Notethatifweassumeindependencebetween i and X i ,wehaveE[( W (kfc) ni ) 2 ]= ˙ 2 ( ˙ jj j S 1 SS S j ). Thusifweassumeindependencebetween i and X i ,wehavethesimpleasymptotic variancefor ^ (kfc-de) j , ˙ 2 = ( ˙ jj j S 1 SS S j )asdiscussedin[LZL + 13].Butundermodel (4.2.1)withheteroscedasticerrorterm,wehavethefollowingasymptoticnormalitywith morecomplicatedvariance. Proposition6. UndertheAssumption3intheappendix,wehave p n ( ^ ( kfc-de ) j 0 j ) d ! N (0 ;˙ 2 kfc ) ; (4.2.12) wheretheasymptoticvarianceisdas ˙ 2 kfc =lim n !1 ( jj 2 j S 1 SS j S + j S 1 SS SS 1 SS S j ) = ( ˙ jj j S 1 SS S j ) 2 : (4.2.13) Similarly,sincetheexpression(4.2.13)fortheasymptoticvarianceisreallycomplicated, whichmakessuchWaldtypestatistichardtouseinpractice. 101 4.2.3InverseProjection Sofarweconstructestimatorsforthetargetcotparameter j directly.However,to conductthehypothesistestingproblem(4.2.2),[LL14]proposedanequivalenttestbasedon theprojectionof X ij onto( Y i ; X | i; n j ) | , X ij =( Y i ; X | i; n j ) 0 j + ij;y ; (4.2.14) where ij;y E ij;y =0 ; Cov( ij;y ; ( Y i ; X | i; n j ))= 0 .Underthelinearmodel(4.2.1) withheteroscedasticnoise,aslongasCov( X i ; )= 0 ,wecanstillshowthatthevector 0 j 0 j = ˙ 2 j;y 0 j ˙ 2 ; 0 j 0 | n j ˙ 2 + n j;j | ,where ˙ 2 j;y =Var( ij;y )=(( 0 j ) 2 + w jj ) 1 with = 1 =(( w jk )).BecauseCov( i ; X i )= 0 ,wehave Cov( i ; ij;y )= 0 j 1 Cov( i ; Y i )= ˙ 2 j;y 0 j := b 0 j : (4.2.15) Hencethetest(4.2.2)isequivalentto H 0 : b 0 j =0.Basedontheideaproposedin[LL14], wecanhavetheestimationfor b 0 j ^ b j = 1 n n X i =1 Y i X | i ^ X ij ( Y i ; X | i; n j ) ^ j (4.2.16) where ^ and ^ j aresomeinitialestimatorsfor 0 and 0 j . Observethat ^ b j isthesolutiontotheestimatingequationbasedon ij;y i + b 0 j = X ij E( X ij j X i; n j ;Y i ) Y i X | i 0 + ˙ 2 j ;y 0 j ,thatis n X i =1 m (inv) ni ( b j ):= n X i =1 Y i X | i ^ X ij ( Y i ; X | i; n j ) ^ j + n b j =0 ; (4.2.17) 102 andalsobysimplealgebra,wehave m (inv) ni ( b 0 j )= f i ij;y + b 0 j g | {z } W (inv) ni + i ( Y i ; X | i; n j )( 0 j ^ j )+ X | i ( 0 ^ ) X ij ( Y i ; X | i; n j ) ^ j | {z } R (inv) ni : Withsimplecalculations,wehaveE( W ni )=0and Var( W ni )=Var( i ij;y )=Var( i ( X ij X | i 0 0 j 1 i 0 j 1 X | i; n j 0 j; n 1 )) = jj +( 0 j 1 ) 2 0 | 0 +( 0 j 1 ) 2 + 0 | j; n 1 n j; n j 0 j; n 1 2 0 j 1 0 | ;j 2 0 j 1 $ j 2 0 | j; n 1 n j;j +2( 0 j 1 ) 2 0 | $ +2 0 j 1 0 | ; n j 0 j; n 1 +2 0 j 1 0 | j; n 1 $ n j := ˙ 2 n; inv : Notethatifweassumetheindependencebetween i and X i ,wehavethefollowing varianceexpression.Since X ij = X | i 0 0 j 1 + i 0 j 1 + X | i; n j 0 j; n 1 + ij;y andCov( i ; X i )=0, wehaveCov( i ; i 0 j 1 + ij;y )=0,i.e. 0 j 1 Var( i )=Cov( i ; ij;y ).Hence Var( W ni )=Var( i ( ij;y + i 0 j 1 ) 2 i 0 j 1 ) =Var( i )Var( ij;y )+( 0 j 1 ) 2 (Var( 2 i ) Var 2 ( i )) : IffurthermoreweassumenormalityfortheerrortermthenwehaveVar( 2 i ) Var 2 ( i )= E( 4 i ) 2[E( 2 i )] 2 =3 ˙ 4 2 ˙ 4 =Var 2 ( i ),whichleadstothesameresultinTheorem3.1 103 from[LL14],i.e. Var( W ni )=Var( i )Var( ij;y )+( 0 j 1 ) 2 (Var( 2 i ) Var 2 ( i )) =Var( i )Var( ij;y )+[Cov( i ; ij;y )] 2 = ˙ 2 ˙ 2 j ;y +( 0 j 1 ) 2 ˙ 4 = ˙ 2 ˙ 2 j ;y +( 0 j ) 2 ˙ 4 j ;y ; whichismorelikelytobeestimable. Butwecanstillgettheasymptoticnormalityasstatedinthefollowingproposition. Proposition7. UnderAssumption2intheappendix,wehave p n ( ^ b j b 0 j ) d ! N (0 ;˙ 2 inv )(4.2.18) where ˙ 2 inv =lim n !1 ˙ 2 n; inv . Butweseethattheasymptoticvarianceof ^ b j istoowaycomplicated,whichmakessuch Waldtypestatisticshardtouseinpracticewithheteroscedasticnoise. 4.3EmpiricalLikelihoodBasedApproach Toavoidthecomplexityofestimatingasymptoticvarianceunderheteroscedasiticcase,we proposeELbasedapproach.NotethattheabovethreeproceduresinSections4.2.1,4.2.2 and4.2.3correspondtothreeestimatingequations(4.2.6),(4.2.11)and(4.2.17)oftheform m n ( X i ;Y i ; j ; ^ n j ; ^ ),wherethenuisanceparameters n j andtheothernuisanceparameters denotedas replacedbytheirestimators ^ n j and ^ .Tokeepitsimple,wewrite m ni ( j )= m n ( X i ;Y i ; j ; ^ n j ; ^ )ingeneral. Notethattheestimatingequations(4.2.6),(4.2.11)and(4.2.17)havethesamestructure, 104 i.e.thetermisthepopulationlevelterm,whichwillbeshowntobedominantand asymptoticallynormal,whiletheothertermsareallaboutestimationerrors,whichneed tobecontrolled.Weproposethefollowinggeneralframeworkbyassumingtheestimating equationsevaluatedatthetruth 0 j canbedecomposedasfollows, m ni ( 0 j ):= m n ( X i ;Y i ; 0 j ; ^ n j ; ^ ):= W ni + R ni (4.3.19) where f W ni g n i =1 whichareIIDand f R ni g n i =1 needtosatisfythefollowingconditions: (C0) P min 1 i n m ni < 0 < max 1 i n m ni ! 1; (C1) W ni 'sareIIDwithmean0andvariance ˙ 2 n with ˙ 2 n ! ˙ 2 w ; (C2) 1 p n P n i =1 R ni = o p (1)andmax 1 i n j R ni j = o p ( n 1 = 2 ). Accordingto[Owe01],withestimatingequations,wecanconstructempiricallikelihood tomaketheinference.thefollowingempiricallikelihoodratiofunctionofthetarget parameter j EL n ( j )=max n n Y i =1 np i : p i > 0 ; n X i =1 p i =1 ; n X i =1 p i m ni ( j )=0 o : (4.3.20) Underthisframeworkwiththeabovegeneralconditions,wehavethefollowingpow- erfulWilkstheorem. Theorem5. If(C0)-(C2)hold,then 2log EL n ( 0 j ) d ! ˜ 2 1 : 105 BasedonTheorem5,anasymptotic leveltestisgivenbyrejecting H 0 if 2logEL n ( 0 j ) > ˜ 2 1 where ˜ 2 1 istheupper quantileof ˜ 2 1 .Wecanalsoconstructa(1 )100% intervalfor j asCI = f j : 2logEL n ( j ) <˜ 2 1 g .Sincetheasymptoticdistributionis chi-square,wedonotneedtoestimateanyadditionalparameters,suchastheasymptotic variance. 4.4TheoreticalExamples ThissectionoutlinesthreeexamplesaswediscussedaboveinSections(4.2.1),(4.2.2)and (4.2.3)todemonstrateinterestingandpowerfulapplicationsofTheorems5.Weneedto checktheconditions(C0)-(C2)fortheseproblems. FromProposition5,6and7,weseethatWaldtypeinferenceprocedureishardto implementduetothecomplexasymptoticvariance.Fortunatelywedonotneedtoestimate thatvarianceinordertoconductinferencebyusingtheselfstudentizedELprocedure.And infact,wealreadyvcondition(C1)forthethreeproceduresinSection(4.2.1),(4.2.2) and(4.2.3),respectively.Wecancontrolthesecondterm R ni sundercertainassumptions, whichleadstothefollowingtheorems. 4.4.1LassoProjection TheexampleisaboutusingLassoestimationtogetthelowdimensionalprojectionas wediscussedinSection4.2.1. Theorem6. UndersometypicalconditionsfortheinitialestimatorsasinAssumption1in theappendixandassumethat X i and i arebothsub-Gaussian.Aslongas s log p= p n = o (1) , theconditions(C0)and(C2)canbed.Assume ˙ 2 n; lasso ! ˙ 2 lasso forsome ˙ 2 lasso < 1 , 106 andthenwehave 2log EL ( lasso ) n ( 0 j ) d ! ˜ 2 1 : Noticethatunderthehomoscedasticnoisecase,[ZZ14]and[vdGBR13]usedtheWald typeteststatisticfortesting H 0 basedonthesameestimationequationasweusedhere.And in[NL14],withthesameestimatingequation,theyinsteadproposedtheScoreteststatistic fortesting H 0 .Althoughtheyareasymptoticallyequivalent,thebetweenthese twocanbefoundin[NL14].Weareusingthesameestimatingequationtoconstructthe likelihoodratiotypestatisticfortesting H 0 .Sinceweareusingempiricallikelihood,it notonlyenjoystheWilksphenomenon,butalsohasotherniceproperties,suchasthe shapeoftheintervalisdatadrivenandourprocedureismorerobusttothe distributionassumptionfortheerrortermsinceitonlyrequiresmomentassumptions.The keyadvantageofourmethodisthatweallowheteroscedasticityfortheerrortermduetothe selfstudentizationpropertyoftheempiricallikelihood.Pleaserefertotheempiricalstudies inthesimulationsectionfortheperformancecomparisonofourmethodwiththeWaldtype testandScoretest. 4.4.2InverseProjection Thesecondexampleisaboutusinginverseregressiontogetthelowdimensionalprojection aswediscussedinSection4.2.3. Theorem7. UndersomeconditionsfortheinitialestimatorsasinAssumption2inthe appendix,andassume ( X | i ; i ) | issub-Gaussian.Aslongas s log p= p n = o (1) ,thecondi- tions(C0)and(C2)canbed.Assume ˙ 2 n; inv ! ˙ 2 inv forsome ˙ 2 inv < 1 ,andthen 107 wehave 2log EL ( inv ) n ( b 0 j ) d ! ˜ 2 1 : Notethatsincewearedoinganequivalenttest,fromthisinferenceprocedure,wecan notgettheintervalfor 0 j . 4.4.3KFCProjection ThethirdexampleisabouttheprojectionbyselectingtheKFCsetaswediscussedinSection 4.2.2. Theorem8. UnderAssumption3intheappendix,theconditions(C0)and(C2)canbe d.Assume ˙ 2 n; kfc ! ˙ 2 kfc forsome ˙ 2 kfc < 1 ,andthenwehave 2log EL ( kfc ) n ( 0 j ) d ! ˜ 2 1 : AbouttheKFCsetselection,weproposethefollowingprocedure.Basedonnormality assumptionofthepredictors,wehavethewellknownconditionaldistributionresultforany givesubset S : ˆ jk ( S ):=Corr( X ij ;X ik j X i S )= ˙ jk | S j 1 SS S k : Thesamplepartialcorrelationcanbeevaluatedby,^ ˆ jk ( S )= ~ X | j ~ X k =n .Fortestingwhether apartialcorrelationiszeroornot,wecouldapplyFisher'sz-transformation ^ F jk = 1 2 log ( 1+^ ˆ jk ( S ) 1 ^ ˆ jk ( S ) ) : Classicaldecisiontheoryyieldsthefollowingrulewhenusingthelevel .Reject 108 thenullhypothesis H 0 : ˆ jk ( S )=0againstthetwo-sidedalternative H a : ˆ jk ( S ) 6 =0if p n jSj 3 j ^ F jk j >z 1 = 2 : Sowecouldthenselectthesmallestsizeof S suchthat max k 2S p n jSj 3 j ^ F jk j n .Toreducethedimensions,weapplythe p LassotoselectCNAIs thatarepredictiveofgeneexpressionlevelsandCNAIsthatareexplanatoryofvariability. ThesevariablesarethenappliedintheGoldfeld-Quandttesttospecifypredictorsonthe response.Sincethe p Lassoisnotthatsensitivetotheselectionofthetuningparameter andwearealsodurabletoselectmorevariables,wejustsetthetuningparametertobe p log p=n . Wefoundthat19outof654genesdemonstrateheteroscedasticityatthe level0 : 05 = 654.Thepresenceofheteroscedasticityforthesegenessuggeststheneedtouse ourmethodforidentifyingtheCNAIsthatareassociatedwithgeneexpression.Asfurther evidencefortheexistenceofheteroscedasticity,weapplythe\wanderingschematicplot" [Tuk77].Thisslicesthepredictedvalueintobinsandusesm-lettersummaries(generaliza- tionsofboxplots)toshowthelocation,spread,andshapeoftheresidualsforeachbin.The m-letterstatisticsarefurthersmoothedinordertoemphasizeoverallpatternsratherthan chancedeviations.Figure4.4presentsthe\wanderingschematicplots"forgenesPDK3 (Chr23),TPST2(Chr22),ELF3(Chr1)andSNRPE(Chr22),whicharethetop4genes fortheheteroscedasticity. 120 4.6.4ResultsforTop4GeneswithHeteroscedasticity WeapplyourEmpiricalLikelihoodbasedapproachtothefourgenesdiscussedintheprevious sectionanddemonstratedinFigure4.5andcompareitsperformanceswiththoseofWald typetestandScoretypetest.Forexample,weusegeneTPST2onChr22asshownin Figure4.5bfordemonstration.Foreachinferenceproceduretestingforallcovariates,we cangetasequenceofp-values f p j g p j =1 .Withtheorderedp-values, p (1) p (2) p ( j ) p ( p ) ,weadopttheBenjamini-Hochberg(BH)algorithmtomakethedecision. Asaresult,wefoundthatonlyEL-INVandEL-LASSOcandetectsignalsandallofthe otherproceduresfoundnothingt.MoreoverEL-INVandEL-LASSOfoundtwo consistentsignalsatthe305-thCNAIand307-thCNAI,bothofwhichareonChromosome 17withCytoband\17q12-17q12". 121 (a) (b) (c) (d) Figure4.4: WonderingSchematicPlotforTop4GeneswithHeteroscedasticity. 122 (a) (b) (c) (d) Figure4.5: ManhattanPlotforTop4GeneswithHeteroscedasticity. 123 4.7TechnicalDetails 4.7.1AssumptionsforTheoreticalExamples Assumption1. (1) Assumetheinitialestimator ^ satisfying k ^ 0 k 1 = O p ( s p log p=n ) . (2) Supposetheinitialestimators ^ w j satisfy max 1 j p k ^ w j w 0 j k 1 = O p ( a n ) ,where a n = o (1 = p log p ) . (3) Thepredictionerrorssatisfy k X ( ^ 0 ) k 2 2 =n = O p ( s log p=n ) and max 1 j p k X n j ( ^ w j w 0 j ) k 2 2 =n = O p ( b n ) ,where X n j isthedesignmatrix X withthe j -thcolumndeletedand b n = o (1 = p n ) . (4) X i and i areallsub-Gaussian. (5) s log p= p n = o (1) . Remark6. 1. With(4)that X i and i areallsub-Gaussian,wehave X ik i sub-exponential withE ( i X ik )=0 .ByBernsteininequality[Ver10]andunionboundinequality,we have P ( 1 n n X i =1 X i i 1 t ) C 1 p exp( C min( t 2 =C 2 ;t=C 3 ) n ) : Bytaking t = C 0 q log p n forsomepositiveconstant C 0 suchthat CC 0 2 >C 2 ,wehave k 1 n n X i =1 X i i k 1 = O p ( r log p n ) : (4.7.21) 2. For ij = X ij E ( X ij j X i; n j ) ,wehave ij sub-gaussiansince X i issub-gaussian. Andforany k 6 = j ,wehaveE ( X ik ij )= E f X ik [ X ij E ( X ij j X i; n j )] g = E f X ik X ij 124 E [ X ik X ij j X i; n j ] g =0 .Similarly,wehaveforany t> 0 and 1 j 6 = k p , P ( 1 n n X i =1 X ik ij t ) C 1 p exp( C min( t 2 =C 2 ;t=C 3 ) n ) ; whichleadsto 1 n n X i =1 ij X i; n j 1 = O p ( r log p n ) : (4.7.22) 3. Forthepropertiesoftheinitialestimatorsin(1),(2)and(3)undertheheteroscedasitic noisecase,wecanusethe p Lassoestimatorasin[BCW14].AccordingtoTheorem7 in[BCW14],wehavethatthe p Lassoestimatorsundercertainconditionshavethese propertiesd. Assumption2. (1) AssumethesameassumptionasLassoprojectioncasefortheinitial estimator k ^ 0 k 1 = O p ( s p log p=n ) . (2) AssumesimilarassumptionasLassoprojectioncasefortheinitialestimators ^ j ,i.e. max 1 j p k ^ j 0 j k 1 = O p ( a n ) ,where a n = o (1 = p log p ) . (3) AssumesimilarassumptionasLassoprojectioncaseforthepredictionerrors,i.e. k X ( ^ 0 ) k 2 2 =n = O p ( s log p=n ) and max 1 j p k ( Y ; X n j )( ^ j 0 j ) k 2 2 =n = O p ( b n ) and b n = o (1 = p n ) . (4) ( X | i ; i ) | issub-Gaussian. (5) s log p= p n = o (1) . Remark7. Forthecondition(2)above,ifweassume a =max 1 j p s j with s j = k 0 j k 0 andthenthe p Lassoestimatorsfor 0 j satisfythisconditionwith a n = a p log p=n .Forthe 125 condition(3)above,sinceweassumethat ( X | i ; i ) | issub-Gaussian(whichmakes 0 | X i alsosub-Gaussian),thenduetoCov ( 0 | X i ; i )= E ( i 0 | X i )=0 ,wehave i 0 | X i sub- exponentialandbytheBernsteininequality,wehaveforany t> 0 , P ( 1 n n X i =1 X | i 0 i t ) 2exp C 1 n min( t 2 =C 2 2 ;t=C 2 ) g : Thisalsoleadsto 1 n n X i =1 X | i 0 i = O p ( p log p=n ) ; (4.7.23) aslongas log p=n ! 0 .Andwiththesameargument,wehave 1 n n X i =1 X ik ij;y = O p ( p log p=n ) ; (4.7.24) 1 n n X i =1 ( Y i ; X | i; n j ) 0 j ij;y = O p ( p log p=n ) : (4.7.25) Assumption3. (1) Fortheeigenvaluesof ,thereexistsomeconstants min and max suchthat 0 < min < min ( ) max ( ) < max < 1 : (2) Assume X i ˘ N ( 0 ; ) and i tobesub-Gaussian. (3) AssumethesameastheLassoprojectionfortheinitialestimator k ^ 0 k 1 = O p ( s p log p=n ) . (4) m 3 log p=n = o (1) , s q (log p ) 2 m 3 n = o (1) , s r (log p ) 3 m 2 n 2 = o (1) . (5) Assume s p log p sup S : jS m max k 2S ˙ jk j S 1 SS S k = o (1) tocontrolthepartial correlationbetweenthetargetcovariate X ij and X i S . 126 4.7.2ProofofTheorems ProofofTheorem5. Asin[Owe01],by(C0),withprobabilitytendingto1, 2logEL n ( 0 j )= 2 P n i =1 log(1+ ni )where n X i =1 m ni 1+ ni =0 : (4.7.26) Thenextstepistoboundthemagnitudeof .Let = j j u where u =sign( ) 2 1 ; 1 g . Nowby P n i =1 m ni 1+ ni =0,wehave 0= n X i =1 um ni 1+ ni = n X i =1 um ni 1 ni 1+ ni ; whichimplies n X i =1 um ni = n X i =1 2 ni 1+ ni = n X i =1 j j m 2 ni 1+ ni j j n X i =1 m 2 ni 1+ j j max 1 i n j m ni j : Thuswehave u 1 n n X i =1 m ni j j 1+ j j max 1 i n j m ni j 1 n n X i =1 m 2 ni : whichimplies j j 1 n n X i =1 m 2 ni (max 1 i n j m ni j ) u 1 n n X i =1 m ni u 1 n n X i =1 m ni : (4.7.27) 127 From(C1),byLemma3in[Owe90],wehavemax 1 i n j W ni j = o p ( n 1 = 2 ),andtogetherwith (C2),wehave max 1 i n j m ni j = o p ( n 1 = 2 ) : (4.7.28) Andsinceforany > 0, 1 n˙ 2 n n X i =1 E W 2 ni 1 ( j W ni j > p n˙ n ) = ˙ 2 n E W 2 n 1 1 ( j W n 1 j > p n˙ n ) ; whereobviously W 2 n 1 1 ( j W n 1 j > p n˙ n ) p ! 0duetoP( j W n 1 j > p n˙ n ) ! 0,wehaveby DominatedConvergenceTheorem, 1 n˙ 2 n n X i =1 E W 2 ni 1 ( j W ni j > p n˙ n ) ! 0 : ThusbyLindeberg-FellerCentralLimitTheorem,wehave 1 p n n X i =1 W ni d ! N(0 ;˙ 2 w ) : (4.7.29) By(4.7.29)andtogetherwith(C2),wehave 1 p n n X i =1 m ni = 1 p n n X i =1 W ni + 1 p n n X i =1 R ni = 1 p n n X i =1 W ni + o p (1) d ! N(0 ;˙ 2 w ) : (4.7.30) Andby(C1)and(C2)wehave 1 n n X i =1 m 2 ni = 1 n n X i =1 W 2 ni + 1 n n X i =1 R 2 ni +2 1 n n X i =1 W ni R ni = 1 n n X i =1 W 2 ni + o p (1) ! ˙ 2 w : (4.7.31) 128 ActuallytheabovefollowsfromcheckingtheWLLNfortriangulararrays.Firstofall P n i =1 P( W 2 ni >n )= n P( W 2 n 1 >n ) E W 2 n 1 1 ( W 2 n 1 >n ) ! 0;and n 2 n X i =1 E W 4 ni 1 ( W 2 ni n ) = n 1 E W 4 n 1 1 ( W 2 n 1 n ) = n 1 Z n 0 2 y P( W 2 n 1 >y ) dy ! 0 since y P( W 2 n 1 >y ) E( W 2 n 1 1 ( W 2 n 1 >y )) ! 0as y !1 . Thusby(4.7.27),(4.7.28),(4.7.30)and(4.7.31),wehave j j ( 1 n n X i =1 m 2 ni + o p (1))= O p ( n 1 = 2 ) andhence j j = O p ( n 1 = 2 ) : (4.7.32) Thenitfollowsfrom(4.7.28),wehavemax 1 i n ni 1+ ni = o p (1).Therefore,from(4.7.26), wehave 0= 1 n n X i =1 ni 1+ ni = 1 n n X i =1 ni n 1 ni + [ ni ] 2 1+ ni o = 1 n n X i =1 ni [1+ o p (1)] n n X i =1 [ ni ] 2 ; whichleadsto 1 n n X i =1 ni = [1+ o p (1)] n n X i =1 [ ni ] 2 : (4.7.33) 129 Againbyusing(4.7.26)andtogetherwith(4.7.30),wehave 0= 1 n n X i =1 m ni 1+ ni = 1 n n X i =1 m ni n 1 ni + [ ni ] 2 1+ ni o = 1 n n X i =1 m ni n n X i =1 m 2 ni + 1 n n X i =1 m ni [ ni ] 2 1+ ni = 1 n n X i =1 m ni n n X i =1 m 2 ni + O p n max 1 i n m ni 1+ ni 1 n n X i =1 [ ni ] 2 o = 1 n n X i =1 m ni n n X i =1 m 2 ni + o p n n 1 = 2 2 1 n n X i =1 m 2 ni o = 1 n n X i =1 m ni n n X i =1 m 2 ni + o p ( n 1 = 2 ) ; whichleadsto = 1 n n X i =1 m 2 ni 1 1 n n X i =1 m ni + o p ( n 1 = 2 ) : (4.7.34) Finally,byTaylorexpansiontogetherwith(4.7.30),(4.7.31),(4.7.33)and(4.7.34),wehave 2logEL n ( 0 j )=2 n X i =1 log(1+ ni ) =2 n X i =1 ni [1+ o p (1)] n X i =1 [ ni ] 2 =[1+ o p (1)] n X i =1 [ ni ] 2 =[1+ o p (1)] 2 n X i =1 m 2 ni =[1+ o p (1)] 1 p n n X i =1 m ni 1 n n X i =1 m 2 ni 1 1 p n n X i =1 m ni + o p (1) d ! ˜ 2 1 ; as n !1 : 130 Thiscompletestheproofofthetheorem. ProofofTheorem6. Weonlyneedtocontroltheterm R ni ,whichwillbecontrolledoneby one. By(3)inAssumption1,wehave(4.7.21)and(4.7.22),whichleadsto 1 n n X i =1 R ni; 1 = 1 n n X i =1 ( Y i X | i 0 )( w 0 j ^ w j ) | X i; n j = ( w 0 j ^ w j ) | 1 n n X i =1 X i; n j i k w 0 j ^ w j k 1 k 1 n n X i =1 X i; n j i k 1 = O p ( a n ) O p ( r log p n )= O p ( a n r log p n ) : Inordertohave 1 n P n i =1 R ni; 1 = o p ( n 1 = 2 )weneedtohave a n = o (1 = p log p ),whichis trueaccordingto(2)inAssumption1. For R ni; 2 ,wehave 1 n n X i =1 R ni; 2 = 1 n n X i =1 ( X ij ^ w | j X i; n j ) X | i; n j ( 0 n j ^ n j ) = 1 n n X i =1 ij X | i; n j ( 0 n j ^ n j )+ 1 n n X i =1 ( w 0 j ^ w j ) | X i; n j X | i; n j ( 0 n j ^ n j ) 1 n n X i =1 ij X | i; n j ( 0 n j ^ n j ) + 1 n n X i =1 ( w 0 j ^ w j ) | X i; n j X | i; n j ( 0 n j ^ n j ) 1 n n X i =1 ij X | i; n j 1 0 n j ^ n j ) 1 + v u u t 1 n n X i =1 ( w 0 j ^ w j ) | X i; n j 2 v u u t 1 n n X i =1 X | i; n j ( 0 n j ^ n j ) 2 = O p ( p log p=n ) O p ( s p log p=n )+ O p ( p s log p=n ) O p ( p b n ) = O P s log p=n + p b n s log p=n : 131 Inordertohave 1 n P n i =1 R ni; 2 = o p ( n 1 = 2 )weneedtohave s log p= p n = o (1)and b n = o (1 = p n ).Thuswith(3)and(5)inAssumption1,wehavevthehalfcondition in(C2), 1 n P n i =1 R ni = o p ( n 1 = 2 ). Nowforthesecondhalfoftheconditionin(C2), max 1 i n j R ni; 1 j =max 1 i n ( Y i X | i 0 )( w 0 j ^ w j ) | X i; n j =max 1 i n ( w 0 j ^ w j ) | X i; n j i w 0 j ^ w j 1 max 1 i n X i; n j i 1 = w 0 j ^ w j 1 max 1 i n max 1 k p X ik i : Nowsince X i and i areallsub-Gaussianandthenwehave X ik i sub-exponential,and thenbytheunionbound,wehave P max 1 i n max 1 k p X ik i >t X 1 i n X 1 k p P( j X ik i >t ) pnC 1 e C 2 t : Bytaking t =log( pn ) =C with Ct n P( j 2 i j >t ) nC 1 e C 2 t whichimpliesthatmax 1 i n j 2 i j = O p (log n ).Thuswehavemax 1 i n j R ni; 1 j = O p ( a n log( pn )). Inordertoachievemax 1 i n j R ni; 1 j = o p ( n 1 = 2 ),weneed a n log( pn ) = p n = o (1),whichis truesince a n = o (1 = p log p ). For R ni; 2 = ij;y X | i ( 0 ^ )= ij;y X ij ( 0 j ^ j )+ X | i; n j ( 0 n j ^ n j ) = ij;y [( Y i ; X | i; n j ) 0 j + ij;y ]( 0 j ^ j )+ X | i; n j ( 0 n j ^ n j ) = 2 ij;y ( 0 j ^ j )+ ij;y ( Y i ; X | i; n j ) 0 j ( 0 j ^ j )+ ij;y X | i; n j ( 0 n j ^ n j ) ; 135 similarlyas R ni; 1 ,bycondition(1)and(4.7.24),(4.7.25),wehave 1 n n X i =1 R ni; 2 = 1 n n X i =1 2 ij;y ( 0 j ^ j )+ O p ( s p log p=n p log p=n ) = 1 n n X i =1 2 ij;y ( 0 j ^ j )+ O p ( s log p=n ) = O p ( s p log p=n p 1 =n )+ O p ( s log p=n )= O p ( s p log p=n ) : Soinordertohave 1 n P n i =1 R ni; 2 = o p ( n 1 = 2 ),weneedtohave s p log p=n = o p ( n 1 = 2 ),i.e. s p log p=n = o p (1).Notethat max 1 i n j R ni; 2 j max 1 i n j 2 ij;y ( 0 j ^ j ) j +max 1 i n j ij;y ( Y i ; X | i; n j ) 0 j ( 0 j ^ j ) j +max 1 i n j ij;y X | i; n j ( 0 n j ^ n j ) j = O p ( s p log p=n log( pn ))= o p ( p n ) since s p log p=n log( pn ) = p n = o ( p log p=n )= o (1). Nowfor R ni; 3 = X | i ( 0 ^ ) ( Y i ; X | i; n j )( 0 j ^ j ) =( 0 ^ ) | X i ( Y i ; X | i; n j )( 0 j ^ j ), wehaveby(3)inAssumption2 1 n n X i =1 R ni; 3 = 1 n n X i =1 ( 0 ^ ) | X i ( Y i ; X | i; n j )( 0 j ^ j ) v u u t 1 n n X i =1 [( 0 ^ ) | X i ] 2 v u u t 1 n n X i =1 [( Y i ; X | i; n j )( 0 j ^ j )] 2 = O p ( p s log p=n ) O p ( p b n )= O p ( p b n s log p=n ) : Soinordertohave 1 n P n i =1 R ni; 3 = o p ( n 1 = 2 ),weneedtohave p b n s log p=n = o p ( n 1 = 2 ), 136 i.e. p b n s log p = o p (1).Andwealsohave max 1 i n j R ni; 3 jk 0 ^ k 1 k 0 j ^ j k 1 max 1 i n max 1 j p j X ij j max 1 i n j Y i j +max 1 i n max 1 j p j X ij j = O p ( s p log p=na n log( pn ))= o p ( n 1 = 2 ) : Nowweneedtocheckoutcondition(C0).Fromtheaboveanalysis,wehavemax 1 i n j R ni j = o p (max 1 i n j W ni j ).Thusweonlyneedtoprovethat P(min 1 i n W ni < 0 < max 1 i n W ni ) ! 1 ; whichjustfollowsfromtheGilvenko-Gantellitheoremoverhalf-spacesasinpage219in [Owe01]. ProofofTheorem8. Recallthat 1 p n n X i =1 R ni = R 1 n + R 2 n + R 3 n + R 4 n where R 1 n = 1 p n n X i =1 X | i S ( X | S X S ) 1 X | S X ij j S 1 SS X i S ; R 2 n = 1 p n n X i =1 i X | i S ( X | S X S ) 1 X | S j S 1 SS X i S X | j X S ( X | S X S ) 1 X i S ; R 3 n = 1 p n n X i =1 X ij j S 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] ; R 4 n = 1 p n n X i =1 j S 1 SS X i S X | j X S ( X | S X S ) 1 X i S X | i S X | i S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] : 137 Nowfor R 1 n ,wehave R 1 n = 1 p n n X i =1 X ij j S 1 SS X i S X | i S ( X | S X S ) 1 X | S = n 1 n n X i =1 X ij j S 1 SS X i S X | i S on p n ( X | S X S ) 1 X | S o : Nowweneedtoboundthetwoterms 1 n P n i =1 X ij j S 1 SS X i S X i S and p n ( X | S X S ) 1 X | S . Infact,forevery k 2S ,wehavethatthetwoGaussianrandomvariables X ij j S 1 SS X i S and X ik havethefollowingproperties: E( X ik )=E( X ij j S 1 SS X i S )=0; E( X 2 ik )= ˙ kk ; E[( X ij j S 1 SS X i S ) 2 ]= ˙ jj j S 1 SS S j ; Cov( X ik ;X ij j S 1 SS X i S )=E[ X ik ( X ij j S 1 SS X i S )]= ˙ kj j S 1 SS S k = ˙ kj j S 1 SS SS e k = ˙ kj j S e k = ˙ kj ˙ jk =0 : Thuswehave 0 B @ X ik X ij j S 1 SS X i S 1 C A ˘ N 0 ; 0 B @ ˙ kk 0 0 ˙ jj j S 1 SS S j 1 C A : (4.7.35) Under(1)inAssumption3,byLemmaA.3from[BL08],wehavethereexistsconstants C;C 1 ;C 2 > 0suchthat P 1 n n X i =1 X ij j S 1 SS X i S X ij t C 1 exp( C 2 nt 2 ) ; for0 t C: 138 Byunioninequality,wethenhave P max S : jS m 1 n n X i =1 X ij j S 1 SS X i S X i S 1 t C 1 mp m exp( C 2 nt 2 ) ; for0 t C ,where jfSf 1 ; 2 ;; ;p g : jSj m gj p m . For mp m exp( C 2 nt 2 )=exp( C 2 nt 2 + m log p +log m ),take t = s m log p +log m + C log p ( C 2 n ) ˘ p m log p=n; andthenwehave max S : jS m 1 n n X i =1 X ij j S 1 SS X i S X i S 1 = O p ( p m log p=n ) : Nowinordertocontrol p n ( X | S X S ) 1 X | S ,noticethatbythefollowingmatrix equality[HS81] ( X | S X S =n ) 1 = SS +( X | S X S =n SS ) 1 = 1 SS 1 SS I +( X | S X S =n SS ) 1 SS 1 ( X | S X S =n SS ) 1 SS | {z } S ; (4.7.36) 139 wehave k p n ( X | S X S ) 1 X | S k 1 = k ( X | S X S =n ) 1 X | S = p n k 1 k 1 SS X | S = p n k 1 + k S X | S = p n k 1 p jSjk 1 SS X | S = p n k 2 + p jSjk S X | S = p n k 2 p jSjk 1 SS X | S = p n k 2 + p jSjk S k 2 k X | S = p n k 2 : OneofthemostimportantresultsinmatrixanalysisistheCauchy(eigenvalue)inter- lacingtheorem.Itassertsthattheeigenvaluesofanyprincipalsubmatrixofasymmetric matrixinterlacethoseofthesymmetricmatrix.Forexample,ifan n n symmetricmatrix S canbepartitionedas S = 0 B @ AB B | C 1 C A ; inwhich A isan r r principlesubmatrix,thenforeach i 2 1 ; 2 ; ;r ,wehave i ( S ) i ( A ) n r + i ( S ) : Inparticular,wehave min ( ) min ( SS )and max ( ) max ( SS ).Thusbythe ofmaximumeigenvalue,wehave k 1 SS X | S = p n k 2 1 min k X | S = p n k 2 : 140 So k p n ( X | S X S ) 1 X | S k 1 p jSj 1 min k X | S = p n k 2 + p jSjk S k 2 k X | S = p n k 2 = p jSj 1 min + k S k 2 k X | S = p n k 2 : Nowwehavetocontrol k X | S = p n k 2 and k S k 2 .Inordertocontroltheone,bythe sub-Gaussiantailedcondition(2)inAssumption3, P(max S : jS m k X | S = p n k 2 t p n ) P(max S : jS m max j 2S j 1 n n X i =1 X ij i j t= p m ) p m m exp( Cnt 2 =m ) ; followedfromtheBernsteininequalityfor t small.For p m m exp( Cnt 2 =m )=exp( m log p + log m Cnt 2 =m ),take t = p m q m log p +log m + C 1 log p Cn ˘ p m 2 log p=n .Thenwehavethe followingorder max S : jS m k X | S = p n k 2 = O p ( m p log p ) : Nowfor k S k 2 with S = 1 SS I +( X | S X S =n SS ) 1 SS 1 ( X | S X S =n SS ) 1 SS , wehavetocontrol X | S X S =n SS Notethat P sup S : jS m k X | S X S =n SS k 2 P sup S : jS m max j;k j X | j X k =n ˙ jk j m 2 p m P j X | j X k =n ˙ jk j C 1 m 2 p m exp( C 2 2 =m 2 ) wherethelastinequalityisalsofollowedfromLemmaA.3in[BL08]withconstants C 1 ;C 2 > 0.For m 2 p m exp( C 2 2 =m 2 )=exp(2log m + m log p C 2 2 =m 2 ),bytaking = m r m log p +2log m + C 1 log p C 2 n ˘ 141 p m 3 log p=n ,wehave sup S : jS m k X | S X S =n SS k 2 = O p ( q m 3 log p=n ) : Itfollowsthen k S k 2 = k 1 SS I +( X | S X S =n SS ) 1 SS 1 ( X | S X S =n SS ) 1 SS k 2 k 1 SS k 2 2 k I +( X | S X S =n SS ) 1 SS 1 k 2 k X | S X S =n SS k 2 = O p ( q m 3 log p=n ) ; since k 1 SS k 2 = 1 = 2 max ( 2 SS ) 1 min . Thuswehave k p n ( X | S X S ) 1 X | S k 1 p jSj 1 min + k S k 2 k X | S = p n k 2 = O p ( q m 3 log p=n ) ; i.e.sup S : jS m k p n ( X | S X S ) 1 X | S k 1 = O p ( p m 3 log p=n ). Insummary,wethenhave sup S : jS m n 1 n n X i =1 X ij j S 1 SS X i S X | i S on p n ( X | S X S ) 1 X | S o sup S : jS m 1 n n X i =1 X ij j S 1 SS X i S X | i S 1 sup S : jS m p n ( X | S X S ) 1 X | S 1 = O p ( p m log p=n ) O p ( q m 3 log p=n )= O p ( m 2 log p=n ) : Andhence R 1 n = o p (1). 142 For R 2 n ,wehave R 2 n = 1 p n n X i =1 i X | i S ( X | S X S ) 1 X | S j S 1 SS X i S X | j X S ( X | S X S ) 1 X i S = j S 1 SS X | j X S ( X | S X S ) 1 1 p n n X i =1 X i S i X i S X | i S ( X | S X S ) 1 X | S = j S 1 SS X | j X S ( X | S X S ) 1 n 1 p n n X i =1 X i S i 1 n n X i =1 X i S X | i S p n ( X | S X S ) 1 X | S o = j S 1 SS X | j X S ( X | S X S ) 1 n 1 p n n X i =1 X i S i X | S = p n o =0 : Observethatwecanrewrite R 3 n as R 3 n = 1 p n n X i =1 X ij j S 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] = 1 p n X | j I X S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] ; 143 where 1 p n X | j I X S ( X | S X S ) 1 X | S X S canbecontrolledasfollows k 1 p n X | j I X S ( X | S X S ) 1 X | S X S k 1 =max k 2S j 1 p n X | j I X S ( X | S X S ) 1 X | S X k j p n max k 2S n X | j X k =n ˙ jk + ˙ jk j S 1 SS S k + [ X | j X S =n j S ] 1 SS S k + j S 1 SS [ X | S X k =n S k ] + j S S S k + j S S [ X | S X k =n S k ] + [ X | j X S =n j S ] 1 SS [ X | S X k =n S k ] + [ X | j X S =n j S ] S S k + [ X | j X S =n j S ] S [ X | S X k =n S k ] o p n max k 2S n X | j X k =n ˙ jk + ˙ jk j S 1 SS S k + k X | j X S =n j S k 1 p jSj 1 min max + p jSj 1 min max k X | S X k =n S k k 1 + 2 max k S k 2 + p jSj max k S k 2 k X | S X k =n S k k 1 + k X | j X S =n j S k 2 1 min k X | S X k =n S k k 2 + p jSjk X | j X S =n j S k 1 k S k 2 max + k X | j X S =n j S k 2 k S k 2 k X | S X k =n S k k 2 o : Andwehavethat P sup S : jS m max k 2S j ˙ jk 1 n X | j X k j p m +1 P j ˙ jk 1 n X | j X k j C 1 p m +1 exp( C 2 2 ) wherethelastinequalityisalsofollowedfromLemmaA.3in[BL08]withconstants C 1 ;C 2 > 0.For p m +1 exp( C 2 2 )=exp(( m +1)log p C 2 2 ),bytaking = r ( m +1)log p + C 1 log p C 2 n ˘ p m log p=n ,wehave sup S : jS m max k 2S j ˙ jk 1 n X | j X k j = O p ( p m log p=n ) : 144 Similarly,wehave sup S : jS m k j S 1 n X | j X S k 1 = O p ( p m log p=n ) sup S : jS m max k 2S k X | S X k =n S k k 1 = O p ( p m log p=n ) Bysup S : jS m k S k 2 = O p ( p m 3 log p=n ),wehave sup S : jS m k 1 p n X | j I X S ( X | S X S ) 1 X | S X S k 1 p n sup S : jS m max k 2S n X | j X k =n ˙ jk + ˙ jk j S 1 SS S k + k X | j X S =n j S k 1 p jSj 1 min max + p jSj 1 min max k X | S X k =n S k k 1 + 2 max k S k 2 + p jSj max k S k 2 k X | S X k =n S k k 1 + p jSjk X | j X S =n j S k 1 k S k 2 max + k X | j X S =n j S k 2 1 min k X | S X k =n S k k 2 + k X | j X S =n j S k 2 k S k 2 k X | S X k =n S k k 2 o = p n sup S : jS m max k 2S ˙ jk j S 1 SS S k + O p f p n q m 3 log p=n g ; since p m 3 log p=n = o (1).Undercondition(4)and(5)inAssumption3,wehavethat 145 R 3 n = o p (1). Notethat R 4 n = 1 p n n X i =1 j S 1 SS X i S X | j X S ( X | S X S ) 1 X i S X | i S X | i S ( X | S X S ) 1 X | S X S [ 0 S ^ S ] = 1 p n n X i =1 j S 1 SS X i S X | i S [ 0 S ^ S ] j S 1 SS ( X | S X S = p n )[ 0 S ^ S ] 1 p n n X i =1 X | j X S ( X | S X S ) 1 X i S X | i S [ 0 S ^ S ] + X | j X S ( X | S X S ) 1 X | S X S = p n [ 0 S ^ S ]=0 : Thuswehavevthat 1 n P n i =1 R ni = o p ( n 1 = 2 ). Andfor R ni; 1 ,wehave max 1 i n j R ni; 1 j = k ( X | S X S ) 1 X | S k 1 max 1 i n k X ij j S 1 SS X i S X | i S k 1 = k ( X | S X S ) 1 X | S k 1 max 1 i n max k 2S X ij j S 1 SS X i S X ik wheresup S : jS m k ( X | S X S ) 1 X | S k 1 = O p ( p m 3 log p=n ).Andsince X ij j S 1 SS X i S isGaussianundertheassumptionthat X isGaussian,wehave X ij j S 1 SS X i S X ik sub-exponential.So P sup S : jS m max 1 i n max k 2S X ij j S 1 SS X i S X ik >t p m nmC 1 exp( C 2 t ) whichleadstosup S : jS m max 1 i n max k 2S X ij j S 1 SS X i S X ik = O p ( m log p ). 146 Thuswehave sup S : jS m max 1 i n j R ni; 1 j = O p ( m log p q m 3 log p=n )= o p ( n 1 = 2 ) since( m log p=n ) p m 3 log p=n = o (1). Andfor R ni; 2 ,wehave max 1 i n j R ni; 2 jk j S 1 SS X | j X S ( X | S X S ) 1 k 1 max 1 i n k X i S i X i S X | i S ( X | S X S ) 1 X | S k 1 ; where k j S 1 SS X | j X S ( X | S X S ) 1 k 1 = k j S 1 SS n 1 X | j X S ( X | S X S =n ) 1 k 1 = k j S 1 SS n 1 X | j X S ( 1 SS S ) k 1 ( j S n 1 X | j X S ) 1 SS k 1 + k n 1 X | j X S S k 1 ( j S n 1 X | j X S ) 1 SS k 1 + k ( n 1 X | j X S j S ) S k 1 + k j S S k 1 : Andbysimplealgebra,wehave sup S : jS m k ( j S n 1 X | j X S ) 1 SS k 1 = O p ( q m 3 log p=n ) ; sup S : jS m k ( n 1 X | j X S j S ) S k 1 = O p ( m 2 log p=n ) ; sup S : jS m k j S S k 1 = O p ( m 2 p log p=n ) : 147 Nowfor max 1 i n k X i S i X i S X | i S ( X | S X S ) 1 X | S k 1 max 1 i n k X i S i k 1 +max 1 i n k X i S X | i S ( X | S X S ) 1 X | S k 1 max 1 i n k X i S i k 1 + k ( X | S X S ) 1 X | S k 1 max 1 i n k X i S X | i S k 1 ; since X ik i issub-exponential,wehave P sup S : jS m max 1 i n k X i S i k 1 >t =P sup S : jS m max 1 i n max k 2S j X ik i j >t p m mnC 1 e C 2 t whichleadstosup S : jS m max 1 i n k X i S i k 1 = O p ( m log p ).Andsince X ik X il issub- exponential,wehave P sup S : jS m max 1 i n k X i S X | i S k 1 >t P sup S : jS m max 1 i n p m max k;l 2S j X ik X il j >t p m m 2 nC 1 e C 2 t whichleadstosup S : jS m max 1 i n k X i S X | i S k 1 = O p ( p mm log p ). Sincesup S : jS m k ( X | S X S ) 1 X | S k 1 = O p ( p m 3 log p=n ),wehave sup S : jS m max 1 i n k X i S i X i S X | i S ( X | S X S ) 1 X | S k 1 = O p ( m log p + p mm log p q m 3 log p=n )= O p ( m log p (1+ m 2 p log p=n )) : 148 Insummary, sup S : jS m max 1 i n j R ni; 2 j = O p f m 3 log p p log p=n (1+ m 2 p log p=n ) g ; sincelog p=n ! 0.Inordertohavesup S : jS m max 1 i n j R ni; 2 j = o p ( n 1 = 2 ),weneed tohave m 3 (log p= p n ) p log p=n = o (1),whichistrueunder(4)inAssumption3since m 3 (log p= p n ) p log p=n = p m 3 log p=n p (log p ) 2 m 3 =n = o (1). Observethatmax 1 i n j R ni; 3 jk 0 S ^ S k 1 max 1 i n k X ij j S 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 : Since k X | i S ( X | S X S ) 1 X | S X S k 1 max k 2S j X | i S 1 SS S k j + j X | i S 1 SS ( X | S X k =n S k ) j + j X | i S S S k j + j X | i S S ( X | S X k =n S k ) j ; (4.7.37) wehave max 1 i n k X ij j S 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 max 1 i n max k 2S j X ij j S 1 SS X i S X ik j +max 1 i n max k 2S j X ij j S 1 SS X i S X | i S 1 SS S k j +max 1 i n max k 2S max l 2S j X ij j S 1 SS X i S X il jk 1 SS ( X | S X k =n S k ) k 1 +max 1 i n max k 2S max l 2S j X ij j S 1 SS X i S X il j p m k S k 2 k S k k 2 +max 1 i n max k 2S max l 2S j X ij j S 1 SS X i S X il jk S ( X | S X k =n S k ) k 1 : 149 Nowsince P sup S : jS m max 1 i n max k 2S j X ij j S 1 SS X i S X ik j >t p m +1 nC 1 e C 2 t ; wehave sup S : jS m max 1 i n max k 2S j X ij j S 1 SS X i S X ik j = O p ( m log p ) : Similarly,wehave sup S : jS m max 1 i n max k 2S j X ij j S 1 SS X i S X | i S 1 SS S k j = O p ( m log p ) ; sup S : jS m max 1 i n max l 2S j X ij j S 1 SS X i S X il j = O p ( m log p ) : Andthenbysimplealgebra,wehave sup S : jS m max k 2S k ( k S n 1 X | k X S ) 1 SS k 1 = O p ( q m 3 log p=n ) ; sup S : jS m max k 2S k ( n 1 X | k X S k S ) S k 1 = O p ( m 2 log p=n ) : Thuswehave sup S : jS m max 1 i n k X ij j S 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 = O p f m log p (1+ q m 3 log p=n + m 2 p log p=n + m 2 log p=n ) g = O p f m log p (1+ q m 3 log p=n + m 2 log p=n ) g ; 150 whichleadsto sup S : jS m max 1 i n j R ni; 3 j = O p ( s p log p=nm log p (1+ q m 3 log p=n + m 2 log p=n )) : Inordertohavesup S : jS m max 1 i n j R ni; 3 j = o p ( n 1 = 2 ),weneed s p log p=n ( m log p= p n )(1+ q m 3 log p=n + m 2 log p=n )= o (1) ; whichistrueunder(4)inAssumption3. Andformax 1 i n j R ni; 4 j = k 0 S ^ S k 1 max 1 i n k j S 1 SS X | j X S ( X | S X S ) 1 X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 : Andfor max 1 i n k j S 1 SS X | j X S =n ( 1 SS S ) X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 =max 1 i n k ( j S X | j X S =n ) 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 +max 1 i n k ( X | j X S =n j S ) S X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 +max 1 i n k j S S X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 ; 151 by(4.7.37),wehave max 1 i n k ( j S X | j X S =n ) 1 SS X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 max 1 i n max k 2S max l 2S k ( j S X | j X S =n ) 1 SS k 1 j X il X ik j +max 1 i n max k 2S max l 2S k ( j S X | j X S =n ) 1 SS k 1 X 2 il k 1 SS S k k 1 +max 1 i n max k 2S max l 2S k ( j S X | j X S =n ) 1 SS k 1 X 2 il k 1 SS ( X | S X k =n S k ) k 1 +max 1 i n max k 2S max l 2S k ( j S X | j X S =n ) 1 SS k 1 X 2 il k S S k k 1 +max 1 i n max k 2S max l 2S k ( j S X | j X S =n ) 1 SS k 1 X 2 il k S ( X | S X k =n S k ) k 1 = O p ( m 3 log p p log p=n ) ; undertheconditionthat m 3 log p=n ! 0.Similarlywehave sup S : jS m max 1 i n k ( X | j X S =n j S ) S X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 = O p f m 7 = 2 (log p ) 2 =n g sup S : jS m max 1 i n k j S S X i S X | i S X | i S ( X | S X S ) 1 X | S X S k 1 = O p f m 7 = 2 log p p log p=n g if m 3 log p=n ! 0.Insummary,if m 3 log p=n ! 0, sup S : jS m max 1 i n j R ni; 4 j = O p f sm 7 = 2 (log p ) 2 =n g : 152 Thusinordertohavesup S : jS m max 1 i n j R ni; 4 j = o p ( n 1 = 2 ),weneed sm 7 = 2 (log p ) 2 =n 3 = 2 = o (1) ; whichistrueunderthecondition(4)inAssumption3since sm 7 = 2 (log p ) 2 =n 3 = 2 = s r (log p ) 4 m 7 n 3 = s q (log p ) 2 m 3 n m 2 log p=n = o (1). Fromtheaboveanalysis,wehavemax 1 i n j R ni j = o p (max 1 i n j W ni j ).Thusweonly needtoprovethat P(min 1 i n W ni < 0 < max 1 i n W ni ) ! 1 ; whichjustfollowsfromtheGilvenko-Gantellitheoremoverhalf-spacesasinpage219in [Owe01]. Fortheproofofthethreepropositions,theyarejustfollowedfromtheproofofthe correspondingtheorems.WeherejustprovetheProposition2. ProofofProposition6. Inordertogettheasymptoticnormalityof ^ (kfc-de) j ,wehavetodeal with 1 n P n i =1 ~ X 2 ij .Nowsince 1 n n X i =1 ~ X 2 ij = 1 n n X i =1 X ij X | j X S ( X | S X S ) 1 X i S 2 = 1 n X | j X j 1 n X | j X S ( X | S X S ) 1 X | S X j = 1 n X | j I X S ( X | S X S ) 1 X | S X j ; 153 wehave j 1 n n X i =1 ~ X 2 ij ( ˙ jj j S 1 SS S j ) j = j 1 n X | j X j 1 n X | j X S ( X | S X S =n ) 1 X | S X j =n ( ˙ jj j S 1 SS S j ) j n X | j X j =n ˙ jj +2 k X | j X S =n j S k 1 p jSj 1 min max + 2 max k S k 2 +2 p jSj max k S k 2 k X | S X j =n S j k 1 + 1 min k X | S X j =n S j k 2 2 + k S k 2 k X | S X j =n S j k 2 2 o : Andsince P sup S : jS m j ˙ jj 1 n X | j X j j p m P j ˙ jj 1 n X | j X j j C 1 p m exp( C 2 2 ) wehave sup S : jS m j ˙ jj 1 n X | j X j j = O p ( p m log p=n ) : Nowfortheterm k j S 1 n X | j X S k 1 ,wehaveprovedabovethat sup S : jS m k j S 1 n X | j X S 1 SS k 1 = O p ( p m log p=n ) : 154 Bysup S : jS m k S k 2 = O p ( p m 3 log p=n ),wehave sup S : jS m j 1 n n X i =1 ~ X 2 ij ( ˙ jj j S 1 SS S j ) j sup S : jS m n X | j X j =n ˙ jj +2 k X | j X S =n j S k 1 p jSj 1 min max + 2 max k S k 2 +2 p jSj max k S k 2 k X | S X j =n S j k 1 + 1 min k X | S X j =n S j k 2 2 + k S k 2 k X | S X j =n S j k 2 2 o =sup S : jS m n O p ( p m log p=n )+ O p ( p m log p=n ) p jSj 1 min max + 2 max O p ( q m 3 log p=n )+ p jSj max O p ( q m 3 log p=n ) O p ( p m log p=n ) + jSj O p ( p m log p=n ) 2 1 min + jSj O p ( p m log p=n ) 2 O p ( q m 3 log p=n ) o = O p f q m 3 log p=n g : Thuswehave sup S : jS m 1 n n X i =1 ~ X 2 ij ( ˙ jj j S 1 SS S j ) = O p f q m 3 log p=n g = o p (1) : (4.7.38) HencewehavethefollowingasymptoticnormalitybySlutsky'stheorem p n ( ^ (kfc-de) j 0 j )= 1 p n P n i =1 m (kfc) ni ( 0 j ) 1 n P n i =1 ~ X 2 ij d ! N(0 ;˙ 2 kfc ) ; where ˙ 2 kfc =lim n !1 ( jj 2 j S 1 SS j S + j S 1 SS SS 1 SS S j ) = ( ˙ jj j S 1 SS S j ). 155 Chapter5 ConclusionsandFutureDirections Inthischapter,weaimtoreiteratethemaincontributionsofthisthesis,andtooutlinesome ofthethingsthatcouldpossiblyfollowasfuturedevelopmentsontheresultspresentedhere. InSection5.1,westartwiththesummaryofthemainideasinthethesis,especiallyfrom Chapters2,3and4.Section5.2layoutssomenaturalextensionsoftheideasinthisthesis. 5.1SummaryandContributions InChapter2and3,weproposedELbasedprocedurestomakepointwiseandsimultaneous inferencesonfunctionallinearmodels,treatingsparseanddensefunctionaldataina framework.WeshowedthatELisanicetooltoaccomplishthisgoal.Westudiedthe asymptoticdistributionsoftheELbasedteststatisticsunderthenullandlocalalternative hypothesesforbothsparseanddensefunctionaldata.Weestablishedthetransitionphasein ,theorderofrepeatedmeasurements,forpointwiseandsimultaneoustests.Thetransition point 0 wasshowntobe1 = 8forthepointwisetestand1 = 16forthesimultaneoustest.If 0 ,weshowedthattheproposedmethodisabletodetectalternativesofsize b n = n 4(1+ ) = 9 forthepointwisetestandoforder b n = n 8(1+ ) = 17 forthesimultaneoustest. Fordensefunctionaldatasuchthat > 0 ,wefoundthattheproposedtestsareableto detectalternativesofmagnitude n 1 = 2 bothpointwiselyandsimultaneously,whichisthe sameorderofalternativeaparametrictestcandetect.Moreover,weproposedapractical 156 bandwidthselectionmethodforfunctionaldata.Manybandwidthselectionmethodswere proposedforindependentorweaklydependentdata,butbandwidthselectionforfunctional dataremainedachallengingproblem,see[ZPW13]forarecentstudy.Numericalexperiments inChapter2showedthattheproposedbandwidthselectionmethodworkswellinpractice. InChapter4,weproposedaedframeworkforhighdimensionalinferencebasedon theempiricallikelihoodwhichisconstructedwithestimatingequations.Itcanbeusedto teststatisticalhypothesisandconstructintervals,whichhavemorenaturaldata drivenshape.Tobroadentheapplicabilityofthemethod,thegeneraltheorywaspresented withthegeneralconditionstobeInprincipal,allofthemethodsproposedin theexistingliteraturecanbere-consideredunderourframeworkandmakefaircomparison amongthem,althoughthetechnicaldetailscanbetcasebycase.Moreover,the keyadvantageofourproposedlikelihoodratiobasedmethodcomparingwithotherssuch asWaldtypemethodandScorebasedmethodisthatitcanallowheteroscedasticerror noise.Thisislargelyduetotheniceselfnormalizationpropertyoftheempiricallikelihood formulation.Inparticular,wedidnotassumeindependencebetweentheerrortermandthe covariates,whichisacommonassumptionintheexistingliterature,althoughwemadethe uncorrelatednessassumption. 5.2FutureDirections Thisthesisfocusedonapplyingempiricallikelihoodtosolvesomefundamentalproblems insimplestatisticalmodels,especiallylinearmodels.Henceanaturaldirectionforfuture researchistogeneralizeourmethodologiestomorecomplicatedstatisticalmodels,suchas generalizedlinearmodelsandsurvivalmodels.ForfunctionallinearmodelsinChapter2 157 and3,wegainedtherobustnessintermsofthecorrelationstructureoftheerrorprocess. Butifwehavepriorknowledgeoftheerrorprocess,howtoincorporatetheerrorcorrelation informationintotheestimationandinferenceprocedurestoincreasetheisavery interestingtopicforfutureinvestigation.Weonlyconsideredonegeneraltypeofhypothesis inChapter2and3.Thereisanotherhypothesisproblem,goodnessoftesting,whichcould beanotherpromisingresearchproblem.ForthehighdimensionallinearmodelinChapter 4,weonlyfocusedononeestimatingequation.Butwhenwehavemorethanoneestimating equations,howtocombinealloftheestimatingequationstomakemoretinferenceis worthyoffurtherinvestigation.Ingeneral,theself-normalizationpropertyofELispowerful andweshouldmakeuseofittosolvesomeproblemsinvariousstatisticalanalysis. 158 BIBLIOGRAPHY 159 BIBLIOGRAPHY [AS58] J.AitchisonandS.D.Silvey, Maximum-likelihoodestimationofparameterssub- jecttorestraints ,TheAnnalsofMathematicalStatistics 29 (1958),no.3,813{ 828. [B + 13] PeterBuhlmannetal., Statisticalanceinhigh-dimensionallinearmodels , Bernoulli 19 (2013),no.4,1212{1242. [Bal60] A.V.Balakrishnan, Estimationanddetectiontheoryformultiplestochasticpro- cesses ,JournalofMathematicalAnalysisandApplications 1 (1960),no.3,386{ 410. [BCW14] AlexandreBelloni,VictorChernozhukov,andLieWang, Pivotalestimation viasquare-rootlassoinnonparametricregression ,TheAnnalsofStatistics 42 (2014),no.2,757{788. [Bel02] DavidABelsley, Aninvestigationofanunbiasedcorrectionforheteroskedastic- ityandthectsofmisspecifyingtheskedasticfunction ,JournalofEconomic dynamicsandControl 26 (2002),no.9,1379{1396. [BHK + 09] MichalBenko,Wolfgangardle,AloisKneip,etal., Commonfunctionalprin- cipalcomponents ,TheAnnalsofStatistics 37 (2009),no.1,1{34. [BL08] PeterJBickelandElizavetaLevina, Regularizedestimationoflargecovariance matrices ,TheAnnalsofStatistics(2008),199{227. [BRT09] PeterJBickel,Ya'acovRitov,andAlexandreBTsybakov, Simultaneousanalysis oflassoanddantzigselector ,TheAnnalsofStatistics(2009),1705{1732. [BTW + 07] FlorentinaBunea,AlexandreTsybakov,MartenWegkamp,etal., Sparsityoracle inequalitiesforthelasso ,ElectronicJournalofStatistics 1 (2007),169{194. [BVDG11] PeterBuhlmannandSaraVanDeGeer, Statisticsforhigh-dimensionaldata: methods,theoryandapplications ,SpringerScience&BusinessMedia,2011. [CC06] SongXiChenandHengjianCui, Onbartlettcorrectionofempiricallikelihood inthepresenceofnuisanceparameters ,Biometrika 93 (2006),no.1,215{220. [CG14] SongXiChenandBinGuo, Testsforhighdimensionalgeneralizedlinearmodels , arXivpreprintarXiv:1402.4882(2014). [CHL03] SongXiChen,Wolfgangardle,andMingLi, Anempiricallikelihoodgoodness- testfortimeseries ,JournaloftheRoyalStatisticalSociety:SeriesB (StatisticalMethodology) 65 (2003),no.3,663{678. 160 [CLS86] P.E.Castro,W.H.Lawton,andE.A.Sylvestre, Principalmodesofvariation forprocesseswithcontinuoussamplecurves ,Technometrics 28 (1986),no.4, 329{337. [CVK09] SongXiChenandIngridVanKeilegom, Areviewonempiricallikelihoodmethods forregression ,Test 18 (2009),no.3,415{447. [CZ10] SongXiChenandPing-ShouZhong, Anovaforlongitudinaldatawithmissing values ,TheAnnalsofStatistics 38 (2010),no.6,3630{3659. [DCL12] ZJohnDaye,JinboChen,andHongzheLi, High-dimensionalheteroscedastic regressionwithanapplicationtoeqtldataanalysis ,Biometrics 68 (2012),no.1, 316{326. [DHR91] ThomasDiCiccio,PeterHall,andJosephRomano, Empiricallikelihoodis bartlett-correctable ,TheAnnalsofStatistics 19 (1991),no.2,1053{1061. [Edw84] AnthonyWilliamFairbankEdwards, Likelihood ,CUPArchive,1984. [EH08] R.L.EubankandTailenHsing, Canonicalcorrelationforstochasticprocesses , StochasticProcessesandtheirApplications 118 (2008),no.9,1634{1661. [Far97] JulianJFaraway, Regressionanalysisforafunctionalresponse ,Technometrics 39 (1997),no.3,254{261. [FFS10] JianfengFeng,WenjiangFu,andFengzhuSun, Frontiersincomputationaland systemsbiology ,vol.15,SpringerScience&BusinessMedia,2010. [FG96] JianqingFanandIreneGijbels, Localpolynomialmodellinganditsapplica- tions:Monographsonstatisticsandappliedprobability66 ,vol.66,Chapman &Hall/CRC,1996. [FHL07] JianqingFan,TaoHuang,andRunzeLi, Analysisoflongitudinaldatawith semiparametricestimationofcovariancefunction ,JournaloftheAmericanSta- tisticalAssociation 102 (2007),632{641. [FL01] JianqingFanandRunzeLi, Variableselectionvianonconcavepenalizedlikeli- hoodanditsoracleproperties ,JournaloftheAmericanstatisticalAssociation 96 (2001),no.456,1348{1360. [FL08] JianqingFanandJinchiLv, Sureindependencescreeningforultrahighdimen- sionalfeaturespace ,JournaloftheRoyalStatisticalSociety:SeriesB(Statistical Methodology) 70 (2008),no.5,849{911. [FZ00] JianqingFanandJin-TingZhang, Two-stepestimationoffunctionallinearmod- elswithapplicationstolongitudinaldata ,JournaloftheRoyalStatisticalSociety: SeriesB(StatisticalMethodology) 62 (2000),no.2,303{322. [GQ65] StephenMGoldfeldandRichardEQuandt, Sometestsforhomoscedasticity , JournaloftheAmericanstatisticalAssociation 60 (1965),no.310,539{547. 161 [GVHF11] JelleJGoeman,HansCVanHouwelingen,andLivioFinos, Testingagainsta high-dimensionalalternativeinthegeneralizedlinearmodel:asymptotictypei errorcontrol ,Biometrika 98 (2011),no.2,381{390. [HM93] WolfgangHardleandEnnoMammen, Comparingnonparametricversuspara- metricregression ,TheAnnalsofStatistics(1993),1926{1947. [HMW06] PeterHall,Hans-GeorgMuller,andJane-LingWang, Propertiesofprincipal componentmethodsforfunctionalandlongitudinaldataanalysis ,Theannalsof statistics(2006),1493{1517. [HS81] HaroldVHendersonandShayleRSearle, Onderivingtheinverseofasumof matrices ,SiamReview 23 (1981),no.1,53{60. [HTS + 99] TrevorHastie,RobertTibshirani,GavinSherlock,MichaelEisen,Patrick Brown,andDavidBotstein, Imputingmissingdataforgeneexpressionarrays , 1999. [JM13] AdelJavanmardandAndreaMontanari, eintervalsandhypothesis testingforhigh-dimensionalregression ,arXivpreprintarXiv:1306.3171(2013). [KAC + 98] HenryK,EriceA,TierneyC,BalfourHHJr,FischlMA,KmackA,LiouSH, KentonA,HirschMS,PhairJ,MartinezA,andKahnJO, Arandomized,con- trolled,double-blindstudycomparingthesurvivalboffourentreverse transcriptaseinhibitortherapies(three-drug,two-drug,andalternatingdrug)for thetreatmentofadvancedaids.aidsclinicaltrialgroup193astudyteam. ,JAc- quirImmuneSyndrHumRetrovirol(1998),339{349. [KF00] KeithKnightandWenjiangFu, Asymptoticsforlasso-typeestimators ,Annals ofstatistics(2000),1356{1378. [KZ13] SeonjinKimandZhibiaoZhao, dinferenceforsparseanddenselongitu- dinalmodels ,Biometrika(2013),ass050. [LH08] PeterLangfelderandSteveHorvath, Wgcna:anrpackageforweightedcorre- lationnetworkanalysis ,BMCbioinformatics 9 (2008),no.1,559. [LH10] YehuaLiandTailenHsing, Uniformconvergenceratesfornonparametricre- gressionandprincipalcomponentanalysisinfunctional/longitudinaldata ,The AnnalsofStatistics 38 (2010),no.6,3321{3351. [LL14] WeidongLiuandShanLuo, Hypothesistestingforhigh-dimensionalregression models . [LTTT14] RichardLockhart,JonathanTaylor,RyanJTibshirani,andRobertTibshirani, Aancetestforthelasso ,Annalsofstatistics 42 (2014),no.2,413. [LZL + 13] WeiLan,Ping-ShouZhong,RunzeLi,HanshengWang,andChih-LingTsai, Testingasingleregressioncoinhighdimensionallinearmodels . 162 [Mam93] EnnoMammen, Bootstrapandwildbootstrapforhighdimensionallinearmodels , TheAnnalsofStatistics(1993),255{285. [MB06] NicolaiMeinshausenandPeterBuhlmann, High-dimensionalgraphsandvariable selectionwiththelasso ,TheAnnalsofStatistics(2006),1436{1462. [MB10] , Stabilityselection ,JournaloftheRoyalStatisticalSociety:SeriesB (StatisticalMethodology) 72 (2010),no.4,417{473. [MC06] JeSMorrisandRaymondJCarroll, Wavelet-basedfunctionalmixedmodels , JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology) 68 (2006),no.2,179{199. [MMB09] NicolaiMeinshausen,LukasMeier,andPeterBuhlmann, P-valuesforhigh- dimensionalregression ,JournaloftheAmericanStatisticalAssociation 104 (2009),no.488. [MY09] NicolaiMeinshausenandBinYu, Lasso-typerecoveryofsparserepresentations forhigh-dimensionaldata ,TheAnnalsofStatistics(2009),246{270. [NL14] YangNingandHanLiu, Ageneraltheoryofhypothesistestsandcere- gionsforsparsehighdimensionalmodels ,arXivpreprintarXiv:1412.8765(2014). [NRWY12] SahandN.Negahban,PradeepRavikumar,MartinJ.Wainwright,andBinYu, Adframeworkforhigh-dimensionalanalysisof m -estimatorswithdecom- posableregularizers ,Statist.Sci. 27 (2012),no.4,538{557. [Owe88] ArtBOwen, Empiricallikelihoodratioceintervalsforasinglefunc- tional ,Biometrika 75 (1988),no.2,237{249. [Owe90] , Empiricallikelihoodratioceregions ,TheAnnalsofStatistics 18 (1990),no.1,90{120. [Owe01] , Empiricallikelihood ,CRCpress,2001. [PZB + 10] JiePeng,JiZhu,AnnaBergamaschi,WonshikHan,Dong-YoungNoh, JonathanRPollack,andPeiWang, Regularizedmultivariateregressionforiden- tifyingmasterpredictorswithapplicationtointegrativegenomicsstudyofbreast cancer ,Theannalsofappliedstatistics 4 (2010),no.1,53. [QL95] JinQinandJerryLawless, Estimatingequations,empiricallikelihoodandcon- straintsonparameters ,CanadianJournalofStatistics 23 (1995),no.2,145{159. [RS91] JohnARiceandBernardWSilverman, Estimatingthemeanandcovariance structurenonparametricallywhenthedataarecurves ,JournaloftheRoyalSta- tisticalSociety.SeriesB(Methodological)(1991),233{243. [Ser80] RobertJ Approximationtheoremsofmathematicalstatistics ,JohnWi- ley&Sons,1980. 163 [SF04] QingShenandJulianFaraway, Anftestforlinearmodelswithfunctionalre- sponses ,StatisticaSinica 14 (2004),no.4,1239{1258. [Sil78] BernardWSilverman, Weakandstronguniformconsistencyofthekernelesti- mateofadensityanditsderivatives ,TheAnnalsofStatistics 6 (1978),no.1, 177{184. [SR05] BernardWalterSilvermanandJamesO.Ramsay, Functionaldataanalysis , Springer,2005. [SS13] RajenDShahandRichardJSamworth, Variableselectionwitherrorcontrol: anotherlookatstabilityselection ,JournaloftheRoyalStatisticalSociety:Series B(StatisticalMethodology) 75 (2013),no.1,55{80. [SZ12] TingniSunandCun-HuiZhang, Scaledsparselinearregression ,Biometrika (2012),ass043. [Tib96] RobertTibshirani, Regressionshrinkageandselectionviathelasso ,Journalof theRoyalStatisticalSociety.SeriesB(Methodological)(1996),267{288. [TLTT14] JonathanTaylor,RichardLockhart,RyanJTibshirani,andRobertTibshirani, Post-selectionadaptiveinferenceforleastangleregressionandthelasso ,arXiv preprint(2014). [Tuk77] JohnWTukey, Exploratorydataanalysis ,Reading,Ma 231 (1977),32. [VdG08] SaraAVandeGeer, High-dimensionalgeneralizedlinearmodelsandthelasso , TheAnnalsofStatistics(2008),614{645. [vdGBR13] SaravandeGeer,PeterBuhlmann,andYa'acovRitov, Onasymptoticallyop- timalceregionsandtestsforhigh-dimensionalmodels ,arXivpreprint arXiv:1303.0518(2013). [Ver10] RomanVershynin, Introductiontothenon-asymptoticanalysisofrandomma- trices ,arXivpreprintarXiv:1011.3027(2010). [Wai09] MartinJWainwright, Sharpthresholdsforhigh-dimensionalandnoisysparsity recoveryusing-constrainedquadraticprogramming(lasso) ,InformationTheory, IEEETransactionson 55 (2009),no.5,2183{2202. [WD12] JensWagenerandHolgerDette, Bridgeestimatorsandtheadaptivelassounder heteroscedasticity ,MathematicalMethodsofStatistics 21 (2012),no.2,109{ 126. [WR09] LarryWassermanandKathrynRoeder, Highdimensionalvariableselection , Annalsofstatistics 37 (2009),no.5A,2178. [WWL12] LanWang,YichaoWu,andRunzeLi, Quantileregressionforanalyzinghetero- geneityinultra-highdimension ,JournaloftheAmericanStatisticalAssociation 107 (2012),no.497,214{222. 164 [XZ07] LiugenXueandLixingZhu, Empiricallikelihoodforavaryingcomodel withlongitudinaldata ,JournaloftheAmericanStatisticalAssociation 102 (2007),no.478,642{654. [YMW05a] FangYao,Hans-GeorgMuller,andJane-LingWang, Functionaldataanalysis forsparselongitudinaldata ,JournaloftheAmericanStatisticalAssociation 100 (2005),577{590. [YMW05b] , Functionallinearregressionanalysisforlongitudinaldata ,TheAnnals ofStatistics 33 (2005),no.6,2873{2903. [ZC07] Jin-TingZhangandJianweiChen, Statisticalinferencesforfunctionaldata ,The AnnalsofStatistics 35 (2007),no.3,1052{1079. [Zha09] TongZhang, Somesharpperformanceboundsforleastsquaresregressionwith l1regularization ,TheAnnalsofStatistics 37 (2009),no.5A,2109{2144. [Zha10] Cun-HuiZhang, Nearlyunbiasedvariableselectionunderminimaxconcave penalty ,TheAnnalsofStatistics(2010),894{942. [Zha11] Jin-TingZhang, Statisticalinferencesforlinearmodelswithfunctionalre- sponses ,StatisticaSinica 21 (2011),no.3,1431. [ZHM + 10] LanZhou,JianhuaZHuang,JosueGMartinez,ArnabMaity,Veerabhadran Baladandayuthapani,andRaymondJCarroll, Reducedrankmixedctsmodels forspatiallycorrelatedhierarchicalfunctionaldata ,JournaloftheAmerican StatisticalAssociation 105 (2010),no.489,390{400. [ZL00] WenyangZhangandSik-YumLee, Variablebandwidthselectioninvarying- comodels ,JournalofMultivariateAnalysis 74 (2000),no.1,116{134. [ZPW13] XiaokeZhang,ByeongUPark,andJane-lingWang, Time-varyingadditivemod- elsforlongitudinaldata ,JournaloftheAmericanStatisticalAssociation 108 (2013),no.503,983{998. [ZY06] PengZhaoandBinYu, Onmodelselectionconsistencyoflasso ,TheJournalof MachineLearningResearch 7 (2006),2541{2563. [ZZ14] Cun-HuiZhangandStephanieSZhang, eintervalsforlowdimen- sionalparametersinhighdimensionallinearmodels ,JournaloftheRoyalSta- tisticalSociety:SeriesB(StatisticalMethodology) 76 (2014),no.1,217{242. 165