EMPIRICALLIKELIHOODBASEDFUNCTIONALDATAANALYSISANDHIGH
DIMENSIONALINFERENCEWITHAPPLICATIONSTOBIOLOGY
By
HonglangWang
ADISSERTATION
Submittedto
MichiganStateUniversity
inpartialentoftherequirements
forthedegreeof
Statistics|DoctorofPhilosophy
2015
ABSTRACT
EMPIRICALLIKELIHOODBASEDFUNCTIONALDATAANALYSISAND
HIGHDIMENSIONALINFERENCEWITHAPPLICATIONSTOBIOLOGY
By
HonglangWang
Highdimensionaldataanalysishasbeenarapidlydevelopingtopicinstatisticswith
variousapplicationsinareassuchasgenetics/genomics,neuroscience,socialsci-
enceandsoon.Withtherapiddevelopmentoftechnology,statisticsasadatascience
requiresmoreandmoreinnovationsinmethodologiesaswellasbreakthroughsinmathe-
maticalframeworks.Inhighdimensionalworld,classicalstatisticalmethodsdesignedfor
dimensionalmodelsareoftendoomedtofail.Thisthesisfocusesontwotypesofhigh
dimensionaldataanalysis.Oneisthestudyoftypical\large
p
small
n
"probleminlinear
regressionwithhighdimensionalcovariates
X
2
R
p
butsmallsamplesize
n
,andtheother
isthefunctionaldataanalysis.Functionaldatabelongtotheclassofhighdimensionaldata
inthesensethateverydataobjectconsistsofalargenumberofmeasurements,whichmay
belargerthanthesamplesize.Butthekeycharacteristicisthatfunctionalobjectscanbe
modeledassmoothcurvesorsurfaces.WemakeuseofEmpiricalLikelihood(EL)introduced
by[Owe01],tosolvesomefundamentalproblemsinthesetwoparticularhighdimensional
problems.
Thepartofthethesisconsiderstheproblemoftestingfunctionalconstraintsina
classoffunctionallinearregressionmodelswhereboththepredictorsandtheresponseare
functionaldatameasuredatdiscretetimepoints.Weproposetestproceduresbasedon
theempiricallikelihoodwithbias-correctedestimatingequationstoconductbothpointwise
andsimultaneousinference.Theasymptoticdistributionsoftheteststatisticsarederived
underthenullandlocalalternativehypotheses,wheresparseanddensefunctionaldataare
consideredinaframework.Weaphasetransitionintheasymptoticdistributions
andtheordersofdetectablealternativesfromsparsetodensefunctionaldata.Sp,
theproposedtestscandetectalternativesofroot-
n
orderwhenthenumberofrepeated
measurementspercurveisofanorderlargerthan
n

0
with
n
beingthenumberofcurves.
Thetransitionpoints

0
aretforpointwiseandsimultaneoustestsandbothare
smallerthanthetransitionpointintheestimationproblem.
Inthesecondpartofthethesis,weconsiderhypothesistestingproblemsforalow-
dimensionalcotvectorinahigh-dimensionallinearmodelunderheteroscedasticer-
ror.Heteroscedasticityisacommonlyobservedphenomenoninmanyapplicationsincluding
andgenomicstudies.Severalstatisticalinferenceprocedureshavebeenproposedfor
low-dimensionalcotsinahigh-dimensionallinearmodelwithhomoscedasticnoise.
However,thoseproceduresdesignedforhomoscedasticerrorarenotapplicableformod-
elswithheteroscedasticerrorandtheheterscedasticityissuehasnotbeeninvestigatedand
studied.Weproposeainferenceprocedurebasedonempiricallikelihoodtoovercomethe
heteroscedasticityissue.Theproposedmethodisabletomakevalidinferenceunderhet-
eroscedasticitymodelevenwhentheconditionalvarianceofrandomerrorisafunctionof
thehigh-dimensionalpredictor.Weapplyourinferenceproceduretothreerecentlyproposed
estimatingequationsandestablishtheasymptoticdistributionsoftheproposedmethods.
Forbothofthetwoparts,simulationstudiesandrealdataanalysesareconductedto
demonstratetheproposedmethods.
Iwouldliketodedicatethisthesistomybelovedparents,DaofuWangandShuizhen
Wang,andmylittlebrother,HailangWang.
iv
ACKNOWLEDGMENTS
Firstandforemost,IwouldliketoexpressmysinceregratitudetomytwoadvisorsDr.
YuehuaCuiandDr.Ping-ShouZhongfortheircontinuoussupportguidance,understanding,
patienceandencouragementduringmyPhDstudyandresearch.Dr.CuiandDr.Zhong
havepushedmeintocontactwithamultitudeofdisciplines,andtheirguidanceabouthow
toapproachresearch,write,andgivetalkshasbeeninvaluable.Theyhavealsoprovidedme
excellentenvironmentsfordoingresearchinthedevelopmentofmethodologyandtheoryas
wellasintherealdataanalysis.Withouttheirguidanceandpersistenthelp,thisdissertation
wouldnothavebeenpossible.Forallofthis,Iamverythankfultobothofmyadvisors.
Iwouldalsoliketothanktheotherwonderfulmembersofmyresearchcommittee,Dr.
C.RobinBuellandDr.Hyokyoung(Grace)Hong.IngettingmydualPhDdegreein
QuantitativeBiology,Dr.Buellhasprovidedmuchguidanceandassistance,whichalso
makememorettobecomeaBio-statistician.Ihavebeenenjoyinginvolvement
inthepotatoprojectandlearningalotfrommonthlygroupcalls,annualmeetingsand
theBioinformaticsworkshops.Herguidanceprovidedmewiththeuniqueopportunityto
gainawiderbreadthofexperienceinbiologyscience,whichisespeciallyimportantfora
Bio-statistician.
Besidesthat,IthankalltheotherprofessorsandinthiswonderfulDepartment
ofStatisticsandProbabilitywhohaveneverhedaboutansweringaquestionfroma
nagginggraduatestudent|somethingthatisembeddedinthecultureofWellsHall.My
specialthanksgotoDr.HiraL.Koul,Dr.YiminXiaoandDr.TapabrataMaitifortheir
interestingcourses,valuableadviseandencouragement.
Comingtofriends,IamgratefultoYuzhenZhou,TaoHe,JikaiLei,ChenYue,XinQi,
v
LiqianCai,XiaoqingZhu,BinGao,XuLiuandallotherfellowstudentsfromtheDepartment
ofStatisticsandProbabilityforthefriendshipandthefuntimewespenttogetherinthe
pasteyears.
Finallyandmostimportantly,Iwouldexpressmyprofoundgratitudetomybeloved
parents,DaofuWangandShuizhenWangandmylitterbrother,HailangWangfortheir
love,endlesssupportandfaithinmeinallofmyendeavors.
vi
TABLEOFCONTENTS
LISTOFTABLES
....................................
ix
LISTOFFIGURES
...................................
x
KEYTOABBREVIATIONS
.............................
xi
Chapter1Introduction
...............................
1
1.1EmpiricalLikelihood...............................1
1.2BigDataAnalysis.................................4
1.2.1FunctionalDataAnalysis.........................4
1.2.2HighDimensionalDataAnalysis....................7
Chapter2pointwiseempiricallikelihoodratiotestsforfunctional
linearmodelsandthephasetransitionfromsparsetodense
functionaldata
..............................
10
2.1Introduction....................................10
2.2Abias-correctedestimatorandsomepreliminaryresults...........14
2.2.1Abias-correctedestimator........................14
2.2.2Regularityconditionsandpreliminaryresults.............15
2.3Apointwisetest.............................19
2.4Implementationissues..............................22
2.4.1Bandwidthselection...........................22
2.4.2CovarianceEstimation..........................24
2.5Simulationstudies................................25
2.6TechnicalDetails.................................28
2.6.1ProofofTheorem1............................28
2.6.2ProofsofPropositions..........................29
2.6.2.1SomeUsefulLemmas......................30
2.6.2.2ProofofPropositions......................48
2.6.2.3ExistenceofRMELEandtheasymptoticexpressionfor
~

.49
Chapter3simultaneousempiricallikelihoodratiotestsforfunc-
tionallinearmodelsandthephasetransitionfromsparseto
densefunctionaldata
..........................
66
3.1Introduction....................................66
3.2Asimultaneoustest............................68
3.2.1Nulldistributionandlocalpower....................69
3.2.2Wildbootstrapprocedure........................73
3.3Simulationstudies................................74
vii
3.4Realdataanalysis.................................77
3.4.1CD4dataanalysis............................77
3.4.2Ergonomicsdataanalysis.........................78
3.5TechnicalDetails.................................81
3.5.1ProofsofMainTheorems........................81
3.5.1.1ProofofTheorem2.......................81
3.5.1.2ProofofCorollary1......................84
3.5.1.3ProofofTheorem3.......................85
3.5.1.4ProofofTheorem4.......................89
3.5.2ProofsofPropositionandLemma....................91
Chapter4EmpiricalLikelihoodinTestingCotsinHighDimen-
sionalHeteroscedasticLinearModels
................
94
4.1Introduction....................................94
4.2PreliminaryandExistingMethods.......................95
4.2.1LassoProjection.............................97
4.2.2KFCProjection..............................99
4.2.3InverseProjection.............................102
4.3EmpiricalLikelihoodBasedApproach......................104
4.4TheoreticalExamples...............................106
4.4.1LassoProjection.............................106
4.4.2InverseProjection.............................107
4.4.3KFCProjection..............................108
4.5SimulationStudies................................109
4.6RealDataAnalysis................................117
4.6.1WGCNAofcorrelatedgenes.......................118
4.6.2Test.............................118
4.6.3PresenceofHeteroscedasticity......................120
4.6.4ResultsforTop4GeneswithHeteroscedasticity............121
4.7TechnicalDetails.................................124
4.7.1AssumptionsforTheoreticalExamples.................124
4.7.2ProofofTheorems............................127
Chapter5ConclusionsandFutureDirections
.................
156
5.1SummaryandContributions...........................156
5.2FutureDirections.................................157
BIBLIOGRAPHY
....................................
159
viii
LISTOFTABLES
Table1.1
Transitionphasepointfromsparsetodensedataandopti-
maldetectableorderoflocalalternativesforbothpointwise
andsimultaneousinference.
Notethatweloweredthetransition
phasepoint

0
whichwas1/4intheexistingliterature........6
Table2.1Empiricalcoverageprobability(%)andaveragelengthofpointwise
intervals(inparenthesis)for

1
(
t
)at
t
=0
:
3
;
0
:
5and0.7.26
Table3.1Empiricalsizeandpowerfortesting
H
0
A
:

1
(

)=

2
(

)undersce-
narioA..................................76
Table3.2P-valuesforpairwisecomparisonamongttreatmentgroups.79
Table3.3P-valuesfortestingeachcotfunctioninthequadraticmodel
(3.4.4)...................................81
Table4.1
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Er-
ror:N(0
;
1)................................114
Table4.2
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Er-
ror:0
:
7
X
1
N(0
;
1).............................115
Table4.3
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Er-
ror:
1
p

1
X
1
P
p
j
=2
X
j

1
X
j
N(0
;
1)....................116
Table4.4
ModuleSizes.
.............................118
ix
LISTOFFIGURES
Figure2.1
Panels(a)and(b)areboxplotsforbandwidthsselectedformodel(2.5.19)
with

1
(
t
)=
1
2
sin(
ˇt
)and

2
(
t
)=2sin(
ˇt
+0
:
5)usingtheproposed
bandwidthselectionmethodinSection2.4.Panels(c)and(d)arethe
plotsofthelogarithmofmedian(
^
h
)vslog(
nm
).
.............27
Figure3.1
Empiricalsizeandpowerfortesting
H
0
B
:

2
(

)=0atthe5%nominal
levelunderscenarioB.Theleftpanelisfor
ˆ
=0
:
2andtherightpanelis
for
ˆ
=0
:
5.
................................76
Figure4.1
EmpiricalSizeandPowerComparisonamongEmpiricalLike-
lihoodbasedapproachesandamongHolyTrinityand
p
=
100
.
(a)\EL-KFC"representsELapproachwithKFCprojection,
\EL-INV"representsELapproachwithinverseprojectionand\EL-
LASSO"representsELapproachwithLassoprojection;(b)\Wald"
representsWaldtypetest,\Score"representsScoretestand\EL"
representslikelihoodratiotest......................112
Figure4.2
EmpiricalSizeandPowerComparisonamongEmpiricalLike-
lihoodbasedapproachesandamongHolyTrinitywithHet-
eroscedasticNoise
1
p

1
X
1
P
p
j
=2
X
j

1
X
j
N
(0
;
1)
and
p
=500
.
(a)\EL-KFC"representsELapproachwithKFCprojection,\EL-
INV"representsELapproachwithinverseprojectionand\EL-LASSO"
representsELapproachwithLassoprojection;(b)\Wald"represents
Waldtypetest,\Score"representsScoretestand\EL"represents
likelihoodratiotest............................113
Figure4.3
BreastCancerCohortStudies.
(a)Clusteringdendrogramof
genes,withdissimilaritybasedontopologicaloverlap,togetherwith
assignedmodulecolors.(b)ManhattanplotforModule3......119
Figure4.4
WonderingSchematicPlotforTop4GeneswithHeteroscedas-
ticity.
..................................122
Figure4.5
ManhattanPlotforTop4GeneswithHeteroscedasticity.
.123
x
KEYTOABBREVIATIONS

EL:EmpiricalLikelihood;

IID:IndependentandIdenticallyDistributed;

KL:Kullback-Leibler;

SNPs:SingleNucleotidePolymorphisms;

GWAS:GenomeWideAssociationStudies;

KFC:KeyconFounderControlling

WGCNA:WeightedGeneCo-expressionNetworkAnalysis;
xi
Chapter1
Introduction
1.1EmpiricalLikelihood
InStatistics,thelikelihoodprincipleistheprimaryprincipleasstatedin[Edw84],
Withintheframeworkofastatisticalmodel,alltheinformationwhichthedataprovide
concerningtherelativemeritsoftwohypothesesiscontainedinthelikelihoodratioof
thosehypothesesonthedata....Foracontinuumofhypotheses,thisprincipleasserts
thatthelikelihoodfunctioncontainsallthenecessaryinformation.
However,fortheinferenceproceduretobemorewidelyapplicable,somenon-parametric
versionofthelikelihoodisdesirablesothatwecannotonlygainrobustnessandy
butalsokeeptheenessaswellassomeothermeritsofthelikelihoodprinciple.Inthe
lateeighties,ProfessorArtB.Owenproposedthegreatidea,\EmpiricalLikelihood"(EL)
[Owe88,Owe90],whichisanon-parametriclikelihood.Thewellknown\WilksPhenomenon"
belongingtotheparametriclikelihoodstillholdsforEL[Owe90,Owe01].ELalsoenjoysthe
Bartlettcorrectionproperty[DHR91,CC06].Besides,itproducesmorenaturaldatadriven
shapeofregions.
WeconsidertheunivariatemeaninferenceproblemtointroducetheELidea.Given
n
IIDobservations
f
X
i
2
R
;i
=1
;
2
;

;n
g
fromanunknownunderlyingdistribution
F
0
withtwomoments,wewanttoconducttheinferencefortheunivariatemean

0
:=
E
F
0
(
X
i
).Anaturalpointestimationof

0
isthesamplemean

X
,buthowtoget
1
antnceintervalwithagivenlevelisnotthatsimplesincewehave
noideaabouttheunderlyingdistributionuptothetwomoments.According
to[Owe90],theempiricallikelihoodfor

istheproductoftheprobabilityweights,say
f
0

p
i

1
;i
=1
;
2
;

;n
g
,sittingonthesamplepoints
f
X
i
;i
=1
;
2
;

;n
g
,thatis
Q
n
i
=1
p
i
,withthemomentconstraint
P
n
i
=1
p
i
X
i
=

,i.e.
L
EL
(

)=max
f
p
i
g
n
i
=1
(
n
Y
i
=1
p
i
:
n
X
i
=1
p
i
=1
;p
i

0
;
n
X
i
=1
p
i
(
X
i


)=0
)
:
(1.1.1)
Actually,wecanderivetheaboveformulation(1.1.1)inthefollowingformalway.The
statisticalmodelwiththemomentrestrictioncouldbephrasedformallyasthesetofall
probabilitymeasuresthatarecompatiblewiththemomentcondition,i.e.
P
=
S

P
(

),
where
P
(

)=
ˆ
probabilitymeasure
P
on
R
:
Z
(
X


)
dP
=0
˙
:
Notethatitiscorrectlyspifandonlyif
P
includesthetruemeasure
dF
0
(
x
)asits
member.Thefollowingfunctioncouldberegardedasameasureforthedivergencebetween
twoprobabilitymeasures
P
and
Q
:
D
(
P;Q
)=
Z
˚
(
dP
dQ
)
dQ;
aslongas
˚
ischosentobeconvex.AndweknowthattheKullback-Leibler(KL)divergence
betweenprobabilitymeasures
P
and
Q
isaspecialcasebytaking
˚
(
x
)=

log(
x
).
Ifthemodeliscorrectlyspwehavethefollowingnicepropertyatthepopulation
level

0
=inf

inf
P
2P
(

)
D
(
P;F
0
)
:
2
Henceanaturalstatisticalprocedurefortheestimationofthemeancanbeobtainedby
replacingtheunknown
F
0
withtheempiricalmeasure
F
n
andsearchingovertherestricted
statisticalmodel
P
=
S

P
(

),where
P
(

)=
(
F
p
:=
n
X
i
=1
p
i

X
i
:
Z
(
X


)
dF
p
=0
)
:
Andthentheestimationofthemeanisnedastheminimizerofthefollowingoptimization
problem
inf

inf
F
p
2
P
(

)
D
(
F
p
;F
n
)=inf

inf
P
n
i
=1
p
i
(
X
i


)=0
;
P
n
i
=1
p
i
=1
;p
i

0
1
n
n
X
i
=1
˚
(
np
i
)
:
(1.1.2)
Inparticular,withtheKLdivergencein(1.1.2),wehave
inf

inf
P
n
i
=1
p
i
(
X
i


)=0
;
P
n
i
=1
p
i
=1
;p
i

0
1
n
n
X
i
=1

log(
np
i
)
;
whichnaturallyleadstothelogempiricallikelihoodasnedin(1.1.1)
`
EL
(

):=log
L
EL
(

)=max
f
p
i
g
n
i
=1
(
n
X
i
=1
log
p
i
:
n
X
i
=1
p
i
=1
;p
i

0
;
n
X
i
=1
p
i
(
X
i


)=0
)
:
Mostimportantly,[Owe90]provedthefollowingWilksproperty

2
`
EL
(

0
)

2
n
log
n
d
!
˜
2
1
:
Basedonthisasymptoticresult,wecannotonlyperformhypothesistestingbutalsocon-
structintervalforthemeanparameterwithdatadrivenshape.
3
AnoverviewoftheELmethodscanbefoundin[Owe01]and[CVK09].
1.2BigDataAnalysis
Intheageofinformationandtechnology,alongwiththeadvancementoftechnologicalrev-
olution,informationacquisitionisbecomingeasyandcheap,whichleadstotheexplosion
ofdatacollectionthroughautomateddatacollectionprocesses.Fromvarioussuch
asbiomedicalsciences,engineeringandsocialsciences,massivedatacharacterizedbyhigh
dimensionalityarepoppingupallthetime.Forexample,withtherapidnextgeneration
sequencingtechnologydevelopment,hundredsofthousandsofgeneticvariantssuchassingle
nucleotidepolymorphisms(SNPs),arepotentialfeaturesingenomewideassociationstudies
(GWAS).Timeserieswithverydensetimepointscanbecollectedfromhundredsofthou-
sandsofregionsineconomics,earthsciences,aswellasneuroscience.IntheBigDataera,
documents,images,videosandotherobjectscanallberegardedasformsofmassivedata.
Statisticianshavebeenalsoproposingnewstatisticalmethodologiestodiscoverknowledge
fromthosebigdata.Forexample,fromstudyingdatapointsintheEuclideanspaces
tostudyingcurves(i.e.functionaldataanalysis),surfaces,evenmanifoldsdirectlyin
dimensionalspaces.
1.2.1FunctionalDataAnalysis
Weconsiderthefollowinggeneralfunctionallinearregressionmodel,
Y
i
(
t
ij
)=

|
0
(
t
ij
)
X
i
(
t
ij
)+

i
(
t
ij
)
;i
=1
;

;n
;
j
=1
;

;m
i
(1.2.3)
4
where
X
i
(
t
)
˘f

(
t
)
;

(
s;t
)
g
,
t
ij
˘
f
(
t
)and

i
(
t
)
˘f
0
;

s;t
)
g
aremutuallyindependent.
Forconvenience,assumethatwith
m
i
's(1

i

n
)areallofthesameorderas
m
=
n

forsome


0.Datawith

=0,arecalledsparsefunctionaldata,i.e.longitudinaldata;
thosesatisfying


0
,where

0
isatransitionpointtobesparereferredasdense
functionaldata.Thescenarioswith

2
(0
;
0
)areinagreyzoneintheliteratureandwe
referthemas\moderatelydense".
Historically,sparseanddensefunctionaldatawereanalyzedwithtmethodologies.
Fordensefunctionaldata,onecansmootheachcurveseparatelyandproceedwithfurther
estimationandinferencebasedonthepre-smoothedcurves.Apartiallistofrecentliterature
ondensefunctionaldataincludes[CLS86],[RS91],[ZC07],[EH08]and[BHK
+
09].For
sparsefunctionaldata,thepre-smoothingapproachisnotapplicableand,instead,oneneeds
topoolalldatatogethertoborrowstrengthfromindividualcurves[YMW05a,YMW05b].
[HMW06]investigatedthetheoreticalpropertiesoffunctionalprincipalcomponentanalysis
basedonlocallinearsmoothers.Theyfoundthat,fordensefunctionaldatawith


1
=
4,
thepre-smoothingerrorsareasymptoticallynegligibleandquantitiessuchasthemean,
covarianceandeigenfunctionscanbeestimatedwithaparametricroot-
n
rate,whilethese
quantitiescanonlybeestimatedwithanonparametricconvergencerateforsparsefunctional
datawith

=0.Sincesparseanddensefunctionaldataareasymptoticconceptsand
arehardtodistinguishinreality,[LH10]proposedanestimationproceduretreatingall
typesoffunctionaldataunderaframeworkincludingthemoderatelydensecases.
Morerecently,[KZ13]proposedauself-normalizingapproachtoconstructpointwise
intervalsforthemeanfunctionoffunctionaldata.Theaforementionedpapers
established

0
=1
=
4asthetransitionpointtoparametricconvergencerate.
Incontrasttoestimation,lessisknownabouttheinferenceforfunctionaldata,witha
5
fewexceptionssuchas[ZC07]and[KZ13].InChapter2and3ofthethesis,wepropose
pointwiseandsimultaneousinferenceproceduresforthefunctionallinearmodelundera
frameworkforalltypesoffunctionaldataandinvestigatingthephasetransition
fromsparsetodensedata.Wearenotonlytheonetoproposeaninference
procedureintheregressionsetupwhichcancoveralltypesoffunctionaldata,butalsothe
onetoinvestigatethetransitionphasefromsparsetodensefunctionaldata,forthe
followingverybroadhypothesistestingproblem
H
0
:
H
f

0
(
t
)
g
=0vs
H
1
n
:
H
f

0
(
t
)
g
=
b
n
d
(
t
)(1.2.4)
where
H
(

)isanyspfunctionalwithsomeregularconditionand
b
n
isthedetectable
orderoflocalalternativestobespe(Table1.1).InChapter2and3,wenotonly
derivetheasymptoticdistributionsunderthenullhypothesisandlocalalternatives,butalso
proposeawildbootstrappingapproachtounifytheinferenceprocedureinpracticealong
withanicebandwidthselectionmethod.
Table1.1:
Transitionphasepointfromsparsetodensedataandoptimalde-
tectableorderoflocalalternativesforbothpointwiseandsimultaneousinfer-
ence.
Notethatweloweredthetransitionphasepoint

0
whichwas1/4intheexisting
literature.
PointwiseInference

0
=1
=
8SimultaneousInference

0
=1
=
16
0

<
0


0
0

<
0


0
b
n
n

4(1+

)
=
9
n

1
=
2
n

8(1+

)
=
17
n

1
=
2
6
1.2.2HighDimensionalDataAnalysis
Rapidprogresshasbeenmadeduringthepastdecadeinhighdimensionalstatistics,es-
peciallyinlinearregressionmodelasoneoftheclassicalmodelsinstatisticaltheory.The
vastmajorityofexistingliteraturehasbeenpursuedforestimationundersparsityandho-
moscedasticitybasedonregularizationwithtpenalties,eitherconvexornoncon-
vex.ThemostpopularrepresentativeofconvexpenaltiesistheLassopenalty[Tib96].
ThetheoreticalpropertiesoftheLassoestimatorsuchastheoracleproperty,whichrefers
toconsistentlyrecoveringthesparsepatternandestimatingtheparametersoftheco
cientvector,andselectionconsistencyhavebeeninvestigatedby[MY09,BRT09,BTW
+
07,
VdG08,Zha09,NRWY12]and[MB06,ZY06,Wai09].Thenonconvexrepresentativesinclude
SCAD[FL01],MCP[Zha10],amongothers.Acomprehensiveoverviewofhighdimensional
estimationforhomoscedasticregressionmodelscanbefoundin[BVDG11].
Despiteitsprevalenceinstatisticaldatasets,heteroscedasticityhasbeenlargelyignored
inhighdimensionalstatisticsliterature.[WWL12]analyzedtheheteroscedasticityinhigh
dimensionalcasebyusingquantileregression.[DCL12]proposedamethodologythatallows
nonconstanterrorvariancesforhighdimensionalestimationbutwithaparametricformof
thevariancefunction.Andrecently,[BCW14]cameupwithaself-tuning
p
Lassoestimation
methodthatsolvedthisimportantprobleminhighdimensionalregressionanalysis.
Althoughpeoplehavemadetprogresstowardsunderstandingtheestimation
theoryforhighdimensionalmodels,verylittleworkhasbeendoneforconstructing
intervals,statisticaltestingandassigninguncertaintyforpenalizedestimatorsinhighdimen-
sionalsparsemodels.Inanearlywork,[KF00]showedthatthelimitingdistributionofthe
Lassoestimatorisnotnormaleveninthelowdimensionalsetting.Recently,[GVHF11]and
7
[CG14]consideredglobaltestingwithhighdimensionalalternative.[MMB09]and[WR09]
consideredp-valuesbasedonthesamplesplittingtechnique.Stabilityselection[MB10]and
itsmo[SS13]provideanotherproceduretoestimateerrormeasuresforfalseposi-
tiveselectionsingeneralhighdimensionalsettings.Forthelassoestimator,[LTTT14]and
[TLTT14]consideredaninterestingconditionalinferencewithrandomhypothesis,whichis
philosophicallytwiththetraditionalunconditionalinference.
Intermsoftestingtheofonesingleregressioncot,theclassical
z

test
(or
t

test)isnolongerapplicablebecausethehighdimensionality.Peoplehavebeenpropos-
inglow-dimensionalprojectionproceduretoconducthypothesistestingandconstructcon-
regions[ZZ14,B
+
13,JM13,vdGBR13,LZL
+
13,NL14].Thewaytoselectthe
projectionvariablesvariesfrommethodtomethod.Someofthemusenode-wiseLassopro-
ceduretoselecttheprojectionvariables,andsomeofthemusethesocalledKeyconFounder
Controlling(KFC)methodmotivatedbyscreeningapproaches[FL08].
However,alltheaboveinferenceproceduresassumedhomoscedasticityfortheerrorterm,
inparticular,theconditionalvarianceoftheerrorisaconstant.Thisisessentialfortheir
inferenceproceduretobevalidsincetheyrequiretheaccurateestimationoftheerrorvari-
ance.Withouthomoscedasticity,itishardforthemtocarryouttheestimationoftheerror
varianceinhighdimensionalsettings.Butthishardlyholdsinpractice.Thereisrarely
goodcausetohavestrongbeliefintheassumptionthattheerrorsarehomoscedasticand
similarlythereisrarelytinformationtoenableaccuratespofthevariance
function.Theuseofincorrectvariancemodelswill,ingeneral,leadtoinferencesthatare
notasymptoticallyvalid[Bel02].[WD12]generalizedtheasymptoticresultsof[KF00]for
thecaseofaparameterdimensionunderheteroscedasiticerrors.Butthereislittle
workindealingwithheteroscedasticityundergrowingdimensionalongwithsamplesize.To
8
bridgethisgap,inChapter4ofthisthesis,weproposetouseEmpiricalLikelihood(EL)to
teststatisticalhypothesesandconstructregionsforlowdimensionalcomponents
inhighdimensionallinermodelswithheteroscedasticnoise.
9
Chapter2
pointwiseempiricallikelihood
ratiotestsforfunctionallinearmodels
andthephasetransitionfromsparse
todensefunctionaldata
2.1Introduction
Weconsiderstatisticalinferenceproblemsunderageneralfunctionallinearregressionmodel,
whereboththeresponse
Y
(
t
)andthecovariate
X
(
t
)=
f
X
(1)
(
t
)
;:::;X
(
p
)
(
t
)
g
|
are
continuouslyonatimeinterval[
a;b
].Therelationshipbetween
Y
(
t
)and
X
(
t
)isgivenby
Y
(
t
)=

|
0
(
t
)
X
(
t
)+

(
t
)
;
(2.1.1)
where

0
(
t
)=


10
(
t
)
;

;
p
0
(
t
)

|
isa
p
-dimensionalvectorofunknownfunctionsand

(
t
)isazeromeanerrorprocess,independentof
X
andwithacovariancefunction
s;t
)=
Cov
f

(
s
)
;
(
t
)
g
.Themodelin(2.1.1)isalsoreferredastheconcurrentfunctionallinearmodel
in[SR05],whichincludesthevaryingcotmodelsandfunctionalanalysisofvariance
(fANOVA)models[MC06,ZHM
+
10]asspecialcases.InmanyfANOVAapplications,some
10
componentsof
X
(
t
)arerandomindicatorsoftreatmentassignmentswithcomplicatedcross
ornestedstructures,see[FZ00]formorediscussionsontherelationshipandbe-
tweenmodel(2.1.1)andthevaryingcotmodels.Withoutlossofgenerality,weallow
X
(
t
)tobeamultivariaterandomprocesswithmeanfunction

(
t
)=
E
f
X
(
t
)
g
andcovariance
function

(
s;t
)=Cov
f
X
(
s
)
;
X
(
t
)
g
.
Let
f
Y
i
(
t
)
;
X
i
(
t
)
g
,
i
=1
;:::;n
,beindependentrealizationsof
f
Y
(
t
)
;
X
(
t
)
g
.Insteadof
observingtheentiretrajectories,onecanonlyobserve
Y
i
(
t
)and
X
i
(
t
)ondiscretetimepoints
f
t
ij
;j
=1
:::;m
i
g
.Forconvenience,denote
Y
ij
=
Y
i
(
t
ij
)and
X
(
k
)
ij
=
X
(
k
)
i
(
t
ij
),andassume
that
m
i
's(1

i

n
)areallofthesameorderas
m
=
n

forsome


0.Thatis
m
i
=m
are
boundedbelowandabovebysomeconstants.Functionaldataareconsideredtobesparse
ordensedependingontheorderof
m
[HMW06,LH10].Datawithbounded
m
,or

=0,
arecalledsparsefunctionaldata;thosesatisfying


0
,where

0
isatransitionpointto
bespbelow,arereferredasdensefunctionaldata.Thescenarioswith

2
(0
;
0
)are
inagreyzoneintheliteratureandwereferthemas\moderatelydense"inthischapter.
AswementionedinSection1.2.1,sparseanddensefunctionaldatawereanalyzedwith
tmethodologies.Butsparseanddensefunctionaldataareasymptoticconceptsand
arehardtodistinguishinpractice,[LH10]proposedanestimationproceduretreatingall
typesoffunctionaldataunderaframeworkincludingthemoderatelydensecasesand
theyfound

0
=1
=
4isthetransitionpointtoparametricconvergencerateintheestimation.
Incontrasttoestimation,lessisknownabouttheinferenceforfunctionaldata,withafew
exceptionssuchas[ZC07]and[KZ13].Thefocusofthechapterisonproposingpointwiseand
simultaneousinferenceproceduresforthefunctionallinearmodelin(2.1.1)undera
frameworkforalltypesoffunctionaldataandinvestigatingthephasetransitionfromsparse
11
todensedata.Weareinterestedintesting
H
0
:
H
f

0
(
t
)
g
=0vs
H
1
:
H
f

0
(
t
)
g6
=0(2.1.2)
where
H
f
z
g
isa
q
-dimensionalfunctionof
z
=(
z
1
;

;z
p
)
|
2
R
p
suchthat
C
(
z
):=
@H
(
z
)
@
z
|
isa
q

p
fullrankmatrix(
q

p
)forall
z
.
Thetestproblemin(2.1.2)isverybroad,includingmanyinterestinghypothesesasspecial
cases.Forinstance,if
H
f
z
g
=
z
,thenullhypothesisisequivalentto
H
0
:

k
0
(

)=0forall
k
.
If
H
f
z
g
=(
z
1

z
2
;z
2

z
3
;

;z
p

1

z
p
)
|
,then(2.1.2)isessentiallyanANOVAhypothesis
forthecotfunctions

k
0
(

).If
H
f
z
g
=


c
0
fora
q

p
knownconstantmatrix

andaknownvector
c
0
,then(2.1.2)becomes
H
0
:


0
(

)=
c
0
,whichisatestforlinear
constraintson

0
(

).Similarhypothesistestingproblemshavebeenstudiedby[ZC07]and
[Zha11].However,theirmethodsonlyapplytodensefunctionaldatawith
>
5
=
4.
Inthischapter,weproposenonparametrictestsbasedontheempiricallikelihood(EL)to
test(2.1.2)pointwisely.WeshowtheEL-basedtestsenjoyaniceself-normalizingproperty
suchthatwecantreatbothsparseanddensefunctionaldataunderaframework.
TherehavebeensomeworksonELmethodsforsparsefunctionaldatawith

=0.Among
them,[XZ07]proposedanELmethodforconstructingpointwiseintervaland
aBonferronitypesimultaneousbandforthemeanfunction.[CZ10]studiedan
EL-basedmethodfortestingANOVAtypehypothesesinpartiallinearmodelswithmissing
values.
Toinvestigatethepowerofthetests,weconsiderthelocalalternatives
H
1
n
:
H
f

0
(
t
)
g
=
b
n
d
(
t
)
;
(2.1.3)
12
where
b
n
isasequenceofnumbersconvergingto0ataratetobesplaterand
d
(
t
)
6
=0is
any
q
-dimensionalfunction.Foragiventest,
b
n
isthesmallestorderofthelocalalternatives
sothatthetesthasanon-trivialpowerforanynon-zero
d
(

).Thus
b
n
quanthe
orderofsignalsthatatestcandetect.Forthesparsedatawith

=0,itisknownthatthe
ELmethodusingaglobalbandwidth
h
[CZ10]candetectalternativesoforder
b
n
=(
nh
)

1
=
2
forpointwisetests.Since
h
!
0inatypicalnonparametricregressionsetting,thedetectable
orderhereislargerthan
n

1
=
2
.However,fordensedatawith
>
0,thedetectableorder
b
n
isstilllargelyunknown.Onekeyinterestinthischapteristounderstandtheof

on
b
n
inthepointwisetest.Theoptimal
b
n
isobtainedbymaximizingthepowerofthe
test(i.e.,minimizingtheorderof
b
n
)whilecontrollingthetypeIerroratthedesiredlevel.
Undersomemildconditions,wethat,forthepointwisetest,theoptimalrate
b
n
islarger
than
n

1
=
2
for


1
=
8andequalsto
n

1
=
2
for
>
1
=
8.Thetransitionpoint1
=
8willbe
refereedas

0
forthepointwisetests.Once
>
0
,withaproperlychosenbandwidth,the
proposedtestscandetectasignalataparametricrate.
Therestofthechapterisorganizedasfollows.InSection2.2,wepresentabias-corrected
estimatorandsomepreliminaryresults.WeproposethepointwiseELtestinSection
2.3whereweinvestigatetheasymptoticdistributionsoftheteststatisticunderboththe
nullandlocalalternatives,andthetransitionphasesfor
b
n
.InSection2.4,weaddress
implementationissuessuchasbandwidthselectionandcovarianceestimation.Simulation
studiesarepresentedinSection2.5.AllthetechnicaldetailsarerelegatedtotheSection
2.6.
13
2.2Abias-correctedestimatorandsomepreliminary
results
Inthissection,wewillintroduceaninitiallocallinearestimator
^

(
t
)[FG96]for

0
(
t
)
andthenintroduceabias-correctedestimator


(
t
)andsomepreliminaryresults.
2.2.1Abias-correctedestimator
Let
K
(

)beasymmetricprobabilitydensityfunctionthatweuseasakernel,
h
beaband-
width,anddenote
K
h
(

)=
K
(

=h
)
=h
.Forany
t
inaneighborhoodof
t
0
,

k
0
(
t
)canbe
approximatedby

k
0
(
t
)
ˇ

k
0
(
t
0
)+
@
k
0
(
t
0
)
@t
(
t

t
0
):=
a
k
+
b
k
(
t

t
0
)
;k
=1
;
2
;

;p:
Denote
#
=(
a
1
;:::;a
p
;hb
1
;:::;hb
p
)
|
and
D
ij
(
t
)=(
X
|
ij
;
t
ij

t
h
X
|
ij
)
|
.Put
Y
i
=(
Y
i
1
;Y
i
2
;:::;Y
im
i
)
|
;
Y
=(
Y
|
1
;
Y
|
2
;

;
Y
|
n
)
|
;
D
i
(
t
)=(
D
i
1
(
t
)
;
D
i
2
(
t
)
;

;
D
im
i
(
t
))
|
;
D
(
t
)=(
D
|
1
(
t
)
;
D
|
2
(
t
)
;

;
D
|
n
(
t
))
|
;
W
i
(
t
)=
1
m
i
diag
f
K
h
(
t
i
1

t
)
;K
h
(
t
i
2

t
)
;:::;K
h
(
t
im
i

t
)
g
;
and
W
(
t
)=diag(
W
1
(
t
)
;
W
2
(
t
)
;:::;
W
n
(
t
))
:
Anestimatorfor
#
isobtainedas
^
#
=argmin
#
[
Y

D
(
t
0
)
#
]
|
W
(
t
0
)[
Y

D
(
t
0
)
#
]
;
(2.2.4)
=[
D
|
(
t
0
)
W
(
t
0
)
D
(
t
0
)]

1
D
|
(
t
0
)
W
(
t
0
)
Y
:
14
Thusthelocallinearestimatorfor

0
(
t
0
)is
^

(
t
0
)=

I
p
;
0
p

^
#
=

I
p
;
0
p

[
D
|
(
t
0
)
W
(
t
0
)
D
(
t
0
)]

1
D
|
(
t
0
)
W
(
t
0
)
Y
;
(2.2.5)
where
I
p
isa
p

p
identitymatrixand
0
p
isa
p

p
zeromatrix.ItisshowninLemma1
inSection2.6.2that
sup
t
2
[
a;b
]
j
^

(
t
)


0
(
t
)
j
=
O
ˆ
h
2
+(
log
n
n
+
log
n
nmh
)
1
=
2
˙
a:s:
(2.2.6)
Sincethebiasof
^

(
t
)isoforder
h
2
,undersmoothingistypicallyneededforanunbiased
testingprocedurebasedon
^

(
t
)[XZ07].Toavoidundersmoothingandreducetheestima-
tionbiasin
^

(
t
),we


(
t
)asthesolutionofthefollowingresidual-adjusted[XZ07]
estimatingequationfor

(
t
)

g
n
f

(
t
)
g
:=
1
n
n
X
i
=1
g
i
f

(
t
)
g
=0
;
(2.2.7)
with
g
i
f

(
t
)
g
=
1
m
i
P
m
i
j
=1
n
Y
ij


|
(
t
)
X
ij
f
^

(
t
ij
)

^

(
t
)
g
|
X
ij
o
X
ij
K
h
(
t
ij

t
),where
^

(
t
)isthelocallinearestimatorfor

0
(
t
).
2.2.2Regularityconditionsandpreliminaryresults
Wenowpresentsomepreliminaryresultsregardingtheasymptoticsof


(
t
).Assumethat
t
ij
arei.i.d.randomvariablesfollowingaprobabilitydensityfunction
f
(
t
).Forconvenience,


(
t
)=

(
t;t
),
t
)=(
t;t
),
C
(
t
)=
C
f

0
(
t
)
g
and
A
(
t
)=

(
t
)
f
(
t
).Wewillalsouse
~
o
p
and
~
O
p
torepresentthat,respectively,
o
p
and
O
p
holduniformlyforall
t
2
[
a;b
].The
15
followingconditionsareneededforourasymptoticresults.
(C1):
Thekernelfunction
K
(

)isasymmetricprobabilitydensityfunctionwithabounded
supportin[

1
;
1].
(C2):
Assumethat
E
n
sup
t
2
[
a;b
]
k
X
(
t
)
k

1
o
<
1
and
E
n
sup
t
2
[
a;b
]
j

(
t
)
j

2
o
<
1
forsome

1
;
2

5where
kk
isthe
L
2
normforavector.
(C3):
Assumethat
f
(
t
)and

(
t
)havecontinuoussecondderivativeson[
a;b
],

0
(
t
)has
continuousthirdderivativeson
t
2
[
a;b
],and
C
(
t
)isuniformlyboundedon
t
2
[
a;b
].
(C4):


=min
f

1
;
2
g
andlet
h
=
n


0
with

0
2
(0
;
1)beingtheorderofthe
bandwidth.Assumethat(i)

0
<
1


2

if

2
[0
;
1
=
8]and

0
<
1
=
2

1

if
>
1
=
8;(ii)(1+

)
=
9
<
0
if

2
[0
;
1
=
8]and1
=
8
<
0
<
if
>
1
=
8.
Conditions(C1)and(C3)arecommonlyusedregularityconditionsinnonparametric
regressions.Condition(C2)issimilartothatin[LH10].Theupperboundsonthebandwidth
h
in(C4)(i)areadaptedfrom[LH10].Detailedexplanationontherestrictionson
h
in(C4)(ii)
willbegiveninRemark2afterProposition2.Selectingabandwidththat(C4)will
bediscussedinSection2.4.
Thefollowingpropositionprovidesanasymptoticexpansionfor


(
t
).
Proposition1.
Underconditions(C1)-(C3)and(C4)(i),


(
t
)


0
(
t
)=

A

1
(
t
)

˘
n
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
)
;
(2.2.8)
16
where

˘
n
(
t
)=
n

1
P
n
i
=1
˘
i
(
t
)
and
˘
i
(
t
)=
m

1
i
P
m
i
j
=1
X
ij

ij
K
h
(
t
ij

t
)
.Let

r
=lim
n
!1
n

1
n
X
i
=1
m=m
i
;
ts
=
Z
u
s
K
t
(
u
)
du;
then
Var
f

˘
n
(
t
)
g
=

(
t

t
)
f
(
t
)
n

r
mnh

20
+
m


r
nm
f
(
t
)
o
f
1+~
o
(1)
g
:
(2.2.9)
TheproofofProposition1isprovidedinSection2.6.2.
Remark1.
Proposition1impliesthatthemeansquareerror(MSE)of


(
t
)
isMSE


(
t
)

=
O
f
h
8
+
1
mnh
+
1
n
g
:
Hencetheoptimal
h
opt
thatminimizetheMSEof


(
t
)
is
h
opt
˘
(
mn
)

1
=
9
=
n

(1+

)
=
9
.Itfollowsthat


(
t
)


0
(
t
)=
O
p
f
h
4
opt
+(
mnh
opt
)

1
2
+
n

1
2
g
=
O
p
f
n

1
=
2
+
n

4(1+

)
=
9
g
:
Thentheoptimalconvergencerateof


(
t
)
isoforder
n

4(1+

)
=
9
if


1
=
8
andoforder
n

1
=
2
if
>
1
=
8
.Thus,

0
=1
=
8
isthetransitionpointfortheconvergencerateof


(
t
)
.
When
>
0
,


(
t
)
isnolongersensitivetothechoiceof
h
anditstheconvergencerate
remainsat
O
p
(
n

1
=
2
)
aslongas
h
=
O
(
n

1
=
8
)
and
h
˛
m

1
=
n


.
Thefollowingpropositionprovidestheasymptoticdistributionof


(
t
)anditsproofis
providedinSection2.6.2.
Proposition2.
Suppose
mh
!

0
2
[0
;
1
]
,
C

0

=
8
>
>
>
<
>
>
>
:
f
n=
(
mh
)
g
1
=
2
;
if

0
<
1
;
n
1
=
2
;
if

0
=
1
(2.2.10)
17
and
B
(
t
)=

(
t

t
)
f
(
t
)
f
(
r
20
+

0
f
(
t
))
I
(

0
<
1
)+
f
(
t
)
I
(

0
=
1
)
g
.Underconditions
(C1)-(C4),wehave
nC

1

0


(
t
)


0
(
t
)

d
!
N
(
0
;
V
(
t
))
:
(2.2.11)
where
V
(
t
)=
A

1
(
t
)
B
(
t
)
A

1
(
t
)
.
Remark2.
ByProposition1,thebiasin
nC

1

0

f


(
t
)


0
(
t
)
g
isoforder
O
p
(
nh
4
=C

0

)
.
Sincethebiascanleadtoinvalidtests,weuseCondition(C4)(ii)toensurethatthebiasis
asymptoticallynegligible.When


0
=1
=
8
,thecondition

0
>
(1+

)
=
9
warrantsthat
mh<
1
andhence
nh
4
=C

0

=
n
1
=
2
m
1
=
2
h
9
=
2
=
n
(1+


9

0
)
=
2
=
o
(1)
.When
>
0
,the
conditionthat
1
=
8
<
0
<
implies
mh
!1
and
nh
4
=C

0

=
n
1
=
2
h
4
=
n
1
=
2

4

0
!
0
.
ByProposition2andtheDeltamethod,wecanshowthat,under
H
0
,
nC

1

0

H
f


(
t
)
g
d
!
N
(
0
;
R

1
(
t
))(2.2.12)
where
R
(
t
)=
f
C
(
t
)
V
(
t
)
C
(
t
)
|
g

1
.Theasymptoticvariancesof
H
f


(
t
)
g
aretunder
sparseanddensecases.AWald-typeteststatisticmaybeconstructedusing(2.2.12)ifan
appropriateestimatorforthevarianceof
H
f


(
t
)
g
canbeobtained.Butwewillnotpursue
thisdirectionbecausetheestimationoftheasymptoticvarianceinvolvesmanynonparametric
functionse.g.

(
t
)
;

t
)and
f
(
t
),whichrequiresproperlyselectingseveralbandwidths.
Instead,weproposeaself-normalizingELmethodinthenextsectionwhichavoidsestimating
theasymptoticvarianceexplicitly.
18
2.3Apointwisetest
Inthissection,wewillintroduceatestfor
H
0
atanytime
t
,whichisbased
ontheempiricallikelihoodratio(ELR)statistic.ToconstructanELRstatisticfortesting
(2.1.2),wetheELfunctionat

(
t
)fora
t
2
[
a;b
].Following[Owe90],the
empiricallikelihoodfor

(
t
)isas
L
f

(
t
)
g
=max
p
1
;p
2
;:::;p
n
(
n
Y
i
=1
p
i
:
n
X
i
=1
p
i
=1
;p
i

0
;
n
X
i
=1
p
i
g
i
f

(
t
)
g
=0
)
:
ApplyingtheLagrangemultiplier,thelog-ELfunctionbecomes
l
f

(
t
)
g
:=log
L
f

(
t
)
g
=

X
log
f
1+

|
(
t
)
g
i
f

(
t
)
gg
n
log
n
where

(
t
)isasolutiontothefollowingequation
Q
1
n
f

(
t
)
;

(
t
)
g
:=
1
n
n
X
i
=1
g
i
f

(
t
)
g
1+

|
(
t
)
g
i
f

(
t
)
g
=0
:
(2.3.13)
Themaximumlog-ELwithoutanyconstraintis
l
f


(
t
)
g
=

n
log
n:
Itfollowsthatthe
negativelog-ELRfortesting
H
0
:
H
f

0
(
t
)
g
=0is
`
(
t
):=min
H
f

(
t
)
g
=0
l
0
f

(
t
)
g
;
(2.3.14)
where
l
0
f

(
t
)
g
=
P
n
i
=1
log
f
1+

|
(
t
)
g
i
f

(
t
)
gg
.Tosolve(2.3.14),weminimizethefollowing
objectivefunction[QL95]
M
f

(
t
)
;

(
t
)
g
=
1
n
l
0
f

(
t
)
g
+

|
(
t
)
H
f

(
t
)
g
;
19
where

(
t
)isa
q

1vectorofLagrangemultipliers.tiating
M
(

;

)withrespectto

and

andsettingthemtozero,wehave
Q
2
n
f

(
t
)
;

(
t
)
;

(
t
)
g
:=
1
n
@l
0
f

(
t
)
g
@

|
(
t
)
+
C
|
f

(
t
)
g

(
t
)=0and
H
f

(
t
)
g
=0
:
Combiningequation(2.3.13)for

(
t
),theconstrainedminimizationproblemin(2.3.14)is
equivalenttosolvingthefollowingestimatingequationsystem
Q
1
n
f

(
t
)
;

(
t
)
g
=0;
Q
2
n
f

(
t
)
;

(
t
)
;

(
t
)
g
=0and
H
f

(
t
)
g
=0
:
(2.3.15)
WeshowinSection2.6.2.3thataconsistentsolutionto(2.3.15),denotedas(
~

(
t
)
;
~

(
t
)
;
~

(
t
)),
existsalmostsurely.Wecall
~

(
t
)theRestrictedMaximumEmpiricalLikelihoodEstimator
(RMELE).Thentheteststatisticin(2.3.14)becomes
`
(
t
)=
l
0
f
~

(
t
)
g
:
(2.3.16)
Thefollowingpropositionprovidesanasymptoticexpansionfor2
`
(
t
).
Proposition3.
Underconditions(C1)-(C4),andunder
H
0
,wehave,foreach
t
2
[
a;b
]
,
2
`
(
t
)=
U
n
(
t
)
|
U
n
(
t
)+
O
p
(
nh
4
=C

0

)
;
(2.3.17)
where
U
n
(
t
)=
nC

1

0

G
(
t
)

˘
n
(
t
)
,
G
(
t
)=
R
1
=
2
(
t
)
C
(
t
)
A

1
(
t
)
and
R
(
t
)
and
A
(
t
)
arethe
sameasdin(2.2.12).
Theasymptoticexpansionin(2.3.17)makesaconnectionbetween2
`
(
t
)andthebias-
correctedestimator


(
t
)describedinSection2.2.ByProposition1and(2.2.12),
U
n
(
t
)=
20
nC

1

0

R
1
=
2
(
t
)
H
f


(
t
)
g
+
o
p
(1)andasymptoticallyfollowsa
q
-dimensionalmultivariate
standardnormaldistribution.Naturally,2
`
(
t
)
d
!
˜
2
q
underthenullhypothesis.Thefact
thattheasymptoticdistributionof2
`
(
t
)doesnotdependon
m
(or

)provesthatitisa
self-normalizedteststatisticnomatterthedataaresparseordense.Thisisaveryappealing
propertybecausethetestprocedureisthesameforalltypesoffunctionaldataandsolving
(2.3.15)doesnotrequireestimatingthevarianceof
H
f


(
t
)
g
.
ThefollowingTheoremsummarizestheasymptoticdistributionof2
`
(
t
)underboth
H
0
andthelocalalternative(2.1.3).
Theorem1.
Underconditions(C1)-(C4)andsuppose
H
f

0
(
t
)
g
=
b
n
d
(
t
)
for
t
2
[
a;b
]
,
where
b
n
=
n

1
C

0

and
d
(
t
)
isanydrealvectoroffunctions,wehave
2
`
(
t
)
d
!
˜
2
q
f
d
|
(
t
)
R
(
t
)
d
(
t
)
g
where
d
|
(
t
)
R
(
t
)
d
(
t
)
isthenoncentralityparameter.
AproofofTheorem1isprovidedintheSection3.5.1.
Remark3.
Under
H
0
,
d
(
t
)=0
andTheorem1suggeststhat
2
`
(
t
)
followsa
˜
2
q
distribution
asymptotically.Anasymptotic

leveltestisgivenbyrejecting
H
0
atadpoint
t
if
2
`
(
t
)
>˜
2
q
where
˜
2
q
istheupper

quantileof
˜
2
q
.Bytakingaspecialfunction
H
f

g
=

j
(
t
)
,wecanalsoconstructa(
1


)100%cdenceintervalfor

j
(
t
)
(
j
=1
;

;p
)
asCI

=
f

j
(
t
):2
`
(
t
)
<˜
2
q
g
,whichcanbecomputednumerically.Thisprovidesan
alternativeself-normalizedceintervaltothosebasedonaself-normalizednormal
approximation[KZ13].ComparingtoKimandZhao'smethod,ourmethoddoesnotrequire
estimatingthebiasbecauseweusebias-correctedestimatingequations.
21
Wethesizeofthedetectablesignal
b

n
asthesmallestorder
b
n
in(2.1.3)thatthe
proposedtestcandetect.Foragiventlevel

,
b

n
=min
h
b
n
subjectto(i)TypeIerror


under
H
0
(2.3.18)
and(ii)thepowerisnon-trivialunder
H
1
n
.
Theorem1guaranteesthattheproposedtestcontrolstheTypeIerroratthenominal
levelasymptotically.Forthesparseandmoderatedensecases(


1
=
8),Condition(C4)
implies
mh
!
0andhence
b
n
=(
nmh
)

1
=
2
byTheorem1.Inthiscase,
b

n
isequivalentto
min
h
b
n
=(
nmh
)

1
=
2
subjecttocondition(C4)on
h
.
Theoptimal
h
thatsolvestheminimizationproblemaboveis
h

=
n

(1+

+

)
=
9
foran
arbitrarilysmall
>
0.Thisimpliestheoptimal
b
n
is
n

4(1+

)
=
9+
=
18
,whichresultsin
b

n
=
n

4(1+

)
=
9
byletting

!
0.Fordensedata(
>
1
=
8),(C4)leadsto
mh
!1
.
Theorem1impliesthattheproposedtesthasanon-trivialpowerunderalocalalternative
ofsize
b

n
=
n

1
=
2
,whichisthedetectableorderofaparametrictest.
2.4Implementationissues
2.4.1Bandwidthselection
Theperformanceoftheestimationandtestproceduresdependsonthebandwidth
h
andour
asymptotictheoryrelieson
h
fallingintherangeinCondition(C4).Forlongitudinal
data(sparsefunctionaldata)wheresubjectsareassumedtobeindependent,onemayapply
22
a\leave-one-out"cross-validationstrategy[RS91]tochoosebandwidth.However,cross-
validationistime-consumingandingeneral,itsperformancefordensefunctionaldatais
unknown.
Weproposetoselectthebandwidththroughminimizingtheconditionalintegratedmean
squarederror(IMSE)ofthelocalpolynomialestimator
^

(
t
).By(2.2.5),thebandwidth
h
thatminimizingtheIMSEof
^

(
t
)isattheorderof
n

(1+

)
=
5
,whichcondition(C4)
forbothsparseanddensecases.Let
D
=
f
(
t
ij
;
X
ij
)
;j
=1
;
2
;

;m
i
;i
=1
;
2
;

;n
g
.It
isnottoshowthatforany
t
,
MSE(
^

(
t
)
jD
)=
b
|
(
t
)
b
(
t
)+tr
f
Cov(
^

(
t
)
jD
)
g
where
b
(
t
)=Bias(
^

(
t
)
jD
).TheIMSEisas
IMSE(
^

(

)
jD
)=
Z
b
a
MSE(
^

(
t
)
jD
)
$
(
t
)
f
(
t
)
dt
where
$
(
t
)isaknownweightfunctionand
f
(
t
)istheprobabilitydensityfunctionof
t
ij
.
Theconditionalbiasis
b
(
t
)=(
I
;
0
)(
D
|
(
t
)
W
(
t
)
D
(
t
))

1
D
|
(
t
)
W
(
t
)
l
(
t
)
;
where
l
(
t
)=(
l
11
(
t
)
;

;l
1
m
1
(
t
)
;l
21
(
t
)
;

;l
nm
n
(
t
))
|
with
l
ij
(
t
)=
X
|
ij

(
t
ij
)

X
|
ij
[

0
(
t
)+(
t
ij

t
)

(1)
(
t
)]
=
X
|
ij
[

(
t
ij
)


0
(
t
)

(
t
ij

t
)

(1)
(
t
)]
ˇ
X
|
ij

(2)
(
t
)(
t
ij

t
)
2
=
2
;
23
and

(
s
)
(
t
)=
f

(
s
)
1
(
t
)
;

;
(
s
)
p
(
t
)
g
|
,
s
=1
;
2,isthe
s
-thderivativeof

0
(
t
).Theconditional
covarianceis
Cov(
^

(
t
)
jD
)=(
I
;
0
)(
D
|
(
t
)
W
(
t
)
D
(
t
))

1
D
|
(
t
)
W
(
t
)


W
(
t
)
D
(
t
)(
D
|
(
t
)
W
(
t
)
D
(
t
))

1
0
B
@
I
0
1
C
A
;
where

=Cov(
Y
jD
)=diag(

1
;

2
;

;

n
)and

i
=


t
ij
;t
ik
)

m
i
j;k
=1
.
Anestimatorofthecovariance
s;t
)isdescribedinSection2.4.2.Toestimate

(2)
(
t
),
weuseahigherorderlocalpolynomialestimatorof

0
(
t
)withapilotbandwidth
h

.The
pilotbandwidthisobtainedbyminimizingtheresidualsquarescriterionin[ZL00].By
replacing

(2)
(
t
)and

withtheirestimators
d

(2)
(
t
)and
^

,weobtainestimatorsofthe
conditionalmeanandcovariance,
^
b
(
t
)and
d
Cov(
^

(
t
)
jD
).Thenthebandwidth
h
ischosen
byminimizingtheempiricalIMSE
^
h
=argmin
h
1
N
n
X
i
=1
m
i
X
j
=1
[
MSE
f
^

(
t
ij
)
jDg
$
(
t
ij
)
where
N
=
P
n
i
=1
m
i
and
[
MSE(
^

(
t
)
jD
)=
^
b
|
(
t
)
^
b
(
t
)+tr
f
d
Cov(
^

(
t
)
jD
)
g
:
2.4.2CovarianceEstimation
Thecovariancefunction

;

)canbeestimatedbythenonparametrickernelestimatorof
[YMW05a],whichisuniformlyconsistent[LH10].However,thenonparametriccovariance
estimatorisnotnecessarilypositiveInstead,weadoptthesemiparametric
covarianceestimationof[FHL07].Supposethecovariancefunctioncanbedecomposedas
24

s;t
)=
˙
(
s
)
ˆ
(
s;t
)
˙
(
t
),wemodelthevariancefunction
˙
2
(
t
)nonparametricallyandthe
correlationfunction
ˆ
(
s;t
)parametrically.Forestimation,weapplythenonparametric
kernelestimatorsof(
s;t
)and
˙
2
(
t
)[YMW05a]togetinformationabouttheparametric
structureof
ˆ
(
s;t
).Thenweaparametricmodelto
ˆ
(
s;t
)usingthequasimaximum
likelihoodestimatorof[FHL07].Theparametricstructureguaranteesthepositivesemi-
oftheestimatedcorrelationfunction.Formoredetailsoftheimplementation,
seeSection2.5.
2.5Simulationstudies
Simulationstudieswereconductedtoevaluatetheperformanceoftheproposedin-
ferenceprocedures.Wegenerateddatafromthefollowingmodel
Y
i
(
t
ij
)=

1
(
t
ij
)
X
(1)
i
(
t
ij
)+

2
(
t
ij
)
X
(2)
i
(
t
ij
)+

i
(
t
ij
)(2.5.19)
for
i
=1
;
2
;

;n
and
j
=1
;
2
;

;m
where
t
ij
'sareIIDUnif[0,1]distributed,
X
(1)
i
(
t
ij
)=
1+2
e
t
ij
+
v
ij
and
X
(2)
i
(
t
ij
)=3

4
t
2
ij
+
u
ij
.Here
u
ij
and
v
ij
areIID
N
(0
;
1)randomvariables,
whichareindependentwith
t
ij
and

i
(
t
ij
).Therandomerror

i
(
t
ij
)wasgeneratedfrom
azeromeanAR(1)processsuchthatVar
f

(
t
)
g
=1andCov
f

(
t
)
;
(
t

s
)
g
=
ˆ
10
s
for
some
ˆ
2
(0
;
1).Toevaluatetheproposedmethodsforbothsparseanddensedata,weset
m
=5
;
10and50.Thesamplesizeswerechosentobe100and200.TheEpanechnikov
kernel
K
(
x
)=
3
4
(1

x
2
)
+
wasusedforestimation,where(
a
)
+
=max(
a;
0).Bandwidth
selectionwasconductedforeverysimulateddatasetusingthemethodproposedinSection
2.4.
25
Weset

1
(
t
)=
1
2
sin
t
and

2
(
t
)=2sin(
t
+0
:
5)inModel(2.5.19)andappliedthe
procedureinSection2.3toconstructpointwiseCIsfor

1
(
t
).Table2.1summarizesthe
empiricalcoverageprobability(CP)inpercentageandtheaveragelength(AL)oftheCIs
(inparentheses)for

1
(
t
)at
t
=0
:
3
;
0
:
5and0.7basedon1000simulationreplicates.These
resultswereobtainedusingthedata-drivenbandwidth.Aswecanseefromthetable,the
CPsareclosetothenominallevel95%inbothsparseanddensecasesandtheALsare
shorterunderalargersamplesize.Inaddition,theALsimproveas
m
increasesfrom5to
50.
Table2.1:Empiricalcoverageprobability(%)andaveragelengthofpointwise
intervals(inparenthesis)for

1
(
t
)at
t
=0
:
3
;
0
:
5and0.7.
m
=5
m
=10
m
=50
tnˆ
=0
:
2
ˆ
=0
:
5
ˆ
=0
:
2
ˆ
=0
:
5
ˆ
=0
:
2
ˆ
=0
:
5
0.310092.1(0.272)92.9(0.268)92.9(0.203)92.5(0.203)93.7(0.107)93.9(0.107)
20092.3(0.205)92.3(0.205)93.5(0.152)93.0(0.152)94.7(0.081)94.4(0.081)
0.510092.9(0.270)93.5(0.267)94.5(0.210)94.0(0.209)93.3(0.107)93.1(0.108)
20093.6(0.201)93.3(0.200)94.6(0.152)94.4(0.152)94.0(0.083)93.8(0.081)
0.710092.1(0.273)92.5(0.272)92.2(0.211)92.1(0.208)93.4(0.106)92.8(0.106)
20092.3(0.201)92.4(0.201)94.1(0.153)93.3(0.153)93.9(0.083)93.8(0.081)
Tofurtherdemonstratetheperformanceoftheproposedbandwidthselectionmethodin
Section2.4,weshowinpanels(a)and(b)ofFigure2.1theboxplotsof
^
h
selectedformodel
(2.5.19)with

1
(
t
)=
1
2
sin(
ˇt
)and

2
(
t
)=2sin(
ˇt
+0
:
5)basedon500replicates.Both
themedianandspreadof
^
h
decreasedasthe
n
and
m
increasedandthecorrelation
ˆ
had
littleimpactonthebandwidthselectionresult.Theseplotsalsoshowthatourbandwidth
selectionprocedureisverystableasthereareveryfewoutliersineachcase.Inpanels(c)
and(d)ofFigure2.1,weplotthelogarithmofMedian(
^
h
)againstthelogarithmof
nm
for
eachvalueof
ˆ
.Theseplotsshowclearlineardecreasingtrends,contheselected
26
bandwidthdecreasesinapolynomialorderof
nm
.
(a)
n
=100
(b)
n
=200
(c)
ˆ
=0
:
2
(d)
ˆ
=0
:
5
Figure2.1:
Panels(a)and(b)areboxplotsforbandwidthsselectedformodel(2.5.19)with

1
(
t
)=
1
2
sin(
ˇt
)and

2
(
t
)=2sin(
ˇt
+0
:
5)usingtheproposedbandwidthselectionmethodin
Section2.4.Panels(c)and(d)aretheplotsofthelogarithmofmedian(
^
h
)vslog(
nm
).
27
2.6TechnicalDetails
ThissectioncontainstheproofsforthemaintheoremsinSection2.3.Proofsforthepropo-
sitionscanbefoundinthenextsection.
2.6.1ProofofTheorem1
ProofofTheorem1.
Forconvenience,wesuppresstheargumentofallthefunctionsont.


1
=
0
B
B
B
B
B
@

B

1
+
B

1
APAB

1
B

1
APB

1
AQ
|
PAB

1
PQ
|
QAB

1
Q

R
1
C
C
C
C
C
A
;
where
P
=
V
(
I

C
|
Q
)and
Q
=
RCV
.ByTaylorexpansionoftheequations(2.3.15)at
(

;
0
;
0)asinLemma4inSection2.6.2,wehave
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
=


1
0
B
B
B
B
B
@

n

1
P
n
i
=1
g
i
(

0
)+
o
p

n
)
o
p

n
)

H
(

0
)+
o
p

n
)
1
C
C
C
C
C
A
=
0
B
B
B
B
B
@

B

1
+
B

1
APAB

1
PAB

1
QAB

1
1
C
C
C
C
C
A
(

1
n
n
X
i
=1
g
i
(

0
)
)
+
0
B
B
B
B
B
@
B

1
AQ
|
Q
|

R
1
C
C
C
C
C
A

H
(

0
)
g
+
o
p

n
)
;
28
where
n
=
k
~


0
k
+
k
~

k
+
k
~

k
.Thenunderlocalalternativehypothesis
H
1
:
H
f

0
(
t
)
g
=
n

1
C

0

d
(
t
),wehave

n
=
k
0
B
B
B
B
B
@
~

~


0
~

1
C
C
C
C
C
A
kk
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
k
O
p
(
C

0

=n
)+
o
p

n
)
;
whichimpliesthat
n
=
O
p
(
C

0

=n
).
Thusfor
~

,wehave
~

=

QA
|
B

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
R
H
(

0
)+
o
p
(
C

0

=n
)
=

R
C
A

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
R
H
(

0
)+
o
p
(
C

0

=n
)
:
(2.6.20)
Accordingly,wehave
nC

1

0

R

1
=
2
f
~


R
H
(

0
)
g
d
!
N
(
0
;
I
q
)
:
Underlocalalternative
hypothesis
H
1
:
H
f

0
(
t
)
g
=
n

1
C

0

d
(
t
),wehave
nC

1

0

R

1
=
2
~

d
!
N
(
R
1
=
2
d
;
I
q
)
:
Thus2
`
(
t
)=
n
2
C
2

0

~

|
R

1
~

+
o
p
(1)
d
!
˜
2
q
(
d
|
Rd
)
:
2.6.2ProofsofPropositions
Inthissection,weprovidetheproofsforallthepropositionsinthischapterandtheexistence
oftheRMELE
~

(
t
).AnasymptoticexpressionfortheLagrangemultiplier
~

(
t
)in(2.3.13)
isalsoincluded.
29
2.6.2.1SomeUsefulLemmas
WepresentsomeusefullemmasandtheirproofsbeforetheproofsforthePropositions.
Denote

n
=

n
1
+
h
2
,

n
1
=(
d
n
log
n
nh
2
)
1
2
where
d
n
=
h
2
+
rh=m
.
Lemma1.
Underassumptions(C1)-(C3)and(C4)(i),wehave
sup
t
2
[
a;b
]
j
^

(
t
)


0
(
t
)
j
=
O
(

n
)
a:s::
Proof.
Bytheexpressionof
^

(
t
),usingaTaylorexpansion,wehave
^

(
t
)


0
(
t
)=

I
p
;
0
p

f
D
|
(
t
)
W
(
t
)
D
(
t
)
g

1
D
(
t
)
W
(
t
)
Y


0
(
t
)
=

I
p
;
0
p

n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)
D
i
(
t
)


1

n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)
Y
i


0
(
t
)
=

I
p
;
0
p

n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)
D
i
(
t
)


1

n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)[
B
i
(
t
)+

i
]

;
where
B
i
(
t
)=

(
t
i
1

t
)
2
X
|
i
1

(2)
0
(
t

i
1
)
=
2
;

;
(
t
im
i

t
)
2
X
|
im
i

(2)
0
(
t

im
i
)
=
2

|
with
t

ij
be-
tween
t
and
t
ij
and

i
=(

i
1
;
12
;:::;
im
i
)
|
:
Observethatfordenominator
I
(
t
):=
1
n
P
n
i
=1
D
|
i
(
t
)
W
i
(
t
)
D
i
(
t
),wehave
I
(
t
)=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
0
B
@
X
ij
X
|
ij
K
h
(
t
ij

t
)
X
ij
X
|
ij
K
h
(
t
ij

t
)
t
ij

t
h
X
ij
X
|
ij
K
h
(
t
ij

t
)
t
ij

t
h
X
ij
X
|
ij
K
h
(
t
ij

t
)(
t
ij

t
h
)
2
1
C
A
:=
0
B
@
I
11
(
t
)
I
12
(
t
)
I
21
(
t
)
I
22
(
t
)
1
C
A
:
Inordertogettheuniformboundfor
I
(
t
),weuseLemma2in[LH10]for
I
ij
(
t
)
;i;j
=1
;
2.
30
For
I
11
(
t
),wehave
E
f
I
11
(
t
)
g
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
i
(
t
ij
)
X
|
i
(
t
ij
)
K
h
(
t
ij

t
)
9
=
;
=
E
8
<
:
E
[
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
i
(
t
ij
)
X
|
i
(
t
ij
)
K
h
(
t
ij

t
)
j
t
ij
]
9
=
;
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

(
t
ij
)
K
h
(
t
ij

t
)
9
=
;
=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
Z

(
s
)
K
h
(
s

t
)
f
(
s
)
ds
=
Z

(
s
)
K
h
(
s

t
)
f
(
s
)
ds
=
Z

(
t
+
uh
)
K
(
u
)
f
(
t
+
uh
)
du
=

(
t
)
f
(
t
)+
~
O
(
h
2
)
;
aslongas

12
<
1
whichistruebycondition(C1)and[

(
t
)
f
(
t
)]
00
isuniformlybounded
on
t
2
[
a;b
]by(C3),where
~
O
denoteuniformorderforall
t
2
[
a;b
]andalsoforthe~
o
below.
Hence,undertheconditionthat
E
n
sup
t
2
[
a;b
]
k
X
(
t
)
k

1
o
<
1
forsome5


1
<
1
,and
d

1
n
(
log
n
n
)
1

2

1
=
o
(1),whichistrueunder(C4)(i),byLemma2in[LH10],wehave
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
K
h
(
t
ij

t
)


(
t
)
f
(
t
)
k
=
O
(

n
)
;a:s::
Bysimilarcalculationsforotherthreeterms,wehave
E
f
I
12
(
t
)
g
=
Z

(
s
)
K
h
(
s

t
)
s

t
h
f
(
s
)
ds
=
~
O
(
h
)
;
under

12
<
1
and[

(
t
)
f
(
t
)]
0
uniformlyboundedon
t
2
[
a;b
],whicharetrueunder(C1)
31
and(C3)respectively.And
E
f
I
22
(
t
)
g
=
Z

(
s
)
K
h
(
s

t
)(
s

t
h
)
2
f
(
s
)
ds
=

(
t
)
f
(
t
)

12
+
~
O
(
h
2
)
;
under[

(
t
)
f
(
t
)]
00
isuniformlyboundedon
t
2
[
a;b
]by(C3).Henceinsummary,wehave
underconditions(C1)-(C3)and(C4)(i),
I
(
t
)=
0
B
@

(
t
)
f
(
t
)+
~
O
(

n
)
~
O
(

n
1
+
h
)
~
O
(

n
1
+
h
)

(
t
)
f
(
t
)

12
+
~
O
(

n
)
1
C
A
;a:s::
Thenwehave
I

1
(
t
)=
0
B
@

(
t
)
f
(
t
)0
0

(
t
)
f
(
t
)

12
1
C
A

1
+
~
O
(

n
1
+
h
)
;a:s::
(2.6.21)
Forthenumerator
II
(
t
):=
1
n
P
n
i
=1
D
|
i
(
t
)
W
i
(
t
)
B
i
(
t
),wehave
II
(
t
)=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
0
B
B
B
@
X
ij
X
|
ij
(
t
ij

t
)
2

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
X
ij
X
|
ij
(
t
ij

t
)
3
h

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
1
C
C
C
A
:=
0
B
@
II
1
(
t
)
II
2
(
t
)
1
C
A
:
Similarasthedenominator,undertheconditionthat

0
(
t
)hascontinuoussecondderivative
32
on
t
2
[
a;b
](C3),wehave
E
f
II
1
(
t
)
g
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
)
2

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
9
=
;
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
h
)
2
K
h
(
t
ij

t
)
9
=
;
~
O
(
h
2
)
=

(
t
)
f
(
t
)

12
~
O
(
h
2
)=
~
O
(
h
2
)
if

12
<
1
bycondition(C1)and

(
t
)
f
(
t
)uniformlyboundedon
t
2
[
a;b
](C3),and
E
f
II
2
(
t
)
g
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
)
3
h

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
9
=
;
=
E
8
<
:
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
h
)
3
K
h
(
t
ij

t
)
9
=
;
~
O
(
h
3
)
=[

(
t
)
f
(
t
)]
0

14
~
O
(
h
3
)=
~
O
(
h
3
)
if

14
<
1
(C1)and[

(
t
)
f
(
t
)]
0
uniformlyboundedon
t
2
[
a;b
](C3).
ByLemma2in[LH10],underthecondition
E
n
sup
t
2
[
a;b
]
k
X
(
t
)
k

1
o
<
1
forsome
5


1
<
1
,and
d

1
n
(
log
n
n
)
1

2

1
=
o
(1)whichistrueunder(C4)(i),wecanhave
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
)
2

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
k
=
h
2
O
(

n
1
+1)
;a:s:;
and
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
(
t
ij

t
)
3
h

(2)
0
(
t

ij
)
2
K
h
(
t
ij

t
)
k
=
h
2
O
(

n
1
+
h
)
;a:s::
33
Notethat
III
(
t
):=
1
n
n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)

i
=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
0
B
@
X
ij

ij
K
h
(
t
ij

t
)
X
ij

ij
K
h
(
t
ij

t
)
t
ij

t
h
1
C
A
:
Similarly,bycondition(C2)and(C3),wehavethefollowingduetoLemma2in[LH10]
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij

ij
K
h
(
t
ij

t
)
k
=
O
(

n
1
)
;a:s:;
and
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij

ij
K
h
(
t
ij

t
)
t
ij

t
h
k
=
O
(

n
1
)
;a:s::
Thuswehave
^

(
t
)


0
(
t
)=

I
p

p
;
0
p

p

0
B
@

(
t
)
f
(
t
)0
0

(
t
)
f
(
t
)

12
1
C
A

1

8
>
<
>
:
h
2
0
B
@
~
O
(

n
1
+1)
~
O
(

n
1
+
h
)
1
C
A
+
0
B
@
~
O
(

n
1
)
~
O
(

n
1
)
1
C
A
9
>
=
>
;
=
h
2
~
O
(

n
1
+1)+
~
O
(

n
1
)=
~
O
(

n
)
;a:s:;
since

n
=

n
1
+
h
2
:
Lemma2.
Underconditions(C1)-(C3)and(C4)(i),wehave
E
(
g
i
f

0
(
t
)
g
)=
~
O
(
h
4
)
and
Var
(
g
i
f

0
(
t
)
g
)=
ˆ
1
m
i
h

(
t

t
)
f
(
t
)

20
+
m
i

1
m
i

(
t

t
)
f
2
(
t
)
˙
f
1+~
o
(1)
g
:
34
Proof.
Bytheof
g
i
f

0
(
t
)
g
,wedecompose
g
i
f

0
(
t
)
g
asthefollowingtwoparts
g
i
f

0
(
t
)
g
=
m

1
i
m
i
X
j
=1
X
ij
X
|
ij
f
[
^

(
t
)


0
(
t
)]

[
^

(
t
ij
)


0
(
t
ij
)]
g
K
h
(
t
ij

t
)
+
m

1
i
m
i
X
j
=1
X
ij

ij
K
h
(
t
ij

t
):=
L
1
i
(
t
)+
˘
i
(
t
)
:
Toanalyzetheterm
L
1
i
(
t
)intheaboveexpression,wefurtherobtaintheexpansion
for
^

(
t
)


0
(
t
)inthefollowing.Bytheexpressionof
^

(
t
)andaTaylorexpansion,weobtain
^

(
t
)


0
(
t
)=

I
p

p
;
0
p

p

n
n

1
n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)
D
i
(
t
)
o

1

n
n

1
n
X
i
=1
D
|
i
(
t
)
W
i
(
t
)(
B
i
(
t
)+
T
i
(
t
)+

i
)
o
;
where
B
i
(
t
)=
1
2
(
X
|
i
1

(2)
0
(
t
)(
t
i
1

t
)
2
;

;
X
|
im
i

(2)
0
(
t
)(
t
im
i

t
)
2
)
|
and
T
i
(
t
)=
1
6
(
X
|
i
1

(3)
0
(
t

i
1
)(
t
i
1

t
)
3
;

;
X
|
im
i

(3)
0
(
t

im
i
)(
t
im
i

t
)
3
)
|
with
t

ij
isbetween
t
and
t
ij
.Itthenfollowsthat
(
^

(
t
)


0
(
t
))

(
^

(
t
ij
)


0
(
t
ij
))=
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
n

1
;kl
(
t
)


1
;kl
(
t
ij
)
+(

2
;kl
(
t
)


2
;kl
(
t
ij
))+(

3
;kl
(
t;t

1
)


3
;kl
(
t;t

2
))
o
f
1+~
o
p
(1)
g
35
where
t

1
isbetween
t
and
t
kl
and
t

2
isbetween
t
ij
and
t
kl
and

1
;kl
(
t
)=
f

1
(
t
)


1
(
t
)
X
kl

kl
K
h
(
t
kl

t
)

2
;kl
(
t
)=
1
2
f

1
(
t
)


1
(
t
)
X
kl
X
|
kl
(
t
kl

t
)
2

(2)
0
(
t
)
K
h
(
t
kl

t
)

3
;kl
(
t;t

)=
1
6
f

1
(
t
)


1
(
t
)
X
kl
X
|
kl
(
t
kl

t
)
3

(3)
0
(
t

)
K
h
(
t
kl

t
)
:
Thenwecanwrite
L
1
i
(
t
)=
f
I
1
i
(
t
)+
I
2
i
(
t
)+
I
3
i
(
t
)
gf
1+~
o
p
(1)
g
where
I
1
i
(
t
)=
1
m
i
m
i
X
j
=1
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
X
ij
X
|
ij

1
;kl
(
t
)
K
h
(
t
ij

t
)

1
m
i
m
i
X
j
=1
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
X
ij
X
|
ij

1
;kl
(
t
ij
)
K
h
(
t
ij

t
):=
I
11
;i
(
t
)

I
12
;i
(
t
)
;
I
2
i
(
t
)=
1
m
i
m
i
X
j
=1
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
X
ij
X
|
ij

2
;kl
(
t
)
K
h
(
t
ij

t
)

1
m
i
m
i
X
j
=1
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
X
ij
X
|
ij

2
;kl
(
t
ij
)
K
h
(
t
ij

t
):=
I
21
;i
(
t
)

I
22
;i
(
t
)and
I
3
i
(
t
)=
1
m
i
m
i
X
j
=1
1
n
n
X
k
=1
1
m
k
m
k
X
l
=1
X
ij
X
|
ij
f

3
;kl
(
t;t

1
)


3
;kl
(
t
ij
;t

2
)
g
K
h
(
t
ij

t
)
:
For
I
1
i
(
t
),wehave
E
f
I
1
i
(
t
)
g
=0and
Var
f
I
11
i
(
t
)
g
=
n
1
n
2
m
i
h
2
X
k
6
=
i
1
m
k

1
(
t

t
)

2
20
+2
m
i

1
n
2
m
i
h
X
k
6
=
i
1
m
k

1
(
t

t
)
f
(
t
)

20
+
m
i

1
n
2
m
i
X
k
6
=
i
m
k

1
m
k

1
(
t

t
)
f
2
(
t
)
o
f
1+~
o
(1)
g
where

1
(
t
)=
E
f
X
i
(
t
)
X
|
i
(
t
)


1
(
t
)
X
i
(
t
)
X
|
i
(
t
)
g
.Theleadingordervarianceof
I
12
i
(
t
)is
36
thesameasthatof
I
11
i
(
t
).Insummary,wehaveVar
f
I
1
i
(
t
)
g
=
~
O
(
1
nm
2
h
2
)
1
(

0
<
1
)+
~
O
(
1
n
)
1
(

0
=
1
).
Bycondition(C3),

0
(
t
)hascontinuousthirdderivative,and

(
t
)
f
(
t
),[

(
t
)
f
(
t
)]
0
,[

(
t
)
f
(
t
)]
00
,


1
(
t
),
f
(
t
),
f

1
(
t
)areuniformlyboundedon[
a;b
],wehave
E
f
I
21
;i
(
t
)
g
=
ˆ
1
2
n

1
(
t
)+
n

1
2
n

(
t
)
˙
f
(
t
)

(2)
0
(
t
)

12
h
2
+
~
O
(
h
4
)and
E
f
I
22
;i
(
t
)
g
=
ˆ
1
2
n

1
(
t
)+
n

1
2
n

(
t
)
˙
f
(
t
)

(2)
0
(
t
)

12
h
2
+
~
O
(
h
4
)
:
Therefore
E
f
I
2
;i
(
t
)
g
=
E
f
I
21
;i
(
t
)
g
E
f
I
22
;i
(
t
)
g
=
~
O
(
h
4
).Toevaluatethevarianceof
I
2
i
(
t
),
weevaluatethevarianceof
I
21
;i
(
t
).Notethat
(
nm
i
)
2
I
21
;i
(
t
)
I
|
21
;i
(
t
)
=
1
m
2
i
m
i
X
j
1
;j
2
=1
m
i
X
l
1
;l
2
=1
X
ij
1
X
|
ij
1

2
;il
1
(
t
)
K
h
(
t
ij
i

t
)
X
ij
2
X
|
ij
2

2
;il
2
(
t
)
K
h
(
t
ij
2

t
)
+
n
X
k
(
6
=
i
)=1
m
i
X
j
1
;j
2
=1
m
k
X
l
1
;l
2
=1
1
m
2
k
X
ij
1
X
|
ij
1

2
;kl
1
(
t
)
K
h
(
t
ij
i

t
)
X
ij
2
X
|
ij
2

2
;kl
2
(
t
)
K
h
(
t
ij
2

t
)
+
n
X
(
k
1
6
=
k
2
)=1
m
i
X
j
1
;j
2
=1
m
k
1
X
l
1
=1
m
k
2
X
l
2
=1
1
m
k
1
m
k
2
X
ij
1
X
|
ij
1

2
;k
1
l
1
(
t
)
K
h
(
t
ij
i

t
)

X
ij
2
X
|
ij
2

2
;k
2
l
2
(
t
)
K
h
(
t
ij
2

t
)
:=(
nm
i
)
2
f
J
1
(
t
)+
J
2
(
t
)+
J
3
(
t
)
g
:
Let

2
(
t
)=
E
f
X
i
(
t
)
X
|
i
(
t
)
X
i
(
t
)
X
|
i
(
t
)
g
.Itiseasytoseethatthedominanttermof
37
E
f
I
21
;i
(
t
)
I
|
21
;i
(
t
)
g
is
E
f
J
3
(
t
)
g
.Carefulderivationshowsthat,uptoascaleconstant,
E
f
J
3
(
t
)
g
=
n
1
m
i

2
(
t
)
f
(
t
)

2
12

20
h
3
+

2
(
t
)
f
2
(
t
)

2
12
h
4
o
f
1+~
o
(1)
g
:
SimilarderivationshowsthatVar
f
I
22
;i
(
t
)
g
isofthesameorderasVar
f
I
21
;i
(
t
)
g
.Therefore,
insummary,wehaveVar
f
I
2
;i
(
t
)
g
=
~
O
(
h
3
=m
)
1
(

0
<
1
)+
~
O
(
h
4
)
1
(

0
=
1
).For
I
3
;i
(
t
),it
canbeshownthat
E
f
I
3
;i
(
t
)
g
=
~
O
(
h
4
)and
Var
f
I
3
;i
(
t
)
g
=
~
O
(
h
6
=m
)
1
(

0
<
1
)+
~
O
(
h
7
)
1
(

0
=
1
)
:
Finally,weevaluatetheorderof
˘
i
(
t
).Itisclearthat
E
f
˘
i
(
t
)
g
=0and
Var
f
˘
i
(
t
)
g
=
n
1
m
i
h

(
t

t
)
f
(
t
)

20
+
m
i

1
m
i

(
t

t
)
f
2
(
t
)
o
f
1+~
o
(1)
g
:
(2.6.22)
Insummary,
E
(
g
i
f

0
(
t
)
g
)=
~
O
(
h
4
)andbycomparingthevarianceof
˘
i
(
t
)tothevariances
of
I
1
;i
(
t
)to
I
3
;i
(
t
),wehaveVar(
g
i
f

0
(
t
)
g
)=Var
f
˘
i
(
t
)
gf
1+~
o
(1)
g
.Thiscompletesthe
proofofthisLemma.
Lemma3.
Underconditions(C1)-(C4),wehavefortrue

0
(
t
)
C

1

0

n
X
i
=1
g
i
f

0
(
t
)
g
d
!
N
(
0
;
B
(
t
))
;
where
C

0

and
B
(
t
)
aredinProposition2inSection2.2.
Proof.
Let
˘
i
(
t
):=
m

1
i
P
m
i
j
=1
X
ij

ij
K
h
(
t
ij

t
)andusingtheproofofLemma2,
g
i
f

0
(
t
)
g
=
˘
i
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
)
;
(2.6.23)
38
and
V
i
(
t
):=Var
f
˘
i
(
t
)
g
=
~
O
f
(
mh
)

1
g
1
(

0
<
1
)+
~
O
f
1
g
1
(

0
=
1
)
:
Wewillshowthattheasymptoticnormalityof
P
n
i
=1
g
i
f

0
(
t
)
g
isthesameastheasymp-
toticnormalityof
P
n
i
=1
˘
i
(
t
).
Firstconsiderthecase

0
<
1
,i.e.
mh
!
[0
;
1
),with(2.6.23)andcondition(C4),we
have
(
mh
)
1
=
2
p
n
n
X
i
=1
g
i
f

0
(
t
)
g
=
(
mh
)
1
=
2
p
n
n
X
i
=1
˘
i
(
t
)+~
o
p
(1)
:
(2.6.24)
Asabove,wecancheckthat
E
n
(
mh
)
1
=
2
P
n
i
=1
˘
i
(
t
)
=
p
n
o
=0and
Var
n
(
mh
)
1
=
2
p
n
n
X
i
=1
˘
i
(
t
)
o
=
1
n
n
X
i
=1
[
m
m
i

20
+
m
i

1
m
i
f
(
t
)]

(
t

t
)
f
(
t
)
f
1+~
o
(1)
g
!
[
r
20
+

0
f
(
t
)]

(
t

t
)
f
(
t
)=
B
(
t
)
:
Next,weconsiderthecase

0
=
1
,i.e.
mh
!1
.Againby(2.6.23)andcondition(C4)
1
p
n
n
X
i
=1
g
i
f

0
(
t
)
g
=
1
p
n
n
X
i
=1
˘
i
(
t
)+~
o
p
(1)
:
(2.6.25)
Similarly,itcanbecheckedthat
E
n
P
n
i
=1
˘
i
(
t
)
=
p
n
o
=0and
Var
n
n
X
i
=1
˘
i
(
t
)
=
p
n
o
=
1
n
n
X
i
=1
m
i

1
m
i

(
t

t
)
f
2
(
t
)
f
1+~
o
(1)
g
!
f
2
(
t
)

(
t

t
)=
B
(
t
)
:
Toshowtheasymptoticnormalityunderbothcases,applyingthecramer-wolddevice,it
isenoughtoshowtheasymptoticnormalityof
P
n
i
=1

|
˘
i
(
t
)
=C

0

forany

2
R
p
atany
39
timepoint
t
.ItremainstochecktheLyapunovcondition.Tothisend,notethat
s
2
n
=Var
f
n
X
i
=1

|
˘
i
(
t
)
g
=
n
X
i
=1

|
V
i

˘
C
2

0

:
Andontheotherhand,for
m
!1
,
n
X
i
=1
E


|
˘
i
(
t
)

2+

0
o
=
n
X
i
=1
E

m

1
i
m
i
X
j
=1

|
X
ij

ij
K
h
(
t
ij

t
)

2+

0
o

C
n
X
i
=1
E
f
sup
t
j

|
X
(
t
)
j
2+

0
g
E
f
sup
t
j

(
t
)
j
2+

0
g˘
n
bytaking

2
=2+

0
intheassumption(C2).Thuswehave
1
s
2+

0
n
n
X
i
=1
E


|
˘
i
(
t
)

2+

0
o
˘
n
n
1+

0
=
2
!
0
;n
!1
:
Andsimilarly,for
m
isbounded,
n
X
i
=1
E


|
˘
i
(
t
)

2+

0
o

Cn
h
2+

0
E
f
sup
t
j

|
X
(
t
)
j
2+

0
g
E
f
sup
t
j

(
t
)
j
2+

0
g˘
n=h
2+

0
:
Then,itfollowsthat
1
s
2+

0
n
n
X
i
=1
E


|
˘
i
(
t
)

2+

0
o
˘
n=h
2+

0
(
n=h
)
2+

0
2
=
1
n
(

0

2

0


0

0
)
=
2
:
Theaboveratiogoesto0ifandonlyif

0
<
0
=
2+

0
.Bytaking

2
=2+

0
,thiscondition
isequivalentto

0
<
0
=
2+

0
=1

2

2
.Byassumption(C4),thisconditionis
because

0
<
1

2
<
1

2

2
.ThiscompletestheproofofthisLemma.
Lemma4.
Underassumptions(C1)-(C4),andforeach
t
2
[
a;b
]
underthenullhypothesis
40
H
0
:
H
f

0
(
t
)
g
=0
,wehave
2
`
(
t
)
d
!
˜
2
q
:
Proof.
First,forconveniencewesuppresstheargument
t
inthefunctions

(
t
),
~

(
t
)and
A
(
t
),sincewe
t
2
[
a;b
]inthisproof.Theproofissimilartothatin[QL95].
Weobtaintheirderivativeswithrespecttothethreevariables

;

and

.
@Q
1
n
(

;

)
@

|
=
1
n
n
X
i
=1
@g
i
(

)
@

|
(1+

|
(

)
g
i
(

))

g
i
(

)

|
@g
i
(

)
@

|
(1+

|
(

)
g
i
(

))
2
;
@Q
1
n
(

;

)
@

|
=

1
n
n
X
i
=1
g
i
(

)
g
|
i
(

)
(1+

|
(

)
g
i
(

))
2
;
@Q
1
n
(

;

)
@

|
=0
;
@Q
2
n
(

;

;

)
@

|
=
1
n
n
X
i
=1
@
2
g
|
i
(

)
@

|
@


(1+

|
(

)
g
i
(

))

@g
|
i
(

)
@


|
@g
i
(

)
@

|
(1+

|
(

)
g
i
(

))
2
+
@C
|
(

)
@

|

;
@Q
2
n
(

;

;

)
@

|
=
1
n
n
X
i
=1
@g
|
i
(

)
@

|

@g
|
i
(

)
@

|

g
|
i
(

)
(1+

|
(

)
g
i
(

))
2
;
@Q
2
n
(

;

;

)
@

|
=
C
|
(

)
;
@H
(

)
@

|
=
C
(

)
;
@H
(

)
@

|
=0
;
@H
(

)
@

|
=0
:
Hence,wehavethefollowingTaylorexpansionsofthesystemofequationsat(

0
;
0
;
0).Let
41

n
=
k
~


0
k
+
k
~

k
+
k
~

k
.
0=
Q
1
n
(
~

;
~

;
~

)
=
Q
1
n
(

0
;
0
;
0)+
@Q
1
n
(

0
;
0
;
0)
@

|
(
~


0
)+
@Q
1
n
(

0
;
0
;
0)
@

|
(
~


0)
+
@Q
1
n
(

0
;
0
;
0)
@

|
(
~


0)+
o
p

n
)
=
1
n
n
X
i
=1
g
i
(

0
)+
1
n
n
X
i
=1
@g
i
(

0
)
@

|
(
~


0
)

1
n
n
X
i
=1
g
i
(

0
)
g
|
i
(

0
)
~

+
o
p

n
)
;
0=
Q
2
n
(
~

;
~

;
~

)
=
Q
2
n
(

0
;
0
;
0)+
@Q
2
n
(

0
;
0
;
0)
@

|
(
~


0
)+
@Q
2
n
(

0
;
0
;
0)
@

|
(
~


0)
+
@Q
2
n
(

0
;
0
;
0)
@

|
(
~


0)+
o
p

n
)
=
1
n
n
X
i
=1
@g
|
i
(

0
)
@

~

+
C
|
(

0
)
~

+
o
p

n
)
;
and0=
H
(
~

)=
H
(

0
)+
C
(

0
)(
~


0
)+
o
p

n
)=
C
(

0
)(
~


0
)+
o
p

n
)
:
Puttingthe
aboveequationsintoamatrixform,weobtain
0
B
B
B
B
B
@

n

1
P
n
i
=1
g
i
(

0
)+
o
p

n
)
o
p

n
)
o
p

n
)
1
C
C
C
C
C
A
=

n
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
:
where

n
=
0
B
B
B
B
B
@

C

2

0

P
n
i
=1
g
i
(

0
)
g
|
i
(

0
)
n

1
P
n
i
=1
@g
i
(

0
)
@

|
0
n

1
P
n
i
=1
@g
|
i
(

0
)
@

0
C
|
(

0
)
0
C
(

0
)0
1
C
C
C
C
C
A
:
42
Thenwehave

n
P
!

=
0
B
B
B
B
B
@

BA
0
A
0
C
|
0
C
0
1
C
C
C
C
C
A
:
Bycalculation,wehave


1
=
0
B
B
B
B
B
@

B

1
+
B

1
APAB

1
B

1
APB

1
AQ
|
PAB

1
PQ
|
QAB

1
Q

R
1
C
C
C
C
C
A
;
where
P
=
V
(
I

C
|
Q
),
R
=(
CVC
|
)

1
,
Q
=
RCV
;
V
=(
AB

1
A
)

1
:
Thuswehave
thefollowing
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
=


1
0
B
B
B
B
B
@

n

1
P
n
i
=1
g
i
(

0
)
0
0
1
C
C
C
C
C
A
+
o
p

n
)
Bythis,wecouldoutthat

n
=
k
0
B
B
B
B
B
@
~

~


0
~

1
C
C
C
C
C
A
kk
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
k
=
k


1
0
B
B
B
B
B
@
1
0
0
1
C
C
C
C
C
A
(

1
n
n
X
i
=1
g
i
(

0
)
)
+
o
p

n
)
k
O
p
(
C

0

=n
)+
o
p

n
)
;
43
whichimpliesthat
n
=
O
p
(
C

0

=n
).
Insummaryoftheaboveresults,wehave
0
B
B
B
B
B
@
C
2

0

n

1
~

~


0
~

1
C
C
C
C
C
A
=
0
B
B
B
B
B
@

B

1
+
B

1
APAB

1
PAB

1
QAB

1
1
C
C
C
C
C
A
(

1
n
n
X
i
=1
g
i
(

0
)
)
+
o
p
(
C

0

=n
)
:
(2.6.26)
Thuswehavetheasymptoticexpressionfor
~

,
~

=

RCA

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
o
p
(
C

0

=n
)
:
(2.6.27)
Fortheasymptoticexpressionof
~


0
,(2.6.26)togetherwith(2.6.27)gives
~


0
=[

A

1
+
VC
|
RCA

1
]
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
o
p
(
C

0

=n
)
=

A

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
VC
|
RCA

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
+
o
p
(
C

0

=n
)
=

A

1
f
1
n
n
X
i
=1
g
i
(

0
)
g
VC
|
~

+
o
p
(
C

0

=n
)
:
(2.6.28)
Usingtheexpressionof

in(2.6.40)andtheaboveasymptoticexpressionfor
~


0
,
44
theempiricallog-likelihoodratiostatisticcanbewrittenas
2
`
(
t
)=2
n
X
i
=1
~

|
g
i
(
~

)

n
X
i
=1
~

|
g
i
(
~

)
g
|
i
(
~

)
~

+
o
p
(1)
=
n
(
1
n
n
X
i
=1
g
|
i
(
~

))
n
C
2

0

B

1
(
1
n
n
X
i
=1
g
i
(
~

))+
o
p
(1)
=
n
2
C
2

0

~

|
CVAB

1
AVC
|
~

+
o
p
(1)=
n
2
C
2

0

~

|
R

1
~

+
o
p
(1)
:
By(2.6.27),wehave
2
`
(
t
)=
1
C
2

0

f
n
X
i
=1
g
i
(

0
)
g
|
A

1
C
|
RCA

1
f
n
X
i
=1
g
i
(

0
)
g
+
o
p
(1)
:
(2.6.29)
Weseethat
E
(
R
1
=
2
CA

1
f
P
n
i
=1
g
i
(

0
))=0andas
n
!1
,
C

1

0

Var

R
1
=
2
CA

1
f
n
X
i
=1
g
i
(

0
)
g

!
R
1
=
2
CA

1
BA

1
C
|
R
1
=
2
=
R
1
=
2
C
f
ABA
g

1
C
|
R
1
=
2
=
R
1
=
2
CVC
|
R
1
=
2
=
R
1
=
2
R

1
R
1
=
2
=
I
q

q
Thus,bycentrallimittheorem,wehave
R
1
=
2
CA

1
f
C

1

0

P
n
i
=1
g
i
(

0
)
g
d
!
N
(
0
;
I
q
)
:
Then
by(2.6.29),wehave2
`
(
t
)
d
!
˜
2
q
.
Denote


n
=(
d
n
log
n
nh
2
)
1
2


+
h
2


forsome0
<<
1
6
.
Lemma5.
Underassumptions(C1)-(C3)and(C4)(i),wehavethesolution


(
t
)
tothe
estimatingequation(2.2.7)
(a)
sup
t
2
[
a;b
]
k


(
t
)


0
(
t
)
k
=
O
(

n
1
+
h
4
)
;a:s::
(b)
Andforeach
t
2
[
a;b
]
,inthesphere
n

(
t
):sup
t
2
[
a;b
]
k

(
t
)


0
(
t
)
k


n
o
,where
45

0
(
t
)
isthetrueparameter,wehave
2
`
(
t
)=
n
2
C

2

0

H
|
f


(
t
)
g
R
(
t
)
H
f


(
t
)
g
+
o
p
(
nh
4
=C

0

)
:
Proof.
Weprove(a).Usingtheestimatingequation(2.3),oneobtain
0=
1
n
n
X
i
=1
g
i
f


(
t
)
g
=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

ij
X
ij
K
h
(
t
ij

t
)
+
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

|

;ij
(
t
)
X
ij
X
ij
K
h
(
t
ij

t
)
;
where

;ij
(
t
)=[
^

(
t
)


0
(
t
)]

[


(
t
)


0
(
t
)]

[
^

(
t
ij
)


0
(
t
ij
)]
Itfollowsthat
n
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
|
ij
X
ij
K
h
(
t
ij

t
)
o
[


(
t
)


0
(
t
)]
=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

ij
X
ij
K
h
(
t
ij

t
)
+
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
n
[
^

(
t
)


0
(
t
)]

[
^

(
t
ij
)


0
(
t
ij
)]
o
|
X
ij
X
ij
K
h
(
t
ij

t
)=
g
n
f

0
(
t
)
g
;
(2.6.30)
Sincewehave
g
n
f

0
(
t
)
g
=
~
O
p
(

n
1
+
h
4
),andwealsoknowthatfromtheproofofLemma
1,
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
K
h
(
t
ij

t
)


(
t
)
f
(
t
)
k
=
O
(

n
)
;a:s::
Thus(2.6.30)givessup
t
2
[
a;b
]
k


(
t
)


0
(
t
)
k
=
O
(

n
1
+
h
4
)
;a:s:
.Thiscompletestheproof
ofpart(a).
46
For(b),wehavethefollowingTaylorexpansionfor
1
n
P
n
i
=1
g
i
f


(
t
)
g
by(a)foreach
t
2
[
a;b
],wehave
k


(
t
)


0
(
t
)
k
=
O
p
(
C

0

=n
+
h
4
)
0=
1
n
n
X
i
=1
g
i
f


(
t
)
g
=
1
n
n
X
i
=1
g
i
f

0
(
t
)
g
+
1
n
n
X
i
=1
@g
i
f

0
(
t
)
g
@

|
(
t
)
[


(
t
)


0
(
t
)]+
o
p
(
C

0

=n
+
h
4
)
=
1
n
n
X
i
=1
g
i
f

0
(
t
)
g
+
A
(
t
)[


(
t
)


0
(
t
)]+
o
p
(
C

0

=n
+
h
4
)
;
(2.6.31)
whichgives


(
t
)


0
(
t
)=

A

1
(
t
)
f
1
n
n
X
i
=1
g
i
f

0
(
t
)
gg
+
o
p
(
C

0

=n
+
h
4
)
:
(2.6.32)
TheTaylorexpansionfor
H
f


(
t
)
g
around

0
(
t
)canbeexpressedasfollowsbypluggingin
(2.6.32)
H
f


(
t
)
g
=
H
f

0
(
t
)
g
+
C
(
t
)[


(
t
)


0
(
t
)]+
o
p
(
C

0

=n
+
h
4
)
=
H
f

0
(
t
)
g
C
(
t
)
A

1
(
t
)
f
1
n
n
X
i
=1
g
i
f

0
(
t
)
gg
+
o
p
(
C

0

=n
+
h
4
)
=
H
f

0
(
t
)
g
+
R

1
(
t
)
~

(
t
)

H
f

0
(
t
)
g
+
o
p
(
C

0

=n
+
h
4
)
=
R

1
(
t
)
~

(
t
)+
o
p
(
C

0

=n
+
h
4
)
;
(2.6.33)
wherethesecond-to-lastequalityisduetosimilarresultas(2.6.27)forgeneral
H
f

0
(
t
)
g
.
47
ThuswecouldeasilyseefromtheproofofLemma4that
2
`
(
t
)=
n
2
C
2

0

~

|
R

1
~

+
o
p
(
nh
4
=C

0

)
=
n
2
C
2

0

H
|
f


(
t
)
g
R
(
t
)
H
f


(
t
)
g
+
o
p
(
nh
4
=C

0

)
:
2.6.2.2ProofofPropositions
Inthissection,weprovidetheproofforthePropositionsinthischapter.
ProofofProposition1.
By(2.6.32),wehave


(
t
)


0
(
t
)=

A

1
(
t
)
f
1
n
n
X
i
=1
g
i
f

0
(
t
)
gg
+
o
p
(
C

0

=n
+
h
4
)
:
AndbyLemma2,wehave
g
i
f

0
(
t
)
g
=
˘
i
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
)
:
Combiningthesetworesultstogether,wehave


(
t
)


0
(
t
)=

A

1
(
t
)

˘
n
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
).
AndforVar
f

˘
n
(
t
)
g
,from(2.6.22)intheproofofLemma2,wecaneasilyget(2.2.9)
intheproposition.
ProofofProposition2.
ByLemma3,andProposition1,andunderthebandwidthcondition
48
(
C
4)whichmakesthebiasnegligible,wehave
nC

1

0


(
t
)


0
(
t
)

d
!
N
(
0
;
V
(
t
))
where
V
(
t
)=
A

1
(
t
)
B
(
t
)
A

1
(
t
).
ProofofProposition3.
By(b)ofLemma5,wehave
2
`
(
t
)=
n
2
C
2

0

H
|
f


(
t
)
g
R
(
t
)
H
f


(
t
)
g
+
o
p
(
nh
4
=C

0

)
:
From(2.6.33),wehavethatunder
H
0
:
H
f

0
(
t
)
g
=0,
R
1
=
2
(
t
)
H
f


(
t
)
g
=

R
1
=
2
(
t
)
C
(
t
)
A

1
(
t
)
f
1
n
n
X
i
=1
g
i
f

0
(
t
)
ggf
1+~
o
p
(1)
g
=

R
1
=
2
(
t
)
C
(
t
)
A

1
(
t
)

˘
n
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
)
=

G
(
t
)

˘
n
(
t
)
f
1+~
o
p
(1)
g
+
~
O
p
(
h
4
)
:
By
U
n
(
t
)=
nC

1

0

G
(
t
)

˘
n
(
t
),wehave
2
`
(
t
)=
U
n
(
t
)
|
U
n
(
t
)+
O
p
(
nh
4
=C

0

)
:
2.6.2.3ExistenceofRMELEandtheasymptoticexpressionfor
~

Inthissection,westudytheexistenceofRMELE
~

(
t
)andtheorderoftheLagrange
multiplier
~

(
t
).Tothisend,


n
=(
d
n
log
n
nh
2
)
1
2


+
h
2


forsome0
<<
1
6
where
49
d
n
=
h
2
+
rh=m
.
Lemma6.
Underassumptions(C1)-(C3)and(C4)(i),inthesphere
(

(
t
):sup
t
2
[
a;b
]
k

(
t
)


0
(
t
)
k


n
)
;
(2.6.34)
where

0
(
t
)
isthetrueparameter,wehave(a)
sup
t
k
n

1
P
n
i
=1
g
i
f

(
t
)
gk
=
O
p
(


n
);
(b)
sup
t
max
i
k
g
i
f

(
t
)
gk
=
o
p
(

0

1
n
)
with

0
n
=


n
=C
2

0


n
;and(c)
lim
n
!1
P
(inf
t
C

2

0

n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
>
0)=1
:
Proof.
For(a),noticethat
1
n
P
n
i
=1
g
i
f

(
t
)
g
=
T
1
(
t
)+
T
2
(
t
)where
T
1
(
t
)=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

ij
X
ij
K
h
(
t
ij

t
)
and
T
2
(
t
)=
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1

|
;ij
(
t
)
X
ij
X
ij
K
h
(
t
ij

t
)
where
;ij
(
t
)=[
^

(
t
)


0
(
t
)]

[

(
t
)


0
(
t
)]

[
^

(
t
ij
)


0
(
t
ij
)].
For
T
1
(
t
),byLemma1in[LH10]fortheprocess

(
t
)
X
(
t
),underthecondition(C2),as
weprovedinLemma1,wehavesince
E
f
T
1
(
t
)
g
=0andhencesup
t
k
T
1
(
t
)
k
=
O
(


n
1
)
;a:s::
50
For
T
2
(
t
),byLemma1andtheassumptionfor

(
t
)in(2.6.34),
sup
t
k
T
2
(
t
)
k
sup
t
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
k

;ij
(
t
)
kk
X
ij
k
2
K
h
(
t
ij

t
)

(2sup
t
k
^

(
t
)


0
(
t
)
k
+sup
t
k

(
t
)


0
(
t
)
k
)

sup
t
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
k
X
ij
k
2
K
h
(
t
ij

t
)=
O
p
(


n
)
:
Thuswehavesup
t
k
n

1
P
n
i
=1
g
i
f

(
t
)
gk
=
O
p
(


n
)
:
Thistheproofforpart(a).
Forprovingpart(b),notethat,
sup
t
k
g
i
f

(
t
)
gk

sup
t
k
1
m
i
m
i
X
j
=1

ij
X
ij
K
h
(
t
ij

t
)
k
+sup
t
k
1
m
i
m
i
X
j
=1

|
;ij
(
t
)
X
ij
X
ij
K
h
(
t
ij

t
)
k

sup
t
k

i
(
t
)
X
i
(
t
)
k
sup
t
1
m
i
m
i
X
j
=1
K
h
(
t
ij

t
)
+
ˆ
2sup
t
k
^

(
t
)


0
(
t
)
k
+sup
t
k

(
t
)


0
(
t
)
k
˙
sup
t
k
X
i
(
t
)
k
2
sup
t
1
m
i
m
i
X
j
=1
K
h
(
t
ij

t
)


sup
t
k

i
(
t
)
X
i
(
t
)
k
+
C
1


n
sup
t
k
X
i
(
t
)
k
2

sup
t
1
m
i
m
i
X
j
=1
K
h
(
t
ij

t
)
:
If
m
i
'sarebounded,thenwehavesup
t
m
i

1
P
m
i
j
=1
K
h
(
t
ij

t
)=
O
p
(1
=h
).Andif
m
i
'stend
toy,thenbythetheoremin[Sil78]wehavesup
t
m
i

1
P
m
i
j
=1
K
h
(
t
ij

t
)=
O
p
(1)under
theregularityconditionsofthekernelfunctionin(C1).
51
Forthecase
m
i
'sbounded,wehave

0
n
=


n
and
sup
t
k
g
i
f

(
t
)
gk
sup
t
1
m
i
m
i
X
j
=1
K
h
(
t
ij

t
)
f
sup
t
k

i
(
t
)
X
i
(
t
)
k
+
C
1


n
sup
t
k
X
i
(
t
)
k
2
g

C
h
(sup
t
k

i
(
t
)
X
i
(
t
)
k
+


n
sup
t
k
X
i
(
t
)
k
2
)
:
Thenwehave,forany
>
0,byassumption(C4),
P
(max
1

i

n
sup
t
k
g
i
f

(
t
)
gk
>


0
n
)

n
P
ˆ
C
h
(sup
t
k

i
(
t
)
X
i
(
t
)
k
+


n
sup
t
k
X
i
(
t
)
k
2
)
>


n
˙

n
P
(sup
t
k

(
t
)
X
(
t
)
k
>

2
C

n
)+
n
P
(sup
t
k
X
i
(
t
)
k
2
>

2
C

2
n
)

n
E
f
[sup
t
k

(
t
)
X
(
t
)
k
]

g
(
2
C

n

)

+
n
E
f
[sup
t
k
X
(
t
)
k
]

1
g
(
2
C

2
n

)

1
=
2

n

(
2
C

n

)

E
f
sup
t
k

(
t
)
k

g
+(
2
C

2
n

)

1
=
2

E
f
[sup
t
k
X
(
t
)
k
]

1
g

Cn
f
(


n
)

+(


n
)

1
g
Cn
(


n
)

!
0
;
where

=min
f

1
;
2
g
.Thisimpliessup
t
max
i
k
g
i
f

(
t
)
gk
=
o
p
(

0

1
n
).
Forthecasethat
m
i
'stendtoy,wehave
sup
t
k
g
i
f

(
t
)
gk
sup
t
1
m
i
m
i
X
j
=1
K
h
(
t
ij

t
)
f
sup
t
k

i
(
t
)
X
i
(
t
)
k
+
C
1


n
sup
t
k
X
i
(
t
)
k
2
g

C
ˆ
sup
t
k

i
(
t
)
X
i
(
t
)
k
+


n
sup
t
k
X
i
(
t
)
k
2
˙
:
52
Thenwehave,forany
>
0,byassumption(C4),
P
ˆ
max
1

i

n
sup
t
k
g
i
f

(
t
)
gk
>


n
˙

Cn
n
(


n
)

+(


n
)

1
o

Cn
(


n
)

!
0
;
where

=min
f

1
;
2
g
.Thisimpliessup
t
max
i
k
g
i
f

(
t
)
gk
=
o
p
(


1
n
)=
o
p
(

0

1
n
).This
completestheproofofpart(b).
For(c),weneedtoshowthat,forany
u
2
R
p
,
lim
n
!1
P
(inf
t
C

2

0

n
X
i
=1
u
|
g
i
(

(
t
))
g
|
i
(

(
t
))
u
>
0)=1
:
(2.6.35)
Infact,notethat
C

2

0

n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
=
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
+
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

|
;ij
(
t
)
X
ij
X
ij
X
|
il

il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
+
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

|
;ij
(
t
)
X
il
X
ij
X
|
il

ij
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
+
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

|
;ij
(
t
)
X
ij

|
;il
(
t
)
X
il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
=
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)+~
o
p
(1)
;
53
byLemma1,andtheassumption(2.6.34)for

(
t
).Thuswehaveforany

u
>
0,
P
(inf
t
C

2

0

n
X
i
=1
u
|
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
u
>
0)

P
(inf
t
C

2

0

n
X
i
=1
u
|
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
u
>
u
)
=
P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
+~
o
p
(1)
>
u
)

P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
2

u
;
j
o
p
(1)
j
<
u
)
=
P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
2

u
)

P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
2

u
;
j
o
p
(1)
j

u
)

P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
2

u
)

P
(
j
o
p
(1)
j

u
)
:
Nowsincelim
n
!1
P
(
j
o
p
(1)
j

u
)=0,forproving(2.6.35)weonlyneedtoprovethatfor
some

u
>
0,
lim
n
!1
P
(inf
t
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
u
)=1
:
(2.6.36)
Tothisend,notethat
E
8
<
:
1
m
2
i
m
i
X
j;l
=1

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
9
=
;
=Var(
˘
i
(
t
))
=
~
O
f
(
mh
)

1
g
1
f

0
=0
g
+
~
O
(1)
1
f

0
=
1g
:
54
Bythestronglawoflargenumbers,wehave
C

2

0

n
X
i
=1
1
m
2
i
m
i
X
j;l
=1

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
!
L
(
t
)
;a:s:;
where
L
(
t
)=
r

(
t

t
)
f
(
t
)

20
1
f

0
=0
g
+

(
t

t
)
f
2
(
t
)
1
f

0
=
1g
:
Bytaking

u
=
1
2
inf
t
u
|
L
(
t
)
u
>
0,wehave
lim
n
!1
P

inf
t
1
n
n
X
i
=1
1
m
2
i
m
i
X
j;l
=1
u
|

ij

il
X
ij
X
|
il
K
h
(
t
ij

t
)
K
h
(
t
il

t
)
u
>
u

=1
:
Hence(c)isproved.
Lemma7.
Underassumptions(C1)-(C3)and(C4)(i),inthesphere
(

(
t
):sup
t
2
[
a;b
]
k

(
t
)


0
(
t
)
k


n
)
;
(2.6.37)
where

0
(
t
)
isthetrueparameter,theequation
Q
1
n
f

(
t
)
;

(
t
)
g
=0
almostsurelyhasroot

(
t
)=

f

(
t
)
g
and
sup
t
2
[
a;b
]
k

(
t
)
k
=
O
p
(

0
n
)
,where

0
n
=


n
=C
2

0


n
:
Proof.
Similartotheproofin[Owe90],let

(
t
):=
ˆ
(
t
)

(
t
)with
k

(
t
)
k
=1and
ˆ
(
t
)

0,
andthenfromtheequation
Q
1
n
f

(
t
)
;

(
t
)
g
=
1
n
X
i
g
i
f

(
t
)
g
1+

|
(
t
)
g
i
f

(
t
)
g
=0
;
wehave
ˆ
(
t
)

|
(
t
)
S
(
t
)

(
t
)
1+
ˆ
(
t
)sup
t
max
i
k
g
i
f

(
t
)
gk

1
n
j

|
(
t
)
n
X
i
=1
g
i
f

(
t
)
gj
0
;
(2.6.38)
where
S
(
t
)=
1
n
P
n
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
.Byapplying(a)-(c)inLemma6and(2.6.38),we
55
have
ˆ
(
t
)

|
(
t
)
nC

2

0

S
(
t
)

(
t
)
f
1+
ˆ
(
t
)sup
t
max
i
k
g
i
(

(
t
))
kg
1
C
2

0

j

|
(
t
)
n
X
i
=1
g
i
f

(
t
)
gj
=
f
1+
ˆ
(
t
)~
o
p
(

0

1
n
)
g
~
O
p
(

0
n
)=
~
O
p
(

0
n
)+
ˆ
(
t
)~
o
p
(1)
;
whichimplies
ˆ
(
t
)

~
O
p
(

0
n
)

|
(
t
)
nC

2

0

S
(
t
)

(
t
)+
o
p
(1)
˘
~
O
p
(

0
n
)
;
since
S
(
t
)
˘
C
2

0

=n
uniformlyfor
t
2
[
a;b
].Namelyweprovedsup
t
k

(
t
)
k
=
O
p
(

0
n
)
:
Remark4.
For

f

0
(
t
)
g
,wehave
sup
t
k

(

0
(
t
))
k
=
O
p
(

n
=C
2

0

)
:
Thisisbecause
Lemma1,Lemma6andLemma7arestilltrueifwereplace

(
t
)
by

0
(
t
)
and


n
by

n
.This
impliesthat
sup
t
k

(

0
(
t
))
k
=
O
p
f
(log(
n
)
=nh
)
1
=
2
g
forsparsedataand
sup
t
k

(

0
(
t
))
k
=
O
p
f
(log(
n
)
=n
)
1
=
2
g
fordensedata.
Expressionfor

(
t
)
:
Fromtheequation
Q
1
n
,wehave
0=
Q
1
n
f

(
t
)
;

(
t
)
g
=
n

1
n
X
i
=1
g
i
f

(
t
)
g
1+

|
(
t
)
g
i
f

(
t
)
g
=
n

1
n
X
i
=1
g
i
f

(
t
)
g
n

1
n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g

(
t
)
+
n

1
n
X
i
=1
g
i
f

(
t
)
g
[

|
(
t
)
g
i
f

(
t
)
g
]
2
1+

|
(
t
)
g
i
f

(
t
)
g
(2.6.39)
Inthefollowing,wewanttoshowthattheorderofthethirdtermis~
o
p
(

0
n
).Tothisend,we
observethat
j

|
(
t
)
g
i
f

(
t
)
gj
sup
t
max
i
k
g
i
f

(
t
)
gk
sup
t
k

(
t
)
k
=~
o
p
(


1
n
)
~
O
p
(


n
)=~
o
p
(1)
:
56
Thuswehave
n

1
n
X
i
=1
g
i
f

(
t
)
g
[

|
(
t
)
g
i
f

(
t
)
g
]
2
1+

|
(
t
)
g
i
f

(
t
)
g
˘
n

1
n
X
i
=1
g
i
f

(
t
)
g
[

|
(
t
)
g
i
f

(
t
)
g
]
2
:
Let

|
(
t
)=(

1
(
t
)
;
2
(
t
)
;

;
p
(
t
))and
g
|
i
f

(
t
)
g
=(
g
i
1
f

(
t
)
g
;

;g
ip
f

(
t
)
g
),
i
=1
;
2
;

;n
.
Then
u
-thcomponentof
n

1
P
n
i
=1
g
i
f

(
t
)
g
[

|
(
t
)
g
i
f

(
t
)
g
]
2
is
n

1
n
X
i
=1
p
X
j;k
=1

j
(
t
)

k
(
t
)
g
iu
f

(
t
)
g
g
ij
f

(
t
)
g
g
ik
f

(
t
)
g
whoseabsolutevaluecanbeboundedby
j
n

1
n
X
i
=1
p
X
j;k
=1

j
(
t
)

k
(
t
)
g
iu
f

(
t
)
g
g
ij
f

(
t
)
g
g
ik
f

(
t
)
gj

(sup
t
k

(
t
)
k
)
2
sup
t
max
i
j
g
iu
f

(
t
)
gj


n

1
n
X
i
=1
p
X
j;k
=1
g
ij
f

(
t
)
g
g
ik
f

(
t
)
g


(sup
t
k

(
t
)
k
)
2
sup
t
max
i
k
g
i
f

(
t
)
gk


n

1
n
X
i
=1
(
p
X
j
=1
g
ij
f

(
t
)
g
)
2


C
(sup
t
k

(
t
)
k
)
2
sup
t
max
i
k
g
i
f

(
t
)
gk
sup
t
p
n
n
X
i
=1
k
g
i
f

(
t
)
gk
2
=
~
O
p
f
(

0
n
)
2
g
~
o
p
(


1
n
)
~
O
p
(1)=~
o
p
(

0
n
)
;
Thismeansthatthethirdtermin(2.6.39)isoforder~
o
p
(

0
n
).Itthenfollowsthat

(
t
)=
(
n

1
n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
)

1
(
n

1
n
X
i
=1
g
i
f

(
t
)
g
)
+~
o
p
(

0
n
)
:
(2.6.40)
57
Lemma8.
Underassumptions(C1)-(C3)and(C4)(i),inthesphere
(

(
t
):sup
t
2
[
a;b
]
k

(
t
)


0
(
t
)
k


n
)
;
theequationsystem(2.3.15)almostsurelyhasrootin
U


n
=
f
(

(
t
)
;

(
t
)
;

(
t
)):sup
t
k

(
t
)


0
(
t
)+

(
t
)+

(
t
)
k


n
g
:
Andanysolutionisindeedasolutiontotheminimizationproblem(3.2).
Proof.
SincewehavealreadyprovedinLemma7thatforevery

(
t
)
2f

(
t
):sup
t
2
[
a;b
]
k

(
t
)


0
(
t
)
k


n
g
,theequation
Q
1
n
(

(
t
)
;

(
t
))=0almostsurelyhasroot

(
t
)=

(

(
t
))=
~
O
(

0
n
),weonlyhavetoprovethefollowing:
(a)
Forevery

(
t
)
2f

(
t
):sup
t
k

(
t
)


0
(
t
)
k


n
g
,

(
t
)=

f

(
t
)
g
=
~
O
(


n
)couldbe
solvedfromtheequations
Q
2
n
(

(
t
)
;

(
t
)
;

(
t
))=0.
(b)
Andtherealmostsurelyexistsasolution
~

(
t
)
2
U


n
totheequationsystem(3.3).
(c)
Anysolutionisindeedasolutiontotheminimizationproblem(3.2).
Inordertoprove(a),recalltheexpressionin(2.6.40)andtheasymptoticvariance
B
(
t
)
inLemma3,bytheuniformlystronglawoflargenumbers(SLLN),wehave
C

2

0

n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
=
B
(
t
)+~
o
p
(1)
:
58
Thus

f

(
t
)
g
=
(
n

1
n
X
i
=1
g
i
f

(
t
)
g
g
|
i
f

(
t
)
g
)

1
(
n

1
n
X
i
=1
g
i
f

(
t
)
g
)
+~
o
p
(

0
n
)
=
B

1
(
t
)
(
1
C
2

0

n
X
i
=1
g
i
f

(
t
)
g
)
+~
o
p
(

0
n
)
;
(2.6.41)
and

f

0
(
t
)
g
=
~
O
p
(

n
=C
2

0

)=~
o
p
(


n
=C
2

0

)=~
o
p
(

0
n
)
:
(2.6.42)
Wehave
@

f

(
t
)
g
@

|
(
t
)
=
B

1
(
t
)
(
C

2

0

n
X
i
=1
@g
i
f

(
t
)
g
@

|
(
t
)
)
+~
o
p
(

0
n
)
:
(2.6.43)
BecausetheuniformlySLLNgives
n

1
P
n
i
=1
@g
i
f

0
(
t
)
g
@

|
(
t
)
=
A
(
t
)+~
o
p
(1)where
A
(
t
)=

(
t
)
f
(
t
),wehavethefollowing
@

f

0
(
t
)
g
@

|
(
t
)
=
B

1
(
t
)
(
C

2

0

n
X
i
=1
@g
i
f

0
(
t
)
g
@

|
(
t
)
)
+~
o
p
(

0
n
)
=
nC

2

0

B

1
(
t
)
A
(
t
)+~
o
p
(

0
n
)
:
(2.6.44)
Let
S
f

(
t
)
g
=
n

1
P
n
i
=1
@g
|
i
f

(
t
)
g
=@

(
t
)
1+

|
f

(
t
)
g
g
i
f

(
t
)
g
,then
Q
2
n
f

(
t
)
;

(
t
)
;

(
t
)
g
=
S
f

(
t
)
g

f

(
t
)
g
+
C
|
f

(
t
)
g

(
t
)
:
(2.6.45)
59
Forthetaylorexpansionof
Q
2
n
f

(
t
)
;

(
t
)
;

(
t
)
g
at

0
(
t
),weneedthefollowing:
S
f

(
t
)
g
=
n

1
n
X
i
=1
@g
|
i
f

(
t
)
g
=@

(
t
)
1+

|
f

(
t
)
g
g
i
f

(
t
)
g
=
n

1
n
X
i
=1
@g
|
i
f

(
t
)
g
@

(
t
)
ˆ
1


|
f

(
t
)
g
g
i
f

(
t
)
g
1+

|
f

(
t
)
g
g
i
f

(
t
)
g
˙
=
n

1
n
X
i
=1
@g
|
i
f

(
t
)
g
@

(
t
)
+
~
O
p
(

0
n
)
;
(2.6.46)
whichimpliesthat
S
f

0
(
t
)
g
=
A
(
t
)+
~
O
p
(

0
n
)
:
(2.6.47)
Hencewehave
@S
f

(
t
)
g
@

|
(
t
)
=
n

1
n
X
i
=1
@
2
g
|
i
f

(
t
)
g
@

|
(
t
)
@

(
t
)
+
~
O
p
(

0
n
)
;
(2.6.48)
@S
f

0
(
t
)
g
@

|
(
t
)
=
E
@
2
g
|
i
f

0
(
t
)
g
@

|
(
t
)
@

(
t
)
+
O
p
(

0
n
):=
D
(
t
)+
~
O
p
(

0
n
)
:
(2.6.49)
Let
W
f

(
t
)
g
=
S
f

(
t
)
g

f

(
t
)
g
and
S
f

(
t
)
g
=(
S
1
;
S
2
;

;
S
p
)where
S
j
isthe
j
-th
columnof
S
f

(
t
)
g
.Thenby(2.6.42),(2.6.44),(2.6.47),(2.6.49)andtheassumptionabout
60

(
t
)wehave
W
f

(
t
)
g
=
W
f

0
(
t
)
g
+
S
f

0
(
t
)
g
@

f

0
(
t
)
g
@

|
(
t
)
(

(
t
)


0
(
t
))
+
p
X
j
=1
@
S
j
@

|
(
t
)

j
f

0
(
t
)
g
(

(
t
)


0
(
t
))+
~
O
p
f
(


n
)
2
g
=
f
A
(
t
)+
~
O
p
(

0
n
)
g
~
o
p
(

0
n
)
+
f
[
A
(
t
)+
~
O
p
(

0
n
)][
nC

2

0

B

1
(
t
)
A
(
t
)+~
o
p
(

0
n
)]
+[
D
(
t
)+
~
O
p
(

0
n
)]~
o
p
(

0
n
)
g
[

(
t
)


0
(
t
)]+
~
O
p
f
(


n
)
2
g
=
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+~
o
p
(

0
n
)
:
Bypluggingtheaboveinto(2.6.45),weget
0=
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
C
|
f

(
t
)
g

(
t
)+~
o
p
(

0
n
)
:
(2.6.50)
Since
A
(
t
)
B

1
(
t
)
A
(
t
)isinvertible,bymultiplying
C
(
t
)
f
A
(
t
)
B

1
(
t
)
A
(
t
)
g

1
onbothside
of(2.6.50)wehave
0=
nC

2

0

C
(
t
)[

(
t
)


0
(
t
)]+
C
(
t
)
f
A
(
t
)
B

1
(
t
)
A
(
t
)
g

1
C
|
f

(
t
)
g

(
t
)+~
o
p
(

0
n
)
:
(2.6.51)
Fromthethirdequationoftheequationsystem(3.3),
0=
H
f

(
t
)
g
=
H
f

0
(
t
)
g
+
C
(
t
)[

(
t
)


0
(
t
)]+~
o
(

0
n
)
=
C
(
t
)[

(
t
)


0
(
t
)]+~
o
(

0
n
)
;
61
wehave
C
(
t
)[

(
t
)


0
(
t
)]=~
o
(

0
n
)
:
(2.6.52)
Combine(2.6.51)and(2.6.52),
C
(
t
)
f
A
(
t
)
B

1
(
t
)
A
(
t
)
g

1
C
|
f

(
t
)
g

(
t
)=

nC

2

0

C
(
t
)[

(
t
)


0
(
t
)]+
o
p
(

0
n
)=
o
p
(

0
n
)
;
Thatis

(
t
)=
n
C
(
t
)
f
A
(
t
)
B

1
(
t
)
A
(
t
)
g

1
C
|
f

(
t
)
g
o

1
o
p
(

0
n
)=
o
p
(

0
n
)
:
(2.6.53)
Henceweproved(a).
Forproving(b),from(2.6.50)and(2.6.53),wehave
0=
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
C
|
f

(
t
)
g

(
t
)+
o
p
(

0
n
)
=
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
o
p
(

0
n
)
;
whichimpliesthat
0=

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
o
p
(


n
)
:
(2.6.54)
Nowconsidertheaboveequation(2.6.54)andafunction
˚
onthetheunitdiskin
R
p
by
˚


(
t
)


0
(
t
)


n

=

A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
o
p
(


n
)
:
62
Weknowthat
˚
isacontinuousfunctionontheunitdisk.Alsowehave


1
n
[

(
t
)


0
(
t
)]
|
˚


(
t
)


0
(
t
)


n

=


1
n
[

(
t
)


0
(
t
)]
|
A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
o
p
(


n
)
:
Henceonthecircle
k

(
t
)


0
(
t
)
k
=


n
,wehave


1
n
[

(
t
)


0
(
t
)]
|
˚


(
t
)


0
(
t
)


n

=


1
n
[

(
t
)


0
(
t
)]
|
A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)


0
(
t
)]+
o
p
(


n
)


n
˝
0
(
t
)+
o
p
(


n
)
<
0
;
if
n
bigenough
;
where
˝
0
(
t
)
>
0isthesmallesteigenvalueof
A
(
t
)
B

1
(
t
)
A
(
t
),whichispositive
Thusbythelemmain[AS58],thereexistsapoint
~

(
t
)
2
U


n
and
˚
f
~

(
t
)


0
(
t
)


n
g
=0,which
means
~

(
t
)isasolutiontotheequationsystem(3.3).
Nextwehavetoprove(c).Assumingthat
~

(
t
)isasolutionin
U


n
,welet

(
t
)beapoint
inaneighborhoodof
~

(
t
)containedin
U


n
suchthat
H
f

(
t
)
g
=0and
k

(
t
)

~

(
t
)
k
>>
0.
Thenbyexpanding
l
0
f

(
t
)
g
at
~

(
t
)wehave
l
0
f

(
t
)
g
l
0
f
~

(
t
)
g
=
@l
0
f
~

(
t
)
g
@

|
(
t
)
[

(
t
)

~

(
t
)]
+
1
2
[

(
t
)

~

(
t
)]
|
@
2
l
0
f


(
t
)
g
@

(
t
)
@

|
(
t
)
[

(
t
)

~

(
t
)]
;
(2.6.55)
where


(
t
)
2
U


n
.Wewishtoshowthat
l
0
f

(
t
)
g
l
0
f
~

(
t
)
g
>
0
:
63
Next,weapproximatethetwotermsontherightsideof(2.6.55):Fortheterm,note
that
@l
0
f
~

(
t
)
g
@

|
(
t
)
=
n
X
i
=1
1
1+

|
f
~

(
t
)
g
g
i
f
~

(
t
)
g
g
|
i
f
~

(
t
)
g
@

f
~

(
t
)
g
@

|
(
t
)
+
n
X
i
=1
1
1+

|
f
~

(
t
)
g
g
i
f
~

(
t
)
g

|
f
~

(
t
)
g
@g
i
f
~

(
t
)
g
@

|
(
t
)
=
n
X
i
=1
1
1+

|
f
~

(
t
)
g
g
i
f
~

(
t
)
g

|
f
~

(
t
)
g
@g
i
f
~

(
t
)
g
@

|
(
t
)
=
n

|
f
~

(
t
)
g
S
|
f
~

(
t
)
g
=
nW
|
f
~

(
t
)
g
:
(2.6.56)
By(2.6.45),wehave
W
|
f
~

(
t
)
g
=

~

|
(
t
)
C
f
~

(
t
)
g
:
(2.6.57)
Fromthetaylorexpansionof
H
f

(
t
)
g
at
~

(
t
),wehave
0=
H
f

(
t
)
g
H
f
~

(
t
)
g
=
C
f
~

(
t
)
g
[

(
t
)

~

(
t
)]+~
o
(


n
)
;
fromwhichwecouldobtain
C
f
~

(
t
)
g
[

(
t
)

~

(
t
)]=~
o
(


n
)
:
(2.6.58)
Thus,forthetermof(2.6.55),combining(2.6.56)-(2.6.58)wehave
@l
0
f
~

(
t
)
g
@

|
(
t
)
[

(
t
)

~

(
t
)]=
nW
|
f
~

(
t
)
g
[

(
t
)

~

(
t
)]
=

n
~

|
(
t
)
C
f
~

(
t
)
g
[

(
t
)

~

(
t
)]=

n
2
C

2

0

~
o
p
f
(


n
)
2
g
:
(2.6.59)
64
Forthesecondtermof(2.6.55),wehave
@
2
l
0
f


(
t
)
g
@

(
t
)
@

|
(
t
)
=
n
@W
|
f


(
t
)
g
@

(
t
)
=
n
ˆ
@

|
f


(
t
)
g
@

(
t
)
S
|
f


(
t
)
g
+

f


(
t
)
g
@S
|
f


(
t
)
g
@

(
t
)
˙
=
n
[
nC

2

0

A
(
t
)
B

1
(
t
)+~
o
p
(

0
n
)][
A
(
t
)+
~
O
p
(

0
n
)]
+
n
~
O
p
(

0
n
)[
D
(
t
)+
~
O
p
(

0
n
)]
=
n
f
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)+
~
O
p
(

0
n
)
g
:
Itfollowsthat
1
2
[

(
t
)

~

(
t
)]
|
@
2
l
0
f


(
t
)
g
@

(
t
)
@

|
(
t
)
[

(
t
)

~

(
t
)]
=
1
2
[

(
t
)

~

(
t
)]
|
n
f
nC

2

0

A
(
t
)
B

1
(
t
)
A
(
t
)+
~
O
p
(

0
n
)
g
[

(
t
)

~

(
t
)]
=
n
2
2
C
2

0

[

(
t
)

~

(
t
)]
|
A
(
t
)
B

1
(
t
)
A
(
t
)[

(
t
)

~

(
t
)]+
n
2
C
2

0

~
o
p
f
(


n
)
2
g
:
(2.6.60)
Hence,plugging(2.6.59)and(2.6.60)into(2.6.55),wehave
l
0
f

(
t
)
g
l
0
f
~

(
t
)
g
=
n
2
C
2

0

ˆ
1
2
(

(
t
)

~

(
t
))
|
A
(
t
)
B

1
(
t
)
A
(
t
)(

(
t
)

~

(
t
))+~
o
p
f
(


n
)
2
g
˙

n
2
C
2

0

(


n
)
2
f
1
2
˝
0
(
t
)+~
o
p
(1)
g
>
0
;
if
n
bigenough
;
where
˝
0
(
t
)
>
0isthesmallesteigenvalueof
A
(
t
)
B

1
(
t
)
A
(
t
),whichispositive
65
Chapter3
simultaneousempirical
likelihoodratiotestsforfunctional
linearmodelsandthephasetransition
fromsparsetodensefunctionaldata
3.1Introduction
Inthischapter,wecontinuetoconsiderthesamemodel(2.1.1)aswediscussedinChapter
2.Andweareinterestedinthesamehypothesistestingproblemasin(2.1.2),
H
0
:
H
f

0
(

)
g
=0vs
H
1
:
H
f

0
(

)
g6
=0
:
(3.1.1)
Butinsteadoftestingthecotfunctionsatapoint
t
asinChapter2,wewould
liketotestthefunctionssimultaneouslyonthewholesupport[
a;b
].
Inthischapter,weproposenonparametrictestbasedonthepointwiseempiricallikelihood
ratiotestinChapter2,totest(2.1.2)simultaneously.SinceinChapter2,weshowedthe
EL-basedpointwisetestsenjoyaniceself-normalizingpropertysuchthatbothsparseand
densefunctionaldatacanbetreatedunderaframework,thesimultaneoustesting
66
proceduretobedevelopedherecanalsotreatalltypesoffunctionaldatawithdt
densenessinaway.
Toinvestigatethepowerofthetests,weconsiderthesamelocalalternatives(2.1.3)as
inChapter2fortheentirefunctions

0
(

)simultaneously
H
1
n
:
H
f

0
(

)
g
=
b
n
d
(

)
;
(3.1.2)
Forthesparsedatawith

=0,itisalsoknownthattheELmethodusingaglobalbandwidth
h
[CZ10]candetectalternativesoforder
b
n
=
n

1
=
2
h

1
=
4
forsimultaneoustest,whichis
alsolargerthan
n

1
=
2
.SimilarlyasinthepointwisecaseinChapter2,fordensedatawith
>
0,thedetectableorder
b
n
isstilllargelyunknown.Thisleadstothesamekeyinterest
inthischapterasinthelastchapter,understandingtheof

on
b
n
.Weusethesame
principletogettheoptimal
b
n
bymaximizingthepowerofthetest(i.e.,minimizingtheorder
of
b
n
)whilecontrollingthetypeIerroratthedesiredlevel.Undersomemildconditions,we
that,forthesimultaneoustest,
b
n
islargerthan
n

1
=
2
for


1
=
16andequalsto
n

1
=
2
for
>
1
=
16.Thetransitionpoints1
=
16willbestillrefereedas

0
asinthepointwisecase
forthissimultaneoustest.Once
>
0
,withaproperlychosenbandwidth,theproposed
testscandetectasignalataparametricrate.Thisphasetransitionresultechoesthesimilar
phenomenadiscoveredby[LH10]forestimationproblems.
Therestofthechapterisorganizedasfollows.Weproposethesimultaneoustest
inSection3.2whereweinvestigatetheasymptoticdistributionsoftheteststatisticunder
boththenullandlocalalternatives,andthetransitionphasesfor
b
n
.Simulationstudies
arepresentedinSection3.3,followedbytworealdataanalysisexamples,oneforsparseand
onefordensefunctionaldata,inSection3.4.Allthetechnicaldetailsarerelegatedtothe
67
Section3.5.
3.2Asimultaneoustest
Weassumethesameregularityconditions(C1)-(C4)forkernelfunction,momentsofthe
underlyingprocesses,smoothnessoftherelatedfunctionsandtheselectionofbandwidthas
in2.2.2inChapter2.
Wenowconsiderasimultaneousteston
H
0
in(3.1.1)forall
t
2
[
a;b
].ByLemma5in
Section2.6.2inChapter2
2
`
(
t
)=
n
2
C

2

0

H
|
f


(
t
)
g
R
(
t
)
H
f


(
t
)
g
+~
o
p
(1)
:
Intuitively,2
`
(
t
)measuresthedistancebetween
H
f

0
(
t
)
g
and0atany
t
2
[
a;b
].Totest
thehypothesis(3.1.1)simultaneously,weproposeaCramer-vonMisestypeteststatistic
T
n
=
Z
b
a
2
`
(
t
)
w
(
t
)
dt;
(3.2.3)
where
w
(

)isaknownprobabilitydensityfunction.Theconstructionof
T
n
allowsusto
borrowinformationacrossthetimedomainandyieldamorepowerfultestthanthepointwise
test.Similarconstructionswereusedby[HM93]and[CZ10].Theweightfunction
w
(
t
)
isasubjectivechoiceofthepractitioner.Themostcommonlyusedweightfunctionisa
uniformdensitytoputequalweightsonallpoints,butifthereispriorknowledgeonthe
importanceofaparticularsubintervalonecanchange
w
(
t
)toputmoreweightsonthe
importantsubinterval.
68
3.2.1Nulldistributionandlocalpower
Bytheasymptoticdecompositionof2
`
(
t
)inProposition3inChapter2,weneedtoun-
derstandthecovariancestructureoftheprocess
U
n
(
t
)inordertoinvestigatethedistribution
of
T
n
.
Proposition4.
UnderConditions(C1)-(C4)and
H
0
,Cov
f
U
n
(
s
)
;
U
n
(
t
)
g
=

n
(
s;t
)

f
1+
o
p
(1)
g
where

n
(
s;t
)=
8
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
:


1
20
K
(2)
(
s

t
h
)
I
q
;
if
m
2
h
!
0
;
I
q
I
(
s
=
t
)+
mh

0
(
s;t
)
I
(
s
6
=
t
)
if
m
2
h
!1
and
mh
!
0
;

0
(
s;t
)
;
if
mh
!1
;
K
(2)
(
x
)=
R
K
(
y
)
K
(
x

y
)
dy
and

0
(
s;t
)=
G
(
s
)

(
s;t
)
G
|
(
t

s;t
)
f
(
s
)
f
(
t
)
.
Obviously,theleadingterminthecovarianceof
U
n
(
t
)istundertasymp-
toticscenarios.Inthesecondcaseintheexpressionof

n
(
s;t
),the
I
q
I
(
s
=
t
)termseemsto
dominatebutisonlynon-zeroinanareawithLebesguemeasure0;the
mh

0
(
s;t
)
I
(
s
6
=
t
)
termisnonzeroalmosteverywhereandproducestheleadingordervarianceof
T
n
inthis
case.
Supposethecovariancefunction

n
(
s;t
)hasthefollowingspectraldecomposition[Bal60]

n
(
s;t
)=
P
1
k
=1

nk
˚
nk
(
s
)
˚
|
nk
(
t
)forany
s;t
2
[
a;b
]
;
where

n
1


n
2

0aretheorderedeigenvaluesand
˚
n
1
(
t
)
;˚
n
2
(
t
)
;

arethe
associatedeigenfunctions.Theeigenfunctionsarevectorvaluedorthonormalfunctionssat-
69
isfying
R
b
a
˚
|
nk
(
t
)
˚
nl
(
t
)
w
(
t
)
dt
=

l
k
where

l
k
=1if
k
=
l
and0otherwise.Eventhough
theeigenvalues

nk
changeundertasymptoticscenarios,itiseasytoverifythat
P
1
k
=1

nk
=tr
f
R

n
(
t;t
)
w
(
t
)
dt
g
=
q
forallcasesinProposition4.Alsonotethatinthe
thirdcaseofProposition4,

n
=

0
doesnotdependon
n
andtherefore

nk


k
and
˚
nk
(
t
)

˚
k
(
t
)forall
k
.
Toestablishtheasymptoticdistributionof
T
n
,weneedalltheconditionsinChapter2
withreplacingthecondition(C4)(ii)by
(C4)(ii'):
2(1+

)
=
17
<
0
if

2
[0
;
1
=
8]and1
=
8
<
0
<
if
>
1
=
8.
Underthenullhypothesis,wecana
q
-dimensionalGaussianprocess
U
(
t
),withmean
0
0
0andcovarianceCov(
U
(
s
)
;
U
(
t
))=

n
(
s;t
),asacounterpartoftheprocess
U
n
(
t
).Wewill
showthatthelimitingdistributionof
T
n
isthesameasthatof
Z
n
=
R
b
a
U
|
(
t
)
U
(
t
)
w
(
t
)
dt
,
whichfollowsa
˜
2
-mixturedistribution.Thisresultisdescribedinthefollowingtheorem,
theproofofwhichisprovidedintheSection3.5.1.
Theorem2.
Under
H
0
in(3.1.1)andConditions(C1)-(C3),(C4)(i)and(C4)(ii'),
T
n
d
=
Z
n
f
1+
o
p
(1)
g
,where
Z
n
d
=
P
1
k
=1

nk
˜
2
1
;k
and
˜
2
1
;k
,
k
=1
;
2
;:::
,areindependentchi-
squarerandomvariableswithonedegreeoffreedom.
Remark5.
Theasymptotic
˜
2
-mixturedistributioninTheorem2isquiteentfrom
theasymptoticnormaldistributionforclassicempiricallikelihoodratiotestsforindependent
data,timeseriesorsparselongitudinaldata[CHL03,CZ10].Infact,fordensefunctional
data,ourcalculationshowsthe
E
f
(
T
n

E
T
n
)
4
g6
=3
var
2
(
T
n
)
,andhence
T
n
canbehavequite
entlyfromaGaussianvariable.However,forsparseormoderatelydensefunctional
datawith


1
=
16
,the
˜
2
-mixtureisalsoasymptoticallynormal.Thisresultiscollectedin
thefollowingcorollary,theproofofwhichisgiveninSection3.5.1.
70
Corollary1.
UnderthesameconditionsasthoseinTheorem2,if


1
=
16
,wehave
h

1
=
2
(
T
n

q
)
d
!
N
(0
;q˙
2
0
)
where
˙
2
0
=2


2
20
R
b
a
w
2
(
t
)
dt
R
2

2
f
K
(2)
(
u
)
g
2
du
.
Corollary1makesaconnectionbetweenourgeneralresultsinTheorem2withtheclassic
results.Thenulldistributionof
T
n
istundertasymptoticscenariosandmay
dependonsomeunknownquantitiessuchas

nk
,whichmakesittouseinpractice.
Inthenextsubsection,wewillproposeabootstrapmethodunanimouslyapplicabletoall
typesoffunctionaldatatoestimatethisnulldistribution.Next,westudythepowerofthe
simultaneoustestunderthelocalalternatives.
Theorem3.
Supposethatthelocalalternativehypothesisin(3.1.2)holdsandConditions
(C1)-(C3),(C4)(i)and(C4)(ii')ared.
(a)
If


1
=
16
and
b
n
=
n

1
=
2
(
m
2
h
)

1
=
4
,then
h

1
=
2
(
T
n

q
)
d
!
N
(

0
;q˙
2
0
)
;
where

0
=
R
b
a
d
|
(
t
)
R
(
t
)
d
(
t
)
w
(
t
)
dt
and
˙
2
0
isdinCorollary1.
(b)
If
1
=
16
<

1
=
8
,

0
<
2

and
b
n
=
n

1
=
2+

foranarbitrarilysmall
>
0
,then
˙

1
1
(
T
n

q

nb
2
n

0
)
d
!
N
(0
;
1)
where
˙
2
1
=4
nb
2
n
(
mh
)
2

1
and

1
=
Z
b
a
Z
b
a
d
|
(
t
)
R
1
=
2
(
t
)

0
(
t;s
)
R
1
=
2
(
s
)
d
(
s
)
w
(
t
)
w
(
s
)
dtds:
(c)
If
>
1
=
8
and
b
n
=
n

1
=
2
,let
u
k
=
R
b
a
[
R
1
=
2
(
t
)
d
(
t
)]
|
˚
k
(
t
)
w
(
t
)
dt
.Then
T
n
d
!
P
1
k
=1

k
˜
2
1
;k

u
2
k

k

:
71
WecanuseTheorem3toexaminethepowerandsizeofdetectablesignalsofthesimul-
taneoustestundertscenarios.Weusethesameprinciple(2.3.18)inChapter2to
determinetheoptimalratefor
b
n
.When


1
=
16,followingpart(a)inTheorem3,the
asymptoticpowerofthetestis
B
(
d
)=


z

+

0
=
p
q˙
0

where

0
and
˙
0
arein
Theorem3and

)istheCDFofastandardnormaldistribution.Thetesthasnontrivial
powersforsignalsofsize
b
n
=
n

1
=
2
(
m
2
h
)

1
=
4
.Undertheconstraints(C4)(i)and(C4)(ii')
on
h
,
b
n
attainsitsminimumat
h

=
n

2(1+

+

)
=
17
foranyarbitrarysmall
>
0suchthat
b
n
=
n

8(1+

)
=
17+
=
34
.Byletting

!
0,theoptimaldetectableorderis
b

n
=
n

8(1+

)
=
17
.
When1
=
16
<

1
=
8,byourcalculationsinProposition4andTheorem2thenulldistri-
butionof
T
n
isa
˜
2
mixturewithmean(
P
1
k
=1

nk
)
f
1+
o
(1)
g
=
q
f
1+
o
(1)
g
andvariance
(2
P
k

2
nk
)
f
1+
o
(1)
g
=tr
f
RR

2
n
(
s;t
)
w
(
s
)
w
(
t
)
dsdt
gf
1+
o
(1)
g
=
O
(
mh
).Therefore,the
thresholdforan

-leveltestisoftheform
q
+
c

,where
c


(2
P
k

2
nk

)
1
=
2
=
O
(
mh
)
byChebyshev'sinequality.Bypart(b)ofTheorem3,theasymptoticpoweris
B
(
d
)=


c

2
p
nb
n
mh
p

1
+

0
2
p

1
p
nb
n

!
1
;
for
b
n
=
n

1
=
2+

withanarbitrarilysmall
>
0.Thisalsomeansthatthetesthasnontrivial
powersforsignalsofsize
b

n
=
n

1
=
2
.
Similarly,thepowerofthetestundercase(c)is
B
(
d
)=
P

1
X
k
=1

k
˜
2
1
;k

u
2
k

k

>q
+
c


where
q
+
c

isthe

-thquantileof
P
1
k
=1

k
˜
2
1
;k
.Inthiscase,
B
(
d
)isaconstantaslong
as
d
(
t
)isanon-zerofunction,whichimpliesthatthetesthasanon-trivialpowerif
72
b
n
=
n

1
=
2
.Combiningparts(b)and(c),theoptimaldetectableorderofthesimultaneous
testis
b

n
=
n

1
=
2
when
>
1
=
16.
Notethattheoptimaldetectableorderforthesimultaneoustestissmallerthanthat
ofthepointwisetestweobtainedinChapter2when


1
=
8.Thisisunderstandable
becausethesimultaneoustestborrowinformationovertheentiretimedomainandismore
powerful.Boththepointwiseandsimultaneoustestscandetectsignalsofroot-
n
orderfor
densefunctionaldatawith
>
1
=
8.
3.2.2Wildbootstrapprocedure
Theasymptoticdistributionsof
T
n
aretforsparseanddensefunctionaldata,but
theboundarybetweentscenariosisonlyintheasymptoticsense,making
tasymptoticscenariosveryculttodistinguishinpractice.Tounifytheinference
procedure,weproposeawildbootstrapprocedure[Mam93].Someresidualbasedbootstrap
procedureshavealsobeenproposedin[Far97]and[ZC07]fordensefunctionaldata,but
theconsistencyofsuchprocedureswasnotinvestigated.Theproposedbootstrapprocedure
consistsofthefollowingsteps:
Step1:
Generatingbootstrapsamples
f
Y

(
b
)
ij
;t
(
b
)
ij
;
X
(
b
)
ij
g
B
b
=1
accordingtothefollowing
model:
Y

ij
=
~

|
(
t
ij
)
X
ij
+


ij
:
where
~

(
t
ij
)isthesolutionoftheestimatingequationsin(2.3.15)inChapter2.Theresid-
ualvector


i
=(


i
1
;

;

im
i
)
|
isgeneratedfroman
m
i
-dimensionalmultivariatenormal
distributionwithmean
0
andcovariance
^

i
=(
^

t
ij
;t
ik
))
m
i
j;k
=1
where
^

t;s
)isaconsistent
estimatorof
t;s
)describedinSection2.4.2inChapter2.
73
Step2:
Basedonthe
b
-thbootstrappedsample,computeabootstrappedversionof
T
n
,
denotedas
T

(
b
)
n
.
Step3:
RepeatSteps1and2alargeinteger
B
timestoobtain
B
bootstrapvalues
f
T

(
b
)
n
g
B
b
=1
andthenthe100(1


)%quantileof
f
T

(
b
)
n
g
B
b
=1
,denotedas
^
t

.Rejectthe
nullhypothesisif
T
n
>
^
t

.
ThefollowingtheoremjusttheaboveBootstrapprocedure
Theorem4.
Let
X
n
=
f
(
Y
ij
;X
ij
;t
ij
)
;j
=1
;:::;m
i
;i
=1
;:::;n
g
denotestheoriginal
dataand
L
(
T
n
)
betheasymptoticdistributionof
T
n
underthenullhypothesis.Underthe
sameconditionsasTheorem2andsuppose
^

s;t
)
isaconsistentcovarianceestimator,the
conditionaldistributionof
T

n
given
X
n
,
L
(
T

n
jX
n
)
convergesto
L
(
T
n
)
almostsurely.
3.3Simulationstudies
Forthesimulationstudiesforsimultaneousinference,weconsiderthesamesetupasinthe
simulationstudiesforthepointwiseinferenceinSection2.5inChapter2.Weconsidered
twoscenariosAandB,correspondingtotwohypotheseson

(
t
).InscenarioA,weused
H
f
(
z
1
;z
2
)
|
g
=
z
1

z
2
totest
H
0
A
:

1
(

)=

2
(

)vs
H
1
A
:

1
(

)
6
=

2
(

)
;
whereweset

1
(
t
)=
1
2
sin
t
and

2
(
t
)=(
1
2
+
a
)sin
t
for
a
=0
;
0
:
1
;
0
:
2
;
0
:
3and0
:
4in(2.5.19)
inChapter2toevaluatetheempiricalsize(when
a
=0)andpowers(when
a>
0).In
74
scenarioB,weset
H
f
(
z
1
;z
2
)
|
g
=
z
2
totest
H
0
B
:

2
(

)=0vs
H
1
B
:

2
(

)
6
=0
;
wherewechose

1
(
t
)=
1
2
sin
t
and

2
(
t
)=
c
for
c
=0
;
0
:
02
;
0
:
04
;

;
0
:
14.Inthecon-
structionoftheteststatistic
T
n
,wechosetheweightfunction
w
(
t
)=1for
t
2
(0
;
1)and0
otherwise.Thecovariancefunctionwasestimatedbythequasimaximumlikelihoodmethod
of[FHL07].Allsimulationresultsbelowwerebasedon500simulationreplicatesandthe
criticalvalueofthetestwasestimatedby500bootstrapsamplesineachsimulationrun.We
performedthesamebandwidthselectionprocedureineachbootstrapsampletotakeinto
accounttheextravariationinthetestcausedbybandwidthselection.
Table3.1summarizestheempiricalsizesandpowersforhypothesis
H
0
A
atthe5%
nominallevel.Itcanbeseenthattheempiricalsizesarereasonablycontrolledaroundthe
nominallevel.Asweexpected,theempiricalpowerincreasesastheincreaseofthesample
size
n
andthenumberofrepeatedmeasurements
m
,whichourtheoreticalresults
inSection3.2.Inaddition,thecorrelation
ˆ
doesnothaveaclearimpactonthepower,
indicatingthattheproposedprocedureisrobustwithrespecttothecovariancestructureof
therandomerror.
ThesimulationresultsforscenarioBareillustratedinFigure3.1.Theresultsunder
n
=100and
n
=200arerepresentedbysolidanddashedlines,respectively.Weobserveda
verysimilarpatternasthatunderscenarioA.Thesizeiswellcontrolledatthe5%nominal
levelandthepowerincreasesasthevalueof
c
increases.Ateachvalueof
c
,thepower
increasesasweincrease
n
or
m
.
75
Table3.1:Empiricalsizeandpowerfortesting
H
0
A
:

1
(

)=

2
(

)underscenarioA.
m
=5
m
=10
m
=50
anˆ
=0.2
ˆ
=0.5
ˆ
=0.2
ˆ
=0.5
ˆ
=0.2
ˆ
=0.5
0.01000.0620.0580.0640.0480.0700.054
2000.0600.0520.0680.0440.0580.066
0.11000.1340.1320.1880.2120.7720.764
2000.2240.2280.3880.3440.9840.966
0.21000.3440.4060.6760.7081.0001.000
2000.7240.7340.9480.9481.0001.000
0.31000.7460.7480.9760.9821.0001.000
2000.9740.9740.9981.0001.0001.000
0.41000.9620.9601.0001.0001.0001.000
2001.0001.0001.0001.0001.0001.000
(a)
ˆ
=0
:
2
(b)
ˆ
=0
:
5
Figure3.1:
Empiricalsizeandpowerfortesting
H
0
B
:

2
(

)=0atthe5%nominallevelunder
scenarioB.Theleftpanelisfor
ˆ
=0
:
2andtherightpanelisfor
ˆ
=0
:
5.
76
3.4Realdataanalysis
Weappliedourproposedmethodstotworealfunctionaldatasets,oneissparseandthe
otherisdense.
3.4.1CD4dataanalysis
Thisdatasetwascollectedfromarandomizeddouble-blindedstudyofAIDSpatientswith
advancedimmunesuppression(CD4counts

50cells/mm
3
)conductedbytheAIDSClinical
TrialGroup(ACTG)Study193A.Patientswererandomlyassignedtodualortriplecombi-
nationsofHIV-1reversetranscriptaseinhibitors.Sp,patientswererandomizedto
oneoffourdailyregimenscontaining600mgofzidovudine:zidovudinealternatingmonthly
with400mgdidanosine(treatmentI);zidovudineplus2.25mgofzalcitabine(treatmentII);
zidovudineplus400mgofdidanosine(treatmentIII);orzidovudineplus400mgofdidano-
sineplus400mgofnevirapine(treatmentIV).Therewasatotalof1309patientsincluded
inthestudyand325,324,330and330patientswere,respectively,assignedtotreatments
I-IV.MeasurementsofCD4countswerecollectedatbaselineandat8-weekintervalsduring
follow-up.Butduetovariousreasons,suchasdropoutandskippedvisits,therepeated
measurementswereunbalanced.Thenumberofrepeatedmeasurementsduringthe40
weeksoffollow-upvariedfrom1to9,withamedianof4.Thus,thedatacanbeconsidered
assparsefunctionaldata.Moredetailsofthestudycanbefoundin[KAC
+
98].
OurinterestistostudythetreatmentontheCD4counts.Weconsidertheresponse
variabletobelog(CD4counts+1).TotestfortreatmentwesettreatmentIVas
thebaselineandthreedummyvariables
T
1
;T
2
and
T
3
asindicatorsoftreatments
77
I-III,respectively.Then,wethedatawiththefollowingfunctionallinearmodel:
Y
i
(
t
ij
)=

0
(
t
ij
)+

1
(
t
ij
)
T
1
i
+

2
(
t
ij
)
T
2
i
+

3
(
t
ij
)
T
3
i
+

4
(
t
ij
)Age
i
(
t
ij
)+

5
(
t
ij
)Gender
i
+

6
(
t
ij
)PreCD4
i
+

i
(
t
ij
)
;
for
i
=1
;

;
1309and
j
=1
;

;m
i
where
Y
(
t
)=log(CD4counts+1)istheresponse,
t
isthetime(inweeks).WealsoincludedAge,GenderandPreCD4asthecovariatesinthe
modelandallowedAgechangeover
t
.
Totestfortreatmentweconsideredtheglobalhypotheses
H
01
:

1
(

)=

2
(

)=

3
(

)=0vs
H
11
:atleastoneof

k
(

)
6
=0
;k
=1
;
2
;
3
:
Weappliedtheproposedsimultaneoustestbasedon1000bootstrapreplicates.Theband-
widthwasselectedbytheproposedprocedureinSection2.4.Wegota
p
-valueof
<
0
:
001
indicatingthatthetreatmentareindeedt.Tofurtherdissect
betweentreatments,weconductedpairwisecomparisonamongtreatments.Theresultsare
summarizedinTable3.2.Allthep-valuesforthepairwisecomparisonsexcepttheonefor
comparingtreatmentIIandIIIarelessthan5%.Theresultsindicatethatpairwise
encesintimebetweenttreatmentgroupsarestatisticallytexcept
fortreatmentIIvsIII.
3.4.2Ergonomicsdataanalysis
Aspartofastudyofthebodymotionsofautomobiledrivers,researchersattheCenterfor
ErgonomicsattheUniversityofMichigancollecteddataonthemotionofasingleindividual
78
Table3.2:P-valuesforpairwisecomparisonamongttreatmentgroups.
ComparisonHypothesisp-value
IvsII
H
02
:

1
(

)=

2
(

)0.040
IvsIII
H
03
:

1
(

)=

3
(

)0.000
IvsIV
H
04
:

1
(

)=00.000
IIvsIII
H
05
:

2
(

)=

3
(

)0.078
IIvsIV
H
06
:

2
(

)=00.000
IIIvsIV
H
07
:

3
(

)=00.002
to20targetlocationswithinatestcar.Foreachlocation,theresearchersmeasured3times
theangleformedattherightelbowbetweentheupperandlowerarms,whichyieldeda
sampleofsize20

3=60.Theangleofeachmotionwasrecordedrepeatedlyfromthestart
totheendofeachtestdrive.Thetimeperiodofeachmotionvariedinlengthbecauseof
thetargetsbeingattdistancesfromthedriverandthedrivermayreachthemat
tspeeds.Theobjectiveofthestudywastomodeltheshapeofthemotionbutnot
thespeedatwhichitoccurred.Thusinthisstudy,
t
isusedtorepresenttheproportion,not
thetime,ofthemotionbetweenthestartandtheend.See[Far97]and[SF04]foramore
detaileddescriptionofthisdataset.
Let
Y
(
t
)representtheangleataproportion
t
for
t
2
[0
;
1].Foragivenmotion,
Y
(
t
)is
observedonanequallyspacedgridofpoints.Althoughthenumberofsuchpointsinthe
originaldatavariesfromobservationtoobservation,thenumberofrepeatmeasurements
foreachmotionis20afterimputation,whichwasconsideredasdensefunctionaldataasin
[Zha11].Thepurposeofourstudywastoamodelforpredictingtherightelbowangle
curve
Y
(
t
)
;t
2
[0
;
1]giventhecoordinates(
c
x
;c
y
;c
z
)ofthetarget,where
c
x
representsthe
\lefttoright"direction,
c
y
representsthe\closetofar"direction,and
c
z
representsthe\down
toup"direction.Thecoordinates(
c
x
;c
y
;c
z
)ofeachofthe20targetsintheexperimentwere
knownandusedaspredictorsinourmodel.[SF04]comparedalinearmodel,aquadratic
79
modelandaone-wayANOVAmodel.Theyfoundthataquadraticmodelofthefollowing
formthedataadequately
Y
i
(
t
ij
)=

1
(
t
ij
)+
c
xi

2
(
t
ij
)+
c
yi

3
(
t
ij
)+
c
zi

4
(
t
ij
)
+
c
2
xi

5
(
t
ij
)+
c
2
yi

6
(
t
ij
)+
c
2
zi

7
(
t
ij
)
+
c
xi
c
yi

8
(
t
ij
)+
c
yi
c
zi

9
(
t
ij
)+
c
zi
c
xi

10
(
t
ij
)+

i
(
t
ij
)
:
(3.4.4)
for
i
=1
;

;
60and
j
=1
;

;
20.
Westartedwithmodel(3.4.4),andtestedeachofthecotfunctions

k
(
t
)
;k
=
1
;

;
10tocheckwhichtermcouldbedroppedfromthemodel.Table3.3summarizes
thep-valuesfortestingeachcotfunction.Atthe5%tlevel,wecansee
that

7
(
t
)
;
9
(
t
)and

10
(
t
)arenott,suggestingtodeletethemfromthequadratic
model(3.4.4).Wethenobtainedthereducedmodel
Y
(
t
)=

1
(
t
)+
c
x

2
(
t
)+
c
y

3
(
t
)+
c
z

4
(
t
)
+
c
2
x

5
(
t
)+
c
2
y

6
(
t
)+
c
x
c
y

8
(
t
)+

(
t
)
:
Fromtheabovereducedmodel,wecouldseethattheanglecurve
Y
(
t
)hasasigntlinear
relationshipwiththe\downtoup"coordinate
z
,butatquadraticrelationshipwith
the\lefttoright"coordinate
x
andthe\closetofar"coordinate
y
.Themodelselectedabove
isconsistentwiththemodelchosenby[Zha11].
80
Table3.3:P-valuesfortestingeachcotfunctioninthequadraticmodel(3.4.4).
Hypothesisp-valueHypothesisp-value
H
01
:

1
(

)=00.000
H
06
:

6
(

)=00.032
H
02
:

2
(

)=00.006
H
07
:

7
(

)=00.050
H
03
:

3
(

)=00.006
H
08
:

8
(

)=00.004
H
04
:

4
(

)=00.005
H
09
:

9
(

)=00.080
H
05
:

5
(

)=00.038
H
0
;
10
:

10
(

)=00.109
3.5TechnicalDetails
ThissectioncontainstheproofsforthemaintheoremsinSection3.2.Proofsforthepropo-
sitionscanbefoundinthenextsection.
3.5.1ProofsofMainTheorems
3.5.1.1ProofofTheorem2
ProofofTheorem2.
Weprovethecasewith

2
[0
;
1
=
8],underwhichwechoosethe
bandwidth
h
=
n


0
from2(1+

)
=
17
<
0
<
1


2

.Inthisscenario,itiseasyto
seethat
mh
!
0.Inthiscase,wehavethedecompositionfor
T
n
,
T
n
=
T
n
1
+
T
n
2
,where
T
n
1
=
C

2

0

n
X
i
=1
Z
b
a
˘
|
i
(
t
)
G
|
(
t
)
G
(
t
)
˘
i
(
t
)
w
(
t
)
dt
T
n
2
=
C

2

0

n
X
i
=1
n
X
k
6
=
i
Z
b
a
˘
|
i
(
t
)
G
|
(
t
)
G
(
t
)
˘
k
(
t
)
w
(
t
)
dt:
81
Itthencanbeshownthat
E
(
T
n
1
)=
q
+
qh
m


r

r
20
Z
b
a
f
(
t
)
w
(
t
)
dt
+
O
(
h
2
)
Var(
T
n
1
)=

q
+
qh
m


r

r
20
Z
b
a
f
(
t
)
w
(
t
)
dt

2
+
O
(
h
2
+1
=n
)


q
+
qh
m


r

r
20
Z
b
a
f
(
t
)
w
(
t
)
dt

2
+
O
(
h
2
)=
O
(
h
2
+1
=n
)
;
and
E
(
T
n
2
)=0,
Var(
T
n
2
)=2
q

2
20
Z
2

2
[
K
(2)
(
u
)]
2
du
Z
b
a
w
2
(
t
)
dt
+2(
mh
)
2
Z
b
a
Z
b
a
tr
f

0
(
t;s
)

0
(
s;t
)
g
w
(
t
)
w
(
s
)
dtds
+
O
(
mh
2
+
h=n
)
:
HencewehaveVar(
T
n
1
)=
O
(
h
2
+1
=n
)=
o
f
Var(
T
n
2
)
g
:
Itfollowsthat
T
n

E
(
T
n
)=
T
n
1

E
(
T
n
1
)+
T
n
2
=
T
n
2
f
1+
o
p
(1)
g
:
Thus,tostudytheasymptoticpropertyof
T
n
,weonlyneedtostudythatof
T
n
2
.
Infact,wecanwrite
T
n
2
as
T
n
2
=
1
n
n
X
i
6
=
k
Z
b
a
Z
|
i
(
t
)
Z
k
(
t
)
w
(
t
)
dt;
where
Z
i
(
t
)=
p
mh
G
(
t
)
˘
i
(
t
).Let
U
n
=
1
n

1
T
n
2
=
2
n
(
n

1)
P
1

i<k

n
K
(
Z
i
;
Z
k
),wherethe
symmetrickernel
K
(
Z
i
;
Z
k
)=
R
b
a
Z
|
i
(
t
)
Z
k
(
t
)
w
(
t
)
dt
.Weanoperator
A
K
associated
withthekernel
K
as
A
K
g
(
x
)=
R
1

K
(
x;y
)
g
(
y
)
dF
(
y
),where
F
isthedistributionof
Z
i
.
Thenwehavetheassociatedeigenvaluesandeigenfunctions,denotedas
f

k
; 
k
g
1
k
=1
.By
82
U-statistictheory[Ser80],wehave
n
U
n

1
X
k
=1

k
(
˜
2
1
;k

1)=
o
p
(1)
;
where
f
˜
2
1
;k
g
1
k
=1
areindependentchi-squaredistributedrandomvariableswith1degreeof
freedom.Thatis
T
n
2

P
1
k
=1

k
(
˜
2
1
;k

1)=
o
p
(1).Nowweonlyneedtoprovethat
f

k
g
1
k
=1
isthesameas
f

nk
g
1
k
=1
from

.
Infact,Cov(
Z
i
(
s
)
;
Z
i
(
t
))=

n
(
s;t
)=
P
1
k
=1

nk
˚
nk
(
s
)
˚
|
nk
(
t
).ThenwehavetheK-L
representationoftherandomprocess
Z
(
t
)=
P
1
k
=1
˘
z
k
˚
nk
(
t
).Then
A
K
˘
x
m
=
Z
1

K
(
x;y
)
˘
y
m
dF
(
y
)=
Z
1

Z
b
a
1
X
i
=1
1
X
j
=1
˘
x
i
˘
y
j
˚
|
ni
(
t
)
˚
nj
(
t
)
w
(
t
)
˘
y
m
dtdF
(
y
)
=
Z
b
a
1
X
i
=1
1
X
j
=1
˘
x
i
˚
|
ni
(
t
)
˚
nj
(
t
)
w
(
t
)[
Z
1

˘
y
j
˘
y
m
dF
(
y
)]
dt
=
Z
b
a
1
X
i
=1
1
X
j
=1
˘
x
i
˚
|
ni
(
t
)
˚
nj
(
t
)
w
(
t
)

m

m
nj
dt
=

nm
1
X
i
=1
˘
x
i
Z
b
a
˚
|
ni
(
t
)
˚
nm
(
t
)
w
(
t
)
dt
=

nm
1
X
i
=1
˘
x
i

m
i
=

nm
˘
x
m
:
Thatis
f

nk
; 
nk
g
1
k
=1
=
f

k
;˘
k
g
1
k
=1
.Thuswehave
T
n
2

P
1
k
=1

nk
(
˜
2
1
;k

1)=
o
p
(1),
andthen
T
n

E
(
T
n
)=
T
n
2
+
o
p
(1)=
P
1
k
=1

nk
(
˜
2
1
;k

1)+
o
p
(1).Itfollowsthat
T
n

q
=
P
1
k
=1

nk
(
˜
2
1
;k

1)+
o
p
(1)
:
Since
P
1
k
=1

nk
=
q
,wehave
T
n
=
P
1
k
=1

nk
˜
2
1
;k
f
1+
o
p
(1)
g
.
Weprovetheresultforthedensecase,i.e.
>
1
=
8.Inthiscase,wechoose
thenbandwidth
h
=
n


0
from1
=
8
<
0
<
min
f
;
1
=
2

1

g
.Underthisscenario,we
have
mh
!1
.ByLemma9inSection3.5.2,weknow
U
n
(
t
)asymptoticallyconvergesto
aGaussianprocess
U
(
t
;

)withmean0andcovariancefunction

(
s;t
).Thusthelimit-
ingdistributionof
T
n
isthesameasthedistributionof
Z
=
R
U
(
t
;

)
|
U
(
t
;

)
w
(
t
)
dt
.We
83
onlyneedtoshowthedistributionof
Z
.Tothisend,usingthefollowingKarhunen-Loeve
representationfor
U
(
t
;

)[Bal60]
U
(
t
;

)=
1
X
k
=1
˘
k
˚
k
(
t
)
;
where
˘
k
=
R
b
a
U
(
t
;

)
|
˚
k
(
t
)
w
(
t
)
dt
areindependent(
k
=1
;
2
;

;
1
)normalwithmean0
andvariance

k
.Here

k
and
˚
k
(
t
)are,respectively,the
k
-thorderedeigenvalueof

(
s;t
)
andthecorrespondingeigenfunctionsin
R
q
.Thenwehave
Z
=
1
X
k
=1
1
X
l
=1
˘
k
˘
l
Z
b
a
˚
k
(
t
)
|
˚
l
(
t
)
w
(
t
)
dt
=
1
X
k
=1
˘
2
k
:
Since
˘
k
areindependent
N
(0
;
k
),wehave
T
n
d
!
Z
=
P
1
k
=1

k
˜
2
1
;k
:
Thusbycombining
theabovetwocasestogether,wecompletetheproofofpart(b).
3.5.1.2ProofofCorollary1
Proof.
FromTheorem2,
T
n
hasthesamedistributionas
Z
n
=
P
1
k
=1

nk
˜
2
1
;k
:
Thuswe
onlyneedtoshowtheasymptoticalnormalityof
P
1
k
=1

k
˜
2
1
;k
.ByLyapunovcentrallimit
theorem,ifthefollowingconditionhold
1
X
k
=1

4
nk
=
(
1
X
k
=1

2
nk
)
2
!
0
;
(3.5.5)
Thenwehave
Z
n

P
1
k
=1

nk
q
2
P
1
k
=1

2
nk
d
!
N
(0
;
1)
:
84
UsingProposition4,

n
(
s;t
)=


1
20
K
(2)
(
s

t
h
)
I
q
andinparticular

n
(
t;t
)=
I
q
,we
that
P
1
k
=1

nk
=tr(

)=
q
R
b
a
w
(
t
)
dt
=
q
and
1
X
k
=1

2
nk
=
q
Z
b
a
Z
b
a


2
20
f
K
(2)
(
s

t
h
)
g
2
w
(
s
)
w
(
t
)
dsdt
=
qh˙
2
0
=
2
;
where
˙
2
0
wasinthecorollary.Therefore,theconclusioninthisLemmaholds.It
remainstoshowthecondition(3.5.5).Let

(
s;t
)=


1
20
K
(2)
(
s

t
h
).Then
1
X
k
=1

4
nk
=
q
ZZZZ

(
s;t
)

(
t;l
)

(
l;m
)

(
m;s
)
w
(
s
)
w
(
t
)
w
(
l
)
w
(
m
)
dsdtdldm
=
qh
3
C
0


4
20
Z
b
a
w
4
(
t
)
dt
where
C
0
=
R
K
(2)
(
u
1
)
K
(2)
(
u
2
)
K
(2)
(
u
3
)
K
(2)
(
u
1
+
u
2
+
u
3
)
du
1
du
2
du
3
isaconstant.Thus
thecondition(3.5.5)holds.Thiscompletestheproofofthiscorollary.
3.5.1.3ProofofTheorem3
ProofofTheorem3.
Firstnoticethat2
`
(
t
)=
n
2
C

2

0

~

|
(
t
)
R

1
(
t
)
~

(
t
)+
o
p
(
h
1
=
2
)and
underlocalalternative,
~

(
t
)=

R
(
t
)
C
(
t
)
A

1
(
t
)

1
n
n
X
i
=1
g
i
f

0
(
t
)
g

+
R
(
t
)
H
f

0
(
t
)
g
+~
o
p
(

n
)
:
Wethen
U
+
n
(
t
)=
G
(
t
)
C

1

0

P
n
i
=1
˘
i
(
t
)

nC

1

0

R
1
=
2
(
t
)
H
f

0
(
t
)
g
:
Firstconsideringtheproofforpart(a)with0


0
=1
=
16,underwhichwechoose
thebandwidth
h
=
n


0
with2(1+

)
=
17
<
0
<
1


2

.Inthisscenario,wehave
85
m
2
h
!
0.Wehave
T
n
=
Z
b
a
2
`
(
t
)
w
(
t
)
dt
=
Z
b
a
U
+
|
n
(
t
)
U
+
n
(
t
)
w
(
t
)
dt
+
o
p
(
h
1
=
2
)
=
C

2

0

n
X
i
=1
n
X
k
=1
Z
b
a
˘
|
i
(
t
)
G
|
(
t
)
G
(
t
)
˘
k
(
t
)
w
(
t
)
dt

2
nC

2

0

n
X
i
=1
Z
b
a
˘
|
i
(
t
)
G
|
(
t
)
R
1
=
2
(
t
)
H
f

0
(
t
)
g
w
(
t
)
dt
+
n
2
C

2

0

Z
b
a
H
|
f

0
(
t
)
g
R
(
t
)
H
f

0
(
t
)
g
w
(
t
)
dt
+
o
p
(
h
1
=
2
)
:=
R
n
1

2
R
n
2
+
R
n
3
+
o
p
(
h
1
=
2
)
:
ThenbytheresultinCorollary1,wehave
h

1
=
2
f
R
1
n

q
g
d
!
N
(0
;q˙
2
0
)
:
Andfor
R
2
n
,
obviouslywehave
E
(
R
2
n
)=0,and
Var(
R
2
n
)=
n
2
b
2
n
C

4

0

n
X
i
=1
m
i
X
j
=1
m
i
X
j
=1
1
m
2
i
E
Z
b
a
Z
b
a
X
|
ij
G
|
(
t
)
R
1
=
2
(
t
)
d
(
t
)

X
|
il
G
|
(
s
)
R
1
=
2
(
s
)
d
(
s
)

ij

il
K
h
(
t
ij

t
)
K
h
(
t
il

s
)
w
(
t
)
w
(
s
)
dtds
=
O
(
n
3
b
2
n
C

4

0

)=
O
f
n
(
b
n
mh
)
2
g
:
Sinceinthiscase
b
n
=(
nm
)

1
=
2
h

1
=
4
,wehaveVar(
R
2
n
)=
O
(
mh
3
=
2
)
:
Thuswehave
h

1
=
2
R
2
n
p
!
0sincewehave
mh
1
=
2
!
0underthiscase.Andfor
R
3
n
whichisnon-random,
wehave
h

1
=
2
R
3
n
=
R
b
a
d
|
(
t
)
R
(
t
)
d
(
t
)
w
(
t
)
dt:
Thuswehave
h

1
=
2
(
T
n

q
)
d
!
N
(

0
;q˙
2
0
),
where

0
=
R
b
a
d
|
(
t
)
R
(
t
)
d
(
t
)
w
(
t
)
dt
.
Forpart(b)with1
=
16
<

1
=
8,underwhichwechoosethebandwidth
h
=
n


0
with
2(1+

)
=
17
<
0
<
1


2

.Inthisscenario,wealsomaketohave
m
2
h
!1
and
86
mh
!
0.Wewrite
U
+
n
(
t
)=
C

1

0

n
X
i
=1

G
(
t
)
˘
i
(
t
)

b
n
R
1
=
2
(
t
)
d
(
t
)

:=
1
p
n
n
X
i
=1
Z
+
i
(
t
)
;
where
Z
+
i
(
t
)=
p
mh

G
(
t
)
˘
i
(
t
)

b
n
R
1
=
2
(
t
)
d
(
t
)

.Thenwehave
T
n
=
Z
b
a
2
`
(
t
)
w
(
t
)
dt
=
Z
b
a
U
+
|
n
(
t
)
U
+
n
(
t
)
w
(
t
)
dt
+
o
p
(
mh
)
=
1
n
n
X
i
=1
n
X
k
=1
Z
b
a
Z
+
|
i
(
t
)
Z
+
k
(
t
)
w
(
t
)
dt
+
o
p
(
mh
):=
T
+
n
1
+
T
+
n
2
+
o
p
(
mh
)
;
where
T
+
n
1
=
1
n
P
n
i
=1
R
b
a
Z
+
|
i
(
t
)
Z
+
i
(
t
)
w
(
t
)
dt
and
T
+
n
2
=
1
n
P
n
i
6
=
k
R
b
a
Z
+
|
i
(
t
)
Z
+
k
(
t
)
w
(
t
)
dt:
By
similarcalculationasinthenullhypothesisfor
T
n
1
,wehave
E
(
T
+
n
1
)=
q
+
(
m


r
)
qh

r
20
R
b
a
f
(
t
)
w
(
t
)
dt
+
mhb
2
n

0
+
O
(
h
2
)andVar(
T
+
n
1
)=
O
(
h
2
+1
=n
).
For
T
+
n
2
,wetheU-statisticasfollows
U
n
=
T
+
n
2
(
n

1)
=
1
n
(
n

1)
n
X
i
6
=
k
Z
b
a
Z
+
|
i
(
t
)
Z
+
k
(
t
)
w
(
t
)
dt
=
1
n
(
n

1)
n
X
i
6
=
k
K
(
Z
+
i
;
Z
+
k
)
;
wherethekernelfunction
K
isthesameasintheproofforthenullcase.Itiseasytoshow

=
E
K
(
Z
+
1
;
Z
+
2
)=
mhb
2
n

0
.Andtherstprojection
K
1
(
Z
+
1
)=
E
fK
(
Z
+
1
;
Z
+
2
)
jZ
+
1
g
=

b
n
p
mh
R
b
a
Z
+
|
1
(
t
)
R
1
=
2
(
t
)
d
(
t
)
w
(
t
)
dt
hasthevariance

1
,whichcanbeobtainedby
E
K
2
1
(
Z
+
1
)=
b
2
n
mh
Z
b
a
Z
b
a
d
|
(
t
)
R
1
=
2
(
t
)
E
fZ
+
1
(
t
)
Z
+
|
1
(
s
)
g
R
1
=
2
(
s
)
d
(
s
)
w
(
t
)
w
(
s
)
dtds:
Therefore,wehave

1
=
b
2
n
(
mh
)
2

1
+
O
(
b
2
n
mh
2
),where

1
isinTheorem3.
87
Wealsohave

2
=Var
fK
(
Z
+
1
;
Z
+
2
)
g
=(
mh
)
2
V
+
O
(
h
+
b
2
n
m
2
h
2
),where
V
=2
Z
b
a
Z
b
a
tr
f

0
(
s;t
)

0
(
t;s
)
g
w
(
t
)
w
(
s
)
dtds:
ThusbyU-statistictheory,if

2
=
o
(

1
),whichisequivalentto
b

1
n
=
o
(
p
n
),wehave
U
n
˘
AN(
;
4

1
n
)providedthattheprojectionsequence
fK
1
(
Z
+
i
)
g
n
i
=1
satisfytheLyapunov's
condition,whichcanbevasfollows.Since
E
K
1
(
Z
+
i
)=

,Var
fK
1
(
Z
+
i
)
g
=

1
and
E
fK
1
(
Z

i
)


g
4
˘
b
4
n
(
mh
)
4
uptoaconstant,wehave
P
n
i
=1
E
fK
1
(
Z
+
i
)


g
4

P
n
i
=1
Var
2
fK
1
(
Z
+
i
)
g

2
˘
nb
4
n
(
mh
)
4
n
2
b
4
n
(
mh
)
4
=
1
n
!
0
:
Thusif
b

1
n
=
o
(
p
n
),wehave
T
+
n
2
˘
AN(
nb
2
n

0
;
4
nb
2
n
(
mh
)
2

1
).Thentheconclusionin
part(b)holds.
Forpart(c)with
>
1
=
8,underwhichwechoosethebandwidth
h
=
n


0
with
1
=
8
<
0
<
1
=
2

1

.Inthisscenario,wehave
mh
!1
.Since
b
n
=
n

1
=
2
and
C

0

=
n
1
=
2
,wehave
U
+
n
(
t
)=
G
(
t
)
C

1

0

n
X
i
=1
˘
i
(
t
)

R
1
=
2
(
t
)
d
(
t
)
:
ByLemma9inSection3.5.2inChapter2,weknow
U
+
n
(
t
)asymptoticallyconvergesto
aGaussianprocess
U
+
(
t
;

)withmean

R
1
=
2
(
t
)
d
(
t
)andcovariancefunction

(
s;t
).Thus
thelimitingdistributionof
T
n
isthesameasthedistributionof
Z
+
:=
R
b
a
U
+
(
t
;

)
T
U
+
(
t
;

)
w
(
t
)
dt
.
Weonlyneedtoshowthedistributionof
Z
+
.Tothisend,usingthefollowingeigenvaluede-
compositionfor
U
+
(
t
;

)[Bal60]
U
+
(
t
;

)=
P
1
k
=1
˘
+
k
˚
k
(
t
)
;
where
˘
+
k
=
R
b
a
U
+
(
t
;

)
|
˚
k
(
t
)
w
(
t
)
dt
areindependent(
k
=1
;
2
;

;
1
)normalwithmean

u
k
andvariance

k
.Here

k
and
88
˚
k
(
t
)isthe
k
-thorderedeigenvalueof

0
(
s;t
)andcorrespondingeigenfunctionsin
R
q
.Then
wehave
Z
+
=
1
X
k
=1
1
X
l
=1
˘
+
k
˘
+
l
Z
b
a
˚
k
(
t
)
|
˚
l
(
t
)
w
(
t
)
dt
=
1
X
k
=1
˘
+2
k
:
Because
˘
+
k
areindependent
N
(

u
k
;
k
),wehave
T
n
d
!
P
1
k
=1

k
˜
2
1
;k


2
k

k

:
Thiscom-
pletestheproofofpart(c).
3.5.1.4ProofofTheorem4
ProofofTheorem4.
Conditionalonthedata
X
n
=
f
Y
ij
;X
ij
;t
ij
g
n
i
=1
,thebootstrappedsam-
plewasgeneratedaccordingto
Y

ij
=
~

|
(
t
ij
)
X
ij
+


ij
;
whichcanberegardedananalogof
themodel(2.1.1)withthetruecotfunction
~

(
t
)and


ij
hasmean0andcovari-
ance
^

s;t
).Let
o

p
(1)and
O

p
(1)bethestochasticorderwithrespecttotheconditional
probabilitymeasuregiventheoriginalsamples.
Basedonthisbootstrappedsample
f
Y

(
b
)
ij
;t
ij
;
X
ij
:
i
=1
;

;n
;
j
=1
;

;m
i
g
,we
estimatethetrue
~

(
t
)bylocallinearsmoothingwiththeestimator
^


(
t
),which
fromtheoriginal
^

(
t
)onlyviatheerror.Andthenourestimatingequationisconstructed
asfollowing
g

i
f

(
t
)
g
=
1
m
i
m
i
X
j
=1
n
Y

ij


|
(
t
)
X
ij
f
^


(
t
ij
)

^


(
t
)
g
|
X
ij
o
X
ij
K
h
(
t
ij

t
)
:
SincewehaveprovedthefollowingresultsintheproofofLemma1inSection2.6.2,
sup
t
2
[
a;b
]
k
1
n
n
X
i
=1
1
m
i
m
i
X
j
=1
X
ij
X
|
ij
K
h
(
t
ij

t
)


(
t
)
f
(
t
)
k
=
O
(

n
)a.s.
;
89
andbythesimilarproofasLemma2inSection2.6.2,wehave
g

i
f
~

(
t
)
g
=
˘

i
(
t
)
f
1+~
o

p
(1)
g
+
~
O
(
h
4
)a.s.
where
˘

i
(
t
)=
1
m
i
P
m
i
j
=1
X
ij


ij
K
h
(
t
ij

t
)andhereandbelow,thealmostsurelyconvergence
holdswithrespecttotheoriginalprobabilitymeasure,whichistruealmostsurelyforallthe
samplepointsinthesamplespaceof
X
n
when
n
istlarge.Thenbythefactthat
sup
t
k
~

(
t
)


0
(
t
)
k
=
O
(

n
1
+
h
4
)a.s..Thus,similarto(2.6.29),wehavethefollowing
resultsalmostsurely
2
`

(
t
)=
1
C
2

0

f
n
X
i
=1
g

i
(
~

)
g
|
A

1
C
|
RCA

1
f
n
X
i
=1
g

i
(
~

)
g
+
o

p
(1)+
~
O
(

n
1
+
h
4
)
=
1
C
2

0

f
n
X
i
=1
g

i
(
~

)
g
|
G
|
G
f
n
X
i
=1
g

i
(
~

)
g
+
o

p
(1)+
~
O
(

n
1
+
h
4
)
=
U

|
n
(
t
)
U

n
(
t
)
f
1+
o

p
(1)
g
+
~
O
(

n
1
+
h
4
)
;
where
U

n
(
t
)=
C

1

0

G
(
t
)
P
n
i
=1
˘

i
(
t
)with
G
(
t
)=
R
1
=
2
(
t
)
C
(
t
)
A

1
(
t
).
Thusthebootstrappedversionteststatistic
T

n
canberepresentedas
T

n
=
Z
b
a
U

|
n
(
t
)
U

n
(
t
)
w
(
t
)
dt
f
1+
o

p
(1)
g
+
o
(1)a.s.(3.5.6)
Let
d
(
F;G
)bethemaximumnormdistancebetweentwodistributionfunctions
F
and
G
suchthat
d
(
F;G
)=sup
x
j
F
(
x
)

G
(
x
)
j
.FromtheproofofTheorem2,weknowthatthe
requiredconditionsforshowingtheconvergenceof
d
(
L
(
R
b
a
U
|
n
(
t
)
U
n
(
t
)
w
(
t
)
dt
)
;
L
(
T
n
))
!
0
aretheindependencebetween
X
i
(
t
)and

i
(
t
),

i
(
t
)areindependentwithE
f

i
(
t
)
g
=0and


momentsfor
i
=1
;

;n
.
90
Toshowthat
d
(
L
(
R
b
a
U

|
n
(
t
)
U

n
(
t
)
w
(
t
)
dt
jX
n
)
;
L
(
T
n
))
!
0
;n
!1
,wenotethedif-
ferencebetween
R
b
a
U

|
n
(
t
)
U

n
(
t
)
w
(
t
)
dt
and
R
b
a
U
|
n
(
t
)
U
n
(
t
)
w
(
t
)
dt
isthat

i
(
t
)isreplaced
by


i
(
t
),whichhasmean0andcovariance
^

s;t
).Since
^

s;t
)isaconsistentestima-
torof
s;t
),andfromourconstructionof


i
(
t
)inthewildbootstrapprocedure,given
X
n
,wehavetheindependencebetween
X
i
(
t
)and


i
(
t
),E
f


i
(
t
)
g
=0and


i
(
t
)has
nite

moments.Thus,basedonthestandardmooftheproofofTheorem2,
wehave
d
(
L
(
R
b
a
U

|
n
(
t
)
U

n
(
t
)
w
(
t
)
dt
jX
n
)
;
L
(
T
n
))
!
0.Thistogetherwith(3.5.6),wehave
d
(
L
(
T

n
jX
n
)
;
L
(
T
n
))
!
0almostsurely.
3.5.2ProofsofPropositionandLemma
Lemma9.
Underassumptions(C1)-(C4),forthedensefunctionaldata,
U
n
(
t
)
converges
toamultivariateGaussianprocess
˘
(
t
)
withmean0andcovariancematrix

0
din
Proposition4inSection3.2.
Proof.
Itisclearthat
E
f
U
n
(
t
)
g
=
C

1

0

P
n
i
=1
G
(
t
)
E
f
˘
i
(
t
)
g
=0and
Cov
f
U
n
(
s
)
;
U
n
(
t
)
g
=
1
C
2

0

(
n
X
i
=1
G
(
s
)
E
f
˘
i
(
s
)
˘
|
i
(
t
)
g
G
|
(
t
)
)
:
Forcomputing
E
f
˘
i
(
s
)
˘
|
i
(
t
)
g
,bysimilarcalculationasbefore,wehavethefollowingresult,
E
f
˘
i
(
s
)
˘
|
i
(
t
)
g
=
K
(2)
(
s

t
h
)
m
i
h

(
s

s
)
f
(
s
)+
m
i

1
m
i

(
s;t

s;t
)
f
(
s
)
f
(
t
)+
~
O
(
h
2
)
;
where
K
(2)
(
x
)=
R
K
(
y
)
K
(
y

x
)
dy
and

(
s;t
)=
E
f
X
(
s
)
X
|
(
t
)
g
;

s;t
)=
E
f

(
s
)

(
t
)
g
.
91
Thenwehave
Cov
f
U
n
(
s
)
;
U
n
(
t
)
g
=
G
(
s
)

(
s

s
)
G
|
(
t
)
f
(
s
)
K
(2)
(
s

t
h
)
1
C
2

0

n
X
i
=1
1
m
i
h
+
G
(
s
)

(
s;t

s;t
)
G
|
(
t
)
f
(
s
)
f
(
t
)
1
C
2

0

n
X
i
=1
m
i

1
m
i
+
n
~
O
(
h
2
)
C
2

0

G
(
s
)
G
|
(
t
)
:
(3.5.7)
Bytheof
C

0

,wehavethefollowingresult,
Cov
f
U
n
(
s
)
;
U
n
(
t
)
g˘
G
(
s
)

(
s;t

s;t
)
G
|
(
t
)
f
(
s
)
f
(
t
)=

0
(
s;t
)
:
ThuswehaveCov
f
U
n
(
s
)
;
U
n
(
t
)
g
=

0
(
s;t
)+~
o
(1)
:
TheproofinLemma3provesthecentral
limittheoremthejointdistributionof
f
U
n
(
t
1
)
;

;
U
n
(
t
s
)
g
attimepoints
f
t
1
;

;t
s
g
.
Weakconvergenceof
U
n
(
t
)nowfollows(Billingsley(1968),page95)if
8
a
2
R
q
,
a
|
E
f
[
U
n
(
s
)

U
n
(
t
)][
U
n
(
s
)

U
n
(
t
)]
|
g
a

C
(
s

t
)
2
canbeestablished.Tothisend,wenotethat
a
|
E
f
[
U
n
(
s
)

U
n
(
t
)][
U
n
(
s
)

U
n
(
t
)]
|
g
a
=
a
|
G
(
s
)
B
(
s
)
G
|
(
s
)
a

a
|

0
(
s;t
)
a

a
|

0
(
t;s
)
a
+
a
|
G
(
t
)
B
(
t
)
G
|
(
t
)
a
=2
a
|
a
f
a
|
[

0
(
s;s
)+(
t

s
)
@

0
(
s;s
)
@t
+(
t

s
)
2
@
2

0
(
s;s

)
@t
2
]
a
f
a
|
[

0
(
t;t
)+(
s

t
)
@

0
(
t;t
)
@s
+(
t

s
)
2
@
2

0
(
t;t

)
@s
2
]
a
g

s

t
jj
a
|
f
@

0
(
s;s
)
@t

@

0
(
t;t
)
@s
g
a
j
+
C
1
(
s

t
)
2

C
(
s

t
)
2
;
92
whereweused

0
(
s;s
)=

0
(
t;t
)=
I
q
andthelasttwoinequalitiesfollowfromtheconti-
nuitycondition(C3).
ProofofProposition4.
By(3.5.7)intheproofofLemma9,andtheof
C

0

,we
havethefollowingresult,uptoafactor1+
o
p
(1),Cov
f
U
n
(
s
)
;
U
n
(
t
)
g
=


1
20
K
(2)
(
s

t
h
)
I
q
+
mh

0
(
s;t
)for
mh
!
0,andCov
f
U
n
(
s
)
;
U
n
(
t
)
g
=

0
(
s;t
)for
mh
!1
.
Since
K
(2)
(
s

t
h
)=

20
when
s
=
t
,wecanfurtherhave,uptoafactor1+
o
p
(1),
Cov
f
U
n
(
s
)
;
U
n
(
t
)
g
=
8
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
:


1
20
K
(2)
(
s

t
h
)
I
q
;
if
m
2
h
!
0
;
I
q
I
(
s
=
t
)+
mh

0
(
s;t
)
I
(
s
6
=
t
)if
m
2
h
!1
and
mh
!
0
;

0
(
s;t
)
;
if
mh
!1
;
whichcompletetheproofoftheproposition.
93
Chapter4
EmpiricalLikelihoodinTesting
CotsinHighDimensional
HeteroscedasticLinearModels
4.1Introduction
AsmentionedinSection1.2.2,peoplehavemadetprogresstowardsunderstanding
theestimationtheory,butverylittleworkhasbeendoneforstatisticalinferenceforhigh
dimensionallinearmodels,especiallywithheteroscedasticnoise.Empiricallikelihoodhas
theabilityofinternalstudentizingtoavoidvarianceestimation,whichcanhelpsolvethe
heteroscedasticityissue.
InSection4.2,westudytheasymptoticnormalityofWaldtypestatisticfortheexisting
methodsundertheheteroscedasticnoise.InSection4.3,weproposethegeneralempiri-
callikelihoodframeworkforanalyzingtheestimatingequationsproposedintways,
althoughtheyallfollowthelowdimensionalprojectionidea.InSection4.4,weprovide
implicationsofthegeneralresultsonthreetcases,projectionvialassoestimation,
projectionviainverseregressionandprojectionviaKFCsetselection.Section4.5provides
numericalresultsandSection4.6showssomerealdataanalysis.Wereferalloftheproofs
94
totheTechnicalDetails4.7.
Thefollowingnotationisadoptedthroughoutthischapter.For
v
=(
v
1
;v
2
;

;v
d
)
|
2
R
d
,we
k
v
k
q
=(
P
d
i
=1
j
v
i
j
q
)
1
=q
for0
<q<
1
,
k
v
k
0
=
j
supp(
v
)
j
wheresupp(
v
)=
f
j
:
v
j
6
=0
g
and
j
A
j
isthecardinalityofaset
A
,and
k
v
k
1
=max
1

j

d
j
v
i
j
.Forasymmetric
matrix
M
=((
M
jk
)),

min
(
M
)and

max
(
M
)aretheminimalandmaximaleigenvaluesof
M
.Foranymatrix
M
=((
M
jk
)),let
k
M
k
max
=max
j;k
j
M
jk
j
,
k
M
k
1
=max
k
P
j
j
M
jk
j
,
k
M
k
2
=
p

max
(
M
|
M
),and
k
M
k
1
=max
j
P
k
j
M
jk
j
.Wedenote
I
d
asthe
d

d
identity
matrix,andifthedimensionisobviousfromthecontext,wejustomitthesubscript
d
.For
Sf
1
;
2

;d
g
,let
v
S
=
f
v
j
:
j
2Sg
beasubvectorof
v
.Andforany
k
2f
1
;
2
;

;d
g
,
let
M
j
S
=
f
M
jl
;l
2Sg
asarowvectorand
M
S
j
=
f
M
lj
:
l
2Sg
asacolumnvector.
Denote
n
k
=
f
1
;
2
;

;k

1
;k
+1
;

;d
g
.Forasequenceofrandomvariables
X
n
,we
write
X
n
d
!
X
forsomerandomvariable
X
,if
X
n
convergesto
X
indistribution,andwrite
X
n
p
!
a
forsomeconstant
a
,if
X
n
convergesinprobabilityto
a
.Fornotationalsimplicity,
weuse
C;C
0
;C
00
;C
1
;C
2
;C
3
todenotegenericconstants,whosevaluescanchangefromline
toline.
4.2PreliminaryandExistingMethods
Weconsideralinearregressionmodel:
Y
=
X

0
+

;
(4.2.1)
where
Y
=(
Y
1
;Y
2
;

;Y
n
)
|
2
R
n
isaresponsevector,
X
=((
X
ij
))
2
R
n

p
isarandom
designmatrixwithcolumns
f
X
j
2
R
n
g
p
j
=1
androws
f
X
i
2
R
p
g
n
i
=1
,whichareassumed
95
tobeindependentandidenticallydistributed(IID)withE(
X
i
)=
0
andVar(
X
i
)=

,
and

0
2
R
p
isavectorofunknowntrueregressioncots.Theerrorterm
E(

i
j
X
i
)=0,andVar(

i
j
X
i
)=
˙
2

(
X
i
),whichallowsheteroscedasticity.Notethatwith
theseassumptions,
X
i
and

i
areuncorrelated,i.e.E(
X
i

)=
0
.Inaddition,weassumethe
marginalvarianceVar(

i
)=
˙
2

.Hereafterweassumethat
p
˛
n
.Denote
s
=
k

0
k
0
be
thenumberofnon-zerosof

0
andweassumesparsitywith
s<n
.Let
Z
i
=

i
X
i
bea
randomvectorwithmean
0
andcovariancematrix

=((

jk
)).AndassumeVar(

2
i
)=

andCov(

2
i
;
Z
i
)=
$
.
Inpractice,amonghundredsofthousandsofregressors,peoplewanttotestwhether
sometargetfeaturesaretornot.Forexample,onemaywanttoknowwhethera
particulargeneistornotamongthousandsofgenes.Toassessthe
ofasinglecot,wetestthefollowinghypothesisforanygiven
j
2f
1
;
2
;

;p
g
,
H
0
:

0
j
=0
vs:H
1
:

0
j
6
=0
:
(4.2.2)
Statisticalinferenceforlow-dimensionalcotsinhighdimensionallinearmodelwith
homoscedasticnoisehasreceivedincreasingattention.Lowdimensionalprojectionmethod
hasbeenintroducedby[ZZ14]and[B
+
13].
Under(4.2.1),andinlowdimensionalscenario,i.e.
p

n
,wehavetheordinaryleast
square(OLS)estimatorfor

0
j
,
^

j
=
(
X
?
j
)
|
Y
(
X
?
j
)
|
X
j
=
(
Q
n
j
X
j
)
|
Y
(
Q
n
j
X
j
)
|
X
j
=
(
Q
n
j
X
j
)
|
(
Q
n
j
Y
)
(
Q
n
j
X
j
)
|
(
Q
n
j
X
j
)
=
X
|
j
Q
n
j
Y
X
|
j
Q
n
j
X
j
;
(4.2.3)
where
X
?
j
istheprojectionof
X
j
totheorthogonalcomplementofthecolumnspacespaned
96
by
f
X
n
j
g
,and
Q
n
j
isasbelowforgeneral
Q
S
with
Sf
1
;
2

;p
g
and
jSj
<n
,
Q
S
=
I
P
S
=
I

X
S
(
X
|
S
X
S
)

1
X
|
S
2
R
n

n
:
Howeverinthehighdimensionallinearmodelwith
p>n
,theOLSestimatorisnolonger
valid.Insteadofprojectionontothespacespannedbyalloftherestcovariates,peopleselect
theprojectionspacebasedonthecorrelationsbetween
X
j
andtheothers.
4.2.1LassoProjection
In[ZZ14,vdGBR13,NL14],theyusedthelinearsparseregularizedregressionproceduresuch
asLassotoselecttheprojectionspace.

ij
:=
X
ij

X
|
i;
n
j


1
n
j;
n
j

n
j;j
.Thatis
X
ij
=
X
|
i;
n
j
w
0
j
+

ij
with
w
0
j
=


1
n
j;
n
j

n
j;j
,whichleadstothefollowinggeneralizedversionof(4.2.3)with
relaxedprojection
^

(lin)
j
=
Z
|
j
Y
Z
|
j
X
j
;
where
Z
j
=
X
j

X
n
j
^
w
j
(4.2.4)
with
^
w
j
asanestimatorof
w
0
j
.However,
^

(lin)
j
isbiased.Tosolvethisissue,[ZZ14]proposed
thede-biasedestimatorasfollows,
^

(de)
j
=
Z
|
j
Y

P
k
6
=
j
Z
|
j
X
k
^

k
Z
|
j
X
j
;
(4.2.5)
97
where
^

issomeinitialestimatorof

0
.Thisde-biasedestimator(4.2.5)canberegarded
asthesolutiontotheestimatingequation,whichisbasedonthepopulationsubject

ij

i
=

X
ij

E(
X
ij
j
X
i;
n
j
)

Y
i

X
|
i

0

,thatis
n
X
i
=1
m
(lasso)
ni
(

j
):=
n
X
i
=1

X
ij

X
|
i;
n
j
^
w
j

Y
i

X
ij

j

X
|
i;
n
j
^

n
j

=0
:
(4.2.6)
Andbysimplealgebra,wehave
m
(lasso)
ni
(

0
j
)=

i

ij
|
{z
}
W
(lasso)
ni
+

ij
(

0
n
j

^

n
j
)
|
X
i;
n
j
+(
w
0
j

^
w
j
)
|
X
i;
n
j

Y
i

X
ij

0
j

X
i
n
j
^

n
j

|
{z
}
R
(lasso)
ni
:
Bysimplecalculation,wehaveE(
W
(lasso)
ni
)=E


i
(
X
ij


j;
n
j


1
n
j;
n
j
X
i;
n
j
)

=0and
E[(
W
(lasso)
ni
)
2
]=E


2
i
(
X
ij


j;
n
j


1
n
j;
n
j
X
i;
n
j
)
2

=E
f

2
i
(
X
2
ij

2
X
ij

j;
n
j


1
n
j;
n
j
X
i;
n
j
+

j;
n
j


1
n
j;
n
j
X
i;
n
j
X
|
i;
n
j


1
n
j;
n
j

n
j;j
)
g
=E
f
Z
2
ij

2
Z
ij

j;
n
j


1
n
j;
n
j
Z
i;
n
j
+

j;
n
j


1
n
j;
n
j
Z
i;
n
j
Z
|
i;
n
j


1
n
j;
n
j

n
j;j
g
=

jj

2

j;
n
j


1
n
j;
n
j

j;
n
j
+

j;
n
j


1
n
j;
n
j

n
j;
n
j


1
n
j;
n
j

n
j;j
:=
˙
2
n;
lasso
:
Notethatifweassumetheindependencebetweentheerrortermandthecovariates,wehave
thefollowingform
E[(
W
(lasso)
ni
)
2
]=
˙
2

(
˙
jj


j;
n
j


1
n
j;
n
j

n
j;j
)
:
Thisshowsthebetweenourheteroscedasticcaseandthehomoscedasticcase.
Forthehomoscedasticcase,asdiscussedin[ZZ14][vdGBR13],theinferenceproce-
98
durebasedonasymptoticnormalityneedstoestimatetheasymptoticvariance
˙
2

=
(
˙
jj


j;
n
j


1
n
j;
n
j

n
j;j
).Undertheheteroscedasticnoise,wecanstillshowthefollowingasymp-
toticnormalitybutwithmuchmorecomplicatedasymptoticvariance.
Proposition5.
Undermodel(4.2.1)withheteroscedasticnoise,ifAssumption1inthe
appendixholds,wehave
p
n
(
^

(
de
)
j


0
j
)
d
!
N
(0
;˙
2
lasso
)(4.2.7)
wheretheasymptoticvarianceisdasfollows
˙
2
lasso
=lim
n
!1

jj

2

j;
n
j


1
n
j;
n
j

j;
n
j
+

j;
n
j


1
n
j;
n
j

n
j;
n
j


1
n
j;
n
j

n
j;j
(
˙
jj


j;
n
j


1
n
j;
n
j

n
j;j
)
2
:
(4.2.8)
Suchcomplexasymptoticvariance(4.2.8)makesithardtouseWaldtypeinference
procedureinpracticesinceitisytogetagoodestimatefortheasymptoticvariance.
ThusnaivelyusingtheWaldtypetestprocedureproposedby[ZZ14]intheheteroscedastic
casewillleadtoinvalidresults,whichwillbedemonstratedinthesimulationstudyinSection
4.5.
4.2.2KFCProjection
[LZL
+
13]proposedanotherwaytoselecttheprojectionspace,whichisbasedontheso
calledKFCset
S
=
f
l
6
=
j
:
j
˙
jl
j
>c
g
forsomepre-spthresholdvalue
c>
0.That
isessentiallythesetofallkeyconfoundersassociatedwith
X
j
.Andthentheestimatorcan
99
beobtainedbytheprojectionwithrespecttothecovariatesindexedby
S
,
^

(kfc)
j
=
X
|
j
Q
S
Y
X
|
j
Q
S
X
j
=
~
X
|
j
~
Y
~
X
|
j
~
X
j
;
(4.2.9)
withthedresponseandtargetpredictoras
~
Y
=
Q
S
Y
;
~
X
j
=
Q
S
X
j
.
Basedonthede-biasidea,weproposethefollowingde-biasedKFCestimator
^

(kfc-de)
j
=
~
X
|
j
~
Y

P
k
2S

~
X
|
j
~
X
k
^

k
~
X
|
j
~
X
j
;
(4.2.10)
where
S

=
S
+
c
,i.e.thecomplementof
S
+
:=
f
1
g[S
,and
^

S

isaninitialestimator.
Infact,theabovede-biasedKFCestimatoristhesolutiontotheestimatingequation
basedonthepopulationsubject

ij;
S

i
:=

X
ij

E(
X
ij
j
X
i
S
)

Y
i

X
|
i

0

,thatis
n
X
i
=1
m
(kfc)
ni
(

j
):=
n
X
i
=1
(
~
Y
i

~
X
ij

j

~
X
|
i
S

^

S

)
~
X
ij
=0
;
(4.2.11)
where
m
(kfc)
n
(

0
j
)canbedecomposedas
m
(kfc)
ni
(

0
j
)=

i

ij;
S
+


j
S


1
SS
X
i
S

X
ij

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

+


i

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S


j
S


1
SS
X
i
S

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S

+

X
ij

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


[

0
S


^

S

]
:
Wedenotethetermas
W
(kfc)
ni
andalltheothersaredenotedby
R
(kfc)
ni
.Andfor
simplicityweassumethenormalityof
X
i
˘
N(
0
;

)fortheKFCprojectionsection.Now
100
W
(kfc)
ni
=
f

i
(
X
ij


j
S


1
SS
X
i
S
)
g
n
i
=1
areIIDwithE
W
(kfc)
ni
=0and
E[(
W
(kfc)
ni
)
2
]=E


2
i
(
X
ij


j
S


1
SS
X
i
S
)
2

=E
f

2
i
(
X
2
ij

2
X
ij

j
S


1
SS
X
i
S
+

j
S


1
SS
X
i
S
X
|
i
S


1
SS

S
j
)
g
=E
f
Z
2
ij

2
Z
ij

j
S


1
SS
Z
i
S
+

j
S


1
SS
Z
i
S
Z
|
i
S


1
SS

S
j
g
=

jj

2

j
S


1
SS

j
S
+

j
S


1
SS

SS


1
SS

S
j
:
Notethatifweassumeindependencebetween

i
and
X
i
,wehaveE[(
W
(kfc)
ni
)
2
]=
˙
2

(
˙
jj


j
S


1
SS

S
j
).
Thusifweassumeindependencebetween

i
and
X
i
,wehavethesimpleasymptotic
variancefor
^

(kfc-de)
j
,
˙
2

=
(
˙
jj


j
S


1
SS

S
j
)asdiscussedin[LZL
+
13].Butundermodel
(4.2.1)withheteroscedasticerrorterm,wehavethefollowingasymptoticnormalitywith
morecomplicatedvariance.
Proposition6.
UndertheAssumption3intheappendix,wehave
p
n
(
^

(
kfc-de
)
j


0
j
)
d
!
N
(0
;˙
2
kfc
)
;
(4.2.12)
wheretheasymptoticvarianceisdas
˙
2
kfc
=lim
n
!1
(

jj

2

j
S


1
SS

j
S
+

j
S


1
SS

SS


1
SS

S
j
)
=
(
˙
jj


j
S


1
SS

S
j
)
2
:
(4.2.13)
Similarly,sincetheexpression(4.2.13)fortheasymptoticvarianceisreallycomplicated,
whichmakessuchWaldtypestatistichardtouseinpractice.
101
4.2.3InverseProjection
Sofarweconstructestimatorsforthetargetcotparameter

j
directly.However,to
conductthehypothesistestingproblem(4.2.2),[LL14]proposedanequivalenttestbasedon
theprojectionof
X
ij
onto(
Y
i
;
X
|
i;
n
j
)
|
,
X
ij
=(
Y
i
;
X
|
i;
n
j
)

0
j
+

ij;y
;
(4.2.14)
where

ij;y
E

ij;y
=0
;
Cov(

ij;y
;
(
Y
i
;
X
|
i;
n
j
))=
0
.Underthelinearmodel(4.2.1)
withheteroscedasticnoise,aslongasCov(
X
i
;
)=
0
,wecanstillshowthatthevector

0
j


0
j
=

˙
2

j;y


0
j
˙
2

;

0
j

0
|
n
j
˙
2

+

n
j;j

|
,where
˙
2

j;y
=Var(

ij;y
)=((

0
j
)
2
+
w
jj
)

1
with

=


1
=((
w
jk
)).BecauseCov(

i
;
X
i
)=
0
,wehave
Cov(

i
;
ij;y
)=

0
j
1
Cov(

i
;

Y
i
)=

˙
2

j;y

0
j
:=

b
0
j
:
(4.2.15)
Hencethetest(4.2.2)isequivalentto
H
0
:
b
0
j
=0.Basedontheideaproposedin[LL14],
wecanhavetheestimationfor
b
0
j
^
b
j
=

1
n
n
X
i
=1

Y
i

X
|
i
^


X
ij

(
Y
i
;
X
|
i;
n
j
)
^

j

(4.2.16)
where
^

and
^

j
aresomeinitialestimatorsfor

0
and

0
j
.
Observethat
^
b
j
isthesolutiontotheestimatingequationbasedon

ij;y

i
+
b
0
j
=

X
ij

E(
X
ij
j
X
i;
n
j
;Y
i
)

Y
i

X
|
i

0

+
˙
2

j
;y

0
j
,thatis
n
X
i
=1
m
(inv)
ni
(
b
j
):=
n
X
i
=1

Y
i

X
|
i
^


X
ij

(
Y
i
;
X
|
i;
n
j
)
^

j

+
n
b
j
=0
;
(4.2.17)
102
andalsobysimplealgebra,wehave
m
(inv)
ni
(
b
0
j
)=
f

i

ij;y
+
b
0
j
g
|
{z
}
W
(inv)
ni
+

i
(
Y
i
;
X
|
i;
n
j
)(

0
j

^

j
)+
X
|
i
(

0

^

)

X
ij

(
Y
i
;
X
|
i;
n
j
)
^

j

|
{z
}
R
(inv)
ni
:
Withsimplecalculations,wehaveE(
W
ni
)=0and
Var(
W
ni
)=Var(

i

ij;y
)=Var(

i
(
X
ij

X
|
i

0

0
j
1


i

0
j
1

X
|
i;
n
j

0
j;
n
1
))
=

jj
+(

0
j
1
)
2

0
|


0
+(

0
j
1
)
2

+

0
|
j;
n
1

n
j;
n
j

0
j;
n
1

2

0
j
1

0
|


;j

2

0
j
1
$
j

2

0
|
j;
n
1

n
j;j
+2(

0
j
1
)
2

0
|
$
+2

0
j
1

0
|


;
n
j

0
j;
n
1
+2

0
j
1

0
|
j;
n
1
$
n
j
:=
˙
2
n;
inv
:
Notethatifweassumetheindependencebetween

i
and
X
i
,wehavethefollowing
varianceexpression.Since
X
ij
=
X
|
i

0

0
j
1
+

i

0
j
1
+
X
|
i;
n
j

0
j;
n
1
+

ij;y
andCov(

i
;
X
i
)=0,
wehaveCov(

i
;
i

0
j
1
+

ij;y
)=0,i.e.


0
j
1
Var(

i
)=Cov(

i
;
ij;y
).Hence
Var(
W
ni
)=Var(

i
(

ij;y
+

i

0
j
1
)


2
i

0
j
1
)
=Var(

i
)Var(

ij;y
)+(

0
j
1
)
2
(Var(

2
i
)

Var
2
(

i
))
:
IffurthermoreweassumenormalityfortheerrortermthenwehaveVar(

2
i
)

Var
2
(

i
)=
E(

4
i
)

2[E(

2
i
)]
2
=3
˙
4


2
˙
4

=Var
2
(

i
),whichleadstothesameresultinTheorem3.1
103
from[LL14],i.e.
Var(
W
ni
)=Var(

i
)Var(

ij;y
)+(

0
j
1
)
2
(Var(

2
i
)

Var
2
(

i
))
=Var(

i
)Var(

ij;y
)+[Cov(

i
;
ij;y
)]
2
=
˙
2

˙
2

j
;y
+(

0
j
1
)
2
˙
4

=
˙
2

˙
2

j
;y
+(

0
j
)
2
˙
4

j
;y
;
whichismorelikelytobeestimable.
Butwecanstillgettheasymptoticnormalityasstatedinthefollowingproposition.
Proposition7.
UnderAssumption2intheappendix,wehave
p
n
(
^
b
j

b
0
j
)
d
!
N
(0
;˙
2
inv
)(4.2.18)
where
˙
2
inv
=lim
n
!1
˙
2
n;
inv
.
Butweseethattheasymptoticvarianceof
^
b
j
istoowaycomplicated,whichmakessuch
Waldtypestatisticshardtouseinpracticewithheteroscedasticnoise.
4.3EmpiricalLikelihoodBasedApproach
Toavoidthecomplexityofestimatingasymptoticvarianceunderheteroscedasiticcase,we
proposeELbasedapproach.NotethattheabovethreeproceduresinSections4.2.1,4.2.2
and4.2.3correspondtothreeestimatingequations(4.2.6),(4.2.11)and(4.2.17)oftheform
m
n
(
X
i
;Y
i
;
j
;
^

n
j
;
^

),wherethenuisanceparameters

n
j
andtheothernuisanceparameters
denotedas

replacedbytheirestimators
^

n
j
and
^

.Tokeepitsimple,wewrite
m
ni
(

j
)=
m
n
(
X
i
;Y
i
;
j
;
^

n
j
;
^

)ingeneral.
Notethattheestimatingequations(4.2.6),(4.2.11)and(4.2.17)havethesamestructure,
104
i.e.thetermisthepopulationlevelterm,whichwillbeshowntobedominantand
asymptoticallynormal,whiletheothertermsareallaboutestimationerrors,whichneed
tobecontrolled.Weproposethefollowinggeneralframeworkbyassumingtheestimating
equationsevaluatedatthetruth

0
j
canbedecomposedasfollows,
m
ni
(

0
j
):=
m
n
(
X
i
;Y
i
;
0
j
;
^

n
j
;
^

):=
W
ni
+
R
ni
(4.3.19)
where
f
W
ni
g
n
i
=1
whichareIIDand
f
R
ni
g
n
i
=1
needtosatisfythefollowingconditions:
(C0)
P

min
1

i

n
m
ni
<
0
<
max
1

i

n
m
ni

!
1;
(C1)
W
ni
'sareIIDwithmean0andvariance
˙
2
n
with
˙
2
n
!
˙
2
w
;
(C2)
1
p
n
P
n
i
=1
R
ni
=
o
p
(1)andmax
1

i

n
j
R
ni
j
=
o
p
(
n
1
=
2
).
Accordingto[Owe01],withestimatingequations,wecanconstructempiricallikelihood
tomaketheinference.thefollowingempiricallikelihoodratiofunctionofthetarget
parameter

j
EL
n
(

j
)=max
n
n
Y
i
=1
np
i
:
p
i
>
0
;
n
X
i
=1
p
i
=1
;
n
X
i
=1
p
i
m
ni
(

j
)=0
o
:
(4.3.20)
Underthisframeworkwiththeabovegeneralconditions,wehavethefollowingpow-
erfulWilkstheorem.
Theorem5.
If(C0)-(C2)hold,then

2log
EL
n
(

0
j
)
d
!
˜
2
1
:
105
BasedonTheorem5,anasymptotic

leveltestisgivenbyrejecting
H
0
if

2logEL
n
(

0
j
)
>
˜
2
1

where
˜
2
1

istheupper

quantileof
˜
2
1
.Wecanalsoconstructa(1


)100%
intervalfor

j
asCI

=
f

j
:

2logEL
n
(

j
)
<˜
2
1

g
.Sincetheasymptoticdistributionis
chi-square,wedonotneedtoestimateanyadditionalparameters,suchastheasymptotic
variance.
4.4TheoreticalExamples
ThissectionoutlinesthreeexamplesaswediscussedaboveinSections(4.2.1),(4.2.2)and
(4.2.3)todemonstrateinterestingandpowerfulapplicationsofTheorems5.Weneedto
checktheconditions(C0)-(C2)fortheseproblems.
FromProposition5,6and7,weseethatWaldtypeinferenceprocedureishardto
implementduetothecomplexasymptoticvariance.Fortunatelywedonotneedtoestimate
thatvarianceinordertoconductinferencebyusingtheselfstudentizedELprocedure.And
infact,wealreadyvcondition(C1)forthethreeproceduresinSection(4.2.1),(4.2.2)
and(4.2.3),respectively.Wecancontrolthesecondterm
R
ni
sundercertainassumptions,
whichleadstothefollowingtheorems.
4.4.1LassoProjection
TheexampleisaboutusingLassoestimationtogetthelowdimensionalprojectionas
wediscussedinSection4.2.1.
Theorem6.
UndersometypicalconditionsfortheinitialestimatorsasinAssumption1in
theappendixandassumethat
X
i
and

i
arebothsub-Gaussian.Aslongas
s
log
p=
p
n
=
o
(1)
,
theconditions(C0)and(C2)canbed.Assume
˙
2
n;
lasso
!
˙
2
lasso
forsome
˙
2
lasso
<
1
,
106
andthenwehave

2log
EL
(
lasso
)
n
(

0
j
)
d
!
˜
2
1
:
Noticethatunderthehomoscedasticnoisecase,[ZZ14]and[vdGBR13]usedtheWald
typeteststatisticfortesting
H
0
basedonthesameestimationequationasweusedhere.And
in[NL14],withthesameestimatingequation,theyinsteadproposedtheScoreteststatistic
fortesting
H
0
.Althoughtheyareasymptoticallyequivalent,thebetweenthese
twocanbefoundin[NL14].Weareusingthesameestimatingequationtoconstructthe
likelihoodratiotypestatisticfortesting
H
0
.Sinceweareusingempiricallikelihood,it
notonlyenjoystheWilksphenomenon,butalsohasotherniceproperties,suchasthe
shapeoftheintervalisdatadrivenandourprocedureismorerobusttothe
distributionassumptionfortheerrortermsinceitonlyrequiresmomentassumptions.The
keyadvantageofourmethodisthatweallowheteroscedasticityfortheerrortermduetothe
selfstudentizationpropertyoftheempiricallikelihood.Pleaserefertotheempiricalstudies
inthesimulationsectionfortheperformancecomparisonofourmethodwiththeWaldtype
testandScoretest.
4.4.2InverseProjection
Thesecondexampleisaboutusinginverseregressiontogetthelowdimensionalprojection
aswediscussedinSection4.2.3.
Theorem7.
UndersomeconditionsfortheinitialestimatorsasinAssumption2inthe
appendix,andassume
(
X
|
i
;
i
)
|
issub-Gaussian.Aslongas
s
log
p=
p
n
=
o
(1)
,thecondi-
tions(C0)and(C2)canbed.Assume
˙
2
n;
inv
!
˙
2
inv
forsome
˙
2
inv
<
1
,andthen
107
wehave

2log
EL
(
inv
)
n
(
b
0
j
)
d
!
˜
2
1
:
Notethatsincewearedoinganequivalenttest,fromthisinferenceprocedure,wecan
notgettheintervalfor

0
j
.
4.4.3KFCProjection
ThethirdexampleisabouttheprojectionbyselectingtheKFCsetaswediscussedinSection
4.2.2.
Theorem8.
UnderAssumption3intheappendix,theconditions(C0)and(C2)canbe
d.Assume
˙
2
n;
kfc
!
˙
2
kfc
forsome
˙
2
kfc
<
1
,andthenwehave

2log
EL
(
kfc
)
n
(

0
j
)
d
!
˜
2
1
:
AbouttheKFCsetselection,weproposethefollowingprocedure.Basedonnormality
assumptionofthepredictors,wehavethewellknownconditionaldistributionresultforany
givesubset
S
:
ˆ
jk
(
S
):=Corr(
X
ij
;X
ik
j
X
i
S
)=
˙
jk


|
S
j


1
SS

S
k
:
Thesamplepartialcorrelationcanbeevaluatedby,^
ˆ
jk
(
S
)=
~
X
|
j
~
X
k
=n
.Fortestingwhether
apartialcorrelationiszeroornot,wecouldapplyFisher'sz-transformation
^
F
jk
=
1
2
log
(
1+^
ˆ
jk
(
S
)
1

^
ˆ
jk
(
S
)
)
:
Classicaldecisiontheoryyieldsthefollowingrulewhenusingthelevel

.Reject
108
thenullhypothesis
H
0
:
ˆ
jk
(
S
)=0againstthetwo-sidedalternative
H
a
:
ˆ
jk
(
S
)
6
=0if
p
n
jSj
3
j
^
F
jk
j
>z
1

=
2
:
Sowecouldthenselectthesmallestsizeof
S
suchthat
max
k
2S

p
n
jSj
3
j
^
F
jk
j
<z
1

=
2
:
AndinordertomakethisKFCsetselectionmorestable,weadoptthestabilityselection
proposedby[MB10]and[SS13].Accordingto[SS13],wesplitthedataintohalffor
B
times
andselecttheKFCsetwithvariablesshownatleast50%ofthose2
B
KFCsets.
4.5SimulationStudies
Inthissection,weconductsimulationstudiestoinvestigatethesampleperformance
oftheproposedempiricallikelihoodratiotest,aswellascomparingtheperformancesfor
testimatingequationsproposedintheexistingliterature.Inparticular,togen-
eratethecovariates,wesimulate
n
=200
;
400independentsamplesfromamultivariate
GaussiandistributionN
p
(
0
;

)where
p
=100
;
200
;
500.Weconsider3tcovari-
ancematrices

=((
˙
jk
)),bandedmatrixwith
˙
jk
=
ˆ
j
j

k
j
1
(
j
j

k
j
<
2),Toeplitzma-
trixwith
˙
jk
=
ˆ
j
j

k
j
andblockdiagonalmatrixwithunitblock
0
B
B
B
B
B
@
1
ˆˆ
2
ˆ
1
ˆ
ˆ
2
ˆ
1
1
C
C
C
C
C
A
,where
ˆ
=0
:
2
;
0
:
5.Weconsiderescenariosfortheerrordistribution,standardnormalN(0
;
1),
mixturenormaldistribution0
:
7N(0
;
1)+0
:
3N(0
;
5
2
),tdistributionwithdegreesoffree-
109
dom3,andtwoheteroscedasticdistributions0
:
7
X
1
Z
and
1
p

1
X
1
Z
P
p
j
=2
X
j

1
X
j
where
Z
˘
N(0
;
1)independentof
X
.Notethatforthetwoheteroscedasticdistributions,wehave
Cov(
X
;
)=E(

X
)=
0
,although

isnotindependentwith
X
.Fortheheteroscedastic
casetheconditionalvarianceonlydependsonalowdimensionalcovariatesandthecon-
ditionalvarianceforthesecondheteroscedasticcasedependsonthetheentirevectorof
covariates.Thetruecots

0


0
1
=0
;
0
:
1
;
0
:
2
;
0
:
3
;
0
:
4
;
0
:
5(0forthesizeand
othersforthepoweranalysis),

0
4
=1
:
5
;
0
7
=2andallothersare0.Ourgoalistotest
H
0
:

0
1
=0
;
v.s.
H
1
:

0
1
6
=0
:
Thenumberofsimulationsis500.
Fortheinitialestimatorssuchas
^

,
^

1
and
^
w
1
,wejustusethescaledLasso[SZ12],
whichhastheadvantageofbeingtuninginsensitive.\EL-KFC"correspondstotheKFC
Projectionexample,\EL-INV"correspondstotheInverseProjectionexample,and\EL-
LASSO"correspondstoLassoProjectionexample.And\Wald"correspondstotheWald
typetestasproposedin[ZZ14]and[vdGBR13],while\Score"correspondstotheScoretype
testasproposedin[NL14]withLassoestimationfor
^
w
1
.
Andforthe\EL-KFC",inordertostabilizetheKFCsetselection,weusedthestability
selectionprocedurethroughsub-samplingproposedby[MB10]and[SS13].Accordingto
[SS13],wesplitthedataintohalffor10timesandselecttheKFCsetwithvariables
shownatleast50%ofthose20KFCsets.
Forillustration,weonlyshowsomeofthecaseshere.InTable4.1withToeplitzmatrix
with
ˆ
=0
:
2asthecovariancematrixforthepredictorsandstandardnormalerror,wecan
seethatalloftheprocedureshasreasonablywellcontrolledtypeIerroraround

levelat
110
5%.Andfortheempiricallikelihoodbasedapproachwithtestimatingequations,
theyhaveprettymuchsimilarpowerperformance.Aninterestingcomparisonamongthe
holytrinity,i.e.Waldtypetest,Scoretestandthelikelihoodratiotest,whichcorrespond
tothelastthreesectionsinTable4.1,showsthatthelikelihoodratiotesthasoverallbetter
powerperformancethantheothertwo,especiallyinthelowsamplesizesituation.
Themostexcitingpartisabouttheheteroscedasticity.InTable4.2,wesimulatethe
predictorswiththeToeplitzcovariancematrixwith
ˆ
=0
:
2andtheheteroscedasticnoise
0
:
7
X
1
N(0
;
1).Underthiscase,wecouldseeclearlythatalloftheempiricallikelihoodbased
inferenceprocedures,whichcorrespondstothefoursectionsinTable4.2,arevalid,i.e.
theyhavereasonablywellcontrolledtypeIerror.Thesameresultsarealsodemonstrated
inFigure4.1afor
p
=100.ButfortheWaldtypetestandScoretest,theirtypeIerrorsare
largelywhichindicatesthatthesetwoproceduresareinvalid.Wecanclearlyseethe
patternsinFigure4.1bfor
p
=100.Fortheotherheteroscedasticnoisewithconditionalerror
variancedependingonhighdimensionalcovariates,thatis
1
p

1
X
1
P
p
j
=2
X
j

1
X
j
N(0
;
1),we
canobservesimilarperformancesinTable4.3,aswellasinFigure4.2for
p
=500.This
showstheadvantageoftheempiricallikelihoodbasedinferenceprocedures.
111
(a)EmpiricalLikelihoodBased
(b)HolyTrinity
Figure4.1:
EmpiricalSizeandPowerComparisonamongEmpiricalLikelihood
basedapproachesandamongHolyTrinityand
p
=100
.
(a)\EL-KFC"representsEL
approachwithKFCprojection,\EL-INV"representsELapproachwithinverseprojection
and\EL-LASSO"representsELapproachwithLassoprojection;(b)\Wald"represents
Waldtypetest,\Score"representsScoretestand\EL"representslikelihoodratiotest.
112
(a)EmpiricalLikelihoodBased
(b)HolyTrinity
Figure4.2:
EmpiricalSizeandPowerComparisonamongEmpiricalLikeli-
hoodbasedapproachesandamongHolyTrinitywithHeteroscedasticNoise
1
p

1
X
1
P
p
j
=2
X
j

1
X
j
N
(0
;
1)
and
p
=500
.
(a)\EL-KFC"representsELapproach
withKFCprojection,\EL-INV"representsELapproachwithinverseprojectionand\EL-
LASSO"representsELapproachwithLassoprojection;(b)\Wald"representsWaldtype
test,\Score"representsScoretestand\EL"representslikelihoodratiotest.
113
Table4.1:
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Error:N(0
;
1).

0
1
Methodpn00
:
10
:
20
:
30
:
40
:
5
EL-KFC1002000.0540.3040.7600.98411
4000.0520.4820.9641.00011
2002000.0520.2940.7620.97611
4000.0440.4600.9801.00011
5002000.0640.2920.7600.97211
4000.0400.4880.9721.00011
EL-INV1002000.0400.2960.7480.98411
4000.0540.4700.9621.00011
2002000.0440.2900.7740.97611
4000.0380.4580.9801.00011
5002000.0480.2760.7840.97811
4000.0340.4900.9721.00011
EL-LASSO1002000.0520.3120.7700.99011
4000.0540.4900.9701.00011
2002000.0480.3080.7860.98211
4000.0380.4620.9781.00011
5002000.0560.3000.7880.98011
4000.0420.5120.9761.00011
Wald1002000.0480.2660.7480.9641.0001
4000.0480.5020.9701.0001.0001
2002000.0640.2700.7420.9721.0001
4000.0380.4860.9781.0001.0001
5002000.0520.2840.7940.9680.9981
4000.0400.4860.9781.0001.0001
Score1002000.0500.2640.7460.9621.0001
4000.0520.4800.9661.0001.0001
2002000.0620.2680.7400.9701.0001
4000.0400.4740.9781.0001.0001
5002000.0620.2720.7940.9700.9981
4000.0380.4980.9761.0001.0001
114
Table4.2:
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Error:
0
:
7
X
1
N(0
;
1).

0
1
Methodpn00
:
10
:
20
:
30
:
40
:
5
EL-KFC1002000.0620.2440.6240.9240.9861.000
4000.0400.3660.9160.9981.0001.000
2002000.0700.2300.6520.9200.9901.000
4000.0760.3500.8900.9901.0001.000
5002000.0600.2540.6360.9000.9860.996
4000.0580.4020.9020.9921.0001.000
EL-INV1002000.0580.2300.6200.9100.9861.000
4000.0400.3560.9180.9981.0001.000
2002000.0580.2220.6520.9100.9881.000
4000.0660.3420.8800.9901.0001.000
5002000.0600.2360.6240.8980.9800.996
4000.0500.4020.9020.9921.0001.000
EL-LASSO1002000.0560.2440.6340.9220.9881.000
4000.0460.3760.9261.0001.0001.000
2002000.0620.2320.6680.9260.9901.000
4000.0720.3560.8900.9881.0001.000
5002000.0680.2500.6400.9120.9860.996
4000.0520.4120.9020.9921.0001.000
Wald1002000.2560.4960.8600.98611
4000.2100.7060.9861.00011
2002000.2340.4640.8480.98011
4000.2360.6800.9681.00011
5002000.2080.5160.8740.97811
4000.2340.7360.9861.00011
Score1002000.2560.4900.8600.98611
4000.2180.7000.9861.00011
2002000.2340.4700.8460.98011
4000.2340.6720.9681.00011
5002000.2040.5180.8700.97811
4000.2300.7280.9841.00011
115
Table4.3:
Powercomparison.
Covariate:Toeplitzmatrixwith
ˆ
=0
:
2;Error:
1
p

1
X
1
P
p
j
=2
X
j

1
X
j
N(0
;
1).

0
1
Methodpn00
:
10
:
20
:
30
:
40
:
5
EL-KFC1002000.0660.8860.998111
4000.0480.9881.000111
2002000.0760.9321.000111
4000.0680.9881.000111
5002000.0600.9421.000111
4000.0541.0001.000111
EL-INV1002000.0620.8720.998111
4000.0380.9881.000111
2002000.0740.9361.000111
4000.0640.9881.000111
5002000.0560.9381.000111
4000.0421.0001.000111
EL-LASSO1002000.0660.8760.998111
4000.0460.9881.000111
2002000.0780.9341.000111
4000.0640.9881.000111
5002000.0640.9441.000111
4000.0461.0001.000111
Wald1002000.2220.9821111
4000.2141.0001111
2002000.2440.9901111
4000.2140.9981111
5002000.2600.9901111
4000.2401.0001111
Score1002000.2260.9841111
4000.2081.0001111
2002000.2360.9901111
4000.2060.9981111
5002000.2600.9901111
4000.2321.0001111
116
4.6RealDataAnalysis
Microarrayexpressionexperimentsandarray-basedcomparativegenomichybridization(ar-
rayCGH)experimentshavebeenconductedformorethan170primarybreasttumorspec-
imensinafewrecentbreastcancercohortstudies,collectedatmultiplecancercenters
[FFS10].
Weusedatotalof172tumorsampleswithbothcDNAexpressionmicroarrayandCGH
arraydata.Inourstudy,weusedthecopynumberalterationintervals(CNAIs),which
areasbasicCNAunits(genomeregions)inwhichallgenestendtobeor
deletedsimultaneouslyinasample.ForeachCNAIineachsample,themeanvalueofthe
estimatedcopynumbersofthegenesfallingintothisCNAIwascalculated.Thisresulted
ina172(samples)by384(CNAIs)numericmatrix.Afterglobalnormalizationforeach
expressionarray,wefocusedonasetof654breastcancerrelatedgenes,whichwasderived
basedonsevenpublishedbreastcancergenelists.Thisresultedina172(samples)by654
(genes)numericmatrix.Seemodeldetailsaboutthedataprocessingin[PZB
+
10].
Ourstudytendstorevealthesubtleandcomplicatedregulatoryrelationshipsamong
DNAcopynumbersandRNAtranscriptlevels.ThedependenceofRNAlevelsonDNAcopy
numberscanbemodeledthroughastraightforwardmultivariatelinearregressionmodelwith
theRNAlevelsasresponsesandtheDNAcopynumbersaspredictors.Whilemultivari-
atelinearregressioniswellstudiedinstatisticalliterature,thecurrentproblembearsnew
challengesduetohigh-dimensionalityintermsofbothpredictorsandresponses.Wewill
adoptsomedimensionreductionproceduresfortheRNAexpressionsfollowedbysignt
associationdetectusingtheproposedmethods.
117
4.6.1WGCNAofcorrelatedgenes
Inordertodealwiththecorrelationpatternsamonggenesacrossmicroarraysamples,we
adoptedtheWeightedGeneCo-expressionNetworkAnalysis(WGCNA)[LH08]whichcan
beusedforclusters(modules)ofhighlycorrelatedgenes.
ByusingofWGCNAwith
minModuleSize=10
,weiden5moduleslabeled1through
5inorderofdescendingsizeaslistedinTable4.4.Thelabel0isreservedforgenesoutsideof
allmodules.AndwecanseethatinFigure4.3athereareprettyclear5modulesclustered.
Table4.4:
ModuleSizes.
module012345
size20431653332523
Forsummarizingsuchcluster,weusethemoduleeigengenebyconductingPCAfor
eachoftheemodulestoselecttheprincipalcomponentasourresponsetodothe
associationanalysis.Butsincewehavemissingvaluesintheexpressiondatamatrix,we
imputethemissingvaluesbyusing
impute.knn
[HTS
+
99].Andwechoose
prcomp
in
R
to
performPCA.
4.6.2Test
AfterdoingWGCNA,imputationandPCA,wehaveprincipalcomponentforeachof5
modules.Weregressedeachmoduleeigengeneontothepredictors(CNAIs,totallywehave
384)separatelytoconductthesinglecotstudy.
Forillustration,wejustdemonstratetheresultsforModule3andseeothersinthe
appendix.InFigure4.3b,wecouldseethatalthoughtmethodshavetpower
118
performance,thespotsareverymuchconsistentalloverthemethods.Andthe
Scoremethodiskindofpowerlessinthisdataanalysisamongall6tmethods.
Foreachinferenceproceduretestingforallcovariates,wecangetasequenceofp-values
f
p
j
g
p
j
=1
.Withtheorderedp-values,
p
(1)

p
(2)

p
(
j
)

p
(
p
)
,weadoptthe
Benjamini-Hochberg(BH)algorithm:foravalue

=0
:
01%,let
j
max
bethelargest
indexforwhich
p
(
j
)

j
p

,andreject
H
0(
j
)
,thenullhypothesiscorrespondingto
p
(
j
)
,if
j

j
max
,accepting
H
0(
j
)
otherwise.TakeModule3asanexample,wefoundthatallof
theempiricallikelihoodbasedapproachesdetectedoneconsistentsignalwhichisthe269-th
CNAI,onChromosome15withCytoband\15q11.2-15q11.2".
(a)Clustering
(b)ManhattanPlot
Figure4.3:
BreastCancerCohortStudies.
(a)Clusteringdendrogramofgenes,with
dissimilaritybasedontopologicaloverlap,togetherwithassignedmodulecolors.(b)Man-
hattanplotforModule3.
119
4.6.3PresenceofHeteroscedasticity
Wetestforthepresenceofheteroscedasticityinthisdatasetforeachofthe654genesusing
theGoldfeld-Quandttest[GQ65].TheGoldfeld-Quandttestisoneofthemostwidelyused
testforheteroscedasticity.Itcomparesthevariancesoftwosubmodelsdividedbyasp
breakpointandrejectsifthevariancesTheGoldfeld-Quandttestisnotdirectly
applicablewhen
p>n
.Toreducethedimensions,weapplythe
p
LassotoselectCNAIs
thatarepredictiveofgeneexpressionlevelsandCNAIsthatareexplanatoryofvariability.
ThesevariablesarethenappliedintheGoldfeld-Quandttesttospecifypredictorsonthe
response.Sincethe
p
Lassoisnotthatsensitivetotheselectionofthetuningparameter
andwearealsodurabletoselectmorevariables,wejustsetthetuningparametertobe
p
log
p=n
.
Wefoundthat19outof654genesdemonstrateheteroscedasticityatthe
level0
:
05
=
654.Thepresenceofheteroscedasticityforthesegenessuggeststheneedtouse
ourmethodforidentifyingtheCNAIsthatareassociatedwithgeneexpression.Asfurther
evidencefortheexistenceofheteroscedasticity,weapplythe\wanderingschematicplot"
[Tuk77].Thisslicesthepredictedvalueintobinsandusesm-lettersummaries(generaliza-
tionsofboxplots)toshowthelocation,spread,andshapeoftheresidualsforeachbin.The
m-letterstatisticsarefurthersmoothedinordertoemphasizeoverallpatternsratherthan
chancedeviations.Figure4.4presentsthe\wanderingschematicplots"forgenesPDK3
(Chr23),TPST2(Chr22),ELF3(Chr1)andSNRPE(Chr22),whicharethetop4genes
fortheheteroscedasticity.
120
4.6.4ResultsforTop4GeneswithHeteroscedasticity
WeapplyourEmpiricalLikelihoodbasedapproachtothefourgenesdiscussedintheprevious
sectionanddemonstratedinFigure4.5andcompareitsperformanceswiththoseofWald
typetestandScoretypetest.Forexample,weusegeneTPST2onChr22asshownin
Figure4.5bfordemonstration.Foreachinferenceproceduretestingforallcovariates,we
cangetasequenceofp-values
f
p
j
g
p
j
=1
.Withtheorderedp-values,
p
(1)

p
(2)

p
(
j
)

p
(
p
)
,weadopttheBenjamini-Hochberg(BH)algorithmtomakethedecision.
Asaresult,wefoundthatonlyEL-INVandEL-LASSOcandetectsignalsandallofthe
otherproceduresfoundnothingt.MoreoverEL-INVandEL-LASSOfoundtwo
consistentsignalsatthe305-thCNAIand307-thCNAI,bothofwhichareonChromosome
17withCytoband\17q12-17q12".
121
(a)
(b)
(c)
(d)
Figure4.4:
WonderingSchematicPlotforTop4GeneswithHeteroscedasticity.
122
(a)
(b)
(c)
(d)
Figure4.5:
ManhattanPlotforTop4GeneswithHeteroscedasticity.
123
4.7TechnicalDetails
4.7.1AssumptionsforTheoreticalExamples
Assumption1.
(1)
Assumetheinitialestimator
^

satisfying
k
^


0
k
1
=
O
p
(
s
p
log
p=n
)
.
(2)
Supposetheinitialestimators
^
w
j
satisfy
max
1

j

p
k
^
w
j

w
0
j
k
1
=
O
p
(
a
n
)
,where
a
n
=
o
(1
=
p
log
p
)
.
(3)
Thepredictionerrorssatisfy
k
X
(
^


0
)
k
2
2
=n
=
O
p
(
s
log
p=n
)
and
max
1

j

p
k
X
n
j
(
^
w
j

w
0
j
)
k
2
2
=n
=
O
p
(
b
n
)
,where
X
n
j
isthedesignmatrix
X
withthe
j
-thcolumndeletedand
b
n
=
o
(1
=
p
n
)
.
(4)
X
i
and

i
areallsub-Gaussian.
(5)
s
log
p=
p
n
=
o
(1)
.
Remark6.
1.
With(4)that
X
i
and

i
areallsub-Gaussian,wehave
X
ik

i
sub-exponential
withE
(

i
X
ik
)=0
.ByBernsteininequality[Ver10]andunionboundinequality,we
have
P
(


1
n
n
X
i
=1
X
i

i


1

t
)

C
1
p
exp(

C
min(
t
2
=C
2
;t=C
3
)
n
)
:
Bytaking
t
=
C
0
q
log
p
n
forsomepositiveconstant
C
0
suchthat
CC
0
2
>C
2
,wehave
k
1
n
n
X
i
=1
X
i

i
k
1
=
O
p
(
r
log
p
n
)
:
(4.7.21)
2.
For

ij
=
X
ij

E
(
X
ij
j
X
i;
n
j
)
,wehave

ij
sub-gaussiansince
X
i
issub-gaussian.
Andforany
k
6
=
j
,wehaveE
(
X
ik

ij
)=
E
f
X
ik
[
X
ij

E
(
X
ij
j
X
i;
n
j
)]
g
=
E
f
X
ik
X
ij

124
E
[
X
ik
X
ij
j
X
i;
n
j
]
g
=0
.Similarly,wehaveforany
t>
0
and
1

j
6
=
k

p
,
P
(


1
n
n
X
i
=1
X
ik

ij


t
)

C
1
p
exp(

C
min(
t
2
=C
2
;t=C
3
)
n
)
;
whichleadsto


1
n
n
X
i
=1

ij
X
i;
n
j


1
=
O
p
(
r
log
p
n
)
:
(4.7.22)
3.
Forthepropertiesoftheinitialestimatorsin(1),(2)and(3)undertheheteroscedasitic
noisecase,wecanusethe
p
Lassoestimatorasin[BCW14].AccordingtoTheorem7
in[BCW14],wehavethatthe
p
Lassoestimatorsundercertainconditionshavethese
propertiesd.
Assumption2.
(1)
AssumethesameassumptionasLassoprojectioncasefortheinitial
estimator
k
^


0
k
1
=
O
p
(
s
p
log
p=n
)
.
(2)
AssumesimilarassumptionasLassoprojectioncasefortheinitialestimators
^

j
,i.e.
max
1

j

p
k
^

j


0
j
k
1
=
O
p
(
a
n
)
,where
a
n
=
o
(1
=
p
log
p
)
.
(3)
AssumesimilarassumptionasLassoprojectioncaseforthepredictionerrors,i.e.
k
X
(
^


0
)
k
2
2
=n
=
O
p
(
s
log
p=n
)
and
max
1

j

p
k
(
Y
;
X
n
j
)(
^

j


0
j
)
k
2
2
=n
=
O
p
(
b
n
)
and
b
n
=
o
(1
=
p
n
)
.
(4)
(
X
|
i
;
i
)
|
issub-Gaussian.
(5)
s
log
p=
p
n
=
o
(1)
.
Remark7.
Forthecondition(2)above,ifweassume
a
=max
1

j

p
s
j
with
s
j
=
k

0
j
k
0
andthenthe
p
Lassoestimatorsfor

0
j
satisfythisconditionwith
a
n
=
a
p
log
p=n
.Forthe
125
condition(3)above,sinceweassumethat
(
X
|
i
;
i
)
|
issub-Gaussian(whichmakes

0
|
X
i
alsosub-Gaussian),thenduetoCov
(

0
|
X
i
;
i
)=
E
(

i

0
|
X
i
)=0
,wehave

i

0
|
X
i
sub-
exponentialandbytheBernsteininequality,wehaveforany
t>
0
,
P
(


1
n
n
X
i
=1
X
|
i

0

i


t
)

2exp

C
1
n
min(
t
2
=C
2
2
;t=C
2
)
g
:
Thisalsoleadsto
1
n
n
X
i
=1
X
|
i

0

i
=
O
p
(
p
log
p=n
)
;
(4.7.23)
aslongas
log
p=n
!
0
.Andwiththesameargument,wehave
1
n
n
X
i
=1
X
ik

ij;y
=
O
p
(
p
log
p=n
)
;
(4.7.24)
1
n
n
X
i
=1
(
Y
i
;
X
|
i;
n
j
)

0
j

ij;y
=
O
p
(
p
log
p=n
)
:
(4.7.25)
Assumption3.
(1)
Fortheeigenvaluesof

,thereexistsomeconstants

min
and

max
suchthat
0
<
min
<
min
(

)


max
(

)
<
max
<
1
:
(2)
Assume
X
i
˘
N
(
0
;

)
and

i
tobesub-Gaussian.
(3)
AssumethesameastheLassoprojectionfortheinitialestimator
k
^


0
k
1
=
O
p
(
s
p
log
p=n
)
.
(4)
m
3
log
p=n
=
o
(1)
,
s
q
(log
p
)
2
m
3
n
=
o
(1)
,
s
r
(log
p
)
3
m
2
n
2
=
o
(1)
.
(5)
Assume
s
p
log
p
sup
S
:
jS
m
max
k
2S


˙
jk


j
S


1
SS

S
k


=
o
(1)
tocontrolthepartial
correlationbetweenthetargetcovariate
X
ij
and
X
i
S

.
126
4.7.2ProofofTheorems
ProofofTheorem5.
Asin[Owe01],by(C0),withprobabilitytendingto1,

2logEL
n
(

0
j
)=
2
P
n
i
=1
log(1+

ni
)where


n
X
i
=1
m
ni
1+

ni
=0
:
(4.7.26)
Thenextstepistoboundthemagnitudeof

.Let

=
j

j
u
where
u
=sign(

)
2
1
;
1
g
.
Nowby
P
n
i
=1
m
ni
1+

ni
=0,wehave
0=
n
X
i
=1
um
ni
1+

ni
=
n
X
i
=1
um
ni

1


ni
1+

ni

;
whichimplies
n
X
i
=1
um
ni
=
n
X
i
=1

2
ni
1+

ni
=
n
X
i
=1
j

j
m
2
ni
1+

ni
j

j
n
X
i
=1
m
2
ni
1+
j

j
max
1

i

n
j
m
ni
j
:
Thuswehave
u
1
n
n
X
i
=1
m
ni

j

j
1+
j

j
max
1

i

n
j
m
ni
j
1
n
n
X
i
=1
m
2
ni
:
whichimplies
j

j

1
n
n
X
i
=1
m
2
ni

(max
1

i

n
j
m
ni
j
)
u
1
n
n
X
i
=1
m
ni


u
1
n
n
X
i
=1
m
ni
:
(4.7.27)
127
From(C1),byLemma3in[Owe90],wehavemax
1

i

n
j
W
ni
j
=
o
p
(
n
1
=
2
),andtogetherwith
(C2),wehave
max
1

i

n
j
m
ni
j
=
o
p
(
n
1
=
2
)
:
(4.7.28)
Andsinceforany
>
0,
1
n˙
2
n
n
X
i
=1
E

W
2
ni
1
(
j
W
ni
j
>
p
n˙
n
)

=
˙

2
n
E

W
2
n
1
1
(
j
W
n
1
j
>
p
n˙
n
)

;
whereobviously
W
2
n
1
1
(
j
W
n
1
j
>
p
n˙
n
)
p
!
0duetoP(
j
W
n
1
j
>
p
n˙
n
)
!
0,wehaveby
DominatedConvergenceTheorem,
1
n˙
2
n
n
X
i
=1
E

W
2
ni
1
(
j
W
ni
j
>
p
n˙
n
)

!
0
:
ThusbyLindeberg-FellerCentralLimitTheorem,wehave
1
p
n
n
X
i
=1
W
ni
d
!
N(0
;˙
2
w
)
:
(4.7.29)
By(4.7.29)andtogetherwith(C2),wehave
1
p
n
n
X
i
=1
m
ni
=
1
p
n
n
X
i
=1
W
ni
+
1
p
n
n
X
i
=1
R
ni
=
1
p
n
n
X
i
=1
W
ni
+
o
p
(1)
d
!
N(0
;˙
2
w
)
:
(4.7.30)
Andby(C1)and(C2)wehave
1
n
n
X
i
=1
m
2
ni
=
1
n
n
X
i
=1
W
2
ni
+
1
n
n
X
i
=1
R
2
ni
+2
1
n
n
X
i
=1
W
ni
R
ni
=
1
n
n
X
i
=1
W
2
ni
+
o
p
(1)
!
˙
2
w
:
(4.7.31)
128
ActuallytheabovefollowsfromcheckingtheWLLNfortriangulararrays.Firstofall
P
n
i
=1
P(
W
2
ni
>n
)=
n
P(
W
2
n
1
>n
)

E

W
2
n
1
1
(
W
2
n
1
>n
)

!
0;and
n

2
n
X
i
=1
E

W
4
ni
1
(
W
2
ni

n
)

=
n

1
E

W
4
n
1
1
(
W
2
n
1

n
)

=
n

1
Z
n
0
2
y
P(
W
2
n
1
>y
)
dy
!
0
since
y
P(
W
2
n
1
>y
)

E(
W
2
n
1
1
(
W
2
n
1
>y
))
!
0as
y
!1
.
Thusby(4.7.27),(4.7.28),(4.7.30)and(4.7.31),wehave
j

j
(
1
n
n
X
i
=1
m
2
ni
+
o
p
(1))=
O
p
(
n

1
=
2
)
andhence
j

j
=
O
p
(
n

1
=
2
)
:
(4.7.32)
Thenitfollowsfrom(4.7.28),wehavemax
1

i

n


ni
1+

ni


=
o
p
(1).Therefore,from(4.7.26),
wehave
0=
1
n
n
X
i
=1

ni
1+

ni
=
1
n
n
X
i
=1

ni
n
1


ni
+
[

ni
]
2
1+

ni
o
=
1
n
n
X
i
=1

ni

[1+
o
p
(1)]
n
n
X
i
=1
[

ni
]
2
;
whichleadsto
1
n
n
X
i
=1

ni
=
[1+
o
p
(1)]
n
n
X
i
=1
[

ni
]
2
:
(4.7.33)
129
Againbyusing(4.7.26)andtogetherwith(4.7.30),wehave
0=
1
n
n
X
i
=1
m
ni
1+

ni
=
1
n
n
X
i
=1
m
ni
n
1


ni
+
[

ni
]
2
1+

ni
o
=
1
n
n
X
i
=1
m
ni


n
n
X
i
=1
m
2
ni
+
1
n
n
X
i
=1
m
ni
[

ni
]
2
1+

ni
=
1
n
n
X
i
=1
m
ni


n
n
X
i
=1
m
2
ni
+
O
p
n
max
1

i

n


m
ni
1+

ni


1
n
n
X
i
=1
[

ni
]
2
o
=
1
n
n
X
i
=1
m
ni


n
n
X
i
=1
m
2
ni
+
o
p
n
n
1
=
2

2
1
n
n
X
i
=1
m
2
ni
o
=
1
n
n
X
i
=1
m
ni


n
n
X
i
=1
m
2
ni
+
o
p
(
n

1
=
2
)
;
whichleadsto

=

1
n
n
X
i
=1
m
2
ni


1
1
n
n
X
i
=1
m
ni
+
o
p
(
n

1
=
2
)
:
(4.7.34)
Finally,byTaylorexpansiontogetherwith(4.7.30),(4.7.31),(4.7.33)and(4.7.34),wehave

2logEL
n
(

0
j
)=2
n
X
i
=1
log(1+

ni
)
=2
n
X
i
=1

ni

[1+
o
p
(1)]
n
X
i
=1
[

ni
]
2
=[1+
o
p
(1)]
n
X
i
=1
[

ni
]
2
=[1+
o
p
(1)]

2
n
X
i
=1
m
2
ni
=[1+
o
p
(1)]

1
p
n
n
X
i
=1
m
ni

1
n
n
X
i
=1
m
2
ni


1

1
p
n
n
X
i
=1
m
ni

+
o
p
(1)
d
!
˜
2
1
;
as
n
!1
:
130
Thiscompletestheproofofthetheorem.
ProofofTheorem6.
Weonlyneedtocontroltheterm
R
ni
,whichwillbecontrolledoneby
one.
By(3)inAssumption1,wehave(4.7.21)and(4.7.22),whichleadsto


1
n
n
X
i
=1
R
ni;
1


=


1
n
n
X
i
=1
(
Y
i

X
|
i

0
)(
w
0
j

^
w
j
)
|
X
i;
n
j


=


(
w
0
j

^
w
j
)
|
1
n
n
X
i
=1
X
i;
n
j

i


k
w
0
j

^
w
j
k
1
k
1
n
n
X
i
=1
X
i;
n
j

i
k
1
=
O
p
(
a
n
)
O
p
(
r
log
p
n
)=
O
p
(
a
n
r
log
p
n
)
:
Inordertohave


1
n
P
n
i
=1
R
ni;
1


=
o
p
(
n

1
=
2
)weneedtohave
a
n
=
o
(1
=
p
log
p
),whichis
trueaccordingto(2)inAssumption1.
For
R
ni;
2
,wehave


1
n
n
X
i
=1
R
ni;
2


=


1
n
n
X
i
=1
(
X
ij

^
w
|
j
X
i;
n
j
)
X
|
i;
n
j
(

0
n
j

^

n
j
)


=


1
n
n
X
i
=1

ij
X
|
i;
n
j
(

0
n
j

^

n
j
)+
1
n
n
X
i
=1
(
w
0
j

^
w
j
)
|
X
i;
n
j
X
|
i;
n
j
(

0
n
j

^

n
j
)


1
n
n
X
i
=1

ij
X
|
i;
n
j
(

0
n
j

^

n
j
)


+


1
n
n
X
i
=1
(
w
0
j

^
w
j
)
|
X
i;
n
j
X
|
i;
n
j
(

0
n
j

^

n
j
)


1
n
n
X
i
=1

ij
X
|
i;
n
j


1


0
n
j

^

n
j
)


1
+
v
u
u
t
1
n
n
X
i
=1

(
w
0
j

^
w
j
)
|
X
i;
n
j

2
v
u
u
t
1
n
n
X
i
=1

X
|
i;
n
j
(

0
n
j

^

n
j
)

2
=
O
p
(
p
log
p=n
)
O
p
(
s
p
log
p=n
)+
O
p
(
p
s
log
p=n
)
O
p
(
p
b
n
)
=
O
P

s
log
p=n
+
p
b
n
s
log
p=n

:
131
Inordertohave


1
n
P
n
i
=1
R
ni;
2


=
o
p
(
n

1
=
2
)weneedtohave
s
log
p=
p
n
=
o
(1)and
b
n
=
o
(1
=
p
n
).Thuswith(3)and(5)inAssumption1,wehavevthehalfcondition
in(C2),
1
n
P
n
i
=1
R
ni
=
o
p
(
n

1
=
2
).
Nowforthesecondhalfoftheconditionin(C2),
max
1

i

n
j
R
ni;
1
j
=max
1

i

n


(
Y
i

X
|
i

0
)(
w
0
j

^
w
j
)
|
X
i;
n
j


=max
1

i

n


(
w
0
j

^
w
j
)
|
X
i;
n
j

i


w
0
j

^
w
j


1
max
1

i

n


X
i;
n
j

i


1
=


w
0
j

^
w
j


1
max
1

i

n
max
1

k

p


X
ik

i


:
Nowsince
X
i
and

i
areallsub-Gaussianandthenwehave
X
ik

i
sub-exponential,and
thenbytheunionbound,wehave
P

max
1

i

n
max
1

k

p


X
ik

i


>t


X
1

i

n
X
1

k

p
P(
j
X
ik

i


>t
)

pnC
1
e

C
2
t
:
Bytaking
t
=log(
pn
)
=C
with
C<C
2
,wehavemax
1

i

n
max
1

k

p


X
ik

i


=
O
p
(log(
pn
)).
Hencewehave
max
1

i

n
j
R
ni;
1
j
=


w
0
j

^
w
j


1
max
1

i

n
max
1

k

p


X
ik

i


=
O
p
(
a
n
log(
pn
))
:
Inordertomakemax
1

i

n
j
R
ni;
1
j
=
o
p
(
n
1
=
2
),weneed
a
n
log(
pn
)
=
p
n
=
o
(1),whichistrue
underassumption(2)for
a
n
inAssumption1since
a
n
log(
pn
)
=
p
n
=
o
(log(
pn
)
=
p
n
log
p
)=
o
(
p
log
p=n
)=
o
(1).
132
Notethatmax
1

i

n
j
R
ni;
2
j
=max
1

i

n
j
(
X
ij

^
w
|
j
X
i;
n
j
)
X
|
i;
n
j
(

0
n
j

^

n
j
)
j

max
1

i

n
j
(
X
ij

w
0
|
j
X
i;
n
j
)
X
|
i;
n
j
(

0
n
j

^

n
j
)
j
+max
1

i

n
j
(
w
0
j

^
w
j
)
|
X
i;
n
j
X
|
i;
n
j
(

0
n
j

^

n
j
)
j
k

0
n
j

^

n
j
)
k
1
max
1

i

n
max
1

k

p
j

ij
X
ik
j
+
k
(
w
0
j

^
w
j
)
k
1
k
(

0
n
j

^

n
j
)
k
1

max
1

i

n
max
1

k

p
j
X
ik
j

2
:
Nowsince

ij
'sand
X
i
areallsub-Gaussian,andthenbysimilaranalysisasabovewehave
max
1

i

n
j
R
ni;
2
j
=
O
p
(
s
p
log
p=n
)
O
p
(log(
pn
))+
O
p
(
a
n
s
p
log
p=n
)
O
p
(log(
pn
))
=
O
p
(
s
p
log
p=n
log(
pn
))
:
Inordertomakemax
1

i

n
j
R
ni;
2
j
=
o
p
(
n
1
=
2
),weneed
s
p
log
p
log(
pn
)
=n
=
o
p
(1)
;
whichistrueunderassumption(5)inAssumption1since
s
p
log
p
log(
pn
)
=n
=
o
(
p
log
p=n
)=
o
(1).Thuswehavemax
1

i

n
j
R
ni
j
=
o
p
(
n
1
=
2
),whichvthesecondhalfinthecondi-
tion(C2).
Nowweneedtocheckoutcondition(C0).Fromtheaboveanalysis,wehavemax
1

i

n
j
R
ni
j
=
133
o
p
(max
1

i

n
j
W
ni
j
).Thusweonlyneedtoprovethat
P(min
1

i

n
W
ni
<
0
<
max
1

i

n
W
ni
)
!
1
;
whichjustfollowsfromtheGilvenko-Gantellitheoremoverhalf-spacesasinpage219in
[Owe01].
ProofofTheorem7.
Noticethat
1
n
n
X
i
=1
R
ni;
1
=
1
n
n
X
i
=1

2
i
(

0
j
1

^

j
1
)+
1
n
n
X
i
=1

i
X
|
i

0
(

0
j
1

^

j
1
)
+
1
n
n
X
i
=1

i
X
|
i;
n
j
(

0
j;
n
1

^

j;
n
1
)
:
Bycondition(2)inAssumption2and(4.7.23)impliedfromcondition(4)inAssumption2,


1
n
n
X
i
=1

i
X
|
i

0
(

0
j
1

^

j
1
)


=


(

0
j
1

^

j
1
)


1
n
n
X
i
=1

i

0
|
X
i


=
O
p
(
a
n
r
log
p
n
)
;
and


1
n
n
X
i
=1

i
X
|
i;
n
j
(

0
j;
n
1

^

j;
n
1
)


0
j;
n
1

^

j;
n
1


1


1
n
n
X
i
=1

i
X
i;
n
j


1
=
O
p
(
a
n
r
log
p
n
)
:
Thuswehave
1
n
n
X
i
=1
R
ni;
1
=
1
n
n
X
i
=1

2
i
(

0
j
1

^

j
1
)+
O
p
(
a
n
p
log
p=n
)=
O
p
(
a
n
p
1
=n
)
:
134
Soinordertohave
1
n
P
n
i
=1
R
ni;
1
=
o
p
(
n

1
=
2
),weneed
a
n
=
o
p
(1).Notethat
max
1

i

n
j
R
ni;
1
j
max
1

i

n
j

2
i
(

0
j
1

^

j
1
)
j
+max
1

i

n
j

i
X
|
i

0
(

0
j
1

^

j
1
)
j
+max
1

i

n
j

i
X
|
i;
n
j
(

0
j;
n
1

^

j;
n
1
)
j
=
j

0
j
1

^

j
1
j

max
1

i

n
j

2
i
j
+max
1

i

n
j

i
X
|
i

0
j

+
k

0
j;
n
1

^

j;
n
1
k
1
max
1

i

n
k

i
X
i;
n
j
k
1
=
j

0
j
1

^

j
1
j

max
1

i

n
j

2
i
j
+max
1

i

n
j

i
X
|
i

0
j

+
k

0
j;
n
1

^

j;
n
1
k
1
max
1

i

n
max
1

k

p
j

i
X
ij
j
:
Andbytheassumptionthat
X
i
and

i
aresub-Gaussian,wehave
X
|
i

0
issub-Gaussian
and

2
i
,

i
X
|
i

0
and
X
ij

i
areallsub-exponential.Thenwehave
P

max
1

i

n
j

2
i
j
>t


n
P(
j

2
i
j
>t
)

nC
1
e

C
2
t
whichimpliesthatmax
1

i

n
j

2
i
j
=
O
p
(log
n
).Thuswehavemax
1

i

n
j
R
ni;
1
j
=
O
p
(
a
n
log(
pn
)).
Inordertoachievemax
1

i

n
j
R
ni;
1
j
=
o
p
(
n
1
=
2
),weneed
a
n
log(
pn
)
=
p
n
=
o
(1),whichis
truesince
a
n
=
o
(1
=
p
log
p
).
For
R
ni;
2
=

ij;y
X
|
i
(

0

^

)=

ij;y

X
ij
(

0
j

^

j
)+
X
|
i;
n
j
(

0
n
j

^

n
j
)

=

ij;y

[(
Y
i
;
X
|
i;
n
j
)

0
j
+

ij;y
](

0
j

^

j
)+
X
|
i;
n
j
(

0
n
j

^

n
j
)

=

2
ij;y
(

0
j

^

j
)+

ij;y
(
Y
i
;
X
|
i;
n
j
)

0
j
(

0
j

^

j
)+

ij;y
X
|
i;
n
j
(

0
n
j

^

n
j
)
;
135
similarlyas
R
ni;
1
,bycondition(1)and(4.7.24),(4.7.25),wehave
1
n
n
X
i
=1
R
ni;
2
=
1
n
n
X
i
=1

2
ij;y
(

0
j

^

j
)+
O
p
(
s
p
log
p=n
p
log
p=n
)
=
1
n
n
X
i
=1

2
ij;y
(

0
j

^

j
)+
O
p
(
s
log
p=n
)
=
O
p
(
s
p
log
p=n
p
1
=n
)+
O
p
(
s
log
p=n
)=
O
p
(
s
p
log
p=n
)
:
Soinordertohave
1
n
P
n
i
=1
R
ni;
2
=
o
p
(
n

1
=
2
),weneedtohave
s
p
log
p=n
=
o
p
(
n

1
=
2
),i.e.
s
p
log
p=n
=
o
p
(1).Notethat
max
1

i

n
j
R
ni;
2
j
max
1

i

n
j

2
ij;y
(

0
j

^

j
)
j
+max
1

i

n
j

ij;y
(
Y
i
;
X
|
i;
n
j
)

0
j
(

0
j

^

j
)
j
+max
1

i

n
j

ij;y
X
|
i;
n
j
(

0
n
j

^

n
j
)
j
=
O
p
(
s
p
log
p=n
log(
pn
))=
o
p
(
p
n
)
since
s
p
log
p=n
log(
pn
)
=
p
n
=
o
(
p
log
p=n
)=
o
(1).
Nowfor
R
ni;
3
=
X
|
i
(

0

^

)

(
Y
i
;
X
|
i;
n
j
)(

0
j

^

j
)

=(

0

^

)
|
X
i
(
Y
i
;
X
|
i;
n
j
)(

0
j

^

j
),
wehaveby(3)inAssumption2


1
n
n
X
i
=1
R
ni;
3


=


1
n
n
X
i
=1
(

0

^

)
|
X
i
(
Y
i
;
X
|
i;
n
j
)(

0
j

^

j
)


v
u
u
t
1
n
n
X
i
=1
[(

0

^

)
|
X
i
]
2
v
u
u
t
1
n
n
X
i
=1
[(
Y
i
;
X
|
i;
n
j
)(

0
j

^

j
)]
2
=
O
p
(
p
s
log
p=n
)
O
p
(
p
b
n
)=
O
p
(
p
b
n
s
log
p=n
)
:
Soinordertohave
1
n
P
n
i
=1
R
ni;
3
=
o
p
(
n

1
=
2
),weneedtohave
p
b
n
s
log
p=n
=
o
p
(
n

1
=
2
),
136
i.e.
p
b
n
s
log
p
=
o
p
(1).Andwealsohave
max
1

i

n
j
R
ni;
3
jk

0

^

k
1
k

0
j

^

j
k
1
max
1

i

n
max
1

j

p
j
X
ij
j

max
1

i

n
j
Y
i
j
+max
1

i

n
max
1

j

p
j
X
ij
j

=
O
p
(
s
p
log
p=na
n
log(
pn
))=
o
p
(
n
1
=
2
)
:
Nowweneedtocheckoutcondition(C0).Fromtheaboveanalysis,wehavemax
1

i

n
j
R
ni
j
=
o
p
(max
1

i

n
j
W
ni
j
).Thusweonlyneedtoprovethat
P(min
1

i

n
W
ni
<
0
<
max
1

i

n
W
ni
)
!
1
;
whichjustfollowsfromtheGilvenko-Gantellitheoremoverhalf-spacesasinpage219in
[Owe01].
ProofofTheorem8.
Recallthat
1
p
n
n
X
i
=1
R
ni
=
R
1
n
+
R
2
n
+
R
3
n
+
R
4
n
where
R
1
n
=
1
p
n
n
X
i
=1

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S


X
ij


j
S


1
SS
X
i
S

;
R
2
n
=
1
p
n
n
X
i
=1


i

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S


j
S


1
SS
X
i
S

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S

;
R
3
n
=
1
p
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


[

0
S


^

S

]
;
R
4
n
=
1
p
n
n
X
i
=1


j
S


1
SS
X
i
S

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S


X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


[

0
S


^

S

]
:
137
Nowfor
R
1
n
,wehave
R
1
n
=

1
p
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

=

n
1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S
on
p
n
(
X
|
S
X
S
)

1
X
|
S

o
:
Nowweneedtoboundthetwoterms
1
n
P
n
i
=1

X
ij


j
S


1
SS
X
i
S

X
i
S
and
p
n
(
X
|
S
X
S
)

1
X
|
S

.
Infact,forevery
k
2S
,wehavethatthetwoGaussianrandomvariables
X
ij


j
S


1
SS
X
i
S
and
X
ik
havethefollowingproperties:
E(
X
ik
)=E(
X
ij


j
S


1
SS
X
i
S
)=0;
E(
X
2
ik
)=
˙
kk
;
E[(
X
ij


j
S


1
SS
X
i
S
)
2
]=
˙
jj


j
S


1
SS

S
j
;
Cov(
X
ik
;X
ij


j
S


1
SS
X
i
S
)=E[
X
ik
(
X
ij


j
S


1
SS
X
i
S
)]=
˙
kj


j
S


1
SS

S
k
=
˙
kj


j
S


1
SS

SS
e
k
=
˙
kj


j
S
e
k
=
˙
kj

˙
jk
=0
:
Thuswehave
0
B
@
X
ik
X
ij


j
S


1
SS
X
i
S
1
C
A
˘
N

0
;
0
B
@
˙
kk
0
0
˙
jj


j
S


1
SS

S
j
1
C
A

:
(4.7.35)
Under(1)inAssumption3,byLemmaA.3from[BL08],wehavethereexistsconstants
C;C
1
;C
2
>
0suchthat
P


1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
ij


t


C
1
exp(

C
2
nt
2
)
;
for0

t

C:
138
Byunioninequality,wethenhave
P

max
S
:
jS
m


1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
i
S


1

t


C
1
mp
m
exp(

C
2
nt
2
)
;
for0

t

C
,where
jfSf
1
;
2
;;

;p
g
:
jSj
m
gj
p
m
.
For
mp
m
exp(

C
2
nt
2
)=exp(

C
2
nt
2
+
m
log
p
+log
m
),take
t
=
s
m
log
p
+log
m
+
C
log
p
(
C
2
n
)
˘
p
m
log
p=n;
andthenwehave
max
S
:
jS
m


1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
i
S


1
=
O
p
(
p
m
log
p=n
)
:
Nowinordertocontrol
p
n
(
X
|
S
X
S
)

1
X
|
S

,noticethatbythefollowingmatrix
equality[HS81]
(
X
|
S
X
S
=n
)

1
=


SS
+(
X
|
S
X
S
=n


SS
)


1
=


1
SS


1
SS

I
+(
X
|
S
X
S
=n


SS
)


1
SS


1
(
X
|
S
X
S
=n


SS
)


1
SS
|
{z
}

S
;
(4.7.36)
139
wehave
k
p
n
(
X
|
S
X
S
)

1
X
|
S

k
1
=
k
(
X
|
S
X
S
=n
)

1
X
|
S

=
p
n
k
1
k


1
SS
X
|
S

=
p
n
k
1
+
k

S
X
|
S

=
p
n
k
1

p
jSjk


1
SS
X
|
S

=
p
n
k
2
+
p
jSjk

S
X
|
S

=
p
n
k
2

p
jSjk


1
SS
X
|
S

=
p
n
k
2
+
p
jSjk

S
k
2
k
X
|
S

=
p
n
k
2
:
OneofthemostimportantresultsinmatrixanalysisistheCauchy(eigenvalue)inter-
lacingtheorem.Itassertsthattheeigenvaluesofanyprincipalsubmatrixofasymmetric
matrixinterlacethoseofthesymmetricmatrix.Forexample,ifan
n

n
symmetricmatrix
S
canbepartitionedas
S
=
0
B
@
AB
B
|
C
1
C
A
;
inwhich
A
isan
r

r
principlesubmatrix,thenforeach
i
2
1
;
2
;

;r
,wehave

i
(
S
)


i
(
A
)


n

r
+
i
(
S
)
:
Inparticular,wehave

min
(

)


min
(

SS
)and

max
(

)


max
(

SS
).Thusbythe
ofmaximumeigenvalue,wehave
k


1
SS
X
|
S

=
p
n
k
2


1
min
k
X
|
S

=
p
n
k
2
:
140
So
k
p
n
(
X
|
S
X
S
)

1
X
|
S

k
1

p
jSj


1
min
k
X
|
S

=
p
n
k
2
+
p
jSjk

S
k
2
k
X
|
S

=
p
n
k
2
=
p
jSj


1
min
+
k

S
k
2

k
X
|
S

=
p
n
k
2
:
Nowwehavetocontrol
k
X
|
S

=
p
n
k
2
and
k

S
k
2
.Inordertocontroltheone,bythe
sub-Gaussiantailedcondition(2)inAssumption3,
P(max
S
:
jS
m
k
X
|
S

=
p
n
k
2

t
p
n
)

P(max
S
:
jS
m
max
j
2S
j
1
n
n
X
i
=1
X
ij

i
j
t=
p
m
)

p
m
m
exp(

Cnt
2
=m
)
;
followedfromtheBernsteininequalityfor
t
small.For
p
m
m
exp(

Cnt
2
=m
)=exp(
m
log
p
+
log
m

Cnt
2
=m
),take
t
=
p
m
q
m
log
p
+log
m
+
C
1
log
p
Cn
˘
p
m
2
log
p=n
.Thenwehavethe
followingorder
max
S
:
jS
m
k
X
|
S

=
p
n
k
2
=
O
p
(
m
p
log
p
)
:
Nowfor
k

S
k
2
with

S
=


1
SS

I
+(
X
|
S
X
S
=n


SS
)


1
SS


1
(
X
|
S
X
S
=n


SS
)


1
SS
,
wehavetocontrol
X
|
S
X
S
=n


SS
Notethat
P

sup
S
:
jS
m
k
X
|
S
X
S
=n


SS
k
2


P

sup
S
:
jS
m
max
j;k
j
X
|
j
X
k
=n

˙
jk
j


m
2
p
m
P

j
X
|
j
X
k
=n

˙
jk
j


C
1
m
2
p
m
exp(

C
2

2
=m
2
)
wherethelastinequalityisalsofollowedfromLemmaA.3in[BL08]withconstants
C
1
;C
2
>
0.For
m
2
p
m
exp(

C
2

2
=m
2
)=exp(2log
m
+
m
log
p

C
2

2
=m
2
),bytaking

=
m
r
m
log
p
+2log
m
+
C
1
log
p
C
2
n
˘
141
p
m
3
log
p=n
,wehave
sup
S
:
jS
m
k
X
|
S
X
S
=n


SS
k
2
=
O
p
(
q
m
3
log
p=n
)
:
Itfollowsthen
k

S
k
2
=
k


1
SS

I
+(
X
|
S
X
S
=n


SS
)


1
SS


1
(
X
|
S
X
S
=n


SS
)


1
SS
k
2
k


1
SS
k
2
2
k
I
+(
X
|
S
X
S
=n


SS
)


1
SS


1
k
2
k
X
|
S
X
S
=n


SS
k
2
=
O
p
(
q
m
3
log
p=n
)
;
since
k


1
SS
k
2
=

1
=
2
max
(


2
SS
)


1
min
.
Thuswehave
k
p
n
(
X
|
S
X
S
)

1
X
|
S

k
1

p
jSj


1
min
+
k

S
k
2

k
X
|
S

=
p
n
k
2
=
O
p
(
q
m
3
log
p=n
)
;
i.e.sup
S
:
jS
m
k
p
n
(
X
|
S
X
S
)

1
X
|
S

k
1
=
O
p
(
p
m
3
log
p=n
).
Insummary,wethenhave
sup
S
:
jS
m


n
1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S
on
p
n
(
X
|
S
X
S
)

1
X
|
S

o


sup
S
:
jS
m


1
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S


1
sup
S
:
jS
m


p
n
(
X
|
S
X
S
)

1
X
|
S


1
=
O
p
(
p
m
log
p=n
)
O
p
(
q
m
3
log
p=n
)=
O
p
(
m
2
log
p=n
)
:
Andhence
R
1
n
=
o
p
(1).
142
For
R
2
n
,wehave
R
2
n
=
1
p
n
n
X
i
=1


i

X
|
i
S
(
X
|
S
X
S
)

1
X
|
S


j
S


1
SS
X
i
S

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S

=


j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1

1
p
n
n
X
i
=1

X
i
S

i

X
i
S
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S


=


j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1

n
1
p
n
n
X
i
=1
X
i
S

i


1
n
n
X
i
=1
X
i
S
X
|
i
S

p
n
(
X
|
S
X
S
)

1
X
|
S

o
=


j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1

n
1
p
n
n
X
i
=1
X
i
S

i

X
|
S

=
p
n
o
=0
:
Observethatwecanrewrite
R
3
n
as
R
3
n
=
1
p
n
n
X
i
=1

X
ij


j
S


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


[

0
S


^

S

]
=
1
p
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
S

[

0
S


^

S

]
;
143
where
1
p
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
S

canbecontrolledasfollows
k
1
p
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
S

k
1
=max
k
2S

j
1
p
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
k
j

p
n
max
k
2S

n


X
|
j
X
k
=n

˙
jk


+


˙
jk


j
S


1
SS

S
k


+


[
X
|
j
X
S
=n


j
S
]


1
SS

S
k


+


j
S


1
SS
[
X
|
S
X
k
=n


S
k
]


+


j
S

S

S
k


+


j
S

S
[
X
|
S
X
k
=n


S
k
]


+


[
X
|
j
X
S
=n


j
S
]


1
SS
[
X
|
S
X
k
=n


S
k
]


+


[
X
|
j
X
S
=n


j
S
]

S

S
k


+


[
X
|
j
X
S
=n


j
S
]

S
[
X
|
S
X
k
=n


S
k
]


o

p
n
max
k
2S

n


X
|
j
X
k
=n

˙
jk


+


˙
jk


j
S


1
SS

S
k


+
k
X
|
j
X
S
=n


j
S
k
1
p
jSj


1
min

max
+
p
jSj


1
min

max
k
X
|
S
X
k
=n


S
k
k
1
+

2
max
k

S
k
2
+
p
jSj

max
k

S
k
2
k
X
|
S
X
k
=n


S
k
k
1
+
k
X
|
j
X
S
=n


j
S
k
2


1
min
k
X
|
S
X
k
=n


S
k
k
2
+
p
jSjk
X
|
j
X
S
=n


j
S
k
1
k

S
k
2

max
+
k
X
|
j
X
S
=n


j
S
k
2
k

S
k
2
k
X
|
S
X
k
=n


S
k
k
2
o
:
Andwehavethat
P

sup
S
:
jS
m
max
k
2S

j
˙
jk

1
n
X
|
j
X
k
j


p
m
+1
P

j
˙
jk

1
n
X
|
j
X
k
j


C
1
p
m
+1
exp(

C
2

2
)
wherethelastinequalityisalsofollowedfromLemmaA.3in[BL08]withconstants
C
1
;C
2
>
0.For
p
m
+1
exp(

C
2

2
)=exp((
m
+1)log
p

C
2

2
),bytaking

=
r
(
m
+1)log
p
+
C
1
log
p
C
2
n
˘
p
m
log
p=n
,wehave
sup
S
:
jS
m
max
k
2S

j
˙
jk

1
n
X
|
j
X
k
j
=
O
p
(
p
m
log
p=n
)
:
144
Similarly,wehave
sup
S
:
jS
m
k

j
S

1
n
X
|
j
X
S
k
1
=
O
p
(
p
m
log
p=n
)
sup
S
:
jS
m
max
k
2S

k
X
|
S
X
k
=n


S
k
k
1
=
O
p
(
p
m
log
p=n
)
Bysup
S
:
jS
m
k

S
k
2
=
O
p
(
p
m
3
log
p=n
),wehave
sup
S
:
jS
m
k
1
p
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
S

k
1

p
n
sup
S
:
jS
m
max
k
2S

n


X
|
j
X
k
=n

˙
jk


+


˙
jk


j
S


1
SS

S
k


+
k
X
|
j
X
S
=n


j
S
k
1
p
jSj


1
min

max
+
p
jSj


1
min

max
k
X
|
S
X
k
=n


S
k
k
1
+

2
max
k

S
k
2
+
p
jSj

max
k

S
k
2
k
X
|
S
X
k
=n


S
k
k
1
+
p
jSjk
X
|
j
X
S
=n


j
S
k
1
k

S
k
2

max
+
k
X
|
j
X
S
=n


j
S
k
2


1
min
k
X
|
S
X
k
=n


S
k
k
2
+
k
X
|
j
X
S
=n


j
S
k
2
k

S
k
2
k
X
|
S
X
k
=n


S
k
k
2
o
=
p
n
sup
S
:
jS
m
max
k
2S


˙
jk


j
S


1
SS

S
k


+
O
p
f
p
n
q
m
3
log
p=n
g
;
since
p
m
3
log
p=n
=
o
(1).Undercondition(4)and(5)inAssumption3,wehavethat
145
R
3
n
=
o
p
(1).
Notethat
R
4
n
=
1
p
n
n
X
i
=1


j
S


1
SS
X
i
S

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S


X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


[

0
S


^

S

]
=
1
p
n
n
X
i
=1


j
S


1
SS
X
i
S
X
|
i
S


[

0
S


^

S

]


j
S


1
SS
(
X
|
S
X
S

=
p
n
)[

0
S


^

S

]

1
p
n
n
X
i
=1

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S
X
|
i
S


[

0
S


^

S

]
+

X
|
j
X
S
(
X
|
S
X
S
)

1
X
|
S
X
S

=
p
n

[

0
S


^

S

]=0
:
Thuswehavevthat
1
n
P
n
i
=1
R
ni
=
o
p
(
n

1
=
2
).
Andfor
R
ni;
1
,wehave
max
1

i

n
j
R
ni;
1
j
=
k
(
X
|
S
X
S
)

1
X
|
S

k
1
max
1

i

n
k

X
ij


j
S


1
SS
X
i
S

X
|
i
S
k
1
=
k
(
X
|
S
X
S
)

1
X
|
S

k
1
max
1

i

n
max
k
2S


X
ij


j
S


1
SS
X
i
S

X
ik


wheresup
S
:
jS
m
k
(
X
|
S
X
S
)

1
X
|
S

k
1
=
O
p
(
p
m
3
log
p=n
).Andsince
X
ij


j
S


1
SS
X
i
S
isGaussianundertheassumptionthat
X
isGaussian,wehave

X
ij


j
S


1
SS
X
i
S

X
ik
sub-exponential.So
P

sup
S
:
jS
m
max
1

i

n
max
k
2S


X
ij


j
S


1
SS
X
i
S

X
ik


>t


p
m
nmC
1
exp(

C
2
t
)
whichleadstosup
S
:
jS
m
max
1

i

n
max
k
2S


X
ij


j
S


1
SS
X
i
S

X
ik


=
O
p
(
m
log
p
).
146
Thuswehave
sup
S
:
jS
m
max
1

i

n
j
R
ni;
1
j
=
O
p
(
m
log
p
q
m
3
log
p=n
)=
o
p
(
n
1
=
2
)
since(
m
log
p=n
)
p
m
3
log
p=n
=
o
(1).
Andfor
R
ni;
2
,wehave
max
1

i

n
j
R
ni;
2
jk

j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1
k
1
max
1

i

n
k
X
i
S

i

X
i
S
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

k
1
;
where
k

j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1
k
1
=
k

j
S


1
SS

n

1
X
|
j
X
S
(
X
|
S
X
S
=n
)

1
k
1
=
k

j
S


1
SS

n

1
X
|
j
X
S
(


1
SS


S
)
k
1

(

j
S

n

1
X
|
j
X
S
)


1
SS
k
1
+
k
n

1
X
|
j
X
S

S
k
1

(

j
S

n

1
X
|
j
X
S
)


1
SS
k
1
+
k
(
n

1
X
|
j
X
S


j
S
)

S
k
1
+
k

j
S

S
k
1
:
Andbysimplealgebra,wehave
sup
S
:
jS
m
k
(

j
S

n

1
X
|
j
X
S
)


1
SS
k
1
=
O
p
(
q
m
3
log
p=n
)
;
sup
S
:
jS
m
k
(
n

1
X
|
j
X
S


j
S
)

S
k
1
=
O
p
(
m
2
log
p=n
)
;
sup
S
:
jS
m
k

j
S

S
k
1
=
O
p
(
m
2
p
log
p=n
)
:
147
Nowfor
max
1

i

n
k
X
i
S

i

X
i
S
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

k
1

max
1

i

n
k
X
i
S

i
k
1
+max
1

i

n
k
X
i
S
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

k
1

max
1

i

n
k
X
i
S

i
k
1
+
k
(
X
|
S
X
S
)

1
X
|
S

k
1
max
1

i

n
k
X
i
S
X
|
i
S
k
1
;
since
X
ik

i
issub-exponential,wehave
P

sup
S
:
jS
m
max
1

i

n
k
X
i
S

i
k
1
>t

=P

sup
S
:
jS
m
max
1

i

n
max
k
2S
j
X
ik

i
j
>t


p
m
mnC
1
e

C
2
t
whichleadstosup
S
:
jS
m
max
1

i

n
k
X
i
S

i
k
1
=
O
p
(
m
log
p
).Andsince
X
ik
X
il
issub-
exponential,wehave
P

sup
S
:
jS
m
max
1

i

n
k
X
i
S
X
|
i
S
k
1
>t


P

sup
S
:
jS
m
max
1

i

n
p
m
max
k;l
2S
j
X
ik
X
il
j
>t


p
m
m
2
nC
1
e

C
2
t
whichleadstosup
S
:
jS
m
max
1

i

n
k
X
i
S
X
|
i
S
k
1
=
O
p
(
p
mm
log
p
).
Sincesup
S
:
jS
m
k
(
X
|
S
X
S
)

1
X
|
S

k
1
=
O
p
(
p
m
3
log
p=n
),wehave
sup
S
:
jS
m
max
1

i

n
k
X
i
S

i

X
i
S
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S

k
1
=
O
p
(
m
log
p
+
p
mm
log
p
q
m
3
log
p=n
)=
O
p
(
m
log
p
(1+
m
2
p
log
p=n
))
:
148
Insummary,
sup
S
:
jS
m
max
1

i

n
j
R
ni;
2
j
=
O
p
f
m
3
log
p
p
log
p=n
(1+
m
2
p
log
p=n
)
g
;
sincelog
p=n
!
0.Inordertohavesup
S
:
jS
m
max
1

i

n
j
R
ni;
2
j
=
o
p
(
n
1
=
2
),weneed
tohave
m
3
(log
p=
p
n
)
p
log
p=n
=
o
(1),whichistrueunder(4)inAssumption3since
m
3
(log
p=
p
n
)
p
log
p=n
=
p
m
3
log
p=n
p
(log
p
)
2
m
3
=n
=
o
(1).
Observethatmax
1

i

n
j
R
ni;
3
jk

0
S


^

S

k
1

max
1

i

n
k

X
ij


j
S


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
:
Since
k
X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S

k
1

max
k
2S


j
X
|
i
S


1
SS

S
k
j
+
j
X
|
i
S


1
SS
(
X
|
S
X
k
=n


S
k
)
j
+
j
X
|
i
S

S

S
k
j
+
j
X
|
i
S

S
(
X
|
S
X
k
=n


S
k
)
j

;
(4.7.37)
wehave
max
1

i

n
k

X
ij


j
S


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1

max
1

i

n
max
k
2S

j

X
ij


j
S


1
SS
X
i
S

X
ik
j
+max
1

i

n
max
k
2S

j

X
ij


j
S


1
SS
X
i
S

X
|
i
S


1
SS

S
k
j
+max
1

i

n
max
k
2S

max
l
2S
j

X
ij


j
S


1
SS
X
i
S

X
il
jk


1
SS
(
X
|
S
X
k
=n


S
k
)
k
1
+max
1

i

n
max
k
2S

max
l
2S
j

X
ij


j
S


1
SS
X
i
S

X
il
j
p
m
k

S
k
2
k

S
k
k
2
+max
1

i

n
max
k
2S

max
l
2S
j

X
ij


j
S


1
SS
X
i
S

X
il
jk

S
(
X
|
S
X
k
=n


S
k
)
k
1
:
149
Nowsince
P

sup
S
:
jS
m
max
1

i

n
max
k
2S

j

X
ij


j
S


1
SS
X
i
S

X
ik
j
>t


p
m
+1
nC
1
e

C
2
t
;
wehave
sup
S
:
jS
m
max
1

i

n
max
k
2S

j

X
ij


j
S


1
SS
X
i
S

X
ik
j
=
O
p
(
m
log
p
)
:
Similarly,wehave
sup
S
:
jS
m
max
1

i

n
max
k
2S

j

X
ij


j
S


1
SS
X
i
S

X
|
i
S


1
SS

S
k
j
=
O
p
(
m
log
p
)
;
sup
S
:
jS
m
max
1

i

n
max
l
2S
j

X
ij


j
S


1
SS
X
i
S

X
il
j
=
O
p
(
m
log
p
)
:
Andthenbysimplealgebra,wehave
sup
S
:
jS
m
max
k
2S

k
(

k
S

n

1
X
|
k
X
S
)


1
SS
k
1
=
O
p
(
q
m
3
log
p=n
)
;
sup
S
:
jS
m
max
k
2S

k
(
n

1
X
|
k
X
S


k
S
)

S
k
1
=
O
p
(
m
2
log
p=n
)
:
Thuswehave
sup
S
:
jS
m
max
1

i

n
k

X
ij


j
S


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
=
O
p
f
m
log
p
(1+
q
m
3
log
p=n
+
m
2
p
log
p=n
+
m
2
log
p=n
)
g
=
O
p
f
m
log
p
(1+
q
m
3
log
p=n
+
m
2
log
p=n
)
g
;
150
whichleadsto
sup
S
:
jS
m
max
1

i

n
j
R
ni;
3
j
=
O
p
(
s
p
log
p=nm
log
p
(1+
q
m
3
log
p=n
+
m
2
log
p=n
))
:
Inordertohavesup
S
:
jS
m
max
1

i

n
j
R
ni;
3
j
=
o
p
(
n
1
=
2
),weneed
s
p
log
p=n
(
m
log
p=
p
n
)(1+
q
m
3
log
p=n
+
m
2
log
p=n
)=
o
(1)
;
whichistrueunder(4)inAssumption3.
Andformax
1

i

n
j
R
ni;
4
j
=
k

0
S


^

S

k
1

max
1

i

n
k


j
S


1
SS

X
|
j
X
S
(
X
|
S
X
S
)

1

X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
:
Andfor
max
1

i

n
k


j
S


1
SS

X
|
j
X
S
=n
(


1
SS


S
)

X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
=max
1

i

n
k
(

j
S

X
|
j
X
S
=n
)


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
+max
1

i

n
k
(
X
|
j
X
S
=n


j
S
)

S
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
+max
1

i

n
k

j
S

S
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
;
151
by(4.7.37),wehave
max
1

i

n
k
(

j
S

X
|
j
X
S
=n
)


1
SS
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1

max
1

i

n
max
k
2S

max
l
2S
k
(

j
S

X
|
j
X
S
=n
)


1
SS
k
1
j
X
il
X
ik
j
+max
1

i

n
max
k
2S

max
l
2S
k
(

j
S

X
|
j
X
S
=n
)


1
SS
k
1
X
2
il
k


1
SS

S
k
k
1
+max
1

i

n
max
k
2S

max
l
2S
k
(

j
S

X
|
j
X
S
=n
)


1
SS
k
1
X
2
il
k


1
SS
(
X
|
S
X
k
=n


S
k
)
k
1
+max
1

i

n
max
k
2S

max
l
2S
k
(

j
S

X
|
j
X
S
=n
)


1
SS
k
1
X
2
il
k

S

S
k
k
1
+max
1

i

n
max
k
2S

max
l
2S
k
(

j
S

X
|
j
X
S
=n
)


1
SS
k
1
X
2
il
k

S
(
X
|
S
X
k
=n


S
k
)
k
1
=
O
p
(
m
3
log
p
p
log
p=n
)
;
undertheconditionthat
m
3
log
p=n
!
0.Similarlywehave
sup
S
:
jS
m
max
1

i

n
k
(
X
|
j
X
S
=n


j
S
)

S
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
=
O
p
f
m
7
=
2
(log
p
)
2
=n
g
sup
S
:
jS
m
max
1

i

n
k

j
S

S
X
i
S

X
|
i
S


X
|
i
S
(
X
|
S
X
S
)

1
X
|
S
X
S


k
1
=
O
p
f
m
7
=
2
log
p
p
log
p=n
g
if
m
3
log
p=n
!
0.Insummary,if
m
3
log
p=n
!
0,
sup
S
:
jS
m
max
1

i

n
j
R
ni;
4
j
=
O
p
f
sm
7
=
2
(log
p
)
2
=n
g
:
152
Thusinordertohavesup
S
:
jS
m
max
1

i

n
j
R
ni;
4
j
=
o
p
(
n
1
=
2
),weneed
sm
7
=
2
(log
p
)
2
=n
3
=
2
=
o
(1)
;
whichistrueunderthecondition(4)inAssumption3since
sm
7
=
2
(log
p
)
2
=n
3
=
2
=
s
r
(log
p
)
4
m
7
n
3
=
s
q
(log
p
)
2
m
3
n
m
2
log
p=n
=
o
(1).
Fromtheaboveanalysis,wehavemax
1

i

n
j
R
ni
j
=
o
p
(max
1

i

n
j
W
ni
j
).Thusweonly
needtoprovethat
P(min
1

i

n
W
ni
<
0
<
max
1

i

n
W
ni
)
!
1
;
whichjustfollowsfromtheGilvenko-Gantellitheoremoverhalf-spacesasinpage219in
[Owe01].
Fortheproofofthethreepropositions,theyarejustfollowedfromtheproofofthe
correspondingtheorems.WeherejustprovetheProposition2.
ProofofProposition6.
Inordertogettheasymptoticnormalityof
^

(kfc-de)
j
,wehavetodeal
with
1
n
P
n
i
=1
~
X
2
ij
.Nowsince
1
n
n
X
i
=1
~
X
2
ij
=
1
n
n
X
i
=1

X
ij

X
|
j
X
S
(
X
|
S
X
S
)

1
X
i
S

2
=
1
n
X
|
j
X
j

1
n
X
|
j
X
S
(
X
|
S
X
S
)

1
X
|
S
X
j
=
1
n
X
|
j

I

X
S
(
X
|
S
X
S
)

1
X
|
S

X
j
;
153
wehave
j
1
n
n
X
i
=1
~
X
2
ij

(
˙
jj


j
S


1
SS

S
j
)
j
=
j
1
n
X
|
j
X
j

1
n
X
|
j
X
S
(
X
|
S
X
S
=n
)

1
X
|
S
X
j
=n

(
˙
jj


j
S


1
SS

S
j
)
j

n


X
|
j
X
j
=n

˙
jj


+2
k
X
|
j
X
S
=n


j
S
k
1
p
jSj


1
min

max
+

2
max
k

S
k
2
+2
p
jSj

max
k

S
k
2
k
X
|
S
X
j
=n


S
j
k
1
+


1
min
k
X
|
S
X
j
=n


S
j
k
2
2
+
k

S
k
2
k
X
|
S
X
j
=n


S
j
k
2
2
o
:
Andsince
P

sup
S
:
jS
m
j
˙
jj

1
n
X
|
j
X
j
j


p
m
P

j
˙
jj

1
n
X
|
j
X
j
j


C
1
p
m
exp(

C
2

2
)
wehave
sup
S
:
jS
m
j
˙
jj

1
n
X
|
j
X
j
j
=
O
p
(
p
m
log
p=n
)
:
Nowfortheterm
k

j
S

1
n
X
|
j
X
S
k
1
,wehaveprovedabovethat
sup
S
:
jS
m
k


j
S

1
n
X
|
j
X
S


1
SS
k
1
=
O
p
(
p
m
log
p=n
)
:
154
Bysup
S
:
jS
m
k

S
k
2
=
O
p
(
p
m
3
log
p=n
),wehave
sup
S
:
jS
m
j
1
n
n
X
i
=1
~
X
2
ij

(
˙
jj


j
S


1
SS

S
j
)
j

sup
S
:
jS
m
n


X
|
j
X
j
=n

˙
jj


+2
k
X
|
j
X
S
=n


j
S
k
1
p
jSj


1
min

max
+

2
max
k

S
k
2
+2
p
jSj

max
k

S
k
2
k
X
|
S
X
j
=n


S
j
k
1
+


1
min
k
X
|
S
X
j
=n


S
j
k
2
2
+
k

S
k
2
k
X
|
S
X
j
=n


S
j
k
2
2
o
=sup
S
:
jS
m
n
O
p
(
p
m
log
p=n
)+
O
p
(
p
m
log
p=n
)
p
jSj


1
min

max
+

2
max
O
p
(
q
m
3
log
p=n
)+
p
jSj

max
O
p
(
q
m
3
log
p=n
)
O
p
(
p
m
log
p=n
)
+
jSj
O
p
(
p
m
log
p=n
)
2


1
min
+
jSj
O
p
(
p
m
log
p=n
)
2
O
p
(
q
m
3
log
p=n
)
o
=
O
p
f
q
m
3
log
p=n
g
:
Thuswehave
sup
S
:
jS
m


1
n
n
X
i
=1
~
X
2
ij

(
˙
jj


j
S


1
SS

S
j
)


=
O
p
f
q
m
3
log
p=n
g
=
o
p
(1)
:
(4.7.38)
HencewehavethefollowingasymptoticnormalitybySlutsky'stheorem
p
n
(
^

(kfc-de)
j


0
j
)=
1
p
n
P
n
i
=1
m
(kfc)
ni
(

0
j
)
1
n
P
n
i
=1
~
X
2
ij
d
!
N(0
;˙
2
kfc
)
;
where
˙
2
kfc
=lim
n
!1
(

jj

2

j
S


1
SS

j
S
+

j
S


1
SS

SS


1
SS

S
j
)
=
(
˙
jj


j
S


1
SS

S
j
).
155
Chapter5
ConclusionsandFutureDirections
Inthischapter,weaimtoreiteratethemaincontributionsofthisthesis,andtooutlinesome
ofthethingsthatcouldpossiblyfollowasfuturedevelopmentsontheresultspresentedhere.
InSection5.1,westartwiththesummaryofthemainideasinthethesis,especiallyfrom
Chapters2,3and4.Section5.2layoutssomenaturalextensionsoftheideasinthisthesis.
5.1SummaryandContributions
InChapter2and3,weproposedELbasedprocedurestomakepointwiseandsimultaneous
inferencesonfunctionallinearmodels,treatingsparseanddensefunctionaldataina
framework.WeshowedthatELisanicetooltoaccomplishthisgoal.Westudiedthe
asymptoticdistributionsoftheELbasedteststatisticsunderthenullandlocalalternative
hypothesesforbothsparseanddensefunctionaldata.Weestablishedthetransitionphasein

,theorderofrepeatedmeasurements,forpointwiseandsimultaneoustests.Thetransition
point

0
wasshowntobe1
=
8forthepointwisetestand1
=
16forthesimultaneoustest.If


0
,weshowedthattheproposedmethodisabletodetectalternativesofsize
b

n
=
n

4(1+

)
=
9
forthepointwisetestandoforder
b

n
=
n

8(1+

)
=
17
forthesimultaneoustest.
Fordensefunctionaldatasuchthat
>
0
,wefoundthattheproposedtestsareableto
detectalternativesofmagnitude
n

1
=
2
bothpointwiselyandsimultaneously,whichisthe
sameorderofalternativeaparametrictestcandetect.Moreover,weproposedapractical
156
bandwidthselectionmethodforfunctionaldata.Manybandwidthselectionmethodswere
proposedforindependentorweaklydependentdata,butbandwidthselectionforfunctional
dataremainedachallengingproblem,see[ZPW13]forarecentstudy.Numericalexperiments
inChapter2showedthattheproposedbandwidthselectionmethodworkswellinpractice.
InChapter4,weproposedaedframeworkforhighdimensionalinferencebasedon
theempiricallikelihoodwhichisconstructedwithestimatingequations.Itcanbeusedto
teststatisticalhypothesisandconstructintervals,whichhavemorenaturaldata
drivenshape.Tobroadentheapplicabilityofthemethod,thegeneraltheorywaspresented
withthegeneralconditionstobeInprincipal,allofthemethodsproposedin
theexistingliteraturecanbere-consideredunderourframeworkandmakefaircomparison
amongthem,althoughthetechnicaldetailscanbetcasebycase.Moreover,the
keyadvantageofourproposedlikelihoodratiobasedmethodcomparingwithotherssuch
asWaldtypemethodandScorebasedmethodisthatitcanallowheteroscedasticerror
noise.Thisislargelyduetotheniceselfnormalizationpropertyoftheempiricallikelihood
formulation.Inparticular,wedidnotassumeindependencebetweentheerrortermandthe
covariates,whichisacommonassumptionintheexistingliterature,althoughwemadethe
uncorrelatednessassumption.
5.2FutureDirections
Thisthesisfocusedonapplyingempiricallikelihoodtosolvesomefundamentalproblems
insimplestatisticalmodels,especiallylinearmodels.Henceanaturaldirectionforfuture
researchistogeneralizeourmethodologiestomorecomplicatedstatisticalmodels,suchas
generalizedlinearmodelsandsurvivalmodels.ForfunctionallinearmodelsinChapter2
157
and3,wegainedtherobustnessintermsofthecorrelationstructureoftheerrorprocess.
Butifwehavepriorknowledgeoftheerrorprocess,howtoincorporatetheerrorcorrelation
informationintotheestimationandinferenceprocedurestoincreasetheisavery
interestingtopicforfutureinvestigation.Weonlyconsideredonegeneraltypeofhypothesis
inChapter2and3.Thereisanotherhypothesisproblem,goodnessoftesting,whichcould
beanotherpromisingresearchproblem.ForthehighdimensionallinearmodelinChapter
4,weonlyfocusedononeestimatingequation.Butwhenwehavemorethanoneestimating
equations,howtocombinealloftheestimatingequationstomakemoretinferenceis
worthyoffurtherinvestigation.Ingeneral,theself-normalizationpropertyofELispowerful
andweshouldmakeuseofittosolvesomeproblemsinvariousstatisticalanalysis.
158
BIBLIOGRAPHY
159
BIBLIOGRAPHY
[AS58]
J.AitchisonandS.D.Silvey,
Maximum-likelihoodestimationofparameterssub-
jecttorestraints
,TheAnnalsofMathematicalStatistics
29
(1958),no.3,813{
828.
[B
+
13]
PeterBuhlmannetal.,
Statisticalanceinhigh-dimensionallinearmodels
,
Bernoulli
19
(2013),no.4,1212{1242.
[Bal60]
A.V.Balakrishnan,
Estimationanddetectiontheoryformultiplestochasticpro-
cesses
,JournalofMathematicalAnalysisandApplications
1
(1960),no.3,386{
410.
[BCW14]
AlexandreBelloni,VictorChernozhukov,andLieWang,
Pivotalestimation
viasquare-rootlassoinnonparametricregression
,TheAnnalsofStatistics
42
(2014),no.2,757{788.
[Bel02]
DavidABelsley,
Aninvestigationofanunbiasedcorrectionforheteroskedastic-
ityandthectsofmisspecifyingtheskedasticfunction
,JournalofEconomic
dynamicsandControl
26
(2002),no.9,1379{1396.
[BHK
+
09]
MichalBenko,Wolfgangardle,AloisKneip,etal.,
Commonfunctionalprin-
cipalcomponents
,TheAnnalsofStatistics
37
(2009),no.1,1{34.
[BL08]
PeterJBickelandElizavetaLevina,
Regularizedestimationoflargecovariance
matrices
,TheAnnalsofStatistics(2008),199{227.
[BRT09]
PeterJBickel,Ya'acovRitov,andAlexandreBTsybakov,
Simultaneousanalysis
oflassoanddantzigselector
,TheAnnalsofStatistics(2009),1705{1732.
[BTW
+
07]
FlorentinaBunea,AlexandreTsybakov,MartenWegkamp,etal.,
Sparsityoracle
inequalitiesforthelasso
,ElectronicJournalofStatistics
1
(2007),169{194.
[BVDG11]
PeterBuhlmannandSaraVanDeGeer,
Statisticsforhigh-dimensionaldata:
methods,theoryandapplications
,SpringerScience&BusinessMedia,2011.
[CC06]
SongXiChenandHengjianCui,
Onbartlettcorrectionofempiricallikelihood
inthepresenceofnuisanceparameters
,Biometrika
93
(2006),no.1,215{220.
[CG14]
SongXiChenandBinGuo,
Testsforhighdimensionalgeneralizedlinearmodels
,
arXivpreprintarXiv:1402.4882(2014).
[CHL03]
SongXiChen,Wolfgangardle,andMingLi,
Anempiricallikelihoodgoodness-
testfortimeseries
,JournaloftheRoyalStatisticalSociety:SeriesB
(StatisticalMethodology)
65
(2003),no.3,663{678.
160
[CLS86]
P.E.Castro,W.H.Lawton,andE.A.Sylvestre,
Principalmodesofvariation
forprocesseswithcontinuoussamplecurves
,Technometrics
28
(1986),no.4,
329{337.
[CVK09]
SongXiChenandIngridVanKeilegom,
Areviewonempiricallikelihoodmethods
forregression
,Test
18
(2009),no.3,415{447.
[CZ10]
SongXiChenandPing-ShouZhong,
Anovaforlongitudinaldatawithmissing
values
,TheAnnalsofStatistics
38
(2010),no.6,3630{3659.
[DCL12]
ZJohnDaye,JinboChen,andHongzheLi,
High-dimensionalheteroscedastic
regressionwithanapplicationtoeqtldataanalysis
,Biometrics
68
(2012),no.1,
316{326.
[DHR91]
ThomasDiCiccio,PeterHall,andJosephRomano,
Empiricallikelihoodis
bartlett-correctable
,TheAnnalsofStatistics
19
(1991),no.2,1053{1061.
[Edw84]
AnthonyWilliamFairbankEdwards,
Likelihood
,CUPArchive,1984.
[EH08]
R.L.EubankandTailenHsing,
Canonicalcorrelationforstochasticprocesses
,
StochasticProcessesandtheirApplications
118
(2008),no.9,1634{1661.
[Far97]
JulianJFaraway,
Regressionanalysisforafunctionalresponse
,Technometrics
39
(1997),no.3,254{261.
[FFS10]
JianfengFeng,WenjiangFu,andFengzhuSun,
Frontiersincomputationaland
systemsbiology
,vol.15,SpringerScience&BusinessMedia,2010.
[FG96]
JianqingFanandIreneGijbels,
Localpolynomialmodellinganditsapplica-
tions:Monographsonstatisticsandappliedprobability66
,vol.66,Chapman
&Hall/CRC,1996.
[FHL07]
JianqingFan,TaoHuang,andRunzeLi,
Analysisoflongitudinaldatawith
semiparametricestimationofcovariancefunction
,JournaloftheAmericanSta-
tisticalAssociation
102
(2007),632{641.
[FL01]
JianqingFanandRunzeLi,
Variableselectionvianonconcavepenalizedlikeli-
hoodanditsoracleproperties
,JournaloftheAmericanstatisticalAssociation
96
(2001),no.456,1348{1360.
[FL08]
JianqingFanandJinchiLv,
Sureindependencescreeningforultrahighdimen-
sionalfeaturespace
,JournaloftheRoyalStatisticalSociety:SeriesB(Statistical
Methodology)
70
(2008),no.5,849{911.
[FZ00]
JianqingFanandJin-TingZhang,
Two-stepestimationoffunctionallinearmod-
elswithapplicationstolongitudinaldata
,JournaloftheRoyalStatisticalSociety:
SeriesB(StatisticalMethodology)
62
(2000),no.2,303{322.
[GQ65]
StephenMGoldfeldandRichardEQuandt,
Sometestsforhomoscedasticity
,
JournaloftheAmericanstatisticalAssociation
60
(1965),no.310,539{547.
161
[GVHF11]
JelleJGoeman,HansCVanHouwelingen,andLivioFinos,
Testingagainsta
high-dimensionalalternativeinthegeneralizedlinearmodel:asymptotictypei
errorcontrol
,Biometrika
98
(2011),no.2,381{390.
[HM93]
WolfgangHardleandEnnoMammen,
Comparingnonparametricversuspara-
metricregression
,TheAnnalsofStatistics(1993),1926{1947.
[HMW06]
PeterHall,Hans-GeorgMuller,andJane-LingWang,
Propertiesofprincipal
componentmethodsforfunctionalandlongitudinaldataanalysis
,Theannalsof
statistics(2006),1493{1517.
[HS81]
HaroldVHendersonandShayleRSearle,
Onderivingtheinverseofasumof
matrices
,SiamReview
23
(1981),no.1,53{60.
[HTS
+
99]
TrevorHastie,RobertTibshirani,GavinSherlock,MichaelEisen,Patrick
Brown,andDavidBotstein,
Imputingmissingdataforgeneexpressionarrays
,
1999.
[JM13]
AdelJavanmardandAndreaMontanari,
eintervalsandhypothesis
testingforhigh-dimensionalregression
,arXivpreprintarXiv:1306.3171(2013).
[KAC
+
98]
HenryK,EriceA,TierneyC,BalfourHHJr,FischlMA,KmackA,LiouSH,
KentonA,HirschMS,PhairJ,MartinezA,andKahnJO,
Arandomized,con-
trolled,double-blindstudycomparingthesurvivalboffourentreverse
transcriptaseinhibitortherapies(three-drug,two-drug,andalternatingdrug)for
thetreatmentofadvancedaids.aidsclinicaltrialgroup193astudyteam.
,JAc-
quirImmuneSyndrHumRetrovirol(1998),339{349.
[KF00]
KeithKnightandWenjiangFu,
Asymptoticsforlasso-typeestimators
,Annals
ofstatistics(2000),1356{1378.
[KZ13]
SeonjinKimandZhibiaoZhao,
dinferenceforsparseanddenselongitu-
dinalmodels
,Biometrika(2013),ass050.
[LH08]
PeterLangfelderandSteveHorvath,
Wgcna:anrpackageforweightedcorre-
lationnetworkanalysis
,BMCbioinformatics
9
(2008),no.1,559.
[LH10]
YehuaLiandTailenHsing,
Uniformconvergenceratesfornonparametricre-
gressionandprincipalcomponentanalysisinfunctional/longitudinaldata
,The
AnnalsofStatistics
38
(2010),no.6,3321{3351.
[LL14]
WeidongLiuandShanLuo,
Hypothesistestingforhigh-dimensionalregression
models
.
[LTTT14]
RichardLockhart,JonathanTaylor,RyanJTibshirani,andRobertTibshirani,
Aancetestforthelasso
,Annalsofstatistics
42
(2014),no.2,413.
[LZL
+
13]
WeiLan,Ping-ShouZhong,RunzeLi,HanshengWang,andChih-LingTsai,
Testingasingleregressioncoinhighdimensionallinearmodels
.
162
[Mam93]
EnnoMammen,
Bootstrapandwildbootstrapforhighdimensionallinearmodels
,
TheAnnalsofStatistics(1993),255{285.
[MB06]
NicolaiMeinshausenandPeterBuhlmann,
High-dimensionalgraphsandvariable
selectionwiththelasso
,TheAnnalsofStatistics(2006),1436{1462.
[MB10]
,
Stabilityselection
,JournaloftheRoyalStatisticalSociety:SeriesB
(StatisticalMethodology)
72
(2010),no.4,417{473.
[MC06]
JeSMorrisandRaymondJCarroll,
Wavelet-basedfunctionalmixedmodels
,
JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)
68
(2006),no.2,179{199.
[MMB09]
NicolaiMeinshausen,LukasMeier,andPeterBuhlmann,
P-valuesforhigh-
dimensionalregression
,JournaloftheAmericanStatisticalAssociation
104
(2009),no.488.
[MY09]
NicolaiMeinshausenandBinYu,
Lasso-typerecoveryofsparserepresentations
forhigh-dimensionaldata
,TheAnnalsofStatistics(2009),246{270.
[NL14]
YangNingandHanLiu,
Ageneraltheoryofhypothesistestsandcere-
gionsforsparsehighdimensionalmodels
,arXivpreprintarXiv:1412.8765(2014).
[NRWY12]
SahandN.Negahban,PradeepRavikumar,MartinJ.Wainwright,andBinYu,
Adframeworkforhigh-dimensionalanalysisof
m
-estimatorswithdecom-
posableregularizers
,Statist.Sci.
27
(2012),no.4,538{557.
[Owe88]
ArtBOwen,
Empiricallikelihoodratioceintervalsforasinglefunc-
tional
,Biometrika
75
(1988),no.2,237{249.
[Owe90]
,
Empiricallikelihoodratioceregions
,TheAnnalsofStatistics
18
(1990),no.1,90{120.
[Owe01]
,
Empiricallikelihood
,CRCpress,2001.
[PZB
+
10]
JiePeng,JiZhu,AnnaBergamaschi,WonshikHan,Dong-YoungNoh,
JonathanRPollack,andPeiWang,
Regularizedmultivariateregressionforiden-
tifyingmasterpredictorswithapplicationtointegrativegenomicsstudyofbreast
cancer
,Theannalsofappliedstatistics
4
(2010),no.1,53.
[QL95]
JinQinandJerryLawless,
Estimatingequations,empiricallikelihoodandcon-
straintsonparameters
,CanadianJournalofStatistics
23
(1995),no.2,145{159.
[RS91]
JohnARiceandBernardWSilverman,
Estimatingthemeanandcovariance
structurenonparametricallywhenthedataarecurves
,JournaloftheRoyalSta-
tisticalSociety.SeriesB(Methodological)(1991),233{243.
[Ser80]
RobertJ
Approximationtheoremsofmathematicalstatistics
,JohnWi-
ley&Sons,1980.
163
[SF04]
QingShenandJulianFaraway,
Anftestforlinearmodelswithfunctionalre-
sponses
,StatisticaSinica
14
(2004),no.4,1239{1258.
[Sil78]
BernardWSilverman,
Weakandstronguniformconsistencyofthekernelesti-
mateofadensityanditsderivatives
,TheAnnalsofStatistics
6
(1978),no.1,
177{184.
[SR05]
BernardWalterSilvermanandJamesO.Ramsay,
Functionaldataanalysis
,
Springer,2005.
[SS13]
RajenDShahandRichardJSamworth,
Variableselectionwitherrorcontrol:
anotherlookatstabilityselection
,JournaloftheRoyalStatisticalSociety:Series
B(StatisticalMethodology)
75
(2013),no.1,55{80.
[SZ12]
TingniSunandCun-HuiZhang,
Scaledsparselinearregression
,Biometrika
(2012),ass043.
[Tib96]
RobertTibshirani,
Regressionshrinkageandselectionviathelasso
,Journalof
theRoyalStatisticalSociety.SeriesB(Methodological)(1996),267{288.
[TLTT14]
JonathanTaylor,RichardLockhart,RyanJTibshirani,andRobertTibshirani,
Post-selectionadaptiveinferenceforleastangleregressionandthelasso
,arXiv
preprint(2014).
[Tuk77]
JohnWTukey,
Exploratorydataanalysis
,Reading,Ma
231
(1977),32.
[VdG08]
SaraAVandeGeer,
High-dimensionalgeneralizedlinearmodelsandthelasso
,
TheAnnalsofStatistics(2008),614{645.
[vdGBR13]
SaravandeGeer,PeterBuhlmann,andYa'acovRitov,
Onasymptoticallyop-
timalceregionsandtestsforhigh-dimensionalmodels
,arXivpreprint
arXiv:1303.0518(2013).
[Ver10]
RomanVershynin,
Introductiontothenon-asymptoticanalysisofrandomma-
trices
,arXivpreprintarXiv:1011.3027(2010).
[Wai09]
MartinJWainwright,
Sharpthresholdsforhigh-dimensionalandnoisysparsity
recoveryusing-constrainedquadraticprogramming(lasso)
,InformationTheory,
IEEETransactionson
55
(2009),no.5,2183{2202.
[WD12]
JensWagenerandHolgerDette,
Bridgeestimatorsandtheadaptivelassounder
heteroscedasticity
,MathematicalMethodsofStatistics
21
(2012),no.2,109{
126.
[WR09]
LarryWassermanandKathrynRoeder,
Highdimensionalvariableselection
,
Annalsofstatistics
37
(2009),no.5A,2178.
[WWL12]
LanWang,YichaoWu,andRunzeLi,
Quantileregressionforanalyzinghetero-
geneityinultra-highdimension
,JournaloftheAmericanStatisticalAssociation
107
(2012),no.497,214{222.
164
[XZ07]
LiugenXueandLixingZhu,
Empiricallikelihoodforavaryingcomodel
withlongitudinaldata
,JournaloftheAmericanStatisticalAssociation
102
(2007),no.478,642{654.
[YMW05a]
FangYao,Hans-GeorgMuller,andJane-LingWang,
Functionaldataanalysis
forsparselongitudinaldata
,JournaloftheAmericanStatisticalAssociation
100
(2005),577{590.
[YMW05b]
,
Functionallinearregressionanalysisforlongitudinaldata
,TheAnnals
ofStatistics
33
(2005),no.6,2873{2903.
[ZC07]
Jin-TingZhangandJianweiChen,
Statisticalinferencesforfunctionaldata
,The
AnnalsofStatistics
35
(2007),no.3,1052{1079.
[Zha09]
TongZhang,
Somesharpperformanceboundsforleastsquaresregressionwith
l1regularization
,TheAnnalsofStatistics
37
(2009),no.5A,2109{2144.
[Zha10]
Cun-HuiZhang,
Nearlyunbiasedvariableselectionunderminimaxconcave
penalty
,TheAnnalsofStatistics(2010),894{942.
[Zha11]
Jin-TingZhang,
Statisticalinferencesforlinearmodelswithfunctionalre-
sponses
,StatisticaSinica
21
(2011),no.3,1431.
[ZHM
+
10]
LanZhou,JianhuaZHuang,JosueGMartinez,ArnabMaity,Veerabhadran
Baladandayuthapani,andRaymondJCarroll,
Reducedrankmixedctsmodels
forspatiallycorrelatedhierarchicalfunctionaldata
,JournaloftheAmerican
StatisticalAssociation
105
(2010),no.489,390{400.
[ZL00]
WenyangZhangandSik-YumLee,
Variablebandwidthselectioninvarying-
comodels
,JournalofMultivariateAnalysis
74
(2000),no.1,116{134.
[ZPW13]
XiaokeZhang,ByeongUPark,andJane-lingWang,
Time-varyingadditivemod-
elsforlongitudinaldata
,JournaloftheAmericanStatisticalAssociation
108
(2013),no.503,983{998.
[ZY06]
PengZhaoandBinYu,
Onmodelselectionconsistencyoflasso
,TheJournalof
MachineLearningResearch
7
(2006),2541{2563.
[ZZ14]
Cun-HuiZhangandStephanieSZhang,
eintervalsforlowdimen-
sionalparametersinhighdimensionallinearmodels
,JournaloftheRoyalSta-
tisticalSociety:SeriesB(StatisticalMethodology)
76
(2014),no.1,217{242.
165