HIGH-DIMENSIONALINFERENCEFORSPATIALERRORMODELSByLiqianCaiADISSERTATIONSubmittedtoMichiganStateUniversityinpartialentoftherequirementsforthedegreeofStatistics|DoctorofPhilosophy2016.ABSTRACTHIGH-DIMENSIONALINFERENCEFORSPATIALERRORMODELSByLiqianCaiIntheliteratureofeconometrictheoryandapplication,issuesrelatingtourban,realestate,agricultural,andenvironmentaleconomics,etc.,wherethedataarecollectedspatiallyfromcross-sectionalunits,arecommonandinthesecircumstances,thespatialrelationamongthesamplingsitescannotbeignored.Spatialautocorrelationisthusintroducedtomodelthecorrelationamongvaluesofasinglevariablestrictlyattributabletotheirrelativelycloselocationalpositionsonatwo-dimensionalsurface,whichextendsautocorrelationintimeseriestospatialdimensions.Withthegrowthofcomputercapabilities,databasesarebecomingprogressivelylargerandmorecomplex,makingtraditionalstatisticalmethodslessctiveorsometimesevenunsuitable.Datafromhigh-frequencyeconomictransactions,detailedmacroeconomicdatacollectedbyamultitudeofsourceswithvaryingdataqualityandvaryingsamplingfrequen-cies,anddataonlargeeconomicandsocialnetworksarejustafewexamplesofthecontentofenormousdatabasesthatarenowsubjecttothoroughexamination.Thisdissertationdiscussesapplicable(high-dimensional)variableselectionandestima-tionmethodsandcorrespondingtheoriesfocusingonaspatialerrormodelwherethespatialautocorrelationcomesfromthedisturbancesacrosscross-sectionalunits,inaregressioncontext.Inthepart,weproposeageneralizedLasso-typeofestimatorforthespatialerrormodel,wherethedisturbancetermsareautocorrelatedacrosscross-sectionalunits.Wefurtherprovetheestimationconsistencyandselectionsignconsistencyoftheparameterestimatorunderboththelowdimensionalsettingwhenthedimensionoftheparameterpisandsmallerthanthesamplesizen,aswellasthehighdimensionalsettingwhenpisgreaterthanandcanbegrowingwithn.Thenumberofnon-zerocomponentsoftheparameterinbothsettingsareconsideredrelativelysmallerthanthenumberofobservations(sparsity).Inthesecondpart,wecontinuetoinvestigatepost-modelselectionestimatorsthatapplyestimationtothemodelselectedbyvariableselection.Weshowthatbyseparatingthemodelselectionandestimationprocess,thepost-modelselectionestimatorcanperformatleastaswellasthesimultaneousvariableselectionandestimationmethodintermsoftherateofconvergence.Theconvergencerateoftheestimationerrorinboththe`2andsupnormsarestudied.Moreover,underperfectmodelselection,thatis,whentheselectionprocessisabletocorrectlyidentifythetcovariatesofthetruemodelwithprobabilitygoesto1,theoracleconvergenceratecanbereached.Inthelastpart,asketchofthefutureworkonhigh-dimensionalanalysisonmixedregressive,spatialautoregressivemodel,wheretheresponseunitdependsnotonlyontheexplanatoryvariablesbutalsoontheresponsefromitsneighboringunits,isdescribed.Tomybelovedparentsandmyfoundation,HengyiCaiandHehuaZhong,fortheirendlesslove,supportandencouragement.ivACKNOWLEDGMENTSFirstandforemost,IwouldliketoexpressmysincerestgratitudetomymajordissertationadvisorDr.TapabrataMaiti,forhispatientguidance,invaluableassistance,constructivecommentsandimmenseknowledge.Hiscontinuoussupportandencouragementhelpedmethroughtheresearchandwritingprocessofthisdissertation.SpecialthankstoDr.RogerCalantone,forprovidingmewithapreciousexperienceworkingonreallifebusinessdataandlearningaboutpracticalmethodologiesandalsoforservingasoneofmycommitteemembers.SpecialthankstoDr.ArnabBhattarcharjee,forthecountlessdiscussionsandusefulsuggestionsonmywork.ThanksgoestoDr.ChaeYoungLimandDr.Ping-ShouZhong,forservingasmembersofmyguidancecommitteeandprovidingmewithconstructiveadvicesformydissertation.ThanksgoestotheentirefacultyandmembersintheDepartmentofStatisticsandProbabilitywhohavetaughtmeandhelpedmeduringmystudyatMichiganStateUniversity.Thanksgoestothegraduateschool,theCollegeofNaturalScienceandtheDepartmentofStatisticsandProbabilitywhoprovidedmetheDissertationContinuationFellowship,DissertationCompletionFellowshipandtravelingfellowshipsforworkingonmyresearchandattendingacademicconferences.IwouldalsoliketothankmyacademicfamilymembersatDepartmentofStatisticsandProbabilityforallthetimewehavehadinthepastyears.Knowingallofyoumakeeventhetoughestdaysenjoyable.Lastbutnottheleast,Iwouldliketothankmybelovedparents,HengyiCaiandHehuaZhong,forgivingbirthtome,lovingmeandsupportingmetobecomewhoIwanttobeateverystageoflife.vTABLEOFCONTENTSLISTOFTABLES....................................viiLISTOFFIGURES...................................viiiChapter1Introduction...............................11.1SpatialEconometricModels...........................11.2VariableSelectioninHigh-dimensionalSetting.................51.3Overview......................................8Chapter2VariableSelectionWithSpatialAutoregressiveErrors.....102.1Introduction....................................102.2AgeneralizedmomentsLASSO(GLASSO)estimator.............152.3AsymptoticPropertiesforpandq.....................202.3.1ParameterConsistency..........................212.3.2SignConsistency.............................232.4Asymptoticsforlargepandq..........................272.4.1ParameterConsistency..........................282.4.2SignConsistency.............................322.5SimulationStudies................................332.6Applicationtoahedonichousingpricemodel.................372.7Proofs.......................................44Chapter3Post-modelSelectionEstimationforRegressionModelswithSpatialAutoregressiveError.....................653.1Introduction....................................653.2ModelEstimationandSelectionproperties...................673.3Post-modelestimationproperties........................703.4SimulationStudies................................743.5RealDataExample................................813.6Proofs.......................................83Chapter4Futurework...............................974.1AnextensiontoMixedRegressive,SpatialAutoregressiveModels......974.2FutureWork....................................100BIBLIOGRAPHY....................................102viLISTOFTABLESTable2.1:MeansofNZ,IZ,SCforSEMwhenpnwithpositiveˆ......36Table2.4:MeansofNZ,IZ,SCforSEMwhenp>nwithnegativeˆ......37Table2.5:Numericalsummaryofthetoflivingsizefromneighbors42Table3.1:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:3..................................76Table3.2:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:75.................................76Table3.3:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:3.................................81Table3.4:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:75................................81viiLISTOFFIGURESFigure2.1:TheAveiro-IlhavoHousingMarket................38Figure2.2:Idenneighborswithspilloveroflivingspace...42Figure2.3:Computationresults.........................43Figure3.1:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:3................................77Figure3.2:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:75...............................78Figure3.3:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:3...............................79Figure3.4:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:75..............................80viiiChapter1Introduction1.1SpatialEconometricModelsSpatialeconometricsisaofeconometricswhichdealswiththespatialcorrelationamonggeographicalunits.Theunits,dependingonthenatureoftheproblem,canrefertozipcodes,regions,cities,statesandsoon.Appliedworkrelatingtotransportation,housepricing,agriculturegrowth,etcreliesheavilyonsampleddatathatiscollectedfromtlocations.Thegeographicalunitsdonotneedtobeastheconcretephysicallocationsinspace,theycanalsobeusedtoexplaintheabstractinteractionbetweeneconomicagents,andagoodillustrationwillbetheconnectionthroughsocialnetworks.Whatseparatesspatialeconometricsfromtraditionaleconometricsisthatsampleddatawithalocationcomponentinvolvestwoissuesthattraditionaleconometricshasmostlyig-nored,spatialdependencebetweentheobservationsandspatialheterogeneityinthemodelingrelationships.Spatialdependencecomesfromthefactthatobservationfromlocationiisdependentonthevaluesofobservationj,wherei;jcanbeanysamplelocation.Andspatialheterogeneityreferstothethesituationwhenweexpectatregressionrelationshipforeverylocationinsamplespace.FromtheGauss-Markovassumptionusedinregression,theexplanatoryvariablesareinrepeatedsamplingandaconstantvarianceexistsfordatafromntlocations.ThusthesetwoissuesviolatetheGauss-Markovassumptionsandalternativemodelingproceduresareneededheretoaddresstheissues.1Onewaytodealwiththespatialeistoimposethespatialstructureontothenon-spatialregressionmodel.Startingwiththestandardlinearregressionmodel,whichtakestheformY=X+";whereYisan1vectorconsistingofobservationsofthedependentvariableineachsample,Xrepresentsannpmatrixofexplanatoryvariables,isthecorrespondingparametervec-torofinterestand"isthedisturbancevectorwithindependentlyandidenticallydistributederrorterms.Thegenerallinearregressionmodeliscommonlyestimatedbyordinaryleastsquaresestimator.However,whenspatialinteractionexist,aspatialeconometricsmodelcanbeconstructedbyaddingtcombinationsofinteractiontothelin-earmodel.Typically,wethinktheassociationofanobservationatasplocationwithobservationsatotherlocationscomefromthreesources:theendogenousinteractionamongthedependentvariable(Y),theexogenousinteractionamongtheindependentvariables(X),andtheinteractionctsamongtheerrorterms(").AfullmodelcontainingalltypesofspatialcanbepresentedasY=W1Y+X+W2X+u;u=ˆW3u+"whereW1Y,W2X,andW3udenotethespatialinteractionamongthedependentvariable,independentvariableanderrorterms,respectively.HereW1,W2,W3arennspatialweightmatricesusedtodescribethespatialarrangementofthegeographicalunitsinthesampleandtheymayormaynotbeidentical.andˆdenotethespatialautoregressive2parametersandveryoftentheparametersareassumedtobewithintherangeof-1and1,justlikeinatime-seriesmodel.Theparameterdescribestheclustering(positiveˆ)ordissimilarity(negativeˆ)inspaceofcertainrandomvariable.Ofthetwotypesofspatialautocorrelation,positiveautocorrelationisbyfarthemoreintuitive.Negativespatialauto-correlationimpliesachecker-boardpatternofvaluesanddoesnotalwayshaveameaningfulsubstantiveinterpretation.Eventhoughthestructureofthespatialeconometricsmodelsmentionedabovelookveryaliketheonesusedintimeseries,itisimportanttobeawarethatspatialeconometricsisnotastraightforwardextensionoftimeseriestotwodimensions.Intimeseries,thefocusisonthedependenceamongobservationsovertimeandeachobser-vationisonlycorrelatedwiththeobservationsfromthepast,whileinspatialeconometrics,moreattentionispaidonthespatialdependenceamongobservationsacrossspace,whichismulti-dimensionalandthereisnonatureorderingforthedataarrangement,sospatialeconometricsmethodologiescannotbeadirecttranspositionfromtime-series.Thespatialinteractionamongtlocations,whereverthesource,areallquanbythespatialweightmatrix.Thespatialweightmatrix,usuallydenoteW,isanonnegativematrixofknownconstants.Theelementsinthematrixaredecidedmainlyfromtwosourcesofinformation.TheoneisthelatitudeandlongitudeofthelocationinCartesianspace.Thecoordinatesofthelocationprovidethedistanceofanytwolocations.Andbasedonthefundamentaltheoremofregionalscience,observationsthatarenearerwillagreaterdegreeofspatialdependencethanthosemoredistantfromeachother.Thissuggestsuseoffunctionsofdistancebetweenlocationiandjaselementwijinthematrix.Theothersourceofinformationisthecontiguity,whichrepresentstherelativepositioninspaceofoneobservationtootherobservations.Fromthenatureofthesizeandshapeoftheobservationunits,wecandeterminetheofneighbors,andneighboringunits3shouldexhibitahigherdegreeofspatialdependencethanunitslocatedfurtherapart.Theneighborsinthespatialmatrixcanbebyelements1and0.Andthediagonalelementsarealwayssettozero,assumingnospatialunitcanbeviewedasitsownneighbor.Thefullmodelmentionedaboveincorporatesallinteractionyetinrealapplica-tions,modelsthatcontainfewersourcesofinteractioncanbeobtainedbyimposingrestrictionsononeormoreoftheparameters.Andtheoreticiansaremainlyfocusingonsomeofthemostlyusedmodels.Tostartwith,modelswithonlyinteractionamongtheerrortermsarecalledlinearregressionmodelwithspatialautoregressivedisturbance,alsoknownas,spatialerrormodels.ThemodelisspasY=X+u;u=ˆMu+";where"haszeromeanandvariance˙2I.Inthisregressionmodel,thedisturbancesu0isinuarefollowingaspatialautoregressiveprocess,andarecorrelatedtoeachotheracrossunits.Anotherpopulartypeofmodelwithonlyendogenousinteractiononthedependentvariableiscalledmixedregressive,spatialautoregressivemodelY=WY+X+";where"0isareindependentlyidenticallydistributedwithzeromeanandvariance˙2.ThismodelfromausualSpatialAutoregressiveProcessinthepresenceofexogenousre-gressorsXasexplanatoryvariablesinthemodel.TherehasalsobeenagrowinginterestinmodelscontainingmorethanjustonespatialinteractionAlotofeconometricprob-lemsusethemodelwhichcombinesendogenousinteractionandinteraction4amongtheerrortermsY=WY+X+u;u=ˆMu+";where"0isareindependentlyidenticallydistributedwithzeromeanandvariance˙2.Theestimationmethodsforthespatialeconometricmodels,whichhavebeenconsideredintheexistingliterature,aremainlythe(Quasi-)MaximumLikelihoodmethod,InstrumentalVari-ablemethod,GeneralizedMethodofMomentsorBayesianMarkovChainMonteCarlometh-ods.The(Quasi-)MaximumLikelihoodmethodassumestheerrorterm"tobenormalandpermitstheactualdistributiontobetfromnormaldistribution.Ithasgoodsamplepropertieswithoneorderspatiallag.However,itisnotcomputationallyattractiveforlargesamplesizeproblemsbecauseoftheestimationcomplexitiesofspatialautoregressiveparameters.TheMonteCarlomethodscometouseforthecomputationalchallenge.TheInstrumentalVariablemethodandGeneralizedMethodofMomentsarefeasibleforhigherspatiallagmodelsandtheydonotassumethenormalityoferrorterm".Plus,theGener-alizedMethodsofMomentsisalsocomputationallyfeasibleandasymptoticallyconsistentunderanexplicitsetofconditions.1.2VariableSelectioninHigh-dimensionalSettingInstatisticalresearch,wearecontinuouslydealingwiththeproblemofbuildingamodelusingacollectionofpotentiallyrelevantpredictorsforthepurposeofforecastingaresponseofinterest.Andvariableselectionservesafundamentalroleinidentifyingtherelevantpre-dictorsthattrulymakesacontributiontotheresponse.Themaingoalsofvariableselection5aretosimplifythepredictionmodelsinordertomakethemeasiertointerpret,toshortenthetrainingtimes,toenhancegeneralizationbyreducingovtting,aswellashopefullytoconstructanimprovedestimationmethod.Nowadays,withthedevelopmentofscienresearchandadvancedtechnology,thecollectionofvastquantitiesofdatabecomespossi-bleandincreasinglyeasy.Sometimesthedimensionoftheattributescollectedbecomessolarge,maybeevenlargerthanthesamplesize,thenitbecomesahighdimensionalstatisti-calproblem.Examplesofhigh-dimensionaldatacanbefrequentlyseeninhigh-frequencyeconomictransactions,genomics,high-resolutionimages,amongothers.Thegoodthingis,whendealingwithhigh-dimensionaldataproblem,wemaketheassumptionthattheregressionfunctionliesinalowdimensionalmanifold,andtheregressionparametervectorissparsewithmanyofthecomponentsbeingzero,whichisnotonlyreasonablebutalsomakeshighdimensionalstatisticsinferencepossible.Whentheattentionfocusesonidentifyingthetpredictors,criteriaareneededtoselectamanageablesubsetmodel.Inthelinearmodelcontext,theearliestdevelopmentsofvariableselectionwerebasedonattemptstominimizethemeansquarederrorofthepredictionwithtadjustmentsdependingonthegoalofmodeling.OneofthemostfamiliarcriteriaisMallow'sCpcriterion,whereCp=SSEpMSEfull(n2p),hereCpconsiderstheratioofSSEforpvariablemodeltoMSEforfullmodel,thenpenalizesforthenumberofvariables.Twootherpopularcriteria,motivatedfromverytpointsofview,areAIC(a.k.aAkaikeInformationCriterion)andBIC(a.k.aBayesianInformationCriterion).LettinglogLdenotethemaximumloglikelihoodofthecandidatemodelwithk-dimensionparameters,AICselectsthemodelwhichminimizes2k2logL,whereasBICselectsthemodelwhichminimizesklogn2logL.Traditionalvariableselectioncriteriaisaspformofpenalizedlikelihood,providingaframeworkforcomparison.However,inthe6highdimensionalsetupwhenthedimensionalitybecomescomparabletoorevenlargerthanthesamplesize,computationalandinferentialchallengesaret.Therehasbeenevolvingamountofliteratureworkingontechniquesthatarecapableofreducingthehighdimensionalityofthevariableaswellasproducingoptimalestimators.Approachestocopewithhighdimensionalityisusuallythepenalizedregressionmethods.ConsiderthelinearregressionmodelY=X+":Supposewehavenobservationsindexedbyi,andforeachobservation,oneresponsevariableyi,alongwithpfeaturesfxi1;;xipgareobserved.Typically,thistypeoflinearrelationbetweenYandXcanbeeasilysolvedbyleastsquaresestimators.However,itbecomesunfeasiblewhenthedimensionofthefeaturespbecomeslargerthanthesamplesizen.Penalizedregressionmethodscannowbeused,whichpenalizethemodelwithvariousregularizationtermstoencouragemodelsparsity,byminimizinganobjectivefunctionQwithageneralizedformQ()=1nL(jX;Y)+P();whichconsistsofalossfunctionLandapenaltytermP.ThepenaltytermPisindexedbyaregularizationparameterthatcontrolsthelevelofpenaltyoftheobjectivefunctionQ.Typically,thepenaltyfunctionPhasthefollowingproperties:itissymmetricabouttheorigin,P(0)=0,andPisnondecreasinginjjjj.OneofthemostpopularhasbeentheLeastAbsoluteShrinkageandSelectionOperator,asknownas,theLassomethod,proposed7byTibshirani(1996).TheLassoestimatorisestimatedbyminimizingtheobjectivefunctionQLasso()=1njjYXjj22+jjjj1;hopingtosimultaneouslyselectvariablesandestimatetheassociatedregressioncots.SincetheL1normoftheestimatoriscontrolledbythepenaltyterm,aportionofvaluesin^willbereducedtoexactzerowhileminimizingthesquaredloss.EversincethedevelopmentofLassoestimator,muchprogresshasbeenmadeinunderstandingitsstatisticalproperties.ThedevelopmentofLassoremediesthedisadvantagesofanearlierregularizationmethod,byHoerlandKennard(1970),knownastheridgeregression.TheyproposedtheobjectivefunctionasfollowsQRidge()=1njjYXjj22+jjjj22:Ridgeregressionheavilypenalizeslargecots,leadingtobiasedestimateswhensomeofthecotsarelarge.Andalso,ridgeregressiondoesnotproducesparsesolutionsandthusfailstoimprovetheinterpretabilityofthemodel.1.3OverviewTherehavealreadybeenrichliteratureworkingonhigh-dimensionalvariableselectioninthelinearregressionsetup,butstillnotmuchhasbeentalkedaboutforspatialecono-metricmodels.High-dimensionalproblemsalsoarisefromtimetotimeineconomics,forexample,surveydatawhichcontainshundredsorthousandsofvariablesmayonlyhaveafewthatactuallyrelatetotheresponseofinterest;housepricedatawhichcontainsallcross-sectionalofgeographicneighborsmayonlyhavetrelationwithafew8neighborsnearby.Recently,Belloniandhisgrouphaveworkedonaseriesofpapersfocusingonhighdimensionalvariableselectionmethodsforsparseeconometricsmodels,withappli-cationfocusedoninstrumentalvariable.However,theydonotconsiderthepossiblespatialinteractionthatmightbeinvolvedinthemodel,andtherearenotmuchreferencestohigh-dimensionalvariableselectionforspatialeconometricmodels.Soweinthegaphere.InChapter2,wewillintroduceageneralizedvariableselectionandestimationmethodforaregressionmodelwithspatialautoregressiveerror,thebasicspatialeconometricmodel.Additionally,wewilldeveloptheparameterconsistencyandmodelselectionconsistencyfortheestimatorinboththelowdimensionalsettingwhenthedimensionoftheparameterpisandsmallerthanthesamplesizenaswellasthehighdimensionalsettingwhenpisgreaterthanandcanbegrowingwithn.InChapter3,wecontinuewiththesamemodelwetalkedaboutinChapter2andinvestigatepost-modelselectionestimatorsthatapplyleastsquaresestimationtothemodelselectedbypenalizedestimation.Wemanagetoshowthatbyseparatingthemodelselectionandestimationprocess,thepost-modelselectionestimatorcanperformjustaswellasthesimultaneousvariableselectionandestimationmethodintermsoftherateofconvergence.Anditcanstrictlyoutperformthesimultaneousvariableselectionandestimationestimatorwhentheselectionprocessisabletocorrectlyidentifythetcovariatesofthetruemodelwithlargeprobability.InChapter4,wewillextendtheworktomixedregressive,spatialautoregressivemodel,wheretheendogenousinteractionamongthedependentvariableareconsidered,andrelatedfutureworkisdiscussed.9Chapter2VariableSelectionWithSpatialAutoregressiveErrors2.1IntroductionIntheliteratureofeconometrictheoryandapplication,issuesrelatingtourban,realestate,agricultural,andenvironmentaleconomics,etc.,wherethedataarecollectedspatiallyfromcross-sectionalunitsarecommonandinthesecircumstances,thespatialrelationamongthesamplingsitescannotbeignored.Thusin1973,andOrdputforthaspatialautore-gressivemodel(alsoknownasSAR)tomodelthespatialautocorrelationofthedisturbancesacrosscross-sectionalunitsinaregressioncontext.ThismodelextendsautocorrelationintimeseriestospatialdimensionsandisavariantofthemodelsuggestedinWhittle(1954).InthisSpatialmodel,thedisturbancetermcorrespondingtoacross-sectionalunitismod-eledasaweightedaverageofdisturbancescorrespondingtoothercross-sectionalunits,plusaninnovation.Tobemoreprecise,thedisturbanceuniswrittenasun=ˆMnun+"n:10AndtheregressionmodelwithSARdisturbanceunisspeciasYn=Xn+un:Thesubscriptnindicatesthesamplesize.ThetermMnunisoftenreferredas\spatiallag".Typicallytheinnovations"nareassumedtobei.i.dwithmean0andvariance˙2andtheparameterofinterestisˆ,˙2and.Fornow,weassumethennspatialweightmatrixMnisknown.Contrarytotime-seriesmodelswhichareassociatedwithuni-directionaltimew,thespatialdatacanbeviewedasmulti-directional,witheachlocationcorrelatingwithalltheotherlocationsnearbyineverydirection.Becauseofthisparticularcharacteristicofspatialprocesses,asimpletranspositionoftime-seriesmethodologiescannotbeapplied.Sincetheintroductionofthespatialautoregressivemodel,severalmethodshavebeendevelopedforestimatingtheregressioncotsforspatialmodelswithspatialautore-gressiveerror.Tosummarize,themostwidelyknownmethodswiththeoreticalbasisarethe(Quasi-)MaximumLikelihoodmethod(Ord,1975,SmirnovandAnselin,2001),andthemethodsofmoments(KelejianandPrucha,1999).TheQuasi-MaximumLikelihoodmethodallowsforthecasewhentheactuallikelihoodfunctiontosomeextentfromthenormaldistributionassumed.OneobstacleoftheMLmethodinpracticeisitshugecomputationalburden,sincethemaximizationofthelog-likelihoodinvolvesanonlinearoptimizationthatrequirestheevaluationofthedeterminantofdimensionnn,wherenisthesizeofthedatasetforeachvalueoftheautoregressiveparameterˆused.Commoncomputationalapproachestothisproblemistheuseofeigenvaluesofthespatialweightsmatrix,asdoneinOrd(1975),butthecomputationoftheeigenvaluesquicklybecomesnumericallyunstableformorethan1000observations,orthesolutioncanbetheuseofMonteCarloestimation11toapproximateaswellasboundthedeterminantproposedbyBarryandPace(1999).InSmirnovandAnselin(2001),anewmethodforevaluatingtheJacobiantermwhichisbasedonthecharacteristicpolynomialofthespatialweightsmatrixMnisintroducedandthisalgorithmcanapproachlinearcomputationalcomplexity.However,thesimulationmethodremainsanapproximationandcannotyieldthetheoreticalproperties.ComparedwiththecomputationalyofMLmethod,KelejianandPrucha(1999)proposedanalternativeestimatorforthespatialautoregressiveparameterˆandvarianceparameter˙2inthespatialautoregressiveerrormodelbasedonageneralizedmomentsapproachwhichiscomputation-allysimpleirrespectiveofsamplesize.Further,theconditionsneededdonotinvolvetheassumptionofnormality.Thisestimatorofˆisalsoprovedtobeconsistent,thuscanbetreatedasanuisanceparameterandtheasymptoticpropertiesoftheregressionparametersolvedbasedontheestimatedˆcanretainallthegoodpropertiesoftheOLSforthemodelwhereˆisassumedtobeknown.Intheregressioncontext,oftenweourselvesinthefaceofneedtoidentifytheimportantfactorsinordertoexplaincertainphenomena.Currentdays,highdimensionalstatisticalproblemsarisefromdiverseofscienresearchandtechnologicaldevelop-ment.Here,highdimensionaldatareferstothegeneralcaseofgrowingdimensionalityandultra-highdimensionalwhichspthecasewherethedimensionalitygrowsatanon-polynomialrateasthesamplesizeincreases.Exampleofhigh-dimensionaldataincludesbutnotlimitedto:high-resolutionimages,microarrayorproteomicsdata,high-frequencydataandgenedata(FanandLv,2010).Andasaresultofthewideavailabilityofinexpensiveglobalpositioningsystemsandotheradvancesintechnology,thecollectionofvastquantitiesofdatawithgeo-referencedsamplelocationsbecomespossibleandthemodelsforspatiallycorrelateddatabecomeincreasinglyimportant.Sometimesthenumber12ofattributescollectedbecomessolarge,maybeevenlargerthanthesamplesize,andthismakesitimpossibletoconductthestandardestimationmethodthatwediscussedearlier.Thegoodthingis,webelievethatamongalltheinformationwecollect,manyofthemdonothavecantimpactonthesubjectvariableweareinterestedin,thusthep-dimensionalregressionparametersareassumedtobesparsewithmajorityofthecomponentsbeingzero.Recently,thiskindofhigh-dimensionalvariableselectionproblemhasdrawngreatattentionandmanymechanismsforlinearregressionmodelshavebeendiscovered.Amongall,oneofthemostpopularhasbeentheleastabsoluteshrinkageandselectionoperator,a.k.a.,theLassomethodintroducedbyTibshirani(1996)andmuchprogresshasbeenmadeinunderstandingthestatisticalpropertieseversince.Forexample,inKnightandFu(2000),theproblemoftheasymptoticdistributionofLasso-typeestimatorsisstudiedinthelow-dimensionalsettingwherethedimensionofregressionpissmallerthanthesamplesizenandisLaterinZhaoandYu(2006),analmostnecessaryandtIrrepresentableConditionforLassoisconstructedtoselectthetruemodelconsistentlybothinthepsettingandinthelargepsettingwhenpcangrowasthesamplesizengetslarger.OtherresultsconcerningtheasymptoticpropertiesoftheLassocanbefoundintheMeinshausenandBuhlmann(2006),Bickeletal(2009)andBuhlmannandvandeGeer(2011),amongothers.Econometricsarealsointheneedoftoolstodealwithhugeamountsofdataascomputersaregettingmoreinvolvedinthemiddleofeconomictransactions.Lassoanditspenalizedregressionestimatorscanbecomputedquitetlyandareprovidinggoodpredictionsinpractice(Varian,2014).Recently,Belloniandhisgroupintroducehighdimensionalvariableselectionmethodsforsparseeconometricsmodels,withapplicationfocusedonin-strumentalvariable.TheseriesofpapersincludeBelloniandChernozhukov(2011),Belloni,13ChernozhukovandWang(2011),Belloni,Chen,ChernozhukovandHansen(2012),andBel-loniandChernozhukov(2013).However,allthesevariableselectionmethodsassumethattheerrorinaregressionmodelisindependent,whichisnotthecaseinthespatialautore-gressivemodelcontext.Weargueavariableselectionmethodundertheindependenterrorassumption,e.g.standardLasso,maynotperformwellforspatiallydependentdata.TheliteratureregardingthetheoreticalresultsonasymptoticpropertiesoftheLassoestimatorwithspatialautoregressiveerrorsisverylimited.Sowellthegaphere.WecombinethespiritoftheLassomethodandthegeneralizedmethodofmomentsforspatialautoregressiveerrormodels,anddevelopageneralizedLassoestimatorwhichperformsthevariableselectionandestimationsimultaneouslyfortheregressionparameterinatwo-stageprocess.Also,weusetheconsistencypropertyoftheestimatorformodelparameterˆandthefactthatitisanuisanceparameter(KelejianandPrucha,1999),toprovethattheasymptoticpropertiesoftheLassoestimatorofremainvalidevenwhenthemodelparameterˆisreplacedbyitsmomentestimator.Boththeparameterconsistencyandmodelsignconsistencyoftheestimatorareaddressed.Here,parameterconsistencyreferstotheasymptoticpropertythatasthesamplesizengoestoiy,theresultingsequenceofestimatesconvergesinprobabilitytothetrueparametervalue,thatis,^n!;asn!1Andanestimateismodelsignconsistentifandonlyiftheprobabilitythatthesignofeachcomponentoftheestimatorequalstothatofthetrueparameterconvergestoone,thatis,14thereexistsn=f(n),afunctionofnandindependentofYnorXnsuchthatlimn!1P(^n(n)=sn)=1:Therestofthechapterisorganizedasfollows.Section2belowdescribestheproceduretoconstructageneralizedLassoestimatorinordertoselectandestimatethenonzerocompo-nentsoftheregressionparameterinaspatialautoregressiveerrorsetup.Section3discussestheasymptoticpropertiesoftheestimatorwhichincludestheparameterconsistencyandmodelsignconsistencyinthesetupwherethedimensionofparameterpisandsmallerthanthesamplesizen.Section4tacklestheasymptoticpropertiesoftheestimatorasinsection3inthehighdimensionsettingwhenpcanbegrowingwithn.Section5providesthesimulationstudiesoftheperformanceoftheestimatorfortchoicesofparameterˆinthespatialautoregressiveerrormodelwithproperselectionofthepenaltyparametern,whichwillbelater.Section6illustratesdataexampleofAveiro-IlhavourbanhousingmarketinPortugalwheretheproposedmethodcanbeapplied.Additionally,alltheproofsoflemmasandtheoremsindetailarerelegatedtothesection7.2.2AgeneralizedmomentsLASSO(GLASSO)estima-torInthissection,weproposeatwo-stageestimationprocedurewhichcombinesGMMandLASSOestimationatthesametime.Wewillfocusonthesimplespatialmodelwherethe15errortermisassumedtobespatiallyautoregressive:Yn=Xn+un;un=ˆMnun+"n;(2.1)whereYnisthen1vectorofobservationsonthedependentvariable,Xnisthenpmatrixofobservationsontheexplanatoryvariables,isthep1vectorofunknownmodelparameters,andunisthevectorofspatialautoregressiveerrorswithspatialautoregressiveparameterˆ,ascalarparameter,Mnisaspatialweightingmatrix,annmatrixofknownconstants,and"nisann1vectorofidiosyncraticerrors.Forgenerality,wepermittheelementsofMnand"ntodependonn.Wemakeseveralstandardassumptionsasfollows:Assumption1:Foralln,theidiosyncraticerrors"1;"2;;"nareindependentlyandidenticallydistributedwithzeromeanandpositiveboundedvariance˙2.Additionally,weassumeE("41)<1.Assumption2:Mnisanexogenousnnmatrix.AlldiagonalelementsofMnarezero,jˆj<1andthematrixIˆMnisnonsingularforalljˆj<1.Mnisaspatialweightsmatrixwhoseelementstherelationshipbetweentunits.Inacross-sectionalsetting,iftheithandjthunitsarenotrelated,wecansetmij=mji=0wheremijisthe(i;j)thelementofMn.Often,Mnissetasacontiguity(adjacency)matrix,inwhichcasethenon-zeroelementsaresymmetricandhaveunitvalue.Inothercases,theelementsmayeconomicorgeographicdistancesbetweentheunits,inwhichcasetheyarenon-negativeandsymmetric;Mncanstillbeasymmetricifitisconsideredinrow-standardizedform.Inothermodelingcontexts,forexampleBaileyet16al.(2016),thematrixcanbesymmetricbuttheelementscanassumevalues1;0;1g.Inyetothercontexts,theweightscanbeasymmetricandwithoutanysignorotherrestrictions,beyondtheconditionsinAssumption2;see,forexample,Bhattacharjeeetal.(2016).InAssumption2,jˆj<1isastability(spatialgranularity)condition,andtheinvertibilityofthematrixIˆMnistoensureidentioninreducedform,thatis,theerrorvectorunisuniquelyintermsoftheidiosyncraticerrorvector"n,as(IˆMn)1"n.Theseassumptionsarestandard;seeforexample,KelejianandPrucha(1999)andLee(2004).Thestepofourestimationprocedureistoobtainageneralizedmomentsestimatorofˆ.TheestimationprocessfollowsthesamemethodasKelejianandPrucha(1999),andweoutlinethisbelowforconvenienceofexposition.Let~unbeapredictorforun.Further,letun=Mnunandun=MnMnun,andcorrespondingly,~un=Mn~un,and~un=MnMn~un.Similarly,let"n=Mn"n.Then,underAssumptions1and2:E[1n"0n"n]=˙2E[1n"0n"n]=˙2n1Tr(M0nMn)E[1n"0n"n]=0(2.2)Thespatialautoregressiveparameterˆisincludedintheabovemomentsequationsthroughtheexpressionn=unˆun.Thustheequationscanbeusedtoobtainageneralizedmomentsestimatorforˆ.FromEquation(2:1),andEquation(2:2),weobtainn[ˆ;ˆ2;˙2]0n=0:(2.3)Here17n=26666642nE(u0nun)1nE(u0nun)12nE(u0nun)1nE(u0nun)1nTr(M0nMn)1nE(u0nun+u0nun)1nE(u0nun)03777775;n=26666641nE(u0nun)1nE(u0nun)1nE(u0nun)3777775Nowifweconsiderthesamplemomentsbasedon~un,andusethesetoreplacethemomentsofunshownabove,similartoEquation(2:3),wegettheequation:Gn[ˆ;ˆ2;˙2]0gn=n(ˆ;˙2);(2.4)whereGn=26666642n~u0n~un1n~u0n~un12n~u0n~un1n~u0n~un1nTr(M0nMn)1n(~u0n~un+~u0n~un)1n~u0n~un03777775gn=26666641n~u0n~un1n~u0n~un)1n~u0n~un3777775The31vectorn(ˆ;˙2)canbeviewedasavectorofresiduals,andtheGMMestimatorforˆand˙2canbeasthenonlinearleastsquaresestimator,^ˆnand^˙2n,whichminimizes18thenormoftheresidualvector.Spec,(^ˆn;^˙2n)=argminˆ;˙2hGn[ˆ;ˆ2;˙2]0gni0hGn[ˆ;ˆ2;˙2]0gni:(2.5)SeveraladditionalassumptionsarerequiredtoobtaintheasymptoticpropertiesoftheGMMestimator.Assumption3:TherowandcolumnsumsofMnand(IˆMn)1areboundeduniformlyinabsolutevalue.Notethattheboundfor(IˆMn)1maydependonˆ.Assumption4:Let~ui;ndenotetheithelementof~un,weassumethatthereexistnitedimensional)randomvectorsdinandnsuchthatj~ui;nui;nj6kdinkknkwithn1Pni=1kdink2+=Op(1)forsome>0andn12knk=Op(1).Assumption5:Thesmallesteigenvalueof0nnisboundedawayfromzero,thatis,min0nn)>>0;wheremaydependonˆand˙2.Foradiscussionoftheseassumptions,werefertoKelejianandPrucha(1999).GivenAssumption1to5,thenonlinearleastsquaresestimators^ˆnand^˙2ninEquation(2:5)areconsistentestimatorofˆand˙2,thatis,^ˆn!pˆand^˙2n!p˙2asn!1(KelejianandPrucha,1999).Letusnowfocusonthecontextofaspatialregressionmodelwhoseerrorsareautoregressive.Itiseasytoseethat,ifˆwereknown,wecouldrewritemodel(2.1)as(IˆMn)Yn=(IˆMn)Xn+"n:Then,LASSOvariableselectionandestimationofcanbeconductedusingtheL1penalized19leastsquarescriterion(YnXn)0n(ˆ)(YnXn)+npXj=1jjj;wheren(ˆ)=(IˆMn)0(IˆMn);foragivenpenaltyn,wedenotethisestimatoras^L(ˆ).Ofcourse,inpracticalapplicationsˆistypicallyunknown,andthusthedirectLASSOestimatoraboveisinfeasible.Inthiscase,wemayreplaceˆbythegeneral-izedmomentsestimator^ˆn,andproposeafeasiblegeneralizedmomentsLASSO(GLASSO)estimator^L(^ˆn)formodel(2:1)inthesecondstepoftheestimationprocess.Tobespec^L(^ˆn)=argmin(YnXn)0n(^ˆn)(YXn)+npXj=1jjj:(2.6)Theabovefunctioncanbenumericallyoptimizedusingthepackage\glmnet"inRde-velopedbyFriedmanetal.(2010).Theglmnetalgorithmsusecyclicalcoordinatedescent,whichoptimizestheobjectivefunctionovereachparametersuccessivelywhilekeepingotherswiththecyclesrepeatinguntilconvergence.Thetuningparameternischosenbycross-validationwithacertainlowerboundinferredfromthetheoreticalresultsdiscussedbelow.2.3AsymptoticPropertiesforpandqInthissection,weconsidertheasymptoticbehaviorofthegeneralizedmomentsLASSOestimator(2.6)underthesettingwhenp(thedimensionofallcandidatecovariates)andq(thedimensionofcovariateswithnon-zerocots)arebothandandsmallerthanthesamplesizen;thatis,q<0,thenpn(^L(^ˆn))!Dargmin(V(w));whereV(w)=2w0U+w0C(ˆ)w+0Ppj=1[wjsgn(j)I(j6=0)+jwjjI(j=0)],andU˘N(0;˙2C(ˆ)).TheabovetheoremestablishesparameterconsistencyoftheGLASSOestimatorinthesettingwhereboththedimensionofallcovariatespandthedimensionofnon-zerocovariatesqareandsmallerthanthesamplesizen.Further,ifwecontroltherateofconvergenceofthepenaltyparameterninaspway,theestimatorachievesasymptoticnormalitytowardstheminimizerofafunctionV(w).InthefunctionV(w),wisap1vector,Uisap1randomvectorwithnormaldistribution,andC(ˆ),edinAssumption6,involvesthespatialparameterˆandspatialweightmatrixMn.Sp,ifthetuningparameterngrowstoyataslowerratethanthesquarerootofn,wehaveaniceresult.ComparedwiththeasymptoticpropertiesofthenaiveLASSOestimatorinthelinearmodelsetting,herewehavespatialcorrelation.WethatthespatialautoregressiveparameterˆisinvolvedintheasymptoticdistributionoftheGLASSOestimatorandcontrolstheconvergencerate;ifˆ=0,theasymptoticdistributionreducestothesameasthatforthenaiveLASSO.222.3.2SignConsistencyAbove,wehaveshownparameterconsistencyofourgeneralizedmomentsLASSO(GLASSO)estimator^L(^ˆn).However,aconsistentestimatordoesnotnecessarilyconsistentlyselectthecorrectmodel.Here,wemayhavealargenumberofirrelevantpredictors,eveninthelowdimensionalsettings,andourprimarygoalistocorrectlyidentifythosewhicharerelevantsothatthemodelwillnotonlytwellbutalsobeeasilyinterpretable.Soanotherpropertywedesireisthemodelselectionconsistencyoftheestimation,whichrequiresthatP(fi:^i6=0g=fi:i6=0g)!1;asn!1:Thus,wefollowtheideaofZhaoandYu(2006)andachievetheresultthroughsignconsis-tencyoftheestimator,inwhichcase,sign(^L(^ˆn))=sign();wheresign()mapspositiveentryto1,negativeentryto1andzerotozero.Wedenotetheabovesignconsistencyconditionas^L(^ˆn)=s:Notethatsignconsistencyisstrongerthanmodelselectionconsistency,inthesensethat,ifourestimatorissignconsistent,thenthemodelselectionconsistencyconditionisautomat-icallyFurther,signconsistencyavoidstheundesirablesituationthatthemodelisestimatedonlywithzerosmatchedbutreversedsignsforsomeoftherelevantcovariates.23Notation:Assume=(1;;q;q+1;;p)0wherej6=0forj=1;;qandj=0forj=q+1;;p.Let(1)=(1;;q)0and(2)=(q+1;;p)0,andforanyp-columnmatrixZ,writeZ(1)andZ(2)astheqandpqcolumnsofZrespectively.Cn(ˆ)=1n[(IˆMn)Xn]0[(IˆMn)Xn].BysettingCn11(ˆ)=1n[(IˆMn)Xn](1)0[(IˆMn)Xn](1),Cn22(ˆ)=1n[(IˆMn)Xn](2)0[(IˆMn)Xn](2),Cn12(ˆ)=1n[(IˆMn)Xn](1)0[(IˆMn)Xn](2),andCn21(ˆ)=1n[(IˆMn)Xn](2)0[(IˆMn)Xn](1),wecanexpressCn(ˆ)asfollows:Cn(ˆ)=0B@Cn11(ˆ)Cn12(ˆ)Cn21(ˆ)Cn22(ˆ)1CA:ForthesamereasonasAssumption6,hereweassumethatCn11isinvertiblebasedontheuniquenessoftheparametrizationoftherelevantqcovariates.Since^ˆnisaconsistentestimatorofˆ,theinvertibilityofCn11(^ˆn)isinheritedfromthatofCn11(ˆ)whenthesamplesizeislargeenough.Thiscanholdevenforthehighdimensioncasewhenp>nandAssumption6doesnothold.Intherestofthepaper,wewillusethenotationCntodenoteCn(^ˆn)unlessspotherwise.ThefollowingpropositionplacesalowerboundontheprobabilityofLASSOpickingthetruemodelwhichquantitativelyrelatestotheprobabilityofLASSOselectingthecorrectmodel.ThisisamooftheProposition1inZhaoandYu(2006).Proposition2:1:AssumethatjCn21(Cn11)1sign((1))j61holdswithaconstant>0,wheretheinequalityholdselement-wise.Then,P(^L(^ˆn;)=s)>P(An\Bn)24forAn=fj(Cn11)1Wn(1)jrforanyr>0with06c<1,wehaveP(^L(^ˆn)=sn)=1o0@s(ˆ)encs2(ˆ)1A:Fromtheaboveresult,itisclearthattheconvergenceratefortheestimationmethodtochoosethecorrectmodelisaboundedfunctionofthespatialparameterˆtimesthe26exponentialofafunctionofnands.Here,s(ˆ)istheboundforthediagonalelementsofC111˙2andC22C21C111C12˙2.Becauseofthespatialstructureaddedtothelinearmodel,theconvergencerateisWhiletheconvergencerateinthei.i.dcaseisrelatedonlyton,nowthisdependsalsoonˆ.Oneremarkhereisthat,forTheorem2.2,theofthespatialcorrelationtotheestimatorintheformofafunctionofˆcaninsteadbeappliedtothepenaltyparameternasalowerbound.Inthisway,additionalinformationcanbeusedforthechoiceofnbesidescross-validation.2.4AsymptoticsforlargepandqIntheprevioussection,weprovedparameterconsistencyandsignconsistency,aswellastheasymptoticdistribution,ofourgeneralizedmomentsLASSOestimator^L(^ˆn)asn!1undertheclassicalsettingwherep,q,andareallandpandqaresmallerthann.ThesettingisinthesensethatitisnaturaltoassumetheregularityconditionsasstatedinAssumption6:C=limn!11nX0nˆ)XnwhereCisandnonsingular.However,inpractice,therearemanysituationswherelargepandthusqareneeded;itcaneitherbelargerthanthesamplesizenorincreaseatsomerateasn.Inthelargepandqcase,weallowthedimensionofthedesignsCngrowandmodelparameterchangeasngrows,thatis,p=pnandq=qnn.2.4.1ParameterConsistencyInthissection,weprovethat,withanappropriatechoiceofn,thegeneralizedmomentsLASSOestimator^L(^ˆn)obeysthefollowingoracleinequalitywithaprobabilitythatcanbemadearbitrarilyclosetounity.Thatis,forlargeenoughn,theconditionjj(I^ˆnMn)Xn(^)jj22n+njj^jj1642ns0˚20iswithanarbitrarilylargeprobability.Theinequalityprovidesaboundforjj^jj1,andthustheestimatorisconsistentiftheboundconvergestozero.Here,isthetruevalueoftheunknownparameter,n=Ologpn,wedenotetheGLASSOestimatorby^fornotationalsimplicity,s0isthecardinalityofthesetofnonzerocomponentsof,andS0and˚0areconstantsdependingonthedesignmatrixXn.Bytheofthegeneralizedmomentsestimator:^:=argmin˚(YnXn˚)0^ˆn)(YnXn˚)+njj˚jj1:Since^providestheminimaofthispenalizedobjectivefunction,wehavetheinequalitybelowwithchangeofscaleofn:jj(I^ˆnMn)(YnXn^)jj22n+njj^jj16jj(I^ˆnMn)(YnXn)22n+njjjj128Rearrangingtermsandusingthetriangleinequality,weobtainourBasicInequality:jj(I^ˆnMn)Xn(^)jj22n+njj^jj1620n(IˆM0n)1^ˆn)Xn(^)n+njjjj1:(2.7)NotethatthetermontheRHSoftheBasicInequality(2.7)canbeeasilyboundedintermsoftheL1-normofparametersinvolved:2"0n(IˆM0n)1^ˆn)Xn(^)6max16j6p2j"0nT(j)jjj^jj1whereT(j)isthejthcolumnofthematrixT=(IˆM0n)1^ˆn)Xn.Next,weintroducetheset=:=ˆmax16j6p2j0nT(j)j=n60˙wherewearbitrarilyassumethat206ntomakesurethaton=wecangetridoftherandompartoftheproblem.Now,wehavethefollowingresult.Proposition2:2:SupposeAssumptions1-5hold,andfurtherassumealltheelementsofXnarenonstochasticanduniformlyboundedinabsolutevalue,thenforallt>0,ifwe0=2˙(exp[t2=2]+1)rlog2pn;wehaveP(=)>1Kexp[t2=2]:forsomepositiveconstantK.Proofisgiveninthelastsection.Sinceweareinasituationwherepisgrowingwith29n,andpossiblyp>n,wegenerallyconsiderthefactthatonlyafew,says0,ofthejarenon-zero.Toquantifythesparsityofthetrue0,wedenoteS0:=fj:0j6=0g;sothats0=jS0j.Intheliterature,S0iscalledtheactiveset,ands0thesparsityindexof0.Beforewestatetheoracleinequality,usingn>20andtheBasicInequality(2.7),wehaveon=,2jj(I^ˆnMn)Xn(^)jj22=n+2njj^jj16njj^0jj1+2njjjj1:Sincejj^jj1=jj^S0jj1+jj^Sc0jj1>jjS0jj1jj^S0S0jj1+jj^Sc0jj1;andalso,jj^0jj1=jj^S0S0jj1+jj^Sc0jj1:Therefore,2jj(I^ˆnMn)Xn(^)jj22=n+njj^Sc0jj163njj^S0S0jj1:(2.8)Here,nissomeregularizationparametersatisfyingtherelationshipwith0inProposition2.2.FromAssumption1,wehave0<˙20,andforallsatisfyingjjSc0jj163jjS0jj1,itholdsthatjjS0jj216(0X0n^ˆn)Xn)s0=(n˚20):Notethat,whenwesolvefortheLASSOestimatorinthesecondstepofourestimationprocess,^ˆnisconsideredaknownparameter.Finally,weobtainparameterconsistencyasfollows.Theorem2:3:SupposeCondition1holdsforS0,forsomet>0,andlettheregularizationparametern>20,thenon=,wehavejj(I^ˆnMn)Xn(^)jj22n+njj^jj1642ns0˚20:Theresultalsomeansthatwithprobabilityatleast1Kexp[t2=2],wehavejj(I^ˆnMn)Xn(^)jj22n+njj^jj1642ns0˚20:31Asdiscussedabove,theaboveresulttellsstatesthatwithhighprobability,theL1normofthebetweentheestimatorandthetruevalueoftheparameterofinterestisboundedbyafunctionofnands0(sameasthedimensionofnon-zeroparametersqn).Further,theconsistencyoftheestimatorisachievedwhentheboundconvergesto0asngoestoy,andpnandqninthiscaseneedtosatisfy:q2nlog2pnn!0:2.4.2SignConsistencyWehavealreadyprovedsignconsistencywhichinfersthemodelselectionconsistencyofourgeneralizedmomentsLASSOestimatorwithaconditionsimilartotheStrongIrrepresentableConditioninZhaoandYu(2006).Now,weextendtheresulttosignconsistencyoftheestimatorinthehighdimensionalcasewhenpandqarelargeandgrowingwithn,followingthepreviousargumentsbutwithanadditionalassumption:Assumption7:Thereexists06c10sothatthefollowingholds:1n(X0nˆ))ii6K1;8i;0Cn11(ˆ)>K2;8jjjj22=1;(2.9)qn=O(n2c1);n1c22mini=1;;qjnij>K3:Undertheaboveassumptions,wecanhavethefollowingresult.32Theorem2:4:UnderAssumptions1-5and7,iftheconditionj(Cn21)(Cn11)1sign((1))j61holdsforsome>0,thenforpn=o(n2(c2c1)),and8nthatnpn=O(nc2c12),wehaveP(^L(^ˆn;n)=s)>1Or(ˆ)n2c22o(1)!1;asn!1:Here,wedenoter(ˆ)asafunctionofthespatialparameterˆwhichcontrolsthemaximumoftheabsolutevalueoftheelementinthematrix(Cn11(ˆ))1.Thetermr(ˆ)controllingtheconvergencerateoftheestimatortocorrectlyselectthetruemodelinourspatialautore-gressiveerrorssettingfromthatinthetraditionalindependentdatalinearregressionsetting.2.5SimulationStudiesInthissectionwestudythesampleperformanceofthegeneralizedmomentsLASSOestimator(GLASSO)^L(^ˆn)inboththelow-dimensionalsettingandthehigh-dimensionalsettingandcomparethesewiththetraditionalLASSOestimator^Laswellastheordinaryleastsquaresestimator^OLS(whenapplicableinthelowdimensionalcase);boththe^Landthe^OLSignorespatialdependenceinthedata.Forthispurpose,weconductatwo-partMonteCarlostudy.Throughout,wesetthedistributionoftobenormal,andwithoutlossofgenerality,N(0;1).Thisisbecausetheestimatorsforˆearlierdonotdependon˙2.Weconsider6choicesofˆ,coveringtherangefrom1to1,togetherwith5choicesofthesamplesizen,andthuswehaveatotalof30casesforoursimulationstudy.Foreach33case,theresultsaresummarizedover200MonteCarloreplications.Thedetaildesignofthestudyisasfollows.TheweightmatrixMnisasanidealizedweightingmatrixMnfollowingKelejianandPrucha(1999),whichmeans,Mnwasselectedsuchthateachelementofuniisdirectlyrelatedtotheoneimmediatelybeforeandafterit.Weassumetheaboverelationshiptobecircular,sothatunisrelatedtoun;1andun;n1,forinstance.Forsimplicity,wespecifyMnsuchthatallthenon-zeroelementsofMnareequalandthattherespectiverowssumto1.OurmainobjectofinterestliesintheabilityofthegeneralizedmomentsLASSOes-timatortoconsistentlychoosethecorrectparameter,andthesimulationresultshowsthemean(over200replicates)forthevalueofCorrectly,Falsely,andSign-correctlyidencomponentoftheparameterforourGLASSO,traditionalLASSOandOLS(onlyinthelowdimensionalcase),respectively.NotethatHuangetal.(2010)demonstratetheselec-tionconsistencyofusinggroupLASSOforvariableselectionwithacertainlowerboundforthepenaltyparametern.Fortheanalysis,wealsosetaproperchosenlowerboundforthecross-validationselectionofn,whichtheconditionsimpliedbyourtheoreticalresults.Inthelow-dimensionalsetup,thedimensionoftheparameterischosenasp=50withtheq=5non-zerocomponentsindependentlygeneratedfromauniformdistributionovertheinterval(2;5)andtherestarezerocots.ThecovariatesXi'sareIIDfroma50-dimensionalGaussiandistributionwitheachcomponenthavingmeanzeroandvariance1.Thepairwisecorrelationissettobecor(xij;xik)=0:5jjkj.Theresultsforthelow-dimensionalsetupareshowninTable2.1and2.2.Ineachofthetables,thereportedarethemeansofthestatisticsfrom200repetitions;NZrepresentsthecorrectlyselectedcomponents,IZrepresentstheincorrectlyselectedcomponentsandSCrepresentsthethe34numberofcorrectlyselectedcomponentswithcorrectsign.Forthehigh-dimensionalsetup,wesetthedimensionoftheparameterp=1000butthetruenumberofcomponentsthataretisonlyq=20.Mnisspthesameasitisinthelow-dimensionalsetting.The20non-zerocomponentsarealsogeneratedindependentlyfromauniformdistributionovertheinterval(2;5).ThedesignmatrixXnlikewiseisthesameasthatofthelow-dimensionaldesignwithonlychangeofdimensions.TraditionalOLSbecomesimpossiblesoweonlycomparetheperformanceofthegeneralizedLASSOandthetraditionalLASSO.Notehere,inthetraditionalLASSOapproach,weignoretheautocorrelationoftheerrorunandtreattheerrorsasIID.TheestimatorofLASSOisachievedbyusingthepackage"glmnet"inRandthepenaltyparameternischosenby10-foldcross-validation.AnotherissuethatdistinguishesourmethodfromthetraditionalLASSOistheuseofalowerboundforn,whichtheconsistencyofourapproach.TheresultsarerecordedinTables2.3and2.4.Table2.1:MeansofNZ,IZ,SCforSEMwhenpnwithpositiveˆnˆ=0.25ˆ=0.5ˆ=0.75NZ(20)IZ(980)SC(20)NZ(20)IZ(980)SC(20)NZ(20)IZ(980)SC(20)100GL15.577.415.415.679.715.615.280.715.2L16.4105.616.416.4113.316.415.9117.515.9200GL19.696.519.619.5104.519.519.3122.219.3L19.7116.719.719.7168.119.719.6254.219.6400GL2093.12019.9116.319.919.715819.7L20129.72020194.92019.9277.319.9600GL2048.5202069.12019.9127.019.9L20140.32020226.32020325.720800GL2014.9202023.7202061.820L20150.92020258.22020374.620parameterwhenthesamplesizengetslarger.Whatdistinguishesthemethodstrulyistheirabilitytoidentifytheirrelevantcomponentsandsetthesetozero.FromTables2.1and2.2,inthelow-dimensionalcase,itisclearthatthetraditionalLASSOisnotsuitablefordependentdataandOLSworksreasonablywellforallchoicesofn.However,eventhoughourgeneralizedmomentsLASSOestimator(GLASSO)falselyselectsmorezerocomponentsinsmallsamplesizes,theresultsgetmuchbetterwithincreasingdataandperformsbetterthantheOLSwhennexceeds400.Theseresultsareconsistentforallchoicesoftheautoregressiveparameterˆ.36Table2.4:MeansofNZ,IZ,SCforSEMwhenp>nwithnegativeˆnˆ=-0.25ˆ=-0.5ˆ=-0.75NZ(20)IZ(980)SC(20)NZ(20)IZ(980)SC(20)NZ(20)IZ(980)SC(20)100GL15.0969.5815.0714.3570.9014.3413.9367.0413.89L15.9888.4315.9514.9282.1814.8813.6460.0913.61200GL19.6272.6119.6219.5671.1219.5619.5068.4219.50L19.6571.6919.6519.4658.919.4619.0548.6719.05400GL19.9743.4319.9719.9941.8719.992040.3620L19.9954.2619.9919.9535.7219.9519.7623.8319.76600GL2014.10202015.27202015.0720L2046.13202028.382019.9516.7119.95800GL203.0720204.1220204.220L2042.16202024.282019.9914.1419.99Inthehigh-dimensionalsetting,sinceOLSbecomesunavailable,weonlycomparetheperformanceofthetraditionalLASSOandourtwo-stageGLASSOestimator.Still,fordif-ferentchoicesofˆ,eventhoughbothmethodscanmostlyselectthenon-zeroregressioncotscorrectly,thetraditionalLASSOperformspoorlyrelativetothegeneralizedmo-mentsLASSOestimatorincorrectlyidentifyingthezeroelements.ThemostinterestingobservationisthewaytheLASSOover-selectsinthepresenceofevenalittlebitofspatialdependence.Thisinabilityisnotasamplebias:ifanything,theproblemworsenswithsamplesize.Thereisalsoanimportantasymmetrybetweenpositiveandnegativedepen-dence,whichhastodowithchallenginginferencesinnegativeautocorrelationsituations.Insummary,theLASSOlosesitsselectionabilitywhentheerrorsarenotindependent.2.6ApplicationtoahedonichousingpricemodelInthissection,weillustratetheproposedtwo-stepGLASSObyapplicationtohousingmarketdataforthemunicipalitiesofAveiroandIlhavoandtheadjoiningperi-urbanandruralareaincentralPortugal(Figure2.1);seeBhattacharjeeetal.(2016)forthedataandforfurtherinformation.37Figure2.1:TheAveiro-IlhavoHousingMarketThedatasetwasprovidedbytheJanelaDigitalS.A,whichownsthelargestportalinPortugalforrealestateadvertisement,andcontainsn=12;467observations(housesonsale)sampledfrom76tlocationswithintheabovehousingmarketovertheperiodOctober2000andMarch2010.Ourinteresthereliesmainlyinestimatingthespatiallyvaryingimplicitpriceoflivingspacethatismodeledbythelivingspaceelasticityofhouseprice.Weestimatethiselasticitybyregressingthelogarithmofhousepricepersquaremeteroflivingareaonthelogarithmofsquaremetersoflivingspace.Thisisanexampleofahedonichousepricemodel;seeBhattacharjeeatal.(2016)forfurtherdiscussion.Potentially,severalotherregressorsrelatingtotheattributesofthehouse,aswellasthecharacteristicsoftheneighborhood,alsohousepricesandhenceshouldbeincludedascontrols.However,theofthesehedoniccharacteristicsonthespatiallyvaryingestimatesoflivingspaceelasticityisnotsubstantial,afterspatialdependenceisadequatelymodeled.Hence,forthisillustrativeexample,weabstractfromthefullestimationofahedonichousepricemodel,andfocusonspatialdependence.38Wemodelthespatialaspectsquitefully.Thisisdoneinthreeways.First,weallowfullspatialheterogeneitybyallowingtheshadowpriceoflivingspace(ii)tovaryacrosstheL=76locations.Inaddition,weallowforLlocationsp(i)toaccountforneighborhoodlevelunobservedheterogeneity.Second,wemodelspatialspilloversinhousepriceshocksbyspatialautoregressiveerrors,wherethespatialweightsmatrix(Mn)isarow-standardizedversionofinversegeographicaldistanceweights.Thatis,weconstructaweightsmatrixwhere,correspondingtotwohousesintlocations,thediagonalelementsarereciprocaloftheEuclideandistancebetweenthelocations;ifthehousesareinthesamelocation,thecorrespondingspatialweightisthereciprocalofhalfthedistanceofthatlocationtoitsnearestneighborlocation.Thisweightsmatrixisthenrow-standardizedbydividingeachelementbythesumofallentriesinitsrow,andthistransformedmatrixthenconstitutesourspatialweightsmatrixMn.Third,andmostimportantlyinthecontextofthiswork,weallowspilloversofthequalityofhousingstockfromneighboringlocationstohousingpriceinanindexlocation.ThemostpopularwaytoaccommodatesuchspilloversinexogenouscovariatesisthespatialDurbinmodel(LeSageandPace,2009):Yn=Xn+WnXn+un;un=ˆMnun+n:Here,inadditiontothe(direct)ofthecovariatesandthespatialautoregressiveerrors,thereisalsothecomingarisingfromcovariatevaluesintheneighborhood,andcapturedthroughaspatiallagterm(WnXn)withcorresponding.TheabovespatialDurbinmodelcanhavethestructuralinterpretationofcapturingthetruespilloversintheofcharacteristicsintheneighborhood,butmayalsosometimesbeseeninthereduced39formofomittedorinappropriatelymodeledspatialdependence(LeSageandPace,2009).Whateverthemechanism,thespatialDurbinmodelisanimportantworkhorsemodelincontemporaryspatialeconometrics.Typically,thespatialweightsmatrixWnisassumedknownapriori,andusuallytakentobethesameasMn.However,mis-measuredspatialweightscanhaveseriousimplicationsontheinferencesdrawn,andacurrentbranchoftheliteraturefocusesoninferencesonthespatialweightsthemselves;see,forexample,Bhattacharjeeetal.(2016).Here,weusetheGLASSOforidentifyingtheneighborsthatmatterandforestimatingtheimpliedweightsmatrixWn,whichhasL(L1)elements.Thisallowsspilloversandtheirstrengthtovaryoverthespatialdomain,whichisnaturalinthecurrentcontextofhedonicpricing.Inatypicalapplication,thiswouldimplyaddingcovariatesforalllocationsontherighthandsideoftheregressionmodelandthenuseLASSObasedmodelselectiontoestimateboththespatiallyvaryingslope()andspilloversfromotherlocations(Wn).Inthecontextofourapplication,theestimationofathreedimensionalfunctionalsurfaceofthespatialvaryingoflivingspacecanbetailoredtotheregressionofalinearcombinationoftheoflivingspaceovernearbylocations,besidestheoflivingspaceoneachsplocation.Thus,thegeneralizedmomentsLASSOvariableselectionandestimationmethodproposedisusefulwhenweselectneighboringlocationswhoselivingspacehaveanontheindexlocationandtoestimatehowlargetheis;intheprocess,wecanbuildaparsimoniousmodelbyeliminatingthoselocationsthatareirrelevantforhousingpricesateachindexlocation.Further,thereissomebecauseofreplicationsinourdata.Foreveryhouseinanindexlocation,housesinanyspotherlocationisexpectedtobeexchangeable,andhencewhatmattersisnotthelivingspacesofthesehouses,buttheiraverageateach40location.Tobespthechosenlinearmodelcanbedescribedas:Yij=i+xijii+Xk6=ixk:ik+uij;i=1;2;;L;j=1;2;;ni;n=Xini:HereYijisthelogarithmofhousepricepersquaremeteroflivingspaceforthejthreplicationintheithlocation,whileXijrepresentsthelogarithmoflivingspaceofthecorrespondinghouse,andtheaverageofthelogarithmoflivingspaceateachofotherlocationskisdenotedasxk:.Further,uijisaspatialautoregressiveerrorwithspatialweightmatrixbasedonthedistancebetweenthelocations(Mn).Underthemodel,foreachreplicationjinthelocationi,thelogarithmofhousepricecanbemodeledasthelinearcombinationofitscorrespondinglogarithmoflivingspace,alongwiththeaverageofsampledlogarithmsoflivingspaceatotherlocations,plusanerrorterm.Theresponsevariableyijisarrangedbylocationsintoacolumnvectorofdimensionn=12467.Weareinterestedinselectionandinferencesontheoftheaverageoflogarithmoflivingspaceatlocationkoneachofthesampleoflogarithmofhousepriceatlocationi,whichisdenotedik,togetherwithlocationsp(i)andspatiallyvaryinglivingspaceelasticitiesofhouseprice(ii).Thismakesourparameterofinteresttobeap=7677dimensionvector.Eventhoughinthisdataset,itisnotthehigh-dimensionalsettingweearlierwhichrequiresp>n,itishighdimensioninthesensethatpisconsiderablylarge.Inadditiontoestimatingthespatiallyvaryingimplicitpriceoflivingspace,wewishtoidentifythoselocationswith\livingspace"oneachother,sohenceweimplementthetwostepmethodforvariableselectionandestimationfortheproposedmodel.WecomparetheseresultswiththetraditionalLASSOmethod.Byconductingvariableselectionandestimationfortheproposedmodelateachlocation,41Figure2.2:IdenneighborswithspilloveroflivingspacetheoflivingspacefromlocationsintheneighborhoodisthusidenPartoftheresultsareshowninTable2.5asillustrationandwesummarizethenumberofidenneighborsthroughboxplot(Figure2.2).Table2.5:NumericalsummaryofthetoflivingsizefromneighborsLocationNo.ofneighbors(GLASSO)No.ofneighbors(LASSO)1556255934634656552667257488343942510111Table2.5illustratestheinnumberofselectedneighborswithspilloveroflivingspaceidenbythegeneralizedmomentsLASSOestimatorandthetra-ditionalLASSOmethod.Coincidingwiththesimulationresults,thetraditionalLASSOestimatortendstoover-selectirrelevantvariablescomparedtothegeneralizedmomentsLASSOestimatorandthusweakenselectionpower.ThesummaryofboxplotsinFigure2.2furthersupportstheaboveconclusionusingtheoveralldistributionofthenumberof42(a)Spatialforlocation1(b)Spatialforlocation6Figure2.3:ComputationresultsneighborsidenWhiletheGLASSOselectsparsimoniousmodelswithamedianof5neighbors,thetraditionalLASSOselectsenormouslylargemodelswithamedianofabout40neighbors.ThenetworkgraphsinFigure2.3usetwoexamplelocationstoillustratetherelationshipbetweenthesesplocations(Locations1and6inourcase)andtheidenlocationswithtspilloverThemagnitudesofthelocationswithlargespatialarealsoshown.Eachpointislocatedatitsownlocationbycoordinatesandthedistancebetweenlocationsrepresenttherelativedistancebetweenpoints.TakeLocation1forexample.ThehousepriceinthislocationisbythelivingspaceateothersurroundinglocationsandLocation2hasthelargestofthese5locations.Wecanseethatidenlocationsarenotcompletelyrandombutsomewhatconditionedbydistancesonthespatialdomain.Atthesametime,onecanclearlyseetheadvantagesofallowingthepatternsofspilloverstobetintlocations.ComparingtheresultswithFigure2.1,weseethatourmethodcansuccessfullyidentifythespilloverlocationsinthenearbyarea.432.7ProofsPROOFofTheorem2:1.nearandomfunctionofˆand˚,Zn(˚;ˆ)=1n(YnXn˚)0ˆ)(YnXn˚)+nnpXj=1j˚jj:BytheofLASSOestimator,foranyˆ,Zn(˚;ˆ)isminimizedat˚=^L(ˆ).However,wenothavethetruevalueofˆ,butinstead,weusetheGMMestimator^ˆnasasubstitute.ThenthefunctionZn(˚;^ˆn)isminimizedatthegeneralizedmomentsLASSOestimator˚=^L(^ˆn).Furthermore,denotebythetruevalueoftheunknownparameter,andletZ(˚;ˆ)=(˚)0C(ˆ)(˚)+˙2:Then,itiseasytoseethatforanygivenˆ,Z(;ˆ)isminimizedat˚=.Foreach˚2Rp,Zn(˚;^ˆn)=1n(YnXn˚)0^ˆn)(YnXn˚)+nnpXj=1j˚jj=12+2+3where1=1n(YnXn˚)0^ˆn)(YnXn˚)2=1n(YnXn˚)0ˆ)(YnXn˚)3=nnpXj=1j˚jj44Sincenn!0,wehave3!0.Also,2=1n[(IˆMn)Xn(˚)+"n]0[(IˆMn)Xn(˚)+"n]=1n(˚)0X0nˆ)Xn(˚)+1n"0n(IˆMn)Xn(˚)+1n(˚)0X0n(IˆMn)0"n+1n"0n"n!p(˚)0C(ˆ)(˚)+˙2=Z(˚;ˆ);byAssumption6andtheweaklawoflargenumbers.45Moreover,since^ˆnisaconsistentestimatorofˆ,12=1n(YnXn˚)0^ˆn)ˆ)](YnXn˚)=1n(YnXn˚)0[(ˆ^ˆn)(Mn+M0n)+(^ˆ2nˆ2)M0nMn](YnXn˚)=1n[(˚)0X0n+"0n(IˆM0n)1][(ˆ^ˆn)(Mn+M0n)+(^ˆ2nˆ2)M0nMn][Xn(˚)+(IˆMn)1"n]=1n(ˆ^ˆn)(˚)0X0n(Mn+M0n)Xn(˚)+1n(^ˆ2nˆ2)(˚)0X0n(M0nMn)Xn(˚)+1n(ˆ^ˆn)"0n(IˆM0n)1(Mn+M0n)Xn(˚)+1n(^ˆ2nˆ2)"0n(IˆM0n)1(M0nMn)Xn(˚)+1n(ˆ^ˆn)(˚)0X0n(Mn+M0n)(IˆMn)1"n+1n(^ˆ2nˆ2)(˚)0X0n(M0nMn)(IˆMn)1"n+1n(ˆ^ˆn)"0n(IˆM0n)1(Mn+M0n)(IˆMn)1"n+1n(^ˆ2nˆ2)"0n(IˆM0n)1(M0nMn)(IˆMn)1"n!p0Therefore,Zn(˚;^ˆn)Z(˚;ˆ)!p0forany˚2Rp.CombinedwiththefactthatZn(˚;^ˆn)isaconvexfunctionof˚,wehavesup˚2KjZn(˚;^ˆn)Z(˚;ˆ)j!p0foranycompactsetKand^L(^ˆn)2Op(1)byapplyingtheconvexitylemmainPollard46(1991).Fromtheaboveresultwehaveargmin(Zn(˚;^ˆn))!pargmin(Z(˚;ˆ))whichimpliesthat^L(^ˆn)!p:Forasymptoticnormalityoftheestimator,weneedntogrowslowly,andfurtherassumethatn=O(pn).Fromtheaboveproof,wealreadyknowthatnZn(˚;^ˆn)=(YnXn˚)0^ˆn)(YnXn˚)+npXj=1j˚jjisminimizedat˚=^L(^ˆn).Noww=pn(˚).ThennZn(˚;^ˆn)canbetreatedasafunctionofwandnZn(˚;^ˆn)=YnXnwpn+0^ˆn)YnXnwpn++npXj=1wjpn+j=~Vn(w)isminimizedatpn^L(^ˆn).ThesameistrueforVn(w)=~Vn(w)(YnXn)0^ˆn)(YnXn)npXj=1jjj:47ItfollowsthatnpXj=1wjpn+jjjj!0pXj=1[wjsgn(j)I(j6=0)+jwjjI(j=0)]:Also,n(w)=(YnXnwpnXn)0^ˆn)(YnXnwpnXn)(YnXn)0^ˆn)(YnXn)=n(w)1(w)+1(w);where1(w)=(YnXnwpnXn)0ˆ)(YnXnwpnXn)"0n"n:Easytoseethat1(w)="n(IˆMn)Xnwpn0"n(IˆMn)Xnwpn"0n"n=21pnw0X0n(IˆMn)0"n+1nw0X0n(IˆMn)0(IˆMn)Xnw!D2w0U+w0C(ˆ)w;48whereU˘N(0;˙2C(ˆ)).Alson(w)1(w)=2pn"0n(IˆM0n)1^ˆn)Xnw+1nw0X0n^ˆn)Xnw+2pn"0n(IˆMn)Xnw1nw0X0nˆ)Xnw=2pn"0n(IˆM0n)1[(^ˆnˆ)(M0n+Mn)(^ˆ2nˆ2)M0nMn]Xnw1nw0X0n[(^ˆnˆ)(M0n+Mn)(^ˆ2nˆ2)M0nMn]Xnw!p0whereweusetheconsistencyof^ˆnintheproofabove.ThusVn(w)!DV(w),andcombinedwiththefactthatVnisconvexandVhasauniqueminimum,itfollowsfromGeyer(1996)thatargmin(Vn)=pnh^L(^ˆn)i!Dargmin(V(w)):PROOFofProposition2:1.Bytheofestimatorinthesecondestimationstep,^L(^ˆn)=argmin˚[(YnXn˚^ˆn)(YnXn˚)]+njj˚jj1;wheretheestimatoristheminimizerofthepenalizedleastsquarewhenthetruespatialparameterˆisreplacedbyitsconsistentestimator^ˆn.Let'=˚,whichisequivalenttowpnintheproofofTheorem2.1.ThefollowingproofissimilartothatoftheproofofTheorem2.1.Dn(')=[(YnXn('+))0^ˆn)(YnXn('+))]+njj'+jj1(YnXn)0^ˆn)(YnXn)49Then^'=^L(^ˆn)=argmin'Dn('):SeparateDn(')intotwoparts,Dn1(')andDn2(').LetDn1(')=[(YnXn('+))0^ˆn)(YnXn('+))](YnXn)0^ˆn)(YnXn)=[(I^ˆnMn)((IˆMn)1"nXn')]0[(I^ˆnMn)((IˆMn)1"nXn')]"0n(IˆM0n)1(I^ˆnM0n)(I^ˆnMn)(IˆMn)1"n=2'0X0n(I^ˆnMn)0(I^ˆnMn)(IˆMn)1"n+'0X0n(I^ˆnM0n)(I^ˆMn)Xn'=2(pn')0Wn+(pn')0Cn(^ˆn)(pn')whereWn=Wn(^ˆn)=X0n^ˆn)(IˆMn)1"n=pn;tiateDn(')w.r.t.',wehavedDn1(')d'=2pnWn+2nCn(^ˆn)':Noteherethatboth^'(1)andWn(1)arevectorsofdimensionp1.Let^'(1),Wn(1)and^'(2),Wn(2)denotetheqandlastpqentriesof^'andWnrespectively.Thenby50fsign(^Lj(^ˆn))=sign(j);forj=1;2;;q:gfsign((1))^'(1)>(1)jg:Henceifthereexists^'suchthatCn11(^ˆn)(pn^'(1))Wn(1)=n2pnsign((1));j^'(1)jP(An\Bn):51Thus,P(An\Bn)>1P(Acn)P(Bcn)>1qXi=1P(jznij>pn(jnijn2nbni)pqXi=1P(jnij>n2pni):wherezn=(zn1;;znq)0=(Cn11)1Wn(1),n=(n1;;npq)0=Cn21(Cn11)1Wn(1)Wn(2)andbn=(bn1;;bnq)0=(Cn11)1sign((1)).Since^ˆnisaconsistentestimatorofˆ,similartotheproofofTheorem2.1,andundertheregularityconditionsinAssumption6,wehave(Cn11)1Wn(1)!DN(0;C111(ˆ)˙2)ThisisbecauseCn=1nX0n^ˆn)Xn=1nX0n^ˆn)Xn1nX0nˆ)Xn+1nX0nˆ)Xn=1nX0n[(^ˆ2nˆ2)M0nMn(^ˆnˆ)(M0n+Mn)]Xn+1nX0nˆ)Xn!pCThestepfollowsfromAssumption3and6,togetherwiththeconsistencyof^ˆn.Thus,52(Cn11(^ˆn))1!p(C11(ˆ))1.Similarily,X0n^ˆn)(IˆMn)1"n=pn=X0n[(^ˆ2nˆ2)M0nMn(^ˆnˆ)(M0n+Mn)](IˆMn)1"n=pn+X0n(IˆM0n)"n=pn=op(1)Op(1)+X0n(IˆM0n)"n=pnSinceX0n(IˆM0n)n=pn!dN(0;˙2C(ˆ)),wehaveWn=X0n^ˆn)(IˆMn)1n=pn!dN(0;˙2C(ˆ))ThusWn(1)!DN(0;˙2C11(ˆ)).ApplyingSlutsky'stheorem,wehavezn=(Cn11)1Wn(1)!DN(0;(C11(ˆ))1˙2):Makinguseoftheaboveresult,combinedwiththefactthatCn21(Cn11)1Wn(1)Wn(2)=(Cn21(Cn11)1;Ipq)Wnwehaven=Cn21(Cn11)1Wn(1)Wn(2)!dN(0;C22(ˆ)C21(ˆ)C11(ˆ)1C12(ˆ)˙2):Henceallzni'sandni'sconvergeindistributiontoGaussianrandomvariableswithmean530andvarianceboundedbys2(ˆ)forsomeconstantfunctions(ˆ).Fort>0,theGaussiandistributionhasitstailprobabilityboundedby1t)rwith06c<1,wehaveqXi=1P(jznij>pn(jijn2nbni)6(1+o(1))qXi=1211s(ˆ)n12jnij(1+o(1))=o0@s(ˆ)encs2(ˆ)1AandpqXi=1Pjnij>n2pni=(1+o(1))pqXi=1211sn2pni=o0@s(ˆ)encs2(ˆ)1A:Theorem2.2follows.54PROOFofProposition2:2.P(max16j6p2j"0nT(j)j=n>0)=P(max16j6p"0n(IˆMn)X(j)nn+"0n(IˆM0n)1^ˆn)X(j)nn"0n(IˆMn)X(j)nn>02)6P(max16j6p"0n(IˆMn)X(j)nn+max16j6p"0n(IˆM0n)1^ˆn)X(j)n"0n(IˆM0n)1ˆ)X(j)nn>02)Letr=˙qlog2pn,anddenoteA=max16j6p"0n(IˆM0n)1^ˆn)X(j)n"0n(IˆM0n)1ˆ)X(j)nnThenA1=max16j6p"0n(IˆM0n)1(^ˆ2nˆ2)M0nMnX(j)nnandA2=max16j6p"0n(IˆM0n)1(^ˆnˆ)(M0n+Mn)X(j)nn;therefore,P(A>r)6P(A1+A2>r)6PA1>r2+PA2>r2:Since^ˆnisaconsistentestimatorofˆ,thatis,^ˆn!pˆ,wehave558t>0,c=12exp(t22),whennislargeenough,P(j^ˆnˆj>c)c)r2)=P(nmax16j6pj"0n(IˆM0n)1M0nMnX(j)njoj^ˆ2nˆ2jn>r2)=P(nmax16j6pj"0n(IˆM0n)1M0nMnX(j)njoj^ˆ2nˆ2jn>r2\j^ˆ2nˆ2j>c)+P(nmax16j6pj"0n(IˆM0n)1M0nMnX(j)njoj^ˆ2nˆ2jn>r2\j^ˆ2nˆ2j6c)6c+P(max16j6pj"0n(IˆM0n)1M0nMnX(j)nj>rn2c)56andP(A2>r2)=P(nmax16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)njoj^ˆnˆjn>r2)=P(nmax16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)njoj^ˆnˆjn>r2\j^ˆnˆj>c)+P(nmax16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)njoj^ˆnˆjn>r2\j^ˆnˆj6c)6c+P(max16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)nj>rn2c)Next,weneedthetailprobabilityofmax16j6pj"0n(IˆM0n)1M0nMnX(j)njandmax16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)nj:However,notethatinourcase,wedonotassumeGaussiandistributionfortheerrorn,instead,weonlyhavezeromeanandsecondmomentassumption(Assumption1).Thus,weusethemomentinequalityderivedfromtheNemirovski'sinequality:Emax16j6pj"0nU(j)j268log(2p)nXi=1max16j6pjU(j)ij2E"2iforanydesignmatrixU,withU(j)asitsjthcolumn.Basedontheassumption,therow57andcolumnsumsofMnand(IˆMn)1areboundeduniformlyinabsolutevalueandeachelementofXnarenon-stochasticanduniformlyboundedinabsolutevalue.Also,weknowthat,ifAnandBnarematricesthatareconformableformultiplicationwithrowandcolumnsumsuniformlyboundedinabsolutevalue,thentherowandcolumnsumsofAnBnarealsouniformlyboundedinabsolutevalue.Further,thisresultfollowsto3ormorematrices.Thus,therowandcolumnsumsofIˆMn,(IˆM0n)1M0nMnand(IˆM0n)1(M0n+Mn)areallboundeduniformlyinabsolutevalue.Soeveryelementinmatrices(IˆMn)X(j)n,(IˆM0n)1M0nMnX(j)nand(IˆM0n)1(M0n+Mn)X(j)narebounded,anddenotethecommonboundforallofthemasB.Then,wehavePA1>r26c+E[max16j6pj"0n(IˆM0n)1M0nMnX(j)nj]2(rn=2c)26c+8(2c)2log(2p)˙2Bnr2;andsimilarly,PA2>r26c+E[max16j6pj"0n(IˆM0n)1(M0n+Mn)X(j)nj]2(rn=2c)26c+8(2c)2log(2p)˙2Bnr2:Asaresult,P(A>r)6PA1>r2+PA2>r262c+(2c)2log(2p)˙2B0nr258Substitutingtheaboveprobabilitybounds,wehavePmax16j6p2j"0nT(j)j=n>06P max16j6p"0n(IˆMn)X(j)nn+A>02!6P max16j6p"0n(IˆMn)X(j)nn+A>02\A>r!+P max16j6p"0n(IˆMn)X(j)nn+A>02\A6r!62c+(2c)2log(2p)˙2B0nr2+Pmax16j6p"0n(IˆMn)X(j)n>n02r62c+(2c)2log(2p)˙2B0nr2+Emax16j6pj"0n(IˆMn)X(j)nj2n2(02r)262c+(2c)2log(2p)˙2B0nr2+log(2p)˙2B0n(02r)26exp[t2=2]+B0exp[t2]+B0exp[t2=2]6Kexp[t2=2]:Thisthenimpliestheproofoftheresult:P(=)=1Pmax16j6p2j"0nT(j)j=n>0>1Kexp[t2=2]:59PROOFofTheorem2:3.Ontheset=,withn>20,2jj(I^ˆnMn)Xn(^)jj22n+njj^jj1=2jj(I^ˆnMn)Xn(^)jj22n+njj^S0S0jj1+njj^Sc0jj164njj^S0S0jj164nps0jj(I^ˆnMn)Xn(^)jj2=(pn˚0)6jj(I^ˆnMn)Xn(^)jj22=n+42ns0=˚20;wheretheinequalityfollowsfromthefactthat4uv6u2+4v2:Further,combiningtheoracleinequalitywiththePropositionregardingtheset=,theresultfollows.PROOFofTheorem2:4:UsingtheresultofProposition1andthelineofproofofThe-orem2,wehaveP(An\Bn)>1P(Acn)P(Bcn)>1qXi=1Pjznij>pn(jijn2nbnipqXi=1Pjnij>n2pni:wherezn=(zn1;;znq)0=(Cn11)1Wn(1),n=(n1;;npq)0=Cn21(Cn11)1Wn(1)Wn(2)andbn=(bn1;;bnq)0=(Cn11)1sign((1)).Replaceallthe^ˆninthenotationsabovewiththetrueparametervalueˆ,anddenotetheseasCn0,Wn0,zn0,n0,andbn0forsimplenotation.Theneachelementinthetermonthe60righthandsideoftheaboveinequalityis:P(jznij>pnjijn2nbni=Pjznij>pn(jijn2nbni);jzn0iznij>;jbn0ibnij>+Pjznij>pn(jijn2nbni);jzn0iznij6;jbn0ibnij6+Pjznij>pn(jijn2nbni);jzn0iznij>;jbn0ibnij6+Pjznij>pn(jijn2nbni);jzn0iznij6;jbn0ibnij>=A1+A2+A3+A4forany>0.SinceCnCn0!p0;WnWn0!p0,thenznzn0=op(1);nn0=op(1)andbnbn0=op(1).NotethatherewecannotuseC=limn!11nX0nˆ)XnasinAssumption6,sincethismaynotbenonsingularormaybenotevenconvergentinthehigh-dimensionalcontext.Thus,A1+A3+A4<3,andA2=Pjznij>pn(jnijn2nbni);jzn0iznij6;jbn0ibnij66Pjz0ij>pn(jnijn2n(bni0+)):Nowifwewritezn0=H0A"n,whereH0A=(ha1;;haq)0=(C011)11pn[(IˆMn)Xn](1)0,thenH0AHA=(C011)1n1[(IˆMn)Xn](1)0[(IˆMn)Xn](1)(C011)1=(C011)1:61Therefore,zn0i=(hai)0"nwithjjhaijj2261K28i=1;;q:(2.12)Similarly,Pjnij>n2pni=Pjnij>n2pni;jni0ij>)+P(jnij>n2pni;jni0ij66+Pj0ij>n2pni:Ifwewriten0=H0B"nwhereH0B=(hb1;;hbpq)0=C021(C011)1n12[(IˆMn)Xn](1)0n12[(IˆMn)Xn](2)0,thenH0BHB=1n[(IˆMn)Xn](2)0fI[(IˆMn)Xn](1)f[(IˆMn)Xn](1)0[(IˆMn)Xn](1)g1[(IˆMn)Xn](1)0g[(IˆMn)Xn](2):SinceI[(IˆMn)Xn](1)f[(IˆMn)Xn](1)0[(IˆMn)Xn](1)g1[(IˆMn)Xn](1)0haseigenvaluesbetween0and1,thereforen0i=(hbi)0"nwithjjhbijj226K18i=1;;q:(2.13)62Alsonotethat,nnbn0=nn(C011)1sign((1))6nnK2ksign((1))k2=nnK2pq(2.14)Nowgiven(2:12)and(2:13),itcanbeshownthatE("ni)4<1inAssumption1impliesE(zni)4<1andE(ni)4<1.Infact,givenanyconstantn-dimensionalvector,E(0"n)2k6(2k1)!kk22E("ni)2k:Fori.i.d.errorswithbounded4thmoments,wehavetheirtailprobabilityboundedbyP(zni0>t)=O(t4)Therefore,forn=pn=O(nc2c12),using(2:14),ifwemakearbitrarysmall,wehaveqXi=1Pjznij>pnjijn2nbni6q(3+O(pn(jijn2n(bni0+)))4)=qOr(ˆ)n2c2+2c12=Or(ˆ)n2+2c2;wherer(ˆ)istheboundfortheabsolutevalueoftheelementsinthematrix(Cn11(ˆ))1.63Likewise,pqXi=1Pjnij>n2pni6+(pq)On24n=Opn24n=o(1)Addingthesetwoterms,Theorem2.4follows.64Chapter3Post-modelSelectionEstimationforRegressionModelswithSpatialAutoregressiveError3.1IntroductionFromthepreviouschapter,wehavementionedthatthereexistsanextensiveliteraturework-ingonspatialeconometricmodelswherethedataarecollectedspatiallyfromcross-sectionalunitsinonetimeperiod,andthespatialrelationamongthesamplingsitescannotbeig-nored.Amongall,thespatialautoregressivemodel,whichwasintroducedbyandOrdandOrd,1973,1981)asavariantofthemodelsuggestedinWhittle(Whittle,1954),isoneofthemostwidelyreferencedmodelsofspatialautocorrelation.Inaregres-sioncontext,ifthespatialcomesonlythroughtheerrorterms,wecanmodelthedisturbancetermforonecross-sectionalunitasaweightedaverageofdisturbancescorre-spondingtoothercross-sectionalunits,plusaninnovation.Theweightedaverageinvolvesascalarparameter,denoteˆ,andaspatialweightmatrixwhoseelementsdescribethespatialinteractions.Andtheinnovationsaretypicallyassumedtobeindependent,identicallydis-tributedwithmeanzeroandstandarddeviation˙2.Theparameterofinterestinthiscase65willbeˆ,˙2andthevectorofregressioncots.Inahigh-dimensionalsetup,whichiseasilyencounteredthesedays,traditionalmethodsforregressionmodelswithspatialautoregressiveerrorscannotbedirectlyapplied.Oneofthemostcommonapproachestovariableselectionandestimationinhighdimensionmodelshasbeentheleastabsoluteshrinkageandselectionoperator,the`1penalizedLassoestimatorintroducedbyTibshirani(Tibshirani,1996).IthasbeenprovedasafundamentalresultthattheLassotype`1penalizedestimatorobtainsboththeparameterconsistency(KnightandFu,2000,BuhlmannandvandeGeer,2011)andmodelselectionconsistency(ZhaoandYu,2006).Andthe`1penalizedleastsquaresestimatorsachievethe`2errorconvergenceattherateofpslogp=n,whichaddsapenaltyplogptotheoraclerateps=nofconvergencewhenthetruemodelisknown.Here,nisthesamplesize,pisthetotalnumberofparametersandsisthenumberofparameterswithnon-zerocots(Bickle,Ritov,Tsybakov,2009,Zhang,Huang,2008).Basedonpreviousliterature,BelloniandChernozhukov(Belloni,Chernozhukov,2013)proposedatwo-stepprocedure,whichappliesordinaryleastsquarestothemodelselectedbyLassoestimator.Theyshowthatthepost-modelselectionestimationperformsatleastaswellastheLassointermsoftherateofconvergence,evenwhenLassodidaunsatisfactoryjobineliminatingtparametersinthevariableselectionstepanditcanbestrictlybetterwhenLassocanperfectlyselectthetrueregressionmodel.Wewanttoderivesimilarresultsforthespatialmodelinthehighdimensionalsetting.InChapter2,wecombinetheideaofgeneralizedmomentsestimatorand`1penalizedestimator,anddevelopageneralizedtwo-stageLassoestimatorasaststepmodelselection.Itturnsoutthatthevariablesselectedinthestepisabletocontainthetrueparametersetandthebetweentheselectionandtrueset^misatthesameorderwiths.Thentheleast66squaresestimatorinthesecondstepcanachievea`2errorrateaswellastheestimatorfromthestep.Furtherifthestepcanperfectlyselectthetruemodel,thatis,the^mgoestozeroinprobability,thenthetwostepestimatorisabletoattaintheoraclerateofps=n.Similarresultforthesupnormestimationerrorrateisalsoderived.Thisisnon-trivialworksincetheliteratureonhigh-dimensionalmodelshasfocusedmostlyon`1and`2estimationerrors,`1boundsontheestimationerrorwasestablishedforlinearregressionmodel(Lounici,2008,vandeGeer,2014),buthasnotbeenmentionedinthehigh-dimensionalspatialmodels.Therestofthechapterisorganizedasfollows.Section2discussesthepropertiesofthemodelselectionandestimationbystepmodelselectionapplyingthegeneralizedLassomethoddescribedinChapter2.Section3provesthe`2erroraswellasthesupnormerrorconvergenceratefortheleastsquarespost-modelselectionestimation.Section4providesthesimulationstudiesoftheperformanceofthepost-modelselectionestimatorcomparedwithsimultaneousvariableselectionandestimationestimatorempirically.Section5appliestheproposedmethodtoasmallColumbuscrimedatasetasanillustration.AlltheproofsarerelegatedtotheSection6.3.2ModelEstimationandSelectionpropertiesFollowingtheestimationproceduresdescribedinthepreviouschapter,wecanseethattheobjectiveofthemodelselectionstepsistorecoverthetruesupportTwithcard(T)=s,oftheparametervectorandthepropertiesofthepost-modelselectionestimatorsdependcruciallyonboththeestimationandmodelselectionpropertiesofthegeneralizedLasso.Inthissection,wewilldeveloptheestimationpropertiesofthegeneralizedLassointheformof67`2norm,followedbyselectionpropertiesgivenbytheselectionsupport^TofthegeneralizedLassoestimator^L(^ˆn)(wewilldenote^foreasynotation).Assumption8:AssumetheelementsofXnareuniformlyboundedinabsolutevalue,furtherassumethatmaxi;jjxijj=O(1ps).Assumption8doesnothingbutcontrolsthemagnitudeofthecomponentsofthedesignmatrix,anditcanbeachievedbynormalizingthedesignmatrix.Fortheanalysisoftheestimator,weneedtousethefollowingrestrictedeigenvalueconditionontheGrammatrix,similarstatementscanbefoundinBickel,Ritov,andTsybakow(2009):Condition(RE):Foragivenc>0,thereexistsaconstant(c)suchthat(c)=infjjTcjj16cjjTjj16=00X0nXnnjjjj22>0Fortheanalysisofpost-modelselectionestimators,wealsoneedthefollowingrestrictedsparseeigenvalueconditionontheempiricalGrammatrix,wecanthesameconditioninBelloniandChernozhukov(2013):Condition(RSE):Foranygivenm0;!x=supjjTcjj066=00X0nXnnjjjj22<1AnextendedconditioncanbederivedforCondition(RE)inordertocatertothespatialautoregressiveerror.Lemma3:1:SupposeCondition(RE)holdsforc=c+1c1,forsomec>1,andthatˆ)=(IˆMn)0(IˆMn),thenforthegeneralizedmomentsestimator^ˆn,whennislargeenough,68wehave(c)minˆ))=infjjTcjj16cjjTjj16=00X0n^ˆn)Xnnjjjj22;whereminˆ))istheminimumeigenvalueofthematrixˆ).FromtheLemma3.1wecanseethattheofspatialautoregressiveerrorinthemodelwilladdacotdependingontheeigenvaluesofˆ).AndsimilarextensioncanbederivedforCondition(RSE)aswell.ThefollowingtheoremconstructsthemainestimationpropertiesofthegeneralizedLassoestimator^.Theorem3:1:SupposethatAssumption1-5and8holds,andCondition(RE)holdforc=c+1c1,forsomec>1.Choosetheregularizationparametern>2cb(exp(t2=2)+1)qlog2pn,thenwithprobabilityatleast1Kexpt2=2g,t>0,wehavejj^jj26(1+1c)pn(c)minˆ))Theboundfor`2normoftheparameterestimationerrorisderivedwhenthedisturbance"nisdistributionfree,andtheconvergencerateofthe`2errorisqslogpn.Alowerboundfortheregularizationnisrequiredinordertocontroltherandomnessbroughtby"n.Inpractice,wewouldliketoselecttheregularizationparametertobeclosetothelowerboundsincetoomuchpenalizationwillhaveanegativeontheselectioncapability.Infact,theestimationerrorconvergestozeroin`2normwhenpn!0.Intheprecedingparagraphs,wewilldiscussthemodelselectionpropertiesofthegeneralizedlassoestimatorandprovidetheboundsonthefalsepositiveselectedvariables.Theorem3:2:(1)Ifthecoarewellseparatedfrom0,thatis,minj2Tj0jj>+t;forsomet>0;=maxj=1;;pj^j0jj;69thenundertheconditionsusedinTheorem3.1,thetruemodelisasubsetoftheselectedmodel,T:=support(0)^T:=support(^n),withhighprobability.(2)SupposeAssumption1-5,Assumption8andCondition(RE),(RSE)holds,andchoosetheregularizationparameternsameasinTheorem3.1,thenwithprobabilityatleast1Kexpt2=2g,t>0,wecanhaveupperboundsonthenumberofnoisevariables^m=card(^TT),thatis,^m.s:FromTheorem3.2,weobtainthattheselectedsupportsetoftheestimatorfrom`1pe-nalization^TcontainsthetruemodelTwithahighprobabilityconvergesto1andalso,thenumberofnoisevariablesselected^misboundedbyavalueinthesameorderasthecardinalitys.Thesearethetwovariableselectionpropertiesweneedinordertoproceedtothepropertiesofestimatorsderivedfrompost-modelselectionestimation.Infact,manyoftheknownvariableselectionpropertiesmeettheserequirements,butwearefocusingonthe`1penalizationforbetterillustration.3.3Post-modelestimationpropertiesInthissection,wewillpresentageneralresultontheperformanceofapost-modelselectionestimator^pwiththemodelselectedfromprevioussteps,intermsofboththe`2normandsupnormerrorconvergencerate.Wewillshowthattheestimator^pcanperformatleastaswellastheestimatesprovidedby`1penalizationestimationandevenstrictlyoutperforms^ifcertainpropertiesareachievedbythestepmodelselection.70Wetheleast-squarepost-modelselectionselectionestimatoras^p=argmin1njj(I^ˆMn)(YnXn)jj22;wherej=0;j2^Tc:(3.1)Thefollowinglemmaprovidesanupperboundforastochasticterminvolvingthedistur-bances"n,anditisacrucialpartforthederivationof`2convergenceratetheorem.Lemma3:2:SupposeAssumption1-5,8andCondition(RSE),^ˆnisageneralizedmomentsestimatorforˆ,form=1;2;;ns,en(m;)=2˙p!xminˆ))pn(slogpm+p(m+s)log(3D)+r(m+s)+log1)forany2(0;1)andsomeconstantD.Thenforallm,supjjTcjj06m;jjjj2>0j"0n(IˆM0n)1^ˆn)Xnnjjjj2j6en(m;);withprobabilityatleast1exps(11=e).Nowwiththelemmaestablishedabove,withahighprobability,the`2errorboundsofthepost-modelselectionestimator^pisobtainedthereby.Theorem3:3:Let^beanyestimatorfromamodelselectorandlet^T=supp(^).Suppose^T^T,thetruesupportand^m=card(^TT).s.Nowifwelet^pbethepost-modelselectionestimatordin(3:1),withassumption1-6andCondition(RSE),wehaveforany2(0;1)withprobabilityatleast1exps(11=e)thatjj^p0jj262˝xminˆ))en(^m;):71Thistheoremestablishesaboundforthe`2errorboundsofthepost-modelselection.Theconvergencerateofthisboundimpliesthattheleastsquarepost-modelselectionestimatorperformsjustaswellasthe`1penalization.Andinfact,basedonthesummarizationcorollarybelow,theperformanceofthepost-modelselectionestimatorbecomesstrictlybetterthangeneralizedLassowhenthenumberofnoisevariablesselected^mgoesto0withprobabilityto1.Andifthemodelselectionstepmanagestoperfectlyselectthetruemodel,thatis,^T=T,withhighprobability,the`2estimationerrorcanachievethepsnoraclerateofconvergence.Corollary3:1:Let^pbethepost-modelselectionestimatordin(3:1),thenundertheconditionsfromTheorem3.3,wecangetdirectlyjj^p0jj2.8>>>>>>>><>>>>>>>>:qslogpn;ingeneralqo(1)slogpn;ifT^T;^m=o(s);w.p.!1psn;ifT=^T;w.p.!1Nowthatwehaveachievetheestimation`2errorbound,wenowproceedtoconstructtheconvergencerateoftheestimationerrorintheformofsupnorm.Atthispoint,weneedonemoreassumptionofthedesignmatrix.Assumption9:=X0n^ˆn)Xnn,withoutlossofgeneralization,assumei;i=1,andmaxi6=jji;jj6csforsomeconstantc.SimilarassumptionhasbeenmadeinDonoho,EladandTemlyakov(2006),wheretheauthorsrequirethatthevalueofmaxi6=jji;jjtobetlysmall.Infact,thiscanalso72bederivedfromAssumption8.Withthese,nowweareabletoproceedtotheresultofconvergencerateoftheestimationerrorinsupnorm.Lemma3:3:LetAssumption1-5,8-9bed.Andforany2(0;1),takehn(m;):=r2n˙(pmlogp+plog(m+s)+rlog(1);thenwithaprobabilityatleast1-2p=(p1),maxi2~T;card(~TT)6m1njnXj=1Ti;j"jj6hn(m;);foranym6ns;whereTi;jistheelementinthematrixT=X0n^ˆn)(IˆMn)1.Theorem3:4:Let^Tbethesupportofanystepmodelselector,andlethn(m;)bethesamefunctiondinLemma3.3,thenassumeAssumption1-5,8-9andCondition(RSE)holds,thenwehavewithhighprobability,jj^p0jj16(1+c(^m+s)minˆ))˝x)hn(^m;):Theorem3.4canbeusedtoderivetheratesofconvergenceoftheestimationerrorinsupnormofthepostmodelselectionestimatorandtheresultissummarizedasfollows:Corollary3:2:Let^pbethepost-modelselectionestimatordin(3:1),thenunderthe73conditionsfromTheorem3.4,wewillgetjj^p0jj1.8>>>>>>>><>>>>>>>>:qslogpn;ingeneralqo(s)logpn;ifT^T;^m=o(s);w.p.!1qlogsn;ifT=^T;w.p.!13.4SimulationStudiesInthissection,wewillusecomputationalsimulationstoshowhowthepost-modelselectionestimator^pearlieroutperformthesimultaneousvariableselectionandestimation^intermsofthe`2errorrate.Thedistributionof"nintheMonteCarlostudyisalwayssettobenormal,andwithoutlossofgenerality,N(0;1).Thisisbecausetheestimatorsforˆearlierdonotdependon˙2.TheweightmatrixMnisasanidealizednnweightingmatrixina\circularworld"followingKelejianandPrucha(1999),andspMnsuchthateachelementofuniisdirectlyrelatedtothe5elementsimmediatelybeforeandafterit.Forsimplicity,wespecifyMnsuchthatallthenon-zeroelementsofMnareequalandthattherespectiverowssumto1.InthenpdesignmatrixXn,thecovariatesXi'sarei.i.d.fromapdimensionalGaussiandistributionwitheachcomponenthavingmeanzeroandvariance1.Andthepairwisecorrelationissettobecor(xij;xik)=0:5jjkj,for16j;k6p.Theq=20non-zerocomponentsofthepdimensionalparameterofinterest0aregeneratedindependentlyfromauniformdistributionovertheinterval(2;5).Weconsider4tchoicesofˆ,alongwith44=16combinationsof(n;p),andthiswillgiveus64modelsettingsintotal.Foreachcase,theresultsaresummarizedover10074MonteCarloreplications.Attheendofcalculationsineachsetting,werecordandcomparetheRelativeEstimationError,whichisbyjj0jj2=jj0jj2,oftheestimate,forthepost-modelselectionestimator^p,thegeneralizedlassoestimator^,andtheoracleestimatesforthespatialerrormodelwithonlythetruenonzeroparameters.WedenotethemasREEp;REEg;andREEo,respectively.Besides,thenumberintheparenthesisrecordthesumofestimationvarianceforthe20tparameters.The`1penalizationcomputationusedinthegeneralizedlassoestimatorandinvolvedingettingthepost-modelselectionisachievedbyusingthe\glmnet"intheRpackagedevelopedbyFriedmanet.al.(2010),andthepenaltylevelischosenbycrossvalidationcontrolledbyadata-drivenchoiceoflowerbound.Theideaisthat,basedontheproof,ischosentodominatetherandomnessbroughtby"n,thatis,n>2cmax16j6pj"0nT(j)j=n;withprobabilityatleast1;whereT(j)isthejthcolumnofthematrixT=(IˆM0n)1^ˆn)Xn,probability1-needstobecloseto1andcisaconstantgreaterthan1.Therefore,thelowerboundforthepenalizationisproposedtobe=c0^˙Q(1jX;^ˆn);forsomefixedc0>c>1;hereQ(1jX;^ˆn)isthemaximum(1)quantileofjz0nT(j)j=n,whereznisan1standardnormalvectorand^˙istheestimateof˙.Table3.1to3.4displaythesuperiorityofthepost-modelselectionoverthesimultaneousvariableselectionandestimationmethodvia`1penalizationandtheyalsoshowthepost-modelselectionestimatorreachthesameorder75ofestimationerrorrateastheoracleestimatorasthesamplesizeislargeenough.Figure3.1to3.4comparethecoveragerateofthe20tvariablesineachpairofn,pscenarioateachvalueofˆ.Here,thecoverageissothatthetruevalueofeachvariableislocatedwithinthe95%intervalconstructedfromtheestimates.Table3.1:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:3p=500p=800p=1000p=1200REEp0.084(0.159)0.101(0.189)0.119(0.213)0.123(0.224)n=225REEg0.2120.2160.2300.226REEo0.031(0.101)0.029(0.099)0.031(0.102)0.029(0.100)REEp0.032(0.058)0.037(0.064)0.041(0.066)0.043(0.072)n=400REEg0.1370.1370.1370.137REEo0.021(0.055)0.022(0.056)0.021(0.056)0.021(0.057)REEp0.022(0.034)0.022(0.035)0.023(0.035)0.024(0.035)n=625REEg0.1160.1140.1120.110REEo0.017(0.035)0.017(0.035)0.017(0.035)0.017(0.035)REEp0.016(0.023)0.016(0.023)0.017(0.023)0.016(0.023)n=900REEg0.1000.0960.0960.093REEo0.014(0.024)0.014(0.024)0.015(0.024)0.014(0.024)Table3.2:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:75p=500p=800p=1000p=1200REEp0.101(0.221)0.119(0.261)0.135(0.299)0.144(0.290)n=225REEg0.2320.2210.2360.268REEo0.035(0.134)0.034(0.129)0.032(0.134)0.031(0.126)REEp0.059(0.142)0.085(0.176)0.083(0.167)0.075(0.173)n=400REEg0.1790.1660.1520.137REEo0.028(0.120)0.028(0.114)0.028(0.106)0.027(0.108)REEp0.047(0.173)0.073(0.209)0.077(0.219)0.081(0.230)n=625REEg0.2180.1730.2460.222REEo0.033(0.197)0.033(0.187)0.032(0.194)0.031(0.191)REEp0.044(0.173)0.041(0.177)0.053(0.215)0.047(0.169)n=900REEg0.1920.1450.1380.200REEo0.031(0.190)0.031(0.189)0.031(0.207)0.030(0.182)76Figure3.1:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:377Figure3.2:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:7578Figure3.3:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:379Figure3.4:Coveragerateofpost-modelselectionandoracleestimatorsforˆ=0:7580Table3.3:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:3p=500p=800p=1000p=1200REEp0.082(0.168)0.091(0.189)0.106(0.217)0.120(0.241)n=225REEg0.2210.2280.2580.258REEo0.028(0.086)0.028(0.087)0.029(0.085)0.028(0.083)REEp0.034(0.054)0.039(0.061)0.042(0.065)0.042(0.063)n=400REEg0.1370.1480.1440.148REEo0.020(0.045)0.019(0.045)0.019(0.045)0.020(0.045)REEp0.020(0.029)0.020(0.029)0.022(0.030)0.023(0.030)n=625REEg0.1050.1080.1100.108REEo0.015(0.028)0.015(0.028)0.015(0.028)0.015(0.028)REEp0.015(0.019)0.016(0.019)0.015(0.019)0.016(0.019)n=900REEg0.0880.0870.0880.090REEo0.012(0.019)0.013(0.019)0.012(0.019)0.012(0.019)Table3.4:MeansofREEfor^p,^and^oracleof100datasetsrepetitionforˆ=0:75p=500p=800p=1000p=1200REEp0.092(0.197)0.102(0.220)0.114(0.253)0.115(0.252)n=225REEg0.2150.2170.2350.235REEo0.026(0.093)0.027(0.092)0.026(0.094)0.026(0.092)REEp0.037(0.062)0.039(0.070)0.045(0.077)0.051(0.082)n=400REEg0.1260.1300.1340.142REEo0.018(0.048)0.018(0.048)0.017(0.048)0.019(0.048)REEp0.021(0.033)0.024(0.034)0.024(0.034)0.025(0.034)n=625REEg0.0990.1000.1010.099REEo0.014(0.030)0.015(0.030)0.014(0.029)0.015(0.029)REEp0.015(0.020)0.015(0.021)0.015(0.021)0.016(0.021)n=900REEg0.0800.0780.0790.079REEo0.012(0.020)0.011(0.020)0.011(0.020)0.011(0.020)3.5RealDataExampleInthissection,wearegoingtoapplytheproposedmethodtoworkonasmallreallifedataexampleasillustration.ThedatawechoseisabuiltinsampledatasetintheRpackage\spdep",whichcanalsobefoundinAnselin(1988)book.Itincludes49samplesdescribingtheColumbuscrimeincludingthenecessaryspatialinformation.Thisdatasetweuseoriginallyisnothigh-dimensional,butweintentionallypickthisonesincethelow-dimensionalnatureofthedatasetcanbeusedasacriteriontocheckcapabilityofvariable81selection.The\classic"ColumbuscrimeregressionistopredictthevariableCrime,theresiden-tialburglariesandautotheftsper1000householdswithvariablesHOVAL,thehousevalueandINC,incomeamount.TheLagrangeMultiplierTestStatisticsforspatialdependenceistforspatialerrormodelsandaninitialanalysiswiththesampledatareturnsanestimateof^HOVAL=1:17,^INC=0:30,withstandarderror0.348and0.095,respec-tively.Boththehousevalueandhouseincomehaveanegativeimpactontheburglariesandautotheftsrate.Totestthepowerofvariableselectionforspatialerrormodels,wemanuallyinput500spuriouscovariatesthatareuncorrelatedwiththeresponse,sothatthesampledatasetbecomeshigh-dimensionalwith49502dimension.Nowageneralized`1penaltyvariableselectionmethodisappliedtothesampledatasetandthepenalizationparameterischosenbycross-validationwithadata-drivenlowerbound.Thevariableselectionmethodshowsgreatcompetencebycorrectlyselectingtheonlytwoauthenticvariables,withaes-timateofHOVALandINC-0.4592141and-0.1476459,respectively.Apost-modelselectionestimationisconductedwiththeselectedtwovariablesanditturnsoutreturningamuchaccurateprediction,withparameterestimates^HOVAL=0:99,^INC=0:31,andtheirstandarderror0.348and0.095,thesamewiththeoraclecase.823.6ProofsPROOFofLemma3:1.Notethatthegeneralizedmomentsestimator^ˆnisaconsistentestimatorforthespatialautoregressiveparameterˆ,then0X0n^ˆn)Xn=0X0nˆ)Xn+0X0n^ˆn)Xn0X0nˆ)Xn=0X0nˆ)Xn+0X0n[(ˆ^ˆn)(M0n+Mn)+(^ˆ2nˆ2)M0nMn]Xn=1+2where1=0X0nˆ)Xn;2=0X0n[(ˆ^ˆn)(M0n+Mn)+(^ˆ2nˆ2)M0nMn]Xn:Nowifwelookattheterm2,since^ˆnˆ!p0,^ˆ2nˆ2!p0andtheboundednessconditionofXnandMn,whennbecomeslargeenough,2njjjj22!0Andthus,infjjTcjj16cjjTjj16=00X0n^ˆn)Xnnjjjj22=infjjTcjj16cjjTjj16=01njjjj2283Alreadyknownthatˆ)=(IˆMn)0(IˆMn)isasymmetricandpositivematrix,sothereexistsauniquedecompositionofˆ),ˆ)=Q0UQ;whereU=diag(1;;n)isthediagonalmatrixcomposedoftheeigenvaluesofˆ),andQisanorthogonalmatrix.Basedonthese,1njjjj22=0X0nQ0UQXnnjjjj22=ni=1i(QXn)2injjjj22>min0X0nXnnjjjj22CombinewithCondition(RE),weyieldresultofLemma3.1.PROOFofTheorem3:1.BythegeneralizedLassoestimator^canbeex-pressedas^=argmin1n(YnXn)0^ˆn)(YbXn)+njjjj1;84where^ˆn)=(I^ˆnMn)0(I^ˆnMn).Thus,ifwedenote0asthetruevalueforparameter,intuitively,1n(YnXn^)0^ˆn)(YnXn^)+njj^jj161n(YnXn0)0^ˆn)(YnXn0)+njj0jj1:(3.2)Since,1n(YnXn^)0^ˆn)(YnXn^)1n(YnXn0)0^ˆn)(YnXn0)=1n[Xn(0^)+(IˆMn)1"n]0^ˆn)[Xn(0^)+(IˆMn)1"n]1n[(IˆMn)1"n]0^ˆn)[(IˆMn)1"n]=1n[Xn(0^)]0^ˆn)[Xn(0^)]+21n"0n(IˆM0n)1^ˆn)Xn(0^)>1njj(I^ˆnMn)Xn(0^)jj2221n(max16j6pj"0nT(j)j)jj^0jj1whereT(j)isthejthcolumnofthematrixT=(IˆM0n)1^ˆ)Xn.Thenaccordingtotheresultinpreviouschapter,theset=:=ˆmax16j6p2j0nT(j)j=n60˙inwhichtherandompartcanbegetridof,hasaprobabilityatleast1Kexp[t2=2],where0=2˙(exp[t2=2]+1)qlog2pn.Nowassumeanarbitrarilyconstantc>1,sothatn>0,thenontheset=,21n(max16j6pj"0nT(j)j)jj^0jj1>ncjj^0jj185Nowbringtheresultbackto(3:2),ontheset=,1n(YnXn^)0^ˆn)(YnXn^)+njj^jj16ncjj^0jj1+njj0jj1:(3.3)Withsimpletransformation,jj^jj1=jj^Tjj1+jj^Tcjj1>jj0Tjj1jj^T0Tjj1+jj^Tcjj1;andjj^0jj1=jj^T0Tjj1+jj^Tcjj1;wecanhavetherelationshipofthebetweenthegeneralizedLassoestimatorandthetruevalueoftheparameteronthesupportandnon-supportset,cn(YnXn^)0^ˆn)(YnXn^)+(c1)njj^Tcjj16(c+1)njj^T0Tjj1Denote˚=^0fornotationsimplicity,sincethetermonthelefthandsideispositive,thuson=,jj˚Tcjj16c+1c1jj˚Tjj1Thus˚belongstotherestrictedsetinconditionRE(c),wherec=c1c+1,andwecanhave1n(YnXn^)0^ˆn)(YnXn^)=1njj(I^ˆMn)Xn˚jj22>(c)minˆ))jj˚jj22:86Basedon(3:3),(c)minˆ))jj˚jj22nc(jj˚Tjj1+jj˚Tcjj1)6n(jj˚Tjj1jj˚Tcjj1);(c)minˆ))jj˚jj226(1+1c)njj˚Tjj1(11c)njj˚Tcjj16(1+1c)njj˚Tjj16ps(1+1c)njj˚Tjj2:ThelastinequalitymakesuseofCauchy-Schwarzinequality,andpsisthepricetopaywhenyoureplacethe`1with`2norm.Andthuswetheproofwithboundofthe`2normoftheestimator,jj^0jj26(1+1c)pn(c)minˆ)):PROOFofTheorem3:2.(1)Basedontheassumptionofthemagnitudeof0,ifT*^T,then9k2f1;2;;pg,suchthat0k6=0,but^k=0.Thusforanyl2f1;2;;pg,jj^l0ljj2>j0kj>minj2Tjj0j>maxj=1;2;;pj^j0jj:Acontradictionoccurs,andthusT^T.(2)RecalltheofthegeneralizedLassoestimatorin(2:6),andmakeuseof87theKarush-Kuhn-Tuckercondition,for8j2^T,djj1n(I^ˆnMn)Yn(I^ˆnMn)Xnjj22ijj=^j=[21n(YnXn)0^ˆn)Xn]jjj=^j=nsign(^j)Thus,8j2^T,j2n(YnXn^)0^n)Xnjj=n:Sinceweareonlylookingatthesupportset^Toftheestimatedparameter^,qj^Tjn=jj(2n(YnXn^)0^n)Xn)^Tjj2=2jj(1n(YnXn0+Xn0Xn^)0^ˆn)Xn)^Tjj262jj(1n"0n(IˆM0n)1^ˆ)Xn)^Tjj2+2jj(1n(0^)0X0n^ˆ)Xn)^Tjj26ncqj^Tj+;here=jj(1n(0^)0X0n^ˆ)Xn)^Tjj2.Thelastinequalityisby"0nsfromtheset=,whichweknowfromTheorem3.1,hasaprobabilityatleast1Kexpt2=2g.Onthe88otherhand,usingHolderinequality,andtheextensionofCondition(RSE),=jj(1n(0^)0X0n^ˆ)Xn)^Tjj26supjjTcjj06^m;jjjj61j01nX0n^ˆn)Xn(^0)j6supjjTcjj06^m;jjjj61jj01pnX0n(I^ˆnMn)0jj2jj1pn(I^ˆn)Xn(^0)jj26!x(^m)maxˆ))jj^0jj2Tosummarize,(11c)nqj^Tj62!x(^m)maxˆ))jj^0jj2;combinedwiththeorderof`2normoftheerencebetweenthegeneralizedLassoestimator^and0,easilytoget^m.s:PROOFofLemma3:2.Foreachnonnegativeintegerm6ns,andconsidereachset~Tˆf1;2;;pg,withcard(~TT)6m,aclassoffunctionsG~T=ff;2Rp;support()~T;jjjj1=1g;wheref="iD0i,withD0i=[(IˆM0n)1^ˆn)Xn]ithrow.FurtherthesetFm=fG~T:~Tf1;2;;pg;withcard(~TT)6mg:89combiningallpossiblechoicesof~T.Withtheofen(m;),itfollowsdirectly,P(supf2Fmj1nnXi=1f("i)j>en(m;))6pmmaxcard(~TT)6mP(supf2G~Tj1nnXi=1f("i)j>en(m;)):(3.4)Nowconsideranytwofunctionsf;g2G~T,recallthatE1pnPni=1f("i)=1pnPni=1g("i)=0,let(f;g):=vuutE[1pnnXi=1f("i)1pnnXi=1g("i)]2;whichcanbeseenasa\naturalsemimetric".Also,thecoveringnumberofG~TwithrespecttoobeysN(t;G~T;)6(3Rt)m+s;foreach0en(m;))6(3DRpnen(m;)pm+s˙2!xmaxˆ)))m+spnen(m;)˙p!xmaxˆ))):Bytheofen(m;),anddenoteE=pnen(m;)fornotationsimplicity,(3DRpnen(m;)pm+s˙2!xmaxˆ)))m+spnen(m;)˙p!xmaxˆ)))6expE22˙2!xmaxˆ))+(m+s)logEpm+s˙p!xmaxˆ))+(m+s)log3Dg=expm+s2(Epm+s˙p!xmaxˆ)))2+(m+s)logEpm+s˙p!xmaxˆ))+(m+s)log3Dg:TakeD>e=3,soEpm+s˙p!xmaxˆ))>p2,andcombinewiththefactthatlogx6x24,91ifx>p2,thentheinequalitiescontinuesasfollows:(3DRpnen(m;)pm+s˙2!xmaxˆ)))m+spnen(m;)˙p!xmaxˆ)))6expm+s4(Epm+s˙p!xmaxˆ)))2+(m+s)log3Dg=expE24˙2!xmaxˆ))+(m+s)log3Dg6explogpm(m+s)log(1)gthustheaboveprobabilityisboundedbyems=pm.From(3:4),P(supf2Fmj1nnXi=1f("i)j>en(m;))6emsAndthusP(supf2Fmj1nnXi=1f("i)j>en(m;);9m6ns)6nXm=0ems6es=(11=e):AndLemma3.2thereforeisproved.PROOFofTheorem3:3.Bytheof^p,jj(I^ˆnMn)(YnXn^p)jj22n6jj(I^ˆnMn)(YnXn0)jj22n;92thus,jj(I^ˆnMn)(YnXn^p)jj22njj(I^ˆnMn)(YnXn0)jj22n=(^p0)0X0n^ˆn)Xn(^p0)n2"0n(IˆM0n)1^ˆn)Xn(^p0)n60SupposeAssumption1-5,and8holds,andcombinedwiththefactthat^m=card(^TT),thenusingtheresultofLemma3.2,wewillget(^p0)0X0n^ˆn)Xn(^p0)n6j2"0n(IˆM0n)1^ˆn)Xn(^p0)nj62en(^m;)jj(^p0)jj2;withprobability1exps(11=e).Since^m6ns,fromtheextensionofCondition(RSE),weobtain(^p0)0X0n^ˆn)Xn(^p0)n>˝xminˆ))jj^p0jj22;remindhereˆ)=(IˆMn)0(IˆMn).Andcombinetheaboveresults,withprobability1exps(11=e),the`2normofthepost-modelselectionestimationerrorhasanupperbound,jj^p0jj262en(^m;)=˝xminˆ));whichprovestheTheorem.PROOFofLemma3:3.Foranym2f0;1;;nsg,easytoseejjT"njj1=93maxi2~T;card(~TT)6m1njPnj=1Ti;j"jj.ThenP(jjT"njj1>hn(m;))=P(maxi2~T;card(~TT)6m1njnXj=1Ti;j"jj>hn(m;))6(m+s)maxiP(1pnjnXj=1Ti;j"jj>phn(m;))Sincethe"jisi.i.dwithN(0;˙2),thelinearcombinationforeachi,1pnPnj=1Ti;j"jisalsonormallydistributedwithmean0andvariance1nPnj=1T2i;j.AndthevarianceisindeedamultiplicationoftheithelementofthediagonalofthematrixTT0.Considerthefactthat1nTT0!p0,andthatallthediagonalelementinisequalto1.Thusforany2(0;1)maxiP(1pnjnXj=1Ti;j"jj>phn(m;))62expnh2n(m;)2˙2gAndifwebringintheofhn(m;)P(maxi2~T;card(~TT)6m1njnXj=1Ti;j"jj>hn(m;);foranym6ns)6nsm=0pm62p=(p1):PROOFofTheorem3:4.Stillusetheofthepost-modelselectionestimator^p,jj(I^ˆnMn)(YnXn^p)jj22n6jj(I^ˆnMn)(YnXn0)jj22n;94andtherefore,easytoget(^p0)0X0n^ˆn)Xn(^p0)n2"0n(IˆM0n)1^ˆn)Xn(^p0)n60:Sincejj(^p0)Tcjj0=^mˆf0;;nsg,thencombinewithCondition(RSE),andLemma3.3,minˆ))˝xjj^p0jj226hn(^m;)jj^p0jj1:Thecardinalityofthesupportof^p0isboundedbym+s,sojj^p0jj216(m+s)jj^p0jj22;withhighprobability.Combinethetworesults,wecanhavejj^p0jj16hn(m;)(m+s)minˆ))˝x:Ontheotherhand,sincethesubset^Telementsof^pistheleastsquaresestimatorforthelinearmodelwithresponsevectorYnandcovariatematrixX^T,then1nX0^T^ˆn)(YnX^T^ps)=0;andeasytoseetheequationaboveisthesameas1nX0n^ˆn)(YnXn^p)=0:95Keepinmindtheofwehave,jj^p0)jj1=jj1nX0n^ˆn)(Xn^pYn)1nX0n^ˆn)(Xn0Yn)jj16jj1nX0n^ˆn)(X0n^pYn)jj1+jj1nX0n^ˆn)(Xn0Yn)jj16jj1nX0n^ˆn)(Xn0Yn)jj16hn(^m;);fromtheresultofLemma3.3.Thus,withhighprobability,for16j6p,^p0))j=(^pj0j)+Xi6=ji;j(^pi0i);thenj^p0))j(^pj0j)j6csXi6=jj^pi0ijj^pj0jj6j^p0))j(^pj0j)j+csXi6=jj^pi0ij;andwewillhavejj^p0jj16jj^p0)jj1+csjj^p0jj16(1+c(^m+s)minˆ))˝x)hn(^m;):AndthisprovestheresultofTheorem3.4.96Chapter4Futurework4.1AnextensiontoMixedRegressive,SpatialAutore-gressiveModelsTobler'slawofgeographyencapsulatesthissituation:\everythingisrelatedtoevery-thingelse,butnearthingsaremorerelatedthandistantthings."Onewayofapproachisthroughspatialinteraction.AccordingtoAnselinandBera(1998),highorlowvaluesforarandomvariabletendtoclusterinspaceorlocationstendtobesurroundedbyneighborswithverydissimilarvalues.Thespatialinteractionsgenerallycomefromthreeresources:theendogenousinteractionamongthedependentvariables,theexogenousinteractionamongtheindependentvariablesandtheinteractionamongtheerrorterms.Tocapturethespatialdependence,thegeneralapproachinaspatialeconometricsistoimposestructuresonamodel.Inanempiricaleconomicproblem,ifthespatialonlycomesfromtheerrorterms,econometricianswillprefertousearegressionmodelwithspatialautoregressiveerrors,thatis,aspatialerrormodelaswehavediscussedintheprevi-ouschapters.Comparedwithothers,thespatialerrormodelisconceptuallysimplerinthesensethattheonlyproblemsinvolvedareheteroskedasticityandnon-linearityinthespatialparameterˆ.Anotherpopulartypeofmodel,whichhasalsobeenheavilydiscussedintheliterature,97considerstheendogenousinteractiononthedependentvariableinaregressioncontext,andthistypeofmodeliscalledMixedRegressive,SpatialAutoregressiveModel:Yn=WnYn+Xn+"n;(4.1)wherenisthetotalnumberofspatialcross-sectionalunits,Ynisann-dimensionalvectorofresponse,Xnisannpmatrixofconstantregressors,Wnisthespatialweightsmatrix,similarlytomatrixMnintheSpatialErrorModel,and"nisann-dimensionali.i.d.disturbanceswithzeromeanandvariance˙2.TheWnYnin(4:1)iscalleda\spatiallag"andthespatialautoregressiveparameterˆrepresentsthespatialduetotheofneighboringunits.Themaininterestinestimationofthemodelis,ingeneral,theparameters,and˙2.Theinterpretationofthemodelisthattheresponseofaunitdependsnotonlyontheexplanatoryvariablesbutalsoontheresponseofitsneighboringunits.Therefore,thespatiallagmodeliswidelyusedinspatialeconometrics,socialsciences,agricultures,andhealth(Bertrand,Luttmer,andMullainathan,2000,Topa,2001).Clearlywewouldnotwanttorunordinaryleastsquares(OLS)onthismodel,sincethepresenceofYnonbothsidesoftheequationmeansthatthereexistsacorrelationbetweenregressorsanddisturbances,andtheestimateswillthusbebiasedandinconsistent.Theestimatingmethods,whichhavebeenwidelydiscussedintheliteratureformixedregres-sive,spatialautoregressivemodels,aremainlythe(quasi-)maximumlikelihoodestimator(Lee,2004),thetwo-stage-least-square(2SLS)orInstrumentalVariablemethod(KelejianandPrucha,1998,Lee,2002)andthegeneralizedmethodofmoments(Lee,2007).Theinstrumentalvariables(IV)areusuallygeneratedfromexogenousregressorsXnandthespatialweightsmatrixWnofthemodel,andmostofthemarecomputationallysimple.98However,theyareintrelativetotheMLestimator,whenthedisturbancesarenor-mallydistributedsothatthelikelihoodfunctioniscorrectlyspAlso,astheIVsarefunctionsofthespatialweightsmatricesandexogenousvariables,the2SLSmethodwouldnotbeapplicabletothe(pure)SpatialAutoregressiveProcesswhentherearenoexogenousvariablesrelevantinthemodel.Thegeneralizedmethodofmoments(GMM)approachcom-binestheIVestimationwithageneralizationofthemethodofmoments(MOM)inKelejianandPrucha(1999)thathasbeendiscussedfortheestimationofSpatialErrorModel.Ofall,themostpopularandtraditionalestimationmethodisthe(quasi-)maximumlikelihoodestimatorundertheassumptionthattheerrorterm"nisnormallydistributed,andthequasi-maximumlikelihoodestimatorallowsforthecasewhenthetruedistributionoferroristfromnormal.Wewanttocontinuetheideaofexploringvariableselectionandestimationmethodsinthehigh-dimensionalsetup,wherethedatacontainslargernumberofparametersthanthesamplesizebutmostofthemareexcessive,forspatialeconometricmodelsandextendthetheoreticaldiscussionstoamixedregressive,spatialautoregressivemodel,wheretheresponsevariableisspatiallycorrelatedwithunitsinneighborsinaregressioncontext.Considerthemixedregressive,spatialautoregressivemodelin(4:1)inahigh-dimensionalsetting,andletSn()=InWn.Weadvocatean`1penalizedlikelihoodestimatorwhichisas,^=argmax2Rpf^ln()njjjj1g;(4.2)99where^ln()=n2ln(2ˇ)n2ln˙2+lnjSn()j12˙2(YnSn()Xn)0(YnSn()Xn):isthelog-likelihoodofthemodelparametersgiventhesample.Inrealitythesedays,thereareincreasingnumberofdatasetscontainingspatialinteractionthatcomesfromtheresponsevariable,andformanyofthem,largeamountsofirrelevantparametersexistbecauseofeasydatacollection.Becauseofthis,itwillbeextremelyhelpfultoachieveasymptoticconsistencyaswellastheoreticalinferenceresultstojustifyforthepenalizedestimatorin(4:2).4.2FutureWorkSofar,spatialliteraturehasnotpaidmuchimportanceonmodelselectioninahighdi-mensionalcontext,andneitherhasthemodelselectionliteratureaccountedforspatialde-pendenceinanysubstantialway.Thecombinationofthesetwoconceptscanbewidelyextendedtomorecomplexspatialmodels.Forexample,besidesthespatialerrormodelandmixedregressive,spatialautoregressivemodelwediscussedinthedissertation,whenonlyonesourceofspatialinteractionisconsidered,wecanalsodevelopsimilarapproachforthespatialautoregressivemodelwithautoregressivedisturbances,wherebothinteractionsfromtheneighborsareconsidered.Also,thespatialmodelswementionedonlycontainonespatiallagterm(theˆMnorWnpart).Spatialmodelswithhigher-order,whichincorporatetwoormorespatiallags,havealsobeendiscussedintheliterature(LeeandLiu,2010)andhowtodeveloptstatisticalapproachesfortheminthehigh-dimensionalsetupisworthstudying.100Intheexistingliteratureofspatialeconometrics,thespatialweightmatrixplaysanirreplaceableroleindescribingtheinteractionsbetweencross-sectionalunits.However,ithasbeenpointedoutbyManski(1993)thattheliteratureofspatialautoregressivemodelfamilyfailtospecifyhowthespatialweightmatrixshouldchangewhenthesamplesizechanges.Eventhoughtheincreaseofsamplesizewillinevitablythemagnitudeofthespatialautocorrelation,andthereexistlargeliteratureworkingontheasymptoticpropertiesofestimatorsinthelow-dimensionalsetup,studyofhowthespatialweightmatrixwillchangeislargelyneglected.Moreworkisneededforthesponofspatialweightmatrixtoallowforachangingsamplesizecontext.Besides,thespatialautoregressivemodelfamilyintheliteraturealwaystreatthespatialweightmatrixasprioriknowledgeandthespatialweightsaretypicallyasfunctionsofsomemeasureofdistance.Thechoicemayormaynotbeconsistentwithrealityandincorrectspmayresultinconsequences.Someliteraturehasalreadynoticedtheissue(Bhattacharjee,Jensen-Butler,2013,Baileyetal,2014),butmostofthemethodsaredevelopedforpaneldataanddonottakeintoconsiderationofthehigh-dimensionalityofthepredictorvariables.Weanticipatetheanalysisofspatialweightmatrixfordatacollectedfromonetimeperiodinahigh-dimensionalsetupcouldleadustoanewunderstandingofspatialinteractionandthemethodcanalsobeextendedtoawiderresearch101BIBLIOGRAPHY102BIBLIOGRAPHY[1]Anselin,L.andBera,A.K.(1998).SpatialDependenceinLinearRegressionModelswithanIntroductiontoSpatialEconometrics.HandbookofAppliedEconomicStatistics,237{289.[2]Bailey,N.,Holly,S.andPesaran,M.H.(2016).ATwo-StageApproachtoSpatio-TemporalAnalysiswithStrongandWeakCross-SectionalDependence.JournalofAp-pliedEconometrics,31,249{280.[3]Barry,R.P.andPace,R.K.(1999).MonteCarloEstimatesoftheLogDeterminantofLargeSparseMatrices.LinearAlgebraanditsApplications,289,41{54.[4]Belloni,A.,Chen,D.,Chernozhukov,V.andHansen,C.(2012).SparseModelsandMethodsforOptimalInstrumentsWithanApplicationtoEminentDomain.Econo-metrica,80,2369{2429.[5]Belloni,A.andChernozhukov,V.(2011).HighDimensionalSparseEconometricMod-els:AnIntroduction.arXiv:1106.5242v2.[6]Belloni,A.andChernozhukov,V.(2013).LeastSquaresAfterModelSelectioninHigh-dimensionalSparseModels.Bernoulli,19,521{547.[7]Belloni,A.,Chernozhukov,V.andWang,L.(2011).Square-rootLasso:PivotalRecov-eryofSparseSignalsviaConicProgramming.Biometrika,98,791{806.[8]Bertrand,M.,Luttmer,E.F.P.,andMullainathan,S.(2000).NetworkandWel-fareCultures.QuarterlyJournalofEconomics,115,1019{1055.[9]Bhattacharjee,A.,Castro,E.,Maiti,T.andMarques,J.(2016).EndogenousSpatialRegressionandDelineationofSubmarkets:ANewFrameworkwithApplicationtoHousingMarkets.JournalofAppliedEconometrics,31,32{57.[10]Bhattacharjee,A.andJensen-Butler,C.(2013).EstimationoftheSpatialWeightsMatrixunderStructuralConstraints.RegionalScienceandUrbanEconomics,43,617{634.103[11]Bickel,P.J.,Ritov,Y.andTsybakov,A.B.(2009).SimultaneousAnalysisofLassoandDantzigSelector.TheAnnalsofStatistics,37,1705{1732.[12]Buhlmann,P.andvandeGeer,S.(2011).StatisticsforHigh-DimensionalData.Springer.[13]CA.D.andOrd,J.K.(1973).SpatialAutocorrelation.London:Pion.[14]CA.D.andOrd,J.K.(1981).SpatialProcess:ModelsandApplications.London:Pion.[15]Donoho,D.L.,Elad,M.andTemlyakov,V.(2006).StableRecoveryofSparseOvercom-pleteRepresentationsinthePresenceofNoise.textitIEEETransactionsOnInformationTheory,52,6{18.[16]Fan,J.andLv,J.(2010).ASelectiveOverviewofVariableSelectioninHighDimen-sionalFeatureSpace.StatisticaSinica,20,101{148.[17]Friedman,J.Hastie,T.andTibshirani,R.(2010).RegularizationPathsForGeneralizedLinearModelsviaCoordinateDescent.JournalofStatisticalSoftware,33,1{22.[18]Fu,W.andKnight,K.(2000).AsymptoticsforLasso-typeEstimators.TheAnnalsofStatistics,28,1356{1378.[19]Geyer,C.J.(1996).Ontheasymptoticsofconvexstochasticoptimization.Unpublishedmanuscript.[20]Hoerl,A.E.andKennard,R.W(1970).RidgeRegression:BiasedEstimationforNonorthogonalProblems.Technometrics,12,69{82.[21]Huang,J.,andHorowitz,J.L.andWei,F.(2010).VariableSelectioninNonparametricAdditiveModels.TheAnnalsofStatistics,38,2282{2313.[22]Kelejian,H.H.andPrucha,I.R.(1998).AGeneralizedMomentsEstimatorfortheAutoregressiveParameterinaSpatialModel.InternationalEconomicsReview,40,509{533.[23]Kelejian,H.H.andPrucha,I.R.(1999).AGeneralizedSpatialTwo-stageLeastSquaresProcedureforEstimatingaSpatialAutoregressiveModelwithAutoregressiveDistur-bance.JournalofRealEstateFinanceandEconomics,17,99{121.104[24]Lee,L-F.(2002).ConsistencyandofLeastSquaresEstimationforMixedRegressive,SpatialAutoregressiveModels.EconometricTheory,18,252{277.[25]Lee,L-F.(2004).AsymptoticDistributionsofQuasi-maximumLikelihoodEstimatorsforSpatialAutoregressiveModels.Econometrica,72,1899{1925.[26]Lee,L-F.(2007).GMMand2SLSEstimationofMixedRegressive,SpatialAutoregres-siveModels.JournalofEconometrics,137,489{514.[27]Lee,L-F.andLiu,X.(2010).tGMMEstimationofHighOrderSpatialAutore-gressiveModelswithAutoregressiveDisturbances.EconometricTheory,26,187{230.[28]LeSage,J.andPace,R.(2009).IntroductiontoSpatialEconometrics.BocaRaton,FL:CRCPress.[29]Lounici,K.(2008).Sup-normConvergenceRateandSignConcentrationPropertyofLassoandDantzigEstimators.ElectronicJournalofStatistics,2,90{102.[30]Manski,C.F.(1993).IdenofEndogenousSocialTheProb-lem.TheReviewofEconomicStudies,60,531{542.[31]Meinshausen,N.andBuhlmann,P.(2006).HighDimensionalGraphsandVariableSelectionwiththeLasso.TheAnnalsofStatistics,34,1436{1462.[32]Ord,J.K.(1975).EstimationMethodsforModelsofSpatialInteraction.JournalofAmericanStatisticalAssociation,70,120{126.[33]Pollard,D.(1991).AsymptoticforLeastAbsoluteDeviationRegressionEstimators.EconometricTheory,7,186{199.[34]Smirnov,O.andAnselin,L.(2001).FastMaximumLikelihoodEstimationofVeryLargeSpatialAutoregressiveModels:ACharacteristicPolynomialApproach.ComputationalStatisticsandDataAnalysis,35,301{319.[35]Tibshirani,R.(1996).RegressionShrinkageandSelectionviatheLASSO.JournaloftheRoyalStatisticalSociety.SeriesB,58,267{288.[36]Topa,G.(2001).SocialInteractions,LocalSpilloversandUnemployment.TheReviewofEconomicStudies,68,261{295.105[37]vandeGeer,S.A.(2014).StatisticalTheoryforHigh-dimensionalmodels.LectureNotes.[38]Varian,H.R.(2014).BigData:NewTricksforEconometrics.JournalofEconomicPerspectives,28,3{28.[39]Whittle,P.(1954).OnstationaryProcessesinthePlane.Biometrica41,434{449.[40]Zhang,C.H.andHuang,J.(2008).TheSparsityandBiasoftheLassoSelectioninHigh-dimensionalLinearRegression.TheAnnalsofStatistics,36,1567{1594.[41]Zhao,P.andYu,B.(2006).OnModelSelectionConsistencyofLasso.JournalofMa-chineLearningResearch,7,2541{2563.106