A 1..-. . .. . . .. . , . ,... 449 4 4 .84‘4:£:049 4.4:! 5,4,0‘9319. 44
49.9 494 4. ._ 449 4. 100.999.... .99 9. .7994. 4 9 4 . A: 4 .Lln.4,...19. 9.9.4. 9.9... .499454. . 49.84”. 4 L. 494. 4 4 d. 4 . 44 . , . , . . .9 94 4.99 4g? Eat 4‘41}?
.54 .0194 ‘4 o .9. 9 999. 91994‘9901U.M9 49 4999 .9 9 .94. 9 .4 .. 94.4 44 41....94.9.ul$.«9...,._ . . , , ,. . .99 ‘9'."9H49i-Ml; .94. Anonrt9m9wvoub 9949B“. .,9. ~49 4.9%

.\~ . .99 9 99 u ‘9 94 99 9.9 . 4 0919999..

90 . 4 .

. 4.. . . . r :9... , 994.9 9. 4 994. g 4 .494 4. 99 . : .9
_ . . 4 04. . .4 99.4491. 4293443.. ”49 444.994.43990494r94 904. “9.4.4.... "4 9ant“ WXLWIL.,.I.§9‘ 149-499:x, .14
4 .99A49499uvq‘ 49,999. . . . , . . .... . . , .. 99 9.9.1.909} 4999 ,

 
  

     

             

 

    
   

    
 

..,..uwgﬁﬁwﬁuwmm

999.9534“ 4.. 94. ,..-4.9.4 494 44.4

v

91.449.239.994 9.... lid-.9
9 I

 

                   
                   

..,...9 ..94 9. 4.
.49-... 4.99., .9. I9 944. .

39. P4 ..94. 44- ... ,. , A , .. . . . . . 9.‘ 449. 9”}... 44,4‘449T4ﬂ2900

9. .9’44 ’ , 0‘:
9 eﬁil 99
99 V99‘: 94999.09.

0 4:094

         

 

..V 9.9949491! . . . . 9 4;... ..994 . . 9493.9. ..4. ..94 4 . .4

 

 

             
             

                                           
            

         

   
     

  

                 

       

 

       

 

       

       
     

               
  

 

    
      

    

                           
         

   

 

 

     

 

   
   
  

      

 

     

 

 

          

     

  
       

 

                                                      

         
 

  

  
 

 

                   

 

 

  
                

 

 

       
        

 

M. 8‘4‘ 9 ,
. 94A 99 ..,,44.9944499.0_944. 49.499 . . ,. . . . 4.4. 9.994. .4 4:49 . :99 .I 4
...9 4.4 .49 4 999 494. 1- . 4L 991., 99.44949}; 9494 999 ‘0. 160.9994. 93.44.14,..9 .4 .. . . . . . . . . . , . . ,. . . . . . , .. 9.. 9204.44. 4 9 944. 4 . NF! 0. 49,4.“ . .94 L 9
9 ’94 . ﬁ 4 .4.. .‘ 9492.1.“444 ..‘—,1093.9!“ .9. 4 9 9 4 9044.99 94'499494 41 994 ..A 49.4449. .999. . . , . . . 499499 9099 9 .4.-91‘. '4. ’4 ‘94. 9 .94 u. 4 9
. .9 '94 949 ..49 4 444.49 4 , . 949. .4 499. .,. 9.4.... :9.»- 44. 9:.94ﬂa9 994. ... "9.9.4594 44... . 9 ..4. 4.... 4 4 9A .11..“ ..4-4,442.44:- 9\. 999., .. J 9 1.4 4
04 . .9 . r94 44...... 9194.94 JR». 49‘ .1. :99. 2:. .,.I: ..4 ..4. ”—944 ., JIY. 404994 1.4.9.44 .4 9.99.4 . .9494... 444...... 4 4.94.92...) 4 444 .9 44 4 444.9- 9449 9. 13,319.
9... 944 449 .49.. 9 9.949 H4 ..99 .9.9. 934.... I99 4. .99 .4. .4344. ..4 .. 4 4 4944990949”..- 944 994 .4444 . 4 ._ 999. 4. . 9 4, 44 ,.U ””493. 9999.94.44. -.9. ”DD;
“9.9994440949149999 39".” “3,499.9? 04 A419 I .99 9.,.-.96 :44 9. 994.994.4919 994444410.- . . 9, . .49. 3‘9“: 9.44 49... (“49093094 944 99 3099994441349» 9 4 .‘Undhuku. A l, I
.9444 494.499....4A “4 4.9.494 . 9949. ...9: 99 4 49 9. . T0 . .. 44.4.4 499 , ,. , . ... 9 9 .94 9 9| 4 9. . 9. 499.9 .... .49 ..,9. 9. 4.49. hi, 9.. .. . 9 .44.... 999 41.94494.“ 9494 9939”,. «a 9....9.9,!. 4
94439.41“? 9449. .4. “4.4%349493 4 .44. .4 .C 4-$ $404.42.: 99.4.3.9 4i 49.4.9946. .9 4,94 .4 . . . . . . 4.... 4 :4. .99... .9. .9 ..9... .. 4. ..4999.......... .., ,9... . 4. . 4 4.24:... 4 4. , 4 .49 .4949 94. . 9.. 99. .4 7.» » 4.9;}13 ....h
94.9?“ .“J‘JM—w .9 9.4, .4994 03904904.”. ..' A949... 4. 9‘9 .9 4444 9 . . . 3999490444. 4 4. 9. A9 9 .9A9 . 9. C 49.9 ‘9 9 ..4. 499. .h‘. “.93? 4A9. 94 44
r“ 9.39. .99 1 444 .9 444994 9 94:999. . ’l. ‘4 44.944- 949.49 .9 49-4494 9944,. .04 99‘ .9n...9. 44. .94.. . 4 .99 .9994 9 99.49.54? 9 .‘94 ...4‘.4 .. . 494 , 9.49.. .9944): .J. 4
44* 99299491949 4.9"? . .C 45.94.9411 997,.9-4949999. I $999.93 4,99... ‘4. . . 94-49... «9494. 99.. 140499. 09 9 4.99A4.‘ 1. 449999.1‘969‘4 .994
9. 99...,4«.1.H 944.449... 4.44 3.44....494999944; . 9 39.49 9.0... 4 .49: 4999 49. 9144444 .. . 4. 4.... 4. 4.4.. .. 9., 29.4... . . : 4.0999444! 94.4. 1 .. 94.9.44 ...—9.... 4 .99. . .4949 ll 49.
1. .94 933. . 9 9494‘ 4 4‘ .0991 .,.-0944.94... . 9449499 44 4 .9 9... .9994 9999! .1940... 9‘. 4 . 999949 .99 4.9. 4 .9 99. 9.9 ..444 9 499494 .99. . 44. .94 9 9.... 9 4. 49.499 9. 9 4.4..‘..9 .4 . 9!...‘9.‘.‘ 0.4
04104949“. 9. .9." , 4.9 04 4 9 9. 9 .94 494.99 4 94.9.99 49 9 .4. .. 9099494. ..9. 949. .9. 4949 9.. 9 9 9h 9 49,94
,4uw» r9 9A.... , . . o.~‘wuuu.4 ..9’ 4 .494 .4‘4 . . 9O 9 9.. 9 4 9 .9 49999409994 444 0904 4 49499. 99 4 449, 4. .949 96 . .4 944 4‘40. 4.94 I 9. 4 99 9,4 .4 ., 09999 .44 9 4 4,4,9 4 94
. 99 94494. . . , $039449- 44 4 9‘... I399 44. .... . ,4. .494‘94‘494 4 99,. .9994... 9.4.9 . ‘2499 9.. 9999,99. 2994.49.99: 9.9 90. 999-9... 9.9 9 ..4. :94... .44. 9. 49.44.9494 ... ..4.... .4. 2" 4 .4...
9.49 9:44. 9. .. . . , .44 9.949". 49.4l4l4 9.4 ,.I. 9994 9999 9.9 9499 . . .. .4. . .999. 9.4. 44‘» . . . .
4. 4 99 .44” . 9 (94‘. 9094999. A9 4.999 A .99 I, ...-99.99491lf4499 99994. 99 9 , 0. 9. .‘949: .7.
.44 “anwyudh .r 144?. 9459304 9941.. . 4. .9 .9 599499.. .444. 9 4 909.. .. 94‘444J9HH . 4.9. 4.499 . . , . . 4.4 ...
“9 4‘09) 0”}4r9040443 49‘... 049 44 0 4.499 ..I—looh. 9. 099994. ‘0 499‘ 99 440.1’..‘9."! 4 . , . 09.09 9.9.944 4940.. 4 0999949
' 4494941490909 99999 .. . 44-9099 .9449 . ‘93 01994999. 99794940 . 474 9 0.4 , 99949 CA»... .44... 4 0. .. 94.4. 09.49 .. 9909940,!4 ' . . . . .
. 4.99., [409.499 9A... "4 44'. .,9 9 ”‘4. 44‘ 4 9 0 4 9 9 449 1.14094 040404 .44 994 4 9. . 9. A, 49 4 494.49 .‘9499 99‘439. . . 4.
9_'494~.9A9,.9 a; 0C. “In. ”’9. 94.49. 949 “44 . 44 9. 9.9.40.0- .. 49 9‘ .4 4 2.4.9-”. 4., 9 9.44 99. 9 . 44994.19. 4 . 4. 99.
..4. ..4.. «9 J99? 44.l44l 9.99999 399 4944499- 4 9.4... 9 94. , ..4471999l499.\1 . .9. 9 44. . . 499-4.... 44 . . . , . .. . 9.! .Q, ,
.4"! ,4 4 .9. 9.94 .9 9‘. .40’9 (‘99,. 9.9.. 09):... 99‘94,.9q99490 ,..-999909319049 . 4 ,4. 49 494 0.9.. . . . . .1 943.439. 99.0 9.,
4 .4 ”no? 4 . .9999. 93499959194849 44 099.9 194.34.49.40 344494249‘ .1‘912. I 9.94. .,40444.9999 949.09.949. 9 4 . 99.9494919949944 93.99.993.418. .. . 0..
944.4414. 9. 9 . . . 5 .9 9.9!: 49 .4990. 9.444 O. 4 .| 9009949.- ..9499.949 4 4 9494 9.49. 0. 94.04 499 499.4. 499.99. A 4. 9 4 49 9949 9 ...
4. 4,9094. . 0,449 .9094, '9 A 9 ’4 99944999134949; 49.334909 4*: ,.9 .94 ‘.. 94.4 44.990 .4, .99 3 9949994- 9949... 4 , . 44.9 9; L. ,994. .4499.
499449 9.934 . 4 4'9: . (...-4499‘”. 9 4, 44.34.44 - 999949994 9' .9.494994.'99644 444...! 4909.44.99? 0 9 . . . . . . . . , 9,9,9. 9 . .9. .. ..999999 2!,9- 99.9.“...4 .4 9999 9.
. 91999.29 949 4” u . .9 9 4 44.44.99 8d" 9:439:31. 99 9.. 44.449 949.999. 9.90 . .4 .9494 9944.49. . .. , . ... . . . . . 2499.904 4 49.... .9 94 .99 49494.9. .4 9. . .49... .
494... 1999949 ..4994. ,9 4 9.144.941.4949..ﬁ94ln‘94.4 99014 ,49 :9. 9. 99999994492944.0919. 9.94. . . . . , . .. 4. 4499 .H4t4n.4.'«m,..99,.. 9.,.
4 .9. van-J?“ 444“. 4. .4 V4 .999 9 94.I.99 9.39...J9.9)...l 4.44.19 . . .94. 1.47. .4 94.449. 94.3.9:
9&3 9rv44. ..f 0.9, ,.9.I944 4994144? 9499.99.24.49 19‘9“”...9.‘ 9194.99.49 4 .999 . . , . 92943.4 ‘84....
. . . . J4.499,‘14.49.n..9 ”9494 4?“ 110059.. . 4 . . 4 ... 499 9.4-9. . , . 49.144.099.4149.‘
W . Z 90 44 40.4.4”: 99394 .9 .999,l4 444499944.“4..4.4 .. , . ...9949499,.....A.L.4999
9.449. .A 994.. 9939-49 ln9 ‘4. 394992.394)“ 49 .. 30.9 .4949 9. 444099 ,9.. 4. 44-9. .9 9. .9 4.929.441.491-
4. 4,9 .. 4.449 49:4’4. 9~ 4.44, ..4. 499994 4494 9 ..9 94494444394444 44999.9... 94., .39.,944. 9.4494794. 99. 49 9994\499304
.4.. a” B 4. 4 I994. 994. 4999. 94.19.9994‘ 994., .4 4 4‘ 99 9944 .994999939" 99. 9. 4.4! 44. 9. 4 .4. 94A 4 . . , . . 9.4.9.9 - 7 . .4 4,44 . 49.9994 .I9944 9399999414 44499
449...... 44494 49 49.9 . 3499.44. .4 .9 Q {99..4100'99443444 4 1.994.499.9949 4.94 44 9.9. 49.9. .4 4'... 04.99.419.999: 9 - . . . 4.4.9.. .u.,. 9.. 5. .9. .49 99....(94l944 9499 “tan 4 44494499.,
4:4 ..4 444. 9‘25. .4 “1'4994‘. . 9 41.94.9449- 99.444). .,., .0944. 494,34 94. 99.. .99 9 4 .A..499 4 . . . . . 1 . A4 0.99... : .9993. ., n}: . «M94399.
«0»...- 4. .99 . u 4 ...9 4. 4H4}... 441.999.61.99 9.44.99.40.94 44444 9494!. 4‘93: ..4.-44.94. 1944.94.19! 8.04.24.39.41“. . 9.9444941... ..444. .99.... 4 .. . 9..&99....9!..49494
.I. . , II a 4444 . 949190.49. 99994.99“ 9'9. 4.999 . A9 9. .949 9. ..0 4 ..94‘499 04.9 4. 9 ..9 4.. ..9 9 J. 4 .‘09 ..9 0 ’9 4.9- . 41019449. 0 .4 9;.
94.9. 54.94409. ,44949 . .399 4 .49 ,4. 944 94499 .3.wa 9499.409” ... 99903.14.9‘.99.99. . 4.9999431390703019! ..‘J a . 49......44ﬁo9o 99. .9. ...ﬁ 9.9!.49. A9,. 9449.44.99.19 49":
.9909.i4.,4..’94. , . I4:‘}949‘ﬁ4..9u9 9.4 9 4949.99.9..0...O‘ 9.04.449. .94904 9909. 9,9 ”9 9. 19 .4... . . . 94.9.4111... .94 99.99 24994404494441
.9. 3.99.4499 49. A4 4 .949 4:94 9 . .1794... 90494.4 4494 490464. 9.44.9 990499 .04. 444:. 40...... .9“ .4293. .4. .9 . 9 9944993999494, 449.499.... 99 44999494949; 94 .4 .4.-49.4 4 . ....4. 419444.: 9.4.3....49444 9 44.44 194.. 49"....
49444.. . ,944 44.4 4.4.99.9. 3491.74.44! . . #4 49:" ..44- 4.4.9. 4 . 99“.,“ or .49 949. 4 9 994A ,. 9 9 ,... 9 I 9.9 99. .449 .9. A.. 9 4 .4041, \914 9‘4... 49
94 4 4 4 9 0 .994 54.049! 92. 4 4.409 4 4 9 . 3.94 9 '99 . 9 . . {9999.94.49.44 4 .49 .99 .‘9... 9 . 9
”1. 9&1; 4 944k?»“ 4.49 4 ‘19:.949909 Q 9994 4,9494 499 4. .. 449909 .94- ﬁoouuu. «J 9.9 th99h4"..4.9 , 99999 .3 9.9“..999unm0934v. “”99”. ..,”. .... "99 4. 4.. .. 93.93”“ "4.4 94 4 99 0904,. 4 .3‘99w ”444w, . 9
~49.94.994.59 99 9 999: 94444. .-. 49149 9.9949994 4 .9994 .9494. 9 ,9. 9.490 49.999.99.941... .4 9.94 94 4. 4A 499.990.494- . 4.9 999 . . . . . ,
—4.. .,..9949 .,. .43 I .4 - 49“.”. 4 .494. 4,4. .4... 9.049.444 "444.... .4449 49 . 9 .49 4. .444 u .994. .4 ‘9 4.4.... 4 9. .49.. .4 I444..9.9.;449.9.l4.4 . 44L . 94. 4 . 4 4. 4. . , . ,. 4.9444. . ,9..! 0 .99.. r .9
.4. 4.4 4494 ..‘4 349.144 4449 499444495 949 1‘..4,1.44.wr49.4ﬂv9 9 999 9 94. 097%4499 48 9 9 94 9 9994449994 . 94 .. . 499 99 . ,9.,. 999. 94 4-4491 $96.09; .. t .94 . . . , . r . _ .. . . . . . 44994.. 49944944949099.4- .4 4.4 44494 In 5 49 . ,
99,49 2:4 94.9 . :9 9 9.99944. “9..... 9 . 4.49. 4 49:9 444. ... 494.4154. . 3.44499 944494.949.) .99 4. 94 999 94 . 9 .9. 49 4 44.49.99. .9 . 94 . . . . . . . . . .994 . . .49 4.94.4 ,,.4.4.o49 ..4 4494‘: 499990.... 94439 999 ..9 4 . .9 Q9 .4 l
..4.... .ﬂaqu,..n9,.l4 44‘9“”..9”..44441.I449444 4449449 .9 1.449494. 4.4.494 49999., .... 99.4.49 .44. 94. .4 I04. iﬁolfJSQ 944999 4994 .. 4. .999.9,4.:9.9.|4992 44v .4. .94 .4 4 ...9944 .4419. 44 49 .99 ., . , .9 149.499.44.438.....4....9.9.4.‘49 449. 99...: 4.4%....9. ’94\\ 944.94 ..4‘4449I9‘0 9.9 999499 ”..4-99! 949‘
"99 9 4,99 949 9.465.- 94. “AIM ..:49,Io.. 949 4.4.9 ‘.\ ..4‘ . 94.4 .9. 7494999 4 -2949... 44A 94. ...9. 4.4\ . 9, .99.A49...4 ..4 .44\.4944h4 ',.9 44 991.4“- 44.44 .- $44M"!
4 4 . 94 ..4 9L9. 44. ... 499. .4....9 49.9 4.9” ﬂaw-444.0 4. ..4... .4999. 4.994 419949.969 49494494294 49944,: . 44990.. 9A. 41394994....994143990944: .,.-4. 1.4.444 499.. 34949.9. ..., 994 9 944.3969... .. ,, .49. ..9 4.A :%.4.\ 24...... 94I..9994.9944.44."919..“49.!49
99490 to. 4‘9 0 .4 99 994499. .4. 9 90... 49499499 .9 \ 94494994 9 490.469 49... 9 9 .9999. 995948.94.- 4 9 4404004. '. 9‘11. 4 .9. 9. . 49,99 . ,.4 94.9 4.99 .9 9., . 94 b. .9... I 4 ‘9. 94949 9949.9, .999 994. 9ﬁ
‘99,}.44. A 4. .949 t 9 A9. .4 49.14.8441 434949.41 4944.449. 94 4.?” .44. .4 cost-49999 999. 4.44 9, 4. 999. 94.99 4 4 4 9.9 -9994‘9 0949 9. 4449 99.... .44. . . 94, ,9..-.99- .40 9 .943 949 .. .94. 494
9“."49, 999 . 1L0”..9u44.449n4. ...-69%”: dd:.’. .9 .4349: .4 “9.,-9. 944794.. 9 4 . .4 4 ””4949 .4449. T 4-»... ..9 .9 A. 4. . 44 -.. lac-9444.... 9.. ...4J40H4l91.
........9 4- 4 . 44.4 . .4119 9.2.4... .I. 4 44. .... 4.143.494.9944. 14....9, ...... .49... 1.. .429; 4. 4.9 .
.. 9.44.9449. 94. . 9-...QLP. ..4, . 944, 499994. .9444} 9. .. - (.4... 4.. .94 4. 9 .9l ..9. ...: 49”“..N. l.r999 $94,949.49!
.44...‘.4944..9l4 4 4. .9. 4..449.94499.4. .4.‘49 . 494449941. ”1.9.9. .4940 94499.14. ..4 . . 4 44 4.... 4... 9.1.4. 4.. -49..«4 .49. .9. I .19.. .4 .9 ... . $9449.12“;
9..:4. 9 9 09 0199999..- 940A 9'. 4. 94. 99439. 9 994 9‘ 4‘ . .9. 9.; '9394Q9F{1 99.9 44,4441 9 .4. 19.99. . .44. 9‘ '90.
9 .444 4.994 .24.) 4 4 44.449.493.441}... ,4 44.9..9; 9.9 £4.49 4 ...4. 49:744. .99. 49 . .4 9.44m” . ...,944...9.4. ,9..? .34. .294. . ...44...4 . 4.40 414:..93444. 4.4. ..9 , 44 944.44 4. $494.2 49k”. “49..
93599744949 494.4. . .4. 44994 .49 .409. 9 4 94 94, 9,9 ....449 . 494- 4. . 9.9 9 944. . 94.9.: 4 49.1.9. ..4. 349.494. 99 44494 ,‘$4449 4. "444 .9 24949
0.99399‘9. 944.. .4.... 99.99,. 99.. .‘..4..(' ‘2994. 9494993 . 4. «.449, ,9 .. 4949.. .9449 . 994-4,(9949... 494449.. .42, 4-94 9! 44. 399,444“ a
4 4 4.9 .s ‘444. 94. .9949. 499.4 4094.... I: 949 9. 4 . , 4.. .9494 4494.49 99 9.9‘4.‘ 4,590.“.
,4v 94.44144... ...-4. 9 10.6.4494. .149 4.4. ..,... 4. 44,974.... . . , , .9 4.94.9490.” 9 9,944! 9.4949499‘JIS9I99.
.494! 999‘- 4999449 44 9.4.9.9149 4 I .4009}! 949 04.: OH . ., . . . . . 4 . . . , . . 404 9 4.9‘ .94 414.99.999‘.
P4n09..4...9p 49$ 349999“: L, .4..H949.4.4094 44. ..4 ”59.9%.? 3......94. . . . . . . .. . . . . . . . 30:4. 444499999. 4“ 244mg
4. «4V. 44’...9-49999w3 .4 .4. , 99...... 4.. ..,, 9.49 . . . . , . .. . . .. . . ... . ‘9. 4.: 19,4919: 394%. 9 49.4
99. . ... I94 94....M49J )4 9 4:454:9- 9. 4... .,.. . ... 47.44944 . , 449 4 .4. 99 4.. . .4. 4541‘-
4 4 W933. .09 ..ll49,4. OM49! .9 4 09. .99 4 «.44 9.9". ..4 9 . . 29999. éﬁj. .9 ”0.194 9J1W99’Pnu99r99 4.909 .V’vw

    

     

 

404.. 4 4:...
9.99.94 4.44 44.4.9 o 4999 99,044 4
99 . .44. 4.0.9 4. 949. 4 . . 0 9'... 9. .

    

4 99999.. .4. 9 9,.. 4994 4044 . 9.9" 4 .994 9949 .4
44“ 94M- 4 . .lmJn.‘ 9u . 4.49444 4 9., 9 4 444.9449, 0., ,9! 09.4999
. 499040. A 4'949». 0., 4, .4944I. 9‘ 44 9.9409 99 9 90999 9 , .9 7994909.. 4
2 ,9. .490 .4 9‘44, ,9 4 4. 99.. .9: .4 9 4 4A 9 9444 494.34 .‘4'99944, Q4...
, 34.44939. 99494498494 99 999‘994..‘3 44 O 949. 4 449.,94_ .9. 4 9.4999 - 9.
.1. A4 4 9. 45,4», .999..99 .. 4. 9.4.4. .( .1949}. 4 .49 £94. 4 9499 .4: 4.4. :94. . . 4. 4. .44. 9 4 .4444”: 4... .2499 .4 49.4.4449; ..4-A9494 .4..

9099 19.49.49’4949 4a 9349 4. 9‘.— n. 49494099 9. 944499909099. 4944 9. 9 9 ‘99. no. '0 999.949 '9.:94|999. 9444.. 4. 9409.. 39449.99 4... 9.4.4... 90. 5.94049914049, . 49 ‘4 A 9 O.
.. ..,. .. 4, 9 4 \94 9949 .. .4403 . 99 4404 944. 99. . 4 44441194. .4.. 4999.9. .49 .9 .44

 

   

         

                
    

        

.. 9.4.44 9 ...-994.4 .4...) .9... .444 4. .919
.49‘,, 9049.049 9 ’99. 94. '4 399 94.. 4s.

 

     

:4? 94 9.499
0'

 

 

 

      

 

                               

         

. .. ..9., . . _
. _ r 4.. 44.. 0.9 44 94,4... 9:44. 9 .49 49 . 994 . 4 4 . ... 44 4 .4. .. .v 4 0.084991...
4 309“}. ....49.‘ 9 9.9 I .89. 4999499. 4 39094 044 9.....“- 9 9.. 999.. 9 9469. 4 94.999.49.91 9. 91.94 .4499 4.4.4.1.... 9.99. 4.949 . 4. 49.9 9.9 4 04 499. ,9494 (H4. .43“?
..99 9.9494,. 0 99 .9 :9. ((9909.19‘ . .

 

.49. 49 4.499949999 .94. 499.0,..’.9.4.!.
9444 ..9 400944 .99. .44.. 9. .4 .4 ...944994‘9, 9 9 4494. 409949494. 99999 ’9‘w 94. 499049.094 .
.4 940,49 '44. 9C9. . l 99 A94... . 4990.9 9 9 I99 9' 4. 4 9944944 I .4

 

.4 9 .4 0944994. 0 91. 494 r09.4.9§9‘09 .3
.9449 4 4.4%: 41.94.29. .40. ..4. .449
44.99 0.4.

                                

 

      

,..4. 9
”$914399! 0 9.9. 44 . 44 9 4.9.9 {H.495- 9.4494399. §~v¢h 49.

44.494 94. 4 ‘94

 

 

 

 

9. :4 ,9.. . .941. 4. 9 .4 ..4-4.“! ...99 .9434. 94 4. {49 .... .... ..9 ”41944.4.ﬂ.9443|¢44.3)9‘49t $919,144.49?
- .4.” .9 499 . 444 .9 .....9994 A 4999494. 49. 194.494. .4. ..4. 1.4459449, 9 409499404494 .... d.) 490‘. 4 Q
..4 . .9. ..4, 44494,... .99... _ 4.. .4 .4. .9944 9. . 09...}.1...9... 999.104....‘49449449491594'9.a.9 act-449.4499
. 99' . 444.994.99.49. 99 9. .9994. 9..,4991999 9.4940 4..... 4 4 .94.‘A 4499991.4994994999,I9IA4.999‘94§, 54990::
94.94.4409. 4 .4949. 444. 9 44 4.. .44., 4444‘. 404999, 99, 9.4. .4944 . ..9J.‘.9N....x444. 90409919449492! 4.4””99, ,499'999499-9 93“»{39‘
4 44941.... 4. 9 49.44949. 94. .94.,-4 9 . 94.19... ..4. ,9. 9.994... 4. 4.. 9170944494, 9.99.. 9499‘94 4.4449... .1044. 499. 49 0.444444934449309

   

  

 

..9- 9... 949.. 3‘ .949 . . 9 ,9.. 9. 9.9. . 49.9.49 499,9. . .999, '14,?
4.44999 {”9 944. .. 9‘. 4‘9. 4.. ,. .4.-499”“... L99uldﬂ 94.99.9999‘4‘ a
4. . 924.949.5294J444 .9 4.49%.l3hitoﬁu904941944999’WI
494 4.49.. 149.999“. .

 

     

. , 904.“. 44. 99.499904 49 44499.
4 [,9444.90.09 9. . 99. 0. .... 99.94.4594 999. 49.64.449.12

. 994.. 9.
.944... 9.09 Q.

 

 

49.494 9.,.-00,99 ,9.
9.. 4 4. 4 4

      

 

.0 , . ....9.‘ 49. 444. . 44 492,9. 999994.-
.1449, ...-94.9.

 

     

\4 999-
. 1,494... 4.94 994.1,- .9. .9 .444... 19,. .. .. .9, 9.9, .4,

   
 

               

 

9. ..‘99
...4490499! - 9...... 49. 94. ....o, 499.

3.4.. 44 ﬁrm“; $4994.13. .49-4.14199 .1091 4.09.44.

.4...A 9 . .. .r‘ ..9 994... . 994.509.444.49 .9489I9..9..944449.’4.999449994‘.9

..9 .4 .94, 4-4. . . 94 4. 499' 4 ...-,0... 4 (.44- .44’994b92 9. 44.14.094.- 900941994395. , . ‘

. 9.2.9.0944! 94.499.94.99. .909... ’9 9.4. .29 494'"

. .99“. . 4‘29409924999 . 94.49- 0449. 4. :99: 3.9“": .

 

049.4. 9.99 4.04. 9.4 94
.49-’99.. .

 

 

   

49. 9.4. .9, “So-.4....

44.,4

       

          

 

                          

     

   

 

4.94449 . $4994 994...... 949444.944 99. 4.49.9.."u99’9‘9un94'4
b A 9 . . 4 9 4 I 9.4.. .4999 994! 9. 9 944991 9909
, 9449999.... .4. . 49 999.444 4 , 4.4.0.0499. 44 4. . 9.099 .949", .44\9449 4 , . . . .. . .9 ,0 9 14494.44: 9.9 999. 4 4 49.“\449,.- '“9094n49,499,..9959u994. .9 994 {29‘4’499‘9‘4C f9
9 ...9. .1 ., . .9 9.. .99.? 9.9. . . 99 . 99“” .44 4.094.4«494 V‘Q?.J4933919.01“Q991999944,4.40. 4.404449444404994
.4994. a... 9 449‘ . Iv. 9‘ .9 9994 . . . .4 9 9 4949 9. 90.... . . . . . . . 94 ‘ 949 949 299.04.90.9'9 949 94 9.
4.4 a... . . . 449., .249. .4.. . .. . . . . , . . . . 991444.099".- l-NV . ...44.,...,.4t444.Ad-§ ..Il:.‘0 t?! cal-0.99
. . . 99 40195394.),1‘99949 $.04 4944’.
.... y , k4... .49-99.9 4.9,. 999.00.49.46 [99.4.4.1
. 4

 

49
. . ..4.?! ....thi)...6.9.9..“19949994144421849HU9 0999.431“
. .,94..9499.4.9499\499..‘9. .449, 9. 4.4499939794949499\9909.4 .9. 999) 34!.t‘9999943‘l

             

 

   

 

.94 4.4 9.. .4, 499 4. .4 5 099.9499 ‘99 .44.9.9 4 9.9. , .,9440494444’49‘949 494.9440. 9:9.0909’.t9 .9!-

. . . . . . ... 9 I9 4 999 9 . ’4 4. 4 .. 499. 44090. 4d 99 9949,9449 ....9‘90949949‘..949‘094’9 r4.:.0’ it...

..4.. 9..... 9.9.9 ., . . . . . . . . 4 9 494...... . .. .493 9 ....9994 94997—99940 $494.40.! 9.990499999‘ 9999490. 444499994
.44.... 2. . . . . .. . . . . , . . ,. , , . . . .,4 4 4994 9.. . .. 94...!4490 9.9944444. 443.444.44.999-94.’é9gzhgg‘i
. , . . . . . . . .. : . , . 4,949. , ..4994 ....9 {4. {9.1. 9.9044949949‘ 39 4 4 99949499949

. . .. , , . .9 4 , 4.4419449. 449.119.12.939934943419 9 O 943‘ l‘

. . . . . 4,994. ,4 ..99‘99.‘99 4.49 9064. 99.9.4A .9I04494‘944 90.99 ’Ph‘: 3994,9494

. . . 44 A4 ...... ..,, ... 9 :94... 4. ... {494.425.444.44 ,9,. \. .. \ .4. 9999) ..4.;ll “0.334.494.444.34!999I!Ji. 4959;484-

... 19.41.03. 44. r.) 999 . 4-49 9, 9. 9 ...4.4.90.Ao.

 

.9. 4. 4A .949.-

.. . .304. 49.. 4.94.9. | 40-19 Jivx.’
9 99 9 994. 9.999. Q-. 49999.09 .94 .9J4u4'.‘

$.39 3.9949999. IL 4449-‘19999449931 044.9999. .9400"... ..4

 

 

.4495, a .4 991.3540!) .4 9.., 4 49,9

 

 

             

 

 

4.4.9. , .19 .94 . -4. ..4 29.4.5.9. 1.1.4799. ...4494 494:9}: 9.94. 9.943949449444944‘29ti

. 4. ..4.... ... 4. .. . ., . .4 4.1.3949. 44.0. . .4494944,94.9\4.194.99$839 A1399}! i

. .4. 4 4., ..9.. 9.9.... . 9... . 4 4. ..4464. ..4“. . . .949 2‘4... 994 94.....1904'999949. 9 ‘9’)..3‘942499. 449495999 3'09.
.9. 4 ..041999. 44994499.. 9 . 9 ‘. '9 ‘44. 949 4 49.9‘49 Z409

 

 

4.9.99.9 4999.045. 44.44.4454 9.949442!!! 9;..ﬁntbih‘34i

.9499! . J .94.... 9.443 41.199.9“4439“4‘9 9 .
$214-$444... 9.4.3499..- 9 949442.314 509.394 3.0499 994.40.

40. 404.994.51.490. ’9 990,499). 4,999! ‘iiuﬂ, 9i;

                      

. 9,. 40 4. 94 ...4.4..

                  

 

                 
  

   

 

            
   

9 9 . .4 9. 99 99 4 9999 9499.199 9 9‘, 849:6
, 4.34.49 4. ....9 . 4 . .. . . 4‘ .44.“).‘9 {44.494.999.01 4.9. 44.014949 .
9.4.4. . .94 .4 .4. 9 .9993 n. .l. 4 . .. . O 4 ‘94‘-9‘.O 394440449414‘393‘0’ 99.443.99.940,‘ ‘092915’0‘; .9...
. . , 4 ,, . , ,9. . . . .9 4449.1 9 49. .4 4 4.9. 4 90-. 4 .4 949. 4 9 9 ,49..9 9... 9 . 49 9 9. 44 94.990 .9 Pd. 9.9 9494‘0’494 4‘ 4 4.44 .
4— . 99.4.4 .4 . -,. '4. 4. 944.44 . 44.. . . , 9.9.9 . 4...!“99. ,. 9.. ,.4‘ .4 9.99 99994 9 9.99499. 9...) ..4l.‘ 4... 4.9 . . 44)..’94.‘¢Hl491”.’.319$409.’0’9§‘975 994‘:
. 4... , I . 99 . .4... o 4 4.. . . . . .........4. 4 39.9.}.94: , 4 9, .4399... 4. 4 9. 4 . . . 9.- ,: 4.4 444999.... 493.444. 09999.9.‘74999444'9 .4949 499.
.444 49.4 . 9 9. 9 . 9,. 9 ....9 9.9 .. ,9.. h4.9 . z A 9 .444. .194... A.. .94. ...‘499 .9999494 , . . .., . . . ... . . . . . . , . , ,. . . .. . .4 , $519.9994‘9 44 944."‘.9444',J99 {.3413 4, .
. 90.49 9. .4 . 9...‘ I I. ,,.49. 4- 449 49 .4 . - 4.... 9499. 44-999.... .9944 J . 9 9.9144 9 494 94.99494441999 . ., . . . . , , . , . . .. . . _ . . . . .... . . . . . . 4 9944.4..41999-24999‘44404.:. 40.0.44” 44‘. ‘9!¥‘.‘90 ill. ‘0‘.
. 1494.49 44.5449. .4. 9‘42: 4. .944. ....4 .. .99.....4&,.9, 4.444, 4.. .9 94. L9 4949 . . ., .. . _. .. . . . . . . . . 49249994440992.4931} .9494
4 ... 9 ..4. .9 .4 . A. .9.9. .9 9 , . . 4 .94 49.. . . .. H .. . L 99,449.99... 3449’9449’9‘44994999‘ 99.440, 9
. . 9. .94. 4,49. 09...: 4. 4... 94 4.94 9.9.4 .. ,4 J49..909§ on. 499490. 4.4 10.59..“- a... g’oi’I
_ . r . . (A :9 -. 4 94 9944.9...944‘9.’.94. 9.4994485 .99 .9. 349444149

 

...,999934..4...4..~4.499999. . . . . . ,_ . . 4.79.94.94.999
.. 9.699 . 9 9909.399... . , . . . . .. . , ... 4 u. . 99.9.9 ,.94

 

 

«D, . : .. . . . .. . , . ., . 9, '4.A....9.f...94(9..9 (9199:4094!‘4494944.0)942I!'9‘4 14.99.4094" .49.

 

 

 

 

    

 

, . . . .. . . . . 49,. . 494.9194 9.9.9.3.. 403-44991499‘49149334499 igi‘l‘hzc .-
,. .4 A... 4‘ 99 . 9.1.9.4444; 9990. . . 4 , . 4 . . . . . .. A . .. .. , . . . . , . . 4 , . 5:094- , 99.4.99. 94" -444 0.90,. D ‘ ’0‘}... .494‘:.‘9J.‘, 9g).
. 47.929 4 4. .4 .9 449. .4949 999 , , . . .. ., , . . . , . . I. 949.9949: .99.9.. .009. 49.94 ..9... 449.999. 949.:‘949..4944g. 93 QII
.. 4 3. I. . 4 94 . . , .. 4. ...-.4 44999 2.9.4.. .44. 4-4..49440.49994.9-9499249999944999914.‘.99’44049949994999.4011
4 .4 .1. .49 .. L ., .. 4.44.4494. .,........4.......I$4,1 ..4.. . 93.499.491.31... 4:999 {4494991994}: 994984.314 .....-
. .94 9 .9 4 .4 4..4 , 9 . 4 44. 4. 94 .944 . 9 94. 44-4. .444uﬁ944 49.... 4.9949999... '9. O4 0.34,..4194999‘90’4 .99! 49'94494 . 999.4991: 0’0
4... .9: 44 ..4 9. 9.4-4 4.4.9.. 4. 4994. 2744\994994. 9. .P D 1249?
.99 , .4 .9. 99994.9 49.. 4 94.9.. 999.994.429.193. (9594 ‘99 44N499499‘9979IH-‘99hi39'
. .... 42.. ..4..9-,,9. - L .l .. 4 ...99.4 4....49 4. .939 4419‘ 99.! .4 0 9\44 ,9444. 99“. .984413449. , .4... 9299959999,; 9999.399 9‘91;
.. 9 .4. 9. A 4 . A. 9 .. . . , . . . . . . . . .. . ,. .. , , .4.-2:, . (.,. ,... 4. .9 4.1.4.14494449ta.$a..314 .94449 99 4.44.-.‘. 39944944 94...?
.. - .4. . .4. .4 . . . . , .. . . , . . , . . . . . . ,,. . . . . , . . . ..4. I:..-’ 3.499.691.14591444‘9. 49,444.,4..99.94494..:O‘:994.4C 2:99.944 Ortiz A
A4999}- ..999994.99. .9 4 . 4 . , . . .. . .. A . , . . . . Z . . . . , . . . . . . .4'4 ~1.9494I4.4 4.09.4.9? .9J94494”0’,.9.4..044‘9049r i3. \‘49‘

 

9 O. 0 940194 4 I.
.. .44
' ..9949‘4994444

 

.. 4199 4 79:44.9. , . . . . . . . . , . . . 99 , 09 .44'39‘490 . .9‘9‘224, S 4%
4. .4 99.4, 4 . . . . . , . -. . 9. . .4.. , . . . ..9... {“9 0’4...9_|‘994 4‘9 50"499‘ 91.99"“? i
. . ..9 49.4.9. .., . . , , . - 4.9 9,949 -.4 49). 49940399 9 994,094 9. 544‘ .9..9.999£9441.4 9,90499‘9‘Q‘99Hg‘9tm‘i4

                 

 

4 9199494‘441494. 9" ‘9
..9 9‘999‘9938494.“4‘09,4 9‘49. i

.9909. .99 4 4,,
.4A_.‘!44944 4 9 9.4. 9999.

   

 

, 09‘ .999 .4. .94. . . . . . . . .. _ , . . . . .. . , , . .. 4. A 4 ,. . ,. . .. . . . ., . , 394.‘9,\J\‘9,14;‘D9;‘£.0“_4"4\:49,9-§
49 v 4 ... , . _, . .. ,, .,, .. . . . 4 . . . . . . . 4 . . 3.9499099114490999. ‘99904’40.’4,9§9- 990949.983.)

. . .4 $194.94.- 4 . . . , .. ,. , . . .. . .. . A . . . . . .. , . , . .. . . . . . . . . , 4999.94.14.3'9 4. 94.9494. 4944 .9994.l9444.99.;.’|94.914999‘§.J44“ 44.419559 9&0,
,. , , .. . . , r . . . ,. .. . . . , . . . . .. . , ,. . . 4 3.494 44.141.494.940. 9499‘;49449.999. 994.730.949.944 4912999399.!

. .. ..944. 9 9ll‘f9o9‘119' 9.94 49,4449‘. 994.“.9491? 4999 449999949954, 4‘943’49‘9.
., ..I. 4 42"; . 9.444 99400999.. 99,399,."091

  

        

.9449}: 944.
0 A 04

.9. 494. ..499 .9994. ...4 ,, ..4 . .9 .4 .4 9.9
9.4.1. 4.9:- 949494: .4.. 4,0
. 4.9. .9 ’9... .9.4.9.:..96,,.494.

   

9‘
444-459.,4914 9...
93.14.. 9

 

 

.4999
..4 .944 (44
. 404,... . .,.. ..

 

9.4 94.49, 999! b 4.§A 4 '3-.. 94'.
..49... 44. ,, 49. . ..4-9 I4 4 «..,-99 AIM...4 9. 99. ,944..IO44I491r49994-,9l44§994.4.299:9.91969
4.49 499.191.! 4.9.949. ,9..-4454:9039 9 “3.4099949. 4-19:199“; $3":

.4... 99.9494. 7.. 9 .94 . .44 4:9 4.99494 49,499. 03.9.9444. 99999, 99.4949

  

 

      

                       
                         

   

    

    

”..4. E 5......" u
..n-.%..: 49!.9? ..w44,. A

9.4,,90949.‘ -9
.994. 9.,-9994 .440

 

 

Z 1.4.99.4..9. 4944 994 9449.49 4 .4ﬁ9992‘999’19’aio'944 £440,129...
. ,, . 49444914914919. ...-0.039. 4.49995!- 49 0919959.. .9k‘9999..9{409
...-.9 a ..4... 9944.99494- 9991.39.99. 0.49049.‘.4.}a9979-4.199499.94.4.94‘9 4:944

..99 4.. . 4 .44.,9. .
. 4.5 .4944... 4. , 94. 9.449 .99 4191.9 .99. 4.9;.90‘999940‘34543949999949.4999.

 

...9. .IA. 9- 4. 9. ‘9‘. O .. .I . Alvis-x. 9|.9 4 .99“- 99.. 9142‘s.. .1.
2.9. 94 .94 0349’|94990.\I4.v"| ’99.. 9.49.}. 9.6 . 19 9 . 99H l4.9$9 l.\
4.9 4.40 4 24.9.49; 9'9 4.94'.9.' 9 0. p.994 9 ”.443: 90“. 9. A . 4999
.9 4 9.944 4 .4 4 ._ .. . I . .9 . , . . . .... l.

4 ., 94 49. .4.. ‘..4 9. 4.. .4
: , . . 9 .

     

    

  

  
 

  

. - 1...... . .... 9.: 43%. .- 12 _ ._ $444414.” , .. _. .. 4.4.4 .424 . «$1.45.... 9L... .4.. 2.419;... .3099...» rm.” 5.”..34w4m- ..4 .nMuA.W..-u...mmw-..._ . .,.-.4....
.. 4 9. ,.9. 9 . 4 4 4 04 .i 9 .4 4 4
4.44... ......mwmm. ....uw .__.. ..n.....+.....run._m $4.4 ... 9.4%} .wwwt 5x- ..Amjé? ......a “gray-,..... 4419......ei ”Kauai?” . ..a ......4............ in». aw, ,. wﬁnﬁw-
.9 0. 91.4“.“0‘4 4.9 9999 40.. 9.049194 4, 3.9 94.49.99 444” 4.9.449"...;9 9 499. .49Q.99§.. 44.94 “9.094. 3.4.9.4413... 94.4.75...” 9. .44-4?? . ...".4994 94.9 4.94.49“... 4.4.. .9. 4... ... ..0.‘..4,199.? 4499. 4.99. .9 .9
9 99. 43.99.49.144; 94,919.! . ,

 

 

 

 

} 2 LIBRARY
2 O * ' Michigan State
In .

 

 

 

This is to certify that the
dissertation entitled

APPLICATION OF SIMULTANEOUS CONFIDENCE BANDS
IN STATISTICAL INFERENCE FOR HETEROSCEDASTIC,
HIGH DIMENSIONAL AND FUNCTIONAL DATA

presented by

Qiongxia Song

has been accepted towards fulﬁllment
of the requirements for the

Doctoral degree in Statistics

 

 

Major Professor’sT‘Sirgnature
8/ 7 / 20/0
/ ,

Date

MSU is an Afﬁrmative Action/Equal Opportunity Employer

“‘4-L-.-.— —.---o-

 

-A_.-A-l-I-01I-'-'E'-'Al—n-l

 

gnu--.--o—o-l-A-'----—-u--o-a--o-I-A-A-A—A

..-.-.c--.-.-c-

PLACE IN RETURN Box to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5’08 KIProi/Aoc8Pres/ClRC/Dateouojndd

 

APPLICATION OF SIMULTANEOUS CONFIDENCE BANDS IN STATISTICAL
INFERENCE FOR HETEROSCEDASTIC, HIGH DIMENSIONAL AND
FUNCTIONAL DATA

By

Qiongxia Song

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Statistics

2010

ABSTRACT

APPLICATION OF SIMULTANEOUS CONFIDENCE BANDS IN
STATISTICAL INFERENCE FOR HETEROSCEDASTIC, HIGH
DIMENSIONAL AND FUNCTIONAL DATA

By

Qiongxia Song

This dissertation studies simultaneous conﬁdence bands for heteroscedastic, high di-
mensional and functional data with their applications in statistical inference.

Nonparametric simultaneous conﬁdence bands are a powerful tool of global infer-
ence for functions. Chapter 1 provides a bird’s eye view of the state—of-the—art and
challenges for constructing such conﬁdence bands, a brief introduction to later chap-
ters. An introduction to the nonlinear spline smoothing and local linear smoothing
is also provided in chapter 1.

In Chapter 2, asymptotically exact and conservative conﬁdence bands are obtained
for possibly heteroscedastic variance function, using piecewise constant and piecewise
linear spline estimation, respectively. The variance estimation possesses oracle ef-
ﬁciency and the widths of the conﬁdence bands are of optimal order. Simulation
experiments provide strong evidence that corroborates the asymptotic theory while
the computing is extremely fast. Also, in simulation, the proposed conﬁdence bands
is compared with some other testing heteroscedasticity methods. As illustration of
the applicability of the methods, the linear spline band has been applied to test for
heteroscedasticity in a fossil data and in the motorcycle data.

Chapter 3 provides the method for constructing simultaneous conﬁdence bands
for nonlinear additive autoregressive models (NAAR), which have found wide use in
recent years to reduce dimension in nonparametric smoothing of time series. Under
weak conditions of smoothness and mixing, we propose spline-backﬁtted spline (SBS)

estimators of the component functions for nonlinear additive autoregressive model

that is both computationally expedient for analyzing high dimensional large time se—
ries data, and theoretically reliable as the estimator is oracally efﬁcient and comes
with asymptotically simultaneous conﬁdence band. Simulation evidence strongly cor-
roborates with the asymptotic theory.

Chapter 4 focuses on constructing conﬁdence bands for densely spaced functional
data. We illustrate the use of local linear smoothing to construct simultaneous con—
ﬁdence bands for the mean function. Our approach works under mild conditions for
the case of densely spaced observations and differs from sparse and irregular longi-
tudinal data. Simulation experiments provide strong evidence that corroborates the
asymptotic theory. The conﬁdence band procedure is illustrated by analyzing the

near infrared spectroscopy data.

ACKNOWLEDGMENT

I would like to thank many people who have helped me on the path towards this
dissertation. Most importantly, I want to thank my thesis adviser, Prof. Lijian Yang,
who set me on the right path after I have joined his group. His systematic guidance
and constant push not only work as the major sources of the completion of this
dissertation and its corresponding publications, but also cultivates me from a student
to an independent researcher. His support and encouragement in every research
related aspect sustain my conﬁdence in entering a highly competitive academic area.

I also wish to express my gratitude to my dissertation committee, Professor Con-
nie Page, Professor Yuehua Cui and Professor Timothy Vogelsang, for sparing their
precious time to serve on my committee and giving valuable comments and sugges-
tions.

I am grateful to the entire faculty and staff in the Department of Statistics and
Probability who have taught me and assisted me during my study at MSU. My
special thanks go to Prof. James Stapleton for his numerous help, constant support
and encouragement.

My PhD degree will not be completed without the help and support from my
family. My parents and my sister have been giving their support from the other side
of the earth. My husband, Weihua Geng, is the one I can really rely on when I have
to face difﬁculties and challenges. I owe them a great deal.

Michigan State University is such a great university to study, research and live.
In particular, I am grateful to the graduate school and the Department of Statistics
and Probability in terms of providing me Dissertation Completion Fellowship. This
dissertation is also supported by Prof. Yang’s NSF grant award DMS 0706518. I
would like thank the group members for their generous help, and they are Dr. Lan
Xue, Dr. Jing Wang, Dr. Lily Wang, Dr. Rong Liu, Mrs. Shujie Ma, Mr. Shuzhuan

Zheng and Mrs. Guanqun Cao.

iv

TABLE OF CONTENTS

List of Tables ................................. vii
List of Figures ................................ viii
Introduction to conﬁdence bands ................ 1
1.1 Status and challenges ........................... 1
1.2 Nonparametric smoothing ........................ 3
1.3 Variance function bands ........................ g . 5
1.4 SBS estimate and NAAR models bands ................. 6
1.5 Functional data bands .......................... 7
Spline conﬁdence bands for variance function .......... 10
2.1 Introduction ................................ 10
2.2 Main results ................................ 12
2.3 Error decomposition ........................... 19
2.4 Implementation .............................. 21
2.4.1 Implementing the exact band .................. 24
2.4.2 Implementing the conservative band .............. 24
2.4.3 Implementing the bootstrap band ............... 26
2.5 Examples ................................. 28
2.5.1 Simulation example ........................ 28
2.5.2 Fossil data and motorcycle data ................ 29
2.6 Appendix ................................. 32
Oracally efﬁcient spline smoothing of NAAR models with simulta—
neous conﬁdence bands ..................... 48
3.1 Introduction ................................ 48
3.2 The SBS estimator ............................ 51
3.3 Decomposition .............................. 57
3.4 Simulation example ............................ 60
3.5 Appendix ................................. 63
A simultaneous conﬁdence band for dense longitudinal regression 87
4.1 Introduction ................................ 87
4.2 Main results ................................ 89
4.3 Decomposition .............................. 93
4.4 Implementation .............................. 94
4.5 Simulation ................................. 95
4.6 Empirical example ............................ 96

4.7 Appendix ................................. 96
4.7.1 Preliminaries ........................... 96

4.7.2 Proof of Theorem ........................ 97

5 Summary of thesis contribution ................. 108
Bibliography .......................... 110

vi

2.1

2.2

2.3

3.1

3.2

4.1

LIST OF TABLES

Coverage probabilities for c = 100 from 500 replications. ....... 30
Coverage probabilities for c = 5 from 500 replications. ........ 31
Simulated rejection probabilities of test homoscedasticity from 500

replications. ............................... 32
Coverage frequencies from 500 replications. .............. 62
Comparison of computing time of Model (3.23). ............ 63
Coverage frequencies from 200 replications. .............. 96

vii

2.1

2.2

2.3

2.4

3.1

3.2

LIST OF FIGURES

For data generated from model (2.22) (with 00 = .5, c = 100) of differ-
ent sample size n and conﬁdence level 1 — 0, plots of conﬁdence bands
for variance (thick solid), the linear spline estimator 6% 2 (9:) (dotted),

and the true function a2 (:r) (solid). The bands are computed from
bootstrap method. ............................

For data generated from model (2.22) (with 00 = .5, c = 5) of different
sample size n and conﬁdence level 1 — a, plots of conﬁdence bands
for variance (thick solid), the linear spline estimator (“7% 2 (2:) (dotted),

and the true function 02 (3:) (solid). The bands are computed from
bootstrap method. ............................

For the fossil data, plots of variance conﬁdence bands (thick solid)
computed by bootstrap method, the linear spline estimator 6% 2 (1:)
(dotted) and a constant variance function that ﬁts in the conﬁdence
band (solid). The lower picture is the data scatter plot and the conﬁ-
dence band for mean (thin solid). ....................

For the motorcycle data, plots of variance conﬁdence bands (thick
2

solid) computed by bootstrap method, the linear spline estimator 62 2 (2:)

(dotted) and a constant variance function that ﬁts in the conﬁdence
band (solid). The lower picture is the data scatter plot and the conﬁ-
dence band for mean (thin solid). ....................
Plots of the eﬂiciency of SBS estimator ma,SBS corresponding to or-
acle smoother 7720’s for d = 4 and p = 0 (upper panel), p = .3 (lower
panel) of ma (ma) in (3.24), for a = 1 (thick curve for n = 1000, thin
curve for n = 500, and solid curve for n = 100). ............

Plots of the efﬁciency of SBS estimator ma SBS corresponding to or-

acle smoother ﬁrms for d = 10 and p = 0 (upper panel), p = .3 (lower
panel) of ma (13a) in (3.24), for a = 1 (thick curve for n = 1000, thin
curve for n = 500, and solid curve for n = 100). ............

viii

46

47

83

84

3.3

3.4

4.1

4.2

4.3

For p = 0, plots of the oracle smoother ﬁrms (dotted curve), SBS
estimator mmSBS (solid curve) and the 95% conﬁdence bands (upper

and lower dashed curves) of the function components ma (Lita) in (3.9)
with a = 1 (thin solid curve). ......................

For p = .3, plots of the oracle smoother mas (dotted curve), SBS
estimator ma,SBS (solid curve) and the 95% conﬁdence bands (upper

and lower dashed curves) of the function components ma (ma) in (3.9)
with a = 1 (thin solid curve). ......................

For data generated from model (4.11) (with 00 = .5) of different sample
size n and conﬁdence level 95%, plots of conﬁdence bands for mean
(dashed lines), the local linear estimator ﬁt (11:) (dotted line), and the
true function m (11:) (thick solid line). ..................
For data generated from model (4.11) (with 00 = 1) of different sample
size n and conﬁdence level 99%, plots of conﬁdence bands for mean
(dashed lines), the local linear estimator ﬁt (:13) (dotted line), and the
true function m (x) (thick solid line). ..................
The upper plot shows the Tecator data with the 95% conﬁdence band
(dashed thick lines) for the mean estimate (thick solid line). The lower
plot is the conﬁdence band (thin dashed lines) for the mean estimate
(thick solid line) in a different scale ....................

ix

85

86

105

106

107

Chapter 1

Introduction to conﬁdence bands

1.1 Status and challenges

Nonparametric regression has gained much attention since it relaxes the usual as-
sumption of linearity and enables one to explore the data more ﬂexibly. Many of the
properties of nonparametric regression estimators have been thoroughly investigated.
However, as Eubank and Speckman [13] pointed out, techniques for constructing in-
terval estimates to accompany the regression function estimators have been slow to
develop, even in the case of independent and identically distributed (IID) observa-

tions.

Consider the nonparametric regression model
Y,=m(X,-)+e,-, i=1,2,---,n. (1.1)

A natural deﬁnition for asymptotic exact (conservative) 100(1 —a)% conﬁdence bands
for an unknown function m(:i:) over interval [a, b] consists of an estimator rh(x) of

m(:r), lower and upper conﬁdence limit (In,L($) and ln,U($)) at every 2: E [a, b],

1

such that,

"Emmi? {m(:r) e [m(:z:) — sz (11:),rh(:1:) +1”), (23)] ,Vx 6 [a,b]} = 1— a,
I

17153) 1513}? {m (:13) e [m (:r) — 1”,]; (x) ,m (x) + In), (29] ,Vx e [a, b]

Conﬁdence bands are closely related to conﬁdence intervals, which represent the
uncertainty in an estimate of a single numerical value. While, conﬁdence bands
arise whenever a statistical analysis focuses on estimating a function or constructing
interval estimates. A conﬁdence band is used in statistical analysis to represent the
uncertainty in an estimate of a curve or function based on limited or noisy data.
For instance, with the simultaneous conﬁdence bands, we can test whether m is of
certain parametric form: H0 : m = me, where 6 E 9 and 9 is a parameter space.
For example, we can test whether m = c with c a constant or we can test whether
m(:c) = 60 + 31:1: with (60, 61) linear regression estimate. If so, then we accept at

level 1 — a the null hypothesis that m is constant or linear. Otherwise H0 is rejected.

Construction simultaneous conﬁdence bands has been developed slowly since it is
difﬁcult to establish asymptotic sample distribution theory for nonparametric re-
gression estimates. In the last two decades, many statisticians have worked on
the theory and applications of nonparametric simultaneous conﬁdence bands, see

[7, 13, 16, 22, 23, 25, 74, 75, 87, 89].

All these methods are local polynomial smoothing based. Conﬁdence bands of
kernel type estimators are computationally intensive since a least square estimation
has to be done at every point. In contrast, it is enough to solve only one least square
to get the polynomial spline estimator. Recently, some research has been done to
provide conﬁdence bands results using polynomial spline smoothing. See, Wang and
Yang [70] and Wang and Yang [72]. For the application, see Wang et. a1. [30]. In

this thesis, I tackle this difﬁcult problem in many scenarios, using polynomial spline

2

smoothing mainly. In this introductory chapter, I state, without proof, those basic
facts about our target models. We construct conﬁdence bands for all these models

with statistical inference.

1.2 Nonparametric smoothing

Smoothing techniquas make an important class of tools for identifying the true signal
hidden in highly noisy data. They offer the art of nonlinear curve/ surface estimation
by relaxing the linear assumption in regression and have very broad applications in
many areas. I give a brief introduction to the smoothing techniques used in our
research and analysis, namely regression splines and kernel smoothers.

Regression spline smoothing is a projection method for ﬁtting splines. Let { X 2°, Yi}?=1
be a strictly stationary process. Assume that Xi, 1 S i S n are supported on a com-
pact interval [a,b]. Polynomial splines begin by choosing a set of knots, and a set
of basis functions spanning a set of piecewise polynomials satisfying continuity and
smoothness constraints.

Let a =t1—k = = t0 <t1< < tN+1 = = tN+k = b beasequence
of equally spaced knots, dividing [a,b] into (N + 1) subintervals of length h = (b —
a) / (N + 1). The j-th B-spline of order k for the knot sequence T denoted by B ',k is

J
recursively deﬁned by the de Boor [10], i.e.

= (u -' tlej,k—1(“) _ (u - tj+lej+1,k—1(“)

 

 

BiW) J—ijSN
9’ tj+k—1 - tj tj+k — tj+1
for k > 1, with
1 t- < u < t -
— +1.
Bj,1(u) = ‘7 ’7

0 otherwise,

N
We denote by C(p-2)[a, b] the linear space spanned by {Bip (2:7) }J-1—k’ whose

3

elements are C(pT2)[a,b] functions that are polynomials of degree p — 1 on each

subinterval. We denote by
C (p) [a, b] = {mlthe p-th order derivative of m is continuous 0n[a, b]}.

The polynomial spline estimator for regression model (1.1) is

7int) = argmin {Yz' - 9(Xz')}2, k > 0
g(.)eG(k_2)IaibI 2521

Locally linear smoothing is used for the last chapter to develop the conﬁdence
bands for functional data. This smoother combines the strict local nature of the
data and the smooth weights of kernel smoothers. Kernel smoothers are expensive
to compute (0(n2) for the whole sequence), but are visually smooth if the kernel is
smooth.

A local linear approximation is
M1303 0 + 5(32' — I)

The local approximation can be ﬁtted by locally weighted least squares. A weight
function and bandwidth are deﬁned as kernel regression. In the case of local linear

regression, coefﬁcient estimates 6 and b are chosen to minimize

11
(6,6) =argmin {yi—a—b(:rZ-—:r)}2Kh (xi—2:)
i=1

with Kh (u) = IIIK (7%), h = hn —r 0, as n —> 00. When (XTWX) is invertible,

one has the explicit representation

a = 63‘ (XTWX)_1XTWY

4

in which Y = (Y1, . . . ,Yn)T, eg; 2 (1, 0), and the design matrix X is

1 (5171—33)

1 (am—3:) n><2

and W =diag{K (51515)};1.

1.3 Variance function bands

The importance of being able to detect heteroscedasticity in regression is widely
recognized because of efficient inference for the regression function requires that het—
eroscedasticity is taken into account. In many applications of regression models the
usual assumptions of homoscedastic disturbances cannot be guaranteed a priori. Al-
though the problem of testing hypothesis regarding the regression function has been
discussed by many researchers much less attention has been paid to the problem of
testing hypotheses regarding the variance structure in a nonparametric regression
model. By constructing conﬁdence bands for variance function, we provide a simple
consistent test for heteroscedasticity in a nonparametric regression set-up.

In the second chapter, we propose polynomial spline conﬁdence bands for het-
eroscedastic variance function in a nonparametric regression model, and the result is
the only existing conﬁdence band result for variance functions. The greatest advan-
tages of polynomial spline estimation are its simplicity of implementation and fast
computation. It is desirable from a theoretical as well as a practical point of view to
have conﬁdence bands for polynomial spline estimators.

We assume that observations {(Xi, Yi) ”1:1 and unobserved errors {52]le are

i.i.d. copies of (X, Y, s) satisfying the regression model (1.1) where the error 5 is

5

conditional noise, with E (5 |X ) E 0, E (82 IX) E 02 (X). We constructed a si-
multaneous conﬁdence band for 02 (:12) over [a, b]. In addition, the proposed variance
estimator is asymptotically as efﬁcient as the infeasible estimator, i.e., the asymp-
totic mean squared error is as small as if the conditional mean function m (:c) is given
(equivalently, as if the unobservable error e is actually observed).

We applied our result on a motorcycle data. The result shows that with a p—
value as small as 0.008, one rejects the null hypotheses that the conditional variance
function of the data is a constant as no horizontal line can be squeezed into the
99.2% variance function conﬁdence band. The details of the theoretic results and

applications are the content of the chapter two.

1.4 SBS estimate and NAAR models bands

Non— and semiparametric smoothing has been proven to be useful for analyzing com-
plex time series data due to the ﬂexibility to “let the data speak for themselves”. One
unavoidable issue in high dimensional smoothing is the “curse of dimensionality”, i.e.,
the poor convergence rate of nonparametric estimation of multivariate functions. Ad-
ditive regression models has been found wide use in recent years to reduce dimension
in nonparametric smoothing of time serials.

A nonlinear additive autoregressive model (NAAR) is of the form

d
Yi = m (Xi) +8i’ m (121, ...,SL‘d) = C+ 2 m7 ((137), (1.2)
7:1

72.
where the sequence (IQ, xi} 1 is a length n realization of a (d + 1)-dimensional
z:
strictly stationary process, the d-variate functions m (.) and a (-) are the mean and
standard deviation of the response Yz' conditional on the predictor vector X,- =

{Xi1,...,Xz-d}T, and E(€,- [Xi) = 0,E(522 [Xi) = a2 0‘2] In the context of

6

N AAR, each predictor X”, 1 S 'y g d can be observed lagged values of Y2" such as
X237 = Yi-w or of a different times series.
Inference of model (1.2) centers on the estimation and testing of {m/y (-)},C;___1.

The two-step estimators for model (1.2) possess oracle efﬁciency. If all compo-
d

nents {m - } and the constant c were known and removed from the re—
ﬂ( ) ﬂ=1ﬂaé7 n
sponses, one could estimate m7 () from the univariate data {IQ-,7, XIV} __ in which
n n _
{Yi'llzél are latent oracle responses to the 'y-th covariate {Xi7}i=1’
d
Yiy=m7(Xz')+5i=Yi_C— Z mﬁ(Xiﬁ),lsiSn,IS’YSd-
ﬁ=1ﬂ757

For the NAAR time series models, however, none of the existing methods pro-
vide any simultaneous conﬁdence band for may (). To address this need, we propose
an all new spline+spline oracally efﬁcient estimator that is theoretically superior as
it comes with an asymptotically simultaneous conﬁdence band for my (), and also
computationally more expedient than any existing estimators due to the use of spline

instead of kernel in all steps.

1.5 Functional data bands

Traditional statistical methods fail often as we deal with functional data. Indeed, if
for instance we consider a sample of ﬁnely discretized curves, two crucial statistical
problems appear. The ﬁrst comes from the ratio between the size of the sample and
the number of variables (each real variable corresponding to one discretized point).
The second, is due to the existence of strong correlations between the variables and
becomes an ill-conditioned problem in the context of multivariate linear model. So,
there is a real necessity to develop statistical methods/ models in order to take into

account the functional structure of this kind of data.

Functional data with different design are increasingly common in modern data
analysis. A simultaneous conﬁdence band for this data set has been more and more

in need. A functional data set has the form {X,- ij} ,1 S 2' S n,1 S j S N, in

h

j 7
which N observations are taken for each subject, with X,- j and Y, j the jt
th

predictor
and response variables, respectively, for the 2’ subject. In this paper we only deal
with the equally spaced design. Without loss of generality, the predictor X,j takes
values {1/N,2/N,...,N/N} for the ith subject, 2' = 1,2,...,n. For the ith subject,
its sample path { j /N , Yij} is the noisy realization of a continuous time stochastic
process €,-(:c) in the sense that I’,,- = 5,- (j /N ) + a (j /N ) 6,, ,with errors 5,- j satisfying
E (5,3) = 0, E9322]. = 1, and {€,-(x),:z: E X} are iid copies ofa process {£(rr),:c E X}
which is L2, i.e., EfX €2($)da: < +oo.

For the standard process {€(x),:r E X}, one deﬁnes the mean function m(a:) =
E{€(:z:)} and the covariance function G (mal) = cov {€(x),§(:r’)}. Let sequences
{Ak}z<_)__1,{wk(x)}z:1 be the eigenvalues and eigenfunctions of G (x,x’) respec-
tively, in which A1 2 A2 2 2 0,2211% < oo, {’t/Jk}zO=1 form an orthonor-
mal basis of L2 (X) and G (1r,:c’) = 220:1 Aktpk(a:)z/2k (x’), which implies that
fa (35,3!) pk (1") dx’ = Amp/C(23).

The process {{,-(:c), :1: E X } allows the Karhunen-Loeve L2 representation

62-(10) = m) + 2:, max).

where the random coefﬁcients §,k are uncorrelated with mean 0 and variances 1, and
the functions ¢k = ,/,\k2pk. In what follows, we assume that ’\k = 0, for k > K,
where K. is a positive integer or +00, thus G(:c,:r’) = Zz=1¢k($)¢k (513’) and the

data generating process is now written as

n,- = m (j/N) + 22:, em (gr/N) + 0 cm 5.,- (1.3)

8

The sequences {Ak}g=1 , {¢k(:r)}z___1 and the random coefﬁcients ail; exist mathe-
matically but are unknown and unobservable.

T wo distinct types of functional data have been studied: sparse longitudinal data
(1 S j S N,- and N,’s are iid copies of an integer valued positive random variable)
and dense functional data (N,- —> 00 as n —» 00). For the dense functional data,
strong uniform convergence rates are developed for local-linear smooth estimators,
but without uniform conﬁdence bands. The fact that simultaneous conﬁdence band
has not been established for functional data analysis is certainly not due to lack
of interesting applications, but to the greater technical difficulty to formulate such
bands for functional data and establish their theoretical properties. In this thesis, we
present simultaneous conﬁdence bands for m(:r) in dense longitudinal data given in

(1.3) via local linear smoothing approach.

Chapter 2

Spline conﬁdence bands for

variance function

2.1 Introduction

Quantiﬁcation of local variability of regression data is an indispensable ingredient for
many scientiﬁc investigations. The most intuitive measure of such is the conditional
variance function, whose estimation has been the subject of Miiller and Stadtmiiller
[50], Hall and Carroll [20], Ruppert et. al. [61] and Fan and Yao [15], which em-
ployed kernel type smoothing methods for the nonparametric variance function. Sim-
ilar smoothing methods have also been used to estimate noise-to—signal ratio in Yao
and Tong [83] with applications to time series volatility estimation. These existing
works estimate the conditional variance function via kernel smoothing of the squares
of residuals from an initial kernel smoothing of the regression data. Such two-stage
smoothing technique has also been used in estimating homoscedastic variance in Hall
and Marron [21]. More recently, a new approach to variance estimation based on dif-
ferencing has been proposed, which can successfully handle serially correlated errors,

see Dahl and Levine [9] and Brown and Levine [4].

10

What has been lacking is uniform conﬁdence band for the whole variance curve
over an entire bounded range, and explicit formula for the estimated variance function.
The former is useful for making inference on the shape of the variance function, such
as testing of homoscedasticity, while the latter is appealing to practitioners without
much statistics expertise but wish to implement nonparametric procedures. Uniform
conﬁdence bands have been constructed for conditional mean function in Hall and
Titterington [26], Hardle [23], Xia [75], Claeskens and Van Keilegom [7], and for
probability density function in Bickel and Rosenblatt [1]. All these and other related
works such as Mack and Silverman [46], are based on kernel smoothing and make use
of the “Hungarian embedding” type results such as in Rosenblatt [59] and Tusnady
[69]. More recently, Zhao et. a1. [89], Wang and Yang [70] constructed conﬁdence
bands for conditional mean function using polynomial spline method with explicit
formulae for both the estimated conditional mean function and the conﬁdence band.
In particular, Wang and Yang [70] allows for heteroscedastic and nonnormal errors,

and is useful for testing hypothesis on the shape of regression curve.

In this chapter, we propose polynomial spline conﬁdence bands for heteroscedastic
variance function in a nonparametric regression model. The greatest advantages of
polynomial spline estimation are its simplicity of implementation and fast computa-
tion, see for instance, Stone [67] and Huang [28] for the basic theory of polynomial
spline smoothing, and Xue and Yang [76] for computing speed comparison of spline
vs. kernel smoothing. Hence, it is desirable from a theoretical as well as a practical

point of view to have conﬁdence bands for polynomial spline estimators.

We assume that observations { (X,-, Y,) [1:1 and unobserved errors {s,-}?=1 are

i.i.d. copies of (X, Y, 5) satisfying the regression model

Y=m(X)+€, (2-1)

11

where the error 5 is conditional noise, with E (5 |X ) E 0, E (E2 [X ) _=_ 02 (X), see
Assumption (A4) in Section 2.2 for details. The conditional mean and conditional
variance functions m(:r) and 02 (2:), deﬁned on interval [a, b], need not be of any
known form.

Our goal is to construct a simultaneous conﬁdence band for 02 (1:) over [a,b].
In addition, the proposed variance estimator is asymptotically as efficient as the
infeasible estimator, i.e., the asymptotic mean squared error is as small as if the
conditional mean function m (2:) is given (equivalently, as if the unobservable error
e is actually observed). As an example, consider the motor cycle data, Figure 2.4
shows that with a p-value as small as 0.008, one rejects the null hypotheses that the
conditional variance function of the data is a constant as no horizontal line can be
squeezed into the 99.2% variance function conﬁdence band. For other methods of
testing the heteroscedasticity or the lack-of-ﬁt of regression function, see Dette and
Munk [11] and Bissantz et. a1. [2], and Section 2.5 for simulation comparison of our
method with that of Dette and Munk [11].

The chapter is based on a published work Song and Yang [63], and the chapter is
organized as follows. In Section 2.2, we state our main results on variance conﬁdence
bands using constant / linear splines. In Section 2.3 we investigate the error structure
of spline variance estimators leading to insights of proof. We give the actual steps
to implement the conﬁdence band in Section 2.4, and in Section 2.5, we report sim-
ulation results and applications to a fossil data and the well known motorcycle data.

Appendix contains all the technical proofs needed for the main results.

2.2 Main results

An asymptotic exact and conservative 100 (1 — a) % conﬁdence band for the unknown

02 (1r) over the interval [a, b] consists of an estimator 62 (5c) of a2 (2:), lower and upper

12

conﬁdence limits 62 (2:) — ln,L (2:), 62 (2:) + ln,U (2:) at every x E [a, b] such that

nli_)m@P{o2(x)€ [62(2: 2:)—l,,L(2:,) 02:2( 2:)+an(2: 2:)],V2:E[a,b]} = l—a,

ggiggP{02($)€[62(x)—l,,’L(2:),62 (x)+l,,,U(2:)],V$E[a,b]} 2 1—0.

respectively.

If the mean function m(2:) were known, one could compute the errors 5,- =
Y,- — m (X,) ,1 S 2' S n and make use of the fact that E (8,2 ]X,- = 2:) .=_ 02 (2:) to
carry out polynomial spline regression of the data {(X,, Z,-) lfl=1t in which Z,- = 5,2

are the squared errors. Speciﬁcally, one could deﬁne the “infeasible estimator” of

_ 2?: Zi-g X,) 2,in
960%: 2),”), 1f ( l

the variance function as 5,2,2 (2:) = argmin

which Gggz 2) — G (pg 2) [a, b] is the space of functions that are piecewise polyno-

mials of degree (p2 —— 12) on interval [a, b], deﬁned precisely below, for some positive

integer p2.

To mimic the above unattainable spline smoother, we deﬁne

. n . 2
p1,p2(>=?;gr31213 Zi=1{Zz’,p1—9(Xi)} , (2.2)
gEGN22 [a,b]

,2 . .' . . .
where 22.19152,“ are the squares of resrduals 5,4,1 obtained from spline regressron,

a,“ = y,- — mm (X,), 1 g i g n, (2.3)
for some positive integer p1, in which

mp1 (2:)—- — argmin 2:27.;1 {1",- — g (X,) }2 . (2.4)
gEG(pl1 2)[a, b]

13

To introduce spline functions, for the two steps V = 1, 2, we divide the ﬁnite interval
[a,b] into (NI/+1) subintervals Jj = [tj,tj+1) , j = 0, ....,Nu—1, JNV = [tNV’ b] .

A sequence of equally-spaced interior knots {tj} _ V1, are given as

t0=a<t1<---<tNV<b=tNV+1,tj=a+jhu,j=0,1,...,N1/+1,

in which hy = (b — a) / (NV +1) is the distance between neighboring knots. We
denote by G56:_2) = 0]];5—2) [a, b] the space of functions that are polynomials of
degree (191/ — 1) on each Jj and have continuous (pu - 2)th derivative. For example,
G N11, denotes the space of functions that are constant on each J j’ and G9,,” the space
of functions that are linear on each Jj and continuous on [a, b].

In what follows, II'IIoo denotes the supremum norm of a function w on [a, b], i.e.
llwlloO = squ€[a,b] [w (23)], and the moduli of continuity of a continuous function 21)
on [a,b] is denoted by w (w, hy) = maxx,2:’€[a,b],|x—2:’[Shy [21) (2:) — w (55’) I. That
limOw (w, hy) = 0 follows from the uniform continuity of w on compact [a, b].
VCur approach is to construct the error bound function In (2:) around the spline

estimators (“7,244,2 (2:). The technical assumptions we need are as follows:
(A1) The regression function m (-) E G (p 1) [a, b].
(A2) The density function f () of X is continuous and positive on the interval [a, b] .

(A3) The subinterval length hV ~ n—l/(ZPV‘H), i. e., the number of interior knots
NV N nl/(ZPV+1),V 21,2

(A4) The joint distribution F (23,5) of random variables (X, 8) satisﬁes:

(a) There exists a positive value 1} > 1/p2 and a ﬁnite positive M” such that

sup E (lel4+277 IX = 2:) < MU:
2:6[a,b]

14

(b) The error is conditional noise: E(e|X = :c) E 0, E (e2 [X = cc) E o2 (2:)
with E (e4 [X = 2:) E p4 (2:) which is a positive function on [a, b] with bounded
variation. The variance function 02 () E C(p2) [a,b] and has a positive lower

bound on [a, b].

Assumptions (A1)-(A4) are adapted from [70] for sample {(X,,Z,-)}?=1. In
particular, Assumption (A4) (a) implies that var (52 |X = 2:) E p4 (cc) — o4 (2:)
is the conditional variance of Z = 52, denoted as v% (2:). We denote also pa: =
min (p1,p2) ,p* = max (101,122) ,N* = min (N1,N2),N* = max (N1,N2). The idea
of allowing different degrees of smoothness for m and 0 comes from one referee.

To properly deﬁne the conﬁdence bands, we denote for any 2: E [a, b], deﬁne its

location and relative position indices ju (2:) ,ru (2:) as

J}, (:c) = my (:5) = min {[(2: — a)/hy],N1/}, n, (2:) = {2: —— 5.11,”) /h,,. (2.5)

tint/(33) _<_ (I) <

0 g ry(2:) < 1,‘v’x 6 [a,b), and TV (b) = 1. We denote by II¢II2

Since any 2: is between two consecutive knots, it is clear that

tjnu($)+1’
the theoretical L2 norm of a function 45 on [a, b], i.e. "dug = E {d2 (X)} :2

fcl,’ ¢2 (2:) f (2:) c122, and the empirical L2 norm as ”(Min = n“1 2311 ¢2 (Xi) , Cor-

responding inner products are deﬁned by

b
(¢.se)=/a ¢(x)<p(x)f(e)dx=E{¢(X)90(X)},
(<15. ‘Pln = 771 2:21 <f> (Xi) 90 (Xi)

for any L2-integrable functions ¢, (p on [a, b]. Clearly E (qb, ‘Pln = ((p, (,0).
Algebra shows that the space 0%5—2) can be spanned linearly by the B-spline
basis introduced below or the truncated power basis introduced in Section 2.4, see [10].

Hence the same estimator mp1 (25) can be expressed as a linear combination of either

15

of the two bases. While the truncated power basis is convenient for implementation,
it is easier to work with the B—spline basis for theoretical analysis. The B-spline basis
of Gag/1), the space of piecewise constant splines, are indicator functions of intervals
Jj, bj,1(2:) = Ij (2:) = [J], (2:) ,0 S j 3 NV. The B-spline basis of G0 u’ the space of
Nb
j:

piecewise linear splines, are {bj,2 (2a)} 1 , where

1‘ " tj+1 .
bj,2 (2:) = K (T) , J = —1,0, ...,NV, for K(u) = (1— |u|)+.

NV (Pu—2)
(x)}j=1—pu for GNV

Deﬁne the rescaled B-spline basis {B -
8m e) E bm (e naming—1, 1-... g.- s N...

Obviously all the rescaled basis functions will have theoretical norm 1.

N1

. . we
i=1-P1

To express the estimator mp1 (2:) based on the basis {Blipl (2:)}

introduce the following vectors in R": Y = (Y1, ..., Yn)T ,

T .
B,,,,1(X)={3,1,1(X1),...,B,,p1(xn)} , g = 1 — p1,...,N1,

and let the design matrix for spline regression be

then the estimator mm (2:) in (2.4) is expressed as

A _ T —1 T
mp1(2:) — {Bl—p1,p1($)""’BN1,p1($)} (Bpprl) BP1Y

= Z ”\j.p1Bj,p1($)’
2=1-P1

16

. A T
where the coefﬁcients {A1_p1,p1,...,/\N1 p1} are solutions of the following least

squares problem

N 2
.. . T n 1
{A1“Plvp1""’)‘N1,P1} = alrvgmin Zizl Y,— 2: ”2491333191 (X,) ,
or equivalently, of the normal equation
((3 B > l” i N1
jip , ‘I, . . ( jip ): _

_ N
= (n 1 2:; 323101 (Xi) mph—191‘

It is straightforward that <Bj,p1i 8]" p1> E 0, [j _ jII 2 191, thus the inner product
matrix on the left side of the normal equation is diagonal for the constant B spline
basis (p1 = 1), and tridiagonal for the linear B spline basis (p1 = 2). According

to Lemma 2.2, it is approximated by its deterministic version, whose inverse has an

explicit formula given in [70].

For p2 = 2, deﬁne the inverse of inner product matrix as S with its 2 x 2 diagonal

submatrices {'3ij 5 j S N2}

 

 

3332-1 3331'
(2.6)
The widths of the conﬁdence bands depend on the variance function:
2
2 ij($) ”Z ('U)f(‘U) dt) 2 N2 Bj, 2 (:13) Bl, 2(2)) Sjj’sll’vjl
'Un,1( = nub ”2 , ”71,2 (17):: Z I ’n a
2(r),1 2 j,j’,l,l’=—1
(2.7)

17

with j (:27) deﬁned in (2.5), and s”, in (2.6), and

(vjl)N2 = E = {log (2)) 83-3 (1)) 31,2 (11) f (v) dv}N2

' -/___ . . '
1,] — 1 ],J’=—1

Under all assumptions, applying [70] to the unobserved sample {(Xi, Zi) 21:1,

an asymptotic 100 (1 — a) ‘70 exact conﬁdence band for 02 (:3) over [a, b] is

5% (2:) i W (x) {2 log (N2 +1)}1/2 dn (a) ,

and

- —1 2
where vn,1 (1:) is given in (2.7) and replaceable by i) Z (:r) { f (x)nh1} /

 

log (1 — a) }+log log (N2 + 1) + log 47r

2 2 ], (2.8)

dn=l—{210g (N2 +1)}—1[log{-
and an asymptotic 100 (1 — a) % conservative conﬁdence band for o2 (I) over [a, b] is
5% (:r) d: ”71,2 (x) {210g (N2 +1) — 210g a}1/2,

where U712 (:16) is as in (2.7), replaceable by “0,1,2 (1:) in (2.16).

We state our main results in the next theorems.

Theorem 2.1. Under Assumptions (A1 )-(A4), as n —> 00, the spline estimator

612,1,p2 of 02 is asymptotically as eﬁicient as ”infeasible estimator”, i.e.

.2 ~ _ -2 ~2 _ — 2 +1
“01,1432 — OPQiioo — $31be 0p1,p2 (a3) — 0192 (1'), — op (n P1/( P1 )).

Theorem 2.1 and the aforementioned properties of 612,1,p2, imply the following:

Theorem 2.2. Under Assumptions (A 1)-(A4), an asymptotic 100 (1 — a) ‘70 exact or

18

conservative conﬁdence band for 02 (:13) over the interval [a, b] for p2 = 1 or 2 is

1(1):)i m (3:) {210g (N2 +1)}1/2dn(a),

2

1
(332(1) i um (:5) {210g (N2 +1) — 2 loga}1/2,
respectively. That is,

711190013 {02 (3:) e 57in (:r) i ”71,1 (:13) {210g (N2 +1)}1/2 dn (0) ,Va: 6 [a,b]}
=1— 0:,

1/2
gggP {0'2 (1:) 6 632 (:r) :l: Un,2 (:13) {210g (Iv—Zia} ,‘v’x 6 [a,b]}

21—0.

The proof of Theorem 2.1 and therefore also of Theorem 2.2, depend on Proposi-
tions 2.1, 2.2 and 2.3 in the next section, and the proofs of the propositions are given

in the Appendix.

2.3 Error decomposition

In this section, we break the estimation error 612,2,“ (2:) — 512,2 (2:) into three parts,
so we can deal with the convergence rate for each part in the proof. To understand
_2)

this decomposition, we begin by discussing the spline space C(pl introduced in

Chapter 1 and the representation of the linear spline estimators mp1 (1:) in (2.4) and

5,2,2,“ (1:) in (2.2).

We write Y as the sum of a signal vector m and a noise vector E

Y: m+E,
m = {m (X1),...,m(Xn)}T’

19

E ={el,...,en}T.

. . . —2
PrOJecting the response Y onto the linear space 0.),“ ) spanned by

N1
{Bj,p1 (X) }j=1—p1’ one gets

. . . T

= Proj Y: Proj m+ Proj E.
dim-2) 65521—2) Gym—2)

Correspondingly in the space 0031-2), one has

Th191 (33) = 771191 (5’3) + E101(5'7) ’

T ._
mp1(:c) = [{Bj,p1($)}:-:11_p1] (Bngpl) lBglm’

T _
5p, (11:) = [{Bj3p1(a:)}j_:11_p1] (Bﬁ‘prl) 132,13 (2.9)

T - T
- . _ 2 2 _ ~2 2
Regarding variance, we deﬁne Z — {51, ..., an} , Zp1 — {€1,131 , ..., €71,191} , then

T ._

5,2,2 (2:): [(3331,2 (x)}::21_p2] (352392) 13,222,
r _

532m (x): [{Bj,p2(x)}:-V=21_p2] (3523102) IB$2ZP1'

Taking difference,

6,292,191 (:16) — 5,2,2 (1:)

T -— -
[{Bj,p2(a:)}::11_p2] (3,1523%) 1131352 (zpl—z)

20

T _
= “831102 (”lg-21-102] (3523”) 113%“

Then one writes

(33%,, (:13) — 5,2,2 (:17) = 1,02,),1 (:17) + ”192,291 (x) + 111,024,1 (1:) , (2.10)

in which

[102,101 = [192,191 (1‘)

T _
= [{.,,,,..)};.2,_,,) (3528202) <>

”192,191 = ”192,191 (x)

T _.
= [{Bj,p2($)}:1_p2] (B1721???) 13%; (111,p1,...,11n,p1)T

T

T _
[{BM,2 (1)}:1_p2] (Bgzspz) 131% (1111,p1,...,111,,,p1)r

- 2 - - -
1..., = {m (Xi) — mp. (29)} + 8%, (Xi) + 2 {m (a) — m. (We. (Xi)

111,4,1 = 2 {m (X,) — mm (X,)} 5,.

2.4 Implementation

In this section, we describe procedures to implement the conﬁdence bands in Theorem
2.2. Our codes are written in XploRe for convenience in order to use kernel smoothing,

see Hardle et. al. [24].

21

Given any sample {(Xi,Y.,-)}?=1 from model (2.1), we use min (X1, ...,Xn) and
max (X 1, ...,Xn) respectively as the endpoints of interval [a, b]. Motivated by the
comment of one referee, we select the number of interior knots N V using a BIC criteria.
For knot location, we use equally space knots. According to Assumption (A3), the
optimal order of NV is nl/ (27)” +1). Thus we propose selecting the ”optimal” NV,
denoted by N3”, from [0.5mm min(5N7~V, Tb)], with NW = n1/ (21011“) and Tb =

n / 4 — 1 to ensure that the total number of parameters in the least square estimation

is less than n /4.

To be spec1fic let Qn- — (1 + Nn) be the total number of parameters. Then N opt

is the one minimizing the BIC value

ngt = argmin BIC(Nn)
Nn€[0.5Nr1/, mIII(5Nr1/,Tb)]
where BIC = log(MSE) + qn log (n) /n, with MSE = 22:10? — 17,-}2/n.The least

squares problem in (2.4) can be solved via the truncated power basis {1, x, ..., xp 1 _1 ,

(x — t .)p1-1 ' = 1 N In other words
J + ,] ,..., 1 .

F11

—1
mp1(“ =2: Vk‘Ek + Z 71‘ p1 (":- ill:1 ’

k=0

T
where the coefficients {’yo,...,”ypl_1,&1,p1,...,”bepl} are solutions to the fol-

lowing least squares problem

, . T
{70’ “'7 7N1,p1}
pl 1

= argmin 272:1 Y — Z 7kX2k— —J:'yjpi1(X— If)? 1

22

The variance estimators 612,14? (3:) are computed likewise.

When constructing the conﬁdence bands, one needs to evaluate the functions
v2”? (1‘) in (2.7) differently for the exact and conservative bands, and the descrip-
tion is separated into two subsections. For both cases, one estimates the unknown
functions f (2:) and v% (2:) and then plugs in these estimates, as in [70]. This is anal-
ogous to using 7 :l: 1.96 x sn/ﬁ instead of 7 :l: 1.96 x o/Jn as a large sample
95% conﬁdence interval for a normal population mean a, where the sample standard

deviation sn is a plugin substitute for the unknown population standard deviation 0.

... 2
Let K (11.) = 15 (I — U2) I {Iul S 1} /16 be the quadric kernel, sn =the sample

standard deviation of (Xi)?:1 and

 

A __ _1 n _1 ~ Xi-zr
f(:1:) _ n Zi=1h2mtJK (hmtf), (2.11)

(4701/10 (%)1/5 11—1/5

'12 rot,f 3”,

with h2 rot, f the rule-of—thumb bandwidth in Silverman [62].

A

_ __ . T _ .2 2
Deﬁne :p2 = {:i,p2’1 S i S n} , 5i,p2 = {Zi,p1 — 01,1432 (Xi)} , and

T
1 ,..., 1
X= X0!) = ,
X1 —£L’ ,..., Xn —1'

X' — a: n
W: W (21:) = diag K 2 ,
h2 rot,o i=1

where h2 rot,o is the rule-of-thumb bandwidth of Fan and Gijbels [14] based on data

n .
(X., Ei,p2)i=1‘ Deﬁne the following estimators of v22 (2:),

 

Z

- —1 _
11%“ (as) = ( 1, 0 ) (xwa) xTwapZ. (2.12)

23

The following uniform consistency results are provided in [1] and [14]

A

max su {)2 :1: —v2 a: su x _ :1: =0 . .
p,$€[:b][z,p,() Z()[+x6[:b][f() f() ,.(1) (213)

2.4.1 Implementing the exact band

The function ”71,1 (:13) is approximated by the following, with f (2:) and 22,1 (2:) de-
ﬁned in (2.11) and (2.12), j (:13) deﬁned in (2.5)

. - ._ _ —1 2
vn,1(:r) = vZ,1 (:r) f 1/2 (:13) n 1/2h2 / .
Then (2.13) and (2.8) imply that as n -—> 00, the band below is asymptotically exact

6%,1(I):tvn,1(sr){2log(N2 +1)}1/2¢,,. (2.14)

2.4.2 Implementing the conservative band

The band below is asymptotically conservative
.2 - __ 1 / 2
02,2 (:12) :i: Un,2 (at) {210g (N2 +1) 2 log a} , (2.15)

where the function ”71,2 (:r) in (2.7) for the linear band is estimated consistently by

. -1/2
m (x) = {AT (as) L,,(,.)A (2)}1/ 2 (22,2 (x) {g1 (as) nh2} , (2.16)
with 3'2 (2:) deﬁned in (2.5), and f(a:) and 222 2 (2:) deﬁned in (2.11) and (2.12), A (2:)
and Lj deﬁned as follows:
A (z) = Chm—1 {1 - T2 (13)} ,
Cj2($)7"2 (1‘)

24

I. j=0,...,N2-—1

lj+2,j+1 lj+2,j+2

The terms lik’ |i — k] S 1 are deﬁned through the following matrix inversion

 

 

( 1 ﬁ/4 0 )
\/2/4 1 1/4
1/4 1
MN2+2 1/4
1/4 1 (5/4
K0 «274 1 /(N2+2)><(N2+2)
(lik)(—1\1l2+2)x(N2+2) 1

and computed via (2.18), (2.19), and (2.20) given below, which are needed for (2.17).

Letting

 

 

z1=2+‘/§ z2=2_‘/§, 9=EZ=(2—\/§)2=7—41/§, (2.18)

and applying matrix theory from Gantmacher and Krein [19] and Zhang [84], we have

the following

l11 = lN2+2,N2+2
8.2%(1— 6N2+1) — 21(1 — 0N2)
8.2%(1— 9N2+1) — 2z1(1— 6N2) + (1— 0N2_1) /8,

 

25

’2' i: {821 (1 — 9N2+2“i) — (1 — 6N2+1‘i)} {821 (1 — 024) — (1 - (914)}
’ (21 —— 2.2) {642% (1 —- 9N2+1) —-— 1621 (1 — (9N2) + (1 - 9N2—1)}
(2.19)

 

for2SiSN2+1and

l12 = lN2+1,N2+2
(ax/2) 21 (1 — 0N2) — (1 — 6N2—1)/8
82% (1 — 9N2+1) — 221 (1 — (9N2) + 8 (1 — 6N2—1)/8,

z {8.21 (1 — 0N2+1—i) - (1 — 9N2-i)} {821 (1 — ei‘l) — (1 — (ii—2)}
i’i-H: 421(z1— 22) {647% (1 — 6N2+1) — 1621 (1— 9N2) + (1— 6N2—1)}
(2.20)

 

for 2 S i S N2. By the symmetry of the matrix M N2 +2, the lower diagonal entries

are 1141,, = 1,3241, W = 1, ..., N2 +1. See [70] for details.

2.4.3 Implementing the bootstrap band

In this subsection, we use wild bootstrap for improved performance following the
- - ‘. _ «2 _ .2 .
suggestion of one referee. We deﬁne the reSIduals 52,191,}? — Eiapl 0911132 (X,),
where E23131 are deﬁned in (2.3), and denote a predetermined integer by n 3, whose
default value is 500. The steps to compute bootstrap band, similar to Yang [77], are

described in the following.

Step 1, Let {afik}1SkSnB’ 1 S i S n be i.i.d. samples of the following discrete
distribution 5i,k = :l:1 with probability 1/2, it is easily veriﬁed that E(6z',k) = 0,
Var (62,16) 2 1.

Step 2, For any 1 S k S n3, deﬁne the k-th wild bootstrap sample 5:2

Wm =

n
.2 . *_ . . . __ .2 .
0p1,p2(Xz)+€i,p1,p252,ka1 S i S n.Tak1ng Ep1,k — {Ei’p1,k}z~=1 , we apply linear

26

spline on Ep1,k to get the spline estimate

T
.2 _ , N1 T *1 T
”p1,p2,k($) ‘ [{Bval (1)} j:1—-p1[ (BPIBPI) BPIEPLk (2'21)

Step 3, The wild bootstrap (1— a) pointwise conﬁdence interval for function value

02 (T) at one point T is [big/20:), 32U,a/2(£IJ)[ , where Elia/29‘) and 3%],01 /2($) are

the lower and upper 100(a/2)% quantiles of the set 61211,p2,k(x)1S/€STIB obtained

from (2.21) for each of the bootstrap sample generated in Step 2.

Step 4, According to [70], the uniform conﬁdence band is wider than the point-

 

wise conﬁdence interval by an inﬂation factor of 21—3042 \/ 2 {log(N2 + 1) — log(a / 2)}
when localized at any point T, hence we deﬁne the wild bootstrap (1 — a) conﬁdence

band for the function 02(2)) over [a,b] as [3%,a/2(I),3%j,a/2(SC):[ ,T E [a, b] where

aria/2(2) =
(31221102 (x) + (320/2013) - 6%1,p2<x>) 2111., )2 \/2 {log(N2 + 1) — logo/2)},

 

{Ian/2(3) =
612,1,” (T) + (3%],0/2(T) — 5%1,p2($)) 21—30/2 \/2 {log(N2 + 1) — log(a/2)}.

 

As one referee pointed out, instead of resampling at each point T and then in-
ﬂate by a universal factor Kn, it is also possible to resample the maximal deviation
distribution, as was done in Neumann and Kreiss [54], and obtain bootstrap lower
and upper 100(a/2)% quantiles of SUPTE[a,b] (7ng (T) — 02 (T) ”71%? (T). Our ap—
proach, however, has the advantage of adaptivity since the conﬁdence band is locally

calibrated at each point T, without the constraint of symmetry.

27

2.5 Examples

2.5.1 Simulation example

To illustrate the ﬁnite-sample behavior of our conﬁdence bands, we simulate data

from model (2.1), with X ~ U[—1/2,1/2], and

m (T) = sin (27rT) , a (T) = 00-33%, elT ~ N1{0,o2 (T)}. (2.22)

The noise levels are 00 = 0.2, 0.5, while sample sizes are taken to be n = 100, 200, 500.
Conﬁdence level 1 — a = 0.99, 0.95. For c = 100 and c = 5, Tables 2.1 and 2.2 contain
the coverage probabilities as the percentage of coverage of the true curve a (T) at all
data points {Xi}?=1 by the conﬁdence bands in (2.14), (2.15) and using bootstrap
method, over 500 replications of sample size n. Following the suggestion of one referee,
we have included variance functions 02 (T) that are strongly heteroscedastic (c = 5)

and nearly homoscedastic (c = 100).

In all cases, the performance of constant band is worse than the linear band in
terms of coverage, while the bootstrap band has the best coverage. In all cases the
coverage improves with sample sizes increasing, showing a positive conﬁrmation of
Theorem 2.2. The bootstrap band achieves reasonable coverage rate for moderate
sample size as low as 100, while for the nearly homoscedastic case of c = 100, the
asymptotic linear band has good coverage for sample size as low as n = 200. For
the strongly heteroscedastic case c = 5, it seems that the bootstrap band is the only
satisfactory one. We therefore recommend using the bootstrap band for analyzing

real data.

The graphs in Figures 2.1 and 2.2 are created based on two samples of size 100
and 500 respectively, for c = 100 and 5 respectively, each with three types of symbols:

center thin solid line (true curve), center dotted line (the estimated curve), upper and

28

lower thick solid line (bootstrap conﬁdence band). In all ﬁgures, the conﬁdence bands
for n = 500 are thinner and ﬁt better than those for n = 100.

We next compare by simulation the testing of heteroscedasticity based on the
proposed bootstrap conﬁdence band to the results of [11] for the following three

models

m(T) = 1 + sin(T), 0(T) = oexp(c.T) (monotone, model I)
m(T) = 1 + T, 0(T) = o {1 + csin(10T)}2 (high frequency, model 11) (2.23)

m(T) = 1 + T, 0(T) = 0(1 + CT)2 (unimodal, model III)

for c = 0, 0.5, 1.0 and o2 = 0.25 with standard normal errors. The design points X
were generated uniformly from [0,1] and the sample sizes were n = 50, 100, 200.
Table 2.3 shows the relative proportion of rejections for the various situations using
both our method and the results from [11], Table 1, p. 700 (in brackets). Our
method performs poorly when heteroscedasticity is weak (c = 0.5) for models I and
III, so the type II error is larger than [11]. For strongly heteroscedastic model (c =
1), however, our method achieves higher rejection power for models II and III, and
comparable rejection power for model I, so the type II error is either comparable to
[11] or lower. For homoscedastic model (c = 0), our rejection rate is always lower,
hence the bootstrap conﬁdence band based test has smaller type I error than [11].
Based on the above simulation, our method is better than [11] at detecting strong
heteroscedasticity and retaining homoscedasticity, while [11] is better than ours at

discovering weak heteroscedasticity.

2.5.2 Fossil data and motorcycle data

In this subsection we apply the bootstrap band to two real data sets, both of which

have sample size below 200.

29

Table 2.1: Coverage probabilities for c = 100 from 500 replications.

 

 

 

 

 

 

 

00 n 1 — a Constant Band Linear Band Bootstrap Band
0.99 0.882 0.886 0.944
100 0.95 0.806 0.858 0.858
0.99 0.940 0.970 0.996
0.2 200 0.95 0.874 0.958 0.968
0.99 0.984 0.994 1
500 0.95 0.942 0.992 0.984
0.99 0.764 0.892 0.956
100 0.95 0.690 0.870 0.886
0.99 0.896 0.970 0.992
0.5 200 0.95 0.830 0.962 0.960
0.99 0.974 0.996 0.998
500 0.95 0.926 0.994 0.984

 

 

 

 

 

 

The fossil data reflects global climate millions of years ago through ratios of stron-
tium isotopes found in fossil shells. These were studied by Chaudhuri and Marron[5]
to detect the structure via kernel smoothing. The corresponding penalized spline ﬁt
was provided in Ruppert et. al. [60]. In this section we test the heteroscedasticity of
the fossil data variance. The null hypothesis is H0 : 02 (T) = 03 > 0. The response Y
is the strontium isotopes ratio after linear transformation, Y = 0.70715+ratio*10—5,
since all the values are very close to 0.707, while the predictor X is the fossil shell

age in million years.

In Figure 2.3, the center dotted line is the linear spline ﬁt 632 (T) for the variance
function 02 (T). The upper/ lower thick solid lines represent bootstrap conﬁdence
band. The constant horizontal line between the upper/ lower thick lines represents
the average of the minimum of the upper line and the maximum of the lower line,
which indicates if one can ﬁt a constant line into the conﬁdence band. Since the
variance band of high conﬁdence level 100(1 — 0.20)% contains the ﬁtted constant
line entirely, we have failed to reject the null hypothesis of homoscedasticity with

p—value 0.20.

30

Table 2.2: Coverage probabilities for c = 5 from 500 replications.

 

 

 

 

 

00 n 1 — a Constant Band Linear Band Bootstrap Band
0.99 0.824 0.858 0.944
100 0.95 0.764 0.834 0.874
0.99 0.912 0.896 0.986
0.2 200 0.95 0.832 0.884 0.954
0.99 0.978 0.970 1
500 0.95 0.916 0.964 0.992
0.99 0.886 0.856 0.946
100 0.95 0.648 0.828 0.878
0.99 0.916 0.918 0.992
0.5 200 0.95 0.688 0.904 0.958
0.99 0.958 0.966 1
500 0.95 0.726 0.964 0.986

 

 

 

 

 

 

 

 

A second data used to illustrate our technique is the well-known motorcycle data.
The X -values denote time (in milliseconds) after a simulated impact with motorcycles.
The response variable Y is the head acceleration of a PTMO (post mortem human

test object).

In Figure 2.4, the center dotted line is the linear spline ﬁt 632 (T) for 02 (T).
The upper/ lower thick solid lines represent bootstrap conﬁdence band. The constant
line between the upper/lower thick lines represents the average of the minimum of
the upper line and the maximum of the lower line. Since the variance band of an
extremely high conﬁdence level 100(1 — 0.008)% does not contain the ﬁtted constant

line entirely, we reject the null hypothesis of homoscedasticity with p-value S 0.008.

In both Figures 2.3 and 2.4, there exists an exact correspondence of high (“7% 2 (T)
value in the upper plot to greater width of the conﬁdence band for the conditional

mean function in the lower plot, throughout the entire data range.

31

Table 2.3: Simulated rejection probabilities of test homoscedasticity from 500 repli-

cations.

 

n=50

n=100

n=200

 

2.5% 5% 10%

2.5% 5% 10%

2.5%

5%

10%

 

model I

 

0.5

1.0

0.004 0.004 0.012
(0.038) (0.056) (0.101)
0.014 0.020 0.030
(0.055) (0.084) (0.132)
0.038 0.058 0.110
(0.095) (0.148) (0.223)

0 0 0.002
(0.028) (0.057) (0.093)
0.002 0.006 0.018
(0.064) (0.097) (0.151)
0.024 0.072 0.254
(0.153) (0.215) (0.313)

0
(0.037)
0
(0.086)
0.150
(0.249)

0
(0.059)
0.004
(0.134)
0.362
(0.337)

0
(0.105)
0.034
(0.200)
0.690
(0.458)

 

model II

 

0.5

1.0

0.004 0.004 0.012
(0.031) (0.053) (0.100)
0.082 0.106 0.158
(0.197) (0.276) (0.390)
0.316 0.422 0.612
(0.272) (0.365) (0.481)

 

0 0 0.002
(0.026) (0.049) (0.089)
0.296 0.484 0.766
(0.333) (0.433) (0.568)
0.356 0.512 0.734
(0.477) (0.557) (0.674)

 

0
(0.032)
0.694
(0.527)
0.656
(0.693)

0
(0.056)
0.918
(0.637)
0.884
(0.790)

0
(0.100)
0.992
(0.761)

0.984
(0.884)

 

model III

 

0.5

1.0

 

0.004 0.004 0.012
(0.034) (0.054) (0.097)
0.02 0.034 0.066
(0.073) (0.113) (0.185)
0.078 0.112 0.216
(0.136) (0.198) (0.291)

 

0 0 0.002
(0.028) (0.053) (0.100)
0.010 0.030 0.110
(0.105) (0.158) (0.233)
0.122 0.312 0.642
(0.221) (0.304) (0.412)

 

0
(0.031)
0.032
(0.175)
0.668
(0.378)

0
(0.053)
0.142
(0.239)
0.984
(0.476)

0
(0.094)
0.394
(0.342)
0.978
(0.598)

 

2.6 Appendix

The goals of this Appendix are to prove Propositions 2.1, 2.2 and 2.3. These clearly
establish Theorem 2.1 and Theorem 2.2. In what follows, we denote by ||€|| the
Euclidean norm and by [5 I the largest absolute value of the elements of any vector 6.

We use c, C to denote positive constants in the generic sense.

The following result is based on Theorem 3.2 and Propositions 3.1, 3.2 of [70], see
also [28] and Leadbetter et. al. [38].

Lemma 2.1. Under Assumptions (AU-(A4), there eTists a constant 0171 > 0,p1 2 1

32

such that for any m E C (p1) [a,b] and the function mp1 (T) given in (2 9),

g Cpl inf Ilg — mu,>0 = 0,. (51:1) . (2.24)

[[2401 (1’) ‘ m (9” 96 0(191’

00

Moreover, for the function Epl (T) given in (2 9),

 

 

5p, (1:)“0O = 0,, (711171 M) . (2.25)

According to Lemma 2.1, the bias term mp1 (T) — m (T) is uniformly of order
0p(h11)1) = Op (n_p1/(2p1+1)), while the noise term Epl (T) is uniformly of
order Op (hi1)1 ﬂag?) = Op (n-pl/(291+1)\/Eﬁ).

The following lemma on uniform convergence of the empirical inner product to

the theoretical counterparts is from Lemma 3.1 of [70].

Lemma 2.2. Under Assumptions (A2) and (A3), as n —> oo,

 

 

') 1 —
91,92€G(p1-2) ”91“2 “92ll2
_—_ 0p (\/n—1h1—110g(n)) , (2.26)

The next result on the empirical inner product matrix is based on Lemma B2 of

[70] and Lemma A5 of [76].

Lemma 2.3. Under Assumptions (A2) and (A3), there eTist constants c( f ), C (f) >
0 independent of n but dependent on f, such that as n —> 00, with probability approach-
ing 1, for allé E RNV+pV,V = 1,2

c0910 3 (n-lBguspV)‘1:[sC(f)ls. (2.27)

 

—1
usual? 3 {ﬁn—11321310..) {SCUHKIIQ- (2.28)

33

Using the above three results, we establish two additional technical lemmas to be

used in proving Propositions 2.1, 2.2 and 2.3.

Lemma 2.4. Under Assumptions {A2} and (A3), as n —+ 00,

(191-205;-.. IR} W)

TE[a,b]
33. ((3,,,,,1)}=o(h.1/2). jzn’féjxp,{<8j,pwl>.}

j=1—Pu
= 01; (till/2 + \/n_1h,71 log n) . (2.30)

 

 

 

 

Proof. For each T 6 [a,b], at most pl, of the 8.71191! (T)’s are nonzero, (2.29) follows

directly from the deﬁnition of {B -

u
T } , and the sim le fact that

12 .
“bjpull2>ch/,1—PVSJSNV-

The same deﬁnition and fact also imply that

23)) x (31/2) _0( 1,2)

As all {B j 191/ (513)}j: ‘1—pu are standardized, the deﬁnition and rate of Amp” in
(2.26) imply the second half of (2.30).

Lemma 2.5. Under Assumptions (A2) and (A3), as n —> 00,

N2 N1 _1 2
Z Z {Tl 2:1,le j2p2( )(Z)Esz1p1 (XZ)}

i=1-P2 k=1—P1
: 0p (n5/2(2p*+1)-1/2(2p*+1)4) , (2.31)

34

while for any continuous function r deﬁned on [a, b],

N1

2 [n—1::=13,,, (X.) T (Xi) .,.]2

i=1-P1
s llallgo (1111?... (N1 +p1)n—1. (232)

Proof.

N2 N1 2
B z 2: [...-123., max-18.32.19»)

i=1—P2 k=1-P1

N2 N1
= Z Z n‘ZZB{B )p2( X-PBk,p,(X.-)Zo2(x.)}

j=1—p2k=1—p1 i=1
S ”—1maX(N1+PlaN2+PZ)N*—1N*

2 2 2
X max E 3' (X1) Bk (X1) 0 (X1) .
lk—J|SP1 { 3,121 ,p1 }

With the deﬁnition of Bj p1 (T ):— bjapl (T) ”bj,p1“2_lv 1—p1 Sj S N1, we have

2 2 2
Ik-mjélléplE 3,1,, (X1) Bk).1 (X1) 0 09)}

< 6(0) f()\/h1h2 2_C(f0)
— C(f)h1h2 vh1h2

 

Thus (2.31) follows from

N2 N1 1 2
E. Z Z {n 2i=1 Bj1p2()(7:)€ZBk,pl (XZ)}
3:1—192 k=1—p1
S n_1ma.x(N1 +P1,N2 +p2)N*—1N* x C(f,a)

0 (n5/2(2p*+1)—1/2(2p*+1)—1).

é”;
D—l)
D"
[\3

35

To prove (2.32), we argue that

N1
_1 n 2
Z [" Zilej,p1(Xi)T(Xi)5i]
J=1-p1
N1
_ —1 2 2 2
— '2 n B{B,-,,1(X1)r(X1)a (X1>}
3:1—171
N1
Hangonrnion‘l Z B{Bj,p1 (X1)‘-’-}
i=1-P1
Mango Ilrugo (N1 +101) n"1

|/\

The next three propositions show the asymptotical property of the three terms,
[192,191, IIp2,p1 and II Ip2,p1 in (2.10), decomposed from section 2.3, then estab—
lish Theorem 2.1.

Proposition 2.1. Under Assumptions {AU-(A4), HIp2’p1lloo = supxqa b] le2,p1 (11:)l,

as n —+ 00, is of order

0p(h§p110gn) = 01, (n—2p1/(2p1+1)10gn) = 0p (n-p2/(2p2+1)) ,

Proof. By Cauchy-Schwarz inequality,

’Ii,p1|<2{m(Xi) —mp1(X-)}2 +2Ep1(X

thus maacz._1 II -p1| is bounded by

2:
~ 2 ~ 2
S 2 llm‘mpllloo+ “51’1”“; '

36

2 {Wm (Xi) — mm M2 + {mil W}

It follows that

HIPIBBHOO
= sup ’{ij (13)}1Y2 (BT2 131,2) 113T (I,- ,1gign)T’,
TE[a,b] ’ 2 ]=1—p2 192 ml

which, as for each T E [a, b],B j p2 (T) 75 O for at. most p2 values of j, is bounded by

N2
7’2 max Bin”)

=1—p2 ("-13T23P2)_1X"_IBT2(lIi,B1l’15"5 ")Tl

 

Using (2.29) in Lemma 2.4 and (3. 41) in Lemma 3.10, the above is bounded by
—1 2
p2C(f)h2 / ><C(f) ><

max?=1 lliipl l, we have

 

T
n‘lBT 201.,- p1|v 1 < i < n) l . THen, using the bound on

”Ipl’pzlloo

s C(f)h§1/2 >< {Hm — mpluio+|lép1)l:o}Xj=IT1§§Xp2{<Bj,p2,1>n}

which, applying (2.24) and (2.25) in Lemma 2.1, and (2.26), (2.30) in Lemma 2.4, is
bounded by

—1 2 2 2 12 _ _
Op{h2 / x (h1p1+h1pllogn) x (122/ +\/n 1122110gn)}

2
= 0p (hlpl log n) .

 

Proposition 2.2. Under Assumptions {AU-{A4}, as n —-> 00,

”[1 H = sup III ml
49 p ,1)
P2 1 00 “[a,b] 2 1

_ 0p(n3/<2p*+1)—3/2) ( —p2/<2p2+1))

-01?

37

Proof. By deﬁnition

[Ip1,p2(l‘)
= {mam (x) , ...,BN2 m (2:)} (Bp23p2)_ sz (111p1,..,11n,p1)T
= I{Bl-P2P2(I)" BN2,P2 (I)}(" lBPzBP2) 1
N
xn{ —lzi= 1 ijp2(X 05131 (X08 III}j =21— —p2

Applying (3.41) in Lemma 3.10, |11p1,p2($)', with probability approaching 1, is
bounded by

C ”{Bl—mipz (I) , BN2,” (17))“ C(f)

—1 n .. N2
{" Zi=1BJ3p2 (X05101 (Xi) 5i}j:1_p2 H,

X

 

 

applying (2.29) in Lemma 2.4,

N2 ll

—1 2 _ n ~
SUP .11p1,p2($)l S CU) hg / “{n 122-:133'492 (X05191 (Xi) 5i}j=1-P2 I

TE[a,b]

Next, one can write for any 1 — p2 S j 5 N2,

"—122; 31,102 (Xi) 3’1 (Xi) Ii
= 22:1 BM. (xi {Bl—m (Xi) MN BB}
x (n—13$IBPI)—ln 113311;;
{II III Zn z=1 BJP2( ”5in P1 (Xi)}l:11-P1

xn( "IBT Bp1)—1n—1B$1E,

38

 

hence, sup II T is bounded by
TE[a,b] p1,p2

 

N1

—1/2 N2
C(f) 112 Z 5233' ,p2( Xilfz'B/wl (Xi)

j=1—p2 ni= 1

—1
1
xJC; BT 131,1) n-1B$1E

c —1/2 N2
(f) hg 2 £2733 j,(p2 Xi)€in,p1 (Xi)

j=1—p2 ni= 1

1
T11 T

k=1—p1

 

2

 

 

N1 2

|/\

k=1—p1

 

 

   

 

N N 2
”1/2 2 1 1 n
= 0 (W12 2 2 g 2333' Xilgima (Xi)

j=1—p2 k=1—p1 i=1

-—1
1 T 1 T

 

 

   

by Cauchy—Schwarz inequality.

Note that with probability approaching 1,
2
—1 T *1 -1 T
(n Bplnpl) n BplEll
T —1 T
n BPlE}

2 1 —-1 n 2
= C (f) 2 {n Zi=1 Bjapl (Xi) 5i}

k=1-p1
= 0,; {(N1 +p1) n_1} = 0p(N1n—1)

 

 

|/\

Q

C:

"Ra
PM

2 :3"
wF-J

53's

{11

V

according to (2.32) of Lemma 2.5 with function r (T) E 1. Meanwhile, according to

39

(2.31) in Lemma 2.5 we have,

sup IIIp1,p2(T)l
T€[a,b]

2 0p (h21/2 X n5/2(2p*+1)—1/2(2p*+1)—1 x m)

_ Op(n1/2<2p2+1) x n5/2<2p*+1>—1/2(2p*+1)—1 x n1/2(2p1+1>,,—1/2)

= op (n3/(2p*+1)—3/2) _

Proposition 2.3. Under Assumptions (A 1)-(A4), as n —> 00,

“”1132491 “00 = $2165“ IIUpz,p1 (1?)!

0p (n3/(2p*+1)-1) = 0p (n—P21/(2P2+1)) .

Proof.

”11111112 (~11)

1 T
= {Bl_p2’p2($),...,BN2,p2(13)} (331,213,?)

T T
i3” (1111 p1N,2111n,p1)
—1
_ 1 T
1 Tl
~,—,— :1 3,1,2 (X) {m (Xi) — mp1 (X.)}e.
2:
N2 1 T ’1

{2:13.3P2(W{m1 9P1(Xi)}51}

2:1

—1

N2

i=1-P2

N2

i=1-P2

40

+

N2 1 T “I
n
1 -
{g :1 39,192 (Xi) (9171 (Xi) - mp1 (Xi)}5i}
2:
N2 1 T ‘1
= 2 {83-2172 (x)}j=1—p2 (an2Bp2)

{$231192 (Xi) {m(Xi) - 9P1 (Xi) } 52'}

N2

i=1-P2

N2

i=1-P2

N2 1 T ‘1
+ 2 {81432 (I)}j=1_p2 (531723192)

1 " N1
; :1 33.102 (Xi) 5131;,“ (xi)
1,:

k=1—p1
N2

—1
1 T 1 T '
(EBpiBPI) ng1(gp1-m)}. '
J=1—P2

in which the spline function gpl E C(pT‘I) satisfies II m — gpl “00$ Chpl, and

gPi = {9101 (X1) , "-19:01 (Xn)}T-

With probability approaching 1, according to (2.29) in Lemma 2.4, the ﬁrst term

in the above is bounded by

 

N2 1 T ‘1
2 {Bl-p2,p2 (x)}j=1—p2 (RBPZBp?)

 

n N2
{% Z Bj,p2 (Xi) {m (Xi) — 9P1 (Xi) } 51'}
i=1 j=1-p2
1 " N2
S CU) ’12—1/2 {a Z BM (X1) {m (X1) - 9m (19)}52'}
2:1 j=1—p2

41

By (2.32) in Lemma 2.5 with 7‘ (x) = m (2:) — 9P1 (:13), the above has order

-1 2 2 __ ‘ N1 2
Op (122 /\/||0||gollm—gpllloo(N1+p1)n 1) =0p(N2./N1/n)

 

For the second term

—1

 

2{Bl—P2,P2(x ) ”(BN2,p2(;1{x)}( 31923102)
1 7’ N1
{; Z Bj,p2 (Xi) *3in,“ 0%)}

-—1
1 T 1 T
ng1BP1) £3191 (gpl—m)}

k=1—p1
N2

 

j=1-P2

C
1(/f2) {n12— Z Bj 3,202( XDEinpl 0(1)}

—1
(1 BT 131,1) 1.1137791 (gp1 —m)}

N1

|/\

k=1-p1
N2

 

 

i=1—P2

 

|/\

C(f) N2 N1 2
1/2 2 {% 233102 z')5in,p1 (1%)}

h2 j=1—p2 2:1 k=1—p1

—1
6 —BT 13101) -71;Bp1(gp1—m)H.

H

 

 

The order of

 

N2 N1 2

2 { £23 val-Ema}

j=1_p2 k=1—p1

*
is Op (n5/Z(2p*+1)-1/2(2p +1)—1) according to (2.31) in Lemma 2.5. And with

42

probability approaching 1, (3.42) of Lemma 3.10 implies that

 

 

(”ABIFTI’iBPIY1 ”—1351 (gm ‘ m) H

3 CU) “Tl-1351 (3P1 ‘ mlll’

while lln—lBgl (gpl — m) H is bounded by

 

N1 1 n 2
Z {5 Z Bj,p1 (Xi) l9p1 — ml 09)}

j=1-p1 z=1

 

N1

2
1 n
s “gm —mlloo , 2 {521323191 mo}
1 2‘:

J=1—p

 

N1

=0p(h11’1) 2 “ﬁx {<Bj,p1’1>n}2’

. =1—
J=1—P1J p1

 

which is of order Up {121191 x ‘/N1 x (hi/2 + \/n_1hl_llogn)} = 01901114) by

(2.30) in Lemma 2.4. Combining them, the order of the second term is

Op(h1—1/2 X n5/2(2p*+1)—1/2(2p*+1)——1 x #191)
= 01, (n5/2(2P*+1)—1/2(2p*+1)—3/2)

= 0p(n3/2(2p*+1)—1)_
Putting the ﬁrst and second term together, we have established that

raw <x>l=0p (n3/2<2p*+1>-1)-

43

n=100, Conﬁdence level= 95%

 

f

 

 

 

 

0.6- l
0AM.
0.25 ‘ when“-.. . ...-«m»... .. __ .4

WW
-0.2* .
-0.4>

-o.5 (3 Q5

n=500, Conﬁdence level= 95%
0.6* l

0.4

0.2% a”:
0 W
-o.2-
-04,

~05 o

 

W

 

 

 

 

 

0.5

n=100. Conﬁdence level= 99%

 

1

 

 

 

 

 

 

 

 

 

0.6-
0.4-
0.2?” ~... “,., w-w-"w..- ,__
0. .
tel/W.
—o.4L
—o.5 c 0.5
n=500. Conﬁdence level= 99%
0.6- ‘
04W
0.2—‘— ’4
0W
-o.2-
-o.4-
-o.5 6 0.5

 

Figure 2.1: For data generated from model (2.22) (with 00 = .5, c = 100) of different

sample size n and conﬁdence level 1 — 0, plots of conﬁdence bands for variance (thick

solid), the linear spline estimator 6g 2 (cc) (dotted), and the true function 02 (m)
(solid). The bands are computed from bootstrap method.

44

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n=100, Conﬁdence level= 95% n=100. Conﬁdence level= 99%
0.4-
03*
0.2:
0.1 - E
_ . 0 . i g,
-0.1 L g ‘ _0-1 _/-’\/\/\.4 S
‘04 .5 6 0.5 _ 48.5 6 0.5
n=500, Conﬁdence level= 95% n=500, Conﬁdence level= 99%
0.4» ‘ Q4»
0.3 4 0.3 ‘
0'2 \/\’\—/l 0.2 \XV‘
0.1\< 0.1M
-O.1 > * -0.1 *
' 45.5 o 0.5 '0-‘85 6 0.5

Figure 2.2: For data generated from model (2.22) (with 00 = .5, c = 5) of different
sample size n and confidence level 1 — (1, plots of conﬁdence bands for variance (thick

solid), the linear spline estimator (“7% 2 (3:) (dotted), and the true function a2 (:15)
(solid). The bands are computed from bootstrap method.

45

 

variance conﬁdence band, p—value=0.20
2 . r . . . .

 

 

 

 

 

 

 

_o'%0 95 100 105 110 115 120 125

mean conﬁdence band, conﬁdence Ievel=0.99

T

 

I

0.7075 . ~

I
O

0
L

0.7074 '

0.7073 \ ‘ /

0.7072 . -

I

l

 

 

 

90 9‘5 160 165 1io 115 1éo 125

Figure 2.3: For the fossil data, plots of variance conﬁdence bands (thick solid) com-
puted by bootstrap method, the linear spline estimator 6% 2 (2:) (dotted) and a con-

stant variance function that ﬁts in the conﬁdence band (solid). The lower picture is
the data scatter plot and the conﬁdence band for mean (thin solid).

46

variance conﬁdence band, p—value=0.008

 

3000

I

2000

I

1 000

 

 

 

I

-1000

 

 

 

_20000 1‘0 2‘0 3'0 4‘0 5‘0 60

mean conﬁdence band. conﬁdence Ievel=0.99
100 T . . . .

 

50- o .0. o "‘

—100

 

—150-

 

 

 

-200

Figure 2.4: For the motorcycle data, plots of variance conﬁdence bands (thick solid)
computed by bootstrap method, the linear spline estimator 6% 2 (x) (dotted) and a

constant variance function that ﬁts in the conﬁdence band (solid). The lower picture
is the data scatter plot and the conﬁdence band for mean (thin solid).

47

Chapter 3

Oracally efﬁcient spline smoothing
of N AAR models with

simultaneous conﬁdence bands

3. 1 Introduction

Non- and semiparametric smoothing has been proven to be useful for analyzing com-
plex time series data due to the flexibility to “let the data speak for themselves”. One
unavoidable issue in high dimensional smoothing is the “curse of dimensionality”, i.e.,
the poor convergence rate of nonparametric estimation of multivariate functions. Ad-
ditive regression model of Hastie and Tibshirani [26] has been adapted by Chen and
Tsay [6] to autoregression and found wide use in recent years to reduce dimension in
nonparametric smoothing of time series. A nonlinear additive autoregressive model

(NAAR) is of the form

(1
Y2- =m(Xi)+5.l-, m(x1,...,:rd) =c+ 2 m», (3:7), (3.1)
7:1

48

 

 

T n
where the sequence {lg-,Xz- }
2

:1 is a length n realization of a (d + 1)-dimensional
strictly stationary process, the d-variate functions m (-) and a (-) are the mean and
standard deviation of the response Y,- conditional on the predictor vector Xi =
{X,-1,...,X,-d}T, and E(s,- |x,) = 0,E(e§|x,) = 02(xi). In the context of
NAAR, each predictor Xz-7,1 _<_ 7 _<_ d can be observed lagged values of Y2" such as
Xiy = Yi—fy’ or of a different times series. The component functions {mry()}g=1

are subjected to the identiﬁability condition Emry (X,- ) E O, 1 S 'y S d.

Inference of model (3.1) centers on the estimation and testing of {my (-.)}g=1
The marginal integration method of Tjostheim and Auestad [68] and Linton and
Nielsen [43] came with asymptotic distribution, which was extended in Sperlich,
Tjostheim and Yang [65] to include second order interactions. Other related works in-
clude Fan and Li [17], Yang, Park, Xue and Hardle [78] and Lu, Lundervold, Tjostheim
and Yao [44]. The backﬁtting idea promoted by [26] was made rigorous in a more
complicated form of smooth backﬁtting by Mammen, Linton and Nielsen [47] and
popularized by Nielsen and Sperlich [55] . These kernel based methods are extremely
computational intensive, limiting their use for high dimension d, see Martins-Filho
and Yang [48] for numerical comparison of these methods. Spline method of Stone
[66] had been extended in parallel to NAAR models in Huang and Yang [29], which
are fast and easy to implement but lack of limiting distribution. For applications of
additive model in medical and environmental research, see Liang et al [41], Roca—
Pardinas, Cadarso-Suarez and Gonzalez-Manteiga [57] and Roca—Pardiﬁas, Cadarso—

Suarez, Tahoces and Lado [58].

The two-step estimators of Linton [42] for model (3.1) possess oracle efﬁciency

and are theoretically superior to the aforementioned estimators of {772/7 (-)}idy=1. If
d
all com onents {m - } and the constant c were known and removed from
p 3‘) ﬂame

n
the responses, one could estimate m7 () from the univariate data {la-,7, X17}. 1 in
Z:

49

 

 

n n
which {Yiry} 1 are latent oracle responses to the 7—th covariate {Xi7}- 1,
Z: Z:
d
Yi'y =m7 (X737) +Ez' =Yi—C-ﬂ Zﬁ¢ m5 (Xm) ,1 Sigml SySd. (3.2)
=L 7

d
ﬁ=Lﬁ¢7

,. n
initial kernel estimates, create a pseudo univariate data Y”, Xi'y}- 1, and estab-
7,:

The key idea of [42] is to replace the true {mﬂ ()} and 0 above by some

lish the asymptotic equivalence of kernel / local polynomial estimators of my () using
either unobservable {127,X2-7}:=1 or {72-7, Xi'y}:=1- Recently, faster oracally ef-
ﬁcient estimators have been developed for NAAR time series data by Horowitz and
Marnmen [27], Wang and Yang [71], making use of orthogonal series/spline initial
estimates. The second step estimation is done by kernel method, with pointwise
asymptotic distribution. For the sake of discussion, we call the two-step estimator of

[42] kernel+kernel, of [27] orthogonal series+kernel and of [71] spline+kernel.

For the NAAR time series models, however, none of the existing methods pro-
vide any simultaneous conﬁdence band for my (-.) To address this need, we propose
an all new spline+spline oracally efﬁcient estimator that is theoretically superior as
it comes with an asymptotically simultaneous conﬁdence band for my (-), and also
computationally more expedient than any existing estimators due to the use of spline
instead of kernel in all steps. The asymptotically simultaneous conﬁdence band is
that of an univariate regression function in Wang and Yang [72], and is most con-
venient for inference in the global shape of function m7 (). Such conﬁdence band
methodology has been applied to compare the dependence of corn, soybean and wheat
crop yields on wetness index under various conditions, see Huang, Wang, Yang and
Kravchenko [30]. The spline+spline method is asymptotically oracally efﬁcient as the
spline+kernel method of [71], but can be hundreds of times faster in terms of com-

puting, see the comparison in Table 3.2. We see little hope of further reducing the

50

computing burden for model (3.1) over the proposed spline+spline method and still
retaining the simultaneous conﬁdence band and oracle efﬁciency. It seems that the
only alternative worth exploring is to use penalized spline instead of B spline smooth-
ing in the second step. For theoretical properties of penalized spline smoothing, see
Kauermann, Krivobokova and Fahrmeir [36] and Krivobokova and Kauermann [37].

The chapter is based on a published work Song and Yang [64]. The rest of the
chapter is organized as follows. Section 3.2 describes the spline-backﬁtted spline
(SBS) estimators and presents the main theoretical results. Section 3.3 illustrates the
idea of proof via decomposition of error. Simulation results are showed in Section 3.4.

Most of the technical proofs are in the Appendix.

3.2 The SBS estimator

In this section, we describe the spline-backﬁtted spline estimation procedure. For
convenience, we denote vectors as x = ($1, ..., crd) and take [I - I] as the usual Euclidean
norm on Rd, i.e., ”x“ = 1(Zd=1$%’ and I] - [[00 the sup norm, i.e., IIXIIoo =
SUPIS’YSd [x7|. In what follows, denote Y = (Y1, ..., Yn)T the response vector and
(X1, ..., Xn)T the design matrix. We denote by 1k the k-vector with all elements 1,
and Ikx k the k x k identity matrix. Throughout this chapter, we denote the space
of the second order smooth functions as 0(2) [0, 1] = {m lm” E C [0, 1] }.

While X7 may be distributed on (—00, oo), estimation of m is carried out only on
compact intervals, and without loss of generality, we take all intervals to be [0, 1] , 1 S
'7 S d. Let 0 = to < t1 < < tN+1 = 1 be a sequence of equally spaced
knots, dividing [0,1] into (N + 1) subintervals of length h = hn = 1/ (N + 1) with

1/5

a preselected integer N ~ n given in Assumption (A5), and let 0 = t6 < t’f <

.. < 15R“, +1 = 1 be another sequence of equally-spaced knots, dividing [0,1] into
(N* + 1) subintervals of length H = Hn = (N* + 1).1 where N* ~ n2/5logn is

51

another preselected integer, see Assumption (A5). Next, we deﬁne the constant spline
basis I Jul: for step one and the linear spline basis b J for step two de Boor ([10], page
89) as follows,

10($)EI,0<CC<1

1 J*Hg:c<(J*+1)H, * *
0 otherwise,

bJ(a:) = K64]

 

),1SJSN+1,K(u)=(1—[u|)+.

We denote by G7 the linear space spanned by {b J (17)}923 = {1, b J (2:7)}9211,
whose elements are called linear splines, piecewise linear functions of 237 which are con-
tinuous on [0, 1] and linear on each subinterval [t J, t J +1] ,0 S J S N. We denote by

N +1
071,7 C R” the corresponding subspace of R” spanned by {1, {bJ (Xi?) }:—1}J—-1 .

Similarly, deﬁne the {1+ dN*}-dimensional space 0* of additive constant spline
. d,N* .
functions as the space spanned by {1, I J4: (3:7) }7=1,J*=1’ and the corresponding
11 d,N*
..-... n {n {an (m
with probability approaching one, the dimension of 071,7 becomes N + 2, and the

astlCR". Asn——>oo,

dimension of 0,"; becomes 1 + dN*.

The function m (x) has a multivariate additive regression spline (MARS) estimator
rh (x) = ﬁzn (x), the unique element of 0*, so the vector {ﬁt (X1) , ...,ﬁr (Xn)}T E
G; best approximates the response vector Y. For spline regression, we introduce the

following weights,

17

W- = 1(0_<_X2-7£1),1§i§n,137§d, (3.3)

1 S i S n, (3.4)

W; = 1(0 3 X,- g 1) = ngle-y,

52

w* = diag(wf,...,w;;),

and impose on additive component functions the identiﬁability condition
Em,(X,-,)W?":0,1<ny<d (3.5)

Deﬁne next a weighted spline estimator of m as

d N*
_ )—argmin: {Y,— X,)}2 W,-’"= ;\0 + Z: 2 AJ* ,,IJa: (,xry) (3.6)
960* 7=1J*=1

where (ABA/1,1, ..., S‘lV“ d) is the solution of the weighted least squares problem

{16,13,1,...,1;V*,d}T

d N* 2
= Y,-—)\ —- ,\ I x- W-*.

Pilot estimator of each component function is

3137): Z 19*,7 IJ*( 1'7) —n_ 1:211J*(X' W) W-* , 137361617)
J*=1

which satisfies the empirical analog of (3.5): 1),—1 2:1:1 7717 (X, 7) W; — —0, 1

|/\

7 S (1. These pilot estimators are used to deﬁne pseudo-responses 17,7, V1 S 'y S

d, which approximate the “oracle” responses Y,-,7 in (3.2). Speciﬁcally, we deﬁne
‘ _ . d ~ . _ — _ —1 - -

Y,7 — Y,- — c — Zﬁ=1ﬂ757 mﬁ (X23) , where c — Yn — n zy=1Y,-,wh1ch IS a
ﬁ-consistent estimator of c by central limit theorem for strongly mixing sequences.

Correspondingly, we denote vectors

- . . T T
Y», = {Y1,,...,Ym] ,Y7 = {Y1,,...,Ym} . (3.8)

53

We deﬁne the spline-backﬁtted spline (SBS) estimator of my (SE/7) as 722,333 (27)

based on { Y,.,,X,,,}n=1, which attempts to mimic the would-be spline estimator
22

m7 S (:07) of my ($7) based on {Y ,7,X ,.,},_1 if the unobservable “oracle” re-

22
sponses Y- , were available. To be precise, for 0 S x S 1,
27 ,=1 7

n
. . . 2
m'y,SBS (any) = argmin Z {Yi’Y — 97 (X,,./)} Wi’Y’
92607 2:1

2727,8(357) = grgexrg:Z{i/,,— g,(x X,.,)}2W,-,. (3.9)
7

Before presenting the main results, we state the following assumptions.

(A1) The additive component functions m7 (3:7) 6 C (2) [0, 1] ,V7 = 1, ..., d

(A2) There exist positive constants K0 and A0 such that a(n ) < KOe e_)l0n holds

for all n, with the a-mizing coefﬁcients for {Z,- = (Xi’ 5M} _ deﬁned as

a(k)= sup [P (BﬂC)—P(B)P(C)], 1:21.
BEo{Zs,sSt}, CEa{Zs,s>t+k}
(3.10)

(A3) The noise 5, satisﬁes E (5, IX,- )- — 0, E ( 2 [X,- ) = o2 (X, ),E (lg-[2+6 IX, )<
M, for some 6 > 1/2 and a ﬁnite positive M5 and a (x) is continuous on [0, 1]d,

0<chinf

,do(x) S sup ]da(x) S CU < 00. Consequently, for

xE[0,1 x€[0,1
'y = 1, ..., (1,027 (3:7) = E {02 (X) [X7 = 3:7} satisﬁes also
co S infx7€[0,1]‘77 (3:7) S sup$7€[0,1] any (237) S C0.

(A4) The density function f (x) of X is continuous and 0 < c f S inf 1],, f (x)

xE[0,
S supx€[0,1]df(x) S of < 00.

(A5) The number of interior knots in estimation step one N * ~ n2/ 5 log n, i.e.,

54

cN*n2/510gn S N * S CN*n2/510gn for some positive constants CN*’ C N*'
1/5

The number of interior knots in estimation step two N ~ 22 .

Remark. 1. The smoothness Assumption (A1) is nearly minimal. (A2)-(A4)
are typical in the nonparametric literature, for instance, Fan and Gijbels [14]. For
(A5), the optimal order of N in the second step ensures bias and variance trade-
off. Theorem 3.4 on the oracle efﬁciency of 722,533 ($7) remains true if N * is of
the more general form n2/5N’, where the sequence N ’ satisﬁes log(n)/N’ = 0(1),
n’gN’ —» 0 for any 6 > 0, see Proposition A.1., A2 and Lemma A.1., A2 for the
proof of Theorem 3.4 in Appendix.

Remark. 2. Assumptions (A1)-(A4) are satisﬁed by many commonly used time

series models, such as those in Chen and Tsay [6].

Theorem 3.4. Under Assumptions (A1) to (A5), as n —> 00, the SBS estimator

m a: and the oracle smootherﬁi :5 given in 3.9 satisfy
7,SBS ’7 7,8 ’7

sup [7227,3135 (x7) — 772,75 (1:7)] = Op (n—2/5 (log n)_1) .
11:76[0,1]

Theorem 3.4 provides that the maximal deviation of 7227,3133 (11:7) from 2727’s ($7)
over [0,1] is of the order Op (n—Z/E’ (log n)_1) = op (n’2/5 (log n)1/2), which is
needed for the maximal deviation of 722,335 ($7) from my (2:7) over [0, 1] and the
maximal deviation of 2727’s (2:7) from m (1:7) to have the same asymptotic distri-
bution, of order n—2/5(log n)1/2. The estimator 1727,3133 ($7) is therefore asymp-
totically oracally efﬁcient, i.e., it is asymptotically equivalent to the oracle smoother
2727’s (x7) and in particular, the next theorem follows. The simultaneous conﬁdence
band given in (3.11) has width of order n72/5(log n)”2 at any point :57 6 [0,1], con-
sistent with published works on nonparametric simultaneous conﬁdence bands such

as Xia [75], Claeskens and Van Keilegom [7].

55

Theorem 3.5. Under Assumptions (AU-(A5), for any p 6 (0,1) , as n —» 00, an

asymptotic 100 (1 -— p) % simultaneous conﬁdence band for m7 (m7) is

mmses (an) n 2&7 (2,) {MT (1,) at,» (1‘7) log (Lg—1) f7 (no

nh}1/2 [1 — {210g (N +1)}_1 [log (p/4) + glog {4n log (N +1)}]] , (3.11)

where [77 (x7) and f7 (11:7) are some consistent estimators of 07 (11:7) and f7 (x7),

313:7): min{[$7/h] 1N} ,6 ($7) = {1‘7 - tj(x,,) } /h,

and

A($v)= Cj(n:,)_1{1-5(='3~7)} C]: \/§ j=0,N+1 ,
cj($7)6(:c7) 1 1SjSN

l- . l- .
J+1,]+1 ]+1,]+2 ,OSjSN,
lj+2,j+1 lj+2,j+2

where terms {lik}li—k|<1 are the entries of the inverse of the (N +2) x (N + 2)

 

 

matrixMN+2,
(1 72/4 0 )
ﬁ/4 1 1/4
1/4 1
MN+2= 1/4
1/4 1 ﬁ/4
(0 75/4 1 )

We refer the proof of the theorem to Wang and Yang [72].

56

3.3 Decomposition

In this section, we provide insight on the proof of Theorem 3.4. Recalling the notaion
of W,” and W,,, deﬁned in (3.4), (3.3), for any functions ¢,<p on [0,1]d, deﬁne the
empirical inner product, empirical norm and empirical mean restricted on [0,1]d
as MAME,” = "7‘1 Z?=1¢(Xz') r09) W5", “25”?” = n’1 23:1 <22 (Xi) W,*,
Eﬁqb = n—1 Zil=1 45 (X,) W2?k = (1, 55);,” respectively. In addition, if functions d), «p
are L2[0,1]d—integrable, deﬁne the theoretical inner product and its corresponding
theoretical L2 norm as (a, (p); = E {gs (x,) (p (x,) W,*} , ll¢ll§2 = E {(22 (x,) W,*}.
A function (f) is called theoretically centered (empirically centered) if E¢W,* = 0
(Eﬁqﬁ = 0). The additive component function m7 and its pilot estimator m, deﬁned
in (3.7) are therefore theoretically centered (empirically centered). In the second
step, for any functions <13, (,0 on [0,1], for any 1 S '7 S d, similarly deﬁne (q‘), 90>qu =
n—1 23:, (:5 (X17) «n (X.- ) w,,, Inna”, = 5122:, n2 (x,,) w,,, Enn¢ =
72—1 231:1 (1) (X20) W,,, = (1, (Mann respectively. In addition, if functions 45, (p are
L2 [0, 1]-integrable, deﬁne the theoretical inner product and its corresponding theoreti-
cal L2 norm as <¢n§0)2,7 = E {4’ (Xi )‘P (Xi'y) Wi'y} a ”ME, = E {4’2 (Xi’y) Wi'y}'
The function space 07 introduced in Section 3.2 is expressed more conveniently

for asymptotic analysis via the following standardized B spline basis

 

b (.2:
BJn'Y (x7) = ,,ZJ||;:,0 S J S N + 1. (3.12)
. . . d,N* . . .
leerSe, 0* 1s spanned by {1, B3,, ,7 (2:7)} 1 J* 1 , in which the new theoretl-
) ’Y: ’ :

cally centered and standardized B spline basis are

by: (5’37)
83*,(1‘7) =—’—7———,1 S’ySd,1 S J* SN*, (3.13)

’ =2
bJ*,7

 

 

 

lg
57

in which

b}*,7 (33,7) :1 J... +1” (1:7)— Cﬂ ’71J*,7(a:7), (314)
CJ*,'7 =’<1[J*7'>2
Simple linear algebra shows that
d N*
m(—_—x) i0+ Z Z I\J*VBJ*H(x),xe[0,1]d (3.15)
7=1J*=1

where (X0, 31,1, ..., 5‘N* d) are solutions of the following least squares problem

{30,11,1,...,:\N*,d}T

d N* 2
= argmin Z{Y,-— A0—: 2 AJ*7B* J... (X,,)} Wi*.(3.16)

Rd( N*)+1z' 1 7=1J*=1

Deﬁne for any n—dimensional vector A = {Az- ”:1, the spline function constructed
from the projection of A on the inner product space (Gn, (a 92,") as PnA (x) =
A * A o o A A A -

A0 + Eff/=1 ZIJV*=1 AJ*:’YB;*,7 (x7) , w1th coefﬁCIents (AO’A1,1"”’AN*,d) given
in (3.16) with Yi’s replaced by Ai’s. The multivariate function PnA (x) has empiri-

cally centered components Pn,7A (1:7), 7 = 1, ..., d

J*=1

The estimators Th (x) ,m, ($7) in (3.15) and (3.7) are rewritten as ﬁt (x) = PnY (x) ,
iii/7 (x7) = Pn,7Y (:37). For linear operators Pn, Pnﬂ, 'y = 1,...,d, using the

relation Y = m + E, where the signal and noise vectors are m = {m (Xi) }?=1 ,E =

58

{5i}?=1’ one has the following decomposition for ”y = 1, ..., d
m (x) = Th (x) + E (x), Th7 (x7) = my (x7) + 57 (x7) , (3.18)
in which the noiseless spline smoothers and the variance spline components are

m (X) = an (X) ,Thly (1277) = Pn,rym (IE7) ,

0m
A

N
v

II

T
Additionally, we can write §(x) = 5*TB* (x), 5* = {56,d’f,1,...,&}‘v*,d} =

—1
(B*TW*B*) B*TW*E, where vector B* (x) and matrix B* are deﬁned as

B* (x) = {1,3111 (31),...,B}"V*,d (1,1)}, 3* = {3* (X1),...,B* (xn)}T.
(3.20)

Clearly 5* equals to

—1

T

0 3* ,B*
dN* < J* ’7 J*I’7,>2,n 13,737,561)
ISJ* ,J*,SN*

1
“ﬂ 231:1 Wit-‘2’

1 .
lsvsd

1

where 019 is a p—vector with all elements 0.

The second step spline smoothing is interpreted similarly. For notational sim-
plicity, take 7 = 1 and denote Xi,-1 = (Xi2,...,Xz-d)T for 1 S i S n, and x_1 =
(2:2, ...,xd)T. Denote 83*,_1 (x_1) = (83*,2 (3:2) , ..., B3*,d (15(1))T, and so m_1 (x_1),
Th_1(x_1), Th_1 (x_1) and E_1(x_1). Deﬁne B(:c1) = {80,1(331),...,BN+1,1(:1:1)},

59

. T -1 T -
B = {B (X11) , ...,B (Xln)}T,thenm1,SBS ($1) = B (331) (W3) Bﬁ—wvl,

T -1 T ..
7711’s (:51) = B (3:1) (B—n‘ﬂ) —Bn——WY1, where Y1 and Y1 are deﬁned in (3.8).
Making use of the deﬁnition of 6 and the decomposition (3.18), the difference
between the smoothed backﬁtted estimator Th1,SBS (x1) and the smoothed “oracle”

estimator 7721’s (131) , both given above, is

 

BTWB) ’1 BT

n

TIl1,s(5’31)_ TA”1,SBS (5’31) = B (x1)( W (Y1 _ Y1)

T -1 T
- .....(B WE) (1......3...)

n n

 

‘1'), and \I'v are the following vectors

n T T N+1
_1 -
‘I’b = {" ZBJ,1(X2'1)W; {m_1(xi,-1) —m_1(Xz’,-l) }1d—1} (321)
i=1 J=1
n N+1
—1 *~ T
‘I’v = {71. Z BJ,1(X7:1)WZ- €_1 (Xi,_1) ld—I , (3.22)
i=1 J=1

here we need the fact that W; Wi’Y = W23“.

According to Propositions 3.1 and 3.2 in Appendix, both of these two terms have
order 0;; (h1/2n_2/5 (log n)_1) = 0p (71—1/2 (log n)_1).

3.4 Simulation example

In this section, we carry out simulation experiments to illustrate the ﬁnite-sample be-
havior of SBS estimators. The programming codes are available in R, see http: //www.r-
project.org.

The number of interior knots N * and N for the spline estimation are calculated

60

as N* = min ([c11n2/5 logn] + CH +1, [(n/2 —1)d-1]), and N = [C21n1/5] +
on + 1, in which [a] denotes the integer part of a. Tuning constants Cll = 5, 021 =
3, on = em = I worked well, and we used them by default. The additional constraint
that N * S (n/ 2 — 1) d_1 ensures that the number of terms in the linear least squares

problem (3.16), 1 + dN*, is no greater than n/Z.

Alternatively, one can use BIC to choose the number of knots. To be speciﬁc, in
the second step, let qn = (1 + Nn) be the total number of parameters. Then N opt
is the one minimizing the BIC value. BIC = log(MSE) + qn log (n) /n, with MSE =

21:1{1/2- — 37,-}2/71. For computing speed consideration, we have not experimented

with this option in this chapter.

Consider the following nonlinear additive heteroscedastic model

Yt = Ed: sin (27rXt’y) + Q, 5t lid N (0,02 (Xt)) , (3.23)
721

—1 2
in which Xt = {Xt1,...,Xtd}T is generated as Xt,’ = (I) {(1 — a2) / ZW} -

1 / 2, 1 S '7 S d where the Zm’s follow a vector autoregression (VAR) equation

—1
2 ~ N 0,1—a2 2 ,z =aZ_ +e,e ~N(0,E),2StSn,
I d t t1 t t

2 = (1_p)Idxd+p1d1§’ a=0.3, 0<p<1,

—1
with stationary distribution Zt = (Zt1,...,th)T ~ N (0d, (1 — a2) )3). Hence
X n_ is a sequence of geometrically strong mixing random variables with marginal
t t—l

distribution U [—0.5, 0.5]. The standard deviation function is

_ 1 5‘exp(zi=1lXtTI/d)
0(Xt) — 00g . 5 +exp (251:1 lXt’rl/d),

61

 

0’0 = 0.5,

which ensures that our design is heteroscedastic.

The SBS estimator #17533 (:07) and the oracle smoother 7717’s (:57) are com-
pared in terms of coverage probabilities of conﬁdence bands for sample sizes n =
100,500,1000, with conﬁdence level 1 — p = 0.95. Table 3.1 contains the coverage
probabilities as the percentage of complete coverage of the ﬁrst true curve sin(27rx)
at all data points {th}?:1 by the conﬁdence bands in (3.11), over 500 replications
of sample size n, for d = 4, 10 and p = 0, 0.3. The results are satisfactory as the em-
pirical probabilities rapidly become greater than the nominal probability of 0.95 as n

becomes large. To show that the SBS estimator m'ySBS ($7) is as efficient as the

Table 3.1: Coverage frequencies from 500 replications.

 

 

 

 

 

r n = 100 n = 500 n = 1000
0 0.86 0.972 0.966

d = 4 .3 0.876 0.956 0.964
0 .0848 0.974 0.97

d = 10 .3 0.842 0.962 0.966

 

 

 

 

 

oracle smoother ﬁlms (10,), we deﬁne the empirical relative efficiency of m7,SBS (:57)

with respect to 7727’s (1:7) as

(0<Xt7S1)

Z?=1{m%SBS(Xt7)m”721(XW} (oth~,31)

 

8H7 =

. 1/2
221:1 {mus (Xi?) "‘7 (Xt7)}21 l
) (3.24)

 

 

Theorem 3.4 indicates that the effry should be close to 1 for all ’y = 1, ..., d. Figure
3.1 and 3.2 provide the kernel density estimators of the above empirical efﬁciencies
computed over the 500 replications. Again, these plots show that the empirical dis-
tribution of e37 does rapidly converge to the point mass at 1 as n becomes larger.

Finally, Figure 3.3 and 3.4 show typical examples of the SBS estimator with the conﬁ-

62

dence bands in (3.11) and the corresponding empirical relative efﬁciencies. The plots
in these two ﬁgures illustrate graphically the summarized results on conﬁdence band
coverage and on the empirical relative efﬁciency.

Lastly, we provide the computing time of model (3.23) with dimension d = 10
from 100 replications on an ordinary PC with Intel(R) Quad CPU 2.4 GHz processor
and 3.0 GB RAM. The average time run by R in seconds to generate one sample of
size n and compute the SBS estimator and spline backﬁtted spline (SPBK) estimator
of [71] has been reported in Table 3.2. As expected, the computing time of SBS is
hundreds time faster than SPBK and this advantage widens with increasing sample

size.

Table 3.2: Comparison of computing time of Model (3.23).

 

Method 72. = 100 n = 500 n = 1000
SPBK 0.09 7.8 54
SBS 0.007 .064 0.32
Ratio 12.88 121.88 168.75

 

 

 

 

 

 

 

3.5 Appendix

Throughout this section, an >> bn means nl—i+moo bn/an = 0, and an ~ bn means
nimoo bn/an = c, where c is a nonzero constant. Whenever we write ~ 1 for some
quantity that depends on 0 S J * S N * or 0 S J S N + 1 it means it holds for all

possible J* or J values as n —-> oo.
A.1. Propositions

Recall from section 3.2 that “‘1’me = SUPO<J<N+1 I {‘Ilb}{lv=+01 |. In this sec-
tion, we show that the bias term ”‘I'blloo of (3.21) and the noise term \Ilv given in

(3.22) are uniformly of order Op (h1/2n_2/5 (log n)‘1).

63

Proposition 3.1. Under Assumptions (A1) to (A2), and (A4) to (A5)

“‘1'me = 019(h1/2 (n_1/2 + H)) = 01; (h1/2n—2/5 (logn)_1).

Lemma 3.1. Under Assumption (A1), there exists 9 (x) = C+EEIY=197 (3:7) 6 0*,
such that for ﬁt deﬁned in (3.19),

*
= 019(n—1/2 +H).
2,n

d

Th - 9 + 2 (1,97 (X7)>2,n
7:1

 

 

Proof. By the result on page 149 of [10], there exists a constant Coo > 0 and spline
functions 97 E 0*, such that “97 — m7|loo S Coo ”leHOOH, 7 = 1,2, ...,d. Thus
llg — mnoo _<_ 247:1”97 — MM.» s 000 29,21 llmslloo H and um - mug,” s
”9 - mllin S 000 Zg=1llm7lloo H. Noting that ”ﬁt — gllin < ”Th - mllin +

* d I
“9 - mll2,n S 2000 27:1 “mfyhoo H, one has

 

 

 

 

 

lam (stml s [M (W... —<1,m~,<x7>>;,n + |<1’m7 (x7)>3,nl
s CoollmgllooH+0p (72—1/2). (3.25)
So
(1 *
Tit-9+ X (1,97 (X7)>2,n
7:1 2,n
d
S llm-QHEJML Z l<1v97 (X7)>2,nl
=1
g 300. i ”mill... H + 0,.(n—1/2) = a, (..,—U2 + H).

Proof of Proposition 3.1. Clearly that “‘I’blloo S R1 + R2 + R3, where

 

 

 

N+1 _1 n d * *

R1: SUP n X Z BJ,1(X2'1)W2' <1,97 (Xi7)>2n ’
J=0 t=17=2 ,

N _.
R2 =Sljfgn IZQBJM i1)Wi*7{9 (Xi7)—m7(xi'l')}(’
— i=17=2
R3 = [XI-1E} ”—1: EngJ,“ Xi1)Wz'*
J=0 i—17—2

 

 

X {7712 (X27) ‘ 97 (xiv) + (1’97 (Xi7)>;n} '

According to (3.25) R1 = Op{h1/2 (H + n—1/2)}. For R2, using the result on

page 149 of [10], one has R2 S Cooh1/2H. To deal with R3, let B32” (x7) =

a:
B3,,” (x7) — <1, BJ*,7 (X7)>2,n’ for 1 S J* S N*, 1 S 7 S d, then m(x) —
a:
g(x) + 236:1 (1,97 (1(7)); n = &* + 261:1 £1};— _1 {13*HB3: (x7). Denote
d
next wJJ*,_1 (XI): {wJ J*,’7 (Xl)},y___2 ,qu J*,_1= qu J*,7 7:2, where

tau...” (x1) = 1_3J,1(X,1)B},.,,y (X17) Wf, ”WJ,J*,7 = Bow...” (x,). (3.26)

Thus,

T T
71,: BJ, 1(Xi1)W {“771{ -1 (sz1)T — 9_1 (X11) + En9_1 (X233) }1d—1

n
i=1 J*= -1

65

bounded by

N*
(d — 1) sup
2S'st J*Zzl

 

*
“J*nl

sup sup
1SJSN1SJ*SN*

 

 

)

TL
—1
11 Z wJ,J*,’)’ (X
i=1

n—IZBJ1(X2'1)W; 3.1137 (X27)
i=1

 

 

N*
g (d—l) sup |&** I sup SUP
2931]}; J ’7 13J:N13J*£N*

) ,

where An,1 is in (3.35). By Lemma 3.11,

n
+ An,1 71—1 Z BJ,1(Xz-1)Wz-*

 

 

sup sup < *< *
ISJSN 1_J _N

 

 

'n
71—1 isz,J*,’7 (X?)

S SUP
1SJSN1<JS*SN*

 

 

nZwJMJ*,’7(X _ﬂwJ,J*,’)’

 

 

+ sup sup
1<J<N1<J*<N*

=0 (Iogn/f) +0p ((Hh)1/2)——.O p.((Hh)1/2)

“Wm

Therefore, one has

N +1
sun)

1
J_;ZBJ,(X1 2'1)W

{771-1 (Xi_11)T — 9-1 (X 1-1)T + En9_1(X_1)T}1d—1l

N* 1/2
S {1: (w) {op<<w>+ope:>}

297561 J*=1

 

 

66

1v* 1/2
2
= OP h1/2{ Z ((13%?) }

 

 

 

J*=1
d *
7:1 2
Thus, by lemma 3.1
R3 = 019(111/2 (n_1/2 + H)) . (3.27)

Combining (3.25) and (3.27), one establishes Proposition 3.1.

Deﬁne an auxiliary entity

5-1 — NZ aJ*, -1 BJ* ,-1 (x 1) (3'28)
J*=1
- - d - . . . . . .
where aJ*,_1 = {aJ*,7}7=2 and aft” IS glven In (3.21). Deﬁnitions (3.17) imply

that E_1 (:c_1) deﬁned in (3.19) is the empirical centering of 533 (x_1), i.e.
n
g__1(x1)=5_1(-)_$1—121E1(Xi_)1 (329)

Proposition 3.2. Under Assumptions (A2) to (A5), one has

||\Ilvlloo = 0p (Hh1/2) = 019 (h1/2n_2/5(logn)ﬂl) .

According to (3.29), we can write \Ilv = @182) — W9), in which
N 1 N+1
+
1n_2 T
{‘11)} )}J-O = g: BJ,(X1 21)W21W-I5_1(Xi’_1) 1d—l ’
_ 3,2' ’=1 J20
(3.30)

67

N+1 T
{W£2)}J_O= B—-n W*§* _1(X 1)T1d_1. (3.31)

where E:*1 (X_1) is given in (3.28). By (3.26), (3.21) and (3.28), we have

1" Ni: T
n_ 2 Z 3J*,_1wJ,J*,_1(x_1) . (3.32)

”will =
0° l=1 J*=1

OSJSN+1

 

Proposition 3.2 follows from Lemmas 3.2 and 3.3.

Lemma 3.2. Under Assumptions (A2) to (A5), ﬁll/£1) in (3.30) satisﬁes

11$,”

 

 

= Op {h1/2N* (logn)2 /n}.

lOO

Proof. Based on (3.28), ”n—1 23:15: (Xi_1)T ld—lwi*

 

 

is bounded by
00

}.

&3*,7| S {N* (5*T§*) }1/2 = 019 (N*n—1/210gn) .

   

 

N*
(d—l) sup a** sup 3* * Wf"
QSVSCI J*=1 J ’7' 1<J*<N* 71%: J ,’Y( X37) 2

                         

Further, by (3.35),

sup sup
QS’YSdISJ*-<_N*

 

71—1 Zn: 133...,7 (Xh) w;
i=1

S An,1= 0p (n_1/210gn),

 

SO

= 0,, {N*(1ogn)2 /n} . (3.33)

l n "* X- T1 W*
"ZRM 2-1) (11—1 2'

 

 

68

By
n
sup n_1 2: BJ,1(Xi1)Wz'*

OSJSN—l-l i=1

OSJSEIfV+1(<1’BJ’1>2an _ <1’BJ’1>2) + OSJSEII)V+1<1’BJ’1>2

Op(log n/\/r_z) + Op (hl/z) = 0p (hl/Q) .

Thus with (3.33) the lemma follows immediately.

Lemma 3.3. Under Assumptions (A2) to (A5), we have @182) = Op (Hh1/2) .

OO

 

 

 

 

Lemma 3.3 follows from Lemmas 3.14 and 3.15.
A.2. Preliminaries

We ﬁrst give the Bernstein’s inequality for geometrically 7—mixing sequence, which

is used often in many of our proofs.

Lemma 3.4. [Theorem 1.4, page 31 of Bosq [3]] Let {Ebt E Z} be a zero mean real
valued a-mixing process, Sn = 222:1 52-. Suppose that there exists c > 0 such that
fori = 1, ...,n, k = 3,4, ...,E léilk S ckhzklEiﬁé2 < +00, then for each n > 1, integer
q E [1,n/2], each s > 0 and k 2 3

2 n 2k/(2k+1)
P(lSn| 2: ne) S a1.exp(—Fmg£;—5—cg) + a2 (k) a ([2133]) ,
2

2
wherea- is the a—mizin coe cient in 3.10 anda =2ﬂ+2 1+ 5 ,
() 9 1373 { ) 1 q ( 25m22+5c5)
5m2k/(2k+1)

a2(k)=11n 1+ k e , with m7- = maxlSiSn ||g,-||r, r 2 2.

 

Lemma 3.5. Under Assumptions (A4) and (A5), one has:
(2') |

 

2
b3*,7ll2 ~ H, where 53*,7 is given in (3.14).

69

(ii) for any 7 =1,2,...,d,

E{BJ* (X,- 7)BJ*, (X77)W7-*}~1,

for J*’ —J* g 1, and

 

 

E {BM (X22) BJ’a (X13) W23} N 1’

for |J’ — J. g 1. In addition,

E ~ Hl—k,

13*... 7,-(X 7)BJ*, (X77 )W-*k

ElBJv (X203 J’,7 (Xi?) Wiilk “'hl k

 

 

for k 2 1, where 33* 7 and BJ,7 are deﬁned in (3.13) and (3.12).

Lemma 3.6. Under Assumptions (A4) and (A5), there exist constants C0 > CO > 0

such that for any a* = (a6,a’i‘,1,...,a*N*,1,ai2,, aN* 2, a*1d,. ,a*N* d)’

*2
c0 a62+ Z afﬁx), S a6+ Z a;*,783*,7 SC'O a0 2+ Z aft,7
J*37 J*37 2 J*37
(3.34)
Lemma 3.7. Under Assumptions (A2), (A4) and (A6), one has
* *
A 1 = sup 1 B* ,.. 1 B* ,,. l (3.35)
n’ 13J*§N*,7 < J ”>2” < J ”>2

 

= Op (71—1/2 log n) ,

7O

A712 (3.36)
= sup

8* ,B* > _ <B* ,B* >
1<J*,J’*<N*,7 < J*,’)’ J/*,,7 2,n J*n, JHKY 2
= Op (n—1/2H_1/210gn),

A773 (3.37)

* at
B** 13*! I> _<B** 18*! l>
< J17 “’7 2,n J ’7 “’7 2

 

 

= sup
1SJ*.J’*SN*.7#7'

= Op (11—1/2 logn) .

 

 

Lemma 3.8. Under Assumptions (A2), (A4) and (A6), one has

 

An = sup

l<91,92)§,n-(g1,92)§l_ ( logn
9792601 |l91||§||92|l§ p

Denote next by V as the theoretical inner product of the B spline basis
{1,B"'},,.,7 ($7) , J* = 1, ...,N*,7 = 1, ...,d}, i.e.

T
1 0 ,..
V: M at (3.39)
0 * 3* 3* /
dN J*,7’ J*’,7’ 2 137.731.
1SJ,J’SN*
Let S be the inverse matrix of V, i.e.,
T T ’1 T T
1 ON ON 1 ON ON
_ —l-_ _
S—V — 0N V11 V12 ’ 0N S11 S12 - (3'40)
0N V21 V22 0N S21 322

Lemma 3.9. Under Assumptions (A4) and (A5), for V, S deﬁned in (3.39), (3.40),
there exist constants CV > cV > 0 and CS > 63 > 0 such that CVIdN*+1 S V S
CVIdN*+1a CsIdN*+1 ‘5 S S CSIdN*+1°

71

We refer the proofs of Lemmas 3.5 to 3.9 to Lemmas A.2, A.4, A.7, A.8 and A9
in [71].

Lemma 3.10. Under Assumptions (A2) and {A3}, there exist constants c( f ),

C (f) > 0 independent of n, such that as n —> 00, with probability approaching 1,

—1
GBTWB) c g

—1
c(f)||<l|2 s (TGBTWB) CsC(f)IIC||2,V<€RN+2- (3.42)

C(f)|C| S C(fHCI, (3-41)

 

 

The lemma and its proof is based on Lemma 8.2 of Wang and Yang [70].
Lemma 3.11. Under Assumptions (A4) to (A5), for ,qu J* 1 given in (3.26)

sup sup

I: O{(hH)1/2}.
OSJSN+113J*SN*

 

” wJ,J*,_1

Proof. For 7 = 2, ...,d, J = 0, ...,N +1, J* = 1, ..., N*, by the boundedness of the

(Batman. (29.)):

[01.../0’BJ1(211)BJ. ((,udu7)Ifu1,...)du1.. .dud

(If/0 .../O IBJ,1(u1)B:;*,7(ury)ldu1...dud

= 0.01.1 tubs.” )‘1/1/1 (w (.,. WW

= (lle:1ll2lb}*,2)7ll {: f0 le,“1)’J*+1 (“7)duld“7+
(11.111.13.11) ———,~.-.—;——-,/. /. W) Wm}

72

IA

|/\

 

where CJ*,7 = <1,IJ*’,7>2.

sup sup //b 1(ul)1 * (u )duldu7=0{hH},
0ngN+11gJ*gN* J’ J ’7 7

and the proof of the lemma is then completed by (i) of Lemma 3.5.

Lemma 3.12. Under Assumptions (A2), {A4) and (A5), one has

 

 

 

n
_1 { .
sup sup n w :1: (X ) — ,u } 0p (log n/ﬁ) ,
OSJSN+115J*3N* I; J", '1 l “’J,J*-1 00
(3.43)
1/2

sup (1) * = Op (hH) , (3.44)

0<J<N+11<J*<N* -1: J", 00 ( )

 

 

 

 

where wJ,J*,_1 (X1) and l‘wJ’J*,_1 are given in (3.26).

Proof. For simplicity, denote wiﬂ‘ﬂ (XI) = WJ,J*,,Y (X1) — . Then

2 .
E {“13”},ka (Xl)} = Ew3,J*,7 (X1) — H3J,J*,'y’ whlle

qu,J*,7

Ewiﬂﬂ (X1) = E (BJ,1(X11)WI*B;*,7 (le))2

_ 1 1
(HbJJH2Hb3*,7H2) 2/0 .../0 (bJ,1(U1))2 (b§*,7 (12))2

f (11.1, ...,ud)du1...dud,

2
2 ~ 2 2 =1: _
EwJ,J*,7 (X1) 1 and EwJ,J*,7 (XI) >> qu,J*,7. Hence E {qu’Jm’7 (Xl)} _
Ew3,J*,7 (X1) — 113, J J* ,7 2 0* for n sufficiently large and some positive constant

T v
c*, When r 2 3, the r-th moment E IUJJ,J*,,7 (KIM lS

 

J*n

T/l‘”/01bJ,1(u1)T|bJ*ﬂ ()u’ylr f(U1,.-.,’ud)dul...dud.
(Hulk: I )0

 

 

73

It is clear that E IBJ,1(X11)WI*B;* 7

to Lemma 3.11, one has IEWJ,J*,’Y (XDIT = IEBJ’1(X”)WI*B}*17 (X17)

(X17)Ir ~ h(1_r/2)H1_T/2. According

~

 

7. r
(hH)T/2, thus E IUJJ,J*’,7 (Xl)l >> I . In addition, for any J and J*,

 

“WJ,J*,7

E * X r< _——C (T—Z) 1E * X 2
in7J*17( l)’ _ (hH)1/2 T. - leaJ*17( l)| ,

so there exists 0* = ch_1/2H—1/2 such that

7

r _ 2
Elwiﬂnmwl SC: 27!Eiw3,J*,7(xl)'

which implies that {wiﬂgﬂ (Xl)}::1 satisﬁes the Cramér’s condition. By the

Bernstein’s inequality, for r = 3
1 n
P ngEJﬂ'r (X1) 2 p”

2 6/7
S a1 exp — 2qpn + a2 (3) a ([L])
25m2 + 5c*pn (I + 1

with m% ~ h_1, m3 = maxlSz-Sn Ile,J*,7 (XZ)H3 _<_ {CO (2h’1)2}1/3 and

 

 

 

logn n p2 5mg/7
p =p—,a =2—+2 1+ n ,a (3 =11n 1+ .
n x/nh 1 q ( 25mg + 5c*pn 2 ) Pn

Since 53*Pn = 0(1), by taking q such that [6%] 2 cologn, q 2 cln/logn for
constants c0,cl, one has a1 = 0(n/q) = O(log n), a2 (3) = 0 (n2). Assumption
(A2) yields that

 

a ([4:1])6/7 s Cn’6A000/7.

74

Thus, for n large enough,

1M;-

By (3.45), there exists large enough .value p > 0 such that for any J *,

{

which implies that

 

x/n—Ti

 

 

2
(){lpplogn} S (in—62p logn + Cn2—6AOCO/7. (3-45)

§|H

n
;%*7()Xz

 

>)p(nh)—1/210gn}<n—10,1<J*<N*,

 

 

 

00 n
1 .. logn
P sup sup n w ,.. (X1) 2p
”2:31 {ongN+11gJ*gN* [:21 J", ’ Vnh

 

|/\
M8
ME
M3.
“U
A
:1
.1.
M:
E
K.
K-‘x-
‘3
3
IV
bI—d
§a
3“ 3
W

00
3 Z 1\I(N”‘)n—10 < oo
n=1

Thus, Borel—Cantelli Lemma entails that

 

n
—1 :1:
sup sup n w ,.. (X1) :01) logn/Vnh . (3.46)
ongN+11gJ*gN* 1:21 W ’7 ( )
Then,
1 n
-' *
sup sup n w ,.. (XI) :01; logn/Vnh.
ongN+11gJ*gN* 1:21 W "1 00 ( )

 

 

 

 

As a result of Lemma 3.11 and (3.43), (3.44) holds.

The next lemma provides the size of a*Ta * ,Where 5* is deﬁned by (3.21).

75

Lemma 3.13. Under Assumptions (A2) to (A5), 5* satisﬁes

5*T5*— — if.” + E Z 51327—0-13{N* (log n)2 /n}. (3.47)
J*= —17=1 ’

Proof. According to (3.20) and (3.21), 5*TB*TW*B*5* = 5*T (B*TW*E).
Thus

1
W*B*~* *2 = ~*T
H a ”2,n a <B* B

* 5*: 5*T (n_1B*TW*E).
ﬂ” J* ,7 >

2,72
(3.48)

By (3.38), “B*5*|I;,2n is bounded below in probability by (1 — An) “B*5*”;2. AC-
cording to (3.34), one has
*2
.. 2 ~ 2 ~ 2
||W*B*a*“; = a3 + Z 033,273‘3‘...’7 2 c0 a5 + Z a}, . (3.49)
J*a’l’ 2 J*,’)’

Meanwhile one can Show that a"‘T (n _1B*TW*E) is bounded above by

 

J*a

2 2 1/2
~*2 ~=I=2 1 'n. 1 n =1: *
i=1 1*,7 i=1
(3.50)
Combining (3.48), (3.49) and (3.50), the squared norm 5*Té* is bounded by

2

2 n
002(1_An)2{%i:::152} + Z {iZBE’kﬂ (X17) W553}
J*,7 i=1

Truncating e as in Lemma 3.15, Bernstein inequality entails that

76

"—1 23:1 52' + maxng*gN*,7=1,...,d ”-12?=1BJ*,7 (X23) Wigil =
Op (log n/ﬁ) . Thus (3.47) holds since An is of order 019(1) by lemma 3.8.

A.3. Proof of Lemma 3.3 We denote

T
v*— 0 odN,

* *

o B* ,B* — B* ,B* ,

dN* < J*a J*’,’7’>2,n < J*n J*’,7’>2 157’7'3‘11
1gJ*,J*’£N*

then 5* in (3.21) can be rewritten as
1 *T ’1 1 T -1 1 T
5*: (5B W*B*) (;B* W*E) = (V +v*) (53* W*E) . (3.51)
. - A A .. - T
Now deﬁne a = {a0,a1’1,...,aN,1,a1,2, ...,aN,2} as
a = v—1 (n—lB*Tw*E) = s (n*1B*Tw*E) , (3.52)

(2) -

and deﬁne a theoretical version of ‘11,, 1n (3.32) as

-(2, ” A” T
‘11?) = 71—1 Z Z &3*,_1wJ,J*,_1 (Xi) . (3.53)

Lemma 3.14. Under Assumptions (A2) to (A5),

ll‘l’i’z)"i’i’2)lloo = 0,, {h1/2 (log n)2 /nH}.

Proof. By (3.51) and (3.52), one has V 5* = (V+V*) 5*, which implies that
v*a* = V (51* —a*). Using (3.36) and (3.37), one obtains that

IIV me)“; = “Wu; s 0,,(n—1/2H—110g.) 15*“;-

77

According to Lemma 3.13, “5*”; 2 Op (n‘1/2N*1/2 log n), so one has
||v (a*—5*) H; 3 Op {(log n)2 n—1N*3/2}.

By Lemma 3.9, H (a*—a*) H; = Op {(log n)2 n—1N*3/2}. Lemma 3.13 implies

 

 

 

llé*||§ _<_ “(e—awn; + “5*”; = op (10gn\/—N*/n) . (3.54)
Additionally,
(2) “ (2)” ~* A* 1
‘11 —‘I’ = sup a _ a __ U.) X .
l v v (X3 OSJSN'l'l J21 ( J*’-1 J*i-l) n lz-Zl J’J*,-1 ( I)
So
' sim—t?) l s x/N_*op {(—1n——°g ") 2)0), ((hH)1/2)
00 H

 

 

 

h1/2lon2
= Op{ 1ng ) }

Lemma 3.15. Under Assumptions (A2) to (A5), for @927; (3.53), one has

 

 

 

 

 

 

 

     

n
.. 2) _1 &*T * *
I 00 OSJSN+1 12:1 ( X2 )Jéla J -1 J -1 z- 2
= 0p(h1/2H) .
Proof. Note that all)? is bounded by Q1 + Q2, where
00
Q1: sup (1* * #w
0<J<N+1 2: J -1 JJ* 1

 

 

78

N* n

~ T —1
Q2 = sup a* ,,. n E {w * (x.) _ ”w }
OsjsN+1 J*=1 J ,-1 2:1 J,J ,-1 Z J,J*,_1

. (3.55)

 

By Cauchy-Schwartz inequality, (3.54), Lemma 3.12, and Assumptions (A5),

 

Q2 = 029 (10s w N*/n) W01) (Pi—i=7) = Op { (10%)3}. (3.56)

Deﬁne next

I71,7 = sup n_1 2 Z

OSJSNH 15i5n1<J*,J*’<N*

(Xi )Wz'lﬁz'

7

 

s B
“wJ.J*,7 J*+N*,J*'+1 J*’,1

F20 = sup n—1 2 Z

OSJSN“ 1$iSn1<J*,J*’<N*

3

(Xi )szEz'

 

qu,J*,78J*+N*,J*I+N*BJ*',7

then it is clear that

Q1 3 (d— 1) ( sup Fla + sup F2”).
ZSvsd 2Svsd

Next we will show that Fm -.—.- 0p(h1/2H). Let Dn = n90 (Tic; < 90 < g), where

6 is the same as in Assumption (A3). Define
52:0 = 511057;] S Dn), EZD = 52-1 (Isl-l > Dn), 6:21) = 527:1) — E (sap |Xi),

T T
Uiv'l' : ”an/S21 {BT,1(X211)’ - - - ’B]:N* (Xi1)} Wi1€;D.
Denote the truncation of'Fln, as F107 = ln_1 2:;1 Uiﬂl' Next we Show that

79

D
IFL’Y — F11?” -_- Op(h1/2H). Note that lFlw'Y -— Fl,’7l 5 Al,’)’ + 112,7, where

A = —
1,7 7121 Z ”wJ,J*,7SJ+N+1,J’+1
2_1<J* J*’ <N*

33*; 1(Xz'1lwz'1E (55D lxz') ,

 

n
1 >1:
= — 2 i E : B X W
AM n, l“"J,J*,78J+N+1,J’+1 J="’,1(7’1 1) 151D
2=11§J*,J*'§N*

T
Let ijﬁ = {qu,1,7’ ' " "qu,N*,'y} 3 then
N*

J*’ =1

A1,7= Mg] 7_321{n 123”!“ W11)W115(5;D|Xi)}

2:1

N* N* n 2 1/2
< 03 Z “wakﬂ Z {'71: ZBJ*,(1(Xi1)Wi1E(5»¢—D lxz)} °

J=1 J=1 i=1

By Assumption (A3), IE (55D IXz-N = IE (5:1) IXi)| S M5D;(1+6) and Lemma
3.4 entails that sup I% 21:1 BJ1(Xz-1)Wz-1‘ = Op (log n/JH). Therefore
Jn ’

— 1+5
Alf) S Man( )
x sup HwJ 8* ,..
OSJSN‘H. J*Z:12‘]’J*”Y J;=1{n 1:: J 1(X M}

= 0p{N*D;(1+6)h1/2log2 n/n} = op (h1/2H) ,

80

where the last step follows from the choice of Dn. Meanwhile

2+6
2%” P(ls l>D) < §_El€nl2+6_ 0° E(E'5"| IX")
” n) - D2+6 — D2+6
n=1 n=1 n=1 n
:— 2+—_6 <’°°
n=n1D

 

since 6 > 1/2. By Borel-Cantelli Lemma, one has with probability 1,

—1 ... . . ,_.+ _
n 2 2, #wJ,J*,7SJ*+N*,J*’+IBJ*',1(X”)W21”2',D _ 0’
z=11§J*,J* gN*

for large n. Therefore, one has
D _ 1 2
(FL, — le 3 A1,, + A2,, _ 0,, (h / H)

. Next we will show that F £7 = 0;; (h1/2H). Note that the variance of U,” 18
T * 1|: T *
pwJ,7321 var {Bl,l(Xi1)’ - - - , B1,N* (Xil)} Wilgz'l) 821“WJ,,7'

T
By Assumption (A3), ch11 S var ({Bf1(X,-1),~ BlN* (X,1)} W“) S
03V11a var (Um) ~ 11$ J,7321V11321#w J,.,Ve,D = 113’; J,,S21Mw J,,Ve,D: Where

1/2
V€,D = var {EZD IX, }. Let my = {“ngpqu} , then

‘3ch {“7}2 Vs,D 3 var (U2) S 0503 {"7l2 Ve,D

Simple calculation leads to that

Elm-”)2 {C0K7DnH—1/2} 2r!E|U,- ,2) <+oo,

81

where the last step follows from the choice of Dn. Meanwhile

 

2+6
00 1305 |>D) < {BE—kw: 00 E(E|€nl IX")
2+6 ’
n=1 D71

since 6 > 1 / 2. By Borel-Cantelli Lemma, one has with probability 1,

TL
_1 =1: , , + _
n E Z ””J,J*,78J*+N*,J*'+IBJ*’,1(X21)W218i’D_0’
z=11§J*,J*'§N*

for large n. Therefore, one has
D _ 1 2
|er7 — PM) 3 A1,, + A2,, _ op (h / H)

. Next we will show that F113, = Op (hl/zH). Note that the variance of U,-,,y is
T * * T *
pan’SQl var {Bl,l(Xi1)’ - . - ’BI,N* (291)} Wilei,D S2lquq'

T
By Assumption (A3), ch11 5 var ({BI,1(X,-1),... ’81,N* (X,1)} W,- ) S
2 T _ T
CaV11,Var(Ui,7) ~ qu,,521V11521MwJ’,Ve,D - PanS21l‘wJﬂl/5D1Where

1/2
Ve,D = var {EZD IX,- }. Let [£7 = {ngﬂpwJﬂ} , then

0503 {w}2 v5.5 s w.) s 0303 {,.,}2 Van-

Simple calculation leads to that

7‘ _1/2 T—2 ' 2
ElUml g{c0,e,DnH } nElUml <+oo,

81

n
for r 2 3, so {Ui 7}, satisﬁes the Cramér’s condition with Cramér’s constant
1 2:

Cal: = Cofﬁ'yDnH_1/2. Hence by the Bernstein’s inequality,

 

 

1 n (19% n 6/7
P n“ U- 2p 5a exp — +a 3 a([ J) ,

_ 1/3
where m% ~ {ma}2 VE’D, m3 3 {c{na}3H 1/2DnV€,D} , pn = ph1/2H,

 

6/7
2 5m
a1 == 22 + 2 (1+ p" ), (12(3) 2 lln (1 + ——pn3— . Similar arguments

 

 

q 25m§+5c*pn
2 2 5
as in Lemma 3.12 yield that as n —> oo, 2qpn ~ (1%? = pn g 2 —>
25m2+5c*pn 00(log n) / Dn

+oo. For c0, p large enough,
1 n
P a 2 U,” > ph1/2H S clognexp {—02p2 log n} + C'nz_6)‘060/7 S n_3,
i=1

for n large enough. Hence

00 D 1/2 = 00 _1_ n , 1/2 00 —3
P(|W1,7)2ph H) ZP ”EU, th H 3 Zn <oo.
n==1

ThuS, Borel-Cantelli Lemma entails that F 97 = 0;; (h1/2H) Noting the fact that
|FL, — F1137) = 0,0 (h1/2H), one has that F1” = 0,, (h1/2H). Similarly F2,7 =
01) (h1/2H). Thus

Q1 3 (d — 1) ( sup F17 + sup F2”) = Op(h1/2H) , (3.57)

2svsd ’ 2:73d

and one has Q1 = Op (hl/zH). The result follows from (3.55) and (3.56).

82

Eff of the 1-st estimator, d=4

 

3.0

 

2.0

—-4

1.0

—d

d l

050 of5 1.0 15 210 215 3T0
x

 

 

 

0.0

 

Eff of the 1-st estimator, d=4

 

3.0

2.0

 

1.0

 

 

 

C21 ,, 1 A

O r r l I T T r

0.0 0.5 1.0 1.5 2.0 2.5 3.0
x

 

Figure 3.1: Plots of the efﬁciency of SBS estimator ma,SBS corresponding to oracle

smoother 1710’s for d = 4 and p = 0 (upper panel), p = .3 (lower panel) of ma (cm)
in (3.24), for a = 1 (thick curve for n = 1000, thin curve for n = 500, and solid curve
for n = 100).

83

Eff of the 1-st estimator, d=10

 

3.0

 

2.0

i‘ \
1’ h

10

 

 

 

0.0

010 015 110 115 210 215 310
x

Eff of the 1-st estimator, d=10

 

3.0

 

2.0

1.0

 

 

 

0.0

115 210 215 310
x

 

010 015 110

Figure 3.2: Plots of the efﬁciency of SBS estimator ma,SBS corresponding to oracle

smoother 7710’s for d = 10 and p = 0 (upper panel), p = .3 (lower panel) of ma (mg)
in (3.24), for a = 1 (thick curve for n = 1000, thin curve for n = 500, and solid curve
for n = 100).

84

95% confidence band, n-100, d=4

‘ Y ‘

 

 

 

 

 

 

 

 

 

 

2

1

 

-2 -10

 

 

 

Figure 3. 3: For p = 0, plots of the oracle smoother ma S (dotted curve), SBS esti-
mator ma SBS (solid curve) and the 95% conﬁdence bands (upper and lower dashed

curves) of the function components ma (spa) in (3.9) with a— — 1 (thin solid curve).

85

95Vo confldence band, nil-100, d-1O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(\l —
>— o A
‘7 ~
N _
I
95°/o conﬂdence band, n-500, d-‘IO
(\l —
>— o —
‘7' -
(\l _,
l
—d.4 ' 010 0'2 014
X
95% confidence band, n-1000, d-10
N -—
>— Q ~
‘7 a
(\l _
u
—d.4 ' 010 012 014
X

Figure 3.4: For p = .3, plots of the oracle smoother 7710’s (dotted curve), SBS
estimator 7720,3138 (solid curve) and the 95% conﬁdence bands (upper and lower

dashed curves) of the function components ma(:1:a) in (3.9) with a = 1 (thin solid
curve).

86

Chapter 4

A simultaneous conﬁdence band

for dense longitudinal regression

4.1 Introduction

Traditional statistical methods fail often as we deal with functional data. Indeed, if
for instance we consider a sample of ﬁnely discretized curves, two crucial statistical
problems appear. The ﬁrst comes from the ratio between the size of the sample and
the number of variables (each real variable corresponding to one discretized point).
The second, is due to the existence of strong correlations between the variables and
becomes an ill-conditioned problem in the context of multivariate linear model. So,
there is a real necessity to develop statistical methods/ models in order to take into

account the functional structure of this kind of data.

Functional data with different design are increasingly common in modern data
analysis. A functional data set has the form {X,j,1’,j}, 1 S i _<_ n,1 S j S N,, in
which N,- observations are taken for ith subject, with X,,- and Yz’j the jth predictor

th

and response variables, respectively, for the 2' subject. In this chapter we only

deal with the equally spaced design. For simplicity, we only consider the case N1 =

87

 

yar-. .‘

 

N2 = = Nn = N. Without loss of generality, the predictor X,j takes values
{1/N, 2/N,.. .N,/N} for the ith subject, 2' = 1, 2, ...,n. For the ith subject, its sample
path {j/N, j}Y, is the noisy realization of a continuous time stochastic process €,’(:L‘)

in the sense that

Yz’j = 52‘ (j/N) + 0 (370/761,,
with errors 8,-j satisfying E (23,-j) = 0, E<€22j) = 1, and {€,-(:c), a: E X} are iid copies
of a process {f(:r),z E X} which is L2, i.e., EfX 52(x)dx < +oo.

For the standard process {£(x),a: E X}, one deﬁnes the mean function m(x) =
E{£ (13)} and the covariance function G (11:,27’) = cov {{(x),{f(:r’)}. Let sequences
{Ak},c::1, {212k($)}g:1 be the eigenvalues and eigenfunctions of G (55,:c’) respec—
tively, in which A1 2 A2 2 Z 0 with 22:1)‘k < 00, {$1,321, form an or-
thonormal basis of L2 (X) and G (:13, a: ’)= 2,3: 1Akwk( :r)1,/Jk (1’), which implies
that [G (x,a:’) 11),, (33’) dx’ = Akz/zk(:r).

The process {€,-(:I:), :1: E X } allows the Karhunen-Loeve L2 representation

err) = m(x) + 2:, aim/em,

where the random coefﬁcients 5,- k are uncorrelated with mean 0 and variances 1, and
the functions (13,, = i/Akwk' In what follows, we assume that A], = 0, for k > K,
where n is a positive integer or +00, thus G(:r,:c’) = 2g=1 ¢k($)¢k (:c’) and the

data generating process is now written as

it, = m (j/N) + 22:, em (j/N) + 0 (2711715., (4.1)

The sequences {Mgr/21:1 , {¢k(x)}z=1 and the random coefﬁcients 5,), exist mathe-

matically but are unknown and unobservable.

Two distinct types of functional data have been studied. Yao, Miiller and Wang

88

[80, 81], Yao [82] and Ma, Yang and Carroll [45] studied sparse longitudinal data
for which 1 _<_ j S N,- and N,’s are iid copies of an integer valued positive ran-
dom variable. While Li and Hsing [39, 40] concern dense functional data. For the
dense functional data, strong uniform convergence rates are developed for local-linear
smooth estimators, but no uniform conﬁdence bands have been given. The fact that
simultaneous conﬁdence band has not been established for functional data analysis
is certainly not due to lack of interesting applications, but to the greater technical
difﬁculty to formulate such bands for functional data and establish their theoretical
properties.

In this chapter, we present simultaneous conﬁdence bands for m(x) in dense lon-
gitudinal data given in (4.1) via local linear smoothing approach.

The chapter is a joint work with Yang, L., Liu, R. and Shao, Q. We organize
our chapter as follows. In Section 4.2 we state our main results on conﬁdence bands
constructed from local linear smoothing. In Section 4.3 we provide further insights
into the error structure of local linear estimators. Section 4.4 describes the actual
steps to implement the conﬁdence bands. Section 4.5 reports ﬁndings of a simulation
study. An empirical example in Section 4.6 illustrates how to use the proposed spline
estimator with conﬁdence band for inference. Proofs of technical lemmas are in the

Appendix.

4.2 Main results

. 1 r 1/7‘
For any Lebesgue measurable function a on [0, 1], denote “(f)“r = {ID |d>(:r)l dz} ,
1 g r < 00 and llqblloo = SprE[O 1] |q§(a:)|, and for a continuous function a on [0,1]

denote the modulus of continuity as

w (M) = maxx,x’e[0,1tlx-x’|36 11"“ ‘ ,5 (13,)"

89

1:11}: ' .'

 

For any 6 E (O,1], we denote by CO’ﬁ [0,1] the space of order 5 Holder continuous

function on [0,1], i.e.,

ecu—as x’
001510.11: as: 1in = sup ] Q, )l < +oo .
x¢x',x,x’€[0,1] l1? — 33,]

in which ||qb||0ﬁ is called the Co’ﬂ-norm of 45. Clearly, C013 [0, 1] C C [0, 1] and if (I) E

0013 [0,1], then w(¢,6) S ll¢lloﬁ (Sﬁ. For any vector C = ((1, ...,CS) E R3, denote

the norm “cur = (151" +---+I<sl">1/",1 s r < +oo, ucuoo = max(lC1| . 14.1).
We are using the local linear estimation in this paper.

The technical assumptions we need are as follows:

(A1) The regression function m, E C [0,1] .

(A2) The standard deviation function o(:r) E C015 [0, 1]. For any k = 1,2, . ..K.,
(5,, (at) E Cotﬂ [0,1] for 3 6 [0,1] and minz€[0,1] G(;z:,:1:) > 0 ,

(A3) As n —+ 00, Nn_1/4 (logn)"1 —> 00 and N = 0 (n6) for some 0 > 5/8. The

bandwidth h satisﬁes Nh (logn)_1 —+ 00, nh4 —» 0 as n —» 00.

(A4) The number It of nonzero eigenvalues is ﬁnite. The variables (.5,- k)oo,1rt k=1 and

00,00 , _ _ Z:

(23,-j), 1 , 1 are independent. In addition, max1<k<nEI£1k|n1 < +oo for
z: ,3: _ _

some 771 > 4 while E|811|772 < +00, for some 172 > 4 + 20 with 0 being the

constant in Assumption (A3).

(A5) The kernel K ,,(x) is a second order smooth function and satisﬁes the following
conditions: K ,,(x) = %K (E), where K () is a density function with bounded
support [—1, 1] and symmetric about 0 unless special conditions are indicated.

It is Lipschitz continuous.

Denote by C (2:) ,a: 6 [0,1] a standardized Gaussian process such that EC (2:) E

90

0, EC2 (2:) 5 1,3: 6 [0, 1] with covariance function

EC (x)( (33’) = G (33,:13’) {G(:c,x) G (x’,x’)}—1/2 ,x,:c’ 6 [0,1]

and deﬁne the 100 (1 — a)th percentile of the absolute maxima distribution of C (:13),

for all a: E [0,1],

P ] sup |((:I:)|SQ1_O,] =1—a,VaE(O,1). L
xE[0,1]

Denote by Z1—a/2 the 100 (1 — a/2)th percentile of the standard normal distribution.

Deﬁne also the following ”infeasible estimator” of function m f]

m(x) =E(a:) "—12,—61 ,1: e] [,0 1]. (4.2)

The term ”infeasible” refers to the fact that m(x) is computed from unknown quantity
€,-(:r),:r 6 [0,1], while "m(x) would be the natural estimator of m(x) if all the iid
random curves £,(:c), a: E [0, 1] were observed, a view taken in Ferraty and Vieu [18].
We propose to estimate the mean function m(x) by solving the local linear least

square

 

n N . 2 .
(&,b)=argminzz{Yij—a—b(-IJV—x)} K}, ({V—x)
with K}, (u) = ,liK (ﬁ), h=hn -—»0, asn—+oo. For anyxE [0,1],
. . T T '1 T
m(x)=a=e0 (x WX) x WY (4.3)

_ _ TY
inwhichY=<1f,1,...,1/_,N)Y , «‘12:, Y,,, 1gjgN,eg=(1,0),and

91

the design matrix X is

1 (7%] —:1:)
X = ' , (4.4)
1 (71% —a:) Nx2

and W =diag{{Kh (j/N — :13) /N}§V=1}.

We now state our main results in the following theorems.

Theorem 4.1. Under Assumptions (AU-(AU, for Va E (0,1) ,as n —+ 00, the ”in-

feasible estimator” ﬁ(a¢) converges at the J77 rate
P{SquE[O,1]"1/2lm($l — m(~r)lG(:r,:r)_1/2 s Ql—a} —> 1— a,

P{n1/2 |m(a:) — m(:1:)|G(:lc,:1:)_1/2 S Zl—a/Z} —> 1— a,\7’:c 6 [0,1],

while the local linear estimator m is asymptotically equivalent to m

SprE[O,1]n1/2 [m(x) — m(x)| = 0p (1) .

Corollary 4.1. Under Assumptions (AU-(A4), as n —+ 00, an asymptotic 100 (1 — a) ‘70

exact conﬁdence band for m(x),:c 6 [0,1] is
m(x) a: G (x,:c)1/2 Q1_an_1/2,Va 6 (0,1)

while an asymptotic 100 (1 — a) % pointwise conﬁdence interval for m(x),2: 6 [0,1],

is m(x) :l: G (:c,:1:)1/2 Zl—a/Zn—1/2°

92

4.3 Decomposition

In this section, we break the estimation error m(x) —m(:c) into a bias term and a noise
term. To understand this decomposition, we begin by discussing the representation of

the local linear estimator m(x) in (4.3). We obtain the following crucial decomposition

m(x) = m(x) + 5(1) + Em, (4.5)
with

m(x) = e (XTWX)_1XTWm

“I
”a”
V

II

(b

T
0
g" (XTWX) _1 XTWe
T
0

{(2:) = e (XTWX)—1XTW£, (4.6)

in which m = (m (l/N) , . . . m (N/N))T is the signal vector,

T
e = (0(1/N) E.,1,...,o(N/N) 5.,N) ,qu = 72‘1 231:1 5ij’1 3 j S Nis the noise

vector and g = (Zg=1€.,k¢k(1/Nl ,..., 22:1 5.,k‘l’k (N/N))T are the eigenfunc-

tion vectors, where E k = n_12?=1§ik,1 S k S n.

The next three propositions concerns m(x), Etc) and E (:13) given in (4.5).

Proposition 4.1. Under Assumptions (A1) and (A3), as n ——> 00
1/2 ~ _ G _1/2 _
Spr€[0,1] n |m(x) m(x)] (133:) - 0(1).
Proposition 4.2. Under Assumptions (A2)-{A4}, as n —> oo

supr[O,1]n1/2 m(x) — m(x) —E(:c) = 0,00), (4.7)

93

and there is a version Z(:c) ofC (.75) such that
supxem,” n1/2 [6(1) — Em] = op (1), (4.8)
hence for any a 6 (0,1)

P{supxe[o,1]n1/21m<x>—m(x)lG<x,x>‘l/2sol—a} a 1-0,

P {supxe [0,1] M ]E<x)| G (16,2?)‘1/2 s Ql—a} —» 1— a. (4.9)
Proposition 4.3. Under Assumptions {A2j-(A4), as n —> oo
supxdo,” n1/2 |E(a:)| G(a:,:c)_1/2 = 0,, (1) .

The Appendix contains proofs for the above three propositions. Combining these
propositions with the decomposition of m(x) as given in (4.5), we can easily get

Theorem 4.1.

4.4 Implementation

In this section, we describe procedures to implement the conﬁdence bands and in-

,n
tervals given in Corollary 4.1. Given any data set (j /N, Y”) _ 1 , 1 from model
J: ,2:
(4.1), the local linear estimator m(x) is obtained by (4.3). When constructing the

conﬁdence bands, one needs to evaluate 100 (1 — a)th percentile Q1 —a by estimating
the unknown functions G (:c, 2:).

The pilot estimator of covariance function G (13,113,) is C (x, 93’) = a (x, x’) such

{a (and) ,hl (13,13’) ,132 ($,a:/)} = argmin :13,=1

a,b1,b2 ‘7

that

94

(04.3., — a — b1(j/N — x) — b2 (j’/N — x’) }2 K,, (j/N — as) Kh (j’/N — 12’),
where
ojj, = n—IZ?=1{Yz-j — mp (j/N)} {YU- — mp (j’/N)},
for 1 gj,j’gN.

Therefore. as n —+ 00,
m(x) j: C (1:, 101/2 Q1_an_1/2 (4.10)

and m(x) i C (1:, 1101/2 Z1_a/2n-1/2 have asymptotic conﬁdence level 1 — a.

4.5 Simulation

We carried out some simulations to illustrate the ﬁnite sample behavior of the pro-

posed conﬁdence bands deﬁned in Section 2. We generated data from model

. 2 . . .
Yij = m(J/N) +Zk=15ik¢k (J/N) +0€ij,1 S J S N, 1 S 2 S 71, (4-11)

with gik ~ Normal(0,1),k = 1,2, 5 ~ Normal(0,1), for 1 S i S n, and m(x) =
sin {27r (x — 1/2)}. We take orthonormal functions ¢1(a:) = —2 cos {7r (:1: — 1/2)} and
¢2(a:) = sin {7r (:1: — 1/2)} to be the eigenfunctions, thus A1 = 2, A2 = 1/2. Different
noise levels a = 0.5,1 were used to interpret the result, and the number of subjects
n was taken to be 50, 100, 200 and 500. We used N = [7108 log n] to determine the

number of grid for each subject.

Table 4.1 shows the coverage frequencies from 200 replications for the conﬁdence
levels 1 — a = 0.95 and 0.99. As we expected, the coverage rates go to the nominal

ones as the sample sizes increase.

95

Table 4.1: Coverage frequencies from 200 replications.

 

 

 

 

 

 

 

 

 

o n 1—a=.95 1-a=.99

.5 50 0.9 0.97
100 0.91 0.995
200 0.94 0.99
500 0.94 0.99

1 50 0.855 0.95
100 0.905 0.96
200 0.89 0.975
500 0.865 0.97

 

 

 

 

4.6 Empirical example

In this section, we have applied the conﬁdence band procedure of Section 4.4 to the
data are recorded on a Tecator Infrared Food and Feed Analyzer working in the wave-
length range 850 - 1050 nm by the Near Infrared Transmission (NIT) principle. Each
sample contains ﬁnely chopped pure meat with different moisture, fat and protein
contents. In this study, we used 240 meat samples with each consisting of a 100 chan-
nel spectrum of absorbance and the contents of moisture (water), fat and protein.
Figure 4.3 shows this data set with the conﬁdence band for the mean. We can clearly

see that there is no linear or quadratic pattern for the Tecator mean.

4.7 Appendix

Throughout this section, C means some nonzero constant in this whole section.

4.7. 1 Preliminaries

We ﬁrst state some results are used in the proofs of Lemma 4.2.

Lemma 4.1. [Theorem 2. 6.7 of [8]] Suppose that {751 S i g n are iid with E(fl) =

0,E(£%) = 1 and H (:23) > 0 (a: Z 0) is an increasing continuous function such

96

that xﬂz—lHCc) is increasing for some 7 > 0 and :chlogHCc) is decreasing with

EH (K 1]) < 00. Then there eaist constants C1,Cg,a > 0 which depend only on the

distribution ofél such that for any {snail satisfying H—1 (n) < an < C1 (nlogn)1/2
_ t

and St - 21:15i

P ' S—Wt <0 H “1.
{lgggnlt ()l>$n}_ 2n{ (axn)}

4.7 .2 Proof of Theorem

-1
PROOF OF PROPOSITION 4.1. m(x) = 6% (XTWX) XTWm. The dispersion

matrix

xwa = diag (1, h) DNJ diag (1, h),

where

DN _ 5N,O (:c) SN,1 (9:)
’17 —
5N,1 (2:) SN,2 (2:)

where sNJ (x) = N—1 29;, Kh (j/N — :13) {(j/N — 2:) /h}1 ,z = 0,1,2. Denote

”0,1500 me (K)

D3; =
#1,;e (K) #23; (K)
where
fix/h elK (v) do :1: e [0,h)
“Lac (K) = fl] le (v) do a: E [h, 1 — h]

f(11—$)/h le (v) do a: E (1 — h, 1]

Dnﬂ; 2 D3; + U (h)

@wa)”1 = diag(1,h_1) {0171+ U(h)} diag (1,114)

97

Without loss of generality, let :1: E [h, 1 — h], one has

m(x) — m(x) = g (XTWX)-1XTW {m — m (2:) X60 — m, (:c) X61}

6% diag (1, h—l) {D171 + U (h)}

diag (1, h_1) XTW {m —— m (x) XeO — m, (11:)Xe1}

diag (1, h_1) XTW {m — m (x) XeO — m, (:13)Xe1}

( N—lzﬁlKho/N-x) \
><m{ <j/N>— m(x)>—m’<x o/N-a}
N—IZN_1KhJ/N-x){(J/N—$)/h}
\x{m( jm/N)— (2:)—m ’(x> o/N—x)})
/ N—lzl‘; K}. (j/N—x) )
x{% m”(:c )(j/N—as)2+u(h2)}
Nj-‘L—lz Kh< o/N—va/N—a/h}
XE m”(x )(j/N—x)2+u(h2)} )

1m,,(x)h2N-12N= Kh<j/N—z>1{<j/N—x>/h}2+u(>
2 H121; Khv/N—scno/N—x)/h}3+u<1)

1 II 2 ”2,2: (K) + “(1)
= 5m (x) h ' .
”3,3: (K) + u (1)

 

 

 

 

Combining the above two big equations, we have

m(x) — m(x)

0 ( H }2 ua<K>+u<1>

98

1 II 2T- —1 —1
= Em (a:)h 60 d1ag(1,h )Dx

= U022).

Lemma 4.2. Under Assumptions (A2)-{A4), there exist iid standard normal random

#2 (K)+U(1) +U(h3)
#3 (K)+u(1)

variables Zik,§inj,z-:vl S i S n,1 Sj S N,1 S k S re and some 6 E (0,1/2) such

that as n —> 00

max 5,1: — Zeke] +

1<k<n lg.,j — 2,3,5] 2 Gas. (nﬁ—l) (4.12)

max
1£jSN

- - ‘ _ —1 ' _ —1 .. -
in which Z.,k,{ — n 221:1 Zikg’ 2.0-,5 -— n 274:1 Z235) 1 S j S N, 1 S k S
Ii.

PROOF. Aaccording to Assumption (A4), E léiklnl < +00, 771 > 4, E lazy-‘02 <
+oo,172 > 4 + 20, so there exists some 6 E (0,1/2) such that 711 > 2/6, 772 >
(2 + 0) /ﬁ. Let H(a‘) = $711,337; = nﬁ in Lemma 4.1, then

 

for some 71 > 1. Applying Borel-Cantelli Lemma, one ﬁnds iid variables Zik,£ ~
N(0, 1) such that

max max
lSkSnlStSn

2211527: ‘ Z:=1 Zz'k,€ ] = 035- ("ﬁl '

 

Likewise, if one lets H (:17) = $772,337; 2 nﬂ in Lemma 4.1, then

 

" = *772 1”723: -72—9
H(a;rn) a n 0(n )

for some 72 > 1. Applying Borel-Cantelli Lemma and Lemma 4.1, one ﬁnds iid

99

variables Z215 ~ N (0, 1) such that

 

 

>nﬁ}SC

t t
21:1 527' " Zi=1ZiJl€

max P{ max
1SjSN 1StSn

 

which entails that max1<j<NP{lZ?___1 162-]- — 2:21 Zij,e] > nﬁ} S Cn1—7725

and

P {133% lzil=1 513' — Zil=1zijﬁl > "5}
Z {Iii—.1521 _ 221:1 Zijﬁl > ”3}

P
ISJSN
< C'n1_772ﬂ x N S Cn1_7725+6 S Cn-72,

|/\

in which '72 > 1 as described before. Thus Borel—Cantelli Lemma implies that

lsmjast lzilzl 527' ‘ 2212131335] = 0a.s. (n5) .

Putting together all the above proves (4.12).

Denote
~ —1 ~
g (:13) = 63‘ (xwa) xng = 22:19: (at) ,
~ _ —1
where 5k (:13) = {keg (xwa) xTthk and 4),, = (ek (1 /N) , . . . , ek (N/N))T.

Let 5k (:13) be the solution to the least square problem

N
argmin 2 {wk (j/N) — a — b (M — a}? Kh o/N — x),
j=1

.., —1 ~ _ ~
_ T T T _ _
we have ek (as) _ e0 (x wx) x W¢k and 2km _ ZWMek (3:), k _ 1, . . . , n,
similar to the deﬁnition offh(:1:) and £k(:c) in (4.6). Also denote (km) = Z.,k,{¢k (:13) ,

100

. , K and deﬁne

”1/2{Z;=1¢i(x f} mzkﬂ 1C

nl/zG’ (:1:,:c)"1/2 219:1 Ck (cc) (4.13)

PROOF OF PROPOSITION 4.2: Note ﬁrst fact that 7. k 5 are independent N (0, n_1)

variables implies that max 7 = O n—l/2 . By Assumption A2 ,
15kg; .,k,§ 1)

45k (a?) E co’ﬂ [0, 1] . Similar to Proposition 4.1, one has

1SkShzl|Q5

, mufﬂe)-

The deﬁnition of 6(a) in (4.13), together with deﬁnition of m(x) in (4.2), the strong

approximation in (4.12), the above bound on max1< k< n I-Z: k E] entail that

While,

|/\

|/\

|/\

sup lace) — m (e) — Em]
:c€[0,1]

13%,. likl ,3ng live (1‘) - «31. (all...

C as (lee! + In - me!) 1.2
op (Ml/2122 + 725—4112)

op(n—1/2) .

sup [m(x) — m (x) — Zeal
:cE[0, 1]

1SkSnlE

0p (nBTI) = op(n"1/2) .

101

,1: Z .,,kg] :llgl] H451: (xllloo

Now for any :1: E [0,1], Z(ac) is Gaussian with EZ(:c) E 0, E? (1‘) '2' 1,2: E [0,1] and

covariance E6017)? (:c’) equal to

n1/2G(ac,:c)_1/2 721/20 ($1, I!) -1/2

X 00" “22:1 2.1%th 1 {22:17-,k.€¢k(x'>}l
= G (:c,:c)‘1/2 G (:c’,:c’)—1/2 G (:c,:c') ,V:c,:c’ E [0,1],

so £ {2(a) ,2: E [0,1]} = .C {C (1:) ,:c E [0, 1]} Proposition 4.2 is proved.
PROOF OF PROPOSITION 4.3

Proof. We use C,- to denote a constant in the context. Since G (:c, :c) is bounded, we

only need to consider sup '5 :c . Notice that
:cE[0,1]

E(:c) = e3" (XTWX)_1XTWe

egdiag(1,h_1) {D;1+ U(h)}diag (1,h_1)XTWe
= QN,h($){Co + U(h)}, (4-14)
where QN,h(x) = N-1 Z§V=1Kh(j/N -— 103,12 We discretize the interval [0,1]

and partition it into N * = V N / I13 subintervals {1k} of equal length. Let ask be the

center of 119' For :1: E Ik,

IQN,h($ll S IQN,h($) - QN,h($k)| + IQN,h($k)|
N
= lower)! + N‘1 Z MK}. (j/N — x) — K}, (j/N — xk)}€,,jl
j=1
= IQN,h(ack)I +op{(nNh)‘1/2}. (4.15)

The above is obtained because the kernel function K () is Lipchitz continuous. Ac-

102

cording to (4.14) and (4.15), we obtain

 

~ —1/2
sup |e(:c)| S max Q (:1: )l + o (n ). (4.16)
:cE[0,1] lsksN* N’h 1” p
In the following, we will show that
-—1/2
max :10 . I = o n . 4.17
1_<_kSN* QN,h( 1,) p( l ( )

 

Deﬁne Rj,h($) = N—lKh (j/N — :c) 2.0-,5, where 24,5 is deﬁned in Lemma A.2.
According to Lemma A.2, {Z-Jﬁ’l S j S N } are independent and identically dis-
tributed as N(O, 1/n). Then

N N
Z P(]Z_,j,5]5j_1/2) 3 Z E|Z_,j,5|4j2 < oo.
i=1 j=1

Based on the Borel—Cantelli Lemma, it is straightforward to show that with proba-

bility 1, for large enough j, |Z.,j,€| S j-1/2.

In the following, we only focus on large enough N such that IZ.,j,el S N _1/ 2

and deﬁne

R- =N K N— Z - I - . 4.18
Jih(x) h(]/ 1') 'iJaE {qu,€SBN} ( )

It is straightforward to show that {Rj,h($k)11 S j S N} are independent bounded
random variables with mean 0. Notice that le,h(xk)l S 01 / ( N 3/ 2h) and

N
Z amen}? _<_ 02/(nNh)-
j=1

103

Therefore, according to the Bernstein’s inequality,

N 1 2 3/2
PL: le,h($k)] Zn) szexp [717 / (Oz/(nNh)+C1n/(N h)}].
'=1

 

In particular, if n = \/log N/(nNh), P (29,:1 le,h($k)] 2 17) ——> 0 under the

Assumption (A3), which implies that

 

N
Z |Rj,h(:ck)l = op{¢10g N/(nNh)}
j=1

Hence,

N

Z RN,h($k)
i=1

= Op(nﬁ—1) + 0p {\/log N/(nNh)} = op(n_1/2).

N
IQN,h($k)l S N_1ZKh(j/N_$k)(§.,j—Z.,jl
i=1

+

 

 

 

 

This completes the proof.

104

95% conﬁdence band, n=100

 

1.0

0.5

0.0

-0.5

-1.0

 

 

 

 

1.0

0.5

 

 

 

 

Figure 4.1: For data generated from model (4.11) (with 00 = .5) of different sample
size n and conﬁdence level 95%, plots of conﬁdence bands for mean (dashed lines),
the local linear estimator m (:c) (dotted line), and the true function m (:15) (thick solid
line).

105

99% confidence band, n=100

 

1.0

0.5

-0.5

 

 

 

 

1.0

0.5

-0.5

 

 

 

 

Figure 4.2: For data generated from model (4.11) (with 00 = 1) of different sample
size n and conﬁdence level 99%, plots of conﬁdence bands for mean (dashed lines),
the local linear estimator m (:c) (dotted line), and the true function m (:c) (thick solid
line).

106

 

 

 

 

o
- - .. ll... - 5
- n.... H...“ ”N..”? 0
mm... W... “Mm. 1
. . “...u....Hh......HHh.II\ u Ill-..u..... ..
u .u.....w... HUM - o s
................ ”.u....m... 1......m.u 1 d
. ...........mm ...m...m mm.n n
n ... “nu uni... ”an" a
u... m..-m1um.m b
.. ....... ...... ....u.....m....uu....mmm. m e
.. ......h....+.h....... . ..mmm... w m
........ --..w -....me. -w m a
a ..............u...1.. ..Hﬁuuh e d
.. m Wu.“ . ....m.......u......: 9 v n
a u... . mama... a n
........-...u.... ...Wmm w o
..u....." .....eu. ..umm. c
"nun-dun ”u“
- -....u. hm w...“
w.” ...m..- .. Imam...” o
.. mmmﬁ -o 9
........”.....“........n... mum... 9
.u....u... .....ww... 1mm
u.” WWW... o
- 5
a

 

 

 

 

_ _ _ ﬁ _ _ _ q _ _ _ A _ _ S
3 3 3 3. m.” 0.” 2 ON 3 S 3 mm 3 3 3
85288 8528%

1 000 1 050

950
wavelength

107

900

 

850
Figure 4.3: The upper plot shows the Tecator data with the 95% conﬁdence band

(dashed thick lines) for the mean estimate (thick solid line). The lower plot is the
conﬁdence band (thin dashed lines) for the mean estimate (thick solid line) in a

different scale.

Chapter 5

Summary Of thesis contribution

The main contributions of this thesis are the follows: the construction of simultaneous

conﬁdence bands for heteroscedastic, high dimensional and functional data.

Construction of simultaneous conﬁdence bands has been develOped slowly since
it is difficult to establish asymptotic sample distribution theory for nonparametric
regression estimates. In the last two decades, many statisticians have worked on the
theory and applications of nonparametric simultaneous conﬁdence bands. For the
ﬁrst time, we constructed conﬁdence bands for variance function, nonlinear additive

models and dense functional data.

Among all the nonparametric smoothing methods, polynomial spline smoothing
has the advantage of fast computation and simple implementation, see for instance,
Stone [67] and Huang [28] for the basic theory Of polynomial spline smoothing, and
Xue and Yang [76] for computing speed comparison of spline vs kernel smoothing. We
used polynomial spline smoothing to do the nonparametric regression in the chapter

2 and chapter 3 of the thesis.

The importance of being able to detect heteroscedasticity in regression is widely
recognized because of efﬁcient inference for the regression function requires that het-

eroscedasticity is taken into account. In chapter 2, we proposed polynomial spline

108

conﬁdence bands for heteroscedastic variance function in a nonparametric regression
model. It is desirable from a theoretical as well as a practical point of view to have
conﬁdence bands for polynomial spline estimators.

In chapter 3, we proposed an all new spline+spline oracally efficient estimator.
For the NAAR time series models, none of any existing methods provide simultaneous
conﬁdence band for the additive components. To address this need, we proposed an all
new spline+spline oracally efﬁcient estimator that is theoretically superior as it comes
with an asymptotically simultaneous conﬁdence band for the additive component, and
also computationally more expedient than any existing estimators due to the use of
spline instead of kernel in all steps. The spline+spline method is asymptotically
oracally efﬁcient as the spline+kernel method of [71], but can be hundreds of times
faster in terms of computing, see the comparison in Table 3.2.

Locally linear smoothing is used in chapter 4 to develop the conﬁdence bands of
mean function for dense functional data. This smoother combines the strict local
nature of the data and the smooth weights of kernel smoothers. Kernel smoothers
are expensive to compute (0(n2) for the whole sequence), but are visually smooth
if the kernel is smooth. The conﬁdence bands for dense functional data by locally

linear smoothing is very easy to use for practitioners.

109

 

 

BIBLIOGRAPHY

[1] Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the devia-
tions of density function estimates. Ann. Statist. 1, 1071—1095.

[2] Bissantz, N., Claeskens, G., Holzmann, H. and Munk, A. (2009). Testing for lack
of ﬁt in inverse regression - with applications to biophotonic imaging. J. Royal
Statist. Soc. Ser. B. 71, 25—48.

[3] Bosq, D. (1998). Nonparametric Statistics for Stochastic Processes. Springer-
Verlag, New York.

[4] Brown, L. D. and Levine, M. (2007). Variance estimation in nonparametric re-
gression via the difference sequence method. Ann. Statist. 35, 2219—2232.

[5] Chaudhuri, P. and Marron, J. S. (1999). SiZer for exploration of structures in
curves. J. Amer. Statist. Assoc. 94, 807—823.

[6] Chen, R. and Tsay, R. S. (1993). Nonlinear additive ARX models. J. Amer.
Statist. Assoc. 88, 956—967.

[7] Claeskens, G. and Van Keilegom, I. (2003). Bootstrap conﬁdence bands for re-
gression curves and their derivatives. Ann. Statist. 31, 1852—1884.

[8] Csc’irgc’i, M. and Révész, P. (1981). Strong Approximations in Probability and
Statistics. Academic Press, New York-London.

[9] Dahl, C. M. and Levine, M. (2006). Nonparametric estimation of volatility mod-
els with serially dependent innovations. Statist. Probab. Lett. 76, 2007—2016.

[10] de Boor, C. (2001). A Practical Guide to Splines, Springer-Verlag, New York.

[11] Dette, H. and Munk, A. (1998). Testing heteroscedasticity in nonparametric
regression J. Royal Statist. Society B. 60, 693—708.

[12] DeVore,R. and Lorentz, G. (1993). Constructive approximation : polynomials
and splines approximation. Springer-Verlag, Berlin, New York.

110

[13] Eubank, R. L. and Speckman, P. L. (1993). Conﬁdence bands in nonparametric
regression. J. Am. Statist. Ass, 88, 1287-1301.

[14] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications,
Chapman and Hall, London.

[15] Fan, J. and Yao, Q. (1998). Efficient estimation of conditional variance functions
in stochastic regression. Biometrika. 85, 645-660.

[16] Fan, J. and Zhang, W. (2000). Simultaneous conﬁdence bands and hypothesis
testing in varying-coefﬁcient models. Scandinavian Journal of Statistics. 27, 715-
731.

[17] Fan, Y. and Li, Q. (2003). A kernel-based method for estimating additive par-
tially linear models. Statistica Sinica. 13, 739—762.

[18] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: The-
ory and Practice. Springer Series in Statistics, Springer: Berlin.

[19] Gantmacher, F. R. and Krein, M. G. (1960). Oszillationsmatrizen, Oszilla-
tionskerne und kleine Schwingungen mechanischer Systeme. Akademie-Verlag,
Berlin.

[20] Hall, P. and Carroll, R. J. (1989). Variance function estimation in regression:
the effect of estimating the mean. J. Roy. Statist. Soc. Ser. B. 51, 3—14.

[21] Hall, P. and Marron, J. S. (1990). On variance estimation in nonparametric
regression. Biometrika. 77, 415—419.

[22] Hall, P. and Titterington, D. M. (1988). On conﬁdence bands in nonparametric
density estimation and regression. J. Multi. Analys. 27, 228-254.

[23] Hardle, W. (1989). Asymptotic maximal deviation of M-smoothers. J. Multi.
Analys. 29, 163—179.

[24] Hardle, W., Hlavka, Z. and Klinke, S. (2000). XploRe Application Guide.
Springer-Verlag, Berlin.

[25] Hardle, W. and Marron, J. S. (1991). Bootstrap simultaneous error bars for
nonparametric regression. Ann. Statist. 19, 778—796.

[26] Hastie, T. J. and Tibshirani, R. J. ( 1990). Generalized additive models. Chapman
and Hall, London.

[27] Horowitz, J. and Mammen, E. (2004). Nonparametric estimation of an additive
model with a link function. Ann. Statist. 32, 2412—2443.

[28] Huang, J. Z. (2003). Local asymptotics for polynomial spline regression. Ann.
Statist. 31,1600—1635.

111

[29] Huang, J. Z. and Yang, L. (2004). Identiﬁcation of nonlinear additive autore—
gression models. J. R. Stat. Soc. Ser. B. 66, 463—477.

[30] Huang, X., Wang, L., Yang, L. and Kravchenko, A. N. (2008). Management
practice effects on relationships of grain yields with topography and precipitation.
Agronomy Journal. 100, 1463-1471.

[31] Izem, R. and Marron, J. S. (2007). Analysis of nonlinear modes of variation for
functional data. Elect. J. Statist. 1, 641—676.

[32] James, G. M., Hastie, T. and Sugar, C. (2000). Principal Component Models for
Sparse Functional Data. Biometrika. 87, 587—602.

[33] James, G. M. (2002). Generalized linear models with functional predictors. J. R.
Stat. Soc. Ser. B. 64, 411—432.

[34] James, G. M. and Silverman, B. W. (2005). Functional adaptive model estima-
tion. J. Amer. Statist. Assoc. 100, 565—576.

[35] James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional
data. J. Amer. Statist. Assoc. 98, 397—408.

[36] Kauermann, G., Krivobokova, T. and Fahrmeir, L. (2009). Some asymptotic
results on generalized penalized spline smoothing, J. R. Stat. Soc. Ser. B. 71
487-503.

[37] Krivobokova, T. and Kauermann, G. (2007). A note on penalized spline smooth-
ing with correlated errors. J. Amer. Statist. Assoc. 102, 328—1337.

[38] Leadbetter, M. R., Lindgren, G. and Rootzén, H. (1983). Extremes and related
properties of random sequences and processes. Springer-Verlag, New York.

[39] Li, Y. and Hsing, T. (2007). On rates of convergence in functional linear regres-
sion. J. Multi. Analys. 98, 1782—1804.

[40] Li, Y. and Hsing, T. (2009). Uniform convergence rates for nonparametric re-
gression and principal component analysis in functional / longitudinal data. Ann.
Statist. In press.

[41] Liang, H., Thurston, S., Ruppert, D., Apanasovich, T. and Hauser, R. (2008).
Additive partial linear models with measurement errors. Biometrika. 95, 667—
678.

[42] Linton, O. B. (1997). Efﬁcient estimation of additive nonparametric regression
models, Biometrika. 84, 469—473.

[43] Linton, O. B. and Nielsen, J. P. (1995). A kernel method of estimating structured
nonparametric regression based on marginal integration, Biometrika. 82, 93—101.

112

 

[44] Lu, Z., Lundervold, A., Tjostheim, D. and Yao, Q. (2007). Exploring spatial
nonlinearity using additive approximation. Bernoulli. 13, 447—472.

[45] Ma, S., Yang,L. and Carroll, R. J. (2010). Simultaneous conﬁdence band for
sparse longitudinal regression curve. Manuscript.

[46] Mack, Y. P. and Silverman, B. W. (1982). Weak and strong uniform consistency
of kernel regression estimates. Z. Wahrscheinlichkeitstheorie verm. Gebiete 61,
405—415.

[47] Mammen, E., Linton, O. B. and Nielsen, J. P. (1999). The existence and asymp—
totic properties of a backﬁtting projection algorithm under weak conditions. Ann.
Statist. 27, 1443-1490.

[48] Martins-Filho, C. and Yang, K. (2007). Finite sample performance of kernel-
based regression methods for non-parametric additive models under common
bandwidth selection criterion. J. Nonparametr. Stat. 19, 23—62.

[49] Morris, J. S. and Carroll, R. J. (2006). Wavelet-based functional mixed models.
J. R. Stat. Soc. Ser. B. 68, 179-199.

[50] Miiller, H. G and Stadtmiiller, U. (1997). Variable bandwidth kernel estimators
of regression curves. Ann. Statist. 15, 182-201.

[51] Miiller, H. G. and Stadtmiiller, U. (2005). Generalized functional linear models.
Ann. Statist. 33, 774—805.

[52] Miiller, H. G., Stadtmiiller, U. and Yao, F. (2006). Functional variance processes.
J. Amer. Statist. Assoc. 101, 1007—1018.

[53] Miiller, H. G. and Yao, F. (2008). Functional additive models. J. Amer. Statist.
Assoc. 103, 1534—1544.

[54] Neumann, M. H. and Kreiss, J. P. (1998). Regrassion-type inference in nonpara-
metric autoregression. Ann. Statist. 26, 1570—1613.

[55] Nielsen, J. P. and Sperlich, S. (2005). Smooth backﬁtting in practice. J. R. Stat.
Soc. Ser. B. 67, 43—61.

[56] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis. Second
Edition. Springer Series in Statistics. Springer: New York.

[57] Roca-Pardiﬁas, J ., Cadarso-Suarez, C. and Gonzalez-Manteiga, W. (2005). Test-
ing for interactions in generalized additive models: Application to S 02 pollution
data. Stat. Comput. 15, 289—299.

[58] Roca-Pardiﬁas, J., Cadarso—Suarez, C., Tahoces, PG. and Lado, MJ. (2008).
Assessing continuous bivariate effects among different. groups through nonpara-
metric regression models: An application to breast cancer detection. Comput.
Statist. Data. Analysis. 52, 1958-1970.

113

[59] Rosenblatt, M. (1976). On the maximal deviation of k-dimensional density esti-
mates. Ann. Prob. 4, 1009—1015.

[60] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric regression.
Cambridge University Press, Cambridge.

[61] Ruppert, D., Wand, M. P., Holst, U. and HOssjer, O. (1997). Local polynomial
variance-function estimation. Technometrics. 39, 262—273.

[62] Silverman, B. W. (1986). Density estimation for statistics and data analysis.
Chapman and Hall, London.

[63] Song, Q. and Yang, L. (2009). Spline conﬁdence bands for variance function. J.
Nonparametri. Statis. 21, 589—609.

[64] Song, Q. and Yang, L. (2010). Oracally efﬁcient spline smoothing of nonlinear
additive autoregression models with simultaneous conﬁdence bands. J. Multi.
Analys. In press.

[65] Sperlich, S., TjOStheim, D. and Yang, L. (2002). Nonparametric estimation and
testing of interaction in additive models. Econometric Theory. 18, 197—251.

[66] Stone, C. J. (1985). Additive regrassion and other nonparametric models. Ann.
Statist. 13, 689—705.

[67] Stone, C. J. (1994). The use of polynomial splines and their tensor products in
multivariate function estimation. Ann. Statist. 22, 118—184.

[68] Tj¢stheim, D. and Auestad, B. (1994). Nonparametric identiﬁcation of nonlinear
time series: projections. J. Amer. Statist. Assoc. 89, 1398—1409.

[69] Tusnady, G. (1977). A remark on the approximation of the sample df in the
multidimensional case. Periodica M athematica Hungarica. 8, 53—55.

[70] Wang, J. and Yang, L. (2009). Polynomial spline conﬁdence bands for regression
curves. Statistica Sinica. 19, 325-342.

[71] Wang, L. and Yang, L. (2007). Spline-backﬁtted kernel smoothing of nonlinear
additive autoregression model. Ann. Statist. 35, 2474—2503.

[72] Wang, L. and Yang, L. (2010). Simultaneous spline conﬁdence bands for time
series prediction function. J. Nonparametri. Stat. in press.

[73] Wang, N ., Carroll, R. J. and Lin, X. (2005). Efficient semiparametric marginal
estimation for longitudinal/clustered data. J. Amer. Statist. Assoc. 100, 147—
157.

[74] Wu, W. and Zhao, Z. (2007). Inference of trends in time series. J. R. Stat. Soc.
Ser. B. 69, 391—410.

114

[75] Xia, Y. (1998). Bias-corrected conﬁdence bands in nonparametric regression. J.
Roy. Statist. Soc. Ser. B. 60, 797—811.

[76] Xue, L. and Yang, L. (2006). Additive coefﬁcient modeling via polynomial spline.
Statistica Sinica. 16, 1423—1446.

[77] Yang, L. (2008). Conﬁdence band for additive regression model. Journal of Data
Science 6, 207—217.

[78] Yang, L., Park, B.U., Xue, L. and Hardle, W. (2006). Estimation and testing
of varying coefﬁcients in additive models with marginal integration. J. Amer.
Statist. Assocg 101, 1212—1227.

[79] Yao, F. and Lee, T. C. M. (2006). Penalized spline models for functional principal
component analysis. J. Roy. Statist. Soc. Ser. B. 68, 3—25.

[80] Yao, F., Miiller, H. G. and Wang, J. L. (2005a). Functional linear regression
analysis for longitudinal data. Ann. Statist. 33, 2873—2903.

[81] Yao, F., Miiller, H. G. and Wang, J. L. (2005b). Functional data analysis for
sparse longitudinal data. J. Amer. Statist. Assoc. 100, 577—590.

[82] Yao, F. (2007). Asymptotic distributions of nonparametric regression estimators
for longitudinal or functional data. J. Multi. Analys. 98, 40—56.

[83] Yao, Q. and Tong, H. (2000). Nonparametric estimation of ratios Of noise to
signal in stochastic regression. Statistica Sinica. 10, 751—770.

[84] Zhang, F. (1999). Matrix theory: basic results and techniques. Springer-Verlag,
New York.

[85] Zhang, J. T. and Chen, J. (2007). Statistical inferences for functional data. Ann.
Statist. 35, 1052—1079.

[86] Zhao, X., Marron, J. S. and Wells, M. T. (2004). The functional data analysis
View of longitudinal data. Statistica Sinica. 14, 789—808.

[87] Zhao, Z. and Wu, W. (2008). Conﬁdence bands in nonparametric time series
regression. Ann. Statist. 36, 1854—1878.

[88] Zhou, L., Huang, J. and Carroll, R. J. (2008). Joint modelling of paired sparse
functional data using principal components. Biometrika. 95, 601—619.

[89] Zhou, S., Shen, X. and Wolfe, D. A. (1998). Local asymptotics of regression
splines and conﬁdence regions. Ann. Statist. 26, 1760—1782.

115

111'“

"'llll[l[]l][l[[i][[l[[l

0