.1— .
‘

yam? 3m ..
a u.” E

,n

i n '
.2»: W.

9‘.
m:

m:
iiiﬁn
1a 1 Vin

K.-
. u... L.
: 1. “m5?!
6... r: .31.!

, .
, b. _. I
#1?»

.
z
a...

 

, .. 1.11....
.13.. .

 

.35
:

 

:3. E
l.\'h 5v}. ,

if g 3.

a .
..,...z.. :3. ....

 

11...}. .. r . . V , . . .Y .. ‘ . . . I, .rlbrt‘uu , 1w. Aunt!”
; .3, 7 . . ‘ . _3,,....~.§..q. . a? may. 5.
, ‘. ‘ )1 .. ‘ .. ..... 5. ..1‘ . .

 

 

THESIS

F'» ;
J/O?

This is to certify that the

dissertation entitled

Nonlinear Wavelet-Based Nonparametric Curve
Estimation With Censored Data And Inference on
Long Memory Processes

presented by

Linyuan Li

has been accepted towards fulﬁllment
of the requirements for

Ph.D. degree in Statistics

 

 

) 7
Major professor

Hira L. Koul

Date May 20, 2002

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

LIBRARY
Michigan State
University

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c:/CIRC/DateDue.p65-p. 15

NONLINEAR WAVELET-BASED NONPARAMETRIC CURVE
ESTIMATION WITH CENSORED DATA AND INFERENCE ON
LONG MEMORY PROCESSES
Bv

U

Linyuan Li

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements

for the degree of
DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

2002

ABSTRACT

NONLINEAR WAVELET-BASED NONPARAMETRIC CURVE ESTIMATION
WITH CENSORED DATA AND INFERENCE ON LONG MEMORY
PROCESSES
By

Linyuan Li

In the ﬁrst two parts of this thesis, we provide asymptotic formulaes for the
mean integrated squared error (MISE) of nonlinear wavelet-based density and haz-
ard rate estimators under randomly censored data. We show this MISE formula,
when the underlying survival density and hazard rate functions and the censoring
distribution function are only piecewise smooth, has the same expansion as anal-
ogous kernel density estimators. However, as to the kernel estimators, this MISE
formula holds only under the smoothness assumption. In addition, we establish
an asymptotic normality of non-linear wavelet estimator of hazard rate function,
which is useful to construct a conﬁdence interval of hazard rate function.

In the third part, we discusses the asymptotic behavior of Koul’s minimum
distance (m.d.) estimators of the regression parameter vector in linear regression
models with long memory moving average errors, when the design variables are
either known constants or i.i.d. random variables, independent of the errors. It is
observed that all these estimators are asymptotically equivalent to the least squares -

estimator in the ﬁrst order.

ACKNOWLEDGMENTS

I would like to express my deep gratitude to my dissertation advisor, Profes-
sor Hira L. Koul, for his constant guidance, generous support and extreme patience
shown during the writing of this dissertation. His dedication and contribution to
statistics have been my main source of inspiration during my graduate study. I
am also very grateful to Professor Winfried Stute for his valuable and constructive
suggestion and Professor Donatas Surgailis for providing me a preprint of Lemma

3.3.3 in this thesis.

I would also like to thank Professors LePage, Levental and Salehi for serving
on my thesis committee. My special thanks go to Professor Ibragimov who offers
wonderful courses during each summer and to Professor Page for training me as a

statistical consultant.

The research in this thesis was also partly supported by the NSF Grant DMS

0071619 under the RI. Professor Hira L. Koul.

iii

TABLE OF CONTENTS

1 Nonlinear Wavelet-based Density Estimator 1
1.1 Introduction ............................... 1
1.2 Notatlions and Estimators ....................... 5
1.3 Main results ............................... 7
1.4 Proofs of the theorems ......................... 8
1.5 Proofs of the propositions ....................... 23
2 Nonlinear Wavelet-based Hazard Rate Estimator 34
2.1 Introduction ............................... 34
2.2 Notations and Estimators ....................... 35
2.3 Main results ............................... 36
2.4 Proofs .................................. 38

3 Minimum Distance Estimators in Regression Models under Long

Memory 52
3.1 Introduction ............................... 52 —
3.2 Main results ............................... 54

iv

3.2.1 The case of non-random designs ................

3.2.2 The case of random designs ..................

3.3 Proofs of the theorems

3.4 Appendix .......

Bibliography

ooooooooooooooooooooooooo

Chapter 1

Nonlinear Wavelet-based Density

Estimator

1 .1 Introduction

The mathematical theory of wavelets and their applications in statistics have be-
come a well-known technique for non-parametric curve estimation: See e.g., Meyer
(1990), Daubechies (1992), Chui (1992), Mallat (1989), Donoho and Johnstone
(1994), Donoho, et a1. (1995, 1996) and Kerkyacharian and Picard (1992, 1993).
For a systematic discussion of wavelets and their applications see recent mono-
graph by Hardle, et a1. (1998). The major advantage of the wavelet method is its
adaptation to erratic behavior of the density and local adaptation to the degree
of smoothness of the unknown density. These wavelet estimators typically achieve

the optimal convergence rates over exceptionally large function spaces. They do

an excellent job of taking care of discontinuities in the target function, and in con-
sequence they enjoy very good convergence rate even if smoothness conditions are
imposed only in a piecewise sense.

Hall and Patil (1995) ﬁrst explicitly demonstrated that, in the no censorship
case, the discontinuities of densities have a negligible effect on the performance of
non-linear wavelet density estimators. The mean integrated squared error (MISE)

of the kernel estimator of density function f has the form

MISE ~ 61(nh)-1 + c212”,

71

where “~ means that the ratio of the left- and right-hand sides converges to 1 as
the sample size n —> 00, h is the bandwidth of the kernel estimator, r is the order of
the kernel and c1 and c2 are constants depending on both the kernel and unknown
density. The ﬁrst term derives from the variance and the second from the squared
bias. This expansion for kernel estimators generally fails if the underlying density
function does not have r derivatives (Hall and Patil, 1995, p.906). However, the
MISE expansion of non-linear wavelet estimators is still valid for only piecewise
smooth density function, and even has the same constants c1 and c2. Patil (1997)
provided similar results for non-linear wavelet-based hazard rate estimator with
complete data.

In industrial life-testing, medical follow-up research and other studies, the
observation of the occurrence of the failure event may be prevented by the previous

occurrence of the censoring event. So only part of observations are real failure times.

Formally, let X1, X2, ' -- ,Xn be i.i.d. survival times with a common distribution

function F and density function f. Also let Y1, ’2, - -- ,Yn be i.i.d. censoring times
with a common distribution function G. It is assumed that X.- is independent of
Y, for every 2'. Rather than observing X1, X2, - -- ,Xn, the variables of interest, in
the randomly right-censored models, one observes Z,- = min(X,-, Y,) = X,- A K and
6,- : I(X.- s K), i = 1,2, - -- ,n, where 1(A) denotes the indicator function of the
set A.

Antoniadis, et al. (1999) describe a wavelet method for the estimation of den-
sity and hazard rate functions from randomly right-censored data. The method is
based on dividing the time axis into a dyadic number of intervals and then counting
the number of events within each interval. The number of events and survival func-
tion of the observations are then separately smoothed over time via linear wavelet
smoothers. They provide estimator’s asymptotic normality and obtained best pos-
sible asymptotic MISE convergence rate under the assumption that survival time
density function f is r-times continuously differentiable and the censoring density
9 is continuous.

The objective of this chapter is to propose a non-linear wavelet estimator of
density function with censored data and derive a result similar to the main result,
Theorem 2.1 of Hall and Patil ( 1995). One of the consequence of this extension is

that we can show that MISE has the analogous expansion
MISE ~ kln'lp + kgp‘z’, (1.1.1)

where n denotes the sample size, p is the smoothing parameter, a wavelet analogue

of the bandwidth ’1‘1 for kernel estimators and k1 and k2 are constants depending

3

on the wavelet, unknown density and censoring distribution.

Recently Wu and Wells (1999) provided hazard rate estimation by non-linear
wavelet methods in the left truncation and right censoring model. They have n
observations (Xi,6,-, V.) with X,- 2 V,, where X,- = min(T.-, U.) and 6,- : I(Ti _<_ Ui).
They applied counting process techniques and obtained analogous MISE expansion,
but needed further truncation. They provided a wavelet-based estimator for hazard
rate function over bounded interval [L, T] which is chosen such that the size of risk
population satisﬁes the following conditions:

(Y1): P(Ym;n-s na) 2 o(n"2) for some a > 0, where Ymin = inf,€[,,,.]}"(t) and
Y(t) = 23:11“, 2 t 2 V,).

(Y2): ESUPtepnjlﬁr) — WI = 0(n‘1), where C(s) = E|Y(s)/n|.

Basically, the condition (Y1) means that the size of the risk population Y(t) is
large and the condition Y(2) means that Y(t) is uniformly close to its expectation,
for all t E [L, r]. In addition, they only obtained the approximation ( 1.1.1) for the
MISE, which is weaker than the result (1.3.1) given below.

In this thesis, we apply the method of Stute (1995) that approximates a
Kaplan-Meier integral by an average of i.i.d. random variables with a sufﬁciently
small rate. We provide a MISE expansion similar to that of Hall and Patil (1995)
for density function over (—00, T], for any ﬁxed T < T”, where TH = inf{:r : H (x) =
1} S 00 is the least upper bound for the support of H, the distribution function of
Zl.

In the next section, we give the elements of wavelet transform and provide non-

linear wavelet-based density estimators. The main results are described in Section

3, while their proofs appear in Sections 4 and 5.

1.2 Notations and Estimators

This section contains some facts about wavelets that will be used in the sequel. Let
¢(:r) and w(:r) be father and mother wavelets, having the properties: (15 and 11) are
bounded and compactly supported. I a? = f w? = 1, ,uk 5 f yk¢v(y) dy = O for

O S k S r — 1 and p, = rlrs: aé 0, where K. = (r!)‘1fy'w(y)dy. Let
$10?) = P1/2¢(P$ - j), wig-(I) = pE/wa — j), :r 6 IR

for arbitrary p > 0, —00 < j < 00 and p,- = 192", 2' Z 0. Then

f¢j1¢j2 = 61112, [wiijlwizh = 6i1i26j1j2’ [Chill/”’02 = 0’

where 6,)- denotes the Kronecker delta, i.e., 6,,- = 1 , if 2' = j; 0, otherwise. For the
more on wavelets see Daubechies (1992).

In our random censorship model, we observe Z ,- = min(X,:, I2), and 6,- = I (X,- g
Y,), i=1,2,--- ,n. Let T < TH be ﬁxed and f1(:r) = f(:r)I(:r g T). We estimate
f1(x), i.e. density function f (:r) for a: E (—oo,T]. The wavelet expansion of f1(:r),
assuming f1 6 £2, is

f1(:r) = Z b.¢.~(z)+2 Z amt-(x),

j=-oo i=01=-oo (1.2.1)

bj =/f1¢j, bij=/f1¢‘ijo

We propose a nonlinear wavelet estimator of f1(2:) :

. oo . q—l 00 A
M2) = Z bjcbjoc) + Z 1(|b,j| > 5)¢,,(x), (1.2.2)
jz—oo i=0 jz-oo

where 6 > 0 is a “threshold” and q _>_ 1 is another smoothing parameter, and the

wavelet coefficients (3]— and Iii]- are defined as follows:

 

 

__ 61:1(Zk S T)¢j(Zk)
bj =/¢j()1(:c < T) an2:( )= ”ELI l-én(Zk-) , (1.2.3)
.. = a: ___ _1_ n 6kI(Zk S Tli/Iij(Zk)
b. / w..()1( <T)dF(x> Hz. 1_ cf..(z.—) (1.2.4)

Here F}, and On denote the Kaplan-Meier estimators of distribution functions F

and G, respectively, i.e.,

—1_[" ___:lNZmSI)
k=1 n — 61: )+1

" _1_[ 1— 50¢) _] (2mg)
k=1 n — 1: +1

where, Z0.) is the k-th ordered Z-value and 6“.) is the concomitant of the i-th order
Z statistic, i.e., 6m 2 63- if Z0.) = Z,-. Note that 6k/n(1 — Cn(Zk-)) is the jump of

the Kaplan-Meier estimator F}, at Zk.

Remark 1.2.1 We can deﬁne the wavelet estimator of f (2:), say f (2:), instead of
f1(:r), similarly to (1.2.2)-(1.2.4). However, in this case, the MISE, i.e. Ef(f —
f )2 < 00 cannot be ensured. Thus we typically consider E ffoo( f — f )2 to eliminate

the endpoint effects. Since wavelet estimator f is the same as f1 whenever Z(,,) S T,

[:(f—f)2=/(f1—f1)"‘—/:f{"=/(f1—f1)2,

we have

provided that T 3 Z(,,). Thus, our analysis for fl is closely related to that for f

restricted to (—00, T].

Remark 1.2.2 Although here we consider survival times setting, the random vari-
ables by no means be necessary restrictedly to positive. Suppose there is no censor-
ing, i.e. G _=. 0 on (—00, 00). Then 6,. E 1, for all k = 1,2, - -- ,n and upon taking
T = TH = 71:, we see that f1 E f and the above estimator f1 = f of Hall and Patil

(1995)

1 .3 Main results

We assume that the smoothing parameters p, q and 6 satisfy the following condition:

A: p—->oo, q—>oo, p62—>0, p2r+162—+oo, 62CVn'llnn,where
q

C > Co E 2{r(2r + 1)‘1 sup f1(1 — G)’1}1/2.

Theorem 1.3.1 In addition to the conditions on (b and it) stated in section 1.2,
assume that the r-th derivative f(') is continuous on (—oo,oo) and is bounded,
monotone on (-—oo, -u) and (u, 00) for a suﬁiciently large positive it and the cen-

soring distribution function G is continuous. Also assume that condition (A) holds.

Then

El/(fl — f1)2 - {n’lp Til—G +p'2’n2(1 — 2‘2")“/f1‘”2}

 

(1.3.1)
= 0(n‘1p + 12‘”)-

Remark 1.3.1 This theorem is an analogue of Theorem 1 of Hall and Patil (1995),
where the monotonicity of f (’l on (u, 00) for large positive u is needed. However,

7

it is not needed for censored data case, because of the effect of truncation at T.

Remark 1.3.2 The result (1.3.1) is stronger than traditional asymptotic formula

for MISE. It implies a wavelet version of the MISE formula:

E/(fl - f1)2 N "—11? 1_f_IG “PP—2”?“ — 2-2rl-1/fir)2-

In the Theorem 1.3.1, we have assumed that survival time density f is r-times
continuously differentiable and censoring distribution function G is continuous for
simplicity and convenience of exposition. However, if f (’l and G are only piecewise

continuous, Theorem 1.3.1 still holds. That is the following:

Theorem 1.3.2 In addition to the conditions on d) and w stated in section 1.2,
assume that the r-th derivative f”) and G are only piecewise smooth, i.e. there
exist points 230 z —oo < 2:1 < 2:2 < < am < oo = ash/+1 such that the ﬁrst r
derivatives off exist and are bounded and continuous on (rung-+1) for 0 g i g N,
with left- and right-hand limits; and that f") is monotone on (—00, —u) and (u, 00)
for suﬁiciently large positive u. In particular, f and G themselves may be only

piecewise continuous. Also assume that condition (A) holds and pgr+1n‘2' —-> 00.

Then also (1.3.1) holds.

1.4 Proofs of the theorems

The proof of the above theorem follows along the lines in Hall and Patil (1995),
combined with Stute (1995) which establishes an asymptotic representation for

the Kaplan-Meier integral f (,0an as an average of i.i.d. random variables with a

8

sufﬁciently small error. This allows for a more traditional and direct approach to

the density estimation problem for the censored data, compared to the martingale

approach as, e.g., in the Wu and Wells (1999). We begin with some lemmas. To

state these lemmas, we ﬁrst need some addtional natation. Let

991(1) = ¢J($)I(‘T S. T): j : Gail-3:27”. a

Wij($)=¢ij($)1($ST), i=0,1,~-,q—1,j=0,i1,:l:2,~-,

~ 1 n 5k<Pj(Zkl .
bj=hzl———’J:O’i1’i2’on’

- C(Zk)
13,-,- = 12%, i=0,1,~-- ,q— 1; j=0,:l:1,:i:2,---,
- k
and
lezkl= Uj—(Zk) V( 21:) Wz'j(Zk) = U11 Zk) - lblzkli
_ 1 _, 1 " ,
W1 2 EZWJIZk), ll 13' = ; k=1wij(Zk)a
where
1— 6k T”
Uj(Zk) - l—jH—(Zsfzk 911(wlF(d~U),
1 — 5k T”
Uij(Zk) — rmﬁ) 941(0)) F014).

 

Viz" A: [1,. (pl (“2"”) G<dv>F<dw),

coll—)) U)ll1_ G’U(

 

)l
Vij(Zk)— “f: [m ({9sz “U < Zk C$03)]G'(dv)F(dw).

coll-H (Mill-

(1.4.1)
(1.4.2)

(1.4.3)

(1.4.4)

(1.4.5)

(1.4.6)

We also deﬁne

 

H3,(z) = n’IZHZ, < 2,6, =3), _7 = 0,1, z 6 1R,
i=1
and
B)... sn/_:—1n[1+n(1_Hl(z))J H3(dz)—/_:_%, (1.4.7)
C... E/ﬁjfl H——I%d;——"Z()Z)— 1:111 0d:—-})(, 1 g k g n. (1.4.8)

Lemma 1.4.1 Let b,- and bij be deﬁned as in equations (1.2.3) and (1.2.4). Then

the following equations hold.

. - ___, 1
bj 2' bj + WJ' + R110“, E(R,21,j) = 0 (m) [993 (IF, (1.49)
- - _, 1
bij = bij +ll1‘j + Rﬂ,ij1 E(R311])— 0 (£5) [99?]- dF. (1.410)

Proof. Because the proofs of (1.4.9) and (1.4.10) are similar, details are given only
for (1.4.10). The proof below uses many ideas of the proof of the main theorem
of Stute (1995). The main difference is that here the integrand wij depend on n,
instead of a ﬁxed function in Stute (1995). Another difference is that we need the

rate of reminder in second moment, instead of in probability.

10

Write 8” = f <p,-,-(:r)d15’n(a:). From a result in Stute (1995, p.434) and details

of the proof of the theorem in Stute (1995), we obtain

4 f¢z1(W)70H1dw+// ”<w_"°"((“’)'"”°( “’)H‘(du)H°(dv>

_ v < u, v < w)c,o,-,-(w)m(u,~)1-0 v ~1 w u H
///I( ' [1—H(v)]2 H (d )H (d )H..(d )+R....,.

 

 

where

Rn,z'j = nl,ij + 5712.0 + Rn1,ij — Rn2,ij + 23113.0 a (1-4-11)

Hn, H2 and H}, are the empirical (sub-) distribution function estimators of H , HO

and H1, respectively, 70(Zk) = 1/(1 — C(Zk)), and

5711,41: iiﬁpiﬂ ()Zk )"/0(Zk)5kBkm (1.4.12)
nk: 1
Sn2,ij 2% g; (pij(Zk)6keA"{Bkn + Ckn}2, (1.4.13)

 

--— --w w z u' [Hn(z (2)]2 0 *4 ~1 w
12...”- ff m m m < .)[,_H(z;.]'[1_(H wHHanHAd ). (1.4.14)

Raw/ff

x [H2(dv) — H°<dv)1[H.‘.(dw) — H‘wwn

+/// “22:23.: W)

x [Hn(du) — H(du))H°(dv)(H,1(dw) - H1(dw)]

+ [ff [(1) < ail: _<_ Zijgngwhom)

x [Hn(du) — H(du)][f13(dv) — H°(dv)]H1(dw), (1.4.15)

 

 

 

 

R.3,.,-— ==I(// v < w _<p,,((w (“(mlHﬂdv) _ H°(dv)][H,§(dw) — H1(dw)].

(1.4.16)

11

Writing the ﬁrst three integrals of 13,-,- as sum and using the deﬁnitions of Uij (2).), 1*; ,- (2,.)

and W,,-(Zk), we have
~ 1
biJ-zb'j‘t' 5k: U1j( ()Zk ——ZW Rnij

= BU + Wij + Rn,ij-
Thus, to prove (1.4.10), it sufﬁces to bound the ﬁve terms of the RHS of (1.4.11).
Hence, the lemma follows from the following ﬁve propositions. All proofs are given

in the next section.

Proposition 1.4.1 Under the assumptions of Theorem 1.3.1,

1
Ewan: 0(n)/4MF

Proposition 1.4.2 Under the assumptions of Theorem 1.3.1,

1
E(512121]) O k?) [943' dF'

Proposition 1.4.3 Under the assumptions of Theorem 1.3.1,

1
E (Riln'j) : 0 (a) [Vij dF'

Proposition 1.4.4 Under the assumptions of Theorem 1.3.1,

imamg=o(:)/%MF

Proposition 1.4.5 Under the assumptions of Theorem 1.3.1,

Emwp=o(;)/%MF

12

Lemma 1.4.2 Under the assumptions of Theorem 1.3.1,

= 002-11)).

 

2(5)” ‘ b1)? — "-119 f1

”EB . l—G
J

 

 

Proof. In view of (1.4.9),
~ - f ““72
Z(b,—b,-)2-n 1}) Til—é +EZW‘J-4-EZR2J
j j j

+ 2E: IBJ — 51'”le + 2E: “32‘ - ijle + 213: an.3‘|le|
j 2‘ j

31 SE

 

 

=311+ 312 + 813 + 314 + 315 + 3169 (say).

Noticing that

 

 

Gﬁy+iVm ”
weobtain
_ 2(p1—1 f1((y+J)/ __ 2
2w . W 1 izb
Sincef¢2=1,

Erma +,~)/p)/(1_ Guy +j)/p>) 4 [fl/(1 — G).
J
and Z]. bf z 0(ff12), then

EZ(hj—l) j 2=n p/f1/( 1(—G )+o(n‘1p).
.7

As a consequence of Zk g T for all k = 1, 2, - - - ,n, all denominators appearing in

(1.4.3) and (1.4.4) are bounded away from below. Thus they may be handled along
the same lines as those in Hall and Patil’s paper p.922 to show that Var( 23(13)- -
(5)2) = o(n‘2p2). So we obtain 311 = o(n’1p).

13

From (1.4.5), we have
312 g n" Zewﬁzl) g 2n'1:(EUJ-2(Zl) + E132(Z1)).
J 1
In view of (1.4.6), applying the Cauchy-Schwarz inequality and using the compact

support of o, we ﬁnally can obtain

1
[1—H(T)][1—GT

 

EU2(ZI)<)]p-1/¢2(u)f12((u +j)/p)dU- (1-4-17)

So we obtain

n-IZEU?()Z,= 0(n“1/¢2(u)§;p‘1f,( (u+j )/p)du)o(= 22-1 p.)

Y

By applying the same argument, we can obtain

1 —1 2 2 .
H(T)]2[1—G(T)]2p /¢(U)f1((u+J)/P)du2 (1.4.18)

 

Ev3?(z2)< [1_

thus, 71-12], EVJ-2(Zl) = o(n'1p) too. Hence 312 = 0(n’1p). By (1.4.9),

313 = o (ﬂax/2,2321}? = o (g) = 0(n-1p).

Finally, by applying Cauchy-Schwarz inequality twice to 314, together with s11 and

313, we obtain

1/2
314 S 2 (2 E091 _ (5)2 ' ZEerm) = 0(n'lp).
J J

By applying the same argument, we can show 315 = 316 = o(n‘1p) too. This

completes the proof of the lemma.

Lemma 1.4.3 Under the assumptions of Theorem 1.3.1,

22: 224022 2 21(lb22|>6)} o<n-2'/<'”+”).

i=0 j

14

Proof. The proof is analogous to that of Theorem 2.1 of Hall and Patil (1995,
p.916-918). The difference is that we need to take care of two additional terms,
Wij and Rmij. Thus we only provide the detail for these two parts. As there, let a
and [3 denote positive numbers satisfying a + B = 1, and set

2., = Z ZE{(82 — b22)2}1(|b22| > 22;),

__0 j

322-. — :ZE{(bij— bij)21(lbiJ-bi1l > B6»

i=0j

So s2 5 321 + 322. By (1.4.10), we have

821 S3iZEU5i1 — b,j)2}1(|b,-j| > 05)

i=0 j
q-l
+ BZZEWEqum-jl > at)
._0 J.
”23213123,,“ (lb,j|>a6)
i: 0 j

=3(321,1+ 521,2 + 821,3), (say).

Define f”,- = sup f1((y+j)/p,-)/(1—G((y+j)/p,~)). Since the denominator
116311139“)

1— C((y + j)/p,-) Z 1 — C(T), for all i,j and y, we have f1.ij S sup (1 —
yesuppw

G'(T))‘1f1((y + j)/p,~). Because fl is bounded and monotone in the extreme tails

and 112 has a compact support (—v, v), we have, for sufﬁently large K,

sup—Zf122-< (:r» ‘sup-l-Z sup f1((y+j)/p2)

n;i>0pi j miZOpi j yE(—v,v)

30-60))" sup— 2: f1((((u+j)/p2-)+Ksur>f1]

_n ,i>0pi
I] /P |>K

l'

g (1 —G’(T))’1 sup/l" Kf1(u/p.-+a:)d.r+Ksupf1],

_n;i20

 

15

where u = v or —v, depending on the monotony of f1. Hence we have

supp: 12h” < 00 and supp: 12/99?de < 00. (1.4.19)
J

n;1>0 n ,1'>O

Use this fact and an argument as in Hall and Patil (p.916-917) to obtain 321,1 =

o(n"2’/(2’"+1)). As to the 321,2, from (1.4.5),

2 q
3212 _<_ EZZ[EUEJ()Z1 +EV- 3(2.,)] (1.4.20)

i=0 J

By applying an argument similar to (1.4.17) in Lemma 1.4.2, we have

EUé-(zn = 0(2)? / ¢2(U)f12((u +j)/p.)du),

so that

where the second equality holds from q = 0(ln n), and Z]. pf1f12((y+j)/p,-) —> f ff.
By applying an argument similar to (1.4.18) to the second term of (1.4.20), we have

321,2 = o(n‘2’/(2'+1)). Next, from (1.4.10),

q—l
321,3<§2ER11JJ=O($)ZpiZP:I/Q?Jdlr

i=0 j 1‘0 J
1.4.21
= 0 (52-) by (1.4.19), ( )
n
= 0(n—2r/(2r+1)) by n—lpq _> 0

Thus, 321 = o(n‘2'/(2'+1)).

16

As to the 322, by (1.4.10) we have

822 <3QZIZE{(b1J—b1J)2II—b1jl>1‘36)}

1:0 J
q-1 __2 ,
+ 3ZZE{WUI(|b,-j — b2Jl > 66)}
1:0 j
+3ZZE{R"U bij_b1Jl > (36)}
1: 0 J

=3(S22,1 + 322.2 + 322.3)» (say).

By applying the similar argument as that in 321,2 and 8213, it is obvious that

3222 : 522,3 : o(n‘2’/(2’+1)). To complete the proof of this lemma, it thus suffices

to prove 322,1 : o(n‘2'/(2'+1)). In view of (1.4.10), we have

5221(:ZElwij_bij)21(lb1J-bijl> 0155)}

1:0 J

+2259” 0 2211-12.,|>2,52)}

+22%!) 22-— b2)21(|R2.-J|>22,132)}

1:0j

=822,11 + 322.12 + 822,132

where 011, 02 and 03 are positive numbers such that 01 + (12 + a3 : 1.
The term 322,11 is similar to 312 in Hall and Patil, following the argument there,

noticing 5,5 and fly-J- play the roles of hij and fij there, we can show that

322,11 = 0 (n’2r/(2r+l)) . (1.422)

17

AS 130 the 82213, 191'. A = {IBJ'J' - bijl S 01/36}, then,

322,13 ziZE{(Bij—bgj)(21(|Rnijl > 03136) )14}

 

1-0 ,1
+ZZE{(51J—bij)21(.|Rnijl>0336)14c}
1:0 J
<22}? 3262PI <.an2~2| > 2222)
1: 0 j
+EZE{(b ii ()0) )(21lb1’3 _‘bijl > (11/36)}
1: 0 j
<iza2 (3262E (12,8262 +322,11
1-0 j
=0 (:ZERn,1j)2 + 822.11
1— 0 J
:0 (n-2r/(2r+1)),

the last equality follows from (1.4.21) and (1.4.22). By applying the same argument
to 322,12, we can show that 322.12 = 0 (71‘2” ("2””) too. This completes the proof of

the Lemma 1.4.3.

Lemma 1.4.4 Under the assumptions of Theorem 1.3.1,

22222.2 I2..I<2 I) m

__0 J

= 0072')-

 

Proof. The proof here is again analogous to that of Hall and Patil (1995) and the
details are omitted. We only provide special treatment for the additional two terms

related to Wij and R" ,J. As there, let 6 > 0, and define

33O=qzlzbijl (lb1j|<6)1 331::lzb2Ji I{lbiJl<( (1+€)6}

i=0 j l: 0 J
q-l
332— _ 022123 1{|b,,-| < (1— a5}, A = XZbEJIIIb.J — b,,| > 26).
i=0 j i=0 J

18

Then

S32 - A S 830 S 331+ A . (1423)

By applying the arguments analogous to those of Hall and Patil (1995, p.918-921)

to 331 and 332, we can obtain

22. = 232 = 2‘2'2220 - 2‘2")'1 f If” + 002-2"). (1.4.24)
Now,
EA <ZZb b,,-—b.-,-|>7126)+qZ:Zb§,.1>(|I_I,,-I>7225)
'=o j 1:0 J
+22%!) (IRMA > 2226)
._0 J.

=A1 + A2 + A32 (WW2

where '71, '72 and 73 are positive numbers such that 71 + 72 + 73 : 1 .

(22” eXP(--I1— 2) f52f;:,2222])

=(::23)_0I)231

1:0 J

The ﬁrst equality follows from Bernstein’s inequality (see Hardle, et al. 1998,

p.244), while the second follows from 1162 —+ 00.

212-0 (2223, 53%;)” (3:223; 7/2322)

1-0 J 1‘ 0 J
q-l

= ., (2:22.) =22».
1:0 J'

The second equality follows from the arguments in (1.4.17) and (1.4.18), while the
third follows from 1162 -> 00. Similarly, we can show A3 = 0(331) too. Thus,

EA : 0(331). Combining (1.4.23) and (1.4.24), the proof of lemma follows.

19

Lemma 1.4.5 Under the assumptions of Theorem 1.3.1,

34:225.}: p)—22

1:9]

Proof. The proof follows from the step 3 of Theorem 2.1 of Hall and Patil (1995).
We are now in the position to give the proof of the Theorem 1.3.1 and 1.3.2.

Proof of the Theorem 1.3.]. Proof follows from the bound

E‘/(f1 - f1)2 - {71—119 1110 +P—2r52(1 " 2-2r)_l [f1r)2}

£81+32+S3+S4,

 

 

and Lemmas 1.4.2, 1.4.3, 1.4.4 and 1.4.5.
Proof of the Theorem 1.3. 2. We use the same notations as in Hall and Patil (1995).

Noticing that, by the orthogonality properties of (b and 1/2,

[(1—12)‘2’.,,=I(zz.z..),

where Z denotes the set of all intergers and

q-l
Iq(lp, $0, W1, . . .) = 2a)., — bj)2 + Z 2(81j_b1j)21(|b1jl > 6)
jeW 2:0 Jew.-
+§Zb3,1(1b,,|<6)+22b3,.
1:0 j6W11=q jE'I’g

20

By (1.4.9) and (1.4.10),

1,011,210, 211,, . . .)
=20}, — bj)2 + 2W: + ZREBJ + 22(51 — bj)Wj

J'EW J'EW J'ESP J’EWI

+2203,- 6,)12..,-+22W—7,-R,,-+:Z(6 .,-— 6,,)(21|6,,|>6)
JIEW JEW 1: OJEW

+:ZW3, 1( (|6,,-|>6) +QZZR,,,1 ()6,,|>6)
2:0 36!? 2:0 jEW
q—l

+2ZZ(6,,—6,)IT1(|6,,—|> 6) +2216 ,j— 6., )R,,-,-1(|6,,-| >6)
2:0 3912. 2= 0169?.-

+2EZWURM,I( ()6,,-| >6) )+q:::bf,l( ()6,,-| <6) +22%.
2‘: OjeW 1': 0 jeW i=0 jew

=11+12+I3+I4+15+15+I7+Ig+19+110
+111+112+113+1142 (say).

By the Theorem 1.1 of Stute (1995), when F and G are only piecewise con-
tinuous, a quartile transformation may be applied so as to trace everything back
to uniformly distributed Z ’3. Thus the above Lemma 1.4.1 still holds in this case
(see also Stute and Wang (1993, p.1605).

From (1.4.17) and (1.4.18) in the Lemma 1.4.2, we obtain E12 = o(n‘1p).
From (1.4.9) in the Lemma 1.4.1, it is easy to see E13 = 0(n‘2p) = 0(n‘1p). From
the proof of Lemma 1.4.3, we have E18 = 0(n‘zr/(2'+1)). From (1.4.21), we also
have E19 2 o(n“2’/(2'+1)). Applying the Cauchy-Schwarz inequality, we can show
14, I5, Is, 110, In and 112 are all of the order o(n‘2'/(2'+1)).

When fl is only piecewise smooth, let lI denote the ﬁnite set of points where
flm has discontinuities for some 0 g s S r. Suppose supp d) g (—v,v), supp

21

111 g (—v,v) and let

K={k:k€ (px—v,p:r.+v) forsome :rEl'I},

K,- = {k : k 6(1),:1: - v,p,-sc+v) for some :1: E H}.

Also let Kc, 1K3 denote their complements. Then, unless j 6 HQ, b,, and 13,-,- are con-
structed entirely from an integral over or an average of data values from an interval
where f1”) exists and is bounded. Also, unless j E K, b, and (3,- are constructed

solely from such regions. Thus we may write

1,(1,210,211,...) =[1(1K)+12+I3+I4+15+15+I7(K0,K1,K2,...)

+113(K01K11K21°°') + [14(K‘01K17K21'“)
+ 1,016) + 17(K3,1<:,1K.g, . . .)

+ 113(K3,KC,IK§, . . .) + 11412312212; . . . ), (1.4.25)

where

11(22): 2032- — 2,)2, 120122) = 2032 — 12V,

J'EK jeKC
q— 1
12(K2,K2,K2,...) =22b ,— b.,))(21(|6,-I >6)
i=0 16K.
-1
17(Kg,1<§,1<§,...) :qzza') ,,— 6,,) )21( ((6,,|>6)
1': OJ'EKf

the rest of the terms are deﬁned similarly. However, for our compactly supported
wavelet (b and 1/2, both K and K,- have no more than (211 + 1)(#H) elements for I
each 1'. Considering q = 0(lnn), we can show 11(K), 17(K0,K1,IK2,...), and

22

113(K0, K1, K2, . . .) are of the lower order 0(n‘2’/(2'+1)). Thus it is negligible com—

/

pared to the main terms of MISE. Although b,, is only of the order p,—1 2 when fl is

not r-times smooth, based on theorem’s additional assumption pg’ﬂn‘zr —> 00, we
readily see that 114(K0,1K1,K2, . . . ) = 0(n‘2r/(2'+1)). By tracing the whole proof of
Theorem 1.3.1 carefully, we will see the rest of the terms of the right hand side of

(1.4.25) have precisely the asymptotic properties claimed for f(f1 — f1 )2 in Theorem

1.3.2.

1.5 Proofs of the propositions

Proof of the Proposition 1.4 .1 . In view of (1.4.12), applying the moment inequality

to Sn”,- and taking expectation yields

 

 

1 n
E (5,31,,2) S E(; Z 9922j(Zk)73(Zk)6kBin)
2. 2:1 (1.5.1)
1
= a: E(991?j(zk)73(zk)6kE(Bgn|(Zka6k)))'
k=l
In view of (1.4.7), the deﬁnition of BI," and note that
$2
33—3- _ln(1+:r) SI, for 2:20,
we have
IB |<—1-/Zk- 113(212 < 1 (152)
kn _ 2n _oo [1— H,,(z)]2 _ n(1— Hn(Z,,—))' ' ’

Thus, conditionally on {Z)c = z} and {6), = d}, noticing an(z—) = 23:, I(Z,- <

z) is a binomial random variable with parameters 71 — 1 and p := H (z—), we have

 

1 2
E 2 = z = < < —'
(Bknlzk 25* d) - E(n2(1-— Haw—D3) ‘ 723(1-19)2

23

Thus, from (1.5.1), notice (Z1,61), (Z2,62), - -- , (Zmdn) are i.i.d., we have

 

E(Si...)< E(222,(22)23(zl)62 211-62112.» ) = 0(513) [2 2,2112

The last equality follows, because {Z1 3 T}, T < r”, imply that 1 — H (Z1—) 2

1—H(T)>0.

Remark 1.5.1 If we consider the estimation of f, instead of trunction f1. Then

we have

 

2 _ F(dx)
E(522'2)"(n21)1;1(/11—21121141211

The above integral will be inﬁnity for some i and j, such that 1,63, 2 K > O on
a neighborhood of r”. Since q —> 00 and j 6 (—oo, 00), there always exist some
i and j satisﬁng above condition. Thus we could not show 23:13:, E(S,2,1,-,)=

0(n'2r/(2’+1)) without the truncation.
Proof of the Proposition 1.4.2. In view of (1.4.13),
1 " 1 "
IS'222,2-j| S E 122—; l¢2j(Zk)|5k€A"an + 5 §|W1j(zk)|6keAkC£n2

Again applying the moment inequality to the average and taking expectation yields

E(Sﬁ...)<_ 32422212216 22228...) +§2E(22?.(22)62e222cz.).

nit: l k=1
Because the proof of the two terms are similar and the second term is more involved

and require more details, we here only need to prove, for any k,

1 - -
E (W?j(Zk)6ke2AkC;:n) = O ('11—'2 )/§0?j (IF. (1.0.3)

24

Writing LHS of (1.5.3) as

E(w?.(z.)6.E(e2-“‘*Cz.l<zm60))

_<_ E [Q?j(Zk)5kEl/2 (€4Ak |(Zk, 6k))E1/2 (CSnKZkv (5,0)] .

Next, we want to show, conditionally on (Z,c = z} and (6;. = d}, uniformly in k, z

and d,

E(64AkIZk = Z,(Sk = d) = 0(1), (1.5.4)

E(Cfnlzk = z,5,. = d) = o (—). (1. .

n4

C)!
O!
V

Then (1.5.3) follows, thus Proposition 1.4.2 follows.
We ﬁrst prove (1.5.4). Since (21,61), (22,62), . -- ,(chin) are i.i.d., we here
only consider k = 1 for the convenience of exposition. Recalling A1 is between two

terms: 5:1 and 11:1, where

.. _ Z“ 1 ~ _ 2,- HOW)
III—”[00 lnll+n(1—Hn(z)) H2(dZ), $1—/_oo Til—(3'

 

Thus there exists 0 S A g 1, such that
Al: 331+ /\(i:1— 3171): 11+ MB“, + Cl"),

where Bl" is as in (1.4.7) and C1,, is deﬁned as in (1.4.8). Notice 0 3 3:1 =

— ln(1 — G(Zl—)) S —1n(1— G(T)), which is bounded, and
0 s IABlnl s IBlnI s [n(1— H..(Zl-)]‘l s 1,
by n(1— Hn(Z1—) Z 1. So we obtain

e4Al = exp{4x1 + 4/\B1n + 4/\Cln} S (1 - C(T))_464€4AC‘" = C ' 84ml",

25

where C is a positive constant, the concrete value of which may changes from line to
line in the sequel. Thus, in order to prove (1.5.4), we only need to prove, uniformly

in 21 and d1,
E (eHHCMIZl 1: 21,61: (11) = 0(1). (1.06)

Note that

1 = _ 1— Hn(z) + 2

1— 11,12) [1 — 11(2)]2 1— H(z)
[Hn(Z) - 11(2)]?

[1 - H(Z)l2l1 - 11742)],

 

+

 

we can rewrite GM as

__ Z"1-H Hn(z) 0 Z“ 2 ~0 ~
01"“ f... 11- Hz)1 )1—7HH”) /... 1—H1z)H"(d‘)

Z1- 1 ~ Z1- [Hum z —H(z)]2
‘/_ 1—H1z)H°(dz)+/_.. 11—H1z )1211—H1~)1H0(d)

oo

 

211+ 12 + 13 + 14, (say).

The ﬁrst three terms of C1,, are bounded w.p.1, because {Z 1 g T} and T < TH, all
denominators are bounded away from below. Thus we need only to deal with the
fourth term 14. Because of A S 1, to prove (1.5.6), it sufﬁces to prove, uniformly in

21 and d1,

E(8414|21= 21,61: d1) = 0(1). (1.5.7)

26

Let J(:1:) = [(n — 1)“l 9:21(ZJ~ S :13) - H(x)][1 — H(x)]‘1, writing 14 as a

sum, conditionally on {Z1 = 21}, {61 = d1}, we have

 

 

J2”)! (ZkZ < )Z]1) H2(Zk)1( Zk < 21)
I.,_ < 2: +n-—2 _; 2[1 _

 

 

 

n[1 -— H()Zk)]2n[1 -' Hn (Zk)l
112,. < 21) 211217") " 112,.H < z.)
< 2:3;1212 Z n[1 «1112,.»+ n211— H T 12).— Z n[1 — 1112.)]
Clnn.

 

g 2supJ2(z )lnn +
z<T

Since the second term goes to zero, we only need to bound the ﬁrst term. From

8J2(z)

the above inequality, we have e414 S sup Cn . Since J (z) is a martingale in z

z<T
(see Koul, 1992, p.42), 118”“) is a sub-martingale. Thus, from the property of the

sub-martingale, we have, uniformly in zl, 61,
E(e 414121: 21,61: d1) S CE ("812173) .

Noticing H. I Z - _<_ T is a binomial random variable with parameter n — 1 and
1—2 J

p := H (T) Let b(k; n, p) = (2)115?"c , through direct calculation and enlargerment,

we have
8(k- Jug-+2) C ’1 16(k-np)2
En 8J2” zinw 1) (1'2) b(k; n —1,p) S ; nn'IU-P)’ b(k;n,p) =:p'1J.
k: 0 k=0

Since 19 = H (T) > 0, we need to bound the J. The idea is to divide the sum
. . . 16(k— —n )2
Into two parts according to the magnitude of n "H“ 12>”. To make it clear, let

A = {k; [11: - 111)] _<_ nd,d 6 (1/2,1)}. We write J = 216A+ZkEM =: J1+ J2. It is

easy to see J1 S Timn'M—Hlﬂ").2 = 0(1). As to the J2, when k > 7111) + 71“, we have

27

(see Feller, 1957 p.163)

2 1
1211.12.12) 3 11m; 12.10) «re-2222+“, 161 < '2'.
h _ k-—(n+1)p+l . .
w ere £1 — ———1(n+l)pq , (n +1)p — 1 < m S (n. +1)p and b(m, 11,19) 18 the central

term, which is O((27rnpq)"%) (see Feller, 1957 p.140). Thus, when k — up > n", we

have 61. 2 (p11)’1/2[nd"1/2+ +(1/2- P)” 1”l and

2
16(k- up) _1 2d— 1

2 71250-2) b(k;n,p)=0(nn16b(m;n,p)e 2" " )=O(1).

k>np+nd

By applying the same argument to the k < np — 71", we have .12 = 0(1). Thus we

prove (1.5.7) and hence (1.5.6).

5
As to the (1.5.5), in view of (1.4.8), we write C1,, , conditionally on {Z1 = 21}

and {61: d1}, as

 

 

IZ<z,6=0) 1"11Z<z,5=0
Cln:;1l_2(lc 1k )(__T_l:: k 1]: )
k=1

 

 

 

k=1 1— Hn( Z)
jg; Z1<21,151):O)(—/-:o“Il‘-I__°(Hc%.2__:j
£1 2“ < 31121;") 1.1—‘14)]

= : D1,. + D2".

Applying the moment inequality to D1,, and taking expectation yields, conditionally

0n{Z1= Z1} and {61 = d1},

 

 

 

1 1(2): < H21 )1an (Zk) HH(Zk)l8 _ _
ED<1 n—IZI - 21,51“ — 611) < E(Enl k: [1 _ n(Zk)l8l1-H(Zk)l8 Zl — 21,61_ d1)
(22 < Z1 )[HH (Zg)— H(Zg)]8 _ 7 = -

SCH“ 11-H122))2 2‘ ‘ “"6‘ d‘)‘

 

28

Conditionally on {Z1 = zl}, {61 = d1}, {22 = 22} and {62 2 d2} , an(22) 2:
1 +62, Q is a binomial random variable with parameter n — 2 and p 2: H (22). Thus,

through directly calculation, we have

 

[(22 <21)[Hn (22)- H(z2)]8
E( Mil—H (22 >18

<n _ n < _

Thus, uniformly in 21, d1,

21 = 21,61: d11Z2 = 32.52 = 612)

 

E(DIBnIZ1—Z 21,61: (11) = 0(-1—) . (1.5.9)

114
As to the D2", which is the sum of i.i.d. centered bounded random variables. By

Rosenthal’s inequality, it is easy to show

E(DSHIZ1= 21,61: (11) = O<-1—') . (1.510)

714
Combining (1.5.8), (1.5.9) and (1.5.10), we proved (1.5.5). TOgether with (1.5.4),

we proved (1.5.3). Thus Proposition 1.4.2 follows.

Proof of the Proposition 1.4.3. In view of (1.4.10),

 

-.— w 1(z.<..)( 1-6k)lH <>-z. war -1 .1
Ram] -/9Pij( )70(w w)”; [l—H ()Zk )Hl2l1“ "(an Hum ).

So

 

k=1
n (Z1: <Hn(w)[H(Zk)HH)(Zk)l4H1
[1 - H()Zk)l4[1 — Hn(Zk)l2 n

2 2 1 " I(Z,.<H112)[H()—ZkH(Zk)]2 2 ~,
Rn1,ij—</‘pij(w )70(IU) (; [1 _ H()Zk )]2[1 _ 11"an ) Hn(dw)

 

s /w?.(w)1§(w)%k (dw).

_ .1— n _ 1(Zk < 21))[Hn()Zk -I{(Zk)]4
— ngwtwmzoonk; [Hg (12.1111411-42112 -

=1

 

29

Thus

" 1(zk < zl)[H,.(Zk) — H (Zk)]4
[1 — H(Zk)]4[1 — 191,420]2

 

E(Rnl ,ij) < E(‘P?j(zl)7g(zl)61E(%

 

21,51».

(1.5.11)

k=1
By applying an argument similar to that in Proposition 1.4.2, we can Show, condi-

Zl =21,51=d1) =0(—1§),
n

uniformly in 21 and d1. Thus by (1.5.11), the proposition follows.

tionally on {Z1 = 21} and {51 = d1},

 

1 n I(Zk<21)[Hn( ()—Zk (Zk)]4
E(HZ [1—H()1'112k]4[1—()Zk]2

 

k=1

Proof of the Proposition 1.4.4. In view of (1.4.15), we write Rug”- as

Rn2,ij = 3712,00) + Rn2,ij(2) + Rn2,ij(3)'

Because of the similarity of the proof, we here only provide the details for the ﬁrst

term. We write

Ryan“) = / momma),

where

 

u)=I(‘/:/ ’U <1E1A_wH)¢ij)(]2‘7)0(10)[H0(dL) —H°(dv)][f1,ll(dw)— 1:11(dw)].

Hence
E(Rimu )) s E(K2(zl>) = E(E(K2(zl)lzl)). (1.5.12)
Conditionally on {21 = 21} and {61 = d1}, we rewrite K (21) as sum, we have

= $2 fwzk, w) - u(w)][f1,t(dw) — Wm»,
k=2

30

where

1(Zk < 21A w)901j(w)70(w)(1- 5k)

“2’“ w) = 11— H(Zu]? ’

 

u0=EMMLwﬂ k=za~.m. dam)

Again continuing to write K (21) as sum, we have

Kt.)=.,:,;{,

1 Z [b.(Zk Z051—U(Z1))61]H[h(zk,W)— H(U/ ')]H1(dw)}
k- 2

(:1
=— nz{:l 2 [h (21;, Z1)61— H(Zl)6l] " %u(Zk)6k

(#1:

_/[h( Z,c,111)))-u('w)]1f11 (61111)}
= — 7112214 Zk)61c +- 11:: {711 Z [h(Zk, 2051 - “(2051]

nk: 2 (#1:

—/[h(Zk,w) — u()w)]f11(dw }

1 n
— ;2' éuﬂﬂﬁ-n (n _1) _ZZ[’I( (Z1: Z1) 51 — "U (2051]

)kz 2l¢k

1 n n ”l
+ m ;;{[:(Zk, Z1)(51— u(Z1)61]—/[h(Zk,w)" “(U/”H (dw)}

=11+ 12 + 13, (303/)-

As to the 11, in view of (1.5.13), we have

1

E112 = o (E) E(u2(Zg)62) = o (”i ) [993, dF (1.5.14)

As to the 12, in View of (1.5.13), we have

|12| = 0 (5) iiwzzwznazw 0(-1—) 2:11:11 ()21 110(2116.

k=2 (#1:

Thus

E1; = 0 (”i ) [go-U. dF. (1.5.15)

31

AS t0 the 13, 18t H(Zk,Z1) = [h(Zk, Z1)61 — H(Z1)61] — f[h(Zk,'lU) — u(w)]f11(dw),

thus 13 = 0(n‘2) 22:2 2;“: H(Zk, Z1). Noticing

EH(Z,., 2,) = E(E(H(Zk, 20121)) = 0, k 5e 1. (1.5.16)

EH(Zk, Z1) 2 O, EH(Zk, 21) = 0, k #1.

Hence

E132 =0(-1:—)kEH2(Zk, Z1)

=21¢k
Z Z Z EH(Z,,, lelH(Zk, Z12)

““1 (#512 115512

)
) Z Z Z EH(Z,,, Zk)H(Z,,, Zk)
)
)

+0

13th

+0

:1)...

hill ““2 115512

2 Z Z EH(Zk le)H(lev Zk)

k¢11 k¢12 115512

+0

Sél 1...

+0

2: Z Z EH(Zk11 Z1)H(Z1, Zkg)

k1¢l k2¢l k1¢k2

AAAA

SAI ._.

=I3(1) + 13(2) + 13(3) + [3(4) + 13(5): (say).

The ﬁrst term

E(E(H2(ZkaZl)|Zk))

ﬁél ._.
a.

3H

[V]:

21!:

1L

= o (5) E1h<z,.,z,)(1 — 24206112
k=2t¢k
:: 0 (i4) ZE(h2(ZkaZl)6l)
n k=2z¢k
= 0 ($) ZE¢?j(Zz)’Y§(Zz)5I
k=2l¢k
1
= 0 (55) [99?de'

32

AS t0 the 13, let H(Zk, Z1) = [h(Zk, Z1)61 — H(Z1)(§1] — flh(Zk,'lD) — u(w)]H1(dw),

thus 13 = 001-2323;:=2 21;: H(Zk, Z1). Noticing

EH(Zk,Z1) = E(E(H(Zk, 20121)) = 0, k 5&1. (1.5.16)

EH(Zk, Z1) = 0, EH(Zk,Z1) = O, k 55 l.

Hence

131;: OF) ZEH2(Z)k,Z1
n4 =2l¢k

+0 :2 Z: EH(ZkaZli)H(Zkazlz)

k¢li k¢1211¢l2

)
)2 Z 2: EH( (1121,21) (2,2,21)
)

139'...

+
O
ﬁhl 1—4

(Well k¢12 1158512

ZZZEH( Zk Zli)H (levZk)

k¢li k¢1211¢12

4:): Z EH(ZkUZl)H(ZlaZk2)

=13(1)+ 13(2) + [3(3) + 13(4) + 13(5), (say).

il ._.

(
(
+0(
+06

The ﬁrst term

a
A
1—1
v
II
Q

E(E(H2(ZkaZl)|Zk))

a“) ._.
R.
3 II

M5

21!:

1L

15114252061 — 24206112

II
0
31>)"
a-

21k

‘lL

E(h2(Zk, Z1)61)

:5.) 1—1

v v V v V
a.
3 ll
M=

\
‘6
are

a.

I:

no
11'

all

M:

||
C

II
Q
AAA/“\A

E¢?j(Zt)7§(Zt)5t

3.4 ._.
3.

LI)
1'?

I:

ll
0

.F.

352' .._.

As to the 13(2), for k 95 l], k ¢ 12 and II ¢ 12, conditionally on {Z1c = zk} and

{61; = 611;}, we have
E(H(Zk, Z11)H(Zk, Z12)le, dk) = EH(Zk, Z11) EH(Zk, Z12) = O,

which is from (1.5.16). Thus 13(2) = 0. By applying the same argument, we have

13(3) = [3(4) = [3(5) = 0. Thus

1
E13 = 0 (E) [53,1111 (1.5.17)

Together with (1.5.14), (1.5.15) and (1.5.17), we deduce EK2(21) = 0(n“2) fgcfj (1F.
1.
From (1.5.12), we finally obtain ERﬁg‘U-(l) = 0(n‘2) f (pf,- (11“.

Proof of the Proposition 1.4 .5. The proof is basically the same as the previous

proposition.

33

Chapter 2

Nonlinear Wavelet-based Hazard

Rate Estimator

2. 1 Introduction

In this chapter, we consider the same setting of survival analysis with random
censorship as that in Chapter 1, with the extra assumption that random variables
X and Y are nonnegative. Our goal is to estimate the hazard rate function A(x)

with censored data,

 

_, P(a:$X<x+c|X_>_':r)_ f(1:)
A(x)—€l-1)rgi+ 6 _1—F(x—)’

:1: E (0, 00).
There is an extensive literature avaliable on estimating A(.r) from censored
data, see e.g., the survey paper Singpurwalla and Wong (1983) and the review paper

Padgett and McNichols (1984). Tanner and Wong (1983) and Lo, at al. (1983) -

studied a kernel estimation of density and hazard rate under random censorship

34

and provided Mean Square Error (MSE) and asymptotic normality of hazard rate
estimators.

The objective of this chapter, like that in the previous, is to provide a non-
linear wavelet-based hazard rate estimator for randomly censored data, its asymp-
totic formula for MISE and its asymptotic normality. We show this MISE formula,
when the underlying survival density function and censoring distribution function
are only piecewise smooth, has the analogous expansion for the kernel estimators.
However, as to the kernel estimators, this MISE formula holds only under the
smoothness assumption.

In the next section, we give the elements of wavelet transform and provide
nonlinear wavelet-based hazard rate estimators. The main results are described in

Section 3, while their proofs appear in Section 4.

2.2 Notations and Estimators

As that in Chapter 1, let T < TH be ﬁxed and A1(:r) = A(1:)I(:r g T). Since, in
general, hazard rate function A(:r) is not square integrable, we estimate A1(:r), i.e.

hazard rate function /\(:r) for :r E (0, T]. Like in Section 1.2. the wavelet expansion

of A1(:r) is

M2) = Z tat-m +2 2 bum-xx).

j=-oo i=01=—oo (2.2.1)

bj =/)\1¢j, sz =/)\1?1’zj-

35

We propose a nonlinear wavelet estimator of A1(x) :

oo q-l oo
:5) = Z b,¢,(a:) +2 2 b,,-I(|5,-,| > awn-(1:), (2.2.2)
j=-oo i=0 jz-oo

where now the wavelet coefﬁcients f) and hi]- are defined as follows:

 

an (__:_r__)
bj ___/$100“ —Fn (:r-)
_ _ "“121: < T)¢j(Zk)
Zl1‘—)Gll1-n(Zk—)l
an-T( )
b..— -/w.~.( T)——— Fm _)

 

__" M< ia<Twnn)
’ ”Zu- —)1[1—G (z.— )1

where Fn and C" are the Kaplan-Meier estimators of distribution function F and

G, respectively. For more details, see section 1.2.

2.3 Main results

Theorem 2.3.1 In addition to the conditions on (b and 1,12 stated in Section 1.2,
assume that the r-th derivative A?) is bounded and continuous on (0,T], the cen-

soring distribution G is continuous. Also assume that condition (A) holds with

Co E 2{r(2r + 1)‘1 sup A1(1 — H)‘1}1/2. Then

EI/(X. — A1)? — {n-‘pf 1 f}, +p-2'~2<1 — 2-2’)” / AS’”}|

=0m p+p43 (zan

 

Remark 2.3.1 This result is stronger than traditional asymptotic formula for

MISE. It implies a wavelet version of the MISE formula:

A - ’\ -—r —r-— 7‘2
E/(Al-A1)2~nlp/I_IH 2n2(1—22)1/Ag).

36

     

Remark 2.3.2 The truncation parameter q and threshold parameter 6 are chosen
to ensure that ;\1 is very close to A1 in case where A1 is smooth, yet at the same
time provide sufﬁcient local adaptability to produce automatic incorporation of
appropriate wavelet terms liq-21),]- in place where A1 is not smooth. For the details

about how to choose these smoothing parameters, see Hall and Patil (1995).

In the Theorem 2.3.1, we have assumed that hazard rate function A1 is r-times
continuous differentiable and the censoring distribution function G is continous for
simplicity and convenience of exposition. However, if A1 and G are only piecewise

smooth, the following theorem gives a similar result.

Theorem 2.3.2 In addition to the conditions on Q5 and it“ stated in Section 1.2,
assume that A?) and G are only piecewise smooth, i.e. there exist points :50 = 0 <
361 < r2 < - -- < am < T = 1N+1 such that the ﬁrst 7‘ derivatives of A1 exist and
are bounded and continuous on (r,,r,+1) for 0 g i g N, with left- and right-hand
limits; In particular, A1 and G themselves may be only piecewise continuous. Also
assume that condition (A) holds with Co as in Theorem 2.3.1 and pgr+1n‘2r —> 00.

Then (2.3.1) continues to hold.

While wavelet estimators allow us to obtain MISE and optimal convergence
rates analogous to kernel estimators under weaker assumption, there is a fundamen-.
tal instability in the asymptotic variance of wavelet estimator caused by the lack of
translation invariance of the wavelet transform. For more details, see Antoniadis, -

et al. (1994). Because wavelet estimators are only dyadic translation invariant, we

37

provide an asymptotic expansion of the variance and asymptotic normality result

at dyadic point r = U2", 1: and l are integers.

Theorem 2.3.3 In addition to the conditions on (b and 1,9 stated in Section 1.2,

assume A1(r) is r-times continuously differentiable at :c 2 U2", Also assume that
p = 2” = 0(n1/(2’+1)), q —> 00, pq62 -—> 0, 6 Z CVn‘l 1n n,
where C > C1§{(8r + 2)(2r + 1)‘1 sup A1(1 - H)“1}1/2. Then

\/np‘1():1(:z:) — /\1(:1:) + b(x)) =d> N((l,0'2(:r)),

where

Mr) = (Tn—um.) / u’ 2 ¢<u + 02(1) du -p-',

l

2 A1( ) . 2

1
Remark 2.3.3 This result is analogous to Theorem 4.2 in Lo, et a1. (1989), a
result of asymptotic normality of kernel estimator of hazard rate function with

censored data.

2.4 Proofs

The proofs of the Theorems 2.3.1 and 2.3.2 are very similar to those of Theorems
1.3.1 and 1.3.2. The difference is that we have one more term 1 — Fn in the de-

nominators of b,» and bi], which create an extra technical difﬁculty. We begin with

38

some lemmas. To state these, we need some additional notation. Let

 

 

 

 

' = _Z 6kcpj( (Zk) ‘ :ikz: 5k99j( ((ZZk))]
1[1—F(Zk)] ]G[1‘- n(Zk“ [l—H
n (2.4.1)
.j = _: 6k¢,j( (Zk)( .. 2121 51:99:)“: ((ZZk))]
l[1 - F(Zk)][1 — [1 _
We also define
_7 1 n , _7 1 n f
141,—.- —:w,(z,.), 121,-, = 5 Ply-(Zr).
n k=1 k=1
Wylzkl = 111(2):) — ”(21%), Iii-,(Zk) = Uole) — ”AZ/c)
and
_ 1 - 6k T” 992W)
UJ(Zk) — 1_ H(Zk) Zk 1_ F(w) F(dLLl),
1 — 5k T" W‘W)
Uij(Zk) = -———-/ —-’——- F(dw),

 

(pJ-(w Iv(<Zk/\w) v w
Z")=// [1— H(v)]2 C(le‘d )’

f“ _ (pm-(w I(‘U < Zk /\ (d)
l,J(Zk) — // [1_ H(v)]2 G(dv)F(dw).

Remark 2.4.1 We use the same notation of the coefficents bj, bij, their estimators

 

13,-, b,,- and Uj(Zk), Vj(Zk), Wj(Zk), etc. as those in Chapter 1 for easy comparision.

However here they all include one more term 1 — F", or 1 —- F in their denominators.

Lemma 2.4.1 Under the assumptions of Theorem 2.3.], we have

— ~

__, 1
b, =b,+W,+R,,,j, E(R§,j) =0(7-13)/90§d1’,

o<s>w

Proof. The proof follows along the same lines as those in Lemma 1.4.1, use (p,- / (1 —

(24.3)

EU = b,,- + W5,“ + Rm, E(Riy)

F) and 90,-]- / (1 — F) instead of go,- and apij. Because the denominators of b, and bi,-
are bounded away from zero, all needed conditions are satisﬁed.

39

Let

= iZth). W“) = hill/WM)
k=l k=l

6kI(Zk S t)
1- G(Zk) ’

(2.4.4)

Q(stt)= ”Mail: U(st 15—) V(stt)

and

U(Zk,t) H(SEQ/I(Zk<w<t)F(dw)’

H((wZ<t)(()Iv<Zk(/\w) .
Vt)(Zk, ://I( [1 _ v)]G[1— (11)] G(di)F(dtu).

Lemma 2.4.2 Under the assumptions (of Theorem 2.3.1, for any t _<_ T, we have

(2.4.5)

 

15w) = Q(t) + W(t) + R..(t), sup 5123a) = o (n'4) . (2.4.6)

th
Proof. From a result of Stute (1995, p.434) and details of the proof of the main
theorem of Stute (1995), we have (2.4.6), an approximation of the Kaplan-Meier
estimator when Mr) = I (:1: g t). The rest of the proof is completely analogous to
that of Lemma 1.4.1. Here we only consider the specific integrand 99(13), instead of
(pg-(1:) there. Also we consider the fourth moment of Rﬂ(t), instead of the second

moment there. However, from an argument similar to the one used in the proof of
Lemma 1.4. 1, we can obtain ER4(t) n ‘4) )f¢4(r )dF. Thus supterRf, (t )=

0(n‘4).

Remark 2.4.2 An argument similar to the one used in the proof of the above
lemma, we can show that, for any integer k 2 1, supthRfﬂt) = 0(n '2’“). Thus,
from Holder’s inequality, supthlRﬂt)! = 0(n‘a), for any a 2 1. This fact is
analogous to Lemma 2.1 in Lo, et al.(1989), where they show supthlR:(t)| =
0([ln n/n]°‘). Because of the technical reasons, they deﬁned estimator Fn slightly

4O

different from here. In addition, by Theorem 1.1 of Stute (1995), where F and
G are only piecewise continuous, Lemma 2.4.2 still holds, which will be used in

proving Theorem 2.3.2.

Lemma 2.4.3 Under the assumptions of Theorem 2.3.], we have

E{Z(5j - W} = owl-11)).

.7

Proof. In view of (2.2.3) and (2.4.1),

 

~._-.=_1_ " cases-(2k) E((ZZka-nzs.)
b] by Z -F(Zk)[1-Fn “)Gl[1-n(-Zk-)]

From the deﬁnition of 13}. and Gm we have

[1_ k‘“)][1-’n(‘—Zk )]_ __ n +1—nRanka : n -' Zl¢k;(Z1§ Zk)

 

 

Apply Lemma 2.4.2, and use the continuity of F to obtain

— sz—ZAJ- ()sz F(Zk) +" 312/1“ ()21: E(Zk)l/V(Zk)

nk: l

+_nZA1'()BZk 12,,(2,.)

= 113' + 123' + I3j7 (say),

where

6 - Z n
(96.1: —

 

— F(Zk)

F(Zk) : = Q(Zk) - F(Zk),

62(2),), W(Zk) and Rn(Zk) are deﬁned as in (2.4.4) and (2.4.5). Conditionally on

{Zl = 21}, {(51 2 d1}, through direct calculations, we have
EP4(21) = 0(n“2), uniformly in (11 and 21 S T. (2.4.7)

41

Conditionally on {Z1 = 21}, {61 = d1}, B(zl) = n/(n—V), where V = 2?:2I(Z, S
21) is a binomial random variable with parameter n — 1 and p := H (21). Thus,

through direct calculations, we have

EB4(21) = 0(1), uniformly in (11 and 21 S T. (2.4.8)
Now,
1 n
[123' = — A§(Zk)Bz(Zk)P2(Zk)
n2
k=1
1
— Z A (2,)3 )P(Zk) 4 (ZI)B(ZI)P(ZI)
+2” k=11=1,¢k

=15<1>+ Ito), (say).
The ﬁrst term
21,2,(1) = %E[A§(Z1)B2(Z1)P2(Zl)]
= 0(1)E[A§(21)E1/2(B4(21)

_0(ni)EA§(Z1)=O(£3)/cp§dF.

Hence, from (1.4.19), we obtain

Z1, 51)] (2.4.9)

 

Z1,61) E1/2(P4(Zl)

 

E21121“): o(n_1p). (2.4.10)

The second term

E[|A1(Zk)| Isa-(201 E(IB(Zk)l lB(Zz)| IP(Zk)| |P(Zz)l |z., 2., 626)].

42

Conditionally on (Z,c = 2k}, {6,c = dk}, {Z = 2,} and {6, = d,}, by the Cauchy-
Schwarz inequality, we have
4 4 4 4 ”4
1908(4)) (3(2))! (1442.)) |P(z:)l) 3 [EB (2)123 (zaEP (2.)EP (2)] .

Through direct calculations as that in (2.4.7) and (2.4.8), we have

513(2 =oni( )2 Z EI4( (Zk) )Il4(Zz)|

k==111,¢k

= o G) (/ lgojldFY. (2.4.11)

Hence
2213(2) = 061-) EU (0,.er = 0%) = 0(n-1p) (24.12)

V Z].(f|<,9j|oiF)2 < 00 and p —> so. This, together with (2.4.10), we have

E21112]- - o( n 1.p) Apply the previous same lines in 11, to 12,, we can show
E2]. 1221- = o(n’1p) too.

To complete the proof the lemma, it remains to show that E Z]. 13]- = o(n‘1p).
Apply the moment inequality to 13], we have E13,. 3 E[A§(ZI)BZ(ZI)R3,(Z1)].

Conditionally on {Z 1 = 21}, {(51 = (11}, by the Cauchy-Schwarz inequality, we have
E[B2( (21) )R3,( (21) )]< [EB4(~ )ER4( 21)]1/2 = 0(1/n2)

P E: —1 ,2 F 2.4.13

43

Lemma 2.4.4 Under the assumptions of Theorem 2.3.1, we have

- __ ,\ __
2(1).- — 4))2 — n 12/171) = o(n 1p).

Proof. The proof is similar to that in Lemma 1.4.2, but here we need one more step

 

to approximate (3,. In view of (2.4.3) and

Z(13,—b,)2=§j:(b,—b,)2+Z((3, b+)22Z(5, ”xi—(3,),

i

we have

SISE

:03.- — 4.)? —— n-lp /

j

 

/\1 —,"2 2
‘1—'_'—H +EZHJ- +EZRn’j
J J
+ 25215. — Mm...) + 2492113.- — bJIIWJ-I + £214.21?)
1' J"

J'
+EZ((3,- )2+2EZ|b,- —b,||(3,- —b|

J

 

=11+12+13+I4+15+15+I7+13, (say).

Notice that

2- 2 /\1($) 2
-b) _/¢j(‘r)md$—b]9

 

we have
E(b 2( _11A1((y+.7)/P _ ()2,
Z ’52)] ‘2 . -H((y+j)/)p?1i:
since f¢2 =1,

Zt)-Why +j)/p)/(1— H((y +jl/P7) —> /,\,/(1- H

= 0(f Ag), it follows that E2103]- — (21-)2 = n'lpfAl/(l — H) + o(n‘1p).
Because the denominator appearing in hj is bounded away from below. Thus these

44

(3,- may be handled along the the same lines as those of Hall and Patil (1995, p.922)
to show that Var{2j(13)—b2} - o(n n‘2 p2). So we obtain 11— - o(n p.) By Lemma
2.4.1,
12 = n—1 Z E1132(21) g 2n-1Z(EU,2(21) + E132(21)).
j 1
By direct calculation, notice all denominators are bounded away from below, we

have

 

2 _ z? ___ ,—1 2n 2 “+7
EUj(Zl)—El,(Zl) 0(1) /¢()Al( p )du)

Thus, 12 = 0(n’1)f¢2(U) ij'12¥((u +j)/P)du = 0(n’1P) by 2,1942%“?! +
77/19) —>f/\§ {ooandp—+00.

By Lemma 2.4.1 or (2.4.3), we have 13 = o(n‘lp). From Lemma 2.4.3, 17 =
o(n'lp). Applying the Cauchy—Schwarz inequality to the rest terms, we complete

the proof.

Lemma 2.4.5 Under the assumptions of Theorem 2.3.1, we have

stZE(((3.-s — .. 21(Ib..l>6)} (no-222+”).

1:0 j

Proof. Let a and )8 denote positive numbers satisfying a + B = 1, we have

.,<2zzg{a,_b 2IIb..)>6}+2ZZE{ ,.,._ 1),.) 21(lbs)|>6)}

i=0 j i=0 j
q—l
s2ZZE{(bs.-— b..- ))+2ZZE((I> 2'1" .. 21(Ib..)>a6)}
i=0 qj i=0 J
+2ZZE{(b .-.— .. 21).. 542|>B<5)}
i=0 j

= 2(821 + $22 + 823), (say).

Apply the argument analogous to (2.4.9), (2.4.11) and (2.4.13) appearing in the

proof of Lemma 2.4.3 to 321, we conclude that

0c >2;/w< >::;;;</w>

(_)::/J.JJJ+o(g)ii . (/.JJJJ—)2

i=0
Jew?)

= o(n—2r/(2r+l)).

The third equality follows from 2]. p, 1 f 99?] dF < 00, 2“ (f ISOJ'J‘IdF)2 < 00 and
q = 0(ln n), ‘while the last equality follows from 17.“po —> 0. Apply the same
argument as that in the proof of Lemma 1.4.3 to 322, use 517' instead of 13,-]- in there,

we conclude that 322 = 0(Tl-2r/(2r+l)). NOW let A = {IBij — bijl > 6}, then

323::IZE{(b iJ'" bij) 21Gb iJ" bij|>66)1 (A )}

i=0 J
+ZZE{(5 J-J— bJJ>(2IIbJJ-— bJJI>ﬁ6)IA( >}
i=0 J
SZZEW JJ— JJ 21 bJJ— JJI>6}+ZZ<52P IbJJ—bJJI>z36)
i=0 J i=0 .7
SZZE{(bij-1321(IIJbi)">6)}+ZZB—2E(bij-bij)2
i=0 J 1-0 J

= 323(1) + 323(2): (SUI/L

where 323(1) is analogous to 322 in section 1.4, which is o(n'2r/(2r+1)). While 323(2)
is 0(321) , which is o(n‘2'/(2’+1)) too. Together with .921 and 322, we prove the 2

lemma.

46

Lemma 2.4.6 Under the assumptions of Theorem 2.3.], we have

q-l
JJ, 5 E 22bfj1(|l3,j| g 6) _JJ-wu —2’2')‘1/J\‘{)’ = o(JJ—zr).
i=0 j

 

Proof. The proof follows the same lines as that of Lemma 1.4.4.

Lemma 2.4.7 Under the assumptions of Theorem 2.3.], we have

34 E i Z biz]- : o(p“2').

i=0 1'

Proof. The proof follows from the step 3 of Theorem 2.1 of Hall and Patil (1995).
We are now in the position to give the proof of the Theorem 2.3.1 and 2.3.2.

Proof of the Theorem 2.3.1. Observe that

Elf“ ‘ 2‘)? ‘ {212/ J f‘JJ +p'2'n2a — 22V / AW}

_<_sl+32+53+34.

 

 

Thus Lemma 2.4.4, 2.4.5, 2.4.6 and 2.4.7 together prove the Theorem 2.3.1.
Proof of the Theorem 2.3.2. The basic idea of the proof is similar to that of Theorem
1.3.2. We omit the details.

In the sequel we prove the Theorem 2.3.3. This will involve the following two

lemmas.

Lemma 2.4.8 Under the assumptions of Theorem 2.3.3, we have

J/np-1(Z<6J — bJ)¢J(z)) =2» M0, 02m).

where

02(JJ) = gala / [Z ¢(u + l)¢(l)]2du.

l

47

Proof. In view of (2.4.1),

”P-1(ZUSJ " bJ')¢J‘(17))

.i

l

 

3

:2 VnJc 3

:1

a-

where K( t, 2:) =2 (b( t — J) (J: -— j). For the wavelets in Section 1.2, the kernel
K (t,x) satisfies the moment condition (See Theorem 8.3 of Hardle, et a1. 1998,
p. 95), i..e f( t(-J: k)K(t, 1) dt— -— 60),, for k— — O, 1,- . ,r— 1. Notice (Zhdk) are i.i.d.

for k = 1,2,--- ,n,EV,,Jc = O, and

EVfJc = g/£%K2(pt,px) dt — %(/ A1(t)K(pt,p:r) dt)

1 A1(3+u/p)
—/1—H(:I:+u/p)

 

[EM u+p1r - J)<f>(p:r - J')]2du

- 515(//\1(I + U/p)2¢(u +prc -J')a>(p:c -J')dU)2

j
= i f 1-% [Z ¢(u + l)<;5(l)]2 du + 0(n‘1p‘l).
The second equality follows by the change of variable, while the third equality by
p = 2”, :L‘ = l/2", N —-> 00 and the Taylor expansion. Thus 22:1 E12,: = 02(33) +
0(p’1) —-> a 2:1:( ). In addition, K(t, 2:) being uniformly bounded, we have IV" kl <
cfnTl—p —> O, cis a positive constant. So for all 6 > 0,1im,,_,,,JD Z:_1E(|Vn klz; an kl >
e) = 0. Thus by Lindeberg-Feller CL’s Theorem, the lemma follows.

Let

=2? 2: but/ii” I(lbijl > 5)

i=0 j

48

Lemma 2.4.9 Under the assumptions of Theorem 2.3.3, EJJ;2 = o(n—1p).

Proof. In view of (2.4.3), write (30- as following

biJ' = bij + (sz— ()1) + VT 1] + Rn 33‘. (2.4.14)
Then
q-l
J6: :ZbiJViJ(I) I(lbijl > 6)+ ZZU’IJ - bij)¢ij($)1(lbijl > 5)
i=0 3' i=0 J
q—l
+ZZWJ,JJJ,-(x1) (l5,,| > 5)+Z:Rﬂ,,o,,1(|iJJ,-| > 5) (2-4-10)
i=0 j '

= 11+ 12 + 13 + 14, (say).
Because of the compact support of w(:r), for each i, there are only ﬁnite number of

j such that wJJ-(x) are nonzero. So

E122— 0(Q):ZE(b iJ_ biJ) 2¢i2j($ 1')

i—O j

221/ (Well

2:0 J

= 0(4) [-131 + 2]

n2 n
— o(n‘l p).
In the above, the second equality follows from the argument similar to (2.4.9),
(2.4.11) and (2.4.13) in Lemma 2.4.3, while the last equality is from pqn‘l —-> O and
q = 0(ln n). Similarly, we can show E12 2 E142 2 o(n-1p).

As to the first term of J6,

2:1:(b21b2J)w2J($ I(Ib21l> 6) )+ EZbijVij(x) I(lbijl > 5) - [11+ 112

i=0J 1:0]

49

Let a and B are positive numbers such that a + B = 1, so

”11' <21: '13:)" bJ‘jlle-J-(rr )|I(|bJ,~| > (16)

i=0 j
+ZZIbz-J— bJJIIu'J-J< >11<1bJJ—b.J-|>JJJ).
i=0 j
Because A1(Jv ) is r- times continuously differentiable at 11:, so |b,-]-| < (jg—(”1(2) or

b3 5 c2p:(2'+1), c is a constant (see Hall and Patil, 1995, p.917). Notice 62 =
0(ln n/n), p,- = p22, p = 0(n1/(2’+1)), thus I(IbJJ-l > 6) = 0 for large n, hence the
first term in the bound of 11 1 actually is zero for all sufﬁcient large n. In view of

(2.4.14), the leading term to approximate hij is 5,5. Apply a similar argument as in

(2.4.15) to 111. all the rest of the terms are of smaller order, we have

EIJ2J=0(q) )ZZEw J-J- bJ-J) 2wJ-2J(x)I(IbJJ-- bJJI>z361

i=0J'

(Q)§ZEl/a(b ij_ b0) )2a 1,1210; )Pl/b(|b,j — szl > 56)

1:0j

i _ __ 4 +1
=0(q)ZZ£p,-n d:0(q)p3n d 1, where d> 2:+1

 

—2r/(2r+1)) : 0(TIJ—1p).

= o(n
The second equality follows by Holder’s inequality, while the third equality by
Rosenthal’s and Bernstein’s inequality and let a —> 00, b —> 1 (see the details in
Hall and Patil, 1995, p.917—918). The ﬁfth equality follows by n‘lpq -—> 0. Apply

—(2r+1)

the same argument to 112, using (if, g c2 p,- ,VVe can show that E1122 2 o(n p)

too, which proves the lemma.

50

Proof of Theorem 2.3.3 In view of (2.2.2), by analogous equality of (2.4.14) to h,-

and the definition of J6 in Lemma 2.4.9, we have

XJ(z)—AJ(x)—b( 1:20» —bJ-) )csz-(x )+[ZbJJsJ-()- (x)— b<x )] +JJ
_ —:(b- — 12,-) (:12)+Z(13,- —5,) )o,-(::: )(+ 22272242)
+ZR,,JJJ,-() )+[Zb,-¢,(x—) ()— b(JJ )] +J6

= J1+J2+J3+J4 +J5+J6, (say).

By Lemma 2.4.8, WA -—-d-> N(O, 02(x)). By Lemma 2.4.9, we have @7116 —>
0. The terms J2, J3 and J4 are analogous to 12, 13 and 14 in Lemma 2.4.9, so ap-
plying the same argument, we can show that E122 2 EJ2= EJ2— — o(n p.) Thus
W12 l—J 0, same as J3 and J4. Hence, in order to prove the theorem, it suffices
to show that J5 = o(p”). Apply the same argument as in Lemma 2.4.8, using the

moment condition of K(t,:c), it is easy to see
J5: /[A1(t)- (:r)'(lplR thp$)dt- ME)
= [1w + u/p) — AJ(x>JK(px + JJJpx) du — bu)

/\(k)($
=/Zlk!( K(,ppJ:+up:r)du+o( )—b(:r)

 

 

A2415)
(fif— 2p; o(u + l)¢(l) dup" — b(x) + o(p")

1
: o(p—r),
the last equality follows from the moment condition of K (t, 2:), which proves the

theorem.

Chapter 3

Minimum Distance Estimators in
RegreSsion Models under Long

Memory

3. 1 Introduction

The practice of obtaining estimators of parameters by minimizing a certain distance
between some functions of observations and parameters has long been present in
statistics. These estimators have many desirable properties, including consistency,
asymptotic normality under weak assumptions and robustness against outlier in
the errors. Koul and DeWet (1983) and Koul (1985a, b; 1986) pointed out the
importance of this methodology in linear regression models, using certain weighted

empirical processes that arise naturally in these models. For more details and

references on this methodology, see the monograph by Koul (1992b).
Koul and Mukherjee (1993) extended the above results to linear regression
models with long range dependent errors that are either Gaussian or subordinate

to Gasussian. More speciﬁcally, they considered the multiple linear regression model
)fni :12;;i/3+€i1 5i :G(T]i)1 Z: 1721... in?

where {Jami 2 1} are known ﬁxed constants, C is a measurable function from IR
to R, {mi 2 1} is a stationary, mean zero, unit variance Gaussion process with
correlation p(k) :2 E77117”,C ~ k‘”L(k), k _>_2 1, 0 < 6 < 1, where L is a function
of positive integers, slowly varying at inﬁnity, and L(k) is positive for large 1:.
Thus 22:, p(k) = 00, implying the errors have long memory. For motivation and
arguments in support of this Gaussian and / or Gaussian subordinated long memory
error process, see Taqqu (1975), Dehling and Taqqu (1989) and a review paper by
Beran (1992).

The other class of long memory process is of the moving average type. For
more on their importance in economics and other sciences, see Robinson (1994),
Beran (1994), and Baillie (1996). These processes include an important class of
fractional ARIMA processes. For various theoretical results pertaining to the em-
pirical processes of long memory moving averages, see Ho and Hsing (1996, 1997),
Giraitis et a1. (1996), Koul and Surgailis (1997, 2001b), Giraitis and Surgailis
( 1999), among others.

Because of the importance of multiple linear models with long memory moving

average errors, and the desirable properties of the above mentioned minimum dis-

53

tance estimators, it is natural to investigate their properties under the long memory
moving average errors. The objective of this paper is to obtain the asymptotic dis-
tribution of the m.d. estimators of regression parameter in multiple linear model
with long memory moving average symmetric errors when the design variables are
either known constants or i.i.d. random variables, independent of the errors. These
results thus extend those of Koul (1985a,b) and Koul and Mukherjee (1993) to
these models.

The rest of this chapter is organized as follows. Section 2 provides the m.d.
estimators and their asymptotic normality under both ﬁxed and i.i.d. random

design cases, while their proofs appear in Section 3 and Section 4, respectively.

3.2 Main results

3.2.1 The case of non-random designs

Consider the linear regression model where one observes the response variable

{Ym}, 1 S i g n, satisfying

Yn, = 1;,5 + 5,, 1 g i g n, 5 6 RP. (3.2.1)

h I

Let X denote the n x p design matrix of known constants whose it row is 33,",
1 _<_ i g n. Here IR” denotes p—dimensional Euclidean space, IR = R1. In the
sequel, for the sake of convenience, the dependence of various entities on n will not

be exhibited. We assume the errors {55,1 3 i g n} to form a stationary moving

average sequence,
00
8J = Eng-4,, bk ~ L1(k)k’(1+9)/2, 0 < 0 < 1, 1 g i _<_ n, (3.2.2)
k=1

with the common distribution function F, where (3, s E Z are i.i.d. standard
random variables, symmetric around zero and L1 is a slowly varying function at
inﬁnity. This implies that p(k) = Cov(el,el+k) = L(k)k‘2, where L(k) = C9 L2(k),
Co = 2(2 — 9)‘1(1 — 6)’1fo°°(u + u2)‘(1+9)/2 du, and hence the errors have long

memory. We assume that (0 in (3.2.2) satisfy the following conditions:
A.1 |Ee2"29| S C (1 + |u|)‘5, for some C < oo, 6 > 0, Vu 6 IR.
A.2 E (COP < oo.

Giraitis et al. (1996, Lemma 1) proved that under the Condition A.1, the error dis-
tribution function F is inﬁnitely differentiable. The assumption A2 is a condition
on the decreasing rate of its density function in the tails.

Now, let 7,, :2 L1/2(n) n‘1‘QI/2 and deﬁne, following Koul and Mukherjee

(1993).
MN == r.:2 [Horn-”1:3“ [I(YJ — sz s y)-— 1(_1;+ sz < y)]}l|2dH(y).
o(A) == 7.2 /I|(X'X)“/2{ZIJ[I(JJ s y) — Me. > —y)]}

+(X'X12/2(A — ﬂ) my) + f(-y)I lIdeo), A 6 RP,

where I (A) is the indicator function of set .4, Hull denotes Euclidean norm and H -.

is a nondecreasing right continuous function from IR to IR. The m.d. estimator of

55

the regression parameter )3 is deﬁned by

A

B 2: argmin {Ag/(A), A 6 IR”}.

Note that [3 is the estimator 5+ deﬁned in Koul (1985b) for the independent errors
case and is the estimator [3; deﬁned in Koul and Mukherjee (1993) for the Gaussian
subordinated process errors. The motivation for considering these m.d. estimators
and its ﬁnite sample properties are discussed in Koul (1985b, 1992b). In particular,
for p = 1, xi E 1, H(I) = .r [H(r) = 1(1: 2 0)], 13 is the Hodge-Lehmann [Median
estimator] estimator of the one sample location parameter.

Before we state the asymptotic normality of B, we need the following additional

assumptions on the model (3.2.1) and (3.2.2):
A.3 (X’X)’1 exists for all n 2 p.

A.4 n max |$;(X’X)‘IJ:,-| = 0(1).

151571

A.5 f(1+ y2)’1dH(y) < 00.

Conditions A.3 and A.4 are the same as those in Koul and Mukherjee (1993), while
A.5 replaces the conditions ff’dH < 00, r = 1,2 and f0°°(1 — F)dH < 00 of the
above paper.

Let A = rn'1(X’X)1/2, B = 7,,(X'X)1/2, c,- = A‘lz,, d,- = B'lxi. We now

state the main result:

Theorem 3.2.1 In addition to (3.2.1) and (3.2.2), assume that A.1-A.5 hold. '

56

Then,

Am“ — ﬂ)
= (2/1’2222’2)1 / 24110::- S y) — 1(J,> —y)]f(y)dH(y) +0,,(1). (3.2.3)

The next result gives the asymptotic equivalence of the m.d. estimator in the

ﬁrst order to the least square estimator and its asymptotic normality.
Corollary 3.2.1 Under the assumptions of Theorem 3.2.1,
J,,-I(X'X)1/2(JJ — s) = —J,,-1(J"X)-l/2 23,-5.- + 0,,(1). (3.2.4)

1, i=1

Moreover,
G;‘/2TJ:2(X'X)2/2(3 — a) => NJ(0, 1...), (3.2.5)
where Ipxp is p x p identity matrix, and

an = T;2(X"X)_l/2X’RRX(‘XH‘¥)—1/21 Rn : (p(z - j))an1 21.2:1921 ° ' ° in-

3.2.2 The case of random designs

In this subsection, we consider the following multiple linear regression model
Y,- = 112,23 + 5,, 1 g 2' g n, 5 6 IR”, (3.2.6)

under the same assumptions as those in the previous section, except that here

{XJ,i 2 1} are i.i.d. random variables, independent of the errors and EX1 75 0. 2

57

Similarly, deﬁne
Mlm ,;2 =an -1/2{Zx[1(—x;3 s y) _1(_3-;+x;3 < y)]}|l2dH(y)
::T_2/Iln -1/2{ZX_[I( 5i<y)_1(€i> —y)]}
+ n-1/2(X'X)(A — a) my) + f(—y)] ||2dH(y)
The m.d. estimator of the parameter )3 in (3.2.6) is deﬁned by
Bl :2 argmin {AII(A),A 6 RP}.

Before we present the asymptotic normality of m.d. estimators, we need the fol-

lowing assumptions on the model (3.2.6).
A.6 E||X1||5 < 00.

Let an— - Tn- Til/2, bn — —Tnn1/2, C,- = angi, D,- = bngi, then we have the

following analogous result of Theorem 3.2.1 under the i.i.d. random design case.

Theorem 3.2.2 In addition to (3.2.6) and (3.2.2), assume that A.1, 24.2, A.5 and

A6 hold, then
(13(8) — 5)
z (4)2313)“ / 212 [1(5. _<_ y) — 1(5.> -y)]f(y)dH(y) + 0,,(1).
(3.2.7)

Corollary 3.2.2 Under the assumptions of Theorem 3.2.2,

T—1n1/2(B _ 3)T—1n—1/2ZX 514.0100 (3.2.8)

58

Moreover, let EXI = p ¢ 0, then

rglnl/2(Bl — ﬂ) = —p rn'ln“1/2:5, + 0,,(1), (3.2.9)
i=1

and

Tilvl'l/QZEI- => N(0.,1). (3.2.10)

i=1

3.3 Proofs of the theorems

The method of proof is similar to that of Koul (1985a or 1992b; Ch5) which requires
that M (A) istuniformly locally asymptotically approximated by quadratic form
Q(A) and shows ”.403 — )3)“ = 0(1). This approximation in turn is used to obtain
the asymptotic normality of m.d. estimators 5. F or more details, see Koul (1992b;
Ch5) and Koul (1985a).

In order to provide the details, we need some notations and several lemmas.
Let C stand for a generic constant which may change from line to line. As in Ho
and Hsing (1996, 1997) and Koul and Surgailis (1997, 2001a, b), put

I oo
Eu 3: E bkCi—ka 5:: I: E bkCi—k,
k=l

k=t+1 (3.3.1)
H(z) := 13(5) 5 :17), mm) := F,’(x).

The following two lemmas are analogous to Lemmas 5.1 and 5.2 of Koul and Sur-

gailis (2001b), thus their proofs can be deduced from there.

Lemma 3.3.1 Under the assumptions A] and 24.2, there exist lo 2 1 and a con-

59

stant C such that for any I 2 lo, :1: 6 1R,

|f"”(x)l + lflp’(r)| S C(1+ I113)", p = 0,1,2, (3.3.2)

If)(x) — f,_1(x)| g be(1+)x|3)“. (3.3.3)

Lemma 3.3.2 Let g7(x) 2: (1+ MP)"1 and h(x), x E R be a real valued function

such that, for some C < oo,
|h(x)| S Cg.,(x), 'y = 2,3. (3.3.4)

Then there exists a constant C7, depending only on C in (3.3.4), such that for any

x, y E R,
WI + y)| S 0797($)(1V lyll), (3-3-5)
where a V b :2 max{a, b}.

Remark 3.3.1 From (3.3.2) in the Lemma 3.3.1, f(x) and f, (x) satisfy conditions
of h(x) in Lemma 3.3.2, thus, |f(x + y)) S C(1+ x2)‘1(1 + yz), lf'(x + y)! S

C(1+x2)‘1(1+y2).

Lemma 3.3.3 (Surgailis). Under the assumptions A] and A.2, there exists a

constant C < 00 such that
|00v(I(Eo s 2). Ms. S x))| 5 00+ gem-0,
foralliEZ, xElR.

Proof. The proof is in Appendix.

60

Lemma 3.3.4 Under assumptions of A.I and A2, there exists a constant C such

that

lCov(I(x<£o SIC-+010), I($<Ei SIC-1'01)”

5 00+ r2)“r”[laol v Iaol3]‘”[la.l v 1313]”.
foralliEZ, xER, (106R, 036R.

Proof. The proof is very similar to that of Lemma 3.3.3, so here we only give the
outline. Let f,- be the a-ﬁeld generated by Q, k S i and F(x, y) := F(y) - F(x).

Write the telesc0ping identity:
00
I(x<e,~$x+ai)— F(,xx+a,-) =ZUi)(x,x+a,), (3.3.6)
(:1
where

U.,z(x,x + a.) = Fz—1($ - 51.1-1.2? + a. - €.,z_1)- E(I - 51.1.17 + a. - 61,1)

=U( ()(x, x+a,-)+U(3) (x, x+a,-),
where

1 - - - -
Ui(,,)(x,x + 01,-) = F((it — ELI—1,1? 'l' 011 — 831—1)“ 171(1‘ — 51,131? + 012' — 52,1),

2 - ... ~ ~
U.(,()(:r, 1' + at) = Fz—1(x - 52.1-1.3 + a. — 51.1-1) - E(x - 831—1. at + a. — 51.1—1).

In order to prove the lemma, as that of Lemma 3.3.3, we only need to show the

following (3.3.7) and (3.3.8).

2
E[U,-,,(z,x+a.-)] 30(1+22)-1(|a.lv|a,|3), 1:1,2,-.-,10, (3.3.7)-

2
E[U§j’(x,x+a,)] SC(1+x2)’ll‘1'9(|a,-IV|a,—|3), 1>10, q=1,2. (3.3.3)

61

Proof of { 3. 3. 7). According to the deﬁnition, we have

2
E[U.-,)(x,x + 0.)] S 2[EF,2_1($ -I012|- 52,14,513 + lail " 51,1—1)
+£mﬂx—m.—axm+m4—aﬂ

Sinx-m¢x+mm
lat"
= 4 f(x + v) dv
-l0:'|

S C(1+ x2)‘1(|a,-| V |a,~|3),

the last inequality follows from lemma 3.3.2 with h(x) replaced by f (x) and 7 = 2.

Proof of (3.3.8). For q =1,
(1) I+Og
(1,, (z. a: + a.) = / [fz(u — Inc.-. — é...) — fz(u — 5.3)] du. (3.3.9)

Follows the argument of Lemma 3.3.3, apply Lemma 3.3.1 and 3.3.2 with 7 = 2,

we can obtain the following analogous inequality
(U.(,i)($,$ + Gill S C(lszi-IIV lszt—1l2)(1+ 531)“ + $2)_1(lai| V lail3)-
From (3.3.9) and (3.3.2), we have |Ui(.})(x,x + a,)| g C(lb,(,-_,| A1), thus we obtain
E[Ui(,i)($,$ + 011)]2 S CEIb)C.-_,|2(1+ 55:1)(1 + Ira—100:1 V lail3)v

which is (3.3.8) for q = 1.

62

For q = 2, apply Lemma 3.3.1 and 3.3.2, we have

|U1-(3)(x,x + a,)| S

1+0.-
/ “(W-51,14)—f(-1(u-E,-J_1)]du!

x+|o,-| _1
S C bf(1+ Iu — 5,,,_1|2) du

I-lail

 

I'l'lotl
g be/ (1+u2)'1(1+§ﬁ,_1)du

~|01|

S Cblz(1+ €i,_1)(1+ x2)'1(|a,| V |a,-|3).

Again, as IU’,(§)(x,x + ai)| S 2, we obtain (3.3.8) for q = 2. Hence, we proved the
lemma.

We are now ready to state and prove the asymptotic uniform quadraticity of

AHA).
Lemma 3.3.5 Under the assumptions of Theorem 3.2.1, for all b 6 (0, 00),

E sup |M()3 + A'ls) — Q(8 + 24—13)]: 0(1), (3.3.10)
s€N(b)

where N(b) = {s 6 RP : ||s|| S b}.

Proof. The proof basically is similar to that of Theorem 2.1 of Koul (1985a). As

there, use the symmetry assumption of f (y), it is enough to show that Vb 6 (0,00),

sup f|)Zd.[F<y.y + czs) — czsnyn ”211113) = 0(1). (3.3.11)

s€N(b)

E sup /”Zd,-[I(y < 5,— S y + cis) — F(y,y + c§s)] H2dH(y) = 0(1), (3.3.12)
sEN(b) i

E sup [Ugh-[1(a- _<_ y)—F<y>+czs1(y>]||2dH<y) =0(1). (3.3.13)

sEN(b)
The ﬁrst equality (3.3.11) follows from the inﬁnite differentiablity of F, 2, dc: =

Ipxp, maxilcj-sl —> 0, (3.3.2) and assumption A.5.

63

As to the (3.3.12), here we only give the proof for ﬁxed 3 E N (b). The uniform
convergence can be obtained by the compactness of N (b), similar to that of Theorem
2.1 (Koul,1985a). Let dijz=the j-th entry of the vector d,. Thus the integrand of

the j-th summand of the left hand side (LHS) of (3.3.12) does not exceed

2: Idijdrjl |Cov(I(y < e,- S y + cﬁs), I(y < 5, g y + 0:3)“.

1 1"

Apply Lemma 3.3.4, notice “cis” S ||c,~||||s|| = 0(n‘9/2)||s|| —> 0, so, for any 0 <

h < 0, the above bound does not exceed

02:133..) (1 + yz)“(1+li- rl)‘”n-"/2nsn

!

s Cllslln“2‘”)n2'”n"‘/2(1 + 33-1,

where the last inequality follows from max,||d,~|| = 0(n’(2‘9)/2). Thus the j-th entry

of the LHS of (3.3.12) does not exceed
C||s||71_h/2/(1 + y2)“1dH(y) —> O, n —+ 00,

which proves the (3.3.12).

As to (3.3.13), we need to prove
j” Z dicisﬂylllzdmy) = 0(1), (3.3.14)

/ E|(Zd.[I(e. s y) — Fm] [l2dH<y) = 0(1). (3.3.15)

But 2,6116; = Ipxp, ||sf(y)||2 S ||s||2f2(y) S Cb2(1+ y2)‘1. Thus (3.3.14) follows
from (3.3.2) and assumption A.5. As to (3.3.15), like (3.3.12), apply Lemma 3.2.1,
the j-th entry of the LHS of (3.3.15) does not exceed cn‘2+9n2‘9 f(1+yz)"1 dH(y) <
oo, hence the lemma is proved.

64

Lemma 3.3.6 Let A = argmin{Q(A),A 6 RP}, assume the conditions of Theo-

rem 3.2.1 hold, then

IIA<8 — 3)!) = 0.0). (3.3.16)

||A(A — .3)” = 0,,(1). (3.3.17)

Proof. The proof of (3.3.16) follows from the following Lemma, while proof of

(3.3.17) basically is the same as that of (3.3.16).

Lemma 3.3.7 Assume the conditions of Theorem 3.2.1 hold, then

{a}. for any 5 > 0, there exists a O < z, < 00 and N15 such that
P(|M(,s)| g .2.) 21— e, for all n 2 N15.
(b). for any 5 > 0, 0 < z < 00, there exists N25 and a positive b > 0 such that

P( inf M()3+.4“s) 2 3) 21—5 for all n. 2 N25.

IISIiZb
Proof. The proof of part (a) is from ﬁnite moment Ell/1(3) < 00, which is from
(3.3.15). The part (b) is very similar to that of Lemma 3.1 of Koul (1985a) which
we omit here.
Finally, we are in the position to provide the proof of main theorem.
Proof of Theorem 3. 2.1. The proof follows that of Theorem 5.41 of Koul (1992b)
and Theorem 3.1 of Koul (1985a). We only give the sketch here. From Lemma

3.3.6, we have

Mu?) — 2(3)) = (Higgins + A-1s) — “ifgbow + A-‘s)|

 

S sup M(ﬁ+A'ls) — Q()3+A’1s)l.

IISIISb

65

From above inequality and Lemma 3.3.5, we get
111(3) = Q(A) + 0,,(1). (3.3.18)

The last equality (3.3.18) together with M(B) 2 62(3) + 0,,(1), yield 62(5) =

62(5) + 0,,(1), which is precisely ||A(,3 —- A)“ = 0,,(1). Thus

A

A(,s — s) = A(A — s) + 0,,(1). (3.3.19)

Now, from the deﬁntion of Q(A) and A, we readily get the (3.2.3) of Theorem 3.2.1
from (3.3.19).

In order to prove the Corollary 3.2.1, we need the following lemma.

Lemma 3.3.8 Let Sn(x) = 2?:1d,[1(€i S x) — F(x) +f(x)€,-] , under the assump-

tions of Theorem 3.2.], then

Proof. The proof of the lemma can be deduced from Theorem 3.1 of Koul and
Surgailis (2001c), where they proved more general case, i.e. the uniform reduction
priciple for weighted residuals empirical processes.

Proof of Corollary 3. 2.1. From the Theorem 3.2.1 and notation of Sn(x), we obtain

Av? — (a) = (2 f 12cm)" f [5.3) + 3,,(_,,) — 2: d.s.~f(y)]f(y)dH(1/) + 0,.(1)

i=1

= - 23,-5.- + 0,,(1),
i=1

the last equality , which is (3.2.4), follows from lemma 3.3.8, while (3.2.5) follows
from Theorem 2 of Giraitis et a1. (1996).

66

The following lemma is the asymptotic uniform quadraticity of M ((A) under

i.i.d. random case.

Lemma 3.3.9 Assume the conditions of Theorem 3.2.2 hold. Then, for all b E

(0, 00),

sup |Ml()3 + a;13)— Q10? + agls)| '2 0,,(1).

sEN(b)
Proof. The proof is similar to that of Lemma 3.3.5 except here {Xm' 2 1} are i.i.d.

r.v’s, instead of ﬁxed known constants. To prove the theorem, it is enough to show

that Vb e (0, oo),
2
:22) [HE D. (Po. 3 + as) — 0:313») || dH(y) = o.(1),

(3.3.20)

E SUP [HER-[Hy <8.- 5 y+CIS) -F(v,y+C£8)]

'2 dH(y) = 0(1), (3.3.21)
s€N(b)

 

E sup / H: D. (as. s y) — F(y) + 031(3)] ”2 dH(y) = 0(1).

s€N(b)
(3.3.22)

Proof of ( 3 3. 20). Write F( (x, y) =fyf )du, use the differentiability of f, we

can obtain

LHS of (3320 )< b4/II71}:‘IZX1XIIIX||/::f

 

                 

,)2|dz|| dH(y

(3.3.23)

Now, from Remark 3.3.1, we have

--1

fan
—a;1

 

my + blle-IIZ) ldz s Ca: (1 + b21321?) (1 + 32)“.

67

Thus, from (3.3.23), we have

n“ 2 X,Xﬂ

1

LHS of (3.3.20) S Ca;2b4|

 

 

 

hf/b+ﬁrﬁmn

 

4x2

 

(1+ b2||X,-

 

which is 0,,(1) from A.6, A.5 and of —-> 0, hence (3.3.20) is proved.
Proof of { 3. 3. 21 ) Similar to the proof of (3.3.12), we here only give the proof
for the ﬁxed 3 E N (b). Let D,j:=the j-th entry of the vector D,, which are analogous

to dij in the ﬁxed design case and
K.(Z) == 1(1/ < 2 S y + 0,3) - F(x/3+ 0:3).

Thus the integrand of the j-th summand of (3.3.21) does not exceed

 

22E [lDuDul lE [K1(€i)1\'r(sr) X.,X.] I]. (3.3.24)

 

Apply Lemma 3.3.4, as |C§s| S of,‘ |X,—||||s||, we have

 

(E [Ki(€i)K,-(Er) Xi, Apr]

 

sca+ﬂ40wawma4wawﬂma4WQWW?
(3.3.25)

Combining (3.3.24) and (3.3.25), use A.6, we obtain, for any 0 < h < 0, the j-th

entry of

LHS of (3.21) g C(llsll v ”smirk/2714+": Z(1+)1— r|)“’/(1 + y2)-1 dH(y)
scmwwmmwm/h+rrmmwam 33m,

which proves the (3.3.21).

68

Proof of (3. 3. 22) It sufﬁces to prove
fEHZDC’sﬂy) )|| dH(y =0(1), (3.3.26)

/ EHZD. (as. s y) — Fm] “2414(3) = 0(1). (3.3.27)

The ﬁrst equality (3.3.26) follows from the following inequality and assumptions

A.6, (3.3.2) and A.5.

 

 

2 / Home) < oo

As to the (3.3.27), like that of (3.3.15), we have the integrand of the j-th summand

LHS of (3.3.26) 3 ||3||2E ”72-1 Z X,X;

of the LHS of (3.3.27) does not exceed

ZZEUDU'DU'I IE{[I(51 Syl-F(y)][1 ]

Thus, from Lemma 3.3.3 and similar argument as (3.3.24), we obtain the j-th entry

 

X,, x.)

 

of

LHS of (3.3.27) 3 Cb;2ZZ(1+ Ii— mfg/(1+ (ﬂ—l dH(y).

Thus, LHS of (3.3.27) S Cn’iQ‘Oan‘oﬂl + y2)'1dH(y) < 00. Hence, lemma is
proved.

Proof of Theorem 3.2.2. The proof is completely analogous to that of Theorem
3.2.1.

Proof of Corollary 3.2.2. Proof of the (3.2.8) is completely analogous to (3.2.4) of

Corollary 3.2.1. From (3.2.8), we obtain

T—l n1/2(B— B): —r'1n'1/2Z(z\-—/L) "-W'n— 1n U225 +op(1). (3.3.28)

i=1

69

But, the variance of the ﬁrst term of the RHS of (3.3.28) goes to zero, thus the
ﬁrst term is 0,,(1), which proves (3.2.9). The last equality (3.2.10) follows from the

Lemma 5.1 of Surgailis (1982).

3.4 Appendix

Before we give the proof of Lemma 3.3.3, we need a following lemma.

Lemma 3.4.1 Let g(x) = (1 + lxI‘Q’)‘l and h(x), x 6 IR be a real valued function

such that
|h(~r)l 5 09(23). (3.4.1)
hold for any x E R. Then, for any x S O and any v, w 6 R
If W“ + v + w) ‘ h(u + W141!) S C(lvl V W) (1 v |w|3)(1 + :32)“. (3.4.2)

Proof. First consider |v| S 1, then by (3.4.1) and (3.3.5) with 7 = 3, the LHS of

(3.4.2) does not exceed

Clvl [3 (I + In + w|3)—ldu S C|v|(1 V |w|3) [—1 (1+ |u|3)-ldu
S C|v|(1V |w|3)(1+ 152)“-
Next, consider |v| > 1. Then the LHS of (3.4.2) does not exceed
C/1(1 + In +. v + w|3)—ldu + C/j (1+Iu + w|3)_1du. (3.4.3)

By (3.3.5), the ﬁrst term of (3.4.3) does not exceed

C(1V|v+w|3)/ (1+ lul3)‘1du 5 C|v|3(1 v |w|3)(1 +32)-1.

The second term of (3.4.3) follows similarly. This proves the lemma.
Proof of Lemma 3. 3. 3. Let f, be the o-ﬁeld generated by (1,, k S i. Write the

telescoping identity:

[(5, g .2) — F(x) = Z U,,,(x), (3.4.4)
(:1
where
Ui,l($) = F1-l(~T " giJ—l) — Fill? — 5w)
= Ugly.) + 1.33%), (3.4.5)
where

U,(,l)(l‘) = 171(33 - 53,1—1) - E(x — 51,1),
U,(3)(17) = 171-1“ — 51,1-1) - E(x — 521-1)-
The lemma 3.3.3 follows from the following (3.4.6) and (3.4.7).

2
E[U,-,,(x)] gC(1+x2)", 1:1,2,~-,10, (3.4.6)

2
E[U,‘j’(x)] 50(1+22)-11-1-9, 1>zo, q=1,2, (3.4.7)

where lo will be chosen sufﬁciently large in order that the bounds of Lemma 3.3.1

hold. Indeed, by orthogonality of (3.4.4) and (3.4.6), (3.4.7).

l:EU1,1+1($),(U01 (xll

lCov( I(eo_ < x),I (e, S x) )l

Now, it sufﬁces to show (346) and (3.4.7) for x S 0 only. As to (3.4.6),

E[U1,1($)]2 S 2[EF,2_1(1‘ ‘ 51.1—1) + E5205 — 51.1)]
_<_ 2[EFz—1(l‘ - 51,1-1) + EFIUC — 51.1)]
= 4F(x).
Notice F(x) = ffoof(u)du and by Lemma 3.3.1 (3.3.2), we have F(x) S C(l +

x2)", this proves (3.4.6).

Consider (3.4.7) for q = 1. In view of (3.4.5), as 5,-1-1 = b,(,-_, + in), we have

,U.-(,l)($) = [1 [f1(u -' szi—z - 51,1) - flat — E,,,)]du.

—00

Here, f, satisﬁes Lemma 3.4.1’s h(x) by Lemma 3.3.1. Thus from (3.4.2), we obtain
le-(j)($)| S C(lblCi—ll V lblCi—ll3)(1 V lgi,ll3)(1+ 1‘2)-1
3 C(lb,(,_,| v |b,(,-_,|3) (1 + |é,,,)3)(1+ x2)-1. (3.4.8)

Combining (3.4.8) with the estimate IU,(,})(x)| S C(lb)(,_)| /\ 1), which is an easy
consequence of (3.3.2), we obtain
E[U,.(,}’(3:)]2 s C(Elb,(,-.,|2 + E|b,(,_,|3) (1 + Elé,,,|3)(1+ .32)—l
gcwu+zﬁ*
300+204rkﬂ m4m

the second inequality follows from E |§,,)|3 < 00, which follows from the Rosenthal

inequality.

El: szzl3 S C: Elsz1l3 + C(: ElblCzl2)3/2,
(:1 [=1 l=l

72

this proves (3.4.7) for q = 1.
As to (3.4.7) for q = 2. From Lemma 3.3.1 (3.3.3) and Lemma 3.3.2 (3.3.5)

with ’7 = 3, we obtain

lU1(,i)($)l = l/ [f1(u — E.131—1) - fI—1(U ‘ ELI—llldu

 

S be/ (1+ I'll — 51.1-1l3)-1du

I

g Cb,2(1v|§,,,_1|3)/ (1+|u|3)“du

—oo

3 be(1v|é,,)_1|3)(1+ 2:2)-1.
Hence, as |U1.(f)(x)| S 2, similarly as q = 1, we obtain
1~:[U,‘f’(3)]2 g 03,2(1 + .42)-1 g C(1+ 30-11-14,

this, together with (3.4.9), proves (3.4.7). Hence the lemma is proved.

73

Bibliography

[1] Antoniadis, A., Gregoire, G. and McKeague, I. W. (1994) Wavelet methods

for curve estimation. J. Amer. Statist. Assoc. 89, 1340-1353.

[2] Antoniadis, A., Grégoire, G. and Nason, G. (1999) Density and hazard rate
estimation for right-censored data by using wavelet methods. J. R. Statist.

Soc. B, 61, 63-84.

[3] Baillie, RT. (1996) Long memory processes and fractional integration in econo-

metrics. ‘J. Econometrics, 73, 5-59.

[4] Beran, J. (1992) Statistical methods for data with long range dependence.

Statist. Science 7, 404-427.

[5] Beran, J. (1994) Statistics for Long-Memory Processes. Monographs on Statis-

tics and Applied Probab., 61. Chapman and Hall. NY.

[6] Chui, K. (1992) Wavelets: A tutorial in theory and applications , Boston:

Academic Press.

[7] Daubechies, I. (1992) Ten Lectures on Wavelets, SIAM, Philadelphia.

74

[8] Dehling, H. and MS. Taqqu. (1989) The empirical processes of some long
rang dependent sequences with application to U-statistics. Ann. Statist., 17,

1767-1783.

[9] Donoho, D. L. and Johnstone, I. M. (1994) Ideal spatial adaptation by wavelet

shrinkage. Biometrika, 81, 425-455.

[10] Donoho, D. L., Johnstone, I. .\/I., Kerkyacharian, G. and Picard, D. (1995)
Wavelet shrinkage: asymptopia? (with discussion). J. Roy. Statist. Soc. Ser.

B, 57, 301-369.

[11] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1996)

Density estimation by wavelet thresholding. Ann. Statist., 24, 508-539.

[12] Feller, W. (1957) An Introduction to Probability Theory and its Application.

Volume I, Second edition, John W'iley 8: Sons, Inc.

[13] Giraitis. L., Koul, H.L. and Surgailis, D. (1996) Asymptotic normality of re-

gression estimators with long memory errors Statist. Probab. Lett. 29, 317-335.

[14] Giraitis, L. and Surgailis, D. (1999) Central limit theorem for the empirical
process of a linear sequence with long memory. J. Statist. Plann. and Inference,

80, 81-93.

[15] Hall, P. and Patil, P. (1993) On the choice of smoothing parameter, threshold

and truncation in nonparametric regression by nonlinear wavelet methods. Re- -

75

search Report SMS-72—93, Center for Mathematics and Statistics. Australian

National University, Canberra.

[16] Hall, P. and Patil, P. (1995) Formulae for mean integated squared error of

non—linear wavelet-based density estimators. Ann. Statist., 23, 905-928.

[17] Héirdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998) Wavelets,

Approximation and Statistical Applications. Lecture Notes in Statistics 129

[18] Ho, HO. and Hsing, T. (1996) On the asymptotic expansion of the empirical

process of long memory moving averages. Ann. Statist. 24, 992-1024.

[19] Ho, HQ and Hsing, T. (1997) Limit theorems for functionals of moving aver-

ages, Ann. Probab. 25, 1636-1669.

[20] Kerkyacharian, G. and Picard, D. (1992) Density estimation in Besov space.

Statist. Probab. Lett. , 13, 15-24.

[21] Kerkyacharian, G. and Picard, D. (1993) Density estimation by kernel and
wavelet methods, optimality in Besov space. Statist. Probab. Lett. , 18, 327-

336.

[22] Koul, H.L. (1985a) Minimum distance estimation in multiple linear regression,

Sankhya Ser. A 47, 57-74.

[23] Koul, H.L. (1985b) Minimum distance estimation in multiple linear regression

with unknown error distributions Statist. Probab. Lett. 3, 1-8.

76

[24] Koul, H.L. (1986) Minimum distance estimation and goodness-of-ﬁt tests in

ﬁrst order autoregression. Ann. Statist. 14, 1194-1213.

[25] Koul, H.L. (1992a) M-estimators in linear models with long range dependent

errors. Statist. Probab. Lett. 14, 153-164.

[26] Koul, H.L. (1992b) Weighted Empiricals and Linear Models. IMS Lecture

Notes, 21.

[27] Koul, H.L. and DeWet, T. (1983) Minimum distance estimation in linear re-

gression models. Ann. Statist. 11, 921-932.

[28] Koul, H.L. and K. Mukherjee. (1993) Asymptotics of R-, MD- and LAD-
estimators in linear regression models with long range dependent errors.

Probab. Theory Related Fields, 95, 535-553.

[29] Koul, H.L. and D. Surgailis. (1997) Asymptotic expansion of M-estimators

with long memory errors. Ann. Statist. 25, 818-850.

[30] Koul, H.L. and D. Surgailis. (2001a) Asymptotics of the empirical process of
long memory moving averages with inﬁnite variance J. Stochastic Processes Ed

A ppl., 91(2), 309-336.

[31] Koul, H.L. and D. Surgailis. (2001b) Asymptotic expansion of the empirical

process of long memory moving averages. preprint.

[32] Koul, H.L. and D. Surgailis. (2001c) Robust estimators in regression models —

with long memory errors. preprint.

77

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

Lo, S.H., Mack, Y.P. and Wang, J .L. (1989) Density and hazard rate estimation
for censored data via strong representation of the Kaplan-Meier estimator.

Probab. Theory and Related Fields, 80, 461-473.

Mallat, S. (1989) A theory for multiresolution signal decomposition: the
wavelet representation. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 11, 674-693.

Marron, J.S. and Padgett, W.J. (1987) Asymptotically optimal bandwidth
selection for kernel density estimators from randomly right-censored samples.

Ann. Statist., 15, 1520-1535.

Meyer, Y. (1990) 0ndelettes et operateurs, Hermann, Paris.

Padgett, W .J . and McNichols. D.T. (1984) Nonparametric density estimation

from censored data. Comm. Statist. Theory and Methods, 13, 1581-1611.

Patil, P. (1997) Nonparametric hazard rate estimation by orthogonal wavelet

methods. Journal of Statistical Planning and Inference, 60, 153-168.

Robinson, PM. (1994) Semiparametric analysis of long-memory time series.

Ann. Statist., 22, 515-539.

Singpurwalla, ND. and Wong, M.Y. (1983) Estimation of the failure rate, a
survey of the nonparametric methods. Part 1: Non Bayesian Methods. Comm.

Statist. Theory and Methods, 12, 559-588.

[41] Stute, W. (1995) The central limit theorem under random censorship. Ann.

Statist., 23, 422-439.

[42] Stute, W. and Wang, J. -L. (1993) The strong law under random censorship.

Ann. Statist. 21, 1591-1607.

[43] Surgailis, D. (1982) Zones of attraction of self-similar multiple integrals.

Lithuanian Mathematical J. 22, 327-340.

[44] Tanner, M. A. and Wong, W .H. (1983) The estimation of the hazard function

from randomly censored data by the kernel method. Ann. Statist., 11, 989-993.

[45] Taqqu, MS. (1975) Weak convergence to fractional Brownian motion and to

the Rosenblatt process. Z. Wahhrsch. 31, 287-302.

[46] Wu, S. and Wells, M. (1999) Estimating hazard rate with truncated and cen-

sored data by wavelet methods. preprint

[47] Zhang, B. (1996) Some asymptotic results for kernel density estimation under

random censorship. Bernoulli 2, 183-198.

79