“ynnxu‘wll'l

 

15.xugCIIIVI

 

 

 

 

 

l'lg‘ylf|lY1ItI#1{'\“k(li\)v

   

 

2'3‘505Wo

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

llllljlllllﬂlllMlllllllllrill mm ,
Michigan Stat.
University '

dissertation entitled

Bootstrap approximation to the
distributions of M-estimators

presented by

Soumendra Nath Lahiri

has been accepted towards fulﬁllment
of the requirements for ~ -- .

PhoDo degreein StatiSt'lCS

 

 

/fﬁ»;£—7 (, 2947

Major professor

 

Date

 

MSUis an Affirmative Action/Equal Opportunity Institution 0-12771

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

ﬁgs 0 2 El?
v - l

 

 

 

 

 

 

 

 

MSU Is An Affirmative Action/Equal Opportunity Institution

BOOTSTRAP APPROXIMATIONS TO THE
DISTRIBUTIONS OF M — ESTIMATORS.

By

Soumendra Nath Lahiri

A DISSERTATION

Submitted to
Michi an State University
in partial ful lllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Statistics and probability
1989

404%

ABSTRACT

BOOTSTRAP APPROXIMATIONS TO THE DISTRIBUTIONS
OF M — ESTIMATORS.

By

Soumendra Nath Lahiri

Consider the linear regression model Yi = xiﬂ + 6i, where ei's are random
variables with common distribution F and xi's are known constants. Let Bu be the
M - estimator of 3 corresponding to a nondecreasing, bounded score function 2/). This
thesis analyzes the asymptotic behaviors of certain bootstrap approximations to the
distribution of normalized 3n. It is shown that the ordinary bootstrap procedure as
such does not work in the present set up. As remedies, several modiﬁcations of this
procedure have been rmulated. For studying the asymptotic behaviors of these
procedures, Edgeworth expansions of the distributions of Bu and the modiﬁed
bootstrap estimators are obtained. It is proved that all the prOposed modiﬁcations
lead to a faster rate .of approximation, viz. o ( Max { |xj|/( 2 i: x? )1/2 :
l_<_ j 5 n }) than the usual normal approximation. For the special case, when the
score function g!) is odd and the underlying error distribution F is smooth and

symmetric, it is observed that by taking the resampling distribution to be a

suitable symmetrized kernel estimator of F, one can have even a higher rate of

2

n
approximation, namely 0 ( Max { |xj|2/2 x. : lgj g n } ).
i=1

1

Second part of the thesis considers the bootstrap approximations to the

distributions of M — estimators in a multivariate setting under a different model.

Let Xl ,..., Xn be independent and identically distributed k — dimensional random

vectors with common distribution F0 , 0 E G C Rp for some p 2 1. Let 112 be a
function from Rk x RP into Rp and in be the M - estimator of 0 corresponding to
w. Under some regularity conditions on w, an Edgeworth expansion of the
bootstrapped M - estimator is proved. Using this and the Edgeworth expansion for
an ( obtained by Bhattacharya and Ghosh (1978) : 'On The Validity of Formal

Edgeworth expansion.', Ann. Statist. 6, 434 - 445 ), the rate of bootstrap
approximation is shown to be o(n_1/2). This extends a result of Singh (1981)
( 'On the Accuracy Of Efron's Bootstrap.', Ann. Statist. 9, 1187 — 1195 ) about the

sample mean to the M—estimators.

 

To my parents

iv

ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my thesis adviser, Professor
Hira Lal Koul for all the helpful advice, guidance and encouragement I have
received over last two years. I am thankful to Professor H. L. Koul and Professor
R. Lepage for their constructive suggestions that signiﬁcantly improved my

unprofessional presentation of this work and brought it up to the present form.

I would also like to thank Professors R. V. Erickson, R. Lepage and
H. Salehi for serving on my committee. Finally, I want to thank Ms. Loretta

Ferguson for helping me with the typing of this manuscript.

CHAPTER

TABLE OF CONTENTS

Introduction

Bootstrap approximations to the distributions
of normalized M — estimators in a simple linear
regression model

1.1. Introduction

1.2. Edgeworth expansion for the normalized
M—estimator

1.3. Bootstrap approximations

1.4. Proofs

Bootstrap approximations to the distribution of
normalized multivariate M — estimators.

2.1. Introduction

2.2. Assumptions and main results
2.3. Proofs
Bibliography

PAGE

12
22

39
41
47

61

INTRODUCTION

Ever since introduction, the bootstrap method has found its applications in a
variety of statistical problems and in most of the cases with overwhelming success.
The superiority of the bootstrap approximation in certain estimation problems has
been reported in the introductory paper, Efron (1979), on the basis of some
numerical studies. Soon these empirical results were substantiated from the
theoretical standpoint by Singh (1981), Bickel and Freedman (1981), Beran
(1982), Babu and Singh (1984) and Hall (1988) among others. In fact, it was
Singh (1981) who ' showed for the ﬁrst time that the rate of bootstrap
approximation to the distribution of the normalized sample mean is faster than the
usual large sample normal approximation. He derived an almost sure Edgeworth
Expansion for the distribution of the bootstrapped statistic and compared it with
the standard Edgeworth expansion for the distribution of normalized sample mean
to arrive at the conclusion. It became clear from this work that the distribution of
the bootstrapped statistic corrects itself for the possible skewness of the underlying
distribution and thus povides a better approximation than the normal law.
Subsequently similar results on the rate of bootstrap approximation have been
established in a number of cases when the statistic of interest is a smooth functional
of the underlying distribution. See Babu and Singh (1983) for results on
studentized k—sample means, Helmers (1988) for results on U—statistics,

Bose (1988) for bootstrapping an autoregression model.

In this thesis we shall consider the behaviour of bootsrtrap approximation to
the distributions of M—estimators in two different problems. The ﬁrst problem
concerns a simple linear regression model

Yi = xi 3 + (i, 1: l,....,n

 

2
where ci's are independent with a common distribution F and xi's are known,
nonrandom constants. Here the model differs from the others mentioned earlier
(except Bose (1988) ) at the point that the observed values Y1,....,Yn are not
identically distributed. Bootstrap approximation in similar nonidentical set up has
been considered in Freedman (1981), Bickel and Freedman (1983) and Liu (1988).
The ﬁrst two papers prove the bootstrap central limit theorem for the least square
estimators of the multiple regression parameters and Liu (1988) establishes the
second order correctness of the bootstrap method for the sample mean of
independent but not necessarily identical observations. Here we consider
bootsrapping the M¥estimator Bn of ﬂ corresponding to a nondecreasing,
bounded score function 21) (see Section 1.1 of Chapter 1 for deﬁnition ). Under
certain smoothness conditions on it and/or F, an Edgeworth expansion for the
distribution of normalized Bu has been obtained. This result is of independent
interest for two reasons. First, such expansions for the M— estimators in the general
regression context were not known earlier ; second, the method of proof is
somewhat different from the conventional approach ( cf. Ringland ( 1983)) based on

Bhattacharya and Ghosh (1978).

Bootstrapping 3n under the present model leads to some intriguing
phenomena. In Section 1.3 of Chapter 1, we give an example which shows that the
usual bootstrap procedure does not work in the present set up. The bootstrapped
statistic in the example does not even converge to the limiting distribution of the
unbootstrapped statistic. To overcome this drawback of the usual bootstrap
procedure, we pr0pose different modiﬁcations and show that each of these

modiﬁcations actually attains a faster rate than the normal approximation.

3

In the second problem, we consider the M—estimators of a higher dimensional
parameter in a multivariate setting. The Edgeworth expansion of the normalized
M—estimator was obtained by Bhattacharya and Ghosh (1978) and Bhattacharya
(1985) under some smoothness conditions on the score function 1p. Here we follow
the usual bootstrap procedure and select the bootstrap samples from the empirical
distribution of the observations. Using the smoothness of w and a result of Babu
and Singh (1984), we obtain an almost sure expansion of the distribution of the
bootstrapped statistic along the line of Bhattacharya and Ghosh (1978).
Comparison of these two expansions establishes the superiority of the bootstrap
approximation to the. normal approximation. This extends a result (part ((1) of

Theorem 1) of Singh (1981) about the sample mean to the M—estimators.

CHAPTER 1

1.1. Introduction.

Consider a simple linear regression model

(1.1) Yi = Xi ﬂ + (i , I=1,...,Il,

where 61,...,€n are independent and identically distributed (i.i.d.) random variables
(r.v.'s) with common distribution function (d.f.) F and where x1,...,xn are known
nonrandom constants. Let {0 be a nondecreasing and bounded function from IR into

[R. Deﬁne an estimator Bu of 3 to be a solution of the equation ( in t )

(1.2) 2. xi¢(Yi—xit)=0.

I:

Estimators {Bu} are known as M—estimators of ﬂ ( Huber : 1973 , 1981). Assume

that

(1.3) .. E i!) (61) = 0.

The condition (1.3) ensures the asymptotic unbiasedness of BB . For easy reference

later on, let

2 2 .
(1.4) 3. :2. x. and Mn=Max{|xi|:151£n}.

The asymptotic normality of an(Zin — ﬂ) has been studied extensively in the
literature under much more general settings : see Huber (1973, 1981) and the

references therein. Relatively very little is known about the Edgeworth expansions

5
for the distributions of these estimators , Specially when the score function e is not
smooth. Ringland (1983) considered the one—way layout model with p p0pulations
(p3/n —1 oo ) and obtained a two—term Edgeworth expansion for the distribution of
studentized M—estimators. His method of proof was along the line of Bhattacharya
and Ghosh (1978). In particular, he required the score function it to be smooth and
the design matrix elements to be 0's and 1's only. When p = 1, this forces xi = 1

for all i which is too restrictive in the regression context .

For the one parameter case this paper gives an Edgeworth expansion of the
distribution of 3an — 3) when w is not necessarily smooth and the constants xi's
satisfy only some mild growth conditions. The method of proof is completely
different from that of Ringland (1983). Monotonicity of 1!) enables one to obtain
bounds on the probabilities involving Bu in terms of the probabilities relating to the
sums of independent random variables. Thus one can apply the classical Edgeworth
expansion techniques to these bounds for obtaining an approximate expansion of the
distribution of 3n . Then, the smoothness of 1p and/or F is used to simplify these

expressions into the stated forms.

BOOTSTRAPPING En: In order to describe the bootstrapping of ﬁn, let

Fn be an estimator of the underlying error d.f. F based on the estimated residuals

*

— *
c. = Y. — x. i = 1,...,n. Also let ‘1 ,..., En be a bootstrap sample from F n and

1 1 1 n ’
* *
deﬁne Yi = xi 3n + 6i for i = 1 ,..., n. In accordance with (1.2), the bootstrap

:1:
estimator ﬁn of ﬂ is deﬁned as a solution of the equation ( in t )

(1.5) 2. xii/)(Y:—xit)=0.

1:

 

 

6
The role played by ﬂ in the original problem is to be replaced by 3n in the
bootstrap set up. Accordingly one should have

(1.6) En 10(6):) = En ¢( YI_X13n ) = 0.

where En denotes the expectation under Fn‘ In general, the choice of FIl that will
satisfy condition (1.6) and at the same time be a good estimator of F seems to
depend heavily on the forms of F and 1,0. In the case of bootstrapping the least
square estimator ﬁn of B , the corresponding requirement is E1103: ) = 0. Freedman
(1981) considered the problem of bootstrapping :61] and ensured this condition by
centering the estimated residuals 21,..., ED and then taking the bootstrap samples
from the empirical distribution of these centered values. In fact, he has pointed out
that if one does not center the estimated residuals, the distribution of an(ﬂ: — (in)
does not converge to that of an(bn — ﬂ). Similar remark applies to the present
context as well. We give an example at the beginning of Section 1.3 where (1.6)
does not hold and an(ﬂ:l — Bn) does not have the same limiting distribution as the
unbootstrapped statistic an(BIl — 3). Therefore, one should consider only those

Fn's for which condition (1.6) is satisﬁed.

Clearly, (1.6) is not satisﬁed for general design points if Bu is deﬁned by
(1.5) and F11 is taken to be the empirical distribution function (e.d.f.) of the
estimated residuals Erwin. Therefore, one has to look for apprOpriate
modiﬁcations, if any, of the usual bootstrap procedure. In fact, there are at least
two ways of attaining this. One is to change the resampling distribution and the
other is to change the deﬁning equation (1.5). As an example of the ﬁrst possibility,
Fn is taken to be a suitable weighted empirical distribution and ﬂ: is deﬁned as a

solution of (1.5) ( see Section 1.3 for details ). As an example of the other case,

 

7
*
equation (1.5) is modiﬁed according to Shorack (1982) and ﬁn is deﬁned as a
solution of the resulting equation ( cf. equation (3.8) in Section 1.3 ). In the second
case, it is shown that one can take FIl to be either the e.d.f. of the estimated

residuals E or some smoother estimator of F depending on the degree of

1 ,..., En
smoothness of d) and F. With either modiﬁcation the resulting bootstrap procedure
corrects one term in the Edgeworth expansion of the distribution of the normalised

3n and the rate of bootstrap approximation becomes 0(Mn/an).

Finally, in the case when the error d.f. F is smooth and symmetric and the
score function it is odd, the rate of bootstrap approximation corresponding to a
symmetrized kernel density estimator of F is shown to be 0((Mn/an)2). This result
is similar to a result of Babu and Singh (1984) about the sample mean where the the
resampling distribution is taken to be the symmetrized e.d.f. of the observations
centered about the sample mean. In a nut shell, for all the cases considered here
bootstrap approximation is shown to have a better rate than the normal

approximation .

The layout of this chapter is as follows. Section 1.2 contains theorems giving
the Edgeworth expansions for Bu. Section 1.3 deals with the bootstrap
approximations to the distribution of Zin and Section 1.4 contains the proofs of the

results stated in Sections 1.2 and 1.3.

 

1.2. Edgeworth expansions for BB.

This section gives the Edgeworth expansion for the distribution of
normalized En under some assumptions on at, F and xi's. Parts of these
assumptions are on the underlying model (1.1) and will be assumed throughout the
paper without explicit reference. The rest of the assumptions are required for the
validity of some results in this section. Whenever used, one or more of these will
always be mentioned in the statement of the corresponding assertion. Before stating

the assumptions, we need to ﬁx some notation. For x real, write

”(x) =12 we, —x), V(x) = a? (x) =Var we, — x),

#30:) =E (as, —x) — u(X))3 and not) =E (10(61-x)— u(X))4-

Since 1/2 is bounded all these quantities are well deﬁned.

For any real valued function h deﬁned on IR, let 110) denote the i—th
derivative of h whenever it exists and “h” denote the supremum norm of h. For

convenience, h', h", h'" will replace 11(1), h(2) and h(3) respectively. Deﬁne

II n
3 3 4 4
A=—p.'.(0)/0(0), (1111:). lxi/an, (1211:) lxi/an,
l: 1:
(2.1)
11
_ 3 3 _ 2
(rm—Z 1:1 |xi| /an and d4n—Max{d3n,d2n}.

Next recall the deﬁnition of Mn and an from (1.4). Note that d3n = O (Mn/an)
and d4n = O (Mi/a3). Let bu: log aIl (whenever it is deﬁned ). For c > 0,

 

9
deﬁne the set An(c) ={ i : 1 5 i 5 n, |in > c.Mn} and let kn(c) denote the

number of elements in An(c).

In addition to (1.3), assume that conditions (A.1) — (A.3) below are satisﬁed
by the underlying model (1.1) throughout this chapter.
(A.1): an-woasn-voo.
(A.2) : A = —p'(0)/ 0(0) > 0 ( Whenever it exists ).

(A.3) : There exists a constant c , 0 < c < 1 such that bu: o ( kn(c)) as n -» 00.

Next, we list down the remaining assumptions used in this section.
. 4 _
(A.4). bIl Mn — 0 (an) as n —. oo.
. 6 _
(A.5). bn Mn — 0 (an) as n -+ 00.
(AG) : There exist constants M > 0, 6 > 0 and 0 < q < 1 such that

sup{ |Eexp( it (M cl-x))| : |x|< (Sand |t| > M} < q.

REMARK 2.1 : First two assumptions are typical for proving the asymptotic
normality of 2311 and occur frequently in the literature (see, for example, Huber :
1973,1981). Assumption (A3) is rather uncommon and deserves some
clariﬁcations. For obtaining the Edgeworth expansions of normalized sums of
independent r.v.'s, one usually assumes that the absolute values of the
characteristic functions of all the summands are uniformly bounded away from 1
outside every neighbourhood of zero. But in the present context, this will require
min { |in : i 2 1 } > c for some constant c > 0 which will rule out many
frequently used designs. Condition (A.3) relaxes this requirement on xi's. Another
typical assumption made for proving the asymptotic normality of En is that

Mn/an = o (1) as n -> 00. Condition (AA) and (A.5) are somewhat stronger versions

 

10
of this and are required for obtaining the Edgeworth expansions upto the desired
orders. Note that both the conditions are trivially satisﬁed for bounded xi's as well
as for xi 5 i. Condition (A.6) is actually a modiﬁed Cramer's condition ( see
Bhattacharya and Rao (1976), page 207 for the statement of Cramer's condition )
and will be used for obtaining higher order expansions. See Remark 2.4 and the

Proposition following it for a sufﬁcient condition.
Before stating the theorems, we put down the explicit form of the
Edgeworth expansions. To that effect, write Hi for the Hermite polynomial of

degree i, i 2 1 (see Feller (1966), page 514 ). Let (b and (I) respectively denote the

density and the d.f. of a standard normal r.v.. For Theorems 2.1 and 2.2, deﬁne
Hm<x> = <1><x> — d1n [mm/0(0) — mu) V'(0)/03(0)) x2/ 2A2
+ (113(0) / 603(0) ) H2(x)1¢(x>

H2n(x) = H1n(x>-¢(x) [agnl (mm/0(0) + 3 A may 2 we) ) $76143
+ (115(0) / 6Aa3(0) ) x 11,00 + (( ”4(0) — 304(0)) / 2404(0)) H3(x) }
+ «111,2 {mum/0(0) + A v'(0)/ v<o> ) ( team/12112 03(0) > X21130.)
+ ( 1432(0)/ 7206(0) ) H500 + (mo/0(0) + A v'(o>/ we»? x5/ 8A4
— (we) V'(0)/03(0) + swim)/2v2(0))x‘”’/4A2

- (113(0) v'(0)/4Ao5(o>> xH2(x)}1.

 

11
Under the hypotheses of the following Theorems , the functions a, V, u3, [14 have
sufﬁciently many derivatives so that H1n and H211 are well deﬁned. Now we state

the theorems of this section.

THEOREM 2.1 : Suppose that if has a uniformly continuous, bounded second
derivative. (a) I f “61) is nonlattice and condition (A.4) holds, then

8:1) I P( and, — 5) 5x) — H1n( Ax) l = o (Mn/an).

(b) Suppose that at has a uniformly continuous, bounded third derivative. I f in

addition, conditions (A.5) and (A.6) hold, then

sup |P(an(Bn-ﬂ)5x)—H2n(Ax)|=0(d4n)=0(M,2,/a,2,)

X

where d4n is as deﬁned in (2.1).

Next we state a version of Theorem 1 under the corresponding regularity

conditions on F without assuming the differentiability of 2b.

THEOREM 2.2 : Assume that F has a uniformly continuous, density f.
(a) I f (Mel) is nonlattice and condition (A.4) holds, then

3:2" I P(an(73n-ﬂ) SX)-H1n(AX)| = own/an).

(b) Suppose that f has a uniformly continuous, bounded second derivative. I f in

addition, conditions (A.5) and (A.6) hold, then

3:2" I P( anon—oss—Hgnmm I =o<d4n)=o(M§/a§ ).

REMARK 2.2 : The same technique can be used to obtain higher order expansions

under stronger smoothness conditions on 1b and/or F. The corresponding

 

12

expressions become more and more messy as the order of expansion increases.

REMARK 2.3 : In Theorems 2.1 and 2.2, the smoothness conditions on it and/or
F has been used to guarantee that the functions a, V, m3, m4 etc. have
sufﬁciently many derivatives. In fact it is possible to achieve the same results by
varying the degree of smoothness on if and F. In practice one often encounters score
functions it which are sufﬁciently smooth except possibly at a ﬁnite number of
points. It can be shown that if F is well behaving in some neighbourhoods of these

points, then also the above expansions hold.

REMARK 2.4 : Direct veriﬁcation of assumption (A.6) may pose some difﬁculty in
some cases. But (A.6) is true quite generally if it and F satisﬁes some mild

regularity conditions as is evidenced by the following result.

PROPOSITION : Suppose that F has a nonzero absolutely continuous component Q
with density q with respect to the Lebesgue measure on IR and i/J has a continuous
nonvanishing derivative on some interval (a , b) for which Q { (a , b) } > 0. Then

(A.6) holds.

1.3. Bootstrap Approximations.

We start this section with the following example. It shows that if condition
(1.6) does not hold for some choice of the resampling distribution Fn’ then the

corresponding bootstrap procedure cannot be even ﬁrst order correct.

 

13
EXAMPLE : In addition to all being a nondecreasing, bounded, real valued
function, assume that a has a bounded uniformly continuous second derivative
( e.g. one may take (b(x) = tan—1(x) ). Also suppose that F and w jointly satisfy
the hypotheses of Theorem 2.1 (a) and E (0(61) = O. For the sake of clarity in the

resulting expressions, we take xi = 0 or 1 according as i is even or odd. Note

that for this choice of xi's, a3 = 0 (n) and bn= 0 (log 11). By Theorem 2.1, it
follows that

(3.1) anwn ﬂ) converges in distribution to N ( 0, A—2).
where A = Ew'(cl)/ [E (1)2(Is1)]1/2 as in (2.1).

Next consider bootstrapping 3n. Let Fn denote the empirical distribution of

the estimated residuals 2i = Yi — xiBn, i = 1,..., 11. Take independent sample

* * *
‘1’ ..., ‘n from Fn. Note that in this case En(w(cl)) is not necessarily zero.

Hence, condition (1.6) does not hold. For t 6 IR, write
* 11 a:

r:(t) =~Standard deviation of SI: (t) under Fn.

*
By the monotonicity of (b and the deﬁnition of fin, it follows that for all t 6 IR,
* *
(3.2) ms, (t) < 0) s Pn( (an —a s t) s 1),,(811 (t) s 0).
where P n denotes the bootstrap probability under Fn.

Now, using the Berry — Esseen Theorem for independent random varibles

( cf. Theorem 12.4 of Bhattacharya and Rao (1976)), one can conclude that almost

14
surely, for all t 6 IR,
a: * *
831) | P11 ( (311 (t) —En 311(0) 33’ THU) ) — ‘1’ (Y) I
(3.3)

s 2.75 { 2:1 |in3 Enl a e“; -in ) -Env( 5; —x,t )I3 }/[T:,(t)l3

Here, as before, En denotes the expectation under P n and (JD denotes the

distribution function of N( 0, 1).

Next we state the following results without proofs. Result 3.1 has been
derived in the proof of Theorem 2.1 below ( see equation (4.4)) and Result 3.2 is a

consequence of Lemma 4.2 of Section 1.4 below.

RESULT 3.1 : Let «b have a bounded second derivative and 3n be deﬁned by
equation (1.2). Then there exists N > 1 such that for all n > N ,

P ( an(Bn—ﬂ) > bn ) < 3133.

RESULT 3.2 : Let FI1 denote the empirical distribution of the estimated residuals

'Ei , i = 1,..., n. Then for everyM > 0 and every k 21,

sup{|En[(b(e:—x)]k-E[ib(cl-x)]k|: |x| 5M}: 0(1), a.s..

By Result 3.2 and the uniform continuity of (b, it follows that for all x with

IXI Slos n,

(3.4) I [7:(X/an)]2 — [73(0)]21 = o (n), as.

Hence, from (3.2), (3.3) and (3.4), it follows that for all x with |x| 3 log n,

15

(3.5) sup I P, ( anw; 43,) s x) - <I> (— Hans; (x/an)l/r:(x/an)) I
|x|$logn
=o(n_l/2), a.s..

Next we simplify (— [EnS; (x/an)]/r:(x/an)). Since Bu satisﬁes (1.2),
taking Taylor's expansion , we get
11
0 = 2i=1 xi “Ci -Xi(ﬁn “ﬂ”
‘1 n 2 2 n 3
= 2 i_1x, Meg-(35(3)) i_1x, rm) + (an—5) 2 H x, rap/2

where 5i is a point between 6i and 2i = Yi — xiBn, 1 S i g 11. Now use Result 3.1

and the Law of Iterated Logarithm (LIL) to conclude that

I]

(3.6) an(Bn—ﬂ)Ezb'(rl)=E. 1xi («69+ 0(1) as.

1:

two term Taylor's expansion together with the LIL and the

a
fact that Zn x. = Zn x? = a2, yields,
. 1 i 1 11

Epic, — xjx/an> = n‘lf,‘ lug) — Ixxj/An + aim, — manual)

1:

+ Rjn(x)

where sup { |Rjn(x)| :15 j S n, |x| 5 log n} = O ( n—1(log n)2) a.s..
Therefore, by (3.6) and the Result 3.2, one has

—[EnSI:(x/an)]/r:(x/an) = Ax+ 2?=1(Xi—1)w(ei)/ano(0) + Rn(x)

 

16
where sup {an(X)| : |x| S log 11} = 0(1) a.s.. Recall that 02(0) = Ew2(cl).

n
Deﬁne Bn = 2 i= (xi — 1)Ib(ri)/ano(0). Then Bn has a limiting nondegenarate

normal distribution ( viz. N( 0, 1)). Also, from (3.5), it follows that

Iiniggn I Pn(an(ﬂ;-ﬁn)5x) — <I>(Ax+Bn)|=o(1) 3,,

Comparison of this with (3.1) shows that the usual bootstrap procedure fails to
capture the limiting distribution of the unbootstrapped statistic and as a result, is

not even ﬁrst order correct.

As indicated in the introduction and implied by the above example, we shall
conﬁne ourselves only to those cases in which condition (1.6) holds. First we

consider a situation where (1.6) is ensured by changing the resampling distribution.

Weighted Empirical Bootstrap
Assume that xi's are either all nonnegative or all nonpositive. For n 2 1,

11
write pn = 2 i=1 1xil. Let, F1n be the d.f. putting mass lin/pn at fi’

i=1,..., 11. Take the resampling distribution Fn to be F and draw the

la
* * :1: *
bootstrap samples (1,...,cn from Fn . With Yi 2 xi 3n + (i, i = 1 ,..., 11, deﬁne

*
the bootstrap estimator ﬂn of 6 as a solution of (1.5). Note that for this choice of
* __1 Il '
Fn’ Enwcl) = n 2 i=1]in (b(Yi —xiBn). Hence (1.2) and the fact that all xi 3
are of the same sign jointly imply that

:1: . _1 n
Enw(61) = (Sign of XI) pI1 2i_1xi ﬂYi—Xiﬁn) = 0.

17

Hence, in this case ( 1.6) holds.

Before stating the theorems we introduce some more notation. For any

resampling distribution Fn’ write
m (x) — E I/(X—x) w (x) — s2(x) — E w2(c*—x)—m2(x)
n _ n 1 ’ n " n — n 1 n
(3.7)

mi,n(x) = En(1b(6:— x) — mn(x))i , i = 3, 4 and An = — m1'1(0)/sn(0).

Next deﬁne

*

Hlne) = ax) — d1n «mm/sum) — mp0) wan/aim» x2/2Aﬁ

+ ( m3,n(0) / «Asia» ) 11,00 I ¢<x>
Hines = nine) — <I>(x) [d2n{ (negro/sum) + 3 Anwgov 2 we) )x3/6A3
+ ( m3;n(0)/6Ans§(o» xH2(x) + (( mmm) — 3sﬁ(0))/24sﬁ(o» H300}
+ damp/3.3) + Anwgmvwnm» (m3,n(0)/12Aﬁ an» X2H3(x)
+ (m3?n(0)/72sﬁ(o»H5(x) + (mum/sum) + A,,w;,(0)/Wn<o>)2 x5/8Aﬁ
— (Inga) w'(0)/s3(0) + 3Anwg2m1/2wﬁmnx3/4Aﬁ

— (1113,40) w;,(0)/4Ans;’;(0)) xH2(x)}]-

18
* *
REMARK 3.1 : In the statements of Theorems 3.1 — 3.4, H111 and H2n are
deﬁned by the same expressions but in each case the functions m 11’ Wn, m3 n and
1114 n are to be deﬁned using the corresponding resampling distribution F n'
In the following, let P n denote the bootstrap probability under Fn. We are

now ready to state

THEOREM 3.1 : Assume that the hypotheses of Theorem 2.1 (a) hold and that for

00 *
every c > 0, 2 1exp"( — Cp121/a‘121 ) < co. If/in is deﬁned as a solution of(l.5) with
n:

Fn=F1n then,
(a) spp IPn(an(ﬂ:i_Bn)Sx)-Hin(Anx) | =o(Mn/an) a.s..
(b) “it" I Pn(Anan(ﬂ;—lln)SX)-P(Aan(Bn—ﬂ)Sx)l

= o ( Mn/an ) a.s..

where A and An are as deﬁned in (2.1) and (3.7) respectively.

Modiﬁed Scores Bootstrap :

Now we consider bootstrapping Bu using Shorack's modiﬁcation. For any

*
resampling distribution Fn, deﬁne ﬁn as a solution t of

(3.3) 2 xi { (av: — xit) — EnWi) } = 0.

i=1

n e o * *
Clearly w1th thls modlﬁcatlon, En{ zb(Y1 — x1311) — Enib(cl) } = 0 for any

resampling distribution F n and any xi's. Let Gn denote the empirical distribution

19
of the estimated residuals E1 ,..., En. If I]; is smooth, one can take Fn = G11 and
*
still have the Edgeworth expansion for ﬁn. More precisely, the following analog of

Theorem 3.1 is true.

*
THEOREM 3.2: Suppose that the hypotheses of Theorem 2.1(a) hold and ﬁn is
deﬁned as a solution of (3.8) with Fn = Gn' Then,

(a) sip I Pn( an( a; —Zln ) g x ) —H:n( Anx) | = o ( Mn/an) a.s..
(b) 8:1)lPn(Anan(ﬂ;—Bn)Sx)-P(Aan(Bn-ﬂ)5x)l

= o ( Mn/an) a.s..

Now consider the case when (I) is not necessarily smooth and the
differentiability conditions are imposed solely on F. Here, instead of taking the
samples from Gn’ one should take the bootstrap samples from some smoother
estimator of F to guarantee the validity of Edgeworth expansion for the
bootstrapped estimator ﬂ; Let k be a known probability density on the real line

and {en} be a sequence of positive real numbers, en a 0 as n —I oo. Deﬁne

(3.9) gnu) = e;1 l j k( (x—y)/e,,) demo) 1-

Now take Fn to be the d.f. corresponding to gn. In this case preperties of Fn
depends largely on the assumptions made about k and {en}. For r = 1, 2, let

C(r) refer to the following conditions on k and {en} :

00
(i) For every c>0, 2 n=1 exp(— mega—*2) ) < oo,

20
(ii) I |u| k(u) du < 00 and

(iii) For s = 0, 1, ..., (r+1), k(s) is of bounded variation.

( *
THEOREM 3.3 : Assume that the hypotheses of Theorem 2.2(a) hold and that 'Bn is
deﬁned by (3.8) taking Fn to be the d. f corresponding to the density g n' If k and
{en} satisfy condition C(l), then

*

(a) sip I Pn( an( a: — an ) g x ) — H1n( Anx) I = o ( Mn/an) a.s..

(b) sgplPn(Anan(ﬂ:,—Bn)Sx)—P(Aan(3n—ﬂ)5x)!

= o ( Mn/a’n) a.s..

Theorems 3.1 — 3.3 show that appr0priate bootstrap estimators correct the
terms of order O(d1n) ( see equation (2.1) of Section 1.2 for the deﬁnition of djn’
1$j$4 ) in the Edgeworth expansion for the distribution of normalized BD and thus
attain a higher rate than the normal approximation. In fact, under some symmetry
assumptions on the model, the accuracy of bootstrap procedure can be increased
considerably with a minor modiﬁcation. Assume that the score function w is odd
and the underlying d.f. F is symmetric about zero ( i.e. F(—x) + F(x) = 1 ). Under
these conditions all the terms of order 0 (d3n) in the Edgeworth expansion of BH
vanish. As a result, the rate of normal approximation is typically of the order of
O (d4n)' In such situations if one draws the bootstrap samples from an asymmetric
resampling distribution Fn’ the terms of order O(d3n) do not necessarily vanish
from the corresponding expansion for ﬂ; Therefore, the rate of bootstrap

approximation can at the best be 0( d3n ) which is much worse than the normal

21
approximation. In particular this implies that the ordinary bootstrap procedure
fails in such situations. However, one can overcome this by changing the
resampling distribution to a suitable symmetric distribution. Only the case with

smooth F is considered below.

Let gIl be as in (3.8). Since gn(x) may not be symmetric, we symmetrize gn
and take the estimating density at x to be fn(x) = [gn(x) + gn(—x)] / 2. Now choose
Fn to be the d.f. corresponding to tn. Note that for this choice of Fm, (3.8) reduces

to (1.5) and the corresponding bootstrap estimators are the same.

THEOREM 3.4 : Assume that the hypotheses of Theorem 2.2(b) hold and k and
{en} satisfy condition C(2). Then for odd Ib, symmetric F and F11 equal to the d. f.

corresponding to in,

(a) Sip I Pn( an( a; —pn ) g x ) —H;n( Anx) I = o ((1411) = 0(MI21/a121) a.s..

(b) 82" I Pn( Anan< a; 43,) s x) — P< Aan( a, — A) s x )I = 0 (A4,)

_ 2 2
— o ( Mn/an) a.s..

*
In (a) and (b), ﬁn is deﬁned as a solution of (1.5) or (3.8).

22

1.4. Proofs.

We start by stating Esseen's lemma (Lemma 2 of Feller (1966), page 512).

LEMMA 4.1 (Esseen) : Let F be a probability distribution with vanishing expectation

and characteristic function (p. Suppose that G is a function on the real line such that

F — G vanishes at i 00 and G has a derivative g with | g|$ m. Finally, suppose that g

has a continuously differentiable Fourier transform 7 such that 7(0) = 1 and

7'(0) = 0. hen for all real x and a > 0,

| F(X) -G(X) I S l_,a,’{ | s0(t)-7(t) |/(W|t|)} dt + 24m/ M-

Repeated use of this lemma with prOper choice of a and G will give the

expansions upto the desired order. For the sake of completeness, we include here an

inequality due to Hoeffding ( Theorem 2 of Hoeffding (1963)).

HOEFFDING'S INEQUALITY : If X1, X2, ..., Xn are independent r.v.'s with

ai SXiSbi (1 sign), thenforanyt >0,

P( x — ,1 > t ) _<_ exp( —2n2t2/ 211(1), — 392)

_1 n
whereK=n 2 Xi and u=E(X).

let

1=1

Before proving Theorem 2.1 we need to have some more notation. For t 6 1R,

Sn(t) =2 21:1 xiw (Yi — xit), un(x) = E Sn( 5 + x/an),

Vn(x) = 7121(x)= Var Sn( [3 + x/an), E4,n(x) = 2 21:1 x? 04(x xi/an),

.th .
”Ln (x) =1 central moment of Sn ( ﬂ + x/an), 1 = 3, 4.

23
pnpr) = E exp ( it I Sn( ﬂ + x/an) — un(x))),

vn(x,t) = [log Ipn(x,t)] + Vn(x) p2/2, w (x,t) = E exp ( it w (61 — x)).

Next for real numbers x and y, deﬁne
Klnpr) = e (y) — ( p3,,,Ix)/ ariiIx) ) H2(y) 4) (y).
Kgnpr) = K1n(x,y)-— ¢(y) IIIp4,,,Ix) — semen/zeta» H3(y)

+ (”3,1100 / vzrﬁIx) ) H5(y) I,

plnIxx) = [ 1 e I p3,,Ix)/ 6rﬁIx) ) (it)3 I exp I — t2/ 2 )

72n(x,t) = 71n(x,t) + I ((#4,n(x) — 3E4,n(x))/24T:(x)) (it)4

+ I pine) / 72rﬁIx) ) (it)6 I exp I — 9/2 ).

In the proofs that follow, we shall use D > 0 as a generic constant, independent

of n, x, y etc.

PROOF OF THEOREM 2.1 : Proofs of parts (a) and (b) follow more or less the
same route . First we outline the arguments common to both the parts and then

complete the remaining steps in the proof of each part separately.

Note that boundedness of Ib, ill" and continuity of II)" guarantee that
Jw'(y)dy < co and Ib' is uniformly continuous. This in turn implies that W is
bounded. Therefore, the function u is twice continuously differentiable with a
bounded second derivative. Hence there exist constants 111 > 0 and c1 > 0 such

that for |x| < 771,

24

(4.1) |u(X)I >C|X|-

This inequality will be used to obtain a bound on the probability of the deviation of
3n from [3. Next observe that monotonicity of 1]) implies Sn(t) is nonincreasing in t

for every n21. This and the deﬁnition of 3n gives,

P(Sn(ﬂ+x/an)<0)5P(an(En—ﬂ)$x) $P(Sn(ﬁ+x/an)50)
(4.2)
P(Sn(ﬂ+x/an)>0)5P(an(Zin—ﬂ)2x) $P(Sn(ﬂ+x/an)20)

By Hoeffding's inequality, (4.1) and (4.2), there exists a constant C > 0
such that for all 0 < u < nan,

(4.3) P(Ipn-—eI>u)g 2exp(-Cu2aI21)

Now take 11 = bn/aI1 ( recall that h1] = log an ) in (4.3) to get an N > 1 such that
for all n > N,

—3
(4.4) P(an|Bn—ﬂ|>bn)5an.

Therefore, it is enough to consider the expansion of P( aI1 | 3n -— ,6 | _<_ x ) for
|x| 5 bn' In view of (4.2), (4.3) and the form of H1n(x), it is enough to ﬁnd an
expansion of P( Sn( )6 + x/an ) S 0 ), that holds uniformly for |x| 5 bn, and to

appraise sup { P( Sn(ﬂ + x/an) = 0 ), |x| 5 bn }.

PROOF OF (a): Given an 17 > 0, choose an integer N and a constant b > 0 large

enough such that for all y and |x| 3 bn’ 24 |K1n(x,y)| < IN). This is possible since

 

25
both It and its derivative are bounded. Take a = b Mn/aIl in Lemma 4.1. Then for
all y in IR and for all x with |x| 3 bn’

I P( [3,103 + X/an) - un(X)I/Tn(X) S y ) - K1n( X , y) I
(4.5)

5 Lil pnIxA/rnIx» - 71n(x.t) I / ltl dt + ”Mn/an

As is customary, the integral on the R.H.S. is broken into two parts ; one
ranging over It] 5 6 an/ Mn (call it I ) and the other over 6 an/ Mn < |t| < a
(call it 11 ) for some 6 > 0 which will be chosen later. Since 2b is bounded,

continuous and nondecreasing, w is uniformly continuous. Therefore for any D > 0,

sup{l w(x,t)—w(0,t) I: |t| <D}
s DII er—xI—pIy) I dFIy) .0 am.

Hence there exists a 6'> 0 such that for |x| < 26 and |t| < 26,
(4.6) |w( x ,t)| > .5.
This guarantees that Vn( x , t/rn(x) ) is well deﬁned for large 11 when |x| 5 bI1 and

|t| S 6 an/Mn. Since it is bounded, vn( x , t/rn(x)) is inﬁnitely differentiable in t

over |t| S 6 an/Mn' Next note that for any complex number 11 with | u| <1

log (1+u)=u—u2 r(u) where |r(u)|<1/(1—61) for all |u|< 51a.

Therefore, Taylor's expansion of vn(x , t/rn(x)) around t = 0, continuity of the
functions V(x), u3(x), u4(x) and the above result together yield (possibly with a
smaller 5 > 0 )

26

(4.7) I vn(x . t/T,,(x)) — (it)3 p3,,,Ix)/6r§’,Ix) I < D N4 I 2 xj4/ rﬁIx»

for all |x| 3 bn’ |t| < 6 an/MI1 and large 11. Without loss of generality we may

suppose that for the same set of values of x, t and n,
(4.8) I vnIx,t/r,,Ix)) I 5 12/4, I Iit)3p3,nIx)/6rﬁIx) I s 1.2/4.
Note that for all complex u and z,
Iepru)—1—z I s ( III—2 I + IZI2)exp (r)
|exp(u)-1-z-22/2I 5011—2 I + Izl3) exp (7), 7> maxIIuIIzI).

Now choose 6 > 0 such that (4.6) — (4.8) hold simultaneously. For this
choice of 6, one may use bounds (4.7) — (4.9) to conclude that uniformly in -

|x| 5 bn and for large n,

I: I { I rn(x,t/Tn(x)) — meet) I / |t| )dt
|t|$ 6an / Mn
= I |t|—1|exp(vn(x,t/Tn(x))) —1— (it)3u3,n(x)/61'3(x) I exp (—t2/2)dt
|t| g 6an / Mn

5nd ItI3I2xj4)/r;‘,Ix)+ ItI5I2xj3)2/rfiIx)Iepr—t2/4)dt

(4.10) s D ( Mn/an )2

 

27
This takes care of the ﬁrst part of the integral. Now we estimate 11 . Note that for

real numbers x and t, the differentiability of (I) gives
|w (x,t) —w (0,t)| S D Ith.

By the nonlatticeness of w(cl) and the above inequality, it follows that there exist
0 < q < 1 and N > 1 ( both depending on 7; through 'a' of (4.5) ) such that for all
n > N,

sup {Iwaecj/an. txj/rnIxnI =1 e An(C), IxI s bu, «5 an/Mns I t I s ban/Mn)
(4.11)

<q.

Hence by condition (2.3) it follows that for all n > N and |x| S b n’

H = I {IpnIx.t/r,,Ix)) — mexm/III} dt
6an/Mn SItIS b an/Mn

anc)
(4.12) mm + I {I71n(x,t) I / |t| }dtl
6am / Mn gItI

5D(Mn/an)2.

By (4.5), (4.10) and (4.12) it follows that given an r; > 0, there exists N > 1 and a
D>0 ( both depending on F only through the nonlatticeness of w(cl) and the values

of the function u, 0, p3 and their derivatives at zero ) such that for all n > N,

811p sup IP( (Snw + X/an) -un(X))/Tn(X) .<. y ) - K1n( X , y) |
IXIS bn y
)2

g D ( Mn/an + 17 Mn/an.

28
Since 17 > 0 is arbitrary, this gives the Edgeworth expansion of normalised
Sn(,6+ x/an) with a remainder term of the order of o (d3n) uniformly in
|x| 5 bn. The smoothness conditions on pensures that the functions II, V and p3
have a second derivative and u" is uniformly continuous. Taking Taylor's

expansions ( of the terms involving x ) around x = 0, one gets,

un(x)/rn(x) = xp'IOI/eIO) + d1n(u"(0)/o(0) — u'(0)V'(0)/03(0))x2/2

+ Q1n(x)a
M3,n(x)/Tg(x) = I 143,,,(0)/03(0) ) (dln) + aanx),

where the remainder terms satisfy

I Qme) I .<. D x2 (d3n) sup{ I 14"(y) —p"I0) I = lyl s enMn/en I.

(4.13)
I Q2n(X) I s D b, IM,,/e,,)2

for all x with |x| S bn' Here the constant D depends only on the values of functions
II, V, m3 and their derivatives at 0. Using the above expansions, uniform

continuity of u" and (4.2) one can conclude that,

IxI‘ipb I P( an( 3,, — 4) s x) — H1n(Ax) I = o I (MD/ a.) ).

This together with (4.4) completes the proof of part (a).

PROOF OF PART (b) : The steps in the proof are similar to those in part (a). We

 

29
will mention only the major differences here. Given 1) > 0, choose b > 0 large
enough such that for all y in IR and |x| 5 bn’ | 24 Kén(x , y) | < br). Take a = b

d 4n in the Esseen's lemma and break up the integral into two parts as before. Note

that for any complex 11 with |u| < 1,
log (1+u) = u - 112/2 + 113 r(u) where |r(u)| < 1/(1 41) for |u| < 151 < 1.

Using the differentiability of v(x, t) in t and the above result, choose 6 > 0

such that for |x| S bn, tI S 6an/Mn and large n,

I vnIx.t/rnIx)) — (it)3 p3,,,Ix)/6rﬁIx) — (it)4 [#4,n(x) — p4,nIx)I/24r§Ix) I

< D III5 I) Iij5)/r,5,Ix).
IvnIx.t/r,,Ix))I s 9/4, |(it)3 143,n(x)/67?,(X) + (it)4 M4,n(x)/24Tﬁ(x)l s 9/4.

Now use the second part of (4.8) to conclude that for |t| 5 6 Mn/an,

Ison(x,t/Tn(x)) — pgnIxxn s D (d4nMn/an) {|t|5+ |t|9} epr—tg/A).

Hence, it follows that for large n, uniformly in |x| 5 bn’

(4.14) I = I {IpnIx.t/r,,Ix)) — 12n(x,t)l/It|} at s D d4nMn/an.
ItIS 6 an / Mn

For estimating 11, one has to use condition (A.6) instead of the

nonlatticeness of w(el). In fact condition (A.6) guarantees that

30

k (c)
I n
(4.15) sup {len(x,t/rn(x))| . |x|$bn, 6an/MnsltISb d4n} < q

Using (4.14) and (4.15) one can conclude (as in part a ) that

(4.16) | 31<1pb 8311 IP( (Sn(ﬂ + x/an) -un(x))/rn(x) S y ) — K2n( X , y )I

X
"' II

2
Now observe that the differentiability conditions 011 I6 implies that the

functions II, V, ,u3 and 114 are three times differentiable and u'" is uniformly

continuous. A tedious computation of Taylor's expansion gives
un(x)/rn(x) = Ip'IDI/eIo» x + (em) {u"(0)/o(0) -— u'(0)V'(0)/o3(0)} x2/2
+ [Iain/2) {3 u'(0)V'2(0)/205(0) — u"(0)V'(0)/03(0) }
-+ 4,, {u"'(0)/0(0) — sp'Io)v"Io)/2e3Io))I x3/6 + D3,,Ix),
143,n(x)/r§(x) = d1n u3(0)/a3(0) + 4,, p3'Io) x/e3Io))
— 3 din 113(0) V'(0) x /2e5I0) + D4,,Ix),
Ip4,,,Ix) - p4,,Ix) I/rﬁIx) = 42,, [144(0) — 174(0) I/e4Io) + Q5nIx),
pins/ego) = «1%,, p§Io)/e6Io) + Q6n(X)

where for all x with |x| 5 bn, the remainder terms satisfy

31
|Q3n(x)l s D |x|3d4nsupl I It"'(y)-Iu"'(0)l = lyl < lumen/e»n I.
(4.17)
Max{ IQin(x)| :1: 4,5,6} < “121 (d4n Mn/an).

The constant D depends only on the values of the functions It , V , #3 , I14 and their
derivatives at zero. As in the previous case it now follows from (4.2), (4.16) and

(4.17) that

Iii“? b I PI'anI 73,, — r) s x) — H2n(x) I = o (d4n) = o I (Mn/ an?)

By (4.4) the proof of part (b) is now complete.

PROOF OF THEOREM 2.2 : Note that the hypotheses of Theorem 2.2 differ from
those of Theorem 2.1 only in the differentiability conditions on the functions 7/) and
F. From the proof of Theorem 2.1 it is evident that the differentiability of the
function ’l/J has been used to guarantee that the functions II, V, #3 and #4 have
sufﬁciently many derivatives. Since 'l/) is bounbed and nondecreasing, therefore for
every k 2 1, wk is of bounded variation. An application of integration by parts

gives,

Ivk(y-x)dF(y) = kae)—IFIy+x)dkay)

As a consequence of this relation, the function u, V, #3 and #4 will have
sufﬁcient smoothness as required in the proof of Theorem 2.1. The only cases where
the differentiability of It has been used for different reasons are (4.6) and (4.11).
But under the hypotheses of both the parts, F has a density and hence this follows
easily by Scheffe's Theorem.

32
PROOF OF THE PROPOSITION : Let p = Q { (a , b) }. Then 0 < p S 1. For
any set B of R, let lB' denote the indicator of the set B. Note that by the Riemann

Lebesgue Lemma,
Iexp I a. pIy) ) 1(3, b)(y) dQ(y)
=I exp I up) [1( “a, , pr) )(y)q(v‘1(y))/v(1)(y)l dy

403.8 |t| 400.

Hence, there exists a constant M > 0 such that for |t| > M,

(4.18) I I eprit pIy) ) 1(a,b)(y)dQ(y) I < p/4

Therefore, for any x in IR and |t| > M,

I Eexp(it¢(€1—x)) I
sIl—p)+ I IepritpIy—x))1(,,b)Iy)dDIy) I
s (4 —3p)/4 + I IIexp (it pIy—x» — exp I it up» I 1p, , b)(y) dQ(y) I

s (4 —3p)/4 + I I q (y + x) 1(,_x,b_x)Iy) — q (y) 1(,,b)Iy) I dy.

Note that the continuity of q on (a , b) implies,

q (y + x)1(a—x , b—x)(y) " QIY)1(a , b)(Y) as X " 0

Therefore, the above integral goes to zero because
I q (y + x)1(a—x , b—x)(y) dy = Iq (y) 1(a , b)(y) dy for all x in IR.

Hence, there exists 6 > 0 such that whenever |x| < 6 and |t| > M,

33

I Eexp(it ¢( «fl-X» | < (4-3p)/4 +D/4 =(2-p)/2<1-

This completes the proof of the pr0position.

. *
For the proofs of Theorems 3.1 — 3.4, deﬁne wn(x,t) = En ( exp (in/(61 -— x) )),

1 n . 1 n .
w1n(x,t) = pI'1 Ej=1xj exp (ltrb(6j— x)) and w2n(x,t) = n' Xj=1exp (1th(6j — x)).
The basic facts required for proving Theorem 3.1 and 3.2 are given in Lemma 4.2

below.

LEMMA 4.2 : Let Fn be either of the resampling distributions of Theorem 3.1 and
3.2. Then, for any M > 0,
(4.20) sup { | wn(x,t) —w (x,t) | : |t| S M, |x| S M} = 0(1) a.s..

Let h be a function with a bounded ﬁrst derivative. Then for every M > 0,

(4.21) sup {lEnh( c: —x) —E h( ‘1— x) I : |x| 5 M} = 0(1) a.s..

PROOF OF LEMMA 4.2 : First we prove (4.20). For |t| S M, |x| g M and

FD : Fln’

I Wn(X,t) —W1n(X,t) I
1 n -
sll p( )II M ID |ij I ej—ej I l/pn,
1:1
2
By the assumption on xi's and (4.3), the R.H.S. tends to zero as. as 11 tends to

inﬁnity. Similarly, for Fn = Gn’ |t| S M, [X] 5 M,

|wn(x,t)—w2n(x,t)| S D an I bn—ﬂI /n.

By (4.3), this tends to zero as. as 11 goes to inﬁnity. Therefore, it is enough to

34

show that for i = 1, 2,

sup{ | win(x,t)—w (x,t) | : |t| 5 M, |x| 5 M } =0 (1) a.s..

This is proved by adapting the idea of the proof of Lemma 2 in Babu and Singh
(1984). Fix a > 0. Then there exists a constant C > 0 ( independent of r) ) such
that for all n 2 1 and for all u with |u| < Cr),

sup{ lwin(x+u, t+u)—win(x,t) I: |t| SM, |x| _<_M, i=1, 2}< 77
and ‘

sup{|w(x+u, t+u) —w(x,t)|:|t|$M, |x|$M}<ry.

Deﬁne B(M,n) = {j : j is an integer between —M/(Cr)) and M/(Cn) }. Then,
for i = 1, 2,

sup{ I w,,,Ix.t)—wa,t) I = |t| 5M. IxI 5M}

$277+max{ I Win(10n,iCn)—W(lCn,in) I =i, i€B(M,n) }-
Therefore by Hoeffding's inequality it follows that

P(SuD{ | W1n(.x,t)-W(x,t) I = |t| SM, IXI SM } >477)
SP(maX{ I W1n(iCn,iCn)-W(i0v,i0n) I =i,j€B(M,n) } >277)
sD 27-2 epr—I ppn)2/ 2e13, ).

Similarly,
P( sup{ | w2n(x,t)—w(x,t) | : |t| 5M, |x| SM } >417)

—2

SD?) exp(—712n/2).

35
By Borel Cantelli lemma, ﬁrst part of the lemma follows. The other part can be

proved similarly.

PROOF OF THEOREM 3.1 : Now we sketch the proof of Theorem 3.1. Since w is
bounded, by Lemma 4.2 all ( central ) moments of w (61 -- x) under F n converges
as. to the corresponding (central ) moments of It (61 — x) uniformly over |x| 3 M.
Let N denote the set of all positive integers. Fix a sample point for which (4.20)
holds for every M in N and mn(x), sn(x), m3,n(x), m4,n(x) and their
derivatives respectively converge to u (x), o (x), u3(x), ,u4(x) and the
corresponding derivatives uniformly over |x| 5 1. For this sample point, using
Lemma 4.2 one can get bounds in the inequalities ( in the present set up )
corresponding to (4.1), (4.3), (4.4), (4.5), (4.10), (4.12) and (4.13) uniformly

over all n 2 N for some N > 1. Hence one can retrace the proof of Theorem 2.1(a) to

obtain Theorem 3.1 (a). Part (b) follows easily from Lemma 4.2.
PROOF OF THEOREM 3.2 : Similar to the proof of Theorem 3.1.

PROOF OF THEOREM 3.3 : Let G1n denote the empirical distribution function
of 61,...,en. Deﬁne g1n(x) = I [ k ((x—y)/en) dG1n(y) ]/en . First we show that the
estimators gn(r) (x) converge to f(r)(x) uniformly in x for r = 0, 1, a.s.. Under
the hypothesis of Theorem 3.3, Lemma 2.2 of Schuster (1969) and a simple
modiﬁcation of Lemma 1 of Bhattacharya (1967) guarantee that

(4.22) max { || g1n(r)—f(r) || : r = 0,1 }= 0(1) as n —) oo a.s..

Therefore, it is enough to show that

(4.23) max { II g1n(r)— gum” : p = 0, 1 } = 0(1) as n -» a a.s..

 

36
Now. II gm“) —g,,(‘) II

5 21.1 II 3:2" Ik(r)((X—€i)/en)—k(r)((x-Ei)/en)I I/ne;+1

1:

11
- r+2
SD2i=IIEI~cil/(nen )

sDxn I lln—ﬂl /(n1/2ef,+2)-

The last step follows by an application of Cauchy Schwartz inequality. By (4.3)
and the assumption on { en}, (4.23) follows. Hence (4.22) and (4.23) jointly

imply that

(4.24) max { II glfl‘) 4“) II : p = 0,1 } = 0(1) as n 4 a a.s..
For proving Theorem 3.3 we need the following Lemma.

LEMMA 4.3 : Let Fn be the distribution corresponding to the density g n' Then ,
sup { I wn(x,t) —w (x,t) I : t 6 IR, x 6 IR} = 0(1) as n-+ oo a.s..
For any bounded ﬁmction h,

.. *
sup{ I Enh(cl—x)—Eh(cl—x) I :xEIR}=o(1) aSl'l-Ioo a.s..

PROOF OF LEMMA 4.3 : It is to see that for all x and for all t
| Wn(x,t) - W (x,t) I
s I I sn(y) - f(y) I dy
By (4.24) and Scheffe's theorem it follows that

I I 8n(Y)—f(y) I dyeOasn—aoo as.

This proves the ﬁrst part of the Lemma. Proof of the other part is similar.

37
Now we give an outline of the proof of Theorem 3.3. Fix a sample point for
which max { II gflr) — {(r) II : r = 0, l } -) 0 as n -+ 00. It is enough to show that
for this sample point the inequalities in the proof of part (a) of Theorem 2.1 holds
uniformly in all sufﬁciently large 11 when F is replaced by Fn' Note that for any real
number x,
sup{ I w(x,t)—w(0,t) I :tEIR}
5 II f(y+x)-f(y) I dy

which tends to zero as x tends to zero. Hence, by the nonlatticeness of 16(61),
Lemma 4.3 and the above observation, it follows that there exist N > 1, 6 > 0 and

0<q<1suchthatforalln2N,
Inf{ I wn(x,t) I :ItI > M, IxI S 6} > .5,
(4.25)
sup{ I wn(x,t) I :ItI > M, IxI S 6}<(1+ q)/2 <1.
Also by Lemma 4.3,
Max{ II mHITI—p“) II, II Sn(r)-0(I)II:I'=0, 1, 2}-)0asn—+oo
(4.26) "

Max{ IImin(r)—ui(r)II:i=3, 4; i=0, 1, 2}-+0asn-+oo.

Using (4.25) and (4.26), one can get bounds in the inequalities corresponding to
(4.4), (4.5), (4.10), (4.11) and (4.12) uniformly over all sufﬁciently large 11. As for

the counter part of (4.13) in this case, note that,
sup{ I mI2)(y) -m,(,2)Io) I = lyl s Mnbn/en)
s sup{ I My) -u(2)(0) I = lyl SMnbn/an } + 2 II mg?) -u(2) II

403511-100.

38

Hence part (a) of the Theorem 3.3 follows. Part (b) is trivial in view of Lemma4.3.

PROOF OF THEOREM 3.4 : Using the conditions on {en}, k and the symmetry of

the underlying density f, one can show ( as in the proof of Theorem 3.3 ) that
max{ IIflgr)—f(r)||:r=0,1,2}-)Oasn-mo as.

Therefore the conclusions of Lemma 4.3 hold in this case as well. Hence, one can
complete the proof along the line of proofs of Theorem 3.3 and Theorem 2.2(b) with

a similar observation on Q3n‘

CHAPTER 2

2.1. Introduction.

Let X1, X2, be a sequence of independent and identically distributed

(i.i.d) p—dimensional random vectors with distribution function (d.f.) F 0 where 0

lies in an Open subset O of IRm. Let 1]): Rp x 9 -) [Rm be a measurable function

with respect to (w.r.t) the Borel o—algebras on RP x O and Rm such that
(1.1) I w(x, a) dF0(x) = 0 for all a e e.

Let 161,...,Ibm denote the components of (6. Then M—estimator 9n of 6

corresponding to It is deﬁned as a solution of the m—equations ( in t )

n
(1.2) 2 pi(xj, t) :0, i=1,2,...,m.
i=1

For n 2 1, denote the empirical distribution function of X1, X2""’Xn by F n' Let

*
x1,"

M—estimator 0; as a solution of the system of equations (in t)

.,XI"; be a random sample of size n from Fn' Deﬁne the bootstrapped

n
(1.3) I pi(x3=,t)=o, i=1,2,...,m.
i=1

In sections 2.2 and 2.3 below, under some regularity conditions on It and
F 0, it is shown that in exists for sufﬁciently large values of n and tends to 0 as
n -) 00 with probability 1 under 0. It is also shown that with high ( conditional )
probability under Fn, 0;; exists and tends to 0 at the rate 0(n_1/2(log n)1/2).

39

40
For such sequences of estimators, an almost sure Edgeworth expansion of the

distribution of ATM; — in) is given.

The method of the proof is similar to that of Bhattacharya and Ghosh
(1978). Using the assumptions, an almost sure representation for 0n is obtained.
In fact, it is shown that there exists a sufﬁciently smooth function H and a Borel

measurable function f : Rp -+ Rk, for some integer k 2 1, such that

n
where Z = l/n}: zj, zj = {(xj), j = 1, 2, n and with probability 1,
i=1

IIRDII = 0(n—(S’2V2) for some integer s 2 3.

Next, for almost all sample sequences (X1, 2,....), outside a set of

conditional probability o(n_(s_2)/2), 0; is expressed as

I); = 1107+ 11;)

II

1 .
h Y: — Y', Y' = f X¥ , = 1,...,
w ere n2 j=1 J J ( J) J n and

Pn(IIR;II > 0(n_s/2(log n)s/2) = o(n_(s_2)/2) almost surely (as). Here Pn refers
to the conditional probability given (X1, X2,...,Xn). It should be pointed out that
for almost all sample sequences (X1, X2,...), 0; is expressed in terms of the same

function H.

The arguments in the proof following this point can be divided into two

steps. In step 1, JE(0* — 0 ) is closely approximated by JH(H(Y) —— H(Z)).
n n

 

41
Pr0perties of H, R11 and R; guarantee that for almost all sample sequence
(X1, X2,...), the error of approximation, say Dn’ is small with high conditional

probability. More precisely, D n satisﬁes
Pn(|IDn|I > 0(n—(S_l)/2(log ins/2) = pal—(3‘3”) a.s..

Representation of 0; in terms of the same function H is crucial for carrying out

this step.

In step 2, an almost sure asymptotic expansion for the conditional
distribution of J5 (H(Y) — H(Z)) is obtained. This, together with step 1, gives the
almost sure asymptotic expansion for the distribution of JI—1(9; — 6n). Corresponding
expansion for the distribution of Jﬁ( 611— 0) was obtained by Bhattacharya and
Ghosh (1978). Comparison of these two expansion shows that the bootstrap
distribution of J5“); — 6n) approximates the distribution of 45(611— 0) under 0,
at the rate of o(n_1/2).

2.2. Assumptions and main results.
Before proceeding further, we collect here the notations to be used in the rest

of chapter 2. Let Z+ denote the set of all non—negative integers. Also, let i be a

positive integer. For V=(l/1....V[)’ e (Z+)[ and x = (x1,...,x[)’ in IR‘, write

(I V. t’

x”: 11 x.‘, u! = II (11.!) and IVI = u1+....+u,. For afunction leRZ-1IR
i=1 1 i=1 1

having sufﬁciently many partial derivatives, denote by D jf the partial derivative

1/ l/
of f w.r.t. its j—th co—ordinate, j=1,..., l and set DVf = D11. . . D/ f. Let (DA
and II) A respectively denote the distribution function and the density of normal

distribution with mean zero and covariance matrix A for some positive deﬁnite

42
matrix A. For any matrix A, write A, = transpose of A. By II II and < , >
denote, respectively, the norm and the inner product on appropriate Euclidean
spaces. For a Borel set B g R‘, let B6 = {x: IIx—yII < e for some y E B}, c > 0
and 6B = boundary of B.

Next, denote the underlying parameter value by 00. For i = 1,...., 1n,
OSIVI S s—l and j 21, deﬁne the variables {Zuij} and {Yuij} by
z =DV1/1i(Xj, 00), Y = Dthi(X’J!‘, 00). Write YI"), ZI") for the
and (Z

11,1, j ”31,1

m—dimensional random vectors (Y

V,I,j) i = 1,...,m V,i,j) i = 1,...,m

- _. (V) .. (V)
respectlvely. Set Zj - ( Zj ) 0 S IVI S 8—1, Yj — (Yj ) O s IVI 53—1. Then
(Z1, Z2, ...) and (Y1, Y2, ...) are i.i.d k-dimensional random vectors with

s 1 m+r—1 n
k=m2r=0( r ). Write 29):)

i=1
deﬁne V(V) and Y similarly. In the following we shall write P to denote the

n
Z(V)/n and Z -.= I am and
.I j=1 .I

(D
product probability measure 0 F 0 on the space of all inﬁnite sequences in IRp and
1 o

E to denote the expectation under P. X1, X2, are then considered as
co—ordinate variables. Also write En to denote the expectation under Pn.

Let ”Vii = E (Z i = 1 ,..., m, 0 S |u| S s—l,

u,i,1)’

(2'1) ”V=(‘uu,i)i=1,...,m and ”:(MV)0SIVISS—l'

Also, let 2 = E ( Zl—p )( Zl—u )’ and Sn 2 En(Y1_ EnY1)(Y1- EnYl)’.
Finally, deﬁne

M = (Cl-ratd H(u)) (23) (Grad H (u))’

Mn = (Gard H(Z)) (Sn) (Grad H(Z)’.

Now we state the assumptions.

 

43
(A1) There exists a Borel set C E RP such that F0(C) = 1 V 0 E O and the
components of V) have continuous Vth order partial derivatives in 0 for 1 S |u| S s

at each (x, 0) E C x O for some integer s 2 3.

(A2) E IIDVIb(X1, 00) ”S < 00 for 0 S |u| S s—l, and there exists an e > 0 such
that

Max E( sup IIn”p(x,o)II3)<a.
IuI=s ||Holl<e 1

(A3) D = (( E Diji(X1, do) )) is non—singular.
We are now ready to state the main result.

THEOREM: Let assumptions (A1) — (A3) hold. Then,
(a) for almost all sample sequences (X1, X2, ...), there exists a sequence of statistics

*
{0n} and a constant (11 > 0 such that

Pn ( ”0:1 — 00” < dln—1/2(log n)1/2, 0; solves (1.3) ) = 1 — 0(n—(S_2)/2).

(b) There exists asequence {9n} ofstatistics such that

P ( in solves (1.2) and ”in — 00" < dl-n_l/2 (log n)1/2 eventually) = 1.

. :1:
(e) Let {0n} and {011} be two sequences of statistics which respectively satisfy (b)
and (a). Suppose that the characteristic function of Z1 under 00 satisﬁes the
Cramer’s condition
i<t,Z1>

(A.4) lim sup IE (e

)I <1.
IItllw

Then, there exist polynomials a1(Fn, -), ..., as—2(Fn’ -) such that for almost
all sample sequence (X1, X2, ...),

 

44
* ‘ n-ar/2
sup IPn(./n'(0n—0n)e13 )—IB(:1+2 (anx))d<1>M (x)I
Bee?
= 0 (n—(S—2)/2)
where .2 is a class of Borel subsets of Rm satisfying

(2.2) sup QM(0B)£) = 0(6) as (I 0
B63

and al(Fn, 0), ..., as_.2(Fn, .) are polynomials whose co—efﬁcients are continuous

functions of moments of Fn of order s or less.

(d) If conditions (A1) — (A4) are satisﬁed with s = 3, then for almost all sample
sequence (X1, X2, ...)

62%, anNﬁ Mil/20:, — In) 6 B) — P («E Ml/ZIIn — 00) e B)| = o (fl/2)

where .21 is a class of Borel subset of IRm satisfying

(2.3) sup <I>I( (6B)€) = 0(6) as c I 0.
BE .21

REMARK 2.1. Conditions (A1) — (A3) are similar to those of Bhattacharya (1985)
and are somewhat weakerthan the conditions in Bhattacharya and Ghosh (1978).
Under some additional conditions, e.g. the contnuity of the maps 0 -) F 0 and
04 D(0) = E0((Dj Ibi (X1, 0))), Bhattacharya and Ghosh ( 1978) have obtained
results similar to (b) and (c) of the Theorem uniformly in 0 lying in compact subsets
of 6. But, in our case, such a uniformity does not seem to be necessary. Given the
1’""Xn’ if we can ﬁnd 0; and (in satisfying (a) and (b), we can use the
approximations in part (c) and (d) without any knowledge about 0. One such

data X

situation is of course that (1.2) and (1.3) have unique solutions. In the case of

 

45
multiple solutions there is no rule which deﬁnitely speciﬁes in satisfying (b) (or 0;

satisfying (a)) even in ,the presence of such uniformity.

REMARK 2.2 : Part ((1) of the Theorem extends the pioneering result of
Singh(1981) concerning the improvement of the rate of approximation by bootstrap
in the case of sample mean. Taking It (x,t) = x — t for the sample mean, it is easy
to see that assumptions (A1)—(A4) reduce exactly to the set of conditions required
for the validity of the corresponding result (part D of Theorem 1) of Singh(1981).

REMARK 2.3 : Though conditions ( 2.2) and ( 2.3) look similar, they are not
equivalent in general. If the largest eigenvalue A of M is less than or equal to 1,
then every class of Borel sets satisfying (2.3) also satisﬁes (2.2). But for A>1, a
class of Borel sets satisfying (2.3) need not satisfy (2.2) as shown by the following
example.

EXAMPLE : We consider the case m = 1. Let M = (A) with A > 1 and c = (4A)°5.

1/2

Also,letan=(clogn) ,n>2. DeﬁnethesetBby B={an:n>2}.Then

SE = B and (6B)6 = U (an— 6 , an+t ). Therefore, for 0 < c < a3,
n>2

I (an exp ( —x2/2) dx

(

—x2 x
2n>2 J(an—c, an+ c) e p ( /2 ) d

g 26 2 DZ exp ( — (an—c )2/2)

46

Now choose a = ( 16 A )—1/4 and let N be an integer such that (1—2o)aN > 2a3.
Then, '
I 6 exp(-—x2/2)dx S25(N+}: exp(-oaI21))=0(c).
(dB) n>N

Hence .3 = { B } satisﬁes condition (2.3). Now for sufﬁciently small 6 > 0 and
for all integer n satisfying (n+1)c 2 1, an+1 — an < e. Write a = (— c log 6 )1/2.
Then,

I 6 exp ( —x2/2A) dx

(013)
2 I exp(—x2/2A) dx
( a , co
2 (a‘1 — a—3 A ) exp (—a2/2A)

= a_1 (1— a"2 A) 6 c/2A.

Hence, it follows that

161151 3%wa exp(—x2/2A) dx

(0 — 2A )/2A

2lim a—1(1—a_2A) c =+oo

£1 0
So, .2 does not satisfy condition (2.2).

REMARK 2.4 : Condition (A.4) may be difﬁcult to verify in some situations. A
sufﬁcient condition for (AA) is given in Bhattacharya and Ghosh (1978) as
assumption (A6) on page 439. In our set up, this can be stated as : (A6) of
Bhattacharya and Ghosh (1978) together with the assumptions that C in (A1) is
Open and the matrix ((E Ibi(X1, 00) . I/Jj(X1, 00))) is nonsingular.

 

47
2.3. Proofs.

First we state and prove some lemmas.

LEMMA 3.1. If E IIZIIIS < co for some 8 2 3 and Z1 satisﬁes (A.4), then for

almost all sample sequences and for sufﬁciently large n,

- —2
an(«/ﬁ (Y-Z) e D) -I B (1 + 2 :1 n‘r/2 b,IF,,, x)) d<I>Sn(x)I

sou—(3‘2”?) + c1 es ((6B)e—dn)
I]

for every Borel set B in Rk. Here (1 > 0, c1 are constants (independent of the
sample sequence) and br(Fn’ ~), r = 1, ..., 8—2 are polynomials whose co—efﬁcients

are continuous functions of moments of Fn of order s or less.

Lemma 3.1 is an easy consequence of Theorem 2 in Babu and Singh (1984).
So we omit the proof. The next lemma gives an almost sure asymptotic expansion

for the distribution of J13 (H(Y) — H(Z)).

LEMMA 3.2 : Let Q = {x 6 RR: IIx — pII < 61} for some 61> 0 and let H: IRk —+
[Rm have continuous partial derivatves of all orders on Q. If Grad H(,u) is of full
rank then, for almost all sample sequences,

s—2

1 n_r/2 ar(Fn’ x))d ‘I’M (x)I

Sup an(Jﬁ (H0?) — IHIZ))e D) - I (1+2
B62 B

= O (n-(S'2)/2)
where a1(. , .), ..., as_2(. , .) and .2 are as in the statement ofpart (c) of the

Theorem.

 

48
PROOF OF LEMMA 3.2: without loss of generality, we may assume that the ﬁrst

m—columns of Grad H(p) are linearly independent. Write,
s—2 —r/2 k
7,, n(x) = (1 + 2 1n br(Fn, x)) 4’s (x), x a; In and
1 1': Il
gn(X) = J5 (H(Z + x/JE) — H(Z)), x e IRk so that

JD (HO?) - H(Z)) = gnwﬁ (Y- Z)

First, we show that

(3.1) I -1 73,n(x) dx

:IB(1+X—

1 n-T/2 ar(Fn’ x))d (PM (x) + 0(n_(s_2)/2)

holds uniformly over all Borel sets B in Rm. To that effect let Vn denote the set

{x 5 RR : I|xll 5 log n} and deﬁne the function kn : vn -» 111‘ by

gn(X)

k (x) =
n (x)m-1I—1

 

 

where (x)mk k. Then,

+1 denotes the vector of last (k—m) elements of x 6 IR

 

By SLLN, Z —) It almost surely (P). Therefore kn has continuous partial
derivatives of all orders and a non-singular gradient on VI1 eventually, a.s.(P).

For sufﬁciently large n';

 

49
(3.2) I (ng) v,,,,(x)dx

x x n—'(s_2)/2
I (gnlB)nV r,,,( )4 +eI )

= 78,n(k;l(w)).ldet Grad kn(k;1(w))I—1dw
{(o)'fe13}n kn(Vn)

+0(n_(s—2)/2)
where (at)?1 is the vector of ﬁrst m elements of w E IRk.
Next, we approximate det Grad kn(x) by taking co—ordinatewise Taylor's

expansion.

det Grad kn(x)

 

Grad H( Z+ x/Jﬁ)
= det
d Grad H (Z) +2: n—rr/QA ,n(x ) + n_(S—1)/2Rn(x)
= t
e 0 Ik—m

 

Here Ar n(x) are m x k matrices of polynomials in x and Rn(x) is a mxk matrix
which satisﬁes |an (x )II < c2 IIxIIS—l, x e vn eventually, as. for some

nonrandom constant c2.

 

50

Grad H(Z)

With B n = , we have

 

 

0 Ik—m

(3.3) det Grad kn(x)

= (det Bn) (1 + q1,n(n—1/2x) + n_(S—1)/2R1n(x))

where q1 n is a polynomial of degree S (8—2) and the remainder term R1n is

ONE) uniformly on V n' Therefore , for all large n, we can write

(3.4) (det Grad kn(x))—1

—1 —1 2 — s—1 2
= (det Bn) (1 + q2,n(n / x) + n ( )/ R2,n(x))
where q2 n and R2 11 respectively have pr0perties similar to q1 n and R1 11 in
(3.3). Next observe that for almost all sample sequences, there exists a 6 > 0 such
that {Z + x: ”x” 5 6} _<_; Q for sufﬁciently large values of n. Deﬁne the function
I‘n on E = {x: “X” S 6} by

H(Z+x) —H(Z)

F (x):

n ,xEE.

 

 

mail

Then, kn(x) = 41—1 Fn(n-1/2x), x 6 VII holds for all n such that log n g d-Jﬁ.
Notice that I‘n is a diffeomorphism ( cf. Milnor (1965), page 4) onto its image.
Hence 1:1 has continuous partial derivatives of all orders. In particular we can

I

express I‘; as the sum of a vector of polynomials q3 n and a remainder term

0(llwlls). As a consequence, for all w E kn(Vn),

 

51
(3.5) k;1(w) = n1/2(r;1(n‘1/2w))

= nl/2 qa nor” 2 w) + J5 0(Iln‘1/2wlls)
= 331w.+28-:”—r/2 «:4 r as») + ads-1V 2dawns)
r: 3 3

where for r = 1, ..., (s-2), q4 r n is a vector of polynomials. Now, using (3.4) and

(3.5) in (3.2), we have

_ 7 (KNX
[3111B s,n

= w —1 n—1/2 -1 w
J{(w)TEB}nkn(V) and; H) |(det 3,) (1+q2,n( kn( mas

+ 0(n—(S—2V2).

_ --l
— |det Bnl

n1/2 1/2

<13 (n w))(1 + q2,n<q3,n(n‘1/2w))) dw

J‘{(w)m 111,163le (V VnNS’J
+ 0(n—(S—2)/2)

:'_nr/2(a1 n(a)) d<I>Mn(w) + o(n_(s_2)/2)

J m (1 +2: ,Fr
{(w)1eB}nkn(v )

where a1 r(Fn’°)’ r = 1,...,(s—2) are polynomials whose co—efﬁcients are continuous
,

functions of moments of Fn of order s or less. Now, integrate out the variables

(wm+1,...wk) to get (3.1).

52
N . __ 3—2 —r/2 m
ext, wrlte {S n(x) — (1 + 2 1 n ar(Fn’ x)) ¢M (x), x 6 IR . From
’ r= 11

Lemma 3.1, it follows that for almost all sample sequences and for large n,

(3.6) Iwmm — Hm) e B) — jB (s,n(xmxl

sows—2V2) + c163 ((agngfﬁi'n)
II

for every Borel subset. B of Rm. Following the arguments given in Bhattacharya
and Ghosh (1978) ( page 444—445) it can be shown that there exists a constant

a > 0 such that for large n

—dn —
(3.7) osnaogglme )) 3 «Mn ((313)e an n vn) + o(n_(S-2)/2)

holds for every Borel set B Q lRm. Next use condition (2.2) to conclude that

sup <I>M ((6B)e—an n Vn) = 0(n“(S—2)/2)
B63 11

This completes the proof of lemma 3.2.

LEMMA 3.3. Let, U1,...,U be i.i.d. random vectors with common mean ﬂ. Let

n
A denote the largest eigen value of the dispersion matrix of U1. Suppose that

E||U1||S < 00 for some integer s 2 3. Then,

Poi IIU— on > ((s—l) A 10g 101/?) g lids-2V2 (log 10—3/2

11
where Un = n_1 2 Ui and J is a function of A which is bounded on bounded
i=1

set of values of A.

 

53
PROOF OF LEMMA 3.3 : See Von Bahr (1967).

We are now ready to prove the theorem.

PROOF OF (a): By assumption (Al), $1,...,z/2m have continuous partial
derivatives of order s on C x 6. Taking Taylor's expansion of wi(x,o) around 00
fori = 1,...,m, we have

(3.8) wi(x,t) = piano) +2 (t—oo)” DVwi(x,00)/V! + Rn,i(x,t)

1$|V| _<_s—1

where the remainder term Rn i(x, t) satisﬁes

8
an,i(X’ t)| S c ”t - 00“ max

sup IDVib-(x, 0) I
IVI=s II0-00ll<llt-0oll ‘

for some constant c. Using (3.8), rewrite equation (1.3) as

(3.9) 0 = YO,i +2 (t-00)V.YV’i/u! + R;,i(t)

13 | VI gs—l

till-l

n
where R;,i(t) = .31 Rn,i(X’J!‘, t).

By SLLN, the dispersion matrix of YE”) under Pn

=l E1; zi") . zCV)’ — Zl") .Z(”)’
nj=l J J

a_.s_., 00210». zgur _ ,(u) . ”or

for all u, 0 5 |u| 5 s—l and hence are bounded in norm.

54
By Lemma 3.3, there exist constants d2 , (13 such that almost surely (P),

Pruitt") — Buys)“ > era-Ina... .)1/2) < d3.—<s—2>/2(,,g We

for Oslulg s—l, when n is sufﬁciently large. Also note that by the LIL,
”EnYiV) — pull = 0(n-”200g log 101/2) almost surely (P). Therefore it follows
that for almost all sample sequence, there exist an integer n02 1 such that for all

112110,

(3.10) Pn(llY(") — null > d.1 {V2003 xii/2))

< d3 n_(s_2)/2(log n)_€‘/2 for some constant d4 > 0.

Set R;(t) = (R; 1(t), ..., R; m(t))’. By similar arguments, it can be shown that
for almost all sample sequences, there exists a constant (15 such that (without loss

of generality)

(3.11) Pnumgcm > Ilt—0olls (d, + eel—”200g Isl/2))

< (13 n—(s—2)/2 (log n).s/2 for all n 2 no.

Hence, for almost all sample sequences, there exists n0 2 1 such that for all n 2 no,

outside a set of P n—probability d6 n_(s_2)/2(log n)-S/2, we can write (3.9) as

—1 1/ s
3.12 t—0 = D+ * 6*+ 2 13—0 u!+d t—o 6*
( ) ( O) ( on) (n 2s|u|5s—1( o) HV/ 7” on n)

where (16 and d7 are constants and 1);, 6* and 6* are random elements depending

11 I1
‘1/2 1/2 while neg” s 1.

on (X*,...,XI";) and norms of n; and 8:1 are 0(n (log n)

55
Hence, there exist an integer n1 2 no( depending on the sample sequence) and a

constant d8 such that for all n 2 n1, r.h.s. of (3.12) is less than d8 n—1/2 1/2

(log n)

whenever Ht 0 II is less than d -1/2 1 In B B ' ﬁ (1 '
-— 0 8 11 (0g 11) . y rowers xe pornt

*
theorem (Milnor (1965), page 14) it follows that there exist statistics {0n} such
that for all n 2 n1,
alt ' _ an:
(3.13) Pn(||0n — 00" < d8 11 1/2(log n)1/2, on solves (1.3))

> 1 — (16 1143-4”2 (log 10—3/2.
This completes the proof of part (a).
PROOF OF (b): The proof is essentially the same as that of (a). Only exception is
that we use LIL instead of Lemma 3.3 to get bounds on the deviations ||Z(V) — 140’) H
for 0 5 |u| 5 s—l. This is also pointed out in Remark 1.8 of Bhattacharya and

Ghosh (1978).

PROOF 0F (c): Using (3.8) we can write

1
0 = if}: ._ 1bios] ’ 0n)
j—l
_ ‘ _ 1/
... 20,, + 2 K I vl (H (on 00) 21],, /1/! + End
_1 n * '
Where an : n E j=1 Rn,i(xj3 011)- Set Rn = ( Eula-.., Rum) . Then by

assumption (A2), it follows that there exists a constant (19 > 0 such that

(3.14) P (”Rnll < (19 n-S/2(log n)s/2 eventually) = 1

 

56
Fori=1,...,m deﬁne the function fi: Rk+m-+IR by

fine, a): .+ E (a —oo)”/ul

1<|u|<s— -1 w,“

where w = (wui) e Rk, 0 611"“. Then, f = (f1,...,fm
’ OSIVISs-l; ISiSm

continuous partial derivatives of all orders, f(p, 00) = 0 and by assumption

)has

— —((Dk+j fio(u,0 ))1 <i is non—singular. Hence by the implicit function
<l,j< m

theorem, there exists a unique function H : le -+ ﬁlm and a neighborhood Q of a
such that
f(w, H(w)) = 0

for all 112 6 Q and H has continuous partial derivatives of all orders. Now, by LIL

and (3.14),
||Z(V) "”11” < (10 n_1/2(log 11)”2 for 1 5 |u| S 8—1

”7(0) + Rn” < (10 n_1/2 (log 11)”2

hold eventually, almost surely under P for some constant (10 > 0. Hence, by the

uniqueness of H, we have

(3.15) 0,, -- 11(3)

where i = (il/l) is given by
VJ = 2”,, forlg |u| gs—l and i=1,...,m
Z0,i = ZO,i +Rn,i fOI‘ i: 1,...,m

This gives the almost sure representation for in.

 

57
n

__ =1: *
Next expand the r.h.s of the equation 0 = n 1 2 wi(X j’ (in) into Taylor's
i=1
series around 00 as in (3.8). Using (3.10) and (3.11) it can be shown (exactly in
the same way as in Bhattacharya and Ghosh (1978)) that for almost all sample
‘ _ :1:
sequence, outside a set of Pn— probability 0(n—(S 2)/2(log n)—s/2), 9n has the
representation
* ..
0n = H(Y).

~

Here, if = (Y ) is deﬁned as

V,i

~

Yu,i=Y1/,i for 15 [VI 3 s—l and l=1,...,m

~

Y * * .
Y0,i : 0,i + Rni(0n) for l = 1,...,m

By (3.11) and (3.13), it follows that for almost all sample sequences, there exists a

constant (110 > 0 such that

(3.17) an 13:30:) II > c110 IFS/2 (log 103/ 2) = nods—2V2 (log en‘s/2)

Fix 0 < §2< 61. Since H has continuous partial derivatives on Q, by the mean
value theorem, there is a constant d11> 0 such that whenever Lal, 1122 lies in

{an Ilu- on s 3,),
(3.18) IIH(w1) — mas)" < d11 Ne1 — can.

.. :1:
Write DD = Jﬁ (6'n — 0n) — J13 (H(Y) -— H(Z)). Then (3.14), (3.17) and (3.18)
jointly imply that for almost all sample sequences, there exists a constant (112 > 0

such that

 

58
(3.19) Pn(||Dn|| > d12 {WU/2 (log n)S/2)

= Pn (new) -- Hon) — (H(i) - H(Y)“ > d12 D‘s/2 (“g 108/ 2)
s Pn(llR:,(0;)ll > d10 n‘l/2 (log Ills/2)

= 0(n—(s_2)/2(log n)_Js/2).

Let (D = (110 n—(S—IV2 (log n)S/2. Then, it follows from Lemma 3.2 and
(3.6) that

(3.20) 1832?? | PM (of, — in) e B) —jB emu) dxl

5 sign w (a; —- in) e B) — PM (Ho?) — m» e B)|
+ 0(n-(S_2)/2)

s innnu > en) + sup Pumas?) — M) e (6B)€n)
B63
+ «HS—2)”)

= 0 (sup <1>M ((013)091 n vn)) + 0(n—(S—2)/2).

B63 11

for some constant a > 0. Now use the smoothness of H at a and the LIL to get

u M: — M_1 n = 0 (n_1/2(log log n)1/2)) a.s.(P).

Hence, it follows that .

 

59
(no:
0 (183161.; @anB) n V11»
6 a
= o (sup iMaan) “ ))
B63
= 0(cn) = 0(n s—2)/2).

This completes this proof of (c).

PROOF OF ((1): Write on = rn||M;1/2II, n z 1. Then, as in the derivation

(3.20), one can show that for almost all sample sequences,

A

1% 161391 |Pn(,/n MEI/2“)" -— on) e B) —JM111 /2Bgs,n(x) dx|

s 0(n-(8‘2”?) + o (gun «iM (6M3/2B)‘“))
E II

= 0(n—(S—2)/2) + 0(sup (PM (Mi/2(aB)ﬂn))
B631 n

= 0(n—(s—ZZ)/2)

The last step follows by the condition (2.3). By exactly similar arguments as in the

bootstrap case, it follows that

anb‘) dx| = o(n“(3‘2)/2)

£23 '1’ ”a 13-1/2th “ 00) E B) "l 141/23

1

60

—2
where Es,n (.) = (1 + 2:=1 n-r/Zarwo, -)) ¢M(-) and ar(00 , ~), are

polynomials obtained‘by replacing the moments of Fn by the corresponding

moments of F 0 . Hence, the result follows from the SLLN and the continuity of
o

the co—efﬁcients of the polynomials ar(° , .) in the moments of the corresponding

distributions.

BIBLIOGRAPHY

BIBLIOGRAPHY

1. Babu G.J. and Singh K (1983). Inference on means using bootstrap. Ann.
statist. 11, 999 — 1003.

2. Babu G. J. and Singh K 1984). On one term Edgeworth correction by Efron's
bootstrap. San ya Ser A, 46‘, 219 — 232.

3. Bahr, B. Von (1967). On the central limit theorem in Elk, Ark. Mat 7, 61 — 69.

4. Beran R. (1982). Estimated sampling distributions: the bootstrap and
competitors. Ann. Statist. 10, 212 — 225.

5. Bhattacharya, P.K. (1967). Estimation of a probability density function and its

derivatives. Sonkhya Ser. A 29, 373 — 382.

6. Bhattacharya RN. (1985). Some recent results on Cramer—Edgeworth
expansions with applications. Multivariate Analysis — VI,
P.R. Krishnaiah (editor) 57 — 77. '

7. Bhattacharya R. N. and Ghosh J. K. (1978). On the Validity of the formal
Edgeworth expansion. Ann. Statist. 6, 434 — 451.

8. Bickel P. and Freedman (1983). Bootstrapping regression models with many
parameters. In A Festschriﬂ For Elrich L. Lehmann ( P.J.Bickel,
K.A.Docksum, and J.L.Hodgas Jr. eds. ) 28 — 48, Wadsworth,
Belmont, Calif.

9. Bose A. (1988). Edgeworth correction by bootstrap in autoregressions. Ann.
Statist. 16, 1709 — 1722.

10. Feller, W. (1966). An Introduction to probability theory and its applications.
Vol. 2, Weily, New York.

11. Freedman, DA. (1981). Bootstrapping regression models. Ann. Statist. 9,
1218—1228.

12. Hall P. (1988). Rate of convergence in bootstrap approximation. Ann. Probab.
16, 1665 .— 1684.

13. Helmers (1988). Bootstrap approximations for studentized U—statistics.
Preprint.

61

14.

15.

16.
17.

18.

19.

20.

21.

22.

62

Hoeffding, W. (1963). Probability inequalities for sums of independent bounded
random variables. J. Amer. Statist. Assoc. 58, 13 — 30.

Huber, P.J. (1973). Robust regression : Asymptotics, Conjectures and Monte
Car 0. Ann. Statist. 1, 799 — 821.

Huber, P.J. (1981). Robust Statistics. Weily, New York.

Liu R. (1988). Bootstrap procedures under some non i.i.d. models. Ann. Statist.
16, 1696 — 1708.

Milnor J. W. (1965). Topology ﬁ‘om the differentiable viewpoint. Univ. Press of
Virginia, Charlottesville.

Ringland, J.T. (1983). Robust multiple comparisons. J. Amer. Statist. Assoc.
78, 145 — 151.

Schuster, ER (1969). Estimation of a probability density function and its
derivatives. Ann. Math. Statist. 40, 1187 — 1195.

Shorack, G. (1982). Bootstrapping robust regression. Comm. Statist. A 11,
961 — 972.

Singh K. (1981). On the asymptotic accuracy of Efron's Bootstrap .Ann. Statist.
9, 1187—1195 .

 

 

I ’ . ‘ f . III3III3IIIIIIIIII3IIIIIIIIII3IIIIIIIIIIIIIIIII3III3IIIII