. .uétislitr «.‘yolsragildtt‘41‘xiiclll: (git; lgu'liuitf. 11);!"
ii... 5’ . (${tftlilcélICPn‘v (1...?) 15.9.51)! ’(‘i‘é‘ff") lititallbxtlll iloivv
: .3 ,,.:1 Fit.) : i, 11.4: I . (I) 45((riirpli. o. (I (A, r rt. v . r:¢z$)i»1.l.(rtlli.ztlorllllrn / . 5.) ) v I: 5(1))?! .

 

 

 

 

 

< :715: .7 .u......:..

.L I...
(51.11»... yr...

:1; l
. V. £1.73 . £>i..

 

 

 

 

 

J MICHIGAN STATE UNIVERSITY LIBRARIES

I II III \IIIIIIIIIIIIIII IIIIIIIIIIIIIIII

 

 

 

 

 

 

3 1293 00609 4045

LIBRARY
Michigan State
University

 

 

This is to certify that the
dissertation entitled

An Invariance Principle
Applicable to the Bootstrap

presented by
John Kinateder

has been accepted towards fulﬁllment
of the requirements for

Doctoral degreein Statistics

 

 

 

 

Date May 17. 1990

MSU is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmative ActiorVEquel Opportunity Institution

 

AN INVARIANCE PRINCIPLE
APPLICABLE TO THE BOOTSTRAP

By

John Gerald Kinateder

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Statistics and Probability

1990

 

ABSTRACT

AN INVARIANCE PRINCIPLE
APPLICABLE TO THE BOOTSTRAP

By

John Gerald Kinateder

It is shown that the bootstrap of the sample mean can be viewed as a
stochastic integral which isolates the roles of the resampling empirical distri-
bution function H n and the normalized partial sum process W”.

An invariance principle is then established for the normalized partial sum
process and bootstrap partial sum process encompassing both the normal and
non-normal domains of attraction in the symmetric case. Using the sample
representation of the form given by LePage, Woodroofe, and Zinn (1981) for
the a < 2 case, and Donsker’s theorem in the ﬁnite variance case, we show
that the processes (W", W" o H “) converge jointly to (W, W o H) where W
is the homogeneous independent increments SaS’ process and H is the limit
of the resampling cumulative.

 

To my father, who worked hard, so I could play.

iii

 
 

Acknowledgements

The author would like to thank Professor Raoul LePage for the suggestion
of the problem and all of the helpful direction in solving it. He would also
like to thank Professors Schlomo Levental and Anil Jain for their time and
interest. Professor Hira Koul was particularly helpful with his meticulous
reading of the original manuscript and many helpful corrections and sugges-
tions. Finally, the author would like to thank the Ofﬁce of Naval Research
for supporting him for the last year and one half of his doctoral study.

iv

 

 

Contents

List of Tables. vii
List of Figures. viii
2 The Stochastic Integral Representation. 6
3 The Invariance Principle. 11
3.1 The Main Theorem ......................... 11
3.2 Proof for the a < 2 case ...................... 12
3.3 Proof in the Finite Variance Case ................. 29
4 The Limit Laws. 31
4.1 Inﬁnite Variance (a < 2): Symmetric Case ........... 31
4.2 Finite Variance Case ........................ 34
5 Knight’s result follows in the symmetric (a < 2) case. 35
6 Simulation Results. 45
7 Remarks 52
7.1 Other resampling plans. ..................... 52
7.2 Only off by a scale. ........................ 53

 

A Appendix.

Bibliography.

vi

55

63

 

List of Tables

2.1 Conditional distribution of 22:,(X; — X") given the data. . . 8
2.2 Conditional distribution of f r W2 o H 2(dr) given the data. . . 8
2.3 Conditional distributions given the ordered data. ....... 8
6.1 Analysis of sizes of bootstrap confidence radii. ......... 51

vii

 

List of Figures

6.1
6.2
6.3
6.4

Coverage of bootstrap method for various 0: ........... 47
Distribution of bootstrap conﬁdence radii for a = l, n = 50. . 48
Distribution of bootstrap conﬁdence radii for various 0:. . . . . 49
Distribution of bootstrap conﬁdence radii for or = 1, n = 200. . 50

viii

 

Chapter 1

Introduction.

Suppose X l, X 2, . .. are independent random variables distributed according
to a distribution function F with location parameter 0. In order to make
inferences about 0, we may consider the distribution of the sample mean
about 9 :

Xn—O.

For example, the well-known Lindeberg—Levy Central Limit Theorem
[Bil86] tells us that if EX,2 < so, then

n‘lnﬂ-ﬁ — EX1)—>d N(0,02)

where 02 represents the variance of X1. In the ﬁnite variance case, we can
use this to make inferences about 0 = EX 1.
Of course EX? < 00 is not necessary for convergence in distribution of

the sample mean.

Deﬁnition 1.1 F is said to be in the domain of attraction of a distribution
a (not concentrated at one point) if there exist constants an > 0,bn, and a
random variable Y with distribution n, such that

n

5,. = (1209- - b") _.d Y, (1.1)

n
j=1

Necessarily an ~ cnl/a for some a E (0, 2] and c > 0; Y and p are said to be

a-stable.

In what follows, we assume that F is in the domain of attraction of an a-
stable distribution and X1, X2, . . . is a sequence of i.i.d. F random variables.
If 0 < a < 2, then we fix a sequence an > 0 such that for each y > 0,
n(1 — F(a,,y)) —* y’“ as n —* 00. For such an, (1.1) holds with bn = 0 for
0 < 0 <1, b,, = EXlif1< a < 2, and b,, = Esin(X/a,,) ifa =1. (For
existence of such a sequence, see Feller [Fel7l].) If a = 2, then we choose a,1
such that n

a;1Z(X,- -— EX1)—>d N(0,1).

j=l
In either case, let ,1: denote the limit distribution.

Since the distribution F is generally unknown, so is the distribution of
5... Thus, if we are to use 5,, to estimate 0, then we need to have some idea
of the variability of Sn. As was suggested by Efren [Efr79], in a wide variety
of situations, we can use resampling of the data X‘, . . . ,X,, to estimate the
distribution of an estimator. This is the essence of bootstrap.

Let F" be the empirical distribution function of X1, . . . ,Xn :

no) = l i 100. _<_ x).
n k=l
For each observation of the data X1, . . . , X ,,, we consider the distribution
of m
Sr; = 07-731 2(X; — X-ﬂ)
j=1
where Xf, . . . ,X,‘,', are independent and distributed according to F“. This is
equivalent to simple random sampling from the original sample, X1, . . . , X",
with replacement, and applying the same statistic to the resampled data
as we would to the original data. The resampled data is often called the
bootstrap sample, and the conditional distribution of the statistic applied

to the bootstrap sample (given the data (X 1, . . . , Xn)) is referred to as the
bootstrap distribution.

Bickel and Freedman [BF81] showed that in the ﬁnite variance case,
the bootstrap distribution of 5:", given (X1,...,X,,), converges weakly to
N (0, 1). (Recall that the variance is removed here by the choice of an.) Singh
[Sin8l]) showed that under the added assumptions that E |X1|3 < co and F
is non-lattice, the bootstrap of the pivoted sample mean is actually asymp-
totically a better approximation to the true distribution than the normal,
based on the Edgeworth expansion.

Hall [Ha188] showed that when X1 has ﬁnite variance but EIXII3 = 00,
the general situation is that the normal approximation and bootstrap approx-
imation are asymptotically equivalent. Therefore, in this case, it is better
to use the normal approximation in lieu of the computational cost of the
bootstrap.

For the case of 0 < a < 2, when the bootstrap sample size m is taken to
be the same as the original sample size n, it was shown by Athreya ([Ath84]
and [Ath87]) that the bootstrap distribution of S; does not converge weakly
to a constant distribution along almost all sample sequences. He showed that
it converges in distribution (with respect to the weak topology on the space
of bounded measures) to a random distribution. He gave a representation
for this random limit distribution in terms of Poisson random measures.

Notice that if (M;,,...,M;,,) is a multinomial vector with parameters
(n, (i, . . . ,ﬁ ) independent of the sample sequence, then

£(S;IX1,...,X..) = £(:Xk( :1 -1)|X,,...,X,.). (1.2)
k=l

The ML- can be thought of as counts; X,- is chosen Mg, times in the bootstrap
sample.

Following Athreya’s work, Knight [Kni89] gave a different representation
for this limit law. Using the distributional relationship (1.2) above, and the
sample sequence representation provided by LePage, Woodroofe, and Zinn

[LWZBI], he gave the following explicit representation of the limit law:
As in [LWZ81], define p by

 

. l — F(y)
p 3115301 - F(y) + F(-y-)'
Let she”... be i.i.d., P(€1=l)= p = l — P(el = -1). Let F = (F1,I‘2,...)
represent the arrival times of a Poisson process with unit rate; I". = {35:16,-
where P(£,~ Z :r) = e" for all i (5,62,. . . are independent). Let Mf, 2‘, . ..
be independent Poisson mean 1 random variables. Finally, assume that
{6,}, {I‘,-}, {M;} are mutually independent. Then

L(S;|X,,...,X,,) _.d £(chl‘;l/°’(1II;-1)|ckI‘;I/°,k 21). (1.3)

k=1

Notice that the above convergence is in distribution. In fact, Giné and
Zinn [G289] show that in the inﬁnite variance case, this cannot be strength-
ened to almost sure convergence (which does occur in the ﬁnite variance
case).

In the inﬁnite variance case, the bootstrap distribution of S; does not
converge in distribution to the limit distribution a obtained in the limit
of the original 5,, sequence. That is, the bootstrap distribution is not a
consistent estimate of u. Because of this phenomenon, it has been said that
the bootstrap does not work in this case. (Although it is understood why
this claim was made, Chapter 6 suggests that the method may actually be
viable for applications in the a < 2 case.)

What happens when the bootstrap sample size m is allowed to differ
from the sample size n? Athreya [Ath85] showed that in the 0 < a < 2
case, the bootstrap can still be made to work if the bootstrap sample size is
chosen small enough in relation to n. More precisely, if the bootstrap sample
size m,, -+ 00 such that mn/n -+ 0, then the bootstrap distribution of 5;,"
converges weakly to a in probability.

Arcones and Giné [AG88] added to this answer by showing that if

m,, log log mn/n —» 0,

 

then the bootstrap central limit theorem holds almost surely. That is, the
conditional distribution of 5;," converges with respect to the weak topology

to )1 almost surely. But if
limninf mn log log mn/n > 0,
then there is no almost sure convergence — not even to a random measure!

In this thesis, we examine the relationship between the distribution of
the partial sums and the resampling criteria. We give a decomposition of
the bootstrap of the sample mean in the form of a stochastic integral. Then
we develop an invariance principle explaining the behavior of the processes
involved in the decomposition. When these processes are replaced by their
limits in the stochastic integral, the integral obtained turns out to have the
limiting distribution of the bootstrap for all a. This affords a general rep-
resentation of the bootstrap limit law encompassing both the normal and
nonnormal domains of attraction in one expression.

We give the stochastic integral decomposition in Chapter 2. In Chapter 3
we introduce the invariance principle. The theorem is proved for the finite
variance case as well as the a < 2 symmetric case. Chapter 4 gives a de-
composition of the limiting distributions of the same form as that given in
Chapter 2. Chapter 5 gives an alternate proof of Knight’s representation of
the limit law (of course restricted to the symmetric case) using the invari-
ance principle. We give some simulation results in Chapter 6 which suggest
that in the a < 2 symmetric case the bootstrap of the sample mean actually
performs very well. Chapter 7 contains some concluding remarks suggesting
some of the value of the research as well as future directions. The appendix

contains proofs of some of the more technical lemmas.

 

 

Chapter 2

The Stochastic Integral

Representation.

Here we give a new decomposition of the bootstrap of the sample mean.

Deﬁnition 2.1 For each pair of positive integers m,n, we deﬁne the follow-
ing:
(i) Z, = (Xm, . . . ,X(,,)) are the absolutely ordered observations

WM 2 2. |X1n>|,

so that X“) is the i‘h largest in absolute value.
(ii) W" is the scaled partial sum process associated with (X1, . . . ,Xn):

Int]
W"(t)= ‘ 2X1, t 6 [0,1].

k=l

(iii) (M ,;,,. .,M,‘,'m) is a multinomial (m, (3;, . . . , %)) vector independent of
the observations (X1, . . . , X").
(iv) H (”'“l is the empirical distribution function of the centered multinomial

vector:

H‘m’x"’( $2210” rip—<2)

 

Theorem 2.1 Let n be the number of observations, and let m be the boot-

strap sample size. Then
c (a;l XXX; — X.) | A?) = c (/tW" o HWth) | 3(2).
k=l

We will refer to W" 0 H ("“"l as the bootstrap partial sum process.
It should be pointed out here that in reference to the bootstrap, since

resampling with replacement has no dependence on the order of the data,

m

aim — X.) | 3?.) = 51:00: — X.) |X1,....X.).
k=l

k=l
But this alternative conditioning is not valid for the stochastic integral rep-

resentation;
C(frW" o H‘m’”)(dr)|}(v,.) 51$ £(/r W" o H(m'")(dr) | X1,...,X,,).

To see this, consider the following example.
Example. Suppose the data comes in: X1 = 2, X2 = 1. Here we will
let the resample size m = n = 2, so we will denote H (2'2) by H2. Ta-
bles 2.1 and 2.2 describe the conditional distribution of 22.1100: — X2) and
fr W2 o H2(dr) given (X1,X2) (with a; = 1).

Thus, in particular,

2

P(Z(Xi - X2)=1|X1= 2.x. :1) =1/4,

k=l
but
P(/rW2 0 H2011) .-. 1|Xl = 2,X. =1) = 0.

But if we condition on (Xm, X (2)), then we must consider the conditional
distribution of 2(X; —X2) and fr W" o H"(dr) given (X1,X2) = (1,2) and
(X1,X2) = (2,1) separately, each with probability 1/2.

In this case, we get the following two columns as possibilities for each of
XXX; - X2) and fr W2 o H2(dr), each occuring with probability 1/2; see
Table 2.3.

Table 2.1: Conditional distribution of ZLAX; -— X") given the data.

 

X; X; ram—X.)
1 l -1
l 2 0
2 1 0
2 2 l

 

Table 2.2: Conditional distribution of fr W2 o f12(dr) given the data.

 

Mg, M23 H2(—l) f12(0) H2(1) frW2 o H2(dr)
0 2 1/2 1/2 1 -1
l l 0 l l 0
I 1 0 l l 0
2 0 1/2 1/2 1 -1

 

 

Table 2.3: Conditional distributions given the ordered data.

 

XXX; — X2) fr W" o H"(dr)
-l l -1 1
0 0 0 0
0 0 0 0
l -l -l l

 

Since the conditional probability of each of these values being achieved
in both cases is US, it is clear that in this example, both 2(X; — X2) and
fr W2 o H2(dr) have the same conditional distribution given the ordered
sample (Xn), X(2)).

Proof of Theorem 2.1. Let (M'

m1,oo

.,M;m) and Hm”) be as in the
hypothesis. Notice that

c (2.: X.( ;, — 1:) | Z.) = z: (for; — X.) l 27.).
k=l

k=l

For real t let
Almlu) = Z X.I( ;,,. — 73 g t).
k=l n

This process adds mass X,- at the points Mg”- — 12—. Then

n m m n
2 x.(M;.. — 1’3) = Br — —)(2 MW... = 2)
1:1 71 r=0 n k=l
= frA(m'")(dr).
If we show that aglAlm'") has the same joint distribution with If. as the
W" o Hlm'“) process, then the proof will be complete. 0

Lemma 2.1 For each m and n,
(E; a;l/l('"'")(t), t E 13:405.; W" 0 11(m'")(t), t E R).

Proof. Since both HIm'“) and AIm'") are right continuous and constant

except for possible jumps at r — 13-, r = 0,1,...,m, it suffices to show that
(YnmanoHIm’nlh— 3), r=0,1,...,m) (2.1)
n
=d (in; A(m'")(r — 31—), r = 0,1,... ,m).

n

 

 

10

We do this by examining the increments. Condition on (Mgm . - . 141,7;71)

and use the exchangeability of (X), . . . , X.) to show that

amen-uh?)

<32}; 2 X1”) 7‘ =0,l,...,m> (2.2)

j=nH('"-")(r—1—§)

r... (X); Z X,-I(M;U- = 0),...,: X,1(M,;, = m)).

i=1 i=1

To see this, notice that nHlm'")(r — $3) = #{k S n : Mg. 5 r}, so that
in particular nHlm'")(—l - %) = 0 and nHIm'")(m — 12-) = m. Thus, (2.1)
follows by an application of a Borel-measurable transformation to both sides

in (2.2). 0

Corollary 2.1 For any Bord-measurable function f,
c (2:: mm... — 2)) Y) = c (/ f(r)W" o H‘“""(dr)| Y) .
Proof. Notice that
i x.“ .1 — 2) = / f(r)A‘"""’(dr)
k=l

and apply Lemma 2.1.

 

Chapter 3

The Invariance Principle.

Since Y. is a function of the partial sum process W", and conditionally on
in, S; has distribution dependent on the bootstrap partial sum process
W" o H (mm), it is clear that the behavior of S; jointly with )7" is dependent
on the behavior of W“ o H ("h“) jointly with W". The following invariance
principle helps explain this behavior for large n.

3.1 The Main Theorem.
We deﬁne the Skorohod metric as it is deﬁned in Billingsley [Bi168].

Deﬁnition 3.1 For each pair of functions a: and y in D[0,1], deﬁne the
distance d,(.'c, y) as the inﬁmum of all those values of6 for which there exists

a strictly increasing and onto transformation A : [0,1] —* [0,1] such that
”A - 1H S <5 and Ill? - 1J0)” S 6,
where t denotes the identity function on [0,1].

Theorem 3.1 Suppose H“ is a sequence of stochastic processes on R in-
dependent of W" converging uniformly to H almost surely. Let W be the

11

12

homogeneous independent increments symmetric a-stable process with scale

determined by 5,, —-»d W(1).

(A) If X; has a symmetric distribution in the domain of attraction of an
a-stable distribution (a < 2) and H is the distribution function of a discrete

random variable which takes values in afinite set or a set which can be written
in the form {d1,d2,. . .} such that d, < d2 < ---, then

(We H",W")—+d(Wo 11, W)

in the product space (D(R),ZJ) x (D[0,1],5) where U denotes the uniform
topology on D(R) and 5 denotes the Skorohod topology on D[0,1].

(B) IfEX12< 00 then
(We H",W")—».(Wo 1W)

in the product space (D(R),L(1) x (D[0,1],L(2) where L11 denotes the uniform
topology on 0(3) and U2 denotes the uniform topology on D[0,1].

Notice that in part (B), W is a Brownian motion.

3.2 Proof for the a < 2 case.
Throughout this section we assume the hypotheses of Theorem 3.1(A).
Deﬁnition 3.2 Let £1, £2, . . . be 2.2.d,

P(€1=l)=P(€1=—l)=l/2.

Let F represent the arrival times of a unit rate Poisson process as described

in the introduction.

13

Let T1,T2,. .. be i.i.d., uniformly distributed on (0,1).
Deﬁne W by

W(t) = fury/“HT. g t), t 6 [0,1].
k=1

LePage [LeP80] showed that W is a homogeneous independent increments
symmetric a-stable process.

We start by exploring the way that a particular LePage-like representation
W" with the same distribution as W" converges to W. Then we will use this
convergence along with the almost sure uniform convergence of H " to [I to
ﬁnish the proof.

Let 1 - G be the distribution function of IXII. Let G“ be the usual
inverse: For real y 6 (0,1), G"(y) = inf{:r : C(13) S y}. Deﬁne for each n
and k = l,2,...,n,

Y"). = a;‘G"(I‘k/I‘n+1).

Notice that
(€1Yn1, . . . , (”Ya") =4 0:1(Xu), . . . , X90). (3.1)

As introduced by LePage [LeP80], we deﬁne random variables L? in such a
way that the processes I (L? S t), j = 1,...,n facilitate scrambling of the

ordered random variables, €1Yn1, . . . , 6,. Y..."
t
L',‘(t) = min{t: 735 L771};

< [nt] - 210' “L? S 1)

L'-‘t
J” _ n+1-j

min{t:T- } j=2,...,n.

 

Lastly, deﬁne W" to be the scrambled partial sum process associated with

(‘51an, ° - ' téﬂYﬂn):

Wu“) = z": cpl/"1.1(L2 S t) 16 [0,1]. (3.2)

=1

 

 

14

W"(t) is constant on each of the intervals [fb 1%), j = 1,...,n, adding a
random selection of one of the ckYnk’s at each of the times t = i, %, . . . , 1.
Thus W" has the same distribution as W" in D[O,1].

We ﬁrst examine the behavior of W" truncated to its N ——1 largest jumps.

Let

N—l N-I
143(1) = : ernJ-HL? g 1); and SN = Z emf/“10,50.

j=1 j=l
Proposition 3.1 For each N, Vﬁ —+ 5N as in the Skorohod topology.

Proof. By deﬁnition of an, for each j, with probability one, Y,”- —> 17"“.

Therefore the vector of ordered sizes of the N — l jumps of V; is approaching
the vector of ordered sizes of the N — l jumps of SN.

But the vector of locations of the N — l jumps of VA’,‘ is also converging
to the corresponding vector for SN. To see this note that

< -<m : -<
}__L, __ in{t T,_ n+1

[nil
n+1—(N—1)

 

 

min{t : T- S

}.

The two outer terms converge to T,- almost surely as n —+ co.

Deﬁne A" to be the piecewise linear function determined by An(0) =
0,1,.(1) = l, and A414?) = T", for i = 1,. .. , N — 1. Eventually A. is strictly
increasing in which case

.N-l
d.(V§,S~) 5 maij |Y.,,- — r;‘/°|, (L; — T,-| : 1_<_j g N -1}.

i=1

The right side converges to zero almost surely by the previous analysis. D

We will proceed to show that the tail sums W” - V3} and W -- TN con-
verge to zero in the uniform metric as N —+ 00 rapidly enough so that
d,(v“17n, W) —., 0.

To this end, we will use a weak version of a theorem from Pollard [P0184].
But ﬁrst we need to deﬁne conditional variance as Pollard does.

15

Deﬁnition 3.3 The conditional variance process associated with the Lz-mar-
tingale 5 is the unique, increasing, predictable process V with V(0) = 0, such
that {2 — V is a martingale. It is denoted by (6),.

Theorem 3.2 Let (6,.) be a sequence of Lz-martingales with conditional var-
iance processes (5.). The following are sufficient conditions for convergence
in distribution of(£,.) as random elements of D[0, 1] under the uniform topol-
ogy to Brownian motion on [0,1] :

As it a 00,

(i) £n(0)_’p 0;

(ii) (5.), -—r,, t, for each ﬁxed t;

(iii) EJ(£,.)2 —r 0, where J denotes the functional giving the maximum jump

of the path operated on.

Speciﬁcally, Pollard states the theorem for processes on [0,00) and the
convergence is with respect to the uniform on compacta topology. Thus
he proves a generalized version of this statement for processes restricted to
intervals of the form [0, T].

Let 0° 00

am = )3 e.I‘«‘""I(T.- s 0: s3. = Z If”?

J
i=7; k=n

We will show that Tn/sn converges in distribution to standard Brownian
motion on [0,1] in the uniform topology on D[0,1]. As a consequence, we

shall get
831) lTn(t)l —r. 0- (3.3)

Theorem 3.2 seems almost tailored to our application; the paths of the
processes we are dealing with are constructed in such a way that the maxi-
mum jump (and all others) can be read right off.

Proposition 3.2 T,./s,. converges in distribution to Brownian motion with

respect to the uniform topology on D[0,1].

16

Proof. For 3 in [0,1], deﬁne
gs = 0'{Pj,€j1{Tj ..<.. ulr u S. 3i j 2 ll)

and let .7, be the P—completion of 9,.

Notation. In what follows, || - M will denote the supremum norm for any space
on which it makes sense; it will be subscripted to remove any ambiguity. For
o-ﬁelds f, we will let Ef denote conditional expectation given the o-ﬁeld
.7; EP will denote conditional expectation given the o-ﬁeld generated by the
process {1“,}.

Lemma 3.1 {(Tn(t)/s,.,.7:¢) : t 6 [0,1]} is an Lz—martingale.

Proof. Here is an appropriate place to state a pertinent theorem provided

by LePage [LeP80]:

Theorem 3.3 Suppose Y1,Y2,... is a sequence of independent and identi-
cally distributed random variables with E |Y1|° < 00. Then the following three
results hold:

(A) The sequence of partial sums

Z.,r;‘/°Y, (3.4)
i=1

converges almost surely to a symmetric a—stable random variable.
(B) Its limit, which we denote by the left hand side below, has the following
distribution:

2.,rrT‘/°Y1/,-=. (E|Y.|° )l/aze, rTV" (3.5)

j=l

(C) If A1,...,A. are disjoint measurable subsets of R then the following

random variables are mutually independent:

29F, "“1 I(Y, e A) ..,Ze,~r W”! (Y e A.). (3.6)

j=l J=l

17

By part (A) of this theorem, T..(t)/s,. = limp...» f,’f(t) almost surely,

where
f£(t) .;1:..r; "°’ “.Im <1)
k=n

Since .7, 18 complete, and f”(t) 18 f}- measurable for each p, T,.(t )/s,. 18 mea-
surable .7}.

Also, for ﬁxed n, (f,’,’(t))p>n is an L2(.7"¢)-Cauchy sequence because, for
P > 71)

Ema»? = 3:2 Z 172’“

k=n

Andforp>q_>_n,

Bum—nun“ = EEF(. ‘ Z arr?“ (21.30):

k=q+l
p 2
= E(s;’ Z 1‘; “1).
k=q+l

As p,q —+ 00, this goes to zero (by the bounded convergence theorem). Thus
(f:(t))p>n is an L2(f})-Cauchy sequence which converges almost surely to

T,.(t)/s,.. By completeness of L2(f}), T..(t)/s,. E L2(f}), and
film-'13 T.(t)/s.. (3-7)

Since E" is continuous on L’(f¢), by Lemma A.2 in the Appendix,
E" (mo/s.) = lim Ecru )

P“°°°

= 3;! Z I‘;"°E"c,.I(T,. g t)

k=n

= S;IZF;1/aql(Tk<s)

k=n

= T..(s)/s,.. Cl

18

Lemma 3.2 With probability one, the conditional variance process of T./s.
is given by

(T./s.), = —s;2 i F;2/°ln(l -- T). A t) (for all t). (3.8)

Proof. Consider the process T. / s. as a sum of processes each having a jump
of magnitude le/a/s. at the random time Ti. Using the method provided
by Aalen [Aal78] the conditional variance of the process I(T;. S t) is simply

Th!“ d8
./0 1 — 3'
Let V.(t) denote the right hand side of (3.8). Keeping in mind that our
a-ﬁelds contain knowledge of F). and s., V. is a viable suggestion for the

 

conditional variance of the sum T./s.. Now we prove that it satisﬁes the
conditions portrayed in Deﬁnition 3.3.

Notice ﬁrst of all that the series in (3.8) converges with probability one.
To see this, use Theorem 3.3 and note that for each t, 0 S —ln(1 — T): A t) S
-ln(l — t) < 00.

Since ln(l - T). A O) = ln(1) = 0, V.(0) = 0, for all n. Thus, we need to
check that:
(a) (T./s.)2 — V. is a martingale;
(b) V. is increasing; and
(c) V. is predictable.

For (a), the measurability of V.(t) is evident again by completeness of f}
and measurability of the partial sums.

We break up the conditional expectation calculation for the martingale
condition into two parts.

By the monotone convergence theorem for conditional expectation and
Lemma A.6,

E"V.(t) = E" (—s;’ E 1‘;”°’In(1 — T. A 0)

k=n

19

 

= —3;2ZF;2/°Ef‘ln(1-Tk/\t) (3.9)
k=n
=—-2°° *2/01 — —t_SIT>
sn ZFk [n(l TkAs) —s(k 3)],

k=n
By (3.7) E(E}‘-(/P( t)— T.(t)/s.) ) —. 0. Therefore
Efren) — Tun/8.)” _.,. 0.
Thus there is a subsequence (m) such that
E"(f.i’*(t) — T.(t)/s,,)2 -» 0

Hence by the Minkowski inequality for conditional expectation,

More»? —) E" (T—g‘ﬂ)

n

Thus by Lemma A.2 (in the Appendix) and its corollary, with probability

one,

Ef‘(T.(t)/s.)2= lim Ef‘s ’2 (Z cJ-FJ-VOHTJ _<_ t))2

J=n

2
3"2 lim Ef‘(ZeJ-FJ I‘7'Jl/°I(T <3) +:6J~F Iii/ONT 6(1tll)8

k-ooo
1:" J: —n

PI: PI:
= 8313?;(2 17W?)- s .) + 5: P;”°E"I<Tj E (3%“)

=T(.s)())/s. +s;2:r;’/°1—I_l—:I(T,_<_s). (3.10)

Combining (3.9) and (3.10), we see that the martingale condition is satisﬁed.
For (b), since —ln(1 — T). A t) is a nonnegative increasing function of t on
[0,1) and P{T). = 1 for some 11:} = 0, with probability one, —ln(l — T). A t)

 

20

is a nonnegative increasing function of t on [0,1]. Since Fk,s. > 0 almost
surely, V. is increasing almost surely.

Concerning (c), the set [2:1,] F;2/° < 00] has probability one. On this
set, fort < l ff,’(t) —+ V.(t). Since EV.(1) < oo, f:(1) —» V.(l) almost surely
and f,‘,’(t) is ft—measurable for all t, V. is the almost sure limit of adapted
continuous paths. By completeness, V. is predictable (see Metivier [Met82]).
[3

Proof of Proposition 3.2(continued).
Now we check that the conditions of Theorem 3.2 are satisﬁed.
Relation (i) 18 clear.
For (ii), by (3.2) and the MCT for conditional expectation,

mtg/3.), = E(19:I‘(_.;2 Z r;”°zn(1 - T. A t)))
k=n
= Es.( -2 Z P-s/aEr(_1n(1— T). A t))).

k=n

But the processes {FJ} and {TJ} are independent and E(—ln(1 — T). A t)) =
fol ln(l -- u A t) du = t. Hence

(T /3. =tEZ(I‘ r;2/°/s3.)-

k=n
Lemma A.4 shows that 02((T./s.),) —) 0 as n —r 00. Thus, Chebyshev’s
inequality gives (T./s.), —+,, t, for each t.
Finally, Lemma A3 in the Appendix shows that F;2/°’/s,’, -—+ 0 with
probability one. By the bounded convergence theorem (iii) follows. 0

Corollary 3.1 sup |T.(t)| -—+,, 0.

Proof. Noting that sf, -> 0 almost surely, this follows by Proposition 3.2. D

21

Notice that this corollary is equivalent to the statement: In the uniform
metric,

Ee,r;"°1(:r, g .) _., W. (3.11)

i=1

Now we will work on ﬁnding an apprOpriate bound for the tail sums of

the processes W" described earlier. Deﬁne

n

(JR/(i) = Z GYMNL? S t); s}... = 2 Y3,»
j=~

j=N
We will apply a method similar to that which was used in PrOposition 3.2.
Notice that here we must control the starting and ending indeces, whereas

before we only needed to keep track of the former.

Proposition 3.3 Indexed by t, URI/3N. is an Lz-martingale, with condi-
tional variance

" " ["‘1‘11L'P>5
(UN) “size.- 2 ———(’ ")-
t

 

Proof. For each n,t let
I. = .(r,,.,1(L; s u)... 31.13113 n}.

Notice that the process (U ﬁ/sN.(t)) is adapted to (ﬂu), and that for each
n, j, and t, s”. and Y.; are measurable .77... By Lemma A.2 (Appendix)

E’MUMt) = 2; E’MY..,-e,-I(L; g t)
J:

= Z YnjEfmchL? S t)
j=N

= Z YnJ'CJ'HL? S .3)
j=N

= UMs).

22

Also, for each t,

E(Uhltl/Sanz -- E(Er..2;.(iy.,.,1(1;gt))2)
=N

_—. E(sN'f, J; Y,3.1Er 1(1; 3 1))

S Ean 2Y3

= l<oo.

Thus UJ'G/sN. is an Lz—martingale.

As is pointed out by Pollard [P0184, p.176], if an Lz-martingale Z changes
only by jumps {1,52, . . . occurring at ﬁxed times t, < t; < .. . , and if}? = f},
for t). S t < t)...“ then the conditional variance process is just a partial sum
process of conditional variances: For t). S t < ti“.

(2):: E’°£f + E"1£§+-~+ E’s-1513.

Our process and a-ﬁelds satisfy these conditions, so for each N , n such that
N S n,

 

(U’l) = [ﬁflE’~/~(s’i(U31'-“——+ )— 01(9)?)

3”" ‘ k=0

[rid-1
= )__2 s'iE’"~(§: Yn1€11(L"E (5 51—11))
_N n n

[nt]-1 n
= 2 3'1: Yan-P(L =

k=0 j=N

k+1

            

)

 

_ Inf! -2 n y2I(L?>k/n)
_ 8N1! 71;] n _ k

k 0 j=N

23

[nt]-1 [(Ln> _)

: SNnjzzYnz 12 ﬁ. [:1

Proposition 3.4 For each t in [0,1], the following relations hold:

E(ghf) = Inn—t] (3.12)
Var<£j§i>t g 2E(::’:)+;%:t—E—ﬁ. (3.13)

Proof. By Proposition 3.3

 

n 2[nt]—1 [(1131 > 5))

El”) = when. 2 .-.

8N” ‘ j=N k=0

 

’1 "“1" P111: > *- III‘)
= an 2Y1? j: J _ n
=N n lc

k=0

 

_l___ntlE "
X'iE.”i)
_ [Iii]
— it
Also, 2
n n 2
var<”~) Ea) w—
an t 3N1: t n

Fix N and n and for eachj = N,...,n, let

(.11-. I(L'-‘ > 5)
P1=Y.,-2/3~.; 01': Z ‘75:?”—

11:0
Then

 

EV"): = Egr(:.,.c.)“

3N1: t

24

= E: 2:1).ij (CC)

1=N )—

Notice that all the Cj’s are independent of P, so EP(C.-C,-) = E(Cng).
Ifi = j, then

 

“-1 W1 1 I(L;-‘> I(L'-‘ > K)
= 4:; 2., :5) nu.)

|/\

 

EZZ

lc=0 u=0

”(n-1n 11( Ln>5 Ln) >)n))

(n—k)(n—u

n-ln-l P( Lu > k_\_/_u

=22

k=0u=o(n-k) ("2'")

 

“'1"- n(-—k)kA(n—u)

=ZZ(

Ic=0u=0n(n'_"c )(n—V)

 

1 n-1 n-1

= nZZ(n-k)v (n-u)

nk=0 v=0

 

l"'12(n—k)—l

= —2 <2.

nk=0 (n—lc) "

 

Ifi #j then EC;C‘ equals
[ntl- --ij(L >_ kvtL > k) ["']"2["‘11P(L,->—",.-L > u)

z 3.2.): +22: 2:

k=0 u=k+l (n_ k)(n — V)

 

 

 

 

 

k=0

[nil-l [nt]- 2 [Ml-l

= (n—k)(n-k 2):“ (n-u)(n-—k—1)
Z;( n()n—-n—-1( +2gu=zk11n(n-l)(n—lc)(n—V)

1 [nt]- 1 [nt]-— 2 [Ml-1
< 1 2
— nn—( 1)( Icz-O + lg) ugl 1)
[nt]2

n(n—1)°

25

 

 

Thus
E(Svi): g E(éjzﬂ) 2)+2 $1?“ij [m]: ___»)
S E(QPN+n(£:1t_]21))-

Observing the deﬁnition of my and subtracting the squared mean, the propo-
sition is proved. Ci

Proposition 3.5 There exists a sequence (no(N)) such that

Yn’(N)N
3Nn’(N)

for every sequence (n’(N)) such that n’(N) Z no(N) for each N.

-+ 0 a.s.

-l/a

Proof. Since for each j, Y",- —+ 1‘; a.s., we have for each N,

YnzN 11-2/0
2 _i 2N -—2/aa'
Z‘N+l Yr!) ZN.“ Fj

 

Hence we can choose no(N) Z 2N large enough that for n 2 no(N),
Y3” 1332’“
P | — _ a|>2"”)<2-”,
( 2% Y3, 2%.“; 1‘,”
Fix a sequence (n’(N)) such that n’(N) 2 no(N) for each N. By a simple
application of a Borel-Cantelli lemma, as N —> oo

 

 

Ynimgzv P-2/a

ENHY mm» 2%?” 1‘ I 2’ °
By the strong law of large numbers,

I‘"’° 1‘3”“ ___ (mm-”a
2,91, 17”“ " NI‘;”° N(F2~/N)‘2/°

 

 

-+ 0 a.s.. (3.14)

 

 

—-+ 0 a.s.. (3.15)

26

Combining (3.14) and (3.15), we see that

2
Yn’(N)N

N —* 0 a.s..
£lV+1 Ynz’(N)j

 

But since n’(N) 2 2N,

Yylimw < Y732’(N)N

 

 

_ N ‘
sivn'm) 2i!“ Ynzluvb'

Proposition 3.6 For each N, as n -+ 00,
Ill/,3 o H" — SN 0 H” —+ O a.s..

Proof. Fix N. Let BN equal the set on which

N
VIM-TA —> o,

i=1
N
VlYnj-F-‘l/al -’ 0,

J
i=1

T,#H(t)foralli21 and tER,

llHn-H” —> 0.

By hypotheses, P(B~) = 1. We claim that ||V,{,‘ o H” - SN o H" -—+ 0 on BN.
To see this, ﬁx a: 6 EN and let 6 > 0. Suppress the argument w from the

following relations. By the assumptions on H, there exists K such that for

k2 K,
H(dk) Z H(dK) > max{T,- : i g N}.

Thus, we can choose 5 > 0 such that

e < A IT.- — H(t)|.
gyms}?

 

27

Fix M’ such that for n 2 M’

N
V lLy—le <E/3, (3.17)
i=1

N _1/

VlYnj—Fj l <5/N,

i=1

”H" — H” < 5/3.

Now let n 2 M. We show that |V,{,‘ o H"(t) — SN 0 H(t)| < 6, for all real
t. Recall that SN is constant on [T(,-),T(,-+1)) for i = O, 1,. . ., N (with To = 0,
and TN“ = 1); and VA',‘ is constant on [Lli'P Lii+1))° We proceed to show that

L? _<_ H"(t) if and only if T,- _<_ H(t). (3.18)

Pickt 6 R. Say T; 5 H(t). Then by (3.16), Ti+€ < H(t). Since lL?—T.-I < 15/3
and |Hlml(t) — H(t)| < 8/3, we must have L? < H"(t). The argument for
the other direction of (3.18) is similar. Thus

le'v‘(H"(t)) - 5~(”(t))|

= |E15,Y,,,I(L? S H"(‘)) - [Elfin—W10} S H(t))l

i=1 i=1

N-l
s 2 (Y...- - NV“:

i=1

<6.

We chose 6 > 0 arbitrarily, so we are done. 0

With existence guaranteed by Propositions 3.1 and 3.6, choose n1(N) _>_
no(N) (no is deﬁned in Proposition 3.5) such that for n 2 n1(N),

P(||V,(,‘oH" —5,.oHu > N“) < N“, (3.19)

28

and

P(d,(V§,SN) > N“) < N". (3.20)

Define
N(n) = max{N 21:n,(N)g n} V 1.

Lemma 3.3 Vii/‘01) —>,, W in the Skorohod topology.

Proof. Let (in) be a subsequence. Since n1(N) _>_ no(N) _>_ 2N —+ 00, we
can choose a subsequence (nk’) such that N(nk,) < N(nk,.+,) for each j.

Let 6 > 0. Now m.) 2 n1(N(nk’)) by definition of N(-), and forj large
enough, N(nk,) > l/varepsilon, in which case

i

P(d.(v;.:.,,,,. W) 2 25)

g P(d.(v,,";;,j,,5~(.,,1,) > e) + P(d,(SN(,.,j), W) > e)

S 1
N(nkj)

 

+ P(d,(S~(MJ_), W) > 6),

Since N(nk,.) —v 00, the claim is proved. 0

Lemma 3.4 U Run) / shun)“ converges in distribution to Brownian motion with
respect to the uniform topology on D[0,1].

Proof. As in the proof of Lemma 3.3, choose subsequences (nk) and (nkj).
Again we use Theorem 3.2. By Proposition 3.3, U 3100/ sNWn is an Lz-mar-
tingale. Surely U N(n) /sN(,,),,(O) E 0. Also, since

"1:, _>_ n1(N(nk,-)) Z no(N(nk,)),

Propositions 3.4 and 3.5 give us that the conditional variance condition is sat-
isﬁed. Finally, Proposition 3.5 gives us that the expected squared maximum
jump also converges to 0. C1

29

Theorem 3.4 W" —>,, W in the Skorohod topology on D[0,1].

Proof. Notice that W" = V§(n)+U}(‘,(n). By the corollary to Proposition A.1,
sN(,,),, —+, 0. Therefore, by Lemma 3.4, ”Ugh," -i,, 0. By definition of d, it
is easy to show that if d,(;r,,,;r) —> 0, and ||y,,|| -+ 0, then d,(:c,, +y,,,:r) -> 0.
Apply Lemmas 3.3 and 3.4 to complete the proof. Cl

Proposition 3.7 W" o H" —i,, W o H in the uniform topology on D(R).
Proof. For any n,
”W" o H” — w o I!” s ”W" o H" — v,:;(,,, 0 mu
1:)

“ll/’3‘") o H" _ SM“, 0 H|1+1|S~(n)o H — Wo HI).

A3 A3

 

 

 

A) S "W" — Vﬁwﬂ = ||U,’{,(,,)|| —+, 0 as was shown in the proof of Theo-
rem 3.4. A3 S “VNM — W” = IITN(,,)|| —i,, 0 by (2.2) because N(n) -+ 00.
Also, since n1(N(n)) _<_ n, and N(n) —v 00, A2 —i,, 0. C1

Combining Theorem 3.4 and Proposition 3.7,
(W" o H". W") ->. (W o H. W)

in the prescribed space. Since these processes have the same joint distribution
as (W" o H“, W"), Theorem 3.1(A) is proved. U

3.3 Proof in the Finite Variance Case.

Assume the hypotheses of Theorem 3.1(B). Here EX 2 < 00 so a,, = 121/2

and Donsker’s Theorem applies:

0

 

30

W" _’d W,

under the uniform topology on D[0,1], where W is Brownian motion (see
Billingsley [Bi168]).
Let W" be a process with the same distribution as W" such that W"

converges uniformly almost surely to a continuous Brownian motion W.
lwnoH" — W0 HHR S ||W"0H" — WoHnlln+ ”WOH" - WOHIIR
5 “W" — Wll(o.11+||W o H" — W o 11“,.t

By construction, the first term on the right converges to 0 almost surely.
With probability one W is uniformly continuous on [0,1]. Since H " —i H
uniformly almost surely, the last term converges to 0 almost surely.

With this representation, we have (W“0H“, W“) —+ (WoH, W) uniformly
almost surely. Therefore,

(w" o H", W") a. (w 0 1!, W)

under the uniform topology in each coordinate. D

Chapter 4

The Limit Laws.

Under the usual resampling scheme, (i.i.d. resampling from the data), when
we replace the processes involved in the stochastic integral decomposition by
their limits obtained in the invariance principle, we get the limiting distribu-
tion of the bootstrap.

Throughout this chapter, let M ' be a Poisson (mean 1) random variable

and let H(z) = P(M‘ — 1 S x).

4.1 Inﬁnite Variance (a < 2): Symmetric

Case

In light of the results of Bickel and Freedman [BF81] and the result (1.3) by
Knight [Kni89], and the invariance principle, the following theorem shows
why the stochastic integral decomposition may be the natural way to view
the bootstrap.

Let 6,1‘, and W be as in Definition 3.2. Let M{,M2‘,... be i.i.d. ~ M“
independent of c, F.

31

32

Theorem 4.1

£(ZekP;1/°(M; —l)|ckI‘;1/°,k 21)
k=1

= z: (/°° two H(dt)|ck[‘;‘/°',k _>_ 1) .

Proposition 4.1 shows that this decomposition of the limit law carries

over to the finite variance case.

Lemma 4.1 With probability 1,

Z c r;‘/°( (M; 4:215: c r; ‘/°I(M;— 1: r). (4.1)
k=lr=_1 k=l
Proof. Let Y = 2:11 ckF;1/a. Since Y is symmetric a-stable, it has charac-
terisic function exp(—a°‘|0|°’) for some a > 0 (see [ST89, Deﬁnition 1.1.4]).
Consider the partial sums

K 00
Z r 2 ckI‘;1/°I(M; — 1 = r). (4.2)

r=—l k=l

By Theorem 3.3(A) and (B), for each 1' 2 —1,

ZerFZ1/°(M;— 1)I(M,:-1 = r) := chkF;1/°I(M; —-‘1 = r)

k=l k=1
:4 r(P(M,‘ —- 1 = r))1/°Y.

By Theorem 3.3(C), as r varies, the inner sums in (4.1) are independent.
Therefore, the characteristic function of the left hand side in (4.2) is

K K
II exp(-0°IT(P(ME-1 = r))”"’49|") = eXP(-0"|9l°' Z lr‘l"'1"(1‘41"-1 = 7‘)).

73-1 r=-l
As K —> 00 this tends to exp(—d°|9|°E|M1‘ — 1|“). This is the characteristic
function of (EIMf' — 1|“)1/aY. Thus, by the Continuity Theorem, as K —) 00,

K 00
Z r )3 airy/“HM; — 1 = r)—+d(E|Ml'—1|°’)1/°'Y. (4.3)

f=-l k=1

33

Since the left hand side is the partial sum of independent random variables,
the sum must also converge almost surely (see [Bre68, Prop. 8.36]).

By Lemma A.1 applied K + 1 times, the double sum in (4.3) equals
22;, arr/7M; - l)I(M; — 1 S K). But, by the same lemma, the left

hand side in (4.1) almost surely equals

:3 e r;‘/°'(M; — 1)I(M,; — 1 s K) + 2 arr/WM; — 1)I(M; — 1 > K).
k=l

By Theorem 3.3(B) the term on the right has the same distribution as
(Ele—1|°I(M,'- 1 > K))‘/°'Y,

which converges in probability to zero as K —+ co,

ZrZekr;"°1(M; — 1 =r),):e,.1‘;‘/°(M;— 1). (4.4)

r=-l k=l
Coupled with the almost sure convergence of the left hand side of (4.4) al-
ready established, the lemma is proved. C1

Proof of Theorem 4.1. By definition of W and Lemma A.1

co

L:tWO H(dt) = Z riekf‘zl/aHTk E (H(r —1),H(r)]).

r=-1 k=l
Hence, by Lemma 4.1, it is enough to show that

(:e.r;"°1(M; —— 1 = r),r 2 —1;e,.r,,,k _>_1)
k=l

=d <2 arr/GUT), E (H(r -1),H(r)]),r Z —1; ed}, k 21>.
k=l

For each k,
(1(M; — 1 = r),r _>_ -1) :4 (1(7): 6 (H(7‘ - l),H(r)]),r Z —l).

Also, for different k, these processes are independent of eachother. Since

both processes are independent of e and F, Theorem 4.1 follows. Cl

34

4.2 Finite Variance Case.

The following proposition shows that the above representation for the limit

law carries over naturally to the ﬁnite variance case.

Proposition 4.1 If W is a Brownian motion then ft W o H(dt) has the

standard normal distribution.

Proof. To see this, look at the characteristic functions again. The stochastic
integral can be written as Z” r[W(H(r)) — W(H(r-))]. By independence

r=-1
of the increments of Brownian motion, its characteristic function is

lim ﬁ exp(_r2t2[ll(r) - H(r—)])

n-ooo r=-1 2

 

2 n

= lim exp(—% Z r2[H(r) — H(r—)])

n—ooo
r=—l

= exp(—£2-/r2 dH(r))

= exp(-t2/2).

This is the characteristic function of the standard normal distribution, so the
assertion is proved. D

Assume W is a version of Brownian motion with continuous sample paths.
Since conditioning on the ordered jumps of such a process provides us with
no additional information,

£(/tWoH(dt)| orderedjumpsofW) =£(/tWoH(dt)).

By applying Bickel and Freedman’s results [BF81], using the proper scal-

1/ 2a', we see that in the symmetric case the result concerning a

ing, an = n
distribution in the domain of attraction of an infinite variance stable random
variable can be viewed as an extension of what was already known in the

ﬁnite variance case.

 

Chapter 5

Knight’s result follows in the

symmetric (a < 2) case.

Here we focus on the case that the resampling is the usual simple random
sampling from the original data with the resample size in" equal to n. As
usual, denote H (am) by H ".

The distribution on S; conditional on X1, . . . ,)(ﬂ is random. We wish to
show that this random distribution converges in distribution (with respect
to the weak topology on the space of bounded measures) to the random

distribution of 00
Z “PEI/”(Mi - 1)

k=l
conditional on c and I‘. This is the result which Knight proved. We will show
that it follows using the proof of the invariance principle proved in Chapter 3.

Each path a: in D[0, 1] has only countably many jumps. Let Ax denote
the sequence of jumps of x ordered by absolute value from largest to smallest.

Throughout this section, let W", W be as deﬁned in Chapter 3. We will
use CF to denote the process (air/G, k _>_ 1), so that the a-ﬁeld generated
by AW coincides with the a—ﬁeld generated by 6P.

35

 

36

When put into our framework, our goal is equivalent to showing
£(/rW"oH“(dr)|AW") -+d£(/rWoH(dr)|AW). (5.1)

Since the function H on the right hand side above is deterministic, it may
seem that when we condition the integral on the right by AW that the dis-
tribution becomes degenerate. However, 0(AW) only contains information
about the magnitude and directions of the jumps of W — nothing about
their locations.

We point out here that H ", H satisfy the hypotheses of Theorem 3.1( A).

The only part which may not be obvious is stated in the following proposition.

Proposition 5.1 H“ —+ H uniformly almost surely.

Proof. Here, let AH"(t) and AH(t) denote H"(t) — H"(t—) and H(t) —
H (t—) respectively. By Scheffe’s Theorem [Bil86, Theorem 16.11] it suffices
to show that for each t

AH"(t) -+ AH(t). (5.2)

But for each t
E(AH"(t)) = P(M,:l — 1 = t) —> P(M,' - l = t) = AH(t),
and

gamma) = 192% 2:“ I(M;, — 1 = 1))2 — (P(M;, — 1 = t))?. (5.3)

The ﬁrst term on the right in (5.3) equals

n—l

 

%P(M,:l—l=t)+ P(M;l_1:t,M;2—l=t).

The ﬁrst term here is asymptotically negligible. And for each integer k in
(0,1,. . . ,n} by Stirling’s formula, [Bil86, Problem 27.18],

 

 

37

P(M;1= k,M,','2 = k) =

 

k!k!(n — 2k)! ' n"

nn+%e-n(n _ 2)n-—2k

(k1)2(n - 2k)"-2*+%en-2knn
— 6-1)2 -2k+2( n )1/2( Tl _ 2 )n—2k
_ (I'— e n - 2k n — 21:
e" 2
_i (721-) '
If lc is replaced by t+ 1 we see that this limit equals the limit of the last term
on the right in (5.3). We have veriﬁed the sufficient condition in (5.2). U

 

 

By Daley and Vere-Jones [DV88, Prop. 9.1.VIl(i)], it suffices to show that
for bounded uniformly continuous f on R,

an] r w" o Ham)" mm w an] r W o mam)" AW].
Proposition 5.2 If f is a bounded measurable function on R then

em] 1' w" o H"(dr))|| AW") :1 E[f(/ r W" o H"(dr))|| AW").

Proof. We show that there is a measurable function T such that the left
and right hand sides above equal T(AW") and T(AW") respectively. Then
we invoke the fact that AW” =4 AW“. (See (3.1).)

Let f be any bounded measurable function. Let 9 denote a measurable
function such that g(W", H”) = fr W“ o H"(dr). Let [In be the set of all

maps 7r taking (2:1, . . . ,xn) into partial sum processes of the form

[Ml
<2 1,“) : t6 [0, l]>
k=l

 

38

where 1r is a permutation of n elements. Deﬁne h on the set of n‘h-order
partial sum processes by h(w) = Ef(g(w, H")). By the Fubini theorem, h is

measurable. Also
Em] r W" o H"(dr))u AW") = Emma. H“))IIAW"]
= E1E11<g<w". Hm“ W"1|IAW"1
= E1Ef(g(w.H")) lw=w~ IIAW"]

= Elh(W")|lAW"l

1 ~ n
= ;,-, Z h(1r(AW 1)
.ienn

=4 f; 2 h(v‘r(AW"))
'ienn

= E[h(W”)||AW"].

By backtracking, we get the desired relationship. 0

Proposition 5.3 [(W" o HnlAW") = E(W" o H"|AW).

Proof. H“ only changes at —1,0,1,...,n — 1, so we must show that for
A E B(R), with probability one,

PKW" o H"(—1),. . . , W" o H"(n — 1)) 6 All AW"]
= ”(We 0 H"(—1),...,vT/" o H"(n —1))e A||AW].
The left hand side above equals

61}/n11 - ° - a CnYnn

 

 

PKieji/njluy g H"(a:))>n-l e A

 

39

n-l

(‘J "M =9 Y'U)

.1]

 

= PK: e,y,,,-1(L; g H“(2:))>

'=l 1:=-l

= f(€1Ynlv ' ° 7 CnYnn)

where f is just the name we give to the function which evaluates the previous

line at the vector-value (elyn1,. . . ,enym).
The right hand side in (5.4) equals

n-l

P[(Z Canj1(L? S H"(I))> 6 AIICIF1,€2F2,. ..]

j=l I=-l

n 71-!
= P[<Z£anjI(L? g H“(x))> e A||61F1,...,c,,l‘n,l‘,,+1

i=1

"-

 

‘J=¢1 1‘71 =[‘,

= PK:EiailG—le/‘YMOKL? 5 Hn(x))>,=:1 6 Al
= f(e1a;‘G'1(I‘j/Pn+1), - - - . CadilG'I(Fj/Fn+1))-

Since Y",- = a;'G“‘(I‘j/I‘n+1) for each j, we are done. 0

Corollary 5.1 If f is a bounded measurable function on R then
Em] r W" o H"(dr))n AW) =. an] r W“ o was)" AW].
In light of Proposition 5.2 and Corollary 5.1, our task is reduced to show-
ing
an] r W" o was)" AW] _., an] r w o H(dr))|l AW].

We will show in fact that the above convergence occurs in probability. The
proof of the following proposition will be the main tool in showing this con-
vergence.

 

 

40

Proposition 5.4 For 6 > 0,
P[|/rW" o H"(dr) — ero H(dr)| > 36 HAW] —>,, 0.

Before we directly prove this statement, we will first state and prove a

few lemmas.
Lemma 5.1

”ll/.1,W "<-°H"dr) [:rWonr) >6IIAW]—»,,0. (5.4)

 

Proof. By Proposition 5.3

a]: .- W" o H"(dr)|AW") = c(/T r W" o H"(dr)|AW).

By Proposition 5.1, and the proof of the invariance principle,
W" o H" —*, W o H
in the uniform topology. Since H", H change only at —1, 0,... , T in (—oo, T],
T ~ 1'
f r w" o H"(dr) —+,/ r w o H(dr).

Using the Markov inequality, it is easy to show that for any collection
of random variables, X, X1, X2, . . ., and any sub-a-ﬁeld 9’, if X" —+, X then
Xn —» X in g-conditional probability. That is, for any 6 > 0,

P[|X,, — XI > sue] —+p 0,

Thus (5.4) holds. 0

Lemma 5.2 For any 6 > 0, as T —* 00,

P“ j: r w o H(dr)| 2 an AW] _. 0 a.s.. (5.5)

 

41

Proof.
P °° W d >5 W
11/, r oH< r>1_ 11A 1

= p“ Z 5,,1‘;‘/°(M;— 1)I(M,;— 1 2 T)| 2 6 || AW]

k=l

< 13KB C1.1“.Z""(M,:—1)1(M,:— 1 2 T))2
k=l

- 52 ‘Fl

 

 

f: F;2/°)E[(M,' — 1)21(M; — 1 2 T)||cI‘ .

k=1

1 m
_<_ —((Z 5.13:1“)2 + 2
62 k=l

The expectation term is a sequence of real numbers converging to zero the

statement is proved. D

For each T, deﬁne 91 on R by gT(a:) = (:1: — 1)I[T,°°)(:c — 1).

Lemma 5.3 There exists a tight sequence of random variables 5,, such that
for all 5 > 0,

P“ / rW" o H"(dr)| 2 6||AW] < .1.

_ 62 {nEg;~(M;1) a.s..

Proof. By Corollary 2.1,

P11 / W?" o Warn 2 6IIAW] = P11 2 e.Y..gT(M;.)I 2 ans-14., .- _<_ n]

k=1

1
3'5

|/\

EKZ ekYagﬂMJszlléanjJ S "l
k=l

IA

1
E(Cﬂ +dﬂ)i

where

cu = (ZZj¢k€j€kynjynnElngMril)gT(M:i2)ll€jYnj’jSnl’

d. = (:Y:.)ELq%(M;.)ne.-Y..,jSn].

k=l

 

 

42

Notice that the expectation term in c" is bounded by Eg§~(M,:,) by the
Cauchy—Schwarz inequality. Also

1: 2 n
ZZi¢k€j€kYﬂjYﬂk = (Z {kl/uh) — z: Yn2k'
k=l k=1

The first of the terms on the right has the same distribution as (a; 1 22:, X k)2
which converges in distribution by hypothesis. Thus as a sequence of ran-

dom variables it is tight in n. The second term converges in probability to
22;, F?“ by Lemma A.6 in the Appendix. Therefore (2 Zj¢ijEkYannk)
must be tight in n. To complete the proof, let

5n = lzzj¢.€i‘kyniynk + Z Ynzk. C1
k=1

Lemma 5.4 For each T, Eg§~(M;,) —> EgﬂMf).

 

Proof. Since we can recall the well-known weak convergence of the binomial
(n, A/ n) to the Poisson (A) and apply the square of the bounded continuous
function h;- on R defined by

hT(:t) = (:1: - l)I[_.j,1-)(:z: -1)+ TI[T,OO)(I - l) + I(-oo,1)(1: - 1)

it sufﬁces to show that E(M;l — l)’ —i E(Ml‘ - 1):.

By the weak convergence of (Mg, -1)2 to (M; —1)2, we need only establish
that {(M;l - l)2 3;, is uniformly integrable. Apply the computation of the
centered fourth moment of a binomial given by Lehmann in [Leh83] to get

E(M;l_1)4=3(n—1)2+n—1(1_6(n-—1)).

n n n2

 

 

 

It is easily seen that the right hand side is less than 4 for all n. D

For intervals A let

z..(A) = [A rW"oH"(dr) Z(A) = [A rWoH(dr).

 

43

Proof of Proposition 5.4. We will show that for any n > 0,
“”1an P( PllZn(R) - Z(R)| Z 35 II 6F] 2 377) S 271- (5.6)
First notice that Zn(—oo, —l) = Z(—oo, —1) = 0 a.s.. Also, for any n,
P( Pllznl-1.°0) - Zl-1,00)| _>_ 35H 61‘] Z 371)
S P(Pllan_1le_ Zl-ltTll Z 5|! 6F] 2 n) +
P( PllZn(T,00)l 2 5H 6F] _>_ 17) + P( Pl|Z(T,<><>)| _>_. 5H 6F] 2 0)-

By Lemma 5.2 we can choose To large enough that the third term on the
right is less than 17 for T 2 To. Consider the sequence of random variables
(5,.) provided by Lemma 5.3. Since 6,, is tight there exists B such that

P( |£,,| 2 B) < 17 for all n.

F ix T _>_ To such that 1762 Z 2BEg’(Mf). Then choose nT large enough so
that for n 2 n1, Eg§-(M,f,) S 2Egz(Mf). Then for n 2 n7,

P( P112.(T.oo)l 2 611d) 2 n) _<_ age/59am.) 2 n)
S ’7-

By Lemma 5.3, (5.6) holds. 1:]

Proposition 5.5 For bounded uniformly continous f
E[f(Z..(R))||AW] -*p E[f(Z(R))||AWl-

Proof. Let 6 > 0. Since f is uniformly continuous, there exists a 6’ > 0
such that |f(a:) — f(y)| < 6/2 whenever la: - yl < 6’. Define A5: to be the set

 

44

[12,,(3) — 2(3)) < 6’]. Then
P< 1121mm» — menu 4‘11 2 a)
s P(|E[(f(Z.(R)) - 1(Z(R)))I...uer11 2 6/2)
+ P(IE[(f(Z.(R)) - nz<m>>n,.n e111 2 6/2)
= P(|E[(f(Z.(R)) — ((2(R)>)1.;,11er11 2 6/2)
3 arms ”61"] 2 6/(4llfll))
= P< P112.<R> — 202112 6'1 er) 2 6/(4llfll) >.

This last term converges to zero by Proposition 5.4. 0

Now apply the proposition described from Daley and Vere-Jones [DV88]
to complete the proof.

 

Chapter 6
Simulation Results.

For random variables which have infinite variance, we found that in the
symmetric case, the bootstrap of the sample mean does not perform so badly.
In fact, in some ways, the method gives better results than it does in the finite
variance case.

For various indices of stability 0, and confidence levels 7, we simulated
observations of random variables X ;, symmetric about 0, in the domain of
attraction of an a-stable random variable and applied the bootstrap of the
sample mean to create symmetric 7-confidence intervals for 0 in the following
manner:

For a given sample Xn = (X 1, . . . , X"), the confidence interval 0.,(Xn) is
given by

01(xn) = [Xvi " T1(xn)an + T‘v(xn)l:

where T,(Xn) is estimated as a quantity which satisfies
PllX; — an _<_ Tv(xn)llxnl = T

A suprising observation was that the empirical coverage of this method
was consistently higher than 7 for a < 2. Figure 6.1 shows the observed
coverage on the bootstrap method applied 1000 times each for 7 = .95 con-
fidence with sample size n = 50, bootstrap resample size m = 50, and 500

45

(X 0.01)

1.1

93

46

Figure 6.1: Coverage of bootstrap method for various 0:.

 

 

 

I I WI I I I I I I rT I I I I I I ﬁTI I I I T I I I I r f
r— ..................................................................................................................................................................... 0—1
1- -<
"' -¢

 

 

 

 

 

47

bootstrap observations for various values of a. We used Monte Carlo simu-
lations with X, ~ eU'V" where P(e = l) = 1/2 = P(e = —1), U uniform of
(0,1). Notice that for a > 2, F has ﬁnite variance, and hence it is expected
that the coverage should be approximately .95.

Consider the conﬁdence radii obtained by the above method, scaled by
nl‘l/a because

5; = na;‘(5<; — X").

and a,, ~ cnl/a (see Feller [Fel7l]).

In the ﬁnite variance case, the bootstrap distribution of the scaled and
centered bootstrapped sample mean converges weakly to a ﬁxed (normal)
distribution almost surely. Since the limit distribution is continuous, the
confidence radii given by the above method and then scaled as indicated
converge almost surely to a ﬁxed number as the sample size n tends to inﬁnity.

But by Athreya’s early results [Ath84] we should not expect this phe-
nomenon to occur in the infinite variance case. Our simulation results exem-
plify this. Figure 6.2 shows a frequency histogram of the observed conﬁdence
radii (logarithmically scaled) with n = 50, a = 1.0. The vertical line repre-
sents the logarithm of the radius necessary for an unconditional conﬁdence
interval with confidence level equal to the coverage observed by applying this
method.

Figure 6.3 shows more of the same phenomena for various values of a.

48

Figure 6.2: Distribution of bootstrap conﬁdence radii for a = 1, n = 50.

 

l l l l

 

 

 

49

Figure 6.3: Distribution of bootstrap conﬁdence radii for various a.

 

 

 

 

 

 

 

 

 

12 13 20 '1. 1 3 6
n=58 nlph.=.58 llphl=1.26

).

 

 

O
a
O
.1

 

 

I I I I I I I

I I

 

 

 

 

 

 

 

A A 1 A L A

..

A A A A

 

 

 

 

-1 0 1 2 3 4 6 -1.2 -0.2 0.8 1.8 2.8
alpha=1.60 nigh-31.90
120*! 1 1 r rq 120 _f r I 1 I I r
199» . 139 .

4 80
4 49

~ 20
,

 

   

A 1 A 1

 

 

 

 

1 1

‘1.1 -0.1 0.9 1.9 2.9 '1.2"..9'0.8-0.3 0 0.3 9.8
alphn=2.00 alpm=3.oo

 

50

Figure 6.4: Distribution of bootstrap conﬁdence radii for a = 1, n = 200.

 

m
0

B
O

2 ‘5'
TIIIIIIITITIIIITIIIITIIII]

»
O

 

O

 

’1.
p
U
m
4
m
:1

As expected, even as 11 gets large, the distribution of the scaled radii ob-
tained by this method is dispersed apparently continuously over the positive
real axis. Figure 6.4 shows what happens when n = 200.

More is observed. Since the conﬁdence widths generated by the method
have such a wide distribution, we examined how well the procedure compares
with the procedure which simply uses the unconditional distribution of the
sample mean about 0.

That is, we estimated T, such that

P(an_0l S T7)=7'

Since the st are in the domain of attraction of the stable random variable
Y, we can do this not only by Monte Carlo simulation, but also by using the
quantiles of Y, because 5,, —>d Y (see (1.1)).

The results were startling. For a very small proportion of the applications,
the conﬁdence radii obtained by the method T,(Xn), were larger than T,

But for the complement, the T,(Xn) was extremely small compared to T.”

51

Table 6.1: Analysis of sizes of bootstrap conﬁdence radii.
c Observed proportion of times T95(Xn) < cT963

 

1 .940
1/2 .881
1/4 .770
.107 .500

suggesting that the bootstrap performs well.

Example. Here is an illustration of how the bootstrap conﬁdence radii
compare with conﬁdence radii obtained using the unconditional distribution
of the sample mean.

We ran a simulation with 1000 observations for a = 1, n = 50, and
2000 bootstrap resamplings for each observation with m = 50. The observed
coverage of bootstrap method was 0.968. The radius about sample mean
necessary for unconditional conﬁdence of .968 is T968 9- 32.13.

The implications of these results could be very far reaching. 94% of the
times the method was applied, the radius of the conﬁdence interval for 0 was
less than the radius necessary to give conﬁdence equivalent to the empirical
coverage obtained by the method.

Maybe more substantial is how much smaller the observed radii were.
Half of the time the bootstrap conﬁdence interval radius was less than about
a tenth of the radius necessary for unconditional conﬁdence.

More needs to be studied in this direction. The invariance principle

proved in this paper will help to explain the phenomenon.

 

Chapter 7

Remarks

7 .1 Other resampling plans.

The stochastic integral representation given in Chapter 2 together with the
invariance principle proved in Chapter 3 provide a new way of studying the
resampling problem.

One of the problems with the usual bootstrap in the inﬁnite variance case
is its inconsistency.

We have examined resampling plans by means of multipliers different
from the usual multinomial multipliers in an effort to produce a consistent
“bootstrap” for the long tailed case.

One such method is to deﬁne multipliers 6" = (6?, . . . ,63) whose distri—
bution satisﬁes the following conditions:

(a) For each i, P(6? = —1)=1/2 = P(6?=1);
(b) 22:1 61: = 0i

(c) Each permutation of the components of 6" has the same distribution.

One of the virtues of the usual centered bootstrap is its location invari-

52

 

53

ance; if X,- = Y,- + 0, then

11

XX“ rile—1): ZYk( .ik'll
k=l k=l
because 22:1(M;k — 1) = 0. This method shares this property.
Other beneﬁts to this approach are that 6" —>d (61, 62, . . .) where 6;, 2d 6),.
Also 61,6), :4 6),, so that in fact

11 00
-1 -1/a
an E Xkbj: “(if ckl‘k .
k=l k=l

An analogous statement to the one made here about the conditional limit
laws (which are the important ones to consider) can be made for this type of
resampling using a similar, but easier, analysis.

This method, which we call seamless resampling, is explored more thor-
oughly in a forthcoming joint paper with R. LePage which calls on the in-

variance principle proved here to prove its value.

7.2 Only off by a scale.

Consider the symmetric case.
Recall the result given by Knight [Kni89]. The representation he gives
for the random limit law is the random conditional distribution of

Z semi/KM; - 1)
k=1

given the sequence elFl'l/a, 62F;1/°', . . ..
It is shown by LePage in [LWZ81] that for our choice of an and b,,,

00
s, .4, Z ark-”a.

k=l

 

54

By LePage’s Theorem 3.3(A), the unconditional distribution of the sum

given by Knight is only a scale away from the correct limit law p :

Z ekl‘;l/°'(M; — 1) =4 (E(M; — 1mm 2 2,1“;1/0.
k=1 k=l

Appendix

Appendix A
Appendix.

Lemma A.l Suppose A, B Q R are disz'oint Bord-measurable sets. If {Yk}
is a sequence of i.i.d. random variables independent ofe and F, and EIYI I“ <
co, then with probability one,

5; arr/Wknuswk) = Z e.I‘;"°'mA(Y.) + Z c.P;"°Y.IB(Y.).
k=l k=l

k=l

Proof. By Theorem 3.3(A), we see all three sums converge with probability

one. Let a, b, and c, denote their respective almost sure limits. For each n,

la -— (b + c)| 3 la — Z ekF;l/°Yk1AU3(Yk)I

k=l
+ |b — Z ckr;‘/"Y,.IA(Y,.)| + Ic — E CkF;l/aYkIB(Yk)la
k=1 k=l

Since the right hand side converges to zero almost surely as n —+ 00, a = b+c

almost surely. 0

Lemma A.2 Let T be an arbitrary random variable taking on values in (0, 1)

and e be a random variable independent ofT with P(e 2: 1) = % = P(e = —1).

55

56

For 3 in (0,1) let .7, = o{eI(T S u),u S 3}. Then for B Q (s,1], with

probability one,
P(T E B)

P[T e B||f,] = P(T > S)

[(T > s),

and

E71210 e B) = 0.

Proof. Intuitively, if f, represents the information we are provided with at
time s, then at time s we know the value of T and e if T S 3.

Let A, = {0,[T > s],[e = 77]ﬂ[T S u] : n = :l:l,0 S u S s}. A, is
a r-system generating .7, and fl is a ﬁnite union of sets in A,. Both terms
on the right are f,-measurable. The proof is completed by checking that
the integral condition for conditional expectation is satisfied on A, for both
relations. (See [Bil86, Thm 34.1].) D

Lemma A.3 With probability one, I‘;2/°’/s?, —+ 0.

Proof. Let 5 > 0. We need to show that with probability 1 eventually

~2/a
R < 5. (A1)

m -2 a
j=0 Fuel-{7'
But (A.1) occurs if and only if

(294/... < s[(§1)—2/a + i(r"+j)-2/a]; (A.2)

n n i=1 n

 

and (A.2) is true if and only if

(><>>:<——>

6 n j=1

 

 

Choose m > (1 + €)/e. By the strong law of large numbers, we know that
I‘M,- / n —> 1 almost surely for each j. Hence, for each to in a set of probability

one, 3no(w) such that for n 2 no(w)

 

57

Thus for n 2 720(0)),

<.°°;(— M) > :(W)
1

 

i=1 i=1

 

 

\/
AA
P—J
«1+
0)
\__/
A
y—a
l
m
v

Lemma A.4 For each t, 02(V,,(t)) —+ 0.

 

 

Proof.
oo '2/0 2
5204(1)) .-_- E(X"2 F (ln(1 — :r,c A t) — t))
k=n '1
oo ;2/0 2
= EEI‘(Z ——(1n(1,—- :11. At) — 1))
15:75 3n
2/a
= EZ(P;% )202(-ln(1—Tk/\t))).
k=n

But 02(—1n(1 — T), /\ t)) is no greater than E(— ln(1 — T1))2 = 2. Thus,

2(133’“) )

02(Vn(t)) 5 2E(
k=n n

_2/0, —2/01
3 E (1“,, Zr )

2 2
3n k=n 3n

 

 

 

= ware/33,).

This last term converges to zero by the LDCT, since 1 2 P;2/°‘/s,2, -—+ 0, by
Lemma A.3. D

58

Lemma A.5 If h is a Borel measurable function, the h(Tk)I(T;c S s) is

measurable with respect to .7...

Proof. By considering separately the sets
{TkI(Tk S 8) Z 7}

for 7 Z 1, ’7 6 (0,1), and 7 S 0, we see that T;J(T;c S s) is f,-measurable.
Deﬁne g on R by g(a:) = h(a:)I(o,,](;r). We are done because 9 is Borel-

measurable and

g(TkI(Tk S 8)) = h(Tk)I(Tk S S). C]

Lemma A.6 ForO<s<tSl

t—s

 

Ef'ln(1 411M) =ln(1—Tk/\s)— [Th >3].

1 — s
Proof. By Proposition A.5, ln(1 — Tk A t)[T,c _<_ s] is f,-measurable. Thus,
E" ln(1 — Tk At) = ln(1 — Tk At)[T,c _<_ s] + E" ln(1 — Th /\t)[T,c > 3]. (AA)

Let A, be deﬁned as in the proof of Lemma A.2. Since 0 is the only set in
A, properly contained in [T1‘ > s], Ef‘ ln(1 - T]. A t)[T;c > s] is almost surely
constant on {T1‘ > 3}. Call this constant c. By the deﬁnition of conditional
expectation,

c(1—s)=cP(Tk>s) = / ln(1—TkAt)dP
{T198}
1
= /ln(1-u/\t)du

= Atln(1—u)du+£l(l—t)du
= (1—3)1n(1—3)+ S—t.

59

Thus, on {T1c > s},

 

 

—t
Ef‘ln(1—Tk/\t) = ln(1—s)+ :_3
3 _-
= ln(1—TkAs)+1_3.
Combining this with (AA) the lemma is proved. [3

By Remarks 3 and 4 in [LWZ81] Z3" —+d 2:111:2/0 and our as-

sumptions on F and an, but the following statement needs proof.

i=1 YnJ'

Proposition A.l
2Y2.- pin“.
i=1

Proof. For each N, 72, let

1: N-l n
73 - —ZY n), =ZY .j, and R~.n = 2 Y”
:1 =1 j=N

Let (nk) be a subsequence. It sufﬁces to show that there is a further
subsequence (7119-) such that rm) J, 22°: I‘;2/°'.

Since for each j, YnJ -+ PJ'I/a a.s., for each N,

N
SN,” —+ Z 171/0 a.s.. (A.5)

i=1

For each j, choose N such that 13(2):”, 172/6” > 2") < 2”. Then for each
j, choose n(NJ- ) > n(NJ - ), such that for n 2 n(NJ-),

P(|SN n -—ZI"’2/°| > 2") )< 2".

For each j, let nkj. = min{n;c : nk 2 n( NJ)}. Write 5N1“) as SJ, and RijJ-
as R'.

Since 73* —. —J-S + RJ, we only need to show that 51"“) 2f: 1 I“; We almost
surely and RJ- Hp 0. The proofs of these facts follow.

60

Lemma A.7 Asj ——> 00,

SJ- —* 2 1:2,“ as
k=l
Proof. Let A,- = {lsj — 22;, r;’/°'| > 2(2-J')}.
Notice that

N] . Nj m — a .
A,- c {15,- - Err/“I > 2"}Uuzrk’” — Zn.“ l> 2’1}.
k=l k=l k=l

Therefore P(AJ) 5 2(2"). The proof is completed with an application of the

ﬁrst BoreLCantelli lemma. U

Lemma A.8 RJ- —>,, 0.

Proof. It sufﬁces to show that for each subsequence (ju), there is a further
subsequence (jyi), such that RJ-yJ --+,, 0.

Let j,, be a subsequence. Since Tn —>d 2:11 Fri/a, (Tnkj) is tight. But
then (SJ-,RJ) is tight in R2 because Sj,Rj Z 0 and TM), = Sj + RJ. Since
(SJ-y, RJ-u) is tight in R2 there is a weakly convergent subsequence (5'in , RJ-v', ).
Say it converges to (S, R). We must show that R = 0 a.s..

For a reduction of cumbersome notation, denote 23:1 I‘J-_2/°t by Z.

Since (8,",'. + RJ'W) is a subsequence of (1'3”), SJ-VJ + RJW —’d Z. But
5,}... + RJ-W —»d S' + R by the continuous mapping theorem. Therefore Z :4
S + R. By Lemma A.7 S :4 Z. Since RJ'W _>_ 0, we must have R Z 0. Let (I)

be the standard normal distribution function.
0 = E<I>(S'+ R) — E<I>(S) = E(<I>(S + R) — <1>(S))I(R > 0).

Since <I>(S' + R) — @(S) > 0 on {R > 0}, P(R > 0) = 0. Thus R = 0 almost
surely. Cl

61

Notice that Proposition A.l and the fact that YnJ- —+ PJ—l/a almost surely
together imply that for each N,

RN. _., 2: Ff“ (A.6)
k=N

Corollary A.l 8N,n(N) —+,, 0 for any nondecreasz'ng sequence (n(N)) such
that n(N) > N for each N.

Proof. Let 6 > 0. There exists No such that P(ZﬁNo Fri/a > e) < 5. By
(A.6), we can choose N1 > No such that P(IRNO,n — ZEZNO F;2/al > 5) < e,
for n 2 N1.

But then for N such that N Z No and n(N) > N1,

n(N)

P(SNJIUV) > 26) ‘2 P(Z Yn2j > 26)

j=N

n(N)

P(Z ij > 25)

j=No

"(M —2 2
P(|ZY§-— Z r; /°'|>5) +P( (Z r;/°>

j=N0 k=No 1:: No
< 25 C]

|/\

|/\

Bibliography

 

Bibliography

[Aal78]

[AG89]

[Ath84]

[Ath85]

[Ath87]

[BF81]

[Bi168]

[Bi186]

[Bre68]

Odd Aalen. Nonparametric inference for a family of counting pro-
cesses. Annals of Statistics, 6:701-726, 1978.

Miquel A. Arcones and Evarist Giné. The bootstrap of the mean
with arbitrary sample size. Annals of Inst. of H. Poincare, 1989.

K. B. Athreya. Bootstrap of the Mean in the Inﬁnite Variance Case.
Technical Report, Iowa State University, 1984.

K. B. Athreya. Bootstrap of the Mean in the Inﬁnite Variance Case-
II. Technical Report, Iowa State University, 1985.

K.B. Athreya. Bootstrap of the mean in the inﬁnite variance case.
Annals of Statistics, 15:724—731, 1987.

Peter J. Bickel and David A. Freedman. Some asymptotic theory
for the bootstrap. Annals of Statistics, 9:1196—1217, 1981.

Patrick Billingsley. Convergence of Probability Measures. John
Wiley, New York, 1968.

Patrick Billingsley. Probability and Measure. John Wiley, New
York, 1986.

Leo Breiman. Probability. Addison-Wesley, New York, 1968.

62

[DV88]

[Efr79]

[Fel71]

[G289]

[Hal88]

[Kni89]

[Leh83]

[LeP80]

[szsi]

[Met82]

[P0184]

63

DJ. Daley and D. Vere-Jones. An Introduction to the Theory of
Point Processes. Springer-Verlag, 1988.

Bradley Efron. Bootstrap methods: another look at the jackknife.
Annals of Statistics, 7:1—26, 1979.

William Feller. An Introduction to Probability Theory and Appli-
cations. John Wiley, New York, 1971.

Evarist Giné and Joel Zinn. Necessary conditions for the bootstrap
of the mean. Annals of Statistics, l7:684-—691, 1989.

Peter Hall. Rate of convergence in bootstrap approximations. An-

nals of Probability, 16:1665—1684, 1988.

Keith Knight. On the bootstrap of the sample mean in the inﬁnite
variance case. Annals of Statistics, 17:1168-1175, 1989.

E. L. Lehmann. Theory of Point Estimation. John Wiley, New
York, 1983.

Raoul LePage. Multidimensional Inﬁnitely divisible variables and
processes. Part 1: Stable Case. Technical Report, Stanford Univer-
sity, 1980.

Raoul LePage, Michael Woodroofe, and Joel Zinn. Convergence
to a stable distribution via order statistics. Annals of Probability,
9:624—632, 1981.

Michel Metivier. Semimartingales. A Course on Stochastic Pro-
cesses. Walter de Gruyter, Waltham, Massachussetts, 1982.

Henry Pollard. Convergence of Stochastic Processes. Springer-
Verlag, 1984.

64

[Sin81] Kesar Singh. On the asymptotic accuracy of Efron’s bootstrap.
Annals of Statistics, 9:1187—1195, 1981.

[ST89] Gennady Samorodnitsky and Murad Taqqu. Stable Non-Gaussian

Random Processes. January 1989. Notes for future text.

 

 

 

11111111111WIll1111111111111111ll