,....... 4. 4...
44v ...44;.4.4..4

(4....
...-.4 4.. ..

 

4....
.4344 ......
.....4..........r.v.4

r .1244
4. 44.4.4 44 4

. .47
44.4.4.4...uy......44

.....44

4... ... .4; .4. 44.1 ....
. . . , r47r4 ”4.444453.

. .4 4 4 I. .4.

44 .. . .444/4 . . 4 y ....4 r! U“ 4r.444.4.44

41%.}:
.....4}.

.44

. , . ...? )4
42.4."...rr...4 , 14.4.41...

4.: .41 4...... , 74.4.3.9...4.
74.4.4541“ 444......5H14.

.r.;v..34.1..::»
.,.4r 1:..4p...

...
.4Il..4,.:4
4.4..» 344.

4..4.\344.,44.?.4
.4....44_..4.......4..
44.4 ...4... .44
4 4...... .4tf44.4...

.4...) if?!
v5.4?l.
, ... ...»r .
44 ... 4.2.1.,

, a.
....444..4..44>..44.
.45

 

 

 

 

      
   
   

5*:- .-- 4N5

LIBRARY

Michigan State
University

 
 

: . w

This is to certify that the

thesis entitled

THE PROBABILITIES OF MODERATE DEVIATIONS OF
U-STATISTICS AND EXCESSIVE DEVIATIONS
0F KOLMOGOROV-SMIRNOV AND KUIPER STATISTICS

presented by

Gerald Marlowe Funk

has been accepted towards fulﬁllment
of the requirements for

Ph.D. Statistics and

degree in
Probability

 

 

34w,” ﬂ/K/VV;

Major professor

 

Date May 21, 1971

 

0-7639

 

 

ABSTRACT
THE PROBABILITIES OF MODERATE DEVIATIONS OF
U-STATISTICS AND EXCESSIVE DEVIATIONS
OF KOLMOGOROV-SMIRNOV AND
KUIPER STATISTICS
BY

Gerald Marlowe Funk

Let X1, X2, . . . be independently and identically distributed
random variables. The U-statistic, Um) generated by a function
symmetric in its k arguments based on X1, . . . , Xn was defined by
Hoeffding (Ann. Math. Statis (1948)). Under certain moment condi-
tions Rubin and Sethuraman (Sankhy'a’ Ser. A (1965) ) obtained order

results for probabilities of moderate deviations of these statistics.

These results are extended to obtain expressions of the form

2

/2 -1/2n-1/Zc

P(U(n) - BUM) > ckcr(logn/n)1 )~ (chzlog n)

if c > 0 and o- > O is the standard deviation of the limiting distri-
bution of V}? (UM) - EU(n)). Similar results are obtained for
Lehmann-generalized U-statistics, and for some functions of
U-statistics.

A related problem, that of obtaining asymptotic expansions of
probabilities of excessive deviations of the Kolmogorov-Smirnov and
Kuiper statistics is considered. Let F denote the c.d.f. of Xi and
let Fn be the sample c. d. f. based on X1, . . . , Xn . Let
D: = Slip (F(x) - Fn(x)) . Suppose F is continuous and let n A: = 0(1)
and n A: ... 00.

Then

 

Ge rald Marlowe Funk

2

+ '2an

P[D > x ]~e
n 11

This result was obtained by Rubin and Sethuraman (Sankhya’ Ser. A
(1965)).

In this paper a different proof of this result is presented and
the problem is solved for larger deviations, i. e. , when the condition
n A: = 0(1) is relaxed. Similar results are obtained for the two-sided

Kolmogorov-Smirnov statistic and the Kuiper statistic.

 

 

THE PROBABILITIES OF MODERATE DEVIATIONS OF
U-STATISTICS AND EXCESSIVE DEVIATIONS
OF KOLMOGOROV-SMIRNOV AND
KUIPER STATISTICS

By

Gerald Marlowe Funk

A THESIS

Submitted to

Michigan State Unive r s ity

in partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Department of Statistics and Probability

1971

(5‘ 7Ja277

ACKNOWLEDGMENTS

My most sincere gratitude is extended to Professor Herman
Rubin whose enthusiasm and guidance made this undertaking possible.
I benefited immeasurably from our many discussions of large sample

theory and asymptotic expansions.

ii

TABLE OF CONTENTS

PROBABILITIES OF MODERATE DEVIATIONS OF
U-STATISTICS.

1. 0 Introduction and Summary
1.1 U-Statistics
l. 2 Extensions

PROBABILITIES OF EXCESSIVE DEVIATIONS OF THE
KOLMOGOROV-SMIRNOV AND KUIPER STATISTICS .

Introduction and Summary
The Sample Distribution Function
Some Asymptotic Expansions

NNNN
WNt—IO

Kolmogorov- Smirnov Statistics

N
uh

Kuiper Statistic .
2. 5 Probabilities of Large Deviations of

Kolmogorov- Smirnov Statistics .
2. 6 Probabilities of Large Deviations of the

Kuiper Statistic . . . . . .

B IBLIOGRAPHY

iii

Probabilities of Excessive Deviations of the

Probabilities of Excessive Deviations of the

Page

19
19
22
24
34
51
62
76

83

 

PROBABILITIES OF MODERATE DEVIATIONS

OF U-STATISTICS

1. 0 Introduction and Summary

The primary result of this chapter is the development of an

asymptotic expansion for probabilities of moderate deviations of U-

statistics.

Wassily Hoeffding [11] defined a U-statistic based on n indepen-
dent random variables, X1’ 2, . . . , Xn’ and a real valued function of

m(f_ n) arguments, Y, as

-l
U' (X ,...,X ) = [m1(n)] zwx. X )
\y 1 n m 1 1
1 m
where the summation is extended over all permutations (11, . . . , im) of
m different integers, iJ. , such that l S ij _<_n .

Without loss of generality, any U-statistic may be written as

-1
U¢(X1,...,X)=(n) Z ¢(X.1,...,X. ) (1.0.1)

1<i <. . .<i <n 1 1m
_1 m—

where (p is a real valued function symmetric in its m arguments and

the summation is extended over all distinct sets of m integers

{11,...,1}suchthat1:1jf_n and 1j<1j+1 for IEJEm-l.
I — ‘
90(0' 0: oz )2 —1 21140: oz )
l’2’°"’m n.' i""’i
l m

where the summation is extended over all permutations of {1, 2,

Among other results, Hoeffding found that

lim P[U (X1,...,Xn) > max/J3] = 1-¢(x)
n—>CD (’0
. ( )2
prov1ded E¢(X1,...,Xm) = O , E cp(X1,...,Xm) < co and
2
0‘2 = E E[¢(x1,...,xm)lxl]) > 0. Here
X 1X2
“ 1
¢(x) : _1_._ f e 2 dxl.
VZn -m

Clearly there was no loss of generality in assuming the first moment of

q) to be zero.

If all moments of (p up to order p exist for some p > x‘2 + 2

then for fixed x > O

1/2

logP[U(X,...,X)>mox(1—O£P—) ] Z

1. 740 1 n n
1m
n—>oo

-11...
_-Z

logn

The latter result was obtained by Herman Rubin and Jayaram

Sethuraman [17]. They define deviations of the U-statistic from its

1/2
mean of the form c<£9—§1—ll)

to be moderate deviations.

Our main result is that, with no additional assumptions,

 

1 1/2
P[U (X1,...,X )>mo*x(—952) ]
lim : ‘P n n , = 1 (1.0.2)
“Tm 1 - ¢(X\/I0gn)

Since the sample mean is a U-statistic, this may be viewed as a

generalization of the following theorem due to Herman Rubin and Jayaram

Sethuraman [l7] .

 

 

 

 

Theorem 1. 0 Let X1, X2, . . . be independently and identically distrib-
uted real valued random variables such that EX1 : O, EXZ : 0'2 and
E|X|p<ao forsome p>x2+2 and x>0. Then
1 n 10 n 1/2
P[— z X. > ax(—&—) ]
n ._1 1 n
lim 1‘ = 1 (1.0.3)

n->oo 1 - ¢(xm)

and

 

1 10 n 1/2
P[|H Z Xil>ox(—-§—-) I
1. i=1 __
1m — 1 .
n—>oo 2(1-¢(x'\/logn)

The theorem remains true if the constant x is replaced by

 

x+e wherec=o( 1).
n n logn
In section 1.1 the proof of (1. O. 2) is presented. In section 1.2
some extensions of this result are discussed. However, the problem of
extending these results to deviations larger than moderate deviations
remains open. In Chapter 2 moderate and excessive deviation results

are presented for the Kolmogorov-Smirnov statistics. This problem

is formally introduced in section 2. 0.

1. 1 U-Statistics

Let X1, X2, . . . be a sequence of independently and identically
distributed random variables defined on some probability space
(0,58, P). Let (p be a Borel measurable, real valued function symmetric
in its m arguments. The U-statistic generated by (p and the first n

random variables is defined as in (1. 0. 1):

 

—1
n
.,xn)=(m) 1. Z 1,)(xi,...,xi )(1.1.1)

Given this sequence of U-statistics, {Uém} , we wish to study
the rate at which the corresponding sequence of probabilities of moder-
ate deviations from the mean, {P[U((pn) - EUfpn)> c 12.5.2] } , tends to
zero as n becomes large. Our approach is to exhibit a sequence of
real numbers which is asymptotically equivalent to this sequence of
probabilities. (Two sequences of real numbers, {an} and {bu}, are
defined to be asymptotically equivalent if lim an/bn = l in which

n—>oo

case we write a m b ).
n n

Our main result is

Theorem 1. l

 

Let {USU} be the sequence of U-statistics defined in (1. 1. l)

and suppose that:

E|<p(X1,...,Xm)|p<oo for some p>x2+2 and x>O. (1.1.2)
E¢(X1,...,Xm) = 0. (1.1.3)
0'2 = E(E[<p(X x [X ])2>o (114)
1,..., m 1 .
Then
x2
(n) 10 n ”2 2 -1/2 “2‘
P U¢ > ma'x —§— ~(21rx logn) n (1.1.5)
and 2

1/2 _
P[lU;n)l > ma’x(1—O§—Q) ]~ 2(21rleogn)-l/2 n 2 . (l. 1.6)

Proof.

It will be shown that there is a real valued function (alt) such

that the U-statistics generated by (pl and X1, X2, . .. have the

following propertie s:

 

 

 

 

1/2 c ‘ Z
P[U(n) > x0‘ (1353) :1- “ 1 2]~(2nx210gn)‘1/2n‘x /2
s01 n (nlo /
g“)
(1.1. 7)
1/2 c 2.
P[|U(n)| > x0‘(—12&-11\) :I: n ]~ 2(21'rleogn)-1/Z n—x /2
(01 n (nlo 1/2
gn)
(1.1.8)
2
xmcre: -x /2
P[IU(n> - mU(n)| > 31/2]: 0(2-————) (1.1.9)
(0 ((71 (n log n) x/log n
where {an} is a positive sequence such that En = 0(1) and logn = o(sn).

Now for any two real valued random variables Y1 and Y2 and

constants a and 5 > 0

P[Y +Y >a] > P[Y >a+6] - P[Y < -5]

1 2 - 1 2
and
P[Y1+Y2 > a] : P[Y1> a-b] + P[Y2 > 6].
(n) 10 n ”2
Inparticular if Y +Y :U , Y =U -mU , a=mx0'———g——
1 2 (p 2 .p «)1 n
1/2

and 6 = (mx (ran) (n log n)_ then these inequalities together with

(1.1. 7) and (1.1. 9) will verify (1. l. 5). Similarly, the inequalities

P[|Y1+YZ| >a] _>_ P[|Y1l>a+6]- P[|YZ| > 6]

and

 

 

P[|Y1+Y2| >a] 5 P[|Y1|>a—6]+P[)Y2|> 6]

together with (1.1. 8) and (1.1. 9) will verify (1. 1. 6).

Define
._ l I I l ._
<pl(){1) ‘ E[¢(X1:XZ)--'9Xm)lxl_x1]
=/¢(xl,x2,...,xm)F(dx2)...F(dxm) a.s.
where X' , . . . , X' are inde endentl and identicall distributed with
1 m P Y Y

the common distribution F being that of the Xi's. Then (1. 1. 3) and

(1. 1. 4) insure the real valued random variables ¢1(X1).¢1(X2),
are independently and identically distributed with expectation

2 Z

E¢1(X1)=O and variance E(<p1(Xl)) :0- >0. Finally, (1.1.2)

and Jensen's inequality yield
Elqollp = E<IE[¢IX1HP> :EE[l¢lplX1] = Elwlp < co for some p > c2+2.

Thus the conditions of Theorem 1. l are satisfied for the sequence

cp(X1),(p(XZ), Since
(n) 1 n
U = — E (p (X.)
“’1 n i=1 1 1
and since
. i l ~x2/2
11m (— e )/(1-¢(x)) :1
X->oo x 211'

the asymptotic equivalences (1. 1. 7) and (l. l. 8) have been verified.
To show that (l. l. 9) is also true, Ufpn) is decomposed into k

U-statistics as follows:

 

 

Define (pm(x1,...,x )=go(xl,...,xm) andfor lir<m set

- l l 1__ . .
(pr(x1,...,xr) - E[¢(Xl,...,Xm)|Xi—xi for l_<_1:r](l.1.10a)

11 (x1, .xr)
ZCP (X r ’x )+(-1) 2 ~ (P (X ’X 9 ,X
r 1 r l<i <...<i <r r-1 1 1 lr-l)
—— 1 r-1—
2 r-l
+ (-1) E) 90. 2(X1' ,x.1 1+...+(-1) Z 90 (xi)
l<i <. .<i <r 1 r-Z 1< i<r
—1 r-Z— ——
(1.1.10b)

r=1 r

< 1 m m (n) (n) (n) m m (n)
Then U n : Z ( )U . In particular U - mU '2 23 U .
(P r ‘1’ 90 (P1 1.22

Note this is the same decomposition as that used in [17]. For

m = 1 Um) : U01) and (1.1. 9) is trivial. Suppose m > 1. Evidently,
’ ‘P (P1

 

p ,Um) (n) ”ma
-mU l > le
<p (pl (nlogn)

m [(m) l (n) m X(Ten
P U l > ]
r=2 r Iffr m-l (r110gr1)1/Z

|/\

 

| /\

V m V V
(ﬁ) (n1ogn)"/2 2 (m) E|U(n)l (1.1.11)
mxeen r y

r22 r
for O<v:p.

The latter inequality was obtained using the Markov inequality

[15, p. 158].

It is interesting to note that Yr often has zero (marginal) expec-

tion. In fact, if 1 Si _<_ r and the x. are distinct for l :j i r then

 

 

 

lvr(xl.....xr)F(dx.) = 0. (1.1.12)

To verify this fact it suffices to prove

H
O

[Yrbcl’ . .. ,xr)F(dx1)

Referring to the definition of (Ir (1. 1. 10b) it is evident that \yr may

be written as

1 <ir-j: r-J 11 r-j
_( Z 90 . (X , ,X. a
. . r—J—l 1 1 _._
ZE1I<...<1r_j_1:r 1 r] 1
T0(x x ) (x )
1 1' ’ r “”1 1
Clearly
[‘pl(x1)F(dx1): 0'
and
O .
[Tr-j(xl""’xr)F(dxl):O for O_<_J<r-1 (1.1.13)

since

 

 

 

jqp ( X (pr_j(x11, ,x r-J) 1f no x.J 15 x1
r-J 11 lr-j) (dxl) :
(p _._ (X.',...,X., . .
r J 1 11 lr-j-l) 1f one Xi. 15 x1
J
where {x.,,...,x., )2 {x. ,...,X. } - {x1}. This verifies
11 lr—j-l 11 lr-j
(l. l. 13) and hence (1. l. 12).
Returning to line (1. 1.11), if v is an even integer then
E|U(n)|v: (“1)“)
Yr r
n -1
:(r) EZ---Z\yr(Xi ,...,Xi )--oxyr(Xi ,...,Xi )
11 1r v1 vr
(1.1.14)

Of the (2) v terms in this summation of products all those terms in
which a particular X.1 occurs exactly once have zero expectation. Thus
a term with non-zero expectation can have no more than 352". distinct
X's .

To see that there is a common upper bound to these expectations,
observe that repeated applications of Hb'lder's inequality [15, p. 156]

yields

Eln vr(xi ,...,xi )| :Elvr(x'1,...,x;)lv
jzl j] jr

provided the moments on the right hand side exist. Referring back to
the definition of Yr in line (1. l. lO)it is evident that these moments
exist provided ijv < 00 for 1 :j _<_ r . Fortunately this is easily

shown since

_ 1 1 1 1V
Erp - E(E[¢(X ,...,Xm)lX1,...,Xj])
V 1 1 1 1
:E(E[(p(X,...,Xm)lX1,...,Xj])
= Eq,”

v/p
[EIcp [p] < co provided v is an even integer and v :p .

|/\

Here the conditional variant of Jensen's inequality [15, p. 159, 348]

was used and assumption (1.1. 2) assures the finiteness of E)<p Ip. We
v/
conclude that each term in (1. l. 14) is bounded by va(El(p (p) p.

Of course only those terms with £1 or fewer distinct X's

2

have nonzero expectation. The re are at most

rv
'2—

c(n) : Z c(n)
j=r JJ

such terms where cj is the number of ways j distinct X's may be

arranged in the rv positions in a term. Apparently c(n) = O(n ) so

that in (1.1.14)

TV TV

EIUY l" = (n)-vo<n7) = O(n-T) for 2__<__r<m.

Finally, returning to (1. 1. 11) it is seen that

N|<
v:

 

(1’1), XI’I’lO‘E

n —v ‘3
(p1 > [2] : O(gn (logn) n ).

P[|U(n) _ mU 1
(p (n log n)

Let v be an even integer such that c'2 < v E p then (1. l. 9) is verified

and the proof is finished.

11

1.2 Extensions.

Let F denote a distribution function and 1y a real valued function
m
_II

integrable with respect to the product measure
1:1

F.1 . Define a para-

meter 9(F) as
G(F)= /°'-/\y(xl,x2,...,xm)F(dxl)F(dx2) ---F(dxm), (1.2.1)

Hoeffding [11] calls O(F) a regular functional of F . Given n (_>_m)
independent random variables, X1, X2, . . "Xn , a reasonable method of

estimating 6(F) is to replace F by the sample distribution function

Fn. This leads to a statistic of the form:

1m n n
6(Fn):(ﬁ) Z Z Y(Xi,Xi,...,Xi ). (1.2.2)
11:1 1m=1 1 2 m

This statistic is closely related to the U-statistic discussed in
the previous section. (Note that 1)! does not depend on n.)

Define

cp(x1,...,x )=Z‘i’(x.,...,x.) (1.2.3)

where the summation is extended over all permutations of {1, 2, . . . ,m} .

For lij < m define

‘l’m_j(xl,...,xm_j) = Z ‘l’(xi1,...,xim) (1.2.4)

where the summation is over the distinguishable permutations in which

each Xi’ l :1 _<_m-j , occurs at least once. Then

m _ n (n) n (n) n (n)
n 9(Fn)—()U +(m )U +”'+()U‘Y (1.2.5)

m (P

 

12

Hoeffding showed that if E[9(F )]2

< 00 then the asymptotic distribu-
tion of (I; (earn) - 9(F)) is that of J; (Uén) - 6(F)) .

The following theorem shows, under additional moment condi-
tions, that a similar result is true for moderate deviations. (The

notation developed in (1.2. 1) through (1. 2. 4) is used in the statement

of the theorem)

Theorem 1. 2.1

 

Let {Xi} be a sequence of independently and identically dis-
tributed random variables with common distribution F and let
{8(Fn)} be the corresponding sequence of regular functionals of the

sample distribution function (1. 2. 2). If

[...] [<p(x1,...,xm)]pF(dx1)---F(dxm)<oo for some p>x2+2

(1.2.6)

andfor lijEm-l,

j” [[‘Y(X1,...XJ.1)]qF(dx ) °--F(dxj)<oo for some q> 1, q>x2

(1.2.7)

and 0'2 > 0 where

0'2 =/[/” -(/——1-—¢p(xl,...,xm) -9(F)) F(dx1) ---F(dxm_1)]2F(de)

(1.2.8)

then for x > 0

1/2 2
)- 9(F) > mxo-(lggll) I,” (anzlogn)-l/Z n-x /2

pl“F (1.2.9)

n

and

1/2 2
p[|e(Fn) _ 9(F)| >mx0'(.1£§l_r_l.) ]~ 2(21rleogn) ”2 n_.x({.22.10)

13

Proof.

Using the expansion (1. 2. 5) 9(Fn) may be written as

m-l . (n) m-l 1 m-j-l . (
9(F)=(II (1-1—))U ,+ z: —.—( r1 (1-1—))U
n n (p/m. :1 n‘] n

i=1 3' 1:1 m-j/(m-j)!

Since any weighted sum of finitely many U-statistics is itself a U-statis-
tic (provided the weights are constants independent of n), 9(Fn) may be

written as

m-l 1 (n)
9(F ) 2 23 —.-U (1.2.11)
n j=0 nJ

Clearly (p0 = go/m.l and the 475 are weighted sums of 90 and the 1y]. .

 

 

 

 

Evidently,
1 1/2
Pszan) -6(F)>mX0‘(—9-§—n—) :]
< pEJln) 9 (log n)1/2 En :|
_ I - (F) >mx0‘ ‘ 1/2
(p0 n (n logn)
m-l e nJ.1/Z
+ 2: FEUWB n 1/2 (1.2.12)
i=1 "’3' (m-1><Iogn)
and
(n) logn 1/2 E:n
P:PU,-9(F)>mx0‘ + er
(po 11 (nlogn)
m_1 _8 nj.1/2
_ 2: p[U(n,)< “ 172:] (1.2.13)

i=1 "’1 (m-1)<Iogn)

where 8n is the positive sequence defined after line (1. 1. 9). In View
of Theorem 1.1, (l. 2. 6) and (l. 2. 8) the first term on the right sides

of 1. 2. 12 and 1. 2. 13 are asymptotically equivalent to the right hand

14

side of(l. 2. 9). To see that the remaining terms are asymptotically

negligible Markov's inequality is used to obtain

)1)
pl |U(n,)l > 5 I < zs“’13:lU(’,1 [ (1.2.14)
(pJ. n — n cpj

and for v > 1 Minkowski's inequality is repeatedly used to Show

1/v 1/v

/\

F:

3
/\
17¢

(EIU .l") _ max {(Elcplv11/v.<E11.l">l/V}
(Pj ‘Pj _ 1<i<m-l 1 ‘
”‘“ “ (1.2.15)

Select v so that the moments on the right hand side of l. 2. 15 exist and

v>1 and v>x2. Set

n+J-1/Z

 

6:“

(n log n

)1/2

Then the right hand side of (l. 2. 14) is asymptotically negligible com-

2
..1/2 n-x /2 for 1:] :m-l , and (1.2.9) is verified.

pared to (log n)
A similar argument verifies (1. 2. 10).

The latter part of this proof is a special case of

Lemma 1. 2.1

 

Let {Xn) and {Yn} be sequences of real random variables

such that

1/2
PEXn > x(1_0§_) ]~1-¢(X~/logn) for x e (O,x0).

n
n
Then

Yn 10 n 1/2
P]:Xn+ 7>x(—g——) :'~1-¢(xlogn)

11
I).

if c > 1/2 and, for some fixed M, ElYnlviM< co for some

2

v>-2-—}E—_-l andalln.

 

 

15

Proof.

Use inequalities analogous to(1. 2. 12)and (1. 2.13)and apply
Markov's inequality.

We give one further example of how Theorem 1. 1 may be
extended. Let {XS-h , 1 :j i r , denote r independent sequences of

independently and identically distributed random variables and let Fj

denote the common distribution of the random variables in the jth

sequence. Let T0 be a Borel measurable, real valued function of

r
m: _2 mj arguments. Let T0(---,x.,~-,xk,--o):T0(--o,xk,--o,x.,-~

3:1 1 1

if mj<i<k:m.

J+1 for some j,0:j:r-l. Here m :0. Thus

0
arguments, next m2 arguments, etc.
Define a Lehmann-generalized U-statistic generated by T0 and

T0 is symmetric in its first m1

(3') (j) (j) - .
Xl’X2""’ch’1:J-<—r as.
r c.n -l
Um 11(3) “0061“. ,ng> , ,x;r>, ,x;r> )
j=1 j k k k k
1 ml 1 mr

where the summation is extended over all subscripts for which

_ ...j :c.n;l_<_j:r.
k1 km. J

J

Here the cj's (_>__ 1) are integral valued and represent the proportion

of X(J)'s in the ”sample. "

U'(n)

The sequence of statistics may be approximated by the

sequence of U-statistic s

-1 .
U(n)— (‘1) 8(2. , ,z. ) (1.2.16)
‘P m 1<il<.,.<1 in 1l 1m
where
_ (1) (1) (r) (r)
Zi+1_(xc i+l'°"'Xc (1+1)....,XC i+1,...,XC (1+1))(l.2.17)
1 1 r r
for i=0,1,2,..., ,and
r c. -l O
_ l
‘P(Z1’°°°’Zm) - (m. jEI—J—mjj) 2 2T (...) (1.2.18)

where the summation is over the ( m m m ) permutations of the
1, O O , r

subscripts of Z such that i . .< i for 0 :j E r-l and

<.
mil m(j+1)

J
over all permuations of the X (J) within each Z .

Theorem 1. 2. 2 Let {U'(n)} be the sequence of statistics defined in

 

(l. 2. 15). If the sequence of U-Statistics, {Ufpm} , satisfies the condi-

tions of Theorem 1. 1 and if

m
v r J -
/.../[T0(X(11),...,X(1),...,xgr),...,x(r))] II II F.(dle))<oo
m1 mr j=li=l 1

2
for some v>x , x>0, then

1/2 1/2
P[ '(n)>m0'x(-1—O—§—rl) :|~P[:U;n)>mcrx(1—O£-rl) :].

1'1

Proof.

i=1 3' J
and may be written as

r c.n (n) r c n
Note that II (rri )U' consists of Mn 2 II (r191 ) summands
=1

3'

summands of U'(n) not included in (Nn -Mn)U;n) . Thus U(n) may

be written as

.(n)_ (n) n (n)
U —U¢ +(Tn-m—U )

v
Condition (1.1.2) assures us that E|U(n)] < M < co for some

2 v Nn " 0 V
v>x +2 and condition(1.2.19)guarantees EITnl :l—M—-] EIT I
N n
v
and EITO] < co . Since 1712- = O(%—) , Lemma 1.2.1 applies and the
n

theorem is proved.

The theorem remains true if the independence conditions are
replaced with the assumption that the Z's are independent (and condi-

r
tion(l. 2. l9)replaced by the obvious moment conditions) and if E c. is
.:1
r J
replaced by Z) Cj + o(l) .
i=1
The final theorem in this sect-ion deals with functions of

U-statistics. Let {Xn} denote a sequence of independently and iden-

tically distributed random variables and let U(1n), . . . , Ulin) be k

(1) (Z) (k)

U-statistics generated by X , . . . , Xn and the ke rnals (p ,(p , . . . , (p

1
respectively. For each U-statistic, assume that conditions (1. 1. 3) and
(l. l. 4) of Theorem 1. 1 hold and that all moments of (pg) exist and are
finite. Define aij = E¢§1)¢(13) ; 1:1 :k, 1_<_j_<_1< where (p?) is

defined as in (1. 1. 10a). If the determinant laij] is positive then

18

Theorem 7. 1 of [11] asserts that the asymptotic joint distribution of

the Mn UJlm's is non-singular. Thus each nondegenerate linear com-

. . k (n)
b1nat1on .23 a.U.
J-1 J J

Theorem 1. 1. In View of Theorem 8 of [17] we have the following

is a U-statistic satisfying the conditions of

 

Theorem 1.2.3 Let f(xl, . . . ,Xk) be a function of k arguments
and
k < Hxll )
f(x1, . ..,Xk) — f(0,...,0) + :31 bixi +0 10g ”X“
2 k 2 . . . .
where ”X“ = 23 xi . Then, for the above-ment1oned U-statistics, if
121
x>0, and O'b>0,
1/2 2
PE(U(1n), . . .,ULM) - {(0, . . .,0) >xeb(1—9§ﬂ) :]~(21rleog n)'1/‘Z ﬁx )2
where
2 k k
0' = Z Z b.m.b.m.oz..
b :1 1:1 1 1 J J 1]

j

and mj is the number of arguments of (p(']) l :j _<_ n .
Similar results are valid for 8(Fn) and for Lehmann-gene ralized

U-statistic s.

PROBABILITIES OF EXCESSIVE DEVIATIONS OF
THE KOLMOGOROV-SMIRNOV AND

KUIPER STATISTICS

2. 0 Introduction and Summary

Let X1, X2, . . . ’Xn be a sequence of one -dimensional, indepen-

dent identically distributed random variables each having the continuous
cumulative distribution function F .
The empirical cumulative distribution function, Fn’ associated

with the sequence X . , Xn is defined by the relation

1...

SIH

Z
n xigx

The Kolmogorov-Smirnov statistics are defined as

D: = sup (Fn(x)-F(x))
—oo<x<+oo
D’ = sup (F(x)—F (x1)
n -oo<x<+oo n
D = sup IF (x)-F(x)].
n -oo<x<co n

The Kuiper statistic is defined as

V = sup (F (x)-F(x)) - inf (F (x)-F(x))
n -oo<x<oo n —oo<x<oo n
= D++D'
n n

19

20

The limit distributions of these statistics are well known. N. V.
Smirnov and A. N. Kolmogorov [10] have proved that, for any constant

c greater than 0 ,

2
lim P(D+ > cn-l/Z) : e-ZC
n—>oo n
2
lim P(D- > cn_1/2) = e"2C
n—>oo n
no .2 2
lim P(Dn>en‘1/2) = 2 z (.1)J'le"ZJ C ,
n—mo j=l
and Nicolass Kuiper [14, p. 43] has shown that
00 .2 2
lim P(Vr1 > cn-l/Z) : 2 (4j2c2 - 1) 8.2] C '
n->co j=l

Second order terms of these asymptotic expansions are also known.
[10, 14].

Following the terminology of Herman Rubin and Jayaram Sethuraman

-l/2
n

[17], deviations of these statistics from zero of the form c for

constant c will be called ordinary deviations. Any deviation of the form
-1/2
11

, for {Cu} a sequence unbounded above, will be called exces-

sive. If {Cu} is also an increasing sequence it is apparent that

1/2

limP(D+>c n“ ): 0.
n n

n—>oo
Of interest is the rate at which probabilities of excessive devia-
tions tend to zero as the sample size becomes large. Much work has

been done in this area.

21

Herman Rubin and Jayaram Sethuraman [17] made use of a

theorem by N. V. Smirnov [10, p. 154] to obtain the following excessive

3n-l/2

deviation result. Let cn —> co and CH =- O(l) , then
2
-2c
P[D+>c n-1/2]~e n
n n
-2cZ

P[D >c n-1/2]~28 n
n n

In section 2. 3 it is shown that asymptotic expansions similar to

n-1/6

these remain valid if the condition Cn = 0(1) is replaced by

nl/Z).

In section 2. 4 similar results are obtained for the Kuiper statis-

c=o(

tic;e.g., for Cn>C\/logn and cnn-1/6=O(l), c>1/2
2
-2c
P[V >c n-1/2]~8cze n
n n n

However, for larger deviations the results are fundamentally
different, that is, they are not a "natural" extension of the ordinary
deviation results. A constant deviation will be called large. Jayaram

Sethuraman [18] proved that for any constant c between zero and one

1
Flog P[Dn > c] log(3(c)

C l-x-c

(3(a) = sup <x/(x+c))x+ ((1-x)/(1-x-c>>

O<x<l-c

In fact Sethuraman obtained results similar to this for k-dimen-
sional random variables and variables defined on separable complete

metric space s .

22
In section 2. 5 it is shown that
+ n
P[Dn > C] ~oz(C) (13(0)) ;

where a(c) does not depend on n . Asymptotic expansions are also
given for D , and D- .
n n
Order results for large deviations of the Kuiper statistic have
also been worked out. Innis G. Abrahamson [1] has proved that

1

1'1

log P[Vn Z c] ~ log (3(c) ,

In section 2. 6 it is proved that P[Vn _>_ c] is asymptotically equal
to a1(c)n((3(c))n.

Section 2. 1 contains a brief review of some well known facts about
Fn, Dn and Vn and the relationship between the order statistics of uni-
form random variables and Poisson processes.

Section 2. 2 contains some theorems of a technical nature which

are used in the later sections.

2. 1 The Sample Distribution Function

In the following paragraphs some well known properties of the
sample distribution function are reviewed. No new results are presented

in this section.

2. 1. 1. Let X1, X2, . . . , Xr1 be n independent random variables with
common continuous cumulative distribution function F . The statistics
D+, D-, D and V are independent of F.
n n n n
For example, set Y1: F(Xi) then the Y.1 are independent

uniform random variables on the interval [0, 1]. Since

23

+ _
the statistics D , D , D and V generated by X ,.. . ,X are,
n n n n l n
with probability one, the same as those generated by Y1, . . . , Yn . In

the following sections it will be assumed that the X.1 themselves are

uniform random variable S .

2.1.2 The equation Fn(x) - x = D: can only be valid if x is one of

the observed values of the (uniform) random variables X1, . . . , Xn

From continuity argmnents, it is evident that Fn(x) -x takes, with

probability one, its maximal value at a unique point, Xmax . The

random variable Xmax is uniform on the unit interval. In the following
sections it will be assumed that the maximum deviation of Fn(x) - x is

unique. (Similar remarks apply to the infimum of Fn(x—) —X . [13, 2]).

2. 1. 3 The probability distribution of the order statistics,

X(l) _<_ X(Z) E . . . _<_ X(n) , of n independent uniform random variables

on the interval (5, t) and that of the jump points, T _<_ T E . . . _<_ Tr1

1 2

of a Poisson process, X(v) , is the same given that X(s) = m and
X(t) = m+n [6 p. 400 and 12, p. 239]. By a Poisson process with
parameter A is meant a separable, real process X(t) with stationary

independent integral valued increments and

 

24

2. 2 Some Asymptotic Expansions

The idea of obtaining asymptotic results for Kolmogorov-Smirnov
statistics by a consideration of n Fn(t) as a Poisson process, X(n t) ,
is not new (see D. A. Darling [20] and R. Pyke [16]).

This idea is employed here to obtain asymptotic expansions for

the conditional probabilities of the events

{F (t)-t<

m
n —n

-x for all t such that O<t<x}

and

-x for all t such that x<t< l}

m
{Fn(t) - t _<_ 3'

given that the mth order statistic occurs at x .
For the case x < t < 1 use is made of the following theorem

([12], p. 247-48).

Theorem 2. 2. l

 

Let T , . . . , Tm be the order statistics based on m indepen-
1
dent uniform random variables on the interval [5, t] and let

1
F (Y)= Z -
n,n1 T (y!)-
1—
Then
P[ sup (F (y)-(y-s))<0]= P[T >5+s k212 m]
n’m __ k—n 1 9 9 9
sixﬁt
n1 .
—1-1‘IE—:—S—)_ 1f0<m<n(t-S)

: 0 if m>n(t-S),

In particular,

25

m n-m
P[X<S:1I<) 1(Fn(t) - l2): :1— - X]X(m) = X] ‘—' 1- m (2.2.1)

For the case 0 < t < x the following theorem, due to Lajos

Takaes ([19], p. 56). is used.

Theorem 2. 2. 2

 

If {X(t); O i t < co} is a separable stochastic process with
stationary independent increments for which almost all sample functions

are nondecreasing step functions vanishing at t: 0 then for x i O

p[ sup (t—X(t))_<_x]:1-e'wx.

0:1:(00
where
E[e‘zx(t)] = e't‘W) for t 3 o, Rez 3 0,

and w is the largest real root of the equation ¢(z) = z .

This theorem was proved for Poisson processes by R. Pyke [16].

If X(t) is Poisson with parameter A then
4(2) = M1 -e'z) (2.2.2)

and w = A - A]: provided X > 1, A]: < l and Xe- = X, e . Evidently

as X tends to one, w tends to zero.

Lemma 2. 2.1

 

If A =l+a ;a >0, a 20(1) and w is the largest real
n n n n n

root of the equation w = A (l -e-W) then w ~ 2a
n n n

26

Proof.

Noting the remarks following Theorem 2. 2. 2 , w : )\ _ )3”
..)\ b _x’i‘ n n n
where )‘ne n = xle n and since the function g()() : he is

continuous and for X > 0 has a unique maximum at X = 1 it is evident

«I;

that )tn 1 1 implies )‘n 1 l . An application of the mean value theorem
yields:
a a2 3
e- =l-a+T-O(a);a>0.
Thus
(1+a)e-EL : l-a2/2+O(a3), a>0
(1 -b)eb :1-b2/2 - O(b3), b > o,
>:< ->\T1 >{: -)\:
Let A =l+a;)\ =l-b sothat 1e zxe ,thatis
-a n nb n n n n
n _ n . 2 3 _ 2 3
(1+an)e — (l - bn)e Wthh means that bn + O(bn) — an - O(an)
. _ _ 2 _ z
and Since an—o(l), bn—o(l) we have bn(l+o(l)) — an(l-o(l)).

Thusb~a andwza +b-2a.
n n n n n n

In fact w 22a (1+O(a )) .
n n n

. t . . . .
G1ven that the m h order statlst1c of n 1ndependent un1form

random variables on the unit interval occurs at x , the first m — 1

order statistics, X , . . . , X , have the same distribution as the
(1) (In-1)
order statistics of m - 1 independent uniform random variables on
the interval [0,x]. Also U. = n(x-X .); i=1,2,...,m-l would
(1) (In-1)

be the order statistics of m -1 uniform random variables on the inter-

val [0, n x]. Evidently

27

In _
P[0<5]:1I<)x(Fn(t)- t)< n - xlxhnl —X]
k C — —
: p[_n - X(k)< 3- -x, k—l,2,...,m-1]X( )—x]

P[-(m-1)F1r1n_l(t) +t_<_1; O:t:nx]X(0) = o, X(nx) =m-1]

Ln(x,m) , say. (2.2.3)

Here Filn-l (t) is the sample distribution function based on the {U1}

and X(u) is a Poisson process with jump points {U1} on (O,nx) [See 2.1.3].
The next theorem shows that, asymptotically, the condition

X(nx) = m-l in (2. 2. 3) may be dropped for suitable choice of m and

A .

Theorem 2. 2. 3

 

Let {X(u) ,u: 0} be a Poisson process and, for each n>0 ,

 

let {Xn(u) , u __>__ 0} be a Poisson process with parameter An = l +
where O < c < co , O < t < l and dn is a positive sequence such that
dn = 0(1) and ndr21> alogn for some a > 0. Let kr1 : [n(t+cdn)] .

Define

Ln(t,kn) = P[u-X(u):1;0<u<nt]X(nt) 2 kn- 1].

Then for any t0 and (:1 such that O < t0 < t < t1 < 1 and sequence

c Such that c > c and c d = 0(1),
11 n nn

Ln(t,kn) ~ PEu - Xn(u) _<_ I]
and
2cd

n

L

 

t, kn) = [1 + a(n,t,c)] (2.2.4)

n(

where

28

lim sup a(n,t,c) = 0 (2.2.5)
n—>co A

and the supremum is taken over all t and c such that 0 ~’ tO < t < t1 < 1

and l<c<c.
-' —n

Proof.

Suppose {X(u) ; u > 0} has parameter A. The probability of the
sample functions for which u - X(u) : 1 for all u > 0 is given in
Theorem 2. 2. 2. This probability may be expressed as a sum by con-
ditioning on the events {X(n t) : k - 1} for k 3 nt .

For reasons of notational simplicity, define

Q(>.)=p[u-X(u)5_1;0<u<a], (2.2.6)
Qn(>.,k) = P[u—X(u):1;u>nt]X(nt):k-l] (2.2.7)
and
pn(>.,k) = P[X(nt) : k- 1] . (2. 2. 8)
Since X(-) has independent increments, O(X) may be written
as
QM) = Z L (t.k)P (MMQ (Mk). (2.2.9)
k>nt n n n

and since X(.) has stationary increments,
Qn()\, k) = P[u -X(u) :k-nt; uZO]X(O) = 0]
In View of Theorem 2. 2. 2,

QM) = 1-6 (2.2.10)

and

 

29

'(k'nm’ if kznt. (2.2.11)

Qn(A.k) = 1 -e
Apparently Qn(A,k) and Ln(t,k) are nondecreasing functions of k and
we are led to the following inequalities:

Q(A) > ( Z P (A,j))L (t,k)Q (A,k) if k>nt (2.2.12)
— j>k n n n —

Q(A) < L (t,k)+( Z? P (AinL (t,k')+ Z P (Ad)
- n jZk n n jzk' n

if k'>k>nt.(2.2.13)

If k is large, if 2 P (A,j) is near one and if E P (A,j)
j>k n j>k' n

is near zero then L t,k) could be approximated by Q(A) . Of course

n(
Ln(t, k) is independent of A so the problem reduces to that of selecting
A so as to make the bounds in (2. 2.12) and (2. 2.13) reasonably tight
when k = kn' An intuitively appealing idea is to "aim" the sample
functions of the process {X(u)} at the point (n t, kn)’ i. e. , to select
A so that EX(n t) '= kn

Proceeding with this idea, let

nth;1 = [n(t+cdn)] + [nan] (2.2.14)

-1
where {an} is a positive sequence such that 8n = O(dn) , (neg) : 0(1)

and me: = 0(1) . Now if {Xi} is a sequence of independently distributed

Poisson random variables with parameter 1 then

ntA'
n

I .. _
Pn[An,k] _ P[i_21 Xi_k—1].

30

Cramér's theorem for probabilities of excessive deviations of

a sum of independent random variables from its mean [8, p. 517]

appl ie 5 so that

 

2 P [A'.j]~-—-——l-——- e 2.

. n n
J<kn VZnnci

Returning to (2.2. 12) with (2.2.10), (2.2. 11) and (2.2.15) itis

(2.2.15)

now evident that
2

1'18
1’1

-1
-(k -nt)w -
t,kn):(1-e-W>(1-e n ) (l+—-——2—-——e 2 ) (2.2.16)

V2nns:

 

Ln(
for all n > Ne where Ne is a constant determined by {an} alone,

and not on c or t (since Ar'1 > n(t0+dn) for all t and c satisfying the

restrictions after (2. 2. 5)).

Making use of Lemma 2. 2. 1 it is seen that

-w 2cdr1 8n
(l—e )<( +——)(1+acd)
— t to n n

 

for some constant a and that
ndZ
n

 

-(k -nt)w '1 t1 >
(i—e “ < (1+2e

for n sufficiently large. Thus

2cdn (
l
Ln(t, kn) < t 1+a (n,t, c)) (2.2.17)

 

where lim sup a'(n,t, c) = O .
n->00 A

31

In order to obtain a lower bound for Ln(t, kn) return to
. 1/2
11 _ _ 1 :
(2. 2.13)w1th nt An — [n(t+cdn)] [n en] and kr1 kn+ (n log n)
Then, using Lemma 2. 2. 1, line (2. 2. l7) and Crame’r's Theorem it is

s een that

 

 

 

t0 n
2
ns
_ n
2
2 10,103.11 3 e
j>k .
— n
and
Z P (A",j):(nlogn)ﬂ1/Z
j>k' n n
“ n
for n sufficiently large and some constant a . Thus
2cdn
Ln(t,kn) Z t (1+a (n,t,c)) (2.2.18)
where lim sup a"(n, t,c) = O . The Theorem is proved.

n—->00 A

The use of Cramér's theorem to obtain asymptotic expansions for

Z P (A' ,k ) and Z P (A", R) was not essential. An alterna-
k>k n n n k>k' n n

.... n — n _
tive approach is to note that if p(A, k) = Ake X/kl then repeated

integration by parts verifies [7, p. 163]

l w -x n
2) p(A, k) = 3—,] e x dx.
k=0 ' x

32

Of course 111 can be evaluated using Stirling's formula and asymptotic
expansions for the above integral, an incomplete gamma function, have
been worked out by W. Fulks [9].

The following lemma presents an asymptotic expansion for

binomial probabilities [7, p. 169] .

Lemma 2. 2. 2

 

Let {dn} be a positive sequence such that dn : 0(1) and

ndf'l > aologn for some a > O. For each t 6 (0,1) let {cn(t)} be

a sequence such that cn(t) :1 , m : mn(c,t) = n(t+cn(t)dn) is integral

valued and for some 5 > 0 , cn(t)dn < l-t -e . Define
_ n m n-m
Pn(t,m) — (m)t (l—t) (2.2.19)
and
f (t,m) : (21rn(t+c (t)d )(l-t-c (t)d )-1/2
n n n n
cn(t)dn ‘m cn(t)dn “(mm
(1 + ———E-—-—) (I - ——1—-_-—t—-) (2.2.20)
Then
Pn(t,m) : fn(t,m) (l :l: O(-nl—t)) (2.2.21)
Proof.

A refined version of Stirlings formula [7, p. 52] states that

l l

.. Z ..
l/2nn+1/2e n 12n+l . (2n)1/2nn+1/ e ne12n

 

(217)

m
[A
:1
[A

Thus

f (t ) _.1___ __1_ __.1___
n 'm 9"" 12n+1 ' 12m ‘ 12(n-m)

_<_ Pn(t, m)

1 1 1
3 fn(t’m)exp(12n ' 12m+l ' 12(n-m)+l)

 

and the result (2. 2. 21) follows after slight manipulation.

Lemma 2. 2.3.

 

Let K have a binomial distribution with parameters n and t .

Let m=n(t+cn), 0<m<n O<t<l, Cn>0' Then

(l-t-c ) (t+c
c n c n]]
szgslll-l—ﬁ-l (1+?)

Proof [3] .

For x > 1, K 2 O, xK is an increasing function so that

P[K > m] < min s“m(ts + 1—t)r1

S>l

where E SK 2 (ts + l-t)n Differentiate the right hand side with

respect to s to find the minimizing value is s = (£)(-1—-£) > 1 .
O n-m t

2.3. Probabilities of Excessive Deviations of the Kolmogorov-Smirnov

Statistics

Let X1,X2,...,X be a sequence of independent uniform random

n
variables on the unit interval and let Fn(x) denote the sample
distribution function. Let p(t, %.— t) denote the probability

"density" that the (unique) maximum of Fn(S)-S , for se[0,l] , is

attained at t and is equal to E;— t . Thus
n

l

n
‘jp z p(t, 9.- t)dt = 1 . (2.3.1)
m=l n
0

If pn m(t) denotes the probability density of the m-th order

statistic, X(m) , then
m _ = _ m _ . =
p(t, a. t) pn,m(t)P[Fn(s) s 5 E. t, 05s51|x(m) t] (2.3.2)
and we have the following.

Lemma72.3.l.

‘ ﬁ—‘Vﬁ'i

 

For p(t, %.~ t) defined as above and lgmgn

- t) = a(g)tm‘l(1—t)n‘m(1— n-m )Ln(t,m) . (2.3.3)

p(t, ETTZE)

:15

Proof:
The probability density of X(m) is [12]
= m-l _ n-m
pn m(.t) m(§)t (1 t) (2.3.4)

In view of theorem 2.2.1 ,

34

35

P[Fn(s)-S 5 g - t;tgs51]x(m)=t]=1 - 53.12.15) (2.3.5)

Since Ln(t,m) is defined in (2.2.3) as

P[u—X(u)sl;0<u<nt[X(nt)=m-l] = P[Fn(s)-s 5 g.- t;0<s<th(m)=t]

(2.3.6)

and since the conditional probabilities in (2.3.5) and (2.3.6) are
of conditionally independent events, the lemma follows.
Evidently,
l
P[D+ > dn] = Zp(t,-m- - t)dt. (2.3.7)
n ‘ n.
0 m2n(t+dn)
An asymptotic expansion for the integral on the right (in 2.3.7) is
obtained (for suitable sequences {dn}) by first finding an asymptotic
expansion for p(t, %.— t) and then one for p(t,% - t) for
mzn(t +dn)
fixed t and finally carrying out the required integration using the
method of Laplace.
Let {dn} and {cn(t)} be sequences as defined (and restricted)
in Lemma 2.2.2 and m=mn(c,t)=n(t+cn(t)dn) . The following Lemma

presents an asympotic expansion for p(t, % — t)

Lemma 2.3.2

 

Let p(t, %.— t) be the "density" defined prior to (2.3.1).
Let m:mn(c,t) and c=cn(t).
Define

cdn t+cd cd l-tcdn

2 __ n __ n
dn gn(c,t)— log(lﬁ-E—) (l l-t) (2.3.8)

 

36

and
_ 2c2 -1/2
bn(c,t) — t(l_t)[21r(t+cdn)(l--t--cdn)] . (2.3.9)

Then for any t and t such that 0 < t < t < t < 1 and sequence

0 1 0 l

c such that c > c and c d = 0(1)
n n n n

2
m 2 nd g (C,t)
p(t, B- - t) =/n dnbn(c,t)e “ “ (1+a(n,c,t))(2.3.10)
where 1im supa(n,t,c)=0 and the supremum is over all t ;
n—m A
0 < t0 < t < t1.< 1 and over all c ; 1 s c S on.

Proof:
The proof is achieved by using expressions (2.2.4) and (2.2.19-
2.2.21) in expression (2.3.3).
Next on the agenda is the task of evaluating
Z p(t, %— t)
m3n(t+dn)
for fixed t and sequence {dn}

For each k=0,l,2,... define the sequence {c:(t)} so that

n(t+c:(t)dn) = [n(t+dn)]+k if [n(t+dn)]<n(t+dn)
= [n(t+dn)]+k-l if [n(t+dn)]=n(t+dn)_
Let
m*(k,t) = n(t+ck(t)d ). (2.3.11)
n n n
Then
2 m*(k,t)
p(t,-“i-t)= 2 p(t.—-‘-‘—-t).
m3n(t+dn) n kzl n

It is interesting to observe that, when ndi is large, the

dominant term of (2.3.10) is a decreasing function of c .

37

In particular, if an is such that a>l and

n(t+andn) = [n(t+adn)] ,

the upper and lower Darboux sums of

2
a ndngn (C s t)

n
11an bn(C,t)€
c;(t)

obtained by breaking [c'(t),an] into intervals of width l/ndn are

dc

n(a -l)d —l
“ ndzg (c:(t>,t>

bn(ck(t),t)e n “ (2.3.13)
k=l m
and
n(an—l)dn
k ndign(c§(t>,t>
b (c (t),t)e (2.3.14)
= n I1
respectively.

This leads to

Lemma 2.3.3.
Let p(t, g.— t) be the ”density” defined prior to (2.3.1) and

{dn} be defined as in Lemma 2.2.2. then

2 ndzgn(l,t)
p(t, E:- t) = dn(t)e n (1+a(n,t)) (2.3.15)
m3n(t+dn) n

where
1/2 —1/2

an(t) = 2(ndﬁ) (2nt(l-t)) , (2.3.16)

gn(l,t) is defined in (2.3.8)

38

and
lim sup a(n,t) = 0 .
n+°° t0<t<tl
Proof.
For some a>1 let kn=[(a-l)ndn] and write the left hand side

of (2.3.15) as

n m:(k,t) :E: mg(k,t)
:3: p(t, ------~- — t) +- p(t, -——————- - t) (2.3.17)
=1 “ kgkn+1 “
= Q1+Q2 say .

It will be shown that Ql is asymptotically equivalent to the right

hand side of 2.3.15 and that Q2 is asymptotically negligible when

”a" is a sufficiently large real number.

Making use of (2.3.10) and (2.3.12) through (2.3.14), Ql may
be expressed as

-l a ndzg (c,t)
(ndan bn(c,t)e “ “ dc)(1+a*(n,t)) (2.3.18)

1

A
3\
Q.

:3 N
v
D

[_J

I

2 0
ndngn(Cn(t).t)

+

0n(bn(Cg(t),t)e

where

2 and lim sup a*(n,t) = 0
n+w t0<t<t1

lenl

IA

For large n , each integral exhibited in (2.3.10) is determined by
its behavior near one since gn(c,t) is a decreasing function of

Cal . To see this, recall the definition of gn(c,t) in (2.3.8):

39

cd cd
2 = _ __11 _ _ _-—4E
dngn(c,t) ((t+cdn)1og(l+ t )+(1 t cdn)log(l 1-t))-(2°3'19)

Differentiate with respect to c to obtain

cdn cdn
—dn(log(1+'—E—)-log(l-'I:E)) (2.3.20)

3 2
5F‘(dngn(c , t))

l t t+cdn
-d 1 _£;_. _—_—_. .
n( 0g t l-t-cdn

Clearly §2_.d§gn(c,t) < 0 since 1-t/t is a decreasing fraction of
c

t .

Differentiate again with respect to c to obtain

 

2 -d2
3 dzgn(c,t) = n < 0 . (2.3.21)
3c2 n (t+cdn)(l—t-cdn)

The mean value theorem is used to write

ndign(c,t) = ndﬁgn(1,t)+(e-1)ndil(§§_.gn(3,t))

for some 1 5 2 5 c .
Since for large n, bn(c,t) is an increasing function of c it
is now evident that for large n

2
ndngn(l,t)

a ndign(c,t) ndnbn(a,t)e nd§(a-l)gg(a,t)
nd bn(c,t)e dc 5 2
“ nd g'(1,t) eudu .
1 n n

 

0

(2.3.22)

and 2
ndn(a-1)8A(a.t)

nd g (l,t)
a ndzg (c,t) ndnbn(l,t)e n n
fbn(c,t)e n n dc _>_ eu du . (2.3.23)
1

 

2 1
ndngn(a,t) 0

40

8
V _
(Here gn(a,t) --§E— gn(c,t))
In particular if we let a = 1+0 where 0 > 0 , 0 = 0(1)
n n n n

l =
and 0(0n)

 

nd
n
then 2
nd g (l,t)
an ndign(c,t) bn(1,t)e n n
ndn bn(c,t)e dc== (l+o(l))_(2.3.24)
dnlgg(l,t)]
1
To verify (2.3.24) note that
ndzé g'(a t) < ndzd '(1 t)
nn n9 .. nng a
and
2 dn dn
1 _ _ ——-_ _.___
ndndng (l,t) — ndn0n(log(l+ t ) log(l l-t)) (2.3.25)
2
-ndn0n
< ____.__
2t(1-t)
provided
d < . to(l-to) tl(l-tl)
n min —*——-7{-* , 2

The latter inequality is obtained by noting that
f(x) = logC£:£. — log E:£:§;_-__3L__
t t+x 2t(1-t)
is such that f(0) = 0 and f is increasing if x is small.
Since 1im ndﬁdn = m we conclude that the integrals on the right
hand side ofn(2.3.22) and (2.3.23) converge (uniformly for t0<t<tl)

to —1 as n becomes large. Of course bn and gﬁ are continuous

and (2.3.24) follows.

41

Using (2.3.22) and an inequality analogous to 2.3.26 we have

a 2 an
nd g (c,t) ndzg (ct)
n n n n ’
bn(c,t)e dc bn(c,t)e dc

a 1
n

2__ 1
bn(a,t) ndn(a 1 20ﬁ)gn(an,t)

g —-'—-- e (2.3.26)
bn(1,t)

In View of(2.3.24),(2.3.25)and (2.3.26) ,

2

ndign(c,t) bn(l,t)e
ndn bn(c,t)e do = [1+a(n,t)] (2.3.27)
1 dnlgg<1,t>|

 

for some a(n,t) such that

lim sup a(n,t) = 0 .
n+m t <t<tl

0
Note that

/E'dzbn(1,t)
n

 

Nonn(t)
dn|8;(l,t)l

To complete the proof we need to Show that Q2 is small.
Returning to (2.3.3) it is seen that
p(t, 9:- t) : E"-(n)1:m(1—t)“'m
n t m

and, using (2.3.8) (2.3.11), (2.3.17) and Lemma 2.2.3, that

:E: n ndign(a,t)
Q2 = p(t, —_ t) ('t‘e
m3n(t+adn)+l °

55‘

42
Then, by(2.3.20), (2.3.21), the remark following it and 2.3.25, we have

2
ndngn(1,t) l-t t+dn -(a-1)ndn

 

Q2 5 P‘ e (T 'rrf-Tr‘)
t n
-(a-1)nd2
n d2 (1 t) (2 3 28)
< P. (e 2t(1"t) )en ngn ’ O .
" 1:

Since nd: > a0 log n for some aO > 0 , the constant a in (2.3.27)
may be selected so large as to make 02 asympotically negligible
compared to Q1 for all t0<t<tl (e.g. (a—l) 3 t(1—t)/a0)

The equivalences (2.3.17), (2.3.18), (2.3.27) and inequality
(2.3.28) verify (2.3.15).

Recalling that the point t at which Fn(s)—s ; se(0,1) attains
its maximum has a uniform distribution over the unit interval, line
(2.3.15) in Lemma 2.3.3 may be interpreted as an expression for the
conditional probability that
03:2.(Fn<:>"s)i ..,
given that the supremum is attained at the point t , t0<t<tl . The
following lemma presents an expression for the unconditional probability
that
sup Fn(s)-s>: dn .

O<s<1

Lemma 2.3.4.

 

Let {dn} be a positive sequence as defined in Lemma 2.2.2, and

p(t, E;— t) be the "density" defined prior to (2.3.1) then
n

 

43

l :E: m ndign(l,t:)
p(t, ;-— t)dt"‘e (2.3.29)
0 m3n(t+dn)

where gn(l,t) is defined in (2.3.8) and t3 is such that
SUP 8 (l,t) = g (1.t*) (2.3.30)
O<t<l n n n
Proof.

The method of Laplace ([4], [20]) will be employed to Show for

O<t0<1/2<tl<1 , that

t1 2 2
ndngn(l,t) ndngn(1,t:)
dn(t)e dt/Ve (2.3.31)

to

and

m:n(t+dn)

t0 1
:E: ndﬁgn(1,t*)
+ p(t.-§-- t>dt = o(e n ) (2.3.32)
0 t1

To verify (2.3.31), gn(l,t) is twice differentiated with respect to

t 3

 

2 = 2 = ._£_ _ _ l-t
dnfn(t) dngn(1,t) (t+dn)log t+d +(1 t dn)log( )

 

 

 

n 1—t-dn
(2.3.33)
d t+d

a 2 n 1-t n
dzf' t = ———-d 1,t =—————— - 1 -——— . 2.3.34
n n( ) at ngn< ) t(1_t) og< t l-t-dn) < >

2 2 2]“ l
dzf;(t) =-—§§ d gn(1,t) = —d 2 + 1 ;] < 0
n at n n t (t+dn) (1-t)2(1-t—d

(2.3.35)

44

By using Taylor expansions of the log—terms in the expression

2 d d d

_. = n _ _E. _ __E.
dnfn(t) t(1—t) log(l+ t )+ log(l l-t)

 

it is seen that t: , the (unique) maximum of gn(1,t) is in the
interval 6%(l—dn), 1/2) for all n sufficiently large. Using the

mean value theorem, gn(1,t) may be expressed as
(l t) = (l t*)+ -l—(t—t*)2f"(3 ) (2 3 36)
gm ’ gn , n 2 n n n e o

for some

6ne[[t:,tn] U [t,tgH

In evaluating the integral in (2.3.31) the following notation is

used:
_ —2 —1
hl(t) — t (t+dn) ,
_ —2 —1
h (t) — (1—t) (l—t-d ) ,
2 n
= *
t1,n tn + an ,
= 2': —
t0,n tn En ,
where
- —2 _ 2
En — 0(1) and En — o(ndn) ,

u = (nd:(hl(tl n)+h2(t0’n))l/2 :
_ 2 1/2
v — (ndn(h1(t0,n)+h2(tl,n)) ,

and for some e>0 , t0<1/2—e and tl>1/2+e .

In view of (2.33) through (2.3.36)

2
ndnfn(t:) 7

tl,n 2
nd f (t) 01 (t‘)e u (t -t*)':_x'
f a(t)e nn dtg n n n l,n ne2dx
t n u

0,n n

 

._ it
Un(t0,n tn)

45

2
nd f (t*) V (t -t*)
ndzf (t) a (tt)e n n n n lsn n 2
a (t)e “ “ dt 8 n -X
n > '— dx

2
_ *
Vn(t0,n tn)

 

' H
for some tn 6 [t0,n’t1,n] and tn €[t0,n’t1,n]

Because tgrvl/Z it is seen that

I
dn(tn) dn(tg) -1/2
___.__./v-—————-/~/(2n)
un vn
so that
t1,n 2 2
ndnfn(t) ndngn(l,t§)
dn(t)e dt/Ve ‘ (2.3.37)
t
0,n

Also, referring to the definition of dn(t) , (2.3.16), it is evident

that
t t
0’n ndﬁfp(t) 1/2 2 1/2 0,n “dzfn(t)
. — n
an(t)e dt 5 2(2nt0(1—t0)) (ndn) dt,
t0 to
(2.3.38)

Here, fn(t) may be expanded as
. __ , 1 _
fn(t) — fn(t0’n) + fn(7)(t t0,n) for t0<t5v§5t0 n
Of course t—tO n < 0 and f; is a decreasing function so that
9

fn(t)

IA

_ I
fn(t0,n) + (t to’n)fn(to’n)

IA

fn(t0,n) + 15 en(t-t0,n) (2.3.39)

for n sufficiently large.

46

The latter inequality is obtained by expanding f$(t0,n) as a

Taylor series about t; with the assumption that dn = 0(1) and

tn = 0(1)
Since ndﬁ En tends to infinity for large n
t 0 .
0,n 2 -u ﬂ 2 -l
‘jr’ 15ndn€n(t—t0,n) 1 ~[9 e du (lSndncn).
e dt = ——-:;-—-
15nd e
n n 2
t0 15ndn€n(to_t0,n)

(2.3.40)
Finally fn(t0,n) < fn(t:) so that lines (2.3.38) through (2.3.40)
guarantee
t

0’n ndzfn(t) ndzf (t*)
dn(t)e n dt = o(e n n n ) (2.3.41)

t0

2

provided ndien tends to infinity for large n .

A similar argument shows that

t t
1 ndifn(t) _1/2 1 ndﬁfn(t)
a (t)e dt < 2(2nt (l-t )) nd2 e dt ,
n — l 1 n
t t
1,n 1,n
and
fn(t) 5 fn(t1,n) - 15€n(t-t0,n)
so that

dn(t)e dt = o(e

1 2 2
nd f (t) nd f (t*)
f n n n n n )_ (2.3.42)

47

the asymptotic expansion (2.3.37), (2.3.41) and (2.3.42) verify
(2.3.31). To verify (2.3.32) recall Lemma 2.2.3 and Lemma 2.3.1 to

see that

 

 

n-l m l n m
p(t, - t) < n(m—1)t (1 t)
for 0 5 m 5 n
2 P(t, €11 "' t)
m3n(t+dn)
n(l—ted ) n(t+d ) n—l -n(l-t—d ) l
_ n n n __
<n< 1t) < t) (1-% (1—-—1——> (l-t-dn-n)
1—t—dn t+dn n(l—t-dn) l—t

Note that (1 — %)x < e—1 and (l - %)-X < (1 — %)—le for x > 0
so that
2
-1 l—t—d nd f (t)
n 11
Z p(t, g- t) : n(l- 3}) <———-—)e “
mzn(t+dn) 1't

where fn(t) is defined in 1ine(2.3.33).

Thus there is a constant c > 0 such that

t

t
() :E: 0 ndzfn(t)
p(t, E-- t) 5 cu e n dt . (2.3.43)
m2n(t+d ) n
0 1'1

0

As in (2.3.39), fn(t) may be bounded above by

fn(t) s fn(t0) + (t—t0)fg(t0)

for O < t 5 t Thus the right hand Side of (2.3.43) is bounded

0 .
above by

48

2
en ndnfn(t0)

2 (2.3.44)
Y

ndnfn(t0)

which is asymptotically negligible when compared to

2
*
ndnfn(tn)

This may be easily demonstrated by means of the following crude

inequalities. For t < t:

fg,<t> = (t—tpfmn) : (-21- - t)

and
_ t2 11
fn(t) — fn(2t)—tft'l(2t) + 2 End)
1
f 2t —-———
s n( ) l6t

provided 2t < t3.

Thus the expression in(2.3.44)is less than

 

2
ndn 2
_ IBIb
ce endnfn(2t0)
’i_ 2 (203045)
12 t0)dn

' *
prov1ded 2tO < tn .

. 2
Then Since fn(2t0) < fn(tﬁ) and ndn > aolog n selecting
tO < a0/16 also will assure that expression (2.3.45) and hence
expression (2.3.43) is asympotically negligible compared to (2.3.37).

A similar argument is used to show

11

1
:E: ndﬁfn(t*)
p(t, E:— t)dt = o(e “ ) (2.3.46)
m3n(t+dn) '
t1

 

 

49
Recalling Lemma 2.3.1 and Lemma 2.2.3 it is seen that

p(t, g — t) .<_ $319104?“

2' n(t+dn) ndzf (t)
m ________ n n .
p(t, 'n - t) _<_ e if t < 1-c1r1

t
m3n(t+dn)

and

and is zero if t > l—dn

Thus

ndgfn(tl)
[:2 p(t, 3,? — t>dt 5 en 3
m>n(t+dn )

and (2.3.46) is verified.
The results of this section are summarized as:

Theorem 2.3. 1

 

If {Xn} is a sequence of independent, identically distributed

random variables with continuous distribution functions, F , and

+

Dn = sup Fn(x) — F(x) is the Kolmogorov-Smirnov statistic generated
x
by X1,...,Xn then for any real sequence {dn} such that d > 0 ,
n

(1n = 0(1) and ndi > a0 log n for some aO > 0

2
nd f (t*)
+ nun
P[Dn 3_dn]nve

where

dn c1n
-dr21fn(t)= (t+dn)log(l+ —)+(1—t—dn)log(l— TH—t

and t: is the unique value at which fn(t) attains its maximum for

O < t < l .

 

50

Corollary_2.3.l

 

For the sequences {Km} and {dn} defined in the previous theorem

2
_ ndnfn(t;)
P[Dn > dn] Ne

ndif (til)
Pan > dn]r~/2e ‘1

2. 4. Probabilities of Excessive Deviations of the Kuiper Statistic

For a sequence of random variables, each with distribution
function F , the Kuiper statistic is defined in terms of the sample dis-

tribution function, Fn’ as

This statistic was originally suggested to test hypotheses about
distributions on a circle. It has the property that Vn is the same no
matter where on the circle the count to determine Fn(x) begins. To

make this statement more precise, define addition on the unit interval
by
s+t= s+t if s+t<1

s+t-l if s+t>l

and define the "interval” [s,s-i—t] by {x:s:x:s+t} if s+t<1
and {x:s:x:l or 0:x:s+t-l} if s+t>1.

Let X , X . , Xn be independent uniform random variables

2...

H

on the unit circle. Define X:(t) as the number of Xi in the interval

S

(s, S-i-t] and nF:(t) = Xn(t). Then

V = sup (F (x) - X) - inf (F (x) - X)
n 05x51 n 0:x_<_l n
s . s
= sup (F (x) - x) - 1nf (F (x) - x)
05x51 n 05x51 n

Let Dn(v) : Fn(v) - v. If the infimum of Dn(v) occurs at s and the

maximum at s-i—t and the mth order statistic in the interval (s, 8-1-1)

51

 

 

52

occurs at s-i—t then V = E- - t. Let p (t, E - t) be the joint
n n n n

"density" of the above event.

Then
1 l n-l m
23 p (t,— -t)ds dtzl.
m=l n n
0 0
Using the notation developed in (2. 6. 3), pn(t, It? - t) may be written
as
L(t, n: m) P(t, n) m) R(t, n: m).
Let A be the event that an order statistic occurs at s and the
th

m order statistic in the interval (5, S-i-l) occurs at S-i-t. The
joint density associated with this event is

n! m-l n-m-l

P(t'”'m) Z (m-l).'(n-m-l).' t ”'9

 

for O<s<l,0<t<l,and 0<m<n. Then pn(t,-r§—-t) may

be written as
p (t, — - t) : L(t, n,m) P(t,n,m) R(t, n,m) .
Here, L(t,n, m) and R(t,n,m) are defined by

L(t, n,m) = P[Fn(s) - s 5 Fn(u) -u _<_Fn(s+t) - s+t; ue (s, S—i—t) IA]
R(t,n,m) = P[F (s) -s < F (u) -uf_Fn(s-i-t) - s-i-t ; ue (s+t, s-i—1)|A]

The next three lemmas present asymptotic expansions for P(t, n,m),

L(t,n,m) and R(t,n, m). Since the proofs are similar to those of

53

Section 2. 3, most of the details are omitted.

Lemma 2. 4. l

 

Let {dn} be a sequence such that dn : 0(1) and ndrz1 > clogn

for some c > 0 . Then if m = [n(t+dnll

2
-1/2 3/2 ndn gn“)
r) (3 .

P(t,n,m)~ [21rt(1-t)]
Here, gn(t) is defined by

d d
2 _ n
_dngn(t)—(t+dn)log(l+—t-)+(l-t-dn)10g(l-——).

The proof involves using Stirling’s formula in a manner Similar

to that used in Lemma 2. 3. 2.

Lemma 2. 4. 2

 

If {dn} is a sequence of real numbers such that dn = 0(1)
and ndrz1 > clogn for some cZ :t/Z then for m = [n(t+dn)] ,
0 < t < 1

2d 2
L(t.n.m)~ (——3>.

Proof.

The proof consists of breaking L(t, n, m) into parts by con-
ditioning on Fug-2:1) . First note that P[X:(t/2) = le] follows the

binomial distribution with parameters 1/2 and m-l ;

P[X:(t/2) = le] = (mj‘1)(1/2)m“l for 0_<_j :m—l.

54

Let

Ll(t,n,m,j) : P[Dn(s) 3 Dn(v) :Dn(S-i-t) ; v e [s, s+t/2]]X:(t/2) = j,A]

L2(t, n,m,j) = P[Dn(s) :Dn(v) :Dn(s-i-t) ; v e [s-i-t/2, s-i—t]]X:(t/2) = j,A].

Then

m l m 1
( .- )L1(t,n,m,j) L2(t,n,m,j) .

0 J

m-l

M I

L(t, n,m) : (1/2)

1
As in sections 2. 2 and 2. 3 the relationship between uniform order

statistics and a Poisson process, X(-) , can be exploited.

P[Dn(s) : Dn(v) ; ve [s, s+t/2]|X:(t/2) = j,A]

P[u—X(u) : l ; 0 < u < nt/2]X(nt/2) : j]

Ln(t/Z,J+l).
If j = n/2(t+dn+o(dn)) then Theorem 2. 2. 3 yields
L (t/Z.j+1)~ Zd /t.

n n

Clearly,
L1(t,n,m,j) = P[D (s) < D (v) ; vs [s, s—i—t/2]|erl(t/2) = j,A]
_ P[Dn(s) _<_Dn(v) and Dn(v') > Dn(s—i-t); for all ye [s, s-i—t/2]

and some v' e [s, s-i—t/Z]]XE(t/Z) : j ,A].
It will be shown that for j : n/2(t+dn+o(dn)) that
P[Dn(v') > Dn(s-i—t) for some v' e[s, s-i—t]]Xi(t/2) : j,A] = O(dn)

so that

55

L1(t, n,m,j) ~ Zdn/t‘

To evaluate P[Dn(v) > Dn(s-i—t) for some ve (s, s-i-t/Z) Ix:(t/2) =j,A]

observe that, conditionally, there are j - 1 order statistics in the

interval (8, s-i-t/Z) so the conditional probability density that the kth

order statistic occurs at s-i-v is the probability density that the kth

of j - l uniform order statistics on the interval (0,t/2) occurs at
V 3

.. '. _ _ .-l-k
(t/Z) lkﬂklnﬁg)“ 1W??? 05v5t/2.

The density with respect to u = (t/2)-1v is

k-l j-l-k

k<jlgl><u1 (l-u) 051151.

The conditional probability that Dn(x) does not exceed k/n - v to the

”right" of v is (see Theorem 2.2.1)

EFL-i<— lf 0 < j-k < n(t/Z-V)
‘Z—(l-u)

I
p-o
I

P[Fn’j_k(x) - x30, 0 < x < t/2] _

0 if j-k _>_n(t/Z-v) .

The conditional probability that Dn(x) does not exceed k/n - v to

the ”left" of v is

P[Fn(x)-x:k/n-V 0<x<v]X( =v]=L(v,k).

k) n

Thus the probability density with respect to u and k that the supremum

of Dn(x) for 0 < x < t/2 occurs at v and is equal to k/n -v, given

x:(t/2) 23' and A is

l<(j) (u)k'1 (1 -u)j’k (1 - —l—'-l—‘—) Ln(v, k)

gin-u)

An asymptotic expansion for this density can be obtained by methods
similar to those used to obtain the expansion in Lemma (2. 3. 2). Let
s = o(d ) and j : n/2(t+d +2: ). We require k/n-v = d +t so that
n n n n n

k=n(v+cd) for c >1. Let
e n e—
(2ce —u)dr1 - usn
k = j u+
n (t+d +8) .
n n

Since knzju+c0jdn/t forsomee, 130:2;c31, we may

 

obtain, using Lemma 2. 3.4 that

P[Dn(v') > Dn(s-i-t) for some v' e [s, s-i-t]]erl(t/2) : j,A]

nnn
e

( 1(d2/t21g (1.u:))
O

~2jdi/t2
o(e )

O(n-cZ/t)

Similarly, L2(t, n, m,j) ~ 2dn/t . This may be verified by observing

that

P[Dn(v) :Dn(s-i-t) ; v e [s-i-t/Z, s t]|x:(t/2) = j,A]

p[u_x(u) :1 0 < u < nt/2]X(nt/2) = m-j]

Ln(t/2,m-j+l)

and then proceding with the proof that L1(t, n, m, j) ~ 2 dn/t .

57

Apparently
. . s 2. _
P[Dn(s) _<_Dn(v) :Dn(s+t)ve [s, s+t]]Xn(t/2) j,A] - 0
unless nt/2<j<n(-£+d). Inthiscase [a I<d . For
2 n n n

8 =9dn, [0]<l wehave

.2(l+0)dn 2(1-0)dr1
L1(t,n,m,j) L2(t,n,m,j) : O[(—_——t—_—)( t ):]

Recall that

l
n

Break this summation into three parts:
. n
(a) J- z<t+dn+o(dn)>
. n
(b) J=Z(t+(l+0)dn) |e|<1

(c) other j

 

2d 2
For part (a) the sum is asymptotically ( tn) (1 -o(l)> and parts

(b) and (c) are negligible compared to this. I

Corollary 2. 4.1

II
0
A

...—1
V

If {dn} is a sequence of real numbers such that (1n
and rid:1 > clogn for some c > 0 then for m = [n(t+dn], 0 < t <1

d
L(t,n,m) = 00—3)}.

Lemma 2. 4. 3

 

H
O
A

l--'
v

If {dn} is a sequence of real numbers such that dn

58

(l-t)
2

 

and ndr21> clog n for some c2 3 then for m = [n(t+dn)] ,

0<t<1

d 2
R(t,n,m) ~ (T-n—t) .

The proof is similar to that of the previous lemma. Note that

P[Xs+t(l-—E) = j [A] follows a binomial distribution with parameters
11 2

1/2 and n-m-l;
. n-m-l
P[Xs+t(—1—g-E) : le] : (m'9'1)(l) , 0 :j_<_n-m-1.

Let

R1(t.n.m.j)

_ . , . . -_1.:_£ ' __ _.
_ P[Dn(s):Dn(v) :Dn(s+t),ve [s+t,s+t+ 2 ]|A and xr1 ( 2 )_

R2(t.n.m.1)

_ P(Dn(s):Dn(v)_<_Dn(s+t).V€lS+t+ 2 ,s+llA and Xr1 (‘2’) —1

Then

It will now be shown that for j = %(l-t - dn + o(dn))

dn
R1“. n, m. J) ~ (I17)

Note that

 

59

R1(t.n.m.j)
= P[D (v) < D (s-i-t) ;ve [s-i—t, s+t+-1-i]]A and Xs+t(ﬂ) :1]
n — n 2 2
.. P[D (v) < D (s-i-t) ‘ve [s-i-t s-i—t-i-li]
n _ n v 9 2
and
Dn(v') < Dn(s) for some v'e [s+t,s+t-i——1£t]IA and Xs+t(——1£t) = j]

and for any Poisson process X(u) and j : %(1-t - dn+cn)

1-t - _
P[Dn(v) :Dn(s-i—t) ;ve [s-i-t, s-i—t-i- “‘Z—HA and xs+t(lZ—t) : j]

1’1

= P[X(u)-u:0 0<u< 2(1_t)]x(1?}(1-t))=j]

-_1_2' - 1_-t_
n(l-t) for 0<j<n( 2)

l

0 if j>n(—12t)‘

Also, using the notation of Theorem 2. 2. l,

P[D (v') < Dn(s) for some v' e [s-i—t, s-i—t-i—l—g—t [A and Xs+t(l-£—E) = j]

dn+€n
P]: sup (F . (u) -u>> ]
l-t n’3 2

0<u<—

 

where T1, T2, . . . , Tj are the order statistics based on j uniform

random variables on the interval [0,-Iii] . The "0 term" was

obtained in the same way as the corresponding term in Lemma 2.4. 2.

60

 

Similarly,
. _ 2(n-m-j-l) ._ _n_
R2(t,n,m,j) — O(l - n(l-t) ) for j— 2 (l-t-dn-l-en)
and
dn
R2(t,n,m,j)~ ITE- for en 2 o(dn)

and we can deduce
d 2
n
R(t, n,m) ~ (_l-t) '
These lemmas are used in the proof of the following Theorem

Theorem 2. 4. 1

 

If X,X

1 2, . . . is a sequence of independent, identically dis-

tributed random variables with continuous distribution functions and

V is the Kuiper statistic (as defined in Section 2. 0) generated by
n

X1, . . . , Xn’ and if {dn} is a real sequence such that dn > 0 ,
dn = 0(1) and ndi>1/210gn then
2 ndrzign(t:)
P[V >d ]~ 8nd e
n n n
where ng (t) = (t+d )log -—t— + (l-t-d )log -—l—‘£— and t]< is
n n n t+dn n l-t—dn n

the value at which g(t) attains its maximum 0 < t < 1 .

Proof.

The proof is similar to that of Lemma 2. 3.4.

By Lemma 2.4.1, 2. 4. 2, 2.4. 3 it is clear that for

m = n(t+cd)
C n

61

m

2
c 4 3/2 ndngn(c,t)
pn(t9 n 't) ~ (217)

—1/2
(cdn) r1 e

(t(1-t))‘5/2

,

here drz1 gn(c, t) is determined by replacing dn by cdn in the

definition of d: gn(t) and, repeating the reasoning of Lemma 2. 3. 3,

 

z; (t, m_____+k -t)
k=0 “
1-t
3.;-
_ _5 nd g (c,t)
~ 4nd I (2 ) 1/Z( (1 t) /Z( d )4 3/Ze dc
1
5/2
2
_1/2 ndn ) 2
~ (Z1T) 4<W endn gn(t)
l d“ d“ l
ndr1 log(l+T)- log(l- 1 -——t_)

It remains to integrate this asymptotic density with respect to s and
t . Integration with respect to 3 leaves the expression unchanged.
Integration with respect to t is carried out using the method of

Laplace (see the proof of Lemma 2. 3. 4) to obtain

I/hp n(t, m-]--l1--t)dsdt

 

 

 

 

2 5/2
nd
(211)-1/2(—.:r—n—*—) 1/‘2 2 (t*)
5 t (l-t) 2n e“ ngn n
d (d /(t")(1-t"‘)) nd . + I...
n ‘1 :[t ’2 2(t+dn ) (1-t‘)z(1-t"-dn)
2
2 n ngn(tn)

2. 5. Probabilities of Large Deviations of Kolmogorov-Smirnov

Statistics

Let X1, X2, . . . be a sequence of independent, identically

distributed, continuous random variables with common cumulative

distribution function F. Then Y = F(X1),

1 :F(XZ), Isa.

Y2
sequence of independent, uniform random variables on the interval

(0,1). Define D+ = sup (F (t) - t) where F (t) is the empirical
n O<t<1 ‘1 9
distribution function of Y1, Y2, . . . , Yn . Of course D: is equivalent

to the usual one-sided Kolmogorov-Smirnov statistic,
sXp (Gn (X) — F(X)), where G ( ) is the empirical distribution
-00< <co n
function of X ,X , ...,X
l 2 n
Of interest is the rate at which the probability that D: is

greater than a constant tends to zero as n becomes large.

Theorem 2. 5.1

 

Let D: be defined as above and let c be a real number,

O<c<l. Then

 

 

 

P[D:>c]~oz(c)[6(c)]n, (2.5.1)
where
* t>1<+c * l_t*-c
(3(6) = [1 f, ) (—l-f——) 1, (2.5.2)
t +c l-t -c
and
ei-w‘ -1/2
a(c)- 9%)]: 1 t ‘C + *2 :1 (2.5.3)
2(]t +6)2 (l-t ) (t'+o) ’

>1:
Also, t is a root of the equation

62

63

 

 

 

log(t .l't‘c) + C :0;0<t<1-c,

Proof.

Without loss of generality, it is assumed that {Xn} is a
sequence of independent, uniform random variables on the unit inter-
val. The structure of the proof is the same as for the excessive
deviation case. In particular, Lemma 2. 3. 1 is taken as the starting
point of the proof. The first task, then, is to obtain an asymptotic

expansion for p(t, 1;:— —t) as n becomes large for integer valued m,

m : n(t+c+9/n) (2.5.4)

where k E 6 < k+1, c is a fixed positive number, and t+c is bounded
awayfrom O and l;0<x0<t+c<x1.
In evaluating Ln(t,m) the inequalities (2. 2. 12) and (2. 2.13)

are again employed. Let

x=(1+t3) (2.5.5)

64

 

 

+ c+e +9/n

kn =(1+ 1: ) (2.5.6)
C-e +9/n

XI; = (1+ 2 ) (2.5.7)

where En > O and r15: = alogn if 31—252 < c2 and O elsewhere.

Let K be a Poisson random variable with parameter p . Then

-m K m m-
P[K :m] _<_ min S ES = (i) e H (2. 5. 8)
s>l

In particular, if p : ntxr-l and m is defined as above

 

 

 

 

 

[18
- - n
ntx m-ntk -
2 p (x',j) < < “)9 “<ez(t+°+9/”) < n‘81 (2.5.9)
j>rn n 1'1 — m —- —-
n8
_ n
+ . -
z Pn(>.n.3) : e 2(t+°+28n+97“ _<_ n 3 (2.5.10)
j>m+2n5
— n
and 3
1'18 n5
_ n + n
2 Pn(>.+,j) _<_ minsmEs‘K EeZ(t+c+9m 3(t+c+9/n)2 _<_ n-‘a
j_<_m n 521
(2.5.11)

. . + . .
Let W+ denote the root assoc1ated With )‘n as defined in Theorem

2. 2. 2. Similarly let w_ denote the root associated with x; . Then,

returning to the inequality (2. 2. 12) it is seen that

-W

1. (t,m) 5 (1—e J“)

n (HEN—a) (2.5.12)

for all n sufficiently large, and using inequality (2. 2. 13) it is seen

65

that
-w

Ln(t.m) 3 (1- e (2.5.13)

‘)<1-2n'a)

for all n suficiently large.
Since

X = xg'ze')‘ the

Finally, let w denote the root associated with x.

 

 

>1: >:< -
w = X - x where K < 1 is the solution to Xe
following inequalities obtain
2(5 + e/n) 2(.c.In + e/n)
w<w <w+ “t ;w- t <w <w.(2.5.14)
+ ’ (2.5.15)

Thus, selecting a > 1/2 , it is concluded that

Ln(t,m) = (1-e'w)(1+3(n’l/Z)) (2.5.16)

An asymptotic expansion for the expression

 

p (t,m) = m(n)tm’1(1-t)n‘m
n m
may be found in a mannersimilar to that used in Lemma (2. 2. 2). It
is found that
(t+c)n 1/2 n 6 k2
Pn(t, m) 2 ( 2 ) (p(t, c)) (a(t, c)> (1 +0(-——))
Z'rrt (l-t-c) n
(2.5.17)

 

 

 

where
_ t t+c l-t l-t-c
B<t,C) _ (t+c) (l-t-C)
and
t l-t-c
alt”) - (Tr: t+c)

Thus

66

 

-W 1/2 6 n
P(t, {til-t) = (C(:(-l?t) ))(2:((1tf:.)c)) (a(t,c)) (who) (1+an(t,k)>
(2.5.18)

— n—l/2

and sup an(t, k) = o( ) for x < t+c < x and 0 i k 3 kn: o(nl/Z)

0 1

The probability density that the maximum deviation of D: from 0

occurs at t and is greater than or equal to n(t+c) is

 

 

k
qn<t.c) : z p(t, "a" - t) (2.5.19)
k_>_0
k+9
where mk = n(t+c+ n) is integral valued and 0 gen < 1 . To
see that
m
p(t! _ -t)
qn(t’ C) 1-a(t, c) (2. 5. 20)
write
k' Ink rnk
q(t,c)= Ep(t,————-t)+ Z) p(t,—~t).
n k=O “ k:k'+1 “

The first sum on the right tends to the required limit for large k' ;

 

k' m m k'+1
1330 P“, ‘31: ‘ t) = P“, “29‘ “(I .13. Willa”) (1+3<n‘”2))

if k' = o(nl/Z).

The second sum on the right is bounded above by the sum of
the probability densities of the order statistics themselves. Thus,

employing Lemma 2. 2. 3.

. , linlllr'l.

67

m

 

 

z p(t,—£4): >2 k<§>tk‘1<1-t)“‘k
kzk'H “ t_>_k'+1
c+k' l-t—c-k' c+k' t+c+k' -n
:"Ul- 1..) (1+ .) 1
nk'
t l-t-k'-C' n
5- n(l-t ' c+t+k‘ ) (ﬁnd)

 

-l
: o(ﬁ(t,c))n for some k'=O((10rgln) )(2.5.21)

\

Thus qn(t, c) may be written as

9n
qn(t,c) = fn(t,c)(a(t,c)) (1+an) (2.5.22)
where Ian] : o(1) independent of s and c for s bounded away from

zero and s+c bounded away from 1 .

The probability that D: is at least c is
l-c
f qn(s. c)ds (2.5.23)
0
and for fixed t such that n(t+c) is an integer

t t
f qn(s,c)ds f1fn(s,c)[a(s,c)]e(l+o(l))ds

1
(1,3) (c-3-

t
(1 +o(1))f fn(s, c) [a(s, c)]n(t‘s)ds.
1

“H (2.5.24)

The integral on the right may be expressed as

t-— O
n
l
l uc -u u
~_f (t: C) [em][a(tyc)] [a(t-_:C)] du
O
1
uc
~._fn(t,c) f e-——t(1_t)du
O
, C
_ t(1-t) 't(l-t)
_ cn fn(t,c)|:1—e ]‘ (2. 5.25)

 

This may be seen by letting s = t - g; 0 i u < 1 then

t 1
f1fn(s,c)[a(s,c)]n(t“5)ds = if fn(t-%,c)[a(t-:1—1,c]udu
o

t--
1'1

and observing that a(t-g, c) ~ a(t, c) and that

 

 

t-E n(t+c) 14+}: n(l-t-c) u
n n’ n ’ l-t t l-t-c

CU.

~fn(t.c)e “1‘” [a(ncn‘.u

This final relationship is used to define g (s, c) , a smooth approxima-

c .
s(l-S) . Def1ne

 

tion to qn(s, c). Define b(s,c) : log a(s,c) +

C

_ b(s,c) s(l—s) - s(l-s)
fn(s, C)[1_e-b(s, C):] l:_c—:_ 1—e ](2. 5.26)

gn(s,c) -

 

if b(s,c);1é 0 and

69

C

gn(s,c) : fn(s’c)(S(lC-S))(1'e s(l-s))

 

if b(s,c) = 0. Then (by 2.6.24, 2.6.25)

And from the above discussion it is evident that for t bounded away

from O, and t+c bounded away from 1, that

t t
flqn(s,c)ds :(f1gn(s,c)ds)(l+on(t,c))

t-— t——
11 1’1

where (on(t,c)' < an and an is a sequence convergent to O inde-
pendent of t and c .

Thus for O < tO <t1< 1-c

The left hand term is the probability that the maximum deviation of
Fn(s) - s is at least c and is obtained at some 5 , t0 < s < t1.

The right hand term may be integrated using the

Method of Laplace [4, 20]:

 

Let ¢(x) and h(x) be two real continuous functions defined

on (a,b) such that

70

(i) <1>(x)en (X) is absolutely integrable for every positive

value of n >n .

0

(ii) h(x) has a single maximum in the interval, an interior
point of (a, b).
(iii) h"(x) is continuous, h'(y) = 0 and h"(y) < 0. Then

8.8 H96.)

b nh(x) ~21: ”2 nh( )
41X) e dX ~ ¢(Y) [m] e y

a

In our case

 

t+c l-t-c
t 1-t
h(s) : 10g(t+c) (l-t—c)
c

)_[ s+c Jl/Z(1—e‘w)[ b Jl: e- s('1_s)]
(MS — 2'rr(l-s-c 1-a 1_e‘b -

a = a(s, c), b = b(s,c) , w 2 Wk as defined above, and

 

g (S,c)ds : n (13(5) n (S)ds : ¢(y)( -”1T) en (y)
h (V)
to to
Since

 

t+c l-t-c
t l—t
NS) = ”g(m) (1-1;-..)

 

 

 

 

11"(8)

 

II
I
O
N
(_‘7
U}
3TH
+
O
+
T:
I
U)
AH
y—l
l
U)
I
O
l___l

71

it is evident that h'(s) has at most one root in the interval [0, l-c]

and it must be in the interior of the interval since

1im h'(s) : co lim h'(s) z -oo
s—>O s—>1-c

Jo
’\

>:< :
Call this root t and select t0 and t1, so that to < t < t1.

Evidently conditions (i), (ii), (iii) of the method of Laplace are

 

 

satisfied.
Thus
t1 t1
q (s c)ds~f gn(s,c)ds
to to
~(1-e-W ) l-t*-c + l -l/Ze-nh(t*)
C >}< 2 >1: 2 z}: 2 >'
t (t +c) (1-t) (t +c)
= a(C)[B(C)]n

(a(c) and (3(c) are defined in lines (2) and (3) of the Theorem),

To complete the proof it should be noted that

t t 1

0 l
P[D:>c] zf qn(t,c)dt+f qn(t,c)dt+f qn(t,c)dt

0 t0 t1

t 1

O
~a(c)[(3(c)]n+f qn(t,c)dt +f qn(t,c)dt.

0 t

It remains to be shown that the two integrals on the right are small

compared with 01(c)[(3(c)]n .

72

The integral

to
qn(t, c)dc

0

denotes the probability of the event that the maximum deviation of
Fn(s) - s is at least c and is obtained for some 5 in the interval

(0, t0) . The event described is realized only if at least nc of the

order statistics, Yk , are in the interval (0, t0) . The probability of
the latter event is P[K 2 nc] where K is a binomial random variable

with parameters n and t0. Thus

to
qn(t’ C)dt < P[K > nc] :min S-nc Es

 

+K
O — _ s>l
: min s-nc(ts +1 -t)n
s>l
= [1360.0]n say
l-c
Clearly 6(t0,6) < (3(6) if 0 < t0 <1/2 and to < (l-c) C c[(3(c)]l/C.
For any such tO ,
to
f qn(t.c>dt = o(a(c)[p(c)]“)
o

A similar argument demonstrates that

73

1
f qn(t,c)dt _<_ P[K > n(t1+c)] _<_ [(3(t1,1c1+c)]r1

t1

(here K is a binomial random variable with parameters n and t1) so

ti" < t < l-c

for any t1; 1

This concludes the proof.
The following two theorems are immediate.
Let X1, X2, . . . be a sequence of independent, identically

distributed random variables with common, continuous distribution

function F . Define the sample distribution function as

Define the Kolmogorov-Smirnov statistics:

D- = sup (F(x) - F (x))
n -w<x<m n
Dn = sup IFn(x) - F(x)l.

-m<x<m

Define functions of c , 0 < c < l as follows:

ti" ; the root of the equation

 

 

 

s . l-s-c c _ .
(10g(s+c l-s )) +(s(l-s)) — O, O<s<1-c

74

\‘I

W". ; the largest real root of the equation

 

 

 

 

(1+ 3),-)(1—e‘w) —w z: o
t
>i< >.< * 1- ta: 1- >5
B ; B :: exp{(t +C)(10g * )+(1-t -C)(10g_ji<—_)} : g;
t +c l-t -c l-t -c
>:< >1: >1: >§< _
w t [l-t -c t +c :] 1/2
a? 0‘ = T + —“‘::‘2‘
C t (l-t ) .
Theorem 2. 5. 2
Let D; be defined as above, then
n

P[D;>c]~a"(3 .

Theorem 2. 5. 3

 

Let Dn be defined as above, then

P[Dn>c]~ 26.67l

To verify theorem 2. 5. 2 assume, without loss of generality, that

X1,X2,...

define Uizl-X., i:1,2,3,... . Let

1

then

x - F:(x) .—. F:(l-x) - (l-x).

are uniform random variables on the unit interval and

75

Thus P[D;l > c] = P[sup(F:'(u) - u)> c] and the probability on the right
u
is determined in Theorem 2. 5. 1.

Theorem 2. 5. 3. follows from the observation that

P[D >c] = P[D+>c]+P[D‘>c]—P[D'>c,D+>c]
n n n n. n

and the probability on the right is small compared to the others.

2. 6. Probabilities of Large Deviations of the Kuiper Statistic.

Let X X be a sequence of independent, identically dis-

1, 2’ . . .
tributed random variables with common continuous distribution function

F . As before, the sample distribution function generated by the first

n random variables is defined as

F(X)= Z ;i=1,Z,...,n
n

x.<x
1—

and the Kuiper Statistic is defined as

V = D+ + D" = sup (F (x)-F(x)) _ inf (F (x)-F(x)).
n n n n n
-oo<X<co -a3<x<oo
I. G. Abrahamson [1] proved that the distributions of the Kuiper

statistic and the Kolmogorov-Smirnov statistic are of the same exponen-

tial order in the tails,

log P[Dn > c]

lim =l;O<c<l.
logP[Vn>c]

 

n—‘CD

The following theorem, together withTheorem 2. 4. 3 proves that the
probabilities of large deviations of Vr1 and Dn are not asymptotically

equivalent, in fact,

P[D > c]
lim n = 0; 0 < c <1.
n—>m p[Vn> C]

 

Theorem 2. 6.1

 

Let Vr1 be defined as above and let c be a real number,

O<c<l, then

76

77

P[Vn > c] ~ arl(c)n[(3(c)]n . (2.6.1)

The definitions of [5(c) , w;< and t).< are given in Theorem 2. 5. l and

 

 

2 1/2 2 -1/2
“1“) z “1-2.-t) 372 [7—1— + 21 1 (2-6'21
(l-t) (t+c) t (t+c) (l-t) (l-t-c)
w=w*, t= t*,

Proof.

The method of proof is similar to that of Theorem 2. 5. 1.
Assume, without loss of generality, that X1, X2, . . . , Xn are n inde—
pendent uniform random variables on the unit interval. The notation
s 4— t is used to describe addition on the unit circle. That is, for-
O_<_s<l and Oit<l define

s+t if Ois+til

s+t =
ll-s-tl if l<s+t_<_2

and (s, s-i-t) denotes the interval (3, s+t) if 0 < s+t _<_l and the

 

union of intervals (3,1) and [0,1-s-t') if l< s+t :2.

If the infimum of Fn(v) — v occurs at s and the maximum at
s—i—t and the mth order statistic on the interval (5, 5+1) occurs at
s-i-t then Vn : § - t . For large deviations, an asymptotic expres-
sion for the joint probability density of the above event for m : n(c+t)
is of interest.

To begin with, the probability density that an Xi occurs at s;

O<s<1 is nds.

78

The conditional probability density that the mth order statistic
in the interval (8, s-i-l) occurs at s-i-t, O < t < l , given that an Xi
occurs at s is the same as the probability density of the mth of n-l

order statistics of uniform random variables on the unit interval,

 

(n-l)! m-l n-m-l
(In-1): (n-m-l)! t (H) '
Thus the joint density of an order statistic at s and the mth order

statistic in the interval (s,s-i—l) at s-i-t is

P(t,n,m) z (m-l)!r(:1-m-1): tm’l(1-t)n‘m‘1

 

Finally, the joint probability density that the infimum of Fn(v) - v is

at s and the maximum is at 54-1 and that Vn = % - t may be

written as

L(t, n, m) P(t,n,m) R(t,n,m), (2.6.3)

L(t,n,m) = P[Fn(s)-s :Fn(u) -u E Fn(s-i—t) - s—i—t; u e (s, s-i-t)lA] ,
R(t,n,m) = P[Fn(s)—s _<_Fn(u)-u < Fn(s-i-t) - s-i—t; u e (s-i—t, s-i—1)|A],

A is the event which has probability density P(t, n, m) .

It will be shown that for m = n(t+c) and for large n

n

 

 

 

1/2 t+c l-t-c
p(t, n,m) ~((t+c)2(1-t-2<:)) n3/2[(t:c) (1:1) 1 (2.6.4)
2m: (l-t) .
2
R(t,n,m) ~ (1%?)

2 t 2
and L(t,n,m) ~(l -e-W) : (TY—E)

 

79

Thus the joint density is

pn(t, c) = L(t, n, m) P(t, n, m) R(t, n, m)

(wzczml be 3M2 3/2

(2161/7111: -t3> (t+c)

n

3f2 (“139(3) .

 

(2.6.5)

As was shown in Theorem 2. 5. 1

(3%, c+ E) ~ ak(t, c)(3n(t, c)

..,.) = (1%?) (MILE?)

The joint probability density that the infimum of Fn(v) - v is at s

and the maximum is at s-i-t and that Vr1 Z {:11 - t , m: n(t+c) , is

qn(t’ C) ~ its??? pn(t, c),

Apparently qn(t, c) is independent of s so that

1
,{ qn(t.C)ds = qn(t,C)

as in Theorem 2. 5.1 we have

t t
f quill, c)du ~f gn(u, c)du

1
t-H t-

Eli—-

for

 

gn(u.C) = fn(u,c) [lbw eb(u, C)c] ENC u:|l:1e eu(1C 11)]

 

 

80

fn(u, c) denotes the right hand side of (2. 6. 5)divided by l —a(t, c) , and
b(u, c) is defined as in(2. 5.26).

Finally the probability that Vn is greater than c is

t
l 1
P[Vr1 > c] = f qn(t,c)dt ~ I gn(t,c)dt
O t
O

for any t0 and t1 such that O < tO < t* < t1< l-c . The last integral
may be integrated using the method of Laplace thus obtaining
P[Vn > c] ~ 011(c)n[(3(c)]r1 . It remains to verify the asymptotic expan-
sions listed under (2. 6.4).
The asymptotic expansion of P(t, n,m) is found by using
Sterling's approximation of k.‘ as described following line (2. 2.21).
Next, for any Poisson stochastic process {X(v), v30} , define
L'(a,b, k) : P[a:v-X(v) _<_ 1 ; 0 < v < b|X(O) = o, X(b) = k]. For
L(t, n,m) as defined in(2. 6. 3), identifying the m - 1 order statistics
in the interval (s, 8-1-1) with the corresponding order statistics of
m - 1 independent random variables on the interval (0, nt) and these,

in turn, with the m - 1 jump points of a Poisson process in the

interval (0,nt) we observe that

L(t,n,m) = L'(nt-m, nt,m-1)
-1 _
= ‘JE L'(nt-m, 925-. [5117] +3)
- P[X<%5)=[1n—211—]+11X<nt)= n- 1]
- L'(nt-m E-t- m-l-lr—n-L-H-j]
’ 2 ’ 2

 

81

The summation is over those integers j for which

0_<_[-I—n—£-l—]+jf_n-1.

95-)

The conditional probability of X( 2 is binomial with parameters

l/2 and m-l . Using an excessive deviation result for binomial

random variables it is seen that, for m = n(c+t)’+k, k: O(x/H) and

 

ljl 5 an, an increasing faster than V; ,
.2 P[X(9-2-t-) = [53-211]+j|X(nt).—. m-1]~1. (2.6.7)
11 I 3a,,
For Ln(t,k) defined as in (2. 5.10), it is now apparent that

, _n_ m-l . ~ 1
L(nt-m’ ![ 2 ]+J) Ln(2:[

m-l

2 1+j) (2.6.8)

 

because

P[nt-m < v - X(v); o < v < -r;—t-IX(0):O, X(F‘ZE): [3521—] + j] ~ 1(2.6. 9)

N

and, for m, k and j defined as above and kn = [Ln—ii] +j , relation

2.5.16 applies so that
t -w
L (2’kn)~l-e >0 (2.6.10)

and w is the largest real root of the equation w : (l + E) (l -e-W) .

Applying (2. 6. 7) through (2. 6. 10) to (2. 6. 6) it is seen that
Z
)

L(t, n,m) ~ (1 - e‘W

The asymptotic expansion of R(t,n,m) is obtained in a similar

manner and we have

82

1w. n, m) ~ (ff-,-

This completes the proof.

 

 

BIBLIOGRAPHY

10.

ll.

12.

13.

BIBLIOGRAPHY

Abrahamson, Innis G. (1967). Exact Bahadur efficiences for the
Kolmogorov-Smirnov and Kuiper one-and two-sample statis-
tics. Ann. Math. Stat. 38, pp. 1475-1483.

 

Birnbaum, Z. W. and Ronald Pyke. (1958). On some distributions
related to the statistic D: . Ann. Math. Stat. 29. PP. 179-
187.

 

 

Blackwell, David and J. L. Hodges, Jr. (1959). The probability
in extreme tail of a convolution. Ann. Math. Stat. 30,
pp. 1113-1120.

 

 

Copson, E. T. (1965). Asymptotic Expansions. Cambridge Univer-
sity Press, Cambridge.

 

Darling, D. A. (1960). On the theorems of Kolmogorov-Smirnov.
Theory of Probability and _i_t_:§ Applications, 5, pp. 356-60.

 

 

Doob, J. L. (1953). Stochastic Processes. Wiley, New York.

 

Feller, William. (1957). _A_r_1_ Introduction £2 Probability Theory
and I_t§ Applications. Vol. 1, 2nd edition. Wiley, New York.

 

 

 

Feller, William (1966). £13 Introduction to Probability Theory
and It_s Applications. Vol. 2, Wiley and Sons, New York.

 

 

 

Fulks, W. (1951). A Generalization of Laplace's Method. Proceed-
in s _qf_the American Mathematical Society, Vol. 2, p. 613-
622.

 

Gnedenko, B. V., V. S. Koroluk and A. V. Skorokhod. (1960).
Asymptotic expansions in probability theory. Proc. Fourth
Berkeley Symp. Math. Stat. Prob. Vol. 2, pp. 153-170. Univ.
of California, Berkeley.

Hoeffding, W. (1948). On a class of statistics with asymptotically
normal distribution. Ann. Math. Stat. Vol. 19, pp. 293-323.

 

 

 

Karlin, Samuel. (1966). _A_First Course _ip_ Stochastic Processes.
Academic Press, New York.

Kuiper, Nicolaas H. (1960). On the random cumulative frequency
function. Proc. of the Royal Neth. Academy o_f Sciences
Seriesé Vol. 63, pp. 32-37.

 

83

14.

15.

l6.

17.

18.

19.

20.

84

Kuiper, Nichlass H. (1960). Tests concerning random points on
a circle. Proc. _qf_ the Royal Neth. Academy 3f Sciences
Series A. Vol. 63, pp. 38-47.

 

Loéve, Michel. (1963). Probability Theory (3rd edition). Van
Nostrand, Princeton.

 

Pyke, Ronald, (1959). The supremum and infimum of the Poisson
process. Ann. Math. Stat. Vol. 30, pp. 568-576.

 

 

Rubin, Herman and J. Sethuraman. (1965). Probabilities of
moderate deviations. Sankhya’. Series A. Vol. 27, pp. 325-
344. '

Sethuraman, J. (1964). On the probability of large deviations of
sample means. Ann. Math. Stat. Vol. 35, pp. 1304-1316.

 

Takacs, Lajos. (1967). Combinatorial Methods in the Theory 9_f_
Stochastic Processes. Wiley and Sons, New York.

 

 

 

Widder, D. V. (1941). The Laplace Transform. Princeton Univ-
ersity Press, Princeton.

 

 

.111 18111. 11.1

 

    

RSITY L

IE IBRARIES

56 7667

   

11m
30

0

Mlllmlsllhlllllm
3 1293