WWHHIWIWIWWNHWW1!WIHHHWIHIWHI 113 952 .THS LIBRARY Michigan State University This is to certify that the thesis entitled A Test For The Change-Point Problem Based On The Cramer-Von Mises Statistic presented by Jae-Sung Kim has been accepted towards fulfillment of the requirements for Ph.D. degree in StatiSt‘lCS Jere/~07 Major professor Date 31 988 0-7639 MS U is an Affirmative Action/Equal Opportunity Institution MSU LIBRARIES “ RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. A TEST FOR THE CHANGE—POINT PROBLEM BASED ON THE CRAMER-VON MISES STATISTIC by J ae—Sung Kim A DISSERTATION Submitted to Michi an State University in partial ful illment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1988 507-0'7/7 ABSTRACT A TEST FOR THE CHANGE—POINT PROBLEM BASED ON THE CRAMER—VON MISES STATISTICS BY J AE—SUNG KIM Let X1, X2""’Xn be independent random variables and F, G be continuous distribution functions on R. We want to test the null hypothesis: X1, X2,...,Xn are i.i.d. F against the alternative: X1, X2,...,XT are i.i.d. X F, X .,XIl are i.i.d. G, F¢G, where r is an unknown positive integer, 7+1’ r+2’" called a change—point. The Cramer—von Mises statistic is pr0posed to test these hypotheses. The limiting distributions of the test statistic under the null hypothesis and particular alternatives are obtained. Under general alternatives , the estimator in of r is pr0posed by maximizing the proposed test statistic. The consistency of in is proved and the asymptotic distribution of ATn is expressed in terms of Kiefer processes. In the case of G( .) = F( o—0), the two sample location model, the asymptotic normality of 0n: = med {Xj—Xi, 1 S i 5 in, in + 1 g j 5 n} is also derived. Finally, distributions of test statistic and estimator in finite samples are studied and the comparison of the pr0posed test with other tests is carried out via simulations. To my wife Min—Soo and my sons Woo—Jin and Song—Jin ACKNOWLEDGEMENTS I wish to express my sincere thanks to Professor Hira L. Koul for suggesting the problem, for his continuous guidance, encouragement and especially his patience during the preparation of this dissertation. I would also like to thank Professors Joseph Gardiner, Dennis Gilliland and James Stapleton for serving on my committee. My Special thanks go to Professor James Stapleton for his continuous encouragement, for his constructive comments on earlier versions of the thesis and for his valuable help with simulations. I would also like to thank my wife Min—Soo and my sons Woo—J in and Song—J in for their encouragement and patience during the preparation of this thesis. Finally, Special thanks go to Ms. Cathy Sparks for her excellent typing of the manuscript. TABLE OF CONTENTS Chapter Page 1. INTRODUCTION ' 1 2. TESTING A CHANGE—POINT 4 2—1 The proposed test statistic 4 2—2 The asymptotic null distribution of the test statistic 6 2—3 The asymptotic distribution of the test statistic under particular alternatives 11 3. ESTIMATING A CHAN GE—POINT AND A LOCATION PARAMETER 16 3—1 The consistency of the estimator of a change—point 17 3-2 The asymptotic distribution of the estimator of a change—point 22 3—3 The asymptotic normality of the estimator of a location parameter 23 4. FINITE SAMPLES 29 4—1 Distributions of estimators for finite samples 29 4—2 Comparison of the proposed test with Schectman's test 30 REFERENCES 32 1. INTRODUCTION Let X1, X2,...,Xn be a sequence of independent random variables with distribution functions F1,F2,...,Fn, respectively, Fit? where .7 is a class of continuous distribution functions on R. The so called change—point problem concerns the null hypothesis HO: F1 = F2 = = Fn=F(say) against the alternative hypothesis H1: F1=”'=Fr ,e Fr+1 = = Fn=G(say) for some unknown 7, 1 g r < n. The parameter r is called the change—point. The relevant questions are the following: How should the change—point be detected? If there is a change—point, how should it be estimated? If F( -) = G( o—0), and if there is a change—point, how should 0 be estimated? In the change—point problem there are two t0pics of interest: i) Testing and estimating the change—point and ii) Estimating the change under the alternatives. This thesis deals with both these t0pics. In detecting a change—point there are two main methods available in the literature: i) Sequential detection and ii) Quasi—sequential detection. In this thesis a quasi—sequential procedure is used. If n is not preassigned, a sequential method is a rather reasonable method for finding a change—point. About thirty years ago Page (1954, 1955, 1957) A introduced cumulative sum test to detect a change—point. Pollak (1985) has derived a st0pping rule which is a limit of Bayes rules to detect the change—point. Page's (1954) stOpping time was shown to be Optimal for the detection of changes in distribution by Moustakides (1986). All of these papers have dealt with sequential methods to detect a change—point. Most changebpoint papers consider the quasi—sequential case, for which n is preassigned. We will assume that n is given, that is, X1,X2,...,Xn are given. This thesis studies the Cramer—von Mises statistic for testing H0 and estimating a change—point 7' under H1 in a sequence of independent random variable in the quasi— sequential setting. For G(-) = F( -—0) the Hodges—Lehmann estimator (1957) in = med {Xj—Xi’ 1 5 i 5 ;n < j g n } of 0is pr0posed, where in is chosen to maximize the test statistic in Section 3.1. The limiting distributions of the test statistic under the null hypothesis and the particular alternatives are obtained in terms of two independent Kiefer processes in Section 2—2 and 2—3. The finite sample null distribution of the test statistic is simulated because it is almost impossible to compute it exactly. The estimator in of 1- is shown to be consistent in Section 3—1 and the asymptotic distribution of 1- n under particular alternatives is expressed in terms of the Kiefer processes in Section 3—2. The asymptotic distribution of an is shown to be the same as that of its analogue in the two sample problems with no change. The prOperties in finite samples are studied and the comparison of the prOposed test and other tests is carrried out in Chapter 4. Other approaches to the change—point in the case of quasi—sequential procedure include: for testing and estimating the change—point and change in mean; maximum likelihood estimation of Hinkley, D.V. (1970), Sen and Srivastava (1975), Hinkley, D.V. and Hinkley, EA. (1970), and Hawkins, D.M. (1977); cumulative sum test approach of Hinkley, D.V. (1971) and Pettitt (1980); a non—parametric approach based on the Mann—Whitney statistic studied by Pettitt (1979), Schectman and Wolfe (1984) and Schectman (1982,1983); the least square method of Hawkins, D.L. (1985); for estimating a change in the slope; maximum likelihood estimation approach due to Hinkley, D.V. (1969a, 1969b); non—parametric approach of Sen (1980, 1982). For more details see the bibliography of change—point problems of Hindley, D.V. (1980). Very useful references for this thesis are Hawkins, D.L. ( 1985) and "Empirical Processes" by Shorack and Wellner (1986) for a discussion on the Kiefer process. 2. TESTING THE CHANGE—POINT Recall the hypotheses. Let X1,X2,...,Xn beindependent with Xi having distribution function Fi’ i = 1,2,...,n. We want to test (2.1) for some unknown 7, 1 g r < n, where F and G are unknown. 2—1. PROPOSED TEST STATISTIC To introduce the test statistic, proceed as follows: Divide the sequence X ..,X into two parts of the first j random variables X1,...,X. and the last J Xn’ j=1,2,...,n—1. Consider (n—l) Cramer—von 1" (n—j) random variables Xj+1""’ Mises statistics and take a maximum of them. We do not consider the case of j = II 11, since j = 11 implies that there is no change—point. The corresponding Cramer—von Mises statistic of X1,...,X. and Xj+1,...,X J 11 is snj=(j—x} — {én_j(x>—x}12 dams) ° n 2 . = I;[JFK-WE10:1(I[Xinl-x)—1/(n—j) 2; 1(11xisx1—x11] dHnm. 1=j+ Set j: [nt]. Let _ n— nt 1/2 1 [nt] An(t.X) [ nt] .3131 (mesa—x) —[ “t ]1/2._1_. 121 (I[X.5 (leisxl—Rxn—(l/(n—j»,5, (I[XiSXl-F(x)) 1:1 l=j+1 - (1/(n—J')), £1; (I[Xi 5X] — G(X)) + ((H-T)/(n-i))(F(X) - G(X))- I=T+ 12 Hence Air—1775 vnj(x> = «(3:973 (WIT) $1099 5 x1 — 1‘00) 471—71 n—1 (1/719i=.j€+1<11xi 3 x1 — F00) — 075:9 (1/75)i=§+l(11xis x1 — G(x» -~/Jln—Jl7n ((n-T)/(n-J°))(F(X) —G(X))) = Ln10,F r(t > to), tin—5775 vnj = W (WIT) ii (lei 5 x1— ax» + «(IT—Tm (we i$10199 s x1 — G(x» 1=j+ 471—71 n—1 (we 31; 1(IIXiSXl-G(x))) + «(Jill-15711 (T/J') (F(X) — G(X)) = Ln2(t,F(X),G(X)) + 011203.17 (X),G(X)) (SW) (215) where 01112 = W - T/j (F(X) — G(x)). 13 The last terms of (2.14) and (2.15) tend to 00 as n —-> 00 for any x e R. So in order to have meaningful results we need some conditions on F(x) and G(x). We consider the following alternatives: Hm: G(x) — F(x) = n—1/2(F(x) — F(x)k), for some positive integer k. Note that :13; |G(X)-F(X)| = 0(1). an1(t,F(x),G(x)) ——+ 1 1—1 (1—t0) (F(x) — F(x)k) = al(t,F), (say) and (2.16) an2(t.F(x).G(x)) ——» W (to) (F(x) — F00“) = c1203). (say) uniformly in x E R and t E I as n ——1 00. Note that under Hln’ L ni(t,F(x),G(x)) are functions of t and F(x) only. Thus write Lni(t,F(x),G(x)) = L ni(t,F(x)), i = 1,2. The limiting distribution of the L ni(t,F(x)) are given by Lemma 2. Under Hln’ Ln1(t,F(x))1V—-1 L1(t,F(x)) in (D(IxR),|| H) as n —-+ 00, where, for OSySI, tStO, L103) = W K103) — W K100—ty) — 7771—175 K2(1-t0.y), (2.17) and Ln2(t.Ft0, L2(t,Y) = (IT-WT K1003) + W K2(t-t0,y) - (WIT-ti K2(1-t,Y), (2-18) where Ki's are independent Kiefer processes, i=1,2. 14 Proof. For t5 t0, Ln1(t F(X))= (n-lntllllntl (1/1/5) [Et 1](IIXi S X] - F(Kl) —./[mj/(n—[m])(1(/,/fi) [I120] 1i(I[x so], 15 and where Li(t,y) and ai(t,y) are in (2.16), (2.17), and (2.18), i=1,2. (2) M:1 24 sup{St (t)} = S: (say), as n —-a co. tEI 0 0 Proof. For t g to, 1 (Ln1(t.F(x))+ an1(t.F(x)))2 dame) R _ 1 “—1 “—1 2 -— l0 (LnlfiaPTHIl (y)) + 0311(13,F(Hn (y)))) dy- Let Hn(x) = EHn(x) = 1 /n [TF(X) + (n—T)G(x)]. (2.20) Note that sup |Hn(x) — F(x)| S supl Hn(x) — Hn(x) | + supl Hn(x) — F(x)| xER XER xER L 0 (2.21) as n ——-s 00 because of the Glivenko—Cantelli lemma and because of Hm. ‘ —1 A —1 Next, to see that {Lnl(t,F(Hn (y))) + oznl(t,F(HI1 (y)))} and {Ln1(t,y) + an1(t,y)} have the same limiting distribution, observe that :2? IL 0.1103130»)+amsrmgle)»—Ln1(t.y)—an10,y)l y6{0sll n1 A —1 S $16111) an1(taF(Hn (y))) — Ln1(t,)’)l ye [0.1] “—1 P . + $1611; lcvnl(t,F(Hn (y))) - an1(t,y)l —r--+ 0 as n -’ oo yE[0.ll 16 Since su IF(f1;1(Y))-YI ye 0,1] A s surf 1F1figl1y» —f1n(H;1(y))1+1/n s an 1F1s1— fin1s)1+1/n ye 0.1] ye 0.1] LOasn—aoo, because of (2.21) and because Ln1(t,y), an1(t,y) are tight. By Lemma 2, for t S to, 1111103) + 011103,” L L103,” + 0100’) (D(IX[0,1],“ H) and 1 1 1 1Ln11ay1+ an11ay112 dy l»1 1L1(t,y) + a1(t,y)12dy 0 0 in(D(IMI ll) 38 n—W. Similarly for t > to, 1 1 I0 [Ln2(t.y) + an2(tay)l2dy L lo [L2(t,y) + a2(t.y)l2dy as n —-+ co. It follows that (1) and (2) in Theorem 2 hold. 13 3. ESTIMATING A CHANGE—POINT AND LOCATION PARAMETER An estimator :rn of 1' is given by ; =smallestk such that S = max S .. (3.1) I] 11k jEN(f) DJ When H1 is true and G(~) = F(s—0), an estimator Rn of 0 is 0n=med {Xj—Xi:1$i$rn 00, that is, (Tn - r)/n ——10 in probability, at the rate O(n—1/2+6) for 6 > 0 where r = [ntO], tOcI. Note that all expectations in this section are computed under H1. In what follows use j instead of [nt] for the sake of convenience. From (2.13), Evnjtx1=E{<1/1>,>5 I1xisx1—(1/(a—9), ‘25 11x,sx1>} l=1 1=_]+1 ((n—T)/(n-J')) [F(X) -G(X)li j S T = (3.4) (7/1). [F(x) — G(x)], j > T. Var an(x) I12+T.—2Il. _ Il-T . . . F(x)[1 F(x)] + , 2G(X)[1-G(x)], )3 7-, 2 J (n’l) . (n—j) (3 5) j}; F(x)11—F(x)l + W G(X)[1—G(x)], j> a 18 Use EX2 = {EX}2 + Var X to compute jg n—j) 2 E n Ean(x) 2 i—Lfn—T F—G + —1—le +7 2“ F I-F + n—T2G1—G <7' n(-)1 [ 12 i(n-i) [ 1 (n -i)2 [ ]j Let H(x) = t0F(x) + (1—t0)G(x), xER and fn(t) = n-1 (R ([nt](n—[nt]) /n) EVI21[nt](x)dHn(x), tel. From (2. 20), sup 1 H 00— H(x 1 1= 0,11). xER From (3.8) and that the second and third terms of (3.6) converge to O, and Hn(x) —-) H(x) uniformly in x as n —-+ co, the pointwise limit of fn(t) is f(t) := lim fn(t) n-Ioo lim(tn—r12/n211m1/(n-1nt1 1) IR[F(x)-G(x) 12dH1x), n-ioo Iim1r2/n2)<1a-1ns1)/1nt1> 1R1F1x1—G(x>12dH1x) , n—im (HO)2 (t/(I—t» 80,) .t :10. t3 ((1—t)/t) B(t0 ) , t > tO where B(t0) = 1R[F(x) — G(x)]2dH(x). 7'2 7—2 [F—G]2 +T—2 F[1—F]+ %1+—T;§31o[1—G], j>r. J 11"] J (3.6) (3.9) 19 We need the following Lemma to prove Theorem 3. Lemma3. Under H1, for any 6 > O 1141/2 suplfn (t)— in(t)|—£—»0 as n—aa (3.10) ‘ . . 2 2 A 2 ” Proof. 15,10 — 1,101 = J(n—J)/n 11Rv,,- dH, — 1REv,jdH,1 sfl—fi—‘fltling 4211?, )dH, 1+11R EVE, 41H, —H ,)1} 'n—' _ . . —. 2 3 Mn {4IRIan EanldHn+ JRIHn Hn|dEan} (3.11) Observe that for j S r, j—(gjl—atU—t) uniformly in tel as n——»oo, n .—1 g: __1 7 =1 , {I[X,5x1—F1x)}—1n—1) 2 111xi le—F(X)} -— i-—1+1 _1 n -(n-i) 3 {I[XiSXl-G(X)} i=r+1 = [I Kn1(t,F(x)) -— (31% Kn2110—1.F(x)) - £3,- Kn3(1-tovG(X)) where KI1 i's are as in the proof of Lemma 1. The first term of (3.11) tends to zero 1—/2 —6 1/2— —6 because n sup nItl —-10,n sup ——>0 asn ——ioo and Kn' 'S are teI tel ‘1 nut tight, i=1,2,3 which follows from Lemma 1. The second term of (3.11) tends to zero because 1011—11—1 t( —t) uniformly in tel as n——ioo and [1-61/2 sup lHn (x)—Hn (x)| -P—-1 0 as n —-1 00 by the Glivenko—Cantelli lemma. x6 Hence (3.10) holds. 1:1 20 The consistency of Tn is given by Theorem 3. Under H1, for any 6 > 0, nl/Q-‘Sltn—tol E540 as n—-)oo. Proof. Note that 111/ 2‘5 sup |fn(t)—f(t)| ——) 0 as n _.. oo teR where fn(t), f(t) are in (3.7) and (3.9), since ail/2‘5 sup 11,19—1101 tER s [12/a1/2+51a11/1n—1a101) + n1/2-6/(n_[n,)))11.9.0] + (2/n1/2+5(n—[nt0])/[nt01 + n1/2—5/[nt0]).l[t>t01] .B(t0) Claim: To prove the theorem it suffices to Show that for any 6 > 0 n1/2_6 |f(tn) - f(t0)| _.. 0 as n __) a. To prove (3.13) first note that for any 0 < u < 1, l—t t f1u)=1:g1t)da where g0) = B1t,)[1fi)211tsso1-1,9-)2110101], For tn 5 to, t . t £110) 411,) = 1,0 g1t)dt = B1t0)-11—to)2-1£° 1/11—s)2dt II II 3 B(t0)((1—t0)/(l—tn)2)(t0-—tn) ; B(t0) ——§ ~ (to -- in) (3.12) (3.13) 21 since e to, A f1£,)—f1t0) = 1tn 3(t)dt = B11343 1tn1—1)/12dt t0 t0 2 t A A 0 . Thus we have for t n > to, 1 2— A 1 2— A 2 0 5 n / 5(tn—t0) g n / 511(t0)—i(tn)].e /t§.B(t0) (3.15) By (3.14) and (3.15), the claim is proved. To see that 111/2—5lf(tn )—f(t 0)| —-+ 0 as n ——-1 oo, observe that |f(t n)-f(t0)| 5 WD )-€, (t ,)| + |€,t( ,-) f(t OH 3 sup111t)—&,1t)1 + lsup E,1t)—sup 1101 tel tel tel (by definition of t and because f(t ) = sup f(t)) n 0 teI s 2-sup111t)—é,1t)1 tel - S 2°ftellf|f(t)-f,(t)l + 2°:gllalf,(t)-£,(t)l (3.16) 22 The first term of (3.16) tends to zero by (3.12) and the second term of (3.16) tends to zero by Lemma 3. B 3—2 THE ASYMPTOTIC DISTRIBUTION OF :rn UNDER PARTICULAR ALTERNATIVES In this section the limiting distribution of in under H1n is obtained. Theorem 4. Let k = [ntk], tkcl. Then PH1n{ Tn>k} —-+ Pr{sup[St0(t): eStStk] < sup[St0(tk):tkk} = {max(Snj: [ne]$jgk) < max(Snj: k‘r,1flao. i=Tn+l as n——)oo for any xeR. Proof. We prove (1) only Since the proof of (2) is Similar. A A (;n/(n—Tn))1/2'n—1/2. E 1|I[Xi5x]—G(x) | I[TSi’n] 1=='r+ s Tn/(n‘f))1/2'n_l/2{Tn—r} = 1%,on/n1n—r,))1/2.(in—n/nnl/2 =(i’nn1/4/n)1/2o(n/(n_;n))1/2.((;n_r)/n).n3/8fl, 0 as n __) a by Theorem 3 applied with 6 = 1/4, 1/2, 1/8. 13 The asymptotic normality of 0n is given by 24 Theorem 5. If F has a continuous density h with 0 < j h2(x)dx < 00, then under R H1 and G(-) =r(.—3), 712—.(22n—0)——D—aN(0, 1 ) as __.e, n 12°(l(1)112(y)dy)2 n where kn = rn(n—Tn)/n. Proof. Let Un(u) = IRGn_;n(x)dF‘Tn(x—u), tcR, then ,. . Tn n Un(u) = 1/1'n(n—Tn)iE1 2 I[Xj—Xigu]. j= Tn+1 Note that Un(tln—) S 1/2 5 Un(3n) and hence . {Un(u) > 1/2} 5 {in g u} s {Un(u) 2 1/2}. (3.17) Consider the event {fit—n (flu—0) 5 x}, xeR. From (3.17) we get the following relation: P{Un(x/\/k;+0) > 1/2} 5 P{0 S x/Jk—I;+0} S P{Un(x/t/k;+0) 2 1/2}. (3.18) the first and third term of (3.18) have the same limit, then we will get the limit of the distribution of 4k; (Rn—0). Consider the first term. {Una/7E1; + a) > 1/2} =17k; 1U,1x/7E, + 1) —1RG1y+x/7E, + o)dF1y)1 > (E [1/2 - IRG(Y+X/~/1'<, + 6’)dF(y)l } (3-19) 25 Consider J11; [1 / 2 - G(Y+X/\/k_1; + 0)dF(Y)l = 711—, [IRF(y)dF(y) - JRG(Y+X/~/1§+0)dF(Y)l = JR; lR[F(y) - F(y+X/\/1§Q]dF(y) w l r'1—x)1 h1y)dF1y)=1—x)1 1121105 as n—aa, because R R __.) . n (D 'x‘ Consider i/k; [Una/fig + 0) — IRG(y+x/JE'I; +0)dF(y)] = 91231116211; (y+x/./k;+0)dI3‘;n(y) - JR G(y+X/~/1§+0)dF(y)l T n = 91;; [1R(én_;n(y+x/,/1§1‘+o) — G(y+x/1E,+ 9))d13‘;n(y) + 1RG(Y+X/1/I<;+ 9)d[F;n(Y)-F(y)ll = «R; [1R(én_;n(y+x/,/t;+ a) — G(y+x/./1§;+0))dr(y) + 1R1é,_;n1y+x/¢r,‘+o) — Goa/75+ 0))d113“; 1y)—F1y)11 II + JR; [1RG(y+x/,/l§+ 0)d[F;n(Y)-F(Y))l 26 =(rn/(n—rn))1/2 IRK2,n_:rn(y+x/Jk;+0)dF(y) (3.20) + 1‘r,/1“r,))1/2 1RK2,,_;n1y+x/7k—,+o)d11~‘;n1y)—F1y)1 13.21) _ ((n.?rn)/3rn)1/2 JRK1,;n(y)dG(y+x/Jk; +0) (3.22) by integrating by parts, where K2 n—i' (u) = (IA/ID .E (I[Xi S Ill—GUI», n ._ I—Tn+1 K1,;n1u) = 1N?) if; 111xisu1—F1u)). To get the limit of (3.21), note that (rn/(n—rn))1/2EI—a(tO/(l—t0)1/2 as n—-ioo, and K2,n—Tn(Y+x/‘/§+ 0) =11/7m :5 111xjsy+x/./R;+01—G1y+x/7E,+0) j=Tn+1 = (l/mj_§+1(1[xjsy+x/./1g+111—o(y+x/,/rr;+o))+(rn/(n—rn))—1/2op(1) by Lemma 4. 27 By Lemma 1, (3.20) converges weakly to (to/(l—t0))1/Zj1 K2(1—t0,u)du 0 in uniform metric. To get the limit of (3.21), note that 1RK2,_:, 1y+x/7k;+o)d1F; 1y)—F1y)1 n n = 1RK2,,_:,n1y + f + 0d113“;n1y)—F1y)1 = l K2,,_;n(y + 4L1? + 0)dF;n(Y) - IRK2,,_;n(y + fi- + 9)dF(Y) R n n = )1K2 ‘ (F71(u) + _x_ +0)du — le _‘ (F—1(u) + i + 0)du 0 ,Il—Tn Tn JR; 0 2,n Tn n 1 . “—1 . —1 = {0(K2,n_Tn(F; (u)+0) — K2,n—rn(F (u)+0))du + op(1) I1 P—r—i 0 as n ——1 00 since {K2 n_‘T (t)} is tight by its convergence and . 7 n sup 1F;1(n)—F‘1(n) 3L0 as n—aa. OSuSI n To get the limit of (3.22), use the same method as (3.20) and we get the 1 limit, ((1—t0) /t0)1/ 2 ) K1(t0,u)du where K1(t,u) is the Kiefer process and 0 independent of K2(t,u). 28 Therefore the limit of LHS of (3.19) is 1/2 1 1/2 1 (to/(l—t0)) )0K2(1—t0,u)du — ((1—t0)/t0) )0K1(t0,u)du (3.23) 1 Note that } K1(t0,u)du has a normal distribution with mean 0 since for 0 each ue[0,1], EK1(t0,u) = 0, K1(t0,u) is normal and 1 a.s. n 1 j K (t ,u)du = lim 2 (k/n) K (t ,k/n). So is j K (l—t ,u)du. The variance 1 0 _ l 0 2 0 0 Il-ioo k—l 0 of (3.24) is 1/12 since 1 1 2 Var )0K1(t0,u)du = E(jOKl(t0,u)du) 1 1 ' 1 1 :1 I EK1(t0,u)K1(t0,v)dudv=j I t0(uAv—uv)dudv 0 0 0 0 1 1 = 2t0- j I (u—uv)dudv = (1/12)-t0, and 0 0 Var I;K2(I—t0,u)du = (1/12)-(1—t0). 29 Hence the first term of (3.19) tends to 1 1 Pr 111M?) 2 > 1—x) 10h21y)dy)) = Pr 111W) 2 < x10h21y)ay1 1 = @1712.» 10h21y)dy) and the third term of (3.19) has the same limit where Z 2 N(0,1), Q is a cumulative distribution function of the standard normal. :1 4. FINITE SAMPLES 4—1. DISTRIBUTIONS OF ESTIMATORS IN FINITE SAMPLES To get some idea of the distributions of estimators in finite samples, TABLE 2 and TABLE 3 give some Simulated results. For each row in the TABLE 2 and TABLE 3, 1000 samples of size n (n=40, 100) were generated from N(0,1) with a shift 0(0 = 1.0) at rc{n/4, n/2} and e = ‘ .05. The percentiles of rm, and the sample mean i and standard deviation s of an are given . TABLE 2 Distributions of estimators for n=40 Percentiles of in A011 7 0 10 25 50 75 90 E S 10 1.0 5 7 9 13 24 .942 .479 20 1.0 11 16 19 20 25 .959 .376 30 TABLE 3 Distirbutions of estimators for n=100 Percentiles of :rn A011 7 0 10 25 50 75 90 SE S 25 1 .0 20 24 25 27 33 . 969 .242 50 1.0 45 48 50 52 56 .972 .238 From the tables, we see that ;n is almost median unbiased in the sense that the median of the distribution will be equal to the true value of the parameter. Looking at 3n, 0 is understimated but A011 is getting close to 0 as 11 increases regardless of T. For any case of n, A011 has smaller variance when r=n/2. The comparision of our test and other tests is given in the next section. 4—2. COMPARISON We will compare the test M; with Schectman's test (1983). To compare them, their empirical powers are computed. Schectman's test (1983) is pr0posed as follows: let Wi be the Mann—Whitney statistic for two samples X1,...,Xi and X [(W - -1/2)/i(n-i)l Vi = 1 1/2 , and the test statistic is defined by [( n +l)/12i(n—i)] i+1’”"Xn of Sizes i and n—i, Vz= Max IViI. IS iSIl—I 31 For each row of TABLE 4, 1000 samples of size n (n=40) were generated from normal and double exponential with 0(0=1.0). Let HS and IIk denote empirical powers at a = .05 of Schectman test and M; test respectively, that is, the pr0portion of samples of 1000 for which H0 was rejected. TABLE 4 Empirical powers 0 f M161 and Schectman for n=4—0 T 0 HS Hk 10 1.0 .301 .303 20 1.0 .405 .405 from NW) 1' 0 HS IIk 10 1.0 .494 .494 20 1.0 .683 .676 from dOUble exponential To compare M; with Schectman's, see TABLE 4. For any case of T, Schectman's test and the MI: test have almost the same power. 32 REFERENCES Anderson, T.W. (1962). On the distribution of the two—sample Cramer—von Mises criterion. Ann. Math. Stat. 33, pp. 1148—1153. Billingsley, P. (1968). Convergence of probability measures. Wiley. Csorgo, M. (1983). Quantile processes with statistical applications. SIAM. Hawkins, D.L. (1985). A simple least squares method for estimating a change in mean. Tech. Report, Univ. of Texas at Arlington. Hawkins, D.M. (1977). Testing a sequence of observations for a Shift in location. J_AS_A, 72, pp. 180—186. Hinkley, D.V. (1969a) Inference about the intersection in the two—phase regression. Biometrika, 56, pp. 495—504. Hinkley, D.V. (1969b). On the ratio of two correlted normal random variables. Biometrika, 56, pp. 635—639. Hinkley, D.V. (1970). Inference about the change—point in a sequence of random variables. Biometrika, 57, pp. 1—17. Hinkley, D.V. (1971). Inference about the change—point from cumulative sum tests. Biometrika, 58, pp. 509—523. Hinkley, D.V. (1980). Change—point problems. Tech. Report, Univ. of Minnesota. Hinkley, D.V. and Hinkley, EA. (1970). Inference about the change—point in a sequence of binomial variables. Biometrika, 57 , pp. 477—488. Hodges, J .L. and Lehmann, EL. (1962). Estimates of location based on ranks tests. Ann. Math. Stat., 34, pp. 598—611. (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) 33 Moustakides, G.V. (1986). Optimal stopping times for detecting changes in distributions. Ann. of Stat., 14, pp. 1379—1387. Pages, ES. (1954). Continuous inspection schemes. Biometrika, 41, pp. 100—114). Pages, ES. (1955). A test for a change in parameter occuring at an unknown point. Biometrika, 42, pp. 523—527. Pages, ES. (1957). On problems in which a change in a parameter occurs at an unknown point. Biometrika, 44, pp. 248—252. Pettitt, A.N. (1979). A non—parametric approach to the change—point problem. Ann. of Stat., 28, pp. 126—135. Pettitt, A.N. (1980). A Simple cumulative sum type stastistic for the change—point problem with zero—one observations. Biometrika, 67, pp. 79—84. Pollak, M. (1985). Optimal detection of a change in distribution. Ann. of m 13, pp. 206—227. Schectman, E. (1982). A non—parametric test for detecting changes in location. Comm, Stat., Theory, Meth., 11,(13), pp. 1475—1482). Schectman, E. ( 1983). A conservative non—parametric distribution—free confidence bound for the shift in the change—point problem. @m_m_., Stat., Theory, Meth., 12(21), pp. 2455—2464. Shorack, and Wellner, (1986). Empirical processes with applications to statistics. Wiely. Sen, P.K. (1980). Asymptotic theory of some tests for a possible change in the regression slope occuring at an unknown time point. Z. Washr. Verw. Gebiete 52, pp. 203—218. 34 (24) Sen, P.K. (1982). Tests for change points based on resursive U—statistic. Seg. Anal., 1, pp. 263—284. (25) Srivastava, M. and Sen, A. (1975). On tests for detecting change in mean. Ann. of Stat., 3, pp. 98—108. (26) Wolfe, DA. and Schectman, E. (1984). Nonparametric statistical procedures for change—point problem. J_1)_u_r_n_a_l of Statst. Planning and Inference, 9, pp. 389—396. 3.“),,‘.,,...._.,.,. . ,.......,._. ,. .. ... - . . ..- W .H . ATE U Mllllllllllllll 1| 111111 ||11111111111111111111111111ES 3 1293 03062 2587