.«z. . Jfiw 4a .1324. 'i‘r't'uul. This is to certify that the dissertation entitled Minimum Distance Regression and Autoregressive Mode] Fitting presented by Pingping Ni has been accepted towards fulfillment of the requirements for Ph.D. degree in Statistics M Major professor Hira L. Kou] Date May 20, 2002 MSU is an Affirmative Action/Equal Opportunity Institution 0- 12771 LIBRARY Michigan State University PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 cJCIRC/DateDuepGS-p 15 MINIMUM DISTANCE REGRESSION AND AUTOREGRESSIVE MODEL FITTING By Pingping Ni A DISSERTATION Submitted to Michigan State University in partial fufillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 2002 ABSTRACT MINIMUM DISTANCE REGRESSION AND AUTOREGRESSIVE MODEL FITTING By Pingping Ni This work proposes a class of tests for fitting a parametric regression model to a regression function when the underlying design variables are random and the model is possibly hetroscedastic. These tests are based on certain minimized L2 distances between a nonparametric regression function estimator and the parametric model being fitted. The work obtains the asymptotic distribution of the pr0posed statistic under the null hypthesis. It also derives the asymptotic distribution of the corresponding minimum distance estimator. A class of tests based on a slightly different L2 distance for fitting a parametric autoregressive model to a autoregressive function is also prOposed in this thesis. The asymptotic prOperties of underlying parameter estimator and corresponding minimized distanced is derived. Copyright by PINGPING NI 2002 ACKNOWLEDGMENTS I would like to thank my advisor Professor Hira L. Koul for his guidance and many helpful discussions on the subject of this thesis. He was always available when I had doubts or questions. His general thinking of statistical problem and ways to solve the problem will help my future research and working. I would also like to thank all the other committee members, Professors Connie Page, Habib Salehi, and Lijian Yang, for serving on my guidance committee. Many thanks to Professor Connie Page for her advice when I was at the consult- ing service, to Professors Vincent Melfi, Habib Salehi, and James Stapleton, and to Cathy Sparks for their help on my simulation study. Finally I would like to thank the department of Statistics and Probabilities for offering me graduate assistantships so that I could come to the states to complete my graduate studies at the Michigan State University. This research was partly supported by the NSF Grant DMS 0071619, under the PI: Professor Hira K0111. iv TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES 1 Introduction 2 Minimum Distance Regression Model Fitting 2.1 2.2 2.3 2.4 2.5 2.6 Introduction ................................ Assumptions ................................ Consistency of 6; and én ......................... Asymptotic distribution of fin ...................... Asymptotic distribution of the minimized distance ........... Simulations ................................ 3 Minimum Distance Autoregressive Model Fitting 3.1 3.2 introduction ................................ Assumptions ................................ vii ix 14 14 15 18 26 38 51 65 65 67~ 3.3 Consistency of 6n ............................. 70 3.4 Asymptotic distribution of fiwn — 60). ................ 76 3.5 Asymptotic behavior of the minimum distance ................................... 86 4 Simulations 101 BIBLIOGRAPHY 122 vi LIST OF TABLES 2.1 4.1 4.2 4.3 4.4 4.5 Empirical sizes and powers for testing models 0 vs. model 1 to 4. Tests for model I 22.3. model 2 with double exponential errors. Tests for model 1 vs. model 2 with N(0, 0.1) errors. .......... Tests for model 1 vs. model 3 with the N(O, 0.1) errors ......... Mean and s.d.(6n) under model 1 with double exponential errors. . . Mean and s.d(9n) under model 1 with normal errors. ......... vii 57 . 106 107 107 . 108 108 LIST OF FIGURES 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 4.1 4.2 4.3 4.4 4.5 4.6 The density curve of Jag” — 1). .................... 53 The density curve of nhd/2(th(én) -— Cn) ................ 54 The density of \/7_1(61n — 0.5) ....................... 58 The density of \/n(t92n — 0.8) ....................... 59 The 2 dimentional density of \/T—l(0n — 00) when n = 30 ......... 60 The 2 dimentional density of fiwn — 90) when n = 50 ......... 61 The 2 dimentional density of \fnwn -— 90) when n = 100. ....... 62 The 2 dimentional density of \/T_L(6n — 00) when n = 200. ....... 63 The density of test statistics under H0. ................. 64 The density of \/r_L(0n — 0.8) when the errors are double exponential. 110 The density of fiwn — 0.8) when the errors are N(0, 0.1). ..... 111 The density of Tn(9n) under model I with double exponential errors. . 112 The density of Tn(6n) under model 2 with double exponential errors. . 113 The density of Tum“) under model 1 with N(0, 0.1) errors ....... 114 The density of Tn(6n) under model 2 with N(0, 0.1) errors ....... 115 viii 4.7 The density of Tn(9n) under model 3 with N(O, 0.1) errors ....... 116 4.8 The density of the suitably scaled minimized distance under model 1 with double exponential errors ....................... 117 4.9 The density of the suitably scaled minimized distance under model 2 with double exponential errors ....................... 118 4.10 The density of the suitably scaled minimized distance under model 1 with N(0,0.1) errors ............................ 119 4.11 The density of the suitably scaled minimized distance under model 2 with N(0,0.1) errors ............................ 120 4.12 The density of the suitably scaled minimized distance under model 3 with N(0, 0.1) errors ............................ 121 ix Chapter 1 Introduction This thesis is concerned with the classical problem of using a set of variables, say d—dimensional variable X, to explain the response Y, a 1- dimensional real variable. As in the practice this is often done in terms of the conditional mean function of Y, given X, known as the the regression function, and defined as p.(x) = E(Y|X = x), x 6 Rd, assuming, of course, E [Y] < 00. In the context of time series where X may be the vector of the previous (1 lagged variables, ,u is called the autoregressive function. To be specific, let {(Xi, Yi) : i = 1, ..., n, } be observable random variables, where (Xi, Y,) has the same distributions as (X, Y), for all 1 g i g n. They are said to obey a regression model with regression function u if in addition {(X,, Y.) : i = 1, ..., n.} are independent and identically distributed (i.i.d). The data is said to have come from an autoregressive model of order d = 1 with autoregressive function u, if in addition, Xn+1 is also observable and Y, = Xi“, 1 g i S n. Let 9 C R", and {mg(-) : 6 E 9}, be a given set of parametric models. The statistical problem addressed in this thesis is that of model checking, i.e., to test the goodness-of-fit hypothesis (1.0.1) H0 : u(x) = m90(x), for some 00 E O, and for all x E I vs. H1 : H0 is not true, based on the given data, where I is a compact subset of W. Several researchers have used nonparametric techniques on model checking in regression and autoregressive setting since the late 1980’s. For instance, Eubank and Spiegelman (1990), Eubank and Hart (1992, 1993), Hardle and Mammen (1993), Stute (1996), and Stute, Thies, and Zhu (1998) address this problem in regression setting, while An and Cheng (1991), Vidar, Yao, and Tjdstheim (1997), and K011] and Stute (1999) in the autoregressive setting. In the regression context, some of these works focus on fixed design rather than random and under some restrictive assumptions on the error distribution. The proposed tests in these papers, except Stute (1996), Stute et al. (1998), and Koul and Stute (1995), are based on some nonparametric estimator of the regression function while the tests in the latter papers are based on a certain partial sum empirical process of the residuals. Here we shall briefly summarize the contents of some of these papers. Eubank and Spiegelman (1990) consider the sequence of models where d = 1, at stage n, _ X,- = x," with 0 g xln < < xnn S 1, known nonrandom, and where [43:31): ,80 + 18133231 + f($in)i 1 S i S TL, and f is a smooth unknown function. Moreover, here the errors Y,- -— p(xin) are as- sumed to be i.i.d. N (0, T?) with r2 unknown. It is also assumed that xi" are gener- ated by a continuous positive density w on [0, 1] through the relation for” w(x)dx = (23' - 1) / 2n. The problem addressed in this paper is to test the hypothesis f = 0 versus the alternative that f E L2(w)/{1,x}, f is absolutely continuous and its a.e. derivative f’ is absolutely continuous and square integrable. Here the space L2(w)/{1,x} consists of all functions in L2(w) orthogonal to 1 and the identity function. The paper proposes two tests. For one, they assume that f = Tnpa, a 6 R”, where Tnp is a vector of known functions orthogonal to 1 and the identity function. Then the test is based on the the least square estimators of fig, 51 and a. The other test is based on the spline estimation of f and the least squares estimators of ,80, 61. They prove the asymptotic normality of their pmposed statistics under their null hypothesis. We note that the problem addressed in this paper may be thought to be equivalent to fitting a simple linear regression model, i.e., to test H0 of (1.0.1), with m = 2, m9(x) = (1, x)6, against a nonparametric class of alternatives. Hardle and Mammen (1993) consider the problem of testing H0 based on the model (1.0.2) Yr = #(Xi) + 52', where ei’s are allowed to be heteroscedastic with E(e,-|X,-) = 0 and E(5?!Xi = 3;) = 02(x). They propose a class of tests based on (1.0.3) Mhh(0) := A [n'1 2: Kh(x — X.) (Y,- — 772.9(Xi)) {fh(x)}’2dG(x), — n 1 u Mar) == 12 ‘ZKh(a:— X.), x 6 IR", Km) = it? i=1 where K is a kernel density function on [—1,1]"1 and G is a o-finite measure on Rd. Their test is based on the statistic Tn := nhd/QMMM), where the estimator d and the null model are assumed to satisfy the condition (1.0.4) m5(x) -— mgo(x) =(1/n)Z < n(x),'y(X,-) > e,- + op((nlogn)_1/2), uniformly in 2:. Here n and 7 are bounded functions taking values in R" for some h. It is pointed out in the paper that this assumption holds for linear models and the weighted least squares estimators in nonlinear models if u(-) is ”smooth” with W?) = (3/39)ma(-) at 9 = 90- Apart from the usual assumptions such as the kernel K is a symmetric, twice continuously differentiable with compact support, X lies in a compact set with probability 1 and the density f of X is bounded away from zero and infinity, they also assumed that hn = cn‘lMH) for some known constant c > 0, the regression function ,u and the density f are twice continuously differentiable, and Eexp(te,~) is uniformly bounded in i for |t| small enough. Under some additional assumptions, they concluded that the asymptotic null distribution of nhd/2(Mhh(§) — C") is N(0, 17), where Cn depends on the u = mac, 4 the second derivative of K and it‘d/2, and where r _ 04($)92(13) x (2) 2 v-2/————f2($) d /(K (0) dt, and g is the Lebesgue density of G. The choice of bandwidth hn = cn‘l/(‘H") is asymptotically optimal for the class of twice continuously differentiable regression functions. It is also crucial in getting the rates of uniform consistency of nonparametric estimators of u and f, which in turn play a crucial part in the proofs of this paper. The paper gives details of the proof for one dimensional case only, i.e, for the case d = 1. But it is not clear how their proof can be extended to the case d > 1, without a concern for bandwidth selection. These authors also did Monte Carlo simulations on both distribution of the test statistic and its asymptotic distribution. Their studies show that the simulation of the null distribution of the test statistic has a non-negligible large departure from the limiting distribution in its mean, variance, and shape. It is also proved in the paper that the naive bootstrap does not work for degenerate U statistics. So they suggested to use wild bootstrap to calculate critical values. Stute, Thies and Zhu (1998) also considered the problem of testing H0 of (1.0.1) for model (1.0.2) with d = 1. They constructed a class of test statistics by first splitting the whole sample into two parts, 1 to n1 and n1 to n with n1 —> 00 and n — n1 —+ 00. The test statistic is based on the cusum process of the residuals of the second half. Let Fm be the empirical distribution function of XMH, ..., Xn, 6n, be a fin - nl-consistent estimator of the true parameter under the null hypothesis based on (X,,Y-), n1 +1 < i < n, and let RACE )2 (n — n1 ”2 Z 1{X mii,,(y>A;3x [fosemmzuwa may). Here iflz/yoc m9n1(u)7h0nl(u) ;I2(U)Fnt(du)i 02 is a consistent estimator of 02 based on the first half sample, and m0: 111 59—97" m'" Under the assumption that f: mgo(u)mg;(u)o‘2(u)F(du) is positive definite for some $0 < 00, and under some additional smoothness assumptions on the null model, they proved that under H0, Taft}, -—) B o F in distribution on D[—oo,x0], where B is a standard Brownian motion and F is the distribution function of X. They then propose the test statistic o;12F,f2(x0) ff;[TnR;,(x)]2Fn(dx). It is also proved in this paper that their test statistic converges in distribution to f01 Bz(u)du under their null hypothesis. An and Cheng (1991) considered a problem of testing linearity of an autoregres- sive function. They proposed a Kolmogorov-Simirnov type of test statistic based on a process similar to R, with éi = (X,- — X) - fi(Xi_1 - X), 22:1(Xk -' X)(Xk._—1 - X) 22:1(Xk _ X)2 _ 1 " X=—E Xk, and ,5: n k=1 The test statistic is defined to be _ m m 1/2 A sup—oc 00 and m(lnlnn)/n —> 0. It is proved in the paper that this test statistic converges in distribution to sup0m3‘,(y>1 0, [mp2 (x) — m9,(x)l _<_ “6;; — 61||8€(x), V62, 61 E 6-), x E I. (m4) The model mg is differentiable in 6 in a neighborhood of 60 with the vector of derivatives me, such that for every 6 > 0, k < oo, . _ . _ _ T ' . lim SUpP< Slip lm9(X2) ”1.9002(1) (0 00) m9o(X1)l > 6) n igign,(nhd)l/2||a—eougk ”0 " 90H 16 is 0. (m5) For every 6 > 0, there is an N, < 00 such that for every 0 < k < 00, P ( max It‘d/2||mg(X,-) - r'ngo(X,-)|| Z 6) g e, Vn > N,. igign,(nhd)1/2||9—00Hgk About the bandwidth h,, we shall make the following assumptions: (hl) h,,-—)0asn—)oo. (h2) nhid —+ 00 as n —) oo. (h3) h ~ n‘“, where a < min(1/2d,4/(d(d + 4))). Conditions (hl) and (h2) suffice for the consistency of 6", while (h3) is needed for the asymptotic normality of 6,, and th(6,,). Of course, (113) implies (hl) and (h2). It is well known that under (f), (k), (111) and (112), cf., Mack and Silverman (1982), (2-2-1) :21; ht?) - f0)! = 011(1)) :21; film) - f(:L') = 0,00), use) I 2.2.2 . —1 = 0,, 1 , ( ) :23 nix) | ( ) These conclusions are often used in the proofs below. In the sequel, we write h for h,,, w for w,,; the true parameter 60 is assumed to be an inner point of O; and the integrals with respect to the G-measure are understood to be over the set I. The inequality (a + b)2 g 2(a2 + b2), for any real numbers a, b, is often used without mention in the proofs below. 17 A 2.3 Consistency of 6;“, and 6,, This section proves the consistency of 6;, and 6”. To state and prove these results we need some more notation. Let L2(G) denote a class of square integrable real valued functions on Rd with respect to G. Define p(u1,1/2) := /I(V1(x) — V2($))2dG(.’E), V1, V2 6 L2(G), and the map T(u) = argminé,Ee p(u, mg), l/ E L2(G).L2(G). In the sequel we shall often use the following notation (in), := fh’2dG, dp = f-2do. Moreover, for any integral L :2 fydgbh, L 2: f'ydp. Thus, e.g., T(6) stands for T(6) with aw replaced by (p, i.e., with fw replaced by f. We also need to define #7101319) 5: "—IZKhix—Xilm0(xi)a i=1 [1,,(x,6) := n‘IZKh(x—X,-)r'ng(X,-), i=1 Un(x, 6) := n"1 ZKh(x — X,)Y,- — un(x, 6), i=1 n-1 2 K),(x — X,)(Y,~ —- m,(X,-)), U,,(x) = Un(x, 00), Zn($,6) :: “71(1‘19) — Hn($100) = n"1 ZKh(x —- X,)[mg(X,-) — mgo(X,)], 6 6 IR", i=1 18 1%.,(1) := n‘1 Ema: — X,), K;(x) z: n-1 2 K;(x — X.) x 6 ad, i=1 i=1 20 := /m90(x)mg;(x)dG(x). To begin with we state Lemma 2.3.1 Let m satisfy the conditions (m1), (m2), and (m3). Then the fol- lowing hold. (a) T(V) always exists, VV 6 L2(G). (b) If T(V) is unique, then T is continuous at 1/ in the sense that for any sequence of {11"} E L2(G) converging to V in L2(G), T(l/n) —-> T(V), i.e., p(1/,,, u) ——> 0 implies T(Vn) ———) T(V), as n —> 00. (c) T(mg(-)) = 6, uniquely for V 6 E 9. Proof. The main ideas of the following proof are essentially as in Beran (1977). Proof of Part (a). Because 9 is compact, it suffices to show that for every V E L2(G), the map 6 +—> p(u, m9) is continuous. Accordingly, let 6,, be a sequence in O, converging to a 6 E 9. Then, by the Cauchy-Schwarz inequalities, we obtain |P(V1m0n) — [)(V, mall S Mmemma) + 2P1/2(V1m9)P1/2(m0n»m0) ‘—> 0» by (m3). Proof of part (b). Let {14,}, V in L2(G) be such that (2.3.1) p(u,,,1/) —> 0. 19 Set 6 = T(u), 19,, = T(Vn). Then, by the definition of T, IOU/1117711)") S p(Vn,ma). By subtracting and adding 11 and expanding the quadratic and using the the Cauchy-Schwarz inequality on the cross product term, the above bound is bounded above by p(1/,,, V) + p(u, mg) + 2p1/2(V,,, V)p1/2(l/, m,;). In view of (2.3.1), we thus obtain (2.3.2) lim sup p(u,,, mg") g p(u, my). On the other hand, again by the definition of T, 6, and 19,, here, p(u, mg) _<_ p(V,m19,,) which, together with an argument like the above, implies P(me0n) -p(V»mv) Z P(Vn»mvn) —P(V,m19n) 2 pm. u) — 2p1/2 p(1/,m,;). Ptom this it follows that 6,, —) 6. For, suppose 19,, 4+ 19. Then, by the compact- - ness of 9, there is a subsequence {"6“} C {6"} such that 19,”, -—> 61 ¢ 60, and by 20 the continuity of the map 6 H p(V,6), and by (2.3.1), we Obtain p(1/,,k,m,9nk) ——> p(u, 7119,). Hence, by (2.3.3), p(l/, mm) = p(l/, 171.13), implying, in view of the unique- ness of T(V), a contradiction, unless 191 = 6. Proof of part (c) follows from the identifiability condition (m2), which implies that T(Tl’tg) = 6. D A consequence of this lemma is the following Corollary 2.3.1 Suppose H0, (e1), (62), (f), (m1), (m2), and (m3) hold. Then, 6;, —-—> 60, in probability under H0. Proof. We shall use part (b) of the Lemma 2.3.1 with 12,, = flaw, u = mgo. Note that Mgw(60) = p(;1hw,mgo), 6;, = T(un), and by the identifiability condition (m2), T(V) = 60 is unique. It thus suffices to prove (2.3.4) pm... me.) = opu). To show this, we note that by plugging in Y, = u(X,-) +5,- and note that u = mgo under H0, and expanding the quadratic integrand, p([ihw, u) is bounded above by the sum 2[C,,1 -+- Cn2(60)], where, C... := / Uitx)d¢w(x). cam) == / [i.(w)—f<:. 0, there exists an N,, such that (2.4.3) P (D,,(6n)/||6,, — 90))2 _>_ a + llgfiiflezob) > 1 — a, v n > N,,, where 20 is as in (2.3.1). The claim (2.4.1) then will follow from (2.4.3), (2.4.2), the positive definiteness of >30, and the fact nthn(6,,) = nhd||6,, — 9,,“2 [D,,(6,,)/l|6,, — 90”?) To that effect, let (2.4.4) u,, := (6,, — 60), d,,,- := mg (X,—) mgo(X,- ) — uzmgo(X,-), 1 S i _<_ n. We have M— S Dnl + Dnz, where ||9n - 90“2 D“ = f ”-127“— “(IIdT‘iIIlrdm 0.2 = f M] dad). llunll 27 By the assumption (m4) and the consistency of 6,,, one verifies by a routine argument that D,,1 = 0,,(1). For the second term we notice that (245) Dn2 > inf 2,,(6), where Z,,(b) := / [bT [1,,(x, 60)]2 d 0, and any two unit vectors b, b, 6 Rd, llb— b,” g 6, we have 2 |2n(b)- $7.01)! <5()5+2 )U ”“2194 33- Xilllmao(X zllldsotv ) i=1 But the expected value of the r.v.’s inside the square of the second factor tends to f Ilm(x )ll f(x )dcp(x), and hence this factor is 0,,(1). From these observations and the compactness of the set {b 6 W; ”b“ = 1}, we obtain that sup (2,,(b) — bTEObl = 0,,(1). l!b||=1 This fact together with (2.4.5) implies (2.4.3) in a routine fashion, and also concludes the proof of (2.4.1). We shall now prove the asymptotic normality of n1/2(6,, — 60). The proof is classical in nature. Recall the definitions (2.3.1) and (2.4.4), and let th(6) :2 —2/Un(x,6),ii,,(x,6)d<,b,,,(x). 28 Since 60 is an interior point of O, by the consistency, for sufficiently large n, 6,, will be in the interior of O and th(6,,) = 0, with arbitrarily large probability. But the equation th(6,,) = 0 is equivalent to (2.4.6) [Un(x)/1,,(x,6,,)d<,bw(x) = [Z,,(x,6,,)ii,,(x,6,,)d¢w(x). We shall show that n”2 x the left hand side of this equation converges in distribution to a normal r.v., while the right hand side of this equation equals R,,(6,, — 60), for all n 21,with R, = 20 —+- 0,,(1). To establish the first of these two claims, rewrite this r.v. as the sum 3,, -+- Sm + gnl + gn2 'i” 97,3 'i' 9,,4, where 5,, = /Un(x);ih(x)d 0, and 9190(1) is continuous in x E I. Then, under H0, nl/QSn —-)d 29 N(O, 2) , where 2 = zimhsof/EKux—Xmo—X)a?(X)nh z [02(xlmoo($)m£(x)92(x) f(x) dz. Moreover, iff is twice continuously difierentiable, and h satisfies (h3), then (2.4.7) n1/2|Sn1| = 0,,(1). Lemma 2.4.2 Under H0, (61), (62), (f), (k), (m1), (m2), (m4), (m5), (h1), (h2), (2-4-8) (9) Til/297:1 = 012(1)» (’9) ”mm = 0,,(1). (2-4-9) (6) n” 29713 = 012(1), (d) 711/ng = 019(1)- The proof of (2.4.7) is facilitated by the following lemma, which along with its proof appears as Theorem 2.2 part (2), in Bosq (1998). Lemma 2.4.3 Let fw be the kernel estimate associate with a kernel K‘ which sat- isfies a Lipschitz condition. If f is twice continuously differentiable with a compact 1 support, if wn is chosen to be on (log n/n)m where an ——> a0 > 0, then (10g;c fl)‘1(n/10gn)‘fi 81:11) lfwtr) - f (It)! —> 0, 61-S- for any positive integer k. Proof of Lemma 2.4.1. For convenience, we shall give the proof here only for the case d = 1, i.e., when uh(:r) is one dimensional. For multidimensional case, the ' 30 result can be proved by using linear combination of its components instead of uh(a:), and applying the same argument. Let sm- := th(:r - Xi)eiuh(:r)d 00, (2.4.10) E33,1 —) 2, (2.4.11) E {s§,,1(|snll > Til/2M} —> 0, VA > 0. But, E33,, E/Kh(:z: — X)8[4h($)d<,9(:v) >< /Kh(y - Xl€flh(y)d99(y) = f/EKm:—X)K,,(y—X)(I"(Xlflh(13)fih(y)dse($)d92(z) f(x) Hence (2.4.10) is proved. To prove (2.4.11), note that by the Holder inequality, the L.H.S. of (2.4.11) is 31 bounded above by A_6/2Tl_6/2E(Sn1)2+6 (fume — X)#h(x))2’iédso(r))2 [elm] . S A-6/2 ”_6/2E This upper bound is seen to be of the order 0((nhd)'5/2) = 0(1), by (h2), thereby proving (2.4.11). To prove (2.4.7), by the Cauchy-Schwarz inequality, the boundedness of uh(:1:), (2.3.6), and by Lemma 2.4.3, we obtain 983.. s Cn [(U.(x>9hzdso(z> sup (mo/fin) — 1 2 zEI = n Op((nlld)_1)0p((108k n>2 9‘?) = 0,, ((log,c n)2(log n)fi7 nad7h) = op(1),by (h3). This completes the proof of Lemma 2.4.1. [:1 Proof of Lemma 2.4.2. By the Cauchy-Schwarz inequality, Hal/29.412 s (1912/ / Uflxldwxl) (..1/2/ 11,249.90) —- flh(x)||2d99(r)) . By (2.3.5), and (112), (2.4.12) En1/2/U:($)dcp($) = 0(n'1/2h—d) = 0(1). To handle the second factor, first note that (1,,(93, (90) — ph(:c) is an average of centered i.i.d. r.v.’s. Using Fubini, and the fact that variance is bounded above by the second moment, we obtain that the expected value of the second factor of the above bound 32 is bounded above by (2.4.13) n’l/Q/EIIK),(I — X)r'ngo(:z:)[|2 deem) = 0(n'1/2h’d) = 0(1). This completes the proof of (2.4.8)(a). This together with (2.2.2) implies (2.4.8)(b). To prove (c), similarly, lin1/29n3ll2 S n/U§($)d— mom->1)? / (R.(x>)2999w(x> + / Rn|mh<49o>nds9w = can-+0.0). by (2.2.2), the assumption (m5), and by (2.4.1). This together with (m4) then implies that lanll = 0,,(1), and by the consistency of 6”, we also have ||Vnuffll = 0,,(1). Next, consider Ln. We have Ln = fflnbfloflflnwfln) -fln(9:.00)le¢w('-II) + ffln($»90)fl:($,90)d¢w($l = L711 + [1,0, say. But, by (2.2.1) and (m5), “Ln“ = 0,,(1), while an — fflhwigolfiflfigoldifidfl ! s / ((4144.90) — p.(x.9o)11299.,<4> +2 f ((44490) — 944,90)” (In. 0, andf is twice continuously difierentiable. Then, under H0, (2.4.15) Til/2(9), -— 90) = mini/23,, + 0,,(1). Consequently, n1/2(6n —— 60) => N(0. 2512231), where E is as in Lemma 2.4.1. Remark 2.4.1 Upon choosing g E f, one sees that E = f02(z)mgo($)mg;(a:)f(:1:)d$, 20 =/rngo(:r)mg;(a:)f($)dzr. It thus follows that in this case the asymptotic distribution of n1/2(6n — 60) is the same as that of the least square estimator. This analogy is in flavor similar to the one observed by Beran (1977) when pointing out that the minimum Hellinger distance estimator in the context of density fitting problem is asymptotically like the maximum likelihood estimator. Consider a: and 6 are one dimensional case. Let mg(:z:) = 61:, so 7349(2) = 2:. Let 6n be the minimum distance (MD) estimator, 6,, be the lease absolute distance (LAD) estimator. The variance of £49,, — 60) is denoted by V1, and = 03f12292($)f‘1(:r)dx (f1 9290(4))2 35 V1 The variance of \/r—i(6n — 60) is denoted by V2, and 1 V2 = W Let g(2:) = f2(:z:)l(a:). then a: f. $2f3(x)12($)dx V1 = 2 . (f1 x2f2(:r)l(x)) Now consider the example that X ~ N (0, 7'"), l (1:) = f‘1(a:), the error distribu- tion is N(0, 03), and I is a finite interval [—a, a], then V a: ffa 172f($ld$ of 1 (la x2f(z (1:13)2 ffa fiflxldxl 27m2 n02 V = e : ———£. 2 472 2T2 Take r = 1 and a large enough such that /a $2w($)da‘ > if“) x2w($)da:, 0. —oc where 7/1 stands for the standard normal density, then V1 < V2. Or take a = 1 and 7' small enough such that "HO 2 00 / y2w(y)dy > ;/ y2¢(y)dy, G r 00 then V1 < V2. Remark 2.4.2 Linear regression. Consider the linear regression model, where q = d+ 1, G) = Rd“, and m(r) = 61 + 6310, with 61 6 IR, 62 6 Rd. Because now the _ parameter space is not compact the above results are not directly applicable to this 36 model. But, now the estimator has a closed expression and this regression function satisfies the conditions (m1) - (m5) trivially. The same techniques as above yield the following result. With the notation in (2.3.1), in this case . , R1103) . EKh(.’L‘ - X) un(2:.9) E #1427) s ,uhm 2 1:” Kh(~73- Xilxi EKh(a:—X)X 230 = / 1 m g($)d$, En: [fln($)fln($)'d¢w($), :L‘ IBIL‘ _ 1 9’ Home» E ' f , f(x) dx’ IL‘ 231: MW) = [we — <9 — 9o)'4n(z>]299w(4~>. The positive definiteness of Z,, and direct calculations thus yield (9‘. — 90) = 2:3 / no) Un($)d90w(33l- From the fact that Z,, —-+ 20, in probability, parts (a) and (b) of Lemma 2.4.2, and from Lemma 2.4.1 applied to the linear case, we thus obtain that if (e2), (k) and (h3) hold, if the regression function is a linear parametric function, and if f ||x||2d0(r) < 00, f is twice continuously differentiable, then n1/2(6n — 60) = 251 / Un(:1:)[ih($)d N(0,2312251). Remark 2.4.3 Tightness. Consider when d = 1, from the definition, 0:, satisfies 37 the equation (2.4.16) A (mo; (:16) — m90(1:)) ma;($)dG(r) = An + Bn + C", where 1 n , dG'(:r) An = -— K x—Xie, ma-x . , film; "( ll "( )no) _ l n $_ -m _ _ r'n- dG(:c) Bn — /I‘(nt=ZIKh( X1) 00(Xz) Hal) 0,,(x) fh($), . dG($) Cu 2 EKhx—Xl m0X1 —mo:z: ma-x - , [I < M .() 9<>> MW) and #a = EKh(-’F — X1) (m90(X1l " m90(x))» and in stand for am/ae. The left hand side of (2.4.16) is approximately (a; — 9o) / miwcm. An and B, are op(1/\/r—il_i) by Cauchy-Schwarz inequality, consistency of 01;, and continuity of m on 6 E 9. But 0,, is approximately 40(4) f (a?) ' [fl/KW) (mach: — uh) — m60($)) TWOW) So if mao(') is not differentiable, and Vnh(m90(a: — h) — mgo(:1:)) is divergent, then Vnh(oz; — 60) is not tight. 38 2.5 Asymptotic distribution of the minimized dis- tance This section contains a proof of the asymptotic normality of the minimized distance th(6n). To state the result precisely, recall the definitions of C", CmC'n, I‘, f‘n from (1.0.7) and let P, := 2nd / / [EKh(a: — X)K,,(y — X)02(X)]2 d 7.0/2, is of the asymptotic size a, where .20 is the 100(1 — a)% percentile of the standard normal distribution. Our proof of this theorem is facilitated by the following five lemmas. Lemma 2.5.1 If (e1), (e2), (f), (g), (k) hold and if nhd —> 00, then nhd/2(th(60) —C,,) is asymptotically normally distributed with mean zero and vari- ance I‘. 39 Lemma 2.5.2 Suppose (e1), (62), (f), (1:), (m3), (m4), (m5), (h1), (h2) hold and Ea:4 < 00. Then nh‘l/2 th(60) — th(6n) = op(1). Lemma 2.5.3 Suppose, in addition to (61), (62), (k), (m3), (m4), (m5). and Be4 < 00, f is twice continuously difierentiable and h satisfies (h3). Then, ”ltd/2 thlgol — thlgol = 012(1)- Lemma 2.5.4 Under the same conditions as in Lemma 2.5.3, nhd/2(C'n — C3,) = 0,,(1). Lemma 2.5.5 Under the conditions of Theorem 2.5.1., f‘n — I‘ = 0,,(1). Conse- quently, the positive definiteness ofF implies, lf‘nF‘l — 1! = 0,,(1). Proof of Lemma 2.5.1. Note that th(60) can be written as the sum of Cu and Mug, where Mn2 = n—22/Kh($—Xi)Kh(IL‘—Xj) €15jd§0($). i¢j We shall prove that (2.5.1) nhd/2Mn2 is AN(O,1",,). To prove (2.5.1), we shall use Theorem 1 of Hall (1984) which is reproduced here for the sake of completeness. Theorem 2.5.2. Let TC, 1 S i S n, be i.i.d. random vectors, and let Un. I: Z Hn(Xi9X~j)i Gn($1y) = EHH(X1?$)HTI(X1ay)’ 131(an 40 where Hn is a sequence of measurable functions symmetric under permutation, with EHn()~(1,)~(2)|)~(1) = 0, a.s., and EH:(X1,X2) < 00, for each n _>_ 1. If - - - - - - 2 EGE,(X1,X2) +n-1EH3(X1,X2)] / [EH§(X1,X2)] —+ 0, then Un is asymptotically normally distributed with mean zero and variance n2 EH3 (U?) f(awrgwmi f(f KKdu>2dv by the continuity of 02 and f. This complete the proof of Lemma 2.5.1. [:1 43 Note that 0,, = n‘lEfKflx—Xl) 5f (190(27). Let 6,, :2 Ef Kflx—Xflef (190(1). Then, by routine calculations, , 2 E (Mid/2(6), — Cn)) n 2 = E (n-lhd/z 2 [/ Kflx — X06? (190(3) — enJ) i=1 3 72,-1th (/ Kin: — X06? (190(1))2 (f Km: — X1) d¢(x))2€i] = 0((nhdl’1) = 0(1). = n'lth Combining this with the Lemma 2.5.1, one obtains that nhd/2(th(60) — Cu) is AN (0, F”). Proof of Lemma 2.5.2. Recall the definitions of Un and Zn from (2.3.1). To prove part (b), add and subtract m90(X,-) to the 2"" summand inside the square integrand of th(én), to obtain that th(60) — th(én) = 2/Un($)Zn($,én) d¢w(:1:) —/Z§(3:,én) dgbw(2:) = 2Q1 - Q2~ 533’- We need to show that (2.5.7) (2) nhd/zQI 2 010(1), (22) nILd/2Q2 = 0,,(1). 44 By subtracting and adding (én — 90)Trhgo(X,-) to the 2"" summand of the second factor of integrand in Q1, we can rewrite Q1 as the sum of Q11 and Q12, where / Una) Q12 = (93. — 001T / Und¢w Q11 12-1 Z Kh(.’L‘ - Xi)dm'] d¢w(:z:), i=1 where dm- are as in (2.4.4). By (2.4.1), for every n > 0, there is a k < 00, N < 00, such that P(An) 2 1— n, for all n > N, where An := {(nhd)1/2||én — 00“ < k}. By the Cauchy-Schwarz inequality, (2.2.2), (2.3.6) and the fact that (2.5.8) / (Rn(:r))2 dam = 0,.(1), we obtain that on the event An, nhd/leul is bounded above by nl/zllén — 00||(nhd)1/2 sup in—I-L— Op((nhd)"1/2). i»(nhd)”’ll9-90Hpn(x,é.)d¢w(z), Q1222 = (én — 60)T/Un($) [lin($aén) — #413390) d‘iaw(x) Arguing as above, on the event An, (nhd/2IQ122I)2 is bounded above by nzhduén — eon? max 11mm» — moo(Xz-)||20p((nhd)‘1) = 0pc), lgign by (2.2.2), (2.3.6), (2.5.8), and assumptions (m5) and (h2). 45 Next, note that Qm is the same as the expression in the left hand side of (2.4.6). Thus, it is equal to (2.5.9) ((9,. — 601T / Zn(x.én>un(x,én)d¢w(z) = (6.. — 001T]z. +<én — 6017‘ f 2.42:. 62.) [w.én) — mm] mm 2 D1 + D2, say, But, by the Cauchy-Schwarz inequality, (2.2.2), (2.3.19), and (2.5.8), nhd/2IDll is bounded above by nhd/2llén ‘ 90l|20p(1l = 010(1), by Theorem 2.4.1 and the assumption (m5) and (h2). Similarly, one shows nhd/2|Dgl is bounded above by nhd/znén — 9o||20p(1) = 0pm. This completes the proof of (2.5.7)(i). The proof of (2.5.7)(ii) similar. Details are left out for the sake of brevity. D Proof of Lemma 2.5.3. Note that nhd/21thwo) — moon 5 mid/2 / U2 sup ”Ra/fax) — 1| 2:61 = mild/20A(nhd)‘1)0p((10gk”)(log n/nlfi) = 0,,(1), by (2.3.5) and Lemma 2.4.3. Hence the lemma. (3 Proof of Lemma 2.5.4. Let t:- = mg, (Xi) — mom), Ana) == mm) (rm) — Wm) . Then, 1 ” - 0,, = E: / K§(z—x,)(e. —t,-)2d-W) = 0pm. To prove the part (b) of (2.5.10), note that Aug can be written as the sum of Arm, Arm, and An233 where A... = 71-22 / X2 MUM) An22 = Z/Kh AM )dflx ) An23 = 71—2 til/Khwii) {)1'5 'it A nl‘( )d(,0(.’13). By taking the expected value and the usual calculation, one obtains that n-ZZ/Xm—X X>§d I“. 49 For the sake of convenience, write Kh(2: — X ,-) by K 1(2). Now, rewrite P" as the sum of the following terms: B1 = hd “1‘2: (fKr-(ale )(Ei-t1)(€r-tj)dso($))2. B2 = (fin-2 Z (/ X.(:c1X.-(x1(e.- — t.1(e.- -t1)An($)d¢($))2. B. = 422:”: Um. -(-t.-1(o- 4.1111(2)) x (fw X.(z1X.(e~1(e. — 111(5. — trlAn($)dde dz, i=1 (3”,, = [I (Z K,‘~:(r — X,)(Y, — )(énxy) (5: Kw(:c — X0) dz. The value of the test statstic is calculated by nhd/2(th(én) — C1,). In order to plot a density curve, we repeated the above sampling and calculations for 1000 times. The density curves of normalized 9,, and the test statistic are plot by using density plot command with Gussian kernel Option in SPLUS2000. We also did the above simulation for n = 100 and n = 200. The first three graphs in Figure 2.1 are the Monte Carlo density curves of Jag" — 1) from 1000 runs with sample size n = 50, n = 100, n = 200 respectively. The fourth graph is the N (0, (0.173025)2) density, the density curve of the limiting distribution of Jada — 1) based on the theorem we obtained in section 4. The first three graphs in Figure 2 are the Monte Carlo density curves of nhd/2(th(§n) — Cu) from 1000 runs with sample size n = 50, n = 100, and n = 200 respectively. The fourth graph is the density curve of the limiting distribution of nltd/2(th(én) - C’n) _ in Theorem 5.1. which is N (0, (0.026344)2) in the present case. The graphs show 53 2.0 ‘ 1.5‘ 1.0‘ 0.5 ‘ 0.0 -0.5 0.0 0.5 Sample size is 50 2.0‘ 1.5‘ 1.0‘ 0.5 ( 0.0 -0.5 0.0 0.5 Sample size is 200 2.0 ‘ 1.5“ 1.0 " 0.5 ‘ -0.5 0.0 0.5 Sample size is 100 0.5 ‘ 0.0 -0.5 0.0 0.5 Limiting distribution of the estiamte of the parameter. Figure 2.1: The density curve of fi(én — 1). 54 0 _ -0.1 0.0 0.1 Sample size is 50 15‘ 10“ u...— -0.1 0.0 0.1 Sample size is 200 Figure 2.2: The density curve 55 151 10‘ -0.1 0.0 0.1 Sample size is 100 15 10‘ -0.1 0.0 0.1 Limiting dstribution of the test statistic. 0f nhd/2(th(én) — (in). that the distribution of fiwn — 1) resembles the asymptotic normal distribution quite well even for sample size is 50. The distribution of nhd/2(th(én) — 6'”) has a small negative bias compared with the asymptotic normal distribution for all three sample sizes. But the bias decreases as 77. increases. A simulation for d = 2 and m = 2 was also conducted. The hypothesis to be tested is H0 : 11(3) = 0.52:1 + 0.82:2, vs. H1 : H0 is not true. The parametric model to be fitted is {mg(:1:1,:1:2) = 611131 + 02252, 0 = (01,62)T 6 R2 x = ($1,232)T E R2}. We chose the following five models to generate simulated data from: model 0. Y,- = 0.5X1,-+ 0.8X2i + 5i, model 1. Y, = 0.5X1i + 0.8X2i + 0.3(X1i - 0.5)(X2i — 0.2) + 51', model 2. Y,- = 0.5X1i + 0.8X2i + 0.3X11-X22- — 0.5 + 51': model 3. Y,- = 0.5X1i + 0.8X25+1.4(e$p{—0.2X12i} — e$p{0.7X§i}) + 5,, model 4. Y, = [{Xzi > 0.2}X1i '1' 51‘, The error distribution is N(0,0.3). X1, are i.i.d N(0,0.7) and X2,- are i.i.d N(0,1). The sample sizes chosen are 30, 50, 100, and 200. The nominal level that is used to implement the test is a = 0.05. There are 1000 replications for each combination of (model, sample size). Data from model 0 are used to study the empirical size, and 56 the data from models 1 to 4 are used to study the empirical power of the test. The empirical size (power) is computed by Relative frequency of ( value of the test statistic > F‘1(1 — (1)), where F is the asymptotic distribution of the test statistics under H0. The bandwidth h is chosen to be n“1/4'5 and w is chosen to be (log n/n)1/(d+4), the measure G is taken to be the uniform distribution on [—1,1]. The density curves of normalized 6,, and th(6n) are plotted by using den- sity plot command with Gussian kernel Option in SPLUS2000 for one dimenstion and Surface-Spline Fine Grid for two dimension, where 6,, = (91",62n)T and 00 = (0.5, 0.8)T. The results of the power study are shown in the table. The tables gives the empirical sizes and powers for testing model 0 against models 1 to 4. The simulation results of the densities of fiwn — 60), and the minimum distance test statistics are shown in Figure 2.3 to 2.9. Figure 2.3 is the Monte Carlo density curves of x/V—llgm — 0.5) from 1000 runs with sample size n = 30, n = 50, n = 100, n = 200 respectively. Figure 2.4 is the Monte Carlo density curves of fiwgn — 0.8). Figure 2.5 is the Monte Carlo density surface of W191; — 60) when n = 30. Figure 2.6 is the Monte Carlo density surface of fiwn — 00) when n = 50. Figure 2.7 is the Monte Carlo density surface of V5197; — 60) when n = 100. Figure 2.8 is the Monte Carlo density surface of fiwn — 90) when n = 200. Figure 2.9 is the Monte Carlo density of the test 57 statistic under H0 with sample size n = 30, n = 50, n, = 100, n = 200. In the following figures, ”- 19 is forn=30,”--- is for n = 100, and a heavy solid line is for n = 200. Table 2.1: Empirical sizes and powers for testing models 0 vs. model 1 to 4. ” isforn=50,”— — — n = 30 n=50 n=100 n=200 model 0 0.005 0.022 0.036 0.049 model 1 0.003 0.062 0.670 0.895 model 2 0.931 0.999 1.000 1.000 model 3 0.461 0.975 1.000 1.000 model 4 0.035 0.368 0.977 1.000 58 11 The density of \/r—i(61n — 0.5). Figure 2.3: . 41...} ,x~..\..\..\\.\..v‘¢‘ \.. \ \\ ..w.\\. . « rl 51?” I . (0., 3.15... . a! 1H,...m. I . . 33" [1.4.1] It |.. lllfi.lllll.q 0.2 ‘ Id .. I .l r I .. ., I... .. .’.....o . t... . , .5. 4... w p ... .lll ll; 11ll.| 0.0 59 Figure 2.4: The density of \/r—i(62n — 0.8). 60 Figure 2.5: The 2 dimentional density of Jim” — 00) when n = 30. 000. «N 000.0( 000.0 000.0 (7 as 61 tional density of fiwn — 90) when n = 50. The 2 dimen Figure 2.6 .m...~....,..w“ W6“\ \‘\ u a a. a . g §Nk.‘vw.é ‘R‘l—l‘x‘fiu‘. .. ... ‘“\!“.I‘$.. "InafifOMag. 0““‘6933. géov e.g.,: w 03.... 3:, 3: 0 o 0 0 ..~.€e.....e..3. 494. . . : v . a 3 x s. 0 .. ¢....”...~......"... 0.0.30.0, . .. 000. 00 000. 0N 000.00. 000.0 «7 62 1 density of \/T_l(9n — 90) when n = 100. zona t The 2 dimen Figure 2.7 000.NP000.¢0000.00000.0( 000.0 P7 63 tional density of \/r—i(0n — 90) when n = 200. The 2 dimen Figure 2.8 000. 00 000. 00 000. 00 000.0 0(7 64 Figure 2.9: The density of test statistics under H0. 1.0 4 .0 co .0 0) .0 A 0.2 0.0 I ‘ '\ 1 *1 ’ \_ 2 l t x . l 1," ‘ w l _ .1. ' \ \ ii 1 g“: ‘.\ _l ' -. .\ ' i s \ 1 ,4‘ .- ) 1- 1 r " . 4 f i 1‘ ( t" ‘1“ l 1' \ \ if) —4 .‘ ' I 3 .‘It i l \‘. “t. p 1. ‘J, 31“ 3 . \ ; O, . i \\l f! , \ \ -1 1" . K \- 1 / I ‘ ' ‘\\ ~. l 5., ’ \\\ ,. I ' \x. ‘ / I - \\i "V. \ -// I _ 01:;\ \ / I . ‘\.\"’;'- 4(— ’ " me __ .A u... 65 Chapter 3 Minimum Distance Autoregressive Model Fitting 3. 1 introduction This chapter discusses application of minimum distance idea in fitting a parametric model to the autoregressive function. To be Specific, let Xn be a real valued strictly stationary process having finite expectation. The autoregressive function is defined to be Mr) = E(anXn_1 = 1:), n E Z Let {mg(-) : 9 E 9},9 C R", G compact, be a given set of parametric functions. The statistical problem of interest here is to test the goodness-of-fit hypothesis H0 : u(:r) = m90(.’L‘), for some 60 E 6'), and for all a: E I vs. H1 : H0 is not true, 66 based on the sample {X ,- : i E Z, } from the stochastic process, where I is a compact subset of IR. In the context of regression fitting problem under the i.i.d set up, the asymptotic prOperties of minimum distance estimator of the parameter 6,, are studied, where 6,, is defined to be the argument that minimizes a transformation of the L2(G) distance between the nonparametric estimate of regression function ,a and the parametric function me. It has been shown that the so defined minimum distance estimator is consistent, asymptotically normally distributed with rate of Vii. The corresponding minimized distance is also asymptotically normally distributed. Thus a class of tests can be constructed by using suitably standardized minimum distance. Encouraged by what have been shown in i.i.d case, we consider to apply the same idea to the autoregressive model checking. when dealing with regression model fitting, to reduce the bias caused by fh in Mhh(6) defined in chapter 1, we used an optimal window width for the Nadaraya- Watson type estimation of f, i.e. flan. But it still causes bias. Hence in this chapter we consider using a slightly different L2 distance defined as M),(6) of (1.0.8), which is actually the L2—distance between 7:197 and a kernel estimator 771.9 f defined as A 1 " mof = 5 2; K44 — X.-.)X.-. where G is a o-finite measure with bounded Lebesgue density 9. The estimate of the parametor is defined as (1.0.9). The test statistic Tn is 67 defined to be ”hi/2 1 " 2 .2 Tn = f“?! (Mh(0n) - E 521/6; Kh(I — Xi_1)dG($)Ei) , where F3, is a consistent estimator of F, o?)2/If2(r)gz(r)dr(u)(f(/K1<)(u+e)du)2de, and 5,- = X ,- — mgn(X,-_1). Similar to the discussion in chapter 2, F3, can be chosen to be 4/1 (iiKh(z—X,_1)é?>2g2(z) d2:(u)/ 0 and p 6 [0,1) such that a(k) S copk,k 2 1, where a(k)= = supa(0{X..s s t},e{X..s 2 t+ 14), a(.A,B): = sup |P(AflB)—P(A)P(B)|, AEA,BEB where o{X,, s g t} stands for the a field generated by {X ,, s S t}. It’s also pointed _ out in Bosq (1998) that the usual linear processes are GS M . 68 Here we shall state the needed assumptions. (M) The time series {X,-; X,- 6 IR, i E Z}, where Z stand for all integers, is strictly stationary satisfying GSM mixing condition and X, = p(X,-_1) + 5,. McKeague and Zhang (1994) pointed out that it is easier to check geometric ergodicity, which implies strong mixing with a geometric mixing rate. From Tweedie (1983), one obtains that a sufficient (but by no means necessary) condition for geometric ergodicity of the nonlinear autoregressive process is that p and o are bounded on compact sets, where o2 = E (efleo). About the errors and underlying design we assume the following: (81) The autoregressive function n() satisfies fp2(:1:)dG(:v) < 00, where G is a o-finite measure on R. (82) {5,} are i.i.d and 5,-1.1 is independent to X-, j = 0, ..,i, and o2 2: Eef. (S3) The density of X0 is twice continuously differentiable Lebesgue density f that is bounded from below on I. Denote the first and second derivatives of f by f’ H ‘ and f , respectively. We also suppose that 311901222324 t5.t6} liftt.t2,t3.t4.te.telloo < 00, where ft1.t2.t5.t4,tt.te IS a Jornt densrty of X,“ X,,, Xta, X,“ th, and Xto. About the kernel function K we shall assume the following: Conditions (K), (Al), and (A2) are the same as those in chapter 2. (A3) For each 6, mg(:z:) and m90(.v) are as continuous in a: w.r.t the integrating _ measure G. 69 (A4) The function 6 1——> me is continuous in L2(G): For any sequence 6,,, 6 E 6), ”6,, — 6” —> 0, implies p(m9n,m9) —-> 0. (A5) For every 5 > 0, there is an N, < 00 such that for every 0 < k < 00, max h1/2llmg(X,-) - r'rigo(X,-)H = 0,,(1). ISiSn.(nh)‘/’ll0-00llsk About the bandwidth h we shall make the following assumptions: (H) h ~ n‘“ for some a > 0, and there is a 7 > 0 such that nh“7 -—> 00. In this chapter we will often use an inequality in Bosq (1998). We list it here as a lemma. Lemma 3.2.1 Let X and Y be real valued random variables such that X E Lq(P), Y E L’(P), where q,r >1 and 3 -l- i = 1 — i, then ICOU(Xe Yll S 219(20ll/pllelqllYlln in particular ICOv(X.Y)l S 4al|X||ee||Yl|ee. where a = 01(0(X),0(Y)). “XII... = inf{b: P(|X1 > b) = 0}. 70 Analog to the notations defined in section 2.3, we introduce some notations that will be needed in this chapter. 1 n Un($.9) = a :KMIF - Xi-1)(Xi — m6(Xi-1))t and Un(l‘) = (,1? 60): iZKh( I — Xi_1)8i, and Z,,(z, 6), 12,,(23, 6), and 6,,(2, 6) are as defined in section 2.3 with X,- replaced by X,_1. Note that 114,490) = / Ug(r)do(r). I We also introduce the following notation, .7h3={yElR:|$—y|§h,$EI}. 3.3 Consistency of 6,, The main result of this section is the consistency of 6,,. Similar to the proof of consistency of 6,, in previous chapter, we will first prove the consistency of 6; in Lemma 3.3.3, where now 6;, is defined to be 6:, = argminOEeMgw), and n 2 M;(6): = A(iZKh(at—X,~_1)X,—m9(2:)f(2:)) dG(a:). i=1 This result will be in turn used to prove the consistency of 6,, in Theorem 3.1 . Lemmas 3.3.1 and 3.3.2 list some results that will be needed in the proof of - Lemma 3.3.3 and theorem. 71 Lemma 3.3.1 Let 112,, be a sequence ofmedimension vectors of real valued functions defined on IR, bounded on J, uniformly in n. Then, under the condition (M), the following hold for V2: 6 I and V0 < a < l: Tl n-1 : (Kh(.’13 - Xi—1)¢n(Xi—1) — EKhCL‘ — X0Wn1Xoll i=1 — (—¥—> _ 0p VnhH“ I (a) Tl LIL—1 (Kh(.’L' '— Ari_1)’l/)n(Xi_1)— EKh($ — X0)’ll)n(X0)) dG(.’E) 1 (b) i: 1 \/ nh = 0p( n-1 2 K44: — X.-.)v..(X.-_.) — EKtlx — Xo)w..(Xo) (c)E/I 1 : 0(nh1+a)’ ). 2 dG(:r) where || || stands for the usual L2 norm defined on IR", i.e. ”((11, mtamlTH = a? + - - - + a3", V(a1, ...,am)T E R”. Proof. Note that the lemma holds for {tin} if and only if it holds for all 3"" component of {urn}, 1 g j S m. Hence we only need to prove the lemma for the case of m = 1. Recall m is the dimension of G. Leta w(:r)f(:1:), in probability, VI 6 I. Proof. Note that by the continuity of f and w, EKh(:r — X,)ib(X,-) = / K(u)1,/)(:1: — uh)f(x — uh)du ——-> z/J($)f(a:), so the corollary follows by applying Lemma 3.3.1 (a) to 1b,, = it. C] Lemma 3.3.2 Under the conditions (SI), (52), and (K), E/Zf,($,6,,)dG'(:r) ——-> 0. I 73 Proof. By adding and subtracting Kh(:r—X,-_1)X,- to the ith summand in Z,,(x, 6,,), and expanding the quadratic term, one Obtains / z§(x,0,.)do(:c) s 2Mh(6n) + 2Mh(60) S 4Mh(60)- I The second inequality follows from the definition of 9". Therefore, to prove the lemma it suffices to show that E Mh(90) —> 0. Note that by Fubini, EMh(00) 2 2 7‘ = 3. / EXEC: — X0)dG(:z:) + —- 2 E/ Km: - Xi—1)Kh($ - Xj—ildG($l€i€j- T). I 77. I K)” The first term is O((nh)‘1) by direct calculation. The second term is O by taking conditional expectation on o{X, : s S j} first. Hence (3.3.3) EMh(60) = EAU§(x)dG($) = 0((nh)—1). So the lemma is proved. D Lemma 3.3.3 Under the conditions (51), (52), (83), (K), (A1), and (A4), 6; —> 00, in probability under H0. Proof. The proof is similar to Corollary 3.1 in chapter 2. According to Lemma 3.1 in chapter 2, it suffices to show that (3.3.4) A (£ZKM1: — Xi_1)X,- — m90(.7:)f(1:)) dG(a:) = 0,,(1). Note that by plugging in X,- = m90(X,~_1) -+- 5,, adding and subtracting EKh(z — Xi_1)m90(X,-_1) in the 2"" summand of the integrand, the left hand side of 74 (3.3.4) is bounded above by the sum of the following three terms: (a) [I U3dcm (b) [I (g; ZlKhUF — Xi-ilmao(Xi-1l - EKh($ — Xi—llmeo(Xi—1ll) (10(17) (c) / (Emu: - X.)m..(xo> — more»? dam. I The term (a) is Op(1/(nh)) by (3.3.3). The term (b) is op(1) by Lemma 3.3.1 (c) with 2b,, = mgo. The term (c) is 0(1) because it is equal to A(/ K(U)(meo($ — uh)f(x — uh) — met,(:L‘)f(:1:))du)2 (10(3) = 0(1) by continuity of rage and f, compactness of 1'. Hence (3.3.4) holds, so does the lemma. D Now we are ready to present the main theorem of this section. Theorem 3.3.1 Under the conditions (M), (51), (52), (53), (K), (.41), and (A4), 6,, —> 60, in probability under H0. Proof. The proof of this theorem is similar to Theorem 3.1 in chapter 2. Here we only sketch the proof. Recall the definition of p from section 2.2 and note that M};(6) = p(rn, mgf), where rn(:1:) := 11‘1 2:1 Kh(a: — Xi_1)X,-. By the same argument as in the proof of Theorem 2.3.1 with Mgw and th replaced by M}: and Mh, it suffices to prove the following result, (3.3.5) sup ith) — M;(t9) = 0,,(1). 969 75 To prove (3.3.5), add and subtract n“1 1:, X)(: - Xi_1)mg(X,-_1) inside the parenthesis of A1,:(0), expand the quadratic, and use the Cauchy-Schwarz inequality on the cross product, to obtain that the left hand side of (3.3.5) is bounded above by sup (W) + 2 sup)1/2, 668 969 where Cn(6) :fr (£21042: — Xi_1)(m9(X.-_1) — mg(x))) dG(a:). To prove (3.3.5), it suffices to prove that (a) sup Cn(6) 2 010(1), and (b) sup Mh(9) 2 010(1). 068 966 First to prove (a). Note that Kh(x — X,) is nonzero only if X,- 6 J1 for large n such that h g 1, so Cn(6) is bounded above by "19(9) "m0($ll2fl (iZKfiz—Xi-il) 610(17)- As a consequence of Lemma 3.3.1 part (c) with 2b,, = 1, (3.3.6) [I (rt-1 2 Km: — X._,)) (10(3) = 0,,(1). And sup ly_zlshvz9ye~71 SUP SUP mo(y)—me(:v) =00) '2 969 ly-zlsh.z.y€.71 because of the continuity of m and compactness of (9 and J1. Hence sup 0,,(9) = 0,,(1). age 76 Next to prove (b). By plugging in X,- = mgo(X,-_1) + 52', one obtains that Mh(6) is bounded above by: 2/ U,2,(I)dG(I) + 2/ Z:($,9)dG(IL‘). I I The first term is 0,,(1) by (3.3.3). For large n such that h g 1 , the second term is bounded above by 968,116.71 2 1 n 4 sup mim- / (;ZK.(x-x.-1>) dG(:c) =op<1> I i=1 by the continuity of m, the compactness of G) and .71, and (3.3.6). Hence sup Mh(6) = 0,,(1). 069 So (3.3.5) is proved, so is the theorem. D 3.4 Asymptotic distribution of (We, — 60). In this section we will prove the asymptotic normality of 6,,. Before that we introduce some notations that are going to be used in this section. Define (3.4.1) §n(I) = EKh(IE — X0)r'n90(X0), 7771(15) = Tn’en(1:) _ mfio (I) - mi; ($)(9n _ 60), and {(x), 772 are as defined in (1.0.10) of Chapter 1. Note that under the condition (M), 6,, is a solution to the equation 0Mh(6)/86 = O. i.e. [Unfit 9n)un(:1:, 9n)dG($) = 0. I 77 Plug in X.- = m90(X.-_1) + e,- in Un(.’L‘,9n) and rewrite the above equation in the following form: (3.4.2) [I Un(I)un(I,6n)dG(I) = / z.(z,9,,)pn(x,9..)do(z) I As in the proof of Theorem 2.4.1, we will use Lemmas 3.4.1 and 3.4.2 below to show that the left hand side of the above equation is approximated by an average of martingale differences. Hence, by the martingale central limit theorem (M.G.C.L.T) converges in distribution to a normal random variable with rate 1/\/n. The right hand side can be written as (6,,, — 60) times a random variable which, by Lemma 3.4.3, converges in probability to a positive constant. So the theorem about the asymptotic normality of 9,, follows. Now we start with three lemmas. Lemma 3.4.1 Under the conditions (M), (51), (52), 5(3), (K), (A1), and (A4), there is a function f such that the following hold: lunwo) — doll = 0.0), (a) SUP (b) sup Hm, 9..) — as!) = 0.0). 261' Lemma 3.4.2 Let Z be a real valued continuous function on I. Under the conditions (M), (51), (52), (53), (K), (AI), (A3), and (A4), fi/ (£21041? — Xi—1)l(Xi—1)Ei€n($)) dG(:1:) 78 converges in distribution to a normal random vector with mean zero and covariance matrix given by 0.1/W >m9.< MW 19% WW I The result is also true when {n are replaced by 5. Lemma 3.4.3 Under the conditions (M), (51), (52), (S3), (K), (A1), (A2), (A3), and (A4), (3-4'3lll0n — QOlI‘I/(i‘ ZK’I( x — Xi 1)nn(Xi—1))€($)d0($)= 012(1)- We will state the main theorem of this section. Theorem 3.4.1 Under the conditions (M), (51), (52), (53), (K), (A1), (A2), (A3), and (A4), was, — 60) converges in distribution to a normal random vec- tor with mean zero and covariance matrix 20 172261, where 20, n2 are as defined in (1.0.10). Proof. Note that the right hand side of (3.4.2) can be written as (6,. — 00)R,., where R, is a sum of following terms: R... -—- [fln($a90)fi:($,9nldc($l Rn2 : _/I(%IZIK’1($—Xi-€T)H—6£§%ll)lfln(x,6n)dG($). By Lemma (3.41), Rnl- — fI($ (II)dG( ) + op(1). By Lemma (3.4.3) and Lemma (3.4.1), Rug = op(1). Hence R4. converges in probability to 20. 79 Note that by adding and subtracting Kh(.’L‘ — Xi_1)€($) to the ith summand, the left hand side of (3.4.2) can be written as a sum of the following two terms: L1,. = fl madame), Li. [I Una) 112446..) - 6(4)] 40(4). By Lemma 3.4.2 with l = 1 and En = {, URL), converges in distribution to a normal random vector. By Lemma 3.4.1 and (3.3.3), the term L312 = op((nh)'1/2) . So the left hand side of (3.4.2) is op((nh)‘1/2) and the right hand side of (3.4.2) is (0,, — 60)Rn where Rn converges to 2 in probability. Hence (3.4.4) (6,. — 60) = op((nh)-1/2). Next we shall show that (6,1 - 60) is actually Op(1/ fl Note that by adding and subtracting Kh(.’13 — X,_1)r'ngo(X,-_1) — 5,,(12) to the ith summand, the left hand side of (3.4.2) can also be written as the sum of the following three terms: L... = (I Un(I)€n(x)dG(I), Liz = / Una) (1234.60) — 54(4)) 40(4), I {‘1 :3 to w l /IUn(I)Zn(I,6n)dG(I). By Lemma 3.4.2 with l = 1, fiLil converges in distribution to a normal ran- dom vector. The term L22 = op(1/\/ii) by the Cauchy-Schwarz inequality, Fubini, Lemma 3.3.1 (c), and (3.3.3). The term L3“, = op(1/\/n) is by (3.4.4), (3.3.3), and the assumption (A5). 80 Combine the above discussion to conclude that [(971 ’65an: [Ln2+0P(1') Hence the Theorem holds by Lemma 3.4.2. [3. Next we are going to prove the three lemmas. Proof of Lemma 3.4.1. We will prove a bit more general form of this lemma. i.e. for any continuous function I on I, (3.4.5) sup 11:14“ a; — X,_1)l(X,-_1) — l(I)f(I)| —+ o 261' where f is the density function of X0. Because l(I)f(I) is continuous on compact set I, so it is bounded on I. So sup E—ZKhI-Xz1)l(X_1)—l($)f(.’lf)l IEI = sup l /K(u)[l( (I — uh)f(I - uh) — l(I)f(I)]du| IEI sup imam) — Iowa)! ——> o. l/\ In order to complete the proof of (3.4.5), we still need to show that (3 4 6) )sup— —2 Kh( (a: — X,-_ ()zX,_,) — EKh(I — X0)l(X0)l .—_ 6,,(1) I6 Let Cn(I) = £22; Kh(.’1,‘ — Xi_1)l(X,-_1). Consider covering compact set I E B = {I 6 IR: III S b} for some b < 00 by 12,, closed set: Bjn = {I : Ill-xjnl < b/l/n}, where 1 g j g 14, such that BO” 03271 — —¢ for j ¢ 16. 81 By the assumption that K is Lipschitz, there is a finite positive number 5 such that s 1 " b6 1 " Katy) — Cn($jn)l S Eill' — Ijnl If; ”(Xi—IN S h2l/n g; ”(Xi—Illa b8 . IECn(I) _ Ee)<4 —52 h2 +22 1+—4€1 W [n n. 'n — nx'n I - ' " —° J J ‘ 6p 8c2 q q eh “2c: Choose 11,, = n, and q = (fr—i/ h, then P( sup ICn(xjn) - E 6) ISjSVn 2 4c 1 1/2 S 411,,eIp(—8£—C2 - nhz+7> + 221/nq (1+ :5) a[h"’/2] 3 c1 ne‘c’ "h2 + c3n3/2h‘1p0" "h2/2 = 0(1), 82 for some positive constants c1, C2, and c3 by conditions (M) and (H) . Hence (3.4.7) is 0,,(1), so is (3.4.6). this also completes the proof of (3.4.5). By taking l = mgo in (3.4.5), then part (a) of the lemma is proved. To prove part (b) of the lemma, it suffices to prove that (34.8) 3;; Mutual — Mrflohl = 010(1)- Because for large n such that h S 1, Kh($ — X44) is nonzero only if )(.--1 E .71, so by the continuity of mg, compactness of 9 and .71, and the consistency of 0”, sun Handy) - m00(y)ll = 012(1)- 116.71 Apply (3.4.5) with l(I) = 1, one obtains that sup n’1 2 Kh(:t‘ — Xi_1) = 0,,(1). IEI i=1 Hence (3.4.8) is bounded above by sup 11min) — 614(4)” W 2: K44 — X.-.) = 0.41). 1163: 2:61 i=1 That completes the proof of the part (b) of the lemma. [:1 We are going to apply the Martingale Central Limit theorem, i.e. Corollary 3.1 of Hall and Heyde (1989) to prove Lemma 3.4.2. For the sake of completeness, we state the corollary here as a lemma: Lemma 3.4.4 Suppose Snkn = 2;, X"). and (19,14,314) is a zero-mean, square integrable martingale array with differences X7125 and n2 is an as finite random 83 variable. If {Xm} satisfy the following conditions: (a) V5 > 0, Z E[X:,I{lxn,l>5}|7n.i_1] ——> 0, in probability. i=1 (b) =ZE(X:|,.7m-_1) -—>r)2, in probability. (C) 95...- C 35.4.1... for i S i S kmn 21. Then, Snkn converges in distribution to a normal random variable with mean zero and variance n2. Proof of Lemma 3.4.2. W.L.O.G, here only gives the proof for the case that 9 is one dimension. We will construct a martingale array and verify the three conditions of the Lemma 3.4.4. Define j Snj = Zn—l/Q/IKh($_Xi—1)l(Xi—1)€n($)dG($)€iv 7n.) = 0{Xo.X1.---.Xj,€1.-.-.€j}- Then {Sm-”7“} is a zero mean, square integrable martingale array, and .7,”- Q 7”,”, Xm- = 77,—”2 f1 K),(I-—X,_1)l(X,~_1)fn(I)dG(I)e,-. So the condition (c) holds. For any A > 0 and c > 0, (3.4.9) ZE[X§1{|X,,,|>X}|I,,,1]< X? E[|X,,,-(2+C|J-‘,,,_,]. :1 Because 5,,(I) = EK),(I — X0)m90(X0) = f K(u)r'n90(I — uh)f(I — uh)du, the kernel function K has bounded support, and continuity of mg, and f, so 5,, is bounded uniformly in n at I E I, and suppose the bound is 85. Furthermore, note 84 that (3.4.10) [IKMIC — Xt)||€n($)||dG($) S Bg/K(U)9(Xt + uh)du S BgSUPgu/l < 00- It So by stationary of X,- and definition of 7714-1, (3.4.9) is bounded above by —’C _c C 1 n C /\ n /2(Btsupg(yllz+ 521311511“ lfnr—Il 3’ i=1 = A‘Cn—c/2C = op(1), for some constant C. Hence the condition (a) holds. For the condition (b), note that EV: %:E ((/I Khlx — Xi—1)1(Xi—1l€n($ldG($))202) = E(f Km - Xt>z6tach>202 = [I f1 (12de — xtma - 1101120101) €n(I)§n(y)dG($)dG(y) ~02 —> 02 (I m3.<4)F<4>.42<4>13<4)dx. Let Vm- denote (f, Km: — X,)l(X,_1)§n(I)dG(I))202. Note that, V", is bounded uniformly in ni. Then 71 (3.4.11) Var(V,,2) = E ( (Vm- — Evan) :HH i=1 1 2 " : Elana/710+ fiZCOVO/ni’ an)’ idx. Proof of Lemma 3.4.3. By (A2) and consistency of 6,,, one obtains (3'4-12) maXOSiSn-llnn(Xi)l/ll6n -' 60H = 019(1)- Similar to the proof of (3.4.10), f1 Kh($ — Xi)l€($)ldG(.’L‘) are bounded uniformly in i and h g 1, hence [IKMIC " Xi)l€($)ldG($)l7ln(Xi)l/ll0n - 90” = 012(1) uniform in i- So (3.4.3) is also a 0,,(1). C1 - 86 3.5 Asymptotic behavior of the minimum distance. In chapter 2, it has been proved that the standardized minimum distance is asymp- 1/2 under the i.i.d setup. In this section totically normally distributed with rate nh we will show that the same result is also true if the observations are from a stochas- tic process satisfying a GSM condition. This result can be seen from the following three propositions. Before present the propositions, Define E," = Xi — rngn(X,-_1) 2: 1, ...,n. Proposition 3.5.1 Under the conditions (M) to (H), 1 nhl/2 ) M),(6,,) — M),(00) = e.( Proposition 3.5.2 Under the conditions (M) to (H), 1 " ~ - 1 ,1, Z [1 K20: — X.-1)dG(:r)U.dc<>. The first term of (a) is 0,,(51) by taking the expectation of the summation and Theorem 3.4.1. Similar to the argument of first term in (3.5.1), the second term of (a) is Opn(—1—h.) Hence the first part of (3. 5. 2) holds. Using the same skill and similar argument as above, one obtains that term (b) can be written as a sum of following two terms 1h _ T E :% Xi— 1 . (611— 60) — I n('ni:1 h, h, ——)m90(XXi_1)Ei) dG(.’L‘), T(6n 2( T — TlnX( i 1) (9" 9°) He. ain't/(:1: 24K h “In. won) Ml“) By taking the expectation of the absolute value of the integration, and Theorem 3.4.1, the first term is op( n/——,)h1,. Similarly the second term is also op( ——1)—n,)h,. Hence the second part of (3.5.2) holds, so does the proposition. [:1 Now we prove the prOposition 3.5.3. Proof of proposition 3.5.3. Define 4513' = [116.02 — Xi-1lKh($ — Xj—rldG(33)€i€j, n 1 j an = fi2¢ijy Unzzlvnj' I: J: 91 Then, Un is a sum of Martingale differences. In order to apply the M.G.C.L.T to prove the proposition 3.5.3, one needs to check the following three conditions: (Cl) Var(U =2 EVn 1 (C2) 32— 2 v3, -—. 1 in probability, 0 1 " 2 . . . ( 3) 53 2E {an1{lvnjl>gan}lfn‘j_1}} ——> Om probab111ty,Ve > 0. The proof of proposition 3.5.3 is broken down into four lemmas. Lemma 3.5.3 Under the conditions (M) to (H), (71211) En( 4 Z ¢ij¢1j)2= 0( (1). i= H Kh(-—Xz_1)e§”. (E(jljflp..jk) 93 where p, is either 1 or 2. Then (3.5.3) can be written as h2 6 (3-5-5); 2 Cu 2 E¢ij¢1j¢ry¢ry = 03/13 + c4A4 + 05A5 + c6A6, say, 11:3 Fl, for some constants 03, c4, c5, and c6. In order to prove the lemma, it suffices to show that A, = 0(1), for u = 3,4,5, 6. But when u = 3 or 4, by (3.5.4) and (3.5.5), h2 1 1 A3 = 0(3 ' n3 ' F) = 0(a) = 0(1), h2 4 1 So we only need to show A5 = 0(1) and A6 = 0(1). Define 7' = min {j S 5: dj = il+1 — i1, and EK,,,..,i,(-, -, -, ) = Ofor some 1.}. It is seen from this definition that on FV (3.5.6) 7' g 8 — u. Next we will show that 1 71.4"Th2 (3.5.7) A, = O( ), V = 5,6. Suppose d, = i,“ — i, for some I. On F”, u = 5, 6, (3.5.3) is equal to / COV (Ki1,..,i( (1:! y? S) t)? Kil+1,..,iu (1:) ya 39 t)) dG($)dG(y)dG(S)dG(t). IxIxIxI By Lemma 3.2.1, the above term is bounded above by (3.5.5/I I I I2p[Za(dr)ll/”llKi.....z-.llqllKi...,.....,IlrdG(:v)dG(y)dG(s)dG(t). 94 l l l = for any p, q,r > 1, and q + q -+- r 1, where IIK.... .,I|.,= (EIK.,(:1:, y,s,t)lq)1/q. By an usual calculation, (3.5.8) is bounded above by const - p[a(d,)]1/p By taking q = r = V, A, is bounded above by h2 T u—l—‘r 3/51 1 const - gin Ed, [a(d.,)] h_4 = O(—), V = 5,6, The lemma therefore is proved. [3 Proof of Lemma 3.5.4. Because E¢ij = 0 for any i 75 j, so EVM-Vnz = 0 for 3' ¢ 1, and EUn = 0. Hence (3.5.9) Var(Un) = Eu: = E (:2 14,-) =:2E. — n4: 1%.,- +14: Ems i1”) dG< )dG(y). 95 where ((433131): K1101? — X,_1)K,,(y - X14)- By Lemma 3.2.1, the term (a) is bounded above by (3-5-10) const-QPlQOU-iNI/pf lle-(IL‘.y)Hq||Kj($.y)|lrdG(I)dG(3/) IxI for any p,q,r > land %+%+%=I. Take p = q = r = 3, then (3.5.10) is bounded above by (3.5.11) const [0 (j —i)]1/3-—— Hence (a) is bounded above by »:1/3 2 const— ”7—22” 2.: Z( 0((n h) 1). k: lj— —i=k On the other hand, by the direct calculation, 1.. / (Emu: y) W) dG( >dG IxI ——> (a2)2/If2(x)gz(xmz /(/K(u)K (n+v) )du)2dv =: 1‘2. This together with (3.5.9) as well as Lemma 3.5.3 implies that n-l n2h - V(U,,) = n2h-c731 = n2h - 2 EV; ——> F2. i=1 So Lemma 3.5.4 is proved. [:1 Proof of Lemma 3.5.5. 96 Note that (3.5.12) n4h22El/n2j "1017213950 +62%; 2: E 0, n 2 1/2 qe 8b n P026131 >135) Sclqexp{—87p}+c2q2 (1+?) .0 (w), for some constants c1, c2 > O and 1 s q S [’5‘] Proof. First consider blocking. Let p be an integer between 1 and n. Let q = [£1 + 1. Define n 11 . . V1() = EEij,z=1~p,]=1~p, ...... i 0, (3.5.13) P(|:€ijl > n 25) _<_ :Pl 2 €ij|>1;— I125) (idle/4k i=1 But by recursively using Bradley’s lemma 1.2 in Bosq (1998), there are indepen- dent random variables Wlilk), ..., Wlilk), such that Pw‘“) _—. Pvflk), and l l . 1/2 1' i lVlm + '00 (3.5.15) P(|w,""—v,"°’|) 311- (l l ,\ C" a(p). for any 0 < A 3 MW”) + CHOO. Hence (3 516) |ZV“"’| > 9—25 . . I , 8g = POZV” (>7:3 —— 2:IW,“’°— Vf‘k’lsAM) l +P m(UuWi‘k’ — vi“): > n) l l/\ “U ’ _ \ V colaM 'Q m I 4: >’ V C E. ; S. E V V V Choose 71% = . _, 6 — 1 2 /\ mzn(16q2 ( )bp), then IIW’""+cHw s Hv,“"’llx+cs<6+1>bp2 III/1““ + one. 2 c — 1114““le 2 (6 -1)5p2 > 0. So, 0 < A g ”VIM“ + cHoo. Hence in view of (3.5.15), (3.5.16) is bounded above by (3.517) 22 1/2 P(IZW:‘")I>”; —:-)+q u (max<§ii,16”b(5+1)>) -a

n28 Choose 6 such that 6—1— n25 1 _ 16(12pr _ 4b 3 then (3.5.17) is bounded above by 1/2 (3.5.18) m(zwgml. $11 ..., (1.3;) ...(p). But by applying Hoeffding’s inequality to Wink), one may obtain that 2(n—26)2 n52 i(k) __ __ 16q < . _ PO 2 W I >153) 2e:l:p{ _q(bp2)2} _ 261p { 16b2p}' Hence (3.5.18) is bounded above by n52 8b ”2 2 -— 11- - 1 — . . e$p{ 16b2p}+ q ( + E) 0(1)) 100 So (3.5.13) is bounded above by n52 8b In 16-q-e$p{—i—6—fiI—9}+88-q2 (1+?) -a(p). The theorem is thus proved. D We will apply the above inequality to prove Lemma 3.5.5. Proof of Lemma 3.5.6. We will show that for any A > 0, P ((32 2"] f, — E33)! > A) = 0(1). n i A) _<_ P ([233, —E¢§) )|> 71.2%) i 00 and n/(2q) = O(nfi?) —> 00 as n tends to 00. Hence both two terms in (3.5.20) tend to 0 by condition (M). therefore the proof of the Lemma 3.5.6 is complete. [:1 101 Chapter 4 Simulations This chapter contains a simulation study comparing three tests. More precisely, let {Xh t = 0, i1 i 2, ...} be a stationary stochastic process satisfying Xt = #(Xt-1)+ 5:, where {at} are i.i.d. r.v.’s with mean zero and at is independent of Xt_1, for all t. The parametric family of functions to be fitted to u is chosen to be m9(x) = 6:13, x 6 1R, 6 6 IR with 90 = 0.8. That is, the hypothesis to be tested is H0 : p(:1:) = 0.82:, vs. H1 : H0 is not true. We chose the following three models to generate simulated data from: model 1. Xt+1 = 0.8Xt + €t+1, model 2. Xt+1 = 0.8Xt—1.Ze:1:p(—X,2))Xt+ EH1 + 0.1, model 3. X,+1 = 0.8x, + 0.5(X, —— 0.5)2 —— 0.3(X, - 0.5)3 + am. 102 The error distribution is either N (0, 0.1) or double exponential. The sample sizes chosen are 50, 100, 200, and 500. The three different tests are those of Koul and Stute (1999) denoted by KS, An and Cheng (1991) denoted by AC, and the mini- mum distance test of Chapter 3 denoted by MD. The nominal level that is used to implement the test is a = 0.05. There are 1000 replications for each combination of (model, sample size, error distribution). Data from model 1 are used to study the empirical size, and the data from models 2 and 3 are used to study the empirical power of these tests. The empirical size (power) is computed by Relative frequency of ( value of the test statistic > F‘1(1 — (1)), where F is the asymptotic distribution of the test statistics under H0. The steps to compute the test statistics are as follows: Let X(O) g, ..., _<_ X(n) denote the ordered X0, X1, ...,Xn. 1. Koul and Stute test: Step 1: Compute the least square estimate of 00 under H0: else = Ext—IXi/ZIXE—r Step 2: Compute Vn(X,-), i = 1,2, ...n, where 3 1 Vn($) = 7 Z (Xi — 6186X1_1)1(Xi_1 S III), 1' E R. i=1 3 Step 3: Compute An(X(,-)), X (i) g 230, where 230 is the 99th percentile of the sample 103 Ana) — fy21(?/>I)Gn(dy)—lZX.2_11(X1-1>23), Gn(:c) = liI(X1_1