WIWIN!!!“NHIHIIHHUIHllIllM'JIWIWHHWI l ._\ —I (O 200 I _ (D “AS nu IBRARIES ll‘llllll‘lllllll llllllllll This is to certify that the dissertation entitled Weak Convergence of Weighted Empirical Processes Under Long Range Dependence with Applications to Robust Estimation in Linear Models presented by Kanchan Mukherjee has been accepted towards fulfillment of the requirements for PhD. degree in SLan§LjCS {41/94 fl Ma jorfiessor [fine July 12, 1993 MSU is an Affirmative Action/Equal Oppormm'ty lnsn'mn'on 0- 12771 LIBRARY Michlgan State University PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE MSU Is An Affirmative Action/Equal Opportunity Institution oddlcm‘pmS-pn WEAK CONVERGENCE OF WEIGHTED EMPIRICAL PROCESSES UNDER LONG RANGE DEPENDENCE WITH APPLICATIONS TO ROBUST ESTIMATION IN LINEAR MODELS By Kanchan Mukherjee A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1993 ABSTRACT WEAK CONVERGENCE OF WEIGHTED EMPIRICAL PROCESSES UNDER LONG RANGE DEPENDENCE WITH APPLICATIONS TO ROBUST ESTIMATION IN LINEAR MODELS By Kanchan Mukherjee A discrete time stationary stochastic process is said to be long range dependent if its covariances decrease to zero like a power of lag as the lag tends to infinity but their absolute sum diverges. In this dissertation, a uniform closeness result of weighted residual empirical process to its natural estimator is derived under the LRD setup. These are then used to prove the asymptotic uniform linearity of a class of linear rank statistics and the asymptotic uniform quadraticz'ty of a class of L2- distance statistics. These results, in turn, are applied to investigate the asymptotic behavior of the above estimators in a linear regression model when the errors are function of LRD Gaussian random variables. Some intriguing phenomena are observed in connection with the inherent nature of the limiting distributions of the above estimators. Unlike the weakly dependent case the limiting distributions may not be normal always. Moreover, when the errors are LRD Gaussian and the design matrix is centered, the asymptotic covariances of the class of rank and minimum distance estimators become those of the least square estimator- a phenomenon which is in complete contrast with the i.i.d. error case. Similar statement applies to LAD and a large class of M- estimators. These results are proved under some conditions on the design matrix that are very similar to those under the i.i.d. setup The dissertation also considers the asymptotic behavior of regression quantiles and regression rank scores which are natural generalization of the notion of order statistics and regression rank scores processes from the one sample model to the linear model. Under the LRD error setup the aforementioned uniform closeness result is used to obtain the asymptotic representations of regression '- .I‘J . _ -. lmflh’hwmmd—va-dwmhflal i «do! We. cumming-Inuit and nth. tic-u throughout my m a ‘ Michigan State Univrlsitv. lira mun-n...- rlwlh‘atiou and contributim to I“ nuk- in the area at moment-Mk \i o L... l 0’ ’ " .1 fa.” :4 -.. have be?!) my “shill Shut-1‘ satisfies. ’. ”atrium and l I I'b'vtflil'l fI-‘Illlll Linn. ‘ , , .f:'~“'l':..‘ ".Hlifll To my parents and grandparents ACKNOWLEDGMENTS I would like to express my deepest regard to Professor Hira Lal Koul for his careful guidance, encouragement and affection throughout my graduate study at Michigan State University. His erudition, dedication and contribution to statistics have been my main source of inspiration to work in the area of nonparametric statistics. I would like to thank Professors Joseph Gardiner, James Hannan and Habib Salehi for serving on my thesis committee and for their suggestions leading to the improvement of my dissertation. 1 cure my special thanks to Professors Dennis Gilliland, Connie Page and James Stapleton from whom I learned many aspects of good teaching and statistical consulting. The dedication of my parents and grandparents to the pursuit of my higher education is greatly appreciated. The support and encouragement of my sisters, Baishakhi and Mousumi, my friend Dr. Suman Majumdar, and his wife .layasri during my doctoral study were invaluable. The careful guidance provided by my teachers at UPANISHAD during the early stage of my education deserves special mention. TABLE OF CONTENTS Chapter 0. Introduction 1. Preliminaries 1.1 Introduction 1.2 Uniform closeness of weighted empiricals 1.3 Proofs 2. Rank estimation 2.1 Introduction 2.2 Asymptotic uniform linearity of linear rank statistics 2.3 Proofs 3. Minimum distance estimation 3.1 Introduction 3.2 Asymptotic representation of minimum distance estimators 4. Regression quantiles and related processes 4.1 Introduction 4.2 Theorems and proofs 4.3 L-estimators and regression rank scores statistics 4.4 Asymptotic uniform linearity of linear regression rank-scores statistics. 5. Bibliography vi Page 24 24 26 29 33 33 35 42 42 45 54 57 61 CHAPTER 0 INTRODUCTION A discrete time stationary stochastic process is called a long range dependent (LRD) or a long memory process if its correlations decay to zero like a. pOWer of lag as the lag tends to infinity but their absolute sum diverges. Quite often, econometric and time series data appears to be stationary and exhibit strong correlation between observations separated by large lag which does not decay to zero at a fast enough rate to be absolutely summable. Similar phenomena have been observed in hydrology in connection with the construction of the Aswan dam over the river Nile, Egypt, when hydrologist Hurst noticed that the annual volume of river-flow shows long term behavior over time (Mandelbrot and Van Ness, 1966). Mandelbrot and Van Ness proposed fractional gaussian noise to model observations with such strong correlation. Later, Granger and Joyeux (1980) and Hosking (1981) independently came up with fractional ARIMA model to include more processes with non-gaussian marginal distributions. The salient features of these processes are that their spectral density diverges at zero and their correlations are not absolutely summable and that creates considerable mathematical difficulties for its analysis. The usefulness of LRD processes in modeling a wide variety of physical phenomena heralded an upsurge of interest among many researchers who explored different probabilistic aspects of LRD processes in the last two decades. Investigation of the behavior of different statistics and estimators based on LRD observations are also carried out by some authors. Taqqu (1975) obtained the weak convergence results of partial sum processes based on random variables (r.v.’s) that are a measurable function of LRD Gaussian r.v.’s.Taqqu characterized 2 the limiting distribution of the partial sum process for r.v.’s having Hermite rank (see Remark 1.2.1) one and two. Later, Dobrushin and Major (1979) and Taqqu (1979) independently characterized the limiting process (called Hermite process) for r.v.’s having arbitrary Hermite ranks through a multiple Weiner-Ito integral representation. It was observed that the limiting process is non-gaussian if its Hermite rank is more than unity. Along with these technical results, parallel research has proliferated on the estimation of some parameters describing the correlation structure. Fox and Taqqu (1986) and Yajima (1985) proposed maximum likelihood estimation of the index of the LRD 0 (see (1.12)) based on LRD Gaussian sequence and Yajima (1985) considered the least square estimation (l.S.e.) of 9 based on LRD r.v.’s. In linear regression model with LRD errors, Yajima (1988, 1991) obtained the strong consistency and the asymptotic distribution of the l.S.e. of the regression parameters under some conditions on the cumulants of the marginal error distribution function. Koul (1992a) derived the asymptotic uniform linearity (AUL) 0f M-statistic and limiting representation of normalized M-estimators in a regression model when the errors are a function of LRD Gaussian r.v. and the score function is absolutely continuous. Motivated by the seminal work of Koul (1992a), in this dissertation we derive the asymptotic representation of some more robust estimators of the regression parameters in a linear regression model when the errors are a function of LRD Gaussian r.v.’s. In particular, we consider the behavior of a class of rank estimator (R—estimator) proposed by Jureckova (1969) and Jaeckel (1972), and minimum distance estimators (m.d.e.) proposed by Koul and Dewet (1983) and 3 Koul (1985b). Finally, we also consider regression quantiles (RQ) proposed by Koenker and Basset (1978) of which the special case is the least absolute deviation estimator (LAD). The investigation of the behavior of robust estimators based on dependent observations started with the work of Gastwirth and Rubin (1975). Gastwirth and Rubin studied the behavior of R—, M— and L-estimators in a location model under A-mixing errors. Koul (1977) generalized their results to linear regression model with strongly mixing errors which contains A-mixing class. In all of the above weakly dependent cases the correlations are absolutely summable and thus the effect of dependency becomes relegated, at least asymptotically. Consequently, the limiting distributions of the suitably normalized estimators is Gaussian as in the case of independent identically distributed (i.i.d.) errors. But, in the case of the LRD errors, the limiting distributions of these estimators are quite different from their weakly dependent counterparts in two fundamental ways. First of all, the normalizing factors are different and secondly the limiting distribution is not always normal. For more on these see Remark 2.2.1. The crux of proving AUL theorems in linear models is unified in the work of Koul (1991, 1992) in the form of a uniform closeness theorem for a weighted residual empirical process to its natural estimate. Hence the fundamental tool for proving most of the results in this dissertation is the, uniform closeness result of weighted residual empirical process to its natural estimate in a linear regression setting when the errors are a function of LRD Gaussian r.v. The technical difficulties for proving the uniform closeness result in this setup is surmounted by using a modification of an ingenious chaining argument .. .a 4 of Dehling and Taqqu (1989). Dehling and Taqqu came up with a chaining argument to prove the uniform weak reduction principle of ordinary empirical process. Theorem 3.1 of Dehling and Taqqu (1989) obtains an upper-bound for k _ _ Pl ..,. Ira‘En ”Quesx) - Fm - Jm(x> (m!) ‘Hmoim > 6], kgn, xEI 121 which converges to zero V6 > 0. Here m, Jm, Tn and Hm are defined in Section 1.2. In Theorem 1.2.1 of this dissertation, we invoke a similar chain to prove that - n - Pl :2Pll7nli§l7ni{1(5i 5X +6110 ‘ le + Eni) " Jmlx 'l‘ Enilim!) lle(7li) > 6i converges to zero. Then, this result is used to derive the uniform closeness result. In other words, we obtain a partial generalization of the. uniform weak reduction principle of Dehling and Taqqu from the ordinary empirical process to a very general weighted empirical processes with nonzero 5 lgi $11. This partial ui’ generalization allows us to prove the uniform closeness result, which along with some other results in Chapter 1, is used to obtain the asymptotic representations of the rank estimators, minimum distance estimators and regression quantiles. As a byproduct of Theorem 1.2.1, we also obtain the weak convergence results of weighted empiricals. In our case the novelty in proving Theorem 1.2.1 lies in choosing the chain suitable for weighted empirical process which, of course, reduces to the Dehling and Taqqu chain when the weighted empirical reduces to the ordinary empirical. Notation. In this dissertation. I(A) denotes the. indicator function of an event A. The index i in the summation varies from 1 to n unless specified t otherwise. For a vector u E R", u ( null ) denotes its transpose (Euclidian norm). If D is an nxp matrix, then diii’ 1 g i _<_ n, denotes its ith row and d.j’1 g j 5 p, its Warm constants whoa: 3:5 row is x3“. 2 "Haifa 09¢ olnerves 91w respun-u- \‘1I31.¢i>l,r {Y (1.1} .j, irli' ifiy: ‘..",’ j! . .,. {'Uldluz 4.741. tlu' linen R'C‘thlhat'fi'a‘ '. (I c", »"‘ "M‘Jmlbmf‘m Wfi‘wi ‘3" 1:.“ Mkémuflémwfimnk‘hfiaflhxmfieifl - Cafidualiww , lgigu, satisfying. I r) . RD 1. m be robust in anemia: a {3.111in of . «mm . we describe“ the ' ‘v-d ilxrnum “1‘ CHAPTERI PRELIMINARIES 1.1. Introduction. We consider the following multiple linear regression model in this dissertation. Let (ni, i2 1} be a stationary, mean zero, unit variance Gaussian process with correlation p(k) :2 E(n11]1+k), k _>_ 1. Suppose 6i := G(ni), i2 1, where G is a measurable function from R1 to R,1 and let X denote the n x p design t matrix of known constants whose ith row is xni, l _<_ i 5 n. Consider a linear model where one observes the response variable {Yni}, 1 5 i g n, satisfying, (1.1) Yni = xxtfifi + 6i, 1 g i g n, for some fl 6 RP. The long range dependence of the r.v.’s {111} is implied by assuming that for some 0 < 0 < 1, (1.2) pm = k-9 L(k). k 31, where L(k) is a slowly varying function at infinity, i.e., L(tx)/L(t)——> l as t—>oo for every X > 0. We assume that L(k) is positive for large k . From Lemma VIII.8 of Feller (1968, vol 2), it follOWS that 337(k) z 00. Examples of such functions L are 1 positive constants or L(k) :: log k. In this dissertation, we derive the asymptotic representation of some familiar estimators of the regression parameter [1 that are known to be robust in the linear models with independent errors. In particular, we consider a family of R-estimators, m.d.e., and regression quantiles. In this chapter, we describe the basic probabilistic results and their proofs that are needed throughout the rest of this dissertation. The technique of obtaining asymptotic representation of suitably normalized estimators defined as a solution of a system of equations goes back to Cramer (1946). The basic idea is to derive an asymptotic Taylor type expansion of a suitable score function and to ensure the existence of stochastically bounded solutions. Therefore using the same technique with suitable modifications one can obtain the asymptotic representation of R-estimators and M-estimators, defined as a solution of a system of equations. In linear regression models, different authors have used different techniques to derive a Taylor type expansion, see, e.g., Koul (1969) and Jureckova (1971) for R-estimators, among others. Koul (1991) envisaged a unified approach to these problems as a consequence of uniform closeness of some weighted residual empirical processes to its natural estimate, centered at its expectation. Using uniform. closeness and the smoothness of the error distribution function one can obtain an one step Taylor Type Expansions of the nonsmooth empirical processes and asymptotic uniform linearity is a consequence of that. 1.2. Uniform Closcness of Weighted Empiricals. Let 6 (n) have the same distribution as that of the marginal distribution of {€i, izl} ({ni, i21}). Let F be the distribution function of c and I := (x: 0_ 1. 8 A quick proof of the last fact with j¢k can be given as follows. Suppose that q < r and write 775 = pnk + 71*, where p:= p(j - k). Observe that Hq(pnk + rf") is a polynomial of degree q in 77k with coefficients involving 17*. Since 77k and 17* are independent the result follows from E nkH (17k) = 0 V0 3 c0 a.e. (Lebesgue) on I, (A.6) The functions Jm and .ll‘lf, are continuously differentiable with respective . . I +I I +I derivatives .1", and Jm. Moreover, .lm(x) and .lm(x) converge to zero as x converges to c 2: ian and d := sup I. Then, fi 10 (2'5) xsuepllsnbcll = op“) a (2-6) xsgpllUMX) - U300! = 0p(1), and (2-7) :gpllrfi‘FilrndIk; 3" + {nil-HE; lel'TrilZ7niénjftxll = 0p(1)' The proof of the above theorem uses a chaining argument similar to Dehling and Taqqu (1989) and appears in the next section. An analog of (2.6) when the errors are i.i.d. appears in Koul (1969, 1970) and was further generalized to include the case of strongly—mixing errors by Koul (1977, Proposition A1). Using the technique similar to the proof of Theorem 2.1 one can obtain pointwise and uniform convergence over compact versions of the above theorem. This is stated below in a form that is useful in Chapter 4. Theorem 2.2. Suppose 6i := C(ni), i 3 1 and (1.2) is satisfied. If (A.1), (A2) and (A3) hold, and F is continuous at x E I, then (2.8) we - stn = 0,.(1). If, in addition, F has (I. continuous density f at x 61 and (AA) holds, then (2.9) lrt‘jflniflm 5x + 6”,) — 1(6, 5x)} - T;.'i27m-£,,if(x)l= 0,,(1). In addition to (A.1) - (AA), suppose that the following hold: (A.7) f is continuous and positive on I —fi— 11 (A.8) Jm, J; are continuously differentiable. Then for every b e (0, oo), (zulieiuixlgb }slTfi‘ZiI7ni{I(6i X+£m)- 1(6-< 1 is still an open p1oblem. In the following corollary, the weak convergence is understood in D[—oo, 00] equipped with the a-field generated by the open balls of the sup metric. Corollary 21. Under (A 1)- (A.)6 the process {Tn ;7n1[1(5i5 - + {nil— F(. + 5m)“ is tight and converges weakly to the process {(1111) 1Jm( (.)Zm(7)} along a subsequence. // Some remarks about the Hermite ranks and the assumptions of the above theorem are now in order. Remark 2.1. Let 7, be a standard Gaussian r.v. and Q: R1411 be a measurable function such that EQ2(7]) < co. Recall from Sansone (1959) that the Hermite polynomials {Hk, k>0}, defined by :7: d)(x) = (-l)k Hk(x)¢(x) ( anal-Daisey, e< “ 2tx/=5Ojtk11k(x)/1a ) have the property that {HR/(1:01”) is an orthonormal basis for L“ (R1, Bl, (1(1)). The index of the first nonzero coefficient in the Fourier expansion of the r.v. (2(1)) with respect to this orthonormal basis is fi 12 called its Hermite rank (see Taqqu, 1975). Clearly, if Q is an odd (even) function then its Hermite rank is 1 (2). Also integration by parts shows that if Q is monotone and right continuous such that the function qu vanishes at -oo and 00 then the Hermite rank of Q is 1. // Remark 2.2. Here we consider some examples of the Hermite ranks of a class of functions, see (2.2). If Q is strictly monotone and continuous, then the Hermite rank 111 in (2.2) is equal to 1. To see this, consider the case when Q is strictly increasing. Then using the fact that ¢(x) Hr(x) dx = - d{¢(x) Hr_l(x)}, we obtain that for ye I, r21, sue gy)H,(n) = E10, 5 o-1(y))H,(r,) = — ¢(Q'l(y))Hr_l(Q'l(y)), which is nonzero for r = 1. The same is true when Q is strictly decreasing. Now let Q be an odd function with the additional property that {x E R1: Q(x) 5 0} equals either [0, 00) or (~30, 0]. Then also 111 is equal to one. To see this, consider the case {x 6 RI: Q(x) g 0) 2 (~00, 0]. Note that this implies that Q(x) Z 0 for X > 0- Then for y S 0. y 6 Ir EI(Q(7I) S")?! = EI(Q(7I) SYM 1(7) S 0) < 01 since the range of Q is I. Similarly, for y 2 0, y E I, EI(Q(77) S y)n = 131(c)(7)) S WK?) 5 0) + EM 5 Q(U) S y)771(72>0) <-¢(0) + ¢(0) = 0. The proof is similar in the case {x e R1: Q(x) g 0} = [0, 00). An example of Q for which in = 2 in (2.2) and conditions (A.5) and (A6) are satisfied is given by Q(x) = [XII/6, 5>1. Dehling and Taqqu (1989) showed that for any m3 1, there is a Q for which the Hermite rank of the class of functions {I(Q(17) g x), x E I} is 111. // fi 13 Remark 2.3. Note that Jm and J; are functions of bounded variation and hence are differentiable almost everywhere. Also, under (A.6), sup lJ;n(x)l and x61 sup IJ$'(X)| are finite. // x e I If G is strictly monotone and continuous with d.f. F, then from Remark 2.2, m=1. The following proposition states that in this case (A.6) is satisfied if the Fisher information 1(f) of the density fis finite. Proposition 2.1. Assume that G is strictly monotone, continuous and (A.5) holds. If, moreover C(11) has an absolutely continuous density f and I(f) := f[f’/f]2dF 'l(u)l g Ka[u(1-u)]'a. 0 g u _<_ 1. Fix a b e I such that F(b) > 0. Then for x > I), W) @“(thm s K. ax) [PM (more 5 K. [Punt-'1 jut) (It/[l-F(x)l“ s K. [F(b)l'a El-f’(t)/f(t)l/[l-F(t)l"‘dF(t) s K. [Foot-a {it -f’(t)/f(t)l"’ arm”? {in-Fare awn”? Upon choosing 2a < 1 in this inequality, it follows that )f(x)(l"l(F(x))l -> 0 as x —. (1. Similarly one can prove that |f(x)(l"l(F(x))l —> 0 as X —> c. // A “9-931 fi, 14 1.3. Proofs. In order to prove Theorem 2.1, we need some preliminary facts about Hermite expansions and ranks which are summarized below. These can be found in Taqqu (1975, 1979) and Dehling and Taqqu ( 1989). In what follows, L, with or without suffix, is a generic notation for slowly varying functions and c is a generic constant. Let {7711 12 1} and p be as in Section 1.1 and p > 0 be a fixed integer. Then the following facts hold: (111) zzhpo-rh = 0(112"’9[L(n)]”) if1>9 <1. = 0(11L0(n)) if pd = 1, = 0(11) if p6 > 1. (11.2) For any measurable function h E L2(Rl. Bl. d‘I’) with Hermite rank p < 1/0, Variance< i2 hot) i = (xi: glam-r) I) = ()(112"’9[L(n)l")- (11.3) (i) Reciprocal and product of slowly varying functions are slowly varying. (ii) For any slowly varying function v, V(u)ll6 o oo (0) for all 6>O (<0). Now, we shall give the proof of (2.5). That of (2.4) is similar and simpler. To begin with we obtain an upper bound for the expected. value of the square increment of the Sn process. Throughout the rest of this chapter, for a function h : Rl—tRl, h(x, y) stands for h(y) — 11(x), x gy. Lemma 3.1. Suppose fi := C(ni) , 13 1, and (1.2) is satisfied. Then (3.1) ra2E1 Sex. M2 5 Z: 2'7"inth [F(x+g“i, yam) F(x+5nj, y+§nj)]l/2 lp(i-j)]m+l. 1 Proof. The Heerite expansion of {HQ 5 x+£ni) - F(x+£ni)} and (2.3) yields that fi 15 TanI Sn(x1 Y)l2 =i2 Emmot- q 2‘12““ rngrlqecfini. Y+E,,i)/q! Jr(x+£,,-. y+c,,,-)/r! xEHq(nj)Hr(7/k)- Using (2.1), the above is equal to It? Jinn“,- q 2%“ Jq(x+£,,,, y+£,,,) Jt.(x+£.,J-. y+£,,,-)/q! p“(i—j)l 5 2i: :Ll'rnnnjl q 2%“ qu(X+E,,ir y+£,,i) Jt,(x+£,,jr y+£,,j)/q-'l lp'“+l(i-i)l 5 i2 phenol L sz'lii(x+f,tii Y+E,,;)/<1!ll/2 ( ZEN-Jamey. y+£,,,-)/r!J’/2 lp"‘+l(i-j)l. where the last step follows from the Cauchy-Schwarz inequality applied to the sum involving q. But the Hermite expansion of {l(x+£ni < 6i gy+£ni) - F(x+£“i, y+£m)} and (2.1) yield that q >ZmJ(2](x+Enll y+£ni)/ql : E [1(X+€lll<€l S y+£ni)-F(X+€ni, y+£ni)]2 S F(X+€ni’ y+£ni)’ V 121. Hence the lemma is proved. // Proof of (2.5). Without loss of generality, assume that 7"]- 3 0, 1 g i 5 n. For the general {711i} the result follows from 711i: 7m - 7;” and the triangle inequality. In what follows, we use a 111mlification of the chaining argument of Dehling and Taqqu ( 1989). Let (3.2) /\(x) :2 F(x) + .l$(x)/ ml, XE I. Note the following facts: —‘— 16 (3.3) /\ is strictly increasing, A(d) < co, and by (A.5), (A.6), A is differentiable with uniformly continuous derivative A' = f + Jay/111! satisfying /\’(x) _.0 as x—» c and d. (3-4) FRY) S Mer), XSy- (3-5) lJm(xa Yi/m” SJHXtYV 111’ S /\(Xr Y)r XSY- Fix a 6> 0 and an 11 3 1. Recall that r" :: 11(l‘mm/2Lm/2(n) and let (3.6) K, = x(6, n) := integer part of log2{/\(d)§: 7"] (6rn)'l} + 1. 1:] By (A.1), (A2) and the inequality _ n 02 — /2” ,2 I/‘-’ . “Ely“, 2 o'“/ [L(u)] Ema/(n (1:ng“ 7m): it follows that . _ n (3.7) fill-i517,“ —» 00 and /\(d)/2'r1 z 6/(Tnligl’7ni). I: _ Next define refining partitions of I as follows. Note that A is invertible and define (3.8) a, k ;= x1) 1(a)) 2-k],j = 0, 1, 2k, k=0, 1, 2, r. Clearly, c = 7r”, k < ”I, k < < 7r__)k_l, k < ”2k, 1.- = d, and (3.9) hr”, k) - he, k) = Aha/2k. For an x E I and a k E {0, l, ..., I17} define jff by the relation (3.10) Wjfi,k gx< Fjfi+l,k . Now define a chain linking each point x e I to c by (3.11) C = ij Then, (3'12) Sn(x)=snj(7rx 0 17rjX 1)+Sn7rj)l(( ,j,l17r)2( 2) <7r. < ..... <7r. 0, i 20 P1 1321' sa(1rjfi , 11 mm“, k) ( > 6/2] S jg) P1 1 Sn(7rjI k+1 1 7rj+1I k+1l1> 6/21 < 46-2 7:222 . I '_. "1+1 _ n _ Irma... 111111 x 1 2k+1_1 l 2 jg) [F(7rj1k+l + ("1’ 713+1. k+l + £111) 13(7rj,1<+l+ £11r17rj+1,k+1 '1‘ fin,” / S 4 6-2 T1122 ;7n17m‘( 1P( :1— _1)l1n+l l where the last inequality fOIIOWS from the Cauchy-Schwarz inequality applied to the factor involving F and the fact that jg) [F(7rj’ k'H + a, 7r1‘1'11k+l + a.) 511 V aeR_ Similarly, we also obtain Pl 8x112 ll SHUT-jig I h. 7'3" X_+l 1c l1> 6/21 S46 ) T11~zi:;7ni711r(1p( 1 1‘) llm+l- Hence from (3.17) we obtain that (3.18) P[ sup 1511(Xll > 6) x 61 S 4 (N '1' 116-2 71122 £72111 Inr 1P1 (1 1‘) HUI-H + PISXUP [HM (X)> 6/41- 1 1 We now analyze the first term in this upper-bound. Since for large 11, 7'n>1, and by the Cauchy- Schwarz inequality, $71115 11 12/ (.3 6) yields that IE +1< 2 + log.) [)1(d)/6] + c In 11, fi 21 where c is a constant and In denotes natural logarithm. We shall next show that, (3.19) an 1: (In 11) 7,2}; gym", |p(i-1‘) ("1+1 = 0(1), 0 < 1/111. 1 Consider first the case when (111 + 1)19 < 1. 1511 nl an S( In 11) (nlm 0, sup{ lun(x)| : [x1 > k, x e I} Ssupf 1Jin(x)1: N > k ‘ W11, X E 1} Th'Zl‘rniénil 1Hm(7li)1/m!- 1 By (A.6), sup{ lan(x)| : Ix) > k, er} ._. 0 as k _. 00. Hence, in view of (3.15) withj = 1, to prove (3.20) it suffices to show that V 0 < k < oo, (3.21) sup{ )1/I,(x)| : (x) g k, x e I} = oI,(1). Fix a k. Viewing 11,, as a process 011 [—k k] ()1, it is enough to show that (3.22) (a) 1/,,(x) : oI,(1) V x e I, and g c (x - y)? , for some c e (0, 00) (b) Ell/11(xl ' ”11(3)” For (a), note that E[Vn(x)]2 = 71122 27111710 Jm(x, x+€III) .lm(x, x+£IIj) pm(i—j)/ ml 1 1 < Unix. x+t -)1'-’ (we 13.1 > n" 1.322: ,le(i-r)l"‘~0. 1 15 511'“ s 1 s n Since Jm is continuous in a neighborhood of x and by (A.1), (11.1) the rest of the term is bounded. For (1)), note that the mean value theorem entails that for x 5 y, x1y 6 l'kv kl F11. ”11(3’) - V11(X) = Th]Z:7,,i{[-l1n(y+«£,,,)--lrn(X+€,,,)1 - 1-l1n(Y)--l111(X)1} (1110'le(771) 1 23 = Thl?7ni (y-x)[an(unixy) - Jin(uxylle(Ui)(m!l-11 where unixy E [x-H’ni, y+£ni], uxy E (x, y]. Therefore, Ell/“(1‘) - Vn(}')l2 = (Y’X)2 11122 Z7ni7nj Anixy Anjxy Pm(i-j)/ (11102 : Cn(Y'x)2» 1 J where, Anixy: Jin(unixy) - J;,,(uxy). Note that, lAnixyl 52 .1112 llJ,‘l’,’(x)l (00 V n, i, x, y. Hence using (A.l) and (F.1) once more as in (a) we obtain that C": 0(1). Now (3.22) follows from Billingsley (1968; Theorem 12.3). This completes the proof of (2.6) also. // Proof of (2.7) . This follows from (2.6) in conjunction with the following: SZPI'TH‘ZvndFe+sin—Fem?mam)! = 0(1), x 1 1 which can be proved by applying the mean value theorem on {F(x+£ni)}-F(x)} and using the uniform continuity of f and (AA). // Proof of Theorem 2.2. Proof of (2.8) follows directly from the Hermite expansion at each fixed x and the continuity of .l,,,, which in turn follows from the assumption of the continuity of the d.f. F. Proof of (2.9) follows from (2.8) in a simple manner. Finally, assertion (2.10) follows along the same line. as in the proof of (2.5) and (2.6) with suitable inmhfications. // fi CHAPTER 2 RANK ESTIMATION 2.1 Introduction. The idea of estimating location parameter based on rank statistics finds its root in the seminal work of Hodges and Lehmann (1963). Since then, a major branch of nonparametric statistics deals with the rank-estimation (R-estimation) of parameters by minimizing certain dispersions based on the ranks of observation. Generally, these dispersions are expressed in terms of linear rank statistics. Thus, the widespread applicability of linear rank statistics for a variety of testing problems leads to its use in estimation in a natural way. The key tool for studying the R-estimators is the AUL of linear rank statistics. To explain R-estimation in a regression set up, let 1,!) be a nondecreasing real-valued function 011 (0, 1) and {RiAv lgign, AGRP} denote the residual ranks, i.e., RiA is the rank of Yni - xiiiA among Ynj - xflJ-A, 1 gj 5 n. Define (1.1) S(A) 2: Xij(xm - x) WRiA /(n+l)) = [ 51(A), SP(A)]E MA) := sum /R1, g is nondecreasing, right continuous, g(1) - g(0) = 1}. Then V l) E (0, 00), (2.2) sup “13$le [1 + A;,l s) - fi(11)]-sf(F'l(u))“:—- 0,,(1), 05u51,seN(b) (23 B;,ls A‘ s f1v(F = 1, ) regigemb) ll [03+ w l+3 l‘W F)" Op” where, 27 (2.4) $(u) := £(xni- i) [I(ei S F'l(u) - u], 0 5 u 51, s == soc...- 2) was.» - i1. 17):: (i)¢(U)du- // The following lemma gives an approximation to BQNIS in terms of the Hermite polynomials that is useful in determining the limiting distributions of the above estimators. Lemma 2.1. Let {n}, i 2 1} be a stationary, mean zero, unit variance Gaussian process with correlation p(k) :2 E(nln1+k), kg 1. Suppose 6i :2 C(ni), i2 1 and assume that (1.1.2), (W.1), (W2) and (L2) hold. Then, (2.5) B133 = stud) + 0.11). where, l quj) 2: E 1/I(F(€))Hq(1;):—(j)'.lq(F'l(u))di/v(u),q 31, and for any nxp matrix D, (2.6) 3,1 := 13;,1 2d,,,H,,,(1,,)/m!. // l The next corollary gives the asymptotic representations of the suitably normalized R—estimators defined in (1.2) and (1.3). These representations are obtained from (2.3) in a routine fashion as in the proof of Theorem 4.1 of Jureckova (1971) and Theorem 3.3 of Jaeckel (1972). We omit the details for the sake of brevity. Corollary 2.1. Under the assmnptions of Theorem. 2.1 and condition (L.1), (2.7) Aw ([9,. -19) = Aw (23,,“ - 13) + 0,111 = {Ifd1/'(F)}“B;v‘3+ 0.11). (2-8) = {If dd(F)}'l swimw) + 0,,(1). // 28 Remark 2.1. In the i.i.d. errors case Koul (1992, Corollary 4.4.1) derived the asymptotic representation of 31, under the conditions (W.1), (W*.2), (A.5), (L.1) - (L.2) where (W*.2) is a slightly weaker condition than (W.2), namely, (w*.2) 111;,(wtwrlwni = 0(1). max 1 S i S 11 Using the AUL of S, Koul obtained that (WW/213m) = {If darn“ (W‘wrl/‘ZS + 0.11). . l _ . which converges in distribution to N(0, {ff(l'1/1(F)}“) [[1/v(u) - Ml du Ip xp ). 0 Note that the limiting representations in the LRD case differ from those of the i.i.d. errors case in tw0 fundamental ways. Firstly, they have different normalizations. If, for example, li‘rantW/n exists and is positive definite then (W"W)1/2 is of the order of 111/“2 whereas AW is of the order of nmg/Q/Lm/gh) . Secondly, unlike the i.i.d. errors case, the limiting distribution may not be always normal. The value of m is very crucial for determining the limiting distribution. If, either G is strictly monotone and continuous or G is an odd function with the property that {x e R: C(x) g 0} equals either (—oo, 0] or (0, 00] then m=1 (See Remark 1.2.2). In such cases the first. approximation of AWLdJ-fl) is exactly N(0, 031‘“) where I‘,,:: 7,,‘2(Wt'W)—l/2W"RW(WtW)‘l/2. R the dispersion matrix of (711, . .nn)t and a} :2 {ff di/2(F)}‘2 [El/’(F(F))I)]2. Yajima‘s (1991) results can be used to calculate the limit of I‘nunder some additional conditions on the design. // Remark 2.2. Conditions (W.1) and (W .2) are satisfied by many designs, in particular, by polynomial designs with Xnij: ij, lsisn, 13151) and by trigonometric designs with Xnij= cos(i;1j) or sin(i;1j), 111- # t‘k forj ¢ k. // 29 2.3. Proofs. To begin with we introduce some more processes that will be useful in the sequel. Accordingly, define FJM A) =n‘2HY Ym- e,Asyt yEL Tum, A) z: 2: (xni- yam/n,- xffiA 5 F-1(u)) 0 g u g 1, A 6 RP. l The basic idea of the proof of (2.2) can be sketched as follows. Note that (3.1) z(u, e + 11;,1 s) = 2(x x)I[Rank one, - lei/1;} s)< nu] l xIll :21): x", x)I[F,,S(€i - x-tAls ) 0, 3 6 > 0 3 (3.8) lim supIIP[ sup “B-“HTnhl, [3) - Tn(v, ,6” H > a ] < (Y. )u-v)<6 32 But (1.2.4) applied p times, with the choice of {7111} and {5m} as in the proof of (3.3), yields that 139) an E 10. 1] 1113-1111:, 11) - tum-1111113111 = 0.11), where E "3,111 .-= 0(1). Note that for u 5 v e [0, 1], the Cauchy-Schwarz inequality yields that 13.10) lJm(F’1(V)) - Jm(F"(u))1 s E1/11I1u|F<1y = 0(1). 86 N(b) Proof of (2.2). Using H a+bII 2 _<_ 2 (Hall? + II MI 2 ) for a, b e Rp and Fubini, 1.11.3. of (2.2) 32 [Elle-wl wa, ”If? (1y + 2 b2 If? (1y. . P New. IE || 13;: we, will dy = r..'-’,21IE[:aW..U {1ei 5y) - Fem? dy, J: I . - - t -1 2 where, for any matrix D, adnij 2: the Jth entry of the vector (D D) / dni' Using the Hermite expansion of {I(cigy) - F(y), yEI} in (1.2.3) and the properties stated in (1.2.1) the integrand of the jth summand is equal to - 7 I' I = Tn 2 El: gaw'fij OWNJ (1 gm (“gm qu) l/(l! Jq,(y)/q ‘ EHq(”1)HqI("f) z: Maya/q!Jen-r). : -2 Cl " 0' ' Tn ? Er: me WHI‘J (12m 2 m '_ . ‘2 < 2 Jg(y)/ql nlmsaiitsna wnij XXII) (1 1)l/ mu 37 (2.5) s F(Y)(1-F(Y))nlma2< ...(W‘WW MWL-"‘>:>:1pml $1511 (2-6) S K F(Yl(1' PM) Here, (2.5) is obtained from X) J3(y)/ql = E[I(€;SY)'F(Y)]2 = F(Y)(1’F(Y)) q_m while (2.6) from (F.1) and (W..2) Now (2.2) follows from the assumptions (D2) and (D.3). The assertion (2.3) follows from (W.3) and a monotonicity argument of Koul (1985a, Theorem 2.1). // Proof of (2.4). For fixed 3 6 RP, EIIIB'wl mm + As‘ s, 11)-131.! wa.y>lr+’dy S2IEllel Sw(fl + Aw.l S5 3') ' Bid Sw(fls 3)”.1 (1y mi x Aw 8 +3) ' Jm(Y)}(1n!)-1H111(7li) +2 [2 dy Ill where, for A 6 RP, y E R], Sw(A,Y)==ZVni{I(€i SXffim'flHW-F (x X,’,,(A fllfl’H 111( X.‘,,(A- -fl)+Y)(m' ) 1H111(71i)} I From (1.2.3), T1 is equal to p 2 2 Ira-2 E[ raw",- :5“ Jay, x,.iA;J s + Y)(<1)Hq(n;)] dy, j=l 1 q _ m where for a function h : Til—+111, 11(x, y) stands for h(y) — 11(x), x g y. Now using the properties in (1.2.1) as in the proof of (2.2) the jth integrand of the above expression Tn-Z Z: Zlawnija wnrj Iq '2 J( “(y xiliAw S+Y)'](I(Y9XIII'AW1 S+y)/q! l lpn](i_1~)| Zm+l 38 -2 2 _ :- 0" 25 ¥'“wni1'“wmj 1,11%“ 310% KEN s + y)/q!]l/’ [>2 13W, xtrAQJ s + 11)/(111W lpm(i-r)l <1> m+1 3,1th )1" n"‘"'"”’L-m0, V(n)/n‘5 : 0(1) . Therefore, with an :: 1"ng l xltliAw1 s s,| Fubini’s theorem yields that, I II y+an 1 max |F(x .A W‘s + y) - F(y)| dy 3)) 1 {(1)11111151 = 2 an -._.— 0(1). 1 513 n "i y—an Again the uniform convergence in (2.4) is achieved in a routine fashion, see, e.g., Koul ( 1985a). This completes the proof Theorem 2.1. // The following corollary gives the asymptotic. representations of the suitably normalized minimizer of MW(A). Corollary 2.1. Under the assmnptions of Theorem 2.1, (28) A113,, - 19) =- (If2 dx)‘l 13;: 1w, 1) dF(y) + ope), (2.9) =- (If? dx)-lsw1Jm dHu). Theorem 2.2. In addition to (1.1.1) and (1.1.2) assume that (X.1)-(X.3) and (D*.1)—(D*.4) hold. Then (2.10) A1091- fl) = - (2112 (1111113331{UaHvHUan-ynfiy) dH0) - (1 - (1) 111(115 0), u E R1, 0 _<_ u g 1. Note that, hl/2(u) 1' I u I /2 and so, 13,,(1/2) reduces to the well-known least absolute deviation (LAD) estimator of ,8. Theorem 3.1 of KB gives the following linear programming version of the above minimization problem (1.1): 43 (1.2) minimize a 1:“, r+ + (1 - a) 1), r' subject to Yn-Xb=r+-r' , (b, r+, r') eRpx 1x11", where 1:, z: [1, ..... 1] 1 X" and Yn is the response vector. By the linear programming theory, 13,,(0) is the convex hull of one or more basic solutions of the form (1.3) bh = 11,311,, where h is a subset of {1,2...n} of size p and X11 ( Yb ) denotes the sub-design matrix ( sub design vector ) with rows x)“, i e h ( co-ordinates Yni, i e h ). In general, one can choose 3“.) from (1.1) in such a way that it is a stochastic process called regression quantile process that has sample path in [D(0, 1)]p. There will also be ‘break-points’ 0 = (10 < a, < < OJ“: 1, such that 3“.) is constant over each interval (oi, “1+1)’ 0 g i g .ln-l. See Gutenbrunner and Jureckova (1992) (GJ) and references therein for more. on this. The corresponding dual program, 111entioned in the appendix of KB is the following: (1.4) 111axi11iize. Y),a with respect to a (1.5) subject to Xfla = (1-(1') Xhlna a6 [0, 1]". GJ investigated the statistical properties of the optimal solution of (1.4) when the errors are independent. Note that a maximizer of ( 1.4) also maximizes al'(Yn - Xt) subject to (1.5), for any t E RP. Now choose t = 3,,(0), where for some p-dimensional subset hn(cr) of {1,....n} an optimal solution for (1.2) is given by 311(0) = X31191) Yhn(0‘)' Then one particular solution of (1.4) corresponding to this choice of 3,,(0) can be given as follows: 44 For i not in hn(a), let A (1.6) 5111(0) .-= 1, Y", - xgis,,(a)>0, A 01 Y ' ' xt ,6,,(a)<0, 111 111 II and for i in hn(oz), 51,],(0) is the solution of the p linear equations: 1.7 =1- " .-" .IY.-".‘,,- >0. ( ) jEE}(O)xnjanJ(a) ( 05:511an jglxn‘l ( 11) anfl(n) ) GJ call 5,,(0) the regression rank-scores for each a e (0, 1). When p =1 and Xni '2 1, 1 Si 5 11, GJ observe that (1.8) 3111(0) = 1, (1 < (Rm - 1)/n 2: R111 - 1101, (R111 — 1)/n 5 a 3 RM /11 = 0, Rni /n < a, where Rni := Rank(Yni). Hence in the one sample location model, the processes {sni(.), 1 Si 3 11} reduces to the familiar rank process (See Hajek and Sidak, 1967, Section V35). GJ observed that 5,,(.) satisfying (1.6) and (1.7) can be chosen in such a way that it has piecewise linear paths in [C(O, 1)]n and 5,,(0) = 1n = ln- 21,,(1). See GJ also for a discussion on the generalization of the duality of order statistics and rank process from the one sample location model to the linear regression model by the RQ (z: regression quantiles) and RR (:2 regression rank-scores) processes. Using these processes one can construct various statistics such as L- statistics and RR statistics that are useful to make inference about the regression parameter 1?. In the context of i.i.d. errors, different types of L-estimators were 45 proposed by Ruppert and Carrol (1980), Koenker and Portnoy (1987), Portnoy and Koenker (1989) and GJ. Among them, Koenker and Portnoy (1987) and GJ considered smoothed L-estimators that are asymptotically equivalent to the well known M-estimators but have the added advantage of being invariant with respect to scale and reparametrization of the design. In this paper we shall consider these type of L-estimators and will observe that under appropriate conditions, the above asymptotic equivalence continues to hold even in the LRD setup. As pointed out by Jureckova (1992a), one of the major advantages of using RR statistics based 011 the residual is that the corresponding estimators of the some of the components of 5, when others are treated as nuisance parameters do not require the estimation of the nuisance parameters. The basic result needed to study these estimators is the AUL of the RR statistics based 011 residuals as given in Jureckova (1992a) for the i.i.d. errors. The corresponding result under (1.1.1) and (1.1.2) is given in section 4. Section 2 obtains the joint asymptotic distribution of the finite number of suitably normalized RQ ’s and asymptotic representations of RQ and RR processes. Section 3 applies this results to yield the asymptotic behavior of L- and RR statistics. The proofs heavily depend upon the uniform closeness result of Chapter 1. 4.2. Theorems and Proofs. To find out the asymptotic distribution of RQ, we first define the following minimum-distance type estimator of 6 by 2 9 (2.1) ,amd(01) Z: argminA "(X50442 ‘.I(A, 01) where, 46 €(A, a) := 2x11i{I(Y11i' x11iA _<_ 0)- a}, 0 5 a 51, A6 RP. I By the continuity of the d.f. F, SKA, a) is an almost everywhere differential of the function 2.3%de - xt-A) with respect to A. Therefore, I DI intuitively, minimizer of Zha(Yni - xi1iA) and f9md(a) should be asymptotically I equivalent. The following lemma is enunciated towards this direction. It also gives the joint asymptotic representation of the finite. number of RQ. To state it, we need to introduce 13(01):: [9 + F 1(.1( (1)01, 01 2: (1,0 ”0)", and (1(0) := f(F'l(a)), 030' g 1. Also, recall the definition of Sx from (3.2.8), conditions (X.1), (X.2) from Theorem 3.2.2 and conditions (A.7), (A8) from Theorem 1.2.2. Moreover, the following design condition is assumed: (X.0) The first column of the design matrix X consists of one only. Lemma 2.1. Assume that (1.1.1) (1.1.2) and (X0) - (x2) hold. Then (i) sup, 6 [0, 1] "Bruno. 0) - J...(F"(a))sxll = 0.11) (ii) If, in addition, (A.7) hold.- then )1 1.) e (1), 00) 11.11.1111 a e (0, 1), sung/(1.) “131mm )+ A3.l s a) - 21mm), «)1- s<1 0. This fact together with an argument like the one given in the proof of Theorem 2.1 below implies (iv)(a) in a routine fashion. (b) First we show that for every 0 < a < 1, (2‘3) “Bxlgifihiah 0)” 2 011(1): 48 From Theorem 3.3 of KB, we have the following algebraic identity: (granny...- xgifiaa) s 0)-a) - 2: Jenna... xrfima) 3 0101111,)“, = was), 1 i E hn(a) " where, hn(a) is as in (1.6) and each element of the pxl vector wn(a) belongs to the interval [a -1, 01]. Hence Bkl‘fl'(Bn(a), 0) : Bid . Z xniil ‘ a) "I“ Biclxi, (a) W11(0)- 1 E hn(a) 11 Therefore, (2.2) follows from (2.3) and (X2) by noting that the right hand side is t -l _ bounded by 2 p lmsaiicS n ”xnin I) — 0(1). Now the claim in (b) follows along the same line as in the proof of (2.5) below. The claim about (c) follows from (b) with the help of Cramer-Wold device. // Remark 2.1. An interesting special case of (2.2) is when k :1 and (11: 1/2. Here regression quantile reduces to the celebrated LAD (least absolute deviation) estimator. As an example of errors when the limiting distribution of the LAD estimator is nonnormal, consider the model (1.1.1) when the error r.v. is a chi- square centered at its median, i.e., C(x) : x2 - v, where v is the median of x? r.v. Then, as shown in DT, 111 equals two. Hence in the one sample location model with chi-square errors centered at its median, it follows that the asymptotic distribution of the LAD estimator is the same as that of u.{nl ' 6L(n)}'l 2(1)? - 1) which converges weakly to the r.v. 112.2(1) where {Z2(a), C16 [0, ll} is the Rosenblatt process as in example 2 of DT. // Remark 2.2. A very intriguing pl‘ienomenon is observed regarding the asymptotic efficiency of different estimators when the errors are exactly Gaussian, i.e., C(x) = x. In this case, in equals one. and all the estimators discussed in this thesis are asymptotically normally distributed. lVIoreover, (2.2) implies that the 49 asymptotic dispersion of the regression quantiles ([9,,(01); a e (0, 1)} is independent of a. In addition, it is same as the asymptotic dispersion of the least squares estimator, M-estimators and that of Koul’s 111. d.e. flit, irrespective of the scores — functions (in the case of M—estimators) and integrating measure H (in the case of m.d.e.). Similarly, in this situation, R-estimators and Koul’s m.d.e. [9K continue to have the same asymptotic dispersion irrespective of the score function C. This is in complete contrast with the i.i.d. errors case. Finding an estimator with better asymptotic efficiency in the case of Gaussian errors is still an open problem. // Remark 2.3. Recall that in the i.i.d. errors case, under (X0), (X.1), the assumption that lm Nl01 Ipxp 0 (1 - 0’) (1'2(0)l W- This essentially follows from the work of Koul (1992, section 5.4) by viewing fin(a) as an M-estimator corresponding to the convex score function 110. Under stronger conditions on the design matrix it also follows from KB. Hence, remark similar to that of Remark 2.2.1 is also applicable here by comparing (2.2) and (2.4). In the following theorem, we depict the asymptotic representation of the regression quantiles process uniformly over the compact subset of (0, l). The basic idea for its proof can be found in Jureckova (1971) and Koul (1985). Here, for an Rp valued stochastic process {Xn(a), a E [0, 1]}, “anla := sup{ "Xn(a)" ; a g a S La} and we say that Xn = 03(1) ( 0;“,(1) ) if H X” IL, = Op(1) (o,,(1)) for every a E (0, 1/2]. Theorem 2.1. Assume that (1.1.1), (1.1.2), (X.0) - (X2) and (A.7) - (A.8) hold. Then (2.5) Ax<3....1(a)-A(a)) + (Ma) Bin/1(a), 0) = 0;;(1). 50 (2.6) Ax11‘3..(a)- ammo) = 03(1). // Proof. The proof will be given in several steps. Let D,,(a) denote the set of minimizers of "D;l g (A, a)". Note that H D,1 ‘5' (A, a)” can take at most 2" possible values and the set Dn(a) is nonempty for each a in (0, 1 / 2]. Now define Ana) by AxlAaa) - 19(0)} =-—= - <1"(a) Bra/3(a), a). T11 to1rove (2.5), it is enough to show that for every a in (0, 1 / 2], (2.7) sup, 6 ,asupA E W, ll AAA-[1(a)] - Adina) - [3(a)] H = ope) Step 1. "133} '5'(A,,(a), A)" = 0;;(1). Proof. Follows from the above Lemma (2.1)(iii) by observing that AX1A..0, define Ob(a):- — {AeDn(a );x||A( A [i((1))” 1.) Then v M > 0, ))(|> 11 0,,(n e) #oVo'Ela] =0(1). (2.8) P[ snpa E I a'supA E 0b( II( Proof NotethatA(AA,, (1 = A,[A-(as )-q]-( 1(a))[n;,121(s(a),a)]. Therefore, the left hand side of (2.8) is less than P[supa E IasupA E 011(0) ll Ax[A-fi(a)] - (1,-1(a)) [B;1€r(/3(a),a)]))”> M, SUPGEIHHPAEO((1) (|(1'1((1)3(IB ‘.I( A, a) l” < M/2, 0b(0 (1') #4) V0613] +P[sup(316IaISUPAEO()(1llq.1(a)13xlzrl(A 0) )|| > M/2 0,,(0 a) ¢¢Vaelal 51 s Ptsupa 61am”, 6 ohm Hq“(a)B;.‘[‘I(A, a) - Wu), an - AAA-Ma» M > M/2, Ofia)¢¢Vaeg] + P[supa E [asupA 6 Ohm) ||(1-1(a)B;,'cy(A, a) H _>_ M/2, ohm) ¢ ¢ Va 6 Ia]. Now, the first and second terms are 0(1) by Lemma (2.1)(iii) and Step 2 respectively. Step 4. Given 6, M >0, 3 6>0 and "0 such that V n 2 110, (2.9) P[infa E Iainfusuz a ”Wm/3(a) + A338, 0)" > M] > 1 - a. Proof. The polar representation of vectors, the Cauchy-Schwarz inequality and the fact that V 06(0, 1) and V 96R", HtB;1[€I'(fl(n) + rAQO, a) is a monotone increasing function of r > 0, yields that for any 6 > 0, . -1 -l 1anle 6 "13,( mam) + AK 8, a)“ 3 infr2 6 arm : 1 otBflsmm) + Age r, a) 3 inf othflerwm) + A29 6, a). l|0||=l Therefore, using Lemma 2.1(iii), for all s1.1fficiently large n, the left hand side of (2.9) is not less than P[infa otngflma) + A330 6, a) > M] e lawfnoll = 1 2 P[infa E lainfflon : 1 [at-Bgonma), a) - 6 (101)] > M + 1] - 6/2 2P[ 6 inf (1(a) - inf a'n-‘wam, a) > M + 1 -5/2. (1 61a x a e lamrnon = 1 Now, using infa E [a q(a) > 0 and infCt 6 lainfl O‘B;19'(fl(d), a) _-_-. Op(1), we ION-:1 can choose 6 sufficiently large so that the above probability is not less than 1- e/ 2. 52 Step 5. Given 6, M >0, 3 no such that V n 2110, (2.10) P[supa E [asupA 6 011(0) “mm-{3(a)} - AX[A,,(a) - ,B(a)]” > M] < 6. Proof. By Step 4, choose 6 and 11 large such that P[ 66(0) 7t (25 for some a in Ia] < 6/3 . Then the left hand side of (2.10) is less than P[supa E IasupA 6 W0) ”Ax[A-fi(a)] - AX[A,,(a) - pm] ll >M, 0,,(a) ¢ ¢ Va e la] + P[()6(a) 75 43 for some a in la] S P[Supa E [aSUPA 6 06(0) [AAA-[“0” ' AxlAn(a) ’ fl(n)l ” >1“, 06(0) ¢ 45 V0 61a] +P[supa E [asupA 6 55(0) “AX[A-fl(a)]—AX[A“(G)-[i(a)] H > M, 05(0) ¢ (:5ch e lam/3. The first probability is less than 6/ 3 by Step 3. The second probability is less than 6/3 by the choice of 6. This completes the proof of (2.5). Assertion (2.6) can be a: following the same lines by observing that “Dying/a), a!) H = 0,,(1). // We now turn to the RR processes. To define these, let {Cni’ 1 _<_ i g n} be a triangular array of p x1 vectors and let Cn x p be the matrix with rows cai, 1 g i g n. Define sequences of weighted RR processes by 03(0) 3: Zc.,;{ft..i(a)- (1' (1’)}, 0 S a S 1, and an approximating sequence.l of weighted empirical process Uf',(a) := ZCniUifi > F'l(n)] - (1 - 0)}, 03051. For convenience we now recall the following algebraic identity from (5.15) of GJ. (2.10) 5M0) - (1 - a) = 1(6, > F'l(o)] - (1 - a) -{I[e, 5 Ma) + Xf,i(/9..(a) - fl(a)) - He, 3 F"(a)1} + 5mm) IlYni = xii/Bum”, 13 i 5 n, 0 g a g 1. 53 This identity is useful in approximating Bf, by U3. The following theorem gives the desired asymptotic representation of LIE. Theorem 2.2. In addition to (1.1.1), (1.1.2), (X.O) - (X.2) (A.7) - (A.8) assume that (C.1) - (C2) hold where, (C.1) (CtC)'l exists for all n 2p. (C2) 11 max c-'1(CtC =0 1 . l < i < n 6'“ Then the regression rank-scores process admits the following representation: (2.11) 3.1mm - U§1 = Be‘C‘XAglAa/‘mm - 2(a)) q(a) + 0;;(1), and (2.12) Be‘foa) = - Set-new» + 0;;(1). Consequently, (2.13) Erma) = - (S. - Bg‘C"XA;.’SXJ tum-1(a)) + 03(1). // Remark 2.4. Let g :: BglC‘XAQ. Then from its definition 9 = Dglc‘xng. Hence by the Cauchy-Schwarz inequality 9 is bounded. // Proof of the Theorem. Let En( Ax[fl,,(n -fl(o)]. From (2.10) we obtain that Bette) = Be‘foa) - gEm) q -(E;‘:c..,{1[e, s E 1m )+x.,,A;‘E.a( > He, s E-‘1}-9En(a)q(a)1 + BE] 2: )Clllalll(a)1[\7ni—_ xni('Bn(a 0)] iEhna =Bg‘U;(a)-9E,.(a) 1(a)- R1(a)+E)(a ) (say). By Theorem 2.1, (2.6) and Remark 2.4, R1 2 03(1). By (C.2), supa E [0, 1] “19(0)” = 0(1) almost surely. Hence (2.11) follows. The relation (2.12) follows as in the proof of Lemma 2.1(i). 54 Remark 2.5. As in Remark 2.1, the nature of the approximating process in (2.13) is quite different from that in the i.i.d. errors case. The leading r.v. (214) Z := -[S. - 1330"“:le J... in the right hand side of (2.13) is a product of a random quantity, independent of a, and a nonrandom continuous function of a, which also depends on the Hermite rank m. If m =1, then Z is a multivariate normal r.v. with mean 0 and dispersion matrix proportional to (2.15) Bglcta - X(X‘X)‘1X) R (I — X(X‘X)'1X)CB;1, where R is the dispersion matrix of (771,712. . .1'/,,)’. For 111 other than one the limiting distribution may not be normal. Also note that unlike the i.i.d. errors case the limiting distribution may not be distrilmtion free. // 4.3. L—estimators and regression rank scores statistics. In this section we derive the asymptotic distribution of smoothed L- estimators based on RQ processes. For a finite. signed measure 1/ on (O, 1) With compact support, an L-estimator of ,6 is defined by (3.1) Tn: [9,,(a) (1,/(a). c%-— Note that for u with dI/(a) = I(aga g 1-a)da, Tn reduces to an analog of the trimmed mean. The following theorem is an immediate consequence of Theorem 2.1. Theorem 3.1. Under the assumptions of Theorem. 2.1, with u :2 1/(0, 1), (3'2) AxlTn ' fill ' e1 iF-I(G)d"(“)i : ’SxiJndF-lini) (14(0) (ll/((7) + 0})(1)' // 0 55 Remark 3.1. Consider the linear model (1.1.1) where now {6i} are i.i.d. P. Let 1,!) be an absolutely continuous function from R1 to R1 such that [wdF =0, 0 du. inure. Y... ). mo. Y... ) denote €r(fl(a). at time). no) etc. respectively, when {Yni} are replaced by {Ynit} in their definitions. The following lemma is similar in spirit to Lemma 2.1 and Theorem 2.1. It gives the asymptotic representation of the regression quantiles processes based on the residuals {Ynit} and the proof is similar to that of Lemma 2.1. Here 6,,(1) (6p(1)) denotes a sequence of stochastic processes that converge to zero (bounded) in probability, uniformly over a _<_ a g 1 - a. lltllg L, V £1.6(0, 1/2], L e (0, 00). Also racall the notation 03(1) from Theorem 2.1. s , (1.1.2), (x.0) - (x2), (A.7), (A.8), (11.1) (J! Lemma 4.1 Assume that (1.1.1 V and (R2) hold. Then (4.1) IlBglmmo). o. Y...) - and). o))- E .‘X‘RAJ t ,(.. (o)1— — o (1). (4.2) sup{ 113113190.“ )+A,,l s, a, Yn )- (16(0). mY -crs<1( a)”}=0p(1) where the above supremum is taken over a g a g 1 - a, HS“ 5 K, ”til 3 L. (4.3) Axmmao. Y...) - (3(a)) = - n"(o){S.J...(F"(o)) + EQX‘RAflt q(o))+o,.(1), (4.4) A. (dunno. Y...) - Add. Y...)) = 5,,(1), (45) Annie. Y...) - (3(a)) = 0.11). // The following theorem gives the main result of this section. Theorem 4.1. Assume that (1.1.1), (1.1.2), (X0) - (X2), (A.7), (A.8), (RJ), (R.2) and (C.I) - (C3) hold, where (0.3) dx = Then (4.6) 330510, Y...) = 13313.net) + B;‘C"RA;‘t oto) + 6,.(1). Moreover, if the score function I) is of bounded variation and constant outside a compact subinternal of(0, 1), then V 0 < L < 00, (4.7) suthlIS L B1Vf,(Y -V,C,]- BEICtRA}1 t ith) db(a) = 0,)(1). // 59 Proof. Let E n(oz, t):— —— Ax [,Bn( (0, Ym)- )]. From (1. 6), for 1 sign, ani(a’ Ynt) : I[Ynit > Xiii 311((1’ YIN-n + 311““) Ynt) IlYnit: xltll 1811(0 Ynt)l : 1 - Hg g F'1 (a ) + x,,iAx1E11(Oa t) + rfliA} lt] +ad11i(a Ynt)IlY11it=x1t1i'Bn(a Ynt)l Hence. 13.111131... Y...) - 13.105101 ni X =-B‘.‘Zc..{1[e.s F“(a) + xf..A;.‘E..(a. 1) + rhiA‘Jtl-Iie. s W 1) + x A ‘E 16“ )1} l + Bel :c'a‘ (a, Ynt) IIYnit: xlll fl|l(n Yul” Bt‘l Xenia iii ((1') IlYni : xltll 3M0” 111 am (4.8) = - Rl(a, t) + 112((1, t) - 113(0), (say). By (C2), R2(a, t) = 6])(1) and 112(0 61—) — 01 *,(1). To handle R1(a, t), let T(a, s, t) 3=B,C..i{1l‘i