ESTMATEON 0F DERWATS‘y’ES 0F AVERAGE OF R.- DENSSHES AND SEQUEEéCE - CGMPOUR?) ESTIMATEON IN EXPOHEHTEAL FAMiLéES Thesis fer the Degree of Ph. D. MECHEGM STATE UNWERSITY RADHEY SHYAM 3:993?! 1974 LIBRJ .24' DASH“ ’..‘.' ;- "i I 31%;» emit; a. i I "v ' ..-.v'.‘.q-nw;it*-:-' .'. -) J...1 . . .- ‘ _J This is to certify that the thesis entitled ESTIMATION OF DERIVATIVES OF AVERAGE OF u-DENS ITIES AND SEQUENCE - COMPOUND ESTIMATION IN EXPONENTIAL FAMILIES presented by Radhey Shyam Singh has been accepted towards fulfillment a ' of the requirements for . Ph.D. degree in Statistics Sheet/Wm 0 Major professor %/77 07639 ABSTRACT ESTIMATION OF DERIVATIVES OF AVERAGE OF u-DENSITIES AND SEQUENCE-COMPOUND ESTIMATION IN EXPONENTIAL FAMILIES By Radhey Shyam Singh Let X1,...,Xn be independent random variables with M‘ densities f1,...,fn, where u is a o-finite measure dominated by Lebesgue measure on the real line R. With a fixed integer E(v) n-1 nffv). 1 J For any subset D of R, we give sufficient and (some- v 2 0, we exhibit kernel estimators of 2 what) necessary conditions for asymptotic unbiasedness (asy. u.), almost sure (8.3.) and mean square (m.s.) consistencies, each uniform on D. We also prove integrated mean Square (i.m.s.) con— sistency, and obtain convergence rates and exact rates for the ECT) asy. u., m.s. and i.m.s. consistencies. When , for an integer r > v, exists on D, we show that the error term is (r-v)/2(1+r) 0((n-llog n) ) with probability one, while m.s. and i.m.s. errors are 0(n-2(r-v)/(1+2r)), each uniform on D. The vector (f(v)(x1),...,f(v)(xm)) is shown to be asymptotically m-variate normal. We extend this estimation to multivariate case. Specifically, estimation of mixed partial derivatives of the average of p-variate u-densities has been considered. Radhey Shyam Singh 1(v) We make applications of f to sequence-compound squared error 1088 estimation (SELE). With an observation on X distributed according to (~) Pm 6.05 an exponential family wrt u and w G n, the natural parameter Space, we take SELE of 9(m) = w, em or w-1 as our component problem. _ n _ With (X1""’Xn) ~ in — Pm1 X...X Pmn El? , a (sequence compound) estimator of e = (9(m1),...,e(wn)) is m = ($1,...,qh) with mi (X1,...,Xi)-measurab1e. With a 6 > O, and CD the empiric distribution function of w "wn’ and R(Gn) the Bayes risk at 1,.. G11 in the component problem, we say m has a rate 6 at e if ~ . . -1 n 2 _ -6 the modified regret n 2131(mi - 9(wi)) — R(Gn) — 0(n ). With Oi < Si in n such that -ai and Bi are increasing in i, we exhibit estimators (of 9) having certain rates uniformly in w 6 XEEGi’Bil' These rates depend on the Speed at which \an‘ v ‘Bn\ 1 m as n t m. 'When ai’ Bi are constants wrt i and satisfy certain conditions, we exhibit a divided difference estimator of m with a rate 1/5, and kernel estimators, (for each ~ integer r > O), of 9 with rates (r-1)/(1+2r), r/(l+2r) or (r-)/(1+2r) according as 6(w) = w, ew or w-l, where for the case m, r > 1. When 9(w) = w, and w has identical components, ~ we show that rates with the divided difference and the kernel estimators of w are near, but cannot be more than, 2/5 and ~ 2(r-1)/(1+2r), respectively. ESTIMATION OF DERIVATIVES OF AVERAGE OF u-DENSITIES AND SEQUENCE-COMPOUND ESTIMATION IN EXPONENTIAL FAMILIES By Radhey Shyam Singh A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1974 TO MY PARENTS ii ACKNOWLEDGEMENTS I wish to eXpress my deep gratitude to Professor James Hannan for the patience he accorded me in the preparation of this thesis. His careful criticism and invaluable suggestions aided greatly in improving and simplifying virtually all of the results in the thesis. In addition, among many others who helped, I wish to record my thanks to Professor Dennis Gilliland for reading and commenting on a difficult rough draft; to Mrs. Noralee Barnes for her excellent typing; and to my wife for her great understanding and endurance. Finally I would like to thank the Department of Statistics and Probability at Michigan State University for the generous support, financial and otherwise, during my stay at Michigan State University. iii Chapter 0 TABLE OF CONTENTS INTRODUCTION .................................... 0.1 Estimation of Derivatives of the Average of Densities ............................... 0.2 Sequence-Compound SELE with Applications of f(v) ................................... 0.3 Some Notational Conventions ................ NON-PARAMETRIC ESTIMATION OF DERIVATIVES OF THE AVERAGE OF n u-DENSITIES, AND CONVERGENCE RATES IN n ..................................... 0 Introduction ...(.3 ......................... 1 Estimation of f and the Main Assumption .2 Asymptotic Unbiasedness and the Exact Rate 3 Strong Consistency with Rates .............. 4 Variance, Covariance and Asymptotic Normality .................................. 1.5 Mean Square and Integrated Mean Square Consistencies with the Exact Rates ......... 1.6 Estimation of Mixed Partial Derivatives of the Average of Multivariate u-Densities CONVERGENCE RATES IN SEQUENCE-COMPOUND SQUARED ERROR LOSS ESTIMATION OF CERTAIN UNBOUNDED FUNCTIONALS IN EXPONENTIAL FAMILIES ............. 2.0 Introduction ............................... 2.1 A Bound for the Modified Regret ............ 2.2 Some Assumptions and Notations ............. 2.3 A Divided Difference Estimator of g with a Rate 1/5 ................................. 2.4 Kernel Estimators with Rates Near 1/2 when w -1 9(w) = u.), e or u) ..................... 2.5 Rates Near the Best Possible Rates with the Divided Difference and the Kernel Estimators of 9 with Identical Components ........... 2.6 The Divided Difference Versus the Kernel Estimators ................................. iv Page 10 13 18 22 28 36 41 41 43 45 47 58 73 81 APPENDIX A.l ON GLIVENKO-CANTELLI THEOREM FOR THE WEIGHTED EMPIRICALS BASED ON INDEPENDENT RANDOM VARIABLES A.2 A BOUND FOR THE v-th MEAN OF THE BOUNDED DIFFERENCE OF TWO RANDOM RATIOS ................ BIBLIOGRAPHY .............................................. Page 83 85 87 0. INTRODUCTION In this thesis we consider estimation of derivatives of the average of densities with applications to Sequence-compound squared error loss estimation (SELE). 0.1. Estimation of Derivatives of the Average of Densities. Estimation of a Lebesgue-density, hereafter L-density, has been studied by various authors, and a variety of methods have been used: For example, Watson and Leadbetter (1963), and Nadarya (1965) used the kernel methOd first introduced by Rosenblatt (1956), and studied in detail by Parzen (1962); Cencav (1962), Schwartz (1967), Kronmal and Tarter (1968), and Watson (1969) used the orthogonal series method; Weiss and Wolfowitz (1967), Rao (1969), and Wegman (1969) used maximum likelihood methods; Van Ryzin (1970) and Wahba (1971) used, respectively, histogram and polynomial (Lagrange)- interpolation methods. Estimation of derivatives of a L-density has also been considered by Bhattacharya (1967) and Schuster (1969). Estimation of non-Lebesgue densities and their derivatives arises in Empirical Bayes problems, while that of the averages of non-Lebesgue densities and their derivatives arises in compound decision problems. Yu (1970), (Section 2 of the appendix) exhibits kernel estimators of a u-density and its derivative, where dp = u(x)dx and, for some a 2 -m, u(x) > 0 iff x > a. He gives rates for mean square errors (m.s.e.) at each point on the real line R. Samuel (1965), (Section 6), exhibits kernel estimators of the average of L-densities and, under uniform equicontinuity (hence, necessarily uniform equiboundedness) of densities on a subset D of R, she proves asymptotic unbiasedness (say. u.) and weak consistency, both uniform on D. Susarla (1970), (Sec- tion 1.3), exhibits kernel estimators of the average and its first partial derivatives of m-variate normal densities with known co- variance and uniformly bounded unknown means, and obtains rates for m.s.e. uniform on Rm. Samuel uses Parzen-type kernels, while Yu and Susarla use those of Johns and Van Ryzin (1972). We consider here non-parametric estimation of derivatives of the average of non4Lebesgue densities. Let u be a o-finite measure with density u wrt Lebesgue measure on R. Let X1,...,Xn be independent random variables with Xj having a u- density fj' With r 2 v 2 0 fixed integers, we exhibit kernel estimators f(v)(x), depending on X1,...,Xn, u and r, of f(v) = n-12: f§v). (If u were known to be at least as smooth as f, were, we would estimate derivatives of the average of L-densities ufj directly.) In the remainder of this section we describe the main re- sults contained in Chapter 1. Bounds obtained here are quite explicit. We make almost no assumption on u for some of the results on asy. a.s., m.s. and integrated mean square (i.m.s.) consistencies. For any subsetD of R and any h = hn l O as n 1 m, if -1 x+h -(v) -(v) . . SUPXEDh Ix \f (t) - f (x)\dt a 0 as n a w, and if, in case v > 0, the v-th order Taylor expansion of f(x + hy) about x with integral form of the remainder exists for all 0 < y < l and U. 3 for each x in D, then, under certain boundedness conditions on l/u, asy. u., 3,3. and m.s. consistencies, uniform on D, are proved (in Sections 2, 3 and 5, reSpectively). (Thus, contrary to the assumption made for similar results in most of the papers on the subject, asymptotic continuity of f(v) at the estimation point is not needed.) In Section 4, we obtain rates and exact rate for the 1(v) var(f ) and prove the asymptotic normality of the vector (f(V)(x1),...,f(V)(xm)). In Section 5, we prove i.m.s. consistency. Under certain boundedness conditions on l/u, the difference of %(v) and its expectation converges to zero a.s. and in second mean; and hence, the three properties asy. u., a.s. and m.s. con- sistencies of the estimator become equivalent. Sufficient and (somewhat) necessary conditions for asy. u. (and hence for a.s. or m.s. consistency) uniform on D are also given in Section 2. These, Specialized to f E f and D = (a,w) for an a 2 -m, become: If J I: f(x)dx < "s then f(v) is asy. unbiased uniformly on (a,m) iff f(v) .is uniformly continuous there. When r > v, and for all 0 < y < l and for each x in D, f(x + hy), with h as indicated above, has r-th order Taylor expansion about x with integral form of the remainder, then, under certain boundedness condition of h-lf:+h\f(r)(t)ldt and of l/u on D we obtain rates for various convergences. In Section 2, we obtain rates and the exact rate for the bias term uniform on D. The result giving rates, specialized to the i.i.d. case with r = v + l, u E l and D = R improves the correSponding one of Bhattacharya (1967), (see Remark 2.5). In Section 3, we show that the error term is 0((n-llog n)(r-v)/2(1+r) ) a.s. as n t m, uniformly on D. This result, specialized to the i.i.d. case with r = v+1, u E l and D = R improves the correSponding one obtained by Schuster (1969) (see Remark 3.3). Rates and exact rates for m.s. error uniform on D and for i.m.s. error are obtained in Section 5. These rates are shown to be 0(n-2(r-v)/(1+2r) ) as n t m. Results, concerning bounds of m.s. and i.m.s. errors, Specialized to fj E f, u E l and v = 0 improve the corresponding ones of Parzen (I962), Schwartz (1967) and Wahba (1971), (see Remarks (5.1), (5.2) and (5.3)), though only Schwartz considered i.m.s. consistency. In Section 6, we estimate mixed partial derivatives of the average of multivariate u-densities. Specifically, we exhibit _(v ,...,vm) v +...+vm_ m vi kernel estimators of f (x) = a f(x)/(Il1 axi ), -1,“ where x 6 Rm, f = n E f and fj are m-variate u-densities. 1 J These estimators have asymptotic prOperties analogous to those possessed by the estimators prescribed in the univariate case. We verify some of these related to asy. u., m.s. and a.s. consistencies, each with and without rates. 0.2 Sequence—Compound SELE with Applications of f v). In Chapter 2, we deal with sequence-compound SELE of certain unbounded functionals in exponential families. We use the estimators :%(v) in order to exhibit certain sequence compound estimators whose nnodified regret converges to zero with certain rates. Suppose é’= {Pwlw E O} is a family of probability measures on R, and the component problem is SELE of real 9(w). The sequence- , _ romp. lea l Thus he ((8 compound problem consists of n repetitions of the component prob- lem with the loss taken to be the average of the component losses. Thus one has w = (w1,...,wn) E On and (X1 The i-th component of a (sequence compound) estimator ,ooo’x)~P x-ooXP n U.) U) l n ~ w = (Ql’°"’¢h) of E = (91,...,en), where ej abbreviates 9(wj), is allowed to depend on (x ""’Xi)' 1 With Gn’ the empiric distribution function of w ,w , 1,... n and R(Gn), the Bayes risk versus CD in the component problem, let *1 . n 2 Dn(‘i”fi?) - n :1 E(ej - cpj) - R(Gn). Dn(w,q9 is called the modified regret of m, and is often taken as a standard for evaluating compound procedures, (e.g., Hannan (1956), (1957), Samuel (1963), (1965), Gilliland (1966), (1968), Johns (1967), and Susarla (1970); of course with varying component prob- lems). If 5 > o, and Dn((i),£p) = 0(n’5) as n _. 00, we will say ff hasa Egg; 6 (at 9). In the references cited in the next paragraph 0 is a bounded interval, and, except in case of Samuel, rates are uniform in n w E n 0 When 9(w) = em and .9 is an exponential family satisfying certain conditions, Samuel (1965) exhibits estimators m and shows that D:(w,qp 410 for each m as n a m. When components of w are means of normal densities with variances unity, and e is the identity, Gilliland (1966), (Chapter 3), obtains an estimator with a rate 1/5. Extending Gilliland's work to m—variate-case, Susarla (1970), (Section 1.4), exhibits, for each integer r > 1, estimators with rates (r-1)/2(m+r+l), and thus, improves Gilliland's result. Ether COlll an est of me When 0 is a certain family of discrete distributions and the component problem is linear loss two-action, Johns (1967) prescribes an estimator with a rate 1/2. The same rate, 1/2, is achieved by estimators prescribed by Gilliland (1968) in sequenceccompound SELE of' 9 in a certain discrete exponential families. For our main results,«9 is an exponential family wrt u, where u is a a-finite measure with density u wrt Lebesgue measure on R such that, for an a 2 -m, u(x) > 0 iff x > a. The assumption that u(x) >10 iff x > a is imposed in various papers either on Empirical Bayes, or on compound problems in exponential families, (e.g., Samuel (1965), (Section 6), Yu (1970), (Chapters 1 and 2), and Johns and Van Ryzin (1972)). In the case of Gilliland (1966), (Chapter 3), and in the univariate version of Susarla (1970), (Chapter 1), u is the standard normal density function. In each of the papers cited in the preceding paragraph, and in the paper of Hannan and Macky (1971), u is at least continuous on {u >'0}. We, instead, make certain assumptions on (local) boundedness of l/u. In all the papers on compound decision prob- lems so far available in the literature, a is assumed to be bounded. we relax this by Eggggg. n gg_ghg_natural parameter space. However, clur assumptions restrict the speed at which max grows ISan‘wj\ as ntca. We will now describe the main results of Chapter 2. We have treated only the cases 9(w) = w, em or w-l. (The cases of (bk, eLw or w-m, where k and m are positive integers and O <:L < a, can be treated analogously). For a1 < Si in n for all car deg bel cas CO‘. EX all i 2 l, with a1 1 and Bi 1, we exhibit estimators with certain rates. These rates are uniform in m E X? [ai,ei], and depend on how maxlsjsn(‘ai‘ V \Bil) grows as n t m. Rates below are, for the sake of convenience, indicated only for the cases when 01 and Bi are constants wrt i and satisfy certain conditions. We use the ideas of Gilliland (1966), (Chapter III), and exhibit an estimator of w based on a divided difference estimator of (log f)(1), where f n-1 n f. and f is a -densit of Pm . This estimator is shown, in Theorem 1, to achieve a rate 1/5. 1 gm We use the estimators (introduced in the preceding section) of f(v) to obtain certain kernel estimators of 9 when 9(w) = w, em or w-1. For each integer r > 1, we exhibit kernel estimators of m which are shown, in Theorem 2, to achieve a rate (r-l)/(1+2r). When wj's are means of normal densities, our estimators are pre- ferable, for various reasons (see Remark 4.3), to the correSponding ones of Susarla (1970), (Section 1.4). For the case 9(w) = em, we obtain, for each integer r > 0, kernel estimators which are shown, in Theorem 3, to have a rate r/(1+2r), and thus improve (rate wise) Theorem 6 of Samuel (1965), (also, see Remark 4.5). When 6(w) = w-l, we exhibit, for each integer r > 0, kernel estimators which are shown, in Theorem 4, to avhieve a rate (r-e)/(1+2r) for any 6 > 0. The result here with u(x) = (F(T))-le-1[x > 0], T >'0, generalizes and improves the main re- tsult of Section 2.1 of Susarla (1970), (see Remark 4.6). In Theorems 5 and 6, we show that, when 9 is identity and w has identical components, rates with the divided difference and ~ the kernel estimators are near, but cannot be more than, 2/5 and 2(r-l)/(1+2r), respectively. Finally, when 9 is identity, 3 comparison between the divided difference estimator, say T, and the kernel estimator, say V , is made in Section 6. QR with r > 6 is preferable to ~ V in the sense that supw‘Dn(m,VK‘ e 0, as n a m, faster than N supm\Dn(m,E)\ a O, as n : m. ~ 0.3 Some Notational Conventions We suppress the arguments of functions whenever it is con- venient not to exhibit them. We denote elementary functions by their values, and, except for emphasis, do not diSplay the dummy variables of integrations. Indicator function of a set A is denoted by A itself, or by [A]. For any measure 5, the g-integral of y is denoted by gy, g(y) or §[y]. We abbreviate the Space Lp(R) to LP, with 1 s p s m, Lp-norm to H'Hp, g(t) - g(x) to g]: and, occasionaly, supt€A\g(t)\ to HgHA. The symbol 5 indicates that the equation holds by the definition, or, is a defining one. The symbol I. is used throughout to signal the end of a proof. CHAPTER 1 NON‘PARAMETRIC ESTIMATION OF DERIVATIVES OF THE AVERAGE OF n u-DENSITIES, AND CONVERGENCE RATES IN n 1.0 Introduction. Let u be a o-finite measure, dominated by Lebesgue measure on the real line R. Let X1,...,Xn be independent real valued random variables with Xj distributed according to Pj << u. With u , a fixed determination of du/dt, let fj(t) = -1 t+e 106 t u(t) > 0, and 0 otherwise. (From the properties of Lebesgue points (u(t))dlime de if the limit exists for all j and of a function, see pp. 255-256 of Natanson (1955), all of the above limits exist a.e. Moreover, if f5 is a determination of de/du, then almost every point is a Lebesgue point of ufj, and hence j a fixed v 2 0, we want to estimate f = f; a.e.) Let fgl) be the i-th order derivative of fj. For E(V) “ fiV). -1 _n 21 J In Section 1, we exhibit a class of kernel estimators f of f(V), and discuss the main assumption to be made in later sections. We obtain results on the bias in Section 2, on the error of the estimate in Section 3, and on the mean square and integrated mean square errors in Section 5. In Section 4, we prove the 1(v) asymptotic normality of (f (x1),...,f(v) (Xm)). In Section 6, we treat the multivariate version of the problem; Specifically, for m _(v1,...,vm) x ='(X ,---,X ) in R , we estimate f (x) = m (v +...+v ) v i . . ), where f is the average of n m-var1ate 9 a 1 m g(X)/(HT ax 1 de fi Jc Sa 9X (0 10 densities. Unless stated otherwise, results are obtained at a fixed point x. (v) 1.1 Estimation of f and the Main Assumption. Let X' be the class of all real valued BoreI-measurable functions on R vanishing off (0,1). For an integer r > v, let ficx be such that if K E Kr, then V (1.0) ki é (1!)'1fylx(y)dy = [i = v], i = 0,1,...,r-1. Denote [yi‘x(y)\dy/il by lk‘i. The set x: is non-empty, Since it contains the v-th element of the dual basis for the subSpace of v+1 L1(0’l) With basis {1’ y/1!,...,yr'1/(r-l)!}. Define K: = Kw Let 0 <‘h é hn s 1 be such that hn l 0 as n 1 m. For a fixed r 2 e and a fixed K 6 xi, let X.-- (1.1) Yj(-) = {Kr-fi——9/u(xj)}[u(xj) > 0] The proposed estimator of f(v) is (1.2) E(V) = (nhv+1)-1Zn Y, 1 J . -(v) :(v) Hereafter gg frequently_abbrev1ate f and f by gn and A gn, respectively. For estimating a Lebesgue density and its derivative (in Empirical Bayes linear loss two action in eXponential families) Johns and Van Ryzin (1972) introduce and use Lz-kernel functions satisfying the xt-conditions for r > 1 and v = 0 or 1, with the exception that, for the case v = 1, their kernels vanish off (0,2) instead of (0,1). With fn E f, Yu (1970), (Section 2 of 11 his appendix), considers estimation of f and f(1), and uses Johns and Van Ryzin type kernels. The orthogonality properties of K E K: and the assumption (Aér)), which is introduced below only for integers r > 0, are used in curtailing the bias of gm, (see (1.4) - (1.7) below). (Aér )): For each 0 < y < 1, there exists the r-th order Taylor expansion of f(t + by) about t with integral form of the re- mainder: fa: + by) = so 1 £11.51)“, + 2:137,-js+hy - E(r'1) = s t + e, and by repeated integrations of (see Van Vleck (1973),p pp 286- -7),f t 2 -(r) It f (t1)dt, t 5 t2 both sides, we get t t f(t + e) - zr-l {is-f f(j)(t)= :+6 P r 2 f(r)(t )dt dt 0 J. J: t 1 1 r -1 (t + e - t )r _ t+e l -(I‘) ” it (r-l)! f (t1)dt1 where the second equality follows by Fubini theorem. We introduce the notation f(r) ]:\dt (1 3) A (x) =-111fm 12 where the dependence of Ar on n and h is abbreviated by omission. For some of our results, we will assume Ar(x) = 0(1). f(r) is asymptotically g3: Note that Ar(x) = 0(1) whenever g(r) x+t equicontinuous at x (i.e., 1x a O as t 1 O and n 1 a0. If only finitely many Pj are distinct and x is a rt-Lebesgue f€r> J point of each of the , then again Ar(x) = 0(1). Let Bn denote the bias of gm, i.e., (1.4) B = P g - g where En = P1 X...X Pn' Since Xj has Lebesgue den31ty ufj and the fj's vanish wherever u vanishes, by (1.1) and (1.2) = h- (1.5) (Bn + gn> é gnsn(t) jx< 0, (Aév)) holds and K 6 X3, then the substitution in the rhs of (1.5) of the expansion of f(t + hy) given by (Aév)), and use of the orthogonality properties of K give - t+h - - (1.6) (Bn + gn)(t) = h VjK(y)jt y(c + by - z)v 1f(v)(z)dz/(v-l)!. If (Aér)) for r > v holds and K 6 xi, then, since RV = 1, by arguments similar to those giving (1.6), r-1E(r) (1.7) Bn(t) = h'vjx(y)j:+hy(c + hy - z) (z)dz/(r-l)!. For r = v + l kernels giving (1.6) and (1.7) belong to the same Class Xi: but, since (AéV+1)) é (Aév)), the two expressions are not eQuivalent. We will use (1.6) and (1.7) to prove the asymptotic unbiasedness of our estimators. 13 In what follows is a sequence of positive numbers, and D is a subset of R. Unless stated otherwise, all the limits, convergences and asymptotic equivalent relations (for functions depending on n) are wrt n a m. 1.2 Asymptotic Unbiasedness and the Exact Rate For the results of this section, X1"°"Xn need not be independent. Recall that gn and gm stand for f(v) and %(v) reSpectively. We will give sufficient and (somewhat) necessary conditions for “BnHD = 0(1). Under different conditions we will obtain two upper bounds for Ufl]\,andan asymptotic expression for Bn. We first prove the following, where by (t,n) a [0+,m) we mean t in a non-deleted rt—nbd of 0 converges to 0 and n 4 m. Theorem 1(a), Let K E Kt, and, for the case v = 0, be bounded. If (Aév)), whenever v > 0, holds at each point in a rt-nbd of x, and if (2.0) Av(x + t) = 0(1) as (t,n) ~+ [O+,oo) then (2.1) Bn(x +’t) = 0(1) as (t,n) a [0+,m). On the other hand, if K 6 X’ (K need not be in X3) is + bounded, (gn - gn')]: t = 0(1) as (t,n),n' a [O+,m),m, and for a , x+t _ - subsequence {m}, 11mmtm,t10 Bm]x - 0 and fm E L1[x,x+Tm] for x+t\ some Tm > hm, then, as (t,n) ~[0+,a0, ign1x = 0(1) (and hence, (2.0) holds). 14 Remark 2.1. The second part of the theorem essentially says that, in the presence of certain assumptions (which are always satisfied in the case fj E f and, for some T > O, Lebesgue- inf of the restriction to [t E [x,x+T)\f(t) > 0} of u is positive) (2.0) is necessary even for a weaker form of (2.1). Prggf, (Sufficiency of (2.0)). First consider the case v = 0. Since K E X3, vanishes off (0,1) and IK(y)dy = l, by the first equation in (1.5), we have (2.2) \Bnml = h’1\f:+hK(Zfi-t-)(E]Z)dy\ s \lKilm 1.00:) Thus, Since K is bounded, (2.1) for the case v = 0 follows from (2.2) and (2.0). Next consider the case v 2 1. Since K being in K: gives (2.3) (h'v/(v~1)!)fK(y)j:+hy(t + by - z)“'1dzdy = k 5 1, and, since, by our hypothesis, (1.6) holds at every t in N+(x), a rt-nbd of x, (2.4) (v-l)!Bn(t) = h'va(y)j:+hy(t + hy - z)V‘1(gn]:)dzdy V t e N+(x). Note that the integrand in (2.4) is bounded above by - z \K(y)\(hy)v llgnlt‘ which vanishes for y 4 (0,1). Thus, by (2.4), .‘ -v.. \Bn(t)\ S \k|v_1Av(t) V t € N+(x), and hence, Since K t X§ implies \k\v-1 < m, (2.1) for v > 0 follows from (2.0). Necessity ofg(2.0). Let m be a subsequence and Tm > h é m - x+t E 3 fm 6 L1[x, x + Tm] and 11mm1m,t10(Bm]x ) — 0. By (1.5), (Bm + gm)(') = E-ij(y)fm(- + gy)dy. Therefore, since K vanishes 15 off (0,1), by use of the transformation theorem X+§ (2.5) \(Bm + gm)]:+t\ s g(V+1)HKHm Ix \f ]:+t\dv = 0(1) as t t 0, m where, since K is bounded, f E L [x, x + T ] and T > g, the m l m m convergence in (2.5) follows by a theorem on continuity of transla- tion of Ll-functions, (e.g., see Hewitt and Stromberg (1965), p. 199). Since, by our hypotheses, for all Sufficiently large m, x+t x+t (gn ‘ gm)]x « O as (t,n) a [O+,a0, and Bm]x a O as t l O, the identity gn = (gn - gm) +(gm + Bm) - Bm and (2.5) yield x+t gnJX ~0 as (1:.m)-+[0+.<==~)-I Remark 2.2. The proof of the first part of Theorem 1(a) (v) 3130 proves that: If (A0 ), whenever v > 0, holds on D and if K 6 x3, then (2.6) “BnHDHAVle s “Rum or \k‘v-l according as v = 0 or >-0. Thus, if (2.6) holds and rhs of this is finite, then “AvHD = 0(1) implies (2.7) HBHHD = 0(1). In fact, “AvHD = 0(1) is somewhat a necessary condition for (2.7): If K e (K d t b ' V ' b d d ( - ) x+t = X’ nee no e in KQ) is oun e , SUngpi gn gn. 1x \ 0(1) as (t,n),n' 4 [0+,m),w, and for a subsequence {m}, lim sup \B ]x+t\ = 0 and f 6 L (U > [x x + T )) for m1m,t10 xED m x m l xED ’ m l = 0(1) x+t xED‘gnJX (and hence, “AvHD = 0(1)). Proof of this assertion follows from some Tm > hm’ then, as (t,n) « [0+,a9, Sup arguments identical to those given for that of the second part of Theorem 1(a). As an immediate corollary to this last result, we have l6 Corollary 1. Suppose K E K: is bounded and only finitely many P are distinct. For an a 2 -m, if each jmfj < m, then a x+t x J supx>aan(x)\ = 0(1) iff limtflosupx>algnj \ = 0(1). Remark 2.3. If only finitely many Pj are distinct, then AV(X) = 0(1) whenever x is a rt-Lebesgue point of each of the f our fgv). Thus, (from the first part of Remark 2.2) with fj f(v) estimator of is asymptotically unbiased at x under the f(v) weaker assumption than that of the continuity of at x imposed for Similar results in almost all papers on the subject. Sufficiency and necessity parts of Corollary 1 Specialized to the i.i.d. case with u E l, v = 0 and a = -m have been proved, reSpectively, by Nadarya (1964) and Schuster (1969) for their kernel estimators. Remark 2.4. For the case 0 = 0 and u E l, (2.7), (with different kernels), has been noted by Samuel (1965), (Section 6), under the uniform equicontinuity (and necessarily uniform equi- boundedness) of f1,f2,... on D. Theorem 1(b), If, for r > v, (Aér)) holds and K E Xi, then ‘HV’l‘l nX'H’l “(1') (2.8) h \Bni S Akir-l Jx \f l, and -r+v -(r) (2.9) \h Bn - krf \ s \k‘r-l Ar Proof. Inequality (2.8) follows immediately from (1.7), Since the absolute value of the rhs there at t = x is no more than ((r-1)1)'1jyr‘1lx(y)\dy(é \k\r_1) times hr‘V’lf:+h\t(r)\. 17 Also, since (2.10) ((r-i)!hr)'1fx(y)j:+hy(t + by - z)r'1dz dy = (r!)'1jer(y)dy é kr, from (1.7), the lbs in (2.9) at t is exceeded by (2.11) <(r-1)!hr>'1j \K(y>\j:+hy 0, then teal (2.13) h'(r'V)Bn(t) ~ krf(r)(t) uniformly on D. Thus, under certain conditions, the exact rate of convergence r-v for the bias of the estimator g“ is h Theorem 1(b) describes the situations where such rate is indeed achieved by gm. 18 Some global properties of Bn will be obtained in Section 5. Under varying conditions, we will show that, for a fixed a 2 -m, j:2;3: dt 0(1), j: 3“ 2dt 5 \k|r Ja\E \2 dt and 22 2 I" B Zdt ~ (r V)I: \£(r)\d a n khr 1.3 Strong,Consistency with Rates Let En denote the error of the estimator gm, that is, g(v) - gn, where gn and g“ denote reSpectively, and . Unless stated otherwise, all convergences in this section will be meant with probability one. We will give sufficient and (somewhat) necessary conditions for HE E = 0(1), and prove, for n‘D r>v, (3.0) uh'r+VEn - krf(r)“D = 0(1) Under conditions weaker than those used for (3.0), we will show that “En“D = 0((n-110g n)a) for 20 = (r-v)/(1+r). Hereafter denote gm - P g by On. In view of fin (3.1) En = cn + B“, if “CUHD = o(1),tflunlsufficient and (somewhat) necessary conditions for HEnHD = 0(1) can be obtained from Section 2 (Remark 2.2 and Corollary 1). Similarly, regarding rates of convergence, if _ .. .. I‘— anHCnuD - 0(1), then sufficient conditions for anhEndD — 0(1) and for (3.0) (with h-r+v = an) can be obtained from (2.8) and (2.9), reSpectively. Thus our objective in this section will be to obtain sufficient conditions under which anHCnHD = 0(1). 19 For the remainder of this chapter, let n (3.2) uh(.) = Leb-inf restriction to [t6[x,x+h)\v fj(t)>0} of u. 1 For the results in Theorems 2(a) and 2(b) below, K need not be in xi (but K E X). First consider the case when D = {x}. Theorem 213). Let “Knm < m.V n > O (3.3) Bauer.‘ 2 11] s 2exp{- g(hWIuhn/HKHOD)2} . Proof. By (1.2), the event on the lbs of (3.3) is -1 n v+1 [‘n 21(Yj- Pij)\ > Th ], and by (1.1) and (3.2), W“ s “K“ In a.s. for l s j s n. Hence, since Y ,...,Y are in- m h 1 n dependent, Theorem 2 of Hoeffding (1963), applied to random vari- ables Y1 and 4Yj here, completes the proof.II Clearly, when D contains finite, m, points, EHLHCnuD > n] s m times the rhs of (3.3) with uh there replaced by mintEDuh(t). We now consider the case when D is not finite. Theorem 2(b), Let K on (0,1), and, for each t in D, l/u on [t, t + h) be of bounded variations. Then, with Y (t) = K((- - t)/h)[u(°) > O]/u(°) (we may understand that by Yj we are abbreviating YX ), V n > 0, l (3.4) EDEHCnHD 2 n] S 4ngM exp(-2(M2 - l)+), v+l t+h where M - nah R/(suptED It ‘dY.(t)\). Remark 3.1. Kernel functions K 6 xi, which are of bounded variations always exist, e.g., take those K's in X: which are polynomials on (0,1). 20 Remark 3.2. Since Y (t), as a function of t, is of bounded variation, Y (t+) and Y (t-) exist for all t. There- fore, for any countable set S dense in D, SUPtGDY (t) = (t) is a random variable. supt SY (t). Consequently, Sup Y E tED X. v+1 -l n Similarly, HEHHD (= u(nh ) 21(ij - PjY' 1 len by (1.1) and (1.2» is a random variable, and the lhs of (3.4) is meaningful. Proof. Fix t in D until stated otherwise. Let F be the average of distribution functions of X .,Xn, and let 1,.. * - 2F (-) = n 122([Xj < .] +[Xj s .]). Since Lebesgue-Steiltjes integral I-dG does not depend on how G (monotone) is defined at points of discontinuity, from (1.1) and (1.2), t+h t h _ * '— (3.5) Cn(t) —j Y.d(F - F)(-). Since Y is of bounded variation on [t, t+h), it is con- tinuous there except on a countable subset C. But by the absolute . , — dF* - . . . "‘d’k continuity of F, En fC ~ deF — 0 which implies 00 F — 0 a.s. Consequently, (3.5) can be written as (3.5)' 2h("”)cn(t) = ft+h(y +Y._)d(F* - F)(-) a.s. t '+ (17*- F>(o+> + (11* - F>(--), K = 0 Since 2(F\ - E)(-) 'V y 4 (0,1), and Y is of bounded variation (and hence is the dif- ference of two increasing functions) on [t, t + h), by (3.5)' here and (V) of Theorem 21.67 of Hewitt and Stromberg (1965), the rhs , , t+h * -' . . of (3.5) 13 2ft (F - F)(-)d(Y ). Hence, s1nce our foreg01ng analysis in the proof holds good for each t E D, t+h (3.6) hMHCnHD S “If ‘ flimsupcen c \dY.(t)\ . 21 Now (3.4) follows from (3.6) here combined with Lemma A.l and Remark A.l with c1 =...= cn = n-25 of the appendix.'. Let vh(t) be the total variation of l/u on [t,t+h), and V(K) be that of K on (0,1). Then V y in [t,t+h), (u(Y))-1 5 (uh(t))-1 + Vh(t) and \K((y-t)/h)\ S \K(0)\ +'V(K) = V(K); and the total variation of Y.(t) on [t,t+h) is no more than Vn(‘)“K“m'+ vSuPts- n]'< m. Thus (3.8) follows by Borel-Cantelli Lemma. Similarly, (3.9) follows from (3.4) and (3.7).. Remark 3.3. Suppose for r > v, (Aér)) holds at each , r -l t+h -(r) _ p01nt in D, K E X§ and h suptED It ‘f \dt — 0(1), then by (2.8) of Theorem 1(b), (3.10) HBDHD = 0(hr'v). The choice of h that balances rhs's of (3.8)-(3.10) is proportional 22 -1 1/2(r+1) , , to {n (1 + log n)} - Thus with th1s h, if, for some n, . . ‘1 . uh > 0 for each p01nt 1n D (and “(uh) + 2vhHD < m, 1n case D is not finite), then (3.1) combined with (3-8)-(3.10) gives, with 2d = (r-v)/(1+r), (3.11) “En: = 0((n-llog n)a). \n The result in (3.11) specialized to the case u E l, f E f, r = v+1 and D = R, is proved by Schuster (1969) (for his estimators) under stronger assumptions that f and its first v+1 derivatives are bounded. If only finitely many P are distinct, then (3.11) can be j strengthened slightly by replacing log n there by log log :3 (This follows from (3.1), (3.6) and (3.10), Since HF* ‘ EHm 0((n-1 log log n)%), see Kiefer (1961)). 1.4 Variance, Covariance and Asymptotic Normality, In this section we prove the asymptotic normality of A on.A d A a (gn(x1), ,gn(xm)) where, as before, gn an gn abbrev1ate - t 2 . f(v) f(v). We first obtain an upper bound for oh = var Sn hZV+1)o§ ~ “KH§(f/u), and for x' # x, oh(x,x') é V+1)-1). Throughout this section, we and and show that (n A A 2 cov O] and gm = (nhv+1)-12'Yj. Since 1(1....,xn are independent, so are Y ”Yn’ it follows that 1,.. v+1 2 2 _ 2 (4.0) (nh ) on — 2 var(Yj) S 2 Pij 23 Lemma 1. V g 2 l, -1 t+h - -1 (4.1) n sz\Yj(t) - Pij(t)\§ s (2\\K[|m)§jt (f/ug ). nggf. By cr-equality (LoeVe (1963), p. 155), the lhs of (4.1) is exceeded by 2gn-lz Pj\Yj(t)\§ = 2§I\K(y-t)/h)\g(f/ug-1)dy which is bounded above by the rhs of (4.1), since K vanishes off (0,1).III Inequality (4.0) and the latter part of the arguments used in the preceding proof with g = 2 yield 2 2 2v+2 -l +h - (4.2) ch(x) s HKHm(nh ) f: (f/u). Remark 4.1. If u E l a.e. on {t‘fj(t) > 0 for some j 2 1}, then (4.2) is strengthened to nth+2Hg§Hm s “K”:, since then IEfiEsIVteR. Lemma 2. If _ f E (A1): n 1 :‘H‘l-Jo) - :(Xde = 0(1) then - 2 - (4.3) (nh) 1; Pij = “K“:(f/u) + 0(1). Remark 4.2. (A1) is implied if x is a rt-Lebesgue point of (u)-1, f(x) is bounded in n, A0(x) = 0(1), and either or sup is bounded in n. Obviously, Sprstci-i-hmun-l xst z{(j:+hij)(fx. fj)) 2 )- Now consider the case i 2 1. By the transformation 0(1) by (A theorem, (Ag1 ))+ and the orthogonality prOperties of K, the lhs of (4.5) is (4.6) hn'1z{fj 0 and if (4.8) holds, then (4.9) Wh2V+1 2 ~quzf/u , Remark 4.3. If (A1) holds (which is assumed indirectly for (4.9)), then the simple result in (4.2) gives a rate for oi equal to the exact rate obtained in (4.9). Theorem 3. With x1,...,x[n in R, suppose (4.7) for pairs (x1,xj), 1 ¥ j, i,j = l,...,m, and the hypotheses for (4.9) for each x., i = l,...,m, hold. If for each t = x I...,x 1 l m (4.10) (f(t)/u) 3/2h1j:+h o and (4.8) holds at x1,...,xm, then (4.11) implies 2v+1 (4.16) 11m Pn[(nh )%Cn(xi) S ti’ 1 s i s m] t i HKH2(ai/U(Xi)) m = U Q( ) 1 % 28 For the i.i.d. case with v = 0 and u E l, (4.16) is quite similar ~to the univariate version of the result obtained in Theorem 3.5 of Cacoullos (1966). In view of the identity En = Bn + Cn and (4.16), certain interesting results about the asymptotic distribution of (En(x1),...,En(xm)) can easily be obtained from (2.8) and (2.9) both at x1,...,xm. 1.5 Mean Square and Integrated Mean Square Consistencies with the Exact Rates. Define the mean square error (MSEn) and the integrated mean square error (IMSEn) of the estimators gn by 2 a. (5-0) MSEn = Pn(gn - gn) and IMSEn - fa MSEndt reSpectively, where a 2 -m is fixed. Obviously, 2 2 (5.1) MSEn - BU + on This section is divided into two parts. The first one deals with properties of MSEn and the other deals with those of IMSEn. We obtain, among other results, rates and the exact rates for MSEn and IMSEn. In view of (5.1), various results concerning MSEn can be obtained from those of Bn and on contained in Sections 2 and 4 respectively. We describe some of them as follows. By (4.2), if (nth+2)-13uptEDI:+h(f/u)dy = 0(1), then Sufficient and (somewhat) necessary conditions for HMSEUHD = 0(1) can be obtained from Remark 2.2 and Corollary 1. Regarding rates of convergence, we have for r > v, 29 Theorem 4. If (2.8) holds, then (5.2) MSEn s (\k\r_1hr’v’1j:+h\f(r)\)2 + “Ku:(nh2V+2)-1j:+h(E/u); and if (2.13) with D = {x}, and (4.9) hold , then -(r) (5.3) MSEn ~ (krhr'v f 2W1)-1 )2 + (nh HK“§(E/u). Proof. Inequalities (2.8) and (4.2) combined with (5.1) yield (5.2). Since an ~ bn > 0 and cn ~ dn > 0 imply an +'cn ~ bn + dn’ (5.3) is an immediate consequence of its hypothesis. I Remark 5.1. Suppose K is bounded. If for some 0 < p, {(I:+h‘f(r)\1/p) V (I:+h(f/u)1/q} is bounded in n, q S 1, suptED then (5.2) followed by use of Holder inequality gives 2v+1+q)-1 n-Zs(r-v-p)) 2-- (5.4) “MSEHHD = 0(h (r v p) + (nh ) = 0( where 3-1 = 2r + l - 2p + q, and the second equation follows by taking h proportional to n-8, a choice of h balancing the two terms in the middle of (5.4). The result in (5.4) specialized to v = 0, fn E f, u E l and D = R improves the corresponding result obtained in Theorem 2 of Schwartz (1967). Assuming f is continuous, of bounded vari- ation and xjf(r-J) 6 L2 for each j = 0,1,...,r, he exhibits an estimator of f by orthogonal series method, and shows that MSEn -(r-2)/r ) uniformly on R. This rate is ~(2r-1)/(2r+1) of his estimator is 0(n ) much slower (especially when r is not large) than 0(n obtained in (5.4) with p = %, q = 1 and v = 0, which is guaranteed t+e\f(r)‘ 2 tERJ't < °° in this case simply by the assumption that sup for some 6 > 0. Moreover, he requires r > 2 instead of r > O. 30 -1 t+h -(r) t+h - Remark 5.2. If suptEDh {(It \f \) V (It (f/U))} is bounded in n, then taking h proportional to n-1/(1+2r) (5.2), we get n-2(r-v)/(l+2r) 64v umgt=o< > improving the rate in (5.4) (with the excess in the rate of the order n.‘2 where c = 23{(r-v)q + (4v+1)p}/(1+2r)). For the case D = {x}, fn E f and v = 0,1, Yu (1970), (Section 2 of his appendix); and for the case D = {x}, fn E f, u E 1 and v = O, Parzen (1962), (Section 4), and Wahba (1971), (Theorem 2), obtain the rate in (5.4)' for their estimators. Yu makes a little stronger assumption f(r) that and flu are bounded on [x,x+h]; while Parzen and Wahba make still stronger assumption (adding others) that, reSpectively, (r) . f is continuous and is in L2. An optimal h, in the sense of minimizing the asymptotic expression for MSEn in (5.3) is given by (5.5) h1+2r = n-1(2v+1)HKH:f/(2(r-v)k:(f(r))2u) Thus approximations of the optimal h could be based on suitable - - 2 guesses or estimates of the magnitude of f/(f(r)) . Using h given by (5.5), (5.3) becomes '(r) 1+2v -1 2- r-v 2/(1+2r) (5.3)' MSEn ~ Cr,v{(krf ) (n HKHzf/u) } . where -l 2(r-v)/(l+2r) -1 (2v+1)/(1+2r) cr v = ((2v+1)(2r-2v) ) ’ + (2(r-v)(2v+1) ) Relations (5.3), (5.5) and (5.3)' specialized to the case fn E f, 31 u E 1 and v.= 0 coincide (up to the factors kr and “KH2) with (4.12), (4.15) and (4.16), respectively, of Parzen (1962). In the remainder of this section we derive certain pro- ' f 4 ° d parties 0 IMSEn - Ia MSEn(t) t. Lemma 4. For each n 2 1, m 2 - 2 - (5.6) I, a, s (nh2V+1) luxuz f:(f/u). Proof. Integrating both sides of the inequality in (4.0) and then making use of the Tonelli theorem at the second step below we get ‘ 2 - a: 2 .- (nhZV+1)I: ohdt S h 1I8I(K (y-t)/h)/u(y))f(y)dydt = f/h>dtdy 2 m - s “Kn, Jaw/u). l Lemma 5. Suppose r = v. If Bn = 0(1) a.e. on (a,m), V ‘gn‘ E L2(a,a0 and, for the case v > 0, (Aév)) holds on n (a,a9, then (5.7) J”: B: = 0(1) Proof. Consider first the case v = 0. Let s(t) = V I\K(y)‘f(t+hy)dy. Using Tonelli theorem and Schwarz inequality n at the second step below, we get 2 a - _ I:S (t)dt S faff\K(y)K(w)\(z f(t+hy))(: f(t+hw))dydwdt s £I‘K(y)K(w)‘(j: v f2(t+hy)dt)%(j: v f2(t+hm)dt)%dydm n n s II‘K(y)K(m)\f: V f2(t)dtdydw < m n 32 since K 6 L1 by its definition, and v f(t) E L2(a,m) by hypothesis. Since by (1.5) and by cr-izequality ‘Bn(t)\2 = (IK(y)f(t+hy)dt - f(t))2 s 2(sz(t) +'V f2(t)) and since by hypothesis Bn = 0(1) a.e. on (a,a0,nthe desired conclusion for the case v = 0 follows by dominated convergence theorem. Now consider the case 0 2 1. Since K vanishes off (0,1), (2.4) followed by (2.3) gives, V x E (a,m0, (5.8) \Bn(x)\ s b-1va-1\K(y)\j:+hy\gn(t)\dtdy +-\gn(x)‘ s jy"\1<(y>\j(1,\gn v, (1.7) holds a.e. on (a,m), then 2 h2(r-v) m g(r) (5'9) I: B: S \k‘r Ia< )2° Proof. By (1.7) we have, after use of the transformation h - - theorem, (r-l)!han(x) = IK(y) 0y zr 1f(r)(x+hy-z)dzdy for almost all x E (a,m). Therefore,using Tonelli theorem, and Schwarz in- equality at the second step below, we have 33 by hy 2 a) 2 '2co -1- h V a Bndx s ((r-l)!> ja1<(y2>\r \f‘r) v, suppose (2.8) holds on (a,m), and both (2.9) and Ar = 0(1) hold a.e. on (a,m). If v ‘§(r)\2 E L1(a,a9, n then oo--2 2- (5.10) [a\h 2(r V>Bn - kr\f(r)\2| = 0(1). Proof. By Schwarz inequality, the square of the lhs in (5.10) is bounded above by 11-12, where = @ -(r-V) _ -(r) 2 _ m ~(r-v) '(r) 2 I1 Ia‘h Bn krf \ and Iz—‘fa\h Bn+krf \. Since (2.8) holds on (a,a0, by transformation theorem and the Schwarz inequality, we get on (a,a9, (5.11) (\k\r_1hr’“)‘23:(t) s (jg\é(r)(t+hw)\dw)2 S V I3\f(r)(t+hw)\2dw . n Since V \f(r)‘2 E L1(a,m), so is the extreme rhs of (5.11)- By n cr-inequality the integrands in I1 and 12 are bounded in n by an L1(a,m)-function. Hence, since (2.9) and Ar - o(l) both hold a.e. on (a,oo) , by dominated convergence theorem, I112 = 0(1). I 34 Lemma 8. If (4.8) holds a.e. on (a,m) and (V f/u) E n L1(a,a0, then (5.12) f:\nhzv+1 a: - HKH:(f/u)‘ = 0(1). Proof. By (4.0), V t 6 (a,m), 2v+ (5.13) nh 1°§(t) s (nh)-1£: PjY§(t) = h‘lf:+hxz((y-t)/h)(f/u)dy s IK2(w)V (f(t+hw)/u(t+hw))dw . n Since (V f/u) E L1(a,a9, by an application of Tonelli theorem, we see that the extreme rhs of (5.13) is in L1(a,m). Hence, the integrand in (5.12) is bounded in n by a L1(a,a9—function, and, since (4.8) holds a.e. on (a,m), (5.12) follows by dominated con- vergence theorem. I. As an immediate corollary to Lemmas 7 and 8, we have - 2 Corollary 5. If kr # 0, lim inf f:\f(r)\ > 0 and (5.10) holds, then -2 - a 2 2 - 2 (5.14) h (r v)fa 3n ~ krj\t(r)\ ; and if lim inf f:(f/u) > 0, and (5.12) holds, then 2 2 2 - (5.15) nh V+1 f: “a . “KHZ j:(f/u) We will use (5.14) and (5.15) to prove (5.17) below. In view of (5.1), various results on IMSEn can be obtained 2 Q2 (.02 from those on I:On and IaBn’ e.g., if Jach - 0(1) (by (5.6) 2v+1 -1 2 m - _ it is sufficient that (nh ) “Ku2f8(f/u) - 0(1)), then (5.7) implies IMSEn = 0(1). Regarding rates of convergence, we have for r > v, 35 Theorem 5. If (1.7) holds a.e. on (a,a0, then 2 2 2 - m - 2 2v+l -1 - (5.16) IMSEn s \k‘r h (r V>fa\£(r)\ + uKu2(nh ) f:(f/u); and if (5.14) and (5.15) hold, then (5.17) IMSEn ~ rhs of (5.16) with ‘k‘r replaced by \kr\. nggf, Equation (5.1) followed by (5.6) and (5.9) yields (5.16). By (5.1),(5.17) is an immediate consequence of its hypotheses . I It may be recalled that a sufficient condition for (1.7) at a point is that (Aér)) holds at that point. Thus a simple assumption gives (via (5.16)) a rate for IMSEn quite similar to the exact rate obtained in (5.17). Since ‘kr‘ s “KHZ by Schwarz inequality, (5.16) with h = n-1/(1+2r) yields r-v 2 w -(r) 2 - (5.18) IMSEn s (“Kuzh ) ja{\f \ + (f/u)} . Remark 5.3. The result in (5.18) Specialized to the case u E l, f E f, v = 0 and a = -m improves the result in (3.6) of n Schwartz (1967) who exhibits an estimator of f by orthogonal series method. Assuming tjf(r-j)(t), j = 0,1,...,r, are in L he shows 2’ -(r-1)/r that IMSEn of his estimator is 0(n ). This rate is significantly weaker (eSpecially when r is not large) than our rate 0(n-2r/(1+2r)), which is guaranteed in this case if we only f(r) assume that 6 L2. Moreover, he restricts r > 1, while we assume r > O. 36 An optimal choice of h as a function of n and independent of the point at which gn is to be estimated can be obtained by considering a global measure of how good gm is as an estimator of g“. The integrated mean square error is a standard measure of this type. The global Optimal h, as the minimizer of the asymptotic eXpression in (5.17) for IMSEn is given by 1+2r (5.5)' h = (n1(1+2V)HKfo (f/u)/(2(r- -v)k :fa 00(EM) 2 ) and, hence, could be approximated by some suitable guess or estimate - - 2 of the magnitude of the ratio j:(f/u)/f:(f(r)) . Using h given by (5.5)', the asymptotically minimum possible value of IMSEn can be obtained by (5.17). 1.6 Estimation of Mixed Partial Derivatives of the Average of Multivariate u—Densities. Let X1,...,Xn be independent m-variate random variables with Xj " Pj << u, where Pj's and u are over Rm, and u is absolutely continuous wrt Lebesgue measure. Unless stated other- wise, throughout this section, the product H is over l,...,m. With t in Rm, and u , a fixed determination of du/dt, let -1 t1+€1 tm+€m fj(t) = (u(t)) llmeilo,1=l,ooo,mt)jt ... jtm de 1f lhnit exists V j 2 l and u(t) > 0, and 10 otherwise. For v and t in RIn with elements of v non-negative integers, and for V . \v‘ = 2? VJ, let f§V)(t) = a\v\fj(t)/(H3til). For a fixed vector v = (v1,...,vm ), vj 2 O integers, we consider estimation of g(V)= -121 f(\)) . 37 Let h = (h1,...,h ) be Such that 0 < h. E h, S 1 m 1 1,n and h t 0 as n t m. With r = (r ,...,r ), r, 2 v, integers, % 1 m 1 1 r let in be defined as in Section 1. For fixed Ki in K§1, let i i ,(v) -1 n Vi” -1 Xi-“xi (6.0) f (x) = n 2._ {{n(h. ) K.(-l--)}[u X.) > 01/U(Xo)] J'l 1 1 hi J J where le,...,ij are coordinates of X,. f(v) is our proposed estimator of g(v). Taking expectation of f(v)(x) wrt Pn = P1 X...X Pn’ and then making use of the transformation theorem, we get -v (6.1) an(V)(x) = j(nhi iKi(yi))§(x+h-y)dy ~ where h-y = (hlyl, . . . ,hmym) . . m _ For 2 and t 1n R , let (2)1(t) - (t1,...,ti,zr+1,...,zm). With 1 S L s m and with the first L-elements of r non-negative integers, we introduce (Aér))L: For V y E (0,1)m and for each i S L, f(x+h-y) has ri-th order Taylor eXpansion in hiyi about xi with integral form of the remainder, while other components Of (X + h‘Y) are held fixed. (Such expansion in the univariate case is given in (Aér)) in Section 1.) Suppose L (0 S L s m) elements of v are positive. With- out loss of generality, let these be v1....,v . Suppose (Aév)) 6 L liolds. Using Taylor formula, we eXpand f(x+h'y) (appearing in (6u1)) in hly1 about x1 with integral form of the remainder a}: the v ~th term, we perform the integration on the rhs of (6.1) l ‘vrt y1 and use the orthogonality properties of K1. Then using 38 Taylor formula, we expand f ((x+h-y)1(t)) (appearing in the resultant) in hzy2 about x2 with integral form of the remainder at the vz-th term, we perform the integration wrt y2 and use the orthogonality properties of K ; then we do the similar -(v1,v2,8,...,0) operations wrt (x3,v3,K3) with f ((X+h'Y)2(t)) (appearing in the resultant), and so on until such operations wrt (XL’YL’KL) are completed. We finally get A 'V. - (6.2) gnfmm = jam, min/1)) jJ;f(")( ,(tnstdy V '1 v = _ i _ , —l where Jy(t) “{[xi 5 ti < xi +hiyi](xi+hiyi t1) ((vi 1).) [vi>0]+ [vi = 0]}, st = Hidti and the second integral on the rhs of (6.2) v -v i i v = is L tuple. Since Iz Ki(z)dz vii, Imhi Ki(y1))ij(t)stdy 1. Consequently, using the transformation x v -1 +hiyi = ti’ L+l S i s‘m, i V L 1 = and the facts that Jy(t) s n1((hiyi) “vi-1)!) and K1 .. 0 off (0,1), all at the second step below, we get, with Bn = 2(v) '(v) Enf ' f ’ -Vi v -(v) . -(v) td (6.3) \Bn(x)\ Sfmhi \Ki(yi)\)j‘Jy(t)\f ((x+h y)L(t))-f (x)ldL y . (a), where Av(x) =‘I(flh;1[xi 5 ti < xi + hi])‘f(v)]:\dt, and ‘ik‘j = IZJ\K1(z)le/j!° Thus, if “Kium‘< m for 1+1 s i s m (bounded kernels Ki 6 x i, r vi i exhibited, see Section 1 and Remark 3.1),and (A8”)L holds and 2 vi, can always be Av(x) ’ 0(1), then (6.4) Bn(x) = 0(1). 39 (r) Now suppose for ri > 0.1, (A0 )In holds. An analysis similar to that given for (6.2) (this time use rl-th,r2-th,..., rm-th order coordinatewise Taylor expansion with integral form of the remainder) gives A(V) '(V) -v' r -(r) (6.5) gnf (x) = f (x) + fcnhi 1Ki(yi>)ijE(‘)\ s (n\.k\ )A (x) 1 n 1 r1 1 ri-l r Results obtained in (6.3), (6.6) and (6.7) for m = l coincide with (2.6), (2.8) and (2.9), respectively. Since X1,...,Xn are independent, the inequality var X S EX2 followed by the transformation theorem gives 2 . 2(v) 2(Vi+1) -1 - (6.8) oh(x) = var f (x) S M (nIIhi ) I(H[Xistiéxi+hi])(f/u)dt where M E HHKiHm. Since MSEn é En‘f(v) - “(v)\2 B: + 02, rates of convergence for MSEn can be obtained from (6.4), (6.6) and (6.8). If, corresponding to u and h here, uh is defined analogous to (3.2), then by Theorem 2 of Hoeffding (1963), V n > O P [ f(V) f(v) 9 hVi+1 /M)2 "n \ - En \ > n] S 29XP{' 2((H 1 )“ uh }' v,+l Thus by Borel-Cantelli lemma (fl hi1 2(V) PnE(V)\ = >\f O((n-llog n)%(M/uh)) a.s., and 40 ‘(v) rates for strong consistency of f (x) can be obtained from (6.4) and (6.6), since ‘%(v) - f(v)\ s \Bn\ + \%(v) - Pnf(v)‘. Though we have verified some of the asymptotic properties of g(v)’ it is not our intent to encounter and verify all the pro- (v) perties of E that we have already studied in the univariate case. However, regarding some of these, it can be verified that under the assumptions analogous to those given for (4.7), (4.8), (4.11), (5.2), (5.3), (5.16) and (5.17), results analogous to these also hold good in the multivariate case here (analogoue of (4.8) 2v£+1 2 2 _ 1 )Oh ' (HHK1H2)(f/u) = 0(1) and that of (5.16) is 2(ri-v.) 2 a an - IMSEnS (n‘iklr hi 1 )J‘a fa \f(r)\2dt 1 i l m is n(nh 2 -2vi-1 -1 m a .. (UHK1“2 hi )n I ... Ia (f/u)dt, etc.). Conditions (somewhat) m a 1 necessary for asymptotic unbiasedness, (and also for strong or mean m square consistency), uniform on any subset of R , are analogous to those given for the same in the univariate case (cf. Remark 2.2 and Corollary 1). CHAPTER 2 CONVERGENCE RATES IN SEQUENCE-COMPOUND SQUARED ERROR LOSS ESTIMATION OF CERTAIN UNBOUNDED FUNCTIONALS IN EXPONENTIAL FAMILIES 2.0 Introduction. Let n be a parameter Space indexing a family of proba- bility measures .0 = {Pw\w 6 Q} on a sample space I, With an observation on a random variable X ~ Pm’ let the component prob- lem be squared error loss estimation (SELE) of Egg; 9(w). Suppose this component problem occurs repeatedly and in- dependently. Then, after n such occurrences, we have an unknown vector 9 = (w1,...,wn) 6 0n and a correSponding vector of in- dependent random variables, X = (X ,...,X ) with X. ~ P E P . N l n J j wj With 0 abbreviating 9(wj), we consider estimation of each J component of e = (91,...,en) with loss taken to be the average of the squared-error losses in the individual components. We call m = ($1,...,qh) a sequence compound estimator (henceforth, compound estimator of simply estimator) if “3 is (X1,...,X )-measurable. Let G be the empiric distribution j 1 function of the first 1 components of w, and R(-) be the Bayes envelope for the component problem. With a 6 > O, we say m achieves a rate 6 (at 0) if the modified regret of W! defined by _ -l n _ 2 (0.1) D (93.33) —n 21§j(cpj ej) R(Gn) 41 42 is 0(n-5) as n a m, where 21 = P1 X...X Pj' We now describe the main results briefly as follows. In Section 1, we use the method of Gilliland (1968), (Section 2), to obtain an explicit bound for \Dn(9’gb\' In Section 2, we introduce some further assumptions and notations. For the results in Sections 3-6, I,= R,.9 is an exponential family wrt p, a c-finite measure dominated by Lebesgue measure on R and n is the natural parameter space. Using the technique developed by Gilliland (1966), (Chapter III), we exhibit, in Section 3, a divided difference estimator for 9 with a rate 1/5. Based on estimators (introduced in Chapter 1) of derivatives of the average of u-densities, we exhibit, in Section 4, kernel estimators of 9 for (integer) r > 1 when 9(w) = w, and for (integer) r > 0 when 9(w) = em or w-l. These estimators are shown to have rates (r-1)/(1+2r), r/(1+2r) or (r-)/(1+2r) in their reSpective cases of w, em or w-l. In Section 5, we show that, when 9 is an identity map and w has identical components, rates with the divided difference and the kernel estimators are near, but cannot be more than, 2/5 and 2(r-1)/(1+2r), respectively. A comparison between the divided difference and the kernel estimators, when 6 is identity, is made in Section 6. Because of the reason stated there, the latter one is preferable to the former one. 43 2.1 A fiound for the Modified Regret. In this section we will prove two simple but useful lemmas. Special forms of both have been studied, among others, by Gilliland (1968) and Susarla (1970). Lemma 1 is essentially due to Gilliland (1968), and Lemma 2 is a consequence of inequalities (8.8) and (8.11) of Hannan (1957), and of Lemma 1. With u some a-finite measure dominating P V j = l,...,n, J . f d let fj be a determination of de/du Let mi 2 maxlsjsi j an N1 2 maxlsjsi‘ej‘ be such that mi and Ni are non-decreasing. Recall that e abbreviates 6(w ). As the Bayes response against J 1 G1 in the component problem, we take the version of conditional expectation zief (1.0) , .._l__l_i [:1 f > 0] . 1 21 f 1 j 1 3 Thus ‘fii‘ s.N1. For the purpose of this section only, take *0 arbitrary real valued function on R, and for j 2 1, define “j E '1 ' *1-1' Lemma 1. With $0 taking values in [-Nn, Nn], n 21 Pi‘Ai(Xi)| s 2Nn(1 + log n)u(mh). Eroof. Abbreviate, throughout this proof, Nn by N. From (1.0) it follows that, for l s i s n, (91 ' w1-i)fi . a.e. P 1 21 fj A1 = 1 . Consequently, since \9 - Wi-l‘ s 2N for V 1 s i s n, i 44 2 n (filmn) n (1.1) 21 PilAi(Xi)\ s 2N Manual 1“ [m ) 21 j n Since by Lemma 2.1 of Gilliland (1968), Z: a:(2i a1).1 5 2: 1-1 for all 0 5 si s l, 1 s i s n and n 2 l, the rhs of (1.1) is bounded above by 21mg: 1-1)“.(mn) s 2N(1 + log mum“). I Lemma 2. For any estimator 3’: (¢1’°'°’¢h) with qh and, for i - 2,...,n, mi taking values in [-Nn,Nn] and [-N1,N1] reSpectively, ‘1) (“mm s An'lz“ N P \cp (X) -¢ (x )\ 9 ~ ~ 2 J~J j j 1'1 J -1N2 + 8n n(1 + log n)u(mn). Proof. Unless stated otherwise, sums in this proof are taken from 1 to n. Let the argument Xj in various summands below in this proof be abbreviated by omission. Inequalities (8.8) and (8.11) of Hannan (1957) specialized to the SELE problem here yield 2 2 The identity b2 - c2 = (b-c)(b+c) followed by (0.1) and (1.2) gives 2 33((qa - $3-1)(q3 +'¢j-1 - Zej)) S nDn(g,9) . - -2 (1 3) s z £1“ch ¢j)(cpj+1l!j 63)) = - - -2 . z §j((¢3 ¢j_1 Aj)(q3+vj 61)) Since *0 is arbitrary, we can (and do) take $0 = m1. Then, since, for j 2 2, ¢3’ $1, ¢ and ej are in [-Nj,Nj], and 3-1 45 'maxlsdsn‘qB +'¢j - 291‘ s 4N“, from (1.3), -422Njfj|¢3 ' *3-1‘ s nDn(g,go s 4(22Nj§j\¢3 - tj_1\ + NHZPJ\Aj\). The last inequalities and Lemma 1 now complete the proof. . 2.2 Some Assumptions and Notations. For the remainder of this chapter, we take I = R, the real line, and assume 9 << ,1, where p. is a a-finite measure dominated by Lebesgue measure on R. With u , a fixed deter- mination of du/dx, we assume the existence of an a 2 -m such that (2.0) u(x) > 0 iff x > a. Furthermore, we take (2.1) n = {w e R\(c(w))‘1 é feu’xdmx) < m}; and, for w E n, (2.2) fw(x) = C(u))emx for x >ra, (and zero otherwise), as a fixed density of Pw wrt u. (Thus, with fj abbreviating fw , ufj is a Lebesgue density of j X ). Let a1 5 min and Bi 2 max be in a for 1 each 1 S i s‘n, and ail and Bit. We also take lstiwj 1sjsiwj (2.3) m1 = sup{fw‘w E [ai’ei]} and Ni = Sup{\e(w)\\w E [01:51]}° For T > O and x > a, define (2.4) u (x) = Lebesgue-inf of the restriction to [X,X+T) of u. ° T 46 The conclusion of Lemma 2 will be used in obtaining certain rates for various estimators to be introduced in later sections. Since the upper bound in the lemma does not depend on the first component (with values in [-Nn,Nn]) of the estimator 9 there, without any_further indication, the such component of each of the estimators (yet to be introduced), is taken to be arbitrary with gagges in [-N1,N1]. Our work in each of the next two sections is comprised of mainly two steps: First to exhibit an appropriate estimator ¢i+l of $1 and then to obtain a suitable bound for N1+1£i+1\mi+1(xi+1) - ¢i(xi+l)‘ for each i = l,...,n-l. Using this and Lemma 2, we will obtain a bound for ‘Dn(m,qp\ uniformly in e e stabs. Let 0 < hn s hn-l 3...: h1 S 1. Unless stated otherwise, we, hereagier, fix 1 with, 1 s i s n-1 and drop the subscripts in mi, a1, Bi’ Ni’ hi and Xi+1. For aj E R, let a = i-lziaj. Note that log C(w) é -1og jew'du(-) is concave on [0:9] and, hence, so is log fw(x) = mx + log C(w) for each x. Thus, inf = f A f8. Hence, V y 2 0 f aSwSB w a (2.5) q é (mi+1/(fa A f )Y/Z) 2 £i+1/(f)Y/2. Y B For a real valued function g on R and for numbers b < c, abbreviate the retraction of g to [b,c] by (g)b C. Unless stated otherwise, all the limits (of functions depending on i) are taken as i 1 m (hence, necessarily as n 1 m). 47 2.3 A Divided Difference Estimator of m with a Rate 1/5. In this section we consider the case when 6 is the identity - 1 - map. Since fj(x) = C(wj)exp(wjx), by (1.0), ti = (log f)( )[f > O]. A Motivated by this expression, the compound estimator ! to be introduced here will be based on a divided difference estimator of log f. The main idea behind the construction of this kind of estimator is developed by Gilliland (1966), (Chapter 3), in sequence- compound SELE of means in the family of normal densities. Our technique to be introduced here in defining i is, however, a little different than those of Gilliland (1966), (Chapter 3), Susarla (1970), (Section 1.2), and Hannan and Macky (1971); and does not require the continuity of u for g to have a rate. The method used here to get rid of the continuity requirement of u is partly due to Yu (1970), (Section 2 of the appendix), where he exhibits kernel estimators of a density function and its derivative. Define a real valued functional Q on the space of all real valued non-negative functions t on R by — + (3.0) Q(t)(x) = h 1(10g Effiyhl)[t(x+h) + t(x) > 0] . ZhN y+h Let = e and, for ° = l,...,i, let 6.( ) = f, and n J J y fy J Bj(y) = [y s xj < y+h]/u(Xj). Note that bj is well defined with probability one, and is an unbiased estimator of éj The compound estimator *9 which we pr0pose for w, has (i+1)st component (3.1) 11,100 = (Q(é)(X))a,B- Abbreviate Q(5)(x) and Q(6)(x) by Q(x) and 6(x) respectively. 48 For x > a, define * u (x) = Lebesgue-sup of the restriction to [x,x+2h) of u. by u . In Lemma 3 below and in its proof, Q, Q, Denote u * 2h * - u*, u ,m. and f all are evaluated at a fixed point x > a. Lemma 3. V v > 0 (3.2) 31(\Q - 6\ A 2N)Y s ko(y)(ih3f ui/u*)’Y/2 where ko(y) ' vP(v/2)(l6n3(1 ‘l--T\2)/31<‘+)'Y/2 with k = l - hnu*m. Proof. The lhs of (3.2) is A 2 (3.3) If," gum - Ql > v]d + p2(v)>d. where p1(v) = P1[(6 - Q) >'v] and p2(V) = P1[(Q - Q) > v]. Our method of the proof here involves obtaining an appropriate upper bound for p1(v) + p2(v) with 0 < v < 2N. Fix v in (0,2N) until stated otherwise. For 1 - 1,...,1, let Y1 = bj(x+b) - R ehvb Let vj - Pij and 02 = i var(§). We will first obtain (3.7) below by obtaining suitable upper bounds for 5 and 02. Notice j(x), where R = 5(x+h)/5(x). that Vj = 6j(x+h) - R ehv6j(x). Hence 5 = (l - ehv)5(x+h), and we get (3.4) —n5(x+b) s G s -bv5(x+b) By independence of Yl’°'°’Yi and by cr-inequality (Loave (1963), p. 155) we have 2 2hv 2 2 (x+h) +'R e P13§(x)). (3.5) 16 s 2: Pij s 22i(ijj 49 Since v < 2N, R - 8(x+h)/5(x) and, for y = x, x+h, ijio') s 61(y)/u*, by (3.5) we get a2 s 20 + Rn2)5(xH1)/u* = 2(<5)'1>52>/u,. Now, since, for l s j s 1, mi 6 ['N.N]. _ 6 (y) w (t-x) (3.6) nnlsfla-i-fyfiej dtsnn for y=x,x+h. 1 Therefore, weakening the final upper bound obtained above for 02 by the first inequality in (3.5) we get u*foz s 2(1+n2)m'152(x+n). This last inequality and (3.4) give - 2 h3v2fu* (3.7) fl;— 2 —-—-§—- . a 20+“ )11 Next we will obtain (3.10) below by obtaining apprOpriate lower bounds for oz, 6. vj and -Yj. By independence of ‘11,...31 and by the facts that v > O, Pj('51(°)) > 0, and 31(x'l'h)8 (x) I O with probability one, we get .1 (3.8) 02 2 i-lzi(var('6 (x+h))+ 1?.2var('8j (x))) . .1 * Now the definition of u and the second inequality in (3.6) yield, for y Ix, x+h, . t+h - 2 Var(3j(Y)) jy (fj/uo 610’) 2 (51m (1 - u*6j+/u*) 2 (6j(y)(l - hnp*fj)+7u*) 2 k+6j(y)/u*, 50 where k is as given in the lemma, and the last inequality follows from the definition of m given in (2.3). Consequently, from (3.8) we get (3.9) u*02 2 (5(x+h) + R25(x))k+ = (1+R)5(x+h)k+. h Next observe that -R e vBj(x) S'Yj S Bj(x+h). Therefore, since for y = x, x+h, Bj(y) s l/u* with probability one, Y S 1/u* J and -vj S RTVu*, These upper bounds for Yj and -vj together with (3.4) and (3.9) yield (Y j-vj)(-5/oz) s {(1+m)nu*/(k+(1+R)u*)} 2 * + - S n u (k u*) 1. Hence 2 * 2 (3.10) Y. - v. s 11—”— (- 9—) J J k+u v * We will use (3.7) and (3.10) to obtain a suitable upper bound for p1(v). Note that the event in p1(v) is [§.>’0]. Therefore,(3.10) and the Bernstein inequality stated in (2.13) of Hoeffding (1963) give , - 2 (3.11) p1(v) = Pi[Y - C) > -v] S eXp{- 1L v) 7 * } ~ 2 n o 2(1 +'IL§?'9 3k u* 3ik+h3vzfui 5 exp{- 3 2 * 161'] (l + T1)u where the last inequality follows by (3.7) and by the fact that 2 * - * (1 +‘“ u /(3kfh*)) S 4(3k+u*) lfizu , since n 2 1, k+ s 1 and * u 2L1 *. By interchanging x, x+h in the definition of Yj's and by applying the techniques used for bounding p1(V), we see that p2(v) is also bounded above by the extreme rhs in (3.11). 51 Now bounding above the integrand on the rhs of (3.3) by the upper bound just obtained for p1(v) + p2(v) and then per- forming the integration there after extending the range of integra- tion from (0,2N) to (0,oo) we get the desired conclusion. I Lemma 4. g(l) supt>a\(Q(5) - )(t)\ s 4(Nm2h . Proof. Since, for 1 s j s i, mj E [-N,N], for each integer v 2 0 and V t E [°, - + 2h] we have \£(V)(t)\ v wj(t-.) v (3-12) fj(') = \wj\e s N T], and f (t) w.(t-°) (3.13) -—i——- = e J 2 “'1 f.(') J For the purpose of this proof only, let gj = wglfj° Since g(t) = g(t+h) - g(t), by Cauchyvmean value theorem, see Graves (1956), p. 81, for some a in (0,1) (3.14) m =‘é((::£t + h + eh) ___ f(t + h + 5h) 5(t) g (t +-eh) f(t + eh) Therefore, by (3.14) and by mean value theorem, Q(5)(t) = 1 (1) t'=t+yh h' log(f(t + h + eh)/f(t + €h)) = (log E(t')) for some Y 6 (0,2). Making another use of mean value theorem at the third step below, we thus have, for some y', y” E (O,yh) 52 -<1) Eu) Eu) \Q<5)(t> -( >1= \(f )(t + Yh) - +\(-=-—>! f(t+\(h) - mm (3.15) “(2) l ”321%?“ “+3 >\ -<1> f + \(T)(t)"f(1)(t+¥")\) s 4h(N’n)2 where the last inequality follows by applying (3.12) for v = 2,1, (3.13) and the fact that lf(1)/f| SLN and y < 2. Since the rhs of (3.15) is independent of t, the proof of the lemma is complete. . Observe that edh s (6j(x+h)/5j(x)) s eBh for each 1 s j S 1. Therefore, Q is in [a,B]. Since Wi = f(1)/f and N = ‘a‘ V ‘5‘, by (3.1) and Lemma 4 we get (3.16) m- ind s \Q ,B\+ \Q Hi \ s (\Q - (N A 2N) + 4h(NTDZ. Therefore, (3.16) followed by cr-inequality (see Loéve (1963), p. 155), Lemma 3 and (2.5) leads to Lemma 5. V y 2 O 3 - 2 * 2 2 2 Emma» - imam s k6(v){(ih > Y/ Wu lug“ q!) + (W )Y) 4. where k6(y) = 2(Y-1) {(4112)Y v ko(v) with k in k0(y) replaced by mama - hnu*>}. 53 For the remainder of this section, let c0.Cl,... denote absolute constants, and let . -l/5 h - hi - coi . We will now state and prove our main result of this section. Numbers k1,k2,... below are finite and independent of n. Theorem 1. If V i = l,...,n, (111.0) h1‘p*m s 1 - c1, and if for a 5 6 [0.1] a a v 6 [5,1] and k and k2 a with 1 {1 I=2-l-y, V i =l,...,n, (A1.1) u((u*/ui)Y/2qY) s kiNY-1h(6-Y)(1-g), and (Al.2) N s k2h§(6'Y), then 3 a R3 3 \Dn(m,i)\ S R3112, uniformly in (B E Xrllllaiiei]. Remark 3.1. Assumptions (A1.0), (Al.1) and (Al.2) together imply the existence of a R4 3 V i = l,...,n, (3.17) u(m) s k4N-2h(85-11Y)/6 . To prove (3.17) we proceed as follows. Since T} 2 l, (A1.0) implies “umum < 11.1. Consequently, £01 A fB < (hu) -1. Therefore, since * 2 - m 2 m and since (u /u*) 2 (u) 1, from (2.5) the u-integrand 1+1 in (ALI) is no less than th/Z. Hence, by (A1.l) 54 (3.18) N2u 0, then the rhs integral in (3.19) is not finite, unless u(x)'~ O (as x aim) at least as fast as e-Lx for some L > (25 - ya)/(2-y), which holds, since 0 s y s 1, a4< B, a“ and '5 1 in i, only if N (= ‘3‘ V \a‘) = 0(1). Thus, in such situations (Al.l) holds only if N = 0(1). It will be shown in the example following the proof of the theorem that, for the case u(x) = (2n)-%e-x2/2[-m < x < w], (Al.1) holds only if N 1 m not faster than (log i)%. From these examples we con- jecture here that for (Al.l) it is perhaps necessary that, for some T1.T2 non-negative, N = 0(1) + T1(log i)T2, whatever be the form of u. If this is true, then of course (A1.l) also implies (Al.2) . Proof of Theorem 1. By (Al.2), n (= eZhN) 1 l, and hence by (Al.0), k6 in Lemma 5 is bounded in i. Now fix v and 5 satisfying the hypothesis of the theorem. The trivial bound 55 yd s 2N yields - ¢1(X)\ S ZNI-Ygi+1wi+l(x) .. Ei+lwi+1(x) i-l/S ”1+1 ' §1(X)‘Y. Therefore, since h = , Lemma 5 gives k5 and k6 3 V’i = 1,...,n _ - 2 2 2 (3.20) N1+1£i+lwi+1(X) - “(ml s k5N1+1N1 Yi Y/5(LL((U*/U*)Y/ qY)+N Y) i-o/S SR6 , where the second inequality follows from (A1.l) and (Al.2). Since X abbreviates Xi+l and (3.20) holds for each 1 s i s n-l, {12; NjEjHj-1(xj) - $j_1(xj)\ s k6n'lzri'1j'Y/5. Thus, the first term on the rhs of the inequality in Lemma 2 with 53 there replaced by i is bounded above by (k3/2)h: uniformly in 9 E x:[ai,51], and so is the second term there by (3.17), since k4 is independent of i and n. I Now we will show how the conditions of Theorem 1 reduce to a single condition on N when the family of densities involved is normal- 2 Example N(m,1), Let u(x) = (2n)-%e-x /2[-m‘< x < m]. -— 2/2 Then a = -mq C(w) = e w and n R. LEt -cy=B=N>0. ‘We will show that all the assumptions of Theorem 1, with y = l and any fixed 6 6 [0,1], are satisfied iff 3 21) N é N = 0(1) + (LZQ l ')% ( . i 15 og l . We will first prove the 'if' part. Clearly (Al.2) holds. (Sonsidering the upper and lower bounds for the ratio u(t)/u(x) * 2h‘x‘ for x s t < x + 2b, we get u (x) S u(x)e and u*(x) 2 e-2h(\x\+h). Therefore, u(x) 56 (3.22) u*fw(x> s e2h\x‘u2-4h\X\)} S exp{2h(h + w sgn x)}. 4...). 2 By (3.22) u m = u SUP‘w‘Swa S eXp(2h + 2hN). Therefore, since h = coiull5 and hence hN is bounded uniformly wrt co in a neighborhood of zero, by a suitable choice of CO, (A1.0) holds. For this paragraph only, let N abbreviate N1+1 (instead of Ni). Now observe that mi+1(x) é sup‘w“wa(x) S exp(x2/2)[‘x\ S N] + exp(N\x\-N2/2)[\x\>N]. Therefore, since fa(x) A fB(X) 2 exp(-N\x\ - N2/2), by (2.5) Suez-erupt 2 1 2 (3.23) q1(x) s e \(egx [m SN] + e”2N ”MEN >511). * Moreover, using the bounds obtained above for u and u we get * * (3.24) (E'§'(x))% s (211)%8XP(1£X2 + 3h\x\ + 2h2) u* Thus by (3.23) and (3.24), u{((u*/u:)%ql)[\x\ S N]} S 2N exp(N2 +-3hN + 2h2) and 2 ui<%qi>[\x\ > NJ} s <2n>'%exp(- §"+ 2h2>fcxp<-tx2 + 3(h + §)\xk)dx 2 S czexp(2N + 9hN) Consequently, since hN is bounded, we get 2 1. (3.25) u((u*/u:)2q1) S c3e2N 57 Thus, by (3.21) and (3.25), (Al.l) holds with y = 1. Conversely, by Remark 2.2 we note that the lhs of (Al.1) is bounded below by the rhs of (3.19). Therefore, since B = -a B N, with v = 1, 5‘21‘12 2 (3. 25)‘ “((u */u*) ql) 2 (2”) exp(- -—)Iexp(- %—+ 2 = (8n)%e2N 3Nx 2 )dx Thus (A1.l) with y = l and any 5C[0, 1] holds only if 2 eZN s (8n)%k1h2(6-1) ’3 , or only if (3.21) holds. II The following corollary, which is a consequence of Theorem 1, asserts that for certain families of densities, Dn(m,§) = 0(hn) uniformly in m 6 an. It also shows how the condition (Al.l) of the theorem is simplified in fixed N case. Recall from (2.4) and the definition following (3.1), that u*(é UZh) and u* de- pend on i, l S i S n-l. Corollary 1. Let a and B be constantswrt i. If H is such that (Al.0) holds, and for a 6 6 [0,1], With W(a,8) = a ' (55/2): (3 26) “{(exp(xw(a,6))[x S 0] + exp(xw(B’a))[x > 0])(— u:)6/2< u* ]< m for i = l, (e.g., take any 6 in [0,1] with 6 < ZB/a, and u(x) = xT-l[x > 0], 'r 2 1 or 2:(j+1)[j S x < j+1]) , ~ 5 . . then Dn(9’!) = 0(hn) uniformly in m E [a,e]n Proof. Since N E ‘a‘ V ‘8‘ is constant wrt i, (Al.2) holds with y = 5. C(w) is clearly bounded away from 0 and m _._ 5/2 ‘ — (min/(fa A f8) ) s 2 *5/2 . . . * (u*/u ) times the u-lntegrand in (3.26). Thus, Since u i on [0231- Hence, since m 5 sup‘w‘ngw, q6 58 and u*1 in i, (Al.l) with y = 6 holds by (3.26).. As a final comment to this section, we state here that the rate 1/5, that is shown to be achieved, under certain conditions, by i, is perhaps much slower than that i could actually achieve. This will be supplemented in Section 5, by showing that Dn(w,§) = -(2/5)+- 0(n ) V w 6 an with identical components. 2.4 Kernel Estimators with Rates Near % when 9(w) = w, ew 9£_ m-l. In this section we consider the situations where 9(m) is w, em or m-l. In each of these cases we will exhibit, for each s >'0, a class of compound estimators with rates (l-e)/2 uniformly in 9 e xlEQi’Bil' The classes of compound estimators to be exhibited in this section are based on types of kernel functions introduced by Johns and Van Ryzin (1972) in Empirical Bayes Linear loss two-action problems in exponential families. Thus in this section we will have two sets of assumptions; the one, which was not needed in Section 2.3, involves the kernel functions defining the classes of estimators, and the other involves the family of densities. Recall from the latter part of Section 2 that the dependence of h,a,B,N,m and qY on i is abbreviated by omission, where i is fixed with l S i S n-l. For 0 = 0,1 and integer r > v, let x: be defined as in Section 1 of Chapter 1. As in Chapter 1, with a fixed Kv 6 xi, define v+l)-l i 3(v) , _. . - (4.0) f ( ) — (ih 21{(Kv((Xj o)/h)/u(Xj))[u(Xj) > 0]} . Estimation of Vi (and hence exhibition of compound estimators) in this section, involves estimation of one or both of 59 the functions f and f(1). It has been seen in Chapter 1 that g(v) (v) , as an estimator of f , has various asymptotic properties. we will make , according to our need, applications of one or both of the functions f(0) and f<1> in defining our compound estimators here. The reason we have taken here r > v (instead of r 2 v, as is taken in Chapter 1) is that x:, for any integer r > v, is non-empty and f here is (infinitely) differentiable. Moreover, we assume here that Kv 6 L2(O,l), (e.g., Kv could be the v-th element of the dual basis for the subspace of L2(0,l) with basis ”'11). Denote ‘yj‘Kv(y)‘dy/j! by Mini“ Let s = ‘a‘ V ‘5‘. (In case 9 is identity map, 8 = N-) {1.y.---,y ‘dd'h "(1‘) Since maxlSjSi‘wj‘ S s, by mean value theorem, x ‘f ‘dy S hsrehsf. Hence, (2.8) of Chapter 1 gives (4.1) ‘Pif(v) - E(V)‘ s ‘k‘r_1 ”hr-vsrehsf . - hs - Moreover, since (Lebesgue) ess-supo‘tsl(f/u)(- + ht) S e (f/uh)(-), the inequality in (4.0) of Chapter 1 followed by the equation in the proof of Lemma 1 there, gives (4.2) var(f(v)) S fehs(ih2v+luh)-1HKvH: . As in Remark 5.2, and in Inequality (5.18), both of Chapter 1, a choice of b, that balances the two terms hr-v 2 - v+l) k and (1h appearing in the bounds for the bias in (4.1) 2(v), i and the standard deviation in (4.2) of the estimator s (4.3) h = i‘1/(1+2r). 60 This choice of h has been adopted by various authors, (e.g., Susarla (1970, Theorem.2; Yu (1970), Theorems 1.1 and 2.1; Johns and Van Ryzin (1972), Theorems 3 and 4), working on certain problems utilizing kernel estimators of a density or of its derivative. We too agggg (4.3) throughout this section and in Theorem 6 of the next section. For 0 < y S 2, let M v be the y-th mean error of f(v), 3 = 3(v) -(V) Y - = 2 i.e., My’v Bi‘f - f ‘ . Then, slnce M2,v (lhs of (4-1)) + lhs of (4.2), Liapounov's inequality followed by cr—inequality and (4.3) yields y(r-v) r- y - y/2 (4-4) MY’V 5 °v(Y)h {(8 f) + (f/uh) } where - hS y hs 2 y/Z Cv(Y) - (‘k‘r-1,ve ) V (e HKVHZ) . Hereafter, wg abbreviate f(v) by f and MY 0 by My' Recall that X abbreviates X . Inequality (4.4) will be used r+1 in obtaining an upper bound for Ei+11¢i+l(x) - ¢i(X)‘ where qa's are yet to be defined. Let c1,c2,... denote absolute constants. We now discuss the three cases separately. Case m. We consider here the case when 9 isthe identity map. Since fj(-) = C(wj)exp(mj-), (1.0) Specialized to the case 9(w ) = wj yields J _ -(1) - (4.5) ‘i - (f /f). Since x: is non-empty for any r > v, and f here is (infinitely) differentiable, we restrict, throughout our discussion of this case, r in (4.0) to be at least 2. Define a compound estimator ..I‘Il. 61 ; with its (i+l)st component, $1+1(X), given by (4'6) %1+1 = (%(l)/%)a.6’ where f”) is given by (4.0) with h = 1'1/(H2r). Define l-I - 111 by (a 7) H = hr-l é i-(r-l)/(1+2r). Recall that N = s = ‘a‘ V ‘B‘ and each of 0:9 and N hides subscript i. The following lemma which plays the central role in proving Theorem 2 below is a consequence of (4.4) and Lemma A.2 or the appendix. Lemma 6. V p > 0 and 0 < v S p A 2, - P P'V V rv v/Z (4-8) 21+1‘Wi(x) Wi+1(x)‘ 5 B(P)N H (N +‘u(qY/Uh )) + where B(p) = 2P+(y-1) (1 +-(hN)Y(l+2Y))max (Y)- v=0,lcv Proof. Fix 0 < y S p A 2. Since *1 and Wr+1 are in [u.B] and N = ‘a‘ v ‘3‘, by (4.5) and (4.6), ‘1i - yi+1‘ s 2N. Consequently, Lemma A.2 of the Appendix and the definitions of i d M v y el + _ p P+(Y‘1) p-Y - - Y Y (4.9) 31‘“ (1‘ s 2 N (f) V(MY 1 + (1+2 )N MY 0). , 3 Since 8 = N, by (4.3) and (4.4), h(v-1)YM is bounded above by Y2V - - - 2 Cv(Y)(Hf)Y(NrY'+ (fuh) Y/ ). Consequently, the rhs of (4.9) is - - - 2 bounded above by B(p)Np YHY(NrY+ (fuh) Y/ ), where B(p) is as given in the lemma. Since X abbrev1ates Xi+l’ taklng expectation wrt Pi+l on both sides of the inequality just obtained we get the desired 62 conclusion from the definition of qY given in (2.5). I Lemma 6 with p = 1 will be used to prove our main result below. The numbers b0,b1,... below are finite and independent of n. Theorem 2. Recall from (4.7) that H = hr"1 = i-(r-l)/(l+2r). If for a 5 e [0.1] a a b0 3 v i = l,...,n, (A2.0) u(m) S b iHé/(N2(l + log i)), 0 and if 3 s v 6 [6,1] and b1 and b2 3 with §-1 = 2 + y(r-1), (A23) u(qY/uzlz) s bINY'1n(5'Y)(1‘§) v i = l,...,n-l, and (A2o2) N s b2H§(5'Y) v i = l,...,n, 6 . . n then 3 3 b3 3 ‘Dn((£,i)‘ 5 b3Hn uniformly 1n (3 E XIEHi’Bi]. Remark 4.1. For r = 2, (A2.2) is equivalent to (Al.2), while (A2.l) is implied by (Al.l). Moreover, since the rhs in 1-5/2 (A2.0) is no less than b i /(N2(l + log i)), by Remark 2.1 0 via (3.17) there, for each r 2 2 (A2.0) is implied by (Al.0), (Al.l) and (Al.2) together. Thus assumptions of Theorem 1 are stronger than those of Theorem 2,at least for r = 2. Remark 4.2. By (2.5) via the definition of m, the lhs of (A2.l) is no less than (4.10) ”(EL—Y7?) = C_(%__ .‘m ul'Y/Ze(B'Yo/2 )xdx . (Ufa) CY (01) 8 Equation (4.10) is the same as (3.19). Hence the comments in Re- mark 2.2, regarding possible necessary conditions for the finiteness 63 of the integral on the rhs of (4.10) (and hence for (A2.l)),remain valid here too. Proof of Theorem 2. Since hr.1 5 H and (y-6)§ < (r-l)-1, by (A2.2) hN i 0. Therefore, since 3 = N, Cv(Y) in (4.4) is bounded in i, and so is B(l) in Lemma 6. Consequently, Lemma 6 with p = 1 gives a b such that 4 1- 2 (4.11) Ni+121+1‘¢i(X) - yi+1(X)‘ s b4N1+1N Yip/(NW + u(qY/ux/ )) S b51-6(r-1)/(1+2r) where, remembering g'1 5 2 + y(r-l), the last inequality follows from (A2.l) and (A2.2). Since (4.11) holds for each 1 S i S n-1, and X there -lzt11-li-6(r-l) /(l+2r). Thus, the first term on the rhs of the inequality in Lemma 2, with -l abbreviates Xt+1, n ngiEi‘wi-l(xi) - 61(Xi)‘ S bsn m there replaced by 3, is no more than (b3/2)H: uniformly in w 6 x:[ai’ai]’ and so is the second term there by (A2.0), since b0 is independent of i and n.- The hypotheses of Theorem 2 are satisfied for many exponential families. In Example N(m,l), introduced in Section 2, we will show that all the assumptions of Theorem 2, with -a = B = N>0, y = l and any fixed 6 6 [0,1] are satisfied iff , _ rLr-l)(l 4;) . s (4.12) N -~N1 — 0(1) + (2(l+r)(1+2r) log 1) . Note that for the case r = 2, (4.12) is the same as (3.21). Since the lhs of (3.25) is bounded below by u(ql/ug), we h have from there 64 2 g 2N (4-13) u(ql/uh) S cle Thus, if (4.12) holds, then, with Y = l, (A2.2) holds; from (4.13), (A2.l) holds; and,2from the fact that m(x) é sup‘m‘Sme(x) S e implies u(m) S eN l2, (A2.0) holds. 0n the other hand, we have noted in Remark 3.2 that the lbs of (A2.l) is no less than the rhs in (4.10). Therefore, with Y = 1 2 2 2 -% -N /4Ie-x /4 +.§E§-dx = (8n)%e2N . (4-14) u(ql/UE) 2 (2n) 8 2 Hence, (A2.l) with y = 1 holds only if (4.12) holds. II Remark 4.3. Theorem 2, specialized to the above Example N(w,l), improves the univariate version of the result in Theorem 3 of Susarla (1970). We have shown, through a simpler and shorter proof, the existence of less restrictive kernel estimators with rates (r-l)6/(l+2r) where 6 is given by (4.12). Our rates are strictly higher than the rates (r-l)/(4+2r) shown to be achieved by his kernel estimators in the bounded N-case, provided, in our unbounded N-case, N satisfies (4.12) with some 1 > 6 > (1+2r)/(4+2r). Note that the number of restrictions on the kernel functions increases as r increases. The following corollary shows how the conditions of Theorem 2 are simplified greatly in fixed N case. From (2.4) and (4.3), remember that depends on i with l S i S n-l. uh Corollary 2. Let a and B be constants wrt i. If for a 6 6110.1]. and with w(oz.e) = 01 - (53/2). 0/2 h (4-15) u{(eXP(m(o:,8))[x S 0] + eXP(XW(B,oz))[x > 0])/u } < co 65 for i = l, (e.g., take any 6 in [0,1] with 5.< ZB/a, and u(x) = xT-1[x > 0],'T 2 1 or E;(f+1)[i 5 x < 1+11), then Dn(m,i) = 0(H2) uniformly in 9 e [a,e]“. M. Since N 2 ‘cy‘ v ‘5‘ is constant wrt i, (A2.2) holds for y = 6. C(m) is clearly bounded away from 0 and m on [are]- Therefore. m(x) e sup f (x) s c2(exp(6x)[x >.0] + ‘w‘SN m exp(ax)[x S 0]), hence (A2.0) holds since a and B are in 0. Moreover, since (fB A fa)(x) = C(m)(exp(ax)[x >.0] + exp(Bx)[x S 0]), q5 é mi+ll 0 and 0 < v S p A 2, -y/2 - 2 - /2 , afllqyioo - ii+1(X)‘p) s B(p)H¥Np{er + N Y/ ”{qy((uh) Y + (uh) )1}. + where B(p) - ZPHY‘D (2 + 2Y)co(v)- Proof. Since wj E [a,5] for all 1 S j S i, by (1.0) e" si 5 e3. Thus (4.16) and (4.17) followed by the fact that N . e3 give ‘ S 2N. Therefore, since (EVE) 5N. ”i " i1+1 it follows from'Lemma A.2 of the appendix, that p P+(Y'1)+ ' “Y I")! Y Y (4,13) Ei‘wi - ¢i+1\ s 2 (f) N {MY + (1+2 )N My} . Since f' s Nf, from (4.4) with v 0, M‘ s c0(y)HY{(Nsr53')Y + (Nf/ufi)Y/2}. Consequently, by (4.4) with v = 0, rhs of (4.18) s B(p)NpHV{er + (Nf)-Y/2((uh)-Y/2 + (111;) ’Y/Zn, where B(p) is as given in the lemma. Now (4.18) followed by the preceding inequality and the definition of qY in (2.5) yield the des ired conclus ion . . Let the numbers bo,b1,... of n. Now we will obtain the main result for the case under study. below be finite and independent 67 - /' 1+2 Theorem 3. Recall from (4.7)' that H = hr = i r ( r). If for a 6 6 [0,1] 3 a b0 3 V i = l,...,n, . 6 2 (113.0) u(n) s boiH /N (l + log i), and if 3 a y 6 [6,1] and b1 and b2 9 V i = l,...,n-l, ‘Y/2 . "le -2+¥/2 5w (113.1) ”{qy((”h) + (uh) )} s blNifl H and ry -2 6-y (A3.2) s s b2N1+1H , 6 . n then 3 a b3 9 ‘Dn(e,i)‘ S b3Hn uniformly 1n 9 E X1[ai,ei]. Remark 4.4. If n S (-m,0], then taking 3 E 0 (implying N 5 eB E l) (A3.2) becomes ‘q‘rv S b Hé-Y. In general, (A3.2) 2 holds if ‘o‘ V ‘B‘ 1 m at rates not faster than b4log i for some th < r/(1+2r). Keeping the difference in H's in two Cases w and ew in mind, we see that (A3.0) implies (A2.0). Finally, since the lhs of (A3.l) is bounded below by the lhs of (A2.l), comments, quite similar to those given in Remark 3.2 regarding possible necessary conditions on o and B for (A3.l),can be stated here too. Proof of Theorem 3. Since h = 1-1/1+2r, H = hr and N1+1 2 N1, (A3.2) implies hs # 0. Hence C(V) in (4.4) is bounded in i, and so is B(l) in Lemma 7. Consequently, since N (.2 Ni) S N Lemma 7 with p = 1, and the hypotheses (A3.l) i+1 ’ and (A3.2) yield a b5 such that V i = l,...,n-l, .-6 / 1+2 (4.19) Niflgifl‘yim - imm‘ s bsl r ( r). 68 In view of (4.19) and (A3.0), the remainder of the proof follows by arguments identical to those given in the second para- graph of the proof of Theorem 2.]. Remark 4.5. Theorem 3 here improves Theorem 6 of Samuel (1965). Restricting m's to a bounded interval of a, She exhibits estimators a? and shows that, under certain conditions, V e > 0 Dn(m,gr) < e V n 2 some n0(m,e) < m. We do not require her continuity assumption on u; and her other hypotheses always imply ours, (this may readily be seen through Corollary 3 below). By analyses analogous to those made earlier in Example N(m,l), it can be verified that the hypotheses of Theorem 3, with y=l,6€[0,1] and -a=a>0 (sothat s=B and N = exp(e) 2 1) are satisfied iff = 0(1) + (932$)— log i) The hypotheses of Theorem 3 reduce to a rather simple one in the fixed N case, as we can see in the following corollary. Corollary 3. Let a and B be constants wrt i. If for a 6 6 [0,1], (4.15) with u:/2 there replaced by uglz + (ufi)6/2 holds, (e.g., take examples mentioned in Corollary 2), then Dn(9’i) = 0(Hg) uniformly in m E [q,B]n 2322;. The proof is identical to the one given for Corollary 3. I In the next section we will point out that, in certain cases, Dn(9’1) are 0(H:-). Thus one may eXpect ] to achieve rates much higher than those indicated in Theorem 3. 69 Case m-l. We now consider the situation where the com- -1 ponent problem is SELE of B(w) = w . One of the important examples, where such estimation arises, is sequence compound SELE of scale A in F(A.T)-family: (T(T))-1xT-1k-Te-XIx[x,T > 0]. This of course includes the case of sequence compound SELE of oz in N(0,02)-family: 2 - 2 (Zfld ) kexp(-x /02)[-m1< xr< m], since x2 is sufficient for 02. Throughout the study of this case, we assume that’ an< 0. (Thus (DX B‘< 0 V i = l,...,n). Since fj(x) = C(wj)e j and, for -l j = l,...,i, wj < 0 , wj fj(x) = -‘: fj' Thus Specialization of (1.0) to ej é 6(wj) = wgl gives (4.20) (riot) = -(“: f)/f(x). -l 1 Since asmjsa<0 Vj=1,...,i, by (1.0) e s‘isa'. For the remainder of this section, let f be given by l (4.0) with r >>0, and let L denote 8. log H. Motivated by (4.20), our proposed compound estimator has (i+l)st component (4.21) (1,100 = (-(“:+L mm» ,1 -1- 6 ’0 r é i-r/(l+2r). From (4.3) and (4.7)' recall that H h Also, note that s 5 ‘a‘ V ‘B‘ = ‘o‘ and N SUp{‘w-1“d 5 w 5 B} = W1 Lemma 8. V p > 0 and 0 < y S p A 1, (4.22) gm‘iim - 11,1001" s B

HY\a""(‘a‘rY + 1 + 31/2 (‘log 11“"2 + l)u(qY/uh+L)} 70 where B(p) = 29.3(c0(y) v 1) with co(y) given by (4.4). Proof. Fix 0 < v s p A 1. For j = l,...,i, ° f = _1 w __ in. , '03:] e B < 0. Therefore, ij(x) S ‘B‘-1Hfj(x), since L5 = log H and a S wj S (4.23) 1:41? s ‘e"1H f(x) . As in Case em, abbreviate M1 0, introduced preceding (4.4), to M Now (4.4), the inequality (x) S uh(t) V x S t < x+L, 1 ' uh+L and Schwarz inequality give (4.24) Bijfi+L‘f = f‘ = ‘:+LM1 S co(l)H{‘:+L(‘a‘rf + (f/uh)%)} s c0(1)H{‘o‘r‘:+Lf + (L(“:+Lf)/uh+L(X)))%] 2 s c0(l)H{‘ore'1‘f(x) + (Lf(X)/(‘B‘u (x))) } h+L since ‘:f S ‘B‘lf(x). Liapunov's inequality, (4.23), (4.24) and cr-inequality (Loéve (1963), p. 155) give oo- +L“-y co - x+L - “- y “15> BiUxf-J‘: 0 share], ‘f-fn s (com v 1>(‘e'1‘H>V(<‘a“Y + was)" W2 + (‘log H‘f(x)/u (x) ] h+L since cg(l) = c0(y) and L8 = log H. Since 3-1 S 61 S a-1 < 0, by (4.21) ‘wi - Wi+l\ S ‘B‘-1. Therefore, by (4.20), (4.21) and Lemma A.2 of the appendix, (4.26) 31‘)i(x) - yiflor)‘p s zp‘e‘Y’P(f(x))'Y{lhs of (4.25) + Z‘B‘-YMY(X)} . 71 But, since (4.4) followed by the inequality u gives h ‘ uh+L - - 2 MY 5 c0(y)HY{(‘oI‘rf )Y + (f/uh+L)Y/ ), by (4.25), (4.27) rhs of (4.26) s B(p)HY‘e"p{‘a‘rY + 1 + (f(x)u (x))‘V/2(‘log 11“”2 + 1)] h+L where B(p) is as given in the lemma. Since X ~ Pi+l’ (4.26) followed by (4.27) and the definition of qY in (2.5) leads to (4.22).- We will use Lemma 8 with p = l in order to prove our main result below. Numbers b ,b below are finite and O 1,.. independent of n. Theorem 4. Recall that H é hr = i 6 6 [0,1] and g > 0 a a b 9 v i = l,...,n, 0 (A4.0) u(m) S bOiHé‘log H‘Q‘B‘2/(l + log i), and if 3 y 6 [6,1] and b1 and b2 3 2 y/2 6- - /2 . (A4.1) ‘8‘ ”(qy/”h+1) s blH Y‘1og H‘g Y v l = l,...,n-l, and (A4.2) ‘5‘“2‘07‘rY s bZHb-Y‘log H‘Q v i = 1, .,n 6 . . then 3 8 b3 9 ‘Dn(m,g)‘ S bBHn‘log Hn‘Q uniformly in n Proof. Since 0 >‘B 1,by (A4.2), h‘a‘ é Hllr‘a‘ a 0. Therefore, since ‘3‘ = ‘a‘, by (4.4), B(l) in Lemma 8 is bounded in i. Consequently, since 3 abbreviates Bi’ Ni here is 72 ‘Bi‘-1’ H = i-r/(l+2r) and (A4.2) holds V i = l,...,n, Lemma 8 with p = 1 followed by (A4.2) and (A4.1) give a b4 3 V i = l,...,n-l, (4.28) Ni+1§i+1‘yi(X) - Wi+1(X)‘ s baHé‘log H‘C S béi-r6/(l+2r)‘log “n1;- In view of (4.28) and (A4.0), the remainder of the proof follows by arguments identical to those used in the second para- graph of the proof of Theorem 2. I Assumption (A4.1) of the theorem is the most stringent one. Comments, regarding a possible necessary condition for this, are the same as those contained in Remark 4.2. Corollary 4. If a and a are constants wrt i, and for 6 6 [0,1] and g > 0 3 a y 6 [6,1] and a b5 3 V i = l,...,n-l (4.29) (lhs of (4.15) with 6 and uh replaced by y and uh+L) S 5- - bSH Y‘log H‘g Y/Z, 6 . . then Dn(9,l) = 0(Hn‘log Hn‘g) unlformly 1n 9 6 [a,e]n. Proof. The proof is analogous to that of Corollary 2. . Example. For T > 0 fixed, let u(x) = (F(T)-1xT-1[x > 0]. Moreover, let a and B be constants wrt i. Then, since by cr- inequality, y/2 (4.30) (qyluzii) S {P(T)(XT-1[T 2 l] +-(x1_T + (h+L)1-T)[O < T < 1])] exP((B - Yo/2)X)[X > 0], (4.29), with any 0 < 6 s 1 3 o < Zs/d, g = (0/2) + (1-¢)[0 < T < 1] and y = 6, is satisfied uniformly in w 6 [0,8]“. 73 Remark 4.6. Section 2.1 of Susarla (1970) deals with sequence compound SELE in the example just mentioned. His condition on the parameter Space implies 23 < a < a, and his assumption (0.8) to- gether with his hypothesis T > 2 restricts T to be in {3,4,...,r+l} U {t‘t 2 r+2]. Moreover, his presentations are rather complicated and proofs of lemmas are lengthy. His estimator, which depends also on certain other auxiliary random variables independent of X1,...,Xn, achieves (a rather weaker) rate r/2(l+r) uniform in m E [a,a]n. Note that this example with T > 1/2 does not cover the case of sequence compound problems where the component is SELE 2 2 2 (based on X ) of c in N(0,o )-family. 2.5 Rates Near the Best Possible Rates with the Divided Difference and the Kernel Estimators of m with Identical Components. This section deals with only the case when w has identical components. We will show that, when 9 is identity, rates with the divided difference and the kernel estimators are arbitrarily close to, but cannot be more than, 2/5 and 1, respectively. We will also indicate that, for Case em, kernel estimators achieve rates near 1. Throughout this section, let w = (w,..-,w) 6 DP, and let a and 3 be constants wrt i such that a S w < B. Let f abbreviate fw. It may be noted at the outset that the conclusions of Lemmas 5, 6 and 7 remain valid if qY there is replaced by - 2 " e /f 1” Theorems 5 and 6 below are proved for the case 9 is identity, and for the lower bounds there we assume: 74 (5.0) 3 an e > 0 and a finite L > a 3 Lebesgue-sup of the restriction to (L,L+g) of u is finite, and Lebesgue-inf of the restriction to (L,L+g) of u is positive. With 6 and L in (5.0), we have (5.1) Oau*(x)f(x) < a and for a y 6 [0,2) and i = l (5.5) u((u*/u:)Y/2f1-Y/2) < a, then 3 a c1 3 Y (5.6) Dn(9,i) S Clhn . On the other hand, if (5.0) holds, then 2 . . (5.7) Dn(m,i) 2 CZhh V suff1c1ently large n. Proof. Throughout this proof, fix 1 with 1 S i S n-1, and abbreviate X to X. i+l Let y be given by (5.5). By (5.4), k6(y) in Lemma 5 is bounded in i (by choosing cO suitably, if necessary). Since the conclusion of that lemma holds even if q there is ~ fl-y/Z q , which here becomes replaced by , by (5.5), iY/S§i+1‘$i+1(x) - (1)"Y is bounded in i. This conclusion and the inequality in (5.3) give (5.6). To prove (5.7) we proceed as follows. Recall that a > w . Since by (3.1). 1i+i(x) = (Q(6)(X))Q 3’ (5.8) Pi+1‘1i+1(X) - w‘ 2 g-wEi+1[¢i+l(x) - w > v]dv 2 ~ Piflflt < X < L + e/2]“"8‘w§i[Q(6-)(x) > v + m']dv}, where (here and throughout this proof) L and g are given by (5.0). 76 Fix X E (L, L +'s/2) and v 6 (0, B-w) until stated otherwise. From Section 3, (following (3.0», for j = l,...,i, 6j(') = [- S X < - + h]/u(Xj). As in the second paragraph in the J proof of Lemma 3, for .j = l,...,i, let Yj = 6j(X +-h) - eh (W)6 Y j(X). Slnce X1,...,Xi are 1.1.d. so are Y1,..., i' The definition of Q in (3.0) and the Berry-Esseen theorem 15:1] (Loéve (1963), p. 288) lead to To .3!" L'lfir‘f (5.9) gi[Q(6)(X) > v + 0.)] =Ei[zin >0] Y - p Y .-e 3 ll£9135 1 291‘ 1 1 1‘ 2 Q( C ) - C3 3 S 1 01 2 where o = var of Y1 and 6 is the distribution function of N(0,l). Inequalities in the remainder of this proof are valid only V sufficiently large i. Since h i 0, take h S g/4, where e is as in (5.0). Let 01(X) = ‘:+hf(t)dt. Then, since _ hw (61(°+h)/61(')) - e , _ eh(v+u)) (5.10) P Y = 61(X+h) 01(X) = 61(X)ehm(1-ehv) 1 1 hp 2 2 hv61(X)e c4 v, where the last inequality follows from (5.2). Moreover, since 61(X+h)61(X) = 0 with probability one and P161(-) > 0, Cl 2 ezhwvar 61(X). Thus, -2hw 2 e 01 x+h 2 X (f/u) - 61(X) 2 var(61(X)) = * 3t 2 61(X)(l - u (X)61(X))/u (X) 2 c5h 77 -1 * since, by (5.0) and (5.2), su u (t)61(t) < m and pL 0 Consequently, y ( 10) P Y (5.11) -1—1 2 -c vh3/2. 01 6 3 and, since by (5.0), Pl‘Y1 - P Y ‘ s(tonstant)oi l l P1W1 " P1Y1‘3 *5 (5.12) 3 S c7h . °l Now weakening the integrand on the extreme rhs of (5.8) by (5.9), (5.11) and (5.12) and then making the transformation ’5 c6v(ih3) = t we get, after recognizing that X has u-density f satisfying (5.2), (5.13) Pr+1‘$i+1(X) - w‘ 2 u(t <'x < L +.€/2) 3 3 42 (B-m)c6(ih )52 -% {°g(ih ) ‘0 i(-t)dt - C9(1h) }. .-l/S , . Since h = C01 , the integral 1n (5.13) converges to 135('t)dt as 1‘“ m, and hence by (5-1), 11/5 times the lhs of (5.13) is bounded below by a positive quantity for all large i. 2 2 Therefore, since Pt+1‘Wi+1(X) - w‘ 2 Pi+l‘mi+l(x) - w‘, (5-7) follows from the equality in (5.3). II Theorem 6. Let 2 be the kernel estimator introduced ~ under Case w in Section 4. (See (4.6). Also recall that W is ~ defined for each integer r > 1.) As in (4.3) and (4.7), take -l/(l+2r) r-l h = h1 5 i and H = Hi = h . If for a Y E [0:27 and i = l, (5.14) u(fl'Y/zlug/Z) < a 78 then 3 c 9 10 Y. (5.15) Dn(e,i) s CIOHn’ and if the kernel functions K0 and K1 defining W are ~ bounded, and (5.0) holds, then 2 o o (5.16) Dn(9’i) 2 CllHn V suff1c1ently large n. Proof. Fix i with l s i s n-1 and abbreviate X1+1 by X. In view of Lemma 6, (5.15) follows by arguments identical to those given for the correSponding part of Theorem 5. Now we prove (5.16). Recall that B > w. Since W = 2(1) 2 (f If)a:6’ B-w (5.17) Pi+l‘%i+1m - w‘ 2 O Ei+l[wi+l(x) - w > v]dv 2 Piflm, < x < L + e/ij‘o w§.[[%(1) - w? > v\f\]dv}, ) and f(1 ) is abbreviated by omission, H'H where the argument X in and L and e are given by (5.0). Fix X E (L, L + 3/2) and v E (O, B-w) until stated otherwise. For 1 S j s i, let {<-h 1K + wK + v\x Op (-—]—-——)}[u(X) > 0] (5.18) u(Xj)T=1 0 J where K.o and K1 are the kernels used in the definition of l. Since X1,...,Xi are i.i.d., so are T . The T1,..., 1 definitions of f(J), j = 0,1, given in (4.0) and Berry-Esseen 79 theorem give A t 1 - 2 1 (5.19) 21“” - wf > vm] 2 31021 Tj < 0] - 3 i p T i %P T - P T \ 2 §(_ ] l) _ C l] l l 1 cl 13 O3 1 whe 2 = T re 01 var 1. Inequalities in the remainder of this proof are obtained only V sufficiently large i. Since h 1 0, we take h 5 3/2, where e is as in (5.0). Let 21,22 and 23 denote, respectively, the first, second and the third term in the expression for T Then 1. the transformation theorem followed by r-th order Taylor expansion with integral form of the remainder and the orthogonality properties of K.j 6 x; for j = 0,1 gives P121 + hf(1)(X) = f3K1(y)I:+hy(X + hy-t)r-1f(r)(t)dtdy/(r-l)!. Thus, since K1 is bounded, by (5.2), P121 5 -hf(1)(X) +-const.hr. By similar arguments, P122 s hwf(X) + const.hr+1. Therefore, since by (5.2), P23 = V x+h\K0(Efiz§\f(t)dt s hv const. and since f(1) = wf, r r-l (5.20) PlTl S c14h(h + h + v) . Next observe that 2 2 (5.21) 01 2 g (21) + ii cov(Zj,Zj,). J j' Since h i O and K0 and K are bounded, writing the exact 1 expression for cov(Zl,Zz), we see, after making use of the trans- formation theorem and (5.0) and (5.2), that \cov(Zl,ZZ)\ is bounded in i. The same conclusion holds for cov(Zl,23), 2 cov(Zz,23) and (P121) . Therefore, there exists a finite constant 80 § (could be negative) such that 2 2 -l 2 (5.22) o 2 P z + g = h j‘K (t)((f/u) (X + ht))dt + g . l 1 l 1 Consequently, by (5.0) and (5.2), (5 23) h 2 ° 01 2 C15) and hence, by (5.20) .‘HI. 5‘ (5.24) 1 1 s c 113/203 + h“1 + v) 01 16 M ' b 5 O hP T P T 3 2 b 5 23 oreover, Since y ( . ), 1‘ 1 1 1‘ 5 const. 01’ y ( . ) 3 P \T - P T \ J 1 1 1 -% (5.25) 3 S c17h . “1 Now weakening the integrand on the extreme rhs of (5.17) by (5.19), (5.24) and (5.25), and then, doing the analysis exactly similar to that given (following (5.12)) in the proof of Theorem 5, we get the desired conclusion. II For Case em in Section 4 we have taken H, = h? with 1 1 h = i-l/(1+2r). i In this case, if for a y 6 [0,2) and i = l .. 2 _ - (5.26) u{f1 Y/ ((uh) y/Z + (ufi) Y/2)} < m, where ufi(-) in (5.26) stands for uh(-+l), then using a proof similar to that used for (5.6) and making an application of Lemma 7 with p = y, it follows that 1 given by (4.17) satisfies 3 Y 13110312) Cain) as n t on. * Remark 5.1. In view of the definitions of u*, u and uh, each of (5.5) and (5.26) implies (5.14). Densities satisfying 81 (5.5) and2(5.26) V y 6 [0,2) exist, e.g., take u(x) = - - 2 - - (2n) *8 x / [-w‘< x < m], (F(T)) le 1[x > O], T 2 1 or g;(j+l)[jl< x 5 1+1]. Thus situations exist where the lower and upper bounds in each of Theorems 5 and 6 are considerably tight. 2.6 The Divided Difference Versus the Kernel Estimators. The divided difference estimator introduced in Section 3 and the kernel estimator introduced under Case w, in Section 4 n are compound estimators of the same vector w = (w1,...,wn) 6 O - g; N Therefore, it is rather natural to make a comparison between them. Denote them here by i and ER reSpectively. Recall that ER is defined for each integer r > 1. under certain conditions, Theorems 2 and 5 show that ER with r > 6 is better than 1 in the sense that, V large n, ~ SUPw\Dn(‘£’iK)\ s con-(r-1)/(1+2r) ‘ elm-Us S supw\Dn(‘£’1)‘ where c0 and c1 are some finite positive constants. By Theorems 2 and 6, WK. with r > 6 is better than WK with r = 2 in the same sense. Results obtained in Theorems 2 and 6 for WK with r = 2 coincide, respectively, with those obtained in Theorems 1 and 5 for w. However, as we have noted in Remarks 4.1 and 5.1, condi- tions forlatter ones are stronger than those for former ones. ~ ‘Hence, ‘K’ even with r = 2, could be preferable to 1. Neverthe- less, $ is a more natural estimator compared to 6K. ~ ~ 82 Estimators i and 3K are somewhat (but not completely) similar to !f* and oi. respectively, prescribed by Susarla (1970), (Chapter 1), for the case u(x) = (2n)-%exp(-x2/2)[-O < x < m] and -01 - Bi 5 c2, a finite positive number. Results of Theorems 1 and 5 for i, specialized to the above case and (in Theorem 5) w = 0, coincide with those obtained by Susarla for 1f*. However, in order to make oi_ (which is, in comparison of EK’ rather ** complicated to exhibit) better than 1. in the sense described above he requires r > 12. APPENDIX APPENDIX - Here we prove two useful lemmas; one concerning the weighted empiricals based on independent random variables and the other concerning the difference of two random ratios. A.l. On Glivenko-Cantelli Theorem for the Weighted Empiricals figsed on Independent Random Variables. Let X1,...,Xn be independent real valued random variables, and, for w 6 [0,1], let Fj(x) = wP[Xj < x] + (l-w)P[Xj S x] and Yj(x) = w[Xj < x] +-(l—w)[Xj S x]. Furthermore, with c1,...,cn non-negative numbers such that 22C: = 1, let n * n H = F = n 21% J" Hn zch'YJ and + 1': Dn = supx’wmaxNSn(HN(x) - HN(X)). A Special case of the result in Remark A.l (following the proof of Lemma A.l below) is used in the proof of Theorem 2(b) of Chapter 1. Lemma A.l. With c = zch, V M 2 1, (1) P[D: 2 M] < 2c M exp(-2(M2 - 1)) * Proof. Let A = mastnGIN - HN). The remark following (2.17) of Hoeffding (1963), p. 17, and Theorem 2 therein, applied to random variables chj with w = 1 yield 83 1r...» .... ._ 84 (2) P[A(x-) 2 n] S exp(-2n2) V x 6 R and V H > 0. Fix (temporarily) O < y‘< M and partition R into k intervals with endpoints -m = x < x1 <...< xk = m such that Hn(x ) S y for j = l,...,k. Since 0 S Hn(-) S c, we can J-l’xj -1 . (and do) take ki< cy +11. Since HN(xj_1,xj) S Hn(xj_1,xj) S y * for N S n, using the monotonicity of RN and HN’ we get * (3) sup j-[)\ = 0((n log rob with probability one. A.2. A Bougd_for the Yrth Mean of the Bounded Difference of Two Random_Ratios. We apply Lemma A.2 below in the proof of Lemmas 6, 7 and 8 of Chapter 2, in order to obtain certain suitable bound for the p-th mean distance between the compound and Bayes estimators there. Lemma A.2. Let y,z and L be in R with 2 fi 0 and L > 0. If Y and Z are two real valued random variables, then V y > 0 Y (-1)+ - (1) qu - g A L)Y 5 2% V ‘2‘ Y{E\y-Y\Y 1 + + (\E‘Y + 2‘“' ) LY)E\z-Z\Y}. Proof. Since [2\z-Z\ S ‘2‘] S [2‘2‘ 2 ‘2‘], the lhs of (l) is exceeded by (2) mi - §\V[2\2\ 2 \z\]) +LYE[2\z-Z\ 2 \z‘] . Now by Markov-inequality, the second term in (2) is no more than (2L)Y‘z|-YE\z-Z‘Y. By triangle inequality with intermediate term y/Z, and by c r-inequality (Loeve (1963), p.155), the first term in (2) is bounded above by 2Y+ O with probability one, then V y >-0 Y ( -1)+ - (3) Edi-2‘ /\L)Y52YJ'Y EH4 Y(\y.Y\Y+ (‘SY + + Z'W’l) LY)\z - 2‘3}. Thus (1) becomes a special case of (3). BIBLIOGRAPHY BIBLIOGRAPHY BHATTACHARYA, P.K. (1967). Estimation of a probability density function and its derivatives. Sankhya Ser. A 29 373-382. BILLINGSLEY, PATRICK (1968). Convergence of Probability Measures. John Wiley & Sons, Inc., New York. CACOULLOS, THEOPHILOS (1966). Estimation of a multivariate density. Ann. Inst. Statist. Math. 18 179-189. CENCOV, N.N. (1962). Evaluation~of an unknown distribution density from observations. Soviet Math. 3 1559-1562. GILLILAND, DENNIS C. (1966). Approximation to Bayes risk in sequences of non-finite decision problems. RM-162, Department of Statistics and Probability, Michigan State University. GILLILAND, DENNIS C. (1968). Sequential compound estimation. Ann, Math. Statist. 39 1890-1905. HANNAN, JAMES F. and ROBBINS, HERBERT (1955). Asymptotic solutions of the compound decision problem for two completely specified distributions. .Ann. Math. Statist. 26 37-51. HANNAN, JAMES F. (1956). The dynamic stafistical decision problem when the component problem involves a finite number, m, of distributions (Abstract). Ann. Math. Statist. 21 212. HANNAN, JAMES (1957). Approximation to Bayes risk in repeated play. Contributions t2_the Theory g§_Games 3 97-139. Ann. Math. Studies No. 39, Princeton Univ. Press. HANNAN, J.F. and VAN RYZIN, J.R. (1965). Rate of convergence in the compound decision problem for two completely Specified distributions. Ann. Math. Statist. 36 1743-1752. HANNAN, JAMES and MACKY, DAVID W. (1971). Empirical Bayes squared error loss estimation of unbounded functionals in exponential families. RM-290, Department of Statistics and Probability, Michigan State University. 87 53 88 HEWITT, EDWIN and STROMBERG, KARL (1965). Real and Abstract Analysis. Springer-Verlag'New York, Inc. HOEFFDING, WASSILY (1963). Probability inequalities for sums of bounded random variables. J, Amer. Stat. Assoc. 58 13-30. JOHNS, M;V., Jr. (1967). Two-action compound decision~problems. Proc. Fifth Berkeley Symp. Math. Statist. Prob. 1, University of California Press. ~ JOHNS, M{V., Jr. and VAN RYZIN, J. (1972). Convergence rates for empirical Bayes two-action problems II. Continuous case. Ann. Math. Statist. 43 934-947. KIEFER, J. (1961). On lane deviations of the empiric d.f. of vector chance variables and a law of iterated logarithm. Pacific J, Math, 11 649-660. KRONMAL, R. and TARTER:~M. (1968). The estimation of probability 11—. densities and cumulatives by Fourier series methods. J, 599;, Statist. Assoc. 38 482-493. LOEVE, MICHEL (1963).~7Probability Theory (3rd ed.). Van Nostrand, Princeton. NADARAYA, E.A. (1965). On non-parametric estimates of density function and regression curves. Theor. Prob. Appl. 10 186-190. NATANSON, I.P. (1955). Theory of Functions of a Real Varigble. Frederick Ungar Publishing Co., New York. OATEN, ALLAN (1972). Approximation to Bayes risk in compound decision problems. Ann. Math. Statist. 43 1164-1184. PARZEN, EMANUEL (1962). On the estimation of probability density and mode. Ann. Math. Statist. 33 1065-1076. RAO, B.L.S. PRAKASA (1969). Estimation of a unimodal density. Sankhya Ser. A. 31 23-36. ROSENBLATT, MURRAY (1956). Remarks on some non parametric estimates of a density function. Ann. Math. Statist. 27 832-837. SAMUEL, ESTER (1963). Asymptotic solutions of the~sequential com- pound decision problem. Ann. Math. Statist. 34 1079-1094. SAMUEL, ESTER (1965). Sequential compound estimatgrs. Ann, Math, Statist. 36 879-889. SCHWARTZ, STUART C. (1967). Estimation of a probability density by an orthogonal series. Ann. Math. Statist. 38 1261-1265. 89 SCHUSTER, EUGENE F. (1969). Estimation of a probability density function and its derivatives. Ann. Math. Statist. 40 1187-1195. SUSARLA, V. (1970). Rates of convergence in sequence-compound squared-distance loss estimation and two-action problems. RM-262, Department of Statistics and Probability, Michigan State University. VAN RYZIN, J. (1966). The compound decision problem with m X n finite loss matrix. Ann. Math. Statist. 37 412-424. VAN RYZIN, J. (1970). On a histogram method of~density estimation. Univ. of Wisconsin, Department of Statistics T.R. No. 226. VAN VLECK, F.S. (1973). A remark concerning absolutely continuous functions. Amer. Math. Monthly 80 286-287. _ WAHBA, GRACE (1971). A polynomial alggrithm for density estimation. ,5 5.2.9.- M_a_t_:_t_1_. Statist. 42 1870-1886. WATSON, 6.8. and LEADBETTER, M.R. (1963). On the estimation of the probability density I. Ann. Math. Statist. 34 480-491. WATSON, 6.8. (1969). Density estimation by orthogonal~series. Ann, Math, Statist. 40 1496-1498. WEGMAN, E.J. (1969) . Maximum likelihood estimation of a unimodal density function. Inst. Statist. Mimeo Ser. No. 608, University of North Carolina at Chapel Hill. WEISS, L. and WOLFOWITZ, J. (1967). Estimation of a density at a point. z, Wahrscheinlichkeitstheorie und Verw. Gebiete 7 327-335. ~ YU, BENITO (1970). Rates of convergence in empirical Bayes two- - .“d action and estimation problems and in extended sequence-com- pound estimation problems. RM-279, Department of Statistics and Probability, Michigan State University. Ill. Ila If..- .l‘Avn : v