A NON - REGULAR SQ‘UARED-ERROR LOSS -SET_- _ COMPOUND ESTIMATION PROBLEM I, ; -- IDisSertationfor the Degree of Ph. D . MICHIGAN STATE UNIVERSITY ‘ YOSHIKO NOGAMI - 1975 - Date 0—7 639 This is to certify that the thesis entitled A INN-REGULAR SQUARED ERROR HES SET COMPOUND BTD’IATIGN PROBLEM presented by Y osh iko N ogarn 1 has been accepted towards fulfillment of the requirements for _Ph.D. __degfee in StatiStiC§ and Probability DMDWW Major professor June 17 . 1975 k. J ABS TRACT A NW -REGULAR SOUARED ERROR LOSS SET COMPOUND ETIMATICN PROBLEM By Yoshiko Nogami For an integrable function f 2 0, let .9(f) be the family of distributions Pe specified by a density proportional to the restriction of f to the interval [9, e + l) for e in a real interval 0. The component problem is estimation of 9 based on X distributed according to P9, with squared-error loss. For a prior G on Q, let R(G) denote the Bayes risk versus G in the component problem . Let X1,...,X be n independent random variable with n each Xj having Pe 60(f). Let G j of 91,...,9n. n be the empiric distribution The work here is a generalization and continuation of R. Fox's (1968, 1970) work. Under P6, the uniform distribution on [9, e + 1), he constructed a Levy consistent distribution- valued estimate an of Gn' When the e are iid G, he showed the convergence to R(G) of the respective expected risks for D with components Bayes versus an and for ¢ with components ~ direct estimates of the posterior means wrt G. In this work we introduce procedures 9, ¢ and er (another ~ ~ direct estimate of the posterior mean wrt Gn)' We generalize Yoshiko Nogami Fox's an to .9(f) and , with all convergence rates for bounded 0, show that D(§, E) = E{n-1 22:1(@j(§) - ej)2] - R(Gn) is 0((n-110g n)%), even when f E 1, the boundedness of Q is necessary for the convergence of D(§,E) to zero whatever be the set compound procedure t. The proof is based on the bound obtained for the risk difference in terms of Lévy distance. In this connec- tion we obtain a unified generalization of Lemmas 8 and 8' of Oaten (1969). For a prior Gk on 0k, let Rk(Gk) denote the Bayes risk . k . . . against G in squared error loss estimation of 9k, based on k k .0. O = 0.. . = .0. b X1, ,Xk Let gj (ej¥k+1’ ,ej), J k, ,n and GD e the empiric distribution of é:,...,é:. Then, QT for .9(f) and 05 for em, both have 0(n-1/<2k+2) E{(n-k+l)-1 ) for Dk(§,9T) = 2 k k z?=k(e*r,j(§) ‘ 93-) i ' R (Ga) and Dk(e,@). It is _L shown that D(O, QT) has exact order n 2. A NON-REGULAR SQUARED-ERROR LOSS SET COMPOUND ESTIMATICN PROBLEM By Yoshiko Nogami A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1975 TO MY PARENTS ii ACKNOWLEDGEMENTS I wish to express my deep gratitude to Professor James F. Hannan for his excellent guidance and patience during the pre- paration of this dissertation. His invaluable suggestions helped greatly to improve the manuscript. Among all others I thank him and Professor V. Fabian for their encouragement and advice during my study at Michigan State University. My appreciation extends to Professor D. Gilliland for his careful reading of my rough manuscript and suggesting appropriate changes. Furthermore, I thank Professors V. Fabian, J. Stapleton and J. Shapiro for reading the thesis. I am also indebted to Noralee Burkhardt for her excellent typing and patience. Finally, I am grateful to the Department of Statistics and Probability at Michigan State University for the financial support. iii Chapter 0 II III APPENDIX A.l A.2 TABLE OF CONTENTS Page INTRODUCTION ..................... .............. l A BOUND AND RATE FOR A TWO-STAGE PROCEDURE ...... 5 1.0 Introduction .................. ............ S 1.1 An Upper Bound of the Modified Regret for E .................. ..... ...... ........ 7 1.2 A Particular Procedure 9 with a Rate (1/4)- ................. T ................... 21 RATES FOR ONE-STAGE PROCEDURES FOR A FAMILY OF UNIFORM DISTRIBUTIONS ............... ............ 34 2.0 Introduction ............. ...... . ...... .... 34 2.1 A Procedure 9T with a Rate 1/4 ........... 38 A Lower Bound of the Modified Regret D(O, gr) 000000000000.Coo-000.000.000.000... [+5 2.3 Procedures 9T where D(O, er) is of Exact Order 0(h2) ........ ................. 55 2.4 The One-Stage Procedure ¢ .... ........... .. 61 2.5 A Counterexample to D(e,t) a O on R .... 68 RATES FOR ONE-STAGE PROCEDURES IN THE k-EXTENDED PROBLEM ......................................... 70 3.0 Introduction ............ .......... _1 ..... .. 70 3.1 A Procedure QT with a Rate (2k+2) ...... 72 3.2 A Procedure ¢ for .9(l) with a Rate (2k+2)-1 ........ ...... .. ................... 88 Extension of Lévy Metric and Bounds for Difference of Two Integrals of a Bounded Function .......... 92 A Fatou Theorem for Variances ................... 107 BIBLIOGRAPHY .................................... 108 iv CHAPTER 0 INTRODUCTION The set compound problem simultaneously considers n statistical decision problems each of which is structurally identical to the component problem. The loss is taken to be the average of n component losses. For a non-negative integrable function f, let .Q(f) denote the family of probability measures {P919 6 a real interval Q} with P6 specified by a density proportional to the restriction of f to the interval [9, 9+1). In this thesis, the component problem considered is the squared-error loss estimation of 9 based on X with distribution Pe 6.9(f). For any prior distribution G on 0, let R(G) be the Bayes risk versus G in this component problem. Let X .,X be n independent random variables with 1". n Xj distributed according to P9 . Let t = (t1,...,t ) be a set . ~ n J compound procedure: for each j = 1,2,...,n, tj is an estimator N of ej based on X = (X1,...,Xn). Let Gn denote the enpiric distribution of 9 .,9n and let 1,.. -l n 2 i“ sj=1(tj(§> - 9].) ‘15 - R(Gn) U A (D H v II (1) A bootstrap procedure based on component procedures Bayes versus an estimate of Gn will be called a two-stage procedure, while a procedure based on a direct estimate of the component Bayes procedure versus Gn will be called a one-stage procedure. For the case where f l and Q = (-m,m),'Fox (1970) exhibited a distribution-valued Levy consistent estimate an of Gn' In the Empirical Bayes problem where the 9i are iid with common distribution G, Fox (1968, §4.3) obtained a convergence rate 0(1) of the expected risks to R(G) for a two-stage pro- cedure E based on an and for a certain one-stage procedure 9. The behavior in the compound problem of the generalizations of these procedures is the subject of this thesis. If sup{\D(e, t)\ : e E 0n} = 0(n-Q), then we will say t ~ has a rate a. All rates are obtained only for bounded 0. Chapter I is concerned with a two-stage procedure E. In Section 1 an upper bound of D(§, D) for 3 based on any distribu- tion-valued estimate G of CD is obtained. In Section 2 we show that there is E based on the generalization On of Fox (1970) with a rate Chapter II is specialized to f E 1. We here deal with two one-stage procedures ET and ¢ (the latter is completed in ~ Chapter 111) where ET is based on retraction of the Bayes estimate versus a raw estimate of Gn to the interval (X-1,X], while m is based on estimates of a modified form of the Bayes estimate versus cm. In Section 1, a ET with a rate % is displayed and in Sections 2 and 3 it is shown that 9T has exact order % at e = 0. Section 4 shows that at e = 0, 9 is sometimes better than gr.Section 5 shows that when 0 = (-m,m), there is no sequence of estimates E of g for which D(g,£) converges to zero. Chapter III considers the k-extended problem. For any k k prior distribution G on Q , let Rk(Gk) denote the Bayes risk against GR in the squared-error loss estimation of 6k based k k on X1,...,Xk. Let ej — (ej4k+1’°°"ej)’ j — k,...,n, and GD ~ be the empiric distribution of é:,...,é:. Then, k k ___ -1n _2_kk (1 ) D (g. E) f(n4k+1) Zj=k (tj(§) ej) d5 R (Gn) is used as the standard in the k-extended problem. In Chapter III we exhibit two one-stage k-extended procedures QT for .9(f) and ¢ for .9(1) with Q bounded. These are respective generalizations of QT and @ introduced in Chapter II, 1 ~ and have rate (2k + 2)- In Appendix, unified generalizatiw at Lemmas 8 and 8' of Oaten (1969, Appendix) is introduced in connection with Chapter I. Notational Conventions. P and P abbreviate P and Xn P , respectively. j ~ 9. i=1 91 A distribution function also represents the corresponding measure. We often let P(h) or P(h(w)) denote jh(w)dP(u0. G abbreviates the empiric distribution Gn of 91,...,9n. R denotes the real line. We often abbreviate y-l to y'. We denote the indicator function of a set A by [A] or simply A itself. ]?or any b function h, hjb or [h(o)]a means h(b) - h(a). V and A de- a note the supremum and the infimum, respectively. é denotes the defining property. We also use the notations a+ é 0 V a and a 5 (-a)+, When we refer to (c,d) in the same section that we are dealingwith, we simply write (d). Lemma b.d means Lemma 4 d in Chapter b. The symbol ' is used throughout to signal the end of a proof. EX and Var(X) mean the expectation and variance of a random variable X. CHAPTER I A BGJND AND RATE FOR A TWO-STAGE PROCEDURE §1.0. Introduction Let f be a measurable function with 0 s f s 1. With g Lebesgue measure,we define q(e) é (fg+1 f d5).1 and assume that q is uniformly bounded by a finite constant, say m. Letting p é dPe/dg we denote by .9(f) the family of probability measures 6 given by (0-1) 9(f) = {P9 with P6 = q(e)[e.e+1)f, V e E o) where Q is a real interval. The above assumptions apply through- out the body of this thesis. Let X1,...,Xn be n independent random variables with each X distributed according to P j j abbreviates P . Denote the empiric distribution function of j 91,92,...,en by G without exhibiting the subscript n. With 69(f) where PJ, squared-error loss, let 90 procedures are Bayes against G: QG(X) é (91n,ezn,...,enn) with, be the procedure whose component for each j, (0.2) ejn = 6(9 Pe(Xj))/G(Pe(xj)) x, x = I J ecI(e>dc(e>/jj q as xj'+ X.'+ where y' is an abbreviation of y-l and the affix + is intended to describe the integration as over (X3, X.]. Henceforth we delete J J + in lower limits of I s. A bootstrap procedure based on component procedures Bayes versus an estimate of G will be called a two stage procedure. Let G be a distributiondvalued estimate of G. Define E = (©1,...,©n) to be the two-stage procedure such that, for each j, ©j(X) = gjn is of form (0.2) with G replaced by C (0/0 is understood to be Xj). The modified regret for a procedure t is of form ~ 1 _ - .n _ 2 _ _ 2 (0 3) D(e,g) — n zj=1{§(tj(§) ej) {(ejn ej) I . In §1 we exhibit an upper bound of D(§, E) (uniform wrt 9 in 0? when Q is bounded) in terms of L(G,G), relying on Proposition A and Lemma A.3 in Appendix. In §2 we construct a particular distribution-valued Lévy consistent estimate G of G for Q = R. To show consistency, Theorem 2 of Hoeffding (1963) will be used. Under the additional assumption that l/f satisfies a Lipshitz condition, we show by making use of the bound in §1 that the modified regret D(9, E) has a rate %-- when 0 is bounded. In both sections, L abbreviates L(G,C), and (ajb) or (b) ‘mean (a,b) or (b) with G replaced by an estimate C. §l.l. An Upper Bound of the Modified Regret for E . In this section we shall exhibit a bound of the modified regret D(e, D) for the two-stage procedure 9. To do so, the main ~ development is Lemma 6 in which we show that the average expectation of ‘an - ejni over the set where L < e is bounded by at most a constant times 6- For the proof of Lemma 6 we use a special case of Lemma A.2 of Singh (1974) and Proposition A and Lemma A.3 in the Appendix. Lemma 6*, which improves the bound of Lemma 6 in the special case where f E l, is included because it also illustrates a different proof. Let Q [c,d], where -m < c S d <'+m, throughout this section. Let G be a distribution-valued random variable which is an estimate of the empiric distribution G, obtained from X ,...,X . Since X! < e s X, by (0.2) whatever be the distribu- 1 n J jn J 2 2 . tion G, \(B. - 8.) — (9, - ej) \ s Z‘gjn - 0 Hence, it follows Jn J Jn jn\° from (0.3) that (1.1) 2'1\D(§, EH s In’1 2, Now, Lévy distance for two distribution functions is defined by (A-1-1)- For fixed j, since \gjn - 9 S 1, Jn‘ for any e > O (1.2) PI’e. Jn - ejn‘ SEE?” > 61+?“ng - ejnmi S 6]) Q Before dealing with the second term of rhs(2), we introduce four lemmas. Lemma 1. For any a 2 O and 6 2 0 with e +16 < l, -1 n n 2j=1AjSI+d-c where V j, X.-6 _1 Aj =Pj{[ej + 5 sxj < ej + 1 - eyj‘xjHe q (16) }. E£22£- Since V j, (1.3) A]. =j {q(ej)[ej+5 s y < sing/K}; q dG}f(y)dy and the average wrt j of the numerator of the quotient in rhs(3) equals the denominator, it follows that -1 n Y'O . A. = - . 2J=1 J fij,+eq dG > 0]f(y)dy s 1 + d c Lemma 2, For an arbitrary distribution function F of a random variable and any s,t 6 R with s < t, u+t Proof. By the Fubini theorem f F]§:: dy = f Xu+s t-s.. dy dF(u) = Lemma 3. For s,t E R with s s t and for any n E R, (1.4) n'1 2‘} B s t-s J=1 j where V j, Xj-s X_+T] B. =P,G - sx, ,+1- J, dG . J J lxj-Jej ‘I J < eJ “Vixjm q I Proof. As in the proof of Lemma 1, V j, q(ej)[ej + T1_ S y < 91“ - m] y+TI j q dG y '+TI (1.5) Bj = j 13(y)c;]y"S dy . y-t Since [9]. + TL sy< ej+1 - n+1=[y'+n+< ej 5y - TL], the average wrt j of the numerator in the quotient is no more than the denominator. Also, since f s 1, taking the average wrt j and interchanging the integral and average operation leads to lhs(4) s f(G(y-s)-G(y-t))dy, hence does not exceed t-s by Lemma 2.. Lemma ‘1- For all s E R, X. (1.6) [1'1 22:1 {{HG'éNXj'SHEi s G,w‘XJ, q as} g e(3+d-c) J' Proof. For j fixed we let 2 = Xj -s. By the definition of L and remark stated after (A.l.l) (that the infimum in the definition of Lévy distance is attained), if L s 6: then 10 (1.7) (G') 56' s (G') -3 6 where . denotes addition of the identity and the pre-subscripts denote composition with those translations. Hence I(G-é>(z)I[i s e] sc'J:_€ v G'] + S e + G]:_: Hence, X (jx. q d0) 1 -1 Zn lhs(6) s an Zj— _1P j ‘1 n Xj -S+e j + n Zj_1 P jx(Gl S_€/j::. q dG) Lemma 1 with e = 6 = 0 and Lemma 3 with (s,t,n) = (s-e,s+€,0) lead to the bound of Lemma 4.‘ We will invoke a special case of Lemma A.2 of Singh (1974, Appendix) in the proof of forthcoming Lemma 6, and also in later sections (§2.l and §3.l). Lemm§_§_(8ingh (1974)). For real random variables Y and Z, and real numbers y and z, E(\§" :jAl) s 2\z\-1[E‘Y-y\ + (\Zj + l)E\Z-z\} 11 We shall now get an upper bound of the average wrt j of the second term of rhs(2). Lemma 6. For a > 0, '1 n 2j=1'3(Iejn - ejn\[i s 6]) s aoe where = 4m{16 + 21m + (6+9m)(d-c)]. 30 Proof. Fix n and e e [o,d]“. We also fix j thru (19). X abbreviates Xj. Since (0.2) - X' = fi. (G-X')q(9)dG/j:, q(e)dG, we abbreviate the quotient of the rhs to y/z and that with G replaced by C to Y/Z. Then, (1-8) B. - ejn = Y/Z - y/z Let * denote conditioning on X and [L s 3]. Then, by Lemma 5 and by the fact that o s'Y/Z, y/z s 1, (1.9) §*\%-- :1 s.§-R*(IY-y\ + Z‘Z-Z‘) By letting I = (X',X], GI and GI are defined in PrOposi- tion A in Appendix. Then, by Proposition A, 12 where (1.11) s = \(c -E;)(x')'\ and r = \(c -&)(X)I Thus (1.12) when Lse,L seVSVTse+S+T-°-},. I By applying Lemma A.3 in Appendix, with h(e), the restriction of (e-X')q(e) to (X',X], and weakening the resulted bound, when LI 3 1,, (1.13) jY-yj s 2041+) + m(S+‘I‘) To bound a(k+), pick w 1,w2 E I such that 0 < w -w1 < x. 2 Now, by the definition of h, wz wl-X' m2 (1.14) th1 = (wz‘wl)(Q(w2) + w2_w1 Qle) But, since by the definition of q, w2 w2 w2+1 qlwl = 0, Z-IjD(e, b)‘ s'PEL > 3] + aoe uniformly in 9 N where so is as defined in Lemma 6. We can prove a strengthened version of Lemma 6 for .9(l) using an alternative proof. To do so we need to introduce the follow- ing lemma. Lemma 7. Let T be a signed measure, h be a measurable function and I = (y',y] be an interval with El h dT # 0. Let T be the signed measure with density Ih/yI h dT wrt T. Then, Y 1 I I f5 dTy(S) = y - f0 Ty(y , y +t]dt Proof. By Fubini's theorem applied to the lbs of the second equality below, Y " f8 d'ry(s) = j. f:_y.dt dry(s) = j; Ty(y', y'+t]dt .. Since G(X) - G(X') 2 n-1, it follows by two applications of Lemma 7 above with h = 1, T = C and G to gjn and ejn (cf. (0.2) and (675) with q = 1) that if G(X) - 6(x') > 0, then .1 . (1.20) ejn - ejn = 30 (w - W)dt 16 where for 0 s t s l, X '+I: (1.21) ‘W = Glx. IGJE. A A and W is given by (21). ~k - Lemma 6_. If Pj6@(l),j=l,2,...,n, then for 0<€<21, -1 n (1.22) n 2j=1 3(Iejn - ejn\[i s 6]) 3 8(2 + d-c)e . Proof. Fix j until (35). Since ‘ejn - ejn‘ s l, (1 23) A - s [EJX = O] + A [C X 0 , . x-e . x Now, if L s s, then (7) holds. Hence, G]X,_e s GJX' + 23 Thus 5 X A X's (1.24) ELGJXI = O) L 5 3] S PEG]x1+€ S 26] 5P, .SX< ,+ +P, .+1- SX 3+1 JIGJ SJ 6] J19] e < 93 ] X-e '1 +2 P ,+ 5X< .+1- G , '. 6 jU-GJ 6 9J e]( ]X +6) } Therefore , 17 (1.25) o'1 22:1(lhs(24)) 5 2(2 + a - c)€ because both the first and second terms of rhs(24) do not exceed -1 e and (23) (third term of rhs(24)) s l + d - c by Lemma 1 On the other hand, by (20) and weakening the integrand, . . x (1.26) qujn - ejnIIGJX. > 0. i 5 el) 1. ,. .. egqotw - win, 3 ejdt) For any a, b and z E R, when a S b, ((a-z)/(b-z)) = l - ((b-a)/(b—z)) decreases from 1 to zero as 2 increases from -m to a. Applying the above analysis to the representation (21) with a, b and 2 defined by positional correspondence in (21) and then applying (7) at X', we obtain that for 0 s t s l and L s e ELX'+t1-GLX'+e)-e S Q S ELX'm5(X'J)+e C(X>~G(X'+e)-e é-G(X'-e)+e Finally, making the lower bound smaller (and the upper bound larger) wrt G(X'+t) and G(X) and weakening by another four usages of L < 3, results in l8 X'+t-e X'+e X'+t+€ (1.27) (6] X._ -2€)/GJ:T_:€ s W s (G] + Ze)/G]§:E€ Note for future use that if u S‘W s v, then (1.28) jW-W\g(W-u)++(V‘W)+- Now, for any a, b, y and z E R (1-29) Z{b/a - (y-Ze)/Z) = 2e + (b-y) + (b/a)(z-a) Let u = lhs(27). With W é- b/a and u 5 (y-Ze)/z where the corn- ponent quantities are defined by positional correspondence in the definitions of W and u, (29) and the relationships X '+t X'+€ , X'+t-e + ij. give X+ z-a s ij e, O s (b/a) S l, and b-y = G] X+e X '+t X '+e X+ (1.30) (GIX.+€>(w-u> s 2s + G]X'+t-e + GJX. + GTX 8 . Similarly, for any a, b, y and z E R, (1.31) z((y+2€)/z - (b/a)) = 23+y-b + (b/a)(a-z) Let v = rhs(27). With v = (y+2e)/z and w = b/a where the 19 corresponding quantities are defined by the positional correspondence in v and W, (31) and the relationships a-z s G]: , O s (b/a) < l 'e _ X'+t+e X' and y-b — G1X"+t + G]X._e show X-e X'+t+e x' x 1032 - + + g ( ) (GJX1_€)(V W) S 26 GJX‘+t + GJX._€ GJX_€ (26), (27), (28), (30) and (32) together give us that (1.33) IN - WILL s e] s (6]:Tie)"1(rhs(30)) + (G]§:E€)-1(rhs(32)). 1 X'+t 1 X'+t+€ dt 3 d By Lemma 2 , f0 ]X'+t-e e an IQ G1X'+t dt 3 €- Thus 1 (1.34) f0 rhs(33)dt s a'+ B where _ X'+e X+€ X+e oz - (3s + GJX' + ij )/G]X,+ and X' X X-e = + a (36 + G]x!_€ G]X-e)/G]X' l Bounding I01h8(33)dt by 1 (since 0 5W, W s 1) over the sets X-1[ej, ej+e) and X-1[ej+l-e, ej+1)’ and extending the set 20 X-ILGj + e, ej+l-e) in two different ways (as shown below) we get (1.35) n-1 33:1 P(J‘éjw - {TILL s e]dt) -1 n + . P . s X < .+1- S 26 n ZJ=1 j(aIGJ SJ 6]) + n.1 2‘? P.(B[0.+e s X < e.+1]) J=1 J J J By applying Lemma 1 with q = l, twice and Lemma 3 four times to the second and third terms of rhs(35), the second and third terms of rhs(35) are both 5 33(l+d-c) + 23. Hence, in view of (35), (26), (25) and (23) we recognize that the sum of rhs(25) and 2€.+ 2{3€(1+d-c) + 23} gives us rhs(22).. 21 §1.2. A Particular Procedure % with a Rate (l/4)-. We first construct a normalized (but not monotonized) estimate G* of the empiric distribution function G. Main work in this section is, under the extra assumption on f (Lipshitz condition for l/f), to obtain the generalization (Lemma 8) of Lemma 3.1 of Fox (1970). Then, we exhibit a distribution-valued estimate G of G. Lemma 9,showing Lévy consistency of G to G, will be proved as in the proof of Theorem 3.1 of Fox (1970) by using Lemma 8. Finally, Theorem 2 shows that there exists a procedure E with a rate (l/4)-. In addition to the assumption on f in the introduction of Chapter I we now assume that l/f satisfies the Lipshitz condition: 1I (2.1) v{(v-u)' (f(v))-1-(f(u))-1'I : u < v} sM for a finite constant M. Let Q = R until the proof of Lemma 9 is ended. Let Q be the distribution function defined by Q(y) =fiquo,vy, Then, letting 5 5 I deG(e)’ we have by the definition of pa that 66:) = f(y)(Q(y)-Q(y')) and thus = 21121 (2.2) Q(y) 2f(y_r) 22 where 2 abbreviates 2m=0 throughout this section. Since r q 2 l and q is the density of Q wrt C, it follows by Theorem 32.B of Halmos (1950) that (2.3) em = fl, (q(e)>'1dQ(e) For each y, we let ‘1: -1 n F = X. S (y) n 2j=1[ J Y] and for any h > O a - * +h (2.4) at (y) =h 1F 13, . We allow h to depend on n and assume h < l for convenience. Let P'é fPedG. Then, 5 = dP7d§ where g is Lebesgue measure. We estimate p(y) by AF*(y) and Q(y) by 7': * (2.5) Q (y) = $(AF (y-r)/f(y-r)) Note that Qx has bounded variation because of (1). From the relation (3), we obtain a raw estimate W. of G from (2.6) We) = floo(q(t)>’1dq*(t> 23 * * Since F (y) s.G(y) s.F (y+l) for all y E R, we furthermore estimate G at a point y by * * — ~k C (y) = (F (1') V NW) A F (y+1) Following Lemma 8 is a direct generalization of Lemma 3.1 of Fox (1970) in the sense that if f :-:— 1, then m = l and M = 0, and hence we get his bound 2exp{-2nh232}. Lemma 8. If 0 < h s e s 1, then for each y (2.7) Racers-e s c* sc+e1°> 2nh2( ( e-bh)+) 2 sZexp- 2 1+4 - +3 M { (9m) 9(1) > } where 8 = min 8 , 9 = max 9, and b = 2-1m(2M+3(1 AM)). (1) . (n) . i 1sisn ISlsn P1:_gg_f_. For y > 6(1)) + l, F*(y) = 0*(y) = G(y+3) = l and for y < 9(1) -1, F*(y+l) = 6*(y) = G(Y'e) = 0; in both cases lhs( 7 ) = 0 and ( 7 ) holds trivially. For y 6 [9(1) --1, G(n)+l] it is sufficient to prove the lemma for the raw estimate W. For if G(y-e)-e SW(y) s G(y+e)+e, then. since G(Y‘e)-e $G(Y) S F*(Y+1) and F*(Y) SG(Y) SG(Y‘+e)+€: it follows that mm)... s W(y) A F*(y~+l) s c* s G(y+e)+e. Pick y E [9 Since the summation on r in (”-1, G(n)+1]. (5) involves at most a finite number of non-zero terms, we shall 24 freely interchange integral and summation on r without further comment. In fact, if the r-th term is non-zero, then r s y - 9(1) + h and, for y s 9(n) + l, (2.8) r s e - 6 + 2 5 a - 1 (n) (1) For each j, let 1 l (2.9) N]. = 2: flm(q(t))' dt{[t-r < xj s t-r+h](h f(t-r))'} , where the subscript t in dt denotes the variable of integration. By the definition (6) of W, We are going to find an upper and a lower b0und of PW(y) in order to apply Hoeffding's bound (1963, Theorem 2). To do so we shall find an upper and a lower bound of PW , V j. Fix j until (24). We use the corresponding notations without subscript j until (28). Now, Proposition 111.2.1 of Never (1965) gives us a version of the relation E E(h(t)\X) = Eh(t) for an integrable function h and probability measures. But, because of its proof it holds for finite measures and hence by two applications of it, it holds for finite signed measures. 25 Hence, 1 l pe{j‘3:m(q(t))' dt([t-r < x s t-r+h](f(t-r))‘ ) = Km(q(t))’1dt{r>e([t-r < x s t-r+h](f(t-r))'1)} Thus, by the definition of W (2.10) 3 w = «9) z flwf1dts where (2.11) s = (f(tn‘lh'l Thu, 5 s < e+ljf(s)ds Because a function satisfying Lipshitz condition is absolutely con- tinuous (cf. Royden (1968), p. 108, Exercise 16(a)) and the product of two absolutely continuous functions is absolutely continuOus, S(--r) is absolutely continuous. Since l/q is clearly absolutely continuous, S('-r) and l/q are both of bounded variation. Applying integration by parts (Saks (1937), Theorem III.14.l) and using d(q(t))-1 = (f(t+l)-f(t))dt gives us that - - 1 (2.12) flm(q(t)) ldtS(t-r) = 361%? - Km S(t-r)f]:+ dt . Now, by the assumption (1), 26 (2.13) ‘§%:))- - 1" ijs-tj . Until (22), we use the notation - + (2.14) A(t) = h 1f: h[e s s < 9+1de Applying (13) to the definition (11) of S and doing exact integration leads to the inequalities (2.15) l - 12—1" 5 3(t)/A(t) S 1 + lid-l . - +h Moreover, because ZA(y-r) = h 1 fy [6 < tjdt, y [e s y] s Eh(y-r) S [e s y + h] . Hence, weakening the bounds by a usage of [e s -]/q(y) s 1, shows that 2 16 LQ_§;Xl _Iflfl S QJLEXZEL ( £0 S x+h| +13h ( . ) q(Y) 2 q(y) T Q(Y) 2 On the other hand, in the integral of rhs(12) we make a change of the variable t-r to t to get 27 j: S(t -r)f]:+1dt = j‘3(t)[t s y-r]£]:::+1 dt Let [2] denote the greatest integer s 2 if z > 0 and -1 if z < 0. Since z[t e y-r](f(t+r+l) -f(t+r)) = [t s y](f(t+[y-tj+1)-f(t)) = f(t+Ey-fij+l)-f(t) (the latter because [y-fi] = -1 if t > y), it follows that (2.17) g “I“: S(t-r)f]:+1dt = j s(t)f]:+D'tj+1 dt By one usage of (15) and the fact that 0 < f s l and j‘g(t)dt = 1, (2.18) Ij<3(t>-t G(y+e)+c] s {[1760 - My) > e-bh] (2.28) 2nh2((e-bh)+)2 S exp - i (1 + 4aM)2 Furthermore, by the first inequality of (25), {W(y) <;G(y-e)-e} C1{P W(y) - W(y) > e-bhj. Hence by the symmetry of the tail bounds,‘P[W(y) < G(y-€)-e] has the same upper bound, rhs(28), which together with (28) gives us the asserted bound of Lemma 8.‘I We let 6 = N-l, N being a positive integer depending on n, and consider the following grid on the real line: .-< -26 < ‘6 < O < 6 < 26 <... . We finally estimate G at y by A * (2.29) G(y) = supr (jS) : jé S y, j = 0, i_l,... } Let L = L(G, G be Lévy metric of G and G (cf. (A.l.l)). Lemma 9. (Fox (1970)). For any a > 0, if h S e and O S 6, then (2.30) ‘P[L >'23] S (6-1 +-l)[k-1 +1] rhs(7). 32 Proof. We rely on the proof of Theorem 3.1 of Fox (1970). Pick 6 > 0 such that h S e and 6 S e- Let J be the largest * integer such that F (J6 + l) S e- We also let * * T = {j : F ((j+l)6+l)4F (j6) > e, j 2 J, j = O, :_l,...} and An = U [j6, (j+l)6). Since only retraction and monotonicity jtT * A properties of his respective estimates G and C were used before Lemma 3.1 of Fox was applied, the following inequalities are still * A true for our estimates G and G. (2 31) 'Pli > 2e] = P( U ({5(y) > G(y+2e)+2e}U{§(y) < G(y-Ze)-Ze})) ~ " yEAn at u ((6715) >G(io+e>+e} u {chm -e}> jotAn S E R((G*(J5) >’G(j6+e)+e} U {G*(j6) < G(j6-e)'e)) . jééAn Since there are at most (6-1+l)[§-1+I] grid points (see Fox (1970, p. 1850)) in An’ by Lemma 8 the extreme rhs of (31) is no larger than rhs(30). . Let E be the procedure whose component procedures are Bayes versus C defined by (29). To get a rate of convergence of the modified regret for 9 we use the bound of Theorem 1 in section 1. Since this bound is valid only for Q = [c,d] where -m < c S d <'+m, we assume Q(f) with Q = [c,d]. 33 Theorem _2_. If Pj E Q(f) With Q = [Cid], j = 1929000911 where f-1 satisfies the Lipshitz condition (1), then there exist .. 1v constants b1 and b2 so that,for B with blh = b26 = (n 1log n)“, c ‘1 2 . . n D(9, e) = 0((n log n) ), uniformly in e E [c,d] . 2392:. We use Theorem 1 in section 1 with 6 replaced by 2 13 and apply Lemma 9. Then, choosing e = 6 = (2b+l)h < l (for sufficiently large n) and weakening the bound by changing G(n) and e to d and c, respectively gives (1) 4 2 -(nh /b5) (2.32) \D(§, E)\ s b3h + bah e where b3 and b4 are some constants, and b5 = 2{l+4(d-c+3)M]2. % 2 Choose b1 and b2 so that b S4 and l - — L b2 = b1(2b + l) 1. Then, for blh (= bzé) = (n 1log n)“, (32) leads to the asserted rate in Theorem 2” . (3b5)- CHAPTER II RATES FOR ONE -STAGE PROCEDURES FOR A FAMILY OF UNIFORM DISTRIBUTIONS §2.0. Introduction In Chapter I two-stage procedures were developed for estima- tion of 9 for the family .9(f). For sufficiently smooth f and certain two-stage procedures E, D(e, b) = 0((n-llog n)%) uniformly in 0 (cf. Theorem 1.2 ). In this Chapter we consider one-stage ~ procedures for estimating 0 for the family 19(1), i.e., where _1 Pj is the uniform distribution u[ej, qj+1), and obtain 0(n 6). Throughout this chapter let X1,X2,... be independent with Xj distributed according to P. =U0., 0.+1 . Webe inb J I J J ) g y motivating the structure of two one-stage procedures for estimating g = (91,...,en). For fixed j, l S j S n, we abbreviate Xj to x. Then by (1.0.2) with q = l and Lemma 1.7 with h = l, y = x and T = G, the empirical distribution of 91,...,en, §G(X) = (eln’°°"enn) has jth coordinate x'+t X X! IG]XI dt 1 (0.1) ejn = x - f0 G] For each y let pe(y) * (dPe/d§)(y) = [e S y < 9+1] where g is Lebesgue measure. Then p(y) é G(pe(y)) = G(y)-G(y') 34. 35 which leads to the relationship (0.2) G(y) = 2:0 5 s]dsr + (-1)rj3 u[g < -s]dsr Proposition 1. Let Y .,Yr be independent and 1,.. a S'Yi S b, i = 1,2,...,r. Let T'= r-l E:=1‘Yi' Then, for every TI - , - b-a EIY - Bl S \EY - TH +-—y¥" Jn72 Proof. By the triangle inequality (1.11) ET? - m s In? - “I +E‘T - EI'EI . Now, by Fubini representation (10) of the integral, 42 Also, using the first inequality of Lemma 2, and weakening the bound by use of G(x'+t) - G(x') S l and l-h < 1, gives us 1 h x'+t 1 x'+h -1 (1.17) jouitc - PX(AtTn))dt s i0 G]x, dt + (h gjx, dt + n x'+h -l Sh+G]x, +n . Thus, taking the maximum of two bounds of (16) and (17) (recall h < 1/2) and weakening the bound by a use of (y-I-u) V (yd-yam) S y-Pv-I-uvw, ' leads to first term of rhs(lS) S h + G]:.+h + 8171-. Thus, in view of (15) and the fact n.-1 S (n-l)-;5, (14) is established.’ Lemma 4_. (1.18) PXIAT - AG\ s 01”“ + 03" "Lb + (1 + .an 2) 1 " n x x' frT-Th Proof. Define Wi = ((n-l)/(nh))[x < Xi S x+h] for i 9‘ j. Then, by definition (5) of Tn’ we can directly verify AT“ = 8| -1 - Since 0 SWi S h for all i, by Proposition 1 with b-a h 1 (1.19) lhs(18) S \PX(ATn) ' AG\ + 1 ' 71172 . "' n-l h But, by the second inequality of Lemma 2, l 3 x+h - EX(ATn) - AG S GJX + (nh) and by the first inequality of Lemma 2, 43 x'+h -1 NC “'EX(ATH) S G]x' +-n . Thus, taking the maximum of above two bounds and weakening the bound by use of (u+v) V (s+t) S u+s+th gives us first term of rhs(l9) S.G]::hlflh + G]::+h +(nh).1 . Therefore, in view of (19) and n-1 S (n-l) J5 the bound (18) is obtained.‘ We now go back to the inequality (3). Applying the bounds from Lemmas 3 and 4, we get n-1 21 P.(rhs(3)) s 2(h+(3+ 1.6172) “-L—M-l Zr} Flam-1 J—l J 2 f‘n_'1" h le J -1 n xj+h -1 n X'+h + A.“ 2j=1Pj{G]Xj /ch + 6-6 sjzlei GJX? lee} - Therefore, by Lemma 1.1 with 6 = e = O and by two applications of Lemma 1.3 with (n,s,t) = (0, l-h, l) and = (O,-h,0), we finally obtain in view of (3), (2) and (1) that \D(e.8r)\ s (10+2(d+1-c))h + 2(3 + £fm)(d+1_c) 1 N N Jn-lli Setting h = n T gives us -}’ Theorem 1, For QT defined by (0.4) with h = n 4 \D(§’§T)\ S 0(n-% ) uniformly in 0 . Remark 1, Theorem 1 of Chapterlll is a k-extended generaliza- tion of Theorem 1 for non-regular families of distributions .9(f). Its specialization to k=l (unextended case) is itself a 44 generalization of Theorem 1 which concerned the .0(1) family. Theorem 1 was presented because of its simplicity and its significance in the motivation of Theorem 1 of Chapter III. 45 §2.2. A Lower Bound of the Modified Regret D(0, 9T). Let X1,...,Xn+1 be i.i.d. random variables with the common ' ’b ' = . = .o. distri ution P U[0,1) Let § (X1: ’xn+l ar(X) = (er l’°"’eT n+1) (for the definition, see (0.4) with n ). Here we consider replaced by n+1). Since QT l""’eT n+1 are identically distributed and since for all j, ej n+1 = 0 (for the definition, see (0.1) with n replaced by n+1), abbreviating 9T n+1 to 9 we see in view of (1.0.3) that the modified regret of 3T at e = O is given by 82 (2.1) D(O, 9T) =P e For fixed x i X , we abbreviate in the defini- n+1 ‘Pn+1 ,n+1 tion of 6 (see (0.5) with j and n replaced by n+1) to m and exhibit an explicit form of T in a.s. PX-sense. Lemma 5, For every x 6 [0,1), 11 n (2.2) (p - (zj___l(xj -h)[x < XJ. S x+h]-h zj=1[0 S Xj S x] n n - - h + , [O S X S x'+h IX, [x < X, S x+h a.e. P 2J=1 j J) J=1 J ] ~X Proof. Fix j and note that as a function of t 6 [0,1], z:d3[Xj-x'+r-h S t < Xj-x'+r] is equal to zero, is equal to its first term, or is equal to the sum of its first two terms according to whether 1 < X -x'-h, Xj-x'-h S 1 < Xj-x' or Xj-x' S l. Integrat- j ing over t 6 [0,1] for each case gives 1 m X - '+ - S .- '+ = h- , X, S x+h f0 Zr=0[ j x r h t < XJ x r]dt (x+ XJ)[x < J ] + h[Xj S x] . 46 Hence, it follows by the definition (see (1.5)) of T abbreviated n+1’ to T that (2.3) (“1“)th T]::+tdt = Egfiwth-xjux < xj s x+h] n+1 200 [X' j=l r=0 n+1 + h , X. S x - -r < X, S x'-r+h . 2J=ll J l a J 1 But since [x < x S x+h] = 0, [x S x] = l, g:_0[x'-r < x S x'-r+h] = 0 and a.e. P , 2m [x'-r < X S x'-r+h] = 0 we have ~X r=l j ’ n n rhs3 =x , x1x] = 0. Hence, E has the follow- ing shnpler form: (2.4) ’9 =[ x' V tp for x E [0, l-h) (x' V qD A x, for x E [l-h, l). 47 Now, we let (2.5) J = [m 2 x', x < l-h] and recognize by (l) and the definition of E that (2.6) D(0, 9 ) 2 P( 2J) ....T ~‘P Let 52 denote convergence in distribution. Also, N(c,d) denotes the normal distribution with mean c and variance d. To get lower bounds for D(0, er) (Theorem 2) we use the relation (6) and the fact that for fixed x, hm1 $01? - 2.1 and L '1 3.~ fl 2 Sn - Q/nh m-+ 2 nh )J ..N(0,x ). We then apply a convergence theorem (cf. Loéve (1963) 11.4, A(i)): 2 (2.7) If Un'QU, then liinEUn ZEUZ , where E means expectation, and Theorem A.1 in appendix. We shall first prepare Lemmas 6, 7 and 8 to prove the above two convergences in distribution for the proof of forthcoming Theorem 2. n n Let u = 0 S X. S , v = , x X, S x+h and w = 22:1 Xj[x < Xj S x+h]. We also define X = (w-hv-xv-h)/(hv), Y = (u-nx)A/nx(l-x) and Z = (v-nh)//Efi . 48 Then, on the set J, m of the form (2) is alternatively written as (2 8) = hx + gnhflzz f—_x(l-x) n'liy ~ P -L ' _ 1+(nh) ’2 1+(nh)%Z Lemma 6. Given x 6 (0,1), if h is a function of n such that nh « m and h a 0, then (Y, Z) €N(_0_, I) . 2 . . . . where Q is the zero vector in R and I is 2 X 2 identity matrix. Proof. For each x 6 (0,1) we restrict to n such that x < l-h. Pick t and s arbitrary, and let - ..lv - Vj = n %{s(x(l-x)) 2([0 S Xj S x]-x) + th 15([x < Xj S x+h]-h)] , for j = 1,2,...,n. Then, it is not hard to see that n , V, = s Y + t Z ZJ=1 J Since the V, are i.i.d., the characteristic function R of (Y, Z) 2 at a point (s,t) 6 R is given by n (2.9) K(S.t) = (J(1)) where J is the characteristic function of V ; V1. 49 Since by XV (6.8) (Feller, 1971), for any complex numbers such that ‘04 S 1 and jsj S 1, Ia“ - e“I s nIa - BI» (2.10) \(J(l))n - exp(-%(sz+t2))j S n\J(l) - exp(-%;(sz+t2))j By the triangular inequality and by using Il-y-e-yj = 0(y2) as y—DO, sz+t2 l (2.11) rhs(lO) S an(1) - l + 2n \ + 0(n ) Now, from the Taylor development of characteristic functions by XV (4.14) (Feller, 1971) and from the fact that J(0) = l, 2 J'(O) = i PgV = 0 and J"(0) = 4PgV , it follows that N 1 2 1 I - _ l _ I IJ(1) 1 + 2 {Xv , s 6PX\VI 3 Now, we verify that PXVZ = n-1{(sz+t2) - tzh - zstx(,/’>E(1-x)-% + l-x x-%)/h-] and PX\V\3 = n'3/2{Is(x'1-1)15 - th15\3x + jt(l-h)h-%-sx%(l-x)_%j3h + 'st%(l-x)-% + t h1§\3(l-x-h)] . 50 Hence, -1 2 - L OSn (s+t2)- xV2S0(n1h2) 1'13 and 3 -3 2 -% Exivi =0(n / h ) Hence, applying the triangular inequality leads to 2 2 S +t \ = 0(n.1 hg + n-B/2 2n Jr \J(l) - 1 + h )) Thus, in view of (11), (10) and (9) 2 2 \K(t,s)-exp(- S :t )] = O(hlé + n-;5 h-% + n-l) To get the conclusion we invoke the continuity theorem (cf. e.g. Breiman (1968), Theorem 11.6). | P -l P We shall next prove X H -2 where a 0 means convergence in probability PX for given x. Lemma 7. Under the same assumption as Lemma 6, Proof. For given x 6 (0,1), we restrict to n such that x < 1-h. Then, X is written as (2.12) x = (C/(Z—hn - v'1 51 -1 n where C = (nh) 2j=1 I. = x < X. s x+h . J [ J J U,, where U, = h-1(x,-X-h)l, with J J J J Since v has the binomial distribution with parameters n and 1h (2.13) — 1 as nh -+ co and h -+ 0 By simple computations, EU Nl’J“ and Var(U) = %--Fr11-h Thus, EC = h'lsu = -2‘1 and Var(C) = (nh2)-1Var(U) = (12'1+(1-h)/4)/(nh). Therefore, by the Chebychev inequality, P (2.14) C a -2 as nh « m and h a 0 Applying (14), (13), (12) and Slutsky's Theorem completes the proof of Lemma 7. ' Besides the above two lemmas we shall show that PX[&>5 x'] vanishes when nh » m .and h a O. 52 Lemma g. Under the same assumption as Lemma 6, s ' .. f ’ . EXEm x ] O or fixed x Proof. We restrict to n such that x < l-h. Let Wj = h[0 S Xj S x] - (Xj-h-x')[x < Xj S x+h] for j = 1,2,...,n. Then, by the‘representatixn1(2) of m, [m s x'] = [Wuz -n-1h] where W is the average of i.i.d. W.'S- Since PXW J =1m2‘%+xw, 1 j _ — -1 '1 (2.15) Px[m S x ] = PX[W4PXW 2 (l-x-n -2 h)h] . But, Var(W) = n-IVarU/Jl) = hn-1{l-(1-x)(2-x)h + (1% - x)h2-4-1h3} s (g9hn-1. Hence, by the Chebychev inequality and for large n - -l - -l -2 rhs(15) s (7/3)h 1n (l-x-n 1-2 h) which tends to zero when nh a m and h a 0. . We are now ready to prove Theorem 2. (i) If h is a function of n such that nh « m and h a 0, then for any i > e > 0, there exists N <‘+m so that for all n 2 N D(g. 9T) > (fi - €)h (ii) If h is a function of n such that nh a m, h d 0 and 3 nh = 0(1), then for any WIH > e > 0, there exists N < +m so that for all n 2N 53 D<9.9T)><§-e>:—h. . . 3 Proof. (1) Since nh a a and h a 0 implies nh a m and h a O, we have by Lemmas 6, 7 and 8 that given x 6 (0,1), .0 P (2.16) mm ..N(9_,1), x —» - i [.9 2 x'J 3 1 Hence, in view of (8) it follows from Slutsky's Theorem that if -1 -1 x t (0,1), then b ¢.J‘Q -2 (see (5) for the definition of J). By a convergence theorem (7), we have . -2 2 1 (2.17) 11m Px(h m J) 2 EEO < x < 1], and hence by Fatou's theorem applied to the lhs below blp-a . -2 2 11m P Px(h Q J) 2 P(lhs(l7)) 2 Thus, by (6) we get that 1' h-ZD o ) ‘ l a .21.! (”.13 4 (i) follows because of the definition of lim inf. To prove (ii) we first recognize that for this choice of __ _ /m h, (16) still holds. Let SD = {/hh q>+ 2 1 JnhB} J. Then, in view of (8) it follows from Slutsky's theorem that if x 6 (0,1), then 54 .3 2 Sn “N(O, x) . 2 - W Since px{(nh)¢ J} = Px(sn - 2 1 ,/nh3 J)2 2 Var(Sn), applying Theorem A.l in Appendix to the rhs leads to . 2 2 . (2.18) 11m PX{(nh)m .J} 2 x [O < x < 1] . Thus, by Fatou's Lemma applied to the lhs below 2 1y; P§X(nh (p J) 2 P(lhs(18)) and the ooh—- Therefore by (6) we get that lim (nh)D(O, gr) 2 definition of 1im inf leads to (ii). I Theorem 2(i) implies that at any parameter sequence (91,92,...) where 91 = 92 =..., 9T with the choice h = n has modified regret converging to zero at a rate no faster than n- This leaves open the possibility of strengthening Theorem 1 of §1 by this improved rate. The next section develops a positive result in this direction by obtaining the improved rate at a fixed parameter sequence. 55 §2.3. Procedures 8T where D(O, QT) is of Exact Order 0(h2). In this section we show that the modified regret D(O, 9T) 2 -%h-1 has an upper bound of order 0(h ) when n = 0(1), and by this choice of h we have a lower bound with the same order of magnitude. Specifically, if h = n- (up to constants), then we get convergence of exact order %. As in §2, let X1,...,Xn+1 be i.i.d. observations from P = u[0,l) and fix x = Xn+l' 9 and m abbreViate aT,n+1 and n ‘ . = th t 2.2 ¢h+l,n+l’ respectively Let v Zj=1[x < Xj s x+h] so a ( ) reads for each fixed x E [0,1), -l n (3.1) ¢ = v {gj=1{(Xj-h)[x < Xj s x+h] - h[0 s Xj s x] + [0 g Xj s x'+h]} - h} a.e. Px' Since 3 is the retraction of m to (x',x] where O s x < 1, then \é\ s 1 and (3.2) P B2 S PEV = 0] + P(BZEV > 0]) Note that (3.3) P 1;..[v = 0] = jg‘hu-mndy + fi_h<1-y)“dy . +1 , , The first term on rhs(3) is (l-h)n . Hence, the inequalities 1 = ((1-h)+h)n+2 2 (“"1'2)h(1-h)“‘+1 imply that it is bounded by ((n+2)h)-1. The second term on rhs(3) is bounded by (n+1)-1 so . gr 56 that (3) implies (3.4) ' P[v = o] = 0((nh)'1) Now, consider the last term on rhs(2). Since x' < O s x, by Fubini representation of the integral (c f. (1.10) with r = 2) and (1) it follows that (3.5) P(B2[v > 0]) s P(jé PX[U > h]dsz) + P(IS EX[V < h]d52) where U = 2 U , V = Z,V,, and for each j = 1,2,...,n j j J J (3.6) Uj = (Xj-s-h)[x < Xj s x+h]-h[0 < Xj s x] + [0 < X. s x'+h] J and (3.7) Vj = Uj + 23[x < Xj s x+h] . The Uj are i.i.d. with mean -sh - 2-1h2 for x E [0, l-h) and mean -s(1-x)-2-1(1-x)2 for x 6 [l-h,1). Then, by (7), each Vj has mean J’sh - z'lhz, for x e [0,1-h) (3.8) P V. = .. ~x J L_S(1-x)-2 1(1'102, for X E [l'h’1) We will use the bound 2 for the range of Uj and Vj for all 0 s s s l and O s x < l in Theorem 2 of Hoeffding (1963) in order L“ 57 to bound the tail probabilities in (5). Moreover, the Hoeffding bound developed for the last term of rhs(5) will also bound the first term since Egv is closer to h than PXU is to h. Having noticed these facts we now prove lemma 2. % 1 (3.9) P(“92[v > 0]) = 0(h2 + n- + (nh2)') Proof. In view of the comment preceding Lemma 9 it suffices to show that the last term of rhs(5) has the order indicated in (9). Let €-= n-lV. Using (8) and letting a = n“1 + %h and 1 a = hn'1(1-x)'1 2 -+ %(l-x) we have [EXDV +EXV >h(s-a1)], for x e [0,1-h) PXEV < h] = [ EX[4U + EH6 > (1-x)(s-a2)], for x E [l-h,l) Applying Theorem 2 of Hoeffding (1963) with the bound 2 for the range of the Vj gives . 2 J exp{-%b1((s-al)+) }, for x E [0, l-h) (3.10) EXW < h] sL 1 2 exp{-2b2((s-a2)+) }, for x 6 [l-h,l) where b1 = nh2 and b2 = n(1-x)2. A direct calculation together with the fact that I: exp(-%y2)dy =,/n72, shows that 58 1 2 2 f0 eXP{-%b((s-a)+) )ds 2 2 m -3use) 2 (3.11) s f: ds +'ja e ds - -L sa2+(2b 1+,f2‘nab 2) where a,b 2 0., Applying (10) and (11) with a a and b = b gives ~\"" 0(h2 + n" + (nhzfl). (3.12) P{[0 s x < 141]]: PXLV < h]ds2'} The treatment of the [1-h s x < 1] part involves more analysis because of the dependence of a2 and b2 on x. By (11) with a = a2 and b = b2, and by h s 2-1, 1 2 h l-X 2 -% l . + — —— (3 13) f0 EX£V < h]ds s {{n(1_x) 2 } + cln + c2 mum?) A 1 —_];_§.}A1, —L 221.2 + (2+ c1>n 2+ {04+ c2) n(l-x) for x E [l-h,1) and constants c1 and c2. But, for d > 0 2 2 (3.14) fl-h (1_y)2 A 1 dy S f(1_h)v(1-d)dy + Il-h (1-y)2 dy 2 -l -l =h/\d+d{(h/\d) -h}s2d. Thus, ' -L (3.15) P([l-h s x < l]lhs(l3)) s 0(h3 + n 2) 59 Therefore, (9) follows from (15), (13), (12) and (5).. Hence, by (2.1), (2), (4) and Lemma 9, we get the followinglemma: Lemma lg. (3.16) MO. 9,) = 0(h2 + 15% + (nh2)-1). Theorem 3. If 8T is defined by (0.4) with h such that ~ -L -1 n 4h = 0(1), then there exists a constant c3 so that for sufficiently large n, -1 2 2 C3 1] >D(9, ET) SC3h Proof. If nn‘h-l = 0(1), then there exists some constant 0 <:M < +m such that (3.17) h 2 Mn‘% Therefore, by Lemma 10 there exists a constant c such that D(0, 9T) 5 C4h2 for sufficiently large n. On the other hand, by (17), nh3 2 nkM3 1 +m as n 1 +m. Hence, (i) of Theorem 2 in section 2 holds. Letting c3 > Ca V 4 we pick a > 0 in (i) of Theorem 2 so that e < 4-1 - cgl. Then, from (i) of‘Theorem 2 the first inequality in Theorem 3 follows.‘ Remark. Here, we shall state values of h and of bounds of the modified regret, up to constants. That is, for example, ’(I _ ‘0’ . . h = n means h - C n for some pOSitive constant c. We let large n be fixed. 60 The lower bounds of D(0, 3T) in Theorem 2 describe the . . . . . -2 3 strictly convex curve which attains the minimum value n I at n-1/3, has the value (nh).1 for h less than n-1/3 and h2 -1/3 n . h: for h greater than 0n the other hand, the upper bound in (16) describes the .L I O O I O O 2 strictly convex curve which attains its minimum value n at -% 2 -1 -% 2 h = n , has the value (nh ) for h less than n and h for h greater than n-%. Hence, these two curves coincide with each other for h -1. greater than and equivalent to n 4 and attain the best exact order -% .. % at h = n . For h less than n , our (upper and lower) bounds N" are not necessarily close. 61 §2.4. The One-Stage Procedure ¢ . Consider the procedure a (0.6) originally considered by ~ Fox (1968) in the empirical Bayes problem. It is a corollary to Theorem 2 of Chapter III that with bounded Q and the choice h = n-% , \D(e, ¢)\ = 0(n-k) uniformly in parameter sequences, the same rate established for 6T in Theorem 1. N In this section we study the risk behavior of a at 0. N For the modified regret of ¢ we find the same lower bound as established in Theorem 2(ii) for 9T and an upper bound which is 0(n‘%) for the choice h = n-%.- As in sections 2 and 3 let X1,...,Xn+1 be iid P = u[0,1) and replace n by n+1 in the definition (0.6) of 9. Fix x é Xn+l and let ¢ and W abbreviate ¢n+l,n+l and wn+l,n+1 in (0.6) and (0.7). As in section 2 define n n = n 0 s X. S x and v = x < X, s x+h . * Then, by (0.7) applying the definitions of F and T + we can n 1 show as we have done for m in (2.2) that W has the following explicit form; for x 6 [0,1), . -1 n 4.1 =v -hu-h+ , 0 5X. ' ( > W {Xv 2J=11 J 5 x +h]} a.e. Ex From Lemma 5 in section 2 and the above (1), we can easily see that (4.2) OSw-tpsh 62 In the same manner as (2.1) was obtained, (4.3) ME}. 9) = I; q) Note that ¢ = x' V W for x E [0, l-h); = (x' V w) A x, for x e [i-h, 1). We get a lower bound first. Let J = [W 2 X', x < l-h] . By (1) and the definition of ¢, (4.4) P ¢2 2 P(WZJ) Define Y = [nx(l-x)]-%(u-nx) and Z = (nh)-%(v-nh). Then, we can easily see that = (nh)f%xZ-n-1 _,/x(1-x) n-gY (4-5) W _ .1. 1 + (nh) L52 1+(hh) 22 Theorem 4. If h is a function of n such that nh a m and h a 0, then for any %’> 6 >10, there exists N <‘+m so that for all n 2‘N, l 1 13(9),) >(3- e) a; 63 Proof. Fix x 6 (0,1) until (7). Since by (2) m s w, it follows by Lemma 8 in section 2 that (4.6) PXEW s x'] S'Px[m s x'] a 0, for given x. .8 Since by Lemma 6 in section 2, (Y,Z) _.N(QJI) where Q_is 2 dimen- sional zero vector and I, 2 x 2 identity matrix, and since by (6) P J a 1, it follows from Slutsky's theorem applied to rhs(5) that if x E (0,1), then fr-Ih (L: J §N(0,x2) As a consequence of a convergence theorem (2.7) (cf. Loéve (1963) 11.4 A(i)) we have 2 2 (4-7) lim(nh)PX(¢ J) 2 x [0 < x < l] . Thus, by Fatou's Lemma applied to the lhs below -1 2 lim P Px(nh w J) 2 P(lhs(7)) 2 j; yzdy = 3 Therefore, in View of (3) and (4), lim_(nh)D(O, a) 2 3'1 and the definition of lflm inf leads to the conclusion. . 64 We shall now find an upper bound. In the same manner as (3.2) was obtained 2 2 (4.8) Pg) SP[v=0]+P(¢ [v>0]) In view of (3.4), we only consider the last term. As in (3.5), 2 / l 2 l 2 (4.9) {(3 [v > 0]) s P{]O PXEU > h]ds + (O Extv < h]ds } where U = ngj, V = ZjVj and for each j = 1,2,...,n (4.10) U. = (x-s)[x < Xj S x+h]-hLO S Xj S X] + [O s Xj S x'+h] J and v. = U. + 2 x < x. s x+h J J SE J 1 The Vj are i.i.d. with ; sh , for x E [0, 1-h) l (4.11) P V, =' {(1-x)( s-L+x+h) , for x e [1-h, 1) Note that each Uj has mean -sh for x é [0,1-h) and mean (l-x)(-s-l+x+h) for x E [l-h,l). We will use the bound 3 for the range of Vj and Uj for all 0 s s s l and O s x < l in 'Theorem 2 of Hoeffding (1963) in order to bound two tail probabilities in (9). As in §3, for 0 s x < l-h the bound develOped for the first term in the curly brackets of rhs(9) will also bound the last term in the same brackets because P¥V is closer to h than PXU is 65 to h. But, for x E [l-h,l) such ordering varies according as the values of x and hence we require more treatment for this case. Using these facts we prove Lemma ll. 1 .. _,L _ (4.12) P(QZEV > 0]) = 0((nh2) + h3 + hn 2log n + n Proof. Let UI= n-1V. Using (11) and letting a = n and a = x+h-l-n-1h(l-x)-1, we get 3.5x“? +1397 >'n(s-a1)], for x e [0,1-h) (4.13) Px[v < h] =1 1,1:XE-V + {XV >(1-x)(s+a2)], for x t [1-h,1) ~ Fix x E [0,1-h) until (15). We shall find the bound for the second term in the curly brackets of rhs(9) and double it to bound the curly brackets of rhs(9). By Theorem 2 of Hoeffding (1963) with the bound 3 for the range of the Uj and Vj’ - , _; _ 2 (4.14) ‘EX[U >1h] + PX[V < h] > 2exp{ 2 b1((s a1)+) } where b1 = 4nh2/9. Applying (l4) and (3.11) with a = a1 and b = b1 gives 66 II C A A :3 '3' N v I v (4.15) P([o s x < 1-h3fé lhs(l4)dsz) Fix x E [l-h,l) until (19) and let U = n U. By the statement after (11), (4.16) EX[U > h] gigx[fi'- 3X6 > (1-x)(s-a2)] . Applying Hoeffding's bound (as (14) is obtained) to rhs (13) for x E [l-h,l) and rhs(l6) and weakening the bound as below, shows (4.17) lhs(l4) g 2 exp{- 6 b2((s-a2)+ A (s+a2)+)2} 2 where b2 = 4n(l-x) /9. Notice that a2 > 0 iff (l-h <)5l < x < 62 (< l) where - --\ _—._.-- _-___ -1 t 2 - ~ - V~" _ 51 = 1-2 (h + Jh -4n 1h) and 62 = 1-2 1(h - éh2-4n 1h) Hence, 2 - C -1 EeXP{‘% b2((s+a2)+) ], for X E (61.52) fl[1-h,l) (4.18) 2 rhs(l7) =j { exp{-% b2((s-a2)+)2], for x 6 (61,6 ) 2 67 Rec0gnizing lhs(3.ll) S l and applying (l8) and (3.11) with b = b2 c . and a — -a2 for x 6 (61,62) 0 [l-h,l), — a for x E (01 2 ’52)’ and weakening the bounds results in L9t14£2%.A 1 , for x 6 (61,62)Cr[1—h,l) 1 2 4n(1-X) (4.19) 10 (18) ds 3 2 9 '1 (x+h-l) +(——-——-—§)A1+c h(/FKI-x)) , 6 2n(1-x) for X C (61,62) where c6 is some positive constant. Simple computation gives us 1 2 _ 3 §1_h(y+h-1) dy - h /3. 6 :6i(1-y)-1dy = log((l-51)/(l-62)) S log n Also, by (3.14) Hence, we can easily check 68 (4.20) P([l-h S x < 1]]; (18) dsz) = 0(h3 + hn-%log n + n-%) 1 2 Thus, in view of (20), (18) and (17), P([i-h s x < 1]]01hs(14) ds ) equals rhs(20). Therefore, (12) follows from this, (15), (14) and (9). I Applying Lemma 11 and (3.4) to (8) we obtain, in view of (8) and (3), the following upper bound of D(0, ¢). Theorem 5, 2 - - - D(O,¢) = 0((nh ) 1 + h3 +'hn %log n + n 35) . Remark . As in the remark of section 3, we shall state values of h and of bounds of the modified regrets, up to constants. We let large n be fixed. The lower bound of D(0, ¢) in Theorem 4 describes the hyperbola (nh).1 which coincides with (up to constants) that of the lower bound D(0, 3T) in Theorem 2, for h less than n-1/3, -1 . and then decreases to n as h increases. 0n the other hand, the upper bound of D(0, @) in Theorem 5 describes the strictly convex curve which attains the minimum -% 2 1/6 value n for n- S h S;n- , has the value (nhz).1 for h H - 3 - less than n and h for h greater than n 1,6. Thus, from the remark at the end of section 3, we can easily see that for h greater than n-%, D(0, a) must be strictly below D(0, 9T)' Hence, for such h ¢ is strictly better than 3T. ~ 69 co §2.S. A Counterexample to D(e, t) a 0 93’ R In §1 we demonstrated a procedure gr such that N D(e, QT) = 0(n-k) uniformly in e in case of a bounded parameter ~ set 0 = [c,d]. Here we prove that the boundedness assumption on Q is necessary for the modified regret to converge to zero. Theorem 6. Let X1,X2,... be independent random variables where for each j, xj ~ uIej, ej+1). ej t a = R. Let t(X) = (t1(X),...,tn(X)) be an estimator of e = (91,...,9n), N n = 1,2,... Then there exists a sequence (91,92,...) E R0° such that limn D(e,t) > 0. Proof. Since for each j, P(t.(X) - 9,)2 2 P.(P (t,(X)) - 9.)2, -—-- ~ J ~ J J ~X J ~ J it follows that l 2 (5.1) ”‘9"? 2 n 2j=1 Pj> - 9].) - R(G) Now, let u be a joint prior measure on (91,92,...). Let ”9 be the conditional measure given ej and let ”j be " j the marginal measure of ej. Then, setting sj = ”9 Px(tj(X)), J = 1,2,...,n. we have that '1 n 2 -1 n 5.2 P. P t. X - _ . '.P. . - . ( ) gin 2j=l 3(~x( J(~)) ej) } > n 21:1 e] J(sJ 9]) 1/ Now consider u = ”l x ”2 X... where ”j puts mass 2 on each of the values 2j j;r, j 2 l, where r is some fixed number such that 0 < r < 2. Then 7O 2 = c . - 2 - . 2 uij(sj-ej) %?sz_r(sj (23 r)) + % P2j+r(sj (2J+T)) (5.3) 2 f2j+l-r 2 2 2j+r {$(sj'(2j-r)) + h(sj-(2j+r))2}dx 2 r (1-2r) , where the last inequality follows since the integrand on the lhs is 2 not less than r . -1 n 2 Since R G = n 2 P.(e. ' 9.) where , is defined ( ) i=1 J Jn J eJn by the posterior mean (1.0.2) with q = l, and since the ej's are apart from each other more than 1, ejn = ej for all j and hence R(G) = 0. Thus, u(R(G)) = 0. Therefore, in view of (1), (2) and (3). (5.4) u{D(e, t)] 2 r2(1-2r) * for all n. The retraction t of t formed by taking * tj = (X5 A tj) v Xj has modified regret bounded by l and satisfies (4). Therefore, usin Fatou's lemma ives 8 g (5.5) u{ling(e, t*)} 2 12;;{u D(e, t*)} 2 r2(1-2r) > 0 . .___ .___ * By limnD(e, t) 2 limnD(9, t ) and (5), there exists a (61,92,...) 6 R such that 1imn n(e, t) > 0.. co CHAPTER III RATES FOR ONE -STAGE PROCEDURES IN THE k-EXI‘ENDED PROBLEM §3.0. Introduction Let X1,...,Xn be independent random variables each X, having the distribution Pj 6.9(f) (see (1.0.1) for the definition of .9(f)) where f satisfies the assumptions (stronger here in that f is now assumed bounded away from zero) that for given finite positive constants m (> 0) and M (2 0), (0.1) m" s f s 1 and -1 -1 -1 (0.2) v{(v-u) \(f(v)) -(f(w) \ : u < v} SM Throughout the chapter, we assume 0 = [c,d] with -oo < c S d < 00. R For each J = k,k+1,...,n, let ij = (zj-k+l’°"’zj)’ ; L '1 _ x - Xj, y - §§_1 and §.— (y,x). Throughout this chapter, G is interpreted as the empirical distribution of the (n-k+1) k-tuples 31:,91:+1,...,9: of parameters. ~ (qtn’°°"enn) where e'n is the posterior mean of ej given )_<_ wrt the prior J Let 96 be the k-extended Procedure with 9600 71 72 , namely , (0.3) 9. J“: 115+ raj q<§>flg§flf§+ q<§>dG<9§> where the affix + is intended to describe the integration as over (x',x_]. The modified regret for any procedure t relative ~ to the k-extended envelope is given by 2 k = _ _ _ 2 D(e.t) Av. Q(tjq) e) 13(9jn ej)} ~~ J where Av. means the average over 3‘ = k,...,n. Since Xf < ejn s X , J when X]: < tj(X) s Xj , we have \ (0.4) 2-1\Dk(e,t)‘ sAv. P\tj(X) - ejn We here introduce two one-stage procedures which are respective generalizations of unextended (k = l) procedures, 9T treated for 9(1) and 05 treated for the uniform distribution on the unit interval [0,1) in Chapter II. We exhibit in Section 1 the k-extended version of 9T for Q(f) and in Section 2 that of (b for 0(1) with the same rate (2k 4- 2) ‘1. 73 §3.1 A Procedure 9T with a Rate (2k +.2)'1. ~ Hereafter throughout this Chapter, we interpret er as a ~ k-extended version of the procedure treated in Sections 1, 2 and 3 of Chapter II. We first derive 3T in analogous way used in §2.0. We ~ then bound the modified regret of 9T using Lemmas 1 through 5 and.Proposition 1 (analogous respectively to Lemmas 1.7, 1.1, 1.3, II.l and II.2 and'Proposition 1.1). Let u.= (v,u) E R X R and similarly (Q = (w,e) E Rkn1 X R. Let f(v) = fling). f(2) = ff(u> and «mm = n‘i‘iqmi). qua) = Q(w)Q(e)- Let Q be the measure with density q(u) at u. wrt G. Note that, since (0.1) hmplies q s m, (1.1) q(g) S‘mk for all 2.6 RR For fixed j, we abbreviate R: to x = (y,x) where y is the ~ first k—l coordinates and x 5 Xj. In view of (0.3), a. (1.2) e. =J“ 'e dQ(§)/j‘ 'dQ(Q) Vj =k,k+l,...,n B?§ The following lemma generalizes Lemma 1.7 with h E 1. Lemma 1. Let T be a signed.measure and I = (u),uj be a cube with T(I) # 0. Let Tu be the signed measure with density I/‘T(I) wrt T. Then, 74 — 1 I ' Isdeg(§) — uk f0 TUESk 2 uk + tjdt Proof. By the Fubini theorem applied to the lhs of the second equality below, f(uk-sk)d72(§) = f j:k-u£dt dTE(§) = I; TEESk s u; + t]dt . | Applying Lemma 1 with T the measure with density (Q(fi',§])-1 wrt Q, gives us that (1.3) ejn = x - njn where (1.4) “jn = f; Q((y'.y] x (X',X'+t])dt/Q(§'.§] . For every u_€ R , let 05> Mm=fp§mwm) where 75 k-l paw = pw(V)pe(U) = i=1 pwi(vi)pe(u) Then, by the form of densities (1.0.1) , (1.6) 6(2) = momma] . Hence, V (v,u) E Rka1 X R, (1.7) Q((V'NJ X (..,u1) = 2: ggilfiar) ’ where z abbreviates summation wrt the non-negative integer r and (also in (10) below) involves at most G(n) - 9(1) + 2 terms (when 0 = [c,d], at most d-c+2). (7) is a generalized form of (2 .0 .2) . Letting (1.8) F*(Q = Av. [X3( 3 u] where throughout this chapter Av. means average over and, for any 0 < h < 1, (1.9) Mm) = h'kF*(2.e+hl]. 76 _ * we estimate p(u) by AF (g). Thus, in view of (7), we estimate Q((v',v] X (-m,u]) by (a generalization of (2.0.3)) AF*( 2; (1'10) “3) = 3 f(v)¥(:-:) and Q(gfl.g] ‘by u * (1.11) T(v,°)]u, = AF (2)/f(2) Since 0 sLThn s l, we finally estimate ejn by (1.12) 9rj=x‘(0V(pjn)/\1 where 1 I (1.13) «3“ = $0 T(y.~>]:.+tdt/T]:. and 0/0 is taken to be zero. To get an upper bound of the modified regret for ET we use an analogous method to that used in Theorem 1 in Chapter II. The following lemma is a generalization of Lemma 1.1 with 5 = 0 = e. k Let {j =Pj_k+1 x...x Pj Vj =k,...,n. 77 UN: L" m n Lanma I =(y_',u_]. Then k k k 0 O Q 2- O (1 14) Av Ej {15(9J)/Q(I£)] s (d+ c) Proof. Since by the definition of Q, 5 k I = Q(I ) it follows by k usages of f s 1 that lhs(lh) = fig : Q(Iu) > O}f(L_1)dl_1_ s (d+2-c)k . The next lemma is a generalized analogue of Lemma 1.3 with n = 0. Lemma ;. For 3 st ERR, k (1.16) AV- P1;(G(2<_-£. §-§]/Q(2s'.£]) S H (ti-Si) " i=1 Proof. In view of (15) with e = 0 and |: u Is 111806) = [0201' .27.] > 016(2-£.w_-§Jf(2)d! S [G(z-Efl-aldw. - 78 By the Fubini Theorem, the rhs equals jg: d! dG(_® = rhs(l6) .I By the definitions ((3) and (12), respectively) of 9. and aT,j’ and by the fact that 0 s njn s l, (1.17) \e'r,j - ejn\ s \njn - (pjn‘ A 1 . But, by Lemma 1.5 and by another usage of 0 s njn s 1, it follows v. = x... x? that with {bk P1 XPj-k ij XPn (1.18) Ej’kfin / I ‘1 1 . x'+t .- jn - ijn‘ /\ 1) :- 2(Q(?£ :51) {Ejk‘j‘o T(y, )JX' dt 1 v - I0 Q((y',y] x (x',x'+t])dt\ + ZEj,k\T(y.°)]:.'(Q(£,§]) 1\} The following lemma is a modified analogue to Lemma 11.1. It is used to prove the forthcoming Lemma 5 which is a generalized analogue to Lemma 11.2. . k- Lemma 4. For g = (v,u) c R 1 x R and each j, 79 RV 1 v+hl u-r+h v (1.19) h Ej,k(T(‘—9) = Z f(v)f(u-r) I j Qj(s_)f(§_)d§ t=v s=u-v +(n-k)-1 Z{[v < y < v+hl, u-r < x $.u-r+h]/f(v)f(u-r)}, where 0j(B) é Q(B) - (n-k)-1[(y,x) E B]q(§:), V B E 5% . Proof. By the definition (10) of T, (1.20) 119 = h"‘ Av.{g[v < 9513:} s.v+hl, u-r < Xj s u-r+h]/(f(v)f(u-r))}. -2 The i=jth term gives h (second term of rhs (19)). Now, since for fixed r, taking the average operation in- side of the integrals gives us that k 9i+l -1 ~ k (n-k+l) Z:=k,#j k [v < t s v+hl, u-r < s s u-r+h]q(gi)f(§)d§ ~1 _ v+hl u-r~+h ., . t=v Is=u~r Qj(§ ’§]f(§)d§ ' Multiplying by (f(v)f(u-r))-1 and summing over r leads to the first term of rhs(l9) .' Lgmma’ég For 0 s t s l, 80 Q((Y'i'hldl X (X'+h,x'+t])[t 2 h]-a -3136" >113” s Q((yuwhu x (x',x'+t+h]) + o. -1 where a = 0(h+n ). Proof. We obtain by two applications of Lemma 3 (note that the second term of rhs(l9) with s=x is zero) and a change of variable u' to u in the second inner integral below, kv X '+t (1-21) h Ej,kT(y’ )lxu = lymm f(y) ME)“ ”h 41—65. ( XY “3“) where y+hl f(v ) .+h v , x'+t A= Iy f(y) [I. Qj((v ,v]X(-m,u])du]x. dv . Making a change of variable w = u-t in the positive term of A, using (22) to bound the f-ratios as needed for (21) and applying (1) gives us A s I31Ix:%((v',v]X(w,,w+t])dwdv(+50(hk+1) By this and by the upper bound of (22) together with the assumption h < 1, applied to the integral of the second term of rhs(24), 82 ::+h éj((v',v]x(w,w+t])dwdv +'hkq . rhs(24) s fi+hl I ( ') (1.25) (2) Weakening the bounds and applying Q-(n-1)D1 s 6j s Q gives, in view of (25), (24), (23) and (21), the bounds of asserted Lemma 5.. We finally introduce the following pr0position which gen- eralizes Proposition 11.1. 1,...,YN be (k-l)-dependent random variables and a s'Yi s b for all i = 1,2,...,N. Then, for Proposition 1. Let Y any fl and every N, (1.26) E\§ - n\ s \E; - ni +'h:§'/n72 . «51 where M is the greatest integer s N/k. Proof. We prove in the same way as Proposition 11.1. The extension of Theorem 2 of Hoeffding (1963, Section 5d) to (k-l)-dependent random variables gives 2M PUT- - ET‘ > s]ds s 2 exp{- Li] (b'a) Hence 83 E)? - ET] s 2f: exp{-252M(b-a)-2}ds = (b-aL/fi7(2M) . Thus, the triangle inequality leads to the asserted bound.‘ We shall now go back to (18). To get an upper bound of the k average expectation wrt Pj of lhs(18), we apply the following ~ Lemmas 6 and 7. These are replacements of Lemmas 11.3 and 11.4 combined with multiplication by (Q(xf,x])-1 and the average expectation. Let f(- é (n.k)-1£j],l=i=k Ki for any random variables K1 Lemma é, k -1 ,.1 v ' (1-27) AV- Ej{(Q(§'.§]) (J0\Ej,k(T(y.-J:.+t) -Q((y'.y]X(X',X'fi1)\dt)} = 0(h + n'1 + ([6 hk)’1). Proof. For notational simplicity we prove only for k = 2. Fix j until (32) and t 6 [0,1] until (30). Define -2 - Wi = h (f(y)) 1[y < Xi-l s y+h]g([x'+t-r < Xi s x'+t-r+h]/f(x'+t-r) -[x'-r < Xi S x'-r+h]/f(x'-r)) for i = 2,3,...,n. In view of U (20), we can directly verify that T(y,')]:,+t = W. Since W1 is a function of Xi-l and Xi’ the sequence of random variables W2,...,Wn are l-dependent. Also \Wi\ szmzh‘Z for all i. Thus, by Proposition 1, with b-a = ZmZh-Z, lllll' I ill-[III lit 84 (1.28) Ej,2\T(y,')1:v+t -Q(\ 2 _ v ' 2 s \Ej.2\ + m 2 (‘3' A x h where K is the least integer greater than %n-l. Let us denote the first term of rhs(28) by \P-Q\ where P and Q are defined by positional correspondence. Using \P-Q‘ = (Q-P)+ + (P-Q)+, applying the lower and upper bounds of Lemma 5 to P in (Q-P)+ and (P-Q)+, respectively, and performing simple computations, we bound \P-Q\ by Q((y'.y]X(X'.X'+t])[t < 111+ {Q((y'.y'+h]X(X'.X'+t]) (1.29) + Q((y'+h,y]x(x',x'+h])}[t 2 h] + Q((y',y]x(x'+t,x'+t+h]) + Q((y,y+h]X(X',x'+t+h]) + 0(h+n-1) By (1) with k = 2, Q((y',y]x(x'+t,x'+t+h]) s.m2 X G((y',y]x(x'+t,x'+t+h]). Also, from the Fubini Theorem x+h 1 ' -m . x'+t+h = (S‘X')A1 I joc<JX.+t dt jx. f(s-x'-h)vodtdsc“y .y]x(-«>.s1> s hc< sh where the subscript s in dS denotes the variable of integration. 85 Hence, 1 2 (1.30) $0 Q((y',y] X (x'+t,x'+t+h])dt s m h . Thus, taking the integral wrt Lebesgue measure over [0,1], and weakening the bound in various ways (including one usage of (l) and (30)) leads to (1.31) [3(29)dt s Q((y'.y'+h]><(x'.x]) + Q((y'+h.y]><(x'.x'+h]) +Quwwh1xmumhb+om+wfh By Lemma 2 with k = 2 and by three applications of (l) and Lemma 3. (1.32) Av. pj_in{ rhs<31> /Q1:. - Q((y',y1 x (x',x]>\} = 0(h + n-1 + (m h2)'1) Proof. The proof proceeds in the same way as that of Lemma 6. For simplicity, we let k = 2. Fix j until (38). Define Zi = h;k[§ 5 §: 5 x + h1]/f(x), V i = 2,3,...,n. Then, in view of (11), (9) and (8) we can directly verify T(y,-)]:, = Z: From the definition, Z2,...,Zn are l-dependent random variables. Also, by the assumption (0.1), 0 s Zi smzh-2 for 2 -2 all i. Thus, by Proposition 1 with b-a = m h (1.36) rzj,2\T(y.-)]:. ' Q(E'EN 2 -_ s 15,202,211» - «tutu + 7773- 2 ), h 87 where k is as defined just below (28). Proceeding the same way that (29) was obtained, we bound the first term of rhs(36) by (1-37) Q((y'.y]><(x,r+h]) + Q((y.y+h]X(X'.x+h]) + Q((y'+h,>']X(X'.X'+h]) + Q((y' .y'+h] x >/6<§>) where P(fi) is defined by (1.5). As in §2.0,estimating 5(3) by AF*(u) (see (1.9) with f E l), Q(u) by h-(k-1)F*((v,v+hl] X (-m,u]) and G((v',v] X (~m,u]) by T(u) é Z AF*(v,u-r), and noting from (1.4) that 0 s njn S 1, gives us an estimate of 9. JD as (2.2) ®jn = X - (0 V q’jn) A l where (2.3) an = {h'(k‘1)F*<(y.y+h]x(-m.x]> - T}/T1:. 90 where 0/0 is taken to be zero. We first investigate the relation between ¢ and er. We ~ abbreViate wjn to W and qfin in the definition of aT,j to m and will show m lies between w-h and w. In view of (3) and (1 13) (2.4) (9')) a)"; T(y.x'+t>dt - h‘(k‘1)F*(-.x)];+h)/T(y.31:. . 1n the definition (1.10) of T(y,x'), for every i, 1 f0 g[y'+t-r < xi 3 y'+t-r+h]dt = h, s‘h, = 0, according as X1 5 y, y < Xi s y+h, y+h < Xi. Applying two separate cases X1 5 y and y < X1 5 y+h to the rhs of the first equality below gives us h2 1T( x'+t)dt = Av [ < Xk-l s +h]( 1z[x'+t-r < X s x‘+t-r+h]dt) 10 y’ ' y .,1-1 Y 10 1 + 1 1 , , = hF (-,x)]y +[y < X1-1 s y+hl, x < xi 5 “H.102“ +t-r]x.. that T]x. = h {F <~.x+h>]§ - F < .x>1§ 3, we obtain (2.5) OStp-wsh, analogous to (2.4.1) in the k = 1 version. We now consider the modified regret of m. Since ~ X! < ¢jn 5 xj’ it follows by (0.4) and by the triangular inequality with QT j as an intermediate term that (2 6) 2-1'Dk ) 5A P" - '+A P - 9 ° \ (9’91 V; ‘an 9T,j‘ V' ~\e'1‘,j jn‘ Since (by (2), (1.12) and (5)) l¢jn - aT’j! s (W - q» A l s'h, (2.7) first term of rhs(6) s h 0n the other hand, since the inequality (1.41) for f = l bounds the second term of rhs(6), it follows by (6) that whg@\=Mh+R*+4G#34> - 2 2 Hence, taking h to be exact order n 1/( k+ ) gives 92 -l/ (2k+2) Theorem 2, For h with exact order n n-l/(2k+2) \Dk(g. g)\ = 0( ) , uniformly in e. APPENDIX APPENDIX Section A.l (the main development) relates to bounds for difference of two integrals of a bounded function in terms of exten- sions of Levy metric. Section A.2 relates to convergence of a sequence of variances. Some of the results in §A.l were used in §1.1 and the consequence in §1.2 was applied in §2.2. §A.l. Extensions of Lévy Metric and Bounds for Difference of Two Integrals of a Bounded Function. We first extend Levy metric L to the family 3 of in- creasing real functions on R and then introduce an extension 9 to the family 771 of measures on (R,B) determined by the variation of elements in 3. p is defined as the infimum of L's and Remark A.l shows the infimum is attained. Proposition A bounds L at retractions to an interval by the maximum of differences of values of the func- tions at end points of the interval and L at the unretracted functions. We then prove strengthened generalizations (Lemma A.2) of Lemmas 8' and 8 of Oaten (1969, Appendix) giving bounds on the dif- ference of two integrals of a bounded function. Lemma A.3 introduces another family of bounds for the same difference. Proposition A and Lemma A.3 are used in the proof of Lemma 6 in §l.l. Although Lemma A.2 was derived for this purpose, it is 93 94 now included only for its own sake (as a generalization of Oaten's results) since Lemma A.3 gives a better bound in this application. For each F E 3, let pre-subscripts on F denote composi- tion with the indicated translation, that is, €F(X) = F<€ + X) 2 let F'(x) = x + P(x) and note a + ( F)’ = e(F'). For every 6 r é‘R, let Sr be the interval Sr: {320 : _€(F°) _<_r+G° s€(F')} . Note that (i) replacement by strict inequalities throughout would, at most, subtract an end point from Sr’ (ii) replacement by re- strictions to a dense subset of R would, at most add an end point to Sr' Therefore neither would affect definitions which follow. Lévy distance L of F and G in 3 is defined by (1.1) L(F,G) = A SO. That L is a pseudo metric will be seen in Lemma A.l where L is shown to be the supremum of the difference of the quantiles of modifica- tions. For right continuous F and G, Sr(F,G) is closed. For, taking r = 0 without loss of generality (since Sr(F,G) = SO(F,r+G)) 95 and letting e 1 L(F,G) = L through points of S0 gives G' s L(F') and, by symmetry of Lévy distance, F' s L(G') which is equivalent to _L(F') S G'. We define another distance function p on 5 as follows: for any F and G in 3, (1.2) p(F,G) = A L(F. Hi3) reR Note that p is invariant under translates of values of F and G. Since functions in 3 which differ only by a constant except at discontinuity points induce the same measure, p is actually a metric on 7%: (1-3) p(u, v) = p(F,G) for any F and G E 3, inducing the respective measures u and v. Since A (A Sr) = A (U Sr) for any family of subsets Sr r r of extended real line, we see that (1.4) p = A (U Sr) r Although we have used + (-) in the subscript position to denote the positive (negative) part, we will also use + (-) on the line to denote right (left) limit. Remark A4. The infimum in the definition of p is attained. 96 Proof. Pick a sequence {an} of numbers which strictly decreases to p. Then, by (4) there exists rn such that - + 0 O - O rn _€(F)SG srn+e(F) U n Thus, taking lim and 133 on the lhs and the rhs respectively, leads to - 1im r +' (F')- s G' s.-lim r +- (F')+-. __ n -p 11 p Therefore, for every r 6 L133 rn, IE; rn], L(F, r+G) = p. For each F E 3 and t 6 R, let tF denote the t-th quantile of F'. Note that tsa CF maps R onto R. Define n by n 0 97 F'(tG-‘n-5) s F°(tF-5) s G'(tG+6) and G'(tG-6) s F'(tF+5) s F’(tG+n+5) Since the mapping t«9 tG is onto, these inequalities show that L(F,G) s fi(F,G) +'26 and thus L s n. 0n the other hand, if L(F,G) < c, then _e(F') s G' 3.6(F'). Noticing that the t-th quantiles have the Opposite ordering and by the definition of S(F') (t-th quantile of S(F')) = tF-s , we obtain tF+e 2 tG 2 tF-e. Thus N(F,G) s e and therefore n s'L.. We now prove the following pr0position. Proposition A, Let 1 = (a,b] be a finite interval and let FI be the retraction of F into the closed interval [F(a+), F(b+)]. Then, L(FI,GI) _<. \(F-G)(a+)\ v \(F-c)(b+)\ v L(F,G) fleei- Let 98 (1.5) v =F‘ VG' and A =F‘ AG' It is straightforward from the definition of t-th quantiles to show that according as t _<. V(a+) or 2 A(b+) or e (V(a+), A(b+)). we have \tF - tG \ s 1(F'-G')(a+)i or s \(F'-G')(b+)‘ I I where strict inequalities hold for A(a+) < t s V(a+) or A(b+) s t < V(b+). Thus, (1031,61) 5 \(F'-G')(a+)\ v \(F°£')(b+)\ v was) Therefore, Lemma A.l leads to the asserted inequaltiy.. Definition A.l. With h, a function defined on a real interval 1, the modulus of continuity of h is the function given by (.0 (1.6) 0(a) = V(thl : ml, (1.2 e 1, MI - 1112‘ < e) 2 99 for every 3 > 0. Definition A.2. With h measurable on a real interval 1 supporting a finite measure T, (1.7) T-SUP h = A{6: TEh > 5] = 0}, T-inf h = -(T-sup (-h)) and, with Tr denoting the restriction of T to the interval a (r-6/2, r+€/2), the anodulus of continuity of h is the func- tion given by (1.8) T-q(e) = V{Tr€-sup h - Tre-inf h : r e I} for every 6 > 0. The following Lemma A.2 is a unified and slightly strengthened generalization of Lemmas 8' (corrected by replacing 3 by 4 in the bound) and 8 of Oaten (1969, Appendix) with proof evolving from those of Oaten. nggg.é;g. Let 1 be a finite interval {a,b} supporting finite measures u and v and let h be measurable on 1 into a finite interval [c,d]. By abbreviating p(u,v) to p and L(u[a,-], v[a,-]) to L, §§hd(u-u)| has the following families of upper bounds: 100 (1 9) 0((Eifl'V L)+){(k-1)L + \h-v\1+2(ui A v1)}+((-e) v d)\u1-vI\, V positive integer k < hf§'+ l (1.10) (cl-cm + (m) «(pub-1?- v (2.0mm: /\ vI)+((‘d) v emu-m b-a V positive integer k < “5; + l . The bounds in (9) and (10) hold for every positive integer k, but those unlisted are dominated by the bounds corresponding to the largest k listed. Rgmark, The bounds of Oaten are parametrized by A 6 (L,m) and (2L,m) respectively and are improved by the u1 = 01 = l specialization of (9) and (10) above with k taken to be the least integer greater than (b-a)/x. Proof of Lemma A.2. For a given 0 with k-l < (b-a)/o < k, let 6 5 kg-(b-a) and let Xj é a+j6-2-16 for j = 0,1,2,...,k. Since G < (b-a)/(k-l), it follows that 5 < o and hence (x0+x1)/2 and (xk_1+xk)/2 both lie inside the interval 1. Proof of (3)). Note L < (b-a)/(k-l) and take 0 > ((b-a)/k) v L. Let hj é h((xj_I+xj)/2) for j = 1,2,...,k. Then, \h(x)-hj\ s q(g 0+) for each x 6 (xj_ 1,xj], and 'hj-hj+l' s q(o+) for each j. Let D. = 1M9 x ,x, , ' = O,l,...,k. Then J (u X 0 J] J 101 k jhd(h-v) = zj=1{§(xj_1,xj](h-hj)d(u-v) + hj(Dj-Dj_1)} (1.11) s ( + 1 - 1 + h D + zk‘l (h -h )D 0129+) ”iv kk j=1 j j+l j From 0 > L, D.) s (V(X., X. +L A x, ,X. +L (J+ J 1+1] ) (“(3-1 J] ) s v X. ,X. A (X. ,X.] + L ( J J+1] H J-1 J and, by interchange of u and v, D. S X.. X. Av(X. .X. +L . Thus, henceforth abbreviating HI and v1 to u and v, -1 , . 35 1 (Dj‘ s 2(h A v) + (k-1)L . Therefore, k-l Zj=1(hj-hj+l)Dj S o(o+)(2(u A v) + (k'1)L} - 102 Combining this with the inequality thk s d(u-v)+_+ (-c)(v-u)+, we obtain that (1.12) man s e(e+){1h-q1+2(,.Av)+L3+d+ +('C)(v-u)+ . Replacing h by -h gives us (1.13) -lhs(ll) sy(g+){iu-v\I+2(uAV)+(k-1)L]+(-c)(u-v)+fd(v-u)+ . We obtain (9) by taking the maximum of rhs(12) and rhs(l3), recognizing (d(u-v)++(-C)(v‘u)+)V((-C)(LL‘v)++d(v-u)+)=((-C)Vd) \u-v\ and letting 0 decrease to ((b-a)/k)vL. Proof of (19). Since, by Remark A.2, p = L(F,G) for some right continuous distribution functions F and G inducing u and 0, it suffices to prove (10) with p replaced by L = L(F,G). As in the proof of (9), note 2L < (b-a)/(k-l) and take a >((b-a)/k) v 2L. By the definition of L we can find x0 = y0 < y1 <...< yk = xk so that,for each j, ‘xj-yj\ s L and (1.14) F(yj-) - L s G(xj) SF(yj) + L because 103 u{[F(y-)-L. F+LJ = \y-xj\ sL} Y = [F((Xj ‘14) -) -L. F( xj+L )+L] and the intervals [xj-L, xj+L] strictly increase wrt j. We extend the domain of h to the interval [x0, xk] by defining 2h = c + d on complement of 1. For each ', let A. = x, A , and V, = x. V ,. Let J J J yJ J J y] T = ”+6, let Tj denote the restriction of T to the interval [A,, V. ] and let h. = T.-inf h. Then, define functions h and J J+1 "J J 1 h2 by = = = - + h1(xj,xj+1] h2(yj.yj+1) 11]. and h2(yj) 112(3'j ) Vh2(yj) Now h-h s T-q(L+g+) a.e. T on (x.,x. ] because 1 J 3+1 V 6 > 0 1((ijVjHJDEh < hj+o])+'r([/\j.xj+1]0[h < hj+6]) > 0 so that if 7((xj,xj+1]fl[h-hj > 1]) > 0 then T-ry(L+o+) 2 1-6 and thus 2 A- Also h -h s 0 a.e. T because h 2 h vh, 2 h a.e. T 2 ‘—j-1 -j 2 on A.,V. and h 2 h. = h a.e. 'r on V,,A. . [J J] "J 2 (J J+1) Let r E R. If h2(yj_1,yj+1) s r, then h1(xj_1,xj+1] s r. . . l Conversely hl(Xj-l’xj+1] s r implies h2((yj_1,yj)L(yj,yj+1)) s r and therefore h2(yj-l’yj) s r. Hence h;1(-m,r] is the union of at 104 most k/2 intervals of the form (yi,yj), and h;1(-m,r] is the union of the corresponding intervals (xi,xj]. We note that, by two applications of (14), ”(y"yj) s i v(xi,xj]+2L V j so that -1 - (1.15) uhz (~m,r] s vh11(-m,r] + kL . By two usages of the Funini representation of the integral (cf. (2.1.10)) of a nonnegative function in the rhs of the first equality below (1.16) jhldo-fhzdu-d(o1-o1) f(d-h2)du - f(d-h1)dv = g-c(uh;1-Vh;1)(-m,d-t]dt g (d-c)kL . Henceforth abbreviating T-q(o+L+) to g, p1 to u and 61 to v, the triangle inequality and (16) bound fhd(v-L)-(d-c)kL (1.17) f(h'h1)dv + f(hz‘h)du + d(v‘u) S a» + d(v-u) Applying (17) to -h with the measures interchanged gives the bound gp+(-C)(u'v). The minimum of those bounds is the former or latter according as u 2 or s v and therefore 105 (1.18) fhd(o-o) s (d-c)kL + Q(o A v) + t(o-o)+ - d(u-v)+ . Applying (18) to -h gives rhs(l8) with c,d replaced by -d,-c and gives (1.19) Wham-m s (d-chL + am A v) + (c v (-d>>ia-v\ . and (10) results on letting a decrease to ((b-a)/k) V 2L. ' In the following lemma, the natural generalization of the inverse probability integral transformation is used to develop bounds for the same difference of integrals without recourse to partitioning. Léflfli.é;2° Let I,u,v and h be as in Lemma A.2. Let F and G be distribution functions inducing M and v with V(a-) s A(b+) where v and A abbreviate F' V G' and F' A G° (as in (5)). Then \fh d(M-v)i has the°following family of bounds (1.20) igin(IF-(Dwain(IE-cum)11+o<1(F.G)+>{A(b+)-v(a->} + gig) I-in 2 Proof. Without loss of generality we can assume I is open. For, V e > 0 {a,b]cxa-e,b+€) to which h is extendible with 106 the same modulus of continuity and, if (20) holds with a,b replaced by a-e, b+e, then letting 310 gives (20) with 1 = {a,b}. Let I = (a,b) and let f denote the map t 9 tF. f-1{u] = [F'(u-), F'(u+)] and F' is strictly increasing, Since -1 ‘ . . f (8.)1) = u f '{u} = (F (9+). F (w) B 6(S)h(ts)dt + (V(a) h(tF) h(tG)dt + (A(b) 6(T)h(tT)dt where S and T have values in the set {F,G} 3 S'(a) = A(a) and T°(b) = v(b). Hence, abbreviating L(F,G) to L hereafter and using Lemma A.l, fh d(u-v) s 107 (1.22) d( (G -F) (a) )++( -c) ((F -G) (a) )++o(L+) (A(b) -v(a))+d( (F -G) (b) )+ + (-c>((c-F>(b>>+ - Applied to -h, (22) is altered only by c,d changing to -d, -c: (1 ~23) ('C) ((G -F) (a) )++d((F -G) (a) )++oz(L+) (A(b) -V(a) )+(-C) ((F-G) (b) )+ + d((G-F)(b))+. Since (22) + (23) = (d-c){1(F-G)(a)‘ + \(F-G)(b)\} + 20(L+)(A(b)-V(a)) and (22) - (23) = (d+C) ((11 - (11), (22) v (23) = (20).§ 108 §A.2. A Fatou Theorem for Variances. The following theorem is used in Section 2.2. Theorem A.l. If (Um) is a sequence of random variables converging in distribution to a random variable U, then lim_Var(Un) 2 Var(U). Pooof. It suffices to show that for {UH} such that Var(Un) ~ finite. With ”n = EUn and o: = Var Un’ the Tchebycheff inequality gives P[\Un — “n! <,/2 on] 2 1/2 while tightness provides a finite b independent of n for which PE'Unl s b] > 1/2. The nonempty- ness of the intersection of these events shows Eunl < b +-/§ Oh so that {Mn} is bounded. Letting {um} be a convergent subsequence with limit u , on U - U 'QIJ"LL and hence (cf. Loéve (1963) 11.4, A(i)) m m m . . 2 2 11m Var(Un) = 11m 151(11m - Rm) 2 13(1) - Na.) 2 Var U . 5 BIBLIOGRAPHY BIBLIOGRAPHY Breiman, L. (1968). Probabiligy. Addison4Wesley Deely, J.J. and Kruse, R.L. (1968). Construction of sequences estimating the mixing distribution. Ann. Math. Statist. 22. 286-288. Feller, W. (1968). An Introduction to'Probability_Theory_and its Applications Volume I (33o ed.). Wiley, New York. Feller, W. (1971). An Introduction to Probability Theory and its Applications Volume 11 (Zoo ed.). Wiley, New York. Ferguson, T.S. (1967). Mathematical_3(a'istics a Decision ’ Theoretic Approach. Academic Press. Fox, R. (1968). Contribution to compound decision theory and empirical Bayes squared error loss estimation. Research Memorandum RM-214, Department of Statistics and Probability, Michigan State University. Fox, R. (1970). Estimating the empiric distribution function of certain parameter sequences. Ann. Math. Statist. 41, 1845- 1852. Gilliland, D.C. and Hannan, J.F. (1969). On an extended compound decision problem. Ann. Math. Statist. 49, 1536-1541. Halmos, P.R. (1950). Measure Theory. Litton Educational Publishing. Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. ;, Amer. Statist. Assoc. 58, 13-30. Loéve, Michel (1963). Probability Theory (3EQ ed.). Van Nostrand, Princeton. Neveu, J. (1965). Mathematical Foundations of the Calculus of Probability, Holden-Day. 109 110 Oaten, A. (1969). Approximation to Bayes risk in compound decision problems. Research Memorandum RM-233, Department of Statistics and Probability, Michigan State University. Oaten, A. (1972). Approximation to Bayes risk in compound decision problems. Ann. Math. Statist. 43, 1164-1184. Royden, H.L. (1963). Real Analysis. The Macmillan Company, New York. Saks, S. (1937). Theory of the Integral (Zoo ed.). Monografie Mathematyczne. Sibley, D.A. (1971). A.metric for weak convergence of distribution functions. Rockprountain Journal of Math. 1” No. 3, 427-430. Singh, Radhey S. (1974). Estimation of derivatives of average of u-densities and sequence-compound estimation in exponential families. Research Memorandum RM-3l8, Department of Statistics and Probability, Michigan State University. Swain, Donald D. (1965). Bounds and rates of convergence for the extended compound estimation problem in the sequence case. Technical Report No. 81, Department of Statistics, Stanford University. . Yu, Benito Ong (1971). Rates of convergence in empirical Bayes two- action and estimation problems and in extended sequence- compound estimation problems. Research Memorandum RM-279, Department of Statistics and'Probability, Michigan State University. N ”11111111711131(1111111111111111115