w , * I 1 \ l i x 1 APPROXEMATEON TO SAYES RISK EN SEQUENCES OE NDN-FEMTE DECESION PROBLEMS Thesis for 1%: fiagrm ci’ Ph. D. MICHKGAN SE‘ATE UNEVERSITY Dennis Crippan Giiliiand 129636 LIBRARY Michigan 5a. Um‘ "r7 This is to certify that the thesis entitled APPROXIMATION TO BAYES RISK IN SEQUENCES OF NON-FINITE DECISION PROBLEMS presented by Dennis Crippen Gilliland has been accepted towards fulfillment of the requirements for PhoDo ‘ degree in Statistics and Probability WW J Major professor Date August 10, 1966 0-169 ABSTRACT APPROXIMATION TO BAYES RISK IN SEQUENCES OF NON-FINITE DECISION PROBLEMS by Dennis Crippen Gilliland Consider a statistical decision problem where X N P9, 9 E Q, and an action A E A is to be taken with A allowed to depend upon the random variable X. A loss function L 2 0 is defined on Q X A, and a non-randomized procedure w incurs a risk R(9,¢) s‘fL(9,¢)dPe. Suppose this decision problem occurs n times with '9“ = (0 .,3n) 6 On and lg“ n (X1,...,Xn) ~ 1". P6 X...XPe . A strongly sequential compound rule 3 =~(¢1,¢2,...) l n is such that for each i, wi is the means by which the i'Eh action is taken and L(9,¢i) is -§i measurable for each 9. The compound risk up to stage n is taken to be the average of the component risks, Rn(§’$) = n'12: R(§i,¢i) where .2 - (91,62,...). The modified regret Dn(£J§) s Rn(§’2) - R(Gn) with Gn the empirical distribution of 3“ and R(G) the Bayes risk at G in the component problem has been used as a standard for compound procedures. The results of the thesis show that an(§J2)|1s 0(n-g) uniformly in 2_€ waor squared error loss, certain discrete exponential families including the Poisson and negative binomial, and realizable sequential procedures Q, Bounds like 0(1) and 0(1) uniform in ‘2 are obtained for larger classes of discrete exponential families. A problem.involving normal N(9,1) dis- Dennis Crippen Gilliland tributions is also investigated with a sequential compound pro- cedure 2 demonstrated such that an(§,$)l = 0(n-(1/5)) uniformly in ‘2. In the last chapter sequence strategies are exhibited which at each stage i depend upon Gi-l and which achieve various rates of modified regret depending upon the exact structure of the component game. Sequence strategies were given by Hannan ((1957). Contributions to the Theory of Games, 3, 97-139. Ann. Math. Studies No. 39, Princeton University Press.) for various M X N games with M finite. These procedures result in an absolute modified regret 0(n-5) uniform in player I move sequences. We give a direct generalization to the countable M case. Also a theorem is proved with applications in the uncountable M case. APPROXIMATION TO BAYES RISK IN SEQUENCES OF NON-FINITE DECISION PROBLEMS By Dennis Crippen Gilliland A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1966 ACKNOWLEDGMENTS I am indebted to Professor James F. Hannan for the introduction to compound decision theory and for the guidance he gave to my thesis work. His comments and suggestions led the way to theorems and simplified proofs. Above all this, I deeply appreciate his efforts in teaching me many aspects of mathematical statistics. Thanks are also due to Professor Herman Rubin for many use- ful discussions and to Professor Esther Seiden for the original encouragement to do graduate work in mathematical statistics. The financial support provided by the Department of Statistics and Probability, the National Science Foundation, and the Office of Naval Research made my graduate studies possible. In this regard I also wish to thank Goodyear Aerospace Corporation for the leave of absence for education. ii TABLE OF CONTENTS CHAPTER I. INTRODUCTION TO SEQUENTIAL COMPOUND DECISION PROBLEMS . . . . . . . . . . . . . 1.1 The component problem . . . . . . . . . . . . . 1.2 The sequential compound decision problem . . . 1.3 A useful theorem . . . . . . . . . . . . . . 1.4 A bound for the modified regret under squared error loss . . . . . . . . II. SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR LOSS AND SOME DISCRETE EXPONENTIAL FAMILIES . . . 2.1 Introduction . . . . . . . . . . . . . . . . . 2.2 Review of known results and an example . . . . -1 2.3 Bounds for n 2: E(|¢i - lil) . . . . . . . . -1 n H _ 2.4 Bounds for n 21 E(Icpi Wil) . . 2.5 Bounds for modified regret . . . . . . . . . . 2.6 Examples . . . . . . . . . . . . . . . . . . III. SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR LOSS AND A FAMILY OF NORMAL DISTRIBUTIONS . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . 3.2 Estimation of ui(x) . . . . . . . . . . . . . . 3.3 A decision problem . . . . . . . . . . . . . . IV. ON THE BAYES RESPONSE TO THE EMPIRICAL DISTRIBUTION OF OPPONENTS PAST IN SEQUENTIAL DECISION PROBLEMS . 4.1 Introduction . . . . . . . . . . . . . . . . . 4.2 Games involving squared error loss . . . . . . 4.3 Games with countable M . . . . . . . . . . . 4.4 A theorem for non-finite games . . . . . . . . BIBLIOGRAPHY..................... iii PAGE 10 10 11 14 21 26 33 36 36 36 43 45 45 47 49 53 58 CHAPTER I INTRODUCTION TO SEQUENTIAL COMPOUND DEC IS ION PROBLEMS The notation necessary to describe the mathematics of compound decision theory is necessarily cumbersome. To reduce the complexity we will.find it convenient to suppress the display of some dependen- cies and to use Operator notation. For example, PX, P(X) or P(X(w)) denotes the integral fXKw) dP(w). Another reduction in notation is accomplished by letting square brackets denote the indicator function, [A] being the indicator function of the set A. 1, The Component Problem. The component problem has the structure of the usual statistical decision problem. Let 0 denote a parameter space indexing a set of probability measures {Pele 6 Q} and let IA signify the action space. A real valued loss function L 2 0 is defined on O X‘A. The action A is allowed to depend upon the realization x in the measurable space (X93) of a random variable X distributed according to P6“ A randomized decision function w is for each x a probability measure V¢(X) over some C-field of subsets of A such that L(9,¢(x)) = v (L(9,A)) is (X,B) measurable. The ¢(X) risk or risk function associated with w is the expected loss resulting from the use of w: R(6,¢) = P9 (L(9,¢(x))). 2, The Sequential Compound Decision Problem. Suppose the decision problem described in 51 occurs n times n _ . .,9n) 6 Q and £5." (X1,...,Xn) distributed with ‘Qn f (61,.. according to En = P1 X ... X Pn where we have made the identifi- cation Pi = P6 . Using the terminology of Hannan (1957, p. 105) i we make a distinction between weak and strong sequence games. In the weak sequence game n is known to player II at each component but not in the strong sequence game. Throughout this paper only the latter situation is considered. A strongly sequential compound rule w = ($1,¢2,...) is such that for each i, mi = ¢i(xi) is the means by which the AER action is taken and L(9,wi) is ‘51 measurable for each 9. The risk in the ith component problem is R(fli,¢i) = ‘£i(L(ei’¢i))' The total risk in the sequential problem up to stage n is taken to be the average of the component risks, Rh(§,m) = n"-1 E? R(gi’wi)’ where .9 denotes (91,92,93,...). If L(0,¢i) is xi measurable for each i we say m is a simple rule. If in addition the mi are identical, mi = $1, we say w is a simple symmetric rule. Simple symmetric rules are traditional and result in Rn(§,¢) = n-1 2: R(Oi,¢1). This is the same as the risk in the component problem resulting from the use of the procedure -¢1 if 6 is assumed to be the realization of a random variable distributed according to Gn the empirical distri- bution of fin. With R(G) denoting the Bayes risk versus G in the component problem the above observations lead to consideration of ' = a - <1) Dump) Rn(_,cp) men) as a standard for evaluating compound procedures. Loosely speaking, if Dn(§,¢) S Bn = 0(1) as n‘v “, then in the limit w incurs a compound risk no greater than R(Gn). Dn(§,¢) is called the modified regret. Later we will prove theorems concerning rates of convergence of IDn(§,¢)I for particular families of distributions and procedures. At this time we point out a necessary condition for the existence of any compound procedure w satisfying TEE“ sup {Dn@_,cp)l2 E 0”} < °°. It is the. existence of a procedure $1 in the component problem.satisfying sup {Pe(L(6,m1)) - ianL(6,A)|6 E O} < a. This follows from sup {Dn(§, - ii_1(x)|) and n'1 2? Pi - ¢i_1l>; particu- larly, in conditions under which there is convergence to zero Q uniform in parameter sequences .2 E O . To simplify the notation Y = - In what follows pe is a determination of EE—I and any ratios O/O will be interpreted as 0. A Bayes response versus G1 is (2) ¢. = [ti pj(x) > o]. The following example shows that for this response and a fixed bounded sequence .2, n-1'Eq'Pi(lYi($)l) need not converge to zero. Example I, Let p be counting measure on {2,3,...} and 6 6 Q = {1/2, 2/3, 3/4,...} index Pe degenerate on (1 - 9)-1. Then with Gj = j(j + 1)-1, we compute ¢i(x) = Eg+l (j - 1)j-1[x = j] so Yi(x) = i(i + 1)’1 [x = 1 +1] and Pi(l‘*’i(x)|) = i(i + 1Y1. i 2 1. Hence, n-1 8? Pi(lYi(x)l) d l as n r “. However, in TheOrem l are given conditions sufficient for convergence at the rate' 0(n7110g n) unifonm in .2. Theorem 1:. If 0 C [-A,A], A < °° and M(x) = sup {pe‘(x)|9 6 O} is integrable (X,B,u), then $1 given in (2) satisfies '1 2n [Y I "1 . o e n 1 Pi( i(x) ) = 0(n log n) uniformly in ._. The following lémma is needed. _ n 2 i -1 n .-1 Lemma‘l. Sn(a1,...,an) — 21 aid:1 aj) S E 1 1 = sn(1,...,1) for all 0 S ai S l, 1 S i S n, n 2 1. P f L t A - 2i a th t S = 2n a2 A.1 «335 = _£22_° e i ’ 1 j so a n 1 i i ’ Ba 2 -2 s (2anAn_1 + an) An 2 0. Therefore, Sn(a1,...,an) Sn(a1,...,an-1,l) for all 0 S ai S l, 1 S i S n. Also, for 1 S k:< n '353 l l) - (2 A + 2) A.2 - Zn (A + °-k)-2 aak(a1,...,ak, ,..., - ak k-l ak k k+1 k 1 2 a S n _ 2 -3 n ,_ -3 2 532 (a1,...,ak,1,...,l) - ZAk-IAk + 2 Ek+1 (Ak.+ 1 k) 0. k Since the second partial is non-negative and k-1H2 . -1 k-1a2 -1 n : we have Sn (a1, ..,ak,1,...,1) S Sn (a1, ..,ak_1,1,...,1) for all 0 S ai S 1, 1 S i S k. The proof is completed by backward induction on k. Proof pf Theorem 1. From (2) it follows that 91(X)13i'1(°i - ej) pj(X) 3i pj(X) 3: pj(X) i-l [21 (3) ‘l’i(x) = pj(x) > o] + BiEEi'l pj(X) = 0, pi(X) > 0] so if Q C [-A,A], 2A P (X) . . _ IY. (x)!S -=——-i-—x- [21‘1 p.(x) > 01+ ADS“1 p (x) = o, p.(x) > o] Pj(x) 1 J l j .1 and 912(X) (4) P. 11‘” l> s 2A 11(--—-(--)-[2 I; 1 p. (x) > 0]) X + A u(pi(X) [21'1 p (x) = 0, pi > 01) j for i 2 1. Therefore, (p. /M)2 ) :1 (pj(x)/M (5) n"1 2’; ridwiml) s 2A n" ucncx) 2“ + A n-1 u@? pi(x)[E]I'-1pj(x) = 0, pi(x) > 0]). Lemma 1 implies that the first term on the right hand side is bounded by 2A 11(M(x)) n’l 2‘1‘ i'l. Since M(x) 2 2‘1‘ pi(x)[2‘1"1 pj(x) = o, pi(x) > o] the second term is bounded by’ A u(M(x)) n-1 and the theorem is proved. We give an example to show that the bound on the rate of conver- gence indicated in Theorem 1 is tight. Example 2, Consider the family of geometric distributions with densities pe(x) = exu - e), x = o, 1, 2,... ; 6 e a = [0,351 with respect to counting measure. At §'= (0,%,0,%,...), ¢Zi(x) = (1/6)[x = o] + ab: > o], $21400 = 35(1 - 1)(3i - 1)'1[x = o] + 25b. > 0] so Y21(x) = (1/3)(3i - 1)-1[x = 0]. Hence, 2n n . -1 21 Pi(|Yi(x)l) 2 21 (1/6)(31 - 1) 2 (1/18) E: i”1 2 (1/18) log n 2 (1/36) log (2n) for n 2 2. The hypothesis of Theorem 1 need not imply Pi(IYi(x)|) r 0 as i r ” uniformly in '2 no matter what determination of the conditional expectation W1 is used. Example‘g. Consider the family of distributions of the preced- ing example. For C degenerate on 0, ¢G(x) = a(x)[x.> O] for some function a. Let b 6 Q be any number such that b ¥ 0 and b ¥ a(1). Define .§(i) = (0:1), Oéi),...) where Ogi) = b or 0 according as j = i or j # i. Then at Eli), ¢i(x) = b(1 - b)(i - b)‘1[x = o] + tab: > o], ¢i_1(x) = a(x)[x > 0] so Pi > lb - a<1>| PiEX = 1] = lb - a(1)l bu - b). The next prOposition shows that there is convergence as i r ” uniform in .g for a family of normal distributions. This result will be needed in Chapter III where a decision problem.involving this family is discussed. Proposition 1. Consider the family of N(9,1) measures, (ICZ [-A,A], A < “. Let p9 be the continuous normal density with respect to Lebesgue measure n. For the determination of hi given in (2), Pi(IYi(x)l) = 0(i-1) as i r a uniformly in .Q. Proof. Inequality (4) implies .-1 2 -1 <6) ridges!) s 2A 1 {u(p_A(X) pA‘ (x) [x < -AJ> -1 -1 2 -1 + (2n) B uE-A S x S A] + u(pA(x) p_A(x)[x >'A])} where B = min {pe(x) I (x,6) E [-A,A]2}. Since the factor of (6) appearing in curly brackets is finite and does not depend upon .2 the proof is complete. 4. A 3.3.9329. £9; the Modified Regret 233g Squared Error Loss. The loss function L(9,A) = (Bl-A)2 is continuous and convex in A for each fixed 9 so only non-randomized procedures need be considered. For the component problem this follows from an appli- cation of Jensen's inequality as is well known. It is true for the compound problem since the compound risk is monotone increasing in each component risk. A non-randomized procedure O has modified regret -1 _ n 2 (7) Dn(9_,

- n 81 giapi - 61> - R(cn). Inequalities (8.8) and (8.11) of Hannan (1957) show that (8) n'1 2‘; Piwi - 6);)2 s R(Gn) and -1 n 2 (9) n 31 Pi(¢i-1 - Oi) 2 R(Gn). Inequalities (7) and (8) imply l (10> Bum) s n‘ 2‘; gimpi - wimpi + ii - 261)) while subtracting and adding n-1 E? Pi“!i - 61)2 from the lower bound resulting from applying (9) to (7) proves -1 n (11) Dues) 2 n 21 giucpi - ii>< O and (A1) 0 = [0,8], 0 < B < m with B assumed known. (The condition g(x) >‘0 is assumed by Samuel (1965) and is implied by assumptions of Swain (1965, p. 25).) Throughout this chapter we will be concerned with the Bayes response (1.2) which for this family takes the Special form g(x) 3: pj(x+1) = 813% 151 <1) wise) = 21 ~ 1 g(X+1) 1 pj(x) g 21 Pj where we have chosen to suppress the display of dependence upon x and f is f evaluated at x+l. Any ratios O/O should be interpreted as 0. Since p (x) S239z p (x) for all x and 0 S 6 S B, the 9 h(B) B hypothesis of Theorem 1.1 is seen to be satisfied under (A1). There- fore, Corollary 1.1 implies that for any procedure taking values in [0,8] (2) lumen” s 28 n'1 8‘; gidcpi - til) + 0(n'11og n) 10. 11. where the 0(n.1 log n) is uniform in ‘Q. In (2) we have made an adjustment in constant apprOpriate to the situation O<3 [0,8]. However, for the most part we will be content to obtain bounds in terms of order and will not attempt to keep track of the constant. We will be explicit enough so that a constant can be recovered if anyone is so interested. In subsequent sections we introduce proce- dures and use (2) to obtain a bound on the rate of convergence of the absolute modified regret. Among other things it will be shown that if the family satisfies conditions in addition to (Al) then a bound of order 0(n-g) exists. Poisson and negative binomial families satisfy the conditions. Before doing this we review or reference what has been accomplished by others who have attacked the problem. ‘2. Review.g£ Known Results and 33 Example. Essentially, Samuel (1965) imposed the assumption + (A1 ) Q = [01,81 where a > 0 and B are known and considered the procedure ¢i(xi) where g {(u-n‘l 2;“ 3,) v 5.} (3) cP.(1'€) = ~ _ ._1 AB 1 sf((i-1) 1 z; aj> v m} With 5j = 5(Xj,x), the Kronecker delta function, and m(x) = min {pe(x)l9 6 [0,8]}. (In Samuel (1965) the truncation at B was not made explicit but we assume this was intended since the indicated proof of Theorem 4 depends upon the boundedness of the 12. procedure.) Samuel showed that under (A1+) there exists a null sequence Kh(§) > 0 such that Dn(2,¢) S Kn(§). (In 55 we point out that under (Al+), IDn(§,¢)I = 0(1) uniformly in .2 and under (A1), an(§,w)| = 0(1) at each fixed g, for two procedures @-> To motivate the introduction of other conditions in addition to (A1) or (Al+) we give an example that illustrates that the null sequence Kn(§) may be going to zero arbitrarily slowly. That is, given any null sequence Hn‘> 0, there exists an exponential family and a sequence ‘2 6 [0,8]” such that Dn(§,¢) 2 Hn for all large n. Example _1_. Consider p°(x) = 9xh(9)g(x), (2 = [0,8] with d'< B 8 1 and where g(0) = 1 and g(x) S x-3, x 2 1 is yet to be Specified. We first point out that m(x) = pa(x) for x 2 2. The function h(9) = (E: 9xg(x))-1 is differentiable on (0,1) with h'(6) = .2: x 8x-1g(x) n2(e) so P6(X) = -e h'(6)/h(6). For fixed x, 5%- pe(x) = 9x-1g(x) Ex Me) + 9 mm] = ex'1g(x) h(6) [x - rem]. Since Pe(X) = a: x 9xg(x) h(e) s h(0) 2°; :62 = E: x"2 = fi2/6 we see that §%-pe(x) > 0 for all 9 and x 2 2. Hence, m(x) = Ra(x) for x 2 2. We now consider the modified regret for the procedure O at the fixed sequence §_= l_= (1,1,1,...). Here D (1 cp) = m‘1 2“ P («p (x) - 1)2) n ’ 1 -i i -l n 2 i-l ~ - 6 = 2 n 21 §i((cpi(x) 1) [21 J. 0]) -1 n “ g(x) m(x+1) _ 2 i-l ~ = 2 n 21 x30 p1(x){g(x+l) m(x) A B 1}-21~1.[21 6j 0] -l n a ‘2 n-l ~ = 2 n 21 $2 p1(x){a - 1} En-IEZI 6j 0] 13. = (a - 1)2 z: p1““1 2 54 2°; p1(2x+1) (1 - p1<2x+2))“'1 3, x = 1, 3, 5,... and where we have taken a = %. Let g(x) = x- -3 x 2 g(x) = a(x), x = 2, 4, 6,... with a(x) strictly decreasing. Then Dump) 2 a 2‘1“ bu) <2x+1)'3 <1 - hm a<2x+2)>“'1 a -3 hglz n-l 2 % 2A“ h(1) (2x+1) (1 - “_1 ) where An = minfx|a(2x+2) S (n-1)'1}. Since .8:(2x+1)'3 2 25mm)”2 (8; g(x)).1 implies B_= (1 +-E: x-3)-1 5 h(1) 5 1: and h(l) we can write . B 1 n-l -2 Dn(l,¢) 2 l6 (1 - 3:3) (2An + 1) . By choice of a(x), An can be made to increase arbitrarily slowly. Since (1 - -l*)n-1 r e"1 n-l the example is complete. Samuel's procedure (3) depends upon a knowledge of a positive lower bound for the parameter set 0 as well as an upper bound. The procedures we will introduce do not depend upon knowledge of a positive lower bound. In fact, the realizable procedures O that we will be introducing are such that ¢i(x) = 0 on [Si-1 53 = 0] so that the lower bound for Dn developed in Example 1 applies to these procedures also. Swain (1965, pp. 25-26) makes four assumptions concerning the discrete exponential family. Under these conditions he demonstrates S S = a procedure w _such that Dn(2J¢) Bn for all 0. where 0 Bn 0(n-klog n). His assumptions are satisfied by the negative binomial 14. family for B < 1 and the Poisson family for B < l. (A trivial modification of his assumption (iv) and a slight alteration of the proof of his Lemma 6 extends his result to cover the Poisson family for any B. However, this will not be detailed here since results to follow, in particular the theorem at the end of §5, show that assumption (iv) is not needed.) In the following sections we will derive bounds for the absolute modified regret under various sets of conditions on the underlying family of distributions. 3. Bounds for n.-1 2n E (5) Eylcpgx) - ¢i(x)|) = (J; Eitcpyx) - M") s -tJdt . 3 +j‘ Eitcpyx) - tier) 2 t]dt o where E; is expectation with reSpect to the measure induced by (X1,...,Xi_1,Xi). We use exponential bounds of Hoeffding (1963) to approximate each integrand displayed on the right hand side of 15. (5). Temporarily we completely suppress the display of dependence on x and delete the primes on 6; and Bi. Also dropping the subscript from E1 we have for 0 S t S B i~ g E 6 EECPf-ll.2t]SE[t-1—i-(¢,+t)20] 1 1 21 6 1 g lj =EEiY 20 =EEiY -EY 2-21m: [ 1 j J C 1( j j) 1 j] . =-~ _ =5 . i = W1th in 6j Ri(t) 5j’ Ri(t) g(wi + t). Since .21 EYj 12:21 p120 and -Ri(t)-EYjSYj-EYjS1-EYj,Hoeffding's Theorem 2 (1963) yields ~2 i 2 ' 28(2 p) (6) step: 41 2 t] s exp {- 2 1 1 2 t2}. 1 ig (1+Ri(t)) Similarly, we treat the other tail writing for 0 S t S $1 g_2i 5 EN! - i. S -t] E[——-l—‘l- (W. - t) S 0] 1 1 ~216 1 i 1 E21 z,-Ez, s4: 32, [1(J J) J] l with z = 6 - Si(t) 6 Si(t) = g(txi - t). The inequalities J j J, .21 =-§L‘i s - -EZSZ-EZSl-EZ '1 1 EZj t g 1 pj 0 and Si(t)‘ j j j j imp y e . ’ 2 282 (31p) (7) EEO; h $1 S -t] S exp {- 1 j 2 t2 i g2 (1+Si(t)) Noting upper bounds for Ri and Si we combine (6) and (7) obtaining for all 0 S t S B 16. 22202in 2 (8) ENC?! “H 2 tJSZeXP {-' ~ t} 1 1 i g2 (1448:)2 Since I exp {-ct }dt S £(fl/c) , (5) and (8) imply 0 . _ g i -1 3 (9) E s c i'% for a constant C not depending upon ‘2 and i. Proposition 1. If the family of distributions satisfies (Al+), then I _ = ' -0 Q (11) PiEdcpi til) o(1) as 1 uniformly in g; and, consequently, (12) n-1 2n P E(l¢' - W I) = 0(1) as n r P I i i i ; uniformly in .2. Proof. It follows from (9) and the fact that w; and $1 take values in [0,B] that for each fixed x, E(|¢i(x) - ¢i(x)l) S fi(x)AB where fi(x) = B(l +‘-£L§l-) m71(x) i-g, m(x) = inf {pe(x)|6 E Q} 17. and m(x) >‘0 under (Al+). Since Pe S*%%§% PB’ PiE(|¢i - lil) 5 a-g; PB(fi(x)AB)’ The right hand side does not depend upon _9_ and goes to zero by the dominated convergence theorem. We now use the Berry-Esseen normal approximation (Loéve, 1963, p. 288) to bound the integrands diSplayed on the right hand side of (5) for each x 2 1. The resulting upper bound for E(|¢i - Vil) is more tractable than that provided by (9). ~ As before consider Y = 6 - R, t 6 and let W = 111‘): j 3 (Yj - EYj)/(1 + Ri(t)). Since leI S 1, E(IWJ| ) S E(W§) and it follows with r:(t) = Var(2i Yj) that -3 i 3 (13) Li(t) = r1 (t) 21 E(|Yj - £le ) S (E: E(W§))-% =r;1(t)(l + Ri(t)). ' - ~ 2 Explicitly, r:(t) = 2;.{pj + R:(t)pj - (p - Ri(t)pj) }. Note that J - q = inf {1 - pelx 2 1, e e 0} > o, 230:) s (2539, 5. s BE p .. -2 1 a J g j SO With T2 = 8'3 + 48232) X 2 1) g g i ~ 2 i ' 2 2 i (14) q’EI pj + q Ri(t) El pj S ri(t) S I? 21 pj. Hence, i ~ -% i -% (15) Li(t) S (q 31 PJ.) + (q 21 9].) . The Berry-Esseen theorem implies that for a constant c s 31 p (16) EDP; "l1 2 t] S <1>(- Eli-('61 t) + c Li(t) 18. where 6 denotes the distribution function of N(0,1). We can 0 B so write for a > 0, J” o(-at)dt s a'lj‘ ¢('r)d'r = 2'1 a"1 j'PHxl 2 TJdT = -1 -1 0 '° 5 -s -1 2 a P(|X|) where X m N(0,1). Therefore, I ¢(-at)dt S (2") a . 0 Using this result and the upper bound in (14) yields - i B as P (17) I <1>(- +531 t)dt s -——%—§—11——g . 0 g 1 (2") g(l‘.1 Pj) We combine (15)- (17) to prove B 1 E, i. '25 ‘ 1 ~ ’35 (18) g EEcpi - Ii 2 tldt S D{(g T + 1)(21 pj) +031 pj) } for a constant D. Unfortunately, the problem is not symmetric and the other tail - S t 6 . For J~2 1‘ ) 1 33' we have 8 J ~ must be treated separately. As before let Z = 6 2 2 __ i 2-3 0 s t 5 pi, Si(t) - Var(21 Zj), and U - 5 g + B i 1 ~ 2 i 2 2 (19) q'21.pj + q Si(t) 21 pj S si(t) S U 21 pj. With Li(t) = s;3(t) 2i E(|Zj - Ezjl3) we have Li(t) s s;1(t) (1 + Si(t)) so I S i I”. "% i '% (20) Li(t) (q 31 pj) + (q 21 pj) and proceeding as in the derivation of (18) i , ‘g i -% 1 ~ -% (21) iEEcpi ~11 s -t]dt s D[(g~U + 1)(E1 pj) + (21 pj) }. l9. -% ~ Since U S T s 3%“5g + 26-2, (18) and (21) combine to yield 8 a - - (22) EM; - 4‘1!) s Diég + Dali pjf’i + (2; :3) 35} g for each fixed x 2 l and a constant D. Proposition‘g. If the family of distributions satisfies (Al), then at each. 2.6 0a (23) PiEdcp].L - wil) = 0(1) as 1 ~ .. ; and, consequently, -l n ' _ d (24) n 21 PiE(l¢i wil) - 0(1) as n a. Proof. We can write PiE(I¢£ - wil) S E(l¢i(0) - ¢i(0)l) + B Pi[x.> O] S C i-%‘+ B Pi[x.> 0] using (10). Therefore, if 2_ is such that 6i H 0 as i m w then PiE(|¢i - wil) = 0(1). Otherwise, limi 6i > O and for each x, E: p (x) d 9. Thus, (22) applied for j x 2 1 and (10) for x = 0 shows that at each fixed x, E(l¢'(x) - ¢ (x)l) * 0. Noting P S 2191 P the proof is completed Ni 1 i h(B) B by application of the dominated convergence theorem. Under the assumption (Al+) the bound (22) takes the form if - <25) En‘lt‘i‘vimlcp:-w.l>sc'n"5+n'n’35 2u+-§EL>%u) 1 1 x=1 g (x+1) for appropriate constants C' and D'. This motivates the follow- ing assumptions concerning the family of distributions: (A2) 2 p%(x) < on .x B and g(x) pB(x> % < (A3) 1; ( g(x+'1) + Proposition 3. If the family of distributions satisfies (Al ), (A2) and (A3), then -1 n n ._ = $5 (28) 21 19,32ch1 wil) 0(n ) 21. uniformly in Q, .EEEEE- The proof follows directly from (27). It follows from (2) that the bounds on the rates of convergence indicated in Propositions 1, 2, and 3 also apply to an(2,¢')l. In §5 we will bound the absolute modified regret of realizable procedures ¢ by bounding no1 2: PiE(I¢' - ¢|) and applying a triangle inequality. Before doing that we show that under (A1) and assumptions stronger than (A2) and (A3) the bound 0(n-g) holds for a procedure w". -1 n H .3- Bounds for n 21 Edcpi ¢il). We introduce the procedure ¢;(Xi) where g(2i'1 éj + 5'1 + 2: Cj} (29) (Pg-(X) = 1_1 A B ~ ' {21 63 + 51} Here C1, C2,... is any sequence of independent random variables satisfying lel S 1, E gj = 0, with (C1,...,Cj) independent of (X1,---,XJ,X') for all j 2 l and 02 = j i the prime on 6; is omitted and E denotes expectation with 2: E C? m ". Henceforth, respect to the measure induced by (Xi,...,Xi_1,Xi,C1,...,Qi). Bounds for n-1 2? PiE(l¢g(x)§ ¢i(x)|) will be evolved by bounding for fixed x and then completing the Cesaro expectation. The artificial randomization makes the variance of a sum divergent and permits the use of Survila's strengthening of the Berry-Esseen theorem (Survila, 1962) to develop the bound for fixed x 2 1. Theorem (Survila). Let U1, U2,... be a sequence of independent 2 random variables with PUi = O and 5i = PlUil3 < a, Define Sn = 22. n -3 n =3 -! 4o Var (21 Ui) and Ln n 21 §i. If Ln 0 as n then n 2 -1 IPEEI Ui S tsn]-¢(t)l S c (1 + t ) Ln for all n and t where C is a constant. Let x 2 1 and 0 S t S.B be fixed and the diSplay of de- pendence upon x suppressed. Then we can write " _ i _ _ i (30) EEcpi 4:1 2 t] s E[Z‘.1(Yj EYj) s 213le — ~ . . a ‘8' ‘ :3 '~ - where Yj : éj + Qj - Ri(t)6j’ Ri(t) ~g($1+t), _EYj p.1 Ri(t)pj, and ZiEYj = - g 2|le t. With TLi(t) and ri(t) defined as in (13) it follows from le - 13le s 2(1+Ri(t)) that -1 (31) Li(t) S 2 r1 (t) (1+Ri(t)). 2 2 ‘ ~ 2 ~ 2 Explicitly, ri(t) = Oi + zlh’j +IRi(t)pj - (pj ', Ri(t)pj) } so 2 2 i 2 2 i (32) Oi + q Ri(t) 21p S ri(t) S 01'. + T2 21p 1 J ... ..2 . - ~ where T2 = B E + 452 3 . Since 0% -' ° and R.(t) S 28 5', 8 g2 1 1 g Li(t) * 0 as i m 0 for fixed x 2 1,‘0 S t S B. Survila's theorem and (30) imply 75 31p "s miP 2 j -l H - 2 S - 1 + L + '--l-i-t . (33> Etwi 11 t] ¢< g rict)‘) c i(t) {1 (g ri(t) ) 1 The inequality ri(t) S Oi + T03; pj)% together with the technique culminating in (17) imply 8 ~ 21 {0 + mi 9.)}5} I¢‘(_.g__1-__plt)dtsg i , 1 J .1 .1 and another application Of (a + b);5 S a% + b335, a, b > O on T2 shows 23. B gzi p (34) f ¢(- fit) dt S :—-£C:j-— +—L—g+ (%L)% O 21 p1 g2t2)p1dt s a'1 , (32), Li(t) s 2 The inequalities 1$0 + a -l i - - {oi + (q lej )%:n and an upper bound for T show that the 0 to B t-integral of last term of (33) is bounded by 01 8 3 5% i (35) B{?(211'P p) 2 + (§+ E, + m: p > % . _ M3 + 1x21 p) *1 ~35 1 j 8 for some constant B. (In deriving (35) we take 01 = 1 so that o;1 S l for all i.) W The other tail can be treated analogously and the integral 1 I EE¢2 - *1 S -t]dt has a bound with the same structure as the 0 r bound (34) and (35) provide for the t-integral of the left hand side of (33). Therefore, the followinginequality is proved: 8 3 0 g 15 (36) MW: -1«l)S 31-3—42: p) "2' + (1%— +§g + 1) p3 g g (21 )’1 + (£35 + no: >35} 1 Pj s 1 Pj 8 for some constant B independent of .2, i and x 2 1. Inequality (36) is analogous to (22). Recall that Lemma 1 and assumptions (A1+), (A2) and (A3) were needed to deduce the consequence of Proposition 3 from (22). At this point we prove a lemma which is useful in applying (36). Let O*={9|OS6<°°,2x9xg(x)<°°}. If B>o is in the 24. * interior of Q then each PO’ 0 S B, has an exponential rate on its tail probability. Write * (A2+) B 6 interior 0 . (It is clear that (A2+) implies (A2) and the exponential rate on * ' . the tail probability. For if B < d, d E Q , then pB(m) S PBEX 2 m] = Z:Bxh(B)g(x) S %%%%C% )m P dEx 2 m] < £_%%%GB_)m for each m.) Lemma 2: Let 1 S Bi S i be an increasing sequence. If (Al) ; and (112+) :obtain, then n i S = (37) 21 PiEEI pj(x) Bi] 0(Bn log n) uni formly in _9_. Proof. For fixed x, En pi (x)[E1 pj(x) S Bi] S Bn so 21 PiEZip jp(x) S131] S mBn + 81' P. 1..[x 2 111]. Since the" family hasmonotone increasing likelihood ratio and [x 2 m] is increasing in x, PszmJSPEme]. But under (A2+) PBEmeJScrm where :3" c=h(LB&-%, r jg for some B < d, d E 0* Therefore, for any m 2 0 (38) Zn liPEZI pj(x) S Bi] SmBn+cn rm. » B Letting m be such that n rIn = Bn’ it follows that m = (log End/log r S (log n)/log r-l; and, hence, the result follows from (38). Before proceeding we digress to compare this lemma with Swain's Lemma 6 (1965, p. 31) Specialized to k = 1. To prove (37) for the case Bi = is, 0 < e < l, assumptions (i), (iii) and (iv) were used, (iii) being 25. x+ (in) gg(y) S M' < Q for all 4x and y. * Remark. (iii) implies Q is open on the right; consequently, (iii) implies (A2+). Proof. We have p31(m) Pe[x.2 m] = h-1(B) 2: Bx-m’h(5) g(x-m) - * A_g(x) S h 1(B) M' = M for all 9 E 0 and every non-negative g(m)g(x-m) (j) inte [ 2 “J - m’l p9 . . E m ger 111. Thus, Pe X - TTO (1 - m) implies P9 1(sz S r where r = 1 -'1711< 1. From the exponential rate on the tail probability * * it follows for any B E 0 there exists a d > B such that d E Q . For with .Q = s > 1, h(B) Em dxg(x) = P (sx) = Z sx p (x) S B 0 B 0 8 2% x , , '1 O (sr) and the last series converges Wlth s < r . Preparing (36) for application of Lemma 2 the trivial bound |¢2 - Wil S B and partitioning by [2: pj S i%] lead to ads. 1 a 01 i g .3? i -% (39) Ese£2 p.SiJ+B{—=——+ +1}(2 p.) i i 1 J g ‘ék 1 J for some constant B and all x 2 1. 0n [x = O] a bound like (10) holds for $2. (The artificial randomization did not change the expectation of the Y and increased the range of Y J J by a factor of no more than three. Then Hoeffding's Theorem 2 - BY, J) (1963) implies the result for m; as it did for @i). Therefore, (39) leads to -1 n ,, -% -1 n i .% (40) n 21 Pimlcpi - wil) s 0(n ) + a n 21 PiEEl pj s 1 ] 1 o ifB-ng) % -1 e 1 x n 1 -% + B.n g(x+1) + -§—£—L- + 1}}:1 pi(x)(21 pj(X)) - x=1 g (x+1) 26. Write I k m “‘3’ EE% 1’8“) < ' Proposition 3. If the family of distributions satisfies (Al), ' (A2+) and (A3 ) and W; is such that i 2 0: + °, then -1 n u _ -% (41) n 21 23:2ch1 - 4:1!) - cm > uniformly in g, ‘ggggf. The proof follows from application of Lemmas 1 and 2 to (40) and the fact (A2+) implies (A2) which together with (A3.) implies (A3). In the next section we introduce some realizable procedures and bound their corresponding modified regrets. '2. Bounds for Modified Rggret. * ** Define two procedures ¢i(Xi) and wi (Xi) by 81-15 (42) cp: = —z—i1———-1As g 1 61 . g Zi-l 61 (43) CP: *=(X) 2.1 1 }/\B , g{1+21 5j and abbreviate Ei-l 6j = Si-l' Since $i = W: = ¢:* = B on .. ~ , * — = 2 . ' - = l I [g 81-1 B g (Si-1 + 1)] we can write |¢1 ¢il l¢' - ¢*II l¢' - ¢**l = l¢' - ¢**II. It then follows'by consideration 1 i ’ i i i i ' of the values taken on by W; on [Xi = x], [Xi = x + 1], and [Xi ¥ x,Xi ¥ x + 1] that 27. > o] * lap! - cp.| S BES. = 01+-§-—{x2 = xJES. > 0] +~—3—-—[x3 = x + IJES. 1 1 1-1 81-1 1 1-1 1 1-1 3 81-1 and > o] ** lipi (Pi I S E3[Si_1 0] + L + )[xi x + 1][S, g 81-1 S1-1 1’ 1 +.-§——[x: ¢ x,x! # x + IJES. > 0]- S1-1 1 1 1-1 From these two inequalities we have u * u _ ** _ B (44) lap].L - cPil \/ lcpj.L 1?1 l S BESi-l - 0] + g—[si > 0] i-l +1—L-[XE = x+ 1][S, > 0]. g Si-l 1 1-1 -1 Lemma 3. For each fixed x and i 2 2 [si_1 > 0] 1 _1 s (45) .31-1( 3. > 4 (£1 pj> . i-l , Proof For each x [S > 0] 8-1 S 4 (S + 6' + 1)-1 —' ’ 1-1 1-1 1-1 1 ' Since (Si-1 + 6£‘+ 1).1 is convex in 81- + 6;, Hoeffding's 1 Theorem 3 (Hoeffding, 1956) implies E((Si-l + 61.+ 1)-1) S 23 (j + 1)-1(§) pj(l - p)i.j where i p = 3 This last i 1 Pj' inequality implies E((Si-1 +-5£'+-1)-1) S i/((i + l) 2: pj) S (2: pi).1 and completes the proof. + Proposition 2. If the family of distributions satisfies (A1 ), then (46) P.E(|w: - mT|> = 0(1), P.E<|w: - wf*l) = 0(1) 1 1 1 l 1 l 28. uniformly in .2; and, consequently, -1 n , _ * _ -l n , ** _ (47) n 21 rims, mil) - a(1). n 2, Pinclcpi - cpi I) — 0(1) uniformly in .9. Proof. For fixed x the E expectation of the right hand side of (44) is bounded by B expf- Ei-l pj} + 4B(Zi pj)-1 + i -l _ g_~ s 4B pi (21 pj) — gi where Lemma 3 and g p1 Bpi have been used. Weakening the bound to B expf-(i-l) m} + 8B 1-1 m"1 = fi where as before m(x) = inf{p°(x)|6 E O} > O, we have * ** h o 1 _ I _ S . Finder), Civil) v Piradcp11 (pi I) Lima) PB(fi A B). The ught hand side is independent of g_ and converges to zero by the dominated convergence theorem. Theorem 1. If the family of distributions satisfies (Al+), then * ** (48> Inn(9_,cp >l =- so), lungs: >l = om uniformly in ‘2. Proof. The proof follows directly from (2), Propositions 1 and 5, and a triangle inequality. Proposition 6. If the family of distributions satisfies (A1), then at each fixed 2.6 Qca * ' ** . (49> Pinclcpi - evil) = a(1). Pindcpi - cpi I) a(1). and, consequently, 29. 0(1). -1 n , _ * _ -l n , ** (50) n 21 Pimlcpi wil) - a(1). n 2, 21mm)i - cpi I) Proof. Consider the bound gi of the proof of Proposition 5. . ———- i . *> —o¢ -0 If .g is such that limi 61 0 then 81 pj and g1 0 so the dominated convergence theorem implies the result. Write P,E(|¢' - ¢*l) S E(l¢'(0) - ¢f(0)l) + B P [x.> O]. The term 1 i i i i i * E(l¢i(0) b ¢i(0)|) v 0 at any .2 since m(O) > 0, and, therefore, gi(0) * O. For 9i m 0, PiEx.> 0]'* 0 so the result is proved for * ** the procedure mi. The same proof works for $1 . Theorem.2. If the family of distributions satisfies (A1), then at each fixed §_6 Qon ' * *‘k (51) Inn<_e_,cp )l = om, lungs: )l = on). ‘ggggf. The proof follows from (2) and Propositions 2 and 6. Example 1 illustrates that in order to deduce rates of convergence it will be necessary to examine the bound (44) in the case the family satisfies conditions in addition to (A1) or (Al+). One such condition is 0 (A2 ) PB(X) < °°. I and notice that (A2+) implies (A2 ). Lemma 3. (a) Under (A1) and (112*), n'1 2'; 3131-1 = 01 = 0(n.1 log n) uniformly in ‘2. (b) Under (Al) and (AZ'), 11"1 zl'EiESi-l = 0] = 0(n-g) uniformly in .2. ll. (Ill-III: 1‘11] 30. n n . . = S 2 = Proof We write 21 EiESi-l 0] Kim 1 pi(x) Iii-112814 O] + 2111 Pi[x 2 In]. The monotone likelihood ratio property and the fact that the. pi(x)_13_i_1[si_1 = o] =.£i[X1 e x,...,Xi_1 # x,xi = x] are probabilities of disjoint events together imply (52) 2? P [s, = o] s m.+ n P [x 2 m] -1 i- B 1 for any 111.2 0. Under (A2+), PBEX 2 m3 S c rm for some constants c, O < r‘< 1. The choice m = - log n/log r proves part (a). Under (A2'), PBEX 2 ma S PB(X.)m.-1 and the choice m = n% proves (b). This lemma allows us to treat the expectation of the Cesaro mean of the leading term in the bound (44). The second and third terms are now investigated. [3 o] . > Lemma 2. Under (Al) and (A2), n-1 21; Pi(——i§-1--—- 1-1 ) = 0(n-%) uniformly in .2. -1 -1 % > S > ' o] 31-1 ([81-1 0] 51-1) , Jensen s 41 S i-l([Si-l 1-1) 2 (2: pj)-%. The proof is completed by application of Lemma 1. _ gEX' = x + 1][S _ >‘0] Lemma 6. Under (A1), n.1 En'P E( 1 ~ 1 1 ) = -———- 1 i g Si-l Proof. From [8, --- i-l inequality and Lemma 3 it follows that .2 > 0] S 0(n.1 log n) uniformly in 2, . -l n 9 Proof. The left hand Side can be written n 21 20 pi(x) 911 -l -l°°n2 > S pi(x+1) g(X) g .(x + 1)'£i-1([Si-l 0] 51-1) 48 n 30 31 pi(X) (2: pj(x))-'1 where use has been made of Lemma 3. The rate 0(n-1 log n) follows from Lemma 1.1. Proposition 1. If the family of distributions satisfies 31. (Al), (A2) and (AZ'), then .. * .. .. - (53) n 1 2‘1‘ Finds; - epil) = 0(n 35), n 1 2’1‘ 21mm; - a?) = 0(n A:) uniformly in .2. Proof. The proof follows from (44) in view of Lemmas 4(b), 5 and 6. Theoremug. If the family of distributions satisfies (Al+), (A2), (A2.) and (A3), then (54) anQ, 0] i 1 1-1 . = o] + g (s ) 1-1 i 1 1-1 + 5' i where T =2; Q 1 j' Since lwi - wgl S B we can partition by ' 1 [81 p S 1%] and weaken the bound of (55) to BEzi s 1%] + Bis = o] + 2 g lTil 1 Pj 1-1 g (si_1 + a; + 1) j [2: p1 > 1%]. The first two terms can be diaposed of by Lemmas 2 and 4. As indicated in the proof of Lemma 3, 132(81.“1 + 6; + 1)..1 S (E; pj)-1. Hence, the independence of T1 and (X1,...X1,Xi) together X with fact EITiI S 0 show that for each fixed X1 1 32. I'ril I Si_1+51+1 _E( i .% i -% . 8 . ' if 01 S 1. Therefore, under 0A3 ) the Cesaro mean of the overall expectation of this term is 0(n-g) in view of Lemma 1. We have proved: Proposition‘g. If the family of distributions satisfies (A1), 8 I . (A2+), (A3 ) and £1,€ is such that Ci S i, then 2,... '1 n I n - -% (56) n 21 Pi Eclcpi - epil) - 0(n ) uniformly in .2. Theoremué. If the family of distributions satisfies (Al), (A2+) and (A3'), then (57) Inn<_e_,cp*)l = oo fixed. I In 51.2 a necessary condition was stated for the existence of a compound procedure m satisfying — co lim.n sup {Dn(2,¢)l§ E Q } < ”. For squared error loss the necessary condition reduces to the 34. existence of a non-randomized procedure $1 in the component problem satisfying <58) sup {Peue - cpl)2)le e 0} < ... With the negative binomial family all [O,l]-valued procedures ml have risk functions uniformly bounded by unity. However, (58) illustrates the necessity of assumption (Al) or (Al+) in the Theorems 1, 3 and 4 for the Poisson family. For with this family and unbounded parameter set, (58) fails for any procedure $1 (Lehmann, 1950, p. 4-13). We sketch the proof. Letting P6( $1) = 0 + h(B) the Cramer-Rao inequality proves Pe((9 - $1)2) 2 {(1 + b'(9))2/ P6((§% log pe)2)} + b2(0) = 0(1 + b'(9))2 + b2(0) from which it follows that EEO-a P‘((O - cpl)2) = +00. As a matter of possible independent interest we give an example which illustrates that this property exhibited by the Poisson family need not hold for other exponential distributions on the non-negative integers with unbounded natural parameter spaces. Consider pe(x) =' 9x(x!)-%h(9), 0 < 9 < “, and m(x) = x%. Then Pe(¢) =6 and Pam - <62) = no) {2: x e“(x1)"’5 - 02 114(6)} = h(e){e 1+ 221.. (xl)-% - ((x-2)1)"5]ex} = h(e){e + 2: (x35(x-l)-35 - 1)((x.2)1)"’5 6"}. It is easy to verify that x%(x-1)-J5 - l S (x(x-1))-AE for x 2 1 so Pe((° - cp)2) s h(e){e + 2: (x1)'35 6"} = h(e)(h'1(a) - 1) < 1 for all 6. The family of distributions in Example 1 satisfies the assump- tions (A1+), (A2) and (A2.) and illustrates the necessity of adding * *1: another condition in order that the procedures w or w attain 35. any rate 0(an) uniformly in ‘2, an a null sequence. We now give an example of a family satisfying the hypothesis of Theorem 3 which shows that the bound is fairly tight. Example 2. Consider the exponential family with g(O) = -4 -3 -a-l g(l) = l; g(x) = x , x = 2, 4, 6,...; g(x) = x log x, [q,l], O < a < l, where a > 0 is fixed x = 3, 5, 7,... and 0 but otherwise arbitrary. It is not difficult to verify that (A2), I (A2 ) and (A3) are satisfied; Then at .9 =.l with n 2 2 and any procedure ¢ such that [¢i(x) S a] 2 [2%-1 6j = 0], _ -1 n . 2 DnQflP) ‘ n 21 21((1 -' (Pi) ) 2 n’1 2‘; 21((1 - @9233?“ SJ, = 0]) 2 (1 -002 31112;“ 63. = 01 2 (l - a)2 E: p1(2x + l)(l - p1(2x + 2))n-1 2 - 2 (1 - o) (1 - 13%))“ 1 21A p1(2x + 1) n where An = min [x|(2x + 4)4 2 n-l} S g(n-l)%. Comparing the series 2 p (2x + l) with the integral ‘r x-3 log A 1 ZA+1 a integration by parts that for A 2 1, 2A p1(2x + l) 2 % h(l) -a-l . x dx shows Via (a + 3)-1(2A + l)"2 log-a-1(2A + 1). Therefore, Dnfll;¢) 2 Kn-J5 log-3 n for some constant K.> 0 and all large n. CHAPTER III SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR LOSS AND A FAMILY OF NORMAL DISTRIBUTIONS 1, Introduction. Let P be a family of probability measures on the measurable space (X,B) where X is the reals and' B the Borel G-field. With n denoting Lebesgue measure suppose that for each P E P, P < < p and p =-%§ is a differentiable determination of the Radon-Nikodym derivative. (Consequently, p is the ordinary deriv- ative of the cumulative distribution function.) Let ‘2 = (P1,P2,...) where P1 6 P and let Xi ~ Pi be a sequence of independent' random variables. For each x 6.x we are interested in estimating (1) u (x) = 9—(1og 81 p (x)) 1 dx 1 j ' For example, when P = {Pele E Q}, Q C1X, and pe(x) = k(9) £(x) eex, a Bayes reSponse versus G1 the empirical distribution of .21 takes the form X (2) 110:) = ui(x) - £381. Therefore, if ui(x) can be estimated then ¢i(x) can be estimated through (2). '2. Estimation of ui(x). Define the distribution function . -1x 1 (3) Fi(x) = 1 L 21 pj(v) dv where we have abbreviated du(v) to dv. Also for 1 2 bi + O 36. 37. and any distribution function F define <4) ti(F)(x) = hf log —-’5 The mean value theorem for integrals implies that under suitable conditions ti(Fi)(x) - ui(x) * O as i v “. This will be discussed later after investigation of the problem of estimating ti(Fi)(x) from the sample (X1,...,Xi). We make the following assumption: there exist functions Mi such that ,_ a (A1) sup {Icil) s 1 1 i -1 l 1 1 ‘ . 32V JX + h (2 TI 1) hi Fi x 1 4c M. (l + eZMi) + 1 q (i F1]: + hi)35 . Now we introduce conditions that will ensure that ti(Fi) - ui * 0 as i *'°. The conditions seem arbitrary but they are con- venient for reducing the notation in the bounds and are easily verified for a family of normal N(6,1) distributions. We introduce the notation f(6)(x) = sup {lf| l ly - xl s 6}. A basic assumption is that each p is twice differentiable at each x. We also assume that there exist functions A and B such that for all P (A2) Pé1)(X) S A(X) < ° and (A3) p'E1)(X) S B(X) < °°. As immediate consequences we have (13) Ipl s A IAI for all p, IAI s 1 and 41. (14) lp'| s a(x) IAI for all p, IAI s 1. x+A x+A 2 Also, f;p(v) dv = IA{p(x) + (v - x) p'(x) + g(v - x) p"(z)} dv x- x- for some 2 E {v,x} proves by way of (A3) that x+A 2 (15) l(2 A) 1 IAp(v) dv - p(x)| S-§&§%-A-’ for all p, O < A S 1. x- We write (16) Iti - ui(x)l s I + ch) + K where _ -1 x + h. _ , ; -1 I(x) - llog(hi Fijx 1) log Fi(x + 2h1)l hi J(x) = Ilog(h-1 F ]x ) - log F'(x - %h )| h-1 i i x - hi i i i and - '1 I X+%h i I K(x) - lhi (log Fi)]x _ gh: dx log Fi(x)l . The inequality llog a - log bl S (a.A b)-1|a - bl and the technique used to prove (15) result in B C h i I 24 F1 '(17) IVJS where the diSplay of dependence on x is suppressed and the function C is defined by 1n£[p(x + A)I IAI s 1} = @183. 42; In order to put K in a more tractable form let ci 6 (x,x + khi) and 61 E (x - £h1,x) be such that log Fi(x + khi) = khi H I I - I _ = (Fi (ei)/Fi (61)) + log Fi(x) and log Fi(x khi) khi(F2(61)/Fi(61)) - log Fi(x). It follows that IF" | IF' (e > - F'l IF' (6 > - F'| i i i i i i i 2K(x) S —--—--, —---——-, + -——-——---, Fi(x) F1 (61) F1 (61) + IF; (e1) - F; (x)! + IF; (51) - Fg(x>l Fi(€i) F1 (51) Assuming there exists a function D such that for all .21 6 P1, i 2 1 IF; (X)| (A4) TIES-— S D(X) < a we have (AD + B) C hi (18) K S 2 F; Combining (16) - (18) yields (AD+B)Chi (19) Iti - nil s I F1 3 x + h B(x) hi 0 O o o S ' Inequality (15) implies FiJX _ hi 2hi Fi(x) + 3 and the bound (12) can be weakened to % * - (20) P.(|t. - t.(F.)|) s k' e3M1{ C +9—9— + (—9—,)’5}1 35 -1 1 1 1 3 2 , k g , h F hi (F1) hi F1.- 1 i for some constant k' not depending on .2, i and x. Combining 43. (19) and (20) with the choice hi = 1-1/5 shows that for each fixed x, * (AD + B + 3% eBMQ c (c + c5123}: _1/5 - s ' ° (21) -31(|t1 nil) k { Fi + (F1)% } 1 for some constant k. The following observation is of use in finding the functions Mi. Repeated application of the law of the mean proves that . -1 = II I _ t1)(€1 61) “1 for some Bi 6 <61,ei), - - S 6i E (x hi,x) and Si (x,x + hi)’ hence, Iti(Fi)(x)l 2 D(Bi) and we may take Mi = 2 D(1). In the case of a family of probability measures P {Pele E O}, Q = [-a,a], where pe(x) = p(x - 6) we may take A = Pb? + 1) and B = pQQ +_1). g. A Decision Problem. Consider the family of normal distributions P = {Pele E Q}, Q = [-a,a], a < m, with P0 denoting the N(6,l) law. Let Xj ; P6j denote the usual N(0,1) density. The Bayes response (2) takes the be a sequence of independent random variables and p form ¢i(X) = ui(x) + x. Theorem.l. The sequential compound procedure * (22) 1210(1) = ((ti_1(Xi) + Xi) A01) V(-01) is such that -1/5 23 - = < > gidcpi wil) o<1 ) and (24) Inne,cp>| = 001'”) uniformly in 2. Proof. For each fixed Xi + 1 = x the bound (21) applies to 21(lt:(X) + x - 1100') with A(X) = p20! 4, Doc), 300 = p'fo, + DOC), C(x) s exp {le +oz + 1;}, D(x) = lxl +oz and 1110;) = 2(lxl +01 + 1). If we weaken the bound with F£(x) 2 pa(x) [x S 0] + p41(x) [x:> O] and use pi + 1(1!) 5 P_a(x) [x< 41/] + [lx' S 0'] + pa(x) [x>0'], it follows that .gi‘+ 1(|¢i + 1 Since Proposition 1.1 implies Pi + 1(HIi + 1(x) - ¢i(x)|) = 0(i-1) ‘ *il) = 0(1-1/5) uniformly in .9. uniformly in .2, (23) is proved. The bound (24) is now an immediate consequence of Corollary 1.1. Swain (1965) suggests a different sequential compound procedure and proves that the modified regret is bounded from above by a bound 0(1) uniform in parameter sequences. CHAPTER IV ON THE BAYES RESPONSE TO THE EMPIRICAL DISTRIBUTION OF OPPONENTS PAST IN SEQUENTIAL DECISION PROBLEMS IF‘ Introduction. In the preceding chapters sequential compound procedures were suggested naturally by the structure of a Bayes response W1 versus G the empirical distribution of player 1's 1) (nature's) past through present moves. The problem of bounding the modified regret associated with certain natural procedures w was reduced to bounding a Cesaro mean difference n"1 3: 21 ('mi - Wil). Inequality (8.8) of Hannan (1957) shows that the simple procedure ¢i(Xi) achieves a non-positive modified regret and illustrates why one would try procedures that approximate ¢1(Xi) in order to obtain small modified regret. In fact, with the sequence (set) compound problem investigators haVe used procedures which at stage i are Bayes versus estimates of G1 or G1-1(Gn) where estimable, as in the‘finite problem, or procedures which approximate $1 or ¢i_1(¢n) in case the distribution functions are not estimable directly. The question arises as to what can be achieved in the sequential problem.when an estimation problem is removed by assuming that G is known at stage is Usable sequence i-l strategies were exhibited by Hannan (1956) and (1957) for various M X N games (M finite) which achieve uniform rates of 0 (n-%) in absolute modified regret. The strategy is to play Bayes versus hi-l + Z where h is an unbiased estimate 1-1 45. 46. of (possibly G itself) and Z is a suitable random G1-1 1-1 vectorI Theorem.l of Van Ryzin (1965) demonstrates that in the finite problem direct play against E at stage i results in -15) i-l a uniform bound 0(n if there is a certain non-degeneracy in the estimate. In some finite problems a bound of the same order obtains whether the procedure used is either Bayes versus an unperturbed G1_1 or an estimate 51,1. For consider a 2 X 2 statistical decision problem where player I's set of pure strategies is Q 3 {0,1} with P = N(9,1), and the loss matrix is (g 3), 6 a, b > 0. A Bayes rule versus (l-p, p) is b P0(x) bpe(X) + 891(X) tp(x) 3': < p] a: [x > 35 + 108 (%?)1 where tp(x) a 1 means decide 9 a 1. Let 0(p) denote the Bayes risk vector. We claim that the Bayes response satisfies a Lipschitz condition of order a s l, (8.13) of Hannan (1957). (L) .. om . o O' - Condition (8.13) reduces to showing 0(9) L+t o 0 105%) =O(t) as t" O uniformly in p where 01(p) is the risk function 0(p) evaluated at 9 a i. These and 01(9) - 0 orders can be verified and then via Hannan's Theorem 5, direct play versus results in a non-negative modified regret G1-1 of order 0(n-110g n) uniform in player I move sequences. Van Ryzin's Theorem 5 (1965) implies that a bound of the same order obtains if the procedure plays Bayes versus any unbiased estimate hi-l that has a bounded kernel. Throughout the remainder of this chapter we assume that 47. Gi-l is available to player II at each stage i. In the next section we point out some results which apply to statistical decision problems having a certain structure. The problems of Chapters II and III possess this structure. In subsequent sections we exhibit sequence strategies for nonéfinite games and prove rates of convergence for the modified regret. 2. Games Involving Squared §£32£.222§° For each 9 E 0 let there correspond a probability measure Pe'<11_1 +1:i - 261)). In view 0f Theorem 1.1 and (l) the following prepoSition is immediate. Proposition _1_. If 0 C [-A, A], A < a and M(x) . sup [pe(x)|9 E 0} is integrable (X,B,u), then (2) 0 S Dn (22$) 3 0(n.1 log n) uniformly in .9. Herman (1957) uses symbolism 0(x) to represent a risk function Bayes versus x. Then 8 a(x) denotes that function evaluated at e, a pure strategy for player I in the component game. Specialization of Hannan's Theorem 5 to a s'l shows that 48. (3) €[0(x) - 0(x+te)] S L (ii-E- for a constant L, all 8, all t > O and all probability measures x with finite support, is sufficient for (2) to obtain. We give an example to show that (3) is not a necessary condition for (2) to obtain. Example I. Consider the discrete exponential family of Chapter II with Q a [a, b]. For squared error loss, and in Hannan's notation, x degenerate on a and 6 a probability measure degenerate on b, ap +bt p a b 2 pa+tpb e[0(x) - O(x%te)] . Pb<(a-b)2) - Pb(< .. 2 w) t 2 2 , where .£(t) a Pb((2pa pb + t pb)/(pa + tpb) ). By Fatou s lemma, lim .i Gilt) 2 2 Pb(pb/pa). For the geometric family 2 t and O < a S b < l, Pb(pb/pa) n-hn; and, therefore, (3) fails. Since the hypothesis of PrOposition l is satisfied, (2) obtains.. Another application of Proposition 1 is now given. Consider the problem where the component game has the following structure. Player I picks an 6 5 [0,1] and II picks a 5 6 [0,1] with loss (6 - 5)2. This is Hannan's Example 2 (1957) where it is observed that the Lipschitz condition (3) is satisfied for 0(-) with 6 0(x) s (e - 5(x))2, 5(x) being the mean of the distribution x. Therefore, 0 S n-IZ:€iG(Gi_1) - R(Gn) - 0(n-110g n) uniformly in .E' This result can be deduced from Proposition 1 in the following way. Associate with each 49. e 6 [0,1] the same probability measure P degenerate on 0. ._ dP -._ If p-— P, then p(€) : du(6) - [6 - OJ and ¢1_1(€) r (i-l)'12i'lej[e - O]=- 5(G a.s. P. Since the hypothesis of i-l) Proposition 1 is satisfied, (2) obtains with Dn(g,¢) = -l n 2 o - n 1 81 (G1_1) R(Gn)' We now proceed to the main results of this chapter. An effort will be made to keep the presentation self-contained although the basis for the results is found in Hannan (1957). 3. _G_at_ng With Countable .111. Let player I's set of pure strategies be M = {1, 2,...}, possibly finite. We identify each 5 of N, the set of pure strategies for II, with the risk vector G'= (01,02,...) ‘where L(j,5) = Oj 2 0 is the loss to II when I's choice is j and II's choice is 5. We identify each j E M, with the vector of 0's and 1'8 with 0's in all but the jth position and let 6 be generic for such a vector. Then 60 denotes the inner product of 6 and C for all 6 E M, 0 E N. For '3 = (61, 62"...) 6 Mo and g: (01, 02,...) 6 No. the following identity ((6.5) of Hannan (1957)) is basic: n _ n n _ (4) 21 €101 - Encn+1 + 21 Ei-1(Oi ' 01+1) + :1 61(01 C$11-11) where Ei.= €1+"'+€i.= 1 G1, Eo== 0. We assume (Al) N is sequentially compact under pointwise convergence and (A2) sup {HOHI I O E N}‘= B < m 500 where H “1 denotes the £1 sequence norm. It follows from (Al) that for all ‘w E m+, the set of bounded sequences with non- negative components, inf {w 0*l0* 5 N} is attained. For let waivinf {wG*IO*EN} and for each j, Oi'wdj as k"'°° where 0 = (01,02,...) 6 N. By Fatou's lemma, w0'= E: ijj S .limk 3: wjoik - lim1 w 01'= inf [w'0*|0* E N} so the infimum is attained. Denote an infimizing 0 by 0(w). We may take C(-) to be positive homogeneous and such that each component Uj(-) is a measurable function. (Let 81 be the Borel subsets of 4X1 = [0,1] and consider (X,B) ‘where .X = ;.Xi and B is the product O-field. Then the 0j(o) may beltaken to be measurable functions from (X,B) to the reals.) Letting n denote the distribution of Z = (Z1,Zz,...) where the 2k are independent and identically distributed uniform [0,1] random variables, we investigate the randomized sequential procedures (5) 01 = C(Ei_ + 11101 2) 1 with the sequence of constants Hi satisfying (6) . 0 < Hi + and i H;1 + with respect to 1. Theorem 1. Assuming (Al) and (A2) and with H; = 1%, the --—--- 1 procedure (5) results in -l n _ '5 (7) In 11031 6101) - R(Gpl - 0(n ) 51. uniformly in g, 3522;. Writing Ei(01 - 01+1) 2 (E1 - (Ei + H1 2))(0i - 01+1) we have (8) 2:1! E:imi ' 01H) 2 "211‘ Hi 2(01 ' CHI) = Hn Z On+1 ' Ho 2 61 ' EI(Hi ' Hi-l) z 0i 2 -B H0 - Z‘l‘m1 - Hi_1)B = -B H“. Also, En on+1 - n R(Gn) ==En(0n+q - d(En)) 2 0 so (4) yields (9) a: e1 01 - n R(Gn) 2 -3 H“. In order to obtain an upper bound we note that Enon+l - n R(Gn) S - O' I- ’ -0 S- .- Hn Z( n+1 0(En)), and, similarly, Ei_1(01 1+1) Hi-l 2(0i 01+1) so that summation by parts and (4) imply (oi - 0 1+1)' n - S n (10) 21 6101 n R(Gn) B Hn + 21 e1 The expectation of the last term is bounded by a direct extension of Hannan's Lemma 2 (1957). Here we state and prove the needed specialization of that extension. Lemma 1. Under the assumptions of Theorem 1, (11> |u°i(w + Z) - 1101(w' + z>| s Bllw - w'll1 for each i and all w, w' in mi. a Proof. With X = [0,1] we write 52. n0i(w' + 2) = jkoiw- + z)d11(z) =1 qi(w + v)dV(v) M where v = nT-l is the measure induced by the transformation T defined by v = T2 = w' - W'+ 2. Therefore, pol(w + Z) - p01(w' + Z) S I 01(w + z)du(z) - I 01(w + v)dV(v) X XOTX and, since the restrictions of 11 and v to X F) TX are equal, nOi(w + z) - pOi(w' + 2) s B n(x- 110. Since 2 6 X - TX if and only if zj 6 [0,1] for all j and zj - w'j + w1 t [0,1] for some j, it follows that u(x - TX) s 8°;ntzj < w'j - wj or zj > 1 + w'j - wj] s E:{(w'j - wj)+ + (wJ - w'j)+} S “w - w'Hl. The proof is completed by interchanging the roles of w and w'. , _ -1 , = -1 We apply the lemma Wlth w - Hi-l E1_1 and w H1 E1. With this specification 6 8 -1 i -1 i -1 j *1 J - ' = - - “W W H1 H1 E1 H1-1 E11-1 + 1;: (Hi-1 E1-1 Hi Ei) i C 6 _ -1 -1 -1 i 1 -1 -1 ' Hi + (Hi " Hi-1)Ei-l + (i ' 1 ' Ei_1)(Hi_1 Hi ) - C S S Hi + (i 1)(Hi-l Hi ) 2 Hi . . -1 - 3 Therefore, the lemma implies ”(Si-(Gi Ui+1))| 2 3 Hi which proves via (9) and (10) that 53. n n -l - S - (12) B Hn 11(21 eioi) n R(Gn) s B{Hn + 2 21 Hi }. The choice Hi = i35 proves the theorem. 4. A Theorem for Non-Finite W. Let L 2 0 be a loss function over the cartesian product M X N* of arbitrary spaces M and NI, In the component game player I chooses an element 6 E M, II picks a O E N = [L (°,6)l6 E N*} and the loss to II is 60, 60 denoting 0 evaluated at e. If w is a discrete measure putting mass w on 61 E M. then wO denotes 81 w1(610). Assume that for i all discrete measure w with finite support in M, (Al') inf {w 0'0 E N} is attained. As before we denote an infimizing 0 by 0(w) and take * 0(o) to be positive homogeneous. (If N is a compact topological Space and each section L(e,-) is continuous * on N then (Al') is satisfied.) * Consider the game where M.= N = [0,1] and L(€,6) = Ie - 5'. In the sequential version a strategy ‘O’ With ‘¢i = 0(Ei-l)’ E1 = i G1, where G is the empirical dis- 1 tribution of (61,...,€i), is such that suprnQfimlg 6 [0,11% 2 % (Hannan, 1957, p. 130). This exemplifies the need for artificial randomization as was the case in the preceding section. However, here the measures E1_1 have support changing with ‘g and 1; since the union of all possible supports is M, uncountable, it is not clear how to add the randomization 54. in order to perturb the measure E This difficulty will i-l' be overcome by embedding a countable set in M. A basic assumption is that (AZ') sup{€0|€ E M, O E N} = B < a. We define the real valued function (13) d = sup{|ec - e'ol | o 6 N}. Clearly (M,d) is a pseudo-metric Space; and, if loss equivalent player I moves are identified, it constitutes a metric Space. Let Ji 2 1 be a non-decreasing integer valued sequence, = C = ° A {a1, a2,...} M and A1 {a1,...,aJi}. Corresponding to G each sequence .3 = (€1,ez,...) E M. there is a sequence I E} = (€1,62,...) where 6i is an element of A1 closest to I 61 in the metric d. We let G1 be the empirical distribution I I of (61,...,€i) and 20, 21, 22,... be independent and identically distributed uniform [0,1] random variables. We investigate the sequential procedure (14) Oi = 0031-1 + H1-1 E~1-1) where H1 satisfies (6) and ‘21 is to be interpreted as the measure placing mass Zj on aj, j = 1"°'Ji' Let u denote the distribution of (Zo,Zl,Zz,...). Remark. Let {a1,...,aj}<:‘M. be fixed but otherwise arbitrary and let W = {(w1,...,w:)l 0 S wi‘< 9, i = l,...,J} 55. be the set of finite measures with support in {a1,...,aJ}. Then Oj(-) is a function from. W, considered as a subset of Euclidean J-sPace,:to the reals. If {(a10,...,aJU)IO E N} is a compact subset of Euclidean J-sPace, there is a determination C(o) such that each component 0j(-) is measurable. In the case of countably infinite support, we noted in the preceding section that sequential compactness under coordinatewise convergence and I! boundedness for the set of risk vectors is sufficient for measurability. Theoremwg, Under assumptions (Al') and (A2'), and with each component of Oi measurable, n n' -l - s (15) lp(£1 e101) n R(Gn) I Bth Jn + Jn + 2 81 Hi } n I Proof. A bound for the left hand side is provided by anl + lCnI +'an| where - n I I Bn ‘ ”(21 6101) " “ R(Gn) n I Cn = "(21 (6101 ' 6101)) Dn = “(R(Gn) - R(Gn))0 Clearly 2"1‘(e'o11 - 6101) s 2“ (mi, 6 b and n(R(Gn ) - R(Gn ))= n info 21 sic - info 81e1051nf021(e;0 + d(e. , e £>> - n I info 21 :10 = 21 d(ei’€i) so 56. (16) Icnl V Innl s 2‘; d(ei, all). The identity (4) yields _ ' I n I (17) Bn - ME 0 > - En own) + 11031 Ei-l(oi - o )> n n+1 i+1 + “(2? 6301 " °i+1))' Since 8: Eiwi ' Cm) 2 ‘2111 Hi 51(01 ' c3+1) = Hn §n°n+1 " Ho 50°1 " trim]: 51 "’ H11-1 51-901 2 '3 H0‘10 " 2111(H1Ji ' Hi-lJi-l) B = -B 11an and E&(On+1 - 0(Eé)) 2 0, it follows that (18) 2‘; Bio: - n R(Gt'l) 2 -B Han. As in the derivation of (10), (19) 211‘ 6:101 - n R(sn) s B 11an + 2‘; 611631 - 01H). ' = 'U-O'IS For those 1 such that J we have |u(€i( i 1+1))l 1 Ji-l 23 H"1 by application of the Specialization (9.8) of Hannan's i _. '1 I I: '1 I Lemma 2 (1957) with w — Hi-l Ei-l’ w Hi E1. Therefore, n n -l n (20) Inc: e'(0, -0, ))l $2 213 H, +7... B 1 i 1 1+1 _ 1 iIJ - J iIJ. > J. i i-l 1 1-1 n -l S 57. and (17) - (20) combine to yield (21) -B H J S B S B[H J + J + 2 En HTI}. n n n n n n l 1 In view of (16) and (21) the theorem is proved. If the set, A is dense in the metric space (M,d) with d(€i’€i) + 0 as Ji 1 fl uniformly in .g, then a balance in the bound of (15) can be obtained by choice of the sequences Hi and Ji' We note that in the finite M problem with Ji 35 equal cardinality of M for large 1 and Hi = i , (15) shows that the expected modified regret resulting from the procedure (14) is 0(n-%) uniform in 5, We now apply Theorem 2 to a problem involving uncountable M. Example 2, Consider the game of absolute deviation on the unit square. Here d(e, e') = sup{| It - 5| - '6' - 5| I I5 6 [0,1]} = '3 - 3", and we let I-Ii = 18, J1 equal the greatest integer in - l, where a, b 6 (0,1) are yet to be specified. For the H u L. I 0 set A = {a1,a2,...} we take the points 1/2, 1/4, 3/4, 1/8, 3/8, 5/8, 7/8,...; that is, ai = bkj where 1 = 21"1 + j - 1, bkj = (2j - l) Z-k, 1 S j S 2k-1, k 2 1. With this choice 1 1 S 2(ib - l)- ; and, hence, it follows from I S - d(€i) 61) 2 J1 (15) that n a+b b -a - s f" |u