w , * I

1
\
l

i
x

1

 

APPROXEMATEON TO SAYES RISK EN SEQUENCES
OE NDN-FEMTE DECESION PROBLEMS

Thesis for 1%: ﬁagrm ci’ Ph. D.
MICHKGAN SE‘ATE UNEVERSITY
Dennis Crippan Giiliiand

129636

 

LIBRARY

Michigan 5a.
Um‘ "r7

 

This is to certify that the
thesis entitled
APPROXIMATION TO BAYES RISK IN SEQUENCES

OF NON-FINITE DECISION PROBLEMS

presented by

Dennis Crippen Gilliland

has been accepted towards fulfillment
of the requirements for

PhoDo ‘ degree in Statistics
and Probability

WW

J Major professor

 

Date August 10, 1966

 

0-169

ABSTRACT
APPROXIMATION TO BAYES RISK IN SEQUENCES OF
NON-FINITE DECISION PROBLEMS

by Dennis Crippen Gilliland

Consider a statistical decision problem where X N P9, 9 E Q,
and an action A E A is to be taken with A allowed to depend
upon the random variable X. A loss function L 2 0 is defined
on Q X A, and a non-randomized procedure w incurs a risk
R(9,¢) s‘fL(9,¢)dPe. Suppose this decision problem occurs n
times with '9“ = (0 .,3n) 6 On and lg“ n (X1,...,Xn) ~

1".
P6 X...XPe . A strongly sequential compound rule 3 =~(¢1,¢2,...)

l n

is such that for each i, wi is the means by which the i'Eh
action is taken and L(9,¢i) is -§i measurable for each 9.
The compound risk up to stage n is taken to be the average of
the component risks, Rn(§’$) = n'12: R(§i,¢i) where

.2 - (91,62,...). The modified regret Dn(£J§) s Rn(§’2) - R(Gn)
with Gn the empirical distribution of 3“ and R(G) the
Bayes risk at G in the component problem has been used as a
standard for compound procedures.

The results of the thesis show that an(§J2)|1s 0(n-g)
uniformly in 2_€ waor squared error loss, certain discrete
exponential families including the Poisson and negative binomial,
and realizable sequential procedures Q, Bounds like 0(1) and

0(1) uniform in ‘2 are obtained for larger classes of discrete

exponential families. A problem.involving normal N(9,1) dis-

Dennis Crippen Gilliland

tributions is also investigated with a sequential compound pro-
cedure 2 demonstrated such that an(§,$)l = 0(n-(1/5)) uniformly
in ‘2.

In the last chapter sequence strategies are exhibited which
at each stage i depend upon Gi-l and which achieve various
rates of modified regret depending upon the exact structure of
the component game. Sequence strategies were given by Hannan
((1957). Contributions to the Theory of Games, 3, 97-139. Ann.
Math. Studies No. 39, Princeton University Press.) for various
M X N games with M finite. These procedures result in an
absolute modified regret 0(n-5) uniform in player I move
sequences. We give a direct generalization to the countable

M case. Also a theorem is proved with applications in the

uncountable M case.

APPROXIMATION TO BAYES RISK IN SEQUENCES OF

NON-FINITE DECISION PROBLEMS

By

Dennis Crippen Gilliland

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

1966

ACKNOWLEDGMENTS

I am indebted to Professor James F. Hannan for the introduction
to compound decision theory and for the guidance he gave to my
thesis work. His comments and suggestions led the way to theorems
and simplified proofs. Above all this, I deeply appreciate his
efforts in teaching me many aspects of mathematical statistics.

Thanks are also due to Professor Herman Rubin for many use-
ful discussions and to Professor Esther Seiden for the original
encouragement to do graduate work in mathematical statistics.

The financial support provided by the Department of Statistics
and Probability, the National Science Foundation, and the Office
of Naval Research made my graduate studies possible. In this
regard I also wish to thank Goodyear Aerospace Corporation for

the leave of absence for education.

ii

TABLE OF CONTENTS
CHAPTER

I. INTRODUCTION TO SEQUENTIAL
COMPOUND DECISION PROBLEMS . . . . . . . . . . . . .

1.1 The component problem . . . . . . . . . . . . .
1.2 The sequential compound decision problem . . .
1.3 A useful theorem . . . . . . . . . . . . . .

1.4 A bound for the modified
regret under squared error loss . . . . . . . .

II. SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR
LOSS AND SOME DISCRETE EXPONENTIAL FAMILIES . . .

2.1 Introduction . . . . . . . . . . . . . . . . .

2.2 Review of known results and an example . . . .
-1

2.3 Bounds for n 2: E(|¢i - lil) . . . . . . . .

-1 n H _
2.4 Bounds for n 21 E(Icpi Wil) . .

2.5 Bounds for modified regret . . . . . . . . . .
2.6 Examples . . . . . . . . . . . . . . . . . .

III. SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR
LOSS AND A FAMILY OF NORMAL DISTRIBUTIONS . . . . .

3.1 Introduction . . . . . . . . . . . . . . . . .
3.2 Estimation of ui(x) . . . . . . . . . . . . . .

3.3 A decision problem . . . . . . . . . . . . . .

IV. ON THE BAYES RESPONSE TO THE EMPIRICAL DISTRIBUTION
OF OPPONENTS PAST IN SEQUENTIAL DECISION PROBLEMS .

4.1 Introduction . . . . . . . . . . . . . . . . .
4.2 Games involving squared error loss . . . . . .
4.3 Games with countable M . . . . . . . . . . .

4.4 A theorem for non-finite games . . . . . . . .

BIBLIOGRAPHY.....................

iii

PAGE

10
10
11
14
21
26

33

36
36
36

43

45
45
47
49
53

58

CHAPTER I
INTRODUCTION TO SEQUENTIAL COMPOUND

DEC IS ION PROBLEMS

The notation necessary to describe the mathematics of compound
decision theory is necessarily cumbersome. To reduce the complexity
we will.find it convenient to suppress the display of some dependen-
cies and to use Operator notation. For example, PX, P(X) or
P(X(w)) denotes the integral fXKw) dP(w). Another reduction in
notation is accomplished by letting square brackets denote the

indicator function, [A] being the indicator function of the set A.

1, The Component Problem.

The component problem has the structure of the usual statistical
decision problem. Let 0 denote a parameter space indexing a set
of probability measures {Pele 6 Q} and let IA signify the action
space. A real valued loss function L 2 0 is defined on O X‘A.
The action A is allowed to depend upon the realization x in the
measurable space (X93) of a random variable X distributed
according to P6“ A randomized decision function w is for each
x a probability measure V¢(X) over some C-field of subsets of A

such that L(9,¢(x)) = v (L(9,A)) is (X,B) measurable. The

¢(X)

risk or risk function associated with w is the expected loss

resulting from the use of w: R(6,¢) = P9 (L(9,¢(x))).

2, The Sequential Compound Decision Problem.
Suppose the decision problem described in 51 occurs n times

n _ .
.,9n) 6 Q and £5." (X1,...,Xn) distributed

with ‘Qn f (61,..

according to En = P1 X ... X Pn where we have made the identifi-

cation Pi = P6 . Using the terminology of Hannan (1957, p. 105)
i

we make a distinction between weak and strong sequence games. In
the weak sequence game n is known to player II at each component
but not in the strong sequence game. Throughout this paper only the
latter situation is considered. A strongly sequential compound rule
w = ($1,¢2,...) is such that for each i, mi = ¢i(xi) is the means
by which the AER action is taken and L(9,wi) is ‘51 measurable
for each 9. The risk in the ith component problem is R(ﬂi,¢i) =
‘£i(L(ei’¢i))' The total risk in the sequential problem up to stage
n is taken to be the average of the component risks, Rh(§,m) =
n"-1 E? R(gi’wi)’ where .9 denotes (91,92,93,...).

If L(0,¢i) is xi measurable for each i we say m is
a simple rule. If in addition the mi are identical, mi = $1,
we say w is a simple symmetric rule. Simple symmetric rules are
traditional and result in Rn(§,¢) = n-1 2: R(Oi,¢1). This is the
same as the risk in the component problem resulting from the use of
the procedure -¢1 if 6 is assumed to be the realization of a
random variable distributed according to Gn the empirical distri-

bution of ﬁn. With R(G) denoting the Bayes risk versus G in

the component problem the above observations lead to consideration of
' = a -
<1) Dump) Rn(_,cp) men)

as a standard for evaluating compound procedures. Loosely speaking,
if Dn(§,¢) S Bn = 0(1) as n‘v “, then in the limit w incurs a
compound risk no greater than R(Gn). Dn(§,¢) is called the
modified regret.

Later we will prove theorems concerning rates of convergence

of IDn(§,¢)I for particular families of distributions and

procedures. At this time we point out a necessary condition for
the existence of any compound procedure w satisfying

TEE“ sup {Dn@_,cp)l2 E 0”} < °°. It is the. existence of a procedure
$1 in the component problem.satisfying sup {Pe(L(6,m1)) -

ianL(6,A)|6 E O} < a. This follows from
sup {Dn(§,<p)|£ 6 00°} 2 sup {Dn@,cp)l§_ E diagonal 00°}
-1 . . m
= sup {n E: §i(l(ei,¢i)) - ianL(31,A)l§_E d1agona1.Q }
2 n‘1 sup “’9‘ L(6,cp1)) - ianL(6,A)|9 e a}.

3. A Useful Theorem.

Let [Pele 6 Q} be a family of probability measures on a

measurable space (X,B). Assume 0‘: E the real numbers, and

1:
Po << n for all 6 where u is a O-finite measure. With squared
error loss and a priori G on (I, a Bayes reSponse versus G is
provided by any determination of the conditional expectation of

0 given X =‘x. With ¢i generic for a Bayes reaponse versus

Gi’ the empirical distribution of 21’ W0 = 0, we are interested in
Pi<l¢i<x> - ii_1(x)|) and n'1 2? Pi<lwi<x> - ¢i_1<x>l>; particu-
larly, in conditions under which there is convergence to zero

Q
uniform in parameter sequences .2 E O . To simplify the notation

Y = -
In what follows pe is a determination of EE—I and any ratios

O/O will be interpreted as 0. A Bayes response versus G1 is

(2) ¢.<x> = [ti pj(x) > o].

The following example shows that for this response and a fixed

bounded sequence .2, n-1'Eq'Pi(lYi($)l) need not converge to zero.

Example I, Let p be counting measure on {2,3,...} and
6 6 Q = {1/2, 2/3, 3/4,...} index Pe degenerate on (1 - 9)-1.
Then with Gj = j(j + 1)-1, we compute ¢i(x) = Eg+l (j - 1)j-1[x = j]
so Yi(x) = i(i + 1)’1 [x = 1 +1] and Pi(l‘*’i(x)|) = i(i + 1Y1.

i 2 1. Hence, n-1 8? Pi(lYi(x)l) d l as n r “.

However, in TheOrem l are given conditions sufficient for convergence
at the rate' 0(n7110g n) unifonm in .2.
Theorem 1:. If 0 C [-A,A], A < °° and M(x) = sup {pe‘(x)|9 6 O}
is integrable (X,B,u), then $1 given in (2) satisfies
'1 2n [Y I "1 . o e
n 1 Pi( i(x) ) = 0(n log n) uniformly in ._.

The following lémma is needed.

_ n 2 i -1 n .-1
Lemma‘l. Sn(a1,...,an) — 21 aid:1 aj) S E 1

1 = sn(1,...,1)

for all 0 S ai S l, 1 S i S n, n 2 1.

P f L t A - 2i a th t S = 2n a2 A.1 «335 =
_£22_° e i ’ 1 j so a n 1 i i ’ Ba

2 -2
s
(2anAn_1 + an) An 2 0. Therefore, Sn(a1,...,an) Sn(a1,...,an-1,l)

for all 0 S ai S l, 1 S i S n. Also, for 1 S k:< n

 

'353 l l) - (2 A + 2) A.2 - Zn (A + °-k)-2
aak(a1,...,ak, ,..., - ak k-l ak k k+1 k 1
2
a S
n _ 2 -3 n ,_ -3 2
532 (a1,...,ak,1,...,l) - ZAk-IAk + 2 Ek+1 (Ak.+ 1 k) 0.
k

Since the second partial is non-negative and

k-1H2 . -1
k-1a2 -1 n :

we have Sn (a1, ..,ak,1,...,1) S Sn (a1, ..,ak_1,1,...,1) for all
0 S ai S 1, 1 S i S k. The proof is completed by backward induction

on k.

Proof pf Theorem 1. From (2) it follows that

91(X)13i'1(°i - ej) pj(X)
3i pj(X) 3: pj(X)

i-l

[21

 

(3) ‘l’i(x) = pj(x) > o]
+ BiEEi'l pj(X) = 0, pi(X) > 0]

so if Q C [-A,A],

2A P (X) . . _
IY. (x)!S -=——-i-—x- [21‘1 p.(x) > 01+ ADS“1 p (x) = o, p.(x) > o]
Pj(x) 1 J l j .1
and
912(X)
(4) P. 11‘” <x>l> s 2A 11(--—-(--)-[2 I; 1 p. (x) > 0])
X

+ A u(pi(X) [21'1 p (x) = 0, pi<x> > 01)

j

for i 2 1. Therefore,

(p. <x>/M<x>)2 )
:1 (pj(x)/M<x)>

 

(5) n"1 2’; ridwiml) s 2A n" ucncx) 2“

+ A n-1 u@? pi(x)[E]I'-1pj(x) = 0, pi(x) > 0]).

Lemma 1 implies that the first term on the right hand side is bounded

by 2A 11(M(x)) n’l 2‘1‘ i'l. Since M(x) 2 2‘1‘ pi(x)[2‘1"1 pj(x) = o, pi(x) > o]
the second term is bounded by’ A u(M(x)) n-1 and the theorem is
proved.

We give an example to show that the bound on the rate of conver-

gence indicated in Theorem 1 is tight.

Example 2, Consider the family of geometric distributions with
densities pe(x) = exu - e), x = o, 1, 2,... ; 6 e a = [0,351 with
respect to counting measure. At §'= (0,%,0,%,...), ¢Zi(x) =
(1/6)[x = o] + ab: > o], $21400 = 35(1 - 1)(3i - 1)'1[x = o] + 25b. > 0]
so Y21(x) = (1/3)(3i - 1)-1[x = 0].

Hence,

2n n . -1
21 Pi(|Yi(x)l) 2 21 (1/6)(31 - 1)

2 (1/18) E: i”1 2 (1/18) log n 2 (1/36) log (2n)

for n 2 2.
The hypothesis of Theorem 1 need not imply Pi(IYi(x)|) r 0 as
i r ” uniformly in '2 no matter what determination of the conditional

expectation W1 is used.

Example‘g. Consider the family of distributions of the preced-
ing example. For C degenerate on 0, ¢G(x) = a(x)[x.> O] for
some function a. Let b 6 Q be any number such that b ¥ 0 and
b ¥ a(1). Define .§(i) = (0:1), Oéi),...) where Ogi) = b or 0

according as j = i or j # i. Then at Eli), ¢i(x) =

b(1 - b)(i - b)‘1[x = o] + tab: > o], ¢i_1(x) = a(x)[x > 0] so
Pi<lYi<x)l> > lb - a<1>| PiEX = 1] = lb - a(1)l bu - b).

The next prOposition shows that there is convergence as i r ”
uniform in .g for a family of normal distributions. This result
will be needed in Chapter III where a decision problem.involving

this family is discussed.

Proposition 1. Consider the family of N(9,1) measures,
(ICZ [-A,A], A < “. Let p9 be the continuous normal density with
respect to Lebesgue measure n. For the determination of hi
given in (2), Pi(IYi(x)l) = 0(i-1) as i r a uniformly in .Q.

Proof. Inequality (4) implies
.-1 2 -1
<6) ridges!) s 2A 1 {u(p_A(X) pA‘ (x) [x < -AJ>
-1 -1 2 -1
+ (2n) B uE-A S x S A] + u(pA(x) p_A(x)[x >'A])}

where B = min {pe(x) I (x,6) E [-A,A]2}. Since the factor of (6)
appearing in curly brackets is finite and does not depend upon .2

the proof is complete.

4. A 3.3.9329. £9; the Modified Regret 233g Squared Error Loss.

The loss function L(9,A) = (Bl-A)2 is continuous and convex
in A for each fixed 9 so only non-randomized procedures need be
considered. For the component problem this follows from an appli-
cation of Jensen's inequality as is well known. It is true for the
compound problem since the compound risk is monotone increasing in
each component risk. A non-randomized procedure O has modified

regret

-1

_ n 2
(7) Dn(9_,<p> - n 81 giapi - 61> - R(cn).

Inequalities (8.8) and (8.11) of Hannan (1957) show that

(8) n'1 2‘; Piwi - 6);)2 s R(Gn)
and

-1 n 2
(9) n 31 Pi(¢i-1 - Oi) 2 R(Gn).

Inequalities (7) and (8) imply

l

(10> Bum) s n‘ 2‘; gimpi - wimpi + ii - 261))

while subtracting and adding n-1 E? Pi“!i - 61)2 from the lower

bound resulting from applying (9) to (7) proves
-1 n
(11) Dues) 2 n 21 giucpi - ii><<pi + ii - 291))
-1 n
Y -
+ n 21 P1( i<¢i + ¢i_1 261)).

Assume Q‘:»[-A,A]. Then W1 takes values in [-A,A] and, therefore,

if wi does, (10) and (11) imply

-1 n -l n
(12) -4A n 21 PiinI) - 4A n 21 gidcpi - til) s

A corollary to Theorem 1 is immediate.

Corollary‘l. Under the hypothesis of Theorem 1 if w is a

procedure taking values in [-A,A], then

1

(13) Inn@,<p)l s 4A n' 2221(ng - til) + our1 log n)

where 0(n"1 log n) is uniform in _9_.

CHAPTER II
SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR LOSS

AND SOME DISCRETE EXPONENTIAL FAMILIES

1, Introduction.
Consider the discrete exponential family of probability measures

on the non-negative integers

Pe(x) = 9x h(9) g(x), x = 0, l, 2,... ; 0 E Q
where g(x) > O and
(A1) 0 = [0,8], 0 < B < m

with B assumed known. (The condition g(x) >‘0 is assumed by
Samuel (1965) and is implied by assumptions of Swain (1965, p. 25).)
Throughout this chapter we will be concerned with the Bayes response

(1.2) which for this family takes the Special form

g(x) 3: pj(x+1) = 813% 151

 

<1) wise) = 21 ~ 1
g(X+1) 1 pj(x) g 21 Pj
where we have chosen to suppress the display of dependence upon x
and f is f evaluated at x+l. Any ratios O/O should be
interpreted as 0.

Since p (x) S239z p (x) for all x and 0 S 6 S B, the

9 h(B) B

hypothesis of Theorem 1.1 is seen to be satisfied under (A1). There-

fore, Corollary 1.1 implies that for any procedure taking values in

[0,8]

(2) lumen” s 28 n'1 8‘; gidcpi - til) + 0(n'11og n)

10.

11.

where the 0(n.1 log n) is uniform in ‘Q. In (2) we have made an
adjustment in constant apprOpriate to the situation O<3 [0,8].
However, for the most part we will be content to obtain bounds in
terms of order and will not attempt to keep track of the constant.
We will be explicit enough so that a constant can be recovered if
anyone is so interested. In subsequent sections we introduce proce-
dures and use (2) to obtain a bound on the rate of convergence of the
absolute modified regret. Among other things it will be shown that
if the family satisfies conditions in addition to (Al) then a bound
of order 0(n-g) exists. Poisson and negative binomial families
satisfy the conditions. Before doing this we review or reference

what has been accomplished by others who have attacked the problem.

‘2. Review.g£ Known Results and 33 Example.

Essentially, Samuel (1965) imposed the assumption
+
(A1 ) Q = [01,81

where a > 0 and B are known and considered the procedure

¢i(xi) where

g {(u-n‘l 2;“ 3,) v 5.}

 

(3) cP.(1'€) = ~ _ ._1 AB
1 sf((i-1) 1 z; aj> v m}
With 5j = 5(Xj,x), the Kronecker delta function, and m(x) =

min {pe(x)l9 6 [0,8]}. (In Samuel (1965) the truncation at B
was not made explicit but we assume this was intended since the

indicated proof of Theorem 4 depends upon the boundedness of the

12.

procedure.) Samuel showed that under (A1+) there exists a null
sequence Kh(§) > 0 such that Dn(2,¢) S Kn(§). (In 55 we point
out that under (Al+), IDn(§,¢)I = 0(1) uniformly in .2 and under
(A1), an(§,w)| = 0(1) at each fixed g, for two procedures @->

To motivate the introduction of other conditions in addition to (A1)
or (Al+) we give an example that illustrates that the null sequence
Kn(§) may be going to zero arbitrarily slowly. That is, given any
null sequence Hn‘> 0, there exists an exponential family and a

sequence ‘2 6 [0,8]” such that Dn(§,¢) 2 Hn for all large n.

Example _1_. Consider p°(x) = 9xh(9)g(x), (2 = [0,8] with
d'< B 8 1 and where g(0) = 1 and g(x) S x-3, x 2 1 is yet to be
Specified. We first point out that m(x) = pa(x) for x 2 2. The
function h(9) = (E: 9xg(x))-1 is differentiable on (0,1) with
h'(6) = .2: x 8x-1g(x) n2(e) so P6(X) = -e h'(6)/h(6). For
fixed x, 5%- pe(x) = 9x-1g(x) Ex Me) + 9 mm] = ex'1g(x) h(6)
[x - rem]. Since Pe(X) = a: x 9xg(x) h(e) s h(0) 2°; :62 =
E: x"2 = ﬁ2/6 we see that §%-pe(x) > 0 for all 9 and x 2 2.

Hence, m(x) = Ra(x) for x 2 2. We now consider the modified regret

for the procedure O at the fixed sequence §_= l_= (1,1,1,...). Here
D (1 cp) = m‘1 2“ P («p (x) - 1)2)
n ’ 1 -i i
-l n 2 i-l ~
- 6 =
2 n 21 §i((cpi(x) 1) [21 J. 0])
-1 n “ g(x) m(x+1) _ 2 i-l ~ =
2 n 21 x30 p1(x){g(x+l) m(x) A B 1}-21~1.[21 6j 0]

-l n a ‘2 n-l ~ =
2 n 21 $2 p1(x){a - 1} En-IEZI 6j 0]

13.

= (a - 1)2 z: p1<x) <1 - p1<x+1)>““1

2 54 2°; p1(2x+1) (1 - p1<2x+2))“'1

3, x = 1, 3, 5,... and

where we have taken a = %. Let g(x) = x-
-3
x 2 g(x) = a(x), x = 2, 4, 6,... with a(x) strictly decreasing.

Then

Dump) 2 a 2‘1“ bu) <2x+1)'3 <1 - hm a<2x+2)>“'1

a -3 hglz n-l
2 % 2A“ h(1) (2x+1) (1 - “_1 )
where An = minfx|a(2x+2) S (n-1)'1}. Since .8:(2x+1)'3 2 25mm)”2

(8; g(x)).1 implies B_= (1 +-E: x-3)-1 5 h(1) 5 1:

and h(l)

we can write

. B 1 n-l -2
Dn(l,¢) 2 l6 (1 - 3:3) (2An + 1) .

By choice of a(x), An can be made to increase arbitrarily slowly.

Since (1 - -l*)n-1 r e"1

n-l the example is complete.

Samuel's procedure (3) depends upon a knowledge of a positive
lower bound for the parameter set 0 as well as an upper bound.
The procedures we will introduce do not depend upon knowledge of a
positive lower bound. In fact, the realizable procedures O that
we will be introducing are such that ¢i(x) = 0 on [Si-1 53 = 0]
so that the lower bound for Dn developed in Example 1 applies to
these procedures also.

Swain (1965, pp. 25-26) makes four assumptions concerning the
discrete exponential family. Under these conditions he demonstrates

S S =
a procedure w _such that Dn(2J¢) Bn for all 0. where 0 Bn

0(n-klog n). His assumptions are satisfied by the negative binomial

14.

family for B < 1 and the Poisson family for B < l. (A trivial
modification of his assumption (iv) and a slight alteration of the
proof of his Lemma 6 extends his result to cover the Poisson family
for any B. However, this will not be detailed here since results
to follow, in particular the theorem at the end of §5, show that
assumption (iv) is not needed.) In the following sections we will
derive bounds for the absolute modified regret under various sets

of conditions on the underlying family of distributions.
3. Bounds for n.-1 2n E<l¢l - ¢.l).
- -----—- 1 1 1
We first introduce a procedure w' which is not realizable

under the sequential problem as originally stated. Define ¢£(Xi) by

 

i-l ~ ,
(4) «g(x) = g {21 1 51+ 51} AB
.. i- ,
{21 SJ + 6i}
where 6J = 6(Xj,x) and 6; = 6(X£,x); Xi, Xi are identically
distributed Pi and Xi is independent of (X1,...,Xi) for all

i 2 1. (Recall that all ratios 0/0 are to be interpreted as 0.)

For each fixed Xi = x write
ii<x>
(5) Eylcpgx) - ¢i(x)|) = (J; Eitcpyx) - M") s -tJdt

. 3
+j‘ Eitcpyx) - tier) 2 t]dt
o

where E; is expectation with reSpect to the measure induced by
(X1,...,Xi_1,Xi). We use exponential bounds of Hoeffding (1963)

to approximate each integrand displayed on the right hand side of

15.

(5). Temporarily we completely suppress the display of dependence
on x and delete the primes on 6; and Bi. Also dropping the

subscript from E1 we have for 0 S t S B

i~
g E 6
EECPf-ll.2t]SE[t-1—i-(¢,+t)20]
1 1 21 6 1
g lj
=EEiY 20 =EEiY -EY 2-21m:
[ 1 j J C 1( j j) 1 j]
. =-~ _ =5 . i =
W1th in 6j Ri(t) 5j’ Ri(t) g(wi + t). Since .21 EYj
12:21 p120 and -Ri(t)-EYjSYj-EYjS1-EYj,Hoeffding's

Theorem 2 (1963) yields

~2 i 2
' 28(2 p)
(6) step: 41 2 t] s exp {- 2 1 1 2 t2}.
1 ig (1+Ri(t))

 

Similarly, we treat the other tail writing for 0 S t S $1

g_2i 5
EN! - i. S -t] E[——-l—‘l- (W. - t) S 0]
1 1 ~216 1
i

1

E21 z,-Ez, s4: 32,
[1(J J) J]

l

with z = 6 - Si(t) 6 Si(t) = g(txi - t). The inequalities

 

J j J,
.21 =-§L‘i s - -EZSZ-EZSl-EZ '1
1 EZj t g 1 pj 0 and Si(t)‘ j j j j imp y
e . ’ 2
282 (31p)
(7) EEO; h $1 S -t] S exp {- 1 j 2 t2

i g2 (1+Si(t))

Noting upper bounds for Ri and Si we combine (6) and (7) obtaining

for all 0 S t S B

16.

 

22202in 2

(8) ENC?! “H 2 tJSZeXP {-' ~ t}
1 1 i g2 (1448:)2
Since I exp {-ct }dt S £(ﬂ/c) , (5) and (8) imply
0
. _ g i -1 3
(9) E<l<pi wil) S B (1 + g)(21 pj) 1

for a constant B not depending upon .2, i and x. Inequality
(9) provides a high rate of convergence at x = 0 because

2: pj(0) 2 h(B) g(0) i; and, therefore,
(10) E(|¢£(0) - ¢i<0)l> s c i'%

for a constant C not depending upon ‘2 and i.

Proposition 1. If the family of distributions satisfies (Al+),

then

I _ = ' -0 Q
(11) PiEdcpi til) o(1) as 1
uniformly in g; and, consequently,
(12) n-1 2n P E(l¢' - W I) = 0(1) as n r P
I i i i ;

uniformly in .2.

Proof. It follows from (9) and the fact that w; and $1
take values in [0,B] that for each fixed x, E(|¢i(x) - ¢i(x)l) S

fi(x)AB where fi(x) = B(l +‘-£L§l-) m71(x) i-g, m(x) = inf {pe(x)|6 E Q}

17.

and m(x) >‘0 under (Al+). Since Pe S*%%§% PB’ PiE(|¢i - lil) 5
a-g; PB(fi(x)AB)’ The right hand side does not depend upon _9_

and goes to zero by the dominated convergence theorem.

We now use the Berry-Esseen normal approximation (Loéve, 1963,
p. 288) to bound the integrands diSplayed on the right hand side of
(5) for each x 2 1. The resulting upper bound for E(|¢i - Vil)
is more tractable than that provided by (9).

~

As before consider Y = 6 - R, t 6 and let W =
111‘): j

3
(Yj - EYj)/(1 + Ri(t)). Since leI S 1, E(IWJ| ) S E(W§) and

it follows with r:(t) = Var(2i Yj) that
-3 i 3
(13) Li(t) = r1 (t) 21 E(|Yj - £le )
S (E: E(W§))-% =r;1(t)(l + Ri(t)).

' - ~ 2
Explicitly, r:(t) = 2;.{pj + R:(t)pj - (p - Ri(t)pj) }. Note that

J -

q = inf {1 - pe<x>lx 2 1, e e 0} > o, 230:) s (2539, 5. s BE p

.. -2 1 a J g j
SO With T2 = 8'3 + 48232) X 2 1)

g g

i ~ 2 i ' 2 2 i
(14) q’EI pj + q Ri(t) El pj S ri(t) S I? 21 pj.
Hence,
i ~ -% i -%

(15) Li(t) S (q 31 PJ.) + (q 21 9].) .

The Berry-Esseen theorem implies that for a constant c

s 31 p
(16) EDP; "l1 2 t] S <1>(- Eli-('61 t) + c Li(t)

18.

where 6 denotes the distribution function of N(0,1). We can
0

B so
write for a > 0, J” o(-at)dt s a'lj‘ ¢('r)d'r = 2'1 a"1 j'PHxl 2 TJdT =
-1 -1 0 '° 5 -s -1
2 a P(|X|) where X m N(0,1). Therefore, I ¢(-at)dt S (2") a .
0
Using this result and the upper bound in (14) yields

- i
B as P
(17) I <1>(- +531 t)dt s -——%—§—11——g .
0 g 1 (2") g(l‘.1 Pj)

We combine (15)- (17) to prove
B 1 E, i. '25 ‘ 1 ~ ’35
(18) g EEcpi - Ii 2 tldt S D{(g T + 1)(21 pj) +031 pj) }

for a constant D.

Unfortunately, the problem is not symmetric and the other tail
- S t 6 . For
J~2 1‘ ) 1

33' we have

8

J

~

must be treated separately. As before let Z = 6
2

2 __ i 2-3
0 s t 5 pi, Si(t) - Var(21 Zj), and U - 5 g + B

i

1 ~ 2 i 2 2
(19) q'21.pj + q Si(t) 21 pj S si(t) S U 21 pj.

With Li(t) = s;3(t) 2i E(|Zj - Ezjl3) we have Li(t) s s;1(t)

(1 + Si(t)) so

I S i I”. "% i '%
(20) Li(t) (q 31 pj) + (q 21 pj)
and proceeding as in the derivation of (18)

i , ‘g i -% 1 ~ -%
(21) iEEcpi ~11 s -t]dt s D[(g~U + 1)(E1 pj) + (21 pj) }.

l9.

-% ~
Since U S T s 3%“5g + 26-2, (18) and (21) combine to yield
8

a - -
(22) EM; - 4‘1!) s Diég + Dali pjf’i + (2; :3) 35}
g

for each fixed x 2 l and a constant D.

Proposition‘g. If the family of distributions satisfies (Al),

then at each. 2.6 0a
(23) PiEdcp].L - wil) = 0(1) as 1 ~ .. ;

and, consequently,
-l n ' _ d
(24) n 21 PiE(l¢i wil) - 0(1) as n a.

Proof. We can write PiE(I¢£ - wil) S E(l¢i(0) - ¢i(0)l) +
B Pi[x.> O] S C i-%‘+ B Pi[x.> 0] using (10). Therefore, if 2_ is

such that 6i H 0 as i m w then PiE(|¢i - wil) = 0(1). Otherwise,

limi 6i > O and for each x, E: p (x) d 9. Thus, (22) applied for

j

x 2 1 and (10) for x = 0 shows that at each fixed x,

E(l¢'(x) - ¢ (x)l) * 0. Noting P S 2191 P the proof is completed
Ni 1 i h(B) B

by application of the dominated convergence theorem.

Under the assumption (Al+) the bound (22) takes the form

if -
<25) E<lcpi - wil) S no + §in pi) i
8

~

where we have used the relation pj Zcx-g pj and D is some
. n ,,
constant independent of 9, i and x 2 1. .Writing 21 P1E(l¢i -.¢i|) =

2". 2‘1‘ pi(x) Eiclic'pyx) ..wimlo, (10) and (25) yield‘

x=0

20.
-1 n '. -1 n .'}5
(26) n 21 PiEdcpi - wil) s c n 2.1 1

- a. ’5 - _
+ D n 1 2(1 +-§-S§-2-) Er; pi(x)(21' pj(x)) 5
x=l g (x+l) .

Lemma 1. 21; 31ml aj)-35 S 2 (73:11 ai)35 for all ai 2 0, l S i S n,

 

n 2 1.
O = O S -
2399; Let A1 21 a1 The inequality 0 A1 2 Ai Ai-l +
. 15 35 -35 A, g
S - S -
Ai-l implies ai 2(Ai Ai Ai-l) from which ai Ai 2(Ai Ai-l) ,

Summing from 1 to n proves the lemma.

Let M(x) = sup {pe(x)l6 E Q}. Dividing the pi(x) by M(x)
’5

we use the lemma to bound the last term in (26) by 2D n-

2192

203(1 + g%(x)/g%(x+l)) Max). Since M(x) S h(B) pB(x)

x=l

(26) becomes

a %
(27>n‘lt‘i‘vimlcp:-w.l>sc'n"5+n'n’35 2u+-§EL>%u)
1 1 x=1 g (x+1)

for appropriate constants C' and D'. This motivates the follow-

ing assumptions concerning the family of distributions:
(A2) 2 p%(x) < on
.x B

and

 

g(x) pB(x> % <

(A3) 1; ( g(x+'1)

+
Proposition 3. If the family of distributions satisfies (Al ),
(A2) and (A3), then
-1
n

n ._ = $5
(28) 21 19,32ch1 wil) 0(n )

21.

uniformly in Q,
.EEEEE- The proof follows directly from (27).
It follows from (2) that the bounds on the rates of convergence
indicated in Propositions 1, 2, and 3 also apply to an(2,¢')l.
In §5 we will bound the absolute modified regret of realizable
procedures ¢ by bounding no1 2: PiE(I¢' - ¢|) and applying a
triangle inequality. Before doing that we show that under (A1)
and assumptions stronger than (A2) and (A3) the bound 0(n-g)

holds for a procedure w".

-1 n H
.3- Bounds for n 21 Edcpi ¢il).

We introduce the procedure ¢;(Xi) where

 

g(2i'1 éj + 5'1 + 2: Cj}
(29) (Pg-(X) = 1_1 A B
~ '
{21 63 + 51}
Here C1, C2,... is any sequence of independent random variables
satisfying lel S 1, E gj = 0, with (C1,...,Cj) independent of

(X1,---,XJ,X') for all j 2 l and 02 =

j i

the prime on 6; is omitted and E denotes expectation with

2: E C? m ". Henceforth,

respect to the measure induced by (Xi,...,Xi_1,Xi,C1,...,Qi).

Bounds for n-1 2? PiE(l¢g(x)§ ¢i(x)|) will be evolved by bounding for
fixed x and then completing the Cesaro expectation. The artificial
randomization makes the variance of a sum divergent and permits the
use of Survila's strengthening of the Berry-Esseen theorem (Survila,

1962) to develop the bound for fixed x 2 1.

Theorem (Survila). Let U1, U2,... be a sequence of independent

2
random variables with PUi = O and 5i = PlUil3 < a, Define Sn =

22.

n -3 n
=3 -! 4o
Var (21 Ui) and Ln n 21 §i. If Ln 0 as n then
n 2 -1
IPEEI Ui S tsn]-¢(t)l S c (1 + t ) Ln for all n and t where
C is a constant.
Let x 2 1 and 0 S t S.B be fixed and the diSplay of de-

pendence upon x suppressed. Then we can write

" _ i _ _ i
(30) EEcpi 4:1 2 t] s E[Z‘.1(Yj EYj) s 213le
— ~ . . a ‘8' ‘ :3 '~ -
where Yj : éj + Qj - Ri(t)6j’ Ri(t) ~g($1+t), _EYj p.1 Ri(t)pj, and
ZiEYj = - g 2|le t. With TLi(t) and ri(t) defined as in (13)

it follows from le - 13le s 2(1+Ri(t)) that
-1
(31) Li(t) S 2 r1 (t) (1+Ri(t)).

2 2 ‘ ~ 2 ~ 2
Explicitly, ri(t) = Oi + zlh’j +IRi(t)pj - (pj ', Ri(t)pj) } so

2 2

i 2 2 i
(32) Oi + q Ri(t) 21p S ri(t) S 01'. + T2 21p

1 J

... ..2 . - ~
where T2 = B E + 452 3 . Since 0% -' ° and R.(t) S 28 5',
8 g2 1 1 g
Li(t) * 0 as i m 0 for fixed x 2 1,‘0 S t S B. Survila's
theorem and (30) imply
75 31p "s miP 2
j -l
H - 2 S - 1 + L + '--l-i-t .
(33> Etwi 11 t] ¢< g rict)‘) c i(t) {1 (g ri(t) ) 1
The inequality ri(t) S Oi + T03; pj)% together with the technique

culminating in (17) imply

 

8 ~ 21 {0 + mi 9.)}5}
I¢‘(_.g__1-__plt)dtsg i , 1 J
.1 .1

and another application Of (a + b);5 S a% + b335, a, b > O on T2 shows

23.

B gzi p
(34) f ¢(- ﬁt) dt S :—-£C:j-— +—L—g+ (%L)%
O
21 p1

g2t2)p1dt s a'1 , (32), Li(t) s 2

The inequalities 1$0 + a

-l i - -
{oi + (q lej )%:n and an upper bound for T show that the 0 to

B t-integral of last term of (33) is bounded by

01 8 3 5% i
(35) B{?(211'P p) 2 + (§+ E, + m: p >
% . _
M3 + 1x21 p) *1
~35 1 j
8
for some constant B. (In deriving (35) we take 01 = 1 so that
o;1 S l for all i.)
W The other tail can be treated analogously and the integral

1
I EE¢2 - *1 S -t]dt has a bound with the same structure as the
0 r

bound (34) and (35) provide for the t-integral of the left hand

side of (33). Therefore, the followinginequality is proved:

8 3 0 g 15
(36) MW: -1«l)S 31-3—42: p) "2' + (1%— +§g + 1)
p3 g g
(21 )’1 + (£35 + no: >35}
1 Pj s 1 Pj

8

for some constant B independent of .2, i and x 2 1.

Inequality (36) is analogous to (22). Recall that Lemma 1
and assumptions (A1+), (A2) and (A3) were needed to deduce the
consequence of Proposition 3 from (22). At this point we prove a
lemma which is useful in applying (36).

Let O*={9|OS6<°°,2x9xg(x)<°°}. If B>o is in the

24.

*
interior of Q then each PO’ 0 S B, has an exponential rate on

its tail probability. Write
*
(A2+) B 6 interior 0 .

(It is clear that (A2+) implies (A2) and the exponential rate on
* ' .
the tail probability. For if B < d, d E Q , then pB(m) S PBEX 2 m] =

Z:Bxh(B)g(x) S %%%%C% )m P dEx 2 m] < £_%%%GB_)m for each m.)

Lemma 2: Let 1 S Bi S i be an increasing sequence. If

(Al) ; and (112+) :obtain, then
n i
S =
(37) 21 PiEEI pj(x) Bi] 0(Bn log n)

uni formly in _9_.

Proof. For fixed x, En pi (x)[E1 pj(x) S Bi] S Bn so

21 PiEZip jp(x) S131] S mBn + 81' P. 1..[x 2 111]. Since the" family hasmonotone

increasing likelihood ratio and [x 2 m] is increasing in x,

PszmJSPEme]. But under (A2+) PBEmeJScrm where

:3"

c=h(LB&-%, r jg for some B < d, d E 0* Therefore, for any m 2 0

(38) Zn liPEZI pj(x) S Bi] SmBn+cn rm.

» B
Letting m be such that n rIn = Bn’ it follows that m = (log End/log r

S (log n)/log r-l; and, hence, the result follows from (38).
Before proceeding we digress to compare this lemma with Swain's

Lemma 6 (1965, p. 31) Specialized to k = 1. To prove (37) for the

case Bi = is, 0 < e < l, assumptions (i), (iii) and (iv) were used,

(iii) being

25.

x+

(in) g<x>g(y)

S M' < Q for all 4x and y.

*
Remark. (iii) implies Q is open on the right; consequently,
(iii) implies (A2+).

Proof. We have p31(m) Pe[x.2 m] = h-1(B) 2: Bx-m’h(5) g(x-m)

 

- *
A_g(x) S h 1(B) M' = M for all 9 E 0 and every non-negative
g(m)g(x-m) (j)
inte [ 2 “J - m’l p9 . . E m
ger 111. Thus, Pe X - TTO (1 - m) implies P9 1(sz S r

where r = 1 -'1711< 1. From the exponential rate on the tail probability

* *
it follows for any B E 0 there exists a d > B such that d E Q .

For with .Q = s > 1, h(B) Em dxg(x) = P (sx) = Z sx p (x) S
B 0 B 0 8

2% x , , '1

O (sr) and the last series converges Wlth s < r .

Preparing (36) for application of Lemma 2 the trivial bound

|¢2 - Wil S B and partitioning by [2: pj S i%] lead to

ads.

1 a 01 i g .3? i -%
(39) E<lcpv -¢.l>se£2 p.SiJ+B{—=——+ +1}(2 p.)
i i 1 J g ‘ék 1 J
for some constant B and all x 2 1. 0n [x = O] a bound like
(10) holds for $2. (The artificial randomization did not change
the expectation of the Y and increased the range of Y

J J

by a factor of no more than three. Then Hoeffding's Theorem 2

- BY,
J)

(1963) implies the result for m; as it did for @i). Therefore,

(39) leads to

-1 n ,, -% -1 n i .%
(40) n 21 Pimlcpi - wil) s 0(n ) + a n 21 PiEEl pj s 1 ]

 

1
o ifB-ng) %
-1 e 1 x n 1 -%
+ B.n g(x+1) + -§—£—L- + 1}}:1 pi(x)(21 pj(X)) -

x=1 g (x+1)

26.

Write

I k m
“‘3’ EE% 1’8“) < '

Proposition 3. If the family of distributions satisfies (Al),

'
(A2+) and (A3 ) and W; is such that i 2 0: + °, then

-1 n u _ -%
(41) n 21 23:2ch1 - 4:1!) - cm >

uniformly in g,

‘ggggf. The proof follows from application of Lemmas 1 and 2
to (40) and the fact (A2+) implies (A2) which together with (A3.)
implies (A3).

In the next section we introduce some realizable procedures and

bound their corresponding modified regrets.

'2. Bounds for Modified Rggret.

* **
Define two procedures ¢i(Xi) and wi (Xi) by

 

81-15
(42) cp: <x>= —z—i1———-1As
g 1 61
. g Zi-l 61
(43) CP: *=(X) 2.1 1 }/\B ,
g{1+21 5j
and abbreviate Ei-l 6j = Si-l' Since $i = W: = ¢:* = B on

.. ~ , *
— = 2 . ' - =

l I [g 81-1 B g (Si-1 + 1)] we can write |¢1 ¢il

l¢' - ¢*II l¢' - ¢**l = l¢' - ¢**II. It then follows'by consideration
1 i ’ i i i i '

of the values taken on by W; on [Xi = x], [Xi = x + 1], and

[Xi ¥ x,Xi ¥ x + 1] that

27.

> o]

*
lap! - cp.| S BES. = 01+-§-—{x2 = xJES. > 0] +~—3—-—[x3 = x + IJES.
1 1 1-1 81-1 1 1-1 1 1-1

3 81-1

and

> o]

**
lipi (Pi I S E3[Si_1 0] + L + )[xi

x + 1][S,
g 81-1 S1-1 1’

1
+.-§——[x: ¢ x,x! # x + IJES. > 0]-
S1-1 1 1 1-1

From these two inequalities we have

u * u _ ** _ B
(44) lap].L - cPil \/ lcpj.L 1?1 l S BESi-l - 0] + g—[si > 0]

i-l
+1—L-[XE = x+ 1][S, > 0].
g Si-l 1 1-1

-1

Lemma 3. For each fixed x and i 2 2

 

 

[si_1 > 0] 1 _1
s
(45) .31-1( 3. > 4 (£1 pj> .
i-l ,
Proof For each x [S > 0] 8-1 S 4 (S + 6' + 1)-1
—' ’ 1-1 1-1 1-1 1 '

Since (Si-1 + 6£‘+ 1).1 is convex in 81- + 6;, Hoeffding's

1
Theorem 3 (Hoeffding, 1956) implies E((Si-l + 61.+ 1)-1) S

23 (j + 1)-1(§) pj(l - p)i.j where i p = 3 This last

i
1 Pj'
inequality implies E((Si-1 +-5£'+-1)-1) S i/((i + l) 2: pj) S

(2: pi).1 and completes the proof.

+
Proposition 2. If the family of distributions satisfies (A1 ),

 

then

(46) P.E(|w: - mT|> = 0(1), P.E<|w: - wf*l) = 0(1)
1 1 1 l 1 l

28.

uniformly in .2; and, consequently,
-1 n , _ * _ -l n , ** _
(47) n 21 rims, mil) - a(1). n 2, Pinclcpi - cpi I) — 0(1)

uniformly in .9.

Proof. For fixed x the E expectation of the right hand

side of (44) is bounded by B expf- Ei-l pj} + 4B(Zi pj)-1 +
i -l _ g_~ s
4B pi (21 pj) — gi where Lemma 3 and g p1 Bpi have been
used. Weakening the bound to B expf-(i-l) m} + 8B 1-1 m"1 = fi
where as before m(x) = inf{p°(x)|6 E O} > O, we have
* ** h o
1 _ I _ S .

Finder), Civil) v Piradcp11 (pi I) Lima) PB(fi A B). The ught hand
side is independent of g_ and converges to zero by the dominated

convergence theorem.

Theorem 1. If the family of distributions satisfies (Al+), then
* **
(48> Inn(9_,cp >l =- so), lungs: >l = om

uniformly in ‘2.

Proof. The proof follows directly from (2), Propositions

1 and 5, and a triangle inequality.

Proposition 6. If the family of distributions satisfies (A1),

 

then at each fixed 2.6 Qca
* ' ** .
(49> Pinclcpi - evil) = a(1). Pindcpi - cpi I) a(1).

and, consequently,

29.

0(1).

-1 n , _ * _ -l n , **
(50) n 21 Pimlcpi wil) - a(1). n 2, 21mm)i - cpi I)

Proof. Consider the bound gi of the proof of Proposition 5.
. ———- i
. *> —o¢ -0
If .g is such that limi 61 0 then 81 pj and g1 0 so
the dominated convergence theorem implies the result. Write
P,E(|¢' - ¢*l) S E(l¢'(0) - ¢f(0)l) + B P [x.> O]. The term
1 i i i i i
*
E(l¢i(0) b ¢i(0)|) v 0 at any .2 since m(O) > 0, and, therefore,
gi(0) * O. For 9i m 0, PiEx.> 0]'* 0 so the result is proved for

* **
the procedure mi. The same proof works for $1 .

Theorem.2. If the family of distributions satisfies (A1),

then at each fixed §_6 Qon
' * *‘k
(51) Inn<_e_,cp )l = om, lungs: )l = on).

‘ggggf. The proof follows from (2) and Propositions 2 and 6.

Example 1 illustrates that in order to deduce rates of convergence
it will be necessary to examine the bound (44) in the case the family
satisfies conditions in addition to (A1) or (Al+).

One such condition is
0
(A2 ) PB(X) < °°.

I
and notice that (A2+) implies (A2 ).

Lemma 3. (a) Under (A1) and (112*), n'1 2'; 3131-1 = 01 =
0(n.1 log n) uniformly in ‘2.
(b) Under (Al) and (AZ'), 11"1 zl'EiESi-l = 0] = 0(n-g)

uniformly in .2.

ll. (Ill-III: 1‘11]

30.

n n
. . = S 2 =
Proof We write 21 EiESi-l 0] Kim 1 pi(x) Iii-112814 O] +
2111 Pi[x 2 In]. The monotone likelihood ratio property and the fact
that the. pi(x)_13_i_1[si_1 = o] =.£i[X1 e x,...,Xi_1 # x,xi = x]
are probabilities of disjoint events together imply

(52) 2? P [s, = o] s m.+ n P [x 2 m]
-1 i- B

1

for any 111.2 0. Under (A2+), PBEX 2 m3 S c rm for some constants
c, O < r‘< 1. The choice m = - log n/log r proves part (a).
Under (A2'), PBEX 2 ma S PB(X.)m.-1 and the choice m = n% proves (b).
This lemma allows us to treat the expectation of the Cesaro mean
of the leading term in the bound (44). The second and third terms
are now investigated.
[3 o]

. >
Lemma 2. Under (Al) and (A2), n-1 21; Pi(——i§-1--—-

1-1

) = 0(n-%)

uniformly in .2.

-1 -1 %
> S > '
o] 31-1 ([81-1 0] 51-1) , Jensen s

41
S
i-l([Si-l 1-1)

2 (2: pj)-%. The proof is completed by application of Lemma 1.

_ gEX' = x + 1][S _ >‘0]
Lemma 6. Under (A1), n.1 En'P E( 1 ~ 1 1 ) =
-———- 1 i g Si-l

Proof. From [8,
--- i-l

inequality and Lemma 3 it follows that .2 > 0] S

0(n.1 log n) uniformly in 2,

. -l n 9
Proof. The left hand Side can be written n 21 20 pi(x)
911 -l -l°°n2
> S
pi(x+1) g(X) g .(x + 1)'£i-1([Si-l 0] 51-1) 48 n 30 31 pi(X)
(2: pj(x))-'1 where use has been made of Lemma 3. The rate

0(n-1 log n) follows from Lemma 1.1.

Proposition 1. If the family of distributions satisfies

31.

(Al), (A2) and (AZ'), then
.. * .. .. -
(53) n 1 2‘1‘ Finds; - epil) = 0(n 35), n 1 2’1‘ 21mm; - a?) = 0(n A:)

uniformly in .2.
Proof. The proof follows from (44) in view of Lemmas 4(b),

5 and 6.

Theoremug. If the family of distributions satisfies (Al+),

(A2), (A2.) and (A3), then
(54) anQ,<p*)| = 001*), Inn@,cp**)| = 0(n“)

uniformly in 2.

Proof. The proof follows from (2) and PrOpositions 3 and 7.
* **

In order to prove rates for m and w in the case (Al)
rather than (A1+) is assumed, we investigate n-1 2? P1E(l¢; - mql).
A straight forward computation shows
(55) |¢' - ¢"| S BES + 6' 8“]:il [S + 6' > 0]

i 1 1-1 . = o] + g (s ) 1-1 i
1 1-1 + 5'
i
where T =2; Q

1 j' Since lwi - wgl S B we can partition by
' 1
[81 p

S 1%] and weaken the bound of (55) to

BEzi s 1%] + Bis = o] + 2 g lTil
1 Pj 1-1 g (si_1 + a; + 1)

j

 

[2: p1 > 1%].

The first two terms can be diaposed of by Lemmas 2 and 4. As
indicated in the proof of Lemma 3, 132(81.“1 + 6; + 1)..1 S (E; pj)-1.

Hence, the independence of T1 and (X1,...X1,Xi) together

X

with fact EITiI S 0 show that for each fixed X1

1

32.

I'ril

I
Si_1+51+1

_E(

 

i .% i -%

. 8 . '
if 01 S 1. Therefore, under 0A3 ) the Cesaro mean of the overall
expectation of this term is 0(n-g) in view of Lemma 1. We have

proved:

Proposition‘g. If the family of distributions satisfies (A1),

8

I .
(A2+), (A3 ) and £1,€ is such that Ci S i, then

2,...

'1 n I n - -%
(56) n 21 Pi Eclcpi - epil) - 0(n )
uniformly in .2.

Theoremué. If the family of distributions satisfies (Al),

(A2+) and (A3'), then
(57) Inn<_e_,cp*)l = o<n"’5), Inn@,<p**)l = 0035)

uniformly in .2.

'ggggf. The proof follows from (2) and the Propositions 4,
7 and 8.

Theorem 4 can be compared with another result. In view of
Lemma 2 and the remark that follows it, we can state a strengthened

version of Swain's Theorem 3 (Swain, 1965) Specialized to k = 1.

Theorem. If the family of distributions satisfies (Al),

(A2+) and

33.

(113’) (x) < co

Eﬁéﬁipe
then sup (Dn(9,¢**)l§ E 0”} S Bn = 0(n-% log n).
The higher rate of convergence indicated in Theorem 4 is obtained
under an assumption (A3.) stronger than (A3-).

The construction of the procedures w* and ¢** depends upon
knowledge of B. If B were not known, then the natural estimates
of *1 could be truncated back to ai where a1 + S. It can be

shown that these procedures result in bounds of order 0(ain-%)

is

replacing the 0(n- ) given in both Theorem 3 and Theorem
4.

Example 1 demonstrates if all that is assumed is (Al) or (Al+)
then 0(1) is the best available uniform bound on an(2,¢)l when
v is either ¢*,¢** or the procedure suggested by Samuel. In the

next section we give an example which satisfies the hypothesis of

Theorem 3 and shows that the bound 0(n-g) is fairly tight.

'6. Examples.

Examples of exponential families which satisfy conditions
(Al), (A2+), (A3) and (A3.) are provided by the Poisson family for
B < w and the negative binomial family pe(x) = (a + :1? 1)6x (1 - 9)a,
ess<1,a>o fixed. I

In 51.2 a necessary condition was stated for the existence of

a compound procedure m satisfying
— co
lim.n sup {Dn(2,¢)l§ E Q } < ”.

For squared error loss the necessary condition reduces to the

34.

existence of a non-randomized procedure $1 in the component

problem satisfying
<58) sup {Peue - cpl)2)le e 0} < ...

With the negative binomial family all [O,l]-valued procedures

ml have risk functions uniformly bounded by unity. However, (58)
illustrates the necessity of assumption (Al) or (Al+) in the Theorems
1, 3 and 4 for the Poisson family. For with this family and unbounded
parameter set, (58) fails for any procedure $1 (Lehmann, 1950, p.
4-13). We sketch the proof. Letting P6( $1) = 0 + h(B) the
Cramer-Rao inequality proves Pe((9 - $1)2) 2 {(1 + b'(9))2/

P6((§% log pe)2)} + b2(0) = 0(1 + b'(9))2 + b2(0) from which it
follows that EEO-a P‘((O - cpl)2) = +00.

As a matter of possible independent interest we give an example
which illustrates that this property exhibited by the Poisson family
need not hold for other exponential distributions on the non-negative
integers with unbounded natural parameter spaces. Consider pe(x) ='
9x(x!)-%h(9), 0 < 9 < “, and m(x) = x%. Then Pe(¢) =6 and
Pam - <62) = no) {2: x e“(x1)"’5 - 02 114(6)} = h(e){e 1+ 221.. (xl)-% -
((x-2)1)"5]ex} = h(e){e + 2: (x35(x-l)-35 - 1)((x.2)1)"’5 6"}. It
is easy to verify that x%(x-1)-J5 - l S (x(x-1))-AE for x 2 1 so
Pe((° - cp)2) s h(e){e + 2: (x1)'35 6"} = h(e)(h'1(a) - 1) < 1 for
all 6.

The family of distributions in Example 1 satisfies the assump-
tions (A1+), (A2) and (A2.) and illustrates the necessity of adding

* *1:
another condition in order that the procedures w or w attain

35.

any rate 0(an) uniformly in ‘2, an a null sequence.
We now give an example of a family satisfying the hypothesis of
Theorem 3 which shows that the bound is fairly tight.
Example 2. Consider the exponential family with g(O) =
-4 -3 -a-l
g(l) = l; g(x) = x , x = 2, 4, 6,...; g(x) = x log x,

[q,l], O < a < l, where a > 0 is fixed

x = 3, 5, 7,... and 0

but otherwise arbitrary. It is not difficult to verify that (A2),
I

(A2 ) and (A3) are satisfied; Then at .9 =.l with n 2 2 and

any procedure ¢ such that [¢i(x) S a] 2 [2%-1 6j = 0],

_ -1 n . 2
DnQﬂP) ‘ n 21 21((1 -' (Pi) )

2 n’1 2‘; 21((1 - @9233?“ SJ, = 0])

2 (1 -002 31112;“ 63. = 01

2 (l - a)2 E: p1(2x + l)(l - p1(2x + 2))n-1

2 -
2 (1 - o) (1 - 13%))“ 1 21A p1(2x + 1)
n

where An = min [x|(2x + 4)4 2 n-l} S g(n-l)%. Comparing the series

2 p (2x + l) with the integral ‘r x-3 log
A 1 ZA+1 a
integration by parts that for A 2 1, 2A p1(2x + l) 2 % h(l)

-a-l .
x dx shows Via

(a + 3)-1(2A + l)"2 log-a-1(2A + 1). Therefore,

Dnﬂl;¢) 2 Kn-J5 log-3 n

for some constant K.> 0 and all large n.

CHAPTER III
SEQUENTIAL COMPOUND ESTIMATION FOR SQUARED ERROR LOSS

AND A FAMILY OF NORMAL DISTRIBUTIONS

1, Introduction.

Let P be a family of probability measures on the measurable
space (X,B) where X is the reals and' B the Borel G-field.
With n denoting Lebesgue measure suppose that for each P E P,
P < < p and p =-%§ is a differentiable determination of the
Radon-Nikodym derivative. (Consequently, p is the ordinary deriv-
ative of the cumulative distribution function.) Let ‘2 = (P1,P2,...)

where P1 6 P and let Xi ~ Pi be a sequence of independent'

random variables. For each x 6.x we are interested in estimating
(1) u (x) = 9—(1og 81 p (x))
1 dx 1 j '

For example, when P = {Pele E Q}, Q C1X, and pe(x) = k(9)
£(x) eex, a Bayes reSponse versus G1 the empirical distribution

of .21 takes the form

X
(2) 110:) = ui(x) - £381.

Therefore, if ui(x) can be estimated then ¢i(x) can be estimated

through (2).

'2. Estimation of ui(x).

Define the distribution function
. -1x 1
(3) Fi(x) = 1 L 21 pj(v) dv

where we have abbreviated du(v) to dv. Also for 1 2 bi + O

36.

37.

and any distribution function F define

<4) ti(F)(x) = hf log —-’5

The mean value theorem for integrals implies that under suitable
conditions ti(Fi)(x) - ui(x) * O as i v “. This will be discussed
later after investigation of the problem of estimating ti(Fi)(x)
from the sample (X1,...,Xi). We make the following assumption:

there exist functions Mi such that
,_ a
(A1) sup {Ici<Fi)<x)l l 1: e P 1 5 Misc) < ...
Consider the estimator
* *
(5) ti - (tiwi) AMi) v (441)

*
where we have no longer diSplayed the dependence on x and F1

is the empirical distribution function of (X1,...,X ). In (5)

i
any undefined ratios are to be taken as zero. We use the method
of Chapter II in writing

- = - t F, 2 d
(6) gidti ciwpl) £21m, 1(1) v v

and bounding the integrand by use of the Berry-Esseen theorem.

Let 0 S v S 2Mi and write

* i
[ti - ti(Fi) 2 v] s [21 Yj 2 0]

where Yj = 6j - 6j exp {hi(ti(Fi) + v)} with 6j = 1 or 0 as

Xj 6 (x,x + hi] or not and 6j = l or O as Xj E (x - hi,x]

1‘ or .I‘ It" JI“? (1‘ ’1

1 I 1 II a ll fi.

38.

or not. The Berry-Esseen theorem implies

(7) 31h: - ti(Fi) 2 v] s §(r;1(v) 2i p (Y

j j)) + c Li(V)

for r:(v) = Var(2iin) and Li(v) = r;3(v) 2i Pj(|Yj - Pj(Yj)l3)°
We have
i _ .4 x + h. h.v . x + h.
:1 Pj(Yj) - 1 Filx 1(1 - e i ) S -1 hi Fi]x i v

and by consideration of the variances of the three point distributions

of the Y,,
J

2 x + h 2 6M x + h
o o S S o o o
q 1 Fijx i ri(v) i e 1 Fijx _ h:
with

l - q2 = sup {P[x,x + bl] I x 6.x, P E P}.

Since IY - P (Y )I S l + e3Mi implies L.(v) S (1 + eBMi) r:1(v)
a j j j 1 1

and I Q (-av) dv S (2n)-% a'l, we see that (7) implies
0

 

2M1 * emi (1211:“: :03
(8) I P.[t. - t.(F.) 2 v] dv s i
0 _1. 1 l 1 (2n .)% h F ]x + h,
1 i i x 1
2c Mi (l‘+ eBMi)

 

+ .
q (i Fi]:‘+ hi)%

To treat the other tail let 0 S v S 2Mi and write

* 1
[ti - ti(Fi) s -vJ s [21 zj s o]

39.

where Zj = 6j - Gj eXp {hi(ti(Fi) - v)}. The Berry-Esseen theorem

yields

(9) chI - t M ) s -v] s @(-s',’1(v) 2i (2 )) + c L!(v)

PJ .1

with s:(v)

Var (211- zj) and L;(v) = s;3(v) 2i dezj - Pj(zj)l3)'

We note

iiFJX+hi -1

(l - e hi v) 2 %i hi F iJX + hi v, 0 S v S h;

211(2)
1 J j
and that the variance satisfies

ZiFi]:+h, 2 ZMiF1x+h°

S 5 °
q 1 si(v) 1 e _ h: .

Since L 1(v) S (l + e Mi) s;1(v), it follows from (9) that

 

(71“,; ) A3112 eMi (F 1" ‘1’ 21¢
(10) ‘rP ﬁEt 3 ti(Fi) S -v] dv S i
o (2 n 10% hi Fin + hi

-: a((zui) A h;1)(1 + eMi)

+ __
q (i F1]: + “195

 

 

b
For any non-increasing function g and O < a S b, I g(v) dv S
a O
a-1 b I g(v) dv. Applying this observation with g(v) =

0 .
* -l
.- S- = 2 =
EiEti ti(Fi) v], a (ZMi) Ahi l, b 2Mi and the bound

(10) we obtain

 

2M 4M,”ie1(‘r"]::+hi)25

(11) J” 11313 - t.(F.) s -v] dv s 1 hi
0 '-1 1 1 1 % x'+ h
(2 11 i) hi Filx i

2c Mi (1 + eMi)

q Ii 1531:” hi)’5

 

+

40.

which when combined with (8) proves via (6) that

 

* (1 + 4M.) e3Mi an]: f 31¢
(12) P.<lt. - t.(F.>l) s 1 1 i
-1 l 1 1 ‘ . 32V JX + h
(2 TI 1) hi Fi x 1
4c M. (l + eZMi)
+ 1

 

 

q (i F1]: + hi)35 .
Now we introduce conditions that will ensure that ti(Fi) -
ui * 0 as i *'°. The conditions seem arbitrary but they are con-
venient for reducing the notation in the bounds and are easily

verified for a family of normal N(6,1) distributions. We introduce

the notation

f(6)(x) = sup {lf<y>| l ly - xl s 6}.

A basic assumption is that each p is twice differentiable at each

x. We also assume that there exist functions A and B such that

for all P

(A2) Pé1)(X) S A(X) < °
and

(A3) p'E1)(X) S B(X) < °°.

As immediate consequences we have
(13) Ip<x + A) - p(x>l s A<x> IAI for all p, IAI s 1

and

41.

(14) lp'<x + A) - p'<x>| s a(x) IAI for all p, IAI s 1.
x+A x+A 2

Also, f;p(v) dv = IA{p(x) + (v - x) p'(x) + g(v - x) p"(z)} dv
x- x-

for some 2 E {v,x} proves by way of (A3) that

x+A 2
(15) l(2 A) 1 IAp(v) dv - p(x)| S-§&§%-A-’ for all p, O < A S 1.
x-
We write
(16) Iti<Fi><x> - ui(x)l s I<x> + ch) + K<x>
where
_ -1 x + h. _ , ; -1
I(x) - llog(hi Fijx 1) log Fi(x + 2h1)l hi
J(x) = Ilog(h-1 F ]x ) - log F'(x - %h )| h-1
i i x - hi i i i
and

- '1 I X+%h i I
K(x) - lhi (log Fi)]x _ gh: dx log Fi(x)l .
The inequality llog a - log bl S (a.A b)-1|a - bl and the technique
used to prove (15) result in
B C h

i
I
24 F1

 

'(17) IVJS

where the diSplay of dependence on x is suppressed and the function

C is defined by

1n£[p(x + A)I IAI s 1} = @183.

42;

In order to put K in a more tractable form let ci 6 (x,x + khi)
and 61 E (x - £h1,x) be such that log Fi(x + khi) = khi

H I I - I _ =
(Fi (ei)/Fi (61)) + log Fi(x) and log Fi(x khi)

khi(F2(61)/Fi(61)) - log Fi(x). It follows that

 

 

IF" <x>| IF' (e > - F'<x>l IF' (6 > - F'<x>|
i i i i i i i
2K(x) S —--—--, —---——-, + -——-——---,
Fi(x) F1 (61) F1 (61)
+ IF; (e1) - F; (x)! + IF; (51) - Fg(x>l
Fi(€i) F1 (51)

Assuming there exists a function D such that for all .21 6 P1,

 

i 2 1
IF; (X)|
(A4) TIES-— S D(X) < a
we have
(AD + B) C hi
(18) K S 2 F;

Combining (16) - (18) yields

(AD+B)Chi

 

(19) Iti<Fi> - nil s

I
F1 3
x + h B(x) hi
0 O o o S '
Inequality (15) implies FiJX _ hi 2hi Fi(x) + 3 and the bound
(12) can be weakened to
%
* -
(20) P.(|t. - t.(F.)|) s k' e3M1{ C +9—9— + (—9—,)’5}1 35
-1 1 1 1 3 2 , k g , h F
hi (F1) hi F1.- 1 i

for some constant k' not depending on .2, i and x. Combining

43.

 

 

(19) and (20) with the choice hi = 1-1/5 shows that for each
fixed x,
* (AD + B + 3% eBMQ c (c + c5123}: _1/5
- s ' °
(21) -31(|t1 nil) k { Fi + (F1)% } 1

for some constant k.

The following observation is of use in finding the functions
Mi. Repeated application of the law of the mean proves that

. -1
= II I _
t1<F1)‘x) (Fi (51)/F1(Bi>)(€1 61) “1 for some Bi 6 <61,ei),
- - S

6i E (x hi,x) and Si (x,x + hi)’ hence, Iti(Fi)(x)l 2 D(Bi)

and we may take Mi = 2 D(1). In the case of a family of probability

measures P {Pele E O}, Q = [-a,a], where pe(x) = p(x - 6) we

may take A = Pb? + 1) and B = pQQ +_1).

g. A Decision Problem.

Consider the family of normal distributions P = {Pele E Q},
Q = [-a,a], a < m, with P0 denoting the N(6,l) law. Let
Xj ; P6j
denote the usual N(0,1) density. The Bayes response (2) takes the

be a sequence of independent random variables and p

form
¢i(X) = ui(x) + x.
Theorem.l. The sequential compound procedure
*
(22) 1210(1) = ((ti_1(Xi) + Xi) A01) V(-01)

is such that

-1/5
23 - =
< > gidcpi wil) o<1 )
and
(24) Inne,cp>| = 001'”)
uniformly in 2.
Proof. For each fixed Xi + 1 = x the bound (21) applies to

21(lt:(X) + x - 1100') with A(X) = p20! 4, Doc), 300 = p'fo, + DOC),
C(x) s exp {le +oz + 1;}, D(x) = lxl +oz and 1110;) = 2(lxl +01 + 1).
If we weaken the bound with F£(x) 2 pa(x) [x S 0] + p41(x) [x:> O]
and use pi + 1(1!) 5 P_a(x) [x< 41/] + [lx' S 0'] + pa(x) [x>0'],

it follows that .gi‘+ 1(|¢i + 1
Since Proposition 1.1 implies Pi + 1(HIi + 1(x) - ¢i(x)|) = 0(i-1)

‘ *il) = 0(1-1/5) uniformly in .9.

uniformly in .2, (23) is proved. The bound (24) is now an immediate
consequence of Corollary 1.1.

Swain (1965) suggests a different sequential compound procedure
and proves that the modified regret is bounded from above by a bound

0(1) uniform in parameter sequences.

CHAPTER IV

ON THE BAYES RESPONSE TO THE EMPIRICAL DISTRIBUTION OF

OPPONENTS PAST IN SEQUENTIAL DECISION PROBLEMS

IF‘

Introduction.
In the preceding chapters sequential compound procedures
were suggested naturally by the structure of a Bayes response

W1 versus G the empirical distribution of player 1's

1)
(nature's) past through present moves. The problem of bounding
the modified regret associated with certain natural procedures
w was reduced to bounding a Cesaro mean difference
n"1 3: 21 ('mi - Wil). Inequality (8.8) of Hannan (1957) shows
that the simple procedure ¢i(Xi) achieves a non-positive
modified regret and illustrates why one would try procedures
that approximate ¢1(Xi) in order to obtain small modified
regret. In fact, with the sequence (set) compound problem
investigators haVe used procedures which at stage i are Bayes
versus estimates of G1 or G1-1(Gn) where estimable, as in
the‘finite problem, or procedures which approximate $1 or
¢i_1(¢n) in case the distribution functions are not estimable
directly.

The question arises as to what can be achieved in the
sequential problem.when an estimation problem is removed by

assuming that G is known at stage is Usable sequence

i-l
strategies were exhibited by Hannan (1956) and (1957) for

various M X N games (M finite) which achieve uniform rates

of 0 (n-%) in absolute modified regret. The strategy is to

play Bayes versus hi-l + Z where h is an unbiased estimate

1-1

45.

46.

of (possibly G itself) and Z is a suitable random

G1-1 1-1
vectorI Theorem.l of Van Ryzin (1965) demonstrates that in the

finite problem direct play against E at stage i results in

-15)

i-l

a uniform bound 0(n if there is a certain non-degeneracy in

the estimate.

In some finite problems a bound of the same order obtains
whether the procedure used is either Bayes versus an unperturbed
G1_1 or an estimate 51,1. For consider a 2 X 2 statistical

decision problem where player I's set of pure strategies is

Q 3 {0,1} with P = N(9,1), and the loss matrix is (g 3),

6

a, b > 0. A Bayes rule versus (l-p, p) is

b P0(x)
bpe(X) + 891(X)

 

tp(x) 3': < p] a: [x > 35 + 108 (%?)1

where tp(x) a 1 means decide 9 a 1. Let 0(p) denote the
Bayes risk vector. We claim that the Bayes response satisfies
a Lipschitz condition of order a s l, (8.13) of Hannan (1957).

(L) .. om

. o O' -
Condition (8.13) reduces to showing 0(9) L+t

o
0
105%) =O(t) as t" O uniformly in p where

01(p) is the risk function 0(p) evaluated at 9 a i. These

and 01(9) - 0

orders can be verified and then via Hannan's Theorem 5, direct

play versus results in a non-negative modified regret

G1-1

of order 0(n-110g n) uniform in player I move sequences. Van
Ryzin's Theorem 5 (1965) implies that a bound of the same order

obtains if the procedure plays Bayes versus any unbiased estimate

hi-l that has a bounded kernel.

Throughout the remainder of this chapter we assume that

47.

Gi-l is available to player II at each stage i. In the next
section we point out some results which apply to statistical
decision problems having a certain structure. The problems of
Chapters II and III possess this structure. In subsequent

sections we exhibit sequence strategies for nonéfinite games and

prove rates of convergence for the modified regret.

2. Games Involving Squared §£32£.222§°

For each 9 E 0 let there correspond a probability
measure Pe'<S u on the measurable space (X,B). Let the loss
.function be squared error and consider the simple procedure

5 where 51 - ¢1_1 (Xi)’ W being the Bayes procedure (1.2).

i
Inequalities (1.9) and (1.10) imply

- - n
(1) o 5 Dn (2.1» s n 121 P1 ((11.1 -1i><11_1 +1:i - 261)).

In view 0f Theorem 1.1 and (l) the following prepoSition is

immediate.

Proposition _1_. If 0 C [-A, A], A < a and M(x) .

 

sup [pe(x)|9 E 0} is integrable (X,B,u), then
(2) 0 S Dn (22$) 3 0(n.1 log n)

uniformly in .9.

Herman (1957) uses symbolism 0(x) to represent a risk
function Bayes versus x. Then 8 a(x) denotes that function
evaluated at e, a pure strategy for player I in the component

game. Specialization of Hannan's Theorem 5 to a s'l shows that

48.

(3) €[0(x) - 0(x+te)] S L (ii-E-

for a constant L, all 8, all t > O and all probability
measures x with finite support, is sufficient for (2) to
obtain. We give an example to show that (3) is not a necessary

condition for (2) to obtain.

Example I. Consider the discrete exponential family of
Chapter II with Q a [a, b]. For squared error loss, and in
Hannan's notation, x degenerate on a and 6 a probability
measure degenerate on b,
ap +bt p

a b 2
pa+tpb

 

e[0(x) - O(x%te)] . Pb<(a-b)2) - Pb(<
.. <a~b>2 w) t

2 2 ,
where .£(t) a Pb((2pa pb + t pb)/(pa + tpb) ). By Fatou s

lemma, lim .i Gilt) 2 2 Pb(pb/pa). For the geometric family

2

t
and O < a S b

< l, Pb(pb/pa) n-hn; and, therefore, (3) fails.
Since the hypothesis of PrOposition l is satisfied, (2) obtains..
Another application of Proposition 1 is now given. Consider
the problem where the component game has the following structure.
Player I picks an 6 5 [0,1] and II picks a 5 6 [0,1] with
loss (6 - 5)2. This is Hannan's Example 2 (1957) where it is
observed that the Lipschitz condition (3) is satisfied for
0(-) with 6 0(x) s (e - 5(x))2, 5(x) being the mean of the
distribution x. Therefore, 0 S n-IZ:€iG(Gi_1) - R(Gn) -

0(n-110g n) uniformly in .E' This result can be deduced from

Proposition 1 in the following way. Associate with each

49.

e 6 [0,1] the same probability measure P degenerate on 0.
._ dP -._
If p-— P, then p(€) : du(6) - [6 - OJ and ¢1_1(€) r

(i-l)'12i'lej[e - O]=- 5(G a.s. P. Since the hypothesis of

i-l)
Proposition 1 is satisfied, (2) obtains with Dn(g,¢) =
-l n
2 o -
n 1 81 (G1_1) R(Gn)'
We now proceed to the main results of this chapter. An

effort will be made to keep the presentation self-contained

although the basis for the results is found in Hannan (1957).

3. _G_at_ng With Countable .111.

Let player I's set of pure strategies be M = {1, 2,...},
possibly finite. We identify each 5 of N, the set of pure
strategies for II, with the risk vector G'= (01,02,...) ‘where
L(j,5) = Oj 2 0 is the loss to II when I's choice is j and
II's choice is 5. We identify each j E M, with the vector
of 0's and 1'8 with 0's in all but the jth position and let
6 be generic for such a vector. Then 60 denotes the inner
product of 6 and C for all 6 E M, 0 E N. For '3 =

(61, 62"...) 6 Mo and g: (01, 02,...) 6 No. the following

identity ((6.5) of Hannan (1957)) is basic:

n _ n n _
(4) 21 €101 - Encn+1 + 21 Ei-1(Oi ' 01+1) + :1 61(01 C$11-11)

where Ei.= €1+"'+€i.= 1 G1, Eo== 0. We assume
(Al) N is sequentially compact under pointwise convergence
and

(A2) sup {HOHI I O E N}‘= B < m

500

where H “1 denotes the £1 sequence norm. It follows from
(Al) that for all ‘w E m+, the set of bounded sequences with non-
negative components, inf {w 0*l0* 5 N} is attained. For let
waivinf {wG*IO*EN} and for each j, Oi'wdj as k"'°°
where 0 = (01,02,...) 6 N. By Fatou's lemma, w0'= E: ijj S
.limk 3: wjoik - lim1 w 01'= inf [w'0*|0* E N} so the infimum
is attained. Denote an infimizing 0 by 0(w). We may take
C(-) to be positive homogeneous and such that each component
Uj(-) is a measurable function. (Let 81 be the Borel subsets
of 4X1 = [0,1] and consider (X,B) ‘where .X = ;.Xi and B
is the product O-field. Then the 0j(o) may beltaken to be
measurable functions from (X,B) to the reals.)

Letting n denote the distribution of Z = (Z1,Zz,...)
where the 2k are independent and identically distributed

uniform [0,1] random variables, we investigate the randomized

sequential procedures

(5) 01 = C(Ei_ + 11101 2)

1

with the sequence of constants Hi satisfying

(6) . 0 < Hi + and i H;1 + with respect to 1.

Theorem 1. Assuming (Al) and (A2) and with H; = 1%, the
--—--- 1

procedure (5) results in

-l n _ '5
(7) In 11031 6101) - R(Gpl - 0(n )

51.

uniformly in g,

3522;. Writing Ei(01 - 01+1) 2 (E1 - (Ei + H1 2))(0i - 01+1)

we have

(8) 2:1! E:imi ' 01H) 2 "211‘ Hi 2(01 ' CHI)
= Hn Z On+1 ' Ho 2 61 ' EI(Hi ' Hi-l) z 0i
2 -B H0 - Z‘l‘m1 - Hi_1)B = -B H“.

Also, En on+1 - n R(Gn) ==En(0n+q - d(En)) 2 0 so (4) yields

(9) a: e1 01 - n R(Gn) 2 -3 H“.

In order to obtain an upper bound we note that Enon+l - n R(Gn) S

- O' I- ’ -0 S- .-
Hn Z( n+1 0(En)), and, similarly, Ei_1(01 1+1) Hi-l 2(0i 01+1)

so that summation by parts and (4) imply

(oi -

0
1+1)'

n - S n
(10) 21 6101 n R(Gn) B Hn + 21 e1

The expectation of the last term is bounded by a direct extension
of Hannan's Lemma 2 (1957). Here we state and prove the needed

specialization of that extension.

Lemma 1. Under the assumptions of Theorem 1,

 

(11> |u°i(w + Z) - 1101(w' + z>| s Bllw - w'll1

for each i and all w, w' in mi.

a
Proof. With X = [0,1] we write

52.

n0i(w' + 2) = jkoiw- + z)d11(z) =1 qi(w + v)dV(v)
M

where v = nT-l is the measure induced by the transformation

T defined by v = T2 = w' - W'+ 2. Therefore,
pol(w + Z) - p01(w' + Z) S I 01(w + z)du(z) - I 01(w + v)dV(v)
X XOTX
and, since the restrictions of 11 and v to X F) TX are equal,

nOi(w + z) - pOi(w' + 2) s B n(x- 110.

Since 2 6 X - TX if and only if zj 6 [0,1] for all j and

zj - w'j + w1 t [0,1] for some j, it follows that
u(x - TX) s 8°;ntzj < w'j - wj or zj > 1 + w'j - wj]
s E:{(w'j - wj)+ + (wJ - w'j)+}
S “w - w'Hl.

The proof is completed by interchanging the roles of w and w'.

, _ -1 , = -1
We apply the lemma Wlth w - Hi-l E1_1 and w H1 E1.
With this specification
6 8
-1 i -1 i -1 j *1 J
- ' = - -
“W W H1 H1 E1 H1-1 E11-1 + 1;: (Hi-1 E1-1 Hi Ei)
i
C 6
_ -1 -1 -1 i 1 -1 -1
' Hi + (Hi " Hi-1)Ei-l + (i ' 1 ' Ei_1)(Hi_1 Hi )

- C S
S Hi + (i 1)(Hi-l Hi ) 2 Hi .

. -1
- 3
Therefore, the lemma implies ”(Si-(Gi Ui+1))| 2 3 Hi

which proves via (9) and (10) that

53.

n n -l
- S -
(12) B Hn 11(21 eioi) n R(Gn) s B{Hn + 2 21 Hi }.

The choice Hi = i35 proves the theorem.

4. A Theorem for Non-Finite W.

Let L 2 0 be a loss function over the cartesian product
M X N* of arbitrary spaces M and NI, In the component game
player I chooses an element 6 E M, II picks a O E N =
[L (°,6)l6 E N*} and the loss to II is 60, 60 denoting 0
evaluated at e. If w is a discrete measure putting mass
w on 61 E M. then wO denotes 81 w1(610). Assume that for

i

all discrete measure w with finite support in M,
(Al') inf {w 0'0 E N} is attained.

As before we denote an infimizing 0 by 0(w) and take
*

0(o) to be positive homogeneous. (If N is a compact
topological Space and each section L(e,-) is continuous

*
on N then (Al') is satisfied.)

*

Consider the game where M.= N = [0,1] and L(€,6) =

Ie - 5'. In the sequential version a strategy ‘O’ With

‘¢i = 0(Ei-l)’ E1 = i G1, where G is the empirical dis-

1
tribution of (61,...,€i), is such that suprnQﬁmlg 6 [0,11%
2 % (Hannan, 1957, p. 130). This exemplifies the need for
artificial randomization as was the case in the preceding
section. However, here the measures E1_1 have support changing
with ‘g and 1; since the union of all possible supports is

M, uncountable, it is not clear how to add the randomization

54.

in order to perturb the measure E This difficulty will

i-l'

be overcome by embedding a countable set in M.

A basic assumption is that
(AZ') sup{€0|€ E M, O E N} = B < a.
We define the real valued function
(13) d<e,e'> = sup{|ec - e'ol | o 6 N}.

Clearly (M,d) is a pseudo-metric Space; and, if loss equivalent
player I moves are identified, it constitutes a metric Space.

Let Ji 2 1 be a non-decreasing integer valued sequence,

= C = °
A {a1, a2,...} M and A1 {a1,...,aJi}. Corresponding to
G
each sequence .3 = (€1,ez,...) E M. there is a sequence

I
E} = (€1,62,...) where 6i is an element of A1 closest to

I
61 in the metric d. We let G1 be the empirical distribution
I I
of (61,...,€i) and 20, 21, 22,... be independent and identically
distributed uniform [0,1] random variables. We investigate

the sequential procedure

(14) Oi = 0031-1 + H1-1 E~1-1)
where H1 satisfies (6) and ‘21 is to be interpreted as the
measure placing mass Zj on aj, j = 1"°'Ji' Let u denote

the distribution of (Zo,Zl,Zz,...).

Remark. Let {a1,...,aj}<:‘M. be fixed but otherwise

arbitrary and let W = {(w1,...,w:)l 0 S wi‘< 9, i = l,...,J}

55.

be the set of finite measures with support in {a1,...,aJ}.

Then Oj(-) is a function from. W, considered as a subset of
Euclidean J-sPace,:to the reals. If {(a10,...,aJU)IO E N}

is a compact subset of Euclidean J-sPace, there is a determination
C(o) such that each component 0j(-) is measurable. In the

case of countably infinite support, we noted in the preceding
section that sequential compactness under coordinatewise

convergence and I! boundedness for the set of risk vectors is

sufficient for measurability.

Theoremwg, Under assumptions (Al') and (A2'), and with

each component of Oi measurable,
n n' -l
- s
(15) lp(£1 e101) n R(Gn) I Bth Jn + Jn + 2 81 Hi }
n I

Proof. A bound for the left hand side is provided by

anl + lCnI +'an| where

- n I I
Bn ‘ ”(21 6101) " “ R(Gn)
n I
Cn = "(21 (6101 ' 6101))
Dn = “(R(Gn) - R(Gn))0

Clearly 2"1‘(e'o11 - 6101) s 2“ (mi, 6 b and n(R(Gn ) - R(Gn ))=

n
info 21 sic - info 81e1051nf021(e;0 + d(e. , e £>> -

n I
info 21 :10 = 21 d(ei’€i) so

56.

(16) Icnl V Innl s 2‘; d(ei, all).

The identity (4) yields

_ ' I n I
(17) Bn - ME 0 > - En own) + 11031 Ei-l(oi - o )>

n n+1 i+1
+ “(2? 6301 " °i+1))'
Since
8: Eiwi ' Cm) 2 ‘2111 Hi 51(01 ' c3+1)
= Hn §n°n+1 " Ho 50°1 " trim]: 51 "’ H11-1 51-901
2 '3 H0‘10 " 2111(H1Ji ' Hi-lJi-l) B
= -B 11an
and E&(On+1 - 0(Eé)) 2 0, it follows that
(18) 2‘; Bio: - n R(Gt'l) 2 -B Han.
As in the derivation of (10),
(19) 211‘ 6:101 - n R(sn) s B 11an + 2‘; 611631 - 01H).

' = 'U-O'IS
For those 1 such that J we have |u(€i( i 1+1))l

1 Ji-l
23 H"1 by application of the Specialization (9.8) of Hannan's

i
_. '1 I I: '1 I
Lemma 2 (1957) with w — Hi-l Ei-l’ w Hi E1. Therefore,
n n -l n
(20) Inc: e'(0, -0, ))l $2 213 H, +7... B
1 i 1 1+1 _ 1
iIJ - J iIJ. > J.
i i-l 1 1-1

n -l
S

57.

and (17) - (20) combine to yield

(21) -B H J S B S B[H J + J + 2 En HTI}.
n n n n n n l 1

In view of (16) and (21) the theorem is proved.

If the set, A is dense in the metric space (M,d) with
d(€i’€i) + 0 as Ji 1 ﬂ uniformly in .g, then a balance in
the bound of (15) can be obtained by choice of the sequences
Hi and Ji' We note that in the finite M problem with Ji

35

equal cardinality of M for large 1 and Hi = i , (15)
shows that the expected modified regret resulting from the
procedure (14) is 0(n-%) uniform in 5, We now apply Theorem

2 to a problem involving uncountable M.

Example 2, Consider the game of absolute deviation on
the unit square. Here d(e, e') = sup{| It - 5| - '6' - 5| I I5 6 [0,1]} =
'3 - 3", and we let I-Ii = 18, J1 equal the greatest integer in

- l, where a, b 6 (0,1) are yet to be specified. For the

H
u
L.
I

0
set A = {a1,a2,...} we take the points 1/2, 1/4, 3/4, 1/8,
3/8, 5/8, 7/8,...; that is, ai = bkj where 1 = 21"1 + j - 1,
bkj = (2j - l) Z-k, 1 S j S 2k-1, k 2 1. With this choice

1

1 S 2(ib - l)- ; and, hence, it follows from

I S -
d(€i) 61) 2 J1

(15) that
n a+b b -a
- s f"
|u<z1 eioi) n R(Gn)| {n + n + 2 1 1 }

+2{xnb.,1+znb 4i-b}
ili <:2 ili 2 2
a+b + nl-a

nl-b}

S cfn +

for some constant c. The choice a = b = 1/3 results in an

-(1/3))

expected modified regret 0(n uniform in “g.

BIBLIOGRAPHY

Hannan, James F. (1956). The dynamic statistical decision problem
when the component problem involves a finite number, m, of
distributions. (Abstract) Ann. Math. Statist., 27, 212.

Hannan, James F. (1957). Approximation to Bayes risk in repeated
play. Contributions to the Theory of Games, 3, 97-139. '
Ann. Math. Studies No. 39, Princeton University Press.

Hoeffding, Wassily (1956). On the distribution of the number of
successes in independent trials. Ann. Math. Statist., 27,
713-721. "N

Hoeffding, wassily (1963). Probability inequalities for sums of
bounded random variables. J. Amer. Statist. Assoc., 58, 13-30.

Lehmann, E.L. (1950). Notes on the Theory of Estimation. University
of California, Berkeley.

Loave, Michel (1963). Probabiligz Theory (35d ed.) Van Nostrand,
Princeton.

Samuel, Ester (1965). Sequential compound estimators. Ann. Math.
Statist., 36, 879-889.

Survila, P. (1962). On the remainder term in the central limit
theorem. Litovsk. Mat. Sb. No; l, 2, 179-194.

Swain, Donald D. (1965). Bounds and rates of convergence for
the extended compound estimation problem.in the sequence
case. Tech. Report No. 81, Department of Statistics,
Stanford.

Van Ryzin, J. (1965). The sequential compound decision problem

with m.X n finite loss matrix. Tech. Memorandum.No. 54,
Applied Mathematics Division, Argonne National Laboratory.

58.