ON THE CONTINUITY OF THE
BAYES RESPONSE

Thesis for the Degree of Ph. D.

MlCHIGANI STATE UNIVERSITY

MERRILEE KATHRYN HELMERS
1972

UNIVERSITY LIBRARIES

mliu‘l‘iiiliimmu mm It

3 1293 00627 4363

 

 

 

 

 

l
l

 

 

 

 

 

 

 

 

 

 

LIBRARY

Michigan State
University

 

This is to certify that the I!

thesis entitled

ON THE CONTINUITY OF THE
BAYES RESPONSE

 

 

presented by

Merrilee Kathryn He lmers

\

I

has been accepted towards fulﬁllment ‘
of the requirements for ‘

Ph .D. degreein Statigtics and w r
Probability

Q MW

Major professor

Date August 11, 1972

0-7639 ‘

   
   

 
  

amass BY ; i .
,' ,t “MB ISUNS”
I 800K BINDERY m5.

LIBRARY amazes p
W‘Mmm .

    
   

ABSTRACT

ON THE CONTINUITY OF THE
BAYES RESPONSE

By

Merrilee Kathryn Helmers

Consider a statistical decision problem with parameter
set @, observations X ~ P9, action Space d5 decision rules
m, losses L(e,m(X)) 2 O and risk functions R(e,¢9 =‘EeL(9.qKX))-
When 9 is random with prior distribution G 6.2, the class of
all prior distributions, R(G,q9 = EGR(9,qD is called the Bayes
risk of ¢ versus G. The Bayes envelope is defined on .& by
R(G) = infq§(G,q9 and any ¢ for which R(G,¢D = R(G) is said
to be Bayes with respect to G. If a determination of a Bayes
rule with reSpect to G, qt, is made for each C, then qt
defines a mapping from .3 into the set of decision rules.
Hannan((1957), Contributions to the Theory of Games, 3,
97-139. Ann. Math. Studies No. 39, Princeton University Press)
investigated the relationship that certain continuity conditions
on “b have on the average risk stability of the procedure which
plays Bayes against the past in a sequence of independent repeti-
tions of a game. With 91,92,... the sequence of parameters and
Gi denoting the empirical distribution of 91""’9i this thesis
continues Hannan's investigation of D: = n-1 2 R(ei,¢b ) - R(Gn)

i-l
in the framework of a statistical decision problem component.

Merrilee Kathryn Helmers

Chapter I gives the pertinent results of Hannan in the
decision theoretic framework and offers some slight variations of
conditions sufficient for D: a 0.

Chapter II examines in more detail the ® finite case and
begins by showing that the continuity of R(e,¢b) in G is suf-
ficient for D: a O. For component decision problems where the
action space G7 is also finite, Theorem 3 provides a characteriza-
tion of a Hannan condition which is sufficient for D: # O. This
characterization reduces to a condition on the continuity of dis-
tributions of the likelihood ratios for components which are
either 2 X N or M X M classification problems, and the likeli-
hood ratio condition is shown to be necessary for D: d 0. Hence,
a complete characterization of D: 4 0 is provided by the easily
checked likelihood ratio condition.

Chapter III has @ finite and investigates the relation-

ship that the differentiability of R has to the continuity of

ON THE CONTINUITY OF THE
BAYES RESPONSE

By

Merrilee Kathryn Helmers

A THESIS
Submitted to
Michigan State University

in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

1972

TO MY P ARENTS

ii

ACKNOWLEDGEMENTS

I wish to express my sincerest gratitude to Professor
Dennis C. Gilliland for his guidance and encouragement in the
preparation of this thesis. His patience and willingness to
discuss any problem at any time are greatly appreciated.

I also wish to thank.Professor James Hannan for his
critical reading of this thesis. In particular, I would like
to thank him for pointing out how the results of his game theory
paper could be used to shorten the proof of Theorem 2.3. Special
thanks are due to Mrs. Noralee Barnes for her excellent typing of
the manuscript and the cheerful attitude with which she did it.

Finally, I wish to express my gratitude to the National
Science Foundation and to the Department of Statistics and
Probability, Michigan State University for their financial support

during my stay at Michigan State University.

iii

Chapter

I

II

III

TABLE OF CONTENTS

PRELIMINARIES

1.1 The Component Statistical Decision Problem

1.2 The Sequence of Component Problems

1.3 Mpgotonict;y)Results for R(e’q9+te) and
"pGri-te

SUFFICIENT CONDITIONS AND NECESSARY CONDITIONS
FOR ASYMPTOTIC AVERAGE RISK STABILITY

2.1 Introduction
2.2 Continuity of R(e,¢h) in G

Condition (HO); A Characterization for

M X N Decision Problems
2.4 A Likelihood Ratio Characterization of
Condition (1)

SMOOTHNESS OF THE BAYES ENVELOPE AND ITS RELATION
TO (C)

Introduction

Some Mathematical Preliminaries
Differentiability of the Bayes Envelope as
a Function of Q

WWW
LAMP“

BIBLIOGRAPHY

iv

Page

\OJ-‘H

15
15
17
20

27

46

46
48

SO

60

CHAPTER I

PRELIMINARIES

l. The Component Statistical Decision Problem

Consider a statistical decision problem with parameter set
@ and a random variable X taking values in I with P9 denoting
the distribution of X given parameter 9 6 @. Let L 2 0 be a
(real-valued and measurable) loss function defined on Q X a? where
Gr denotes the set of (random) actions. A (behavioral) decision
rule m is a mapping from I into gr. The risk function of m

is given by

(1) R(9,cp) E J; L(e,cp(X))Pe(dX).

A11 risk functions are assumed finite valued and measurable with
respect to e. If G E.&, the class of all a priori probability

measures on a o-field of Q, the Bayes risk of m with respect to
G is given by
(2) R(G,cp) E I R(9,cp)G(d9)~

®
We assume the o-field of G contains all singleton sets and through-
out this thesis will identify degenerate probability distributions
with the singleton sets on which they concentrate all their proba-
bility. Then (2) provides a natural extension of the domain of

R(-,¢) defined in (1). The Bayes envelOpe is given by

(3) R(G) a inf New).
<9

Any rule m such that R(G,m) = R(G) is said to be Bayes with
respect to G and will be denoted by mt. It is also called a
Bayes response with respect to G. A.minimal assumption that we
impose is that a Bayes response exists with respect to G for
each prior measure G. If a Bayes response qt is specified for

each C E.$, then defines a mapping from .& into the class

“PM
of decision rules. We will call ¢(.) a determination of the Bayes
response.

It is convenient for our purposes to extend the domain of
R(-,¢) to N3 the class of all finite signed measures on @, and the

domain of m(-) to NT. the subclass of all finite non-negative

measures on @. For H 6 N3 let

(2') R(H.cp) E f R(escp)H(d6)
®

and note that R(H,qD is linear in H. Let \H‘ = H(®) and for

H E V+, H i9 0, define
(4) 90H 5 <93
m
If mb is specified (any Q will do) then this specification to-
gether with (4) extends the domain of any determination of the Bayes

+
response to the class H'. The resulting m(.) is positive

“PM

+ .
homogeneous on. N', i.e.

_ +
mkH - mH for all k > 0, H E y'

and

R(H’q’ﬂ) s R(H,cp) for all (p and H e if.

In this thesis we will always work with positive homogeneous deter-
minations of the Bayes reaponse.

Throughout we will use square brackets to denote indicator
functions; for example, [f > 0] denotes the indicator function of

the set {x\f(x) > O}.

2. The Sequence of Component Problems.

Suppose the component problem is repeated with

,Q = (91,92,93,...) the sequence of parameter values and

= (X1,X2,X3,...) the sequence of observations. We assume in-

dependent repetitions; that is, the distribution of K. given Q
Q

XP.
191

Hannan (1957) launched an extensive study of the sequence

is the product distribution

of decision rules

*
(5) $2 = ((9 :CP 3C? 3°°°)
G0 Cl G2

for use across the sequence of component problems where CO = O
and
(6) Ci = empirical distribution of 91,...,ei, i = 1,2,...
Hannan (1957, §8) shows that

1 n 1 “

1—1 1 i=1 1-1

which suggests that under certain continuity conditions on the Bayes

n

response ¢(.), %' 2 R(ei’mb ) should be stable about R(Gn)° We
i=1 i-l

let

>6-

(8) D =

5],...

"MD

R(eiﬂPG. ) ' R(Gn)
1-

i l l

*
denote the excess over R(Gn) of average risk resulting from Q

and define an asymptotic uniform in .3 average risk stability

*
resulting from ‘m as

*
(A) Dn a O as n a m uniformly in Q.

The idea of comparing average risk with the envelope R(-)
evaluated at CD is part of Robbins (1951) original formulation
of the compound decision problem. Hannan's (1957) game theoretic
level investigations into variants of NC 1 were at least in part
motivated by the earlier work of Robbins 2nd Hannan and Robbins
(1955) in compound decision theory where statistical information

replaces exact knowledge of Gn The original formulation of the

-1‘
compound decision problem was set rather than sequence in nature.

The idea in the sequence case is to estimate 61-1 or more generally,
991-1 using X1,...,X1_1 and consider i’E: E R(ei,&hi-1) - R(Gn)°

The convergence to zero has been demonstrated and bounds on rates
obtained for the 2 x 2, M X N, squared error loss estimation, two
action, and other decision problems by Robbins (1951), Hannan and
Robbins (1955), Hannan and Van Ryzin (1965), Van Ryzin (1966a),

Samuel (1963), (1965), Gilliland (1968), Johns (1967), and many

other authors too numerous to mention individually.

This thesis concerns the development of necessary conditions and
sufficient conditions for the (A) and its non-uniform in ‘3 counter-
part. In some applications, for example,pattern recognition with
supervised learning, g? is a procedure which is available to the
decision maker and hence our results will have immediate application.
Equally important is the potential use of our results in combination
with results on estimation of empirical distributions for use in
the important compound decision problem. A survey of known results

*
concerning the convergence of Dn and its loss analog to zero can

be found in Gilliland (1972).

In compound decision problems authors seldom impose con-
tinuity conditions on the Bayes response because statistical estima-
tion of NG._1 furnishes its own smoothing. Hannan (1956), (1957),
Samuel (1963), (1966), Gilliland (1966), (1969) and Jilovec and
Subert (1967) have investigated play against random perturbations
of Gi-l and have established asymptotic average risk stability
about R(Gn) without continuity conditions.

By differencing across the extremes of (7) we have

n

* l
(9) O S Dn s n -E [R(ei,mG. ) - R(ei’qh )1, for all Q_.
1-1 1-1 1
Identifying 9i with the probability measure degenerate on 81

we have i G1 = (i-l)Gi_ + 9i; and therefore ‘

1 4b."? . -1'
i Gi_1+(1 l) 91

This together with (9) suggests the investigation of R(G’N9+te)

as t l O. Hannan (1957, p. 129) defines a condition on the component

problem which we shall denote as condition (H):

(H) 1im R(e

) = R(e,m ) uniformly in 9 6 @, G E.& .
no G

"PM: 9

Proposition 1. (Hannan) (H) = (A).

1 we see that (H) implies

Proof. Since “b = _
o + O-

(10) [R(Gi,¢bi-1) ' R(Oi,mGi)] e O as i a m uniformly in Q .

Since Cesaro mean preserves the uniform convergence of a sequence,
(A) follows immediately from (9) and (10).

Hannan (1957, Th. 5) has observed that if ¢(.) satisfies
a Lipschitz condition in (H) then the same analysis yields a bound

*
on the rate of convergence of Dn a 0 in (A).

Gilliland and Hannan (1969, p. 11) give an example involving
an infinite 9 decision problem where (A) obtains for some determina-
tions of m(_) but (H) fails for all determinations of ¢(.).

Thus, (H) is not a necessary condition for (A) to obtain.

The next result concerns

- *
(A ) Dn a 0 as n a m for each ‘9

and

(H ) lim R(9

) = R(9,¢b) uniformly in G 9 6(9) > 0
tiO

’¢G+t9

where G(9) denotes the probability that G assigns to the
singleton set {9}.
Proposition 2. If s is finite and (H') holds for all

9 6 ® then (A-) obtains.

Proof. Since 9 is finite, (H-) implies that for every
6 > 0 there exists a 6 > 0 depending on e and not depending
on 9 or G 3 6(9) > O with ‘R(9,gG+te) - R(G’NG)‘ S g for all

0 s t s 6, 9 and G 3 C(9) > 0. Thus,

(11) |R(ei.¢b ) - R(ei’¢b )\ s e if (i-1)’1 s 5 and Gi-l(ei) > o.
i-l i

For any sequence of parameters 9, Gi-l(ei) > 0 except finitely
often and since risks R(9i,m) are finite, we see that (A-)
follows directly from (ll)and (9).

Progosition 3. If 9 is finite of cardinality M_ and
the loss function L is bounded, say L(9,a) s B < o for all

* -
9 E (9, a 6 a , then (H ) holding for all 9 6 9 implies (A) obtains.

Proof. With 6 > 0 arbitrary and 5 as in the proof of
Proposition 2, we have from (9) that

n

 

* .l
O S Dn S n 121 [R(ei’¢Ci_1) - R(ei’wbi)]
Gi-1(9i)=0
1 n
+ 3' .E [R(ei.¢b. ) - R(ei.¢h.)]
1-1 1-1 1
Gi-l(ei)>0
n n
s%(BM)+-3; 2 13+1 2 e
i=1 n i=1
(i-l)'1>6 (i-1)-1s5
-1
S (y +5n + 1)B + 6

Thus, there exists an n0 so large that O S D: S 23 for all .9
and n 2 no, i.e., (A) obtains.

In Chapter II we investigate other sufficient conditions
for (A-) and (A) and develop a necessary and sufficient condition
for (A-) for some special component problems. The next section
of this chapter deals with the monotonicity of R(9,¢b+te) and
L(9,¢b+te(X)) in t and presents a useful necessary condition

and a useful sufficient condition for a Bayes response to satisfy

the limit condition in (H).

3. Monotonicity Results for R(9 ) and L(9 (X)).

’ CPO-ft 9 ’ ¢G+t e

Hannan (1957, p. 129) has remarked that R(9,go+te) is
monotone decreasing in t 2 0. In this section we state and prove
the Hannan result with an extension on the domain t (Proposition
4) and then establish an analogous a.s. loss monotonicity result
(Proposition 5). For 9 6 ® and G 6.8 let G(9) denote the

measure that G assigns to the singleton set (9}.

Proposition 4, (Hannan). For all 9 E O and G €.$,

(12) R(9 ) 1 in t 2 -G(e)

’ cp(;+t 9

Proof. Let 9 and G be given and consider t > O. The

inequality

R(G + t9,cp0+t9) .<. R(G + t9,ch)

can be weakened to

R(G + te’qo+te) s R(G + t9,¢b) + R(G,gG+t9) - R(G,os)

Using the linearity of R(H,¢) in H and the fact that t > O

we obtain

(13) R(9 ) s R(e,ch)

’ q”c;+t 9

Note that.(12)is trivially satisfied if G is degenerate at 9
so we exclude this case in the remainder of the proof. Let

-G(9) 5 t1 < t2 be fixed and consider G* = (G + t19)/(1 + t1)
and t = (t2 - t1)/(1 + t1). G* is a probability measure since
t1 2 -G(9) and t1 = -l is excluded since G is not degenerate

at 9. Since and m *

G +te

HE* = ¢b+t19 = qG+t29’ replacing G

10
*
by G in (13) yields

R(e’ch‘lee) S R(9 )

’W
G+t19

Since t1 and t2 are arbitrary numbers satisfying -G(9) S t1 < t2,

the proof of (12) is complete.

Consider the following assumptions on the component decision

problem.

(A1) there exists a o-finite measure u such that Pe << u
for all 9 E 9

(A2) For H 6 31+, R(H,cp) can be written j p(e,o(x))fe(x)n(de)n(dx)
. I o

and a Bayes response with respect to H is given by choosing

qﬁx) to infhnize the inner integral a.s. PH.
dP

In (A2) fe is a version of agﬂ- and

(14) rats) = j P9(A>H(de>
O

defines the measure (H mixture) P

d?

f denote a version of -Jﬂ .
H du

H on the c-field of I. Let

Under (Al) and (A2) m is Bayes with respect to H if

and only if

(15) éL(9.cp(X))fe(X)H(de) = inf £L(9,a)fe(X)H(de) a.s. PH -

a
Proposition 2, Suppose (Al) and (A2) are satisfied. Then

for all 9 G O and G 6.9, and -G(9) S t1 < t2

(16) L(9

,cpmtzem» sL(e,ch+tle(X)) a.s. P9 .

ll

ggggf. If G is degenerate at 9, (16) is trivially true.
We therefore exclude this case in the remainder of the proof. The
transformation given in the proof of PrOposition 4 will establish
the result for general t1,t2 such that -G(9) S t < t once the

1 2

result is proved for O = t1 < t2 = t. Let 9 and G be given and

fix t > O. For any m

R(G+te.cp) = j” j{L(o,o(x))fa(x) + t L(e.o(x))fe(x>}c(da)n<dx)
f >0 9

G
(17)
+ t J“ L(9.cp(X))fe(X)u(dX)
[fG=0,fe>O]
= +
Let Pe v1 v2 where v1 << PG and v2 1.PG so that v1 has

density f9 [f6 > 0] and v2 has density fd:fG = 0]. Since

¢b+t9 minimizes R(G+t9,q9 over choices of m we see from (17)

that
L(9a¢b+te(x)) = min L(9,a) a.s. v2
a
so that
(18) L(9,ch+te(X)) SL(9,ch(X)) a.s. v2 .

From (15) we have
(19) é{L(o,ch+te(X))fa(x) + t L(e.ch+te(X))fe(X)}G(do)

s £{L(a,ch(X))fa(X) + r. L(9.<PG(X))fe(X)}G(doz) a.s. Po

and

(20) §L(o,oo,te(x>>fa(x)c(do) 2 g L(o.ch(X))fa(X)G(do) a.s. PG

9

12
which together imply

(21) c L(9,(pG+te(X))f9(X) s c L(9,(pG(X))fe(X) a.s. P .

G

where << P , (21)implies

Since t > O and fe(X) > O a.s. v1 G

"1

(22) L(9 (X)) S L(9,¢b(X)) a.s. v

“ﬁﬁte

Combining (18) and (22) we have

(23) L(9 (x)) SL(e,ch(X)) a.s. P .

"paste 9

We conclude this section with two prOpositions which will
be used in the proof of Theorem 3 of Chapter II. The results are
presented here because they apply to more general decision problems
than the finite 9 problems investigated in detail in Chapter II.

The first pr0position gives a necessary condition for a Bayes reSponse

to satisfy

PM

(24) lim R(9,m

) = R(eap)
t10 G+t9 G

and is presented at the game theoretic level by Hannan (1957, p. 129).
We repeat the result here in our notation.

Pronosition 6. (Hannan). Let be a determination of

‘90)
the Bayes response. Then (24) implies that

(25) R(9,¢b) S R(9,q9 for all m which are Bayes wrt G .

Proof. Let G be given and suppose ¢(,) is a determina-
tion of the Bayes reaponse satisfying (24). Let m be Bayes wrt

G. Then as in the paragraph following (8.12) of Hannan (1957),

13

t R(G’Pmte) + R(G) S R(Gd'te) S R(G+te.cp) = R(G) + t R(8,cp)

so that
R(e,ch) - R(9,cpc+te) 2 R(ean) - R(e.o) if t > 0.

By (24) we see that R(9,qb) S R(9,¢).

The next pr0position is a slight variation of the results
presented by Hannan (1957, p. 129) concerning sufficient conditions
for (24) to obtain. We need the following

Definition. A Bayes response ¢b is said to be G-dominant

if
(26) R(a,¢b) S R(a,¢) a.s. G for all m Bayes wrt G .

(Of course, mt is G-dominant if and only if R(a,mG) = R(a,q9
a.s. G for all m Bayes wrt G.)

Proposition 7. (Hannan). Suppose the class of risk func-
tions {R(-,¢)\¢ a decision rule} is sequentially compact under
pointwise convergence on 9 and that R(9,m) S B < m for all
9 and 9' 'Let G 6.9. (i) If m(_) is such that uh is G-dominant,
then (24) obtains for all 9 such that G(9) > 0. (ii) If ¢(.) is
such that qt is everywhere dominant then (24) obtains for all 9.

2522:. (1) Suppose 9 is such that G(9) > 0. Let t‘1 l 0
and by the sequential compactness assumption let tk be a sub-
sequence and m a decision rule such that
(27) R(o,cp) = lim R(o.<p k ), or 6 ® .

k G+t 9

Since by the dominated convergence theorem,

l4

R(Gsfp) =j lim R(ozstp k )G(d01) = lim R(Gstp k )9
k G+t e k G+t 9

and since

R(G ) s R(G) + t R(e,¢b), t > 0,

’qo+te
we see that R(G,m) S R(G). Hence, the weak limit m is Bayes

wrt G. Since qt is G-dominant and G(9) > O we must have

(28) R(9.m) = R(e.¢b) .

Since lim R(9,m
th G+t9

4) we see that (27) and (28) combine to yield (24). (ii) The proof

) exists by the monotonicity in t (Proposition

is carried by the same arguments used in the proof of (i).

CHAPTER II

SUFFICIENT CONDITIONS AND NECESSARY CONDITIONS FOR
ASYMPTOTIC AVERAGE RISK STABILITY

1. Introduction.

 

In this chapter and in Chapter LEIwe consider component
statistical problems in which 9 is of finite cardinality M 2 2.
If 9 = {l,...,M}, each a priori probability measure G can be
identified with the M-dimensional vector (G(l),...,G(M)), where
G(9) is the mass put on 9 by G. .3 may therefore be identified
with the M-1 dimensional simplex in EM (Euclidean M space).
That is,

M
,3 = [y = (y1,...,yM)\yi 2 O, 1 S i S M, .2 y1 = 1}
i=1
We will equip ,9 with the usual Euclidean distance, given by
d()':)") = [213 (yi - >92]?

This chapter concerns itself with sufficient conditions
for (A) and (A-) to obtain. The conditions fall into two categories.
In Section 2, we discuss sufficient conditions involving the (joint)
continuity of R(9,qh) in G as opposed to the continuity of
R(9,qb) along lines in. Mt as given by conditions (H) and (H-)
in Section 1.2. In Section 3, we consider component problems in
which the action space a' is also finite. We study a pointwise
version of (H) called (H0) and relate it to a condition (I) involv-

ing both the family of probability measures {Pe‘9€®} and the loss structure.

15

16

Under certain assumptions these two conditions are equivalent and
each is a sufficient condition for (A-) to obtain. The condition
(I) is often simpler to verify than (HO). In Section 4, we turn
to some special component problems. In two problems, one in which

® consists of two elements and<7 of any arbitrary (but finite)
number of elements, and the other, the M X M classification prob-
lem with 0-1 loss, we find a condition (Io) equivalent to (H0) but
stated in terms of the distributions of the likelihood ratios

f (X)
fa(X) only. This condition is, in general, quite simple to verify.
6

 

Moreover, it is a necessary condition for (A-), and hence for (A),
to obtain. In the third component problem examined in this section,
one in which 9 consists of any arbitrary (but finite) number of
elements and (7 consists of two elements, the structure of the
problem plays a very important role. A condition on the likelihood
ratios is given which is a necessary condition for (A-) to obtain.
It is not, in general, equivalent to (H0) although it provides a

simple means of verifying that (A-) does not obtain.

l7

2. Continuity of R(G’NG) .13 G.

The convergence conditions (H) and (H-) defined in Section
+
1.2 concern the continuity of R(9,¢h) along lines in N'. The
conditions we now study involve the (joint) continuity of R(9,¢b)

in G. Consider
(C) R(9,¢b) continuous in G E.£
(C-) R(9,¢b) continuous in G 6.3 3 G(9) > O .

Theorem 1. If 9 is finite and (C) obtains for all 9,
then (H) obtains; and, hence, by Preposition 1.1, (A) obtains.

Proof. If R(9,qb) is continuous in G 6.9, then it is

uniformly continuous in G E.&. Since qb+t9 = mb* where
G: = (G + t9)/(l + t) E.& and t
2 * -1 2
d (Gt,G) = 2 (C(01) - (”0 C(01)) +
d€®‘{9}
-1 2
[G(9) - (1+t) (G(9)-+0]
2 2
2
= ——t—'2' 2 G2(or) + ___t__2_ [G(9) - 1]
(1+t) a€®-{ 9} (1+0
5 2t2,
*
we see that d(Gt,G) « 0 as t l 0 and hence R(9,¢b+te) a R(9,¢h)

uniformly in G as t l 0. Since 9 is finite the convergence is
uniform in 9 and G.

Theorem 2. If 9 is finite of cardinality M, the loss
function L is bounded, say L(9,a) S B < m for all 9 E 9,

* - - .
a E a , and (C ) obtains for all 9, then (A ) obtains.

18

Proof. Let 0 < 6 S 1 be fixed and define the sets

L.
n

{1‘1 5 i s n, 91 = oz. Gi_1(a) < 6). oz 6 ®

7:
ﬂ

{i‘l S i S n, 91 = a, Gi_1(a) 2 6}, o 6 ®

denote their respective cardinalities.

(1) o s n: sf 2{ z “(91% 1) - R(ei.<pc)]
i-

069 1.6Jn,a

iEK

+ 2 R(e.ch ) - 11(91pr )1} -
i-l i
n,a

'1
° ' = - + + . .
With 1a max Jn a we see that G1 (a) < [(10 l)6 l]/ia S 6 la

9
Thus, either id S 6-1

a
O ’2’. O O .
or G1 (a) < 26 Since Jn ac: {1 ,1a}

9

a

and i G (a) 2 ‘J l we see that
0 id n,a

-1 . -l
‘Jn,a‘ S maxfé , 251a} S max{5 ,26n} .

The first double sum on the right hand side of (l) is seen to be

bounded by

-1
(2) max{B—%-§-—, 23 M 5} .

Since (C-) obtains for all 9 E 9, R(a,mc) is uniformly continuous
on the closed set ,3a = {G\G(a) 2 5} for each do Since G1“1 E 33

for each i 6 K we have
n,a

(3) mi 2 [1103,an )-R(ei,o _1 )]=0.
n-e iEKn’a 1-1 Gi-1+(1-1) 91

Combining (1) and (3) and weakening by the bound (2) we obtain

_ *
OSlimDS2BM6.
n n

19

Since B and M are fixed and 6 > O is arbitrarily small, the
convergence D: a O as n a m is proved.

A study of condition (C) has implications beyond those re-
lated to the asymptotic average risk stability (A). For example,
in a decision problem with ¢(.) a Bayes response for which (C)
is satisfied for each 9, if G is any estimate of the a priori
distribution G which is independent of the data gathered in the
given decision problem, then R(9,¢»

G

for each 9. Thus, under a bounded risk assumption,

) a;s. R(9.¢b) as G aLs. G
R(G,qa) 845° R(G,¢b) = R(G) by the dominated convergence theorem;
i.e., playing Bayes versus a consistent estimate of G is
asymptotically Optimal.

We now turn to the study of the variation (H0) of (H) and
its characterization in terms of the loss structure in a finite

action decision problem.

20

3. Condition (Ho); §_Characterization for M X N Decision Problems.

Let 8 = {1,2,...,M} and aV= {1,2,...,N} where both M
and N are finite so that the loss function L is given by an

M X N matrix of non-negative numbers whose (9,a) element is

L(9,a). We will show (Theorem 4) that if the following condition

(H lim R(9,cp

O) th o+ce) = R(9,¢b) for all c 9 6(9) > 0

holds for all 9 6 ® then (C-) also holds for all 9 and hence,

via Theorem 2 , (A-) obtains. First we develOp a characterization
of (H0) in terms of the structure of the M X N component decision
problem (Theorem 3).

Throughout this section and the next we assume:

(4) The columns of (L(9,a)) are mutually nondominated; that is,
no pair a,a' E a’ exists for which L(9,a) 2 L(9,a') for

all 9 E 9.
Equivalently, for every a,a' E a' there exists 9,9' 6 ® such that
(5) [L(9,a) - L(9,a')][L(s'.a) - L(e'.a')] < 0-

As in Chapter I let fe denote a version of dPe/du where

u is a a-finite measure dominating each P Since 9 is finite,

9'
u can be taken to be the finite measure ESPO.

Definition 1. For each a E d, H 6 21, x E I, we define

Mms>=gumsmw$s>.

We note that a Bayes reSponse versus G E.& is characterized a.e. u

21

l , if <
(6) qb(x)(a) = arbitrary, if xa(G,x) = 2:23 xa,(G,x)
O , if >

with the proviso E ¢b(x)(a) = 1. We will find it convenient to
let mb(a‘x) denote ¢b(x)(a), the probability assigned to act a
by the probability measure qh(x) and henceforth we employ this
notation. ‘

For the purpose of investigating (H ) for a particular

0

fixed 9 we can find a permutation of the acts a1,az,...,aN such

that
and define the equivalence relation
a ~ a' iff L(9,a) = L(9,a')

The permutation and the equivalence relation clearly depend on 9
but we will not use additional notation to display this dependence.
With A1,...,Ar denoting the equivalence classes with the order

on subscripts denoting the order on losses we have

(7) L(9,A1) < L(9.A2) <...< L(9,Ar)

and for any decision rule m

r

(8) R(e.cp) = jEl $(esAj)cp(Aj|x)fe(x)u(dx)

where L(9.Aj) L(e.a) for 86A and cp(Aj\X) = 2 <p(a\X).

j aGA,
J

22

Definition 2. For each 9 E 9, H 6 #3 x E I, and

j = l,...,r define

(9) p.(H.X) = min 18(H.X) -
J aEAj

The diSplay of dependence on 9 is again suppressed.

We see by (6) that a Bayes rule ¢C must satisfy a.e. u

.1 if p.(G,x) < min p (G,x)
1 jﬁi j
(10) ¢b(Ai‘x) =

0 if pi(G,x) > min (G,x)

P
j$i 3
Lemma 1. Let 9 and G such that G(9) > 0 be fixed.

Then

(11) P [p.(G,X) = min p,(G,X)] = O, i = l,...,r
e 1 j#i 3

if and only if R(9,¢b) is unique across all determinations of
(PG '

Proof. (Only if) For any decision rule m, R(9,qb is
given by (8). Using (8), (10) and (11) we see that

r
R(Giwb) = L(e,Aj)Pe[pi<c.X> < min pj(c,x>]

j¢i

i=1
for any determination of qt.
(If). Suppose there exists an i for which

P [p.(G,X) = min p.(G,X)] > O .
e 1 j#i 3

Then there exists a j # i such that P9(Bij) > 0 where

(12) Bij = [91(G’x) = pj(G.X) s :i: j pk(G.X>] -

23

Consider two Bayes reSponsea ¢C and qé such that
0 = ' 0
mb( \x) qh( ‘x), for x é Bij

¢b(Ai‘x) = l, é(Aj‘x) = l, for x E Bij

Then
R(e,ch) - R(e,<pG') =[L(9’Ai) - L(e.Aj)]Pe(Bij) #9 0

by (7) and the fact that Pe(Bij) > 0.
We now give a characterization of (H0) in
Theorem 3. A determination of the Bayes response m(.)
satisfies (HO) for all 9 6 ® if and only if

(I) Pe[p1(G,X) # 31;: pj(G,X)] = o for all i = l,...,r and

G 9 G(9) > 0

holds for all 9 E 9 in which case all determinations ¢(,)
satisfy (HO) for all 9 6 9.

Proof. (Only if). Supose is a determination which

‘P(->
satisfies (H0) for all 9 E 9. Let 9 and G such that G(9) > 0
be fixed. By PrOposition 1.6 R(a,¢b) S R(o,¢é) for all a

such that G(a) > O and ¢é Bayes with respect to G. Hence,
R(a,qb) = R(a,qé) for all a such that G(a) > 0. In particular
R(9,¢b) = R(9:¢E)° The uniqueness of R(9,¢h) across determina-
tions of the Bayes response ensures that (11) obtains (Lemma 1).

Since 9 and G such that G(9) > O are arbitrary we see that

(I) holds for all 9 E 9.

24

(If). Suppose that (I) obtains for all 9 G 9 and let
¢(.) be any determination of the Bayes response. Fix 9 and
G such that G(9) > 0. Lemma 1 shows that for all a such that
G(a) > O, R(a,mb) = R(o,qé) for all qé Bayes with reSpect to
G. Hence, W(.) is G-dominant. For the M X N decision problem
the risk set is compact. Hence, by Pr0position 1.7,
lim R(9,m

tl0 G+t9
are arbitrary we see that (H0) holds for all 9 6 ®.

) = R(9,qh). Since 9 and G such that G(9) > O

The condition (H0) is a one-sided directional continuity
condition for R(9,¢h) as a function H €_ﬂﬁh Our next theorem
shows it to be equivalent to the (joint) continuity condition (0-).

Theorem 4. (HO) holds for all 9 E 9 if and only if (C-)
holds for all 9 E 9.

For the proof of Theorem 4 we will use

Lemma 2. Let 9 E ® be fixed and let A1,...,A.r be the
corresponding equivalence classes (cf. (7)). If Hn is a sequence

of probability measures, i.e., [Hn}<:,&, such that Hn a G, then
(13) pi(Hn,x) a 91(G’x) as n a m uniformly in x

for all i = 1,2,...,r.

Proof. By Definitions 2 and l we see that

Ipi(Hn.X) - pi(G.X)1 5:2: \xaﬂinnc) - xa(G.X)I
(14) ‘

Smax 2 L(a,a)\Hn(a) - G(a)|£ (x).
aeAi o6® “

Since max L(a,a) = B < m and sup f (x) S l (by choice
0,8 09"
of the dominating measure u = 2 Po) and since 9 and a' are
a

25

finite, we see that the right hand side of (14) converges to 0

as n a m uniformly in x so that (13) obtains.

Proof of Theorem 4. (If) Suppose (0-) holds for 9

and fix G, such that G(9) > 0. Note that where

‘PG-tte = ‘Pct
Gt = (G+t9)/(l+t) and note that, Gt a G as t l 0. Hence, by
(C ), R(9,qbt) « R(9,¢b) as t 1 O, i.e., (HO) holds at 9 and
G.

(Only if) Assume (H0) holds for all 9. Now fix 9 and
G such that G(9) > 0. We will show that R(9,¢h ) « R(9,qh) as

n

H a G. By Theorem 3, (I) obtains and for any determination of

fl

tp(.), (8) and (10) yield

r
1-1 j#1
For sufficiently large n, Hn(9) > 0 and we can write

r
E

(16) R(9.:pG) - 11(9an ) =
n i=1

L(e.Ai)[Pe(Bi) - PO(Cn,i)]
where

Bi = {x‘pi(G,x) < min p

(G,x), f9(x) > O}
ji‘i

J

Cm = {x‘pi(Hn,x) < 31;? pjiunsc). fem > 0}

for i = l,...,r. For any sets B and c, |P(B) - P(C)| SP(BCC) +

P(BCC). By Lemma 2,

limn C“,i C {x‘pi(G,x) S min pj(G,x), fe(x) > O}

ji‘i
and since lim C , :>B,,
'——1n n,1 1
c = . c
lim Cn,’ (11m Cn,) CB

Thus,

26

"‘.'"' C C —
11mn Pe(Bi n Cn,i) S Pe(Bi ﬂ limn Cn i)

Pe[pi(G,X) = min p (G,X)] = o

jﬁi j

and

—'.'_ C "'.—" C
1mn Pe(Bi n on i) s P903i n 11mn on i)

Pe(¢) = 0

and the right hand side of (16) converges to O as n a m. Since
9 and G such that G(9) > O are arbitrary we see that (0-)
holds for all 9.

Corollary 1. If (HO) obtains for all 9 E 9 then (A-)
obtains.

Egggf. The proof is carried by Theorem 4 together with

Theorem 2.

27

4. A Likelihood Ratio Characterigation of Condition (1).

As we have seen in the preceeding section, for all 9 (I)
is equivalent to (H0) for all 9 and it provides an alternative
sufficient condition for (A-) to obtain. In statistical decision
problems condition (1) provides a means for verification of (H0).
In this section we discuss three types of 'M.X N decision prob-
lems and in two we develOp a simple characterization of (I) in
terms of the continuity of the distributions of the likelihood
ratios fa(X)/fe(X). Consider

f (X)

_Q___.= = u
(10) Pe[fe(x) c] O for all c E (0,m) and for all a E 9,

a f 9 .
The three component problems we consider are

(i) M = 2, N arbitrary
(ii) M X M classification problem, 0-1 loss

(iii) M arbitrary, N = 2.

In (i) and (ii) we show that (I) holding for all 9 is
equivalent to (IO) holding for all 9 and that (10) holding for
all 9 is a necessary condition for (A-) and hence for (A). In
(iii) we give a condition involving continuity of distributions of
certain likelihood ratios which turns out to be necessary for (I)
to hold for certain 9 and also necessary for (A-).

Gilliland and Hannan (1969, Theorem 2) have shown that the
existence of a determination of T(.) satisfying (H-) is equivalent
to (10) holding for all 9 in a 2 X 2 (testing) decision problem.

Our results concerning problem (i) subsume the Gilliland and Hannan

28

2 X 2 results.
Problem (1). Let M = 2 and N be arbitrary (finite).
Here 9 = {1,2} and aV= {1,2,...,N} where the acts are

labelled so that

L(l,l) < L(l,2) <...< L(1,N)
(17)
L(2,l) > L(2,2) >...> L(2,N)

This is possible since the columns of L are mutually nondominated
(cf. (4) ). When M = 2, the condition (IO) holds for a 9 E 9 if

and only if it holds for the other parameter in 9 since if c > O,

-l
Per(X)/fa(X) = C 1 = c Peffa(X)/fe(X) = c]. Definition 1 yields
1a(G.x) = L(1.a)c(1)f1(x) + L(2,a)G(2)f2(x)

and for each 9, the equivalence relation induced on acts is equality;
in fact, for 9 = l, Aj = {j} and for 9 = 2, A) = {N-j}. We have

pj(G,x) = xj(G,x) for 9 = 1
pj(G,x) = xN_j(G,x) for 9 = 2 .

Condition (1) states

P“
l

Pe[pi(G,X) = min p

(G,X)] = 0 for all
3*i j

- 1,2,...,N and

GBG(9)>0

where it is understood that the within the square brackets

pi
correspond to 9 in the operator P9. For both 9 - 1,2, (I) is
seen to be equivalent to

0.8) P9[>.1(G,X) = min 1

j(G,X)] = O, for all i = l,...,N and
jﬁi

G 3 G(9) > O

29

that is,
(19) Pe[L(l,i)G(l)f1(X) + L(2,i)G(2)f2(X)

- "in (L(1.j>c<1)£1(x> + L(2.j)G(2)f2(X)}] = 0,
j i

for all i = l,...,N and G 3 G(9) > 0.

Theorem 5. (1) holds for 9 = 1,2 if and only if (10)
holds for 9 = 1,2.

nggf. (If) Suppose (Io) holds for 9 = 1,2. Let G E.&
be such that G(l) > 0. If G(l) = 1 then (1) holds trivially for

9 = 1. Thus, suppose G(l) < 1. For every pair of acts a,a'

’

a f a',

C = GQJLLQLaI - LAZSa'll E (0 m)
G(1)[L(1,a') - L(1.a)] ’

by (17). By (I0) for 9 = l, we have
P1[1a(G.X) = ia.(G,X)] = 0 .

Hence, (18) holds for 9 = l. The treatment of 9 = 2 is similar,
and we see that (I) holds for 9 = 1,2.

(Only if) Suppose (I) holds for 9 = 1,2. If (I0) fails
to hold for both 9 = 1,2 then there exists a c E (0,m) such that

f (X)

l
(20) PIE-£7275- = C] > 0 .

*
Define the function L : a X a... (O,oo) by

(21) L*(i,i> - L(Zi‘) ' L(Z’j)

— L(l j) _ L(l,i) , for all 1 s i, j s N .

30

Let aO < N be any act for which

* *
L (a0,N) = min L (i,N)
i<N

Then for any c satisfying (20), there exists a (unique) G 6.9,

Oi< G(2) < 1 for which

Then

0 < petflm/fzm = c] = rattles/1529)== Egg—L" (so 19]

= Pe[1N(G.X) = 180(9)!) = it; 11(G.X)]

and condition (I) does not hold at 9.

Theorem 6. If (A') holds, (I) and therefore (10) holds
for 9 = 1,2.

Proof. Suppose (I) does not hold; that is, there exists
G, necessarily with 0 < G(2) < 1 such that
(22) P [1, (G,X) =...= 1. (G,X) < min 1 .(G,X)] > 0

9 a a , a
l k a #a1, ,a
k

for 9 = 1,2 and some set 2k = {a1,...,ak}, k.< l. Wlog,

1 = a1 < a2 <...< ak. If not, since for all x in the set

f 1(x)
_§$_l
f2 10,) =*G(1)L (a. .aj), for all 1Si<j Sk,
there exists a (unique) 6' 3 O < G'(2) < 1 and g—%i%-L* (1 a2) =
2(1y*L (8182) and we use G' in place of G. We use (22) to

construct a parameter sequence 9,= (91,92,...), for which

*
lim D > O.
--n.n

31

We shall consider two distinct cases: i) L*(i,j), as
defined in (21), is constant for all 1 S i (j S N; ii) L*(i,j)
is not constant for all i,j. We shall consider the 2 X 2
problem as a special subcase of case i).

Case i. L*(i,j) constant for all i,j.

The set in (22) must be the set

(11mm) =...= was} = (“1(6)

Define A (G) = {A (G,x) < min h (G,x)}, j = l,...,N. Set

3 j kfj k
g = P1{AN+1(G)} and y = P1{(1-¢b(l‘X))AN+1(G)}. Necessarily,
OSySQ.

With G(2) = n, we construct a parameter sequence 9' as

follows: 91 arbitrary,

2 Gi_1(2) < n

I
O

2 614(2) =‘n. v -
1 G,_1(2) = ﬂ . Y

1 ei_1(2) > n

(23) 9

V
c>

For 9 = 1,2 define

0 ..
oe — 2‘: L(e.j)Pe{AJ(G)]

" 0

+ O
09 = oe + L(e.N)P9{.¢N+1(c>} .

Then for all i such that

32

(i) o < Gi_1(2) < n, R(2,¢bi-1) 2 a;

+
ii) n < Gi-l(2) < 1. R(l,¢bi-1) 2 01

i

(24) ﬂ 111) ci_1(2> = n,

R(2,¢b. 1) 2 a: + liﬂrium) - L(1,1)]g, if y = o
1-

 

L R(1,¢b. 1) 2 a; + [L(1,N) - L(1,1)]Y. if v > o.
1-

Consider the decomposition

) + E R(1,¢

(25) z: R(9i’9b ) = x R(2 G )
i- 1 ei=1 1-

»¢
1 ei=2 Gi-

1
By the construction (23) there are nGn(2) terms in the
first sum on the right hand side of (25) and ‘Gn(2) - T“ < 11.1.

Then (24) together with (25) yields
n n 2 -

+ [n(1-n) - 2][o; + 51] if y > o
and

f + 1.. V +
(26 ) 2:11 R(eiupcid) 2 [Ml-2102 + 71]] 52] + [nu-n) - 2101
if y = 0
where
51 =[L(1.N) ' L(1,1)]Y if v > O

82 = [L(er) ’ L(1:1)]Q if Y = 0 .

33

Since R(G) = no; + (l-n)o; = no: +'(1-n)q:. (26) and (26') yield.

respectively,

(27.) 2'1‘ 11(91an > 2 nR(G) - 2&1; + a; + 511+ n(l-‘n)91 if V > o
i-l

and

, +- + 1-
(27 > 2211919614) 2 nus) - 2th2 + o1 + T“ 321 + nu-me,

if y=0.

By construction Gn « G so that R(Gn) a R(G) and

1. D* 61 if y > 0
un n > (l-n) 9 > 0 where 9 - . -

n 92 if y — O
at the parameter sequence 9 defined by (23). This contradicts
the assumption that (A-) is true, and therefore (I) must hold at
9 = 1,2 and the proof is complete for case i.

*
Case ii. L (i,j) is not constant for every pair (i,j).

It is necessary to make a few preliminary remarks concerning

Remark 1. For every triple (i,j,k) 6 a x a X a, i < j < k

one and only one of the following conditions is satisfied:

* * ‘k
A: L (1.1) = L (1.10 L (1.10
* * *

B: L (is) >L (1.10 > L (J.k>

* * *
C: L (1.1) < L (i.k) < L (J.k)

Remark 2. For every triple (i,j,k), i < j < k, satisfying

(C),

34

xj(G,x) > min xk (G,x) for all G 3 0 < G(2) < l and for all
ki‘ j

x 3 f1(x) > O or f2(x) > 0 .

Set (7' = {a G a’}(i,a,k) satisfies (A) or (B), for all
i < a < k}. Since 3k C d', by virtue of Remark 2, we may restrict
attention to d'; that is, we may suppose that a = a' and that
therefore, every triple (i,j,k) satisfies (A) or (B).

We define a sequence of actions {aj.‘ i = 0’°°°’kO+l}

as follows:

 

 

r a = 1
jO
if (1,2,3) satisfies B
aj = max{j‘ 2 < j S N and (1,2,j) satisfies A}
1
if (1,2,3) satisfies A
f(ji-l’Ji-I+1’ ji-I+2) satisfies B
(28)}81 = mx;j\jb1<jSN and (j 1_1,ji._ 1+1 j,) satisfies A
i
. o o A
(Ji-l’ji-I+1’ji-l+2) satisfies
=-jN -1 if (N-2,N-1,N) satisfies B
a Lmin{j < N \ (j,N-1,N) satisfies A} if
jk
0
(N-2,N-1,N) satisfies A
La N
1 +1
kO
For k = l,...,ko+l, define the following sets:
(G) = {i (G,x) =...- i (6.x) < min i (G,x)}
AN+k jk-l 3k «Jk_1.t>1k "

35

where KL(G,X) = la (G,X), L = j09"°9j'k +1"
6 0
We shall continue to use the sets Aj (G) as defined pre-
viously, noting that there, )‘j (G,x) means xa(G,x) with a = j E 4.

Thus, for any prior G with 0 < G(2) < l,

N
R(e,sG) = j§1L(e.j>Pe{Aj<G>}

ko+1 jk
+- 2 z L(e,a )P {o (a \x) (6)} .
k=1 j=jk-1 j 9 G j Ah+k

» +
The sets Aj(G), 1 S] SN and AN+k(G), l S k S k0 l,

are pairwise disjoint; moreover, for all 1 S j S N,

 

(X)
*. 1 G(2)
AJ.(G) ._. {G(1_Q))_L (3,19 < £200 (6(1) L* (i ,3), for :11 i< j < k}

and for all 1 S k S k0+l

f(X>
= Q2). * =9—(—>-L HELL
AN+k(G) {6(1) L (ai’it') < f2(X> c(1) M‘ ’1) < 6(1) ﬂ( ae’ 1k 1)

for all (,<jk_lsi<ijk<L'}

We may, wlog, suppose that the set 2k given in (22) is in

fact {l,2,...,aj }; the set in (6) is then AN+1(G)

1
Set

£1 = P1{AN+1(G)}’ Y1 = P1“1 ch(1\X)) AN+1(G)}

go = P1{AN+RO+1(G)}’ Y0 = P1“1 ' ch(N}X)AN+1(G)},

where G satisfies (22). Necessarily O S Y1 S (:1, O S Y0 S g0.

g0 may be zero.

36

Let G(2) = n. We construct a parameter sequence Q, as

follows: 91 arbitrary

’ 2 614(2) < n

(29) e =< 2 Gi_1(2)=n, y1>0,y >0
1 ci_1(2) =11 , “>0, y0=0

L1 Gi-l(2) > 'n

 

For 9 = 1,2 define a; as before;

ko+l
0
0' + 2‘. L(Gﬁ )P (G)
9 k=1 jk-l BM‘H‘ }

0
ll

k'+
0 1

° + P .
9 09 kgl L(6,8jk) e{AN+k(G)}

0
ll

Then, for all i such that

 

’ 1) o < ci_1<2> < n, R(Z’CPGH) 2 a;
11) n < GHQ) < 1, R(1.cpci-1) 2 0:
iii) ci_1(2) = n,
(30) a Masai-1) 2 01+ 11311171413) - L(1,1)]g1, if “'1 = o
R(2,ngi-1) 2 a: + 1%? I'L(1,N) - L(1,N-l)]y0,
if yl > 0, Yo > o
L R(1,chi-1) 2 a; + [L(1,2) - L(l,l)]Y1,

if Y1 > 0, Y0 = O .

37

As in case i), we have the decomposition (25) with the first sum

on the right hand side of (25) containing nGn(2) terms and

\Gn(2) - ’n‘ < n-l. Then (30) together with (25) yields
(31) 2‘; R(Giwpc. 1) 2 [rm - 210; + [nu-m - 21w; + 61]
1-
if Y1 > 0, Y0 = O

with 31 = [L(l,2) - L(l,l)]y1 and
(31') 2: 11%,.pr ) 2 En“ - 210: + £3.71} 82] +0104» " Ho:
i-l

if y1=0 or y1>0,y0>0

with
[L(l,2) - L(l,l)]g1 if y1 = 0

[L(1,N) - L(l,N-l)]yo if Y1 > 0, yo > o .

As before, R(G) = “0;: + (l-mo; = Tb: + (1490:, and (31)
and (31') yield, reapectively,

(32.) 2‘11 R(ei,chi-1) 2 nR(G) - 2K0; + °1-+ 51] + n(1-'n)e1

if Y1 >'O, yo = O

and

)>_ nR(G) - 2[o:+ o++ 1:? 32]

, n
(32) 21 R(919(PG1- 1

1

if Y1=0 or y1>0,‘YO>0.

By construction Gn - C so that R(G“) - R(G) and

38
*
lim Dn > (l-n) B > 0 where
81 if Y1 > 0, Y0 = O
52 if Y1 = O or v1 > 0, Y0 > O

at the parameter sequence 3_ defined by (29). This contradicts
the hypothesis that (A-) is true and therefore (I) must hold at
e = 1,2 and the proof is complete.

Problem (ii). M X M classification problem, 0-1 loss.

 

Let ® = 6" {l,...,M}, with loss function given by
L(9,a) = 1 if e # a and L(e,a) = o if e = a. Condition (1)
at 9 becomes
0 = P9[G(e)fe(x) = mix G(a)fa(X)] for all C s.t. G(9) > 0.
a 9
Theorem 7. (I) holds for all 9 iff (10) holds for all e.
2529;: Suppose (IO) holds at 9. Then for any a # e and
any G such that O < G(9), G(a) < 1, set c = G(9)/6(a). Then
c 6 (O,m) and Pe[fa(X)/fe(X)=c] = 0 entails Pe[G(9)fe(X) =
G(a)fa(X)] = 0. Moreover, for any G such that G(9) = l, or for
any x such that fe(x) > 0, fa(x) = 0, G(e)fe(x) > G(a)fa(x).
Therefore, since a was arbitrary, (1) holds at 9.
Conversely, fix 9 and suppose there exists c E (O,m) for
which Petfa(x)/f9(X) = c] > 0 for some a # 9. Then there exists
a (unique) G such that 0 < G(9) < 1, G(a) = l - G(9), and
C a G(9)/G(a)- Moreover,

o < Pe[fa(X)/fe(X) = c] = Pe[G(e)fg(X) = G(a)fa(X) = 3:: G(w)fw(X)]

39

and (I) does not hold at 9.

Since 6 was arbitrary, the proof is complete.

Theorem 8. If (A-) holds, (1) and therefore (10) holds
for every 9.

2322;. Suppose (IO) does not hold at 9; that is, there

exists a # e and G such that G(9) > o and
(33) Pe[G(9)fe(X) = G(a)fa(X)] > 0 .

Wlog, we may assume G(9) + G(a) = 1, for if not, we may replace

G by G' where G'(w) =G(u))/[G(e) +G(a)] for w=a,e. Then

xe(G,x) = G(a)fa(x)

k (6.70 = G(e)f (X)

a 9

x (G,x) = G(e)f (x) + G(a)f (x) for every n; # e.a
w 9 a

and (33) becomes

(33 ) Pe[xe(G,X) = xa(G,x)] > o .
Set

g = Pati9(c.x> = xa<c,X)]

Y = Pa{ch(e'.X)[xe(G,x) xa(c,X)]] .

Necessarily 0 S y S g.

Let

do, = Pa[G(“)fa(X) < G(e)fe(x)]
o = Pe[G(e)fe(x) s G(a)fa(x)]

40

Pa[G(a) fa(X) S G (9) f9(X)]

69-} Q04-

PB[G(8)fe(X) < G(rr)fa(X)] .

Let G(9) = n. We construct the parameter sequence E as follows:

91 arbitrary,

9 if 614(9) < 11
e 'f G. (e) = . , = 0
(34) e. = 1 1‘1 n Y
1 a if ci_1(e) = n , y > o

For any (1' 63 with G'((y) = 1 - c'(e), o < G'(e) < 1,
R(e,cpc.) = PGEG'(a)fa(x) < G'(e)fe(x)]
+ Pe{¢G:(a‘X)[G'(a)fa(x) = G'(e)fe(-x)]}.

with a similar expression for R(a,qb,).

Therefore, by the construction (34), if

Bi = a, then Gi-1(9) S n and

R(93CPG ) 2 Ce
1.

1
if
y = 0, 9i = a, then Gi-1(e) > n and
R(cr,ch )2 0+ = 0- + C
1_1 a at
if

y > 0, 91 = a, then Gi-l(e) 2 n and

R(Q’ﬂPGi 1) 2 R(a.ch) = a; + v .

41

As in the proof of Theorem 6, we have the decomposition given by
Q5) with 9 replacing 2 and 0! replacing l in the right hand
side of (25),with nGn(9) terms appearing in the first sum on
the right hand side and ‘Gn(9) - n‘ < n-l. We therefore have
2’; R(91.<pc > 2 23R(91»'ve, ) 2
i-l 1-1

[rm - 210; + [nu-n) - 21w; + a)

where

Since R(G) = n a; + (l-n)o; and Gn 4 G by construction (34),
1' D* l '
1m n 2 ( -ﬂ) 8 > 0

at the ‘3 given by (34), which wiolates the hypothesis that (A-)
is true.

Problem (iii)- M arbitrary, N = 2.

In the case N = 2, condition (I) becomes

Pe[),1(G,,X') = 12(G.,X)] = 0 for all C s.t. G(9) > 0 .

We shall assume that for all 6 6 @, either L(e,l) < L(e,2) or
L(e,1) > L(e,2). Set @1 = {e E (9; L(e,1) < L(9,2)} and

@2 = G - @1 = {9; @; L(e,l) > L(e,2)}. The following result is
not quite as strong as that given in Theorems 5 or .7 for M 2 3.

Theorem 9. If (1) holds for every 9, then the following

condition holds for every 9:

42

(16) Pe[fa(X)/fe(X) = c] = o for all c 6 (0,00), x ~ P9,

and for all a s.t.

[L(a,1) ' L(a,2)][L(6.1) ' L(9,2)] < 0 .

Proof: Suppose (1) holds at 9 E @1. Then for every
0 E @2 and every c E (O,m), there exists a (unique) G E.&,

0 < G(G) < l, G(a) = l - G(9) such that

= 9m L(QLZ) - L(axl)

‘35) G(o!) L<e,1> - L<e,2>

Then P9[fa(X)/fe(X) = c] 8 Petx1(G,X) = x2(G,X)]'= 0 by (1).. Since
a and c were arbitrary, (16) holds at e. A similar proof yields
the same result if 6 6 @2.
Theorem 10. If (A-) holds, then (16) holds for every 9.
Proof: Suppose there exists 9 E @1 for which (16) does

not hold; that is, there exists a 6 @2 and c G (0,m) for which

f (X)
Péﬁﬁ=c]>0.
9
Fix 9,q,c and consider G satisfying (35), O < G(9) < 1,
G(a) = l - G(9)-

Define g, y as in the proof of Theorem 8. We may then
wlog suppose that M = 2 with ® = {a,e}, and proceed as in the
proof of Theorem 6, the case L* constant, by identifying a
with l, 9 with 2, and v with the y defined there.

We conclude this section with some examples to show that
(I) holding for all 9 and (IQ) holding for all 9 need not be

related.

43

Example 1. (I) holds for every 9 but (IO) does not,
although (16) does.

Let r») = {1,2,3}, a= {1,2} with @1 = {1}, @2 = {2,3}.
Let P1, P2, P3 be uniform on (O,l], (1,21 and (1,3],
respectively, and let p be Lebesgue measure. Since P1 and
Pa’ a E @2 have disjoint supports (I) and (16) are trivially

satisfied for each 9. But

f (X) f (X)
.1... = .. _2__ .._., i
P2[f2(x) 1/2] — 1 and P3[ £30,) 2] 1/2

and (lo) is violated at 9 = 2 and 9 = 3.

Example 2. (I) and (10) both hold for every 9. Take
@,¢7, @1 and @2 as in example 1, with P1, P2, p also the same
as in example 1, but replace P3 with P5, uniform on (2,3].
Then (I) and (IO) are trivially satisfied for every 9.

Example 3. (IO) holds for every 9 but (I) does not.
Again take 6), d, @1 and @2 as in example 1.

Let P

P2, P3 be triangular distributions on (0,6),

1,
(1,5) and (2,4) reSpectively, with p = Lebesgue measure. Let

the loss matrix be

0 9
L(9,a)) = 8 o
o 1

For every x 6 (2,4), each likelihood ratio fa(X)/fe(x) is
the ratio of two first order polynomials and therefore, given any

c E (G,x), there exists unique solutions xc, xé 6 (2,4) for which

44

f (x ) f (x')
QC C...

f(x)=°’£(x')"°
6 c a c

and hence Pe{[fa(X)/fe(x) = c, 2 < x < 4] = o for all c 6 (0,00),

and for all a,6 E @.

 

f (X)
For every x E (l,2] U [4,5), f3(x) a O and fl(;)’ being
f (x) 2
the ratio of two first order polynomials, fl(;) = c and
2
f2(X)
f1(x) = c have unique solutions for every c e (0,m), Therefore

Pe{[fa(X)/fe(X) = c, 1 < X S 2 or 4 S X«< 5] = 0 for all c E (0,m),
and for all a,9 E @.

For every x E (O,l] U [5,6), both f2(x) and f3(x) are
zero and P1{[fa(X)/f1(X) = c, O < X S l or 5 S X < 6] = O for all

c E (G,x), a = 2,3. Thus (I is satisfied for every 9.

0)
However, for the given loss matrix, (I) does not hold at

any 9. For example, consider the uniform p-measure

G = (£3 %’, '31-) and the set
A = {X‘X1(Gax) = k2(G,X). 2 < X S 3} °

For each x E A,

11mm = ‘5’- f,<x), 12mm) = gen) + % got)

and hence A = (2,3]. Thus P9(A) > O for every 9 E @. In fact,
for any loss matrix in which L(l,1) = L(2,2) = L(3,l) = 0,
L(2,l) = -;-L(l,2) = '81-L(3,2) condition (1) will be violated at

every 9.

45

The results presented in Section 3 and in this section
give a complete characterization of (A-) in terms of an easily
verified condition on likelihood ratios (I0) for certain component
decision problems.

Theorem 11. Suppose the component decision problem is either
of Type (i) or (ii). Then (A-) obtains if and only if (IO) holds
for all e E @.

Egggf, For each of these decision problems, Corollary 1
shows that (H0) for all e =‘(A-) and Theorem 3 shows that
(H0) for all e a (I) for all 9. But Theorems 5 and 7, show
that (I) for all e ¢’(Io) for all e and Theorems 6 and 8, show
that (A-) 2 (10) for all e for the respective problems. These

results carry the proof of Theorem 11.

CHAPTER III

SMOOTHNESS OF THE BAXES ENVELOPE AND ITS RELATION TO (C)

‘1. Introduction.

Throughout this chapter as in Chapter II, ® = {1,2,...,M}

where 2 S'M < a so that .9 is the (M-l)-dimensiona1 simplex

.9 = {<c<1),...,G(M>>\c<e) 2 o, e -- l,...,M. 26(9) = 1}.
6

Given G = (G(l),...,G(M)) E.& let §_= (G(l),...,G(M-l)) and

define
é=i<ilcea}-

Then ,&_ is a closed subset of Euclidean (M-l)-Spaces whose interior

is non-empty and given by
intw = {g\c(e) > o, e = l,...,M}

The mapping G n g_ is a one-to-one correSpondence between .3 and

£9 Let
Q = {1,2,...,M-1}

and for e E Q, g e; and t such that -G(e) v G(M)-1 s t s

G(M) A 1 ' G(O) note that

(1) g+te=G+ta-tM.

46

47

In this chapter we define the Bayes envelope R as a function of

G, that is,
(2) R(Q) = infcp R(G,cp)

For later purposes it is useful to note the representation,
M-1
(3) R(9) - Z G(9)R(e-M.¢G) + RCM,ch)
e=l
Use will also be made of the fact that R is a continuous and con-
cave function when defined on the simplex .9 in Euclidean.Mespace
(for example, see Wijsman (1970, Theorem 2)) and that these prOperties
are inherited by R as defined by (2) as a function of §_E,&.
Gilliland and Hannan (1969, Theorem 1) have proved the equi-
valence of the existence of partials §§{%% for H.E§ﬂ* (R(H) is
defined for H 6 71+ by R(H) = m R(H/|H|) if H + o and R(H) = o
if H = 0) with a variant of continuity condition (H). It is our
purpose to relate the smoothness of R(-) as a function of g
with the continuity of R(e,qb) in G. Samuel (1963, Lemma 1)
(1966, Lemma 1) has attempted to establish an implication of this
type but as we show the attempts fall short. In fact both Lemmas
as stated are false as shown in Section 3.
For notational convenience in this chapter if f is a

real-valued function defined on ER, fi will denote the igh

partial derivative, 1 = l,...,k.

48

2. Some Mathematical Preliminaries.

We will need the following results concerning concave (con-
vex) functions.

Theorem 1. Let f be a real-valued concave (or convex)
function defined on an Open convex subset C of Euclidean k-space.
(i) If the partials fi(x) exist for all x E C and i = l,...,k
then f is differentiable on C. (ii) If a partial fi(x) exists
for all x E C then f1 is continuous on C, i = l,...,k.

2522;. (i) See Rockafellar (1970, Th. 25.2).

(ii) See Rockafellar (1970, Th. 25.4).

We will now prove a lemma which will be used in a slight
extension of the k = 1 case of Theorem 1 (ii). The lemma extends
the result of Goffman (1961, Th. 9.7.1) to include a boundary case.

Lemma 1. Suppose f is a real-valued function of a real

variable which is defined in [a,b] where a < b. Suppose that

f' exists, and is finite valued on (a,b) and that the limits

f(b+t)-f(b)
t

 

(4) lim fia+t)-f(a) E f'(a), lim

t E f'(b)
t10 t10

exist (possibly infinite). If [daB] c:[a,b] and c is any
extended real number between f'(a) and f'(e) inclusive, then
there exists a g E [a,a] such that f'(g) = c.

2522:. If c = f'(a) or f'(a) take 5 = a or B. Other-
wise, a'< 3 and c is real and strictly between f'(a) and f'(B).
Consider the case f'(a) > f'(a) and let F(x) = f(x) - cx so
that F'(a) > 0 and F'(e) < 0. Hence, there exist a.( §1<< 52 < B 9

F(gl) >'F(a), F(gz) > F(B). Thus, F is maximized on the interior

49

of [a,s], say at g. Then F'(§) = 0 so that f'(g) = c. If
f'(a) < f'(a), F'(oz) < o, F'(e) > 0 so the same ideas yield a
maximizer of F(x) on (aiﬁ).

Theorem 2. Let f be a concave (or convex) function of a
real-variable defined on an interval [a,b] where a«< b and
suppose f' exists and is finite on (a,b). Then (i) f'(a) and
f'(b) exist (possibly infinite) as one-sided derivatives and (ii)
f' is continuous from [a,b] into the extended reals.

2599;» (i) That the limits (1) exist (in the extended
reals) is a consequence of the fact that for a concave function
t'][f(x+t) - f(x)] 1 in t 1‘ 0.

(ii) By Lemma 1 we see that f' assumes every value be-
tween f'(a) and f'(b) inclusive. Since f is concave, f'(x)

is monotone (i in x E [a,b]) and, therefore, must be continuous

on [a,b].

50

3. Differentiabiltity of the Bayes EnvelOpe as a Function of Q.

Definition. For 9 E Q and G E .3 we define the line

segment

(5) L9 G = {9+ tel-G(9) v (604) - 1) < t < 601) A (1 - G(em .

Note that Lg G is empty if and only if G(9) + 604) = 0.

Theorem 3. Fix 9 E Q and G0 E} such that 60(9) +
GOO!) > 0. The partial derivative Re(§) exists and is finite

at all (_; E L if and only if there exists a determination

9.60

(9(0) satisfying

(6) lim R(9-M

.cp ) = R(9—MW) for all g e L
t-0 9+” 9

9.60

in wh ich case

(7) 119(9) = R(G-M.<pc) , c_: e Le,Go

and R(9-M,ch) is seen to be unique across determinations

W1
W1

Proof. For any determination of the Bayes response

and t 1‘ 0 such that _G_ + t9 E a we can write

-1 -1[

t [R(gI-te) - 11(9)] = t R<G+te'm""c+te-tM) - R(G-fte-tM,ch)]
+ t 1[R(G+te-tM,tpc) - R(G,th)]

and using the linearity of R(H,th) in H this becomes

c-1[R((_;+te) - R(9] = t'1[R(c+te-tM ) - R(G+te-tM.<pG)]

’(pG'fte-tM

(3)
+ R(9-M.ch)

51

Since R is concave on .é; the left hand side of (8) is monotone
in t. The term in square brackets on the right hand side of (8)
is non-positive so that
(9) lim t'1[R(_c_ + t9) - rug-)1 s R(9-M,¢G)

tlo
and

(10) 1m t'1[R(c_; + te) - R(G)] 2 R(9-M,cpc) .
no

For Q_on Le,Go both limits (9), (10) apply so that the existence
of Re(§) implies the left hand sides of (9) and (10) are equal
and, hence, implies (7). The derivative of a concave function on

a closed interval is continuous there (Theorem 2(ii)) so that

Re(§) is a continuous function of g, with respect to its 95h

coordinate and (6) obtains.

Conversely, suppose (6) holds for a determination m(.)

and consider the non-negative decomposition

'2 [R(G-M.<pc) - R(9-M.(pmte_m)] =

(11) [R(G+te-tM,(pG) - R(G+te-tM,ch+t9_tM)]

+[R(c.cpc,,t9_m> - R(G,<pc>] .

The left hand side of (ll)being o(t) as t a 0 implies that
the first term on the right hand side of (11)is o(t) as t e 0.

Hence by (8) we see that R9(§) exists and is equal to

R(B‘M’CPG) -

52

Corollary 1. If the partials R9(Q) exist for all e E Q

and Q E intQ) then for any determination of the Bayes response

¢(,)’
M-l

(12) R(Q) = Z G(9)R (Q) + R(M,q)G), for all Q E intCé).
9-1 9

M. Let (P0) be any determination of the Bayes response.
If R9(Q) exists for each a E Q and Q E intw, Theorem 3 shows
that for such 9 and Q, Re(G) '3 R(9-M,cpc) and (12) follows from (3).

The next result is motivated by work of Samuel (1966, Lemma 1)
and establishes the joint continuity of R(e,ng) in Q E intcg).

Theorem 4. (1) Suppose R6(Q) exists for all QE int@)
and e E Q and let :90) be any determination of the Bayes response.
Then for every 6 E (9. R(9,ch) is continuous in Q E int(,_&_). More-
over, for each 9 E 6) and Q E intQ), R(e,qh) is unique across
determinations cp(.)- (ii) If there exists a determination cp(.)
such that for every 9 6 @, R(e,cpc) is continuous in Q E intQ),

then Re(Q) exists for all Q E intQ) and e E Q.

23.33;. (i) Let (90) be any determination of the Bayes
response. By Theorem l(ii), Re(§) is continuous in Q E inth)
for each 9 = l,...,M-l. Since, in addition R(Q) is continuous
in Q, (12) shows that R(M,tpc) is continuous in Q E inth).
Moreover, (12) shows that R(M,tpc) is unique across determinations
app) since R9(G‘)’ e=l,...,M-l and R(Q) are well defined

and do not depend on (p(°)' By Theorem 3 we can write

53

(13) R(9.ch) = 119(9) + R(M,q)G), e = l,...,M-l, c a G(9) + cm) > o .

Therefore, (13) together with the continuity and uniqueness of

R9(Q) and R(M,¢b) yields the continuity and uniqueness of

R(G’QG)’ e = l,...,M-l and completes the proof of (i). (ii) Let

¢(.) be a determination such that for each 9, R(e,¢b) is continuous in
Q E intQ). Then (6) obtains for any 90 E intQ) and all

e = l,...,M-l and it follows from Theorem 3 that R9(Q) exists

for all Q E int(-§) and e E Q.

Since the existence of partials is equivalent to differen-
tiability for concave functions (cf. Theorem 1 (i)), we see that
Theorem 4 gives a complete characterization for the differentiability
of the Bayes envelope. Also it shows that the differentiability
of R on int(&9 depends in no way on the choice of the ‘M-1

coordinates in the definition of £-

The continuity condition (C) which was introduced in Section
2,2 is the continuity of R(e,¢b) in all Q, not just on the
interior. Samuel (1966, Lemma 1) claims that the differentiability
of R, which relates necessarily to differentiability of R at
points in the interior of its domain, is sufficient for (C) for the
particular determination of ¢(.) defined by Samuel (1966, (6)).

Example 1 shows that R may be differentiable on int(§9

and, in fact, may possess a one-sided partial derivative R9 at a

 

 

4|! 1 1‘] ‘1”.

54

point on the boundary of g with R(9,cpc) and R(M,th) not con-
tinuous at that point for any determination of ¢(.).
Example 1. Let (9 =- {l,2,3}, dB {1,2} with loss matrix

given by

1 o
(L(9,a)) = o 1
o 1

Let P1 and P2 be uniform on (0,1) and (1/2, 3/2), respectively,

and let P3 be triangular on (0,2) with u = Lebesgue measure.

For every Q'E intCéQ, Pe[XI(G,X) = x2(G,X)] = 0, e = 1,2,3, and

R(locpc) = P1D.1(G.X) < 12mm)

2,3

R(Bacpc) = Pe[12(G.X) < 1.1mm]. e

and it follows that R(e,qh) is continuous in Q.E int(&9 for
each 9 = 1,2,3. Hence, by Theorem 3, R9(Q) exists and is equal
to R(9 - B’qb) for each Q_E intcg), e = 1,2, and by Theorem
l(i), R is seen to be differentiable on int(£9. Consider the

boundary point Q0 = (5,8). We see that for any determination

“’M’
lim R(1,tp + )= 5 , lim R(l, )= o
t10 Qo "1 tTO Cpgoﬂz
and
1
lim R(3, )= — , lim R(3,¢ )= 1, .
no cPgo+t1 8 ”0 Qo+t2

55

Thus, there is no determination for which (C) obtains for e = l
or e = 3. However, by Theorem 3 we see that R1(QO) exists as
a left limit and is equal to %- but that R1 is not continuous
at Q0 since lim R1(Qd+tl) = g' and lim R1(Qo+t2) = -% .

1:10 C10

The predecessor to Lemma 1 of Samuel (1966) is Lemma 1 of
Samuel (1963). This lemma pertains to an M = 2 decision problem,
in fact a 2 x 2 decision problem. Lemma 1 (1963) states that the
differentiability of R on g (here ﬁ=[0,l] and we presume
that the differentiability at 0 or 1 refers to one-sided and finite
derivatives) implies that for any determination ¢(.) and either
6, R(e’qb) is continuous in Q_E [0,1].

Theorem 5 states that for any action Space with bounded
loss function, the differentiability of R on .Q; implies the
uniform continuity and unicity of R(e,¢h) in (0,1) for each
9. Theorem 6states that, in the 2 x 2 problem considered by
Samuel (1963), under the assumption of differentiability of R
in (0,1), R(e,¢é) is uniformly continuous in [0,1] for both
9's if and only if mb and m1 are admissible determinations of
the Bayes response.

Theorem 5. Suppose M = 2 and L(9,a) S B < m for all
e E 64), a E 4. (Since M = 2, Q = [0,1] and we let g be generic
for an element of .é; and R' denote the derivative of R.) If
R' exists on (0,1) then (i) for each 9 E ® and g E (0,1),

R(0.¢g) is unique across determinations (ii) for each

cp<-)’
0 E @, R(9,¢é) is uniformly continuous in g E (0,1).

56

2592;. (i) Part of Theorem 4.

(ii) The continuity of R(e,¢é) in g E (0,1)
is part of Theorem 4. We note that when M - 2 the probability
measure Gt - (g+t,1-g)/(l+t) is equal to (g+A,l-g-A) if
t = A/(l-g-A). Hence, if 0 < A < l-g and with this choice of
t, R(l’qh+ﬁ) = R(l,qbt) S R(l,¢b), by'PrOposition 1.5 and we see
that R(1,¢g) i in g E (0,1). Likewise, R(2,¢é) 1 in g E (0,1).

For 0 E 9 define the functions he by

h (0) = um R(e.cp) . h (1) =- um R(e.cp)
9 glo 8‘ 9 811 3

119(8) = R(9acpg) 9 g E (091)

The limits exist by the monotonicity of R(9,¢é) in g and are
finite since 0 S R(9,qh) S B4< m for all 9 and g. Since
he(g) is continuous on [0,1] it is uniformly continuous on
[0,1] and hence is uniformly continuous on any subdomain. In
particular, R(9,¢h) is uniformly continuous on (0,1).

Example 1 shows that Theorem 5 cannot be extended to
M = 3 even when the action space is finite. For in this decision

problem and with Q_E intcg),

I
“-0

R(1,cpc) = 0, R(23%) = *9 R(3:(PG) - if 9. 6 £1

while

I
ooh-a

R(chc) = ’5. R(2.<pc) = 0. R(3.ch) - if

IO
n)
k.

for the regions described in Figure l.

57

 

 

\
' ' 7
l l 0(1)
3 2

Figure l. The Domain &

By examining the behavior of R(e,qb) in a neighborhood of (%,%)
we see that R(e,qb) is not uniformly continuous in Q.E int(£9

for e = 1,2,3.

Theorem 6. Consider a 2 X 2 statistical decision problem

with loss matrix

(L(9,a)) = , a,b > 0 .

If R is differentiable on (0,1) then R(e,¢g) is uniformly
continuous in g E [0,1] for each 9 if and only if qb and
$1 are admissible.

2529;. For a decision rule m, let m(x) = m(2‘x). Since
R' exists on (0,1), Theorem 5 states that R(e,¢é) is continuous

in g on (0,1) for each a. The Bayes rules qb and m1 are

58

characterized a.s. u by

l , if f2(x) > 0

(X) =
Wb arb, if f2(x) = 0

V
O

0 , if f1(x)

(X) =
$1 arb, if f1(x) —

I
O

For any determination ¢( ) a calculation shows that

lim R(1,(p) " R(1,(p1) " 0
311 8

lim R(2,(pg)

R(Z. )- 0 .
3:0 CPO

that

lim R(l ,(pg)

a P [f (X) >'0]
8‘0 1 2

limR(2,cp) ahp f (X) >0] ,
311 g 2[ 1

and that
R(l,mb) = a P1[f2(X) > 0] +-a P1[qb(X)[f2(X) = 0]]
R(2,¢1) = b P2[f1(X) > 0] + b sz (1 - tp1(X))[f1(X) = 0]] .

We see that R(l,¢é) is continuous on [0,1] if and only if

P1[qb(X)[f2(X) - 0]] I 0 which is satisfied if and only if qb
is admissible. Likewise, R(2,¢h) is continuous on [0,1] if
and only if P2[(l - ¢1(X))[f1(X) - 0]] - 0 which is satisfied

if and only if m1 is admissible.

59

An equally simple and direct proof of Theorem 6 exists
based on PrOpositions 1.6 and 1.7(ii) and the fact that mb and
m1 are admissible if and only if they are everywhere dominant

(cf. Hannan (1965)).

BIBLIOGRAPHY

BIBLIOGRAPHY

Ferguson, Thomas S. Mathematical Statistics: §_Decision Theoretic
Approach. New York: Academic Press, 1967.

 

Gilliland, Dennis C. (1966). Approximation to Bayes risk in sequences
of nonfinite decision problems. RM-l62, Department of
Statistics and Probability, Michigan State University.

Gilliland, Dennis C. (1968). Sequential compound estimation. Ann.
Math. Statigg. 39 1890-1905.

Gilliland, Dennis C. (1969). Approximation to Bayes risk in sequences
of non-finite games. Ann. Math. Statist. 40 467-474.

Gilliland, Dennis C. (1972). Asymptotic risk stability resulting
from play against the past in a sequence of decision problems.
To appear in September IEEE Transactions on Information Theory.

Gilliland, Dennis C. and Hannan, James F. (1969). On continuity of
the Bayes response and play against the past in a sequence
of decision problems. RMr216, Department of Statistics and
Probability, Michigan State University.

Goffman, Casper. Real Functions. New York: Holt, Rinehart and
Winston, 1961.

Hannan, James F. (1956). The dynamic statistical decision problem
when the component problem involves a finite number, m, of
distributions. (Abstract). Anna Math. Statist. 27 212.

Hannan, James (1957). Approximation to Bayes risk in repeated play.
Contributions £g_the Theory g§_Games 3 97-139. Princeton
University Press.

Hannan, James (1965). MR 30 #3553.

Hannan, James F. and Robbins, Herbert (1955). Asymptotic solutions
of the compound decision problem for two completely specified
distributions. Ann. Math. Statist. 26 37-51.

Hannan, J.F. and Van Ryzin, J.R. (1965). Rate of convergence in the
compound decision problem for two completely specified
distributions. Ann- Math. Statist. 36 1743-1752.

Jilovec, S. and Subert, B. (1967). Repetitive play of a game against
nature. Apl. Mat. 12 383-396.

60

 

 

61

Johns, M.V., Jr. (1967). Two-action compound decision problems.
Proc. Fifth Berkeley Symp. Math. Statist. Prob. 463-478.
University of California Press.

Neyman, J. (1962). Two breakthroughs in the theory of statistical
decision making. Review of the International Statistical
Institute 30 11-27.

Robbins, Herbert (1951). Asymptotically subminimax solutions of
compound statistical decision problems. Proc. Second

Berkeley §1E23 Math. Statist. Prob. 131-148. University
of California Press.

 

Rockafellar, R.T. Convey Analysis. Princeton: Princeton University
Press, 1970.

Samuel, E. (1963). Asymptotic solutions of the sequential compound
decision problem. Ann. Math. Statist. 34 1079-1094.

Samuel, E. (1965). Sequential compound estimators. Ann. Math.
Statist. 36 879-889.

Samuel, E. (1966). Sequential compound rules for the finite decision
problem. Journal Royal Statist. Qgg., Series B 28 63-72.

Subert, Bruno (1967). An asymptotically Optimal decision procedure.
Transactions 9§_the Fourth Prague Conference 23 Information
Theory, Statistical Decision Functions, Random Processes.
Czechoslovak Academy, Prague 253-258.

Van Ryzin, J.R. (1966a). The compound decision problem with m X n
finite loss matrix. Ann. Math. Statist. 37 412-424.

Van Ryzin, J.R. (1966b). The sequential compound decision problem
with m X n finite loss matrix. Ann. Math. Statist. 37
954-975. "”

"71111111111111