IIUHIIWIIUIWIHIHHIWIHHHIWIN!HIIWIHHWI I-s-x ICDNO '—I_._. 1513pr w I III IIIIIIIIII I II II I III IIIII 31293 00611W1896 II This is to certify that the dissertation entitled Finite State k-Extended Set Compound Decision Problem presented by Chitra Gunawardena has been accepted towards fulfillment of the requirements for Ph.D. degree in Stat'i StiCS Major professor Date 87/; I//I (257? MS U i: an Affirmative Action/Equal Opportunity Institution LIBRARY Michigan State University 0-12771 PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE % MSU Is An Affirmative Action/Equal Opportunity Institution FINITE STATE k—EXTENDED SET COMPOUND DECISION PROBLEM By Chitra Gunawardena A DISSERTATION Submitted to Michi an State University in partial ful illment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1989 ABSTRACT FINITE STATE k-EXTENDED SET COMPOUND DECISION PROBLEM By Chitra Gunawardena In compound decision theory the usual standard for evaluating compound decision rules is R(GN), where R is the Bayes enve10pe in the component problem and GN is the empirical distribution of the component states Q = (01, ..... ,0N). As introduced by (Johns and ) Swain (1965), a more stringent standard for evaluating compound rules is Rk(G11\§), where k k R game by Gilliland and is the Bayes envelope of a construct called I‘ Hannan (1969) and G11; is the empirical distribution of the overlapping k—tuples (0 ,....,0k), (0 """0k+1)’ (”N-k+2’°“"0N—1’0 ,01), (oN—k+3"”"0 ,01,02), ...., (0N—2""’0k-1)' The k+l standard is more stringent than the k standard and R1(GN) = R(GN). Ballard's thesis (1974) considered the sequence version of the finite state finite act compound decision problem with Rk(Gll\§) as its risk standard. He exhibited procedures which play I‘k Bayes against a delete estimate of G]; in the 01th component problem, V l 5 a 5 N, and showed that, on the average risk scale, the excess compound risk over Rk(GllfJ) for his procedures has rate 0(N‘1/5). Ballard, Gilliland and Hannan (1974) improved the rate to 0(N—é). We here consider the set version of the finite state compound decision problem with Rk(GIl\§) as its risk standard and treat both delete and k nondelete procedures which play I‘ Bayes against corresponding estimates of G113 in each of the component problems. In both cases we show that, on the average risk scale, the excess compound risk over Rk(G11fI) for our procedures has rate 0(N_I), when the action space is finite. Similar, but weaker results are obtained in Section 2.4 when the action space is infinite. In addition, we characterize extrema of the expected value of a function of a generalized Binomial random variable, under constant variance; an analogue to a work of Hoeffding (1956), under constant mean. We show that extrema are attained at points whose coordinates take on at most four different values, only two of which are distinct from 0 and l. To the memory of my Father iv ACKNOWLEDGMENTS I am truly grateful to my advisor Professor James Harman for his continuous encouragement and guidance in the preparation of this dissertation. His patience and willingness to discuss any problem at any time are greatly appreciated. I would also like to thank Professors Dennis C. Gilliland and R. V. Ramamoorthi for careful reading of the dissertation. Conversations with Professor Ramamoorthi concerning Chapter 3 were especially helpful. I would like to thank Professors R. V. Ericson and H. Salehi for serving on my guidance committee. I wish to thank my husband, K.L.D., and my daughter, Kalpanee, for their continued patience and support throughout. Finally, I wish to thank the Department of Statistics and Probability for providing the financial support which made my graduate studies in Statistics possible. TABLE OF CONTENTS CHAPTER PAGE 1. Introduction to the k—Extended Compound Decision Problem 1 1.1 The Component Problem ............... 2 1.2 The Set Compound Problem ............. 2 1.3 I‘k Decision Problem ................ 3 1.4 k—Extended Set Compound Decision Problem ...... 5 1.5 Literature Review ................. 6 2. Set Compound Decision Problem with m x n Component 8 2.1 Preliminaries ................... 8 2.2 Bootstrap Procedures and Estimation of the Empiric Gllfl 11 2.3 Definition of the Procedures and a Useful Upper Bound for the Modified Regret ...... 13 2.4 Asfymptotic Optimality ............... 16 2.5 In mite Action Space ............... 22 3. Extrema of Eg(X) for Generalized Binomial X with Constant Variance ......... 27 3.1 Introduction and Statement of the Problem ..... 27 3.2 Notations and Preliminaries ........... 28 3.3 Necessary Conditions for Extrema of Eg ...... 31 3.4 Characterization of Extrema ............ 38 BIBLIOGRAPHY ...................... 46 vi CHAPTER 1 INTRODUCTION TO THE k-EXTENDED COMPOUND DECISION PROBLEM This chapter presents the general k—extended compound decision problem. We begin with the introduction of some notations that will be used throughout Chapters 1 and 2. In Sections 1.1 and 1.2 we describe the compound decision problem with its usual standard (1.6) for evaluating compound procedures. In Section 1.4 we describe a more stringent standard (1.11) for evaluating compound procedures and with it we introduce the k—extended compound decision problem. In order to describe the concepts in Section 1.4 we devote Section 1.3 to present the I‘k decision problem introduced by Gilliland and Harman (1969). Notations: k and N will denote integers with k S N. The square brackets will be used to denote the indicator function. If fi are functions defined on some sets Ai for i = 1,2,....,j then i J 8 fi will denote the function; x 6 A1 x ..... x A. ~~—> II fi(xi) . i=1 1 i=1 For a sequence 3‘” = (u1,u2, ...... ), “i will denote (u1,u2, ..... ,ui); the subscript N will be abbreviated by omission. With indices arithmetic mod N, V 1 5 Li S N 3i will denote the k—tuple (“i—k +1, ...... ,ui) and ill-j the (j-i)—tuple (mod N) (ui+1, ...... ,uj) . 1.1 The Component Problem The component problem has the structure of a usual statistical decision problem, which is composed of a parameter set 9, indexing a family of probability measures {P0 : 0 e 9 } over a a—field .2 of a sample space .3 , an action space .1 , a loss function L : 9 x J -. [0,ao), decision rules cp : .3 -I .1 such that L(0,cp) is measurable for each 0, with risk (1.1) W) = E0 Lew) where E 0 denotes the expectation with respect to P 0 . 1.2 The Set Compound Problem When N independent problems each having the same structure of the component problem described in Section 1.1 are considered simultaneously, the N—fold global problem is called a compound decision problem. The loss in the compound problem is taken to be the sum of the losses in the N decision problems. Thus for each N, in the compound decision problem we have the parameter set 9N indexing the family of probability measures N {P0= x P0 : Q 6 ON} over ( .2N,.2N ), the action space AN, - i=1 i compound decision rules 59 = (p1,....,¢N) where for each 1 S a S N «pa : 2N -» J is such that L(0,Ipa) is measurable for each 0 with loss N “-2) LN(.Qa_‘£) = 0:1 L(00,¢a) a nth component risk (1.3) Rama = I L(0a,¢a) dPg and compound risk N (1.4) mm = 23 name. oz=l As standard in compound decision theory, we say that a compound decision rule 53 is simple symmetric if $00.0 =
monk .
11
Let S g [0,oo)Ink be the risk set of this I‘k problem. Then for each
I‘k decision rule cp we can associate a point 3 in S, with coordinates of
3 given by (2.4). For w e (I and s E S, let us denote the vector
inner product of w and s . We will also identify (01,02,....,0k) E 91‘
with the basis vector in n with 1 in the (01,02,....,0k) position. Thus
if s is the risk vector associated with (p then
Rk(,€ki‘P) = 2kg
and the Bayes risk of (p versus «2 E (I is
I1 .
as :11} qu (pj dpk
1:1
and
aka») = (120(0)) = A as.
863
That is, 0(w) is the risk vector associated with a (p(w) satisfying
(2.3) (0(a)) = o if j is not a minimizer of w u-I.
J
Remark 2.2
For every < wl,w2,xk > e (I x a x .2k and (p satisfying (2.8) ,
(2-9) ‘Pi(w1)(’,$k) ‘pj(“’2)(§k) > 0 only if “11(5k)w1 S 0 S ulj(§k)w2 °
11
Let E 0. denote the expectation w.r.t. P 0. and E denote the
.1 -l
expectation w.r.t. P =P0 .
The following lemma gives a useful upper bound for the risk of I‘k Bayes
rules.
Lemma 2.1
If H and H' are mappings from .ZN into 0 and tp(w) is I‘
Bayes against 112 e n, then for all x e s“ and 13k e 91‘
k
(2.10) Egk Lk(£krto(Ha(-))(-)) - some»
5 1:1,; r, [90i(Ha) ease» > 0] dflk ,
1] ~01
X
with not.) = Hag“, . ,a_N).
Proof
The assertion (2.10) is that of (2.7) in Remark (2.1) with (p and (0'
replaced by tp(Ha(r))(~) and (p(H'(2(_)) respectively. n
2.2 Bootstrap Procedures and Estimation of the Empiric GIIN‘I
Definition 2.1
A set compound rule 53 is called k—order non—delete bootstrap rule
associated with the II-valued estimator WN based on 2; if for each
1 5 a s N wad) = e(wN(x_))(xa) where no) is rk
w. The rule 53 will be called k—order delete bootstrap rule associated with
Bayes against
12
the fl—valued estimator awn—k based on axe—k if for each 1 5 a g N
In order to find k—order bootstrap rules in the k—extended compound
problem we need to estimate the empirics G15.
The question of estimating the empiric Gil; has already been solved in
Ballard (1974) in the following sense.
If the estimator h on s" to O: thk) = {hgk(1$k) : 9k E 9k} is
r
E hQSO) 18 an
such that E. h (I; ) = [Q = i ], then the estimate
Air Air 1‘ k k a=l
unbiased estimate of Gk V 1 5 k S r.
r I
It has been shown that such a function h exists if the set of densities
{fl,f2,....,fm} are linearly independent in L101). One such bounded h can
be obtained by taking bounded unbiased estimators h = (h1,h2,....,hm) of
.2 and defining the mapping h from .Zk to O componentwise by
( ) h k h 0 ok
2.11 = o e .
~0k i=1 I91 "1‘
Such an estimator is called a product estimator. Further the covariance
matrix of h has full rank under P0 V le OR if the covariance matrix
~k
of h has full rank under P0 V 0 6 9.
The details of the results stated above and the method of obtaining such
functions h are given in Section 3 of Ballard (1974).
Our theorems concern k—order bootstrap rules based on the bounded
unbiased product estimator h of 51‘ defined in (2.11).
13
The estimators of G115, we will be using in our procedures (Definition
2.1) are WN with
N
(2.12) thx) = 021110.50) = HN (say)
for the non-delete rules, awe—k with
a—k
1:0
for the delete rules.
The estimators WN and awe-k has (k-l)—dependent summands for
N > 2k, WN is unbiased for G113 and awn—k is independent of 130
for each a and, on the average scale, is asymptotically unbiased for GIlfI .
2.3 Definition of the Procedurm and a Useful Upper Bound for the Modified
Regret
With HN of (2.12) and aHa—k of (2.13) the set compound
procedures we investigate are
(2.14) 53 with spam) = (p(HN)QV(a) for 1 S a S N
and
* *
(2.15) g with roam) = naHHxxa) for 1 g a s N .
The following lemma gives useful upper bounds for the modified regret of
the k—extended compound problem evaluated at (g and (9* .
14
Lemma 2.2.
With WN defined in (2.12) let
(2.16) W§(-) = WN(.X_H, - WEN)-
Then, for (e and (9* defined in (2.14) and (2.15)
(2-17) 121‘ (£,se)< AN + BN
and
it! k It! It!
(2-17) 12 (M) .<. AN + BN
where
A —L 2: 2i su‘jwa 6 (2x Ox 2‘ and, I‘k Bayes rule (a in the
J—problem
(2.43) wf(w1)(ask) ¢§(w2)(25k) > 0 only if
b.b. b.b.
u l J(%$k)wl g 0 g u 1 J(?Sk)w2 for some { bi’bj} E { Ai’Aj}'
Proof
By the definition of (pf, for any on E Q and 95k 6 63k
¢§(w)(95k) > 0 only if (p(w)(§k)(Aj) > 0 .
Since 0 only if 3 an bj (a Bayes rule against an) E Aj such that
be
wu1(xk)50 VaeJ.
Using this fact with w = “’1 and w = a)? we obtain (2.43). n
24
Theorem 2.2
Consider the compound procedures 59 and 53* defined in (2.14) and
(2.15) but assuming (p(w) is 1‘k Bayes against 111 in the J—problem.
Then, under (2.40) and assumptions of Theorem 2.1
(2.44) $111132) = O(N)
(2441* 12km!) = O(N)-
v
2
Proof
With to, = w§(o) (cf. (2.16) ) and 1.12: HN (cf. (2.12) ) in (2.43),
an application of (2.23) of Lemma 2.3 yields
(2.45) 2 I ¢§(W§) e§IHNI > 0] so, N‘i,
and with “’1 = aHa—k (cf. (2.13)) and 1122 = HN in (2.43), (2.24) gives
*
(2.45) E. I 23,1104.) cp§IHNI > o I s c, N‘i.
For any compound decision rule Q = (6 ,....,6N) in the J—problem let
9‘ be the compound rule in the sub—problem given by Q‘ = (6f,...,61§).
Then v seeN and 15a5N
|L(aa,ia) - L(oa,i;)| g e
and, by integrating this with respect to P 0
25
(2.46) was - Rad!“ s c v i e 9”.
Express DIME as
(2.47) nk(g,_i) = A1+ A2+ A3+ A4
with
N
A1=;1{Ra(£’9—Ra(£’f)}
A — I31 R 16‘ R” (H
2‘3.—.1I a(r_)-E hams)”
N
__ k c _
A3 — 031 E {R (901$ (HN)) ,ea ”(HNH
and
N
k
A4 = 021 E {~00 ”(HN) " gay ”(GNU -
By (2.46) A1 5 Ne. Applying (2.42) with w = HN’ A3 5 N 6.
Note that the proof of Lemma 2.4 remains valid when a is the risk
vector associated with a 1‘1‘
Bayes rule in the J—problem.
Hence, an application of this generalized version of Lemma 2.4 gives
1}
A 4 5 C2N .
Since R a(Q,§‘) = 15; Rk(Qa,6;), on replacing 1p and (p' in (2.7) of
Remark 2.1 by I; and o ] dp .
Taking if = (q in (2.48), we apply (2.45) to the inner integral and bound
the iterated integral by C1 r2 N“; .
26
a:
*
Similarly, we obtain the same bound by applying (2.45) with Q = (q
in (2.48).
12
Hence A2 5 I: C1 r2 NI when Q =[ *
g .
With these bounds substituted into the R.H.S. (2.47)
234.2)
2.49 ..
( ) 2112.2 )
J52Nr+c2Ni+fclr2N*.
Since the R.H.S. (2.49) is asymptotically equal to 2N6 we have proved
*
(2.44) and (2.44) . n
CHAPTER 3
EXTREMA or Eg(X) FOR GENERALIZED BINOMIAL x
WITH CONSTANT VARIANCE
3.1 Introduction and Statement of the Problem
Let S be the number of successes in n independent trials, and let
p = (p1,p2,....,pn) with pj denoting the probability of success of the jth
trial.
Hoeffding (1956) considered the problem of finding the extremum of
Eg(S), the expected value of a given real valued function g on the range of
S when 2 pi is fixed and proved that extrema are attained when
p1,p2,....,pn take on at most three different values, only one of which is
distinct from 0 and 1. In this chapter we consider the analogous problem of
finding the extrema of Eg(S) when var S = 2 pi(1—pi) = A is fixed and
prove (Corollary 3.2) that extrema are attained when pl’p2"“"pn take on
at most four different values only two of which are distinct from 0 or 1.
The proof basically depends on the functions fn—k,i defined in (3.5) and
the representation of f = Eg given in (3.6). The characterization of
extrema in Theorem 3.3 asserts that if a is an extrema of f and has at
least three unequal coordinates in (0,1) then, any point b E D A (cf. (3.3))
having the same number of zero coordinates and unit coordinates as a and
n n
satisfying .2 bi = 2 a. is also an extrema of f. To prove this
l=l i=1 1 ,
assertion, first we show, inductively (Lemma 3.2), the functions
fIHni(al""’m) in (3.6) with m = # of coordinates of a in (0,1), are
zero V 3 5 i 5 m. Theorem 3.1 covers the case m = 3 and i = 3. Then
II 1:48
a. with
m
we use the fact that, a E D and b e D and 2 b. =
A A . 1 1 1
1:1 i
27
28
fn_m,i(al""’m) = o v 3 g i 5 m, in (3.6), to show f(a) = f(b) (cf. proof
of (3.23)). In Corollary 3.2 we exhibit such a point b 6 DA of the form
stated.
Theorem 3.1 is a corollary to Lemma 3.1 which in turn is a consequence
of the Implicit Function Theorem. Theorem 3.2 depends on a simple basic
result on the intersection of circles and lines, and is helpful in evaluating the
maximum.
3.2 Notations and Preliminaries
The notations used will be consistent with that of Hoeffding (1956).
For a p = (p1,p2,....,pn) with 0 5 pi 5 1
(3.1) 1(1)) = E(s) = 130 600 buyer)
with bn R(p) = P(S = k) given by
3.2 = - - -
( ) bn,k(p) n 1'21 pJ (1 DJ)
{i : E i. = k}
i=1 1
where i = (i1,....,in) with ij 6 {0,1}.
For 03152511,
11
(3.3) DA={p|05pi5l,.2 (pi—.5)2=.25n—A}.
1:
1
i ,..,i
For any given a E Rn, a1 111 will denote the (n—m)—dimensional
vector obtained from a by deleting the coordinates i1,...,im.
29
For any given a and b E It11 , a(bi ,...,bi ) will denote the vector
1 m
obtained from a when ai ,...,ai is replaced by bi ,...,bi .
1 m 1 m
Since f is symmetric under permutation of its coordinates and
linear in each coordinate, we can write
(34) f(p) = p, f,,_1,1(p‘) + f,,_1,0(p‘) v Isis n
where the functions fn_11 and fn_10 are independent of the index i
and symmetric and linear in the components of pi. In general, we will
define the functions fn—k,i by
fn,0(p) = f(p)
(3.5)
1,...,k+1 ...,k+1)
1,
) + fn—k—1,i(p
05i5k;05k5n—1.
1,...,k _
fn—k,i(p I — p1<+1 fn—k—1,i+1(p
By repeated application of (3.5) to the R.H.S. of (3.4)
m
(3'6) f(p) = 20 cmi(p1,°"ipm) fn_rn,i(p H.
where
and,for15i5m
h
cmi(p1,...,pm) = the it symmetric sum of p1,...,p
m.
30
With (0r,1s) denoting the point in lit-Is whose first r coordinates
are 0 and the remaining 3 coordinates are 1, let
(Plrmrpm) E {(Om—th) : h = 0,1,...,m}.
Evaluating (3.6) at (0M,1h,pl’2""’m) V h = 0,...,i we obtain a
system of linear equations in fn—m h(pl""’m) ; 0 < h 5 1. By solving this
system for fn—m i(pl"°"m)
i . .
(3.7) fn—m,i(p1,...,m) = hEO (_l)l h [111] {(om—h,lh,p1,...,m) .
Evaluating (3.6) at p(a1,...,am) and substracting it from (3.6)
m (11 mp )
(3.3) f(p) — f(p(al,...,am)) = £1 [cm’i (a1.....a:) J fn_m,i(p1,...,m) .
In particular when m = 3
(3'9) f(p) - f(p(alia2133) = (P1P2P3 ’ 313233) fn_3,3(p1’2’3)
if
PI+P2+93=31+32+33
and
2 2 2 2 2 2
31
3.3 Necessary Conditions for Extrema of Eg
In this section we state and prove two sets of necessary conditions
(Theorem 3.1, Theorem 3.2) for the maxima of Eg. The following lemma is
a consequence of the Implicit Function Theorem and will be used in the
proof of Theorem 3.1.
Lemma 3.1
Let h and k be functions from 113 me It defined by
h(x,y,z) = x + y + z
and
k(x,y,z) = x2 + y2 + 22 .
Suppose a = (al,aj,ak) 6 [0,1]3 is a solution to h = a and k = fl ,
where a and Q are known constants. If ai it aj , then there exists an
interval J containing ak and unique continuous functions tri and uj
defined on J such that V x e J the point
1100 = (11,00. 11,00. x) e [0.113
satisfy h(u(x)) = a and k(u(x)) = Q with u(ak) = (ai, aj, ak).
The interval J has the form
'(ak- 6. akl 10.1)? . {1}
(ak- 6, ak+ 6) if a e « (0,1)3
. Iak. ak+ 6) .(0.1)2 .. {0} .
(3.10) J
A
32
Proof
The functions h and k have the following properties.
(i) h and k are C1 functions on
136(4) = {x e n3 l IIx—all < t} , r>0.
(ii) h(a) = a k(a) = Q
6(h,k)
Xry
(iii) the Jacobian at a is non zero.
Therefore by Implicit Function Theorem 3 6 such that 0 < 6 < 6
and unique continuous functions ui and uj defined on the interval
(ak-Q, ak+6) such that V x E (ak—Q, ak+é) the point u(x) = (ui(x),
uj(x), x) e B€(a) with h(u(x)) = a, k(u(x)) = [i and ui(ak) = ai,
uj(ak) = aj.
Suppose (ai,aj,ak) e (0,1)3. Then for small enough 6 > 0 ,
B€(a) C (0,1)3. But the fact that 0 < 6 < 6 implies V x E (ak—Q, ak+6),
u(x) e (0,1)3.
Suppose (ai,aj,ak) 6 (0,1)2 x {1} , then for small enough 6 > 0 ,
B ((a) C (0,1)2 x (0,1+6) . Therefore V x E (ak-Q, ak+6) the first and
second coordinates of u(x) are in (0,1). Hence V x e (ak-Q,ak],
u(x) E (0,113 .
By a similar argument we can show that V x E [ak,ak+0)
u(x) e [0,1)3 if (al,aj,ak) 6 (0,1)2 x {0} .
Thus, we have shown the existence of an interval J of the form stated
in the lemma. 0
33
Remark 3.1
If two of ai, aj, ak are on the boundary of [0,1] the only solutions
in [0,1] for h = a and k = Q are the permutations of ai, aj, ak.
Remark 3.2
As a consequence of (3.10) of Lemma 3.1 for any (al,aj,ak,al) 6 (0,1)4
with o < ai # aj ,t ak< 1, we can always find a (bi’bj’bk) 6 (0,1)3 such
that
erasure
ai+aj+ak=bi+bj+bk
and
a?+a?+a§=b?+b§+bfi.
Theorem3.1
Leta maximizes f on DA andsupposefor i¢j¢k
ai ,I aj it ak with at least two of ai,aj,ak are in (0,1). Then
0 if one of the coordinates is 1
IV
' ' k . 2
(3.11) fn_3,3(al’1’ ) = 0 1f (ai,aj,ak) 6 (0,1) , ai ,t aj it ak
0 if one of the coordinates is 0.
IA
Proof
Let a +3 + =0: and a2+a2+ 2=fl
i j 31‘ . i j ak °
Since f is symmetric with respect to the permutation of ai,aj,ak
without loss of generality we will separate the assumptions of the theorem
into the following cases.
34
(i) (ai,aj,ak) 6 (0,1)2 x {1} with ai it aj
(ii)0