CE‘E’AF {\EECE EEE CCME BOUNCE DF CESEUEE FEEUBLEMS n EKT S‘ELETE’ CF SYE‘JEEE/EEEEEEFECATECNS CF PRC}? DUCT “‘1‘ £358 :‘w‘cvsk {'UeE) agree QEP‘IU - titi- Errx ‘Evr "V‘P'n i215}:- { dEh‘EE UF‘E‘EEF’ESE \E JEE‘E-SEEEE‘EG EEUAE‘EG 1%?0 thhlr This is to certify that the thesis entitled EQUIVARIANCE IN COMPOUND DECISION PROBLEMS AND A STABILITY OF SYMMETRIFICATIONS OF PRODUCT MEASURES presented by Jin-Sheng Huang has been accepted towards fulfillment of the requirements for Ph.D. degree inflitifitics & Probability ji’WiJééL VIA/la 4 L E Major professor Date February 24, 1970 0.169 LIBRA '{Y Michigan State Universi ry EQUIVARIANCE IN COMPOUND DECISION PROBLEMS AND A STABILITY OF SYMMETRIFICATIONS OF PRODUCT MEASURES BY Jin-Sheng Huang A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1970 TO MY PARENTS ii ACKNOWLEDGEMENTS I wish to express my sincere thanks to Professor James F. Hannan who suggested the area of research and whose partic- ipation at each stage of the development actually amounts to co-authorship. I am also grateful to Professor Dennis C. Gilliland for reading the manuscript and for suggesting improve- ments. I am indebted to my former colleagues for their en- couragements during my M.S.U. years. Finally I wish to thank Mrs. Noralee Barnes for her excellent typing and her cheerful attitude in preparing the manuscript. iii- TABLE OF CONTENTS Part Page a EQUIVARIANT PROCEDURES IN THE COMPOUND DECISION PROBLEM WITH FINITE STATE COMPONENT PROBLEM Summary .. .............. .......... ............... 1 1. Notations and History ........... ................ 2 2. Equivariant Decision Procedures and Symmetrifications in 3 Compound Decision Pr0blem I. ..... O ....... ......OOOOOOOIO ........... 7 3. Representation of Equivariant Risk .............. 10 The Difference Between the Two Envelopes When 0 is Pairwise Non-Orthogonal ......... 13 S. The Difference Between the Two Envelopes When ‘9 may have some Pairwise Orthogonality ......... 16 b A STABILITY OF SYMMETRIFICATIONS OF PRODUCT MEASURES WITH FEW DISTINCT FACTORS O. sumary O. ...... O ...... .... OOOOOOOOOOOOOOOOOOOOO O 20 1. Introduction ......OOOOOOOOOOOOOO ...... .... ...... 21 2. PrOperties of Symmetrification and an ia-Norm ... 23 3. Permutations; Contraction Effect of Probability Factors .....0.........OOIOOOCOOOOOOOOOO. ........ 27 4. Two Distinct Factors with Unit Differences in Mu1tiplic1ties ......OOOOOCOOOOIOOOOOOO 0000000000 29 5. m1+ 1 Distinct Factors .......... ............. .. 37 PART 3 EQUIVARI'ANT PROCEDURES IN THE (DMPOUND DECISION PROBIEM WITH FINITE STATE COMPONENT PROBLEM 0. SUMMARY Let (IMB,P) be a probability measure space for each pea={F 0,...,Fm},<7 be an action Space and L be a loss function defined on I X 9 X a such that for each i, = j‘ v L(x,Fi,a)dFi(x) < no. -a In the compound problem, consisting of N components each with the above structure, we consider procedures equivariant under the permutation group. With . _3/2 _ am pm. BZB‘Fim) - Fjam and up) = 1—1 (5)9(19) , we show that the difference between the simple and the equivariant envelopes is bounded by (TI) {2K(P) E c?}% N“35 where p = V p, i 1 1.3 and by -% 0T2) 2m{2K(p) E c:}% N . where p = V{pij'pij < 1}. 1 The bound (T1) is infinite unless the F1 are pairwise non-orthogonal and 0T2) is designed to replace it in this case. l. NOTATIONS AND HISTORY Let (IyB,P) be a probability measure space for each 1: E 9 = {FO,F1,...,Fm}, d be an action space, L be a loss function which is defined on I X«9 X47 to the non-negative reals with value variously expressed <1) L = L = xLien. We assume that for each i, V Li(a) has finite lower integral a with respect to F1, (2) c1 = i: Li(a)dFi < co. Since the Space (7 serves only as a parameter Space for the class :5 = {L(a)la é a} of loss functions on I, X 0, it is without loss of generality to assume that 4’ contains no duplicates in this sense. To avoid the notational buildup attendant on the introduction of randomization at this and higher levels, we assume that (7 is its own extension to the class of all probability measures on the o-field of subsets of a generated by {L(x,>P)|x E I, P E 9} : given any such probability measure 5, <3) a a5 64 a Mag) =J‘ L. Vx.P- a is unique by the assumption of no duplicates. We observe that E for each x and P, L is linear in a. For a t t a1 + (l-t)a0 is in (7 if a and a are, and O 1 2 (4) L(x,P,at) =‘f L(x,P,-)dat = t L(x,P,a1) + (l-t)L(x,P,ao), where the first equality follows from (3) and the second one follows from linearity of integrals. Hereafter we Shall write integrals in operator notation, e.g. the integral in (3) would be expressed §[L(a)] or, and preferably, §[L]. Let .D be the family of all functions d on I. to a such that Lie d, the function which maps x to xLi(d(x)), isia-measurable for each i. For d 6.3 we define the risk of d at Fi as the integral of Li° d with respect to Fi’ (5) R(Fi’d) = Fi[Li° d] s ci' If G is a distribution on {0,...,m} we define the Bayes risk against G by (6) ¢(G) = A G[F1[Lio d]]. .8 We refer to the decision problem described above as the component problem. ‘When N decision problems each with above generic structure are considered simultaneously, the resulting N-fold global problem is called a compound decision problem with finite state components. Specifically, let :5 e r”, (3,553) = am“, 3 e g =9“, 6 d = 4N and let .0 be the family of all functions a S = (d1,...,dN) from I, to g such that (7) x Lio da(§) oz is 48-measurable for all a and i, Letting N (3) W(X.P.a) = N E L(Xa.Pd.aa). we define the risk of d E .D at P E 9, by (9) R1 s u'1 z Nici. ” " i g E 9* is called a simple (sometimes, simple symmetric) procedure if dabi) = d(xa) for all 01, a.e. e for some d E .8. Let § be the class of all simple procedures and let S E S be denoted by dN. It will follow directly from the definition of 6 in Section 2 that §C 9:, the subclass of .2 equivariant under the permutation group. AS functions of 3, A R(P,S) and A R(P,S) will be called the Simple envelope and the equivariant envelope, reSpectively. It is well known (cf. (27) ff.) that the former coin- cides with the component Bayes risk ¢(N) with N denoting the empiric distribution of P1,...,Ph. The compound decision problem was introduced by Robbins (1951). He argued that a bootstrap procedure which first estimates the empiric distribution of P1,...,Pfi and then plays Bayes against the estimate within each component may have its compound risk uni- formly close to the simple envelope. Hannan and Robbins (1955) considered 2 X 2 9 X a and (Theorem 3) bounded the average loss of a bootstrap procedure by the sum of an error of estimation and a loss-weighted Glivenko- Cantelli measure of deviations of the empiric distribution of x1,...,xN, thus obtaining strong convergence to zero uniformly in P* of the difference of the average loss from the Simple envelope for all correspondingly good estimators. Risk convergence (Theorem 4) followed as a corollary. Oaten (1969) permits loss dependence on x, replaces Bayes by any of a wide class of e-Bayes and otherwise generalizes these results to m X n 9 X 4 (Theorem 1 and its Corollary) and to certain compact ‘9 X compact a’ with L continuous for each x (Theorems 4 and 5). Under continuity and other restrictions on densities, analogues of generalizations to m X n 9 X a were given earlier by Suzuki (1966a). Hannan and Robbins (1955) also introduced the class of equivariant procedures and showed (Theorem 5) that the difference between the simple and equivariant envelopes convergasto zero uniformly in P as N t m. The proof depended heavily on a measure theoretic lemma Specializing Theorem 11.1 of Hannan (1953). Our Theorems 1 and 2 (CTl) and (T2) of our Summary) are a strengthened generalization of their result, with Theorem 1 correspondingly re- lated to Theorem 3 of Hannan and Huang (1969b) and Theorem 2 follow- ing as a somewhat involved corollary to Theorem 1. Hannan and Van Ryzin (1965), for 2 X 2 G’X GK and Van Ryzin (1966), for m X n 9 X a, have established a rate of OCN-k) (and under additional restrictions on '6’ and L, OGN-1)) for uniform risk convergence of bootstrap procedures based on estimators which are averages over x of a suitable kernel. 1,...,xN The importance of our results stems from the basic character of equivariant procedures in the compound problem. Until Oaten's (1969) e-Bayes relaxation, all of the bootstrap procedures con- sidered were essentially equivalent (cf. Lemma 3 of Oaten (1969)) to equivariant procedures. The equivariant envelope is then a clearly more appropriate yardstick of performance than the Simple one. The results themselves have already been used by Oaten (1969), together with his afore-mentioned Theorem 1, to prove risk con- vergence for a wide class of equivariant uniformly-e-Bayes pro- cedures (Theorem 2). AS noted in Section 3 of Hannan and Huang (1969b), a gen- eralization of the underlying measure theoretic lemma, Theorem 2 of Horn (1968), turns out to be distinctly improved by an immediate extension of the afore-mentioned Theorem II.1. Her corresponding result on the difference between the envelopes (Theorem 1) inherits the deficiencies of her Theorem 2 and only Shows convergence to zero for each ‘3 with a Stronger restriction on «9 than pairwise non-orthogonality, with a7 finite and with L constant with reSpect to x. Considerable other work relates to equivariant procedures in the compound problem. Stein (1956) and James and Stein (1961) (cf. Stein (1966), where the heuristic of the procedure is revealed, and Cogburn (1965)) obtained strong results with a Gaussian Squared- deviation-loss estimation component problem. In an important general development, Cogburn (196?) imbeds the compound and Empirical Bayes problems in a general theory of stringency. Section 4 of Samuel (1967) (cf. Robbins (1962) and Suzuki (1966b)) investigates a pro- cedure Bayes against uniform prior on proportions for the simplest 2 X 2 example. 2. EQUIVARIANT DECISION PROCEDURES AND SYMMETRIFICATIONS IN A COMPOUND DECISION PROBLEM Let .8 be the permutation group on N objects. The generic element g 6.8 will also be used to denote the transformation induced by g: 10 = ,..., . ( ) g X (yg1 ygN) Letting g(B) = {g x| x E B} for B 6.6, it follows from the trans- formation theorem (Theorem 39.C, Halmos (1950)) that for each P 6.9 and g E .9, g P is in 9 and satisfies ~ (11) P03) N” (g§)(g§). g 6 [3. Furthermore, it follows from the definition of W in (8) that, for each a E 4, ga is in d and (12) W(§:B:S) = w(g?ssg£:gg)- Thus the compound decision problem is invariant under .8. d 6.3 is equivariant (under .g) if for all g E.&, (13) d(gx) = gd(x) a.e. a”. Hannan and Robbins (1955) and Ferguson (1967) use the term invariant procedures instead of equivariant procedures. The latter was suggested by Wijsman (1968) to describe functions which transform properly rather than "invariantly". In our further references we will presuppose this change has been made. 7 It follows directly from definition (13) that d 6 6 if and N-l only if there exists a function v on I X I to <7, symmetric on IN-l, and such that for all a (M) dgy=vaw$> améfi where (15) Ed = (x1,...,xa_l, xa+1,...,xN). Equivalently, d 6 6 if and only if there exists a function 6 on I X IF to <7, symmetric on IE, such that for all a, (16) d (x) = 6(x ,x) a.e. 6N. Q ~ Q ~ It follows from the definition of S and 6 that (17) S s: 6. For each d 6.3 and g 6.9 define dg, the g-conjugate of d, by (18) gg(§) = g’1[g(g§>]- 8 Thus d 6 6 if and only if d = d for all g. Define the * symmetrification d of d 6.0 as the average of its conjugates, N (19) 9* = 010'1 2 dg. It follows immediately that III m l a. ___x- i a. m It» bur-J * W 5? = 2 Corresponding results hold for Subgroups of the permutation group and relate to certain extended (cf. Swain (1965), Johns (1967), Gilliland and Hannan (1969) where only the sequence version is con- sidered) compound decision problems. 3. REPRESENTATION OF EQUIVARIANT RISK As a basis for comparing the Simple and equivariant envelopes, we obtain in this section a convenient representation of the risk function of equivariant procedures and relate to the Bayes risk against a certain uniform prior. For each d 6.9’ and g 6.8, it follows directly from de- finition (9), (12) and the transformation theorem that the risk oftfiu! g-conjugate is ) = EEW(x.P.dg(§>)] (21) = fCW(s§.sf.g(g§))] Averaging the above over .& gives the risk of the symmetrification * d . <22) R<§,g*> = (N!)'1 2 R. 396) Hannan and Robbins (1955) consider the class E’ of all ~ .8 satisfying the constant risk prOperty (23). They Show that d E $(N) coincides with A RON,d), the "risk-invariant" envelope. We ~ E N" N wish to remark that the class 6’ need not be considered separately * because, for each d 6 £3 its symmetrification d has the same risk according to (22). For 6 6.9 it follows from the definition of risk (9) that NR(E:§) = g E[L(XQ,P&) o 5a(§)] =2 2 (Fixfa)[1*i°5o,3’ 1 a‘ Pa=Fi where Fi acts on xa and 3a E (P1,...,P&_1, Pa+1,...,PN) acts V on x . ~a In particular, if 6 6 6, then by (14) 6a (and therefore Li<>6a) is symmetric in fa’ and, with Nji E Nj-l or Nj depend- ing on j - i or 1 # i and 2 denoting Sum over i such that Ni > 0, we have the following representation of equivariant risk, N.. J1 = x (25) NR(N~,§) >3 NiFi j Fj [Lia 61]. Nji v where Fi acts on x1 and X Fj acts on x1. The order of the N J " Fj in X Fjji is immaterial Since the integrand is symmetric. J Let p be any measure dominating ‘9 and let fi = dFi/du. [L10 61] by T(61) for g 6 6, (25) Abbreviating 2 Nifi g Fj is expressible as (26) mm”) = u[T(61)]- 12 If 6 E 3, say 6 = dN, then 61 is a function of x1 alone. Thus 2 _ _.1_ o (27) R(N,d ) - E N FiELi d]. N The infimum over .8 of LHS (27) is, by definition, the Simple envelope. The infimum over .8 of RHS (27) is, by (6), (raj/N), the Bayes risk against the prior N/N. (This is also the infimum over .8 of the (N/N)N-weighted risk but we shall make no use of this interpretation). Hereafter we abbreviate WON/N) by (3(5). We now Show that WCN) is the Bayes risk in the compound problem against the uniform prior on the orbit indexed by E. Let UN denote such a prior. Applying the transformation theorem to the~mapping g _. g3, we obtain from (22) that (28) R634”) = 2 R. N N 20 The infimum over .8 of the LHS above is $01) by (20). The N infimum of the RHS is, by definition, the Bayes risk against UN’ N say, R(UN) . Thus ~ (29) ME) = MUN). and therefore d 6 .8 is e-Bayes if and only if (30) 1101,?) s 1793) =+ e. 4. THE DIFFERENCE BETWEEN.THE TWO ENVELOPES WHEN 9 IS PAIRWISE NON-ORTHOGONAL In this and the following section we bound the difference between the Simple and equivariant envelopes. Since v |F(B)-F 36/3 1 j 3 only when ‘9 is pairwise non-orthogonal. (B)‘ = 1 when Fi I.F , Theorem 1 is of interest Theorem 1. 'LEE pij = V ‘Fi(B) - Fj(B)‘, ci satisfy (2), B K(p> =fi-(jig-)3”)(1-p)'3/2 flilfiE p = v pij- Then 1.) ... 2 _ (31) mg) - Mg) s {2K- wcg) - <2K 0 iff i e 1}, l6 17 and let ”I be the restriction of u to Ii. Since I = 2 II I it follows that p, = E ”I. I Let 5 6 6. The risk of 6 is of form (25), which is expressible as integrals with respect to ”1’ N . N (42) NR]- (46) Nags”) - 1293.9} I v v v V For each I, define h = (h1,...,hfi) 6.8 by N 6v _ ..f Oa(X-1-,-...,X1\q’) 1 X0! E II (47) ha(X1,...Xfi) =€ d (x ) otherwise . I a L V V By (14) we see that h 6 6. By direct calculation using (26) and v the definition of h, v v V NRqu = sltmln + (u-sI)[T(dI)] (48> v 6 fi v v v = NRq,dI) - uI[T(dI) - T(61)]. Since dI is e-Bayes with respect to fi/N and h 6 6; (48) yields (49) ulticdl) - 161)] s MT) - 1763)) +Ne- It follows from (38) that, with 5'= V{pij‘i,j 6 I}, (50) RHS (49) s (21(6))15 2 Nfci +-Ns. iEI Summing (50) over all I with “I f 0, we obtain an upper bound for RHS (46). Since n1 # 0 implies E < l we shall weaken (50) by replacing 5 by p, and then dropping the restriction on the summand. Thus % l (51) R(N,dN) - R(N,6) S (2K(p));5N-1 2 E Nici +-N- V €2No I 161 I Since (51) holds for all 6 E 6 and all e > 0, and therfore for 19 e = O, the proof is complete upon using the Schwarz inequality in (51): (52) z: 2: NEC. =2 2N.cisz N (26:) PART b A STABILITY OF SYMMETRIFICATIONS 0F PRODUCT MEASURES WITH FEW DISTINCT FACTORS O . SUMMARY Let .9 = {F ..,Fm} be a class of probability measures on 0" * (1,8). For any signed measure T on 8“, let T be the average of T3 over all N! permutations g and let “T“ = V{‘7(C)‘:C 6 5”}. _ _ 12. 2 3/2 -3/2 Let dij HFi FjH and K(x) 11 (5) x(l-x) . For any non- negative integral partitions N = (N ,...,N ) and N' = ',...,N') ... O m ... 0 m = ' .. = t . of N, 1:et 61 N'N1 Ni and Ai (Ni A Ni) + l. Wlth . . * T = X Fi1 - X F11 and n = #{iléi * 0} - 1, we bound “T H2 by (T3) n K(d)Z a: A;1 with d = vfdijlai # 0, bj # 0} and, if .9 is internally connected by chains with non-orthogonal successive elements, by (T4) 1 m K(d)(2‘6 ‘)2(Z A-l) with d = V{d ‘F F } 2 i i ij i/1 j ‘ The bound (TB) is finite iff the F1 are pairwise non-orthogonal and (T4) is designed to replace it otherwise. 20 1. INTRODUCTION Section 2 investigates some general properties of signed measures and their symmetrifications w.r.t. general groups. Section 3 specializes to the permutation group and notes a contrac- tion effect of probability factors in product signed measures. The properties developed in Sections 2 and 3 will be used throughout the paper and, in particular, in Lemma 2, in the completion of the proof of Theorem 1, and in the proofs of Theorems 2, 3 and 4. Section 4 proves Theorem 1, which is the special case of (T3) for m = l and 61 = 1. This is the main result of the paper. Its proof contains a detailed outline of itself including Lemmas 2, 3 and 4. An example, consisting of the simplest special case, shows that the bound of Theorem 1 is sometimes asymptotically sharp to within a factor of 3.18... . Section 5 proves Theorems 2, 3, and 4, all as corollaries to Theorem 1. Theorem 2 is the Special case of (T3) for m = l, Theorem3 is (T3) and Theorem 4 is (T4). Our main results are a strengthened generalization of Theorem 11.1 of Hannan (1953). The latter is easily characterized in terms of the m = 1 case of (TB), * 2 2 (T2) Hi H s K(d01) 61 (A0 A ) . amounting to the assertion that, for m = l and fixed F0 er1’ 21 22 (T 11.1) HT*H2 -» o as RHS(TZ) —. 0. As partially indicated in Section 3 and Section 5, the derivation in Section 5 of (T3) and (T4) as corollaries of Theorem 2 would equally well yield weakened (as (T 11.1) weakens (T2)) forms of (T3) and (T4) as corollaries of (T 11.1). A corollary of the 61 = 1 case of (T 11.1), the Lemma of Hannan and Robbins (1955), was there used to show (Theorem 5) that the difference between the simple and equivariant envelopes con- verges to zero. A generalization of the Lemma, Theorem 2 of Horn (1968), is shown in Section 3 to be improved by the 61 = 1 case of a rather immediate corollary to (T 11.1). The Special case of (T3) with two non-zero 6i is used in Hannan and Huang (1969a) (Theorem 1) to bound a more general case of the difference. A similar application, Theorem 1 of Horn (l968),inherits the deficiencies of her Theorem 2. 2. PROPERTIES OF SYMMETRIFICATION AND OF AN ii-NORM Although in this paper we will only be concerned with the difference between two symmetrified probability measures, some of the prOperties used in our proofs hold true and are easier to prove in a more general context. This section investigates some of these properties of symmetrified signed measures and their ifi-norms. Let (Q,CD be a measurable space and .3 be a finite group of measurable transformations g on (u,CD. For a signed measure T on (u,CD, we define Tg as the induced signed measure and T* as the symmetrification of T by * (Tg)c = T(g(c)) c E G , T = AV(Tg) where AV denotes the average over g 6.3. Thus symmetrification (*, hereafter) is a linear Operator. * For any real valued function f on u, define fog and f by . f* = Av = j f. 23 24 For any signed measure r, define an eel-norm of r by (1) Hr” = V{\T(C)| :c e a}. It follows from the Jordan decomposition that <2) M -—- M v urn and hence, if T(l) = 0, m M = “in = urn. In particular, if p and Q are probability measures, then (4) o 5 HP - Q“ s 1, with equality at 0 iff P = Q, and equality at 1 iff P i_Q. For use in the proof of Lemma 1, let u be a measure 3' CIT/d“ ”183° Since dT+/du = (dT/du)+ and dT-Idu = (dT/dH)-a “TH = MdT/du)+ v u(dT/du)'. Hence, if T(l) = 0, <5> 2M = lefi-l). It follows from the transformation theorem (Theorem 39.C, Halmos (1950)) that it =£1a (dpfig dug’ 863- * In particular, if u = u , AV of the above equality yields * dT * _ dT (6) (31:) - d“. ' 25 Let u be a measure and let h and f be such that the * * * products h f and hf are u ~integrab1e. By the transformation * * * * theorem and symmetry, u (h ng) = u (h f) for all g 6.&. Averag- ing over .9 and interchanging h and f yields * * * * * * * (7) u-(hf)=uChf)=u(hf). * For the Special case of (7) obtained by letting h = du/du , we have * * h = l by (6) and therefore, if f (and hence also f ) is * u -integrable, then * * * * (8) N(f)=u(f)=u(f)o For use in the completion of proof of Theorem 1, we note that, by subadditivity of norm and by HTgH E “T“ for a signed measure T, (9) Wu s Av \bsu = urn. It follows from norm subadditivity and the Schwarz inequality (here- after referred to as NS-SI) that if T1 are signed measures then 2 2 (10> us: mm s o: n: M . If T = X T is a product signed measure, HT“ is simply T12 related to the correSponding norms of its factors. We abbreviate by omission subscripts on the norms. Since +++ --__+- -+- T - (T1 x T2) +071 x ¢2), T - (T1 X T2) +-(vr1 X T2), it follows easily that (11) HTIH'HTZH 5 Ha X T2“ 5 2HT1\\°HT2H, with the first equality iff either T1 or T2 is a measure or the 26 negative of a measure, the second equality iff T1(1) = T2(1) - O. In particular, if T is a probability measure, the first equality 2 of (11) yields (12) \hl X 12“ = “11H. To assist in comparing our results with Theorem 11.1 of Hannan (1953), consider a signed measure T with T(l) = 0. By (3). (B) M = Ml = V{T(f)|0 s f s 1}. * Since T (l) = 0 by linearity of * operator, upon applying (13) to T*, and applying (8) to T*(f), it follows that * * * (14) Mr H = v{r(£ )|o s f s 1} = v{r(£)|o s f = f s 1}, * with the second equality following from {f ‘0 s f s l} = {f|o s f = f* s l}. 3. PERMUTATIONS; CONTRACTION EFFECT OF PROBABILITY FACTORS Henceforth we Specialize .3 to be the group of transforma- tions on (Ig3)N induced by the group of permutations on N objects, where (IJB) is a measurable Space. We also let .& denote the permutation group itself. Thus a generic element 3 6.9 will be used both as a permutation and the transformation gx = (xgl’°'°’ng)' The following lemma will be used in the successive extensions, in Section 5, from Theorem 2 to Theorem 3 to Theorem 4. Starting from Theorem 11.1 of Hannan (1953) rather than Theorem 2, Lemma 1 together with NS-SI (10) would yield an extension paralleling our Theorem 3 extension from Theorem 2. It will be Shown that Lemma 1 alone would yield an extension improving Theorem 2 of Horn (1968), where there is a stronger restriction on the F1 than pairwise non-orthogonality, and where the non-zero 6i are 1 and -1 respectively. Lemma 1. ‘If T = T X P for a signed measure ‘T with T(l) = 0 and a probability measure P, then, abbreviating affixes on * 229.22 “ H by omission, \\«r*\\ s \\<>r’>*u- * v * Proof. Since r(1) = 0, “T H = V{T(P(f))|0 s f = f s 1} by (14). Since fi= P(f) is symmetric in the remaining variables and since 0 s f s 1, one more application of (14) completes the proof. - 27 28 N. n! For our extension of Theorem 11.1, let T = X F1l - X Fi1 where the F1 are pairwise non-orthogonal and fixed, and the 6i =‘N; - Ni are zero except for two i's. By judicious choice v Ni of g, we have Tg = T X P with P = X{Fi | 61= 0}. By * * * v * (Tg) a r and Lemma 1, Hr H 3 1| (r) H which, by (14) and Theorem II.l, converges to zero as this case of the bound in (T2) does. 4. TWO DISTINCT FACTORS WITH UNIT DIFFERENCES IN MULTIPLICITIES Theorem 1. Let F and F0 be pfmeasures and let 1 N = (N1,N0) be an integral partition of_ N with N1 2 0 < No. -3 2 With d = “Fl - F0“, with C(d) = d(l-d) I ang_with 2 -1 -1 -l K(d) = (.4) (6 +'5 ) C(d)/C(.4) N N N +1 N -1 l o * 1 o * 2 1 .1_ (15) “(F1 x F0 ) - (F1 x F0 ) H s K(d)(N—_1+1 +NO). 2522;. Since both sides of (15) vanish if d = 0, hence- forth assume F1 # F0. The proof proceeds according to the following outline: As in Hannan (1953), a parametric family of densities of the F1 is introduced and some of their moment properties are related to d. Starting as in Hannan (1953) but then weakening by the Schwarz. inequality, Lemma 2 obtains a family of upper bounds for a slight generalization of LHS (15). Lemma 3 develops lower bounds for modal binomial probabilities for application to the denominators of a bound of Lemma 2. Lemma 4 uses characteristic function in- versions to bound a generalization of the numerators of the bound of Lemma 2. The bound of (15) is then obtained as the minimum of d2, from Lemma 1, and a bound resulting from the application of the other lemmas. For 0 < p = l-q < l and i 1,0, let (l6) Fp = pF1 + qFO , fi = dFi/de, 29 30 (17) e = F1(f0)’ P = p F1 = 1 - Qi(= 1 - q Fo(fi))’ i and note (18) F. = F (£2) > (r (f >)2 = 1 1 1 p i p i by the Schwarz inequality, with equality eliminated by Fp # Fi. Thus (19) e = p'la - q F0050» < 1. Let u be any o-finite measure dominating the Fi’ let h1 = dFi/du -l = . ' = h h ' and let hp ph1 +qhO Since flfohp h1 0 p 2 h1 A ho, it follows that (20) e 2 u(h1 A ho) = l - d. Finally, note that from (17) and (18) (21) P1 - P0 = 1 - 9, P521 = pq F1(fi)FO(fi) > pq e. For integer N and p 6 [0,1]N, let b(k;p) denote the generalized binomial probability of k successes in N independent trials with success probabilities B. At the cost of notational complications, the following lemma could be stated and proved for m instead of 2 probability measures. Lemma 2. For non-negative integral partitions of. N, N1 N0 N1 N6 = '= " . = I: N 0N0,N1) and N 2 (NO,N1), define E F1 X F0 ,2 F X F and T=F-F'. For Or PN’r) k=N' r=N' (23) 4\\T*\\2 < N o J l J l b(k;p ) k=N1 r=N1 * * * * Proof. Since F (dg' /dF:) = F' (dF /ng), the second equality in (22) will follow from the first. * By (6) the integrand in LHS (22) is (dF'ldFE) , whence, by (8), LHS (22) is AV of dF' (24> F('=§‘°g) " dF P v . . . . N1 f Since the integrand in (24) is the product H1 f1(xga)IlE.+1 0(xga) of F-independent variables, the integral is expressible, in terms of K = #[cr >N,'1‘g01 < N1], has the product N'-K N -N'+K N -K 0 [F01 ° [F1(f0)]K[Fo(f1)] ° [F1(fl)] 1 (25) -N -N KN'-KN -KN;-N 'I'K _ 1 0-0 1 .11 = -p q PoQo P1 Q1 H00- Then (22) follows since NI NI N' N' N 1204131 PO 0) (26) AV H(K(g)) =2(k0)(N1.1_1k)(N ) 111(k) = N bCN1;P) with the first equality following from the transfermation theorem. From (5) and the Schwarz inequality, (27) 4\\T*\\2=1:(:r|—N’\)]2<:(|:J—;2\ ) = ”(31;)- Pp Fp dF Four-fold application of (22) to this bound results in (23), completing 32 the proof. For integers 0 s k s N, let (28) M = {01-1)qu b(k;pN) with p = (Mn/(NH). = -1 Lemma} aNN-l-k awkzaNOSe fig N100. Proof. Note that f(x) = (l +-x1)x+%§ as O < x t m since, with t = (2x + l)-1, log f(x) = l + t 2/3 e+~t4l5 + ... The lemma then follows from the representations, = ELL = km; N'” <29) am. ENk-l f(N-k) aho H f(N- j) ’ aNo= {Ii—+1} tm) The following lemma uses characteristic function inversions to bound a mixed second divided difference of generalized binomial probabilities. In our application to certain numerators of (23), the generalized binomials involve only two distinct probabilities but the added generality simplifies the proof and may serve to motivate other applications. Lemma 4, For integers o s k s M, P = (p1,...,gfi) e [0,1]M N 2 and (P,P+h) 6 [0,1] less the diagonal, let A and A reSpectively denote the divided difference operators from k to k+l and from P £2. P+h. ‘With B(x)= (4/n)on W(l y) %e 2x xydy and 2 M 0' - 21 PCY(1-Pa), (30) \c b b\ s 13(02). x3/2 -a (31) B(X) -* (211) as X 1 °°, 3/2 (32) s = sup {x B(x)\o s x < o} = .545447... 33 Proof. Note for later use that, with ¢(u) = Pe1U +-Q the ch.f. of a Bernoulli variable with parameter P = l - Q, (33) ‘¢(u)‘2 = l - 4PQ sin2 2 5 exp - 4PQ sin2 2 . 2 2 Since i_\.e-ink A ¢(u) = e.iUR 4 sin2 g, the iterated divided difference in (30) is given by the Fourier inversion l -iuk , 2 p (34) fig e 4 sin 2 2n <50, (u)du. H213 By application of (33) to the Qa’ the modulus of (34) is bounded by 2 2n 3 . 2 p -20 sin _ 2 (35) n I3 Sin 2 e Y'du - B(o ). (Since v-lsin v I, on 0 < v 3 11/2, RHS (33) 5 exp - 4l?Q(u/Tr)2 on ‘u‘ s n. The correSponding weakening of (35) implies 5 - s < n /22 7/2 = 1.553 ... 2 2 5/2 .2 - 2 2 -2x(u/n) _1, 2 -2x(u/Tr) _ 11 -3/2 B(x) S n f3 Sin 2 e du < n_f: u e du - E773'x .) With IO, I1 denoting the modified Bessel functions, 2 l -% -% -2x B(x) =[-;}“Oy <1-y> e ydyl' (36) = [-Ze'x10(x)]' = 2e'x[10(x) - 11(X)]. where the second equality follows by differentiation from the correSponding Laplace transform (cf. 29.3.124 of NBS-AMASS (1964)), ' = and the last from I0 11. In view of (36), (31) is an immediate consequence of the usual asymptotic expansions of IO and 11 (cf. 9.7.1 ibid). 34 % 3 2 - To verify (32), first note that [x / B(x)]' = x e xF(x) with F = (3-4x)I0 + (4x-l)Il. Since F(l) > 0 > F(2), the behavior of F' will imply that 3 x0 6 (1,2) with F s 0 on [0,x0] and F < 0 on (x0,w) as follows. Since 11 s x10, xF' = x(4x-5)IO +(l-x)(4x+l)I1 S -(15/16)xIO on [0,1]. Since x(4x-1)F' = (4x+l)(1-x)F - 3I l s x and O, F(x) 2 0 = F'(x) < 0 and hence I s x0 and F(xo) = 0 = F < 0 on (X0 ,m) ° Since, in addition,.00012>’F(1.452) >’0 > F(l.453) (p. 228 BAAS v VI (1937)), a -1.452 /2B(l.452) 5 (1.452) e (.00012)(.001) < 4(10'8) 0 S s - (1.452)3 which results in (32) and thus completes the proof. Completion g pro_of_ pf Iheoreml. From Lemma 2 with Ni = N1+l we have this case of the bounds (23). Let p = (N1+l)/(N+l) hence- forth. This choice insures both denominators in (this case of) (23) are equal and, by Lemma 3, are bounded below by {(N-l)pq}-%aNo. Application of Lemma 4 to the mixed second difference remaining in the numerator results in the upper bound s‘h‘o-3, with 2 h = Pl-PO = 1-9 s d and o = NlPlQl + (NO-npooO > (N-l)pq(l-d) / /2f by (20) and (21), so that, with cN 4'1s(1~i+1)3 ZCN-1)'3 (N); 4-lse = .3706 ... as N 1 w, 1 l c C.(d) — + ——). N (NIH No (37) \\r*\\2 a 42m c{pq}'1 35 * From Lemma 1, “T “2 s d2 independently of N. Since pq is maximal at p = %, (37) is minimal w.r.t. N given N at N1 = [N/Z] and, therefore, exceeds d2 for all d iff (38) max d2/C(d) = (.4>2/c<.4> s aN([‘i:—21'1 + Eli-1]“). d Since RHS (38) 1 w.r.t. N with first violation of (38) at N = 10, (N of (37) is replaceable by c9 = .5185 ... . This in turn can be improved by first increasing 8 to insure equality at N = 10, thus replacing cN by (.4)2(o'1 + 5'1)'1/c(.4) = .5070 ... and completing the proof. That the bound of the theorem is sometimes relatively sharp, will follow from examination of the simplest Special case. Example. Let F0 and F1 be Bernoulli probability measures b( ;§0) and b( ;§1) with 0 = g0 < €11< 1. Then N N N '1 N 0 * 0 and (F0 X F1 (F0 x F1 ) N measures on {0,1}N, putting mass, 6:)-1b(k;§11) and N _1 N +1 N (k) b(k;§1 ) respectively, on every x in {0,1} with exactly N " N+l for b(k;§11) and using b(k;§11 ) = +1 * 1 ) are symmetric probability k ones. Writing bk glbk-l + (1-§1)bk’ * (39) ZHT H =‘E1bk -§1bk-l ‘ (1-§I)bk‘ = glilbk ’ bk-l‘ = 2glbm with m the greatest integer not exceeding (N1+l)§1. As (N1+1)§1(1'§1) *‘ma (N1+1)§1(1'§1)b: ~ (2n)-1 and therefore N +1 E * 2 1 1 1 (40) “T H ” 2n l-§1 1 ° 36 When gl 4 0 and No/N1 d m (which is the most favorable case for the following comparison), the bound of the theorem exceeds RHS (40) /2 only by the factor .8n (30/11)(.6)3 = 3.18... 5. m+l DISTINCT FACTORS We now obtain various extensions of Theorem 1 as corollaries to that theorem. Theorems 1, 2 and 3 represent successive extensions each subsuming, yet corollary to, its predecessor. Theorem 3 is vacuous unless the F1 are pairwise non-orthogonal, and Theorem 4 is designed to replace it in this case. Thus, as implied in the summary, our final results are merely Theorems 3 and 4. Let Ni N; ¢=XFi -XFi ,dij =HFi-FjH i 1 (41) 0. =N! -N, , A, = (N3 AN.) +1, i,j =0, .,m. i i i i i 1 Theorem 2. For m = l, * 2 2 l l (42) Mr H s K(d01)51(A + A ). 0 1 Proof. Assume without loss of generality 61 2 l (we may rename N and N' otherwise), and for j = l,...,61, let N -j+l N +j-l N -j N +3 = 0 l 0 l (43) Tj F0 x F1 - F0 X F1 Since T = Z Tj, it follows from linearity of * and NS-SI (10) that (44) H'r*H2 s a, 2 MHZ- Applying Theorem 1 to each summand in RHS (44) completes the proof. 37 38 Theorem 3. .222 n = #{il6i # 0} - 1 23 positive (otherwise T* = 0), 95.19.19}. d = V{dij‘6i # O, 5 f 0}. Then 3 \\r*\\2 s n K(d) z (sf/£1. Proof. Given N and N' we construct a sequence of partitions 22 = NO’ N1,...,Nr = N', for some r s n, as follows. To construct 1=CN 22 10 and let t be such that 6t has opposite sign with 68. Let "'°’N1m)’ let s be such that |os| = A{‘6J“6j # 0}, = + = _ = ' . Nls Ns 68, N1t NC 68 and N1:] N1 for the other j 8 Thus N1 stays between N and N' coordinatewise and differs from N in two coordinates. Repeating this construction, the process ~ terminates in r s n steps since each successive Ni identifies at least one more coordinate with N'. Define ~ m N m N T1 = x Fjl"lj - x F ij, i = 1,...,r. i=0 i=0 3 Since Ni differs from Ni-l in two coordinates, say 3 and t, * * the fact that (Tig) = (Ti) enables us to use Lemma 1 to obtain N N,_ * N, N (45) HIZH s “(FSi-ls X Fti 1t) _ (F is X F it)*H' Applying Theorem 1 . to RHS (45), we note that, since each Ni stays between Ei-l in this application of RHS (42) are bounded below by As and At and Ni coordinatewise, the denominators +1 respectively. Since K is increasing, we further weaken this application of the bound (42) by replacing dst by d. Thus (46) [RHS (45)]2 s K(d)(Ni_1S - Nis)2(%- + 71—). S C 39 Since Ni-lj = Nij except for j # s,t, RHS (46) is m -l 2 K d 2 A - N. . ()3.an mmj lj> r Since T = 2 Ti, by NS-SI and the above representation of RHS (46), i=1 * 2 * 2 -1 2 (47) HT H s r )3 “'11“ s r K(d) zjjAj 21 (Ni-lj - Nil) . Since E (Ni-lj - N11) = 6j and Since Ni-lj - Nij are of the same sign for fixed j, the summation w.r.t. i in the last term 2 of (47) is bounded by oj. Since r s n the proof is complete. Theorem 4. Let F0,F1,...,Fm pg internally connected by chains with successive elements non-orthogonal and let V . d = v{dij‘Fi/‘(Fj}' Then HT*H2 s % m K(d)(213|51‘)2 2i A;1. ‘ggggf. For any connected graph of finitely many vertices there exists a vertex whose removal leaves the remaining graph connected. We shall rename F0,...,Fm in such a way that successive removal of F0,F1,... leaves the remaining connected. For each i, let t(i) be 9 t(i) > i and Ft(i) ,KFi. Given N and N' we consider the partition which differs from either N (if 6 S 0) or N' (if 6 > 0) only on the 0 0 O-th and the t(0)-th coordinates, where the 0-th coordinate is NO A N6 and the t(0)-th coordinate, compared to that of N or u', is increased by ‘6 By weakening Theorem 3 on both K o‘- and the second denominator (l + the t(0)-th coordinate of the new partition 2 At( ) + ‘601 2 A we see that the square norm of * of the difference between the product measures associated with 0 40 the two partitions is bounded by (48) 1((3’) 539% + X—1——). 0 t(O) (i) j j-th coordinate at the i-th iteration of this process, we see that Iterate the process. Letting 6 be the difference in the 6§1) = 0 for j < i and (49) 6§i) = 6j +-Z {6:r)‘0 s r < i, t(r) = j} for j 2 i. We also note that the 6st) above are disjoint sums of 6's from {60’°"’5i-1}' Since A(i), the minimum of the j-th coordinates for the two :1 partitions at i-th iteration plus 1, is increasing w.r.t. i, the bound correSponding to (48), further weakened by A(1) 2 A., is i i Since each iteration results in reducing one coordinate difference to zero, the process terminates in m steps. By NS- SI as in the proof of Theorem 3, we see that, by (50), 2 * v ' (51) HT “2 s m K(d) 2 6:1) (Al— + R—l—L i i t(i) . . 1 . . . (m) The coefficient of X—- in the summation above is, With 6m = 0, i . 2 2 (52) 6:1) +-z {6:r) |t(r) = i}. r By (49) and the comment following it, complemented by 2 61 = 0, we see that 6:1) and the 6ir) in (52) are disjoint sums of {50"°"6i-1’6i+l"°"6m}’ Thus the maximum of (52) Over 41 all g is s (2%9L - 5:)2 +-(§%§L - 5;)2 s %i(z|o|)2, and the proof is complete. The hypothesis of Theorem 4 fails to hold if and only if 9 is dis connected. Then either (i) there exists a component, i.e. a connected set of factors, whose N-multiplicity differs from its ~ * n'-multiplicity, in which case it follows easily that ”T H = l, or (ii) every component has identical N- and N'-multiplicity, in which case “T*H is simply related to the “(Tc)*H correspond- ing to the separate components: (53) v “(Tc)*H s HT*H s 2 “(TC)*H. c c where the second inequality follows from Lemma 1 and the triangular inequality. BIBLIOGRAPHY British Association for the Advancement of Science (1937). Mathematical Tables, Vol. VI, Bessel Functions, Part 1, Functions Of Order Zero and Unity. University Press, Cambridge. Cogburn, Robert (1965). On the estimation of a multivariate location parameter with squared error loss. Bernoulli (1723), Bayes (1763), Laplace (1813), Anniversary Vol., 24-29. Springer-Verlag. (1967). Stringent solutions to statistical decision problems. Ann. Math. Statist. 38, 447-464. Ferguson, Thomas S. (1967). Mathematical Statistics. Academic Press, New York and London. Fox, Richard (1968). Contributions to compound decision theory and empirical Bayes squared error loss estimation. RM-214, Department of Statistics and Probability, Michigan State University. Gilliland, Dennis C. and Hannan, James F. (1969). On an extended compound decision problem. Ann. Math. Statist. 40, Halmos, Paul R. (1950). Measure Theory. D. Van Nostrand, New York. Hannan, James F. (1953). Asymptotic solutions of compound decision problems. UNC Institute of Statistics Mimeo Series No. 68. Hannan, James F. and Huang, J.S. (1969a). Equivariant procedures in the compound decision problem with finite state component problem. RM-237, Department of Statistics and Probability, Michigan State University. Hannan, James F. and Huang, J.S. (1969b). A stability of symmetrica- tions of product measures with few distinct factors. RM-238, Department of Statistics and Probability, Michigan State University. Hannan, James F. and Robbins, Herbert (1955). Asymptotic solutions of the compound decision problem for two completely Specified distributions. Ann. Math. Statist. 26, 37-51. 4.2 43 (1965). Rate of convergence in Hannan, J.F. and Van Ryzin, J.R. the compound decision problem for two completely specified distributions. Ann. Math. Statist. 36, 1743-1752. Horn, Susan Dadakis (1968). The optimality for compound decision Tech. Report No. 10, Department of Statistics, problems. Stanford University. James, W. and Stein, Charles (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 361-379. University of California Press. Two-action compound decision problems. Johns, M9V., Jr. (1967). Proc. Fifth Berkeley Symp. Math. Statist. Prob. 1, 463-478. University of California Press. " National Bureau of Standards Applied Mathematics Series 55 (1964). Handbook of Mathematical Functions, U.S. Government Printing Office, Washington, D.C. Oaten, Allan (1969). Approximation to Bayes risk in compound RM-233, Department of Statistics and decision problems. Probability, Michigan State University. Robbins, Herbert (1951). Asymptotically sub-minimax solutions of compound decision problems. Proc. Second Berkeley Symp. Math. Statist. Prob., 131-148. University of California Press. Robbins, Herbert (1962). Some numerical results on a compound decision problem. Recent Developments in Information and Decision Processes. The Macmillan Co., New York (1962), 56-62: Samuel, E. (1965). On simple rules for the compound decision problem. J. R. Statist. Soc. (B), 27, 238-244. Samuel, E. (1967). The compound statistical decision problem. Sankhya 29, 123-140. Inadmissibility of the usual estimator for the Proceedings of Stein, C. (1956). mean of a multivariate normal distribution. the Third Berkeley Symposium on Mathematical Statistics and University of California Press. Probabilipy, 197-206. An approach to the recovery of inter-block Stein, Charles (1966). information in balanced incomplete block design, . Research Papers in Statistics, Festschrift for J. Neyman, 351-366. John Wiley & Sons, London, New York and Sydney. Bounds and rates of convergence for the Swain, Donald D. (1965). extended compound estimation problem in the sequence case. Tech. Report No. 81, Department of Statistics, Stanford University. 44 Suzuki, Giitiro (1966a). Asymptotic solutions of the compound decision problem for many completely Specified distributions. RM No. l, The Institute of Stat. Math. Suzuki, G. (1966b). Discrete compound decision problem. Ann. Inst. Stat. Math. 18, 127-139. Van Ryzin, J.R. (1966). The compound decision problem with m X n finite loss matrix. Ann. Math. Statist. 37, 412-424. Wijsman, Robert A. (1968). Review of Thomas S. Ferguson: Mathematical Statistics. Ann. Math. Statist. 39, 2163-2167. NOVIWNWIWIW “WIWITIIITIHTIIWBIDWEs 3 1293 03082 9406