grit/lean “UHHI!HIWIHIIIHIIHIJHI L =2 l llHl III II”! 31293 00548 0292 LIBRARY Michigan State University This is to certify that the dissertation entitled Parametric Empirical Bayes Problems with Cost For Component Observations presented by Inna Jung has been accepted towards fulfillment of the requirements for Ph.D. degreein_$I.aIJ_S_U_§_S_' ' flaw Major professor Date November 11, 1988 MS U is an Afl‘mmm’vc Action/Equal Opportunity Institution 0-12771 MSU LIBRARIES 4-:— fll fl" RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. PARAMETRIC EMPIRICAL BAYES PROBLEMS WITH COST FOR COMPONENT OBSERVATIONS By Inna Jung A DISSERTATION Submitted to Michi an State University in partial ful lllment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1988 ’wrfl rtxt . . LJN f "\ ‘a-x Gil ABSTRACT PARAMETRIC EMPIRICAL BAYES PROBLEMS WITH COST FOR COMPONENT OBSERVATIONS By Inna Jung We consider the empirical Bayes decision problem where the component problem includes a constant cost per observation and the option to choose in advance the total number of observations. The usual empirical Bayes decision problem involves identical components with a given fixed sample size for all repetitions of the component. The empirical Bayes decision approach with our component permits data accumulated over past component problems to be used in selecting both the sample size and the decision rule to be used in the current component problem. The generality introduced by allowing sample sizes that are determined stochastically makes the result more useful in applications where, typically, the choice of sample size is an option based on past data. The empirical Bayes version involves "independent" repetitions (a sequence) of the component decision problem. With the varying sample size possible, these are not identical components. However, we impose the usual assumption that the parameter sequence Q = (01, 02,...) consists of independent G—distributed parameters where G is unknown. We assume that G E y, a known family of distributions. The sample size Ni and the decision rule di for component i of the sequence are determined in an evolutionary way. The sample size N1 and the decision rule (11 E DN used in the first component are fixed and chosen in 1 advance. The sample size N2 and the decision rule (12 are functions of L1 = (X11,....,X1N ), the observations in the first component. The sample size 1 2 ) N3 and decision rule (13 are functions of (_)_(_1, L . In general, Ni is an integer—valued function of (231, L2,...,_)gi_1) and, given N i’ di is a DN.—valued 1 function of (.)_(1, L2 ,...,_)_(_i_1). (The action chosen in the i—th component is di(xi) which hides the display of dependence on (L1, L2,...,;(_i_l).) For a variety of models, we will construct empirical Bayes rules that are asymptotically Optimal. We consider both parametric models involving squared error loss estimation and linear loss testing and show how more general cost functions are covered by the work. We will simulate one model to assess the small—to—moderate i risk plus cost behavior of one of the suggested asymptotically Optimal empirical Bayes procedures. To my wife Chairan and Sons Sehyun, Sunggon iv ACKNOWLEDGEMENTS I wish to express my deepest appreciation to Professor Dennis C. Gilliland for his guidance throughout the preparation of this dissertation and his concern for my work. I would like to thank Professor R. V. Erickson, Professor R. V. Ramamoorthi and Professor H. Salehi for serving on my committee. I am also thankful to Ms. JoAnn Peterson and Ms. Cathy Sparks who have helped me in many ways. My Special thanks go to Ms. Loretta Ferguson for her patience and great care in typing my thesis. TABLE OF CONTENTS CHAPTER 1. INTRODUCTION ......................................... 1 1.1. A Statistical Decision Problem With Cost for Observation ....... 1 1.2 An Empirical Bayes Decision Problem with Random Sample Size Components ....................................... 3 1.3 Literature Review .................................... 8 2. ESTIMATION OF BINOMIAL PARAMETER ................... 11 2.1. The Component Problem .............................. 11 2.2. An Empirical Bayes Decision Procedure ................... 16 2.3. Some Empirical Bayes Risk Calculations ................... 19 3. TESTING THE BIN OMIAL PARAMETER .................... 22 3.1 The Component Problem .............................. 22 3.2 An Empirical Bayes Decision Procedure ................... 26 4. ESTIMATION OF THE NORMAL MEAN ...................... 28 4.1. The Component Problem .............................. 28 4.2 An Empirical Bayes Decision Procedure ................... 31 5. TESTING THE NORMAL MEAN ............................ 35 5.1. The Component Problem .............................. 35 5.2 An Empirical Bayes Decision Procedure ................... 38 REFERENCES ............................................... 41 LIST OF TABLES TABLE 1.1. Empirical Bayes Procedure with Stochastically Determined Sample Sizes ................................... 5 2.1. n*(a, fl) and r(a, ,8) ...................................... 20 2.2. Estimated Empirical Bayes Risks (m=2, c=.001, a=fl) .............. 21 vii LIST OF FIGURES Figure 2.1. A Risk EnveIOpe .......................................... 19 viii CHAPTER 1 INTRODUCTION § Lt. A Statistical Decisign Prgblem With Cost for Observations Consider a statistical decision problem with parameter space 9, action space .A', nonnegative loss function L( . , -) on 9 x .1, unknown prior distribution G on O and a cost c > 0 per observation. Let X1, X2,... be observations which are independently and identically distributed with a distribution P 0 given 0, taking values in a set .z the observation Space. Let Dn be the set of all decision functions (I: .3” —» .1 where .311 is the observation space for the vector X = (X1,....,Xn). When 0 is the parameter and a decision rule (1 e D n is used, the decision loss plus cost for observing 2; = (X1,....,Xn) is L(0, d(2(_)) + cn where we assume that L is integrable for all 0, n and d 6 Dn' Let Rn denote the risk and Bayes risk of the decision rule (I 6 Dn, i.e., (1.1) Rum, d) = Igan. dendPge) (1.2) Rn(G, d) = [Rn(0, d) dG(6’) and let rn denote the risk and Bayes risk of the decision rule (1 E Dn including cost for observations. Then (1.3) rn(0, d) = Rn(0, d) + cn (1.4) rn(G, d) = Rn(G, d) + cn. We define minimum Bayes risk and minimum Bayes risk plus cost in the usual way. We assume for each prior G and each n = 1,2,... that a Bayes rule d8 6 Dn exists. Thus, (1.5) (11213 Rn(G, d) = Rn(G, d3). I] Let Rn(G) = Rn(G, <13) and (1.6) rn(G) = Rn(G) + on. Since Rn(G) is nonincreasing in n, a minimizer Of rn(G) exists among the integers 1, 2,.... We will denote a specified minimizer as n* = n*(G) and refer tO it as an optimal fixed sample size. Therefore, r(G) = rn*(G) is the minimum Bayes risk in the component across all the possible sample sizes and the correSponding class of decision rules, i.e., (1.7) r(G)=rn*(G) = min {min {rn(G, d)|d E Dn}| n = 1,2,....}. Moreover, note that Rn*(G) + cn* 5 R1(G) + c < no so that n*_<_(R1(G)+C)/C 0 denote the constant cost per observation. Then a Bayes decision function for estimating 0 based on observation X = (X1,....,Xn) is .1, (1.8) deem) 4%) u + (1 — fix» X, and (1.9) rn(G) = (fihiv) + en. The function AV/(A+nV) + cn is a convex function Of n e (—A/V, 00) with a minimum at 17=(A/c)1/2 — A/ V. Therefore, we can define an Optimal fixed sample size n* as the smallest positive integer minimizer Of ( 1.9), which is related to 17 by 1 if r] < 1 (1.10) n* = n*(A, V) = n if 17 E {1, 2, 3,....} [77] or [77] + 1, otherwise where [ ] denotes the greatest integer function. §Q. EmirilB sDeii rlm’h n m 1e ize Commnents When a statistical decision problem occurs repeatedly and independently with the same unknown prior G, one can apply an empirical Bayes approach where G is estimated using data collected from previous repetitions and a Bayes rule with respect to the estimated G is used in the current component problem. The empirical Bayes decision approach with our component permits data accumulated over past component problems to be used in selecting both the sample size and the decision rule tO be used in the current component problem. The generality introduced by allowing sample sizes that are determined stochastically makes the result more useful in applications where, typically, the choice Of a sample size is an Option and based on past data. We impose the usual assumption that the parameter sequence (01, 02,...) consists Of independent G-distributed parameters, where G is an unknown element Of the known class Of distributions y. The sample size Ni and the decision rule di for the components are determined in an evolutionary way. The sample size N1 and the decision rule (11 E DN used in the first component are given nonrandom choices. 1 The sample size N2 and the decision rule (12 are functions of _)_(_l = (X11"""X1N ), the Observations in the first component. The sample size 1 N3 and the decision rule d3 are functions Of (£1, £2). In general Ni is an integer—valued function Of (x1,_)_(_2,...._xi‘1) and, given Ni’ di isa DN.—valued 1 function Of (x1, x2,....,xi"1). Let N = (N1, N2,...) and 51 = (d1, d2,...). We will be concerned with the risk behavior Of empirical Bayes procedures (N, 51). (Here and henceforth, the term risk will refer to the expected loss plus cost for Observations.) The risk for the decision about 0i is (1.11) ErNi(G, di) = ERNi(G, di) + cENi, where E denotes the expectation over the earlier Observations L1, L2,"..X—1. Definitign 1.1. If the empirical Bayes procedure (_N_, g1) possesses the property: (1.12) lgm ErNi(G, di) = r(G) for all G E f, we say it is W. This means that in the limit, the empirical Bayes procedure has the best possible risk behavior, i.e., achieves minimum Bayes risk. For a variety Of models, we will construct empirical Bayes rules that are asymptotically Optimal. All Of our results concern parametric families Of priors, y={G w' wen} where (I is a Specified subset of a finite—dimensional Euclidean space RP. Families of conjugate priors will be used as the parametric families of priors. We will identify G by w and replace G accordingly in formulas for risk, etc. Also, we will use the empirical Bayes approach wherein the prior w is estimated, say by a, and fi=n*(&)) and draws are used in defining the empirical Bayes procedure. Note that we have dropped the superscript on d (if The following table shows how the empirical Bayes procedure evolves using estimates 620 arbitrary, 5.21:5), (x1), a2=aa2 (x1,x_2), 513:“ 3 (x1,x2,x_3),.... The 01, 02, 0 are i.i.d. Gw‘ Table 1.1. Empirical Bayes Procedure with Stochastically Determined Sample Sizes Para— Sample Decision Observa— Estimated Stage meter Size Rule tion Prior Risk . 1 . 1 1 1 01 Nl=n*(w0) d1=daj0 _)_(_ wl(_)_(_ ) E{L(0l,dl(l(_ ))+cN1} l 2 0 N =n*(&2) d =d. x2 a; (x1 X?) E{L(0 d (X2))+cN } 2 2 1 2 ml -' 2 — ’ — 1’ 2 — 2 = ErN2(w, d2) 3 0 N =n*(&2 ) d =d. x3 a; (x1x2x3) E{L(0 d (X3))+cN } 3 3 2 3 (.22 — 3 - "- ’— 3’ 3 - 3 = ErN3(w, d3) The convergence Of the sequence Of risks in the last column to the smallest possible risk r(w) = rn*(w) is the asymptotic Optimality prOperty. The following remark shows how asymptotic Optimality implies the convergence Of the sample sizes Ni to the set of Optimal fixed sample sizes. Bemagk 1.1. Let s(w) denote the set Of integer minimizers of rn(w) (a) If (N, 51) is asymptotically Optimal at (.2, then (1.13) P(Ni E s(w)) -1 1 as i—r 00. (b) If rN (a), di) -» r(w) a.s., then i (1.14) P(Ni E s(w) eventually) = 1. mt. For given w, there exists an 6 > 0 such that for all n’ j! s(w), rn,(w, d) — r(w) 2 c for all (1 6 Dn" On the event, Ni 9! s(w), rNi(w, di) — r(w) 2 6 so that 13er (01, di) —r(w)] Z 6 PW, ¢ 800)), i which yields (1.13) by letting i -+ 00. Since (Ni ii 3(a)) i.O.) implies rN_(w, di) — 1 r(w) 2 c, i.O., (1.14) is proved. a The following lemma will be used in subsequent chapters in establishing the asymptotic Optimality prOperty. Lemma 1.1. For priors w and 11, let n = n*(w), m = n*(u) be Optimal fixed sample sizes and let (11:), d}; E Dk denote Bayes decision rules with respect tO O), V for k = 1,2,... Then (1.15) 0 5 Im (“’1 (1111/1) ‘ TM 5 k k k k sukaRk(w, du) — Rk(u, dV)| + supkle(w, dw) - Rk(u, dw) |. Proof. The left inequality follows from the fact that r(w) is the minimum Bayes risk Over choices (I E Dk and sample sizes k. Adding and subtracting rm(u, (1’3) and noting that rm(1/, (1’3) 3 rn(u, d2) yields (1.16) rm(w, (I?) — rn(w, d2) S rm(w, (it?) — I‘m(V, (113) + rn(u, (13)) — rn(w, (13) which together with (1.4) implies the right inequality Of (1.15). n In Chapters 2 and 3 we develOp a.O. empirical Bayes procedures for squared loss estimation and linear loss testing and a binomial component. Here the family Of priors is the beta family. In Chapter 2 we give the results Of computer simulations that provide estimates Of risk behavior for small to moderate i. In Chapters 4 and 5 we treat the two loss functions in a normal component with normal priors. The quadratic loss function L(0, a) = b(0—a)2, where b > 0, is covered by our results by factoring b out and replacing c by c/b. Similarly, the linear loss function for testing with slopes -b and b for its arms is covered by our work. Our methods cover more general cost functions as well. If the cost function is c(n) and lim inf c(n) > R1(G), then for any given G, inf {rn(G) |n = 1, 2,....} is attained, and we can define n*(G) as the smallest minimizer. Moreover, the proof of Lemma 1.1. applies to give the same conclusion, that is, a bound for excess risk in terms Of the supremum of differences in decision risks over varying sample size problems. § 1_.3. Literatgta Bgvigw In the usual empirical Bayes decision problem we are given a stochastic process (01, X1), (02, X2),.... Of independent and igantiaally distributed random vectors with the interpretation that, at the ith component problem, Observation Xi has distribution P 0 given the parameter 0i = 0 and 01, 0 ,... are i.i.d. with a fixed but unknown prior distribution G in a family Of distributions f. The datum Xi may be a vector Of summary statistics for the Observations taken at the ith component, e.g., the sample mean or other sufficient statistic based on a sample Of Specified size taken at that stage. The family Of priors f can be an unspecified subfamily Of all priors On 9 or a certain parametric family, like conjugate priors. Morris (1983) uses the terminology nonparametric empirical Bayes (N PEB) for the former case and parametric empirical Bayes (PEB) for the latter case. Morris (1983) indicates that PEB is needed to deal with those cases in which number Of component problems is tOO small for Bayes' theory to approximate well. Robbins (1951, 1955, 1966) introduced the empirical Bayes problem. Most Of his work and that which followed Robbins is N PEB. It has mainly concerned constructing procedures in a variety Of Situations that are asymptotically Optimal, i.e, such that lim Rk (G, di) = Rk(G) v G e y. Here k indicates the common sample size taken at each component and on which both the Bayes and empirical Bayes procedure (1G and di are based. Two different approaches have been used in constructing empirical Bayes procedures. The first one is to estimate G from data accumulated from previous component problems and then to construct a Bayes procedure with respect to the estimated G. The second approach is to estimate the Bayes procedure (1G with respect to G directly using data from previous component problems without estimating G from the previous component problems. The first approach gives smoother procedures Since the decision rules will be conditionally component Bayes. O'Bryan (1972, 1976) introduced the nonparametric empirical Bayes decision problem with non — i.i.d. components by allowing unequal nanranaam sample sizes in the component problems. He followed the second approach in the situation that P 0 is in the discrete exponential family. O'Bryan (1976) defined asymptotic Optimality for his case, which is necessarily more general than that Of Robbins (1951), and showed the asymptotic Optimality Of his procedure. O'Bryan and Susarla (1975) studied the empirical Bayes decision problem with nonidentical components in which P 0 is normal with mean 0 and known variance which is changing from component to component. Laippala (1985), whose work is motivated by O'Bryan (1976), introduced an empirical Bayes problem with nonidentical components with cost for Observations and random "floating" sample Sizes for the components. Laippala (1985) defines the "Optimal" sample size as ié = [inf {nlr G) ;r (G)}] A 1 . n+1( where i is a given fixed integer. This is not Optimal among the set of all fixed sample sizes Since for all G E y, r- (G) a r *(G)a 16 n and for some G E y it is possible to have riG(G) > rn*(G). Laippala (1979) defines a floating Optimal sample size i; +1 for use at (n+1)th component problem which is a function Of the Observations from previous 11 components as well as current Observations. It is pointed out in Gilliland and Karunamuni (1988) that this rule is not necessarily Optimal when i a 3 and that the first line Of the proof Of Theorem 1 in Laippala (1985) claiming that i; -—P-+ if; neglects the boundary set on which the convergence may fail. Laippala's results as 10 claimed are nonparametric in the sense of Morris (1983). The component problems that we will consider involve squared error loss estimation and linear loss testing. Many authors have considered the empirical Bayes problem with independent and identical repetitions Of these components following Robbins (1956, 1964). Morris (1983) and Susarla (1982) give general discussions. Singh (1979) provides results on squared error loss estimation problems. Van Ryzin and Susarla (1977) and Gilliland and Hannan (1977) develop the theory for monotone multiple decision problems extending the results for linear loss testing Of Johns and Van Ryzin (1971, 1972). All empirical Bayes work cited above involves identical components with the exception Of the nonrandom sample Size work Of O'Bryan and Laippala. The variant Of O'Bryan and Susarla (1975) has a linear loss component with a translation and scale parameter exponential with the scale parameter known and changing from component to component. Karunamuni (1985, 1988) and Gilliland and Karunamuni (1988) consider the possibility Of varying stochastic sample sizes. Gilliland and Karunamuni (1988) deveIOp the theory for finite state problems. Karunamuni (1985, 1988) studies an empirical Bayes problem with a sequential component with linear loss and multiple decision loss structures. He does not treat the Optimal fixed sample size problem. Rather, assuming a consistent estimator for G, he shows that the risk Of an empirical Bayes one—step sequential decision procedure converges tO the Bayes risk attained by the one—step look ahead sequential decision procedure. This is not the asymptotic Optimality defined by Robbins (1956). CHAPTER 2 ESTIMATION OF THE BINOMIAL PARAMETER §2._L Tha Commnent Problem Suppose that the rate 0 at which defectives are produced by a given production process varies from day—tO—day. On each day a random sample Of at least two parts is taken at a cost Of $ .50 per part and an estimate 3’ made with loss $1000 (AG—0)? If the sequence 01, 0 ,.... is modeled as a stochastic sequence with independent and identically G—distributed variables with G unknown, then the empirical Bayes method is apprOpriate. For the case G is restricted to the Beta (0:, 6) family and the sampling is two—at-a—time, we show how tO construct a decision procedure with risk plus cost for Observations converging to the lowest possibly risk, whatever be a and 5. In Section 3 we find that in this case the envelope risk plus cost is no greater than $18.00 per day, the minimax risk plus cost. Against the least favorable a=fl=2, the empirical Bayes risk is estimated to be below $20.00 after 15 days. The empirical Bayes sample size converges to the Optimal 8x2 = 16 parts here. Other (1, 6 values are tested in the computational work Of Section 3. In this section and the next we deve10p the empirical Bayes procedure and prove its asymptotic Optimality. Let X1, X2,... be i.i.d B(m, 0), where m is a given positive integer and the parameter 0 have prior distribution G in the beta family f = {Beta (0:, fl)| a>0, fl>0}. Estimation Of 0 is considered for squared—error loss. Here O = .1 = [0, 1]. Let c > 0 be a constant cost per Observation. Let (I 6 D11 be a decision rule based on the Observation fin = (X1""’Xn)' The decision loss plus cost for Observation is given by [0 — d(2(_n)]2 + cn. The marginal distribution Of Xi is Beta—Binomial. We let 5 and n denote the first two moments Of G = Beta (0:, 6), that is, _ _ a 6 _ EGa - n+3 11 12 (2.1) +1 "=EG”2=(3+%I(21+5+1)1 andnotethat 0<§20, fl>0. Also E(Xi)=m{ (2.2) B(x?) = me - 17) + m2», and from (2.1) it follows that a: 71-5 In the empirical Bayes application, (2.2) and (2.3) will be useful in the (2.3) construction of consistent estimators for a and ,6. We will use the method Of moments to Obtain estimates Of 5 and n and will use (2.3) to Obtain estimates for the parameters a and 6. A Bayes rule exists and is given by the posterior mean Of 0, given X”. The posterior distribution Of 0, given X11, is Beta (a+an, [3 + mn — an), where Xn denotes the average Of X1""’Xn' Hence, a Bayes rule dG 6 DD 18 (1+an (2.4) dG(Xn ) = EFF-TL _ a + 11 X — a+B+mn a+ZFan n if G 2 Beta (0, 6). Baka 2.1, For G = Beta (0, fl) and G’ = Beta (0’, fl’), , , 2 (2.5) Rn(G, dG’) = (a’+fli+mn)2 {[(a + 5 ) —mn]1) l3 - warmer) -—mn1 5+ («1021, (2'6) an(G, dGI) _Rn(G’$ dG/)i :2 ig— {Ii + '7]— fl’ '1 and (2.7) R (G) = “fl . n (a+fl)(a+fi+1)(a+fi+mn) Prmf. In (2.4) for G’, we see that x12 _ _ a’ _ n Rn(G’dG')—EGE0[0 a’+fl’+mn a’+fl’+mn n =EGE0[9— (Kn-mo)— ““1 012 a’+,6’+mn a' _ n a’+fl’+mn a’+fi’+mn I + I ’ 2 = 130.130 [(0.51 fll’B+mn 0 ‘ a'+%’+mn) _ (01%; +mn (x11 _ m0))] = 1 2{E [(a’+fl’)0—a’]2+n2E E (x -—m0)2}. [a’+fl’+mn] G G 0 11 Using (2.1) EG[(a'+fl')9— or]2 = (a' + fl’)2n- 201’(a’+fl’)€+(01’)2 and nZEGEo (Xn - me)2 = n2EG(Var0 Xn) --= n2EG[%mN1~ 011= mn (r — 2). Hence 2 1 f I 2 I I I Rn(G’dG’)=[a,+fl,+ ] {(0 +fl) n—2a(a +fl)€ + (a)2 + mn (€— 11)} _ [ l ]2{[( I I 2 I I I _ I 2 _ a+fl)—mn]n—[2a(a +fl) mn]§+(01)}i a’+fl’+mn 14 which proves (2.5). Letting G’ = G in (2.5) and using (2.1) leads to (2.7). Finally, from (2.5) we Obtain (2.6) since |Rn(G, dG’) — Rn(G’, dG,)| = I[ 1 ]2{[(a’+fl’)2-nm}(1r-n’)—[2a’(a’+fl’)-mn](€-€’)}| a’+fl’+mn and (a’+fl’ ) 2+mn (a’+fl’+mn ) (a’+fl’) 2-mn (a’+fl’+mn ) 2 < <1 2a’(a’+fl’)—mn < 2a’ ( a’+,6’)+mn < 2 2 = ' (a’+fl’+mn) ‘ (Oz’+fl’+mn)2 From (2.7) the minimum Bayes risk including cost for Observations is (2.8) rn(G) = + cn. m6 -—-1 (a+/3)(a+,6+1) (‘1 + 3 + m”) We seek the Optimal sample size n*. rn(G) is a continuous and convex function of real 11 > —(a+fl)/n. Consider the equation 0 = gfirnm) = _(a+flril(rg+fl+l) (01+ fl+ mn)—2 + c. Its larger solution is (2.9) u ={(%(a+m?£+5+l,)1/2 — (a + zip/m and an Optimal fixed sample size n* = n* (a, ,6) is given by r 1 if V<1 (2.10) n*= V if V6 {1, 2, 3,...} [u] Or [u] + 1 depending on which integer minimizes r n(G)’ otherwise. Here [ ] denotes the greatest integer function and we take n* = [u] if both [11] and [u] + 1 minimize rn(G). By the comment preceding Example 1.1 and the fact R1(G) 5 .25 for all G, it follows that n* _<_ (.25 + c)/c for all G. If a and 6 were known constants, we can use dG 6 DH... to achieve 15 minimum Bayes risk, i.e., r(G) = min {rn(G)|n = 1, 2,...}. In the next section we Show how (a, B) is estimated in the empirical Bayes problem with this component and establish the asymptotic Optimality for the resulting procedure. 16 § 2.2. An Empiricfl Bayea Dagiaian Prmadura Consider the binomial component problem Of the last section. Let O0, 30 be initial nonrandom estimates Of a, 5 and let N1 = n*(OzO, 30) be the sample size chosen for the first component. (See (2.10) for the definition Of the optimal fixed sample size function n*.) Recall that _)_(_1= (X11,X12,...,X1N1) denotes the vector Of Observations from the first component. We will define a sequence of estimates Oi, Bi based on (X1, X2,...,_)_(_i). Then for component i+1, the empirical Bayes sample size is Ni+1 = n*(Ozi, Bi) and the empirical Bayes estimator Of 0H- (2.11) di+l(_xi+1)=a . + N‘HYi“ ,i=0,1,... “i + 5i + m Ni+1 (see (2.4)), where 1N1 (2.12) Yi= —2 X.j,i1,,2... Ni1j=1 We will give estimates based on the method of moments and will find it useful to consider i X?., i = 1, 2,... (2.13) Z. = U .1 1 Ni j and denote average Of Yj’ Zj,j = 1, 2,...,i as Yi, Zi’ i = 1,2,... Let .96 be the trivial a—field and let .9} = 0(X1, X2,...,X_j),j = 1, 2,... The sample size N . is .93_1 measurable, j = 1,2,..., and we see that J J E(Yj| .9j_1) = m 5, j = 1, 2,... N 21 (2.14) 12(sz 2.1) = m (c — n)+m217. 1= 1, 2,... follow from (2.2.). Since Y]. 5 m and Zj 5 m2, j = 1, 2,..., the strong law for centerings at 17 conditional expectation (see Hall and Heyde (1980, Theorem 2.19)) implies i l YI— To: 13(le ..9j_1) -) 0 3.8. (2.15) 1 i From (2.14) and (2.15) we have Yi -) m5 a.s. (2.16) Zi -) m({—n) + m2” a.s. Lemma 2.1. Let m 2 2. The estimators defined for i = 1,2,... by Yi Eli—m (2.17) . 71 "Yi ”15 anti")— and A __ Ei(€i- 2,) + 01: fl— IIi - Ei (2.18) .. _ (l-Ei) (ii-2,) + fli‘ . _ 2 "i 5i are as. consistent. (In (2.18) take ratios 0/0 to be 0.) Prmf. The as. convergence Of the estimates (2.17) follows from (2.16). The as. convergence of the estimates (2.18) then follows from (2.3). 1:) Refer tO Table 1.1. Let L.) = (0, 3), 6.10 be arbitrary and Oi = (Oi, bi) be defined by (2.18). Let the sample size sequence N be defined by Ni+1 = n*(iri, Hi), i = 0, 1,... where n* is defined by (2.10). Let the empirical Bayes 18 decision rules S! be defined by (2.11). Thagram 2.1. Let m 2 2. The empirical Bayes procedure (N, a) defined above is asymptotically Optimal at each G = (a, [3). Prmf. By Lemma 1.1 and (2.6), (2.19) o s erm, d,+1)—- r(G) s 41:, — r) + 212, — nl- Since IEi — {I 51 and lfii — nl S 2 for all i, the DCT, Lemma 2.1 and (2.19) imply that ErN +1(G, di+1) -+ r(G). [:1 l Bamaak 2.2. In the component problem under consideration in this chapter and the next, the marginal distribution Of a single Observation is Beta—Binomial with parameters 111, a, B. If m = 1, this is Binomial (1, 01/01 + 6) and the pair ((1, fl) is not identified. Our method Of estimation in the empirical Bayes version requires that m 2 2. This assumption can be removed if we require that the N i 2 2 and use estimators based on pooled data. Requiring Ni 2 2 i.O. would suffice but details Of these variations will not be presented. In Chapters 4 and 5 we Optimize sample Size over 11 2 2 for the purpose Of simplifying the problem Of estimating the prior. 19 §2.§I-S E Hm B'lQll!’ In this section we treat the empirical Bayes problem Of the last section. All risks are multiplied by 1000, which corresponds to a component with loss function 1000(a-0)2 and cost 1000c per Observation. We have calculated the envelOpe risk r(a, fl) and the Optimal sample Size(s) for various m, c, a, and fl and present some Of the results in Table 2.1. We have included the mean and standard deviation Of the Beta (0:, 6) prior in each case. Figure 2.1 below is a graph Of the envelope risk function r(a, a) plotted against a on a log scale. For this we have chosen m = 2 and c = .001. A Flgure 2.1 A Risk Envelope 21 .0 1 8.0 1 5.0 1 2.0 9.0 6.0 3.0 1 -0 1 1 1 1 1 1 1 1 .001 .01 .1 1 2 10 100 1000 liar a) rlrlrlllrll'l' 20 Table 2.1. n*(a, fl) and r(a, 6) Prior c=.001 c=.002 c=.001 c=.002 a ,6 p a n* r n* r n* r n* r 0.1 0.1 0.50 0.456 4 9.081 3 12.720 4 7.415 3 10.529 0.1 0.3 0.25 0.366 5 10.151 3 14.371 4 8.320 3 11.699 0.1 0.9 0.10 0.212 4 9.000 3 12.429 4 7.462 2 10.429 0.1 1.9 0.05 0.126 3 6.958 2 9.278 3 5.879 2 7.958 0.2 0.2 0.50 0.423 6 11.760 4 16.503 5 9.638 3 10.529 0.2 0.6 0.25 0.323 6 12.510 4 17.470 3 10.274 3 11.699 0.2 1.2 0.14 0.226 5 11.266 4 15.599 4 9.330 2 10.429 0.2 1.8 0.10 0.173 4 10.000 3 13.500 4 8.286 3 11.455 0.3 0.3 0.50 0.395 7 13.421 5 18.844 5 11.010 4 15.440 0.3 0.6 0.33 0.342 7 14.065 5 19.657 6 11.569 4 16.160 0.3 1.2 0.20 0.253 6 13.111 4 18.105 5 10.818 4 15.111 0.3 1.8 0.14 0.199 5 11.855 4 16.213 5 9.851 3 13.473 0.5 0.5 0.50 0.354 7 15.333 5 21.364 6 12.579 4 17.615 0.5 1.0 0.33 0.298 7 15.602 5 21.594 6 12.838 4 17.877 0.5 1.5 0.25 0.250 7 14.812 5 20.417 6 12.250 4 16.929 1.0 1.0 0.50 0.289 8 17.259 5 23.889 7 14.246 5 19.804 1.0 1.5 0.40 0.262 8 17.266 5 23.714 7 14.295 5 19.796 1.0 2.0 0.33 0.236 8 16.772 5 22.821 6 13.937 4 19.111 1.5 1.5 0.50 0.250 8 17.868 5 24.423 7 14.813 5 20.417 1.5 2.0 0.43 0.233 8 17.768 5 24.109 7 14.775 4 20.289 2.0 2.0 0.50 0.224 8 18.000 5 24.286 7 15.000 4 20.500 3.0 3.0 0.50 0.189 7 17.714 4 23.306 6 14.929 4 19.905 4.0 4.0 0.50 0.167 7 17.101 3 21.873 6 14.547 3 19.072 5.0 5.0 0.50 0.151 6 16.331 3 20.205 5 14.091 3 17.962 10.0 10.0 0.50 0.109 1 11.823 1 12.823 2 11.158 1 12.352 21 For m=2, c=.001 and selected a, 6 values, we have made Monte Carlo estimates Of the empirical Bayes risk Of our procedure with initial starting estimates 210 = 230 = 1. This is done for stages i = 10, 15,20, 25,50 and 100 and the results are presented in Table 2.2 along with the standard errors Of the estimates. Table 2.2 Estimated Empirical Bayes Risks (m=2, c=.001) Estimated Empirical Bayes Risks (Standard Errors) 0 [3 10 15 20 25 50 100 Envelope Risk 0.1 0.1 10.22 9.83 10.13 10.00 9.28 9.13 9.081 (0.18) (0.07) (0.14) (0.14) (0.05) (0.01) 0.5 0.5 17.31 15.97 15.68 15.56 15.40 15.37 15.333 (0.67) (0.10) (0.05) (0.03) (0.01) (0.00) 1.0 1.0 21.27 19.05 18.26 18.05 17.41 17.32 17.259 (0.73) (0.43) (0.25) (0.28) (0.02) (0.00) 2.0 2.0 21.26 19.67 19.89 19.44 19.09 18.27 18.000 (0.43) (0.25) (0.30) (0.25) (0.20) (0.04) 3.0 3.0 20.43 19.73 19.36 19.75 18.73 18.47 17.714 (0.28) (0.24) (0.21) (0.25) (0.14) (0.17) 4.0 4.0 19.98 19.34 19.05 18.95 18.66 18.10 17.101 (0.29) (0.19) (0.16) (0.16) (0.15) (0.12) 0.1 0.9 12.25 12.58 13.12 13.05 10.69 9.41 9.000 (0.27) (0.34) (0.42) (0.44) (0.31) (0.31) 0.2 1.8 12.79 13.34 13.24 13.28 12.38 10.86 10.000 (0.19) (0.24) (0.29) (0.29) (0.28) (0.17) CHAPTER 3 TESTING THE BIN OMIAL PARAMETER § 3.1; The Compgnant Prablem In connection with the estimation problem for the binomial parameter 0 presented in Chapter 2, we consider a testing problem concerning the value Of 0 in B(m, 0), where m g 2 is a given integer. As in Chapter 2, we assume the conjugate prior G=Beta (a, )6) for the binomial parameter 0 and a constant cost c > 0 per Observation. The hypothesis to be tested is H0: 05 00 against H1: 0 > 00 for a given 00 E O = [0, 1]. Thus the action space .1 consists Of two actions a0 and a1, where 30 = "accept H0" and a1 = "reject H0". We assume the the linear loss function L(-, ) go on Ox .1? 1(1), a0) = (0— 00f“. L(0,a1) =(110 — o)+. Conveniently, L(0, a0) — L(0, a1) = 0- 00. Let X1,...,Xn be i.i.d. P 0, the distribution B(m, 0), with support 36’: {0, 1,...,m}. Then P15, the joint distribution function of _)_(_ = (X1,...,Xn), has support .3“. Let An denote the set Of all nonrandomized decision functions (3.2) a. .3“ -) {0, 1}. When a 6 .55“ is Observed, we take action a “Q and thereby incur the loss (3.3) L(0,a6(£)) = L(0, a0) — 6(a) [L(0, a0) — L(0, a1)] = 11(0) 30) " 60$)“- 00)- The Bayes risk Of 6 E An at G is (3.4) Rn(G, 6) = EL(0, aay), where E denotes the expectation with respect tO the joint distribution Of (0, X). 22 23 Using (3.3), we can write (3.5) Rn(G, 5) = 110w - 00)dG(0) — 2 5(a)[ll(0-00)pg(s) dawn. 136.311 0 X. We see that in (3.5), (3.6) 13w - oo)p,,(a)dc(0) .. EG(0Ix) - 00 where p 0 is the conditional mass function for and EG(0|X_) is the Bayes estimate Of 0 based on at defined by (2.4): a + (Xl+. "'Xn) dG(-)-(-)=I‘3G(0ll(-)=az + [3 + 11111 Thus (3.5) can be written as (3.7) Rn(G, a) = [$009 — 00)dG(0) — 2 6(8)1dG(21)— 0,) per), 56.311 where p denotes the marginal mass function for X. Since 6(X) takes values 0 or 1, it is clear from (3.7) that Rn(G, b) is minimized by taking 1 if dGI ( X) 2 00 (3-8) 5(309 = 0 othe r w i se which is a Bayes decision function with respect to G. From (3.8), we Observe that a Bayes test 6G E A11 is determined by comparing a Bayes estimate (IG 6 D n with 00 for each n = 1,2,.... This Observation is useful in that an empirical Bayes test 6n can be obtained from the empirical Bayes estimate (111 defined in Chapter 2. Remark 3.1. Let g, g’ be densities of G = Beta (0:, fl) and G’ = Beta (0’, 6’). Then we have (3.9) an(G, 6G,) — Rn(G’, 6G,) :21}, I8’(0)-g(0)ld(0) and 24 (310) 13,85. 4(3) — Rn(G)| .<. 213, new) —g(o) )do for all n = 1, 2... PM. For 6: 6G, in (3.7) (311) Ram, 1G,)=11. (0-00)g(0)d0- 2 6G,(a)idG(a)—00)p(r). 0 x6311 Letting G = G’ in (3.11) leads to (3.12) 8,162 6G,) =1) (0— 00)g’(0)d0- 2 404214618.) — 40111418). 0 56.311 By subtraction, |Rn(G, 6G,)—Rn(G’, 6G,) 811010—001 |g(0)-8’(0)|d0 + 2 1912113104,) |8(0)—8’(0)|d01 n x63 1 821010—00) lg’(0)-g(0)|d0 521.1, lg’w) —g(0)|d0- The second statement (3.10) follows immediately from (3.12) by changing the roles of G and G’. 1:) From (3.12), Bayes decision risk Of 6G 6 An at G = Beta (07, B) can be written as (3.13) Rn(G) = Rn(G, 3G) = 110(9— 00) g(0)d0-— 2: n[11G(a)— 0015(3), 11 = 1, 2,.... x63 We seek a minimizer Of rn(G) = Rn(G) + cn among the integers n = 1,2,... By the comment preceding Example 1.1 and the fact that R1(G) S 1 for all G, it follows that a minimizer n** satisfies n** 5 (1+c)/c for all G. We have chosen tO denote the Optimal sample size function for the test as n** = n**(a, ,6) to distinguish it from the Optimal sample size n* for estimation. We do 25 not have an explicit formula for n** although it is easily computed for any given a, 5. Thus, using the sample size n** and using 6G E An”, we achieve minimum Bayes risk r(G). 26 §3_-2.- W In this section we consider the empirical Bayes decision problem with the linear loss testing component problem described in the last section. The prior G is assumed to be in the parametric family f Of beta priors on O = [0, 1]. Let G = Beta (0:, ,6), where a 6 > 0 are unknown constants. In the sequence Of component problems resulting from the repetition Of the component, we are given a sequence Of parameters 01, 0 ,... which are assumed to be i.i.d. G = Beta (0:, 5). Suppose that we have experienced i component problems by Observing X1: (X11,...,X1Nl),..., X1 = (Xi1"""XiNi)' At the (i+1)th component problem we will test H0: 0i+1-<- 00 against H1: 0i+1 > 00 with the linear loss function given by (3.1). Since 0i+1 ” Beta (07, fl) and a > 0, B > 0 are unknown, the Optimal sample size n**(a, fl) and Bayes decision rule 6G e An**(a 6) are not directly available, so that the minimum Bayes risk r(G) cannot be achieved. However, if an estimate Gi Of G is available at this stage, we estimate the Optimal sample size n**(G) and the Bayes rule 6G E An**(G) at G by Ni+1 = n**(Gi) and 6i+1 = tiéi E An**(éi) and, thus, define an empirical Bayes procedure (21,52) as in Table 1.1. For the estimates Of a, B assume m g 2 and let Ai’ 3i begiven by (2.18). Let £10, 230 be arbitrary initial estimates. Then (3.14) Ni+1 = n**(Evi, 23,), i: 1, 2,... and 1 , if d. 1(xiirl) z 00 i+1 1+ _ (3-15) 6(.X_ )= 0 , otherwise where di+1 is defined by (2.11). 27 Lamma 3.1. Let m22 and let O. 1’ 6i be the as. consistent estimators, e.g., as in (2.18). Let g, denote the Beta density with parameter Oi, 3i and let g be the beta density with the governing parameter values a, )6. Then (3.16) gm) -) g(0), 0 < 0 < 1, as. 11991- At each 0, g(0) is a continuous function Of (a, 3). 1:1 jlfhaaram 3.1. Let m 2 2. The empirical Bayes testing procedure (N, .6) defined by (3.14) and (3.15) is asymptotically Optimal at each G = Beta (07, fl). Praaf. From Lemma 1.1 and (3.10), it follows that 1 . (3.17) 0 5 ErNi+l(G, 61+1) — r(G) 5 4 Ej0|gi(0) —g(0) Id0. Note that the sequence g, — g -) 0 as. on the probability space of the empirical Bayes problem cross Lebesgue measure on (0, l). The sequence gi + g dominates lgi — g| and converges to 2g(0) SO by the generalized dominated convergence theorem, RHS (3.17) converges to zero. :1 CHAPTER 4 ESTIMATION OF THE NORMAL MEAN § 4_._1_ Tha Qampanant Prablam The component problem considered in this chapter is the one introduced in Example 1.1. Here G = N(p, V) and, letting (4.1) p = five. the posterioi distribution Of 0 given )_(_ = (XI, X2,...,Xn) is (42) NW + (l—p) X. 343V). With this notation, the Bayes estimator (1.8) can be written (43) dGQi.) = pp + (l-p)X- The following remark parallels Remark 2.1. Remark 4.1. For G = N01, V) and G’ = N(p’,V’), (4.4) 8,16, «13.) = (I-p')2 % + 2'2 [(11 w)? + V). (4.5) IRn(G,dG,)-Rn(G’.dG,)l SUV—102+ IV’ -v1 and (4.6) Rn(G) = fly“ Proof. By (4.3), dG,(_X) = p’p’ + (l—p’)X. Since expected squared deviation is variance plus bias squared, 2 Rn(G1 dG’) = EGEgIP'I" +(1‘P')X— 0] = EG {(1-10’)2 % + 0’2 (11’ - 02} = (1-10’)2 %+ 9’2 [V + (u’ - 102]- Then (4.6) follows by replacing G’ by G above and using (4.1). Since 2 A 2 Rn(G” dGI)=(1_p’) T47 p, V' it follows that 2 . Rn(G) dctl - RING“ 93’) = p’2[(u’ -u) + (V -V )l 28 29 which yields (4.5). 1:1 We seek the Optimal sample size n* which minimizes __ _ AV In(G) — RD(G) + CH —.W + CD among the integers n = 1,2,.... We consider rn(G) as a function Of real 11 and the equation 2 (1 AV 0 = r (G) = — + c. 6'11 11 (A + nV)2 Its larger solution is (4.7) 7) = (A/e)1/ 2 — A/V. We see that rn(G) is convex in n 6 (—A/V, co) and that the Optimal sample size n* = n*(A, V) is given by (1.10). In our empirical Bayes application the variance A of the conditional distribution N (0, A) is assumed to be unknown but is assumed to be in a given bounded interval (0, a]. Thus we are taking A tO be a nuisance parameter. It is convenient, though not necessary, tO require that at least two Observations be taken in each component Of the empirical Bayes problem so that the estimation Of A is simple. Therefore, we will Optimize sample size over choices 11 = 2, 3,.... in defining the envelOpe risk. It follows that 2 if 77 < 2 (4.8) n*=n*(A, V): ’7 if r) 6 {2,3,...} [17] or [n]+1, otherwise where 7] is given in (4.7). Since R2(G) = EGE0(X — 61)2 5 A/2, it follows as in the comment preceding Example 1.1 that n* S (A/2 + 2c)/c. Letting M be the integer [a/2c + 2] + 1, it follows that (4.9) 2 5 n*(A, V) 5 M < 00 30 for all A and priors G = N(p, V). Notice that in the component problem (4.10) EXn = p, 1 (4.11) Eii ll M1: 2 k 1 k and, provided 11 2 2, n (4.12) E n—if k2101k — Xn)2 = A. 31 §fl.AnEmiri B SD iinPr r In this section we construct a decision procedure for the empirical Bayes problem with the component Of the last section. The unknown prior G is assumed to be from the family Of normal distributions f, the family Of conjugate priors. Let G = N(p, V), where p 6 (-co, co) and V E (0, 00). Let A0, [10 and V0 be initial nonrandom estimates Of the component nuisance parameter A and the parameters 11, V of the prior. Let N1=n*(A0, V0). Then x_1=(x11,....,x1Nl) is observed in the first component. The empirical Bayes procedure that we will study is defined through sequences Of estimators Ai, 111 and Vi that are (31,...,_X_i) measurable with (4.13) Ni+1 = n*(Ai, Vi)’ i = 0, 1,... and i+1 _ . . . . _ (4.14) di+10£ ) — pi+lfli + (1_pi+1)Yi+1’ 1— 0,1,... where ( ) A‘ 4.15 p. = i = 0 1... 1+1 ~ ~ . 1 1 i + Ni+lvi and N. - 1 21 x — 1 2 (4.16) Yi—N;j=1 iji 1“ 1 7 We now define the estimators iii, A. and V., i = 1, 2,.... Motivated by (4.10) we define . _ _l i ._ (4.17) ,rl_v._f.2 Y., 1-1,2,... the average Of the sample means for the first i components. Motivated by (4.12) we define (4.18) A. = S, A a i = 1, 2,... 32 where 1 i i=1 1 is the average Of the sample variances N. (4 20) s. = 1 2) (x. ._ v)? ‘ 1 Nj-I :1 1k 1 for the first i components. Finally, motivated by (4.11) we define A - A — A + . _ (4.21) Vi — [Ti Ai] , l— 1, 2,... where . 1 i (4.22) Ti = .1- 2 T.i i=1 1 is the average of the average squared deviations from ”i = Yi, N1 _ 1 2 In (4.23) the centerings change with i, which creates a more complicated stochastic structure than exists in (4.20). For purpose Of triangulation, we introduce l (4.24) 13:; 2: T., i=1 1 where N1 1 2 (4.25) T: 2 (x. —p) 1 Njk—l 1" Let 5;, be the trivial a—field and let a]: = c(x1,x2,...,xj), j = 1, 2,.... The sample size N j is Jig—measurable and we see that E Y. . = ' = l 2,.... ( J| 33.1) fl . J . . . . = A '=1,2,.... . . = A ° = 1 2,.... Lemma 4.1. The sequences a] = Yi’ Si and Ti are as. consistent for 17, A and V+A, respectively. 33 13m. We will use (4.26) and the theorem on stability about conditional expectation used earlier, i.e., Hall and Heyde (1980, Theorem 2.19). The sequences Yi’ Si and Ti are not bounded. However, we will find random variables Y, S and T that are Square integrable and stochastically larger than their absolute values. This implies the hypothesis Of Theorem 2.19 that is sufficient for the as. convergence. Recall that 2 5 Ni 5 M, i = 1,2,.... Consider the component problem with sample size M and Observations X1, X2,....XM. Let Y = 2 lle’ S = 2 X? and T = 2 (Xj - p)2. From the definitions (4.16), (4.20) and (4.25) we see that Y, S and T are stochastically larger than |Yi | , ISil and ITi | , i = 1,2,.... Also Y2 5 MS and, conditional on 0, the distributions Of S and T are noncentral chi—square distributions with second moments that are integrable N(p, V). Thus, Y, S, and T are square integrable. 1:) Emma 4.2. The estimator Ti and Vi are as. consistent for V+A and V. Prmf. We have from (4.23) and (4.25) that (4.27) Tj — Tji = (Yi — 11)(2Yj -— p — Yi). Since Y1 = E Yj/i, we have from (4.22) and (4.24), that (4.28) Ti — Ti = (Yi — 102. It follows from Lemma 4.1 that Ti is as. consistent for V+A. Using (4.21) and Lemma 4.1 it follows that Vi is consistent for V. n flfhaa am 4.1. Let A 5 a. Then the empirical Bayes procedure (N, :1) defined by (4.13) — (4.23) is asymptotically Optimal at each G = N(p, V). Prmf. From Lomma 1.1 and (4.5), —r(G) 3 2031—17)? + 2|vi—vl. (4.29) 0 s r (G, d- ) Ni+l 1+1 34 Let Y, T be the random variables defined in the proof Of Lemma 4.1. Then for p > 0, E|Yj|2+p g E(Y+1)2+p < co and E|Tj|1+p g E(T+1)1er < 00 for j = 1,2,.... Hence, the {Y?} and the {Tj} are uniformly integrable. Thus, {11?} and {Ti} are uniformly integrable and the as. convergence (Lemma 4.1) implies that (4.30) E([1i — 102 —) 0 and (4.31) E|Ti — (V+A)| -» 0. It follows from the triangle inequality and (4.28) that (vi—v1 g lTi-Ti' + lTi-(V+A)| + |(V+A)—(V+Ai)| (4.32) = (74—7.)2 + IT,—(V+A)I + IAi—AI. The dominated convergence theorem and Lemma 4.1 imply (4.33) ElAi—A|-)0 which together with (4.29) — (4.32) establish the result. 1:) CHAPTER 5 TESTING THE NORMAL MEAN § 5.1. The Comp_onent Problem. In this section we consider linear loss testing Of the normal mean 0 in N(0, A). Specifically, we consider the problem Of testing (5.1) H0: 0 S 00 against H1: 0 > 00 with L(0, a0) = (o— 00)+ L(0, a1) = (00 - o)+. Using the analysis deveIOped in Section 3.1 for the component Of this section, we find that for any test 6, (5.2) Rn(0, 6) = 1°50( 0— 00)dG(0) — I}, 6(8) [dG(3) — ours) as, where .2”: (we, 00), f is the marginal density Of N = (X1,....,Xn) and dG is given by (4.3). A Bayes test versus the prior G is 1 if dG(X) 2 00 0 if dag) < 00. Throughout this chapter we will take y to be the family Of normal (53) 6G (.X.) = distributions N(/1,V) with (5.4) Be |0| =1 )0) WW: K < ., where K > 0 is a known constant. Ramark 5.1. Let g, g’ be densities for G =N(11, V), G: N(11’,V’) in f. Then (5.5) an(G. 1G,)-R,, (-)u/~/V)l and (5.8) If )0) g’(0)d0-l )0) new) 4317' —p) + 1417' —.N1 + 2177') 141—p/m—11—p/m1 + JVI exp (~2/t’2/V’) - exp (~292/V) l- M. A direct calculation gives (5.7). Using (5.7) for G’ and subtraction, LHS Of (5.8) is less than or equal to 163%)”2 exp (—2p'2/v') + p' (1— 244-47470) — (2,3)”2 exp (-2p2/V) — 70 — 2¢(-u/~/V))| 8 Nil)”2 — (2931/2) exp 1—27'2/v') + 1%)”2 lexp (—2p'2/v') — exp (-2142/V)l + In’ -u| + 2|u’ -u| ¢(—u’/~/V”) + 21p) I¢(—u'/JV0 -¢(—p/N)I 37 :3 lu’ _,., + 1777—7171 + 2174 new/m—n—p/NH + .47) exp 1—2p'2/v') — exp (-2#2/V) I. the RHS of (5.8). 1:) We seek the smallest minimizer n** Of rn(G) = Rn(G) + on, n = 2,3,... (As in Chapter 4 we are Optimizing over 11 2 2.) It follows as in the comment preceding Lemma 1.1, that n** 5 (R.2(G) + 2c) /c, so using (5.6) and letting M denote the integer [{2(K + 1001) + 2c}/c] + l, (59) 2_<_n**$M 00. This will be carried out by determining the sample size N. 1+1 and the decision rule 6. E A for i = 0, 1,.... 1+1 Ni+1 Let A0, £10, and V0 be nonrandom initial estimates Of A, 11 and V and A let N1 = n** (A V0). Then N1 = (X11,....,X1Nl) is Observed in the first 07 #0: component. The empirical Bayes procedure that we will study is defined through Ai, iii, Vi that are (Nl,....,Ni) measurable with (5.9) Ni+1 = n**(Ai, 711, V,) 39 and . i+1 0+1 (5.10) 3. (x‘ )= 1+1 0 , otherwise where di+1(2(_i+1) is defined by (4.14) for i = 0, 1,.... If we use Ai, ”i and Vi defined by (4.8), (4.16) and (4.21) in constructing (N, t) = ((N1’ N2,...), (61, 62,....)) given by (5.9), (5.10), then it is easy to see that they satisfy all the consistency prOperties proved in Lemma 4.1, 4.2 and (4.30), (4.31), (4.33) in Theorem 4.1. The following lemma is useful in proving the asymptotic Optimality of (N, 15) that has been constructed above through Ai, iii and Vi i =0,1,.... Emma 5.1. Let gi, g be densities Of Gi = N(i1i, Vi), G=N(p, V) for i = 0,1,... Then (5.11) 12m E 7:01am —g(0)|d0= 0 and (5.12) lém E 11°00 10— 00) (gin) —g(0)|d0= 0 mat“. Note that gi - g -) 0 a.e. on the measure space Of the empirical Bayes problem cross Lebesgue measure On (-00, op). Using the same argument as in the proof of Theorem 3.1., we obtain (5.11). For (5.12), it suffices to show that (5.13) lim E [:0 |0| Igi(0) —g(0)|d0= 0. 1 Since |0| lgi(0) - g(0)| -) 0 a.e. on the measure space Of the empirical Bayes problem cross Lebesgue measure on (—oo, oo), |0| (gi(0) + g(0)) dominates the integrand |0| Igi(0) —g(0)| and |0| (gi(0) + g(0)) -) 2|0| g(0) a.e. on that 40 product space, (5.13) will follow by generalized dominated convergence theorem by showing that 00 “ 00 (5-14) E11,, l0! (8,09) + 8(0))d9" 2 EL“, |0| 8(0)“- Using (5.8) applied to g’ = gi , oo “ 00 (5-15) IELno |0| 51(0)d0- 1.0, MI 8(9)d0|. 53E|fli—ul +EI1/Vi-1/Vl + 217,) El¢(-it,/i/ V,) -¢(—u/JV)I + J V, Elexp (at/V,) — exp (—2p2/V) I. which converges to 0 by the as. consistency and the mean consistency of [ti and Vi. The proof is completed since on “ 00 IE].no |0| 8,(0)d0-E 1.0, MI 8(0)d0| 0° .. $.13 ”.le31(0)d0-I°_°,,|0|g(0)d9|. U Theoram 5.1. Let A g a. Then the empirical Bayes decision procedure (N, .4) defined by (5.9), (5.10) through the estimates Ai [ti and Vi given by (4.18), (4.17) and (4.21) is asymptotically optimal for all G with EG|0| ; K. Proof. From Lemma 1.1, (5.5), and Lemma 5.1, 0§Er (G,d. )—r(G) Ni+1 1+1 _S_4 E12,, I0-0ol Ié,(0)—s(0)lcw-»o c1 REFERENCES REFERENCES Gilliland, Dennis and Harman, James 1977). Improved rates in the empirical Bayes monotone multiple decision pro lem with MLR family. Ann. Statist. 5 516—521. Gilliland, Dennis and Karunamuni, Rohana (1988 . On empirical Bayes with sequential component. Ann. Inst. Statist. ath. 40 187—193. Hall, P. and Heyde, C. C. (1980). Martingale limit theory and its application. Academic Press, New York. Johns, M. V. and Van Ryzin, J. (1971). Convergence rates for empirical Bayes two—action problems 1. Discrete case. Ann. Math. Statist. 42 1521—1539. Johns, M. V. and Van Ryzin, J. ((1972). Convergence rates for empirical Bayes two—action problems 11. ontinuous case. Ann. Math. Statist. 43 934—947. Karunamuni, Rohana (1985). Empirical Bayes with sequential components. Ph.D. Thesis. Dept. 0 Statistics & Probability, Michigan State University. Karunamuni, Rohana (1988). On empirical Bayes testing with sequential components. Ann, Statiat. 16 1270—1282. Laippala, P. (1979). The empirical Bayes approach with floating Optimal sample Size in binomial experimentation. Sgng. ,1. Statiat. 6 113—118; correction note 7 105. Laippala, P (1982). The empirical Bayes rules with floating optimal sample size for exponenti conditional distributions. Ann. Inst. Statigt. Math. 37 315—327. Morris, Carl (1983). Parametric empirical Bayes inference: Theory and Applications. ,1. Amer. Statiat. Assoc. 78 47—65. O'Bryan, T. (1972). Empirical Bayes results in the case Of non—identical components. Ph.D. Thesis, RM—306, Statistics and Probability, Michigan State University. O'Bryan, T. (1976). Some empirical Bayes results in the case Of component problem with varying sample sizes for discrete exponential families. Ann. Statiat. 4 1290—1293. O'Bryan, T. and Susarla, V. (1975). An empirical Bayes two—action problem with non-identical components for a translated exponential distribution. Comm. Statist. 4(8) 767—775. Robbins, H. (1951). Asymptotically subminimax solutions Of compound statistical decision problems. Proc. 2nd Barkaley Symp. Math. Statist. Prob. 131—148, University of California Press. Robbins, H. (1956 . The empirical Bayes approach to statistics. Prac. 3rd Barkelay Symp. Mat . Statiat. Prab. 1 157—163, University Of California Press. 41 42 Robbins, H. (1964). The empirical Bayes approach tO statistical decision problems. Ann. Math. Statiat. 35 1—20. Singh, R. S. (1979). Empirical Bayes estimation in Lebesgue—exponential families with rates near the best possible rate. Ann. Statist. 7 890—902. Susarla, V. (1982). Empirical Bayes Theory, Encyclopedia Of Statistical Science, (eds. Kotz and Johnson) 2 490 - 503, Wiley, New York. Van Ryzin and Susarla, V. (1977). On the empirical Bayes approach to multiple decision problems. Ann. Statiat. 5 172—181. "I11111111111111“