_ , 7, v
. _. . hr’“_,.‘.--..— 7.-...-“._ - .. .w. . .. ‘
J . v V
, ‘ - , _
} - ‘
RATES OF CONVERGENCE IN SEQUENCE-COMPOUND '
SQUARED-DISTANCE LOSS .ESTEMATION AN‘D . ' r r
{ TWO ACTION PROBLEMS '
V‘—
Thesis for the Degree of Ph. D. < ‘ ~
" MICHIGAN STATE UNIVERSITY ‘ r
.' WAGH ESWARUDU SUSARLA .
1970 .
, . . ~ V ‘ . ‘ . ‘ M
4 . . . H . , . ...
. '. ' . - " . ,
‘ , ‘ ‘ ,. ‘ .. A
x . ‘. . A . ‘.- ,
. . ~ , , ‘ . .. - . V .
.. ‘7 » - . .. u ... H_ “H , _ , ,_,,,
4 ‘ . . , - . .. . .. / ‘ ,..V‘..:, ..‘.v-r.~v: um: I
. . A ‘-»___,,.. .. 11- - V,.’...., .l. , - .
‘ e , , ~-» . "x...“r . ,. 4-..
‘ .7 . , . . ,...-.— u, ....
A ‘ . U ‘ ‘ {. rn-rrv v4"
" ' e e r‘ " _ ..»N4.v¢r ’
' . ' . . . - , .. . ,
. ‘ , , . . . .w . _
. . ~ . ~ " L1,}... (1
‘ . . . ,. . . - ~v no» .rr v<~u,
.. . . ‘ ...,..-..'o~-A.wo.‘
. . , vv n..- Ira—.J‘-i
LIBRARY ‘
Typqxé MiChigan Scam
' - University
This is to certify that the
thesis entitled
RATES OF CONVERGENCE IN SEQUENCE-COMPOUND SQUARED-DISTANCE
IDSS ESTIMATION AND TWO-ACTION PROBLEMS
presented by
Vyaghreswarudu Susarla
has been accepted towards fulfillment
of the requirements for
Ph.D. Statistics and
Probability
Aw’aézéé’w’m’
Major professor
degree in
I /
l\/
Date August 12, 1970
0-169
ABSTRACT
RATES OF CONVERGENCE IN SEQUENCE-COMPOUND SQUARED-DISTANCE
IDSS ESTIMATION AND TWO-ACTION PROBLEMS
BY
Vyaghreswarudu Susarla
We consider a sequence of repetitions of a statistical
decision problem which has the structure of one of the statistical
decision problems described below. These statistical decision prob-
lems will be referred to later on as component problems.
When the family of distributions 6’ is, (1) the family of
mevariate normal distributions with covariance matrix I and mean
9 in o = [‘9‘ s a], the problem is to estimate 9 with squared-
distance loss, (2) the family of F(a) distributions with scale
parameter 9 in ® 8 [a,b] where o.< a < b < m, the problem is
to estimate 9 with squared-distance loss and (3) same as (2) except
that the problem is a linear loss two-action problem. For any dis-
tribution G on @, let R(G) denote the Bayes risk in the com-
ponent problem.
§_= {ﬁn} is a sequence of independent random.variables with
distributions {Pen} in :‘9. Let Gn be the empiric distribution
of 91,...,en. Let s be a positive integer and y be in (0,1).
All the orders stated here are uniform in the parameter sequences
.Q in 3 ®. >
When the component problem is described by (1), we ethbit
**
procedures ln , ln and oln’ which are functions of X1,...,§n,
Vyaghreswarudu Susarla
such that Dn(ﬁ,y_**) = n'lzt; EH? - 93|2 - R(Gn)’ Dn(§,§_) and
”Jim are 0(n'1’ (“H"), ocn'(2‘1>v/)
- (3:1) /2 (s+m+1))
and
0(n respectively. Whenever m 2 5 and
**
(s-1)y(mH4) 2 2(23+m)(1+y), i. is better than i_ in the sense
**
that SUpiDn(§9i- )‘gj converges to zero at a faster rate than
sup{Dn(§)1f*)\§J does. Similar comparison has been given between
** **
i_ and 0?. The results stated above for 1_ and i' have been
extended to the case when the covariance matrix I is replaced by
021 (02 unknown) and the means an lie in lower dimensional sub-
spaces having the same dimension.
When the component problem.is given by (2), we exhibit a pro-
cedure W: such that Dn(§J1f) = 0(n-a/2(8+1)) when a,b and a
satisfy certain conditions. For the same set of conditions on a,b
and a, when the component problem is described by (3) with loss
function L, we define a procedure ﬁn such that n-lﬁq E L(9j,¢j) -
Men) = 0(n'S/2(S+1)).
(I. Elul ['1
RATES OF CONVERGENCE IN SEQUENCE-COMPOUND SQUARED-DISTANCE
IDSS ESTIMATION AND TWO-ACTION PROBLEMS
BY
Vyaghreswarudu Susarla
A THESIS
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHIIDSOPHY
Department of Statistics and Probability
1970
TO MY PARENTS
ii
7"?
ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to Professor J.F.
Hannah for introducing me to compound decision theory and for
suggesting the problems treated in the thesis. His comments aided
greatly in improving and simplifying most of the results of the
thesis.
I wish to thank Professors D.C. Gilliland and J.S. Huang
for going through the thesis and pointing out some misprints. I
wish to thank Mr. T. O'Bryan for suggesting changes in the
phrasology. Special thanks are due to Mrs. Noralee Barnes for
her excellent typing and cheerful attitude in the preparation of
the manuscript.
I am grateful to the Department of Statistics and Probability,
Michigan State University and the National Science Foundation for
the financial support during my stay at Michigan State University.
iii
-- “—4
lllln'lllll ll
Chapter
0
I
II
TABLE OF CONTENTS
INTRODUCTION
RATES IN THE ESTIMATION PROBLEM FOR A FAMILY OF
m-VARIATE NORMAL DISTRIBUTIONS o o o o o o o o o ooooooooooo
1.0 Introduction and Notation .............. ......
1.1 A Bound for the Modified Regret D (ggg) .....
*
1.2 A Rate of Convergence for Dn(§,1_ ) with
**
V Based on a Divided Difference Estimator
for the Derivative of the log of a Density ...
k ' .
) for Dn(§’i) With 3
Based on Kernel Estimators for a Density and
its Derivative ...................OOOOOOIOO...
1.4 Rates Near 0(n-k) for Dn(§,o¢) where
O$,aParticu18r i OOOOOOOIOOOOOOOOOO0.0000.
*
1.5 A Lower Bound for Dn(Q’!-*) .......... ......
1.3 Rates Near 0(n-
1.6 Extension of Results in Sections §1.2 and
§1.3 to Constrained Mean Vectors and Unknown
Covariance Matrix ............................
1.6.1 Definition of gf* and a Rate of
Convergence for“ Dn(§Jgf*) ...........
1.6.2 Definition of Z_ and a Rate of
Convergence of Dn(§”i) ...............
RATES IN THE ESTIMATION AND TWO-ACTION PROBLEMS
FOR A FAMILY OF SCALE PARAMETER P(a) DISTRIBUTIONS
2.0 Introduction and Notation ............ ........
2.1 Estimation Problem. Rates of Convergence for
Dn(§3if) with w* Based on Kernel Estimators
for a Density ................................
2.2 Two-action Problem. Rates of Convergence for
Dn(§”£) with t Based on Kernel Estimators
for a Density ...............................
APPENDIX 0......OOOOOOOOOOOOOOOOOOOO00...... ..... O
BIBLIOGRAmY 0....OOOOOOOOOOOOOOOOOOOOOOO. ........
iv
Page
10
27
33
40
45
46
47
49
50
54
63
72
74
INTRODUCTION
In Chapter 1,.9 = {P9} is the family of m-variate normal
distributions with covariance matrix I and mean 9 in
O = [‘9‘ s a] and the component problem is squared-distance loss
estimation of e. In Chapter II,'0 is the family of f(a) dis-
tributions with scale parameter 9 in ® = [a,b] where
0 < a < b < m and the component problem is either squared-distance
loss estimation or a linear loss two-action problem. For any dis-
tribution G on @, let *G and R(G) denote the Bayes estimate
and the Bayes risk in the component problem.
The sequence-compound problem consists of a sequence of
repetitions of the component problemtwith the loss taken to be the
average of the component losses. 3.. {En} is a sequence of in-
dependent random variables with distributions, {Pe } in ﬁr? and
the nth component decision §n depends only on §:,...,§n. With
Gn denoting the empiric distribution of 61,...,en, let
n
_ l
(0.1) Dn(§_,§) - n jEIEEMejéjn - Men).
Dn(§3§) is known as the modified regret of 5,
Since the work reported here is a continuation of Gilliland
(1966, 1968) and Johns (1967), we describe some of the main results
contained in these references. All the orders stated below are
uniform in the parameter sequences concerned. For the purpose of
this introduction only, abbreviate 0(n-a) to order -a.
‘lllllll.‘l I'l'l‘ E [J
When 6’ is the family of univariate normal distributions
with variance unity and mean 9 in [-a,+u] and the component
problem is squared-distance loss estimation, Gilliland (1966)
exhibited a procedure whose modified regret is order -1/5. When
‘9 is a certain family of discrete distributions and the component
problem is the linear loss two-action problem, Johns (1967) exhibited
a procedure whose modified regret is order -1/2. When 6' is a
certain discrete exponential family and the component problem is
squared-distance loss estimation, Gilliland (1968) exhibitadtwo pro-
cedures whose modified regrets are order -1/2.
Now we briefly describe the main results obtained in this
work. In Chapter I, the Bayes estimate against Gn_1 is
with p denoting the mixed density ‘pedGh_1, q denoting the
matrix of partial derivatives of p and indication of the evalua-
tion of both at En abbreviated by omission.
In section §1.2, we define ¢:* based on a divided difference
estimate of 5/5 whose Dn is order -(mﬂ4)-1. This generalizes
the result of Gilliland (1966) for 'm = 1 case.
In section §1.3, for each positive integer s and y in
(0,1), we define 1n based on kernel estimators for p and q
analogous to Johns and Van Ryzin (1967) estimates of ~‘pedG and
its derivative in empirical Bayes two-action problem in exponential
families and show Dn(§”!> is order -(s-1)y/(23+m)(1+y). For each
integer s > 1, we exhibit 0?“, specializing i_ but for the latter's
retraction to [5,m), whose Dn is order -(s-1)/2(s+m+1).
ll“ [1.“
2/m+4
** -
In section §1.5, we show that Dn(9’i- ) 2 c n where
c is a constant depending on a- Hence, whenever m 2 5 and s
and y are such that (s-1)y(m+4) > 2(Zs+m)(1+y), i_ is better than
**
i in the sense that sup{Dn(§,§)‘§J converges to zero at a faster
**
rate than SUP{Dn(§Jl- )‘Q}. A similar comparison is made between
*4:
‘1 and 2%:
Section §1.6 extends the main results of sections §1.2 and
§1.3 to the case when the covariance matrix I is replaced by
021 (02 unknown) under the additional assumption that the means
lie in lower dimensional subspaces having the same dimension.
In Chapter II, as already indicated earlier, 6’ is the
family of F(a) distributions with sclae parameter 9 in
® = [a,b]. In section §2.1, the component problem is squared-
distance loss estimation. For each positive integer s, we define
V: based on kernel estimates for two densities and show that
Dn(§gif) is order -s/2(s+l) whenever a,b and a satisfy certain
conditions. In section §2.2, the component problem is linear loss
two-action. For each positive integer. s, we define 1n based on
kernel estimates for two densities and show that Dn(§’i) is order
-s/2(s+l) whenever a,b and a satisfy the conditions imposed on
them in section §2.1.
Throughout this work, we let Q and ¢ denote the standard
normal distribution and its density respectively. We suppress the
arguments of functions whenever it is convenient not to exhibit them.
Indulging in the abuse of notation, we let sets denote their own
indicator functions and, infrequently, are forced to let the value of
a function denote the function. For any measure n, we let u[f]
or pf denote ‘fdp.
CHAPTER I
RATES IN THE ESTIMATION PROBLEM FOR A FAMILY OF
m-VARIATE NORMAL DISTRIBUTIONS
§1.0 Introduction and Notation.
For fixed a < m and for fixed positive integer m,
let ‘9 = {Pe“e‘ s a} be the family of distributions with P9
denoting the mdvariate normal law with mean 9 and covariance
021, where I is the m X m identity matrix and 02 > 0.
We consider the following estimation probleunwhich will
be called the component problem hereafter. Based on an observa-
tion of a random vector R whose distribution Pe belongs to
‘9, the problem is to estimate 9 with squared-distance loss.
For any distribution G on the m-Sphere of radius a,
let and R(G) denote the Bayes estimate and the Bayes
q'G
risk versus G in the above estimation problem. Since the
problem considered here is the squared-distance loss estimation
problem, *6 is given by the conditional expectation of 9
given 5. If pe denotes the usual density of P9 wrt
Lebesgue measure on (Rmdgm), then the conditional expectation
of 9 given § is G[epe]/G[pe] which, can be expressed as
2
X +'o qG where qG is the vector of partial derivatives of
log G[pe] wrt the various coordinates of X. Hence,
N
2
.1 = + 0
We consider a sequence of component problems as des-
cribed above. That is, let {Kn} be a sequence of independent
random variables with Xn distributed as Fe belonging to 9
n
and the problem is to estimate every component of {9“} with
loss taken as the average of squared-distance losses in individual
n
components. For each n, let the product measure x Pi’ Where
i=1
Pi is an abbreviation for P8 , be denoted by En' Let
1
g = {gm} be a sequence-compound procedure (abbreviated to
procedure hereafter). For any parameter sequence g = {en}
and for any non randomized procedure §'= {gm}, define
1
n
- 2
(0.2) Dn(_€i,§> = n z gjugj - ejl 1 - R(Gn)
j=1
where Cn is the empiric distribution of 91,...,en. D (§,§)
is called the modified regret of the procedure g.
The orders stated in the results of sections §1.1, §1.2,
§1.3 and §1.4 are uniform in all parameter sequences g in
x [‘en‘ 5 a] and the order stated in section §1.6 is uniform
In all parameter sequences §_ belonging to X ([‘en‘ S a] 0 RE),
where, for each n, Eh is a d (d < m)-dime:sional subsPace of
Rm. To reduce the complexity of the statements of various re-
sults in this chapter, the range of the parameter sequences
will not be exhibited, but is understood to be as in the pre-
ceeding sentence. Henceforth, we use these conventions.
In section §1.1, we get an upper bound for ‘Dn(§,§)‘
under the assumption that g is in X [-a,+a]m and a useful
n
lemma, both results holding for each 0 . In section §l.2,
.-
** -___
we exhibit a procedure y_ for which Dn(§,yf*) = 0(n m+4)
when 02 = 1. In section §1.3, for each y > 0, we exhibit a
procedure i_ for which Dn(§,i) = 0(n-(%-Y)) again for
02 = 1. In section §l.4, for each positive integer s, we
- -l 2 +s+1
exhibit a procedure 0% for which Dn(§,o$) = 0(n (S )/ (m ))
for 02 = 1. Section §1.5 shows that
2
Dn(9,yf*) 2 c n m+4 for all n, where Q.= {0‘ and c is a
positive constant. Section §1.6 has two subsections. These
subsections extend reSpectively the main results of sections
§1.2 and §l.3 to the case when 02 is unknown and when, for
each n, en lies in ﬁg intersected with m-sphere of radius a.
Let n denote the Lebesgue measure on (ngam). For
m
any two points u,v in R with coordinates u1,...,um,
m m
v1,...,vm respectively, let ‘u‘2 = 2 ui, “u“ = “E ‘ui‘
m i=1 1— _
and (u,v) = 2 uivi. The inequalities ‘u‘ s “u“ s,/m ‘u‘
i=1
m
will be used without further comment. Also, a vector in R
will be denoted by < > with the general coordinate of the
vector exhibited inside the brackets.
Let pn be an abbreviation for pe , the density of
n
Pe . For each n, let we be abbreviated by N“. Then,
n n
specializing (0.1),
(093) 1“ =§+Uq
where qn is the vector of partial derivatives of the function
n
log 2 pj wrt the coordinates of X.
j=1 "
§1.1 A Bound for the Modified Regret Dn(§’§)'
We state and prove two lenmas which are higher dimen-
sional generalizations of proposition 1 and corollary l of Chapter
I of Gilliland (1966) for the case of the family of normal distribu-
tions 9.
2 2
' -1
Lemma 1. PHE‘wn - ¢n_1‘] s z: e4° “ n
for n > 1.
Proof. From W“ = Gn[9pe]/Gn[pe], the triangle inequality and
Jensen's inequality, reSpectively, it follows that
' n-1 ““1
n
_ = -1 ' _ i
Mn $11-1! {5133113) (121131.) \jilmj 9“)ij
(1.1)
n ['1
-1 -2 -
S 2a pn( z pj) s 2a n pm 2 Pj -
i=1
1‘1
, -1 -2 -1
Since pnpj = exp 0 (en - ej, X - (9n +’9j)2 ):
-1 -2 2 -2 2
PnIPnPj ] = exp 0 ‘6n ' Gj‘ s exp 0 4a
which, when substituted in (1.1), completes the proof.
Lemma 2. If the procedure g' is in X [-a,+u]m, then, for
2 n
each a > O,
-1 n _1
‘Dn(§9§)‘ S 40 n jil£j[“§j ' ¢j_1H] +'O(n log n).
where to is an arbitrary decision rule taking values in
m
[-0,-1.0] 0
Proof. Inequalities (8.8) and(8.1l) of Hannan (1957) when
Specialized to the squared-distance loss estimation problem
here give the inequality
9
n
1 s
2
j=12jilvj_1 - eji 1.
n
(1.2) n .2123[‘¢j - ej‘ ] s R(Gn) s n
By bounding the term R(Gn) appearing in the defini—
tion (0.2) of Dn(§’§) above and below by using (1.2), we
2 2
obtain, by using the equality ‘a‘ - ‘b‘ = (a + b, a-b) for
a,b in Rm, the double inequality
n
(1.3) n'1 z Pj[(§
._ + i
J
' 2 a ' s
1 j j_1 ej t, ¢j_1>] s Dn from Rm to Rm
where H] and HJL
for L = 1,...,m reSpectively under F and any undefined
represent the measures of [j and [3
ratios are taken to be 1. We abbreviate t(E) frequently
by t hereafter.
* *
Let the function t(F ), where F is the empiric
*
distribution of x1,...,Xn_1, be denoted by t , Let x abbre-
viate Xn and X1,...,Xm denote the coordinates of X. Let
N
** 2 * * 2 *
(2.1) q, =tr'(X+ot (X)), t =tr(x+ot (X))
where tr' and tr stand for the coordinatewise retraction
to the intervals [-a,+a] and [-a - k - h, a + k +-h]
respectively.
11
With t abbreviating wn-l’ we have, since ‘V‘ s a
** * ** *
and V = tr'¢ , “t - V” s “I - NH. Therefore, by the
triangle inequality
(2.2) gm“ - IN] S 2mm" - (x + azuxmn + Pninx + oztoo - MH-
m
:
Lemma 3. For all x in R
(1) x + ozufux) e [-a - 13 - helm.
‘ ' h Sk¥h2 mh+k -
(2) ED 2 p<->'” exp - (M + --> where p
L o 02 2
is the density of F. at x and
(3) E‘s—HELEK—hexp%h-(‘XL‘ +q+k+h) for
O
m
L = l,...,m where [3' = X I" With 1" = I. for 1 E L
L j=1 J J
and 13 = [XL’ XL + k +-h].
Proof. In this proof, let Fj denote the distribution of
Xj and ejl’°°"ejm denote the coordinates of ej.
Proof of (1). Let L be in {l,...,m}. Since the coordinates
of x are independent, we can express Ff] and F as
30!.
the products of univariate normal probabilities. Therefore,
by cancelling out the common terms in these products, we obtain
that
m Mo'ls -e +k+h)>-e(a'1(x ~e +k>>
j L = L if L 11L
F C]
-1 -1
j @(o (XL ' 93!. +10) - M0 (XL - 91(1))
Applying Cauchy's mean value theorem (Graves (1946), p. 81) to
the rhs of this equality over (0,0-1h) with the function in
the denominator to be taken as 6(o-l(xL - ejL)) while that
in the numerator to be taken as §(O-1(XL - ej + k)). we
L
obtain, by using a2 - b2 = (a +-b)(a - b) for a,b in R1
9
12
the existence of m in (0,1) such that
Fl]
_12L = - E. - E
FJD exp 02 (xx. 9J4. + 2 + wh)‘
Hence, since ‘ejL‘ s a,
F
exp -‘5- (x +-a 4"E +-h) S -i3L s exp k‘(a - x ).
2t 2 FC] 2 L
0 j 0
Since these bounds for FIVE/FIE are independent of
j, they also bound ﬁjL/ﬁj. These inequalities are equivalent
to (l) in view of the definition of t(F). Since L is
arbitrary the proof of (1) is complete.
Proof of (2). We temporarily abbreviate ”[S¢] by §(S)
for any 3 in 5‘“. Then, F [J = §(o-1(I - e + k))r1
Q(a-1(Ii - eji)). Hence, applying the mean value theorem
-1 -l
to *(o (Ii - eji)) for i 9‘ t and to Mo (IL - en + 10).
. . m
we obtain the ex1stence of in (0,1) such that
. - + h+
hm x' 91 “’1 61k
m
= _ 1 J L
chk, (a) 121¢( o )
where 61L = [i = L]. Hence, since - log(¢(u)/®(v)) = (u-v)(u+v)/2,
we obtain that
Fij m w.h + 6.,k
-02 108(‘ﬁ‘L—) = )3 (w.h + 6 k)(x. - 9.. +4-7—y-‘ﬁ-
(“)mpj i=1 1 1L 1 11
0
Hence, since the functions of mi appearing on the rhs of this
equality, being convex, attain their maxima at mi = 0 or 1,
we obtain that the rhs of the last equality is exceeded by
m h + b. k
.. _ .21..
iglouks, eji + 5mm” v ()
13
m h + 51 k
S 12101 + aukx‘xi‘ + ‘eji‘ +___L_2 )
s (k +h)(“x“ +/moz +mh'2H‘).
Since this bound for -ozlog(Ff:k/(h/o)mpj) is independent
of j, it also bounds -azlog(ﬁjk/(h/o)mp). Since this in-
equality is equivalent to (2), the proof of (2) is complete.
Proof of (3). Using the notation 6(8) for S in 16m
introduced in the proof of (2), we have
F113 =¢(o'1(1 -e. +k)) nuo‘la -e )) and FCJ' =
L L If; l—ML 1. jl j L
9(0-1(I" ‘ 9 )) H 9(O-1(I ‘ 9 )). Hence, applying the mean
L 3!. i541, 1 ji
- _ '1 n _
value theorem to 6(0 (IL 91L + k)) and Q(o (IL ejL))’
we obtain the existence of w, m' in (0,1) such that
p '1
Ff] = k :-h ¢(o (XL
Fiji, quads, - 9n + k + ooh»
- ' k
ejL_+ w ( +h))
Hence, since log(¢(u)/¢(v)) = (v-u)(v+u)/2, we obtain from the
above equality that
F U
2 _h__£t- . . L11 w_+w_1
a log kll FJJL - ((1 m )k +-(w - w )h)(xL-ejL + 2 k + 2 h).
Hence, since 0 < w, m' < l, we obtain from the above equality
that
2 h F '
a log Egg 5 (k-l-h)(‘xL‘ +0! + k-i-h).
Since this bound is independent of 1, it also bounds
2 _. ._
o log(h H:£/(k+h)HJL). This inequality is equivalent to (3).
Hence the proof of (3) is completed.
14
Now we bound the integrals on the rhs of (2.2). The
method of bounding the first integral is essentially a gen-
eralization of that given in Chapter III of Gilliland (1966).
We get a simpler method of bounding this integral because of
the definition of ¢f* in (2.1). This definition of ¢**
differs from that of a similar function introduced by Gilliland
(1966). The method of bounding the second integral of the rhs
of (2.2) differs from that of Gilliland (1966). Let c ,c ,...
l 2
denote finite functions of 02. Let
K = {k‘O < k < (s + g'2(2a + 1))‘1}.
Lemma 4. If k is in K, then
* 2 k+h e _1_ 25
RIM - (x + o t(X))n] s c1(nk2hm+1) + C2(nh‘“ .
nggf. Since the 1hs is the sum of gn-integrals of the moduli
of the coordinates of y* - X - azt(X), the lemma will be proved
by showing that these integrals are bounded by rhs/m.
Let the dependency of t on X be suppressed and
X1,...,Xm denote the coordinates of X. We abbreviate in
this proof the Lth coordinates of ¢* and t by omission.
Let a' denote 2(a + k +’h).
Since ¢*, by definition (2.1), is the retraction of
XL +czt* to [-a - k - h, a + k + h] and Since XL + ozt,
by (l) of Lemma 3, is in [-a - §’- h,a], it follows that
H" - XL - ozt‘ s a' and ‘f - XL - czt‘ s 02‘t* - t‘.
Therefore,
15
7'c 2 0" ‘k 2
Emily - X, - o t‘] -<- ‘Eﬂm - x, - c t‘ > quu
a' 2 *
(2.3) s g §n_1[o ‘t - t‘ > ujdu
a!
= g En-1[02(t* - t) > u]du
O 2 *
+ [a'£h_1[o (t - t) < u]du.
2 *
The main part of the proof bounds £n-1[° (t - t) > u]
f o < s ' d p [ 2 * f ' 0
or _ u a an -n-1 a (t - t) < u] or -a s u <
by using the Berry—Esseen theorem. The rest of the proof shows
that the Pn-integral of m times the bound for the rhs of (2.3)
is exceeded by the bound in the lemma.
Let X be fixed until otherwise stated. Let
, = . = , a d
s, [35, 6:3,]. as, [35, ea] n
2
= - R(tiulo )
(2.4) Yj(u) Sj bj e
for ‘u‘ 561'.
Let the dependency of Yj on u be suppressed hereafter.
2 -3 3
Let a = Var(2Yj) and L = a z PJ‘Yj - Pij‘ where 3
stands for summation over 1 from 1 to n-l.
Sublemma. For ‘u‘ S a',
-2
kg (a+a'+‘xL‘)
2 E °
2__e
C4 (n-l)
‘P
141ij 2 0] - Me 1213an s
(RIB)
Proof. With 3 denoting the Berry-Esseen constant, the Berry-
Esseen theorem (Loeve (1963), p. 288) implies that
-1
En-l[ij Z 0] - 9(6 ZPin)‘ is exceeded by BL. Hence, we
complete the proof of the sublemma by showing that L is
exceeded by B“1 times the bound of the sublemma. In order
16
to get a bound on L, we first get a lower bound on 62.
By applying L1.A (see Appendix) to the Yj’ we
2
obtain a lower bound for B . We observe that Yj defined
by (2.4) takes three values; namely
2
(2.5) o, 1 and - ek u] s [ZYj 2 0]. Hence, by the sub-
*
lemma, it follows that £n_1102(t - t) > u] is exceeded by
(2.10) Q(B-12 Pij) + bound in the sublemma.
Since ekt = Rik/ﬁj by the definition of t,
l8
_ _2 _ _
Z Pij = (n-l) H3L(1 - exp kc u) S -(n-l)ko 2 HJLU. There-
fore, by using the upper bound for B in (2.9), we obtain
’5
that 9(5-1}: Pij) s §(-((n-l)hmk2) fu) where f is the
positive solution of the equation
-2
_ 2ko (ain'i‘ X ) ..
(2.11) 0“ htn p131: e ‘ 9‘ £2 = (FOL 2
*
Therefore, since (2.10) is a bound for §h_1ch(t - t) > u],
we have, for O s u s a',
(2.12) Pn_1[oz(t*-t) > u] s §(-((n-l)hmk2)%f u) + bound in the
sublemma.
*
Now we consider bounding the probability Pn_l[02(t -t) < u]
*
for -a' s u < O. The definitions of t and Yj imply that
2 *
[o (t - t) < u] s [2 Yj S 0] = [2 - Yj 2 0]. Since the sub-
lemma continues to hold when 6j and bj in the definition
of Yj are replaced by -Ej and -oj respectively, we obtain,
by applying the sublemma to .2 1[2 - Yj 2 O], that
n-
2 * .
§n_1[g (t - t) < u] 13 at most
-1
(2.13) §(-B Z Pij) + bound in the Sublemma.
Again since 3 Pij = (n-1)th(1 ' exp kO-Zu) 2 -2-1(n-1)ko-2N:ku
where the inequality follows since ko-za' < l by the hypothesis
on k, we obtain, by using the upper bound 32 in (2.9) and the
definition of f in (2.11), that §(-B-12 Pij) is exceeded
by @(2-1((n-l)k2hm)$5f u). Therefore
§n_1[02(t* - t) < u] S §(%(n-l)k2hm)%f u) + bound in the Sublemma.
19
Integrating this inequality wrt u on [-aJ,O) and
the inequality (2.12) wrt u on [0,af], then bounding their
a
first terms by using the inequality £§(-au)du S (2n) A(581
for any a > 0, we obtain, by using the inequality (2. 3), that
3 l . .
2 ngg +20 (bound in the
P [‘w* - X - ozt‘] s
“'1 ’9 fzn ((n-1>kh)f
sublemma).
Hence we complete the proof of the lemma by showing below that
the Pn-integrals of m(h(k+h)-1)%f-1 and m(nhm)% (bound in
the sublemma) are uniformly bounded in n.
By definition of f in (2.11), we have
k0 2(a+u W+‘X ‘)
-1 _ 2 4% hm %e
f - O (FDL _J) (PDL )
By bounding above (ﬁjé/E:k)£ by using (3) of Lemma
3 and by bounding below ﬁ:k/hm by using (2) of Lemma 3, we
get an upper bound for (h(k+h)-1)%f-1. ‘Weakening this upper
bound for (h(k-l-’n)ml)s‘;f-1 by using the fact that 0 < h S k.< 1/5,
we obtain that
e2:2 -—-(3‘x L‘ﬁ‘x“)
5 5%
_h_, 1 . C
(keh f 3
for some c5. Since (2n02)mp§ S exp(-o-%K‘x‘ - a)+)2) and
(2no2)mf>2 2 exp - (o'2(a + ‘x‘)2), we obtain that the above
upper bound for (h(k+h)-1)%f- is uniformly bounded hn Iland Pn-inte-
grable. Now by using (2) of Lemma 3, and the inequality
0 < h< k < 1/5, we obtain that (nh m)% (bound in the sublemma)
is exceeded by
20
°-2(‘Xt‘ + WE‘LL)
e
C
6 pt
2 -2 + 2
for some c6. Again, since (2n02)mpn S exp - (o ((‘X‘ r a) ) )
and (2noz)m 52 2 exp - (c-2(a +-‘X‘)2), we obtain that the
hm/2 (bound in the
Pn~integra1 of the above upper bound for
sublemma) is uniformly bounded in n. This completes the proof
of the lemma.
The next lemma is a slight generalization of a particular
case of Cauchy's mean value theorem (Graves (1946), p. 81).
Lemma 5. For each j = l,...,n-l, i = l,...,m, let the func-
tions fji’ gji be real valued, continuous on [ai’bi] and
differentiable on (ai’bi) and let the derivative of gji be
finite and positive. Then there exist c1 in (a1,b1),...,cm
1n (am’bm) such that
b.
1 I
2 n fjiJai 2 n fji(ci)
b. - E n g' (c.)
1 ji 1
2 n sjiJai
where 3 stands for the summation over j from 1 through. n-l,
n stands for product over i from 1 through m and prime
over any function denotes its derivative.
Proof. Define the functions g1 and n1 on [a1,b1] as
follows.
m bi
5(X)=Ef (X) Hf]
1 jl i=2jiai
and
m b
n(x)=zg (x) m i
l jl i=2 ji]ai
21
for x in [a1,b1].
With these definitions, we obtain that
b b
' 1
2 1T £11181 €118.
(2.14) 1 = —_L .
bi b1
2 n gji-Jai nl]a1
Since fjl and gjl are continuous on [a1,b1] and dif—
ferentiable on (a1,bl) for all j, so are g1 and n1.
Moreover, since the derivative of gji is finite and positive
for all j and i by assumption, so is the derivative of RI.
Hence, applying Cauchy's mean value theorem to the rhs of (2.14),
we obtain that there exists c1 in (a1,b1) such that
b.
1
2 " £31], §'
(2.15) 1 = 1 1
_T___.
bi “1(61)
2 " 8111a.
1
Now, we define g2 and n2 on [a2,b2] as follows.
m bi
_ I
i=3 i
and
m b,
.. 1
o 0 9 ' '
for x in [a2,b2]. Then ltbeIIOZS that the ratio §1(c1)/H1(c1)
l O a a 2 2 .
is identically the ratio §2]32/ﬂ2]az. Again, §2 and n2
are continuous on [a2,b2] and differentiable on (a2,b2)
Since szogjz are continuous on [a2,b2] and differentiable
on (a2,b2) for all j. Also, since the derivative of gji
is finite and positive for all j and i, the derivative of
22
n2 is finite and positive. Therefore, again using Cauchy's
mean value theorem, the definitions of §2 and Hz and (2.15),
we obtain the existence of c2 in (a2,b2) such that
b.
1
’3 " £1118 we)
(2.16) 1 = 2 2
hi 112(C2)
E n gji]ai
Iterating the above procedure of obtaining (2.16) from
(2.15) (m-2) times, we obtain the result of the lemma.
We apply this lemma to prove the following lemma.
2
2
Lemma 6. ‘Lth coordinate of X +-a t - t‘ S k(1 +'gé9
2 o
+ h(l +-m Q7) for 1 = 1,...,m.
0
Proof. Let the dependency of t on X be suppressed and
abbreviate the indication of the Lth coordinates of t and
t by omission.
Let H abbreviate h- E: and eL denote the unit
vector in the Lth direction. Since t = k.1 (log H(X + k eL)
- log H(X)), by the mean value theorem, there exists 6 in
(0,1) such that
alog H
8 XL L
. 2 - .
Since t - X = a a log p/a XL’ the above equality
L
together with the triangle inequality implies that
2 2
(2.18) ‘xL + o t - t‘ s o (‘11‘ + ‘12‘)
where
10 - X+ek e4
(2.19) 11 = g—is—RJX
L
.m
23
and
(2.20) I=M(X+eke)ram(X+eke).
2 3X, 4. ax, L
By the mean value theorem, I1 = gk(azlog p/aXi)(X + 3*k eL)
for some 3* in (O,e). With ej1,...,9 denoting the
jm
coordinates of ej, we have
2
2 - 2(x-e)p 2(X-e)p
02(1+02§log2)= t it 1_( t it if.
all,
2 P1 2 Pj
The rhs of this equality can be recognized as the conditional
variance of the Lth coordinate of X - 9 given X when the
pair (e,X) has the joint distribution resulting from Gn-l
on e and P6 on X for given 9. Hence, since the
Support of Gn-l is in the m-sphere of radius a, we obtain
that
2 21 ' 2
(2.21) O’ ‘chlg._2‘ S 1 +15 .
‘ X
G L 0
Hence
2 2
(2.22) o ‘11‘ s k(l +945).
0
We complete the proof of the lemma by showing that
02‘12‘ S h(1 +-maza-2) with the help of Lemma 5.
The definition of H gives
(2.23) (n-l)hmﬂ = z Fj D
where, since the coordinates of Xj are independent,
(xi-e, {HO/c
2.2 =
( 4) Fj[3 n Q](xi'eji)/°
24
Therefore,
(X '9. +h)/U (X -6..+h)/o
DH (XL jL i 1
Now we apply Lemma 5 to the ratio (aH/aXL)/H obtained
by using (2.23), (2.24) and (2.25) with the following iden-
" ‘ - = -1 ..
tification. For all j l,...,n- -1, fji= gji 9(0 (y 931))
L'BQQJ 10-9%» and
(ai’bi) = (c-1Xia 0.1(Xi + h)) for all 1. Then there exists
for i # L, ij = ¢(o 1(y-9jL)). gj
a o in (0,1)"1 such that
a_%25_ﬂ.= g—%95—E (x + he).
5 t L
By subtracting 510g p/ax, and then applying the mean value
theorem to this function of h, we obtain the existence of h'
in (O,h) such that
- m 2 -
a XL 5 XL i=1 i axiaXL
For i # L, we obtain directly that
2 - - - - -
4 a 103 p = 2(9JL XL)€931 xi)EJ, 2(GJL xt)pj , 2(911 Xi)pi
X. X . .
a13L ij ZPJ EPJ
The rhs of this equality can be recognized as the i,Lth
element in the covariance matrix of 9 - X conditional on
X when the joint distribution of (9,X) results from Gn-l
on e and P9 on X for given 9. Hence, since the support
of Gn-l lies in m-sphere of radius a, it follows by Schwarz's
inequality that
S a for i # L.
25
This inequality, together with (2.21) and (2.26),
implies that
2
oZP-i—oﬁ-Ii - LE-E—H s h(l +m9’—2-).
a L a L o
2
Thus, by (2.20), ‘12‘ s h(1 + mg 0-2) and the proof of the
lemma is complete. u
Before stating a theorem as a Corollary to Lemmas 2,
4 and 6, we make a remark on the proof of Lemma 6.
Remark 1. The method of proof of the lemma differs much from
that of Gilliland (1966) for m = 1 case. He has never used
‘7’
.
the fact that the conditional variances and covariances are
uniformly bounded by explicit functions of a2. Moreover, the
constants multiplying k and h in the result of the lemma
are Specific functions of a while those of Gilliland are
complicated integrals. A proof similar to the proof obtained
by particularizing our proof to m = 1 is simpler than that
of Gilliland.
In the rest of the section, we let h and k depend
on n. We assume in the theorem to be stated below that
02 = l. The choices of h and k given in the following
theorem are optimal for the convergence to O of the expression
obtained by adding the right hand sides of Lemmas 4 and 6.
l 1
Theorem 1. If h = n m+4 , k = a n “H4 for a in [l,m)
**
and ¢ is defined by (2.1), then
1
gm“ - m = 0(n m“)
26
and
l
D (93¢— ) = 0(n M).
Proof. The first result is a direct consequence of (2.2),
Lemmas 4 and 6 and the definitions of h and k. Since,
* , , m
18 In X [-a,+u] , the second result
n
follows from the first result and Lemma 2 with 02 = 1.
it
by definition ‘1
27
§l.3 Rates Near 0(n-k) for Dn(§’i) with W Based on Kernel
Estimators for a Density and its Derivative
In this section, for each positive integer s and v
in (0,1), we exhibit a procedure 1_ belonging to a class of
procedures whose modified regret Dn(§9i) is
001- (s -1)v/(28+m)(1+v)
). The definition of 1_ depends on
kernel estimators for a density and its derivative. These
kernel estimators are similar to those defined by Johns and
Van Ryzin (1967) for estimating the unconditional density and
its derivative in the empirical Bayes two-action problem in
exponential families.
For L = O,l,...,m, let KL be bounded with
”[HUHSKL]:=S!CLS< G and for all nonnegative integers
t1....,tm,
m t
(3.1) M: n ujj K
O] = 1 or 0 as E tj = O or in {l,...,s-l}
J=1
and, for 1 s L s m, ULKL satisfies (3.1) with 3 replaced
by 8'10
K and their
As a result of these conditions on K0,..., m
intent, if f is a function on km with partials of order s
uniformly bounded by M, then the substitution of the sth order
Taylor expansion with Lagrange's form of the remainder shows
(3.2) ‘u[f KO] - f(0)\ s M COS
and if, in addition, all partials of f not involving the Lth
variable vanish at O,
28
(3.3) ‘pEf KL] - fL(O)\ s MLcLS
where fL stands for the first partial of f wrt the Lth
variable.
The notation to be introduced below is defined for each
n. We abbreviate by omission the dependency on n of the func-
tions to be defined below. We let 2 denote summation over
j from 1 to n-l. Let 3,6 be positive. As in section
§1.2, let X abbreviate X“. Define
(3.4) sj = e'mk0(e‘1<§j - x>), (n-1>§ = : ﬁj
and a = <§L> where
- mﬁl l -l
3.5 - = . lth A =—1( .-x _
< > (n 1>aL z aLJ w 6 qu>
-1
KL“ (35]. X))
where IL is the m X m identity matrix reduced by l/2 in the
Lth diagonal element.
Now we state and prove some lemmas which will be use-
ful in obtaining a rate of convergence for the modified regret of
a certain procedure 1_ to be defined in the latter part of the
section. let c1,c2,... denote finite functions of 02. In
the following lemmas, p, the average of the densities of
X1,...,Xn_1 and q, the vector of partial derivatives of p
are evaluated at X. We do not require the condition that
‘enl S a to prove lemmas 7 and 8.
IEEEELZ: 2;-lilﬁ - El] s c1 (es +‘((n-1)em)-%)-
29
Proof. Since “[pj 6-mKo(€-1(o-X))] = ”[pj(x + e.)KO], its
absolute difference from pJ(X), by the uniform boundedness of
partials of order s of' pj and (3.2), is at most c2 es.
Hence
- - 8
(3-6) En_1[P] ‘ P‘ 5 C2 6 .
Let VX(§) denote the conditional variance of §
given X. Since
-2 2 -1 - ? - 2 - 2 2
ulpje mKO(e (. - x))] = e m ”[pj(X + e.)KO] s e m(Zno ) m/ ”[KO]
2
and ”[KO] < CO,
I -l
(3.7) vxm s c, (ax-1).”) .
Since for any random variable R, E‘R| 3 ‘ER‘ + Var%(R),
(3.6) and (3.7) will yield the bound in the lemma with
c1 = c2 V c3.
2 .. - _.
Since a ”qH/p s‘/m a + “X“ and since \9n\ S 0
implies that Ph[HXH] is uniformly bounded, the following
corollary is a direct consequence of Lemma 7.
Corollary 1. gn[nd“‘(§/p) - 1‘] s c4(es + ((n_1)€m)-%).
Lemma 8. Eh[“a - an] s c5(58-1 + ((n-l)6m+2)-%).
Proof. In this proof, we abbreviate by omission the indication
of the Lth coordinates of q and q. Since, by two usages of
the transformation theorem,
A = -1 1 . - .
nip]. 61“.] 6 u[KL(pj(X +1, 6) pj(X + 6 ))3.
30
its absolute difference from the partial derivative of pj wrt
the Lth coordinate, by the uniform boundedness of partials of
order s of pj and (3.3), is at most c6 68-1. Hence,
(3.8) \P _1[§] - §| s c6 63‘1 .
Let Vx(é) denote the conditional variance of d
2 2 2
given X. By the inequality (a +-b) s 2(a +ib ) for a,b
1
in R and the transformations as above, we have
m+2
2 -1 2 1
uEpj(aLj) ] s (o ) uEKL(2 pj(x +-IL5-) + 2pj(x + 6-))]-
2
Hence, since ”[KL] < m,
(3.9) vxé) s c, ((n_1)5m+2)-1 .
Since for any random variable R, E‘R‘ S ‘ER‘ +-Var%(R),
inequalities (3.8) and (3.9) yield the bound in the lemma
c5 = C6 V c7.
Lemma 9. For any a in (0,1), there exists a finite function
2
of o , c8, such that
' a
Pn[p*** Let M be the minimum value of ‘Zl
for which rhs of (3.10) s 8' Since, for all t,
/2
P[|z|2/2 > t] s e'bt(1-b)'” for b in (0,1), we get from
31
(3.10) that
a -1 2 2
_a _ _a _a -§(M+Zo‘ a) -bM /2 -m/2
6 Pulp < B] s B Pn[\z\ > M] s c e (l-b)
which is bounded in M for b > a.
Corollary 2. For any a in (0,1), there exists a function of
02, c9, such that
- - a
Pniuiluip < an s c, e
p
Proof. Since OZHaH/ﬁ s,/E a + “X” and, therefore, has all
moments, Holders inequality yields, for any r > 1, the bound
r-l r l
r - r-l f, -
Pn [(11391) 1 P, [P < B]
P 1 _1_
for the lbs of the corollary. By Lemma 9, P:[p < B] S C; Br
9.
for b in (a,l). Choosing r such that a r = b, we get
the result of the corollary.
Henceforth, we take 6 to minimize the bound in Lemma
8. That is,
(3.11) 523+” = (In-1)-1 .
We also choose a to be such that
m2. _s-1
(3.12) 5 2 s e s a s .
Let B be defined by
81+Y = 68-1 for any y in (0,1).
With these choices for e, 6 and B, we define 1 as
follows. Let
32
2
(3.13) w = tr'(X + o 1)
13311:»:
m
where tr' stands for retraction to [-a,+u] and for y in
1 I
R , let y = V V B-
In the following lemma, V is evaluated at X.
Lemma 10. For each positive integer s and y in (0,1), there
m+2 s-l
- 2
+m = (n-l) 1, 5 m s c s 6 and
exists C10 such that if 523
1 -
B +Y = 68 1, then
5-1 x
28+m’ +Y for each n > 1 .
EnUW - N] s aloe-1)
Proof. Since W lies in the m-sphere of radius a and w is
2::'
the retraction of X +'o q/p to [-a,+u]m, we have by using
'1
the inequality p 2 5
0‘wa _wsng.-21“$.5qu-§§'ns§-{ué-&\1+@l5-§'Ho
p p P p
Since \5 - 8" s ‘5 - 8‘ + a[5 < B], the result of the lemma
follows from the above inequality,Lemma 8, Lemma 7 and Corollary
2 and the hypothesis on. e, 5 and 5°
Now we state the main result of this section.
Theorem 2. If 02 = 1, the hypothesis of Lemma 10 is satisfied
and ‘1 is defined by (3.13), then
5-1 .31.
'Zs+m 1
Dn(§31) = 0(n +Y ) .
Proof. Since i, by definition (3.13), lies in x [-a,+a]m,
'————- n
1 and Lemma 10.
2
the theorem is a consequence of Lemma.2 with o
33
§1.4 Rates Near 0(n-k) for Dn(§,o$) where Oh, a Particular W
Let a = l and let 8 > 1 be a fixed integer throughout
this section. Letting 0* denote a specialization, less a retrac-
tion to [a,m), of the i. of section §1.3, with certain additional
assumptions on the kernels, we show that Dn(§,o$) = 0(n-(S-1)/2(m+s+1)).
We specialize § and a (defined by (3.4) and (3.5)
respectively) by setting a = 6 and denote their common value by
h. Let
(4.1) §=tr'(X+g)
O x
P
where tr' (as in previous sections §1.2 and §l.3) stands for re-
traction to the cube [-a,+a]m and any undefined ratios are taken
to be zero. Let h2 = X - X, v 3 h(u + 61/5) and Yj(u) =
J J
with
(4.2) Y (u)= 1x 01 -x -vK)oz,=h"‘+1q _hmv’f).
LJ 2 L L L 0 J Lj j
In the following lemma, y will be evaluated at X. let
c1,c2,... denote constants.
S -
Lemma 11. If K0,...,Km are bounded with u[uuu KL] - CLs < m,
KO satisfies (3.1) and 111
with 8 replaced by s-l and are such that for |u\ s 20,
K1,...,uml{m satisfy condition (3.1)
h S S a,
-c X‘ Var (Y ), Va R Z ) X
(4.3) CI e 2‘ s m L1 r( O 0 i 3 c3 eca‘ ‘,
h ¢<|X|>
then gn[now - 1H] s c5(((n-1)h28+m)% + 1 m+2 %)'
((11-1)h )
34
Proof. Let the indication of the Lth coordinate of 0% and W
be abbreviated by omission. Since 0% lies in [-a,+a] and
since V lies in [-a,+a], it follows that \OW - 1‘ s 2a and
‘Ol - V‘ s ‘D| where
(4.4) D '
'0) 1“».
l
"U I |<~ |
Therefore
20! 2a
(4.5) Eo-luol - M] s g gu_1[|n| > u]du = g 3.1-1“) > u]du
0
i-I £h_1[D < -u]du.
-20
The main part of the proof bounds the integrands of the
rhs of this inequality by using the Berry-Esseen theorem and (4.3).
The rest of the proof shows that the Ph-integral of a bound for
the rhs of (4.5) is at most the bound in the lemma.
With 82 = Var(z YLJ) and L - 3-33 PJ‘YLJ - PJYleB,
the standardized range bound for L, together with lhs inequality of
(4.3), the inequality
(4.6) M s h(3a + |xL|)
and the fact that K ,...,K are bounded, implies that
O m
c7 (1 + h(3a + |xL|))
(4.7) L s
c1((n-1)hm)?’¢”(\x‘)e
for ‘u‘ S 2a .
-c21X‘/2
Let 0 s u s 20. Then the definitions of D in (4.4),
32Lj in (4.2) imply that [D > u] s [2: Y“ > 0] + [‘5 < 0]. The
Berry-Esseen theorem (Leave (1963), p. 288) and the triangle
inequality imply that 2n_1[IHzJ >'0] is at most
35
+1 -1-
(4.8) M-(n-Dhmﬂe 1pu>+\¢(-h““ a p u)
Y,)| + B L.
-l
-§(B 23ij
C M
Since rhs inequality of (4.3) implies that 52 s c3hm¢(‘x\)e A ,
the first term in (4.8) can be bounded above by replacing B by
this upper bound for 6. Also, by the equality
(4.9) (n-l)hm+1p u +-z ijcj = (nbl)hm+1(h-1v( (p - p _[p])
+ gmli’c‘h - c3).
the lhs inequality in (4.3), the bounds (3.6), (3.8) and the in-
equality (4.6) imply that the second term in (4.8) is at most
c 8((n-1>hzs*““>}5
:% 2
(1 +-h(3o +-|XL‘))
"czTillz
(4.10)
WIXI)
Hence, with f defined as the positive solution of the equation
c4‘x‘ -2
(4.11) c3e ¢(‘X‘)f2 = p ,
we obtain that
(4'12)Eh-1[2Ytj> o] s ¢(-(n-l)hm+2)%f u) + (4.10) + B rhs of (4.7)
Now we consider -2a 5 u < O. The definitions of D in
(4.4) and YLj in (4.2) imply that [D < u] s [E YLj
[b s O]. The Berry-Esseen theorem and the triangle
< O] +
inequality imply that gn-1EEYLJ < 0] is at most
(4.13) o((n-1)hm+15'15 u) + |§((n-1)hm+la'lp u) - o(-e'lszYLj)\ + BL.
36
c \X‘
Since the rhs inequality of (4.3) implies that 82 s c3hm¢(‘X\fe 4 ,
the first term in (4.13) is bounded by Q((n-l)hm+2)%f u) where f
is the positive solution of (4.13). The lhs inequality of (4.3), the
equality (4.9) and the bounds (3.6), (3.8) and (4.6) imply that
the second term of (4.13) is at most (4.10). Therefore,
_ﬂ, 2)%f u) + (4.10) +-B rhs of (4.7).
m+
(4.14) P 1[2YLJ<03 s 2(((n-1)h
Integrating (4.12) wrt u over [0, 20] and (4.14) wrt u
over [-Za,0), then bounding their first terms by using the inequality
20
g Q(-At)dt s A.1 for A >10, we obtain.(since the corresponding Berry-
2a
Esseen, followed by normal tail bound, treatment of gn—l[§ S 0]du con-
tributes no more than 1+q2/8 times the rest) that §n_l[‘o$-¢|] is at most
2
l 1 m+2)% %’+'4a[(4.10) +'B rhs of (4.7)]}(2 + %—).
((n-l)h
Hence we complete the proof of the lemma by showing that the P -
c ‘Xl/Z n
. -l 2 -g
integrals of f and (l + h(3a + |XL\))e ¢ (‘X‘) are
uniformly bounded.
, 2 + 2 m-Z 2
Since (211)mpn S exp -((‘X‘ - a) ) and (Zn) p 2 exp -(a+1x‘) ,
we obtain from the definition of f in (4.11) that pnf"1 is at
most
c |X\/2
c8 ¢<+>c”<|xl>e “
95(le + a)
which is u-integrable. Again by using the upper bound pn, we can
cz‘Xl/Z %
show that the Pn-integral of (1 + h(3a +-‘XL‘))e ¢ (‘X\) is
uniformly bounded. This ends the proof of the lemma.
Now we state the main result of the section.
37
Tﬁieorem 3. If the kernel functions K
-1/m+8+l
n
,...,K satisfy the conditions
m
0
(of Lemma 11, h = a where 0 < a s s.1 and Oi. is de-
fined by (4.1), then
D (g, i) = 0(n-(s-l)/2(s-lm+l)).
n O
Egggf. Since 0% lies in X [-a,+a]m, the result of the theorem
is a direct consequence of Lgmma 11, the hypothesis on h and
Lemma 2.
Now we exhibit kernel functions K0,...,Km satisfying the
conditions of Lemma 11. We develop these kernels in m = 2 case for
the sake of simplicity of the notation.
Let [cij] be an a X a matrix whose ijth element is Cij'
For each pair of positive integers i,j, let Wi’j be the indicator
function of the south-west quadrant of (i,j) intersected with the
north-east quadrant of (0,0). We will determine [aij], [bijl]
and [bijz] with only finitely many entries different from zero
such that
(4.15) K = z 1i’1,1<1= )3 hi ni’j and K2= z; b. 114
0 . i l . . 1'2
13.1 j 13.1 j 1’] J
satisfy the conditions of Lemma 11.
For any two positive integers S, T, let [aijjs T denote
the modification of [aij] obtained by replacing aij by zero if
i > S or j > T. We note that for any two sets of distinct non-
negative integers, k1,...,kS and L1,...,LT, the vectors
k L
k L
(4.16) [1 lj ST
11S T,..., [i 81 T13 T are a basis for R
38
kL kL
.r.t_ . rt
(For 2 crtil J ] - [0] lff z Crtx y
= O has the roots
{l,...,S} X {l,...,T}, which by iterative application of Descarte's
rule of signs requires the crt to vanish.) We use this fact to
show that certain norms are different from zero and to show that
certain coefficients are zero. The kernel conditions (3.1) on
K0 and K specialize to the following requirements on inner
1
products,
1 I = 1
(anol: [1111'sz = o 11 -3 2 3 +1. 3 5+1
1.] S LISLZD S £1 2
and
2 = =
(Eb 1 [11.1.1.2 > = L1 2’ L2 1
ijl’ J] 0 4SL1+LZSS+1
We choose [aij] for simplicity to be the
‘1 4’2
projection of [lj]s,s on L {[i j 15,3‘1 g L1,L2,
(4.17)
3 g L1 +~L2 3 3+1} divided by its squared norm,
and in order to satisfy the variance requirements (4.3), we take
b
[bijl] to e
projection of [izj] on L [[iLlez |(L L ) # (2 l)
3,8 13,3 l’ 2 ’ ’
(4.18)
l 3 L1 3 s, 1 s L2 5 3} divided by its squared norm.
The squared norms are non-zero by the aforenoted linear
independence for (S,T) = (8,3). Mbreover, bSjl # 0 for some j
in {l,...,s} for, otherwise [bijll defined in (4.18) will lie
(s-1)s (s-l)s
in R and is orthogonal to a basis in R , hence is 0.
Let M = Max{j‘bSj * 0}. Interchanging i and j, we get a
solution for [bijZ] such that K2 satisfies the kernel condgtions
cultminating in (3.1).
39
and K
1
With A denoting a bound of K0, K 2,
3 2
(4.19) V3r<¥Lj) s A2(§'+-v) En-ltgj E (X,X + sh) X (X,X +-sh)].
By the mean value theorem, the probability on the rhs of this in-
equality is szthj(X + §sh) for some g in the unit square.
Hence, factoring out h2¢(‘X\), the restriction h s s-la and
the inequality (4.6) show that the rhs of (4.19) is bounded by the
rhs of (4.3) for suitable c3 and c4.
Now we observe that YLJ defined by (4.2) takes finite
number of values including zero and Z-lbsM' The probability that
it takes the value zero is En-ID'Sj - x a! (0, sh) x (0, sh)] and
that it takes z-lbsM is gn_1[gj - x e (2(s-1)h,2sh) x ((M-l)h,Mh)].
Therefore by L1.A of the Appendix, we obtain that
(4.20) Var(Y1j) 2 c9 En-IEEj - X E (2(s-l)h, 23h) X ((M-l)h, Mh)].
By the mean value theorem, the probability on the rhs of this in-
2
equality is h p (X + §h) for some g in (2(s-l),25) X (M-1,M).
J
2 -
Hence, factoring out h ¢(|X|), the restriction h s 3 £1 shows
that the rhs of (4.20) is bounded below by (4.3) for Suitable c1
and c2 when L = 1. Similarly that Var(Y2.) is bounded by lhs of
J
(4.3) can be similarly proved.
By following the argument given above, we can show that
Var-(K0 o Zj) also satisfies inequality (4.3).
40
**
§l.5 A Lower Bound for Dn(9,¢ ).
In this section, we use the notation of section §1.2
specialized to the 02 = 1 case. Let ,c denote
c1 2,...
absolute constants. With
(5.1) 92 = (n-l)k2hm,
by using the Berry-Esseen theorem and Lemma 1 of the Appendix,
** 2 -2
we show that Dn(9’i' ) 2 c1 8 under certain conditions on 8.
Theorem 4. 1f 36% +ih) a a < m, B a m and yf* is defined
by (2.1), then
** 2 -2
Dn(_0_,y_ )2 c1 8 .
**
Proof. Let the first coordinate of 1n be abbreviated by
** *
W and let the indication of the first coordinate of t
be abbreviated by omission. As in section §l.2, let X, with
coordinates X ,...,X , abbreviate X . Our method of proof is
1 m ~n
**
to show that §h[[X1 > a]‘¢ |] ,exceeds the square-root of the
bound of the theorem. This completes the proof of the theorem
1 n ** 2 ** 2
E P and P 2
2 ** emu), | 1 1m, \ 1
2 **
2.1m, llzznitxlndlv n.
**
Since, by definition, \ﬁ**| s a and since [‘W \ > u] =
** -
since Dn(9,l_ ) = n
*
[\Xl +~t | > u] for u < a, we obtain by Fubini's theorem
that
01 0'
(5.2) §n[\¢**\] = g gn[‘x1 + t*‘ > u]du 2 Ph[[x1 > a] _1[x1+t* > ujduj.
J's,
O
, m-l
Let x 1n (a,m) x R and u be in (0.0)
fixed until otherwise stated. As in section §l.2, let
41
".= 611, .= x.<—i and
6J [h 1] 5] [N] m]
~ t(u-xp
(5.3 Y. = 6 - 6 e .
) J j 1
With this definition of Yj’ we obtain that
*
(5.4) [x1 + t > u] = [8Y1 2 0]
where~3, as in sections §1.2, §1.3 and §1.4, denotes summation
over j from 1 to n-l. Note that X1 > 0 implies that
~ *
[ZYj 2 0, Z 5 = 0, 2 5 = 0] C [X1 + t > u] for u < a.
J
1
Since §1""’§n-l Y1,...,Yn_1.
Hence, with B denoting the Berry-Esseen constant, the
are i.i.d., so are
Berry-Esseen theorem and (5.4) give that
3
* (n-l)%P1Y1_%P1|Y1-P1Y1\
(5.5) P [x + t > u] 2 6H ) B(n- 1)
_11-1 1 SodoY
1 (so d.Y1)3
ek(u-X1)
The definition of Y1 gives that P‘Y1=Ffjl- Ff].
Hence, since the alternative expression for FIE /Fj[j in the
J L
proof of (2) of Lemma 3 when Specialized to the case
2 k
= ' = = ' = _ +._
0' 3 L 1 gives that FlDl/FlD exp h(X1 2 +mh)
for some w in (0,1), we obtain that
k
k(u-X1) '.k(u + - + (oh)
2
(5.6) Ply1 = F1131 e (e - 1)
2 -k F1D1(U +-;+ h) for k< (0+4)-
where the inequality follows from the inequalities u < a < X1
and e'k - l 2 -x.
Applying IJJA.(See Appendix) to the random variable Y1,
we obtain, since Y1 takes value 1 with probability Ffjl’
42
that Var (Y1) is at least (1 - F131 - FPWIDIO - FlCll).
Hence, Since (1 - F131 - Ffj) is bounded away from zero
for h < (a +-4)-1, we obtain that for some c2 > 0
2 -l
(5.7) Var (Y1) 2 c2 Ffjl for h < (a +-4) ,
Using (5.6), (5.7) and the definition of B in (5.1),
we obtain that, for k.< (a +'4)-1:
P Y
5 1 1 g_ k . m/2 a
o - 2 - + — = o
(5 8) (n l) s.d.Y1 % (u 2 +-h)f With h f (Fill)
c
2
3
The standardized range bound for Pl‘Yl - PiY1‘3/(s.d.Y1) ,
(u-X1)k
with the help of the inequality range of Y1 s 1 + e s 2
since u < a < X1, (5.7) and the definition of 8, gives that,
for h < (a + 4)'1,
_ 3
Plhr1 P1Y1|
(5.9) 3 .<. 2k
(n-l)%(s.d.Y1) a czf
Integrating the inequality obtained by weakening (5.5)
with the help of (5.8) and (5.9) wrt u over (0,0), then
using the transformation B(u +~%'+-h)f = cgv in the first
integral, we obtain that
0'
12
C55 «Tl-:34") /'I:2
B 8 En-IEXI +'t* > ujdu 2 f2. I §(-v)dv - Zia, for k < (0+4)-1.
(Lg—mus: Cif
In view of (5.2), we complete the proof by showing that
the Pn-integral of the first term of the rhs of this inequality
on [x1 > a] converges to a positive constant while that of
the second term on X > a conver as to zero.
1 8
43
Since f, defined in (5.8), converges to p1 and
Since Specialization of (2) of Lemma 3 to the case of
02 = L = l and n = 2 gives that f"1 is exceeded by
pit exp(HXH +,E§l) which is Pn-integrable on [X1 > a], it
follows by dominated convergence theorem and the hypothesis
on B that
B(a+%*h)/C: m
Pnux1 > o] 6(-v)dv] —. P1[[X1 > o] f %§(-v)dv] > 0
8(34h)f/c2 a(c2p1)-
and
Eli-q- Pn[[X1 > 0,ij .. o .
C2
The proof of the theorem is complete.
**
Now we make a remark concerning the procedures 1. ,
i and CL defined in sections §l.2, §l.3 and §1.4 reSpectively.
Remark 4. For the choice of h and k given in Theorem 1 of
section §l.2, we obtain by the theorem proved above that
2
Dn(9’&f*) 2 c n. “H4
for some c > 0.
For any y > 0, Theorem 2 of section §1.3 Shows that we
can define a procedure i_ such that Dn(§’i) = 0(n-(i-V)).
Hence, since y > 1/36 implies that k - y 2 Eﬁz- for m 2 5,
**
it follows that the procedure 1. is better than 1_ in the
sense that
sup Dn(ﬁ’i) s c1 n-(%-Y) S c2 n m+4 s sup Dn(§,ﬂf*)
where the sup is taken over all parameter sequences.
44
For any positive integer S, Theorem 3 of section §l.4
Shows that we can define a procedure 0* Such that
-S/2(s+3))
Hence, if ms 2 5m + 8, the procedure
Dn(s.9_$_) = 0(n
A c ** .
Ow 18 better than 1. in the sense described above.
45
§1.6 Extension of Results in Sections §1.2 and §l.3 to Constrained
Mean Vectors and Unknown Covariance Matrix
Let Y be a d-variate normal with mean w and co~
variance matrix 021. If m is assumed to lie in a lower
dimensional subspace E5 say of dimension m < d, then the
square of the projection of Y onto the subSpace orthogonal
to E’ has expectation 02(d-m) and variance 204(d-m). In
this section, this fact has been used to extend the results of
sections §l.2 and §1.3.
Let {En} be a sequence of independent random variables
with In distributed as d-variate normal with unknown covariance
matrix 021 and mean wn belonging to an m-dimensional subSpace
g; of Rd intersected with the d-sphere of radius a. While
stating the results of the present section in section §l.0, we
interchanged m and d in order to make proper references to
sections §1.2 and §1.3.
Let Bn be an orthogonal matrix whose first m columns
generate Eh. Let Xn and en denote the vectors formed by
the first m coordinates of 3; Zn and B; wn respectively
where B; is the transpose of Bn' Let (m-d)Zn denote the
square of the projection of Zn onto the subspace which is
orthogonal to Eh. Let E stand for expectation wrt the joint
distribution of X1,...,§n, 21,...,Zn.
This section is divided into two subsections. In the
first subsection, with the help of the procedure yf* defined
in (2.1), we exhibit a procedure 27* for which
-1/(mH4))
**
Dn(§JZ, ) == 0(n for each 02. In the second subsection,
46
for each positive integer S and each v in (0,1), with
the help of the procedure 1_ defined by (3.13), we exhibit a
i, for which
n(§,i) = 0(n-(S-1)Y/(Zs+m)(1+y)) for each 02. Let 2 de-
U
note summation over i from 1 to n.
**
§l.6.1 Definition of T_ and a Rate of Convergence for
**
Dn (2.1 >
In this subsection, we use the notation of section
§1.2. We require the following notation for each n, but, as
in earlier sections, we suppress the dependency on n of
the functions to be defined below.
** **
Define I. = {T j as follows,
22
** , 1 *
= + —- tr t
(6.1) T tr (X n k )
where tr' (as in section §l.2) and tr)\ Stand for retractions
m
to [-o,+o]m and X [-k-IQXL‘ + o, + k + h), {1(‘xLl + a + k + h)]
L=1
respectively.
** a
Let T be the modification of T obtained by re-
- 1H: 2 *
placing n 1221 in the definition of T by a . Let T
be the modification of T obtained by replacing tr' in the
definition of T by retraction to the cube [-a',+u']m where
a' = a +-k + h. Let c1,c2,... denote finite functions of
2
O' o
c
Lemma 12 E\\T* - T“ s Tl .
n X
. , , 111
Proof. Since the distance between two p01nts retracted in R
to the same cube is at most the distance between the
47
points and since xutrxt*“ S “X“ +'ma', we obtain that
1HT* - T“ S (HXH + ma')‘n-122i - 02‘. Since
(d-m)Zl/02,...,(d-m)Zn/o2 are i.i.d. x2 - random variables
with d-m degrees of freedom, application of Schwarz inequality
to the rhs of the last inequality and the fact that
E[(HXH + ma')2] is bounded by a finite function of 02 completes
the proof of the lemma.
l/uﬂ4_ -l/mﬁ4
Theorem 5. If h = n- , k = a n for a in [1,”):
_ **
12(m+4) = n (m+2) and T_ is defined by (6.1), then
1
Dn(§,Tf*) = 0(n-m*4 ) for each 02.
nggf. Let 02 be fixed. In the proof, we consider only those
n for which k < 02.
Since ‘W‘ S a and since T = tr'T*, it follows that
HT - Ml S \\T* ' TH and hence HTMr - 1H 5 HT“ - TH + HT" - N-
If the Lth coordinate of t* (its negative) > x(‘XL| +-a'),
then, since 1 < oz, T* and $* defined by (2.1) turn out
to equal a' (its negative). Hence, T* = 1*. Therefore the
last inequality, together with Lemas 12, 4 and 6 and the
definitions of 1, h and k, implies that EuT* - W“ =
-1 **
n /m+4). Since I. , by definition (6.1), takes values in
0(
X [-a,+u]m, Lemma 2 and this order relation give the result of
n
the theorem.
§l.6.2 Definition of i. and a Rate of Convergence of Dn(§.i)
In this subsection, we use the notation of section §l.3.
We require the following notation for each n, but as in previous
48
sections, we suppress the dependency on n of the functions to
be defined below.
Define i'= {T} as follows,
. 27- I
T = tr'(x +-—i tr 617))
n X.
P
where tr' (as in section §l.3) and trA stand for retractions
m
to [-a,+a]m and x [-1 1(|x | +-a), 1'1(\x | + a)] and
- 1 - (j=1 L L A
p = b V B ((3.13)). Let T be the modification of T
- a 2
obtained by replacing n 1221 in the definition of T by c .
* x
AS a consequence of replacing T by T, T of sub-
section 1.6.2 by T of this subsection and a' by a in the
proof of Lemma 12, we obtain the following lemma.
C
. 1
Lemma 13. EHT - T“ 3 —¥-.
n 1
Now we State and prove the main result of the Subsection.
Theorem 6. If the hypothesissoi Lemma 10 is satisfied, i. is
— -i
defined by (1.6.2) and 1 = n28+m +y , then
-8-1 .1.
Dn(§”i) = 0(ﬂ 28+m 1+¥ ) for each 02.
2329;. Let 02 be fixed. In the proof, we consider only those
n for which 1 < 02.
If the Lth coordinate of T (its negative) > 1(‘XL1 + a),
then, Since 1 < 02, T and i defined by (3.13) turn out to
equal 0 (its negative). Hence T I $. Therefore the inequality
HT - v“ S “T - T“ + “I - w“, together with Lemmas l3 and 10 and
5-1 .;1_
the hypothesis of the theorem, gives that EHT - W“ = 0(n 28+m 1+v ).
Since, by definition, T is in [-a,+u]m, this order relation
and Lemma 2 complete the proof of the theorem.
CHAPTER II
RATES IN THE ESTIMATION AND TWO-ACTION PROBLEMS
FOR A FAMILY OF SCALE PARAMETER F(a) DISTRIBUTIONS
49
50
§2.0 Introduction and Notation
For 0 < a.< b < 2a < o and a > 2, let
9 a {9919 e [a,bj} be the family of distributions with P6
representing the F(a) distribution with scale parameter 9.
Let S be a positive integer.
Let {Xn} be‘a sequence of independent random variables
with Xn distributeddas P6 belonging to .9. Let
n
Kn = (X1,...,Xn), §'= {an} and Gn be the empiric distribution
of 91,...,en.
In section §2.1, we consider a sequence of estimation
problems each having the structure of the following component
estimation problem. Based on an observable random variable X
whose distribution Pe belongs to .9, the problem is to estimate
9 with squared-error loss. Let R(Gn) denote the Bayes risk
against CD in the estimation problem just described. Let
Q = {¢n} be a randomized sequence-compound procedure (abbre-
viated to randomized procedure hereafter). That is, for each
n, ¢n is a randomized function of X“. For any such 9, g
in x [a,b], let
n
n
-l 2
(0.1) D (m) = n z E|¢ - el - R(G)
n j=1 j j n
where E stands for expectation wrt the joint distribution of
all the random variables involved. In section §2.1, we exhibit
* * **
a randomized procedure 1’ = {Va} SUCh that Dn(§ai, ) =
n-s/2(s+l))
O( uniformly in all parameter sequences g in
X [a,b].
51
In section §2.2, we consider a sequence of two-action
problems each having the structure of the following component
two-action problem. Based on an observable random variable
X whose distribution P6 belongs to «9, the problem is to
choose one of two possible actions a1 and a2 when the loss
functions correSponding to 81 and a2 are L(a1,e) = (e-c)+
and L(a2,e) = (e-c)' for some c in (a,b). Let R(Gn)
denote the Bayes risk against Gn in the two-action problem
described above. Then, in section §2.2, we exhibit a randomized
procedure 1 = {ﬁn} such that the absolute value of Dn(§,$)
defined by
-1 n
(0.2) Dn(a,i) = n z mare) - R(G)
j=1 j I1
is 0(n-S/2(S+1)) uniformly in all parameter sequences Q
in X [a,b].
n
The orders stated in the results of both sections §2.1
and §2.2 are uniform in all parameter sequences g in
X [a,b]. Hence, in order to reduce the complexity of the
Statements of the results in this chapter, the range of the
parameter sequences will not be exhibited, but is understood
to be X [a,b].
n
We introduce some notation which is common to both
sections §2.l and §2.2. Let {in} be a sequence of i.i.d.
random variables with the density of 11 as (a-l)xg-2[O < 11 < l]
wrt Lebesgue meaSure p, on ((0,00), 5 n (0am))- Furthermore,
we assume that {kn} is independent of {Xn}. Define, for each
n, Yn = ann. Then, Yn has F(q-1) distribution with scale
52
parameter en. We let 2 and 2' denote Summations over j
from 1 to n-1 and from 1 to s reSpectively.
Now we introduce some notation which is similar to that
introduced in section §l.4. Since the Vandermonde determinant
involved does not vanish, there exists a unique vector
d = (d1,...,ds) in R3 such that :18 + 0 and
l for L = l
. .1 _ '6 3
(0'3) 2 di(1 - (i 1) ) 0 for L = 2,...,s.
For any h > O and any real valued function g on
(0,m), define
ism) = h'1**~~ 0.
With P. and H. denoting the averages of the distributions of
X1,-..,Xn_1 and Y ,...,Yn_1. respectively and with X abbre-
l
viating Xn, let
(0-4) 11- = 2' dil F(x + (i-l)h)
and
(0.5) §'= z' dii ﬁ(x +-(i-1)h).
* *
With F and H denoting the empiric distributions of
xl’,po,xn-1 and Y1’000,Yn-1, let
* *
(0.6) n = 2' d1; F (x +-(i-1)h)
and
* -k
(0.7) g = 2' d1; H (x + (i-l)h).
53
Let p and q denote the densities of T(a) dis-
9 9
tribution with scale parameter 9 and Fug-l) distribution
inith scale parameter 6 reSpectively. Let 5 and 5 denote
the densities of P. and H. reSpectively. With pés) and
q(S)
9 denoting the sth order derivatives of p9 and q
reSpectively, we assume throughout this chapter that a is 3
s
(0.8) Sup {‘pé )\, \qés)‘ 1 a s e s b} < m.
Under this assumption (0.8), it follows from the con-
dition on d in (0.3) and (3.2) of Chapter I that
(0.9) \ﬁ'- 5‘ s kzhs
and
(0-10) \E - {1| 5 15118
where k6 and k7 are constants.
54
*
§2dl Estimation Problem. Rates of Convergence for Dn(§’£ )
x
with 1 Based on Kernel Estimators for a Density
In this section, under certain conditions on 65 we
Show that,for each positive integer s, the modified regret of
*
the procedure i (to be defined below by (1.2)) is
-s/2(s+1)
n
O( ) when the component problem involved is the
estimation problem described in section §2.0. The method of
proving this rate of convergence is Similar to that of Theorem
3 of Chapter I.
Let w denote the Bayes estimate against in
Gn-l
the component estimation problem described in section §2.0. 5
Then W can be expressed as
(1.1) u = —%1H%LEL for u > 0.
°’ MU)
*
Define the procedure 1 as follows. Let
*_ _X_
(1.2) W — tr (a-l
:3srns
V
where tr stands for retraction to [a,b]. Any undefined ratios
are taken to be zero.
Let K1,K2,... denote constants in this section. Let
E stand for the expectation wrt the joint distribution of the
random variables involved unless otherwise Specified. In the
following lemma, q, 5 are evaluated at X.
Lemma 1. If a > 2, b < 2a, (0.8) is satisfied and h is in
N'= {h‘O < h((a-1)a-1 V (a-Z)a-2) < ea-2F(a-l)a r 8-1} for some
r in (0,%), then
55
5 + 23+l %
E[‘¢* - ((X)\] s K1((nh)- (nh ) ).
Proof. We have by the definition of conditional expectation
* 'k
(1.3) Bill - 1(x>l] = E[E[\l - l|X11
where E["X] stands for the conditional expectation operation
given X.
*
Since w , by definition (1.2), is the retraction of
7': 3': .. ..
XE /(a-l)ﬂ to [a,b] and since t(X) = Xq/(a-1)p, the Bayes
estimate against G whose support lies in [a,b], is in
n-l
[a,b], we have ‘¢* - t‘ S b-a and ‘w* - Y‘ S X(a-l)-1\D‘
where
* -
(1.4) D = S; - 3'-
0 p
We then have
* b-a *
EE‘Y ' W(X)“X] S g PE‘W - Y‘ > ujdu
b-a -1
(1-5) SS PUD| > (a-l)x ujdu
b-a _1 o -1
= 1 FEB > 01-1)X u] +-f P[D < (o-l)x ujdu
o a-b
where P stands for the joint probability measure of
X1,...,Xn_1 and Y ..,Y
1"
The main part of the proof bounds P[D > (a-l)X-1u]
n-l'
for 0 S u S b-a and P[D < (a-l)X-1u] for a-b S u < O by
using the Berry-Esseen theorem. The rest of the proof shows
that the expectation of a bound for the rhs of (1.5) is exceeded
by the bound in the lemma.
56
Let X > 0 be fixed until otherwise stated. Let
(1.6) Zj(u) = z'di([x + (i-1)h < Yj < x + ihj
- ((a-1)X-1u + %9[X + (i-l)h < Xj < x + ihj)
p
for \u\ s b-a.
Let the dependency of Zj on u be suppressed. Let
82 = Var(£Zj) and L = B-3ZE‘ZJ - EZj‘B. Now we prove the
following Sublemma.
Sublemma. For any \u‘ S b-a and any constant k2 such that
‘di‘ S k2 for i = 1,...,S,
k2(1 + 2(a-1)x‘1b)
L s g _ 1 .
k4((n-1)h) (1H(x+(s-1>h>>2
Proof. In order to obtain the result of the sublemma, we
need a lower bound for 52. This bound will be obtained by
applying L1.A (see Appendix) to the Zj'
Since Yj = ijj and Since the distribution of xj
is supported on (0,1), P[Yj 2 Xj] = 0 and hence Zj defined
by (1.6) takes 2-ls(s+5) + 1 values; namely,
0, di-dj((q-l)X-1u + %) for 1 S i S j S s,
(1.7) p
d1 for i = 1,...,8 and -di((a-1)X-1u + %) for i = 1,...
p
with nonzero probability.
The probability that Zj takes the value zero in (1.7)
is given by P[xj é (X,X + sh), Yj 4 (X,X + Sh)]. Since
,S.
57
e-uum S e.mmm for m > 0 and since a > 2 and ej > 2 by
assumption, we have
U
xx+h = 1 x-hgh(“——j)'o’1 ejd
P(Xj6(, s)] “Maj e u
8"(0’ 2) (1‘2
Sf73:133'((a 1)a V (0;?) )Sh.
(1.8) m__
= 1 X+Sh (u )w _2 ej
PLYJ. E (X,X + sh)] —-—1.(a_1)ej 111(8 du
-(a-2)
, e a-l a-Z
S W ((01-1) V (oz-2) )sh.
Therefbre, it follows by the hypothesis on h that
P[Xj E (X,X + sh)] and PEYj E (X,X + sh)] are exceeded by r.
Hence, since P(A n B) 2 P(A) + P(B) - l for any two events
A, B, we obtain that P[xj (E (X,X + sh), Yj 4 (X,X + sh)] > 1-2r.
Hence, since Zj takes the value dS With probability
P(X + (S-l)h < Yj < X + sh S Xj], we obtain by L1.A. that
(1.9) Var(ZJ.) 2 k§(P[X + (s-1)h < Yj < X + sh S Xj'j)
vahere k; = d:(1-2r) inf{1 - P(u + (s-l)h < Yj < u + sh]\u > 0, h E R3.
Vie observe that k3 # 0 since (18 # 0 and Zr < 1. Hence,
Since [x + (S-1)h <3!j < x +sh, xj - Yj 2 h] c [x + (s-l)h <
‘Yij < X + shS %] and since Xj - Yj and Yj are independent,
Vve: obtain that
2 .
Var(YJ.) 2 k3 in£{P[xj - Yj 2 hj\h e %}P[X + (s-l)h < Y], < x + sh].
Ufllerefore,
2 2 '-
(1.10) e 2 k4(n-l)h i H(X + (s-l)h)
to 2 _ 2 .
llere k4 - k3 1nf{P[Xj - Yj 2 h]\h 6 ﬂ}.
58
Since ‘u‘ S b-a and since xq/(a-1)p S b, the
maximum of the moduli of the values of'(-Z,)in.(l.7) is at
J
most
(1.11) k2(1 + 2(a-l)X-1b)
where k2 is the constant stated in the sublemma.
Therefore, the standardized range bound for L, together
with the help of (1.10) and (1.11), gives the result of the Sub-
lemma.
Proceeding with the prOof of the lemma, we obtain an
upper bound for 82. The definition of Zj in (1.6) and (1.11)
imply that
X+sh
h
11x )
2 2 - 2
E zj s k2 (1 + 2(a-1)X 1b) (F + njji‘h"
where Fj and Hj are the distributions of Xj and Yj
2
respectively. Therefore, since 32 = 2 Var(Zj) S 2 Ezj, we
obtain that
2 - _ _
(1.12) 32 s k2(1 + 2(a-1)x 1b)2(n-l)(F];{+Sh + Hjiﬁh).
Let 0 S u S b-a. Then the definitions of D in (1.4)
and Zj in (1.6) imply that [D > 03-1)X-1u] S [zzj > 0] + [n* S 0].
Hence, with b(L) denoting the bound in the sublemma and B
denoting the Berry-Esseen constant, by the Berry-Esseen theorem,
the sublemma and the triangle inequality, we obtain that
PD: Z > O] is exceeded by
j
-l -l- -1 -l-
(1.13) we (n-1)h(o-1)x p u) + lu-e (n-1)h(o-1)x p u)
- Q(g'1z Ezj)|+-B b(L).
59
2
By using the upper bound for B in (l.12),we obtain
that
(1.14) 8-1(n-1)h(a-1)X-1p 2 ((n-l)h)%f
where f is the positive solution of the equation
(1.15) k§x2(1 + 2(a-l)x'1b)2(5j:+5h +-ﬁj§+5h)£2 = h(a-l)2p2.
Since 2 EZj + (n-l)h(a-1)X-1p u = (n-l)h((a-1)X-1u
+ %)(B - '11) + E - a) for all lul s b-8 and X5/I3 S b,
it follows from (0.9) and (0.10) that
In“.
J
a
(1.16) \z Ezj + (n-l)h(o-1)x'15 u\ s (n-1)h‘°’+1(21<6b(o-l)x'l + K7)
for all ‘u‘ S b-a.
Therefore, it follows by the mean value theorem and the lower
bound for 82 in (1.10) that the second term in (1.13) is
exceeded by
((n-1)h2$+1)}5(21t6(o-1)x'1 +~K )
(1.17) 7
- %
kz(l H(X + (s-l)h))
Hence it follows from (1.13) and (1.14) that
(1.18) P[2 zj > 0] S ¢(-((n-1)h)%f u) + (1.17) + b b(L)
for 0 S u S b-a.
Let a-b S u < 0. Then the definitions of D in (1.4)
and zj in (1.6) imply that [D < (a-1)X-lu] S [z(-Zj) > 0] +
[n* s 0]. Hence, since the sublemma continues to hold if d
is replaced by -d, we have by the Berry-Esseen theorem, the
60
trviangle inequality and the sublemma that P[Z(-Zj) > O]
is exceeded by
(1.19) 6(3'1(n-1)h(o-1)x‘15 u) + |o(a'1(n-l)h(o-1)x’1u)
- “-34; Ezj)| + B b(L).
2
By using the upper bound for B in (1.12), we obtain
. . ’5 f
tlhatzthe first term of (1.19) is bounded by 0(((n-l)h) f u)
Mﬂneare f is the positive Solution of (1.15). By using (1.16)
(311:1 the lower bound for 82 in (1.10), we obtain that the second
téirnlof (1.19) is exceeded by (1.17). Hence, it follows from
(1 .19) that
P[D < (o-l)x'1u] s 9(((n~-1)h)SE f u) + (1.17) + B b(L)
for a-b S u < 0.
Integrating this inequality wrt u over [a—b,0) and
C118 inequality (1.18) wrt u over [0,b-a], then bounding their
ffirst terms by using the inequality :-a§(-Au)du S (211).15 A"1
fRDr A > 0, we obtain from (1.5) that
2 a; + 2(b-a) (1.17) + 2(b-a>B b(L).
E1|l* - ll|XJ s
«mm 15
In view of this inequality, (1.17) and the bound in
tlle sublemma, we continue the proof of the lemma by showing
that t'1 and (1 + x'1)(t h(x + (s-l)h))-% are uniformly
bounded and Pn-integrable.
h - -
= P(X +'eSh)/q(X + esh) for some
, -X+sh -X+s
Since ij /H]x
e in (0,1) by Cauchy's mean value theorem, Xd/(a-l)p 2 a,
and Since f is defined as the positive solution of (1.15),
61
we (obtain that f“1 is exceeded by
k2 X (l + 2(a-1)X-lb)(1-{l]§+8h)%((1+ ((a-l)a)-1(X + ssh))%
(1.20)
hgp
. ‘“ +Sh - -l
Slnce a]: = sh q(X + ﬁsh) for some 0 < 6 < 1, (01-1)
_ _ _ _ _ _ 1-
q(X + ssh)/P(X) s b(x + s)“ 2/x°’ 1 and b 0’ e Va _<. 1“(o)X o’pj a
Ei‘cy e-X/b, the condition b < 2a implies that the expectation
(Df the upper bound (1.20) for £71 is uniformly bounded.
Since, by the mean value theorem, 5 H(X + (S~l)h) =
d (}{ + €(S-l)h) for some 6 in (0,1) and since
r-(C¥_1)ba-la 2 Xa-Ze-X/a a-la-ae-X/b
and P(a)pn S X , the con-
Cli.tions b < 2a and a > 2 ‘imply that the expectation of
)(-1(A H(X + (s-l)h))15 is uniformly bounded.
. w
The same notnod of bounding §n[n g 0] completes the proof.
Now we state and prove the main result of the section.
'rliis result is a consequence of Lemma 1, Theorem 2.1 and (2.5)
C)f Gilliland (1968).
Tﬂieorem 1. If a > 2, b < 2a, (0.8) is satisfied, h
-l/s+l
vn
- - - -1
wuh0<¥2reAMrs mrwm r
iri (O,%) and if is defined by (1.2), then
Dn(9.ll> = 0(n‘s‘2(s+l)>.
3:522£° Since Pe(U) is exceeded by (P(a))'la'aua'le-U/b
uIliformly in all e belonging to [a,b] and u[ua-187U/b] < m
3
it follows by Theorem 2.1 of Gilliland (1968) that
-1 n -1
l. E - X = 1
< > n jgl [\lj(xj) yj_1( j)l] 0(n og n)
62
where 0(n-110g n) is uniform in all parameter sequences g
in X [a,b]°
n
Since the inequality (2.5) of Gilliland (1968) continues
to hold when the ¢i mentioned there are randomized procedures
a-
and since 11 , by definition (1.2), takes values in [a,b], it
follows by (1.19) that
‘k -1 n * -1
‘Dn(_8_,i )1 S 4b n jEIEENJ - ¢j_1(Xj)H + 0(n log 11).
Hence the reSult of the theorem follows from Lemma 1 and the
definition of h in the statement of the theorem.
63
92.2 Two-action Problem. Rates of Convergence for Dn(§,1)
with 1 Based on Kernel Estimators for a Density
In this section, under certain conditions on 65 we
Show that, for each positive integer s, the modified regret
of the procedure 1_ (defined below by (2.5) and (2.6)) is
0(n-S/2(S+1)) when the component problem involved is the two
action problem described in section §2.0. The method of proving
this rate of convergence is Similar to that of Johns (1967).
In this section, we make it a convention that the value
of any decision function iS the probability of taking action
a1. Define, for each n,
(2.1) Yn = (6n - C)Pn-
If R(Gn) denotes the component two-action problem described
in section §2.0, then
['1 n
R(cn) = inf bib a“1 2 VJ.) + a”1 z (ej - «0’.
6 j-l j=1
Hence, with mn defined by
1'1
(2.2) m = Z Y.3
n j=1 j
- n _-
(2.3) n R(Gn) = minim] 4.121(6)]. - c)
With E denoting the expectation operation, for any
randomized procedure i.= {ﬁn}, the risk of using Wu to decide
about en is given by (an - c)E[$n] +(en - c) and hence
the average risk of using $1,...,¢n to decide about
91,...,9n respectively is given by
64
1 n 1 n -
— E = +._ -
n jE1PEYj(U) [¢J\Xj UJ] n 121(93 C)
where E[$j‘X, = u] is a conditional expectation of lj
J
given Xj = u. Hence, it follows from (0.2) and (2.3) that
(u)E[1j\xj = u] + m;(u)].
n
(2.4) n Dn(a.1> = ulYllil l'”[j§2VJ
Let h > 0 be a function of n. We define
= f 11 .
1 {mm} as o ows Let
(2.5) $1 = l
and, for n > 1,
* *
(2-6) ln = [XE /a-1 < c“ ]
* *
where n and g are defined by (0.6) and (0.7) reSpectively
and X is an abbreviation for Xn.
Let, for n > 1,
*
= _ 93. . *
(2.7) Sn-l (n 1)(a-l cn ) for u > 0
* *
where g and n are evaluated at Y1,...,Yn 1, u and
X1,...,Xn_1, u respectively.
*
(2.8) mn_1 = E[Sn_1] for u > 0
and
2 _
(2.9) Bn-l — Var(Sn_1) for u > 0.
Lemma 2. If a > 2, b < 2a, (0.8) is satisfied and h is in
-1 -2 -2 -
w'= {h|0 < h((a-1)a v (o-2)“ ) < e“ r(o-1)a r s 1} for some
0 < r < %, then
6S
*
m m
‘01V (9(‘ n-l) ‘ 9(‘ “2:19)]‘ 5 K1((n-l)h28+1)l5 for n > 1.
n Bn-l Bn-l
Proof. By the mean value theorem and the inequality ./2; ¢ S l,
we obtain that
* 1* l
m _ m - m _ -m _
(2.10) ‘¢(_ n 1) _ ¢(_ n 1)‘ n l n l
6n-1 8n-1 /2n Bn-l
Since (0.8) is satisfied by the hypothesis, it follows
from (0.9), (0.10) and the definitions of m: 1 in (2.8) and
mn_1 in (2.2) that
(2.11) 1m* - m \ s (n-1)h3(-E—-k + c ).
n-1 n-1 a-l 6 k7
2
Now we get a lower bound for an 1. Let
(2.12) h Zj(u) = g'di(u(a-1)‘l[u + (i-l)h < Yj < u + ih]
- c[u + (i-1)h < Xj < u + ih]).
Then, since P[Yj 2 Xj] = 0, h Zj defined above takes
-1
2 s(s+5) + 1 values; namely,
-1
(2.13) 0, diu(a-l) - djc for 1 S i S j S s,
diu(a-l)-1 for i = 1,...,s and dic for i = 1,..,,S.
with nonzero probability. The probability that h 2 takes the
value zero in (2.13) is given by P[Xj E (u,u + Sh), Yj f (u,u + Sh)].
Then, it follows by the hypothesis on b, (1.8) and the in-
quality P(A n B) 2 P(A) + P(B) - l for any two events A
and B that this probability is at least l-2r > 0. Hence,
66
Since h Zj takes the value (18 with probability
P[u + (s-1)h < YJ. < u + sh 5 xj], we have from L1.A. (see
Appendix) that
(2.14) Var(h 2]) 2 (l-2r)d: P(_u + (S-l)h < Y < u + sh s x]
. J
J
(1-- P[u + (s-l)h < Y < u + sh s x.])).
J J
Hence, since in£{1 - P[u +(s-,1)h 4 Yj < u + sh s ij| u > 0, '
ej 6 [a,b], h €_Nj >'0, by using the argument given to obtain
(1.10) from (1.9), we obtain that
(2.15) h 5:1 2 k:(n-l) Aim + (s-l)h).
Therefore, we have from (2.10) and (2.11) that _;;r
* 25+1 5 u
m _ m <h > (k —_- + C)
(2.16) \9(- n 1) - 9(- “—n'l)‘ S _ 6 a; k7
Bn_1 8,14 k2 (M101 + (s-1)h))
We have
(2.17) 30 P(a)‘yn‘ S (b + C) Ua-le-U/b.
By the mean value theorem,
(2.18) All-(u + (s-l)h) = &(u + eh) for some 6 in (3-1,5).
Hence, since
0.1 _ q-Ze-u/a
(2-19) b P(a-l)Q(U) 2 U
it follows by the hypothesis on h,
(2.20) ba-]T(a) t H(u + (s-1)h) 2 ua'ze'l'U/a.
67
Since b < 2a the result follows from the
inequalities (2.16), (2.17) and (2.20).
[Rt
_-3 3
(2.21) Ln_1 — en_1 z E‘Zj - EZJ‘
where Zj is defined by (2.12).
Lemma 3. If the hypothesis of Lemma 2 is satisfied, then
-1 h '%
“[‘Yn‘Ln-l] S K3((n ) ) for n > 1.
Proof. The standardized range bound for Ln-l’ together with
(2.15) and the fact that the umximum of the moduli of the values
of h Zj defined in (2.12) is at most
2 22 d d 1 '1
( . ) max {\ 1|,...,\ S|}(u(o- ) + c),
implies that
max {\d1\,...,\ds|1(u(a-1)'1 + c)
%
Ln-l S 3 ‘-
k,((n-1>h> (1 no + (s-1)h))
Since b < 2a and or > 2 implies that the u-integral of the rhs
of the inequality obtained by weakening this inequality for
Ln-l by using (2.20) is uniformly bounded, the proof of the
lemma is complete.
Below, we get an upper bound for B§_1. We have by
the definition of h Zj in (2.12) and (2.22) that
2 2 2 -1 2 u+Sh u+Sh
h E2j s (max {‘d1\,...,|ds\}) (u(a-1) + c) (iju + Hjju )
where Fj and Hj are the distribution functions of X, and
68
. . 2
Yj respectively. Therefore, Since 5
S EZ2 e ha e
n-l 2 j’ w V
2 - — h
(2.23) h28n_1 s (max {\dl‘,...,\ds‘})2(u(a-l) 1 + c)2(n-l)(F]:+S
-u+sh
+11]u ).
-1%
Lemma 4. ”[Bn-l] S k4((n-l)h ) for n > 1.
L
Proof. By (2.23) and the inequality (a+b)% S a2 +ib% for
a,b > 0, we have
- _ _._ +Sh
((n-1)h 1) 75n_1 s max {\d1\,...,\ds|}((h 1F]: )5
+ (h'lﬁjsﬁhﬁ).
By the mean value theorem h-lfj:+8h = s 5(x + ash)
and h-lﬁj:+8h = s 5(x + Ssh) for some 0 < 3,6 < 1. Hence,
since P(a)aQE S ua-le-U/b, F(a-l)aQ-1 a S ua-Ze-u/b and
ua'.2e.u/b is u-integrable, the result of the lemma follows.
The proof of the following theorem depends on Lemmas 2,3,4
euuipart of the method of proof of Theorem 1 of Johns (1967).
Theorem 2. For each positive integer s, if (0.8) is satisfied,
-l/s+l
n
a > 2, b < 2a, h = y where Y((a_1)a-l v (a_2)a-2) <
ea"2 P(a-1)a r S'1 for some 0 < r < 5 and l, is
defined by (2.5) and (2.6), then
Dn(§Jl) = 0(n-s/2(s+l)).
Proof. By (2.4) and the definition of i. in (2.5) and (2.6),
we have
(2.24) nlnnmm)‘ s ulhll] +‘ \ul 2 Yj(u>Eiljlxj = u] + mgom-
j=2
69
To start with, we consider bounding the integrand of
the second term of the rhs of (2.24) on the set [mn > O].
Afterwards we consider the case when mh S 0. So, let mn 2 0
until otherwise Stated. Since m.n 2 0, we have by the defini-
tion of lj in (2.6) and Sj-l in (2.7) for j > 1 that
n _ n
(2.25) jEZYj(u)E[$j‘Xj = uj + mn(u) = jizyj(u)P[Sj_l < 0]
where P stands for the joint probability measure of all the
random variables involved.
By the triangle inequality,
[1
(2.26) | 22v. (u)P[Sj_1< 01‘ s 1Dl| + \Dzl + ‘D3I
J-
where
[11*
[l
(2.27) 01 = z yj (P[Sj_1< 0] - 9(--—1-)),
i=2 Bj- -1
n
(2.28) D2 = z yjm- 4—)- -:—1'—1>)
i=2 BJ- -1 SJ- -1
and
n .121
2.2 = -
( 9) D3 ZYJ §( 511 )
With B denoting the Berry-Esseen constant, the Berry-
Esseen theorem (Loave (1963), p. 288) gives
n
‘D1l S B jEZ‘vj‘LJ-l
Therefore, by Lemma 3,
n
(2.30) nun“ > 01101” a 1.3 22((j-1)h)-%.
j:
70
By Lemma 2, we have
n
(2.31) ullmn > OllDzll S k1 jgz (h25+l>5.
Replacing aj and sj by Yj and Bj in (2.6)
through (2.13) of Theorem 1 of Johns (1967), we obtain that
n y n ‘y,‘
(2.32) \D \ s¢(0) z -1——+ z ——1—+(e +5 )(A§(A) +2@(0)) --
3 j=2 Bj_1 3:2 j2 n 1 1 1 r"
where max X§(-X) = A1§(A1).
x>0 .
The lower bound for B§_1 in (2.15), the lower bound
for b ﬁku + (s-l)h) in (2.20) and the upper bound for yj L
in (2.17), together with the conditions b < 2a and a > 2,
2 - _
imply that ”[Yj/Bj-I] S k5((j-l)h 1) t. Hence, since (2.17)
implies that ”[lyj‘] is uniformly bounded, it follows from
(2.32) and Lemma 4 that
n n
ullm > 0]\D3|] s k ( z ((1-1)h‘1)'5 + 2 l— + (nh‘1)¥ + 1).
n 8 j=2 j=2 j2
Hence (2.25) to (2.32) imply that
luttmn > 0]< z vj(U)E[l|Xj = u] + mg~~__)]\
j 2
n
= mum“ > 0] Ezyj(u)P[Sj_1 < 0]“
(2.33) n $5 n 2 1 5
s k9< z <h>' + z ((3-1)h 3+ >
j=2 J=2
n n
- - 1
+- z ((j-l)h 1) a + 2 —§'+ (3)35 + 1).
i=2 i=2 1
Now we consider bounding the integrand of the second term
of the rhs of (2.24) on [mn < O]. For u in [mm s 0], we
71
have by the definition of ﬁj in (2.6) and Sj-l in (2.7) for
j > 1, we obtain that
n _ n
jEZYJ (u)E[$j‘Xj = u] + mn(u) = --Y1(u) - jEZYj (u)P[Sj_1 2 0].
Following the same argument we gave to bound
2; Vj(U)P[Sj_1 < O] by rhs of (2.33), we obtain that
1'1
mum“ < Oyjizijmwj‘xj = u] + m;(u))]‘ s rhs of (2.33) + uE‘yl‘].
Since (2.17) implies that “[‘y1‘] is uniformly bounded,
this inequality and (2.33), together with (2.24) and the hypothesis
concerning h, imply the result of the theorem.
APPENDIX
APPENDIX
We apply the following lemma for obtaining lower bounds
for certain variances in Lemmas 4 and 11 of Chapter I and Lemmas
l, 2 and 3 of Chapter II. The inequality in the following lemma
is trivially true when p0 = 1.
Lemma 1.A. Let pO < l, p1,...,pi,... be a probability distribu-
tion on {O,l,...,i,...} and let 2 be the r.v. Z(i) = Z1 for
. . _ . 2 :4:
Specified 20 - 0,21,...,zi,... With 2 zipi < m. Let qi
abbreviate l - pi and let 1(X) = 2: Pi(1 — Xqi)-1. Then
I g from to #{i 2 l‘pi > O} as X t from O to l and
0
1 i. i 1
with Al the unique root of 1(X) = 1. Since 1(p0) s 1,
Proof. 1 g since each summand p,(l - )\q.)”1
_____ 1 l
with p1 > 0 g.
Since equality holds in the inequality when XI = l, we consider
below the case k1'< 1.
To prove the inequality when k1 < 1, let ¢(z) = Var(Z) -
2 . .
X1 2 2i piqi for z — (21,22,...). Denoting the first and second
partials wrt zj by *3 and wjj reSpectively,
¢j(Z) = 2Pj{(1-x1qj)zj - 2 z.p.} , w..(z) = 2(1-A1)quj-
11 J]
For j with pj > 0, W is, therefore, minimal wrt zj varia-
tion iff zj = (l‘quj)-IZ zipi. These conditions are satisfied
72
73
\
iff, for some constant c, zj = c(l-)\1qj)-’1 for j with pj 2 O.
For such 2,
-2
2 _
¢(z) = C {2 Pi(1-xlqi) - 1 - x1 2 piqi(1-ilqi) 2} = 0
which yields the nonnegativity of W asserted by the lemma.
BIBLIOGRAPHY
BIBLIOGRAPHY
Clemmer, Bennie A. and Krtuchkoff, Richard G. (1968). The use of
empirical Bayes estimators in a linear regression model.
Biometrika 55, 525-534.
Fox, Richard (1968). Contributions to compound decision theory and
empirical squared error loss estimation. RM-214, Department
of Statistics and Probability, Michigan State University.
Gilliland, Dennis (1966). Approximation to Bayes risk in sequences
of non-finite decision problems. RM-162, Department of
Statistics and Probability, Michigan State University.
Gilliland, Dennis (1968). Sequential compound estimation. Ann.
Math. Statist. 39, 1890-1904.
Graves, Lawrence M. (1956). The theory of functions of real
variables, 2nd Edition. Macmillan.
Hannan, James F. (1957). Approximation to Bayes risk in repeated
play. Contributions £g_the Theory g£_Games, 3, 97-139.
Ann. Math. Studies No. 39, Princeton University Press.
Hannan, J. (1964). Mathematical Reviews 27, 828.
Johns, M.V., Jr. (1967). Two-action compound decision problems.
Proceedings g£_the Fifth Berkelgngymposium 22 Mathematical
Statistics and Probability, 463-478. University of
California Press.
Johns, M.V., Jr. and Van Ryzin, J. (1967). Convergence rates for
empirical Bayes two-action problems II. Continuous case.
Technical Report No. 132, Department of Statistics, Stanford
University.
Loeve, Michel (1963). Probability Theory, 3rd Edition. Van Nostrand.
Martz, Harry F., Jr. and Krutchkoff, Richard G. (1969). Empirical
Bayes estimators in a multiple linear regression model.
Biometrika 56, 367-374.
Miyasawa, Koichi (1961). An empirical Bayes estimator of the mean
of a normal distribution. Bull. Inst. Internat. Statist.
38, 181-188.
74
75
Pi-Erh, Lin (1968). Estimation of a multivariate density and its
partial derivatives, with empirical Bayes applications.
Ph.D. Thesis, Columbia University.
Rutherford, John R. (1965). Some parametric empirical Bayes
techniques. Ph.D. Thesis, Virginia Polytechnic Institute.
"MW
165
111141
N"
N”
U"
Em
MMMMM
iii
3 12
4min:
3146 2
93 O
__