1512513 I .LEES? -‘ RY

.' RflcﬁEggm gt
Unive

/\ may

\I

This is to certify that the

dissertation entitled
'<‘ ‘ at ob;
1m VN W W

presented by

W.W.§oﬂmﬂg

has been accepted towards fulﬁllment
ofthe requirements for

Dev D degree in thA/WMA

 

\lobm W‘W

Major professor
DMMQLOMB

.MSU is an Affirmative Acrionx Equal Opportunity Institution 0—12771

 

 

 

)V1531;1 RETURNING MATERIﬁgg:
Place in book drop to

LIBRARIES remove this checkout from

JIIIKSIIIL. your record. FINES will

 

 

be charged if book is
returned after the date
stamped below.

 

,": i.’ .3“ A 3‘. ,

’ ' ‘ - '. , a - _

mg. 4 I}. J .P‘! 3‘ '*y m}
r . . ‘ r - ' . ‘

 

 

 

 

 

 

0N ADAPTIVE ESTIMATION

By
Anton Schick

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Statistics and Probability
l983

:41, W30

ABSTRACT
ON ADAPTIVE ESTIMATION
By
Anton Schick

A general method for the construction of locally asymptotically
minimax (LAM) - adaptive estimates is given under conditions weaker than
those in Bickel (l982). In particular, we show that Bickel's condition
5* is not necessary for LAM-adaptive estimation and replace it by a
weaker condition. This new condition is found to be necessary and suf-
ficient for a class of estimates to be regular, a property which implies
LAM-adaptivity under Stein's (1956) necessary condition for LAM-adaptive
estimation and which coincides with Bickel's notion of adaptivity.

We demonstrate our method by constructing an LAM-adaptive estimate in

a situation where condition 5* fails.

To my parents, my wife Jeanette and my son Andreas

ACKNOWLEDGEMENT

I wish to express my sincere thanks to Professor Viclav Fabian
for his guidance in the preparation of this dissertation. The advice
and encouragement he gave are greatly appreciated. Also, I would like
to thank Professor H. Koul for awakening my interest in adaptive
estimation and for his review of my thesis and Professors J. Hannan
and S. Axler for serving on my committee.

Finally, I wish to thank my parents for their constant support
throughout my studies and my wife Jeanette for her understanding and

encouragement.

2.
3.
4.

TABLE OF CONTENTS

Page
Introduction ....................... l
A necessary and sufficient condition ........... 3
An example ........................ l4
An auxiliary result .................... 21

Bibliography ......................... 25

I. INTRODUCTION

In a recent paper Fabian and Hannan (1982) use results on locally
asymptotically minimax (LAM) estimates for locally asymptotically normal
(LAN) families to reformulate Stein's (1956) heuristic arguments on
adaptive estimation. The authors define LAM adaptivity of estimates,
prove that a condition S, due to Stein, is necessary for the existence
of such estimates, and give a sufficient property - regularity - for
estimates to be LAM adaptive (see their Theorem 7.10).

Bickel (1982) formulates a condttion 5*, stronger than S. He
then constructs regular (and thus LAM adaptive) estimates under Condition
5* and when estimates of the nuisance parameter are available. He
uses this result to obtain regular estimates in several important cases.

Bickel states that 5* is “heuristically necessary" for the existence
of LAM adaptive estimates (preceding Conditon C, following Condition H).
We give a simple counterexample to the necessity of 5* and obtain re-
sults on the existence of regular estimates without 5*. It is seen
that if 5* does not hold regular estimates are more difficult to con-
struct in that a certain rate of convergence for the estimate of the
nuisance parameter is required.

We consider estimates of a certain type and obtain a necessary
and sufficient condition for such an estimate to be regular. The class
of estimates we consider includes estimates considered by Bickel, but

it is a larger class.

The results described above are derived in the case of i.i.d.
observations under weaker conditons than Bickel's regularity conditions
(see Remark 4.4).

Some notation will be introduced next. If P,Q are probabilities
on a o-algebra x, and 0+ is the absolutely continuous, with respect
to P, part of Q, then any Radon-Nikodym derivative of 0+ with
respect to P will be called a pseudodensity of Q with respect to P,
and also a pseudodensity of J-dQ with respect to J-dP. He shall talk
frequently about expectations using terms which make sense when applied
to the probabilities. If E and F are expectations on a o-algebra
5, then dF/dE denotes the set of all pseudodensities of F with
respect to E which are non-negative and finite valued.

k

I! denotes the k—dimensional Euclidean space._B_k the o-algebra

of the Borel subsets of Rk, R = IR], g: g]. < > will be used to

denote finite or infinite sequences, and, in particular, points in 19‘.
In matrix calculations, points in IE‘ are columns. .1 denotes the
identity matrix. The dimension is not displayed in '1' but will be
clear from the context.

If Hn IS an expectation on a o- algebra 1mg" an <§n,§K>

k

measurable transformation for each n = l,2,..., and if c E I2 , then

(i) we write gn +'c 1n <Hn>-prob. if an{u + O for every

gn-cu > e}
e > O and (ii) we say <gn> is bounded in <Hn>-prob. if <Fn> is

tight, where Fn is the distribution of gn under Hn'

2. A NECESSARY AND SUFFICIENT CONDITION

We begin by specifying the asymptotic estimation problem we shall

consider throughout this paper.

1. Assumption. 9 is a non empty set satisfying 9 = e] x 92 with

 

91 an open subset of IE". 9 = <el,ez> is a point in o. 5. is a

c-algebra and for every 6 E e, E is an expectation on 5, X],X2,...

<5
is a sequence of s-dimensional random vectors which are independent and

identically distributed with distribution F5 under Ea‘ for each
5 e 9. There exists a o-finite integral J such that each F6 has
a density f3 with respect to J. For every 5 = <t,v> E o the map

u 6 G1 + f?” v is differentiable at t in L2(J) with derivative

>

h5 = h(-,t,v). Furthermore J heh; is nonsingular and the map

u E o] + h(-,u,ez) is componentwise continuous at 91 in L2(J).

2. Notation. For every 6 E o and n = l,2,... we denote by 5n

the o-algebra generated by X1,...,Xn and by End the restriction of

E5 to ﬁn. We set E = (Ené’é é e> and Ev = <5

v E oz. For convenience in notation we shall often write f(-,t,v)

n<t,v>’ t E 61> for

instead of f and similarly for other functions gs, 5 E e. We

<t,v>
also set for 5 E o and n

1,2,...

. -T
(l) u(s) 4 a héhé

3".

6 f6% {f5 > 0}

and

n
(3) Yné (nM(6))'L2 Z

By 5 we denote the set of all 5 = <t,v > in <3 for which M(5)
is nonsingular and u E 91 + h(-,u,v) is componentwise continuous at

t in L2(J).

3. Remark. We are interested in estimating the first component 51

of an unknown point 5 in o. o] specifies our knowledge about this
component while 92 summarizes our knowledge about the second component,
the nuisance parameter. If 5 E 5 we obtain from Theorem 4.8 in Fabian

>.
6
In this case we want our estimate to be LAMA(A5,5) where A5 is the

and Hannah (1980) that E5 satisfies condition LAN<5],nM(5), yn
2

class of all subproblems which are LAN<nN(5), ;n5> for some M(5)

and En and satisfy Stein's necessary condition (N(5))12 = 0, see

6
Fabian and Hannan (1982), Section 7. By Theorem 7.10 in the same paper
this can be done by constructing an estimate <Zn> which is regular at

5, i.e. satisfies

(1) (nM(5))%(Zn-51) - yné + o in <En6> - prob..

Since 5 is unknown, this suggests to construct an estimate which is
(globally) regular, i.e. regular at each point in 5. In some cases,
however, estimates which are regular at just one point, say a, are

also of interest. For this reason we restrict ourselves to the construction

of an estimate regular at 6. This will facilitate the treatment and
the reader will have no difficulties to see under what conditions this

estimate is globally regular.

4. Remark. Bickel (1982) defines adaptivity at 6 for an estimate
<Zn> by

s
(1) For every sequence <tn> in 91 such that <n2(tn-61)> is bounded

the distribution of (nM(e))E(Zn-tn) under E converges weakly

n<tn,62>
to the m-dimensional standard normal distribution.

Condition (1) is equivalent to (3.1). This follows from Theorem
6.3 in Fabian and Hannan (1982), Theorem 6.1 in Bickel (1982) and the
note thereafter. Hence an estimate which is adaptive at e in Bickel's
sense is regular at e.

Bickel claims that the existence of a regular estimate implies
that each subproblem obeying his regularity condition R satisfies
Stein's condition (M(d))12 = O, for every regular point d. But the
proof of this claim is incorrect due to an inappropriate reference to
Héjek (1972): Bickel considers only local alternatives for the paramenter
of interest and not local alternatives of both the parameter of interest
and the nuisance parameter as needed in Héjek's Theorem 4.2. Thus it
remains an open question whether Bickel's claim is indeed true.

Next we define a map 0 from G into I?" by

Q(tsv) = J 2(°,t,V)f(’,t,62)

if the integral is well defined and 0 otherwise.

Bickel's condition 8* is
(5*) Q = o

The example below shows that (5*) is not necessary for the construction

of regular estimates. Another example is given in Section 3.

5. Example. Let 01 = (O,m), oz = B? and let the Xi's be normal
random variables with mean u and standard deviation 0 under E<C u>’
i.e. we want to estimate the standard deviation in the presence of an

unknown mean. Easy calculations show that assumption 1 holds and that

(1) z(°.0su) = -o" + o'3(--u>2

(2) 5 = o

and

(3) (No.11) = o‘3h-e2)? , for o s 0, M2 e IR

Furthermore for every 5 E o, the full problem E satisfies condition

LAN<5,nN(5),?n5>, where with 5 = <o,u>

~ _2 2 o
(4) M(6) = o

o l
and
N _% n X-‘ll 2 -}' n X.-u
(5) ind = <(2n) .2] ((-%;—i - 1). n 2 .2] -%;—->
J= J:

and the estimate <Eh> defined by

~ _ -1 " -— 22v —
(6) Zn - <(n .; (xj-xn) )2 , Xn>
3-1
with 76 the sample average satisfies
(7) (nﬁ(6))%(in~6) - t'ms + o in 56 - prob..

Verifications of the above are easy. We refer the reader to example
9.2.12 and Theorem 9.4.33 in Fabian and Hannan (1983).From (4) and (7)

we obtain that Stein's necessary condition for LAM-adaptive estimation
holds and that <Zn>, with 2n the first component of in, is a globally

regular estimate.

6. Remark. We now motivate a natural choice for an estimate regular
at e and then give a necessary and sufficient conditon for this type

of estimate to be regular at e. We begin with a definition.

7. Definition. We say <Un> is an auxiliary estimate at e if each

. _ 35 _ .
Un 1S a 91 valued in measurable random vector and <n (Un 61)) 1S
bounded in <Ene>-prob..

we say <wn> is a consistent estimate of the information matrix
at e if the wn are positive definite matrix valued random vectors
-% -% + - _

on 5n such that M(e) wnM(e) l_ in (Ens) prob..

We say <tn> is a local sequence if <tn> is a sequence in

a] and <n%(tn-e])> is bounded.
8. Remark. In Section 4 we prove that

£(Xj,Un,ez)

"1V3:
._l

_ -l
- Un + (nwn)

(.1.

is regular at e if <Un> is a discrete auxiliary estimate at e and
<wn> is a consistent estimate of the information matrix at e. For a
discussion and the use of discrete estimates we refer to Fabian and
Hannan (1982) and Bickel (1982). The estimate in (1) is of limited
practical value since it presupposes the knowledge of 92, but it suggests
an obvious candidate for a regular estimate. Simply replace 82 by an
estimate. A different method consists in estimating the score-function
E(-,-,ez) directly as Bickel (1982) does. But the present method serves
us better for the purpose of illustration. Substituting an estimate for
82 in (1) has to be done with some care. For technical reasons we adoot
an idea Bickel (1982) uses, but modify it to obtain better estimates of
the nuisance parameter. Recall that Bickel splits the sample in two
unequal parts, estimates the nuisance parameter based on the observations
in the smaller subsample and evaluates the scorefunction only at observa-
tions of the larger part. We divide the sample in two equal parts, obtain
an estimate of the nuisance parameter from each part and when evaluating
the scorefunction with an observation of the first part we use the estimate
of the nuisance parameter based on the second part and vice versa. Thus
our estimates of the nuisance parameter are based on half the sample
and not just on a small proportion of the sample. This improvement is
vital, since it turns out that the estimate described above is regular
at e if we can construct an estimate of the nuisance parameter with
a certain rate of convergence.

To have the above method well defined we make the following

assumption.

9. Assumption. Assumption 1 holds. 62 is a topological space. The

 

map 2(-,t,-) is measurable for each t E 01 and

(I) Jué('9tav) " 2(°9t962)“2f('9tae "" 0

2)
as <t,v> in o converges to e.

For every n = 1,2,... there is a measurable map hn from
CR§)n into 92 such that Vn = hn(X],...,Xn) converges to 62 in
Ee-prob.. <Un> is an auxiliary estimate at e and <wn> is a consistent

estimate of the information matrix at e.

10. Remarks and Notation. Note that Assumption 9 implies (3.11) of
condition H' in Bickel (1982) with Zn(-,.,x],...,xn) = é(.,-.vn).

Indeed, we have

(1) aué(-.tn.vk ) - é(-.tn.e2)uzi(-.tn.e2) + 0
T1

in Ee-prob. for every sequence <tn> in a] converging to 61 and
every sequence of integers <kn> tending to infinity.

Also observe that under Assumption 9 J E(-,t,v)f(-,t,62) is
well defined for <t,v> in a neighborhood of e.

The estimate described in Remark 8 is formally defined by

mn n

(2) 2(U ) = U + (Mi )“(2 2(x.,U ,v )+ l I:(x.,U .v ))
n n n n j=1 J n n2 j=mh+1 J n nl

with <Uh> a discrete auxiliary estimate at 9, mn the integer part

of n/2 'and

(3) an - hmn(X],. ,an) and Vn2 = h (an + 1,. .,Xn)

10

Typically <Uh> will be a discretized version of <Un>. But we do not
want the regularity at e of <2n(Uh)> to depend on the way we discretize.
In other words, we want <in(Uh)> to be regular at e for each discrete
auxiliary estimate <Uh> at e. We now give a necessary and sufficient

condition for this to happen.

11. Theorem. Suppose Assumption 9 holds. Then the following are

equivalent.

(1) <2n(Uh)> is regular at e for every discrete auxiliary estimate

<Uﬁ> at B.
(2) For every local sequence <tn>

g .
n Q(tn,Vn) + O in Ee-prob..

Proof: Note that (1) is equivalent to

(3) <in(tn)> is regular at e for every local sequence <tn>.
Also recall (see Remark 8) that

(4) <Zn(tn,ezﬁ> is regular at e for every local sequence <tn>.
We shall show thathor every local sequence <tn>

A

(5) n%(zn(tn) — Zn(tn,ez) - w"R (t )) + o in Ea prob.,

n n n
n1))‘

= -1' -
where Rn(t) n (an(t,Vnz) + (n mn)Q(t,V
Combining the above shows that (1) is equivalent to

ll

(6) nENaiRn(tn) + O in Ee-prob. for every local sequence <tn>.

By the consistency of <wn> and the independence of an and

Vn2 (6) is equivalent to (2). Thus we are left to verify (5). Again
by the consistency of <wn>, (5) is equivalent to
m
-g n n ,
(7) n (-E Tn2(xj’tn) + .2 Tn](Xj,tn))-+ O in Ee-prob.,
3-1 J-mﬁl
where Tni(°’t) = £(°’t’vni) - £(-,t,92) - 0(t’vni) for i = 1,2 and
t E 01 .

Now fix a local sequence <tn>. Abbreviate E by En

n<tn,62>
and note that <Eh> and <Ene> are mutually contiguous. Next observe

that for j = l,2,...,!"n

(8) E‘(Tn2(xj,tn)lxmn+,,....xn) = o a.e. En

n
and thus
m
(9) Enuln'i ,2: Tnzlxj,tn)llzlxmn+1....,xn)
-1 "I"-— 2 _.
= n jg] EnlﬂTn2(Xj,tn)H Ixn%+1.....xn) a.e. En

J “é(°atnavn2) ' é('3tn992)“2f('stnsez) a.e. E

IA

n

by a property of conditional variances. Using the mutual contiguity of
<Eﬁ> and <En6> and (10.1) we find that (10) converges to zero in
<Eﬁ>-prob.. This shows that

n
Tn2(xj’tn) + 0 1n <En> -prob..

_1
6

J

(10) n

urvj a

1

In the same way we obtain that

12

-g n + . ._ _
(11) n jZm +1 Tn1(xj’tn) 0 1n <En> prob..
n

Using the mutual contiguity of <Eh> and <Ene> we conclude

\a

from (10) and (11) that (7) holds. This completes the proof.

12. Remarks. Note that (11.2) is trivially satisfied if (3*) holds
and in this case consistency of <Vn> guarantees the existence of a
estimate regular at a. This is Bickel's (1982) result. But if (5*)
fails consistency of <Vn> alone does not suffice to construct an estimate
regular at e. In this sense LAM-adaptive estimation is more difficult
in cases when (5*) fails.

Assume for the moment that oz is an open subset of IE) for
some positive integer p and that the whole problem E satisfies conditon

LAN <5,nﬁ(5),?n5> for each 5 E 6. Also assume regularity conditions

which allow the Taylor expansion

_ T ~
Q(tav) ' Q(taez) ' (V'az) M]2(8) + 0(“t-81“) + 0(“V'92“2)
as <t,v> + e. In this case the necessary conditon for LAM-adaptive
estimation N]2(e) = 0 implies that

2

(l) Q(t.v) = 0(Ht-61H) + 0(“v-9 )

ll
2“
Note that Q(t.62) = 0. Thus (1) shows that (11.2) is satisfied if

% .
(2) n (Vn-ez) + O in Ee-prob..

Obviously (2) is weaker than

13

(3) n%(Vn-ez) is bounded in Ee-prob.,

a condition which together with the existence of an auxiliary estimate
at e and of a consistent estimate of the information matrix at e
suffices to construct LAM-adaptive estimates if Stein's condition holds
(c.f. Theorem 6.15 and Theorem 7.10 in Fabian and Hannan (1982)).

We remind the reader that (1) is satisfied in example 5.

3. AN EXAMPLE

1. Description of the example

We consider the regression model

(1) Y. = a

J 1 + 82(Tj) + e. j = 1,2,...

J

where T],T2,... are i.i.d. random variables with uniform distribution

on [0,1], e1,e2,... are i.i.d. random variables with Lebesgue density

9 and independent of T],T2,...,e1 is a real number and 92 is a

real valued absolutely continuous function on [0,1] with square integrable

1

derivative 65 and J 62(t)dt = 0. We suppose that the density 9

0
satisfies the following conditions

(2) i x g(x)dx = 0
(3) 1x2 g(x)dx = «:2 < ..

(4) g is absolutely continuous with derivative 9' and has finite

Fisher information

. 2
1(9) =lﬂ§§§§L dx

(5) with L = - ”(iii—Hg > 0} we have

(5a) J Jm (L(x + v(t)) - L(x))2 g(x)dxdt + 0

O

14

15

and

l

1
(5b) ID I” L(x-v(t))g(x)dx dt = O (J0v2(t)dt)

1 l
for J v(t)dt = O and J v2(t)dt +.0 .
O 0

Note that (5) is satisfied if L is twice continuously differentiable

with bounded derivatives L' and L".

2. Remark, The above regression model satisfies Assumption 2.1 with
a] = 12,92 the family of all realvalued functions v on [0,1] which
are absolutely continuous with square integrable derivative and satisfy
,J; v(t)dt = O, Xj = <Yj,Tj>, J the integral induced by the Lebesgue
measure on the Borel field :ofR x [0,1] and f5 and h5 defined by
f(x,t,v) = g(xl-t-v(x2)) and h(x,t,v) = 3, L(xrt-vuz))gié(x]-t-v(x2))
with x = <x],x2> in II x [0,11 and 5 = <t,v> in e. The differenti-
ability in L2(J) follows from (1.4) and Lemma A.3 in Héjek (1972),
while the required continuity of the derivative is a consequence of
Theorem 9.5 in Rudin (1974). Note also that o = o by the translation
invariance of the Lebesgue measure.

Furthermore, if we endow 02 with the topology induced by the norm
H-Hz defined by “VHS = [; v2(t)dt for v in 02, then (2.9.1) follows
Y.> is an

1 J
),...> is a

from (1.5). Also observe that the sample average <n'

D "M:

J

auxiliary estimate at e and that the sequence <I(g),I(

consistent estimate of the information matrix at 9.
Next it is easily checked that 0 satisfies

1 oo
Q(t,V) = J0 I L(x-(v(u)-92(u)))g(x)dxdu

16
This and (3) show that (2.11.2) is satisfied if <Vn> satisfies
% 2 .
(1) n UVn - ezuz + O in Ee-prob..
We shall now construct such an estimate.

3. Construction of the estimate <Vn>.

We let <an> denote a seouence of positive integers and set

bn = a;]. For each n = 1,2,... we partition the unit interval [0,1]
in an intervals Ini’ 1 = l,...,an of equal length bn' We let mni
denote the midpoint of Ini and Xni the indicator of Ini' Furthermore

we assume that the intervals Ini are numbered in such a way that

m . < mnk for 1 5 J < k 5 an. Next we set

 

"J
(1) '1 i
U = n _ Y.
and
-l " .
(2) Yni - (nbn) jg] ijni(Tj) , 1 - 1,...,an
and define Vn by
/
Ynl'Un 0 f t 5 mnl
A t"mni
(3) Vn(t) = [Yni-Un + —E;—— (Yni+1-Yni). mni f t < mni+1
Y -U m < i; <1
K nan n nan - -

It is easily verified that Vn is a oz-valued random vector, e.g.

1 an
(4) J Vn(t)dt = E b v . - U = 0
0 i=1

17
4. Lemma. If the sequence <an> is chosen such that

4 2
n + O and nbn + w

(l) nb
then nEE “V -e U2 + O
a n 2 2

Proof: For i = 1,2,...,an set
1
(2) C - = an J xni(u)62(u)du

and note that

(3) EeYni = 61 + Cni

Easy calculations show that

2 -1 2 l 2 2
(4) E6(Yn1-61-Cnl) f 3(nbn) (9] + “92“2 + 0 )
and
(5) 59(un-e,)2 5 n"(o2 + 16213)
Next note that by the Schwarz inequality for O 5 U1 < u2 f l

U
2"2. 2 2.2

(6) (e2(u2>-e2(u1)) = (j e2(x)dx) g (“2’”1) j (e,(x)) dx

U] U]

Using this and the Schwarz inequality we obtain

(7) i(62(t)'cni)zxni(t)dt

ilan l (e2<t)-e2(u)hm.(u)du)2

J

an(t)dt

IA

an if (92(tl-92(U))2xni(Ulduxni(t)dt

IA
0"

18

and

1

1
<8) (cn,.,-cn,)2 5 aﬁ ] l0 (e,(t)-e2(u)>zxn,+,(u)xn,<t)dudt

O

l .)2
f anh(xni+1 I xni)92'i2

Combining (4) and (8) shows that for some constant C

a -1
n
2 -1 -2
(9) 1;) Eo(Yni+i‘Yhi) g C(n on + on)
Next define
a
__ n
(10) Vn - 1;] Ynixni ' Un
and
a
"7- n
(1]) n ‘ .2 Cnixni
1-1
It follows form (1) and (9) that
an-l
a _—2 s 2 +
(12) n Eeuvn vn“2 5 n2 E E6(Yni+l Yni) bn 0

and from (1), (4) and (5) that

on nﬁymﬁgg+o.
Furthermore by (l) and (7)

(14) ﬁll/"-52“; .. o .

Combining (12) to (14) gives the desired result.

19

5. Remark. Remark 2 and Lemma 4 show that Assumption 2.10 and condition
(2.11.2) are satisfied. Therefore it follows from Theorem 2.11 that

m
__ '1 n _ n
(1) un + <nI<g>> (jgl L(vj-un-vn2(ij))+ jgm +1 L(vj- -U vn,(r )))
n

is a regular estimate, where <Uﬁ> is a discretized version of <Un>

and mn, V and Vn2 are as described in Remark 2.10.

n1
*
Next observe that conditon (S ) does not hold for a proper choice

0f 9. 8.9- With g(x) = % e-|xl we obtain for v = 82 + r in oz
by easy calculations

1 m
(2) Q(t ,v)= I ! sign(x- r(u»g(x)dx du

o—e .

1

= J sign(r(u eINu)I -l)du
O

1
l sign(r<u)iie""”" - 1 - |r(u)I)du
O

ourug)

and from (2) it is easily seen that Q is not identically zero. This

shows that also in an infinite-demensional nuisance parameter space 92
*

conditon S is not necessary for the existence of a regular estimate.

6. Remark. The estimate <Zn> defined by (5.1) is LAM-adaptive (Aa,e)

where A6 is the class of all LAN <nM(e) subproblems with

9 Yn8>
We)12 = 0 (see Remark 2.3). We now describe a class of subproblems
which belong to A6:

Let r denote the family of all one to one maps y from an open
neighborhood 5 around 0 in TE) into 92 for some positive integer

p satisfying y(0) = 92 and

20

(l) “Yial'Y(0) - aTuHZ = 0(HaU) as a + 0

for some vector p = <5]....,¢p> such that wi is in oz for
i = l,...,p and N = I] w(u)wT(u)du is nonsingular.

For y in r weodefine the subproblem <oo,o> by 90 = o] x yLS]
and o(t,y(a)) = <t,a> for <t,a> E I2 x S (for the defintion of sub-
problem see Definition 7.3 in Fabian and Hannan (1982)). This subproblem

satisfies condition LAN <nﬁ,?n> with

(2) E = 1(9) 0
N
and
~ ~ -g n
(3) Yn = (0M) .2 L(Yj-81-92(Tj)) <l.w(Tj)>

J l

and hence satisfies “)2 = o.
The above follows from Theorem 4.8 in Fabian and Hannah (1980),
since the map <t,a> E I? x S +-f%(-,t,y(a)) is differentiable in

L2(J) at <6],0> with derivative A given by

(4) 5(X) = 5(x1,61,62) <l.w(x2)> for x = <x1,x2> in tzx£0,1]

This is easily verified using (1.4), the arguments in the proof of

Lemma A.3 in deek (1972), Theorem 9.5 in Rudin (1974) and the properties

of y.

4. AN AUXILIARY RESULT

1. Remark and Notation. In this section we shall prove that the estimate

Zn(U n

we abbreviate M(e) by M and y

.62)> as given in (2.8.1) is regular at 6. To simplify notation

ne by y". Also we shall use

M(t) and y nt short for E E ,M(t,ez) and

E a
<t,62> n<t,62>

t’ Ent’
, with t E o

Yn<t,62> 1'

2. Lemma. Suppose Assumption 2.1 holds, <tn> and <un> are local
sequences and g E dE /dE Then
n nun ntn

109 gn-wl ;"tn + %“wn“2 + 0 in Eel~prob..

n

= (nM)'35 X é(xj.t a

w1th wn - (nM) (“n-tn) 60d n j=1 n’ 2)-

V
'nt

Proof: Let 5n = tn-e and Tn = {t E o]: t + 5n E 0]} and set for

1 c

t E Tn, Hnt = Ent+sn’h nt Jf%( ,t+s n,62) and hnt = h(.,t+sn,e

2) .
Using the fact that the map t 6 o] + f%(-,t,92) is continuously dif-
ferentiable at 61 in L2(J) we obtain for every local sequence <5n>

with 5n in Tn and for every e > 0

21

(1) n J(hn6n-hne]-(5 '91)Tﬁne])2 .. 0
(2) allﬁnGIIIZXW-‘nelu > "%€“ne]} -> o
and

(3) M(tn) + M

From (3) we obtain that M(tn) is invertible for all n 3 no, for some

integer no. We now obtain by Theorem 4.5 in Fabian and Hannan (l980)

that the family (Hnt’t E T", n 3 n0> satisfies condition LAN <nM(t ),

n

Ynt >. This shows that
n

T

~ ~ 2 .
(4) log gn-wn Yntn + lawn“ + 0 1n (”ne:-pr°b°’

with Eh = (nM(tn))%(un-tn). Now note that

T ~ ~T
(5) w v =w 1'
n ntn n ntn

and that by (3)

<6) :1:an - 111nm? -» o

The desired result follows now from (4), (5) and (6) and the mutual

cont1gu1ty of <Hn91> and <Ene]>.

3. Proposition. Suppose Assumption 2.l holds, <Un> is a discrete

 

auxiliary estimate at e and <wn> is a consistent estimate of the

information matrix at 6. Then

23

is regular at 8.

Proof: We have to show that

(l) (nM)%(Zn-e]) - yn + 0 in E61-prob..

By the discreteness of <Un> and the consistency of <wn> it suffices

to show that

k _. ~ .
(2) (nM) (tn 8]) + Yntn - yn + 0 1n Eel-prob.,

for every local sequence <tn>.

Let <tn> be a local sequence and set un = tn + (nM)‘%u for

u e lRm. With 9n 6 den“ /dEn we obtain from Lemma 4.3 in Fabian and

t

n n
Hannan (1982) and the mutual contiguity of <E > and <E >
ntn n61
(3) lo - uT( -t ) + %”uV2 + 0 in E - rob
9 9n Yn n h 1 6 P .,

l
with En = (nM)%(tn-e]). On the other hand Lemma 2 shows that

, ~ 1 12 .
(4) log gn - u Yntn + adud + 0 1n Ee1-prob..

Combining (3) and (4) shows that
(5) uT(t + ~ - ) + 0 in E -prob
n Yntn Yn a] '

for every u E ﬁlm. From this (2) follows which concludes the proof.

4. Remark. Bickel (l982) constructs an estimate as in Proposition 3
under stronger conditions than ours. It is easily checked that his re-
gularity conditions R(i), R(ii) and UR(iii) imply continuous dif-

ferentiability at a] in L2(J). Also note that we can choose wn to

24

be M(Un) if the latter is nonsingular and .1 otherwise. However,

estimates which are regular at e and do require the knowledge of 62

can be constructed under weaker conditions than ours; see Theorem 6.l5

in Fabian and Hannah (1982). The estimates constructed there are based

on difference quotients rather than on the "derivative" z. Since the use

of £ facilitates our treatment we have chosen to work with Assumption 2.l.

BIBLIOGRAPHY

BIBLIOGRAPHY

Bickel, P.J. (1982). On adaptive estimation. Ann. Statist. 19, 647-671.

 

Fabian, V. and Hannah, J. (1980). Sufficient conditions for local
asymptotic normality. RM-403. Department of Statistics and
Probability, Michigan State University.

Fabian, V. and Hannan, J. (1982). On estimation and adaptive estimation
for locally asymptotically normal families. 1: Nahrschein-
lichkeitsthenrie verw. Gebiete 53, 459-478.

 

Fabian, V. and Hannan, J. (1983). Introduction to Probability and
Mathemathical Statistics. (Forthcoming book)

Hajek, J. (1972). Local asymptotic minimax and admissibility in
estimation. Proc. Sixth Berkeley Sympos. Math. Statist.
Probab. L, 175-194. Univ. Calif. Press (197?).

 

Rudin, w. (1974). Real nd complex analysis, (2nd ed.). McGraw Hill,

New York.

Stein, C. (1956). Efficient nonparametric testing and estimation.
Proc. Third Berkeley Synpos. Math. Statist. agg_Probab. L,
1874195. Univ. Calif. Press.

25

 

"'ll'flllﬁtllﬂljllﬂlﬂaﬂligﬂlﬂlﬂlﬁllllﬂ