R ./lpl>R——(_— f0(/\) W (V) < 00 ( )
for any R.
In particular, if the spectral density [0 is such that
fee) >< W)? (22.13)
for sufﬁciently large [A], where ;p(/\) is the Fourier transform of some square integrable
ﬁnite function, then there is a sufficient condition which is relatively easy to verify in
practice.
Theorem 2.2.5. For spectral densities of the type given by (2.2.13), a sufﬁcient and
necessary condition for the equivalence of the Gaussian measures P0 and P1 on the
pathes of {X(t),t E D}, where D is any ﬁnite interval, is that the function
h : f0(/\) - f1(/\)
f0(/\)
satisﬁes the condition
/ |h()\)|2 dA < 00, (2.2.14)
|A|>R
for any R < 00.
Another method to ﬁnd out the sufﬁcient and necessary conditions for the equiva-
lence of Gaussian measures is to use reproducing kernel Hilbert space. This approach
has no constrains like stationarity or isotropy on the underlying process. Therefore it
could be a potential tool for analyzing nonstationary processes. We will review some
important results given by Chatterji and Mandrekar (1978), it is worth noting that
those results apply to any dimensional case:
14
Deﬁnition 2.2.6. Let D be any set and K be a real-valued covariance on D X D.
Then K iscalled a covariance on D if (a) K(s,t) = K(t,s) for any s,t E D and (b)
Ensel asatK(s, t) _>_ 0 for all ﬁnite subsets I of D and {as, s E I} oflR.
Deﬁnition 2.2.7. Let D be any set. A Class [C(K) of functions on D forming a
Hilbert space is called the reproducing kernel Hilbert space (for short, rkhs) for a
covariance K .
The following theorem gives existence and uniqueness of rkhs.
Theorem 2.2.8. [Aronszajn(1950)/ Let D be any set and K be a real-valued function
on D x D. Then there exists a unique Hilbert space lC( K) of functions on D, satisfying
K(-, t) E [C(K) for each t E D;
(f,K(-,t)) = f(t) for each f E IC(K) and t E D.
To give the sufﬁcient and necessary condition for the equivalence of two Gaussian
measures, we need introduce a “product” covariance function which is deﬁned as
K0 ® K1{[(t1, t2), (31, 32)]} = K0(t1, s1)K1(t2, 32), where K0 is a covariance function
on D1 and K1 is a covariance function on D2. We will present the following two
theorems in a more general setting. Let D be an index set. Let {X (t),t E D} be a
family of real random variables on a measurable space (Q, A) and f = 0{X (t), t E D}.
Suppose P0, P1 are two measures on f such that {X (t),t E D} is a Gaussian process
on (Q,.F,P,-),i = 0,1. Denote by K0(t,s) = E0(X(t) — m.0(t))(X(s) — m0(s)) and
K10, 8) = E1(X(t) - m1(t))(X(S) 4711(8)), where E0(X(t)) = m0(t) and E1(X(t)) =
7711(1)-
Theorem 2.2.9. The following are equivalent.
(a) P0 E P1 on f.
(b) K0 — K] E K(K1®K0) and m1— m0 E [C(Ko).
15
(c) (i) There exists 71,72 (0 < 71 S 72 < 00) such that 71K0 << K1 << 72K0. (ii)
K0 — K1 6 [C(K0® K0); and (iii) m0 - m1 6 [C(Ko).
Where 71K0 << K1 means that K1 — 71K0 is a covariance.
By virtue of rkhs, we can obtain the analogue of Theorem 2.2.1 in spatial domain.
Let ’Hp(m,-, K,) be as deﬁned before, which is the completion of linear manifold of
{X (t), t E D} with respect to the inner product given by the corresponding second-
order structure (mi, K,), i = 0,1 here. Let A be the bounded linear operator on
’HD(m0, K0) into HD(m1, K1) such that Ah; = h, for any ht as linear combination
of {X(t),t E D}.
Theorem 2.2.10. P0 E P1 on .7: if and only if that
(a) A is one-one bounded, with bounded inverse on HD(m0, K0) into HD(m1, K1)
(b) (I _— A*A) is Hilbert-Schmidt, where I is identity on ’HD(m0, K0).
(C) m1 — m0 E [C(KO)
Since those conditions for the equivalence of Gaussian measures based rkhs has no
constrains on the set D, we can extend Theorem 2.2.3 and Theorem 2.2.5 to high di-
mensional case. We note that some slightly weaker results presented in Yadrenko(1983,
p.154 and p.156), but we believe our proofs are more straightforward and clearer in
the context of this thesis.
Let {Xt,t E D} be centered Gaussian random ﬁeld, with covariance function
K j and spectral measure F j with spectral density fj under corresponding measures
Pj,j = 0,1, where D C Rd is bounded. Let b(s,t) = K0(s,t) — K1(s,t), s,t E D,
as is well known that Lj(s, t) = f ei’\,(s_t)fj(/\) dA. Suppose there exist constants
71,72 such that ’71 K0 << K1 << 72K0,
Theorem 2.2.11. A necessary and sufﬁcient condition for the equivalence of the
Gaussian measures P0 and P1 is that the covariance diﬂerence b(s,t), s,t E D can be
16
extended to a square integrable function b(s, t) on Rd x Rd and the Fourier transform
(,9 of which satisﬁes
hﬂku)|
[Rd [Rd— foo.) )fo (u) dAdp < 00- (2.2.15)
Proof. If follows from the Theorem 2.2.9(c) [Chatterji and Mandrekar(1978), p.183].
that P0 :— P1 is equivalent to
b(s, t) E [C(Ko «8) K0),
where the reproducing kernel Hilbert space
(C(Ko ® Kol— — {95.33 =//6 -z( (SIAM; V) 990‘, V) (“’09) 011’700u ) 99 E L2(F0 X 170)},
which follows from Theorem 2.2.8. Because
memes). (st. t1» =K0(S—SI)K0(t-t1)=//6i(t—t1)l)‘+i(s_sl)’” are) due).
and for any cf) 6 [C(KO <8) K0), there exists 30 E L2(F0 x F0) such that
(AKbeKmat)»
2% ﬂay/“81” )ew t1” “S‘sllmmxmdﬂwdF(u)ds1dt1
=rMst D
We will employ the following well—known properties of Fourier transform. For any
square integrable functions (with respect to Lebesgue measure) gay-(A), A E Rd, there
are square integrable functions aj(t), t 6 Rd such that
tpj(/\) =/ exp(—iA't)aJ-(t)dt, j: 1,2.
Rd
17
Furthermore,
memo) = [Rd expi—z‘xnial * aext) dt (2.2.16)
fad eXp(”“29910090200dA = (2r)d(a1 * a2)(t), (2.2.17)
where all the equalities are in the L2( dA) sense, and a1 * a2 is the convolution, i.e.,
a1* aga) = [R .1 a1(s)a2(t — 5) ds.
Theorem 2.2.12. Suppose
o < fo(>\) x lc(>‘)l2, IAI —» 00. (2218)
where to is the Fourier transform of some square integrable ﬁnite function identically
zero outside bounded set T. Let
h A
( ) [0(3)
satisfy the condition, for some [II > 0
/ |h(A)|2 dA < 00. (2.2.19)
|A|>M
Then P0 E P1.
Proof. When the function h(A) is square integrable and f0(A) x |4p(A)]2, where
92 = f eitlAc(A) dA and C(A) = 0 when A lies out of bounded set T. Then Plancherel’s
Theorem (see,e.g. Yosida (1968), p.153) implies there exists square integrable function
18
a(t) such that h(A) = fe‘i’Vta(t) dt. Furthermore, for s,t E D x D,
b(s,t) = feixls")(f1(z\)- foo» an = / ems—em) we)? dA
By (3.3.83),
not? = f expen’t) (/ c*t-II3+ 2 m1?
j=n+1 j=n+l
The right side of (2.3.9) does not depend on w and tends to 0 as n -—> 00 by the
(2.3.7), so (2.3.3) follows.Switching the roles of K0 and K1 yields (2.3.4). Next, since
E16%SEE168
E081(1/ML)2 < E0 6100.702 . E161(¢.n)2
Bo 60012,")2 — 13181011.”)2 E0 600%”)2
So (2.3.5) follows from (2.3.3)(2.3.4). Again switching the roles of K0 and K1 yields
(2.3.6). 1:1
Another sensible measure for how well predictions based on K1 do when K 0 is the
correct covariance function is
Eo((*1(tan) — eo(t,n))2
E0 60(0)")2
’
i.e., how large the mean squared difference of predictions is relative to correct mean
squared error. Because the mean squared error is often calculated in practice, it is
also of interest to compare the presumed MSE with the actual MSE by evaluating
the ratio E1e1(t,n)2/ E0 e1(t, n)2. The following theorem is a special version of Stein
(1999b, p.135).
Theorem 2.3.2. If the two covariance functions K0 and K1 deﬁne two equivalent
Gaussian probability measures, and the set of sampling sites {tk, k = 1, 2, . . . } is dense
25
in D, where D C Rd is bounded, then uniformly in t E D such that E0 e0(t, n)2 > 0,
E0(€1(02an) - 60(t.n))2
l' = 0 2.3.11
"530 E0 60(t, n)2 ( )
"limoo E1e1(t,n)2/E0 el(t, n)2 = 1. (2.3.12)
Sufﬁcient conditions for equivalent probability measures exist in terms of spectral
density. Let fi(A) for A 6 Rd be the spectral density corresponding to the covariance
function K,(h) for i = 0,1. If f,(A)’s are isotropic, i.e., depending only on “A”, it
follows from Theorem 2.2.5 that the corresponding Gaussian measures are equivalent
if for some e > 0,
f1(>\)/fo(A)-1 = Oahu—(”2+”) as Inn _. 00. (2.3.13)
Condition (2.2.19) implies that f1(A)/f0(A) ——> 1 as ”All —+ co and imposes some
rate on the convergence. Condition (2.2.19) is stronger than necessary for (2.3.11)
and (2.3.12) to hold, as seen from the next theorem (see Stein 1999b, p.136).
Theorem 2.3.3. Let the underlying process X (t) be Gaussian under probability P,-
with mean 0 and spectral density f,, i = 0, 1. Iffor some (.9 > 1, f0(A) IIAII‘p is bounded
away from 0 and 00 and
f 1(A)
f0()\)
then (2.3.11) and (2.312) hold.
—>1 as ”All -—> 00,
If observations are taken on inﬁnite lattice (SZ while 6 —> 0, the rate imposed on the
tendency of f1/f0 ——> 1 will yield the rate of convergence to optimality of predictions
in (2.3.11) and (2.3.12) (see Stein 1999a, p.252). As (2.3.2) deﬁned in ﬁnite sample
case, we let e,-(t, 6) be the prediction error of X (t) under measure P,- when the process
is observed at an inﬁnite lattice 6%.
Theorem 2.3.4. If f,(A)(1 + HAN)“9 is bounded away from 0 and 00 (i = 0,1) and
26
lf1(’\)/f0()‘)— 1] S A(1+ ||A||)_7 for some positive numbers 90,7, A, then as 6 ——> 0
E0(60(02.5) — 81012.5»2
530810127032
E161((7215)2
S11" 2 2"
02¢52d E061(0 a6)
sup
0 (6min(§9,27)(log 6-1)1{,c=2c,}) (2.3.14)
02¢6Zd
I] 0 (6mi“(W>(Ieg 5—1)1{~9=r}) (2.3.15)
Under an additional condition on the spectral densities, Stein(1990a, p.259) also
obtained such convergence rates when the observations are unequally spaced on an
interval.
2.4 Fixed-domain asymptotics of maximum likeli-
hood estimators
Comparing with increasing domain asymptotics, ﬁxed—domain asymptotic results for
estimation are considerably fewer due to lack of analytic tools dealing with increas-
ingly stronger correlations between nearby observations. More speciﬁcally, taking
more and more data in a ﬁxed domain will give you more and more correlated ob-
servations, as opposed to increasing domain asymptotics, where taking samples in an
increasingly large domain gives roughly independent observations if correlations decay
with distance fast enough. For the simplest model, exponential model, the ﬁxed do-
main asymptotics of exact MLE has been thoroughly studied by Ying(1991 and 1993),
Chen, Simpson, and Ying (2000). For the general Matérn model, Zhang(2004) gives
the strong consistency of MLE with 6 ﬁxed. However, the ﬁxed-domain asymptotic
distribution is not available even when data are observed along a line, therefore we
study and establish ﬁrst the ﬁxed-domain asymptotic distribution of MLE for the mi-
croergodic parameter in the general Matérn model. In the following, we will present
some of these results.
Let the second order stationary Gaussian process X (t),t 6 Rd have mean 0 and
27
an isotropic covariance function K (h; 6, 02), where a2 is the variance of the process
and 0 is the parameter that controls how fast the covariance function decays. Given
n observations Xn = (X (t1), . . . ,X (tn))’, the log—likelihood is
1 1
en(9,02) = ‘3 log 27r — Eleg[detv,,(19,02)] — Ex;,[v,,(0,02)]"1x,,, (2.4.1)
where Vn(0, o2) denotes the covariance matrix of X". The maximum likelihood es-
timator (MLE) of (6,02) maximizes the likelihood function €n(0,a2), i.e. (8,62) =
ArgMax €n(0, 02). In this work, we will focus on Matérn model, that is, the covari-
ance function considered in (2.4.1) is Matérn deﬁned as in (3.3.14). As we mentioned
202V
earlier, the product a is shown to be consistently estimable, although both pa-
rameters 02
and 0 are not if spatial sampling domain is bounded [see e.g. Zhang
(2004)]. It is actually more important to estimate 0262” well for spatial interpolation
[see, e.g. Zhang (2004), Stein(1999b)]. For the exponential model which is the special
case with V = 1 / 2, the underlying process is known as Ornstein-Ulenbeck process.
Because this process possesses some nice properties such as the Markovian properties,
the ﬁx-domain asymptotic analysis is relatively more tractable [Ying (1991), Chen
et al. (2000)]. Ying (1991) gives the following ﬁxed-domain strong consistency and
asymptotic normality for the MLE of the consistently estimable parameter 002 for
exponential model as following:
Theorem 2.4.1. Let the underlying process {X (t),t E [0,1]} be Gaussian with mean
0 and an exponential covariance function K (h) = o2 exp(—6h). The domain of the
maximization in (6,02) is J = [a,oo] x [w,v] or [a, b] X [10,00], 0 < a < b < 00
and 0 < w < v < 00. Then, with probability one the maximum likelihood estimator
A
(dump, damp) exists for all large n and as n —+ 00
02,33, ——> 6003 as, (2.4.2)
. . d
mane, — 9003) —. N(0, 2(6003)2). (2.4.3)
28
For general Matérn model, there is no such properties like Markovian property
strictly speaking for the underlying process. So the ﬁxed-domain asymptotic proper-
ties for (8,62) by joint maximizing both parameters are not available yet. However,
Zhang (2004) proved that when V is known and 6 is ﬁxed at any value 01 > 0, the max-
imizer of likelihood (2.4.1) 6% ensures the following strong consistency of an estimator
of the consistently estimable parameter 602.
Theorem 2.4.2. Let the underlying process {X(t),t E Rd}, d = 1, 2,or 3, be second
order stationary Gaussian with mean 0 and possess an isotropic Matérn covariogram
3.3.14 with the unknown parameter values 08, 00 and a known V. Let D”, n = 1, 2, . . .
be an increasing sequence of ﬁnite subsets of Rd such that U311 Dn is bounded and
inﬁnite, and Ln(02, 0) be likelihood function when the process is observed at locations
in Dn. For any ﬁxed 01 > 0, let 6?, maximize Ln(02,61), Then 626%” —> 0303” with
P0 probability 1, where P0 is the Gaussian measure deﬁned by the Matérn covariogram
corresponding to parameter values 03, 0 and V.
We showed the asymptotic normality of this estimator.
Theorem 2.4.3. [Du, Zhang and Mandrekar(2009)] With the same notation and
assumptions as in previous Theorem 2.4 .2, for any ﬁxed 01,
ﬂags?” — 0303") i. N(0, 2(0363”)2). (2.4.4)
This theorem is proved based on those theoretical results related to the equivalence
of Gaussian measures. The detailed proof will be provided in next chapter.
29
Chapter 3
Covariance tapering and
ﬁxed-domain properties of tapered
maximum likelihood estimators
3.1 Introduction
As introduced in the very beginning, applying some traditional statistical approachs,
such as best linear unbiased prediction or kriging, the maximum likelihood estima-
tion or the Bayesian inferences, to large spatial data sets are often computationally
infeasible because of cubic order matrix algorithms on matrix inverse or determinant
involved. To obviate these computational hurdles, a natural idea is to make the co-
variances exactly zero after certain distance so that the resulting matrix has a high
proportion of zero entries and is therefore a sparse matrix. Operations on sparse ma-
trices take up less computer memories and run faster. However, this has to be done
in a way such that the resulting matrix is still positive deﬁnite. Covariance tapering
assures that the tapered covariance matrix is positive deﬁnite while retaining most
of the information. Technically this method is to taper the covariance function to
zero beyond a certain range by directly multiplying a positive deﬁnite but compactly
30
supported function, that results in the so called tapered covariance matrix which
can be efﬁciently handled by well-established sparse matrix algorithms. The tapered
covariogram is of the form
~
K(h;6’) = K(h;9)Ktap(h). (31-1)
where K (h, 0) is the covariance function of the underlying process that depends on a
vector of parameter 0 and K tap(h) is the taper, a known correlation function that is
0 after a threshold distance. Some examples of taper are spherical, Wendland tapers
(Wendland, 1995, 1998). See also Wu (1995), Gneiting (2002) and Mitra et a1. (2003).
One would then use K(h, 0) in estimation and interpolation as if it was the cor-
rect covariance function. For example, we would maximize the following tapered log
likelihood function for Gaussian observations to obtain an estimate for 6,
2 n 1 ~ 1 I “' _1
[n,tap(09(7 ) 2 —5 log 27r — §10g[detVn] — EXnVn Xn. (3.1.2)
where the tapered covariance matrix is a Hadamard product Vn = Vn(6, 02) o Tn,
and Tn has the (i, j)th element as Ktap(||t, — tj“). Xn is the vector of observations.
Maximizing (3.3.4) is computationally feasible for extremely large datasets but results
in a pseudo-likelihood estimator whose properties need to be studied. Kaufman (2008)
established consistency of the pseudo-likelihood estimator for Matérn class. Very
recently, Du, Zhang and Mandrekar (2009) gave some general conditions to ensure
that tapering does not affect the efﬁciency of the maximum likelihood estimator. For
spatial interpolation, Furrer Genton and Nychika (2006) showed that under some
regularity conditions, tapering produces asymptotically optimal prediction.
All the conditions that result in no effect on consistency, asymptotic efﬁciency of
estimation or asymptotic Optimality of prediction put constrains on the tail behavior
of spectral density of tapering function. Roughly speaking, this can be accounted for
31
by noting that spectral density is the Fourier transform of covariance function. So the
faster the spectral density of taper decays, the smoother the tapering function is at
origan and it hardly changes the degree of differentiability of the underlying process,
which actually plays a central role in any ﬁxed-domain asymptotic results. Therefore,
section 3.2 is devoted to address the tail properties of tapering spectral density and
effect of tapering on spatial interpolation. The effect on estimation is in section 3.3, we
will focus on studying the asymptotic distribution of tapered MLE since it is unknown
if the covariance tapering causes any loss of asymptotic efﬁciency before. The content
of these two sections is from the joint papers [Du, Zhang and Mandrekar (2009)],
which has been accepted by Annals of Statistics, and [Zhang and Du(2008)] with my
advisors. In the absence of asymptotic distributions for the true MLE in the general
case, we compare the log likelihood function and the tapered log likelihood function,
and their derivatives in section 3.4. The simulation study and an application to the
climate data to show accuracy and computational are presented in the last section.
3.2 Tail properties of tapering spectral density
If K (h) is the covariance function of X (t), and R (h) = K (h)Ktap(h) the tapered
covariance function, denote the spectral density corresponding to K (h), R (h) and
Ktap(h) by f0(A), f1(A) and ftap(/\) for A 6 Rd, respectively. We have the following
relationship from a well-known fact about the Fourier transformation,
MA) = / foo—xvtapwx. (32.1)
Based on an equivalent equation as (3.2.1), Furrer Genton and Nychika (2006) have
shown the following result.
Theorem 3.2.1. Let f0(A) 0c (MW/(e? + ||>.||2)V+d/2 for A 6 Rd be the spectral
density function of a Matérn covariance function. Let the spectral density fmpO‘) 0f
32
the taper satisﬁes the following taper condition:
0 < Imp.) < 1t1(1+||/\||2)_"_d/2—C (3.2.2)
for some 6 2 0 and M > 0. Then f1(A)/f0(z\) has a ﬁnite limit as ”All —-> 00 and
this limit equals 1 if e > 0.
This theorem and Theorem 2.3.3 imply that if ftap(A) has a lower tail than f0(/\),
then predictions under both models will be nearly the same when a large sample is
obtained from a bounded region.
We can give a stronger result which combined with Theorem 2.3.4 gives the conver-
gence rate of mse’s based on a tapered covariance structure, this indicates the degree
of maintaining the Optimality of prediction using appropriate tapering. It also pro-
vides the condition for the two measures corresponding to the two spectral densities,
i.e. tapered and untapered, to be equivalent in one dimension case.
Theorem 3.2.2. If condition (3.2.15) holds for some 5 > d/2+'7/2, where 0 < 'y < 1,
the following holds.
0)
|f1(/\)/f0(>\) -1| S A(1+ ||/\||)"7 (32-3)
01 < MAXI + llA||)2“+d < 02 (2' = 0.1) (3.2.4)
for some constants A, Cl, Cg > 0.
(ii) The two Gaussian measures corresponding to the two spectral densities are equiv-
alentford=1and1>7>1/2.
Proof. We can assume 0 = 1 without loss of any generality because the results will
be the same except for the magnitude of the constants A and (7,, i = 1, 2. Then in
view of isotropy, f0(A) can be written as f6‘(||A||) = Ml/(l + ||A||2)V+d/2, where M1
is a positive constant [see (33.14)]. By continuity and f0(A) > 0, to show (3.2.3) it
33
suffices to show that there exists A > 0 such that when ”A” large enough,
|f1(/\)— fo(>\)l(1 + HAM)” < A.
MA) _ (3.2.5)
Indeed, since tapered covariance function C () is also isotropic, from (3.2.1) it follows
that
MA) = from”) = [Rd/a(u — wl|)f€‘ap(llwll)d ftap .(.~).~d-1d.~dU(u).
68d 0
and the LHS of (3.3.86) is therefore bounded by
0° l.f6(|l7"u—wvl|)—f6(llwv||)] * d_1 ‘
A12(1+w)7 [98.1 /0 f6(w) ftap(r)r (mum). (3.2.8)
Furthermore, for w sufficiently large, we will bound the inner integral over intervals
[0,w — A], [w — A,w + A], [w + A, 00) by some quantities independent of u, where
A = 0(w6) for 6 = (d+ 21/ + 7)/(d+ 21/ +1), clearly 0 < 6 < 1. Then the assessment
of the whole iterated integral in (3.2.8) becomes straightforward because the outer
integral is one.
First, note that the Mean Value Theorem implies that there exits 5 between w and
34
||ru — wvll such that
f(’)"(||rul - w'vll) - f5(||wv||) = f6'(€)(|lru - wvll - NW“)
and |f6’(a:)] = [2Ml(z/ + d/2)x]/[(1 + $2)"+d/2+1] which is decreasing in a: > 0.
Therefore, for r E [0,w — A] or [w + A, 00), we have Ilru — wvll 2 la) — r] 2 A. This
together with w > A entails 6 > A, thus
*I _ M3AT
f0 (A)l ||ru|| _ (1+ A2)u+d/2+1’ (3'2'9)
lf6<|lru - wvll) - f6(l|wv||)l S
where M3 denotes 2Ml(u+d/2). For r E [w— A,w+A], we have ||ru — wvll Z |w—r|
and w > A > |w — r|, which indicates { > Iw — r], so
,.. ,.. * A13 to — r|r
lf0(ll7"u — vaI)—fo(llwv||)ls fo'(|w — r1) llrull=(1+ |,,_',.|2)..+d/2+1 (32-10)
From (3.2.9) and taper condition (3.2.15), it follows that
If5‘(||7‘u—W||)—f6lllwvllll * (1-1
* f r r dr
(WM f0(w) ta“ )
M3A(1+ w2)"+d/2 / * d—l '
< d
- M1(1+ Mad/2+1 (MM T9 W T (3'2“)
M3MA(1 + w2)u+d/2 w-A rd 00 rd
5 , / dr+/ dr
(”1(1 + A2)V+d/2+l 0 (1+T2)'U+(l/2+f w+A(1+T2)v+(l/2~R
As 6 > d/ 2 2 1/2 — V, the ﬁrst integral in (3.2.11) is ﬁnite and the second integral
tends to 0 as w —> 00. To control the last inner integral part, we need to use the
monotonicity of rd/ [(1 + r2)”+d/2+‘] which is decreasing in r for e > d/ 2. In view of
this, (3.2.10) and taper condition (3.2.15), we have
ft*ap(7')7'd_1 d'r
/w+A ”a(u“; — w’vll) - f5‘(||w’v||)|
—A f6(w)
< M31W(1+ w2)1/+d/2/w+A '7. _ w] 7.d
— Ml w—A (1 + |.,. _ wl2)u+d/2+l(1 + .,.2)v+d/2+e
< M3M(w _ A)d(1+ W2)u+d/2 /w+A IT _ 9’]
M1(1+ (w — A)2)”+d/2+€ WA (1+ Ir — wl2)”+d/2+1
dr, (3.2.12)
where the ﬁniteness of the last integral is easy to be veriﬁed via changing of variables.
Since the total mass of outer integral in (3.2.8) is one, combining (3.2.8) with (3.2.11)
and (3.2.12) gives
oo * ,
limsung(1+w)/ f0 If0(llru wfv”) f°(”“m”)lfgap(r)rd—1drdU(u)*