TAIL ESTIMATION OF THE SPECTRAL DENSITY UNDER FIXED-DOMAIN
ASYMPTOTICS
By
Wei-Ying Wu

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Statistics
2011

ABSTRACT
TAIL ESTIMATION OF THE SPECTRAL DENSITY UNDER
FIXED-DOMAIN ASYMPTOTICS
By
Wei-Ying Wu

For spatial statistics, two asymptotic approaches are usually considered: increasing domain
asymptotics and ﬁxed-domain asymptotics (or inﬁll asymptotics). For increasing domain
asymptotics, sampled data increase with the increasing spatial domain, while under inﬁll
asymptotics, data are observed on a ﬁxed region with the distance between neighboring observations tending to zero. The consistency and asymptotic results under these two asymptotic frameworks can be quite diﬀerent. For example, not all parameters are consistently
estimated under inﬁll asymptotics while consistency holds for those parameters under increasing asymptotics (Zhang 2004).
For a stationary Gaussian random ﬁeld on Rd with the spectral density f (λ) that satisﬁes
f (λ) ∼ c |λ|−θ as |λ| → ∞, the parameters c and θ control the tail behavior of the spectral
density where θ is related to the smoothness of random ﬁeld and c can be used to determine
the orthogonality of probability measures for a ﬁxed θ. Specially, c corresponds to the
microergodic parameter mentioned in Du et al. (2009) when Mat´rn covariance is assumed.
e
Additionally, under inﬁll asymptotics, the tail behavior of the spectral density dominates
the performance of the prediction, and the equivalence of the probability measures. Based
on those reasons, it is signiﬁcant in statistics to estimate c and θ.
When the explicit form of f is known, its corresponding covariance structure can be
computed through the Fourier transformation. Therefore, spatial domain methodologies like
Maximum Likelihood Estimator (MLE) or Tapering MLE can be used for the estimation

of c and θ. Unfortunately, the exact form of f should be unknown in practice. Under this
situation, spatial domain methods will not be applied without the covariance information.
In my work, for data observed on grid points, two methods which utilize tail frequency
information are proposed to estimate c and θ. One of them can be viewed as a weighted
local Whittle type estimator. Under proposed approaches, the explicit form of f and the
restriction of the dimension are not necessary. The asymptotic properties of the proposed
estimators under inﬁll asymptotics (or ﬁxed-domain asymptotics) are investigated in this
dissertation together with simulation studies.

ACKNOWLEDGMENT

I would like to give my deepest thank to my supervisors, Professor Yimin Xiao and Chae
Young Lim for their invaluable support and encouragement in my PhD studying at Michigan
State University. They are always patiently sharing and explaining their knowledge for me.
Without their help and guidance, it would have been impossible for me to ﬁnish this work.
Also, I thank Professor Chae Young Lim for providing me ﬁnancial support from her MSU
grant (MSU 08-IRGP-1532) for my studies in the year 2009-2010.
I also appreciate the help from my other committee members, Professor Mark M. Meerschaert and Professor Zhengfang Zhou. Thank you for your detailed suggestion and time on
my dissertation.
My next thank to my best friends and classmates - Wei-Wen Hsu, Gengxin Li, Hsiu-Ching
Chang and Sumit Sinha for their help and suggestions in the past years.
I am indebted to Professor Lijian Yang, Ms. Suzanne Watson and Mr. Eric Segur for
their tremendous help and constant support.
I want to thank all the people who have helped and inspired me during my doctoral
studying.
Most importantly, to give my thanks to my wife Wei-Ling, my parents and other families,
whose patient love enabled me to complete this work.

iv

TABLE OF CONTENTS

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Figures

vi

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 Introduction
1.1 Increasing domain and ﬁxed-domain asymptotics
1.2 The tail behavior of the spectral density . . . . .
1.2.1 Equivalence of probability measures . . . .
1.2.2 Prediction under ﬁxed-domain asymptotics
2 Main Results
2.1 Preliminary . . . . . . . . . . . . . . . . . . . . .
2.2 Asymptotic properties of a smoothed periodogram
2.3 Approach I . . . . . . . . . . . . . . . . . . . . .
2.3.1 Estimation of c under the known θ . . . .
2.3.2 Estimation of θ under the known c . . . .
2.3.3 Estimation under unknown c and θ . . . .
2.4 Approach II . . . . . . . . . . . . . . . . . . . . .
2.4.1 Estimation of c under known θ . . . . . .
2.4.2 Estimation of θ under known c . . . . . .
2.4.3 Estimation under unknown θ and c . . . .

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

vii

.
.
.
.

1
2
4
5
7

.
.
.
.
.
.
.
.
.
.

11
11
14
17
18
20
21
23
23
24
26

3 Simulation Study

28

4 Discussion

37

5 Appendix
5.1 The properties of gc,θ (λ) . . . . . . .
5.2 Proofs of Theorems in Section 2 . .
5.2.1 Proofs of Theorems in Section
5.2.2 Proofs of Theorems in Section
5.2.3 Proofs of Theorems in Section

40
40
45
45
47
73

. .
. .
2.2
2.3
2.4

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

.
.
.
.
.

84

LIST OF TABLES

3.1

Estimation of θ under known c . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.2

Estimation of c under known θ . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.3

Estimation of θ under known c . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.4

Estimation of θ under known c . . . . . . . . . . . . . . . . . . . . . . . . . .

32

3.5

Estimation of c under known θ . . . . . . . . . . . . . . . . . . . . . . . . . .

32

3.6

Estimation of θ under known c . . . . . . . . . . . . . . . . . . . . . . . . . .

32

3.7

Estimation of θ under known c (Second approach) . . . . . . . . . . . . . . .

33

3.8

Estimation of θ under unknown c for Example 1 . . . . . . . . . . . . . . . .

33

3.9

Estimation of θ under unknown c for Example 2 . . . . . . . . . . . . . . . .

33

vi

LIST OF FIGURES

3.1

Histogrm of Example 1 on diﬀerent c. . . . . . . . . . . . . . . . . . . . . . .

34

3.2

Histogrm of Example 2 on diﬀerent c. . . . . . . . . . . . . . . . . . . . . . .

35

3.3

Histogrm of Example 2 with diﬀerent grid sizes on wrong c.

36

vii

. . . . . . . . .

Chapter 1

Introduction

With recent advances in technology, we are facing enormous amount of data sets. When
those data sets are observed on a regular grid, spectral analysis is popularly used due to
fast computation using the Fast Fourier Transform. For example, parameters of the spectral
density of a stationary lattice process can be estimated using a Whittle likelihood [Whittle,
(1954)], which is more eﬃcient in terms of computation compared to the maximum likelihood
method on a spatial domain.

In my dissertation, I propose new methodologies developed from the perspective of spectral analysis to estimate parameters that control the tail behavior of the spectral density
for a stationary Gaussian random ﬁeld under ﬁxed-domain asymptotics, which is one of two
famous sampling schemes in spatial statistics. The second sampling scheme is the increasing
domain asymptotics. Before explaining my research problem, I ﬁrst introduce these two
sampling schemes and their diﬀerences.
1

1.1

Increasing domain and ﬁxed-domain asymptotics

Spatial data on a grid often can be regarded as a realization of a random ﬁeld on a lattice.
That is, for a random ﬁeld, Z(s) on Rd , data is observed at ϕJ for J ∈

∏d

j=1 {1, · · · , mj },

where ϕ is a grid length. When ϕ is ﬁxed and the sample size is increasing (increasing
domain asymptotics), asymptotic properties of parameter estimates on a spectral domain
have been studied by many authors [see, e.g., Whittle (1954), Guyon (1982, 1995), Boissy et
al. (2005) and Guo et al. (2009)]. For example, Guyon (1982) studied asymptotic properties
of estimators using a Whittle likelihood or its variants when a parametric model is assumed
for the spectral density of a stationary process on a lattice. Guo et al. (2009) studied
asymptotic properties of estimators of long-range dependence parameters for anisotropic
spatial linear processes using a local Whittle likelihood method in which a parametric form
near zero frequency is only assumed. This is an extension of Robinson’s research (1995) on
time series.
For spatial data, it is often natural to assume that the data is observed on a bounded
domain of interest, therefore more observations on the bounded domain means that the
distance between observations, ϕ, decreases as the number of observations increases. This
sampling scheme requires a diﬀerent asymptotic framework, called ﬁxed-domain asymptotics
[Stein (1999)] (or inﬁll asymptotics [Cressie (1993)]).
It has been shown that the asymptotic results under ﬁxed-domain asymptotics can be
diﬀerent from the results under increasing-domain asymptotics [see, e.g., Mardia and Marshall (1984), Ying (1991, 1993), and Zhang (2004)]. For example, Zhang (2004) showed
not all parameters in the Mat´rn covariance model of a stationary Gaussian random ﬁeld
e
on Rd are consistently estimable when d is smaller than or equal to 3. He also showed
2

that a reparameterized quantity which is a function of variance and scale parameters can
be estimated consistently by the maximum likelihood method. On the other hand, under
increasing-domain asymptotics, the maximum likelihood estimators (MLEs) of variance and
scale parameters for a stationary Gaussian process are consistent and asymptotically normal
[Mardia and Marshall (1984)]. Although not all parameters can be estimated consistently
under ﬁxed-domain asymptotics, a microergodic parameter can be estimated consistently
[see, e.g., Ying (1991, 1993), Zhang (2004), Zhang and Zimmerman (2005), Du et al. (2009),
and Anderes (2010)]. The microergodicity of functions of parameters determines the equivalence of probability measures, whereby a microergodic parameter is the quantity that aﬀects
asymptotic mean squared prediction error under ﬁxed-domain asymptotics. [Stein (1990,
1999)].
Although there have been more asymptotic results available recently under ﬁxed-domain
asymptotics, it is still very few in contrast with vast literature on increasing-domain asymptotics. Also, most results are for speciﬁc models of covariance functions. For example,
Ying (1991, 1993) and Chen et al. (2000) studied asymptotic properties of estimators for
a microergodic parameter in the exponential covariance function, while Zhang (2004), Loh
(2005), Kaufman et al. (2008), Du et al. (2009) and Anderes (2010) investigated asymptotic
properties of estimators for the Mat´rn covariance function. For the estimation of the fractal
e
dimension in the spatial domain under the ﬁxed-domain asymptotics, Constantine and Hall
(1994) estimated eﬀective fractal dimension using variogram for a non-Gaussian stationary
process on R. Chan and Wood (2004) introduced an increment-based estimator of the fractal
dimension of a function of a stationary Gaussian random ﬁeld on Rd when d = 1 or 2. These
asymptotic results are established in the spatial domain.
Asymptotic results in the spectral domain are even less under ﬁxed-domain asymptotics.
3

Stein (1995) studied asymptotic properties of a spatial periodogram of a ﬁltered version of
a stationary Gaussian random ﬁeld. Lim and Stein (2008) extended results of Stein (1995)
and showed asymptotic normality of a smoothed spatial cross-periodogram under ﬁxeddomain asymptotics. Regarding the parameter estimation in the spectral domain under
ﬁxed-domain asymptotics, Chan et al. (1995) proposed a periodogram-based estimator of
the fractal dimension of a stationary Gaussian random ﬁeld when d = 1.
In the above discussions, it follows that the properties under increasing domain and
ﬁxed domain are quite diﬀerent and more research works are required for ﬁxed-domain
asymptotics. In the next Section, I will begin to introduce my research problem under
ﬁxed-domain asymptotics.

1.2

The tail behavior of the spectral density

In this dissertation, I propose estimators of parameters that control the tail behavior of
the spectral density for a stationary Gaussian random ﬁeld when the data is observed on
a grid within a bounded domain and study their asymptotic properties under ﬁxed-domain
asymptotics. Let f (λ) be the spectral density of a stationary Gaussian random ﬁeld, Z(s)
on Rd and we assume that

f (λ) ∼ c |λ|−θ

as |λ| → ∞, λ ∈ Rd

(1.1)

where | · | is a usual Euclidean norm and θ > d to ensure integrability of f . That is, we only
assume power law for the tail behavior of the spectral density and do not assume any speciﬁc
parametric form of the spectral density. In the following subsection, the reasons for interest
4

in the tail behavior will be introduced from two perspectives; the equivalence of probability
measures and the prediction.

1.2.1

Equivalence of probability measures

The equivalence between two probability measures P1 and P2 on a measurable space {Ω, F)
is that P1 (A) = 0 for any A ∈ F implies P2 (A) = 0 and denoted by P1 ≡ P2 . We usually
assume F is generated by the paths of the process {Z(s), s ∈ D}. When the stationarity is
considered for the process, many criteria based on the spectral densities have been developed
to classify the equivalence of probability measures [see, e.g.,Ibragimov (1978), Yadrenko
(1983) and Du (2009a)].

Theorem 1. (Yadrenko (1983)) Let Pi , i = 1, 2 be two probability measures such that under
Pi , the process {Z(s), s ∈ Rd } is stationary Gaussian with mean 0 and a second-order
spectral density fi (λ), λ ∈ Rd . If, for some θ > d, f1 (λ) |λ|θ is bounded away from 0 and
∞ as |λ| → ∞, and for some ﬁnite c,
∫
|λ|>c

{

}
f2 (λ) − f1 (λ) 2
dλ < ∞.
f1 (λ)

(1.2)

then P1 ≡ P2 on the paths of Z(s), s ∈ D, for any bounded subset D ⊂ Rd .
The integrability of (1.2) is determined by the tail of spectral densities. For example, if
fi (λ)’s are isotopic, i.e., depend only on |λ|, (1.2) will hold when there exists some ϵ > 0
such that

f1 (λ)
− 1 = O(|λ|−(d/2+ϵ) ) as|λ| → ∞.
f2 (λ)
5

(1.3)

This implies the equivalence of probability measures can be veriﬁed by the decay degree of
their spectral densities.

Many applications of the equivalence of measures have been explored to reduce the computational burden like a tapering method. Let ln (θ) be the log likelihood of data observed:

n
1
1 ′ −1
ln (θ) = − log(2π) − log[det Vn ] − Xn Vn Xn .
2
2
2

(1.4)

where n is sample size, Xn is a data vector and Vn is the covariance matrix. The computation
cost to obtain Maximum Likelihood Estimator (MLE) can be expensive.

To reduce computational burden, a tapering method on the covariance function can be
used:
˜
V (l, θ) = V (l, θ) ◦ Vtap (l).
where V (l, θ) is the covariance function of the underlying process that depends on parameter
θ (possibly a set of parameters), Vtap (l) is the taper, a known positive function, that is 0
after a threshold distance and “ ◦ ” is Schur or Hadamard product. By replacing V (l, θ) with
˜
V (l, θ), tapered likelihood is attained as

n
1
1 ′ ˜
−1
˜
ln,tap (θ) = − log(2π) − log[det Vn (l, θ)] − Xn Vn (l, θ)n Xn .
2
2
2

(1.5)

The consistency of the estimator based on ln,tap (θ) holds if the probability measure under
˜
V (l, θ) is equivalent to the one under V (l, θ) [see. Zhang (2004)]. More theoretical discussion
about a tapered method is found in the Chapter 3 [Du, (2009a)].
6

1.2.2

Prediction under ﬁxed-domain asymptotics

The another motivation to study tail behavior of the spectral density comes from its role in
prediction. In spatial statistics, the best linear unbiased prediction is called kriging. Let process Z(s) be a mean zero stationary process and data is sampled at locations {s1 , s2 , s3 ...., }
which are dense in a bounded region D ⊆ Rd , which implies that the inﬁll sampling is
ˆ
used. Further, assume s∗ be a new location that we would like to explore. Let Z(s∗ , n) be
the best linear unbiased prediction of Z(s∗ ) based on the data Z(s1 ), Z(s2 ), ..., Z(sn ) and
ˆ
e(s∗ , n) be the error between Z(s∗ ) and Z(s∗ , n). The following theorem [Stein (1998), p.
136] compares the prediction performance between a correct measure P1 and a misspeciﬁed
measure P2 .
Theorem 2. (Stein 1999, p.252) Let Z(s) be a mean zero stationary Gaussian random ﬁeld
under probability measure Pi with spectral density fi , for i = 1, 2. If there exist some ρ > 0
f (λ)

such that f1 (λ)|λ|ρ is bounded away from 0 and ∞, and f2 (λ) → 1 as |λ| → ∞,
1
E1 (e2 (s∗ , n) − e1 (s∗ , n))2
=0
n→∞
E1 (e1 (s∗ , n))2
lim

E (e (s∗ , n))2
lim 2 2 ∗
=0
n→∞ E1 (e2 (s , n))2

(1.6)

where Ei (·) and ei (·) is the expectation and prediction error under probability measure Pi ,
for i = 1, 2.
The above result means no matter which probability measures we used for prediction
performance is asymptotically equivalent under the ﬁxed-domain sampling if the tail behavior
of f2 is as that of f1 . Thus, understanding the tail behavior of the spectral density is of great
importance in spatial statistics.
In my dissertation, we introduce two approaches to estimate parameters that control the
7

tail behavior of the spectral density. That is c and θ in (1.1). One of the proposed estimators is obtained by minimizing an objective function that can be viewed as a weighted
Whittle likelihood, in which Fourier frequencies near a pre-speciﬁed non-zero frequency are
considered. This approach is similar to the local Whittle likelihood method introduced by
Robinson (1995) for estimating a long-range dependence parameter in time series analysis. For a stationary lattice process, Robinson (1995) proposed to estimate a long-range
dependence parameter by minimizing the Whittle likelihood over Fourier frequencies near
zero since the long-range parameter dependence is controlled by the behavior of the spectral
density near zero. Meanwhile, we are interested in estimating parameters that govern the
spectral density of a random ﬁeld when the frequency is very large so that we need to focus
on Fourier frequencies that are away from zero.
In our work, we establish consistency and asymptotic normality of estimators of c and
estimators of θ, respectively, when the other parameter is known. Some properties are also
discussed when both parameters are unknown. Specially, if the Mat´rn covariance model is
e
considered, c is related to a microergodic parameter. Consider the Mat´rn spectral density
e
given as

f (λ) =

σ 2 α2ν
π d/2 (α2 + |λ|2 )ν+d/2

, λ ∈ Rd .

(1.7)

Mat´rn spectral density has three parameters (σ 2 , α, ν), where σ 2 is the variance parameter,
e
α is the scale parameter and ν is the smoothness parameter. Since the Mat´rn spectral
e
density satisﬁes

f (λ) ∼

σ 2 α2ν
d
π2

8

|λ|−(2ν+d)

as |λ| → ∞, we have c ≡ σ 2 α2ν /π d/2 and θ ≡ 2ν + d, and σ 2 α2ν is a microergodic
parameter. Thus, estimating σ 2 α2ν when ν is known is equivalent to estimate c when θ
is known. There are several references that investigate estimation of σ 2 α2ν in the spatial
domain. Zhang (2004) showed that σ 2 and α can be estimated only in the form of σ 2 α2ν
under ﬁxed-domain asymptotics when ν is known and d ≤ 3. Du et al. (2009) investigated
asymptotic properties of the MLE and the tapered MLE of σ 2 α2ν when ν is known, α is ﬁxed
and d = 1 for a stationary Gaussian process. Anderes (2010) proposed an increment-based
estimator of σ 2 α2ν for a geometric anisotropic Mat´rn covariance function and showed that
e
α can be estimated separately when d > 4.
The parameter θ is related to the fractal index (or fractal dimension) when the process
{Z(s), s ∈ Rd } is a stationary isotropic Gaussian process. For example, for a stationary
Gaussian random ﬁeld on Rd , suppose that its covariance function C(t) satisﬁes

C(t) ∼ C(0) − k|t|α

as |t| → 0

(1.8)

for some k and 0 < α ≤ 2. In this case, α is the fractal index that governs the roughness
of sample paths of the process and the fractal dimension D becomes D = d + (1 − α/2).
This follows from Theorem 5.1 in Xue and Xiao (2010). When α = 2 in (1.8), it is possible
that the sample function may be diﬀerentiable. This can be determined by the smoothness
of C(t) in items of the spectral measure of {Z(s), s ∈ Rd }. Further information is in Adler
and Taylor (2007) and Xue and Xiao (2010).
On Abelian type theorem, (1.8) holding the corresponding spectral density satisﬁes

f (λ) ∼ k ′ |λ|−(α+d)
9

as |λ| → ∞

so that θ ≡ α + d in our settings.
The rest of this dissertation is organized in the following manner. First, in Chapter 2,
we explain our settings and assumptions. We extend the results in Stein (1995) and Lim
(2008) to more relaxed condition and then introduce our estimators and state theorems for
the asymptotic properties of the proposed estimators. Simulation study will be presented in
Chapter 3. In Chapter 4, we will discuss some issues related to our approach and possible
extension of the current work. In the ﬁnal chapter, we give proofs of our theoretical results.

10

Chapter 2

Main Results

2.1

Preliminary

In this work, we consider a stationary Gaussian random ﬁeld, Z(s) on Rd with the spectral
density f (λ) that satisﬁes (1.1). Deﬁne a lattice process Yϕ (J ) by Yϕ (J ) ≡ Z(ϕJ ), where
J ∈ Zd , the set of d-dimensional integer-valued vectors. The corresponding spectral density
of Yϕ (J ) is

¯
fϕ (λ) = ϕ−d

(

∑

f

Q∈Zd

λ + 2πQ
ϕ

)
,

¯
for λ ∈ (−π, π ]d . Typically, fϕ (λ) may have a peak near the origin which is getting higher as
ϕ → 0. This causes a problem to estimate the spectral density using the periodogram [Stein
(1995)]. To alleviate the problem, we consider a discrete Laplacian operator to diﬀerence
11

the data, which is proposed by Stein (1995). The Laplacian operator is deﬁned by

∆ϕ Z(s) =

d
∑{

}
Z(s + ϕ ej ) − 2Z(s) + Z(s − ϕ ej ) ,

j=1

where ej is the unit vector whose jth entry is 1. Depending on the behavior of the spectral
density at high frequencies, we can apply the Laplacian operator iteratively to control the
( )τ
τ
peak near the origin. Deﬁne Yϕ (J ) ≡ ∆ϕ Z(s) as the lattice process obtained by applying
the Laplacian operator τ times. Then as shown by Stein (1995), its corresponding spectral
density becomes

d
∑

¯τ
fϕ (λ) =



j=1

2τ

¯
fϕ (λ).
4 sin2 (λj /2)


(2.1)

¯τ
Under the condition of (1.1), the limit of fϕ (λ) as ϕ → 0 after scaling by ϕd−θ is

¯τ
ϕd−θ fϕ (λ) → c


d
∑


j=1

2τ

∑
|λ + 2πQ|−θ
4 sin2 (λj /2)

Q ∈ Zd

for λ ̸= 0. Deﬁne
 {
}2τ ∑

 c ∑d 4 sin2 (λj /2)
|λ + 2πQ|−θ , λ ∈ (−π, π]d \{0},

j=1

Q ∈ Zd


gc,θ (λ) =








(2.2)
0,

λ = 0.

The limit function, gc,θ (λ) is integrable by choosing τ such that 4τ − θ > −d. When d = 1,
simple diﬀerencing is preferred as discussed in Stein (1995). Then, 4τ will be replaced with
2τ in our results.
12

Now suppose that Z(s) is observed on the lattice ϕJ . More speciﬁcally, we assume that
τ
we observe Yϕ (J ) at J ∈ Tm = {1, ..., m}d after diﬀerencing Z(s) using the Laplacian

operator τ times. We further assume that ϕ = m−1 so that the number of observations
τ
increases within a bounded observation domain. The spectral density of Yϕ (J ) can be

estimated by a periodogram which is deﬁned using a discrete Fourier transform of the data.
That is, the periodogram is deﬁned by

τ
Im (λ) = (2πm)−d |D(λ)|2 ,

where D(λ) is the discrete Fourier transform of the data given by

D(λ) =

∑

Yδτ (J ) exp{−i λT J }.

J ∈Tm

We consider the periodogram only at Fourier frequencies, 2πm−1 J for J ∈ Tm = {−⌊(m −
1)/2⌋, · · · , m − ⌊m/2⌋}d , where ⌊ x ⌋ is the largest integer not greater than x. A smoothed
periodogram at Fourier frequencies is deﬁned by
(
ˆτ
Im

2πJ
m

)
=

∑

(
τ
Wh (K)Im

K ∈Tm

2π(J + K)
m

)
,

with weights Wh (K) given by
Λh (2πK/m)
,
L∈Tm Λh (2πL/m)

Wh (K) = ∑

13

(2.3)

where

Λh (s) =

1 (s)
Λ
I
h
h {|| s ||≤h}

for a symmetric continuous function Λ on Rd that satisﬁes Λ(s) ≥ 0 and Λ(0) > 0 and IA is
the indicator function of the set A. The norm || · || is deﬁned by || s || = max{|s1 |, |s2 |, ..., |sd |}.
For positive functions a and b, a(λ) ≍ b(λ) for λ ∈ A means that there exist constants
C1 and C2 such that 0 < C1 ≤ a(λ)/b(λ) ≤ C2 < ∞ for all possible λ ∈ A. For asymptotic
results in this paper, we consider the following assumption on the spectral density f (λ).
Assumption 1. The spectral density f (λ) of a stationary Gaussian random ﬁeld {Z(s), s ∈
Rd },
A. f (λ) ∼ c |λ|−θ

as |λ| → ∞,

B. f (λ) is twice diﬀerentiable and there exists a positive constant C such that for |λ| > C,

f (λ) ≍ (1 + |λ|)−θ ,

∂
f (λ) ≍ (1 + |λ|)−(θ+1)
∂λj

∂2
f (λ) ≍ (1 + |λ|)−(θ+2)
∂λj ∂λk

and
(2.4)

for j, k = 1, ..., d.

2.2

Asymptotic properties of a smoothed periodogram

Asymptotic properties of a spatial periodogram and a smoothed spatial periodogram under ﬁxed-domain asymptotics were investigated by Stein (1995) and Lim and Stein (2008).
They assume that spectral density f is twice diﬀerentiable and satisﬁes (2.4) for all λ ∈ Rd .
14

This assumption tells us that the spectral density f (λ) behaves like (1 + |λ|)−θ for all λ,
which is much stronger condition than (1.1). However this condition allows to ﬁnd asymptotic bounds of expectation, variance and covariance of a spatial periodogram at Fourier
frequency 2πJ /m for each m ̸= 0 and J such that ∥J ∥ ̸= 0. Consistency and asymptotic
normality of a smoothed spatial periodogram at Fourier frequency 2πJ /m, however, are
shown when limm→∞ 2πJ /m = µ ̸= 0, that is, J should not be closed to zero asymptotically. Since we make use of asymptotic properties of a smoothed spatial periodogram at
such Fourier frequency under more general assumption (Assumption 1), we extend some of
the results in Stein (1995) and Lim and Stein (2008) under Assumption 1. We focus only on
a smoothed spatial periodogram in the following theorem, but results for a smoothed spatial
cross-periodogram can be shown similarly. Throughout the dissertation, denote

p

−→ by convergence in probability;
d

−→ by convergence in distribution.

Theorem 3. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ − 1 and h = Cm−γ for some C > 0
where γ satisﬁes max{(d − 2)/d, 0} < γ < 1. Further, assume that limm→∞ 2πJ /m = µ
and 0 < ∥µ∥ < π. Then, we have
ˆτ
Im (2πJ /m) p
−→ 1
¯τ
fϕ (2πJ /m)
and
15

(2.5)

(

)
d
ˆτ
mη m−(d−θ) Im (2πJ /m) − gc,θ (µ) −→ N

(

Λ2
0, 2
Λ1

(

)
)
2π d 2
gc,θ (µ) ,
C

(2.6)

∫
where η = d(1 − γ)/2 and Λr = [−1,1]d Λr (s)ds.
Remark 1. The function gc,θ is integrable under 4τ > θ−d which is satisﬁed by the condition
4τ > θ − 1. The condition 4τ > θ − 1 is necessary to show
(
)
¯τ
ˆτ
E Im (2πJ /m) /fϕ (2πJ /m) → 1

and the condition max{(d − 2)/d, 0} < γ < 1 is needed to show
(
)
¯τ
ˆτ
Var Im (2πJ /m) /fϕ (2πJ /m) → 0

so that (2.5) can be shown.

16

2.3

Approach I

To estimate parameters, c and θ, we consider the following objective function to be minimized.

L(c, θ) =

∑

{
(
)
d−θ g (2π(J + K)/m)
Wh (K) log m
c,θ

K ∈Tm

}
τ
Im (2π(J + K)/m)
+ d−θ
,
gc,θ (2π(J + K)/m)
m
1

(2.7)

where Wh (K) is given in (2.3). In L(c, θ), 2πJ /m is any given Fourier frequency that
satisﬁes ∥J ∥ ≍ m so that 2πJ /m is away from 0.

L(c, θ) can be viewed as a weighted Whittle likelihood function. When Λ is a nonzero
constant function, Wh (K) ≡ 1/|K| for K ∈ K, where K = {K ∈ Tm : ||2πK/m|| ≤ h}
and |K| is the number of elements in the set K. Then, L(c, θ) is the form of a local Whittle
likelihood for the lattice data {Yδτ (J ), J ∈ Tm } in which the true spectral density is replaced
with md−θ gc,θ . Note that gc,θ (λ) is the limit of the spectral density of Yδτ (J ) after being
scaled by m−(d−θ) for non-zero λ when ϕ = m−1 . The summation in L(c, θ) is over the
Fourier frequencies near 2πJ /m by letting h → 0 as m → ∞. While a local Whittle
likelihood method to estimate a long-range dependence parameter for time series considers
Fourier frequencies near zero, we consider Fourier frequencies near a pre-speciﬁed non-zero
frequency. For example, by choosing J such that ⌊2πJ /m⌋ = (π/2) 1d , where 1d is the
d-dimensional vector of ones, L(c, θ) considers frequencies only near (π/2)1d .
17

2.3.1

Estimation of c under the known θ

We consider the estimator of c by minimizing L(c, θ) when θ is known. Thus, the proposed
estimator of c when θ is known as θ0 is given by

c = arg min L(c, θ0 ),
ˆ
c∈C

where C is the parameter space of c. c has the explicit expression obtained by ∂L(c, θ0 )/∂c =
ˆ
0:

c=
ˆ

∑

1 I τ (2π(J + K)/m)
Wh (K) d−θ m
,
m 0 g0 (2π(J + K)/m)
K ∈Tm

(2.8)

where g0 ≡ g1,θ0 . The following theorem establishes the consistency and asymptotic normality of the estimator c.
ˆ

Theorem 4. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ0 − 1 for a known θ0 and h = Cm−γ
for some C > 0 where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J satisﬁes
⌊2πJ /m⌋ = (π/2) 1d and the true parameter c is in the interior of the parameter space C
which is a closed interval. Then, for c given in (2.8), we have
ˆ

p

c −→ c,
ˆ

(2.9)

and
(
d

mη (ˆ − c) −→ N
c

Λ
0 , c2 2
Λ2
1

18

(

) )
2π d
,
C

(2.10)

∫
where Λr = [−1,1]d Λr (s)ds and η = d(1 − γ)/2.

Remark 2. Theorem 4 can also be proved when we replace θ0 in (2.8) with a consistent
ˆ
ˆ
ˆ
estimator θ as long as the estimator θ satisﬁes θ − θ0 = op ((log(m))−1 ).

Remark 3. We can prove Theorem 4 for J such that limm→∞ 2πJ /m = µ and 0 < ∥µ∥ < π
instead of the speciﬁc choice of ⌊2πJ /m⌋ = (π/2)1d , which we choose for simplicity in the
proof.

When we choose Λ as a constant function and C = (1/2)π 2 , we have

d
mη (ˆ − c) −→ N
c

(

)

0 , 2d c2 π −d .

For the Mat´rn spectral density given in (3.1) with d = 1, Du et al. (2009) showed that for
e
any ﬁxed α1 with known ν, maximum likelihood estimator of σ 2 satisﬁes

d
2 2ν
n1/2 (ˆ 2 α1 − σ0 α0 ) −→ N
σ 2ν

(

)

2 2ν
0 , 2(σ0 α0 )2 ,

(2.11)

2
where n is the sample size, and σ0 and α0 are true parameters. Note that m is the sample

size of Yδτ which is the τ times diﬀerenced lattice process of Z(s). Since π 1/2 c = σ 2 α2ν for
d = 1, we have the same asymptotic variance as in (2.11). However, our approach has a
slower convergence rate since η < 1/3 when d = 1 as we used partial information. This is
also the case for a local Whittle likelihood method in Robinson (1995).
19

2.3.2

Estimation of θ under the known c

To estimate θ, we assume that c is known as c0 . The proposed estimator of θ is then given
by

ˆ
θ = arg min L(c0 , θ),
θ∈Θ

(2.12)

where Θ is the parameter space of θ. The consistency and the convergence rate of the
ˆ
proposed estimator θ are given in the following Theorem.
Theorem 5. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ − 1 and h = Cm−γ for some C > 0
where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J satisﬁes ⌊2πJ /m⌋ = (π/2) 1d
and the true parameter θ is in the interior of the parameter space Θ which is a closed interval.
ˆ
Then, for θ given in (2.12), we have

ˆ p
θ −→ θ.

(2.13)

ˆ
θ − θ = op ((log m)−1 ).

(2.14)

In addition,

ˆ
Remark 4. The consistency of θ is not enough to determine the asymptotic distribution of
ˆ
θ since we have θ in the exponent of m in the expression of L(c, θ). For the proof of the
asymptotic distribution, we need the rate of convergence given in (2.14).
From Theorem 5, we can now show the following Theorem for the asymptotic distribution
20

ˆ
of θ.
Theorem 6. Under the conditions of Theorem 5, we have
(
d
ˆ
log(m) mη (θ − θ) −→ N

Λ
0, 2
Λ2
1

(

) )
2π d
,
C

where η = d(1 − γ)/2.
ˆ
Remark 5. Note that we have a diﬀerent convergence rate for θ compared to the convergence
rate for c given in Theorem 4. The additional term log(m) is from the fact that θ is in the
ˆ
exponent of m in the expression of L(c, θ).

2.3.3

Estimation under unknown c and θ

In the previous discussion, we consider estimation of one parameter when the other parameter
is known. But in practice, both may be unknown. In order to handle this situation, c is
assigned as any ﬁxed value c∗ . The estimator of θ is then deﬁned by

ˆ
θ = arg min L(c∗ , θ).
θ∈Θ

(2.15)

Theorem 7. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ − 1 and h = Cm−γ for some C > 0
where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J satisﬁes ⌊2πJ /m⌋ = (π/2) 1d
and the true parameter θ is in the interior of the parameter space Θ which is a closed interval.
ˆ
Then, for θ given in (2.15), we have

ˆ p
θ −→ θ.
21

(2.16)

Furthermore,

ˆ
θ − θ = Op ((log m)−1 ).

(2.17)

ˆ
In contrast to Theorem 5, The convergence rate of θ is slower. With this convergence
ˆ
rate, we can not prove asymptotic distribution of θ. Also, we could consider the estimator
ˆ
ˆ
of c0 by minimizing L(θ, c), where θ deﬁned in (2.15), that is,

c=
ˆ

∑
K ∈Tm

Wh (K)

τ
Im (2π(J + K)/m)
,
ˆ
ˆ
md−θ gθ (2π(J + K)/m)

1

(2.18)

ˆ
where θ is the estimate of θ given in (2.15) with the ﬁxed c∗ . But, the consistency of c is not
ˆ
guaranteed. Instead, we obtain the following results which can be easily derived from
Corollary 1. c − c0 = Op (1).
ˆ

22

2.4

Approach II

In Section 2.3, we developed a local Whittle type estimator which utilizes Fourier frequency
information around 2πJ /m = (π/2)1d . However, as the sample size increases, the Fourier
frequencies used in the estimator will be very closed to 2πJ /m = (π/2)1d . Thus, we could
use gc,θ (·) only at [2πJ /m]. In this Section, we provide another estimation methodology
which uses directly the smoothed periodogram with a ﬁxed frequency. Alternative estimator
is obtained by minimizing
(

)

R(c, θ) = log md−θ gc,θ (2πJ /m) +

ˆτ
Im (2πJ /m)
.
md−θ gc,θ (2πJ /m)
1

(2.19)

Asymptotic properties will be discussed in the rest of this Section, and the organization
is same as the Section 2.3. Most theoretical results of the new estimators are identical with
those obtained in Section 2.3 but require some changes in proof.

2.4.1

Estimation of c under known θ

The estimator of c is established by minimizing R(c, θ) when θ is known. Thus, when θ is
known as θ0 , the proposed estimator of c is given by

c = arg min R(c, θ0 ),
ˆ
c∈C

where C is the parameter space of c. By the similar way in Section 2.3, the exact form of c
ˆ
is obtained by solving the equation ∂R(c, θ0 )/∂c = 0 :

c=
ˆ

ˆτ
Im (2πJ /m)
,
md−θ0 g0 (2πJ /m)
1

23

(2.20)

where g0 ≡ g1,θ0 . The same consistency and asymptotic results as in Section 2.3 hold for
this estimator.
Theorem 8. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ0 − 1 for a known θ0 and h = Cm−γ
for some C > 0 where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J satisﬁes
⌊2πJ /m⌋ = (π/2) 1d and the true parameter c is in the interior of the parameter space C
which is a closed interval. Then, for c given in (2.20), we have
ˆ

p

c −→ c,
ˆ

(2.21)

and
(
d

mη (ˆ − c) −→ N
c

Λ
0 , c2 2
Λ2
1

(

) )
2π d
,
C

(2.22)

∫
where Λr = [−1,1]d Λr (s)ds and η = d(1 − γ)/2.

2.4.2

Estimation of θ under known c

Using (2.19), we can consider

ˆ
θ = arg min R(c0 , θ),
θ∈Θ

(2.23)

where Θ is the parameter space of θ when we assume that c is known as c0 . In the following
ˆ
Theorem, the consistency and the convergence rate of the new estimator θ deﬁned in (2.23)
are provided.

24

Theorem 9. Suppose that the spectral density f of a stationary Gaussian random ﬁeld Z(s)
on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ − 1 and h = Cm−γ for some C > 0
where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J satisﬁes ⌊2πJ /m⌋ = (π/2) 1d
and the true parameter θ is in the interior of the parameter space Θ which is a closed interval.
ˆ
Then, for θ given in (2.23), we have

ˆ p
θ −→ θ.

(2.24)

ˆ
θ − θ = op ((log m)−1 ).

(2.25)

In addition,

Remark 6. With the similar way in the Section 2.3, the rate of convergence given in (2.25)
ˆ
is also useful for studying the asymptotic properties of θ. The same result as in the Section
2.3 will be shown.

Theorem 10. Under the conditions of Theorem 9, we have
(
d
ˆ
log(m) mη (θ − θ) −→ N

where η = d(1 − γ)/2.
25

Λ
0, 2
Λ2
1

(

) )
2π d
,
C

2.4.3

Estimation under unknown θ and c

In this subsection, we also consider the situation when both parameters are unknown. With
a given c∗ which may be diﬀerent from the true value c0 , the estimator of θ is established by

ˆ
θ = arg min R(c∗ , θ).
θ∈Θ

(2.26)

Then, we have the following results which are similar to see 2.3.3.

Theorem 11. Suppose that the spectral density f of a stationary Gaussian random ﬁeld
Z(s) on Rd satisﬁes Assumption 1. Also suppose that 4τ > θ0 − 1 for a known θ0 and
h = Cm−γ for some C > 0 where γ satisﬁes d/(d + 2) < γ < 1. Further, assume that J
satisﬁes ⌊2πJ /m⌋ = (π/2) 1d and the true parameter θ is in the interior of the parameter
ˆ
space Θ which is a closed interval. Then, for θ given in (2.26), we have

ˆ p
θ −→ θ

(2.27)

ˆ
θ − θ = Op ((log m)−1 ).

(2.28)

ˆτ
Im (2πJ /m)
,
ˆ
ˆ
md−θ gθ (2πJ /m)

(2.29)

Moreover,

If

c=
ˆ

1

is viewed as the estimator of true value c0 , we can show c − c0 = Op (1).
ˆ
26

ˆ
Based on the quantity of c∗ , the overestimation and underestimation of θ for true value
θ0 can be found in the following result.
Theorem 12. (i) When c∗ < c0 , there exists M such that P
(ii) When c∗ > c0 , there exists M such that P

(

(

)
ˆ = 0 for m > M.
θ0 ≥ θ

)
ˆ = 0 for m > M.
θ0 ≥ θ

Remark 7. The properties of overestimation and underestimation for the ﬁrst approach are
also found from simulation study. However, theoretical results will be more complicated than
second approach because the eﬀect from Bm and Cm should be pored.

27

Chapter 3
Simulation Study
In this chapter, simulation studies with various many models are introduced to validate the
asymptotical results obtained in Chapter 2. Although estimators constructed in Chapter 2
work for high dimensional situation, one dimensional Mat´rn covariance model with various
e
parameter values are considered here.
Let Z(s) is a stationary Gaussian process on R with a Mat´rn covariance function whose
e
spectral density follows (see e.g., Stein 1999, pp. 31)

f (λ) = σ 2 (α2 + λ)−ν−1/2 .

(3.1)

Data are generated from the subroutine ”mnrnd” in Matlab with covariances following
(3.1). We consider the region D = [0, 10] with diﬀerent grid size ϕ = 0.1, 0.05 and 0.025
which corresponds to m = 100, 200, and 400. 500 data sets are simulated for each case. So
that we have 500 parameter estimates.
For the sake of simplifying computation, function Λ is a constant function so that Wh (K)
is same for each K ∈ K. The four times ﬁnite diﬀerence operator (τ = 4) is applied on the
28

simulated data, and C = 1 and γ = 1/3 is chosen for the bandwidth . The notations used
in the tables are deﬁned as follows: m is sample size, |K| is the number of non-zero weights
Wh (K), Bias is the average of the bias obtained by estimations, and STD is the standard
deviation of estimates.
In the ﬁrst example, we consider (α, σ 2 , ν) = (1, 1/π, 1/2). In this case, true parameters
of (c, θ) are (1/π, 2). Table 3.1 and Table 3.2 are results of estimates θ and c, respectively.
Bias of Table 3.1 and Table 3.2 shows errors between our estimations and true value
are less than 10−2 and STD means the estimations are very concentrated. Compared with
the sample size (m), under the present bandwidth setting, the number of non-zero weights,
|K|, seems to be small for each K ∈ K, that is, small number of frequencies are used. The
wider bandwidth setting is also considered by replacing C = 1 with C = 5, and simulation
output is shown in Table 3.3. The Bias and STD are slightly improved in the new bandwidth
setting.
The second simulation example comes from (3.1) with (α, σ 2 , ν) = (1, 1/π, 3/2) which
implies (c, θ) = (1/π, 4). Under the same setting in the previous example with C = 1, The
Bias and STD in Table 3.4 and 3.5 show similar results. Further, C = 5 is again applied to
have wider bandwidth and the output is shown in Table 3.4. Although STD is improved,
Bias in Table 3.6 did not be improved. From Table 3.3 and 3.6, the accuracy of estimation
seems to be aﬀected by which bandwidth we select. Therefore, it is important to ﬁnd an
optimal bandwidth. We will investigate this as a future research.
Under the same simulation setting as Table 3.6, the second approach is also applied and
the output are shown in Table 3.7. Compared with Table 3.6, the performance of the second
approach seems to be similar with the ﬁrst one. This matches those theoretical results we
found before.
29

We consider the estimating θ when c is also unknown. In previous examples whose true
value are (θ, c) = (2, 1/π) and (θ, c) = (4, 1/π). θ is estimated when c is assumed as 2, 1,
0.2 and 0.1. The simulation output of previous two examples under diﬀerent c are shown in
Table 3.8 and Table 3.9, and their histograms are placed in the Figure 3.1 and 3.2. When c
is bigger than true value, the Bias is positive and grows as c increase. In the Figure 3.1 and
3.2, if the selected c is 1/π (true value of c), the estimates distributed around the both sides
of the true value of θ (θ = 4). Meanwhile, when c is not equal to 1/π, most of estimates is
left or right of the true value. Moreover, in the Figure 3.3, trend of estimations is gradually
moving to true value as the increase of sample size.

30

Table 3.1: Estimation of θ under known c
m
100
200
400

|K|

Wh (K)

7
10
17

1/7
1/10
1/17

Bias
0.039
0.009
0.009

STD
0.129
0.088
0.05

Table 3.2: Estimation of c under known θ
m
100
200
400

|K|

Wh (K)

7
10
17

1/7
1/10
1/17

Bias
0.00072
0.0039
0.0024

STD
0.12
0.0945
0.078

Table 3.3: Estimation of θ under known c
m
100
200
400

|K|

Wh (K)

33
52
83

1/33
1/52
1/83

31

Bias
-0.0024
0.004
0.002

STD
0.0618
0.038
0.0256

Table 3.4: Estimation of θ under known c
m
100
200
400

|K|

Wh (K)

7
10
17

1/7
1/10
1/17

Bias
0.032
0.02
0.011

STD
0.138
0.094
0.058

Table 3.5: Estimation of c under known θ
m
100
200
400

|K|

Wh (K)

7
10
17

1/7
1/10
1/17

Bias
0.014
0.003
-0.003

STD
0.132
0.094
0.077

Table 3.6: Estimation of θ under known c
m
100
200
400

|K|

Wh (K)

33
52
83

1/33
1/52
1/83

32

Bias
0.04
0.031
-0.027

STD
0.066
0.042
0.027

Table 3.7: Estimation of θ under known c (Second approach)
m
100
200
400

|K|

Wh (K)

33
52
83

1/33
1/52
1/83

Bias
0.004
0.026
-0.027

STD
0.077
0.047
0.03

Table 3.8: Estimation of θ under unknown c for Example 1
c
2
1
1/π
0.2
0.1

|K|

Wh (K)

52
52
52
52
52

1/52
1/52
1/52
1/52
1/52

Bias
0.4907
0.2996
0.004
-0.1364
−0.3180

STD
0.0421
0.0419
0.0378
0.0417
0.0378

Table 3.9: Estimation of θ under unknown c for Example 2
c
2
1
1/π
0.2
0.1

|K|

Wh (K)

52
52
52
52
52

1/52
1/52
1/52
1/52
1/52

33

Bias
0.5332
0.2245
0.031
-0.1309
-0.3331

STD
0.0418
0.0415
0.042
0.0413
0.0415

Figure 3.1: Histogram of Example 1 on different c.

c=1/ π

c=1

100

100

50

50

0

1.5

2

2.5

0

1.5

2

2.5

c=0.2

c=2
1 00

60
40

50
20
0

1.5

2

2.5

c=0.1
100

50

0

1.5

2

2.5

34

0

1.4 1.6 1.8 2

2.2

Figure 3.2: Histogram of Example 2 on different c.

c=1/ π

c=1

60

60

40

40

20

20

0

3.5

4

0

4.5

3.5

c=2

4

4.5

c=0.2

100

60
40

50
20
0

3.5

4

0

4.5

c=0.1
60
40
20
0

3.5

4

4.5

35

3.5

4

4.5

Figure 3.3: Histogrm of Example 2 with diﬀerent grid sizes on wrong c.

φ=0.1
100

50

0
3. 2 3. 4 3. 6 3. 8 4 4. 2 4. 4 4. 6 4. 8
φ=0.05
100

50

0
3. 2 3. 4 3. 6 3. 8 4 4. 2 4. 4 4. 6 4. 8
φ=0.025
60
40
20
0
3. 2 3. 4 3. 6 3. 8 4 4. 2 4. 4 4. 6 4. 8

36

Chapter 4
Discussion
In this dissertation, we ﬁrst extended the result of Stein and Lim (2008) on weaker assumptions. Then, we proposed two approaches to estimate c and θ that govern the tail behavior
of the spectral density of a stationary Gaussian random ﬁeld on Rd . The proposed estimators are obtained by minimizing the objective function given in (2.7) and (2.19). The ﬁrst
approach makes use of frequency information around 2πJ /m. The second approach only
employ the information from [2πJ /m] = (π/2)1d . Regarding proofs of asymptotic results
and simulation comparison, there is not much diﬀerence between these two approaches.
As mentioned in Chapter 2, the objective function given in (2.7) is similar to the one
used in the local Whittle likelihood method when a kernel function Λ in Wh (K) is constant.
¯τ
When we replace md−θ gc,θ with fϕ (λ) and remove Wh (K) in (2.7), it can be thought of
an approximation to the likelihood of Y τ (J ). This approximation, however, has not been
ϕ
veriﬁed under ﬁxed-domain asymptotics. One might think that we can apply a similar
technique to prove the validity of Whittle approximation to the likelihood since Y τ (J ) is
ϕ
¯τ
a lattice process. However, the spectral density fϕ (λ) of Y τ (J ) converges to zero, which
ϕ
37

require a diﬀerent approach and further investigation is needed.
The weights in (2.7) is controlled by h, a bandwidth, which can be interpreted as a
proportion of Fourier frequencies to be considered in the objective function. In our theorems,
we assume h = Cm−γ for some constant C. In proofs, we make use of the properties of
ˆτ
a smoothed spatial periodogram Im . Simulation results are also changing with diﬀerent
bandwidth. Thus, we could ﬁnd the optimal bandwidth that minimizes the mean squared
ˆτ
ˆτ
error of Im . However, ﬁnding the mean squared error of Im needs explicit expressions of the
ˆτ
bias and variance of Im (λ) and this requires further investigation. It will be more useful
when we can estimate c and θ together or estimate θ when c is unknown. Due to the form of
gc,θ , proving their asymptotic properties under ﬁxed-domain asymptotics is challenging and
needs diﬀerent mathematics.
Although some contributions including theoretical results are made for the case which
both parameters are unknown, more eﬀorts are still need. In the current method, to estimate
ˆ
θ, c was pretended to be a ﬁxed number c∗ but convergence rate of θ may be slower. To
ˆ
handle this problem, we believe updating c∗ through θ could be more reasonable, but how
to update both estimators by an iterative way is still open.
The approaches of the fractal index could be another alternative way to research the tail
behavior of the spectral density. By Abelian type theorem, some relationships between the
tail of the spectral density and the origin of the covariance function have been existed. In
this situation, the methodologies for the fractal index may be useful but the detail have to be
carefully considered. Also, we believe our approaches should be available for the stationary
increment process.
Finally, in our work, data are sampled from on the regular grid points. But in practice,
irregular situation is more interesting. Several works or ideas discussed for increasing do38

main asymptotics may be also valid for ﬁxed-domain asymptotics. Meanwhile, we are also
interested in extending our univariate approaches univariate to multivariate situation.

39

Chapter 5
Appendix

5.1

The properties of gc,θ (λ)

Some properties of the function gc,θ (λ) are discussed in this Appendix. These properties
will be used in the proofs given in Appendix 5.2.1. Recall that

gc,θ (λ) = c


d
∑


j=1

2τ
 ∑
2 (λ /2)
|λ + 2πQ|−θ .
4 sin j

Q∈Zd

For a function gc,θ (λ), let ∇g be the gradient of g with respect to λ and let g and g
˙
¨
denote the ﬁrst and second derivatives of gc,θ (λ) with respect to θ, respectively. That
is,∇g = (∂g/∂λ1 , · · · , ∂g/∂λd ), g = ∂gc,θ (λ)/∂θ and g = ∂ 2 gc,θ (λ)/∂θ2 .
˙
¨
We denote Aρ = [−π, π]d \ (−ρ, ρ)d for a ﬁxed ρ that satisﬁes 0 < ρ < 1. Since we
assume that the parameter space Θ is a closed interval in Chapter 2, let Θ = [θL , θU ] and
θL > d. Although Lemma 1 can be shown for any ﬁxed ρ with 0 < ρ < 1, we further assume
that ρ is small enough so that all Fourier frequencies near (π/2)1d considered in R(c, θ) are
contained in Aρ .
40

Lemma 1. The following properties hold for gc,θ (λ). Let c > 0 be a ﬁxed constant.

(a) There exist constants KL and KU such that for all (θ, λ) ∈ Θ × Aρ ,

0 < KL ≤ gc,θ (λ) ≤ KU < ∞.

(5.1)

(b) For any θ1 , θ2 ∈ Θ, there exist constants KL and KU such that for all λ ∈ Aρ ,

0 < KL ≤ gc,θ1 (λ)/gc,θ2 (λ) ≤ KU < ∞.

(5.2)

˙ ¨ ˙
˙
(c) ∇g, g, g , g/g and ∇(g/g) are uniformly bounded on Θ × Aρ .

(d) gc,θ (λ) is continuous on Θ × Aρ .

Proof. Since gc,θ (λ) is linear in c, it will be enough just consider g1,θ (λ). First, we ﬁnd the
∑
upper and lower bounds of Q∈Zd |λ + 2πQ|−θ . For all (θ, λ) ∈ Θ × Aρ , we have
∑

|λ + 2πQ|−θ ≥ π −θU > 0

Q∈Zd

and
∑
Q∈Zd

|λ + 2πQ|−θ ≤

∑

|λ + 2πQ|−θL + ϵ−θU

Q∈Zd \{0}

≤ (2π)d ϵd−θL /(d − θL ) + ϵ−θU ,
41

where the last inequality follows from
∑

∫

−θL

|λ + 2πQ|

≤

Q∈Zd \0

|y |≥1

∫
≤

|z |≥ϵ

∫
=
=

|λ + 2πy|−θL dy

(2π)d |z|−θL dz

(2π)d xd−1 x−θL dx

x≥ϵ
(2π)d ϵd−θL /(θL − d),

(5.3)

since θL > d. Thus, we have
∑

0 < kL ≤

|λ + 2πQ|−θ ≤ kU < ∞,

(5.4)

Q∈Zd

where kL = π −θU and kU = (2π)d ϵd−θL /(θL − d) + ϵ−θU .

Then, (a) follows from (5.4),

(4d sin2 (ϵ/2))2τ ≤


d
∑


j=1

2τ

4 sin2 (λj /2)
≤ (4d)2τ ,


and by setting KL ≡ c (4d sin2 (ϵ/2))2τ kL and KU ≡ c (4d)2τ kU .

(b) follows from observing that

∑
Q∈Zd

|λ + 2πQ|−θ has lower and upper bounds that

are uniform on Θ × Aρ as given in (5.4).

42

For (c), we have

∂g
∂λi

= c 4τ

{∑
d

}2τ −1

4 sin2 (λj /2)

sin(λi )

j=1

−θ

{∑
d
j=1

∑

≤ K

}2τ −1 ∑

4 sin2 (λj /2)

∑

|λ + 2πQ|−θ

Q∈Zd

(λi + 2πQi ) |λ + 2πQ|−θ−2

Q∈Zd

|λ + 2πQ|−θ

Q∈Zd

≤ K kU

for some constant K > 0 and kU given in (5.4), which implies uniform boundedness of ∇g
on Θ × Aρ . For the uniform bound of g and g , we ﬁrst compute g and g :
˙
¨
˙
¨

g = −c
˙

{∑
d

}2τ ∑

4 sin2 (λj /2)

j=1

g = c
¨

{∑
d

|λ + 2πQ|−θ log |λ + 2πQ| ,

Q∈Zd

}2τ ∑

4 sin2 (λj /2)

j=1

|λ + 2πQ|−θ (log |λ + 2πQ|)2 .

Q∈Zd

Since we can ﬁnd x0 and K such that for a given β > 0, | log x| ≤ Kxβ for all x > x0 , we
can show that there exist n0 , K1 and K2 that satisfy
∑

|g| ≤ K1 + K2
˙

|λ + 2πQ|−θ+β

Q∈Zd ,||Q||≥n0

for some ﬁxed β > 0. When we choose β = (θL − θ)/2, we can show that
∑

|λ + 2πQ|−θ+β < ∞

Q∈Zd ,||Q||≥n0

43

using a similar argument to show (5.3), which leads to uniform boundedness of g. Similarly,
˙
we can show uniform boundedness of g .
¨
The uniform boundedness of g/g follows from uniform boundedness of g and (a). To
˙
˙
show uniform boundedness of ∇(g/g), consider
˙
∑

|λ + 2πQ|−θ−2 (λi + 2πQi )(1 − θ log |λ + 2πQ|)
∂
Q∈Zd
(g/g) = −
˙
∑
∂λi
|λ + 2πQ|−θ
Q∈Zd
)
)( ∑
(∑
−θ−2
−θ
(λi + 2πQi )
|λ + 2πQ| log |λ + 2πQ| −θ Q∈Zd |λ + 2πQ|
Q∈Zd
+
.
)2
(∑
|λ + 2πQ|−θ
Q∈Zd
Since denominators in the expression of ∂ (g/g) /∂λi have uniform lower bounds as shown
˙
in (5.4), it is enough to ﬁnd uniform bounds of numerators to show uniform boundedness of
∂ (g/g) /∂λi . By observing that |λi + 2πQi | ≤ |λ + 2πQ| and |λ + 2πQ|−1 ≤ K for some
˙
K > 0 on Aρ , we can show that each numerator in the expression of ∂ (g/g) /∂λi is uniformly
˙
bounded on Θ × Aρ using a similar argument to show uniform boundedness of g.
˙

∑
To show (d), it is enough to show the continuity of Q∈Zd |λ + 2πQ|−θ on Θ × Aρ since
{∑
}2τ
d
2 (λ /2)
is continuous on Aρ . It can be easily shown that
j
j=1 4 sin
∑

|λ + 2πQ|−θ

Q∈Zd ,||Q||>n

converges to zero uniformly on Θ × Aρ as n → ∞, which implies the uniform convergence
of

∑
Q∈Zd ,||Q||≤n

|λ + 2πQ|−θ to g(θ, λ). Thus, the continuity of gc,θ (λ) in λ follows from

the continuity of |λ + 2πQ|−θ .

44

5.2

Proofs of Theorems in Section 2

5.2.1

Proofs of Theorems in Section 2.2

Proof of Theorem 3. If f (λ) satisﬁes (2.4) for all λ, (2.5) and (2.6) hold by results in
Stein (1995) and Lim and Stein (2008). To prove (2.5) and (2.6) when (2.4) holds only for
large λ, we need to show that the eﬀect of f (λ) on |λ| ≤ C is negligible.
Consider a spectral density k(λ) which satisﬁes k(λ) ∼ c|λ|−θ as |λ| → ∞ and k(λ) is
twice diﬀerentiable and satisﬁes (2.4) for all λ. Also assume that k(λ) ≡ f (λ) for |λ| > C.
f,τ

Let Im (λ) be the periodogram at λ from the observations under f (λ) and

f,τ

am,ϕ (J , K)
= (2πm)−d

∫


d
∑

Rd j=1

(
4 sin2


)2τ

ϕλj
2 
(

where
Φ(λ, J , k) =

d
∏
j=1 sin

sin2
(

ϕλj
πJj
2 + m

)

f (λ)Φ(λ, J , K)dλ.

mϕλj
2

(

sin

)
ϕλj
πKj
2 + m

)

Note that
(
)
f,τ
f,τ
E Im (2πJ /m) = am,ϕ (J , J ),
(
)
f,τ
f,τ
f,τ
Var Im (2πJ /m) = am,ϕ (J , J )2 + am,ϕ (J , −J )2 .

(2.5) and (2.6) follow from Theorems 3, 6 and 12 in Lim and Stein (2008) when these
Theorems hold for f under Assumption 1. The key part of proofs of these Theorems under
45

Assumption 1 is to show
(
)
f,τ
E Im (2πJ /m)
= 1 + O(m−β1 )
¯τ (2πJ /m)
fϕ
(
)
f,τ
Var Im (2πJ /m)
= 1 + O(m−β2 ),
¯
f τ (2πJ /m)2

(5.5)

(5.6)

ϕ

for some β1 , β2 > 0. Once (5.5) and (5.6) are shown, the other parts of proofs are similar to
the proofs in Lim and Stein (2008).
Since results in Stein (1995) and Lim and Stein (2008) hold for k(λ), we have (5.5) and
(5.6) for k(λ). Then, (5.5) and (5.6) for f (λ) follow from

am,ϕ (J , ±J ) − am,ϕ (J , ±J ) = O(m−d−4τ ),
f,τ

k,τ

(5.7)

for J that satisﬁes ∥J ∥ ≍ m and 2J /m ̸∈ Zd . (5.7) holds since

f,τ

k,τ

am,ϕ (J , ±J ) − am,ϕ (J , ±J )
{
)}2τ
(
∫
∑d
−d
2 ϕλj
= (2πm)
(f (λ) − k(λ))Φ(λ, J , k)dλ
j=1 4 sin
|λ|≤C
2
∫

≤ (2πm)−d |λ|≤C

{
∑d

j=1

(
4 sin2

ϕλj
2

)}2τ
|f (λ) − k(λ)| Φ(λ, J , k)dλ

≤ v m−d−4τ

for some positive constant v since k(λ) ≡ f (λ) for |λ| > C and ϕλj /2 ± πJj /m stays
away from zero and π when m is large.

46

5.2.2

Proofs of Theorems in Section 2.3

Proof of Theorem 4. To show weak consistency of c, we consider upper and lower bounds
ˆ
of c. Let
ˆ
K U = argmaxK ∈Tm ,W (K )̸=0 g0 (2π(J + K)/m)
h
and
K L = argminK ∈Tm ,,W (K )̸=0 g0 (2π(J + K)/m).
h
Recall that g0 = g1,θ0 . Then, we have
∑

K ∈Tm

τ
Wh (K)Im (2π(J + K)/m)

md−θ0 g0 (2π(J + K U )/m)
∑

K ∈Tm

≤

≤ c
ˆ
τ
Wh (K)Im (2π(J + K)/m)

md−θ0 g0 (2π(J + K L )/m)

which can be rewritten as
ˆτ
c Im (2πJ /m)

md−θ0 gc,θ0 (2π(J + K U )/m)
≤

≤ c
ˆ
ˆτ
c Im (2πJ /m)

md−θ0 gc,θ0 (2π(J + K L )/m)

(5.8)

with probability one. Note that both gc,θ0 (2π(J + K U )/m) and gc,θ0 (2π(J + K L )/m)
ˆτ
converge to gc,θ0 ((π/2)1d ) by continuity of gc,θ (λ) and m−(d−θ0 ) Im (2πJ /m) converges to
gc,θ0 ((π/2)1d ) in probability by Theorem 3. Thus, it follows that c converges to c in probaˆ
bility.
47

For the asymptotic distribution of c, note that we have
ˆ
(
mη

)

ˆτ
Im (2πJ /m)
md−θ0
d

−→ N

− gc,θ0 ((π/2)1d )

(

Λ2
0, 2
Λ1

(

)
)
2π d 2
gc,θ ((π/2)1d )
0
C

(5.9)

from Proposition 12 in Lim and Stein (2008) and
(
(
)
)
mη gc,θ0 2π(J + K E )/m − gc,θ0 ((π/2)1d ) −→ 0,

(5.10)

d
for E = U or L, since 4τ > θ0 − 1, h = Cm−γ and d+2 < γ < 1. Then, (2.10) follows from

(5.50) and (5.10).

To prove Theorem 5, we consider following lemmas.
Lemma 2. Consider a function hm (x) = − log(x) + dm (x − 1), where dm is positive and a
function of a positive integer m. Also assume that dm → 1 as m → ∞. Then, for a given r
with 0 < r < 1, there exist δr > 0 and Mr such that for all m ≥ Mr ,

hm (x) > δr ,

for any x ∈ Zr , where Zr = {z : |z − 1| > r, z > 0}.
Proof. It can be easily shown that for any positive integer m, hm (x) is a convex function
on (0, ∞) and minimized at x = 1/dm with hm (1/dm ) ≤ 0. Let h∞ (x) = − log(x) + x − 1.
Since dm → 1, for any r ∈ (0, 1), there exists Mr > 0 such that for all m ≥ Mr , we have
|1/dm − 1| ≤ r and min{hm (1 − r), hm (1 + r)} > (1/2) min{h∞ (1 − r), h∞ (1 + r)} > 0.
48

Hence for all x ∈ Zr , we have

hr (x) ≥ min{hm (1 − r), hm (1 + r)} > (1/2) min{h∞ (1 − r), h∞ (1 + r)} ≡ δr .

The following lemma shows that L(c0 , θ1 ) − L(c0 , θ0 ) can be bounded from below by
three terms and two of them can be neglected.

Lemma 3. For a positive integer m and θ1 ∈ Θ, we have

L(c0 , θ1 ) − L(c0 , θ0 ) ≥ Am + Bm + Cm ,

where
(

)
gc0 ,θ0 (2π(J + S m )/m)
Am = − log mθ1 −θ0
gc0 ,θ1 (2π(J + S m )/m)
(
)
ˆδ
gc0 ,θ0 (2π(J + S m )/m)
Im (2πJ /m)
+ d−θ
mθ1 −θ0
−1 ,
gc0 ,θ1 (2π(J + S m )/m)
m 0 gc0 ,θ0 (2π(J + K M )/m)
(
Bm = log

gc0 ,θ0 (2π(J + S m )/m) gc0 ,θ1 (2π(J + S M )/m)

)

gc0 ,θ0 (2π(J + S M )/m) gc0 ,θ1 (2π(J + S m )/m)
(
)
ˆτ
gc0 ,θ0 (2π(J + K M )/m)
Im (2πJ /m)
1−
.
Cm = d−θ
gc0 ,θ0 (2π(J + K m )/m)
m 0 gc0 ,θ0 (2π(J + K M )/m)
49

(5.11)
(5.12)
(5.13)

In (5.11)-(5.13), K M , K m , S M and S m are deﬁned as

K M = arg max{K ∈Tm ,W (K )̸=0} gc0 ,θ0 (2π(J + K)/m),
h
K m = arg min{K ∈Tm ,W (K )̸=0} gc0 ,θ0 (2π(J + K)/m),
h
(
)
gc0 ,θ0 (2π(J + K)/m)
S M = arg max{K ∈Tm ,W (K )̸=0} log
,
h
gc0 ,θ1 (2π(J + K)/m)
gc ,θ (2π(J + K)/m)
S m = arg min{K ∈Tm ,W (K )̸=0} 0 0
.
h
gc0 ,θ1 (2π(J + K)/m)
Furthermore,

sup |Bm | = o(1),

(5.14)

θ∈Θ

Cm = op (1),

where (5.15) is under the conditions of Theorem 5.

50

(5.15)

Proof. From the expression of L(c, θ) given in (2.7), we have

L(c0 , θ1 ) − L(c0 , θ0 )
∑

=−

Wh (K) log

(

gc ,θ (2π(J + K)/m)
mθ1 −θ0 0 0

K ∈Tm

)

gc0 ,θ1 (2π(J + K)/m)

∑

gc ,θ (2π(J + K)/m)
I τ (2π(J + K)/m)
Wh (K) d−θ m
mθ1 −θ0 0 0
gc0 ,θ1 (2π(J + K)/m)
m 0 gc0 ,θ0 (2π(J + K)/m)
K ∈Tm
∑
I τ (2π(J + K)/m)
Wh (K) d−θ m
−
m 0 gc0 ,θ0 (2π(J + K)/m)
K ∈Tm
(
)
gc0 ,θ0 (2π(J + S M )/m)
≥ log mθ1 −θ0
gc0 ,θ1 (2π(J + S M )/m)
+

∑

gc ,θ (2π(J + S m )/m)
I τ (2π(J + K)/m)
Wh (K) d−θ m
mθ1 −θ0 0 0
gc0 ,θ1 (2π(J + S m )/m)
m 0 gc0 ,θ0 (2π(J + K M )/m)
K ∈Tm
∑
I τ (2π(J + K)/m)
−
Wh (K) d−θ m
m 0 gc0 ,θ0 (2π(J + K m )/m)
K ∈Tm
(
)
ˆτ
g
(2π(J + S M )/m)
Im (2πJ /m)
θ1 −θ0 c0 ,θ0
= − log m
+ d−θ
gc0 ,θm (2π(J + S M )/m)
m 0 gc0 ,θ0 (2π(J + K M )/m)
(
)
gc0 ,θ0 (2π(J + S m )/m) gc0 ,θ0 (2π(J + K M )/m)
× mθ1 −θ0
−
gc0 ,θ1 (2π(J + S m )/m)
gc0 ,θ0 (2π(J + K m )/m)
+

=: Hm .

Hm is further decomposed as
(
Hm = − log

gc ,θ (2π(J + S m )/m)
1
mθ −θ0 0 0
gc0 ,θ1 (2π(J + S m )/m)
(

)

)
gc0 ,θ0 (2π(J + S m )/m)
+ d−θ
−1
mθ1 −θ0
gc ,θ1 (2π(J + S m )/m)
m 0 gc0 ,θ0 (2π(J + K M )/m)
0
(
)
gc0 ,θ0 (2π(J + S m )/m) gc0 ,θ1 (2π(J + S M )/m)
+ log
gc0 ,θ0 (2π(J + S M )/m) gc0 ,θ1 (2π(J + S m )/m)
(
)
ˆτ
gc0 ,θ0 (2π(J + K M )/m)
Im (2πJ /m)
+ d−θ
1−
,
gc0 ,θ0 (2π(J + K m )/m)
m 0 gc ,θ (2π(J + K M )/m)
ˆδ
Im (2πJ /m)

0 0

51

which is Am + Bm + Cm given in (5.11)-(5.13).
Note that 2π(J +K M )/m, 2π(J +K m )/m, 2π(J +S M )/m and 2π(J +S m )/m converge
to (π/2)1d as m → ∞. Note also that the convergence of 2π(J +S M )/m and 2π(J +S m )/m
holds for θ1 uniformly on Θ, because h → 0.
The continuity of gc0 ,θ in Lemma 1 implies that as m → ∞,
(
log

gc0 ,θ0 (2π(J + S m )/m) gc0 ,θ1 (2π(J + S M )/m)
gc0 ,θ0 (2π(J + S M )/m) gc0 ,θ1 (2π(J + S m )/m)

)
−→

0

(5.16)

holds for θ1 uniformly on Θ, therefore, supΘ |Bm | = o(1). Also, we have

p

ˆτ
m−(d−θ0 ) Im (2πJ /m)/gc0 ,θ0 (2π(J + K M )/m) −→ 1,
ˆτ
since m−(d−θ0 ) Im (2πJ /m)/gc0 ,θ0 ((π/2)1d ) converges to one in probability by Theorem 3
and gc0 ,θ0 (2π(J + K M )/m) converges to gc0 ,θ0 ((π/2)1d ). Thus, together with

1−

gc0 ,θ0 (2π(J + K M )/m)
gc0 ,θ0 (2π(J + K m )/m)

→

0,

Cm converges to one in probability.

Theorem 13 (Egorov theorem (Folland 1999)). Suppose that ν(X) < ∞, and f1 , f2 , ... and
f are measurable complex-valued functions on X such that fn → f a.e. Then for every ϵ > 0
there exists E ⊆ X such that ν(E) < ϵ and fn → f uniformly on E c .
Proof of Theorem 5. Let (Ω, F, P) be the probability space where a stationary Gaussian
ˆ
ˆ
random ﬁeld Z(s) is deﬁned. To emphasize dependence on m, we use θm instead of θ in this
52

proof. Note that we have

ˆ
P (L(c0 , θm ) − L(c0 , θ0 ) ≤ 0) = 1

(5.17)

ˆ
for any positive integer m, due to the deﬁnition of θm . We are going to prove the theorem
ˆ
by deriving a contradiction to (5.17) when θm does not converge to θ0 in probability.
ˆ
Suppose that θm does not converge to θ0 in probability. Then, there exist ϵ > 0, δ > 0
and M1 such that for m ≥ M1 ,

ˆ
P (|θm − θ0 | > ϵ) > δ.

ˆ
We deﬁne Dm = {ω ∈ Ω : |θm − θ0 | > ϵ}. By Lemma 3, we have

ˆ
L(c0 , θm ) − L(c0 , θ0 ) ≥ Am + Bm + Cm ,

ˆ
where Am , Bm and Cm are given in (5.11)-(5.13) with θ1 = θ. Also, note that
(
Am = hm

gc ,θ (2π(J + S m )/m)
ˆ
mθ−θ0 0 0
gc ,θ (2π(J + S m )/m)
ˆ
0

)
,

where hm (·) is deﬁned in Lemma 2 with

dm =

ˆδ
Im (2πJ /m)
md−θ0 gc0 ,θ0 (2π(J + K M )/m)

,

(5.18)

where K M is deﬁned in Lemma 3.
We are going to show that there exist {mk }, a subsequence of {m} and a subset of Dmk
53

such that for large enough mk , Amk + Bmk + Cmk is bounded away from zero.
By Theorem 3 and the convergence of gc0 ,θ0 (2π(J + K M )/m) to gc0 ,θ0 ((π/2)1d ), we
p

have dm → 1. Then, there exists {mk }, a subsequence of {m} such that dmk converges to one
almost surely. By (5.17) in Lemma 3, almost sure convergence of dmk implies that Cmk given
in (5.13) converges to zero almost surely. To use Lemma 2, we need uniform convergence of
dmk which is obtained by Egorov’s Thoerem (Folland, 1999). By Egorov’s Theorem, there
exists Gδ ⊂ Ω such that dmk and Cmk converge uniformly on Gδ and P (Gδ ) > 1 − δ/2.
On the other hand, there exists a M2 , which does not depend on ω, such that for mk ≥
M2 ,
ˆ

θm −θ0
mk k

gc0 ,θ0 (2π(J + S mk )/mk )
gc ,θ (2π(J + S mk )/mk )
ˆ
0 mk

−1

>

1
2

(5.19)

for all ω ∈ Dmk , because of the uniform boundedness of gc0 ,θ0 /gc0 ,θ1 .
Let Hmk = Dmk ∩ Gmk . Note that P (Hmk ) > δ/2 > 0 for mk ≥ M1 . Then, by Lemma
2 with r = 1/2, there exist δr > 0 and Mr such that for mk ≥ Mr ,
)

(

Amk

ˆ
θm −θ0 gc0 ,θ0 (2π(J + S mk )/mk )
= − log mk k
gc0 ,θ1 (2π(J + S mk )/mk )
)
(
ˆδ
ˆ
Im (2πJ /mk )
θmk −θ0 gc0 ,θ0 (2π(J + S mk )/mk )
k
mk
−1
+ d−θ
gc0 ,θ1 (2π(J + S mk )/mk )
mk 0 gc0 ,θ0 (2π(J + K M )/mk )
> δr
(5.20)

uniformly on Hmk . Note here that Mr ≥ max{M1 , M2 }.
By the uniform convergence of |Bm | on Θ shown in Lemma 3, there exists a M3 such
54

that for mk ≥ M3 ,

Bmk

<

δr
4

(5.21)

ˆ
with θ1 = θmk (ω) uniformly for ω ∈ Ω. The uniform convergence of Cmk on Gδ allows us to
ﬁnd M4 such that for mk ≥ M4 ,

Cmk

<

δr
4

(5.22)

uniformly on Hmk .
Therefore, for mk ≥ max{Mr , M3 , M4 }, we have Amk + Bmk + Cmk ≥ Amk − |Bmk | −
|Cmk | > δr /2 on Hmk which leads

ˆ
L(c0 , θmk ) − L(c0 , θ0 ) >

δr
2

(5.23)

on Hmk . Since P (Hmk ) > δ/2 > 0, it contradicts to (5.17) which completes the proof. Here,
we do not need P (∩k Hmk ) > 0 since (5.17) should holds for any m > 0.
ˆ

p

To show (2.14), it is enough to show that mθ−θ0 −→ 1 which is equivalent to show that
gc0 ,θ0 (2π(J + S m )/m)
gc ,θ (2π(J + S m )/m)
ˆ

−→

gc ,θ (2π(J + S m )/m)
ˆ
mθ−θ0 0 0
gc ,θ (2π(J + S m )/m)
ˆ

−→

p

1,

(5.24)

1.

(5.25)

0

p

0

ˆ
(5.24) follows from the consistency of θ and the continuity of gc0 ,θ shown in Lemma 5.1.
55

To show (5.25), notice that we have
(
)
ˆ
P L(c0 , θ) − L(c0 , θ0 ) ≤ 0 = 1

(5.26)

ˆ
for each m > 0 by the deﬁnition of θ and we have
(
)
ˆ
P L(c0 , θ) − L(c0 , θ0 ) ≥ Am + Bm + Cm = 1

by Lemma 3.

Suppose that (5.25) does not hold. Then, there exists r > 0, δ > 0 and M1 such that
(

gc ,θ (2π(J + S m )/m)
ˆ
−1
mθ−θ0 0 0
gc ,θ (2π(J + S m )/m)
ˆ
0

P

)
> r

>δ

for all m ≥ M1 . On the other hand, there exists {mk }, a subsequence of {m}, such that
dmk → 1, Bm → 0 and Cm → 0 almost surely, where dm is given in (5.18), Bm and Cm
ˆ
are given in (5.12) and (5.13) with θ1 = θ. Then, by Egorov’s Thoerem, there exists Ωδ ⊂ Ω
such that P (Ωδ ) > 1 − δ/2 and dmk , Bm and Cm are uniformly convergent on Ωδ . As in
Lemma 2, for amk , a nonzero solution of hmk (bmk ) = 0, where
ˆ

bm = mθ−θ0

gc0 ,θ0 (2π(J + S m )/m)
,
gc ,θ (2π(J + S m )/m)
ˆ
0

there exists M2 such that |amk − 1| ≤ r uniformly on Ωδ for all mk ≥ M2 . Now, deﬁne
{
Dm =

ω :

gc ,θ (2π(J + S m )/m)
ˆ
mθ−θ0 0 0
−1
gc ,θ (2π(J + S m )/m)
ˆ
0
56

}
> r .

(5.27)

Note that P (Dmk ∩ Ωδ ) ≥ δ/2 > 0 for all mk ≥ max{M1 , M2 }. Similarly to the proof of
Lemma 2, for each mk ≥ max{M1 , M2 }, there exists δr > 0 such that Amk > δr for all
ω ∈ Dmk ∩ Ωδ . This implies that

P (Amk > δr ) ≥ δ/2

for each mk ≥ max{M1 , M2 }. Note that δr does not depend on mk which can be seen in
Lemma 2. Meanwhile, there exists M3 such that for mk ≥ M3 ,

|Bmk | ≤ δr /4,

|Cmk | ≤ δr /4

for all ω ∈ Ωδ . Hence we have
(

)
ˆ − L(c0 , θ0 ) > δr /2 ≥ δ/2
P L(c0 , θ)

for mk ≥ max{M1 , M2 , M3 }, which contradicts to (5.26). Thus, (5.25) is proved.

ˆ
Alternative Proof of Theorem 5. To show the consistency of θ, for a given ϵ > 0 such
that 0 < ϵ < min{θU −θ0 , θL −θ0 }/2, deﬁne Θϵ = {θ : |θ−θ0 | ≤ ϵ} and Θc is the complement
ϵ
of Θϵ . Then, we have
(

)
ˆ
P θ ∈ Θc ∩ Θ = P
ϵ

(
inf

(
≤ P

)
Θc ∩ Θ
ϵ

inf

Θc ∩ Θ
ϵ

57

L(c0 , θ) ≤

inf

Θϵ ∩ Θ

L(c0 , θ)
)

(L(c0 , θ) − L(c0 , θ0 )) ≤ 0 .

By Lemma 3, we also have

inf

Θc ∩ Θ
ϵ

(L(c0 , θ) − L(c0 , θ0 )) ≥
≥
≥

inf (Am + Bm + Cm )

Θc ∩ Θ
ϵ

inf (Am − |Bm |) + Cm

Θc ∩ Θ
ϵ

inf

Θc ∩ Θ
ϵ

Am − sup |Bm | + Cm ,
Θ

ˆ
where Am , Bm and Cm are given in (5.11)-(5.13). Thus, to show the consistency of θ, it is
enough to show that there exists δ > 0 such that
)

(
P

inf

Θc ∩ Θ
ϵ

Am + Cm > δ

−→ 1.

since Bm is deterministic with supΘ |Bm | → 0 as m → ∞. We can consider Am as
(

gc ,θ (2π(J + S m )/m)
Am = hm mθ−θ0 0 0
gc0 ,θ (2π(J + S m )/m)

)
,

where hm (·) is deﬁned in Lemma 2 with

dm =

ˆδ
Im (2πJ /m)
md−θ0 gc0 ,θ0 (2π(J + K M )/m)

,

(5.28)

where K M is deﬁned in Lemma 3. For θ ∈ Θc ∩ Θ, if θ > θ0 + ϵ,
ϵ

mθ−θ0

gc0 ,θ0 (2π(J + S m )/m)
gc0 ,θ (2π(J + S m )/m)

−→ ∞

as m → ∞, because of the uniform boundedness of gc0 ,θ0 /gc0 ,θ shown in Lemma 1. Similarly,

58

if θ < θ0 − ϵ,

mθ−θ0

gc0 ,θ0 (2π(J + S m )/m)
−→ 0
gc0 ,θ (2π(J + S m )/m)

as m → ∞. Thus, there exists M1 such that for m ≥ M1 ,
gc0 ,θ0 (2π(J + S m )/m)
−1
gc0 ,θ (2π(J + S m )/m)

mθ−θ0

>

1
,
2

(5.29)

for all θ ∈ Θc ∩ Θ, because of the uniform boundedness of gc0 ,θ0 /gc0 ,θ .
ϵ
By Theorem 12 in Lim and Stein (2008) and the convergence of gc0 ,θ0 (2π(J + K M )/m)
p

p

to gc0 ,θ0 ((π/2)1d ), dm → 1. Similarly, we can show that Cm → 0. Then, there exists a
δ > 0 such that
(
P

)
inf

Θc ∩ Θ
ϵ

Am + Cm > δ

−→ 1

(5.30)

by Lemma 2 with r = 1/2 and the fact that randomness of Am and Cm comes from the
same quantity dm . This completes the proof of (2.13).

To proof Theorem 6, we consider the following Lemma.

Lemma 4. Under the conditions of Theorem 5, let η = d(1 − γ)/2, we have
59

(a)




∑

I τ (2π(J + K)/m)
mη 
Wh (K) d−θ m
− 1
m 0 gc0 ,θ0 (2π(J + K)/m)
K ∈Tm
(
( ) )
Λ2 2π d
d
−→ N 0, 2
,
Λ1 C

(5.31)

(b)
(

∑

Wh (K) 1 −

K ∈Tm

τ
Im (2π(J + K)/m)

)

md−θ0 gc0 ,θ0 (2π(J + K)/m)

gc0 ,θ0 (2π(J + K)/m)
˙
gc0 ,θ0 (2π(J + K)/m)

= Op (m−η )

(5.32)

Proof. To prove (5.31), we ﬁnd the asymptotic distribution of its lower and upper bounds.
It can be easily shown that




∑

I τ (2π(J + K)/m)
Wh (K) d−θ m
LB m ≤ mη 
− 1 ≤ U B m ,
m 0 gc0 ,θ0 (2π(J + K)/m)
K ∈Tm
where
(
LB m = mη
(
U B m = mη

ˆτ
Im (2πJ /m)
md−θ0 gc0 ,θ0 (2π(J + K M )/m)
ˆτ
Im (2πJ /m)
md−θ0 gc0 ,θ0 (2π(J + K m )/m)
60

)
− 1 ,

(5.33)

)
− 1

(5.34)

with K M and K m as given in Lemma 3. We rewrite LB m as
((
LBm = mη

)

ˆτ
Im (2πJ /m)
md−θ0 gc0 ,θ0 ((π/2)1d )

− 1

gc0 ,θ0 ((π/2)1d )

gc0 ,θ0 (2π(J + K M )/m)
)
gc0 ,θ0 ((π/2)1d )
+
−1 .
gc0 ,θ0 (2π(J + K M )/m)

By Lemma 1 and γ > d/(d + 2), we have
gc0 ,θ0 ((π/2)1d )
gc0 ,θ0 (2π(J + K M )/m)
)
(
gc0 ,θ0 ((π/2)1d )
−1
mη
gc0 ,θ0 (2π(J + K M )/m)

−→

1,

−→

0.

Thus, by Theorem 3,
(
d

LB m −→ N

Λ
0, 2
Λ2
1

(

) )
2π d
.
C

(

) )
2π d
.
C

Similarly, we can show
(
d

U B m −→ N

Λ
0, 2
Λ2
1

The lower and upper bounds converge to the same distribution which implies (5.31).
61

To show (5.32), we rewrite the LHS of (5.32) as
(

∑

Wh (K) 1 −

K ∈Tm

τ
Im (2π(J + K)/m)

md−θ0 gc0 ,θ0 (2π(J + K)/m)

)

gc0 ,θ0 (2π(J + K)/m)
˙
gc0 ,θ0 (2π(J + K)/m)

∑

gc ,θ (2π(J + K)/m) gc0 ,θ0 ((π/2)1d )
˙
˙
Wh (K) 0 0
−
gc0 ,θ0 (2π(J + K)/m) gc0 ,θ0 ((π/2)1d )
K ∈Tm
∑
gc0 ,θ0 (2π(J + K)/m)
˙
I τ (2π(J + K)/m)
−
Wh (K) d−θ m
m 0 gc ,θ (2π(J + K)/m) gc0 ,θ0 (2π(J + K)/m)
=

K ∈Tm

+

0 0

gc0 ,θ0 ((π/2)1d )
˙
gc0 ,θ0 ((π/2)1d )

.

By Lemma 1 and γ > d/(d + 2), we can show that




∑

gc ,θ (2π(J + K)/m) gc0 ,θ0 ((π/2)1d )
˙
˙
 −→ 0.
Wh (K) 0 0
−
gc0 ,θ0 (2π(J + K)/m) gc0 ,θ0 ((π/2)1d )
K ∈Tm

mη 

Also, it can be easily shown that

LB m ≤

∑

τ
Im (2π(J + K)/m)

gc0 ,θ0 (2π(J + K)/m)
˙
≤ U Bm,
Wh (K) d−θ
m 0 gc0 ,θ0 (2π(J + K)/m) gc0 ,θ0 (2π(J + K)/m)
K ∈Tm

where

LB m =

ˆτ
Im (2πJ /m)

gc0 ,θ0 ((π/2)1d )gc0 ,θ0 (2π(J + P m )/m)
˙

,
2
gc ,θ (2π(J + P m )/m)
0 0
ˆτ (2πJ /m)
gc0 ,θ0 ((π/2)1d )gc0 ,θ0 (2π(J + P M )/m)
˙
Im
U Bm =
,
2
gc ,θ (2π(J + P M )/m)
md−θ0 gc0 ,θ0 ((π/2)1d )
md−θ0 gc0 ,θ0 ((π/2)1d )

0 0

62

with
gc ,θ (2π(J + K)/m)
˙
P M = arg max{K ∈Tm ,W (K )̸=0} 20 0
,
h
gc ,θ (2π(J + K)/m)
0 0

gc0 ,θ0 (2π(J + K)/m)
˙

P m = arg min{K ∈Tm ,W (K )̸=0} 2
h
g

c0 ,θ0 (2π(J + K)/m)

.

By Lemma 1, γ > d/(d + 2) and Theorem 3, we can show that
(
mη

LB m −
(

mη

U Bm −

gc0 ,θ0 ((π/2)1d )
˙
gc0 ,θ0 ((π/2)1d )
gc0 ,θ0 ((π/2)1d )
˙

)
d

−→
)

gc0 ,θ0 ((π/2)1d )

d

−→

 (
)2
gc0 ,θ0 ((π/2)1d )
˙
N 0,
gc0 ,θ0 ((π/2)1d )
 (
)2
gc0 ,θ0 ((π/2)1d )
˙
N 0,
gc0 ,θ0 ((π/2)1d )

Λ2
Λ2
1
Λ2
Λ2
1

(

(

2π
C
2π
C

)d

)d


,

.

This completes the proof of (5.32).
˙
¨
Proof of Theorem 6. Let L = ∂L/∂θ and L = ∂ 2 L/∂θ2 . To show the asymptotic distriˆ
ˆ
˙
bution of θ, we consider the Taylor expansion of L(c0 , θ) around θ0 ,

ˆ
¯ ˆ
˙
˙
¨
L(c0 , θ) = L(c0 , θ0 ) + L(c0 , θ)(θ − θ0 ),

¯
ˆ
ˆ
˙
where θ lies on the line segment between θ and θ0 . Since L(c0 , θ) = 0, we have
(
)−1
ˆ
¯
¨
˙
log(m)mη (θ − θ0 ) = −log(m)mη L(c0 , θ)
L(c0 , θ0 ).

Thus, it is enough to show
(
˙
(log(m))−1 mη L(c0 , θ0 )

−→

¯
¨
(log(m))−2 L(c0 , θ)

−→

d
p

63

N
1.

Λ2
0, 2
Λ1

(

) )
2π d
,
C

(5.35)
(5.36)

Since

˙
L(c0 , θ0 ) = − log(m) +

K ∈Tm

∑

−

∑

gc ,θ (2π(J + K)/m)
˙
Wh (K) 0 0
gc0 ,θ0 (2π(J + K)/m)

τ
Wh (K)Im (2π(J + K)/m)

K ∈Tm

(

)

− log(m)md−θ0 gc0 ,θ0 (2π(J + K)/m) + md−θ0 gc0 ,θ0 (2π(J + K)/m)
˙
×
(
)2
md−θ0 gc0 ,θ0 (2π(J + K)/m)




∑

τ
Im (2π(J + K)/m)

= log(m)
Wh (K) d−θ
− 1
m 0 gc0 ,θ0 (2π(J + K)/m)
K ∈Tm

+

∑

(

Wh (K) 1 −

K ∈Tm

)

τ
Im (2π(J + K)/m)

md−θ0 gc0 ,θ0 (2π(J + K)/m)

gc0 ,θ0 (2π(J + K)/m)
˙
gc0 ,θ0 (2π(J + K)/m)

,

we see that (5.35) follows from Lemma 4.

Next we prove (5.36). After some simpliﬁcation, we have
∑

¯
¨
L(c0 , θ) = (log(m))2

Wh (K)

K ∈Tm

− 2 log(m)

∑

τ
Im (2π(J + K)/m)
¯

md−θ gc ,θ (2π(J + K)/m)
¯
0

Wh (K)

τ
Im (2π(J + K)/m)gc ,θ (2π(J + K)/m)
˙ ¯
0
¯

md−θ g 2 ¯(2π(J + K)/m)
c0 ,θ

K ∈Tm

+2

∑
K ∈Tm

∑

Wh (K)

τ
Im (2π(J + K)/m)gc ,θ (2π(J + K)/m)
˙2 ¯
0



¯

md−θ g 3 ¯(2π(J + K)/m)
c0 ,θ



τ
Im (2π(J + K)/m)

+
Wh (K) 1 −
¯
md−θ gc ,θ (2π(J + K)/m)
¯
K ∈Tm
0
gc ,θ (2π(J + K)/m)
˙2 ¯
∑
−
Wh (K) 20
g ¯(2π(J + K)/m)
c0 ,θ
K ∈Tm
=: E1 + E2 ,

gc ,θ (2π(J + K)/m)
¨ ¯
0
gc ,θ (2π(J + K)/m)
¯
0

where E1 is the ﬁrst term with (log(m))2 and E2 is the last four terms in the expression of
64

¯
¨
L(c0 , θ).
First, we want to show that

p

(log(m))−2 E1 −→ 1.

(5.37)

It can be easily shown that

LB m ≤ (log(m))−2 E1 ≤ U B m ,

where

LB m =

U BM =

ˆτ
Im (2πJ /m)

¯

mθ−θ0 gc0 ,θ0 ((π/2)1d )

¯
md−θ0 gc0 ,θ0 ((π/2)1d ) gc0 ,θ (2π(J + P M )/m)

ˆτ
Im (2πJ /m)

,

¯

mθ−θ0 gc0 ,θ0 ((π/2)1d )

¯
md−θ0 gc0 ,θ0 ((π/2)1d ) gc0 ,θ (2π(J + P m )/m)

with

P M = arg max{K ∈Tm ,W (K )̸=0} gc ,θ (2π(J + K)/m),
¯
h
0
P m = arg min{K ∈Tm ,W (K )̸=0} gc ,θ (2π(J + K)/m).
¯
h
0

By Theorem 3, (2.14) in Theorem 5 and Lemma 1, we can show that both LB m and
U B m converge to one in probability, which in turn implies (5.37). In a similar way, we can
show that (log(m))−1 E2 = Op (1). Thus, together with (5.37), we can show (5.36), which
completes the proof.

65

In order to prove Theorem 7, we will extend Lemma 2 to more generalized situation.

Lemma 5. Consider a function hm (x) = − log(x) + dm (x − 1), (x > 0), where {dm } is a
sequence of positive numbers such that dm → d > 0 as m → ∞. Then, there exists some
rl ∈ (0, 1) and ru ∈ (1, ∞), δr > 0 and M such that ∀m ≥ M, we have

hm (x) > 1

∀x ∈ (0, rl ] ∪ [ru , ∞).

Proof. Since dm → d > 0, then ∀ϵ ∈ (0, d), ∃M s.t. ∀m ≥ M,

|dm − d| < ϵ
or d − ϵ < dm < d + ϵ.
∀c > 0 ﬁxed, note that the function fc (x) = − log(x) + c(x − 1), (x > 0) has the following
properties:
(i)

fc (x) → ∞ as x → 0+ or x → ∞.

1
′
′
(ii) fc (x) = − x + c. So, fc (x) = 0 ⇔ x = 1 .
c

′
fc (x) < 0 if x <

1
1
′
and fc (x) > 0 if x > .
c
c

(iii) fc (x) attains its minimum at x = 1 and fc (x) ≤ 0. ( fc ( 1 ) < 0 if c ̸= 1. Otherwise,
c
c
66

f1 (1) = 0. ) Hence we can ﬁnd x1 < 1 < x2 such that
c

fc (x) ≥ 1

if 0 < x ≤ x1 or x ≥ x2 . Now we apply the above facts to c = d − ϵ or c = d + ϵ to get the
following:
(a) If 0 < x ≤ x1 , then

hm (x) = − log(x) + dm (x − 1) ≥ − log(x) + (d + ϵ)(x − 1) ≥ 1.

(b) If x ≥ x2 , then

hm (x) = − log(x) + dm (x − 1) ≥ − log(x) + (d − ϵ)(x − 1) ≥ 1.

Therefore, we have proved the Lemma.

To prove Theorem 11, we ﬁrst ﬁnd the lower bound for L(c∗ , θ1 ) − L(c∗ , θ0 ). The construction of this lower bound follows by replacing c0 in (5.11),(5.12) and (5.13) in Lemma 3
with c∗ . The lower bounded is also established by three terms and two of them are dominated
by the other.

Lemma 6. For a positive integer m and any θ1 ∈ Θ, we have

L(c∗ , θ1 ) − L(c∗ , θ0 ) ≥ Am + Bm + Cm ,
67

where
(

)
gc∗ ,θ (2π(J + S m )/m)
0
Am = − log mθ1 −θ0
gc∗ ,θ (2π(J + S m )/m)
1
(
)
ˆδ (2πJ /m)
gc∗ ,θ (2π(J + S m )/m)
Im
0
+ d−θ
−1 ,
mθ1 −θ0
gc∗ ,θ (2π(J + S m )/m)
m 0 gc∗ ,θ (2π(J + K M )/m)
1
0

(

)

gc∗ ,θ (2π(J + S m )/m) gc∗ ,θ (2π(J + S M )/m)
0
1
gc∗ ,θ (2π(J + S M )/m) gc∗ ,θ (2π(J + S m )/m)
0
(1
)
ˆτ (2πJ /m)
gc∗ ,θ (2π(J + K M )/m)
Im
0
Cm = d−θ
1−
.
gc∗ ,θ (2π(J + K m )/m)
0 gc∗ ,θ (2π(J + K M )/m)
m
Bm = log

(5.38)
(5.39)
(5.40)

0

0

In (5.38)-(5.40), K M , K m , S M and S m are deﬁned as

K M = arg max{K ∈Tm ,W (K )̸=0} gc∗ ,θ (2π(J + K)/m),
0
h
K m = arg min{K ∈Tm ,W (K )̸=0} gc∗ ,θ (2π(J + K)/m),
0
h
)
(
gc∗ ,θ (2π(J + K)/m)
0
,
S M = arg max{K ∈Tm ,W (K )̸=0} log
h
gc∗ ,θ (2π(J + K)/m)
1

gc∗ ,θ (2π(J + K)/m)
0
S m = arg min{K ∈Tm ,W (K )̸=0}
.
h
gc∗ ,θ (2π(J + K)/m)
1

Furthermore,

sup |Bm | = o(1),

(5.41)

θ∈Θ

Cm = op (1),

(5.42)

where (5.42) is under the conditions of Theorem 5.
Proof of Lemma 6. The procedure of the proof for this Lemma is the same as Lemma 6.
Therefore, we will not introduce the details.
68

Proof of Theorem 7. Let (Ω, F, P) be the probability space where a stationary Gaussian
ˆ
ˆ
random ﬁeld Z(s) is deﬁned. To emphasize dependence on m, we use θm instead of θ in this
proof. Note that we have

ˆ
P (L(c∗ , θm ) − L(c∗ , θ0 ) ≤ 0) = 1

(5.43)

ˆ
for any positive integer m, due to the deﬁnition of θm . We are going to prove the theorem
ˆ
by deriving a contradiction to (5.43) when θm does not converge to θ0 in probability.
ˆ
Suppose that θm does not converge to θ0 in probability. Then, there exist ϵ > 0, δ > 0
and M1 such that for m ≥ M1 ,

ˆ
P (|θm − θ0 | > ϵ) > δ.

ˆ
We deﬁne Dm = {ω ∈ Ω : |θm − θ0 | > ϵ}. By Lemma 3, we have

ˆ
L(c∗ , θm ) − L(c∗ , θ0 ) ≥ Am + Bm + Cm ,

ˆ
where Am , Bm and Cm are given in (5.38)-(5.40) with θ1 = θ. Also, note that
(
Am = hm

gc∗ ,θ (2π(J + S m )/m)
ˆ
0
mθ−θ0
gc∗ ,θ (2π(J + S m )/m)
ˆ

)
,

where hm (·) is deﬁned in Lemma 5 with

dm =

ˆδ
Im (2πJ /m)
md−θ0 gc∗ ,θ (2π(J + K M )/m)
0

69

,

(5.44)

where K M is deﬁned in Lemma 3.
We are going to show that there exist {mk }, a subsequence of {m} and a subset of Dmk
such that for large enough mk , Amk + Bmk + Cmk is bounded away from zero.
By Theorem 3 and the convergence of gc∗ ,θ (2π(J + K M )/m) to gc∗ ,θ ((π/2)1d ), we
0
0
p

have dm → d = c0 /c∗ . Then, there exists {mk }, a subsequence of {m} such that dmk
converges to one almost surely. By (5.17) in Lemma 3, almost sure convergence of dmk
implies that Cmk given in (5.40) converges to zero almost surely. To use Lemma 5, we need
uniform convergence of dmk which is obtained by Egorov’s Thoerem (Folland, 1999). By
Egorov’s Theorem, there exists Gδ ⊂ Ω such that dmk and Cmk converge uniformly on Gδ
and P (Gδ ) > 1 − δ/2.
On the other hand, there exists a M2 , which does not depend on ω, such that for mk ≥
M2 ,
ˆ

θm −θ0
mk k

gc∗ ,θ (2π(J + S mk )/mk )
0

gc∗ ,θ
ˆ

mk

(2π(J + S mk )/mk )

(5.45)

falls on the outside of (rl , ru ) for all ω ∈ Dmk , because of the uniform boundedness of
gc∗ ,θ /gc∗ ,θ .
0
1
Let Hmk = Dmk ∩ Gmk . Note that P (Hmk ) > δ/2 > 0 for mk ≥ M1 . Then, by Lemma
5, there exist δr > 0 and Mr such that for mk ≥ Mr ,
)

(

Amk

ˆ
θm −θ0 gc∗ ,θ0 (2π(J + S mk )/mk )
= − log mk k
gc∗ ,θ (2π(J + S mk )/mk )
1
(
)
δ
ˆm (2πJ /mk )
ˆ
I
θm −θ0 gc∗ ,θ0 (2π(J + S mk )/mk )
k
+ d−θ
−1
mk k
gc∗ ,θ (2π(J + S mk )/mk )
0 g ∗ (2π(J + K )/m )
mk
1
M
k
c ,θ0
> δr
(5.46)

70

uniformly on Hmk . Note here that Mr ≥ max{M1 , M2 }.
By the uniform convergence of |Bm | on Θ shown in Lemma 6, there exists a M3 such
that for mk ≥ M3 ,

Bmk

<

δr
4

(5.47)

ˆ
with θ1 = θmk (ω) uniformly for ω ∈ Ω. The uniform convergence of Cmk on Gδ allows us to
ﬁnd M4 such that for mk ≥ M4 ,

Cmk

<

δr
4

(5.48)

uniformly on Hmk .
Therefore, for mk ≥ max{Mr , M3 , M4 }, we have Amk + Bmk + Cmk ≥ Amk − |Bmk | −
|Cmk | > δr /2 on Hmk which leads
δr
ˆ
L(c∗ , θmk ) − L(c∗ , θ0 ) >
2

(5.49)

on Hmk . Since P (Hmk ) > δ/2 > 0, it contradicts to (5.43) which completes the proof. Here,
we do not need P (∩k Hmk ) > 0 since (5.43) should holds for any m > 0.
(2.17) comes from
(
lim P

m→∞

gc∗ ,θ (2π(J + S m )/m)
ˆ
0
∈ (rl , ru )
mθ−θ0
gc∗ ,θ (2π(J + S m )/m)
ˆ

Otherwise, the same contradiction to (5.43) will be found.
71

)
= 1.

To prove (2.29), let

K M = arg max{K ∈Tm ,W (K )̸=0} gθ (2π(J + K)/m),
ˆ
h
K m = arg min{K ∈Tm ,W (K )̸=0} gθ (2π(J + K)/m).
ˆ
h

Assume
c=
ˆ

∑
K ∈Tm

Wh (K)

τ
Im (2π(J + K)/m)
.
ˆ
ˆ
md−θ gθ (2π(J + K)/m)

1

ˆτ
g (2π(J +K )/m)
ˆ
Im (2π J /m)
1
c mθ−θ d−θ g (2π(J +K )/m) gθ (2π(J +K M )/m)
m
M
M
θ,c
ˆ
ˆ
≤ c ≤ c mθ−θ
ˆ

θ
ˆτ
gθ (2π(J +K m )/m)
Im (2π J /m)
1
d−θ gθ,c (2π(J +K m )/m) g ˆ(2π(J +K m )/m)
m
θ

By Theorem 3 and the convergence of gc,θ (2π(J + K m )/m) and gc,θ (2π(J + K M )/m)
to gc,θ ((π/2)1d ),
ˆτ
Im (2πJ /m)
→p 1
d−θ g (2π(J + K )/m)
m
M
θ,c
1

and
ˆτ
Im (2πJ /m)
→p 1.
md−θ gθ,c (2π(J + K m )/m)
1

ˆ
Corollary 1 is veriﬁed because θ − θ = Op (log(m)−1 ) and the boundedness of gc,θ .

72

5.2.3

Proofs of Theorems in Section 2.4

The idea to verify the theoretical results of the second estimator deﬁned in Section 2.4
are similar with Section 2.3. The procedures of proofs will be simpler and worked out by
Theorem 3 and Lemma 2.

Proof of Theorem 8. Compared with the ﬁrst estimator in Section 2.3, the consistency
ˆτ
of c will be directly attained because m−(d−θ0 ) Im (2πJ /m) converges to gc,θ0 ((π/2)1d ) in
ˆ
probability by Theorem 3.

c=
ˆ

ˆτ
Im (2π J /m)
p
d−θ0 g (2π J /m) → c.
m
0

The asymptotic distribution of c comes from Theorem 3
ˆ
(
mη

)

ˆτ
Im (2πJ /m)
md−θ0
d

−→ N

− gc,θ0 ((π/2)1d )

(

Λ2
0, 2
Λ1

(

)
)
2π d 2
gc,θ ((π/2)1d )
0
C

Proof of Theorem 9. For all θ1 and θ2 in Θ,
(

)
gc0 ,θ2 (2πJ /m)
R(c0 , θ1 ) − R(c0 , θ2 ) = − log mθ1 −θ2
gc0 ,θ1 (2πJ /m)
(
)
ˆδ
gc0 ,θ2 (2πJ /m)
Im (2πJ /m)
+ d−θ
mθ1 −θ2
−1
gc ,θ1 (2πJ /m)
m 2 gc ,θ (2πJ /m)
0 2

0

73

(5.50)

We also suppose that Z(s) is a stationary Gaussian random ﬁeld deﬁned on the probˆ
ˆ
ability space (Ω, F, P) and replace θm with θ in this proof. The main idea of proving the
theorem is looking for a contradiction to

ˆ
P (R(c0 , θm ) − R(c0 , θ0 ) ≤ 0) = 1

(5.51)

ˆ
ˆ
for any positive integer m, due to the deﬁnition of θm when θm does not converge to θ0 in
probability.
ˆ
Suppose that θm does not converge to θ0 in probability. Then, there exist ϵ > 0, δ > 0
and M1 such that for m ≥ M1 ,

ˆ
P (|θm − θ0 | > ϵ) > δ.

ˆ
We deﬁne Dm = {ω ∈ Ω : |θm − θ0 | > ϵ}.
Assume

dm =

ˆδ
Im (2πJ /m)
md−θ0 gc0 ,θ0 (2πJ /m)

,

(5.52)

p

By Theorem 3, we know dm → 1. Then, there exists {mk }, a subsequence of {m} such
that dmk converges to one almost surely. To use Lemma 2, we need uniform convergence of
dmk which is obtained by Egorov’s Thoerem (Folland, 1999). By Egorov’s Theorem, there
exists Gδ ⊂ Ω such that dmk converge uniformly on Gδ and P (Gδ ) > 1 − δ/2.
On the other hand, there exists a M2 , which does not depend on ω, such that for mk ≥
74

M2 ,
ˆ

θm −θ0
mk k

gc0 ,θ0 (2πJ /mk )
gc ,θ (2πJ /mk )
ˆ
0 mk

−1

>

1
2

(5.53)

for all ω ∈ Dmk , because of the uniform boundedness of gc0 ,θ0 /gc0 ,θ1 .
Let Hmk = Dmk ∩ Gmk . Note that P (Hmk ) > δ/2 > 0 for mk ≥ M1 . Then, by Lemma
2 with r = 1/2, there exist δr > 0 and Mr such that for mk ≥ Mr ,
(

)

ˆ
θm −θ0 gc0 ,θ0 (2πJ /mk )
− log mk k
gc0 ,θ1 (2πJ /mk )
)
(
ˆδ
ˆ
Im (2πJ /mk )
θmk −θ0 gc0 ,θ0 (2πJ /mk )
mk
−1
+ d−θ k
gc0 ,θ1 (2πJ /mk )
mk 0 gc0 ,θ0 (2πJ /mk )
> δr

(5.54)

uniformly on Hmk . Note here that Mr ≥ max{M1 , M2 }.
Since P (Hmk ) > δ/2 > 0, it contradicts to (5.51) which completes the proof because
(5.51) holds for any m > 0.
ˆ

p

To show (2.14), it is enough to show that mθ−θ0 −→ 1 which is equivalent to show that

ˆ

mθ−θ0
because

gc0 ,θ0 (2π(J + S m )/m)
gc ,θ (2π(J + S m )/m)
ˆ
0
gc0 ,θ0 (2π(J + S m )/m)
gc ,θ (2π(J + S m )/m)
ˆ
0

p

−→
p

−→

1

(5.55)

1.

(5.56)

ˆ
(5.56) follows from the consistency of θ and the continuity of gc0 ,θ shown in Lemma 5.1.
To show (5.55), notice that we have
(
)
ˆ
P R(c0 , θ) − R(c0 , θ0 ) ≤ 0 = 1
75

(5.57)

ˆ
for each m > 0 by the deﬁnition of θ.
Suppose that (5.55) does not hold. Then, there exists r > 0, δ > 0 and M1 such that
(

gc ,θ (2πJ /m)
ˆ
mθ−θ0 0 0
−1
gc ,θ (2πJ /m)
ˆ
0

P

)
> r

>δ

for all m ≥ M1 . On the other hand, there exists {mk }, a subsequence of {m}, such that
dmk → 1. Then, by Egorov’s Thoerem, there exists Ωδ ⊂ Ω such that P (Ωδ ) > 1 − δ/2
and dmk . Now, deﬁne
{
Dm =

ω :

gc ,θ (2πJ /m)
ˆ
mθ−θ0 0 0
−1
gc ,θ (2πJ /m)
ˆ
0

}
> r

.

(5.58)

Note that P (Dmk ∩ Ωδ ) ≥ δ/2 > 0 for all mk ≥ max{M1 , Mr }. Similarly to the proof of
ˆ
Lemma 2, for each mk ≥ max{M1 , Mr }, there exists δr > 0 such that R(c0 , θ) − R(c0 , θ0 ) >
δr for all ω ∈ Dmk ∩ Ωδ . This implies that

ˆ
P (R(c0 , θ) − R(c0 , θ0 ) > δr ) ≥ δ/2

for each mk ≥ max{M1 , Mr }. Note that δr does not depend on mk which can be seen in
Lemma 2.

˙
¨
Proof of Theorem 10. Let R = ∂L/∂θ and R = ∂ 2 R/∂θ2 . To show the asymptotic
ˆ
ˆ
˙
distribution of θ, we consider the Taylor expansion of R(c0 , θ) around θ0 ,

ˆ
¯ ˆ
˙
˙
¨
R(c0 , θ) = R(c0 , θ0 ) + R(c0 , θ)(θ − θ0 ),
76

¯
ˆ
ˆ
˙
where θ lies on the line segment between θ and θ0 . Since R(c0 , θ) = 0, we have
(
)−1
ˆ
¯
¨
˙
log(m)mη (θ − θ0 ) = − log(m)mη R(c0 , θ)
R(c0 , θ0 ).

Thus, it is enough to show
(
˙
(log(m))−1 mη R(c0 , θ0 )

−→

¯
¨
(log(m))−2 R(c0 , θ)

−→

d
p

N

Λ
0, 2
Λ2
1

(

) )
2π d
,
C

1.

Since
gc ,θ (2πJ /m)
˙
˙
ˆτ
R(c0 , θ0 ) = − log(m) + 0 0
− Im (2π(J )/m)
gc0 ,θ0 (2πJ /m)
(
)
− log(m)md−θ0 gc0 ,θ0 (2πJ /m) + md−θ0 gc0 ,θ0 (2πJ /m)
˙
×
(
)2
md−θ0 gc0 ,θ0 (2πJ /m)
(
)
ˆτ
Im (2πJ /m)
= log(m)
− 1
md−θ0 gc0 ,θ0 (2πJ /m)
(
)
ˆτ
gc0 ,θ0 (2πJ /m)
˙
Im (2πJ /m)
,
+ 1 −
md−θ0 gc0 ,θ0 (2πJ /m) gc0 ,θ0 (2πJ /m)
we see that (5.59) follows from Lemma 4.

77

(5.59)
(5.60)

Next we prove (5.60). After some simpliﬁcation, we have
ˆ
I τ (2πJ /m)gc ,θ (2πJ /m)
˙ ¯
ˆτ
0
¯
¨ 0 , θ) = (log(m))2 Im (2π(J + K)/m) − 2 log(m) m
R(c
¯
¯
d−θ g
d−θ g 2 (2πJ /m)
m
m
¯
¯
c0 ,θ (2πJ /m)
c0 ,θ


ˆτ
Im (2πJ /m)gc ,θ (2πJ /m)
˙2 ¯
g ¯(2πJ /m)
¨
ˆτ (2πJ /m)
Im
0
 c0 ,θ
+ 1 −
+2
¯
¯
¯
md−θ g 3 ¯(2πJ /m)
md−θ gc ,θ (2πJ /m) gc0 ,θ (2πJ /m)
¯
c0 ,θ

0

gc ,θ (2πJ /m)
˙2 ¯
− 20
g ¯(2πJ /m)
c0 ,θ
=: E1 + E2 ,

where E1 is the ﬁrst term with (log(m))2 and E2 is the last four terms in the expression of
¯
¨
R(c0 , θ).
First, we know that
p

(log(m))−2 E1 −→ 1.

and

(log(m))−1 E2 = Op (1)
from Theorem 3, (2.14) in Theorem 5 and Lemma 1.

Proof of Theorem 11. Let (Ω, F, P) be the probability space where a stationary Gaussian
ˆ
ˆ
random ﬁeld Z(s) is deﬁned. To emphasize dependence on m, we use θm instead of θ in this
78

proof. Note that we have From the previous discussion,
(
ˆ
R(c∗ , θm ) − R(c∗ , θ0 ) = − log
+

(

ˆδ
Im (2πJ /m)
md−θ0 g∗ ,θ (2πJ /m)
0

g∗ ,θ (2πJ /m)
ˆ
0
mθm −θ0
gc∗ ,θ (2πJ /m)
ˆm

)
)

gc∗ ,θ (2πJ /m)
ˆ
0
mθm −θ0
−1
gc∗ ,θ (2πJ /m)
ˆm

and

ˆ
P (R(c∗ , θm ) − R(c∗ , θ0 ) ≤ 0) = 1, ∀m.

(5.61)

ˆ
for any positive integer m, due to the deﬁnition of θm . We are going to prove the theorem
ˆ
by deriving a contradiction to (5.61) when θm does not converge to θ0 in probability.
ˆ
Suppose that θm does not converge to θ0 in probability. Then, there exist ϵ > 0, δ > 0
and M1 such that for m ≥ M1 ,

ˆ
P (|θm − θ0 | > ϵ) > δ.

ˆ
We deﬁne Dm = {ω ∈ Ω : |θm − θ0 | > ϵ} and

dm =

ˆδ
Im (2πJ /m)
md−θ0 gc∗ ,θ (2πJ /m)

.

(5.62)

0

p

By Theorem 3, we have dm → d = c0 /c∗ . Then, there exists {mk }, a subsequence of
{m} such that dmk converges to one almost surely. To use Lemma 5, we need uniform
convergence of dmk which is obtained by Egorov’s Thoerem (Folland, 1999). By Egorov’s
79

Theorem, there exists Gδ ⊂ Ω such that dmk converges uniformly on Gδ and P (Gδ ) > 1−δ/2.

On the other hand, there exists a M2 , which does not depend on ω, such that for mk ≥
M2 ,
ˆ

θm −θ0
mk k

gc∗ ,θ (2πJ /mk )
0

gc∗ ,θ
ˆ

mk

(2πJ /mk )

(5.63)

falls on the outside of (rl , ru ) for all ω ∈ Dmk , because of the uniform boundedness of
gc∗ ,θ /gc∗ ,θ .
0
1

Let Hmk = Dmk ∩ Gmk . Note that P (Hmk ) > δ/2 > 0 for mk ≥ M1 . Then, by Lemma
5, there exist δr > 0 and Mr such that for mk ≥ Mr ,

ˆ
R(c∗ , θm ) − R(c∗ , θ0 )
)
(
ˆ
θmk −θ0 gc∗ ,θ0 (2πJ /mk )
= − log mk
gc∗ ,θ (2πJ /mk )
1
(
)
ˆδ (2πJ /mk )
ˆ
Im
θmk −θ0 gc∗ ,θ0 (2πJ /mk )
+ d−θ k
mk
−1
gc∗ ,θ (2πJ /mk )
mk 0 gc∗ ,θ (2πJ /mk )
1

(5.64)

0

> δr

(5.65)

uniformly on Hmk . Note here that Mr ≥ max{M1 , M2 }. It contradicts to (5.61) which
completes the proof. Here, we do not need P (∩k Hmk ) > 0 since (5.61) should holds for
any m > 0.

Assume
ˆτ
Im (2πJ /m)
.
c=
ˆ
ˆ
ˆ
md−θ gθ (2πJ /m)
1

80

Then,

ˆ

c = c mθ−θ
ˆ

ˆτ
Im (2πJ /m) gθ (2πJ /m)
.
md−θ gθ,c (2πJ /m) gθ (2πJ /m)
ˆ
1

ˆ
(2.29) is proven because of θ − θ = Op (log(m)−1 ), the boundedness of gc,θ and Theorem
3.
Lemma 7. Consider a function hm (x) = − log(x) + dm (x − 1), where dm is positive and a
function of a positive integer m. Also assume that dm → d > 1 (or < 1) as m → ∞. Then,
there exists some M such that for all m ≥ M,

hm (x) > 0

for any x > 1 (or x < 1).
Proof. (For d > 1) As the previous discussion, we have known hm (x) is a convex function on
(0, ∞) for any positive integer m and minimized at x = 1/dm with hm (1/dm ) ≤ 0. Because
dm → d > 1, there exists M such that dm > 1 if m ≥ M. There exists two intersection points
between x-axis of hm (x) will be 1 and um < 1. Since the convexity of hm (x), when m ≥ M,
hm (x) < 0 if x > 1 (or x < 1).

Proof of Theorem 12. Suppose that the result (i) of Theorem 12 does not hold, then there
exists δ and M1 such that

(

)
ˆm > δ
P θ0 < θ

for m > M1 .
81

p

By Theorem 3, we have dm → d = c0 /c∗ . Then, there exists {mk }, a subsequence of {m}
such that dmk converges to one almost surely. To use Lemma 7, we need uniform convergence
of dmk which is obtained by Egorov’s Thoerem (Folland, 1999) . By Egorov’s Theorem, there
exists Gδ ⊂ Ω such that dmk converges to uniformly on Gδ and P (Gδ ) > 1 − δ/2. Assume
c∗ < c0 . Then, dm converge c0 /c∗ > 1. By the uniform convergence, there exists M such
that dmk > 1 when mk > M.
ˆ
Assume that ω ∈ Ωmk = {ω : θ0 < θmk }.

gc∗ ,θ (2πJ /m) > gc∗ ,θ
ˆ
0

mk

(2πJ /m)

because of the monotonicity of gc∗ ,θ about θ.
ˆ

θm −θ0
mk k

gc∗ ,θ (2πJ /mk )
0

gc∗ ,θ
ˆ

mk

for all ω ∈ Ωmk

∩

(2πJ /mk )

>1

(5.66)

∩
∩
ˆ
Gδ . Because R(c∗ , θmk )−R(c∗ , θ0 ) > 0 on Ωmk Gδ and P (Ωmk Gδ ) > 0

when mk > M , this contradicts to (5.61) will be found. The result (ii) will also be proven
in a similar way.

82

BIBLIOGRAPHY

83

BIBLIOGRAPHY

[1] Adler, R. J. and Taylor, J. E.(2007) Random ﬁelds and geometry. Springer Monographs in Mathematics Springer, New York.
[2] Boissy, Y., Bhattacharyya, B.B., Li, X. and Richardson, G.D. (2005) Parameter estimates for fractional autoregressive spatial processes. Ann. Statist. 33, 2553-2567.
[3] Chan, G., Hall, P., and Poskitt, D.S. (1995). Periodogram-based estimators of
fractal properties. Ann. Statist. 23, 1684-1711.
[4] Chan, G. and Wood, A. T. A. (2000). Increment-based estimators of fractal dimension for two-dimensional surface data. Statist. Sinica 10, 343-376.
[5] Chan, G. and Wood, A. T. A. (2004). Estimation of fractal dimension for a class of
non-Gaussian stationary processes and ﬁelds. Ann. Statist. 32, 1222-1260.
[6] Chen, H.-S., Simpson, D.G. and Ying, Z. (2000). Inﬁll asymptotics for a stochastic
process model with measurement error. Statist. Sinica 10, 141-156.
[7] Constantine, A. G. , Hall, P. (1994). Characterizing surface smoothness via estimation of eﬀective fractal dimension. J. R. Statist. Soc. B. 56, 97-113.
[8] Cressie, N. A. C (1993). Statistics for spatial data (rev. ed.). John Wiley, New York.
[9] Du, J. (2009a). Asymptotic and computational methods in spatial statistics. (Ph. D.
Thesis). Michigan State University, 1-111.
[10] Du, J., Zhang, H. and Mandrekar, V. S. (2009). Fixed-domain asymptotic properties of tapered maximum likelihood estimators. Ann. Statist. 100, 993-1028.
84

[11] Folland, Gerald B. Real Analysis. Wiley-Interscience Publication., New York, 1999.
[12] Furrer, R., Genton, M. G. and Nychka, D. (2006). Covariance tapering for
interpolation of large spatial datasets. Journal of Computational and Graphical Statistics.
Journal of Computational and Graphical Statistics. 15, 502-523.
[13] Guo, H., Lim, C. and Meerschaert, M. M (2009). Local Whittle estimator for
anisotropic random ﬁelds. J. Multivariate Anal. 100, 993-1028.
[14] Guyon, X. (1982). Parameter estimation for a stationary process on a d-dimensional
lattice. Biometrika. 69, 95-105.
[15] Guyon, X. (1995). Random ﬁelds on a network : modeling, statistics, and applications.
Springer-Verlag, New York.
[16] Ibragimov, I. A. and Rozanov, Y. A. (1978). Gaussian random processes. Springer,
New York. MR0543837
[17] Kaufman, C., Schervish, M. and Nychka, D. (2008). Covariance tapering for
likelihood-based estimation in large spatial datasets. J. Amer. Statist. Assoc. 103, 15451555.
[18] Loh, W.-L. (2005). Fixed-domain asymptotics for a subclass of Matern-type Gaussian
random ﬁelds. Ann. Statist. 33, 2344-2394.
[19] Lim, C. and Stein, M. L. (2008). Properties of spatial cross-periodograms using
ﬁxed-domain asymptotics. J. Multivariate Anal. 99, 1962-1984.
[20] Mardia, K. V. and Marshall, R. J. (1984). Maximum likelihood estimation of
models for residual covariance in spatial regression. Biometrika. 71, 135-146.
[21] Robinson, P. M. (1995). Gaussian semiparametric estimation of long range dependence. Ann. Statist. 23, 1630-1661.
[22] Stein, M. L. (1988). Asymptotically eﬃcient prediction of a random ﬁeld with a
misspec- iﬁed covariance function. Ann. Statist. 16, 55-63.
[23] Stein, M. L. (1990). Uniform asymptotic optimality of linear predictions of a random
ﬁeld using an incorrect second-order structure. Ann. Statist. 18, 850-872.
85

[24] Stein, M. L. (1990). Bounds on the eﬃciency of linear predictions using an incorrect
covariance function. Ann. Statist. 18, 1116-1138.
[25] Stein, M. L. (1993). A simple condition for asymptotic optimality of linear predictions
of random ﬁelds. Statistics and Probability Letters. 17, 399-404.
[26] Stein, M. L. (1995). Fixed-domain asymptotics for spatial periodograms. J. Amer.
Statist. Assoc. 432, 1962-1984.
[27] Stein, M. L. (1999). Interpolation of spatial Data. Springer, New York.
[28] Whittle, P. (1954). On stationary processes in the plane. Biometrika. 41, 434-449.
[29] Xue, Y. and Xiao, Y. (2010). Fractal and smoothness properties of space-time Gaussian models. To appear in Frontiers Math.
[30] Yadrenko, M. (1983). Spectral Theory of Random Fields, New York: Optimization
Software.
[31] Ying, Z. (1991). Asymptotic properties of a maximum likelihood estimator with data
from a Gaussian process. J. Multivariate Anal. 36, 280-296.
[32] Ying, Z. (1993). Maximum likelihood estimation of parameters under a spatial sampling scheme. Ann. Statist. 21, 1567-1590.
[33] Zhang, H. (2004). Inconsistent estimation and asymptotically equal interpolations in
model-based geostatistics. J. Amer. Statist. Assoc. 465, 250-261.
[34] Zhang, H. and Zimmerman, D. L. (2005). Towards reconciling two asymptotic
frameworks in spatial statistics. Biometrika. 92, 921-936.

86