NONPARAMETRIC ESTIMATION OF INTEGRAL CURVES USING HARDI DATA

By

Michael DeLaura

A DISSERTATION

Michigan State University

in partial fulﬁllment of the requirements

Submitted to

for the degree of

Statistics – Doctor of Philosophy

2019

ABSTRACT

NONPARAMETRIC ESTIMATION OF INTEGRAL CURVES USING HARDI DATA

By

Michael DeLaura

We develop a fully non-parametric method for the estimation of curve trajectories using HARDI
data. For a set of locations Xi ∈ G, G representing a region of the brain, we consider the diﬀusion
process by applying multivariate kernel smoothing techniques for the estimation of a general
function f describing the signal process obtained from the MRI image. At each location x ∈ G
we search for the direction of maximum diﬀusion on the unit sphere to obtain estimates of curve
trajectories. We establish the convergence of the deviation between estimated and true curves to
a Gaussian process to develop tests for the connectivity likelihood of regions. This method is
computationally eﬃcient as with each step of the curve tracing we construct a pointwise conﬁdence
ellipsoid region rather than exhaustive iterative sampling methods.

Copyright by
MICHAEL DELAURA
2019

This work is dedicated to my parents.
Thank you for always believing in me.

iv

ACKNOWLEDGEMENTS

I would like to thank my advisor Lyudmila Sakhanenko for all of her help with this work. I will
always cherish the time we spent working together.
I would also like to thank my committee
members: Vidyadhar Mandrekar, Yimin Xiao, and David Zhu. Lastly, I would like to thank Sue
Watson and Albert Cohen, who both played a large role in my successful completion of this degree.
Thank you all so very much.

v

TABLE OF CONTENTS

.

.

.

.

.

.

LIST OF FIGURES .
CHAPTER 1

.
.
INTRODUCTION .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
5
1.1 STEJSKAL TANNER EQUATION . . . . . . . . . . . . . . . . . . . . . . . .
6
1.2 Deﬁnitions/Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Simple Vector Model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
1.4 DTI Model .
.
9
1.5 HARDI Model .
.
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
.
.
.
1.6 Main Result

.
.
. .

.
.
.
.

.
.
.
.

.

.

CHAPTER 2 ESTIMATION & MAIN RESULT . . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
.
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
. 20

2.1 Preliminaries
. .
2.2 Simpson’s Rule .
2.3 Estimation .
2.4 Main Result for Asymptotic Normality of Deviation Process

. . . . . . . . . . .

.
.
. .

. .

.
.
.

.
.
.

.
.
.

CHAPTER 3 MEAN AND COVARIANCE OF THE LIMITING GAUSSIAN PROCESSES 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.1 Preliminaries
.
3.2 Green’s Function .

. .

.
.

.
.

.
.

.
.

CHAPTER 4 ALGORITHM .
.
CHAPTER 5 SIMULATIONS .
.
.
.
.

5.1 Artiﬁcial example .
.
5.2 Real HARDI dataset .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
.
. 30
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
.
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

CHAPTER 6 PROOFS .

.

.

.

.

6.1 Some Results for Simpson’s Scheme and Kernel Smoothing.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
. . . . . . . . . . . . 40
6.1.1 Existence of Unique Direction and Establishing Approximation of ˆXn − x . 41
6.1.2 Establishing Asymptotic Unbiasedness of Estimators of Derivatives of f . . 47
6.2 Calculation of Mean and Covariance of ˆZn . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Mean Squared Error of the Process ˆZn . . . . . . . . . . . . . . . . . . . . . . . . 67
6.4 Asymptotic normality of the ˆZn process
. . . . . . . . . . . . . . . . . . . . . . . 69
6.4.1 Lyapunov’s Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.2 Asymptotic Equicontinuity of the Process ˆZn
. . . . . . . . . . . . . . . . 79

CHAPTER 7 CONCLUSION .
BIBLIOGRAPHY .
.
.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
. 91
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

LIST OF FIGURES

Figure 5.1: The true curve is in blue, while the estimated curve is in red.

. . . . . . . . . . 31

Figure 5.2: This is an enlargement of the previous ﬁgure to show the 95% conﬁdence
ellipsoid surrounding a point on the estimated curve. The true curve in blue
touches it.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

.

.

.

.

.

.

Figure 5.3: Diﬀusion ellipsoids illustrate the corresponding diﬀusion tensors along the

ﬁber across the genu of corpus callosum.

. . . . . . . . . . . . . . . . . . . . . 34

Figure 5.4: Visualization of diﬀusion via ellipsoids using DTI/HARDI tensor model. . . . . 35

Figure 5.5: Axonal ﬁbers across the genu of Corpus Callosum. We traced each ﬁber
branch for 40 steps of size δ = 0.01. The estimated curve is shown in
magenta accompanied by blue 95% conﬁdence ellipsoids.

. . . . . . . . . . . . 38

Figure 5.6: Right Fornix. We traced the ﬁber for 30 steps of size δ = 0.01. The estimated
curve is shown in magenta accompanied by blue 95% conﬁdence ellipsoids.

.

. 39

Figure 7.1: A ﬁber across the genu of corpus callosum with diﬀusion "blobs" along it.

. . . 89

Figure 7.2: Visualization of diﬀusion via nonparametric function using our model.

. . . . . 90

vii

CHAPTER 1

INTRODUCTION

Diﬀusion MRI makes use of the properties of protons under an applied magnetic ﬁeld to measure
dominant diﬀusion directions in cerebral white matter. Magnetic resonance imaging (MRI) utilizes
the dynamics of self-spinning protons, most commonly in water molecules, as the source of
energy to generate MRI signal. Under a strong magnetic ﬁeld, a group of these spins form a net
magnetization. This net magnetization can be perturbed by a radio frequency electromagnetic
wave. Its wobbling (precessing) phenomenon can be measured as signals by an MRI scanner. By
manipulating the magnetic ﬁeld by gradients, we can identify the locations of the signal which
in turn allow us to generate images. Because MRI contains no radiation and thus the potential
damage to the human body is minimal, it has become an important tool in both clinical and research
applications.

These directions for which water is diﬀusing correspond to those directions that the neural
pathways, or axons, are aligned. Water moves along, but not across these pathways, which is a key
component in determining these path orientations through the use of magnetic ﬁelds as in dMRI.
The phenomenon of water diﬀusion is further taken advantage of in MRI to develop diﬀusion
weighted imaging (DWI). Water diﬀusion with the presence of a magnetic ﬁeld gradient leads
to MRI signal loss. In an unrestricted environment, water and other molecules move or diﬀuse
randomly in three dimensions resulting from thermal energy. The motion is called Brownian
motion.

These protons in the water molecules moving in constant random motion contained in neural
pathways behave as gyroscopes that either ‘tilt’ or ‘align’ under a magnetic ﬁeld applied in a
particular direction. A lack of alignment corresponds to a larger signal response. This would
indicate that the applied magnetic ﬁeld direction is not in agreement with the path orientation.

This lack of alignment increases as the magnetic ﬁeld gradient becomes more orthogonally
In the case that there is an agreement between the applied

applied with respect to a pathway.

1

direction and pathway we get an aligned assortment of protons. Indeed, this corresponds to a lower
signal response.

Thus these methods make use of the signal response under an applied magnetic ﬁeld to measure
the degree to which water is diﬀusing in a given direction as a means of tracking neural pathways.
With this knowledge one can build models relating the signal response to an orientation distribution
function (ODF) such as in kernel regression estimation. This ODF gives an empirical descrip-
tion of the distribution of diﬀusion at a location x ∈ R3 in each particular diﬀusion direction b ∈ R3.

Studying the Brownian motion of molecules (water molecules in our case) in the brain can
provide information regarding the neuronal structural connectivity in vivo. These measurements
have been made possible with DWI [18, 19], which applies diﬀusion-weighted gradients in various
directions to assess the diﬀusing directions of the water molecules. With DWI data, as in the
commonly used diﬀusion tensor imaging (DTI) techniques, diﬀusivity values and principal diﬀusion
orientation can be estimated at each voxel. Since healthy axons contain intact myelin sheaths and
tend to align in organized orientations, water diﬀusivity in a voxel tends to be preferentially along
the direction of the axonal bundles. By inspecting the orientations of the diﬀusion tensors at
neighboring voxels, axonal ﬁber bundles can be tracked. The success of the axonal tracking can
be used to understand the structural connections between brain regions [21, 25], and can be used
to assess axonal changes over time in applications such as brain maturation in young children
[20], axonal degeneration in Alzheimer’s diseases [26], and potential axonal damage in traumatic
brain injury [24]. However, successful tractography based on DWI data faces some fundamentally
challenging demands, speciﬁcally the need for high image signal-to-noise ratio (SNR), high spatial
resolution, a relative long scan time, the ability to resolve crossing ﬁbers, full coverage of tracks of
interest, and the ability to track at regions with low diﬀusion anisotropy. To address the issue of
crossing ﬁbers, high angular resolution diﬀusion imaging (HARDI) [23] in DWI has gained some
success. The issues related to neuronal ﬁber tractography in DWI motivated our research on the
integral curve estimation.

2

The application of a diﬀusion gradient leads to low image SNR. The subject’s motion as well
as physiological signals within the relatively long scan time in DWI, with a typical range of 8 to
20 min, can further exacerbate the noise issue in DWI. In this work, our goal is to understand the
error propagation due to noise in neuronal ﬁber tractography under as few model assumptions as
possible, leading to a nonparametric setup.

Using the methods of dMRI one can also discover the degree of likelihood for which two regions
are connected. Of course, this is of great scientiﬁc interest since developing our understanding of
the connective features of the brain would inevitably help in the planning of neurosurgeries. This
knowledge would serve as a strong basis for the studying of brain disorders, and research could build
upon itself in a more fruitful manner. Creating an underlying physical connectivity map would
serve as a highly useful foundation in the investigation of functional brain connectivity, i.e., fMRI.
A more informed schematic of this structure on this small a scale could reveal potential patterns and
insight into understanding of the proportions of the brain used for their respective purposes. One
could upon combining this with fMRI analysis create a more rich analysis of region interactions
and understanding of their purposes. Indeed, continued analysis of brain microstructure should
produce a trustworthy anatomy. These methods could potentially be developed further and applied
to uncover nerve pathway interactions on a small scale.

Probabilistic ﬁber tractography [19] is one popular technique in DWI because it can assess
the relative strength of ﬁber connection. However, this technique employs Monte Carlo sampling
and bootstrap techniques, and depends on arbitrary prior parameter assumptions based on fully
parametric models. Incorrect parameter assumptions will exacerbate the error due to the repeated
Monte Carlo sampling. To reduce the need of parameter assumptions, Koltchinskii et.al.
[5]
developed a theoretically more rigorous semiparametric approach for the simple vector model [5],
Carmichael and Sakhanenko investigated the “low-order” DTI model [2] and “high-order” HARDI
model [9]. With the later approach, they demonstrated tighter conﬁdence ellipsoids around the
ﬁbers, and their method is more robust in handling crossing of ﬁbers than other DTI methods
[10, 11]. Deterministic ﬁber tractography [22] is another popular technique in DWI, which does

3

not provide the assessment of uncertainty such as conﬁdence regions or occupancy probabilities
estimation.

Based on the improvement of the semiparametric approach in the work by Carmichael and
Sakhanenko, we want to assess the ﬁber tracking performance based on a completely nonparametric
model using HARDI data. Although the proposed technique is computationally intense, it avoids
the typical limitations of deterministic tractography techniques, and recovers the connectivity
information similar to probabilistic tractography techniques.

4

1.1 STEJSKAL TANNER EQUATION

The general form of the Stejskal-Tanner equation relates the diﬀusion signal E(x, b) to the
diﬀusion tensor D at a location x ∈ G and for some applied magnetic gradient direction b ∈ R3 :

E(x, b) =

R(x, b)
R0(x) = e−cbT Db.

At a given location x, R(x, b) denotes the relative amount of water diﬀusion along the spatial
direction b. R0(x) represents the amount of signal observed without any applied magnetic ﬁeld.
The ratio of these two gives a metric for amount of diﬀusion at a given location and direction. This
serves as a basis in dMRI methodology for the construction of any particular type of model.

The amount of diﬀusion in a particular direction at a given location x ∈ G can be characterized
by this diﬀusion tensor D. The above relationship between the signal and the diﬀusion tensor was
ﬁrst given by Stejskal and Tanner. According to this classical equation, we may report the observed
log-loss of signal obtained from the DTI data (see [8]) as related to a diﬀusion tensor as follows

(cid:19)

(cid:18) R(x, b)

R0(x)

y(x, b) = −log

= cbT M(x)b,

(1.1)

where for each x ∈ G, M(x) (in place of D) is a positive-deﬁnite symmetric matrix representing the
3D distribution of diﬀusion at a location x ∈ G. This M can be visualized as an ellipsoidal structure
describing diﬀusion at a given location whose diagonals are the eigenvalues corresponding to those
eigenvectors of M aligned with the principal eigenvectors of said ellipsoid. The elongation of
said ellipsoid is determined by those corresponding eigenvalues. Note this makes sense since the
process of water diﬀusion is Gaussian and the diﬀusion tensor is representative of the covariance
matrix of the diﬀusion process at a particular point. This model is quite natural for a single ﬁber
where the ellipsoids would be really streched out along the ﬁber, while in the grey matter most
ellipsoids would be spheres.

5

1.2 Deﬁnitions/Preliminaries

Nonparametric Statistics makes use of methods for which no assumptions are placed on the data
belonging to a parametric family of distributions. The need for these methods arises frequently in
areas for which little is known about the underlying data structure of interest (e.g. the setting of this
paper). Considering the highly unknown structure of the brain, and the unavoidably high amounts
of noise shrouding the details we can in fact uncover, it makes sense to approach the problem with
fewer assumptions. One can then formulate a more theoretically rigorous foundation upon which to
proceed with analysis. Before discussing speciﬁc modeling methods we will introduce some basic
concepts in the non-parametric methodology particular to Kernel Regression Estimation.

In Integral Curve Estimation we are concerned with estimating curves able to be represented as

integrals of smooth functions. We have the general form of a regression model

Yi = f(Xi) + σ(Xi)i,

where the i are random i.i.d. errors with Ei = 0, and σ > 0 is a scaling smooth enough function.
The functions f and σ are unknown. We only assume that f exists and is at least ’smooth’ in some
sense. The quantity of interest is the curve x(t) which is known as the ’integral curve’. So as stated
it will be the case that

Þ t

0

x(t) =

g(x(s))ds

for some smooth function g dependent on f . Usually the Xi are non-random points on a regular grid
but they can be approximated by a Xi ∼ Uniform[0,1] setup, which makes analysis slightly easier.
Thus we observe {(Xi,Yi)}n
i=1 where the Xi ∼ Uniform[0,1] and Yi are some observed responses
thought to be a function of the data. We write

Yi = f(Xi) + σ(Xi)i.

6

Then the goal would be to ﬁrst estimate f on [0,1], and then to estimate σ. The ﬁeld of Kernel
Regression Estimation is interested in the function ˆfn for estimating f .

Let us review some basics here. Consider again X1, ..., Xn ∼ Uniform[0,1]. If we construct a

histogram to consider f , then a natural estimator of f would be

(cid:18) x − Xi

h

n

i=1

I

(cid:19)

Yi,

∈ [−1,1]

ˆfn(x) =

1
nh

(cid:18) x − Xi

(cid:19)

w

h

Yi.

n

i=1

ˆfn(x) =

1
nh

where h is called the bandwith, or window-width, and it is a smoothing parameter used to correct
’oversmoothing’ or ’undersmoothing’. Then we can include all of the Xi within h of x. Alternatively
one can introduce a weight function w(x) = 1
2 I(|x| < 1). Our estimator would then have the form:

Now one can go from uniform weights w to arbitrary weights deﬁned by K. More precisely, let

K(x) be a kernel function, which is a probability density satisfying the conditions (K):

(K1) K is non-negative and symmetric about 0,

(K2) Þ ∞
(K3) Þ ∞
(K4) 0 <Þ ∞

−∞ K(x)dx = 1,
−∞ xK(x)dx = 0,

−∞ |x|2K(x)dx < ∞.

Then rather than a simple weighted function w, one may select K to be Gaussian, for example.

Then

ˆfn(x) =

1
nhn

(cid:18) x − Xi

(cid:19)

h

K

Yi,

n

i=1

where the subscript n is included to indicate that (as usually is the case), the bandwith is a function
of the sample size n.

A primary concern in regression function estimation is how to handle the error on a local vs.
global scale. Of course, there is a trade-oﬀ between minimizing the error locally and globally,

7

and the problem of minimizing the bias and variance of the estimator ˆf of f with respect to the
bandwith h is problematic since as h increases so does the bias. Optimizing with respect to the
mean squared error is the common practice. We consider the mean integrated squared error (MISE)
between f and its estimator ˆf , that is,

Þ (cid:0) ˆfn(x) − f(x)(cid:1)2

dx.

MISE = E

1.3 Simple Vector Model

The ﬁrst of these works explored by Sakhanenko et al. was the simple vector model (see [5]), in
which the diﬀusion tensor D was calculated at each location, characterized by a symmetric positive
deﬁnite 3× 3 matrix. Recall one can envision this diﬀusion tensor as an ellipsoidal structure whose
principal eigenvector is pointed in the direction of dominant diﬀusion, giving a measure of the path
direction so that the ﬁber tract can then be reconstructed in small steps. The method is basically
making use of Euler’s method to reconstruct the curve locations.
Stated explicitly, there is a vector ﬁeld v : G → R3 observed at uniformly i.i.d. locations Xi ∈ G
with i.i.d. random errors ξ for which Eξ = 0 and Cov(ξ, ξ) = Σ. The ξi are taken to be independent
of the locations Xi. The observations are:

(Xi,Vi) = (Xi, v(Xi) + ξi).

To trace the curves, we look at the Cauchy problem of solving the diﬀerential equation:

or equivalently,

dx(t)
dt

= v(x(t)),

x(t) = a +

tÞ

v(x(s))ds.

t ≥ 0, x(0) = a ∈ G,

If the vector ﬁeld is a very simple one (e.g., a constant vector ﬁeld), then integrating along the
vector ﬁeld gives an integral curve x that is another regression function in the space of dimension

0

8

one less than the dimension of the space for which Vi is a member. Note that the set G ∈ R3 is a
bounded open set of Lebesgue measure one and represents some scaled region of the brain.
The Nadaraya-Watson type estimator was used as an estimator of the vector ﬁeld v (see reference
[5] of [5]):

ˆV(x) = ˆVn(x) =

1
nhd

(cid:18) x − Xi

(cid:19)

h

Vi,

n

i=1

K

= cbT M(x)b + σ(x, b)ξ.

9

where the kernel K satisﬁes the assumptions (K). The kernel K also should be noted to be deﬁned
on a region of bounded support wherein it is twice continuously diﬀerentiable. In particular, the
estimate ˆV(x) = 0 outside a bounded neighborhood of G. Thus there is deﬁned a plug-in estimate
of the curve x(t) as ˆX(t) = ˆXn(t) for t ≥ 0 of the Cauchy problem:

d ˆXn(t)
dt

= ˆVn( ˆXn(t)),

t ≥ 0, x(0) = a ∈ G,

tÞ

0

ˆXn(t) = a +

ˆVn( ˆXn(s))ds.

or equivalently,

1.4 DTI Model

A more complete account of how noise in DTI data impacts ﬁber trajectory estimates is provided
with the low order DTI model. As before, the main goal is to estimate the curve x(t) only now
driven by the vector ﬁeld v(M(x)) for x ∈ G, where M is a tensor ﬁeld and M(x) represents the
calculated tensor at the location x. Thus v(M(x)) is a tensor-driven vector ﬁeld.

The observations are modeled as according to the Stejskal-Tanner equation (1.1) with het-

eroscedastic noise function σ > 0:

y(x, b) = −log

(cid:19)

(cid:18) R(x, b)

R0(x)

Thus, given N magnetic ﬁeld gradient directions b1, ..., bN we have the elements y(x, bj), j = 1, .., N
that make up the vector Y(x) for a location x ∈ G modeled by
1/2(x)Ξx .

Y(x) = BM(x) + Σ

(1.2)

Here the ﬁxed matrix B is related to the set of diﬀusion gradient directions and timing parameters
of the imaging procedure. For DTI, we require at least N = 6 directions. Denoting each diﬀusion
directional vector b by b = (b(1), b(2), b(3)), we then have for these N directions bi the ﬁxed matrix



B =

2

1

b(1)
1 b(1)
b(1)
2 b(1)
...
b(1)
N b(1)

N

2

1

2b(1)
1 b(2)
2b(1)
2 b(2)
...
2b(1)
N b(2)

N

2

1

2b(1)
1 b(3)
2b(1)
2 b(3)
...
2b(1)
N b(3)

N

2

1

b(2)
1 b(2)
b(2)
2 b(2)
. . .
b(2)
N b(2)

N

2

1

2b(2)
1 b(3)
2b(2)
2 b(3)
...
2b(2)
N b(3)

N

1

b(3)
1 b(3)
b(3)
2 b(3)

2

b(3)
N b(3)

N



.

where the entry M(i,j) represents a measure of the amount of diﬀusion in the (i, j) direction. To
be clear, for example, the (1,1) direction would be the direction (1,0,0) and the (1,2) direction
would correspond to the direction (1,1,0). Likewise the (1,3) direction would correspond to the
directional vector (1,0,1). Note each of these matrices in the ﬁeld is an average within a voxel
located at x that has been corrupted with noise.

Σ is an N × N symmetric positive deﬁnite tensor with entries

Σi j(x) = cov(σ(x, bi), σ(x, bj)).

The N × 1 tensor Ξx is a vector of random noise with entries ξj, j = 1, .., N.

10

This second order tensor M describing the diﬀusion locally at each x can be represented by a 3 × 3
positive-deﬁnite symmetric matrix:



M(x) =

M(1,1)(x) M(1,2)(x) M(1,3)(x)
M(2,1)(x) M(2,2)(x) M(2,3)(x)
M(3,1)(x) M(3,2)(x) M(3,3)(x)



The Cauchy problem of solving the ODE for the curve trajectory is now given by

or equivalently

dx(t)
dt

= v(M(x(t))),

t ≥ 0, x(0) = a ∈ G,

tÞ

0

x(t) = a +

v(M(x(s)))ds.

The vector ﬁeld v(M(x)) consists of the leading eigenvectors of the tensors M(x) for each x ∈ G.
The direction of diﬀusion is the eigenvector corresponding to the maximal eigenvalue of M(x).

For a point Xj ∈ G we estimate M(Xj) using the ordinary least squares estimator:

˜M(Xj) = (BT B)−1BTY(Xj),

provided that (BT B)−1 exists.

We use a kernel smoothing method to estimate M(x) at locations x ∈ G between our observations

Xj:

ˆMn(x) =

1
nh3
n

n

j=1

K

(cid:18) x − Xj

hn

(cid:19) ˜M(Xj),

where K is a kernel function and hn is a bandwidth.

We then compute the eigenvectors and eigenvalues v( ˆMn(x)) and λ( ˆMn(x)) of ˆMn(x). These are
our estimators of the true eigenvectors of M(x). The eigenvector v( ˆMn(x)) with the corresponding
maximal eigenvalue λ( ˆMn(x)) gives the solution for the estimate ˆXn(t) of x(t):

d ˆXn(t)
dt

= v( ˆMn( ˆXn(t))), t ≥ 0, ˆXn(0) = a.

The low order DTI approach utilizes a second order tensor. This method requires that one
assume there only be one ﬁber present per voxel. That is, that only one ﬁber extends outward from

11

each cubic region in consideration rather than branching, touching, or ’kissing’ of two or more
ﬁbers. These situations require a more rich sampling of the directional diﬀusion space. This gives
us the need for the higher order model known to HARDI, or ’High-Angular Resolution Diﬀusion
Tensor Imaging’ in which a tensor of order greater than two is utilized.

1.5 HARDI Model

Under the higher order tensor approach the Stejskal - Tanner equation becomes:

(cid:19)

(cid:18) R(x, b)

R0(x)

log

3

i1=1

3

iM =1

= −c

· · ·

Di1...iM(x)bi1 ...biM

+ σ(x, b)ξb,

where b is the vector in R3 denoting the applied magnetic ﬁeld gradient direction on the unit
sphere and c is a constant that depends on the parameters of the imaging procedure. The numbers
Di1...iM(x) are components of the high order diﬀusion tensor D(x), which is a supersymmetrical
tensor. Due to symmetry, D(x) can be represented by a vector D(x) ∈
positive deﬁnite 3 × ... × 3
Mtimes
RJM , where JM = (M + 1)(M + 2)/2. Thus at locations x ∈ G we observe log-losses of signal

(cid:124)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:123)(cid:122)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:125)

(cid:19)

(cid:18) R(x, b)

R0(x)

Y(x) = log

∈ RN

stacked into the vector Y for all directions b. Now the diﬀusion signal is modeled as:

Y(x) = BD(x) + Σ

1/2(x)Ξx,

(1.3)

where again we require N ≥ 6. B ∈ RN×JM is the ﬁxed matrix whose components are combinations
of the diﬀusion directions as in the DTI model. Ξx ∈ RN is random noise and Σ(x) is an N × N
symmetric positive deﬁnite tensor.

When M = 2 we have the usual eigenvalue problem. Computing the eigenvalues and eigenvec-
tors of the high order tensor requires a diﬀerent approach than the usual eigenvalue problem. For
details see [13]. We will outline the basic idea here. First, the rank R of the high order tensor D is

12

the minimal number R for which:

R

k=1

(cid:124)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:123)(cid:122)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:125)

vk ⊗ · · · ⊗ vk

M times

D =

f or some v1, ..., vR ∈ R3

,

where u ⊗ w = uT w means the outer product, (u ⊗ w)i j = uiwj.
number of vector outer products needed to sum to D.

In other words, the minimal

As we are now considering the possibility of multiple ﬁbers per voxel, we consider the tensor
D(r) describing ﬁber r, for r = 1, ..., R. Then the best rank-1 approximation of the tensor D(1) = D
is λ1v1 ⊗ · · · ⊗ v1, where λ1 > 0 and v1 is a unit vector. Then R is the minimum number of rank-1
tensors that sum to D.

λ(k), v(k) make up the best rank-1 approximation of a tensor D(k) = D(k−1) − λ(k−1)v(k−1) ⊗

· · · ⊗ v(k−1) for all k = 2, . . . , R, which minimize the Frobenius norm:

3

i1=1

3

ir =1

· · ·

(D(k)
i1···ir

− λvi1 · · · vir)2

.

λ(1), ..., λ(R) are called the pseudo-eigenvalues of the tensor D. v(1), ..., v(R) are called the

pseudo-eigenvectors of the tensor D.

For locations Xj ∈ G, we estimate D(Xj), using the ordinary least squares estimators:

˜D(Xj) = (BT B)−1BTY(Xj),

j = 1, .., n,

or the weighted LSE:

˜D(Xj) = (BT Σ−1(Xj)B)−1BT Σ−1(Xj)Y(Xj),

j = 1, .., n.

As before we use a kernel smoothing method to estimate D at locations x ∈ G between our

observations Xj:

ˆDn(x) =

1
nhd
n

(cid:18) x − Xj

hn

(cid:19) ˜D(Xj),

n

j=1

K

13

where K is a kernel function and hn is a bandwidth.

Given ˆDn(x), x ∈ G we calculate its pseudo-eigenvalues ˆλ

for r = 1, ...R by minimizing the Frobenius norm above.

Then we have our r-th curve estimate

(r)
n (x) and pseudo-eigenvectors ˆv

(r)
n (x)

d ˆx(r)
n (t)
dt

= ˆv

(r)
n ( ˆx(r)

n (t)),

t ≥ 0, ˆx(r)

n (0) = a.

It is the curve whose gradients are the pseudo-eigenvectors ˆv

(r)
n . In practice we trace it in small

(r)
n , starting at location a.

steps in the direction ˆv

1.6 Main Result

The primary result is that for each of the above models one can establish the convergence of the

properly normalized deviation process(cid:112)

nhd−1( ˆXn(t) − x(t))

to a vector valued Gaussian process G(t) on [0,T], i.e., the normalized deviation process converges
in the space C[0,T], whose mean and covariance will be a function of the vector ﬁeld estimates
ˆv or, in the case of the non-parametric approach discussed in this paper, ˆf , and in both cases, the
noise Σ. Of course, the dimension d is usually d = 3. This result allows one to provide hypothesis
tests of whether two regions are connected. This also allows for the production of the popular
| ˆX(t) − z|2
probabilistic tractography p-value maps, and the construction of statistics such as in f
t∈[0,T]
that allow the consideration of the curve reaching a location of interest z. These are useful because
they allow scientiﬁc users of such data to easily perform tests for the likelihood of regions being
connected without requiring familiarity with the background processes.

14

CHAPTER 2

ESTIMATION & MAIN RESULT

2.1 Preliminaries

In the current scenario we wish to model the signal completely non-parametrically (note that the
previous were semi-parametric approaches). We will consider the problem of modeling diﬀusion
signal with a general function f with heteroscedastic error in the non-parametric regression set-up:

Yi jk = f(Xi, ϕj, θk) + S(Xi, ϕj, θk)i jk,

where (ϕj, θk) are the applied directions on the unit sphere. Recall the Stejskal-Tanner equation
(1.1). Under this regime, we have:

R0(xi)

Yi jk = −log

= f(xi, ϕj, θk) + S(xi, ϕj, θk)i jk,

(2.1)
where R, R0 are as before and bjk is the vector in R3 corresponding to the magnetic ﬁeld gradient
applied in the direction (ϕj, θk) on the unit sphere. We also include the noise S(xi, ϕj, θk)i jk in
the Stejskal-Tanner equation. S describes the covariance structure of the noise at a given location
and direction. i jk are standardized error terms satisfying the following assumptions (Σ):
(Σ1) Ei jk = 0, E
= 1, and Eεi1 jk εi2 jk = Σi1i2 for all i, j, k,i1,i2.
(Σ2) {εi1 j1k1} & {εi2 j2k2} are independent when j1 (cid:44) j2 or k1 (cid:44) k2.

2
i jk

Let jk := (i1 jk, ..., njk). Σ

jk
i1i2

= cov(i1 jk, i2 jk)

(cid:18) R(xi, bjk)

(cid:19)



jk
i1i2
. . .

jk
Σ
i1i1 Σ
...
jk
ni1

Σ

. . .

jk
nn

Σ

 .

Var(jk) = Σ jk =

15

Naturally, the degree of diﬀusion at the locations xi1 and xi2 exhibits short range dependency
for a given direction (φ j, θk) so that the covariance matrix consists of a high number of zero

or near-zero entries. For example it could be diagonal or banded. This leads to the sparsity
of Σ described in the condition (Σ3).

k1(cid:44)k2 Σk1k2 → κ, where the sequence hn → 0

(Σ3) For some κ > 0 as n goes to ∞ we have h3

n
n

will be introduced in the main theorem.

As it is true that the diﬀusion is strongest at a location xi when R is smallest (indeed, the order
of magnitude of R is dependent on the degree of proton disalignment), to search for the direction
of dominant diﬀusion we seek to maximize the function f in (2.1). As in the previous works, this
will drive the vector ﬁeld of curve tangents, v. For each x ∈ G we will search for the direction
(ϕ∗(x), θ∗(x)) on the unit sphere for which

∂

∂
∂θ

f(x, ϕ
∂
∂ϕ
2
2 f(x, ϕ
∂ϕ
2
2 f(x, ϕ

∗(x), θ
∗(x), θ
∗(x), θ

∗(x)) =
∗(x)) < 0,
2
∗(x)) ∂
2 f(x, ϕ

∗(x), θ
f(x, ϕ
2
2 f(x, ϕ
∗(x), θ

∗(x)) = 0,
∗(x), θ
∗(x)) − ( ∂

∂

∂θ

2

∂

∂ϕ

∂θ

∂θ∂ϕ

∗(x)) < 0,

f(x, ϕ

∗(x), θ

∗(x)))2

> 0.

(2.2)

This indicates that (ϕ∗(x), θ∗(x)) is the direction of the maximum gradient ﬂow at x. We use this
direction to deﬁne the ODE for which the curve x∗ is a solution:
sin θ∗(x∗(t)) cos ϕ∗(x∗(t))
sin θ∗(x∗(t)) sin ϕ∗(x∗(t))

x∗(0) = x0.

∗(x∗(t)) :=

(2.3)

= v

dx∗(t)
dt

cos θ∗(x∗(t))



 ,

This integral curve models an axonal ﬁber x∗(t), t ≥ 0, starting at x0 and ﬂowing along the maximal
gradient direction. The location x0 is typically given as a seed point located in a region of interest.
We assume condition (F) : The function f is twice continuously diﬀerentiable in G×[−π, π]×[0, π].
Then the uniqueness of the direction is guaranteed for some open subset of G containing x0. It
then determines implicitly the end time t = T. Thus, the goal is to estimate x∗(t), t ∈ [0,T], based
on the dataset (Yi jk, Xi, ϕj, θk), i = 1, . . . , n, j, k = 1, . . . , N.

A natural estimation procedure consists of estimating f by some ˆfn, then ﬁnding the direction
n(x)) as the pair that maximizes ˆfn(x, ϕ, θ) at a given

of maximum estimated diﬀusion ( ˆϕ∗

n(x), ˆθ∗

16

location x, followed by ODE (2.3) where unknown true x∗, ϕ∗, θ∗ are replaced by their respective
estimators ˆx∗

n. Thus the estimated integral curve is deﬁned as the solution of the ODE:

n, ˆϕ∗

n, ˆθ∗

d ˆx∗
n(t)
dt

= ˆv

∗
n( ˆx∗

n(t)) =

sin ˆθ∗
sin ˆθ∗

n(t)) cos ˆϕ∗
n( ˆx∗
n( ˆx∗
n(t)) sin ˆϕ∗
cos ˆθ∗
n(t))

n( ˆx∗

n( ˆx∗
n( ˆx∗

n(t))
n(t))

n(0) = x0.



 , ˆx∗

Before introducing our estimator ˆf we make a few comments on the nature of Simpson’s integral

approximation scheme, which would be used later.

2.2 Simpson’s Rule

For a curve g on an interval [a, b] we can approximate the area under the curve of g from a to
b using Simpson’s rule. Rather than the trapezoidal rule, or the left or right Riemann sum, which
use lines or constants, the Simpson’s scheme uses quadratic forms to approximate the function g.
That is, for xi, i = 0, ..., n, forming a partition of [a, b] with x0 = a and xn = b, the quadratic forms

P(x) = px2 + qx + v, p, q, v ∈ R,

can approximate g. P is often called a second order interpolating polynomial because it can be
chosen such that it goes through the points in consideration. In particular, the Lagrange polynomials
deﬁned as

where

P(x) =

n
pk(x) = f(xk) n

k=1

pk(x)

x − xj
xk − xj

j=1
j(cid:44)k

will pass through the curve f at the points xi, i = 1, .., n.
Requiring an even number of intervals to divide [a, b] into and letting ∆x = b−a
n

, as well as

requiring

17

P(xi) = g(xi), i = 1, ..., n

enables one to show that we will have

Þ b

a

P(x)dx =

And the main result is used as an approximation to the integral:

∆x
3 (g(a) + g(x1) + 2g(x2) + ... + 4g(xn−1) + g(b)).
Þ b

Þ b

g(x)dx ≈

P(x)dx.

a

a

An easy case to demonstrate this is to take [a, b] = [−h, h] and n = 2. We will use the Simpson

rule in our estimator ˆfn when dealing with variables.
2.3 Estimation

Our estimator ˆfn(x, ϕ, θ) is a combination of a multivariate kernel smoothing on x with Simpson’s
scheme for numerical approximation of an integral with respect to ϕ and θ. More precisely, introduce

2Nθ

ˆfn(x, ϕ, θ)

2Nϕ

n

m=0

l=0

k=1

(cid:18) θ(x) − θm

(cid:19)

hθ

Kϕ

(cid:18) ϕ(x) − ϕl

hϕ

(cid:19)

K

(cid:19)

(cid:18) x − Xk

hn

Yklm,

am
hθ

bl
hϕ

1
nh3
n

Kθ

(2.4)

=

where am, bl are coeﬃcients in the Simpson’s scheme, and hn, hϕ, hθ are bandwiths for the kernels
K, Kϕ, Kθ, respectively, to be optimized later. We have the following conditions (K) on the kernels:

(K1) K is a symmetric probability density in Rd with bounded support,

(K2)

Þ

Þ

Þ
Þ

K(u)du = 1,
K(cid:48)(u)du = 0,

uK(u)du = 0,

|u|2K(u)du < ∞.

These conditions hold for the kernels K, Kϕ, and Kθ in R3, R, and R, respectively. Additionally we
assume

18

(K3)

Supp(Kϕ) = [−c, c] ⊂ [−N, N], Supp(Kθ) = [−d, d] ⊂ [−N, N].

Recall N2 is the number of equally spaced angles for the observed directions (ϕj, θk) on the unit
sphere. These conditions are quite typical in kernel density estimation practices. The follow-
ing proposition will be useful in establishing asymptotic unbiasedness of the estimators of the
derivatives of the signal function f .

Proposition 1. For any kernel K satisfying conditions (K) and any twice continuously diﬀerentiable
on compact sets function g we have as h → 0:

g(u)du = g(u0) + 0.5h2

g

(cid:48)(cid:48)(u0)K2(1 + o(1)),

Þ

(cid:18)u − u0

(cid:19)

h

h−dK

where K2 =Þ |u|2K(u)du.

Next we will discuss the Simpson’s scheme for the integral of a real function g on a bounded

interval [a, b]. Deﬁne

SN(g(u0), ..., g(u2N), a, b)
[g(u0) + g(u2N)] +

=

1
6N

N

m=1

2
3N

g(u2m−1) +

1
3N

N

m=1

g(u2m)

with um = a + m2N(b − a), m = 0,1, . . . ,2N. The following result is well-known in literature.
Proposition 2. For any four times continuously diﬀerentiable function g deﬁned on the interval
[a, b] we have as N → ∞

g(u)du = SN(g(u0), ..., g(u2N), a, b) − (b − a)5
with some u∗ in [a, b], where gIV is the fourth order derivative of g.

a

180 gIV(u∗)

1
(2N)4

Þ b

These two propositions together indicate that E ˆfn(x, ϕ, θ) = f(x, ϕ, θ) + o(1) as h, hϕ, hθ →
0 and n, Nϕ, Nθ → ∞. We will establish the exact nature of the remainder term in our proofs
section.

19

Our curve estimators will be brought about through the estimation of ˆfn. The direction (ϕ, θ)
for which ˆfn is maximized will yield the tangent vector ˆvn. We apply Euler’s method to obtain the
curve estimators using this tangent vector ﬁeld, meaning that we follow the vector ﬁeld ˆvn in small
steps. We establish the convergence of the properly normalized deviation between our estimator and
the true underlying curve, ˆXn(t) − x(t), t ∈ [0,T], to a Gaussian process. We derive test procedures
for testing whether two regions of the brain are connected. This allows us to construct so-called
p-value maps that can rival the probability occupancy maps often used in probabilistic tractography
methods to make inference about the likelihood of connected regions.

There are two cases for each of the bandwidths:

(I)

hn = O(n−1/2),

hϕ = hθ = h = O(n−1/2), Nϕ = Nθ = N ≥ ν(n) = O(n2) as n → ∞.

Or

(II)

hn = O(N−1/5n−1/10),

hϕ = hθ = h = O(N−1/5n−1/10), Nϕ = Nθ = N = o(n2) as n → ∞.

In practice the number of gradient directions N2 is much smaller (typically around 64) than the
number of voxels n (on the order of 106) in a typical HARDI dataset, so case (II) is more practical,
which is what we will use in the simulation study with simulated and real data.

2.4 Main Result for Asymptotic Normality of Deviation Process

Lemma 1. Suppose that (I) or (II) holds. Then uniformly in t ∈ [0,T],
estimator of x∗(t). That is,

ˆX∗
n(t) is a consistent

| ˆX∗
n(t) − x∗(t)| P→ 0

as

n → ∞.

sup
t∈[0,T]

Theorem 1. Under case (I),

n( ˆXn(t) − x(t)) D⇒ G(t)

in the space C[0,T], where the Gaussian process G has mean function µ(t) and covariance function
C(t1, t2) introduced in the next section.

20

Under case (II),

n1/5N2/5( ˆXn(t) − x(t)) D⇒ G0(t)

in C[0,T], where the centered Gaussian process G0 has covariance function C0(t1, t2) introduced
in the next section.

Then we can show by the Delta Method the following result:

Theorem 2. Suppose conditions of the previous theorem hold under case (II). Moreover, suppose
there exists the unique point τ ∈ (0,T) such that mint∈[0,T] |x∗(t) − z|2 = |x∗(τ) − z|2. If x∗(τ) (cid:44) z
then the sequence

n1/5N2/5(cid:20)

n(t) − z|2 − |x∗(τ) − z|2(cid:21)

min

t∈[0,T] | ˆX∗

is asymptotically normal with zero mean and variance

4(x∗(τ) − z)∗C0(τ, τ)(x∗(τr) − z).

If x∗(τ) = z then the sequence n2/5N4/5 mint∈[0,τ] | ˆX∗
variable |Z|2 − ( d
C0(τ, τ).

n(t)−z|2 converges in distribution to a random
dt x∗(τ)t Z)2, where Z is a normal random variable with zero mean and variance

This allows us to construct hypothesis tests for the likelihood of diﬀerent regions being connected

as well as the so-called probabilistic tractography p-value maps (see [2], [5], [9]).

21

CHAPTER 3

MEAN AND COVARIANCE OF THE LIMITING GAUSSIAN PROCESSES

3.1 Preliminaries

Before we introduce the step-by-step algorithm for computing the estimated integral curve
together with conﬁdence ellipsoids we need to discuss the mean function and covariance functions
of the limiting Gaussian processes, which in turn require introductions of several auxillary functions.

Deﬁne the integrals:

Ψ(z) =

Þ

Þ

K(u)K(z + u)du, Ψ(cid:48)(z) =

K(u)K(cid:48)(z + u)du.

Then integration by parts of the above gives

Þ

Ψ(cid:48)(cid:48)(z) = −

K(cid:48)(u)K(cid:48)(z + u)du.

−|z|2
For instance, for a standard Gaussian kernel Ψ(z) = e
4
π)3 . Now out of kernels Kϕ and Kψ build
(2√
the matrix
ϕ(0)Ψθ(0) Ψ(cid:48)
ϕ(0)Ψ(cid:48)
Ψ(cid:48)
ϕ(0)Ψ(cid:48)
(0) Ψϕ(0)Ψ(cid:48)(cid:48)

(cid:32) Ψ(cid:48)(cid:48)

(0)
(0)

(cid:33)

.

Ψ0 =

If both Kϕ and Kψ are Gaussian then Ψ0 = −diag(cid:0) 1
Þ

θ

θ

(cid:1). For v ∈ R3 deﬁne

θ

1
8π

8π,
Ψ(−τv)dτ,

ψ(v) =

R

which is

1
4π|v| for a standard Gaussian kernel. Also introduce

Þ
(cid:18) −Ψ(cid:48)(cid:48)
Ψ0(v, x) =
ϕ( ∂ϕ∗
ϕ( ∂ϕ∗
∂x (x)vτ)Ψθ( ∂θ∗
∂x (x)vτ)Ψ(cid:48)
∂x (x)vτ) Ψ(cid:48)
ϕ( ∂ϕ∗
∂x (x)vτ) −Ψϕ( ∂ϕ∗
( ∂θ∗
∂x (x)vτ)Ψ(cid:48)(cid:48)
∂x (x)vτ)Ψ(cid:48)
Ψ(cid:48)

(Ψ(−vτ) + κ) ×

θ

θ

(cid:19)

dτ.

( ∂θ∗
∂x (x)vτ)
( ∂θ∗
∂x (x)vτ)

θ

22

In case of a standard Gaussian kernel


1 +(cid:0) ∂θ∗
∂x (x)(cid:1)2
−(cid:0) ∂ϕ∗
∂x (x)(cid:1)(cid:0) ∂θ∗
∂x (x)(cid:1)
(cid:19)2
2(cid:18)
(cid:18) ∂ϕ∗

1 +

(x)

+

∂x

Ψ0(v, x) = − 1
D(x)

D(x) = 64π

where

Also, deﬁne

and

 ,

.

∂x

(x)

−(cid:0) ∂ϕ∗
∂x (x)(cid:1)(cid:0) ∂θ∗
∂x (x)(cid:1)
∂x (x)(cid:1)2
1 +(cid:0) ∂ϕ∗
(cid:19)2(cid:19)3/2
(cid:18) ∂θ∗
 x∗(s)
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ).

M(s) =

− sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗
cos θ∗ sin ϕ∗
sin θ∗ cos ϕ∗
− sin θ∗

0

(cid:32) ∂

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f

2

∂
∂ϕ∂θ f
2
∂
2 f
∂θ

F(x, ϕ, θ) =



Þ t

Next, Green’s function U(t, s) is deﬁned as the solution of the PDE

∂U(t, s)

∂t

= ∇v

∗(x∗(t))U(t, s), U(s, s) = I ∀s > 0,

where

∇v(x∗(s)) = −M(s)F−1(x∗(s), ϕ

∗(x∗(s)), θ

∗(x∗(s)))

(cid:32)

(cid:33)

.

2

∂

∂ϕ∂x f(x∗(s))
∂θ∂x f(x∗(s))

2

∂

We will show that in case (I) the limit process G(t) has the mean function

(cid:33)

EG(t) = −

(cid:34)(cid:32)
(cid:32)
(cid:32)

+

3
∂
∂x2
ϕ
3
∂
∂x2

θ

∂
∂ϕ
3
∂
2
∂ϕ
3

θ

∂

0

U(t, s)M(s)F−1(x∗(s))
f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
3
3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
2 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
3
3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1

∂
∂θ

+

∂ϕθ

2K0,2Kϕ,1,1

2K0,2Kθ,1,1

6Kϕ,1,3

3Kθ,1,1Kϕ,0,2

3Kϕ,1,1Kθ,0,2

6Kθ,1,3

(cid:33)
(cid:33)(cid:35)

ds,

23

where Kp,q =Þ xpK(q)(x)dx. As well as that in case (I) the limit process G(t) has covariance

function satisfying:

Þ t1∧t2

0

Cov(G(t1), G(t2)) =

(cid:26)

(cid:18) dx∗

(cid:19)

ψ

dt

(s)
×S2(x∗(s), ϕ

G(x∗(s))GT(x∗(s)) +
∗(x∗(s)))
∗(x∗(s)), θ

(cid:19)

(cid:18) dx∗

U(t1, s)M(s)F−1(x∗(s))
(s), x∗(s)

25
81 Ψ0
MT(s)(FT)−1(x∗(s))UT(t2, s)ds.

(cid:27)

dt

Þ t1∧t2

For case (II) the centered limit process G0(t) has covariance function deﬁned as follows

C0(t1, t2) =
×S2(x∗(s), ϕ

U(t1, s)M(s)F−1(x∗(s))25
(s), x∗(s)
∗(x∗(s)))MT(s)(FT)−1(x∗(s))UT(t2, s)ds.

81 Ψ0

dt

0
∗(x∗(s)), θ

(cid:18) dx∗

(cid:19)

(cid:19)

(cid:18) dx∗(t)

, x∗(t)
∗(x∗(t)))

dt

∗(x∗(t)), θ

(3.1)

Moreover, the variance function Cv(t) = C0(t, t) satisﬁes the ODE:
∗(x∗(t)))25

= M(t)F−1(x∗(t), ϕ

∗(x∗(t)), θ

dCv(t)
dt
×S2(x∗(t), ϕ
+∇v

∗(x∗(t)), θ

81 Ψ0
∗(x∗(t)))MT(t)(FT)−1(x∗(t), ϕ
∗T(x∗(t)), Cv(0) = 0.

∗(x∗(t))Cv(t) + Cv(t)∇T v

Our method for numerically approximating the conﬁdence regions is outlined in the following

section.

3.2 Green’s Function

We will present the basic facts and usefulness of Green’s function here so that the reader need

not refer elsewhere.

Let Θ, Ω ⊂ Rm be open and consider a linear diﬀerential operator Q on Θ. Then Green’s

function U(t, s) on Θ × Ω at a point (t, s) is a solution of

QU(t, s) = δ(t − s),

(3.2)

where δ is the normal ’Delta-function’. i.e.,

24

By the deﬁnition of the function δ for the integral over Rm we have a unit point mass at the

point for which this function is non-zero:



δ(x) =

1

0

x = 0

x (cid:44) 0

Þ

Rm

δ(t − s)h(s)ds = h(t).

Now consider the deﬁnition of Green’s function and to see its usefulness multiply both sides of

(3.2) by h(s):

QU(t, s)h(s) = δ(t − s)h(s).

Then integrate both sides over Rm:

QU(t, s)h(s)ds =

δ(t − s)h(s)ds.

For the linear diﬀerential operator Q acting on t ∈ Θ we have

U(t, s)h(s)ds =

δ(t − s)h(s)ds.

Þ

Rm

Þ

Q

Rm

Þ
Þ

Rm

Rm

Thus consequently we have

Þ

Rm

Q

U(t, s)h(s)ds = h(t).

Thus if we have a diﬀerential equation of the form:

we will have

Qw(t) = h(t),

Qw(t) = Q

Þ

Rm

25

U(t, s)h(s)ds.

In other words, we have the integral form solution

w(t) =

U(t, s)h(s)ds.

Þ

Rm

By Theorem (2.2) in chapter (7) in Coddington and Levinson (1955) [17] the unique solution

of Green’s function U(t, s) exists and is unique.

26

CHAPTER 4

ALGORITHM

Our algorithm for calculating curve trajectories with their surrounding conﬁdence ellipsoids is
listed below. We will consider how to trace the ﬁber in case (II), since this is the typical scenario
in DT-MRI where n is on the order of millions and N is 10–20. The steps are

• Initialize c1, c2, x0, δ, tj = δ j and

hn = c1N−1/5n−1/10

,

hϕ = hθ = h = c2N−1/5n−1/10

, N = n2

εn,

εn → 0.

• Let ˆfn(x, ϕ, θ) be deﬁned as in (2.5).
• Find a direction ( ˆϕ∗

n(x), ˆθ∗

∂
∂θ

n(x)) on the unit sphere such that
∗
∗
ˆfn(x, ˆϕ
ˆfn(x, ˆϕ
n(x), ˆθ
n(x)) = 0,
2
∗
∗
ˆfn(x, ˆϕ
n(x), ˆθ
ˆfn(x, ˆϕ
n(x)) < 0,
2
∂θ
∗
∗
n(x))
n(x), ˆθ
ˆfn(x, ˆϕ

∗
∗
n(x), ˆθ
n(x)) =
∗
∗
n(x), ˆθ
n(x)) < 0,
2
∗
∗
n(x)) ∂
n(x), ˆθ
2
∂θ
∗
∗
n(x), ˆθ
ˆfn(x, ˆϕ
n(x))

ˆfn(x, ˆϕ
2

(cid:19)2

> 0.

∂

∂

∂
∂ϕ
2
2
∂ϕ
2
2

∂

(cid:18) ∂

∂ϕ
−

∂θ∂ϕ

This direction indicates where the maximum gradient ﬂow is.
• Solve the ODE that governs a curve along the direction ( ˆϕ∗

n(x), ˆθ∗

n(x)) :

d ˆx∗
n(t)
dt

=

sin ˆθ∗
sin ˆθ∗

n( ˆx∗
n( ˆx∗

n(t))
n(t))

n( ˆx∗
n(t)) cos ˆϕ∗
n( ˆx∗
n(t)) sin ˆϕ∗
cos ˆθ∗
n(t))
n(tm) ≈ ˆxn(tm−1) + δˆv∗

n( ˆx∗

ˆx∗
n(0) = x0.

n( ˆx∗

n(tm−1)).

numerically using Euler’s method as ˆx∗



 ,

• Now consider the noise scaling function S. To estimate it we propose 2 approaches. Approach

1: If several b0-images are available then one has m sets of

Y(l)

i jk

= f(Xi, ϕj, θk) + S(Xj, ϕj, θk)ε

(l)
i jk, l = 1, ..., m, i = 1, ..., n, j, k = 1, ..., N,

27

j, k = 1, ..., N} are independent for l = 1, ..., m. Then for each

(l)
i jk

:

i = 1, ..., n,

where {ε
(Xi, ϕj, θk) we estimate S2 as follows
1
m − 1

ˆS2
i jk

=

m

l=1

i jk − 1
(Y(l)

m

m

q=1

(q)
i jk )2
Y

.

Approach 2: If only one b0-image is available then we will assume that S is smooth locally
so averaging locally should serve as its reasonable estimator. This can be done as in step 3 in
section 5 of [9], yielding ˆS2
i jk

.

i jk

n(x, ϕ, θ) by plugging ˆS2

in instead of Yi jk in step 2.

• Obtain ˆS2
• Calculate and estimate the limiting covariance Cv(t) := Cov(G(t), G(t)), which will be done
by ˆCn, the solution of the ODE similar to (3.1) where all unknown functions are estimated.
The solution is then approximated by Euler’s method as follows

(cid:19) ˆS2

(cid:18) d ˆx∗

ˆCn(tm) = ˆCn(tm−1) + δ ˆMn(tm−1)F−1( ˆx∗
×25
81 Ψ0
n )−1( ˆx∗
×( ˆFT
∗
n( ˆx∗
+δ∇ˆv

n( ˆx∗
n( ˆx∗
∗
∗
n(tm−1)), ˆθ
n(tm−1), ˆϕ
n(tm−1)))
n(tm−1)
, ˆx∗
n( ˆx∗
∗
n( ˆx∗
∗
n( ˆx∗
n(tm−1))) ˆM∗
n(tm−1)
n(tm−1), ˆϕ
n(tm−1)), ˆθ
dt
∗
n( ˆx∗
∗
n( ˆx∗
n(tm−1), ˆϕ
n(tm−1)), ˆθ
n(tm−1)))
∗T
n ( ˆx∗
n(tm−1)) ˆCn(tm−1) + δ ˆCn(tm−1)∇T ˆv

ˆCn(0) = 0,

n(tm−1)),

n(tm−1)

(cid:33)

.

ˆfn(x∗(s))
ˆfn(x∗(s))

2
∂
∂ϕ∂x
2
∂
∂θ∂x

∗(x∗(s)))

∗
n(x) = − ˆMT

n (s) ˆF−1

n (x∗(s), ϕ

∗(x∗(s)), θ

28

where

and

∇ˆv



n (t) =
ˆMT

− sin ˆθ∗
sin ˆθ∗

n sin ˆϕ∗
n cos ˆϕ∗
0

n cos ˆθ∗
n cos ˆϕ∗
cos ˆθ∗
n sin ˆϕ∗
− sin ˆθ∗
n

n

n

(cid:32) ∂

2
2
∂ϕ
2
∂
∂ϕ∂θ

ˆfn
ˆfn

∂

2
∂ϕ∂θ
2
2

∂
∂θ

ˆfn
ˆfn

ˆFn(x, ϕ, θ) =

n

 ˆx∗
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ).
(cid:32)

n(t)

• The asymptotical 100(1 − α)% conﬁdence ellipsoid for x(tm), m ≥ 1, is approximated by

, P(|Z| ≤ Rα) = 1 − α,

(cid:26)(cid:12)(cid:12)(cid:12)(cid:12) ˆCn(tm)−1/2( ˆxn(tm) − x(tm))

(cid:12)(cid:12)(cid:12)(cid:12) ≤ Rαn−1/5N−2/5(cid:27)

where Z is a standard normal vector in R3.

• Repeat the steps above until tj reaches T.

Use of Euler’s method for numerical approximation of the solutions is justiﬁed by its simplicity
and by the work of [6], where it was shown that for DTI using the higher order Runge-Kutta
approximations gave no beneﬁt. In fact, the statistical accuracy outweighs the numerical accuracy
on scales typical for brain imaging applications so it is of no concern.

29

CHAPTER 5

SIMULATIONS

5.1 Artiﬁcial example

To simulate various curve scenarios, we need to design a function f whose maximum will be
in the direction tangential to those curves of interest. To this end, we consider the function of the
form

f(x, ϕ, θ) = a(x) cos ϕ sin θ + b(x) sin ϕ sin θ + c(x) cos θ

with some generic real valued functions a, b, c to be chosen later. The maximal direction is

Then the integral curve is deﬁned by the following ODE

(cid:19)

,

a2 + b2

c

θ

∗(x) = tan−1(cid:18)√


=

dx∗(t)
dt

sin θ(x(t)) cos ϕ(x(t))
sin θ(x(t)) sin ϕ(x(t))

cos θ(x(t))

(cid:19)

ϕ

a

∗(x) = tan−1(cid:18) b
 =

a2 + b2 + c2

√

1

.



 .

a

b

c

We ﬁrst simulate a spiral in 3D which mimics a C-shaped ﬁber. So we take a = −(x2 − 0.5),
b = x1 − 0.5, and c = x3 and have

f(x, ϕ, θ) = (0.5 − x2) cos ϕ sin θ + (x1 − 0.5) sin ϕ sin θ + x3 cos θ.

Then solving the ODEs yields

x(t) = 0.5 + r0 cos(ln z(t) + c),

y(t) = 0.5 + r0 sin(ln z(t) + c),

(cid:18)(cid:113)

(cid:19)

z2(t) + r2

0 − r0atanh

z2(t) + r2

0/r0

,

(cid:113)
(cid:18) y(0) − 0.5

t + c =

x(0) − 0.5

(cid:19)

c = atan

− ln z(0),

0 = (x(0) − 0.5)2 + (y(0) − 0.5)2
r2

.

30

Figure 5.1: The true curve is in blue, while the estimated curve is in red.

31

Figure 5.2: This is an enlargement of the previous ﬁgure to show the 95% conﬁdence ellipsoid
surrounding a point on the estimated curve. The true curve in blue touches it.

32

We take S to be a constant function which we varied. We take n = 4000, N = 100, m = 5 (from
step 5), δ = 0.02. We trace the curve for 30 steps of size δ. The tracing is very fast but it relies
heavily on the numerical search for the maximal direction in step 3, which we do using Matlab’s
fminsearch. It requires an initial direction, which we take as (ϕ, θ) = (1,1). During the tracing the
initial direction is taken to be the maximal direction obtained on the previous iteration with added
small random perturbation. It was added to prevent the optimization algorithm from getting stuck
at a local maximum. Without it the optimization step 3 introduces a systematic numerical bias
contrary to expected zero bias. This is also the bottleneck of the implementation. It is possible to
improve this step by utilizing a diﬀerent optimization algorithm.

The results are shown in Figures 5.3 and 5.4. The noise scaling is S = 0.25 and the noise εi jk
is taken to be standard normal, which corresponds to signal-to-noise ratio of 4-5. If we increase S
the curve does not trace, it veers oﬀ and tends to go out of bounds. The conﬁdence ellipsoids are
very tight around the estimated curve. The norm of the limiting covariance function is on the order
of 10−7. It is excellent in comparison to typical conﬁdence ellipsoids’ sizes for methods in [2] and
[9].

33

Figure 5.3: Diﬀusion ellipsoids illustrate the corresponding diﬀusion tensors along the ﬁber
across the genu of corpus callosum.

34

Figure 5.4: Visualization of diﬀusion via ellipsoids using DTI/HARDI tensor model.

35

5.2 Real HARDI dataset

A DWI dataset was collected from a twenty-something-year-old healthy male brain on a GE 3T
Signa HDx MR scanner (GE Healthcare, Waukesha, WI) with an 8-channel head coil. The subject
signed the consent form approved by the Michigan State University Institutional Review Board.
DWI images were acquired with a spin-echo echo-planar imaging (EPI) sequence for 20 minutes
with the following parameters: 32 contiguous 2.4-mm axial slices in an interleaved order, FOV
= 22 rmcm × 22 cm, matrix size = 128 × 128, number of excitations (NEX) = 1, TE = 72.3 ms,
TR = 7.5 s, 150 diﬀusion-weighted volumes (one per gradient direction) with b = 1000 s/mm2, 9
volumes with b = 0 and parallel imaging acceleration factor = 2.

The desired three-dimensional model is not obtained altogether, but rather by imaging a ‘slice’
repeatedly along the z-direction and then reconstructing using a Fourier transform to form the
aggregate image (see [1]). Typically, the number of pixels n representing the number in a sample
of spatial locations is on the order of n = 128 × 128 × 32 = 524,288. That is, 32 slices in the z
direction with 128 × 128 in each plane. The number can be increased to n = 256 × 256 × 32.

The directions did not have the angular components on a regular grid, so we ﬁrst used kernel
smoothing of Y-values based on 1350 pairs of (ϕ, θ) to obtain Y-values for 1600 pairs of regularly
spaced angular components, which corresponds to N = 40. The sample of spatial locations Xi
had the size of n = 128 × 128 × 32 = 524,288. For all kernels we used Gaussian kernels of
various dimensions. We used Matlab with C-subroutines to perform the computations. To ﬁnd the
maximal directions ( ˆϕ∗, ˆθ∗) we used fminsearch. However, we observed that it is very sensitive
to the choice of the initial vector near which it looks for the local optimizer. We ended up using
the direction from previous iteration perturbed by a small random vector. Without the perturbation
fminsearch would simply take the initial direction as the optimizer and trace out short straight lines.
We speculate that it is possible to use a more sophisticated optimization code to get faster and more
robust numerical solution for step 3 of our algorithm.

Our rationale for choosing seed regions for evaluation is based on the following: The corpus
callosum (CC) contains thick axonal ﬁbers connecting the two cerebral hemispheres and enabling

36

the communication between them. The general anatomical locations of these axonal ﬁbers are well
established. These ﬁbers can be used to evaluate new techniques in ﬁber tractography. The anterior
part of the CC, called the genu of CC, contains axonal ﬁbers connecting the right and left frontal
lobes.

The second region is the right fornix, which is a much shorter ﬁber than the CC and it is quite
challenging for tracing, but important for early diagnostics of Alzheimer’s disease. The results are
presented in Figures 1 and 2. The estimated curves are shown in magenta color. The corresponding
conﬁdence ellipsoids are hardly visible, since the norm of the corresponding limiting covariance
matrix was of the order 10−8. Both ﬁbers follow the anatomical ground truth. Anterior ﬁber is
perfectly centered where expected, while fornix ﬁber is a bit shifted oﬀ the center of the expected
location.

Use of Euler’s method for numerical approximation of the solutions is justiﬁed by its simplicity
and by the work of Sakhanenko [6] where it was shown that for DTI using the higher order Runge-
Kutta approximations gave no beneﬁt.
In fact, the statistical accuracy outweighs the numerical
accuracy on scales typical for brain imaging applications so it is of no concern.

37

Figure 5.5: Axonal ﬁbers across the genu of Corpus Callosum. We traced each ﬁber branch for 40
steps of size δ = 0.01. The estimated curve is shown in magenta accompanied by blue 95%
conﬁdence ellipsoids.

38

Figure 5.6: Right Fornix. We traced the ﬁber for 30 steps of size δ = 0.01. The estimated curve is
shown in magenta accompanied by blue 95% conﬁdence ellipsoids.

39

CHAPTER 6

PROOFS

6.1 Some Results for Simpson’s Scheme and Kernel Smoothing.

Proposition 3. For any kernel K satisfying conditions (K) and any twice continuously diﬀerentiable
on compact sets function g we have as h → 0:

g(u)du = g(u0) + 0.5h2

g

(cid:48)(cid:48)(u0)K2(1 + o(1)),

Þ

(cid:18)u − u0

(cid:19)

h

h−dK

where K2 =Þ |u|2K(u)du.

Proof. The above follows easily by a Taylor expansion of the integrand. For a detailed description
see [15].
(cid:3)

We will state the following standard result by Simpson for approximating integrals without

proof:

Þ b

Proposition 4. For any four times continuously diﬀerentiable function g deﬁned on the interval
[a, b] we have as N → ∞

a

g(u)du = SN(g(u0), ..., g(u2N), a, b) − (b − a)5
with some u∗ in [a, b], where gIV is the fourth order derivative of g.
Lemma 2. For any kernel K ∈ C(j)([0,1]d) satisfying conditions (K) and any function g ∈ C(j) on
a compact set, we have as h → 0

180 gIV(u∗)

1
(2N)4

Þ

h−(d+j)K(j)(cid:18)u0 − u

(cid:19)

where j is a non-negative integer and K2 =Þ |u|2K(u)du as before.

(j)(u0) + 0.5h2

g(u)du = g

h

g

(j+2)(u0)K2(1 + o(1)),

40

Proof. Using integration by parts and proposition 1,

Þ

h−(d+1)K(cid:48)(cid:18)u0 − u

(cid:48)(u0) + 0.5h2

= g

g

(cid:19)
Þ
h
(cid:48)(cid:48)(cid:48)(u0)K2(1 + o(1)).

g(u)du =

h−dK

By induction the proof is complete.

Proposition 5. Consider ordinary Simpson’s scheme forÞ b

(cid:18)u0 − u

(cid:19)

h

(cid:48)(u)du

g

(cid:3)

1
6N

(g(u0) + g(u2N)) +

2
3N

N

m=1

a g(u)du deﬁned by
g(u2m).

N−1

1
3N

m=1

g(u2m−1) +

Then as N → ∞
1
1
36N2(g(u0) + g(u2N)) +
9N2
Proof. Straightforward calculation yields the result:

g(u2m−1) +

4
9N2

m=1

N

N−1

m=1

Þ b

a

g(u)du(1 + O(N−1)).

g(u2m) =

5
9N

=

N−1
N−1

g(u2m−1) +

1
9N2
g(u2m−1) +

4
9N2
(g(u0) + g(u2N)) +

N
N
N−1
(g(u0) + g(u2N)) +
g(u2m)] +
m=1
Þ b
1
g(u)du(1 + O(N−1)) − 2
3N
6
g(u)du(1 + O(N−1)).

1
36N2(g(u0) + g(u2N)) +
2
3N
− 2
Þ b
3N
2
3N
1
36N2(g(u0) + g(u2N)) =

m=1
1
3N
2
3N

m=1
2
3N
1
6N

[ 1
6N
[ 1
6N

Þ b

5
9N

m=1

=

+

a

a

a

g(u)du(1 + O(N−1))

g(u2m)

g(u2m)]

m=1
1
(g(u0) + g(u2N))
24N

(cid:3)

6.1.1 Existence of Unique Direction and Establishing Approximation of ˆXn − x

We now prove the existence of the unique direction corresponding to the direction of dominant
diﬀusion:

Lemma 3. The direction satisfying (6.2) exists and unique for each x in a neighbourhood of a.

41

Proof. Deﬁne the function H : G × S2 → R2 as

f(cid:0)x, ϕ, θ(cid:1)(cid:19)
yields the maximum of f . Thus we have H(cid:0)x, ϕ∗(x), θ∗(x)(cid:1) = 0 and

f(cid:0)x, ϕ, θ(cid:1),

∂
∂θ

∂ϕ

.

It is continuously diﬀerentiable in G by assumption. For each x ∈ G, the direction (ϕ∗(x), θ∗(x))

(cid:40)

∂H(cid:0)x, ϕ∗(x), θ∗(x)(cid:1)
∂(cid:0)ϕ(x), θ(x)(cid:1)

det

2 f(x,ϕ∗(x),θ∗(x))
2 f(x,ϕ∗(x),θ∗(x))

∂θ∂ϕ

∂

∂

2

∂θ

 (cid:44) 0.

2

(cid:41)

= det

2 f(x,ϕ∗(x),θ∗(x))
2 f(x,ϕ∗(x),θ∗(x))

(cid:18) ∂
H(cid:0)x, ϕ, θ(cid:1) =
 ∂
g(x) =(cid:0)ϕ
∗(x)(cid:1)
H(cid:0)x(cid:48)
, g(x(cid:48))(cid:1) = 0 for all x(cid:48) ∈ W .

∗(x), θ

∂ϕ∂θ

∂ϕ

∂

By the implicit function theorem, there is an open set W ⊂ R3 containing x and a unique continu-
ously diﬀerentiable function g : W → R2 such that

and

Remark on Lemmas 3 and 4:

(cid:3)

Before stating and proving lemmas 3 and 4, we would like to explain their place in our estimation
n(t)− x∗(t) we will consider a process zn(t), t ∈ [0,T],

procedure. Now to approximate the process ˆx∗
deﬁned as the solution of ODE:

dzn(t)
dt

= ∇v

∗(x∗(t))zn(t) + (ˆv

∗
n − v

∗)(x∗(t)),

zn(0) = 0.

Similar to arguments in [5] we can show that zn(t) approximates ˆx∗

In order to
solve the ODE above note that there exists a function U(t, s) ∈ [0,T]2 which satisﬁes the following
conditions:

n(t) − x∗(t).

U(t, s) = ∇v∗(x∗(t))U(t, s), 0 ≤ s ≤ t ≤ T;

1. ∂
∂t

42

2. U(t, t) = I3×3;
3. U(t, s) ≡ 0 ,0 ≤ t < s ≤ T .

Then by Theorem (2.2) in chapter (7) in Coddington and Levinson (1955) (see [17]) the unique
solution to the ODE above is given via the Green’s function U(t, s) as the following:

Þ t

0

ˆzn(t) =

U(t, s)(ˆv

∗
n − v

∗)(x∗(s))ds.

n(t)− x∗(t) up to terms of order O((cid:107) ˆx∗

Even though ˆzn(t) approximates ˆx∗
an estimation procedure to terms based on the deviation ˆfn − f . Thus we introduce:
Lemma 4. Let ∆ϕ = ϕ1 − ϕ2 and ∆θ = θ1 − θ2. Then

n − x∗(cid:107)2), we need to relate

sin θ1 cos ϕ1 − sin θ2 cos ϕ2
sin θ1 sin ϕ1 − sin θ2 sin ϕ2

∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2
∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2

cos θ1 − cos θ2

(cid:18)

(cid:26)

max

(|∆θ| + |∆ϕ|)2

−∆θ sin θ2

,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19)

.





 =



+O

Proof.

sin θ1 cos ϕ1 − sin θ2 cos ϕ2

1
2

sin(θ1 + ϕ1) +

(cid:18) θ1 − θ2 + ϕ1 − ϕ2
(cid:19)
sin(θ1 − ϕ1) − 1
2
(cid:19)
(cid:18) θ1 − θ2 − (ϕ1 − ϕ2)

(cid:18) θ1 + θ2 + ϕ1 + ϕ2
(cid:19)
sin(θ2 + ϕ2) − 1
2
(cid:19)
(cid:18) θ1 + θ2 − ϕ1 − ϕ2

cos

2

cos

=

1
2
= sin

+ sin

2

.

2

2

sin(θ2 − ϕ2)

43

By a Taylor expansion, at ∆θ = ∆ϕ = 0,

cos

(cid:19)

sin

+ sin

2

(cid:19)
(cid:18) θ1 − θ2 + ϕ1 − ϕ2
(cid:18) θ1 − θ2 − (ϕ1 − ϕ2)
(cid:18) ∆θ
(cid:18) ∆θ
(cid:18)(cid:18) ∆θ

∆ϕ
2 +
2
2 − ∆ϕ
2

(cid:19)(cid:20)
(cid:19)(cid:20)

(cid:19)2(cid:19)

=

+

(cid:19)

2

(cid:19)
(cid:18) θ1 + θ2 + ϕ1 + ϕ2
(cid:18) θ1 + θ2 − ϕ1 − ϕ2
(cid:18) ∆θ
(cid:18) ∆θ

cos

(cid:19)
(cid:19)

∆ϕ
2 +
2
2 − ∆ϕ
2
(cid:26)

(cid:18)

2
2
cos(θ2 + ϕ2) − sin(θ2 + ϕ2)

cos(θ2 − ϕ2) − sin(θ2 − ϕ2)

∆ϕ
2

2 +

+O
= ∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2 + O

max

(|∆θ| + |∆ϕ|)2

+ O

+ O

∆ϕ
2 +
2
2 − ∆ϕ
2

(cid:19)2(cid:19)(cid:21)
(cid:18)(cid:18) ∆θ
(cid:19)2(cid:19)(cid:21)
(cid:18)(cid:18) ∆θ
,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19)

Similarly,

sin θ1 sin ϕ1 − sin θ2 sin ϕ2

2

2

+ sin

1
2

=
= − sin

sin

(cid:19)

cos(θ2 − ϕ2) +

cos(θ1 − ϕ1) − 1
2

(cid:18) θ1 + θ2 − ϕ1 − ϕ2
(cid:19)
cos(θ1 + ϕ1) − 1
2
(cid:19)
(cid:18) θ1 + θ2 + ϕ1 + ϕ2
(cid:18) ∆θ
(cid:18) ∆θ
(cid:18)(cid:18) ∆θ

(cid:18) θ1 − θ2 − (ϕ1 − ϕ2)
(cid:19)
(cid:18) θ1 − θ2 + ϕ1 − ϕ2
(cid:18) ∆θ
(cid:18) ∆θ
(cid:19)
2 − ∆ϕ
2

sin(θ2 − ϕ2) + cos(θ2 − ϕ2)

sin(θ2 + ϕ2) + cos(θ2 + ϕ2)

(cid:19)(cid:20)
(cid:19)2(cid:19)

(cid:19)(cid:20)
2 − ∆ϕ
2

2 +

2 +

∆ϕ
2

∆ϕ
2

sin

= −

2

2

1
2

(cid:19)

+

+O

2 +

∆ϕ
2

(cid:18)

(cid:26)

sin(θ2 + ϕ2)

+ O

+ O

(cid:18)(cid:18) ∆θ
(cid:19)2(cid:19)(cid:21)
(cid:18)(cid:18) ∆θ
(cid:19)2(cid:19)(cid:21)
2 − ∆ϕ
2
,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19)

2 +

∆ϕ
2

max

(|∆θ| + |∆ϕ|)2

= ∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2 + O

44

.

.

(cid:19)

(cid:18) θ1 + θ2

2

(cid:19)

(cid:18) θ1 − θ2

2

sin

cos θ1 − cos θ2 = −2 sin
= −∆θ sin θ2 + O(∆θ),


 =
,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19)

−∆θ sin θ2

.



Finally,

thus



sin θ1 cos ϕ1 − sin θ2 cos ϕ2
sin θ1 sin ϕ1 − sin θ2 sin ϕ2

∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2
∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2

(cid:18)

(cid:26)

cos θ1 − cos θ2
(|∆θ| + |∆ϕ|)2

+O

max

Lemma 5. In a neighbourhood of (x, ϕ∗(x), θ∗(x)) we have

(cid:32) ˆϕ∗
(cid:18)

+O

(cid:33)

(cid:32) ∂

n(x) − ϕ∗(x)
ˆθ∗
n(x) − θ∗(x)

2
∆
ϕ

2
+ ∆
θ

+ ∆ϕ

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f
2
2( ˆf − f)

2

∂
∂ϕ∂θ f
2
∂
2 f
∂θ

(cid:12)(cid:12)(cid:12)(cid:12) + ∆θ

(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂θ

= −

(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂ϕ

(cid:33)−1(cid:32) ∂

(cid:33)
(cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ)
(cid:12)(cid:12)(cid:12)(cid:12) ∂

ˆfn − ∂
∂ϕ f
ˆfn − ∂
∂θ f

2

∂ϕ∂θ

∂ϕ
∂
∂θ

2
2( ˆf − f)

(cid:3)

(cid:12)(cid:12)(cid:12)(cid:12)(cid:19)

( ˆf − f)

.

Proof. By lemma 2 we have for each x ∈ G a direction (ϕ∗(x), θ∗(x)) for which

∂
∂ϕ
and likewise a direction ( ˆϕ∗

f(x, ϕ

∗(x), θ

∗(x)) =

∂
∂θ

f(x, ϕ

∗(x), θ

∗(x)) = 0

n(x), ˆθ∗

n(x)) for ˆfn for which

ˆfn(x, ˆϕ

∗
∗
n(x)) =
n(x), ˆθ

∂
∂ϕ

∂
∂θ

ˆfn(x, ˆϕ

∗
∗
n(x)) = 0.
n(x), ˆθ

45

Expanding the function ˆf (cid:48)

ϕ

at the point (x, ϕ∗(x) + ∆ϕ, θ∗(x) + ∆θ) gives
∗(x)) − ˆf (cid:48)
∗(x), θ
∗(x) + ∆ϕ, θ
∗(x), θ
∗(x), θ
2
+ ∆
θ

∗(x), θ
∗(x))
ϕ(x, ϕ
∗(x) + ∆θ) − ˆf (cid:48)
ϕ(x, ϕ
∗(x), θ
ϕθ(x, ϕ
∗(x), θ
ϕθ(x, ϕ
2

∗(x), θ
∗(x))
∗(x))∆θ + O
∗(x))∆θ
( ˆf − f)

∗(x)))∆ϕ + ˆf (cid:48)(cid:48)
∗(x)))∆ϕ + f (cid:48)(cid:48)
2
2( ˆf − f)

(cid:12)(cid:12)(cid:12)(cid:12) + ∆θ

(cid:12)(cid:12)(cid:12)(cid:12) ∂

(cid:12)(cid:12)(cid:12)(cid:12) ∂

(cid:12)(cid:12)(cid:12)(cid:12)(cid:19)

+ ∆ϕ

∂ϕ∂θ

(cid:18)

∂ϕ

.

f (cid:48)
ϕ(x, ϕ
= ˆf (cid:48)
ϕ(x, ϕ
= ˆf (cid:48)(cid:48)
ϕϕ(x, ϕ
= f (cid:48)(cid:48)
ϕϕ(x, ϕ

(cid:18)

+O

2
∆
ϕ

(cid:19)

2
+ ∆
θ

2
∆
ϕ

Note that the continuity of ˆf (cid:48)

ϕ

in ϕ, θ and the fact that
∗(x) + ∆θ) − ˆf (cid:48)
∗(x)) = f (cid:48)
ϕ(x, ϕ

∗(x) + ∆ϕ, θ
∗(x), θ
ϕ(x, ϕ

ˆf (cid:48)
ϕ(x, ϕ
= − ˆf (cid:48)

ϕ(x, ϕ
∗(x), θ

∗(x))
∗(x), θ
∗(x)) − ˆf (cid:48)
ϕ(x, ϕ

∗(x), θ

∗(x))

we have that ˆf (cid:48)

ϕ(x, ϕ∗(x), θ∗(x)) → 0 as ∆ϕ, ∆θ → 0 and is included in the term O

(cid:18)

2
∆
ϕ

+ ∆

2
θ

(cid:19)

.

Likewise, expanding the function ˆf (cid:48)

θ

f (cid:48)
θ(x, ϕ
= ˆf (cid:48)
θ(x, ϕ
= ˆf (cid:48)(cid:48)
θϕ(x, ϕ
= f (cid:48)(cid:48)
θϕ(x, ϕ

∗(x), θ

at the point (x, ϕ∗(x) + ∆ϕ, θ∗(x) + ∆θ) gives
∗(x))
θ(x, ϕ
∗(x) + ∆θ) − ˆf (cid:48)
∗(x))
∗(x), θ
θ(x, ϕ
∗(x), θ
∗(x))∆θ + O
θθ(x, ϕ
∗(x), θ
∗(x))∆θ
θθ(x, ϕ
2
( ˆf − f)

∗(x))∆ϕ + ˆf (cid:48)(cid:48)
∗(x)))∆ϕ + f (cid:48)(cid:48)
2
2( ˆf − f)

∗(x), θ
∗(x)) − ˆf (cid:48)
∗(x) + ∆ϕ, θ
∗(x), θ
∗(x), θ
2
+ ∆
θ

2
+ ∆
θ

+ ∆θ

2
∆
ϕ

(cid:19)

(cid:18)

.

+O

2
ϕ

∆

(cid:18)

(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂θ

(cid:12)(cid:12)(cid:12)(cid:12)(cid:19)

∂ϕ∂θ

Thus, we have the result:

(cid:32) ∂
(cid:18)

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f
2
∆
ϕ

+O

2

∂
∂ϕ∂θ f
2
∂
2 f
∂θ
2
+ ∆
θ

+ ∆ϕ

(cid:33)(cid:32) ˆϕ∗
(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂ϕ

(cid:33)
(cid:12)(cid:12)(cid:12)(cid:12) ∂
(cid:12)(cid:12)(cid:12)(cid:12) + ∆θ

=

n(x) − ϕ∗(x)
ˆθ∗
n(x) − θ∗(x)
2
2( ˆf − f)

∂θ

ˆfn − ∂
∂ϕ f
ˆfn − ∂
∂θ f

(cid:33)
(cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ)
(cid:12)(cid:12)(cid:12)(cid:12) ∂

2

∂ϕ∂θ

∂ϕ
∂
∂θ

2
2( ˆf − f)

(cid:12)(cid:12)(cid:12)(cid:12) + ∆ϕ
(cid:12)(cid:12)(cid:12)(cid:12) ∂
(cid:32) ∂

46

(cid:12)(cid:12)(cid:12)(cid:12)(cid:19)

( ˆf − f)

.

(cid:3)

Therefore, we can approximate ˆzn by

ˆZn(t) =

Þ t

0

(cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗

cos θ∗ sin ϕ∗
− sin θ∗

0

(cid:33)(cid:32) ∂

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f

2

∂
∂ϕ∂θ f
2
∂
2 f
∂θ

(cid:33)−1(cid:32) ∂

∂ϕ
∂
∂θ

( ˆfn − f)
( ˆfn − f)

(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s)

ds.

−

U(t, s)

sin θ∗ cos ϕ∗

which approximates ˆx∗

n(t) − x∗(t) in terms of ˆfn − f .

6.1.2 Establishing Asymptotic Unbiasedness of Estimators of Derivatives of f

Recall that our model is

Yi jk = f(Xi, ϕj, θk) + S(Xi, ϕj, θk)εi jk, i = 1, . . . , n; j = 1, . . . , Nϕ; k = 1, . . . , Nθ,

where Xi ∼ U([0,1]3), ϕj = −π + 2π

j
2Nϕ

First, deﬁne

wj,k(x) =

.

, and θk = − π2 + π k2Nθ
(cid:19)
(cid:18) x − Xi
n

1
nh3
n

K

i=1

Yi jk .

hn

Immediately note that

Ewj,k(x) =

K(u) f(x − hnu, ϕj, θk)du = f(x, ϕj, θk) + 0.5h2

n

2

∂

∂x2 f(x, ϕj, θk)K2(1 + o(1)).
(cid:19)

(cid:19)

w2Nϕ,k(x),−π, π

.

(cid:18) ϕ − ϕ2Nϕ

hϕ

Þ

Secondly, deﬁne

uk(x, ϕ) = SNϕ

(cid:18) 1

hϕ

(cid:18) ϕ − ϕ0

(cid:19)

hϕ

Kϕ

w0,k(x), ...,

1
hϕ

Kϕ

47

(cid:18) ϕ − ϕ2Nϕ

hϕ

(cid:19)

(cid:19)

Ew2Nϕ,k(x),−π, π

∂x2 f(x, u, θk)K2(1 + o(1))]du

Again, note that

Kϕ

Þ ϕ+chϕ

Euk(x, ϕ) = SNϕ
1
hϕ
∂IV
4
∂ϕ

ϕ−chϕ
2chϕ

+

=

180(2Nϕ)4
hϕ
= f(x, ϕ, θk) + 0.5h2

hϕ

Kϕ

Kϕ

1
hϕ
2
∂

(cid:19)
(cid:19)

Ew0,k(x), ...,
hϕ
[ f(x, u, θk) + 0.5h2

(cid:18) 1
(cid:18) ϕ − ϕ0
(cid:18) ϕ − u
(cid:19)
(cid:20) 1
(cid:18) ϕ − u
(cid:18) ϕ − u∗
(cid:19)
∂x2 f(x, ϕ, θk)K2(1 + o(1)) + 0.5h2

(cid:21)(cid:12)(cid:12)(cid:12)(cid:12)ϕ=u∗

f(x, ϕ, θk)

Kϕ

hϕ

hϕ

∂

2

n

n

k

ϕ

f(x, u∗

k, θk)(1 + o(1)),

k

∂

2
2 f(x, ϕ, θk)Kϕ,2(1 + o(1))

∂ϕ

where the remainder is

R = SNθ

...,

1
hθ

Kθ

(cid:19)

(cid:18) θ − θ0
(cid:18) 1
(cid:18) θ − θ2Nθ
(cid:19)

Kθ

hθ

hθ

K IV
ϕ

hθ

(cid:19)

1440N4
θ h4

θ

(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗

hϕ
2Nθ

K IV
ϕ

0

hϕ

48

f(x, u∗

0, θ0),

f(x, u∗
2Nθ

, θ2Nθ

),−0.5π,0.5π

(cid:19)

.

+

K IV
ϕ

ϕh4

c
1440N4
hϕ
k ∈ [ϕ − chϕ, ϕ + chϕ].

ϕ

where u∗

Thirdly, let

ˆfn(x, ϕ, θ) = SNθ

Then

(cid:18) θ − θ0
(cid:18) θ − θ0

hθ

(cid:19)
(cid:19)

hθ

Kθ

hθ

(cid:18) 1
(cid:18) 1
(cid:18) θ − v

hθ

Kθ

(cid:19)

u0(x, ϕ), ...

1
hθ

Kθ

(cid:18) θ − θ2Nθ
(cid:19)
(cid:18) θ − θ2Nθ

hθ

(cid:19)

(cid:19)

.

u2Nθ

(x, ϕ),− π
2 ,

π
2

(cid:19)

(x, ϕ),− π
2 ,

π
2

Eu0(x, ϕ), ...
2

Kθ

1
EuNθ
hθ
∂x2 f(x, ϕ, v)K2(1 + o(1))

hθ

∂

E ˆfn(x, ϕ, θ) = SNθ

Þ θ+chθ

=

θ−chθ
+0.5h2
∂

ϕ

∂ϕ

= f(x, ϕ, θ) + 0.5h2
n

+0.5h2

θ

∂

∂θ

Kθ

[ f(x, ϕ, v) + 0.5h2
n

1
hθ
2
2 f(x, ϕ, v)Kϕ,2(1 + o(1))]dv +

hθ

ϕ

2

∂

ϕh4
∂x2 f(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
K IV
θ

c

2
2 f(x, ϕ, θ)Kϕ,2(1 + o(1)) +

c
1440N4

R

ϕ

∂

∂ϕ

(cid:19)

(cid:18) θ − θ∗

2
2 f(x, ϕ, θ)Kϕ,2(1 + o(1))
c
1440N4

f(x, ϕ, θ

∗) +

hθ

R,

ϕh4

ϕ

To assess R let u∗ =

Then R = O(1), since

K IV
ϕ
+(u∗

k

ϕ

1

hϕ

f(x, u∗

f(x, u∗

(cid:18) ϕ − u∗

2Nθ +12Nθ
(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗
k − u∗)[K IV
(cid:19)
(cid:18) θ − u
Þ 1
(cid:18) θ − u
Þ 1
(cid:19)
(cid:18) ϕ − u∗

(cid:18) ϕ − u∗
(cid:19)
hϕ
, θk) − 1
hϕ

j=0 u∗
k, θk) = K IV
f (cid:48)(x, u∗

j ∈ [ϕ − chϕ, ϕ + chϕ]. Then
(cid:19)
, θk)
(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗
(cid:19)
(cid:19) 1
(cid:18) ϕ − u∗

, u)du(1 + o(1))
f(x, u∗

hϕ
KV
ϕ

f(x, u∗

K IV
ϕ

KV
ϕ

hϕ
, θ) − 2cKV

hθ
f(x, u∗

f(x, u∗

(cid:19)

Kθ

Kθ

hϕ

hϕ

hϕ

hθ

hθ

hθ

ϕ

hϕ

ϕ

hϕ

R =
−2chϕ

= K IV
ϕ

(cid:19)

f(x, u∗

, θk)](1 + o(1)).

, u)du(1 + o(1)) + O(h2

, θ) + O(h2

ϕ) + O(h2

(cid:19)
(cid:19)

(cid:18) 1
(cid:18) 1

θ

N4
θ h4
N4
θ h4

θ

θ) + O
θ) + O

Thus, combining several previous calculations we obtain

E ˆfn(x, ϕ, θ) = f(x, ϕ, θ) + 0.5h2
n
+0.5h2

θ f (cid:48)(cid:48)
θθ(x, ϕ, θ)Kϕ,2(1 + o(1)) +

(cid:18) ϕ − ϕ∗

(cid:19)

(cid:21)

2

∂

(cid:19)
2
∂x2 f(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
2 f(x, ϕ, θ)Kϕ,2(1 + o(1))
∂ϕ
∗
f(x, ϕ
, θ)

∂

ϕ

(cid:20)
(cid:18) ϕ − ϕ∗
(cid:19)
(cid:18) θ − θ∗

K IV
ϕ

c
1440N4
c

ϕh4
ϕ
K IV
θ

ϕ

+

∗

hϕ

, θ)

f(x, ϕ

−2cKV

hϕ
f(x, ϕ, θ
for some ϕ∗ ∈ [ϕ − chϕ, ϕ + chϕ] and some θ∗ ∈ [θ − chθ, θ + chθ].
Lemma 6. Let hn, hϕ, hθ → 0 and Nϕ, Nθ → ∞. Under conditions (K) on kernels Kϕ and Kθ we
have

θ h4
1440N4

∗)

hθ

θ

ˆfn(x, ϕ, θ) =

E ∂
∂ϕ

∂
∂ϕ

f(x, ϕ, θ) + O(h2

n) + O(h2

ϕ) + O(h2

θ) + O

E ∂
∂θ
2
2

E ∂
∂ϕ
2
2

E ∂
∂θ
2

E ∂

∂θ∂ϕ

ˆfn(x, ϕ, θ) =

ˆfn(x, ϕ, θ) =

ˆfn(x, ϕ, θ) =

ˆfn(x, ϕ, θ) =

∂

f(x, ϕ, θ) + O(h2

∂
∂θ
2
2 f(x, ϕ, θ) + O(h2
∂ϕ
2
2 f(x, ϕ, θ) + O(h2
2

∂θ

∂

n) + O(h2

ϕ) + O(h2

θ) + O

n) + O(h2

ϕ) + O(h2

θ) + O

n) + O(h2

ϕ) + O(h2

θ) + O

∂

∂θ∂ϕ

f(x, ϕ, θ) + O(h2

n) + O(h2

ϕ) + O(h2

θ) + O

49

(cid:18)
(cid:18)

(cid:19)
(cid:19)

ϕ

(cid:18)
(cid:18)

(cid:19)
(cid:19)

1
ϕh5
N4
1
ϕh4
N4
ϕ
1
ϕh6
N4
1
ϕh4
N4
ϕ
1
ϕh5
N4

(cid:18)

ϕ

ϕ

+ O

+ O

θ

(cid:19)
(cid:19)

N4
θ h4

(cid:18) 1
(cid:18) 1
(cid:18) 1
(cid:18) 1
(cid:18) 1

θ h5
N4

θ h6
N4

N4
θ h4

θ

θ

θ

,

,

(cid:19)
(cid:19)

+ O

θ h5
N4

θ

+ O

+ O

(cid:19)

,

,

(cid:19)

.

Proof. (1) By linearity of the diﬀerential operator,

ˆfn(x, ϕ, θ)

∂
∂ϕ

(cid:18) 1

= SNθ

Kθ

hθ

(cid:18) θ − θ0

(cid:19) ∂

hθ

∂ϕ

u0(x, ϕ), ...,

1
hθ

Kθ

(cid:18) θ − θ2Nθ

hθ

(cid:19) ∂

∂ϕ

(cid:19)

.

(x, ϕ),−0.5π,0.5π

u2Nθ

Then by the linearity of the expectation operator, we have

ˆfn(x, ϕ, θ)

E ∂
∂ϕ

(cid:18) 1

(cid:18) θ − θ0

hθ

Kθ

= SNθ

hθ
First, note that

E ∂
∂ϕ

Þ ϕ+chϕ

uk(x, ϕ) = SNϕ
K(cid:48)

=

+

ϕ−chϕ
2chϕ

180(2Nϕ)4

ϕ

1
h2
ϕ
∂IV
∂uIV

, ...,

(cid:19)

u0(x, ϕ)

(cid:19)
(cid:18) ϕ − ϕ0
(cid:19)
(cid:18) ϕ − u

hϕ

ϕ

∂ϕ

E

(cid:19)

(cid:18) ∂
(cid:18) 1
(cid:18) ϕ − u
(cid:20) 1

h2

hϕ
K(cid:48)

ϕ

ϕ

h2

ϕ

K(cid:48)

(cid:19)

hϕ

(cid:21)(cid:12)(cid:12)(cid:12)(cid:12)u=u∗

k

.

f(x, u, θk)

(cid:18) θ − θ2Nθ

hθ

1
hθ

Kθ

E

(cid:18) ∂
(cid:19)
(cid:18) ϕ − ϕ2Nϕ

∂ϕ

u2Nθ

(cid:19)

(cid:19)

(x, ϕ)

,−0.5π,0.5π

(cid:19)

.

(cid:19)

1
h2

K(cid:48)

Ez0,k(x), ...,
n f (cid:48)(cid:48)
xx(x, u, θk)K2(1 + o(1))]du

hϕ

ϕ

ϕ

[ f(x, u, θk) + 0.5h2

Ez2Nϕ,k(x),−π, π

By the previous lemma,

n f (cid:48)(cid:48)(cid:48)
xxϕ(x, ϕ, θk)K2(1 + o(1))

uk(x, ϕ) = f (cid:48)
ϕ(x, ϕ, θk) + 0.5h2
E ∂
∂ϕ
ϕ f (cid:48)(cid:48)(cid:48)
ϕϕϕ(x, ϕ, θk)Kϕ,2(1 + o(1))
+0.5h2
c
1440N4

(cid:18) ϕ − u∗

f(x, u∗

KV
ϕ

(cid:19)

hϕ

+

k

ϕh5

ϕ

k, θk)(1 + o(1)).

50

Thus by proposition 2,

(cid:18) θ − θ2Nθ

(cid:19)

(cid:18) ∂

(cid:19)

(x, ϕ)

,−0.5π,0.5π

(cid:19)

(cid:19)

ˆfn(x, ϕ, θ)

E ∂
∂ϕ

(cid:18) 1
Þ θ+chθ

hθ

= SNθ

=

Kθ
1
hθ

E

hθ

(cid:18) θ − θ0
(cid:19)
(cid:18) ∂
(cid:18) θ − u
(cid:19)
(cid:18) θ − θ∗

Kθ

hθ

K IV

θ−chθ
ϕ f (cid:48)(cid:48)(cid:48)
ϕϕϕ(x, ϕ, u)Kϕ,2(1 + o(1))]du +
+0.5h2
c

(cid:19)

E

∂ϕ

hθ

, ...,

u0(x, ϕ)

1
u2Nθ
Kθ
hθ
n f (cid:48)(cid:48)(cid:48)
[ f (cid:48)
xxϕ(x, ϕ, u)K2(1 + o(1))
ϕ(x, ϕ, u) + 0.5h2
c
1440N4

∂ϕ

R

ϕh5

ϕ

f(x, ϕ, θ

∗)(1 + o(1))

+

θ

1440N4
θ h4
hθ
n f (cid:48)(cid:48)(cid:48)
= f (cid:48)
ϕ(x, ϕ, θ) + 0.5h2
xxϕ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
θ f (cid:48)(cid:48)(cid:48)
ϕθθ(x, ϕ, θ)Kθ,2(1 + o(1)) +
+0.5h2
K IV
θ
∗
c
f(x, ϕ
1440N4

θ h4
1440N4
, θ) − 2cKV I

(cid:18) θ − θ∗
(cid:18) ϕ − ϕ∗
(cid:19)

(cid:18) ϕ − ϕ∗

KV
ϕ

(cid:19)

(cid:20)

hϕ

hϕ

+

c

ϕ

θ

hθ
f(x, ϕ

(cid:19)

ϕh5

ϕ

ϕ f (cid:48)(cid:48)(cid:48)
ϕϕϕ(x, ϕ, θ)Kϕ,2(1 + o(1))
∗)(1 + o(1))

f(x, ϕ, θ

(cid:21)

∗

, θ)

(1 + o(1)).

Note the remainder R is calculated exactly the same as above.

(cid:3)

Proof. (2) As in the proof of lemma 2(1), the linearity of the diﬀerential operator and expectation
operator gives

(cid:18) 1

h2

θ

(cid:18) θ − θ0

(cid:19)

hθ

K(cid:48)

θ

u0(x, ϕ), ...,

K(cid:48)

θ

1
h2

θ

(cid:18) θ − θ2Nθ

(cid:19)

hθ

ˆfn(x, ϕ, θ) = SNθ

∂
∂θ

and

(cid:19)

(x, ϕ),−0.5π,0.5π

u2Nθ

ˆfn(x, ϕ, θ) =

(cid:18) θ − θ0

K(cid:48)

θ

E ∂
∂θ
SNθ

(cid:18) 1

h2

θ

hθ

(cid:19)

Eu0(x, ϕ), ...,

K(cid:48)

θ

1
h2

θ

(cid:19)

(cid:18) θ − θ2Nθ

hθ

(cid:19)

.

(x, ϕ),−0.5π,0.5π

Eu2Nθ

51

By proposition 2,

E ∂
∂θ

Þ θ+chθ

ˆfn(x, ϕ, θ)
1
K(cid:48)
h2

=

(cid:18) θ − v
(cid:18) θ − θ∗

hθ

(cid:19)
(cid:19)

[ f(x, ϕ, v) + 0.5h2

θ

θ

θ−chθ
ϕ f (cid:48)(cid:48)
ϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv +
+0.5h2
c

f(x, ϕ, θ

+

KV
θ

n f (cid:48)(cid:48)
xx(x, ϕ, v)K2(1 + o(1))
c
1440N4
∗)(1 + o(1))

ϕh4

R

ϕ

θ

θ h5
1440N4
hθ
n f (cid:48)(cid:48)(cid:48)
= f (cid:48)
xxθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
θ(x, ϕ, θ) + 0.5h2
θ f (cid:48)(cid:48)(cid:48)
θθθ(x, ϕ, θ)Kθ,2(1 + o(1)) +
+0.5h2
∗
f (cid:48)
c
θ(x, ϕ
1440N4

θ h5
1440N4
, θ) − 2cKV

(cid:18) θ − θ∗
(cid:19)
(cid:18) ϕ − ϕ∗

KV
θ

(cid:19)

hϕ

hϕ

hθ

+

c

ϕ

θ

ϕh4

(cid:19)

(cid:21)

ϕ f (cid:48)(cid:48)(cid:48)
ϕϕθ(x, ϕ, θ)Kϕ,2(1 + o(1))
∗)(1 + o(1))

f(x, ϕ, θ

f (cid:48)
θ(x, ϕ

∗

, θ)

(1 + o(1)).

ϕ

K IV
ϕ

(cid:20)
(cid:18) ϕ − ϕ∗
(cid:19)
(cid:18) 1
(cid:18) θ − θ0
(cid:19)
(cid:18) θ − θ2Nθ

K(cid:48)

h2

hθ

θ

θ

K IV
ϕ

hθ

(cid:19)

(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗

hϕ
2Nθ

K IV
ϕ

0

hϕ

R = SNθ

...,

K(cid:48)

θ

1
h2

θ

Note that R was calculated in the same way as before, but this time with

f(x, u∗

0, θ0)(1 + o(1)),

f(x, u∗
2Nθ

, θ2Nθ

)(1 + o(1)),−0.5π,0.5π

(cid:19)

.

Proof. (3)

∂

2
2

∂ϕ

ˆfn(x, ϕ, θ)

(cid:18) 1

(cid:18) θ − θ0

(cid:19) ∂

hθ

∂ϕ

= SNθ

Kθ

hθ

2
2 u0(x, ϕ), ...,

(cid:18) θ − θ2Nθ

hθ

1
hθ

Kθ

(cid:19) ∂

2
2 u2Nθ

∂ϕ

(cid:19)

.

(x, ϕ),−0.5π,0.5π

So again, by the linearity of the expectation operator, we have

(cid:18) 1

2
2

E ∂
∂ϕ

= SNθ

ˆfn(x, ϕ, θ)

(cid:18) θ − θ0

(cid:18) ∂

(cid:19)

E

Kθ

hθ

hθ

2
2 u0(x, ϕ)

∂ϕ

(cid:18) ∂

(cid:19)

E

2
2 u2Nθ

∂ϕ

(cid:19)

(x, ϕ)

,−0.5π,0.5π

(cid:19)

, ...,

1
hθ

Kθ

(cid:18) θ − θ2Nθ

hθ

52

(cid:3)

(cid:19)

.

(cid:18) ϕ − ϕ2Nϕ

(cid:19)

(cid:19)

1
h3

K(cid:48)(cid:48)

Ez0,k(x), ...,
n f (cid:48)(cid:48)
xx(x, u, θk)K2(1 + o(1))]du

hϕ

ϕ

ϕ

[ f(x, u, θk) + 0.5h2

Ez2Nϕ,k(x),−π, π

We must then ﬁrst calculate
2
2 uk(x, ϕ) = SNϕ

E ∂
∂ϕ

Þ ϕ+chϕ

ϕ

=

K(cid:48)(cid:48)

ϕ−chϕ
2chϕ

1
h3
ϕ
∂IV
∂uIV
By the preceding lemma,

180(2Nϕ)4

+

(cid:19)

(cid:18) 1
(cid:18) ϕ − u
(cid:20) 1

h3

ϕ

hϕ
K(cid:48)(cid:48)

ϕ

h3

ϕ

(cid:19)

K(cid:48)(cid:48)

ϕ

(cid:18) ϕ − ϕ0
(cid:19)
(cid:18) ϕ − u

hϕ

hϕ

(cid:21)(cid:12)(cid:12)(cid:12)(cid:12)u=u∗

k

.

f(x, u, θk)

xxϕϕ(x, ϕ, θk)K2(1 + o(1))
n f IV

2

2Uk(x, ϕ) = f (cid:48)(cid:48)
ϕϕ(x, ϕ, θk) + 0.5h2
E ∂
(cid:18) ϕ − u∗
∂ϕ
ϕϕϕϕ(x, ϕ, θk)Kϕ,2(1 + o(1))
+0.5h2
ϕ f IV
c
1440N4

f(x, u∗

KV I
ϕ

(cid:19)

hϕ

+

k

ϕh6

ϕ

k, θk)(1 + o(1)).

Thus by proposition 2,

(cid:18) θ − θ2Nθ

(cid:19)

(cid:18) ∂

(cid:19)

(x, ϕ)

,−0.5π,0.5π

(cid:19)

(cid:19)

ˆfn(x, ϕ, θ)

E ∂
∂ϕ

2
2

(cid:18) 1
Þ θ+chθ

hθ

= SNθ

=

Kθ
1
hθ

E

hθ

(cid:19)
(cid:18) θ − θ0
(cid:18) ∂
(cid:19)
(cid:18) θ − u
(cid:18) θ − θ∗

(cid:19)

Kθ

θ−chθ
ϕϕϕϕ(x, ϕ, u)Kϕ,2(1 + o(1))]du +
+0.5h2
ϕ f IV

hθ

E

∂ϕ

∂ϕ

hθ

, ...,

2
2
1
2 u0(x, ϕ)
2 u2Nθ
Kθ
hθ
[ f (cid:48)(cid:48)
ϕϕ(x, ϕ, u) + 0.5h2
xxϕϕ(x, ϕ, u)K2(1 + o(1))
n f IV
c
1440N4
∗)(1 + o(1))

f (cid:48)(cid:48)
ϕϕ(x, ϕ, θ

ϕh6

R

ϕ

c

+

K IV

θ

θ h4
1440N4
hθ
= f (cid:48)(cid:48)
ϕϕ(x, ϕ, θ) + 0.5h2
xxϕϕ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
n f IV
+0.5h2

c

ϕϕθθ(x, ϕ, θ)Kθ,2(1 + o(1)) +
θ f IV
∗
f(x, ϕ

KV I
ϕ

1440N4
θ h4
, θ) − 2cKV II

θ

(cid:19)

(cid:18) θ − θ∗
(cid:19)
(cid:18) ϕ − ϕ∗

K IV
θ

hθ

(cid:20)

+

c
1440N4

ϕh6

ϕ

hϕ

ϕϕϕϕ(x, ϕ, θ)Kϕ,2(1 + o(1))
ϕ f IV

∗)

(cid:21)

f (cid:48)(cid:48)
ϕϕ(x, ϕ, θ
∗

, θ)

f(x, ϕ

(1 + o(1)).

hϕ

(cid:19)
(cid:18) ϕ − ϕ∗
(cid:18) θ − θ0
(cid:18) 1
(cid:19)
(cid:18) θ − θ2Nθ

Kθ

hθ

hθ

KV I
ϕ

hθ

ϕ

(cid:19)

(cid:19)

(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗

hϕ
2Nθ

KV I
ϕ

hϕ

53

Note that R was calculated in the same way as before, but this time with
0, θ0)(1 + o(1)),

f(x, u∗

0

R = SNθ

...,

1
hθ

Kθ

f(x, u∗
2Nθ

, θ2Nθ

)(1 + o(1)),−0.5π,0.5π

(cid:19)

.

(cid:18) 1

h3

θ

(cid:18) θ − θ0

(cid:19)

hθ

K(cid:48)(cid:48)

θ

u0(x, ϕ), ...,

K(cid:48)(cid:48)

θ

1
h3

θ

(cid:19)

(cid:18) θ − θ2Nθ

hθ

(x, ϕ),−0.5π,0.5π

u2Nθ

(cid:3)

(cid:19)

Proof. (4)

ˆfn(x, ϕ, θ) = SNθ

∂

2
2

∂θ
and

2
2

(cid:18) 1

(cid:18) θ − θ0

ˆfn(x, ϕ, θ) =
K(cid:48)(cid:48)

θ

E ∂
∂θ
SNθ

hθ

h3

θ

(cid:19)

Eu0(x, ϕ), ...,

K(cid:48)(cid:48)

θ

1
h3

θ

(cid:19)

(cid:18) θ − θ2Nθ

hθ

(cid:19)

.

(x, ϕ),−0.5π,0.5π

Eu2Nθ

By proposition 2,

E ∂
∂θ

2
2

Þ θ+chθ

ˆfn(x, ϕ, θ)
1
K(cid:48)(cid:48)
h3

=

(cid:19)
(cid:18) θ − v
(cid:19)
(cid:18) θ − θ∗

[ f(x, ϕ, v) + 0.5h2

θ

θ

hθ

θ−chθ
ϕ f (cid:48)(cid:48)
ϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv +
+0.5h2
c

f(x, ϕ, θ

+

KV I
θ

n f (cid:48)(cid:48)
xx(x, ϕ, v)K2(1 + o(1))
c
1440N4
∗)(1 + o(1))

ϕh4

R

ϕ

θ

θ h6
1440N4
hθ
= f (cid:48)(cid:48)
xxθθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
θθ(x, ϕ, θ) + 0.5h2
n f IV
+0.5h2

c

θθθθ(x, ϕ, θ)Kθ,2(1 + o(1)) +
θ f IV
∗
f (cid:48)(cid:48)
θθ(x, ϕ

K IV
ϕ

θ h6
1440N4
, θ) − 2cKV

θ

ϕ

(cid:19)

(cid:18) θ − θ∗
(cid:19)
(cid:18) ϕ − ϕ∗

KV I
θ

hθ

+

c
1440N4

ϕh4

ϕ

ϕϕθθ(x, ϕ, θ)Kϕ,2(1 + o(1))
ϕ f IV
∗)(1 + o(1))

f(x, ϕ, θ

f (cid:48)(cid:48)
θθ(x, ϕ

∗

, θ)

(1 + o(1)).

hϕ

(cid:21)

Note that R was calculated in the same way as before, but this time with

(cid:19)
(cid:19)

hϕ

(cid:20)
(cid:18) ϕ − ϕ∗
(cid:18) θ − θ0
(cid:18) 1
(cid:19)
(cid:18) θ − θ2Nθ

K(cid:48)(cid:48)

h3

hθ

θ

θ

K IV
ϕ

hθ

(cid:19)

(cid:18) ϕ − u∗
(cid:18) ϕ − u∗
(cid:19)

hϕ
2Nθ

K IV
ϕ

0

hϕ

R = SNθ

...,

1
h3

θ

K(cid:48)(cid:48)

θ

54

f(x, u∗

0, θ0)(1 + o(1)),

f(x, u∗
2Nθ

, θ2Nθ

)(1 + o(1)),−0.5π,0.5π

(cid:19)

.

(cid:3)

Proof. (5)

∂

2

SNθ

and

∂θ∂ϕ

(cid:18) 1

(cid:18) θ − θ0

ˆfn(x, ϕ, θ) =
K(cid:48)

θ

hθ

h2

θ

(cid:19) ∂

∂ϕ

u0(x, ϕ), ...,

K(cid:48)

θ

1
h2

θ

(cid:18) θ − θ2Nθ

hθ

(cid:19) ∂

∂ϕ

(cid:19)

(x, ϕ),−0.5π,0.5π

u2Nθ

E ∂

∂θ∂ϕ

2

(cid:18) 1

(cid:18) θ − θ0

ˆfn(x, ϕ, θ) =
K(cid:48)

(cid:19)

θ

hθ

(cid:19)

(cid:18) θ − θ2Nθ

hθ

u0(x, ϕ), ...,

E ∂
∂ϕ

K(cid:48)

θ

1
h2

θ

(cid:19)

.

E ∂
∂ϕ

u2Nθ

(x, ϕ),−0.5π,0.5π

SNθ

h2
θ
By proposition 2,

E ∂

2

Þ θ+chθ

∂θ∂ϕ

=

ˆfn(x, ϕ, θ)
K(cid:48)

1
h2

(cid:18) θ − v
(cid:18) θ − θ∗

hθ

(cid:19)
(cid:19)

[ f (cid:48)
ϕ(x, ϕ, v) + 0.5h2

θ

θ

θ−chθ
ϕ f (cid:48)(cid:48)(cid:48)
ϕϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv +
+0.5h2
c

n f (cid:48)(cid:48)(cid:48)
xxϕ(x, ϕ, v)K2(1 + o(1))
c
1440N4

R

ϕh5

ϕ

KV
θ

f (cid:48)
ϕ(x, ϕ, θ

∗)(1 + o(1))

+

θ

θ h5
1440N4
hθ
= f (cid:48)(cid:48)
xxϕθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2
ϕθ(x, ϕ, θ) + 0.5h2
n f IV
+0.5h2

c

ϕθθθ(x, ϕ, θ)Kθ,2(1 + o(1)) +
θ f IV
∗
f (cid:48)
θ(x, ϕ

KV
ϕ

θ h5
1440N4
, θ) − 2cKV I

θ

ϕ

(cid:19)

(cid:18) θ − θ∗
(cid:19)
(cid:18) ϕ − ϕ∗

KV
θ

hθ

+

c
1440N4

ϕh5

ϕ

hϕ

ϕϕϕθ(x, ϕ, θ)Kϕ,2(1 + o(1))
ϕ f IV
∗)(1 + o(1))

(cid:21)

f (cid:48)
ϕ(x, ϕ, θ
∗

, θ)

f (cid:48)
θ(x, ϕ

(1 + o(1)).

hϕ

(cid:19)
(cid:20)
(cid:18) ϕ − ϕ∗
(cid:18) θ − θ0
(cid:18) 1
(cid:19)
(cid:18) θ − θ2Nθ

K(cid:48)

h2

hθ

θ

θ

KV
ϕ

hθ

(cid:19)

(cid:18) ϕ − u∗
(cid:19)
(cid:18) ϕ − u∗
(cid:19)

hϕ
2Nθ

KV
ϕ

0

hϕ

R = SNθ

...,

K(cid:48)

θ

1
h2

θ

Note that R was calculated in the same way as before, but this time with

f(x, u∗

0, θ0)(1 + o(1)),

f(x, u∗
2Nθ

, θ2Nθ

)(1 + o(1)),−0.5π,0.5π

(cid:19)

.

(cid:3)

55

6.2 Calculation of Mean and Covariance of ˆZn

Recall the process

ˆZn(t) =

Þ t

0

−

(cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗

U(t, s)

sin θ∗ cos ϕ∗

0

(cid:33)(cid:32) ∂

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f

2

∂
∂ϕ∂θ f
2
∂
2 f
∂θ

(cid:33)−1(cid:32) ∂

∂ϕ
∂
∂θ

( ˆfn − f)
( ˆfn − f)

(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s)

ds.

cos θ∗ sin ϕ∗
− sin θ∗
(x∗(s)) = 0.

Keep in mind that ∂ f
∂ϕ

(x∗(s)) = ∂ f

∂θ

The partial derivatives of ˆfn are easier to write as

2Nθ

2Nϕ

n

∂ ˆfn(x)
∂ϕ

=

1
nh3
n

am
hθ

bl
h2
m=0
) and bl = O(N−1
ϕ ) are the coeﬃcients in Simpson’s scheme. Denote

k=1

l=0

Kθ

hϕ

hn

K

ϕ

Yklm,

K(cid:48)

ϕ

(cid:18) ϕ(x) − ϕl

(cid:19)

(cid:18) x − Xk

(cid:19)

where am = O(N−1

θ

(cid:19)

(cid:18) θ(x) − θm
Þ

hθ

(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s)

Km,l =

ulK(m)(u)du.

sin θ∗ cos ϕ∗

(cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x

cos θ∗ sin ϕ∗
− sin θ∗

(cid:32) ∂

0

2

.

∂
∂ϕ∂θ f
2
∂
2 f
∂θ

Furthermore, denote

M(s) =

and

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f
Consider the mean function of the process ˆZn(t).

F(x) =

56

Lemma 7. Suppose hn → 0, hϕ → 0, hθ → 0. Then

Þ t

0

(cid:32)
(cid:32)
(cid:32)

(cid:34)

h2
n

+h2

ϕ

+h2

θ

E ˆZn(t) = −(1 + o(1))

U(t, s)M(s)F−1(x∗(s))

3
∂
∂x2
∂ϕ
3
∂
∂x2

∂θ

∂
∂ϕ
3
∂
2
∂ϕ
∂θ
3

∂

∂ϕ∂θ

∂
∂θ

f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
3
3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
2 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1
3
3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1

2K0,2Kϕ,1,1

2K0,2Kθ,1,1

6Kϕ,1,3

3Kθ,1,1Kϕ,0,2

3Kϕ,1,1Kθ,0,2

6Kθ,1,3

(cid:33)

(cid:33)
(cid:33)(cid:35)

ds

Proof. We calculated the asymptotic expressions of the expectations of the estimators of the
derivatives of f in the previous section. The above is a result of applying those calculations to the
deﬁned process ˆZn. No other methods are needed to show the above.

Next consider the covariance for any t1, t2 ∈ [0,T]

Þ Þ

I(s1 ∈ [0, t1])I(s2 ∈ [0, t2])U(t1, s1)M(s1)F−1(x∗(s1))

Cov( ˆZn(t1), ˆZn(t2)) =

(cid:32) Cov(cid:0) ∂ ˆfn(x∗(s1))
Cov(cid:0) ∂ ˆfn(x∗(s1))

∂ϕ

,

,

∂θ

∂ ˆfn(x∗(s2))
∂ ˆfn(x∗(s2))

∂ϕ

∂ϕ

(cid:1) Cov(cid:0) ∂ ˆfn(x∗(s1))
(cid:1) Cov(cid:0) ∂ ˆfn(x∗(s1))

∂ϕ

∂θ

M∗(s2)(F∗)−1(x∗(s2))U∗(t2, s2)ds1ds2.

(cid:1) (cid:33)
(cid:1)

∂ ˆfn(x∗(s2))
∂ ˆfn(x∗(s2))

∂θ

∂θ

,

,

(cid:3)

(6.1)

Consider the covariance array in the middle of the above. Then for the ﬁrst entry we have:

57

(cid:19)

1
n2h6
n
K(cid:48)

ϕ

(cid:18) ϕ(x) − ϕl2

(cid:19)

hϕ

am1am2

,

)

∂ϕ

∂ ˆfn(x∗(s2))

n
2Nϕ
2Nθ
(cid:18) θ(x) − θm2
(cid:19)
(cid:18) x − Xk2

h2
K(cid:48)

k2=1

l2=0

hθ

θ

m2=0

(cid:19)

Kθ

Yk1l1m1, K

hn

ϕ

ϕ

hϕ

bl1bl2
h4

Yk2l2m2),

(cid:18) ϕ(x) − ϕl1
(cid:19)
(cid:18) x2 − Xk2
(cid:18) x2 − Xk2
(cid:19)
(cid:18) x2 − Xk1

(cid:19)
(cid:19)

hn

hn

=

∂ϕ

hθ

hn

Kθ

l1=0

k1=1

m1=0

Cov(K

Cov( ∂ ˆfn(x∗(s1))
n
2Nϕ
2Nθ
(cid:19)
(cid:18) θ(x) − θm1
(cid:18) x − Xk1
(cid:19)
(cid:18) x1 − Xk1
(cid:19)
(cid:18) x1 − Xk1
(cid:18)
(cid:18) x1 − x2
(cid:19)

hn
K(u)K

Þ

(cid:19)

hn

hn

hn

Cov(K

(cid:18) x1 − Xk1

Cov(K

= h3
n
= h3
nΨ

The latest covariance term consists of two terms

f(Xk1, ϕl1, θm1), K

f(Xk2, ϕl2, θm2))

and

Cov(K

S(Xk1, ϕl1, θm1)εk1l1m1, K

S(Xk2, ϕl2, θm2)εk2l2m2).

The ﬁrst term is zero for k1 (cid:44) k2 due to independence, while for k1 = k2 we have

(cid:18) x2 − Xk1

(cid:19)

58

where we used the substitution u = x−x1
hn

The second term is zero when l1 (cid:44) l2 or m1 (cid:44) m2, while for the case of l1 = l2, m1 = m2 we

f(Xk1, ϕl2, θm2))

f(x1 − uhn, ϕl1, θm1) f(x1 − uhn, ϕl2, θm2)du

f(Xk1, ϕl1, θm1), K
x1 − x2

hn

u +
f(x1, ϕl1, θm1) f(x2, ϕl2, θm2)(1 + O(h2
n)),

hn

(cid:19)
and Ψ(z) :=Þ K(u)K(u + z)du.
(cid:19)

(cid:18) x2 − Xk1

S(Xk1, ϕl1, θm1)εk1l1m1, K
S(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2
n)),

hn

S(Xk1, ϕl1, θm1)εk1l1m1)

have for k1 = k2
Cov(K

= h3
nΨ

while for k1 (cid:44) k2 we have

hn

(cid:18) x1 − Xk1
(cid:18) x1 − x2
(cid:19)
(cid:18) x1 − Xk1

hn

(cid:19)

(cid:19)

Cov(K
= Σk1k2h6

S(Xk1, ϕl1, θm1)εk1l1m1, K

hn
hn
nS(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2
n)).

S(Xk1, ϕl1, θm1)εk1l1m1)

(cid:18) θ(x2) − θm2

(cid:19)

hθ

Now we consider the summation for the (1, 1)-component of (6.1):

m2=0

ϕ

ϕ

Ψ

hn

hϕ

K(cid:48)

l1=0

m1=0

1 + O(h2
n)
nh3
n

×K(cid:48)
hϕ
1 + O(h2
n)
nh3
n

(cid:19) 2Nθ
(cid:18) x1 − x2
2Nϕ
2Nθ
(cid:18) ϕ(x1) − ϕl1
(cid:19)
(cid:18) ϕ(x2) − ϕl2
(cid:19)
(cid:19) 2Nθ
(cid:18) x1 − x2
2Nϕ
(cid:18) ϕ(x2) − ϕl1
(cid:18) ϕ(x1) − ϕl1
(cid:19)
(cid:20)k1(cid:44)k2 Σk1k2
(cid:21) 2Nθ
2Nϕ
(cid:18) ϕ(x2) − ϕl1
(cid:19)
(cid:18) ϕ(x1) − ϕl1

hϕ
a2
m1
h2

hn
K(cid:48)

b2
l1
h4

×K(cid:48)

m1=0

l1=0

l1=0

(cid:19)

n2

hϕ

Ψ

+

+

ϕ

ϕ

ϕ

θ

m1=0
K(cid:48)

ϕ

×K(cid:48)

ϕ

hϕ

hϕ

2Nϕ

Kθ

(cid:19)

hθ

Kθ

(cid:19)

(cid:19)

(cid:18) θ(x1) − θm1
(cid:18) θ(x2) − θm1
(cid:19)
(cid:18) θ(x2) − θm1

hθ

am1am2

h2

bl1bl2
h4

θ

ϕ

l2=0
f(x1, ϕl1, θm1) f(x2, ϕl2, θm2)
b2
a2
l1
m1
h2
h4
S(x1, ϕl1, θm1)S(x2, ϕl1, θm1)

(cid:18) θ(x1) − θm1
(cid:18) θ(x1) − θm1
(cid:19)

(cid:19)

Kθ

Kθ

hθ

ϕ

θ

Kθ

hθ

hθ

Kθ

S(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2
n)).

The ﬁrst term above is an ordinary Simpson’s double integral. The next two terms have the
coeﬃcients squared so we need proposition (5).
Now the (1, 1)-component of (6.1) becomes

hn

ϕ−chϕ

(cid:19)Þ ϕ+chϕ
(cid:18) x1 − x2
(cid:19)
(cid:18) θ(x2) − u2
(cid:19)
(cid:18) x1 − x2
(cid:19)
(cid:18) θ(x2) − u

(cid:19)

(cid:19)

hn

hθ

Ψ

+

1
h2

θ−chθ

θ−chθ

ϕ−chϕ
K(cid:48)

Þ θ−chθ
Þ ϕ+chϕ
(cid:19)
(cid:18) ϕ(x1) − v1

(cid:18) ϕ(x1) − v

Þ θ−chθ
(cid:18) ϕ(x2) − v2
(cid:21)Þ ϕ+chϕ
(cid:19)
(cid:18) ϕ(x2) − v

ϕ−chϕ

Σk1k2

k1(cid:44)k2

K(cid:48)

(cid:19)

hϕ

hϕ

ϕ

ϕ

θ

Kθ

Kθ
f(x1, v1, u1) f(x2, v2, u2)dv1dv2du1du2
n)
1 + O(h2

n)

Ψ

hθ

1 + O(h2
n)
nh3
n

(cid:18) θ(x1) − u1
(cid:20)1 + O(h2
(cid:18) θ(x1) − u

nh3
n

+

ϕ

1
h4

(cid:19)
Þ θ−chθ

θ−chθ

K(cid:48)

ϕ

hϕ

n2
K(cid:48)
25

ϕ

1
h2

θ

1
h4

ϕ

hθ

Kθ

Kθ
hθ
S(x1, v, u)S(x2, v, u)dvdu

81NϕNθ
Now we make substitutions in the integrals:

.

hϕ

59

(cid:18) x1 − x2
(cid:18) x1 − x2

(cid:19)Þ Þ Þ Þ 1
(cid:19)

1 + O(h2
n)

h2

hn

ϕ

Ψ

1 + O(h2
n)
nh3
n

(cid:20)1 + O(h2
(cid:18)

+

n)

f(x1, ϕ(x1) − z1hϕ, θ(x1) − y1hθ) f(x2, ϕ(x2) − z2hϕ, θ(x2) − y2hθ)dz1dz2dy1dy2

Kθ(y1)Kθ(y2)K(cid:48)

ϕ(z1)K(cid:48)

ϕ(z2)


(cid:18)

(cid:21)Þ Þ 1
(cid:19)

1
h3

ϕ

+

(cid:19)

Ψ

hn

nh3
n
Kθ(z)Kθ
S(x1, ϕ(x1) − zhϕ, θ(x1) − yhθ)S(x2, ϕ(x1) − zhϕ, θ(x1) − yhθ)dzdy

Σk1k2
ϕ(x2) − ϕ(x1)

n2
K(cid:48)
ϕ(y)K(cid:48)

θ(x2) − θ(x1)

k1(cid:44)k2

z +

y +

hϕ

hθ

hθ

ϕ

25

81NϕNθ

.

Recall properties (K). Applying Taylor expansion to functions f and S yields the following expres-
sion for the second summand of the (1, 1)-component of (6.1):

(cid:20)1 + O(h2

+

nh3
n
×(−1)Ψ(cid:48)(cid:48)

ϕ

Ψ

hn

n)

(cid:18) x1 − x2
(cid:19)
(cid:18) ϕ(x2) − ϕ(x1)
(cid:19)
Þ

hϕ

Where we have deﬁned the integrals:


(cid:21)

(cid:18) θ(x2) − θ(x1)

(cid:19)

1 + O(h2
n)

+

n2

Σk1k2

Ψθ

k1(cid:44)k2

hθ
S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1
hθ

1
h3

ϕ

25

81NϕNθ

,

Ψ(z) =

K(u)K(z + u)du, Ψ(cid:48)(z) =

K(u)K(cid:48)(z + u)du,

Þ

and the integration by parts gives

Ψ(cid:48)(cid:48)(z) = −

Þ

K(cid:48)(u)K(cid:48)(z + u)du.

60

(x2, ϕ(x2), θ(x2))K2

ϕ,1,1

Consider the ﬁrst summand of the (1, 1)-component of (6.1):

(cid:18) x1 − x2

(cid:19)(cid:20) ∂ f

∂

+

Ψ

∂ϕ

hn

1 + O(h2
n)
(x1, ϕ(x1), θ(x1)) ∂ f
nh3
∂ϕ
∂ϕ
n
3 f
ϕ( ∂ f
(x1, ϕ(x1), θ(x1)) ∂
3(x2, ϕ(x2), θ(x2))
+h2
∂ϕ
3 f
3(x1, ϕ(x1), θ(x1)) ∂ f
(x2, ϕ(x2), θ(x2)))Kϕ,1,1Kϕ,1,3
∂ϕ
∂ϕ
5 f
ϕ( ∂ f
(x1, ϕ(x1), θ(x1)) ∂
3(x2, ϕ(x2), θ(x2))
+h4
∂ϕ
5 f
5(x1, ϕ(x1), θ(x1)) ∂ f
(x2, ϕ(x2), θ(x2)))Kϕ,1,1Kϕ,1,5
∂ϕ
3 f
3 f
3(x2, ϕ(x2), θ(x2))K2
3(x1, ϕ(x1), θ(x1)) ∂
+h4
∂ϕ
∂ϕ

(cid:21)

∂ϕ

∂ϕ

+

∂

,

∂

where Kϕ,m,l =Þ ϕlK(m)(ϕ)dϕ. Recall that this expression is calculated on x1 = x∗(s1), x2 =

ϕ,1,3

ϕ

x∗(s2), where ∂ f

∂ϕ

is zero. i.e., recall the equations:

∂

∂
∂θ

f(x, ϕ
∂
∂ϕ
2
2 f(x, ϕ
∂ϕ
2
2 f(x, ϕ

∗(x), θ
∗(x), θ
∗(x), θ

∗(x)) =
∗(x)) < 0,
2
∗(x)) ∂
2 f(x, ϕ

∗(x), θ
f(x, ϕ
2
2 f(x, ϕ
∗(x), θ

∗(x)) = 0,
∗(x), θ
∗(x)) − ( ∂

∂θ

∂

∂

2

∗(x)) < 0,

∂ϕ

∂θ

∂θ∂ϕ

f(x, ϕ

∗(x), θ

∗(x)))2

> 0.

Then only the last summand in the ﬁrst summand of the (1, 1)-component of (6.1) is nonzero.

As a result, the (1, 1)-component of (6.1) is:

Ψ

h4

(cid:18) x1 − x2
(cid:19)
(cid:18) x1 − x2
(cid:19)
(cid:18) ϕ(x2) − ϕ(x1)
(cid:19)

n)

hn

hn

Ψ

ϕ

1 + O(h2
n)
nh3
n

(cid:20)1 + O(h2

+

nh3
n
×(−1)Ψ(cid:48)(cid:48)

ϕ

hϕ

∂

3 f
3(x1, ϕ(x1), θ(x1)) ∂
∂ϕ
1 + O(h2
n)


3 f
3(x2, ϕ(x2), θ(x2))K2
∂ϕ

(cid:18) θ(x2) − θ(x1)

(cid:21)

(cid:19)

ϕ,1,3

+

n2

Σk1k2

Ψθ

k1(cid:44)k2

hθ
S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1
hθ

1
h3

ϕ

25

81NϕNθ

.

61

The (2, 2)-component of (6.1) is easily obtained from the above by the exchange of ϕ and θ, so

it is

h4

θ

Ψ

(cid:18) x1 − x2
(cid:19)
(cid:18) x1 − x2
(cid:18) θ(x2) − θ(x1)

n)

hn

hn

Ψ

(cid:19)
(cid:19)

1 + O(h2
n)
nh3
n

(cid:20)1 + O(h2

+

nh3
n
×(−1)Ψ(cid:48)(cid:48)

θ

hθ

∂

3 f
3 (x1, ϕ(x1), θ(x1)) ∂
∂θ
1 + O(h2
n)


3 f
3 (x2, ϕ(x2), θ(x2))K2
θ,1,3
∂θ

(cid:18) ϕ(x2) − ϕ(x1)

(cid:19)

(cid:21)

+

n2

Σk1k2

Ψϕ

k1(cid:44)k2

hϕ
S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1
h3

1
hϕ

25

81NϕNθ

.

θ

,

θ

ϕ

=

)

∂θ

∂ϕ

hθ

hθ

Kθ

Now consider the (1, 2)-component of (6.1):
∂ ˆfn(x∗(s2))

l1=0

l2=0

k2=1

k1=1

m1=0

am1am2

Cov(K

Yk1l1m1, K

h3
K(cid:48)

m2=0
K(cid:48)

Cov( ∂ ˆfn(x∗(s1))
2Nθ
2Nϕ
n
2Nθ
2Nϕ
n
(cid:18) θ(x) − θm1
(cid:19)
(cid:18) θ(x) − θm2
(cid:19)
(cid:19)
(cid:18) x − Xk1
(cid:18) x − Xk2
(cid:19) 2Nθ
(cid:18) x1 − x2
2Nϕ
2Nθ
2Nϕ
(cid:19)
(cid:18) ϕ(x1) − ϕl1
(cid:18) ϕ(x2) − ϕl2
(cid:19)
k1(cid:44)k2 Σk1k2
(cid:18) x1 − x2
(cid:19)
(cid:20)1 + O(h2
(cid:18) ϕ(x1) − ϕl1
(cid:19)
(cid:19)
(cid:18) θ(x2) − θm1

(cid:21) 2Nθ
2Nϕ
(cid:19)
(cid:18) ϕ(x2) − ϕl1

am1am2

hϕ
n)

nh3
n

m1=0

m1=0

m2=0

l1=0

1 + O(h2
n)
nh3
n

×K(cid:48)

h3

l1=0

Kϕ

n2

hϕ

hn

hn

hn

Ψ

Ψ

+

+

ϕ

hn
K(cid:48)

ϕ

Kϕ

hϕ

hϕ

×K(cid:48)

θ

hθ

which is

1
n2h6
n

(cid:19)

Kϕ

(cid:18) ϕ(x) − ϕl2

(cid:19)

hϕ

θ

ϕ

bl1bl2
h3

(cid:18) ϕ(x) − ϕl1
(cid:19)

Yk2l2m2),

hϕ

bl1bl2
h3

Kθ

θ

ϕ

l2=0
f(x1, ϕl1, θm1) f(x2, ϕl2, θm2)

(cid:19)

(cid:19)

K(cid:48)

(cid:18) θ(x1) − θm1
(cid:18) θ(x1) − θm1

hθ

θ

(cid:18) θ(x2) − θm2
(cid:19)

hθ

Kθ

b2
l1
h3

a2
m1
h3
S(x1, ϕl1, θm1)S(x2, ϕl1, θm1).

hθ

ϕ

θ

Then applying Simpson’s approximation to the last expression we have:

62

hn
K(cid:48)

ϕ−chϕ

(cid:19)Þ ϕ+chϕ
(cid:18) x1 − x2
(cid:18) θ(x2) − u2
(cid:19)
(cid:19)
(cid:19)
(cid:18) x1 − x2
(cid:18) θ(x2) − u

(cid:19)

(cid:19)

hn

Ψ

+

1
h3

θ−chθ

θ−chθ

ϕ−chϕ
K(cid:48)

Þ θ−chθ
Þ ϕ+chϕ
(cid:18) ϕ(x1) − v1
(cid:19)

(cid:18) ϕ(x1) − v

Þ θ−chθ
(cid:18) ϕ(x2) − v2
(cid:21)Þ ϕ+chϕ
(cid:18) ϕ(x2) − v
(cid:19)

ϕ−chϕ

Σk1k2

k1(cid:44)k2

(cid:19)

Kϕ

hϕ

hϕ

ϕ

θ

θ

hθ

Kθ
f(x1, v1, u1) f(x2, v2, u2)dv1dv2du1du2
1 + O(h2
n)

n)

hθ

Ψ

1 + O(h2
n)
nh3
n

(cid:18) θ(x1) − u1
(cid:20)1 + O(h2
(cid:18) θ(x1) − u

nh3
n

+

ϕ

1
h3

(cid:19)
Þ θ−chθ

θ−chθ

Kϕ

hϕ

hϕ

K(cid:48)

hθ

Kθ
hθ
S(x1, v, u)S(x2, v, u)dvdu

θ

n2
K(cid:48)
25

ϕ

81NϕNθ

.

1
h3

θ

1
h3

ϕ

Introduce

as well as

y1 =

θ(x1) − u1

hθ

, y2 =

θ(x2) − u2

hθ

, z1 =

ϕ(x1) − v1

hϕ

z2 =

ϕ(x2) − v2

hϕ

hθ
Then by substitution the above becomes:

θ(x1) − u

z =

, y =

ϕ(x1) − v

hϕ

.

f(x1, ϕ(x1) − z1hϕ, θ(x1) − y1hθ) f(x2, ϕ(x2) − z2hϕ, θ(x2) − y2hθ)dz1dz2dy1dy2

(cid:18) x1 − x2
(cid:18) x1 − x2

(cid:19)Þ Þ Þ Þ
(cid:19)

hn

1 + O(h2
n)

Ψ

1 + O(h2
n)
nh3
n

(cid:20)1 + O(h2
(cid:18)

+

n)

1

hϕhθ

Kθ(y1)K(cid:48)


(cid:18)

ϕ(z1)Kϕ(z2)

θ(y2)K(cid:48)
(cid:21)Þ Þ 1
(cid:19)

1
h2

+

(cid:19)

Ψ

hn

nh3
n
Kθ(z)K(cid:48)
S(x1, ϕ(x1) − zhϕ, θ(x1) − yhθ)S(x2, ϕ(x1) − zhϕ, θ(x1) − yhθ)dzdy

Σk1k2
ϕ(x2) − ϕ(x1)

n2
K(cid:48)
ϕ(y)Kϕ

θ(x2) − θ(x1)

k1(cid:44)k2

h2

z +

y +

hϕ

hθ

ϕ

θ

θ

25

81NϕNθ

.

Finally, the (1, 2)-component of (6.1) is:

63

Ψ

1 + O(h2
n)
nh3
n

(cid:19)
(cid:18) x1 − x2
(cid:20)1 + O(h2
(cid:18) x1 − x2
(cid:18) ϕ(x2) − ϕ(x1)
(cid:19)

nh3
n

n)

hn

Ψ

+

Ψ(cid:48)

ϕ

hϕ

(cid:21)

3 f
3 (x2, ϕ(x2), θ(x2))Kϕ,1,3Kθ,1,3
∂θ
Ψ(cid:48)

(cid:18) θ(x2) − θ(x1)

(cid:19)

θ

∂

h2
ϕh2


3 f
3(x1, ϕ(x1), θ(x1)) ∂
∂ϕ
1 + O(h2
n)

(cid:19)
hn
S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1
h2

Σk1k2

k1(cid:44)k2

n2

+

θ

hθ

1
h2

ϕ

25

81NϕNθ

.

θ

Similarly, the (2, 1)-component of (6.1) is

Ψ

1 + O(h2
n)
nh3
n

(cid:19)
(cid:18) x1 − x2
(cid:18) x1 − x2
(cid:20)1 + O(h2
(cid:19)
(cid:18) ϕ(x2) − ϕ(x1)

nh3
n

n)

hn

Ψ

+

Ψ(cid:48)

ϕ

hϕ

(cid:21)

3 f
3(x2, ϕ(x2), θ(x2))Kϕ,1,3Kθ,1,3
∂ϕ
Ψ(cid:48)

(cid:18) θ(x2) − θ(x1)

(cid:19)

θ

∂

h2
ϕh2


3 f
3 (x1, ϕ(x1), θ(x1)) ∂
∂θ
1 + O(h2
n)

(cid:19)
hn
S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1
h2

Σk1k2

k1(cid:44)k2

n2

+

θ

hθ

1
h2

ϕ

25

81NϕNθ

.

θ

To balance both terms with f and S in the covariance matrix (6.1) we need the following

assumptions :

and

1 + O(h2
n)

n2


k1(cid:44)k2

Σk1k2 =

κ
nh3
n

Nϕ = Nθ = N, hϕ = hθ = h,

where κ > 0 is a constant.

Now we plug in the expression for (6.1) into the covariance function of the process ˆZn. First

we deﬁne the following (which is a slight restatement for convenience):

Recall:

Ψ(z) =

Þ

and the integration by parts gives

Ψ(cid:48)(cid:48)(z) = −

Þ

Þ

K(u)K(z + u)du, Ψ(cid:48)(z) =

K(u)K(cid:48)(z + u)du,

K(cid:48)(u)K(cid:48)(z + u)du.

64

build the matrix

−|z|2
For instance, for a standard Gaussian kernel Ψ(z) = e
4
π)3 . Now out of kernels Kϕ and Kθ
(2√
ϕ(0)Ψ(cid:48)
(0)
(0) Ψϕ(0)Ψ(cid:48)(cid:48)
(0)

ϕ(0)Ψθ(0) Ψ(cid:48)
Ψ(cid:48)
ϕ(0)Ψ(cid:48)

(cid:32) Ψ(cid:48)(cid:48)

Ψ0 =

(cid:33)

θ

.

If both Kϕ and Kθ are Gaussian then Ψ0 = −diag( 1
8π,
Ψ(−τv)dτ,

ψ(v) =

1
8π

θ

Þ

R

θ

). For v ∈ R3 deﬁne

(cid:19)

dτ.

,

where

which is

( ∂θ∗
∂x (x)vτ)
( ∂θ∗
∂x (x)vτ)
(cid:19)

(Ψ(−vτ) + κ) ×

Ψ0(v, x) = − 1
D(x)

1
4π|v| for a standard Gaussian kernel. Also introduce

Þ
(cid:18) −Ψ(cid:48)(cid:48)
Ψ0(v, x) =
ϕ( ∂ϕ∗
ϕ( ∂ϕ∗
∂x (x)vτ)Ψθ( ∂θ∗
∂x (x)vτ)Ψ(cid:48)
∂x (x)vτ) Ψ(cid:48)
∂x (x)vτ) −Ψϕ( ∂ϕ∗
ϕ( ∂ϕ∗
( ∂θ∗
∂x (x)vτ)Ψ(cid:48)
∂x (x)vτ)Ψ(cid:48)(cid:48)
Ψ(cid:48)
θ
−(cid:0) ∂ϕ∗
∂x (x)(cid:1)(cid:0) ∂θ∗
1 +(cid:0) ∂θ∗
∂x (x)(cid:1)2
∂x (x)(cid:1)
(cid:18)
In case of a standard Gaussian kernel
−(cid:0) ∂ϕ∗
1 +(cid:0) ∂ϕ∗
∂x (x)(cid:1)(cid:0) ∂θ∗
∂x (x)(cid:1)
∂x (x)(cid:1)2
(cid:19)2(cid:19)3/2
(cid:19)2
(cid:18) ∂ϕ∗
2(cid:18)
(cid:18) ∂θ∗
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s)
(cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ).

∂
∂ϕ∂θ f
2
∂
2 f
∂θ
Green’s function U(t, s) is deﬁned as the solution of the PDE

cos θ∗ sin ϕ∗
− sin θ∗

2
2 f
∂ϕ
2
∂
∂ϕ∂θ f

sin θ∗ cos ϕ∗

D(x) = 64π

F(x, ϕ, θ) =

0

(cid:32) ∂

Also, deﬁne

M(s) =

(x)

+

(x)

∂x

1 +

∂x

and

2

θ

θ

.

∂U(t, s)

∂t

= ∇v

∗(x∗(t))U(t, s),U(s, s) = I ∀s > 0,

65

where

∇v(x∗(s)) = −M(s)F−1(x∗(s), ϕ

∗(x∗(s)), θ

∗(x∗(s)))

(cid:32)

(cid:33)

.

2

∂

∂ϕ∂x f(x∗(s))
∂θ∂x f(x∗(s))

2

∂

Then back to the covariance function of the process ˆZn we combine the above calculations using

these deﬁnitions:

I(s1 ∈ [0, t1])I(s2 ∈ [0, t2])U(t1, s1)M(s1)F−1(x∗(s1))

(cid:26)

Þ Þ
(cid:19)
(cid:18) x∗(s1) − x∗(s2)
(cid:18) x∗(s1) − x∗(s2)
(cid:20)

Cov( ˆZn(t1), ˆZn(t2)) =
h4 1 + O(h2
n)
Ψ
nh3
n
1 + O(h2
n)
1
N2h4
nh3
n
×S(x∗(s1), ϕ
∗(x∗(s1)), θ
M∗(s2)(F∗)−1(x∗(s2))U∗(t2, s2)ds1ds2,

(cid:19)

hn

Ψ

+

hn

+ κ
∗(x∗(s1)))S(x∗(s2), ϕ

G(x∗(s1))G∗(x∗(s2))

(cid:21) 25
81 Ψ(s1, s2)
∗(x∗(s1)), θ

(cid:27)

∗(x∗(s1)))

where

and

with

Ψ(s1, s2) :=

(cid:18) ∂

∂

G(x) :=

(cid:19)

3 f(x)
∂ϕ
3 f(x)
∂θ

3 Kϕ,1,3

3 Kθ,1,3

(cid:18) −Ψ(cid:48)(cid:48)
ϕ( ∆ϕ∗
)Ψθ( ∆θ∗
hθ
ϕ( ∆ϕ∗
( ∆θ∗
Ψ(cid:48)
)Ψ(cid:48)
hθ

hϕ

hϕ

θ

ϕ( ∆ϕ∗
) Ψ(cid:48)
) −Ψϕ( ∆ϕ∗

hϕ

hϕ

)Ψ(cid:48)
)Ψ(cid:48)(cid:48)

θ

( ∆θ∗
)
hθ
( ∆θ∗
hθ

θ

(cid:19)

)

∗ = ϕ

∗(x∗(s2)) − ϕ

∗(x∗(s1)), ∆θ

∗ = θ

∗(x∗(s2)) − θ

∗(x∗(s1)).

∆ϕ

Now consider the change of variable s2 = s1 + τh with ds2 = hdτ. Then we have proven the

following result:

66

Lemma 8.

Þ t1∧t2

(cid:26)

(cid:18) dx∗

(cid:19)

1 + O(h2
n)
nh2
n

Cov( ˆZn(t1), ˆZn(t2)) =

0
G(x∗(s))G∗(x∗(s)) +
(s)
h4
∗(x∗(s)), θ
×S2(x∗(s), ϕ

∗(x∗(s)))

(cid:27)

dt

ψ

(cid:18) dx∗

U(t1, s)M(s)F−1(x∗(s))
25
(s), x∗(s)
81 Ψ0
M∗(s)(F∗)−1(x∗(s))U∗(t2, s)ds.

N2h4

(cid:19)

dt

1

6.3 Mean Squared Error of the Process ˆZn
Proposition 6. The optimal bandwidths (hn, h) that minimize the MSE of ˆZn are attained under
either one of the following cases:

(I)

hn = O(n−1/2),

h = O(n−1/2), N ≥ O(n2) as n → ∞.

or

(II)

hn = O(N−1/5n−1/10),

h = O(N−1/5n−1/10), N = o(n2) as n → ∞.

Remark. In practice in HARDI the number of gradient directions N is much smaller than the

number of voxels n, so we have the case (II).

Proof. The MSE has the form c1h4

n + c2h4 + c3h2h2

with respect to hn and h and set them to zero:

n + c4h4
nh2
n

+

c5
nN2h4 . Take partial derivatives
nh2

4c1h3

n + 2c3h2hn − 2c4h4
nh3
n

−

2c5
nN2h4 = 0
nh3

and

4c2h3 + 2c3hh2

n +

Consider the second equation:

4c4h3
nh2
n

−

4c5
nN2h5 = 0.
nh2

It is quadratic with respect to h2

(cid:114)

n. Then
−2c2nh3 ±

h2
n =

c3nhh4

n + 2c2nh3h2

n + 2c4h3 − 2c5

N2h5 = 0.

2n2h6 − 4c3nh(2c4h3 − 2c5
N2h5)
4c2

2c3nh

,

67

which is

h2 ±

h2 ± c2
c3

n ≈ −c2
h2
c3

2c5
c2nh8N2 ∓ 2c4
up to terms of smaller order. Since c2/c3 > 0 we take h2
n ≈
(cid:18)1
(cid:19)

(I) N2h8 → C ∈ (0, +∞],

Consider 2 cases:

h2
n = O

and

nh8N2
Now we consider the ﬁrst equation with respect to h for case (I):

(II) N2h8 → 0,

h2
n = O

n

.

(cid:18)

1

c2n

.

2c5
c2nh8N2 − 2c4
(cid:19)

c2n

or equivalently

2c1n−3/2 + c3h2n−1/2 − c4h4n1/2 − c5n1/2

N2h4 = 0

2c1n−2 + c3h2n−1 − c4h4 − c5h4
C

= 0

and furthermore with positive constants c, c0

h4 − c0h2n−1 − cn−2 = 0,

which implies h2 = O(1

n). As a result case (I) becomes
hn = O(n−1/2),

(I)

h = O(n−1/2), N ≥ O(n2) as n → ∞.

Now consider case (II). The ﬁrst equation with respect to h is

2c1

n3h24N6 +

c3

n2h14N4 − c4h4

n

− c5
nN2h4 = 0,

which is equivalent up to terms of higher order with Nh4 = ε → 0 and h = ε

1/4N−1/4

−2c1N − nN1/2

5/2 + c5n2

ε

5 = 0,

ε

which is quadratic in ε

5/2. Then ε = O(N1/5n−2/5). As a result case (II) becomes

(II)

hn = O(N−1/5n−1/10),

h = O(N−1/5n−1/10), N = o(n2) as n → ∞,

68

which completes the proof of the proposition.

Remark. From the proof it is clear that in case (II) the second part of the covariance of

ˆZn
with S is the leading term, while the ﬁrst part is of smaller order. Also the bias squared is of smaller
order as well.
6.4 Asymptotic normality of the ˆZn process

First we need to establish consistency, which will help justify the approximation of ˆX∗

n(t)− x∗(t)

by ˆZn.
Lemma 9. Suppose that (I) or (II) holds. Then uniformly in t ∈ [0,T],
estimator of x∗(t). That is,

ˆX∗
n(t) is a consistent

| ˆX∗
n(t) − x∗(t)| P→ 0

as

n → ∞.

sup
t∈[0,T]

In the following proof we use the Gronwall-Bellman Inequality and formulate it before for
completeness: Let F, G be non-negative continuous functions in [a, b] and D ≥ 0 be a constant.
Suppose that for all t ∈ [a, b),

G(t) ≤ D +

Then for all t ∈ [a, b]

Þ t
(cid:40)Þ t

a

a

F(s)G(s)ds

(cid:41)

F(s)ds

G(t) ≤ Dexp

Proof. Now,

Þ t

0

0

=

(ˆv
(ˆv

Þ t
Þ t
Þ t
(cid:18)

n(t) − x∗(t) =
n( ˆx∗
ˆx∗
∗
(ˆv
0
∗
∗)( ˆx∗
n − v
n(s))ds +
∗)( ˆx∗
∗
n(s))ds +
n − v
∗
∗)(x∗(s))ds +
n − v
| ˆx∗
sup
t∈[0,T]

Þ t
Þ t
Þ t
n(t) − x∗(t)|2(cid:19)

(ˆv

+O

=

0

0

=

0

.

∗(x∗(s)))ds
n(s)) − v
∗( ˆx∗
n(s)) − v
(v
∗(x∗(s))( ˆx∗
∇v

0

(ˆv

∗
n − v

∗)( ˆx∗

(cid:18)

∗(x∗(s)))ds
n(s) − x∗(s))ds + O
sup
t∈[0,T]
n(s) − x∗(s))ds +
∗(x∗(s))( ˆx∗
∇v

Þ t

| ˆx∗

0

n(t) − x∗(t)|2(cid:19)

.

n(s) − x∗(s))ds

69

n(t) − x∗(t)|2(cid:19)

| ˆx∗

= rn Since,

Þ t

Þ 1

0

0

n(s) − x∗(s))ds =

sup
t∈[0,T]
∗
(ˆv
n − v
0
n(s) − x∗(s)|)
= o(| ˆx∗

∗)( ˆx∗

(ˆv

∗
n − v

∗)(cid:48)(λ ˆx∗

n(s) + (1 − λ)x∗(s))( ˆx∗

n(s) − x∗(s))ds =
(6.2)

ˆx∗
n(t) − x∗(t) =

Þ t

0

(ˆv

∗
n − v

∗)(x∗(s))ds +

Þ t

0

∇v

∗(x∗(s))( ˆx∗

n(s) − x∗(s))ds + rn

(6.3)

Þ t

0

zn(t) =

∇v

∗(x∗(s))zn(s)ds +

Þ t

Þ t

0

(ˆv

∗
n − v

∗)(x∗(s))ds

ˆx∗
n(t) − x∗(t) − zn(t) =

∇v

∗(x∗(s))[( ˆx∗

n(s) − x∗(s)) − zn(s)]ds + rn

0
Then by the Gronwall-Bellman inequality

|| ˆx∗
n(t) − x∗(t) − zn(t)|| ≤ ||rn||exp

∇v

∗(x∗(s))ds

(cid:18)
Þ t

Let O

We have

Now,

So that

Now,

Then

(cid:41)
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ds

( ˆfn − f)(x∗(s))
( ˆfn − f)(x∗(s))

| ˜Zn(t)| ≤

≤ ||A||

0

0

(cid:40)Þ t
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)U(t, s)M(s)F−1(x∗(s))
(cid:32) ∂
Þ t
(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ds
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:32) ∂
Þ t
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:32) ∂

( ˆfn − f)(x∗(s))
( ˆfn − f)(x∗(s))

| ˜Zn(t)| ≤ ||A||T sup
x∈G

sup
t∈[0,T]

∂ϕ
∂
∂θ

∂ϕ
∂
∂θ

0

70

(6.4)

(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)

( ˆfn − f)(x)
( ˆfn − f)(x)

∂ϕ
∂
∂θ

ˆfn(x, ϕ, θ) − E ∂
∂ϕ

ˆfn(x, ϕ, θ)

= O

f(x, ϕ, θ) − ∂
∂ϕ

∂ϕ

ˆfn(x, ϕ, θ)

= O

Thus

and analogously

f(x, ϕ, θ) − ∂
∂θ

∂θ

ˆfn(x, ϕ, θ)

= O

To control the error term, we note that

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞
(cid:32) ∂

(cid:32)(cid:115)
(cid:32)(cid:115)
(cid:32)(cid:115)

(cid:33)

(cid:33)
(cid:33)

log(h)
nN2h6

log(h)
nN2h6

log(h)
nN2h6

(cid:33)

a.s.

a.s.

a.s.

∂ϕ

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂
(cid:18) ∆ϕ

(cid:19)

Note ||A|| is the operator norm and we will denote || · ||∞ = sup
x∈G
norm. By GG proposition 3.1 (see [14]) we have

| · | as usual for the supremum

= F−1(x, ϕ

∗(x), θ

∗(x))

( ˆfn − f)(x)
( ˆfn − f)(x)

∂ϕ
∂
∂θ

+ O(∆
2
ϕ

+ ∆

θ)
2

∆θ

And

ˆf (cid:48)
nϕ(x, ϕ
ˆf (cid:48)(cid:48)
nϕϕ(x, ϕ

∗(x∗), θ
∗(x∗), θ

∗(x∗)) − f (cid:48)
ϕ(x, ϕ
∗(x∗))∆ϕ + ˆf (cid:48)(cid:48)

∗(x∗), θ
nϕθ(x, ϕ

∗(x∗)) =
∗(x∗), θ

∗(x∗))∆θ + O(∆
2
ϕ

+ ∆

θ)
2

(6.5)

So we may rewrite it as: (cid:18) ∆ϕ
(cid:19)
(cid:18) ∂

∆θ

= −F−1(x, ϕ

∗(x), θ

∗(x))

(cid:19)

( ˆfn − f)(x)

+ O

( ˆfn − f)(x)

+O

∂θ

Regarding the second order derivatives of f we have by GG proposition 3.1

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂ϕ

2
2 f(x, ϕ, θ) − ∂

2
2

∂ϕ

ˆfn(x, ϕ, θ)

= O

(cid:32) ∂

(cid:33)

a.s.

( ˆfn − f)(x)
( ˆfn − f)(x)

∂ϕ
∂
∂θ

(cid:19)
(cid:32)(cid:114) log(h)

nN2h7

(cid:33)

∂ϕ

(cid:18) ∂
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞

and analogously

71

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂θ

2
2 f(x, ϕ, θ) − ∂

∂θ

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞

ˆfn(x, ϕ, θ)

= O

(cid:12)(cid:12)(cid:12)(cid:12) ∂

(cid:12)(cid:12)(cid:12)(cid:12) + ∆θ

(cid:12)(cid:12)(cid:12)(cid:12) ∂

∂θ

2
2( ˆf − f)

a.s.

nN2h7

(cid:32)(cid:114) log(h)
(cid:33)
(cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ)
(cid:12)(cid:12)(cid:12)(cid:12) ∂

(cid:12)(cid:12)(cid:12)(cid:12)(cid:19)

( ˆf − f)

2

∂ϕ∂θ

So that the terms governing

(cid:18)

2
2( ˆf − f)
have almost sure convergence to zero.

2
+ ∆
θ

+ ∆ϕ

2
∆
ϕ

∂ϕ

O

Thus, as n → ∞

n(t) − x∗(t)| P→ 0
| ˆx∗

sup
t∈[0,T]

(cid:3)

We now prove the asymptotic normality of the normalized deviation process ˆZn(t) in the space
C[0,T]. First, we have to establish the convergence of the ﬁnite dimensional distributions of
this process for general functions. To this end we will establish that the CLT holds by checking
Lyapunov’s condition and prove asymptotical equicontinuity of the process ˆZn(t).
Theorem 3. Under case (I),

n( ˆXn(t) − x(t)) D⇒ G(t)

in the space C[0,T], where the Gaussian process G has mean function µ(t) and covariance function
C(t1, t2) introduced in the next section.

Under case (II),

n1/5N2/5( ˆXn(t) − x(t)) D⇒ G0(t)

in C[0,T], where the centered Gaussian process G0 has covariance function C0(t1, t2) introduced
in the next section.

72

Þ t

n

0

k=1

U(t, s)M(s)F−1(x∗(s))

2Nθ

2Nϕ

m=0
l=0
θ(x∗(s))−θm
θ(x∗(s))−θm

h

h

(cid:18) x∗(s) − Xk
(cid:19)
(cid:18) ϕ(x∗(s))−ϕl
(cid:18) ϕ(x∗(s))−ϕl

h

h

K(cid:48)

ϕ

Kϕ

h

K

(cid:19)
(cid:19)

ambl

(cid:19)
(cid:19) (cid:33)

ds

(cid:18)
(cid:18)

(cid:32) Kθ

K(cid:48)

θ

×[ f(Xk, ϕl, θm) + S(Xk, ϕl, θm)εklm]

ˆZn(t) = − 1
nh6

n

k=1

=:

1
n

ηn,j

6.4.1 Lyapunov’s Condition

Under assumption hn = hϕ = hθ one can write

For the case (I) one has to establish the following convergence:

n

k=1

E|ηn,j − Eηn,j|4 → 0,

which is the Lyapunov’s condition for the process n ˆZn.

For the case (II) one has to establish the following convergence:

n

k=1

[n−4/5N2/5]4

E|ηn,j − Eηn,j|4 → 0,

which is the Lyapunov’s condition for the process n1/5N2/5 ˆZn.

73

Consider E|ηn,k − Eηn,k|4. It is bounded by

1
h24 E

(cid:26)
(cid:26)
(cid:32)

K
−EK

K
−EK

Kθ

h

h

h

0

0

2Nϕ

am1bl1am2bl2

m2=0

m1=0

2Nθ

2Nϕ

(cid:20)Þ t
Þ t
2Nθ
(cid:18) x∗(s1) − Xk
(cid:19)
(cid:19)
(cid:18) x∗(s1) − Xk
(cid:18) x∗(s2) − Xk
(cid:19)
(cid:19)
(cid:18) x∗(s2) − Xk
(cid:19)
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2))−θm2
(cid:32) Kθ
(cid:19)
(cid:18) θ(x∗s2)−θm2

K(cid:48)

l2=0

l1=0
( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1)
( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1)
( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2)
( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2)

(cid:27)
(cid:27)
(cid:18) θ(x∗s1) − θm1
(cid:19)
(cid:19)
(cid:19)
(cid:18) ϕ(x∗(s2))−ϕl2
(cid:21)2
(cid:19)
(cid:18) ϕ(x∗(s2))−ϕl2

(cid:18) ϕ(x∗(s1)) − ϕl1

h
K(cid:48)

ds1ds2

, K(cid:48)

(cid:33)

(cid:19)

h

h

h

ϕ

θ

B(s1, t1, s2, t2)

h

h

ϕ

Kϕ

h

h

K(cid:48)

θ

(cid:18) ϕ(x∗(s1)) − ϕl1

(cid:19)(cid:33)

h

Kϕ

where B(s1, t1, s2, t2) = (F−1(x∗(s1)))∗M∗(x∗(s1))U∗(t1, s1)U(t2, s2)M(x∗(s2))F−1(x∗(s2)) is a 2×
2 matrix.

The expression after Y’s is a linear combination of 4 summands of the type

(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2

h

(cid:19)
(cid:19)

h

(r1)

θ

K

(r2)

θ

K

(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2

h

(cid:19)
(cid:19)

h

(1−r1)

ϕ

(1−r2)

ϕ

K

K

with bounded coeﬃcients. Here K(r) stands for the r-th order derivative. One gets all 16 summands
for the squared expression since r1,r2,r3,r4 take values 0 and 1, which are all bounded in the same

74

way. So it is enough to consider one of them. Thus, E|ηn,k − Eηn,k|4 is bounded by

2Nθ

2Nϕ

m4=0

l4=0

am1bl1am2bl2am3bl3am4bl4

2Nϕ

m1=0

l1=0
(1−r1)

K

2Nθ
(cid:19)
(cid:19)
(cid:19)
(cid:19)

K

K

ϕ

ϕ

ϕ

(1−r2)

(1−r3)

(1−r4)

K

ϕ

C
h24 E
(r1)

K

θ

(r2)

(r3)

(r4)

K

K

θ

θ

(cid:20)

K

θ

6K
[K
×[K

0

0

0

0

h

h

h

h

h

Þ t
Þ t
Þ t
Þ t
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2
(cid:18) θ(x∗(s3)) − θm3
(cid:18) θ(x∗(s4)) − θm4
(cid:19)
(cid:18) x∗(s1) − Xk
(cid:18) x∗(s3) − Xk
(cid:19)
(cid:19)
(cid:18) x∗(s4) − Xk
(cid:19)
(cid:18) x∗(s1) − Xk
(cid:18) x∗(s1) − Xk
(cid:18) x∗(s2) − Xk
(cid:18) x∗(s3) − Xk
(cid:18) x∗(s4) − Xk

(cid:19)
(cid:19)
(cid:19)
(cid:19)

K

h

h

h

h

h

h

h

h

m2=0

l2=0

l3=0

m3=0

2Nθ
2Nϕ
2Nθ
2Nϕ
(cid:19)
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:19)
(cid:18) ϕ(x∗(s2)) − ϕl2
(cid:18) ϕ(x∗(s3)) − ϕl3
(cid:19)
(cid:19)
(cid:18) ϕ(x∗(s4)) − ϕl4
(cid:19)
(cid:18) x∗(s2) − Xk
(cid:18) x∗(s3) − Xk
(cid:19)
(cid:18) x∗(s4) − Xk
(cid:19)
(cid:18) x∗(s3) − Xk
(cid:18) x∗(s1) − Xk
(cid:18) x∗(s2) − Xk
(cid:18) x∗(s3) − Xk
(cid:18) x∗(s4) − Xk

K

h

h

h

h

h

S(Xk, ϕl1, θm1)K
f(Xk, ϕl3, θm3) − EK
f(Xk, ϕl4, θm4) − EK

(cid:18) x∗(s2) − Xk

(cid:19)

S(Xk, ϕl2, θm2)

f(Xk, ϕl3, θm3)]δl1=l2,m1=m2
f(Xk, ϕl4, θm4)]δl3=l4,m3=m4

h

(cid:19)

(cid:19)
(cid:18) x∗(s4) − Xk
(cid:19)
(cid:19)
(cid:19)
(cid:19)

f(Xk, ϕl1, θm1)]
f(Xk, ϕl2, θm2)]
f(Xk, ϕl3, θm3)]
f(Xk, ϕl4, θm4)]

ds1ds2ds3ds4.

h

h

h

h

K

+K
S(Xk, ϕl1, θm1)S(Xk, ϕl2, θm2)S(Xk, ϕl3, θm3)S(Xk, ϕl4, θm4)
×(δl1=l2=l3=l4,m1=m2=m3=m4 + 6δl1=l2(cid:44)l3=l4,m1=m2(cid:44)m3=m4)
+[K
×[K
×[K
×[K
×(δl1=l2=l3=l4,m1=m2=m3=m4 + 6δl1=l2(cid:44)l3=l4,m1=m2(cid:44)m3=m4)

f(Xk, ϕl1, θm1) − EK
f(Xk, ϕl2, θm2) − EK
f(Xk, ϕl3, θm3) − EK
f(Xk, ϕl4, θm4) − EK

h

h

h

(cid:21)

75

This is further bounded by

l1a2

m2b2
l2

2Nϕ

l1=0
(1−r1)

K

ϕ

(1−r2)

ϕ

(1−r3)

ϕ

K

K

m1=0

Ch3
h24
(r1)

K

θ

(r2)

(r3)

K

K

θ

θ

0

0

0

0

h

Þ t

Þ t
Þ t
Þ t
2Nθ
(cid:18) θ(x∗(s1)) − θm1
(cid:19)
(cid:18) θ(x∗(s2)) − θm1
(cid:19)
(cid:18) θ(x∗(s3)) − θm2
(cid:19)
(cid:19)
(cid:18) θ(x∗(s4)) − θm2
(cid:18) x∗(s2) − x∗(s1)

h

h

h

m2=0

l2=0

a2
m1b2

2Nθ
2Nϕ
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl1
(cid:18) ϕ(x∗(s3)) − ϕl2
(cid:18) ϕ(x∗(s4)) − ϕl2

h

h

h

(cid:19)
(cid:19)
(cid:19)
(cid:19)

θ

K

K

(r4)

(1−r4)

(cid:19)
×[ f 2(x∗(s1), ϕl2, θm2) + S2(x∗(s1), ϕl2, θm2)]ds1ds2ds3ds4,

x∗(s3) − x∗(s1)

x∗(s4) − x∗(s1)

Ψ4

h

h

h

h

ϕ

,

,

f 2(x∗(s1), ϕl1, θm1)

where Ψ4(z1, z2, z3) =Þ K(u)K(u + z1)K(u + z2)K(u + z3)du. Now using the Simpson’s scheme
Þ θ(x∗(s2))+ch

Þ ϕ(x∗(s1))+ch

Þ θ(x∗(s1))+ch

and the proposition we bound the above by

θ(x∗(s1))−ch

θ(x∗(s2))−ch

Ch3
h24N4
(r1)

0

0

0

0

h

Þ t

Þ t
Þ t
Þ t
(cid:19)
(cid:18) θ(x∗(s1)) − θ1
(cid:18) θ(x∗(s2)) − θ1
(cid:19)
(cid:18) θ(x∗(s3)) − θ2
(cid:19)
(cid:19)
(cid:18) θ(x∗(s4)) − θ2
(cid:18) x∗(s2) − x∗(s1)

h

h

h

(r2)

(r3)

(r4)

K

θ

K

K

K

θ

θ

θ

ϕ(x∗(s1))−ch
(1−r1)

ϕ(x∗(s2))−ch

Þ ϕ(x∗(s2))+ch
(cid:19)
(cid:19)
(cid:19)
(cid:19)

(cid:18) ϕ(x∗(s1)) − ϕ1
(cid:18) ϕ(x∗(s2)) − ϕ1
(cid:18) ϕ(x∗(s3)) − ϕ2
(cid:18) ϕ(x∗(s4)) − ϕ2

h

h

h

K

ϕ

K

K

(1−r2)

ϕ

(1−r3)

ϕ

(1−r4)

(cid:19)

ϕ

K
x∗(s3) − x∗(s1)

h
x∗(s4) − x∗(s1)

f 2(x∗(s1), ϕ1, θ1)
Ψ4
×[ f 2(x∗(s1), ϕ2, θ2) + S2(x∗(s1), ϕ2, θ2)]dϕ1dϕ2dθ1dθ2ds1ds2ds3ds4,

h

h

h

,

,

76

then by change of variable we have

Þ t

Þ t

Þ t

Þ t

Þ c

Þ c

Ch7
h24N4
(r1)
K
(r2)

θ

K

θ

(r4)

K

θ

−c
(1−r3)

θ

ϕ

u1 +

0
0
(v1)K

0
(u1)K

−c
−c
0
(r3)
(1−r1)
(u2)K
θ(x∗(s2)) − θ(x∗(s1))
θ(x∗(s4)) − θ(x∗(s3))

(cid:18)
(cid:18)
(cid:18) x∗(s2) − x∗(s1)

K
h
x∗(s3) − x∗(s1)

u2 +

K

h

ϕ

Þ c

−c
(v2)
(1−r2)

Þ c
(cid:19)
(cid:19)

ϕ

(cid:18)
(cid:18)

(cid:19)
(cid:19)

ϕ(x∗(s2)) − ϕ(x∗(s1))
ϕ(x∗(s4)) − ϕ(x∗(s3))

h

h

v1 +

ϕ

v2 +

(1−r4)
x∗(s4) − x∗(s1)

(cid:19)

,

,

h

h

Ψ4
f 2(x∗(s1), ϕ(x∗(s1)) − hu1, θ(x∗(s1)) − hv1)
×[ f 2(x∗(s1), ϕ(x∗(s3)) − hu2, θ(x∗(s3)) − hv2)
+S2(x∗(s1), ϕ(x∗(s3)) − hu2, θ(x∗(s3)) − hv2)]du1du2dv1dv2ds1ds2ds3ds4,

h

(cid:19)

x∗(s4) − x∗(s1)

h

By a linear approximation and Deﬁning Ψθ,r1,r2 we have:

Þ t

Þ t

(cid:18) x∗(s2) − x∗(s1)
Þ t
Þ t
(cid:18) θ(x∗(s2)) − θ(x∗(s1))
(cid:19)
(cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1))
(cid:19)

Ψθ,r3,r4

Ψ4

h

h

0

0

0

0

,

,

h

x∗(s3) − x∗(s1)

(cid:18) θ(x∗(s4)) − θ(x∗(s3))
(cid:19)
(cid:19)
(cid:18) θ(x∗(s4)) − θ(x∗(s3))

h

Ch7
h24N4

Ψθ,r1,r2

h

Ψϕ,r3,r4

Ψϕ,r1,r2
f 2(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))
×[ f 2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))
+S2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))](1 + o(h))ds1ds2ds3ds4,

h

77

By a change of variable, si = s1 + τih, the above becomes:

f 2(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))[ f 2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))

Þ Þ Þ

Ch10(1 + o(1))
+S2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))]

h24N4

0

Þ t
(cid:18) x∗(s2) − x∗(s1)
(cid:18) θ(x∗(s2)) − θ(x∗(s1))
(cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1))

s2 − s1
s2 − s1
s2 − s1

Ψ4

Ψθ,r1,r2

Ψϕ,r1,r2

(cid:19)
(cid:19)

(cid:19)

τ4

x∗(s3) − x∗(s1)

s3 − s1

τ2,

τ3,

s4 − s1

x∗(s4) − x∗(s1)

(cid:18) θ(x∗(s4)) − θ(x∗(s3))
(cid:18) θ(x∗(s4)) − θ(x∗(s3))

s4 − s3
s4 − s3

(τ4 − τ3)
(τ4 − τ3)

(cid:19)
(cid:19)

τ2

Ψθ,r3,r4

τ2

Ψϕ,r3,r4

As h → 0, by the boundedness of the integrands on their bounded support and their continuity

ds1dτ2dτ3dτ4

dτ2dτ3dτ4

on their support, we have by the LDCT

(cid:19)

τ4

(cid:19)
(cid:19)

(τ4 − τ3)
(τ4 − τ3)

Þ Þ Þ

τ3,

s4 − s1

x∗(s4) − x∗(s1)

(cid:18) θ(x∗(s4)) − θ(x∗(s3))
(cid:18) θ(x∗(s4)) − θ(x∗(s3))

s4 − s3
s4 − s3

τ2

τ2,

Ψ4

Ψθ,r3,r4

s3 − s1

x∗(s3) − x∗(s1)

(cid:18) x∗(s2) − x∗(s1)
(cid:18) θ(x∗(s2)) − θ(x∗(s1))
(cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1))
Þ Þ Þ
(cid:19)
(cid:18) d
(cid:18) d
(cid:19)
dt θ(x∗(s1))τ2
dt θ(x∗(s1))τ2

(cid:19)
s2 − s1
(cid:19)
s2 − s1
(cid:19)
s2 − s1
(cid:19)
(cid:18) d
v(x∗(s1))τ2, v(x∗(s1))τ3, v(x∗(s1))τ4
(cid:18) d
(cid:19)
dt θ(x∗(s3))(τ4 − τ3)
dt θ(x∗(s3))(τ4 − τ3)

Ψθ,r3,r4

Ψϕ,r3,r4

Ψϕ,r3,r4

(cid:18)

Ψ4

τ2

Ψθ,r1,r2

Ψϕ,r1,r2
→

Ψθ,r1,r2

Ψϕ,r1,r2

dτ2dτ3dτ4

(6.6)

And so

Therefore

=

C
nMn

(6.7)

h24N4

E|ηn,k − Eηn,k|4 ≤ Ch10(1 + o(1))
n

E|ηn,j − Eηn,j|4 ≤ C
Mn

→ 0

k=1

78

For case (II) we have h = O(N−1/5n−1/10) and N2 = n2

αn with αn → 0. Then following the

calculation done above (where h was arbitrary):

E|ηn,k − Eηn,k|4 ≤ Ch10(1 + o(1))

h24N4

=

C(1 + o(1))
αn)6/5n−7/5 =
(n2

C(1 + o(1))

6/5
nα
n

(6.8)

and

[n−4/5N2/5]4

n

k=1

E|ηn,j − Eηn,j|4 ≤ α

8/5
n
6/5
α
n

→ 0

6.4.2 Asymptotic Equicontinuity of the Process ˆZn
Deﬁne ˆZn(t) := Wn(gt(s)), gt(s) = I[0,t]U(t, s)M(s)F−1(x∗(s)) Ξ(t1). Then

ˆZn(t) = −

and

Þ

gt(s)

( ˆfn − f)
( ˆfn − f)

∂ϕ
∂
∂θ

(cid:32) ∂
(cid:12)(cid:12)(cid:12)(cid:12) n

k=1

ds.

(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s)
(cid:12)(cid:12)(cid:12)(cid:12)4

(cid:34)

E|n ˆZn(t) − nE ˆZn(t)|4 = E

ηn,k − Eηn,k

=

n(n − 1)E2|ηn,k − Eηn,k|2 + nE|ηn,k − Eηn,k|4

(cid:35)

.

(6.9)

Since ηn,j are row-wise independent we have the last line.

Consider the above with arbitrary g :

79

E|nWn(g) − nEWn(g)|4

h

h

h

h

h

h

h

h

K(cid:48)

θ

K(cid:48)

ϕ

, K(cid:48)

θ

h
K(cid:48)

ϕ

2Nθ

l1=0

m2=0

l2=0

2Nϕ

2Nϕ

am1bl1am2bl2

m1=0
( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1)
( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1)
( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2)
( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2)

(cid:20)Þ Þ 2Nθ
(cid:18) x∗(s1) − Xk
(cid:19)
(cid:19)
(cid:18) x∗(s1) − Xk
(cid:27)
(cid:18) x∗(s2) − Xk
(cid:19)
(cid:19)
(cid:27)
(cid:18) x∗(s2) − Xk
(cid:19)
(cid:19)
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) θ(x∗s1) − θm1
(cid:18) θ(x∗(s1)) − θm1
(cid:18) ϕ(x∗(s2))−ϕl2
(cid:19)
(cid:19)
(cid:18) θ(x∗(s2))−θm2
(cid:33)
(cid:32) Kθ
(cid:21)
(cid:18) θ(x∗s2)−θm2
(cid:19)
(cid:18) ϕ(x∗(s2))−ϕl2
(cid:19)
Þ Þ 2Nθ
2Nθ
2Nϕ
2Nϕ
(cid:19)
(cid:19)
(cid:18) x∗(s2) − Xk
(cid:19)
(cid:18)
(cid:18) x∗(s1) − Xk
(cid:18) θ(x∗s1) − θm1
(cid:19)
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:19)
(cid:18) θ(x∗(s1)) − θm1
(cid:19)
(cid:18) ϕ(x∗(s2))−ϕl2
(cid:19)
(cid:18) θ(x∗(s2))−θm2
(cid:33)
(cid:32) Kθ
(cid:18) θ(x∗s2)−θm2
(cid:19)
(cid:18) ϕ(x∗(s2))−ϕl2
(cid:19)
Þ Þ 2Nθ
2Nϕ
2Nθ
2Nϕ
(cid:18) x∗(s2) − Xk
(cid:18) x∗(s1) − Xk
(cid:19)
(cid:19)
(cid:18) θ(x∗(s1)) − θm1
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2
(cid:18) θ(x∗(s2)) − θm2

Ykl1m1, K
K(cid:48)

Ykl2m2
, K(cid:48)

Ykl1m1, K
(1−r1)

am1bl1am2bl2

am1bl1am2bl2

m1=0

l1=0

m2=0

l2=0

h
K(cid:48)

ϕ

Ykl2m2

m1=0

l1=0

m2=0

l2=0

ds1ds2

ds1ds2

(cid:19)
(cid:19)

(cid:19)
(cid:19)

K(cid:48)

θ

h

h

(cid:19)

h

h

(cid:18)

Kϕ

Kϕ

(1−r2)

h

h

h

h

K

h

h

θ

h

h

h

ϕ

ds1ds2

K

ϕ

K

ϕ

θ

h

1
h12 E

=

(cid:26)
(cid:26)
(cid:32)

K
−EK

K
−EK

Kθ

1
h12

=

(cid:32)

Kθ

Cov

K

≤ 1
h12

Cov
(r1)

K

θ

(r2)

K

∗(s1)g(s2)

g

∗(s1)g(s2)

g

(cid:19)

(cid:19)

(cid:18) ϕ(x∗(s1)) − ϕl1

(cid:19)(cid:33)

h

Kϕ

(cid:18) ϕ(x∗(s1)) − ϕl1

(cid:19)(cid:33)

h

Kϕ

G(i,j)(s1, s2)

h

h
80

For some (i, j) and r1,r2 ∈ {0,1}. g∗(s1)g(s2) = G(s1, s2) and G(i,j)(s1, s2) are the components

of G

(cid:18)

From the previous covariance portion,

(cid:18) x∗(s1) − Xk

(cid:19)
(cid:18) x∗(s1) − x∗(s2)

h

(cid:18) x∗(s2) − Xk
(cid:19)

h

(cid:19)

K

Ykl1m1, K

Cov
(1 + O(h))h3
f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) + S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2}

Ykl2m2

(cid:20)

Ψ

h

=

(cid:19)

(cid:21)

(cid:21)

2Nθ

2Nϕ

2Nθ

2Nϕ

m1=0

l1=0

m2=0

l2=0

am1bl1am2bl2

Thus E|nWn(g) − nEWn(g)|4 is bounded above by

(1 + O(h))h3

G(i,j)(s1, s2)

(cid:19)

h12

Ψ

Þ Þ
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2

h

h

(cid:18) x∗(s1) − x∗(s2)
(cid:19)
(cid:19)

K

(1−r2)

ϕ

(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2

h

(cid:19)
(cid:19)

(cid:20)

K

θ

(r2)

θ

K

h

K

ϕ

ds1ds2.

h

First consider the sum:

f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) + S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2}
(r1)

(1−r1)

2Nϕ
2Nθ
2Nϕ
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2

m2=0

l2=0

l1=0

h

2Nθ

m1=0
(r1)

K

θ

(r2)

θ

K

h

Now consider just

2Nθ

2Nϕ

m1=0

l1=0

am1bl1 f(x∗(s1), ϕl1, θm1)K

(r1)

θ

am1bl1am2bl2 f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2)
(cid:19)
(cid:19)

(1−r1)

K

ϕ

(1−r2)

K

ϕ

h

(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2
(cid:18) θ(x∗(s1)) − θm1

h

(cid:19)
(cid:19)
(cid:19)

.

(cid:18) ϕ(x∗(s1)) − ϕl1

(cid:19)

h

(1−r1)

ϕ

K

h

81

The error term in Simpson’s method will be absorbed into the leading (1+O(h)) term above and
will be ignored. It is approximated by:

(cid:18) θ(x∗(s1)) − u1

(cid:19)

(cid:18) ϕ(x∗(s1)) − v1

(cid:19)

(1−r1)

ϕ

K

du1dv1

h

f(x∗(s1), v1, u1)K

(r1)

θ

f(x∗(s1), ϕ(x∗(s1)) − ρ1h, θ(x∗(s1)) − τ1h)K

(r1)

θ

(τ1)K

(1−r1)

ϕ

(ρ1)dτ1dρ1

1 Kθ(τ1)dτ1 + h5 f (cid:48)(cid:48)(cid:48)
2

ϕϕϕ

τ

1K(cid:48)
3
ρ

ϕ(ρ1)dρ1

ϕ(x∗(s1))−ch

Þ ϕ(x∗(s1))+ch
= h2Þ c

Þ θ(x∗(s1))+ch
Þ c

θ(x∗(s1))−ch

−c

−c

=

or

h5 f (cid:48)(cid:48)(cid:48)

ϕθθ

Þ

h

Þ

Þ

1Kϕ(ρ1)dρ1 + h5 f (cid:48)(cid:48)(cid:48)
2
ρ
If r1 = 0 or r1 = 1, respectively. Call it h5C(x∗(s1),r1).

h5 f (cid:48)(cid:48)(cid:48)

ϕϕθ

θθθ

Þ

1 K(cid:48)
3

θ(τ1)dτ1

τ

Note we have used a Taylor approximation and the fact that
f (cid:48)
ϕ((x∗(s1), ϕ(x∗(s1)), θ(x∗(s1))) = f (cid:48)
Then

((x∗(s1), ϕ(x∗(s1)), θ(x∗(s1))) = 0.

θ

2Nθ

m1=0
(r1)

K

θ

2Nϕ
2Nθ
2Nϕ
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2

m2=0

l2=0

l1=0

h

am1bl1am2bl2 f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2)
(cid:19)
(cid:19)

(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2

(1−r1)

(cid:19)
(cid:19)

K

h

ϕ

(r2)
K
h
= h10C(x∗(s1),r1)C(x∗(s2),r2) + O(N−4).

(1−r2)

K

h

ϕ

θ

(6.10)

82

(cid:21)

am1bl1am2bl2S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2}
(cid:19)
(cid:19)

(1−r1)

(cid:19)
(cid:19)

K

h

ϕ

Now consider

2Nθ

m1=0
(r1)

K

θ

(r2)

θ

ϕ

h

h

K

l1=0

l1=0

l2=0

m2=0

(1−r2)

h
a2
m1b2

2Nθ
2Nϕ
2Nϕ
(cid:18) θ(x∗(s1)) − θm1
(cid:18) θ(x∗(s2)) − θm2
2Nθ
2Nϕ
l1S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)
(cid:18) θ(x∗(s1)) − θm1
(cid:19)
(cid:19)
(cid:18) θ(x∗(s2)) − θm1
Þ ϕ(x∗(s1))+ch
Þ θ(x∗(s1))+ch
(cid:19)
(cid:18) θ(x∗(s1)) − u1
(cid:18) θ(x∗(s2)) − u1
(cid:19)
Þ c
Þ c

(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl2
(cid:18) ϕ(x∗(s1)) − ϕl1
(cid:18) ϕ(x∗(s2)) − ϕl1
(cid:19)
(cid:18) ϕ(x∗(s1)) − v1
(cid:19)
(cid:18) ϕ(x∗(s2)) − v1

θ(x∗(s1))−ch
(1−r1)
K

ϕ(x∗(s1))−ch

(1−r1)

(1−r2)

(1−r2)

(cid:19)
(cid:19)

h
(1 + O(h))h2

K

K

K

h

h

h

h

h

h

ϕ

ϕ

ϕ

m1=0
(r1)

(r2)

1
N2
(r1)

θ

θ

(r2)

θ

θ

K

=

K

K

=

K

K

=

(cid:18)

N2

−c

−c

S(x∗(s2), ϕ(x∗(s1)) − hp1, θ(x∗(s1)) − hq1)K
(r2)

θ(x∗(s2)) − θ(x∗(s1))

(1−r2)

q1 +

(cid:19)

K

ϕ

K

θ

=

h

(1 + O(h)))h2

(cid:18) θ(x∗(s2)) − θ(x∗(s1))

N2

(cid:19)

Ψθ,r1,r2

h

Thus E|nWn(g) − nEWn(g)|4 is bounded above by

83

S(x∗(s1), v1, u1)S(x∗(s2), v1, u1)

ϕ

du1dv1(1 + O(N−1))
S(x∗(s1), ϕ(x∗(s1)) − hp1, θ(x∗(s1)) − hq1)

h

(cid:19)

dq1dp1

(p1)

(q1)K
ϕ(x∗(s2)) − ϕ(x∗(s1))

θ

ϕ

(r1)

p1 +

(1−r1)

(cid:18)
(cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1))

h

(cid:19)

S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s2), ϕ(x∗(s1)), θ(x∗(s1)))

Ψϕ,r1,r2

h

.

(6.11)

h10C(x∗(s1),r1)C(x∗(s2),r2) +

S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s2), ϕ(x∗(s1)), θ(x∗(s1)))

Ψϕ,r1,r2

ds1ds2.

Þ Þ

Ψ

(cid:18) x∗(s1) − x∗(s2)

(1 + O(h))h3

(1 + O(h)))h2

h12

N2

Ψθ,r1,r2

h

(cid:19)

(cid:18) θ(x∗(s2)) − θ(x∗(s1))
(cid:18)

Þ Þ

h

Change of variable s2 = s1 + τh yields:

(cid:20)

G(i,j)(s1, s2)

(cid:19)
(cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1))

(cid:19)(cid:21)

h

(cid:19)

G(i,j)(s1, s1 + τh)

(cid:20)

(1 + O(h))h4

x∗(s1) − x∗(s1 + τh)

τ

Ψ

h12

s2 − s1
h10C(x∗(s1),r1)C(x∗(s1 + τh),r2) +
(1 + O(h)))h2

(cid:18)

N2

S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s1 + τh), ϕ(x∗(s1)), θ(x∗(s1)))

θ(x∗(s1 + τh)) − θ(x∗(s1))

ϕ(x∗(s1 + τh)) − ϕ(x∗(s1))

(cid:19)(cid:21)

(cid:19)

s2 − s1

ds1dτ

G(i,j)(s1, s1 + τh)ds1dτ

gik(s1)gjk(s1 + τh)ds1dτ

Ψϕ,r1,r2
x∗(s1) − x∗(s1 + τh)

s2 − s1

s2 − s1

x∗(s1) − x∗(s1 + τh)

x∗(s1) − x∗(s1 + τh)

Þ
Þ
Þ

Ψ

Ψ

s2 − s1

τ

(cid:18)
(cid:18)
(cid:18)
(cid:19)1/2(cid:18)Þ

τ

τ

τ

(cid:18)
(cid:19)Þ
(cid:19) 3
(cid:19)

k=1

Þ
(cid:19)1/2

s2 − s1
|gjk(s1 + τh)|2ds1

dτ

(cid:19)

(cid:19)

|gik(s1)|2ds1

|gip(s1)|2ds1

τ

Ψθ,r1,r2
≤ K(1 + O(h))
K(1 + O(h))

n

=

n

Ψ

n
|gik(s1)|2ds1

≤ K(1 + O(h))
(cid:18)Þ
3
3
(cid:18)Þ
k=1
≤ C(1 + O(h))

k=1
K∗(1 + O(h))

(cid:18)Þ

n

n

=

for some p ∈ {1,2,3} and some large enough constant K to bound both the ﬁrst and second terms
in brackets.

Thus,

E|nWn(g) − nEWn(g)|4 ≤

(cid:34)

n(n − 1)

(cid:20)C(1 + O(h))

(cid:18)Þ

n

(cid:19)(cid:21)2

(cid:35)

.

+

nC
nMn

|gip(s1)|2ds1

84

Since E|ηn,k − Eηn,k|4 ≤ C
nMn

ﬁnd an M > 0 such that, with g := gt1 − gt2

and gt(s) is Lipschitz and bounded, we can for its components

Then

|gt1,ip(s1) − gt2,ip(s1)|2ds1 ≤ C|t1 − t2|.

Þ
E(cid:12)(cid:12)nWn(cid:0)gt1(s) − gt2(s)(cid:1) − nEWn(cid:0)gt1(s) − gt2(s)(cid:1)(cid:12)(cid:12)4
= E(cid:12)(cid:12)n(cid:0) ˆZn(t1) − E ˆZn(t1)(cid:1) − n(cid:0) ˆZn(t2) − E ˆZn(t2)(cid:1)(cid:12)(cid:12)4
= E(cid:12)(cid:12)Ξn(t1) − Ξn(t2)(cid:12)(cid:12)4
(cid:21)
(cid:20)

≤ C

|t1 − t2|2 + K

.

When t1 = t2, K may be taken to be zero. If |t1 − t2| > 0, choose a constant C large enough so

that K < C|t1 − t2|.

Therefore

E(cid:12)(cid:12)Ξn(t1) − Ξn(t2)(cid:12)(cid:12)4 ≤ C|t1 − t2|2

To establish the asymptotic equicontinuity of the process n(cid:0) ˆZn(t1)− E ˆZn(t1)(cid:1), we will apply the

.

following two lemmas from [16].

Here || · ||ψ = || · ||4. In terms of the above, we have

||Ξn(t1) − Ξn(t2)||ψ ≤ C

2 |t1 − t2| 1
1

2 = Kd(t1, t2)

For the semi-metric d, a ball of radius  is the interval [t − 

2] for each t ∈ T. Then
we have N(, d) = T
2 , the number of balls of radius  needed to cover T. Note that since
2
−1
, ψ−1(x) = x
ψ(x) = x4
4 .
To ﬁnd the integral above, we note that

, t + 

2

(cid:18) 

(cid:19)

2, d

.

N(, d) ≤ D(, d) ≤ N

85

As is shown in [16], the idea is to, upon deﬁning a nested sequence of maximally separated
2n for s, t ∈ Sn, bound the maximum of the process on Sn
where the second maximum is taken over

subsets S0 ⊂ S1 ⊂ ... ⊂ T, with d(s, t) > η
by

and

all (s, t) ≤ δ ∈ Sn whose chains end at a unique pair s0, t0 ∈ S0.

The second lemma bounds the second term:

s∈Sn
t∈Sn−1

|Ξn(s) − Ξn(t)|

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) max
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)max|Ξn(s) − Ξn(t)|

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ

The integral

can be bounded above by

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12),
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)max|Ξn(s) − Ξn(t)|
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
−1(cid:0)D2(cid:0)η, d(cid:1)(cid:1)max
Þ η
−1(cid:0)D(, d)(cid:1)d
(cid:33) 1
(cid:32)2T
Þ η

ψ

4

0

≤ Kψ

2

0


d .

(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Ξn(s) − Ξn(t)
(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ

(cid:33) 1

2

δ.

(cid:32)2T

2

η

≤ K

Thus, given any r > 0, by Markov’s inequality and the maximal inequality in the theorem,

(cid:34)Þ η

(cid:33) 1

4

(cid:32)2T

2

0


≤ K
r

(cid:33) 1
2(cid:35)

(cid:32)2T

2

η

d + δ

2K(2T)1
4

r4

=

1
2 +

η

√

2T δK
r4

η

.

(cid:33)

P

(cid:32)
(cid:12)(cid:12)(cid:12)(cid:12)Ξn(t1) − Ξn(t2)
(cid:12)(cid:12)(cid:12)(cid:12) > r
n(cid:0) ˆZn(t1) − E ˆZn(t1)(cid:1).

sup
|s−t|≤δ

Choose η arbitrarily small and we have established the asymptotic equicontinuity of the process

86

CHAPTER 7

CONCLUSION

The proposed methods provide a general mathematical and statistical framework for tractography
based on HARDI data. Similar to the tensor model approach, the main advantage of the approach
under consideration is the following through of the uncertainty from the acquired raw data level
to the ﬁber level as it propogates via our ’signal ﬁelds’ to vector ﬁelds and ﬁnally to the integral
curves. But unlike the tensor model approach we do not impose structural assumptions on the
diﬀusion signal such as supersymmetric tensor or a positive deﬁnite matrix.

This approach to tracing curves with surrounding conﬁdence ellipsoids is unique and oﬀers a
computationally cheap alternative to probabilistic tractography methods which are tied to iterative
MCMC sampling techniques. This way, the errors from data measurements are followed through
the model to the level of axonal ﬁbers in an easy to interpret way.

The methods in DTI and HARDI give ﬁber estimates within O(n−1/3) from the true ﬁber,
and they require O(n) operations for Gaussian type kernels as well as O(n2) operations for the
asymptotical covariance calculation. Furthermore in DTI and HARDI one has hn = O(n−1/6).
Typically the number of locations n in HARDI is on the order of hundereds of thousands or millions.
The sampling of the directional space contains at least 6 directions for DTI and between 30 and
150 directions for HARDI, that would be N2 in our model. In the non-parametric scenario this is
accomodated by case (II), and then the ﬁber estimates are within O(n−1
) from the true ﬁber,
and they require the same amount of operations. Our bandwidth is hn = O(n−1/2
), where
εn → 0. If we take εn = O(n−5/3), which corresponds to N = O(n1/3), then we will obtain the
same order for bandwidth and accuracy as the methods based on tensor ﬁelds.

−2/5
n

ε

−1/5
n

ε

The practical downside of our approach is the third step of the implementation, in which we
need to solve a complicated optimization problem numerically. In our simulation study this was
the bottleneck and this step often introduced numerical bias which ruined the subsequent statistical
estimation. This systematic bias through the fminsearch is unexpected and diﬃcult to control

87

through Matlab. The expected bias should be zero, however the process strays oﬀ in a line away
from the true curve before regaining the correct estimated maximum direction. However, the bias
always maintains its roughly distant initial distance strayed from the curve rather than switching
back and forth as in the HARDI simulation scenarios. At this time we can only speculate as to
why the Matlab function cannot escape these ’minimum wells’ to ﬁnd a more maximal direction
estimate at the beginning of the curve trace.

As a direction of future research one could investigate the interplay between numerical errors
and statistical errors, and one could balance them to come up with a practical guide on how to
select tuning parameters in the optimization step. For future work we will compare in the same
way the performance of the completely non-parametric method to that of the higher-order HARDI
model and low order classical DTI model in the C-pattern scenarios as in [11].

It is also worth considering how one might build this same modeling method without the
assumption of a unique maximal direction at each location so that branching scenarios can be
explored, however the mathematical reasoning behind this may require an approach quite diﬀerent
than those we have discussed. We may consider the set of all points on a manifold and only require
this existence rather than requiring conditions on the function f for which the unique maximal
direction can be found at each point x.

Although we have an imperfect method when considering the inability to handle crossings or
branchings of ﬁbers, this study was very fruitful in showing that the data can speak for itself in
these noisy MRI data scenarios for which we wish to uncover curve estimates for the C pattern.
The proof of concept that one may obtain these curve estimates is quite valuable as the methods
of obtaining MRI data can continually improve and with them the noise can be reduced through
gathering of multiple images.
In addition, all of the methods could in theory trace a sequence
of ﬁbers, although one may be more tedious than another. For example, in the HARDI scenario
we can trace multiple curves simultaneously without violating any assumptions. In the discussed
methodology of this paper, we could skip branch points and search nearby for dominant diﬀusion
directions outside of a ﬁber cluster and still within reason estimate a network of ﬁber trajectories.

88

Figure 7.1: A ﬁber across the genu of corpus callosum with diﬀusion "blobs" along it.

89

Figure 7.2: Visualization of diﬀusion via nonparametric function using our model.

90

BIBLIOGRAPHY

91

BIBLIOGRAPHY

[1] Assemlal, H., Tschumperle, D., Brun, L., Siddiqi, K. Recent Advances in Diﬀusion MRI
Modeling: Angular and Radial Reconstruction. (2011) Medical Image Analysis 15 369-396.

[2] Carmichael, O. and Sakhanenko, L. (2016) Integral curves from noisy diﬀusion MRI data
with closed-form uncertainty estimates. Statistical Inference for Stochastic Processes, vol.
19(3), pp. 289-319.

[3] Coddington, E.A., Levinson, M. (1955) Theory of Ordinary Diﬀerential Equations. McGraw-

Hill, New York.

[4] Daducci, A., Canales-Rodríguez, E., Descoteaux, M., Garyfallidis, E., Gur, Y., Lin, Y-C.,
Mani, M., Merlet, S., Paquette, M., Ramirez-Manzanares, A., Reisert, M., Rodrigues, P.,
Sepehrband, F., Caruyer, E., Choupan, J., Deriche, R., Jacob, M., Menegaz, G., Prčkovska,
V., Rivera, M., Wiaux, Y., Thiran, J-P. (2013) Quantitative comparison of reconstruction
methods for intra-voxel ﬁber recovery from diﬀusion MRI. IEEE Proc.

[5] Koltchinskii, V., Sakhanenko, L. and Cai, S. (2007) Integral Curves of Noisy Vector Fields
and Statistical Problems in Diﬀusion Tensor Imaging: Nonparametric Kernel Estimation and
Hypotheses Testing. Ann. Statist. 35, 1576–1607.

[6] Sakhanenko, L. (2012) Numerical issues in estimation of integral curves from noisy diﬀusion

tensor data. Statistics & Probability Letters 82, 1136–1144.

[7] Sakhanenko, L, DeLaura,M. (2019) Supplement to “Fully nonparametric model for a tensor

ﬁeld based on HARDI”.

[8] Assemlal, H.-E., Tschumperle, D., Brun, L. and Siddiqi, K. (2011) Recent advances in
diﬀusion MRI modeling: Angular and radial reconstruction. Medical Image An. 15 369–396.
[9] Carmichael, O. and Sakhanenko, L. (2015) Estimation of integral curves from high angular
resolution diﬀusion imaging (HARDI) data. Linear Algebra and its Applications 473, pp.
377–403.

[10] Sakhanenko, L. (2015) Using the Tractometer to assess performance of a new statis-
e130, pp.1–12.

Journal of Nature and Science,

tical tractography technique.
http://www.jnsci.org/content/130.

1(7):

[11] Sakhanenko, L. and DeLaura, M. (2017) A comparison study of statistical tractography
methodologies for Diﬀusion Tensor Imaging. International Journal of Statistics: Advances in
Theory and Applications, 1(1), 93-110.

[12] Assemlal, H., Tschumperle, D., and Siddiqi, K. (2011) Recent Advances in diﬀusion MRI

modeling: Angular and radial construction. Medical Image Analysis 15 369-396.

92

[13] Ying, L., Zou, Y., Klemer, D., and Wang, J. Determination of Fiber Orientation in MRI
Diﬀusion Tensor Imaging Based on Higher-Order Tensor Decomposition. Proceedings of the
29th Annual International Conference of the IEEE EMBS (2007) 2065-2068.

[14] Evarist Gine, Armelle Guillou. Rates of Strong Uniform Consistency for Multivariate Kernel
Density Estimators. Annales de l’I.H.P. Probabilités et statistiques, Volume 38 (2002) no. 6,
p. 907-921.

[15] Efromovich, Sam. (1999). Nonparametric Curve Estimation: Methods, Theory, and Applica-

tions.

[16] van der Vaart, Aad, Wellner, Jon A. (1996) Weak Convergence and Empirical Processes.

Springer texts in statistics. pp. 95-104.

[17] Coddington, E.A., Levinson, M. (1955) Theory of Ordinary Diﬀerential Equations. McGraw-

Hill, New York.

[18] Basser, P.J., Mattiello, J., LeBihan, D. (1994) MR diﬀusion tensor spectroscopy and imaging.

Biophys J 66, 259–267.

[19] Behrens, T.E., Berg, H.J., Jbabdi, S., Rushworth, M.F., Woolrich, M.W. (2007) Probabilistic
diﬀusion tractography with multiple ﬁbre orientations: What can we gain? Neuroimage 34,
144–155.

[20] Chang, S.E., Zhu, D.C. (2013) Neural network connectivity diﬀerences in children who stutter.

Brain 136, 3709–3726.

[21] Le Bihan, D., Mangin, J.F., Poupon, C., Clark, C.A., Pappata, S., Molko, N., Chabriat, H.
(2001) Diﬀusion tensor imaging: concepts and applications. J Magn Reson Imaging 13,
534–546.

[22] Mori, S., Kaufmann, W.E., Davatzikos, C., Stieltjes, B., Amodei, L., Fredericksen, K.,
Pearlson, G.D., Melhem, E.R., Solaiyappan, M., Raymond, G.V., Moser, H.W., van Zijl, P.C.
(2002) Imaging cortical association tracts in the human brain using diﬀusion-tensor-based
axonal tracking. Magn Reson Med 47, 215–223.

[23] Ozarslan, E., Mareci, T.H. (2003) Generalized diﬀusion tensor imaging and analytical rela-
tionships between diﬀusion tensor imaging and high angular resolution diﬀusion imaging.
Magn Reson Med 50, 955–965.

[24] Zhu, D.C., Covassin, T., Nogle, S., Doyle, S., Russell, D., Pearson, R.L., Monroe, J.,
Liszewski, C.M., DeMarco, J.K., Kaufman, D.I. (2015) A potential biomarker in sports-
related concussion: brain functional connectivity alteration of the default-mode network
measured with longitudinal resting-state fMRI over thirty days. J Neurotrauma 32, 327–341.
[25] Zhu, D.C., Majumdar, S. (2014) Integration of resting-state FMRI and diﬀusion-weighted MRI
connectivity analyses of the human brain: limitations and improvement. J Neuroimaging 24t,
176–186.

93

[26] Zhu, D.C., Majumdar, S., Korolev, I.O., Berger, K.L., Bozoki, A.C. (2013) Alzheimer’s
disease and amnestic mild cognitive impairment weaken connections within the default-mode
network: a multi-modal imaging study. J Alzheimers Dis 34, 969–984.

94