NONPARAMETRIC ESTIMATION OF INTEGRAL CURVES USING HARDI DATA By Michael DeLaura A DISSERTATION Michigan State University in partial fulfillment of the requirements Submitted to for the degree of Statistics – Doctor of Philosophy 2019 ABSTRACT NONPARAMETRIC ESTIMATION OF INTEGRAL CURVES USING HARDI DATA By Michael DeLaura We develop a fully non-parametric method for the estimation of curve trajectories using HARDI data. For a set of locations Xi ∈ G, G representing a region of the brain, we consider the diffusion process by applying multivariate kernel smoothing techniques for the estimation of a general function f describing the signal process obtained from the MRI image. At each location x ∈ G we search for the direction of maximum diffusion on the unit sphere to obtain estimates of curve trajectories. We establish the convergence of the deviation between estimated and true curves to a Gaussian process to develop tests for the connectivity likelihood of regions. This method is computationally efficient as with each step of the curve tracing we construct a pointwise confidence ellipsoid region rather than exhaustive iterative sampling methods. Copyright by MICHAEL DELAURA 2019 This work is dedicated to my parents. Thank you for always believing in me. iv ACKNOWLEDGEMENTS I would like to thank my advisor Lyudmila Sakhanenko for all of her help with this work. I will always cherish the time we spent working together. I would also like to thank my committee members: Vidyadhar Mandrekar, Yimin Xiao, and David Zhu. Lastly, I would like to thank Sue Watson and Albert Cohen, who both played a large role in my successful completion of this degree. Thank you all so very much. v TABLE OF CONTENTS . . . . . . LIST OF FIGURES . CHAPTER 1 . . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1 STEJSKAL TANNER EQUATION . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Definitions/Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Simple Vector Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 DTI Model . . 9 1.5 HARDI Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . 1.6 Main Result . . . . . . . . . . . . . . CHAPTER 2 ESTIMATION & MAIN RESULT . . . . . . . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 . 20 2.1 Preliminaries . . 2.2 Simpson’s Rule . 2.3 Estimation . 2.4 Main Result for Asymptotic Normality of Deviation Process . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 3 MEAN AND COVARIANCE OF THE LIMITING GAUSSIAN PROCESSES 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1 Preliminaries . 3.2 Green’s Function . . . . . . . . . . . CHAPTER 4 ALGORITHM . . CHAPTER 5 SIMULATIONS . . . . . 5.1 Artificial example . . 5.2 Real HARDI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 . . 30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 CHAPTER 6 PROOFS . . . . . 6.1 Some Results for Simpson’s Scheme and Kernel Smoothing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 . . . . . . . . . . . . 40 6.1.1 Existence of Unique Direction and Establishing Approximation of ˆXn − x . 41 6.1.2 Establishing Asymptotic Unbiasedness of Estimators of Derivatives of f . . 47 6.2 Calculation of Mean and Covariance of ˆZn . . . . . . . . . . . . . . . . . . . . . . 56 6.3 Mean Squared Error of the Process ˆZn . . . . . . . . . . . . . . . . . . . . . . . . 67 6.4 Asymptotic normality of the ˆZn process . . . . . . . . . . . . . . . . . . . . . . . 69 6.4.1 Lyapunov’s Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.4.2 Asymptotic Equicontinuity of the Process ˆZn . . . . . . . . . . . . . . . . 79 CHAPTER 7 CONCLUSION . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 . 91 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF FIGURES Figure 5.1: The true curve is in blue, while the estimated curve is in red. . . . . . . . . . . 31 Figure 5.2: This is an enlargement of the previous figure to show the 95% confidence ellipsoid surrounding a point on the estimated curve. The true curve in blue touches it. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 . . . . . . Figure 5.3: Diffusion ellipsoids illustrate the corresponding diffusion tensors along the fiber across the genu of corpus callosum. . . . . . . . . . . . . . . . . . . . . . 34 Figure 5.4: Visualization of diffusion via ellipsoids using DTI/HARDI tensor model. . . . . 35 Figure 5.5: Axonal fibers across the genu of Corpus Callosum. We traced each fiber branch for 40 steps of size δ = 0.01. The estimated curve is shown in magenta accompanied by blue 95% confidence ellipsoids. . . . . . . . . . . . . 38 Figure 5.6: Right Fornix. We traced the fiber for 30 steps of size δ = 0.01. The estimated curve is shown in magenta accompanied by blue 95% confidence ellipsoids. . . 39 Figure 7.1: A fiber across the genu of corpus callosum with diffusion "blobs" along it. . . . 89 Figure 7.2: Visualization of diffusion via nonparametric function using our model. . . . . . 90 vii CHAPTER 1 INTRODUCTION Diffusion MRI makes use of the properties of protons under an applied magnetic field to measure dominant diffusion directions in cerebral white matter. Magnetic resonance imaging (MRI) utilizes the dynamics of self-spinning protons, most commonly in water molecules, as the source of energy to generate MRI signal. Under a strong magnetic field, a group of these spins form a net magnetization. This net magnetization can be perturbed by a radio frequency electromagnetic wave. Its wobbling (precessing) phenomenon can be measured as signals by an MRI scanner. By manipulating the magnetic field by gradients, we can identify the locations of the signal which in turn allow us to generate images. Because MRI contains no radiation and thus the potential damage to the human body is minimal, it has become an important tool in both clinical and research applications. These directions for which water is diffusing correspond to those directions that the neural pathways, or axons, are aligned. Water moves along, but not across these pathways, which is a key component in determining these path orientations through the use of magnetic fields as in dMRI. The phenomenon of water diffusion is further taken advantage of in MRI to develop diffusion weighted imaging (DWI). Water diffusion with the presence of a magnetic field gradient leads to MRI signal loss. In an unrestricted environment, water and other molecules move or diffuse randomly in three dimensions resulting from thermal energy. The motion is called Brownian motion. These protons in the water molecules moving in constant random motion contained in neural pathways behave as gyroscopes that either ‘tilt’ or ‘align’ under a magnetic field applied in a particular direction. A lack of alignment corresponds to a larger signal response. This would indicate that the applied magnetic field direction is not in agreement with the path orientation. This lack of alignment increases as the magnetic field gradient becomes more orthogonally In the case that there is an agreement between the applied applied with respect to a pathway. 1 direction and pathway we get an aligned assortment of protons. Indeed, this corresponds to a lower signal response. Thus these methods make use of the signal response under an applied magnetic field to measure the degree to which water is diffusing in a given direction as a means of tracking neural pathways. With this knowledge one can build models relating the signal response to an orientation distribution function (ODF) such as in kernel regression estimation. This ODF gives an empirical descrip- tion of the distribution of diffusion at a location x ∈ R3 in each particular diffusion direction b ∈ R3. Studying the Brownian motion of molecules (water molecules in our case) in the brain can provide information regarding the neuronal structural connectivity in vivo. These measurements have been made possible with DWI [18, 19], which applies diffusion-weighted gradients in various directions to assess the diffusing directions of the water molecules. With DWI data, as in the commonly used diffusion tensor imaging (DTI) techniques, diffusivity values and principal diffusion orientation can be estimated at each voxel. Since healthy axons contain intact myelin sheaths and tend to align in organized orientations, water diffusivity in a voxel tends to be preferentially along the direction of the axonal bundles. By inspecting the orientations of the diffusion tensors at neighboring voxels, axonal fiber bundles can be tracked. The success of the axonal tracking can be used to understand the structural connections between brain regions [21, 25], and can be used to assess axonal changes over time in applications such as brain maturation in young children [20], axonal degeneration in Alzheimer’s diseases [26], and potential axonal damage in traumatic brain injury [24]. However, successful tractography based on DWI data faces some fundamentally challenging demands, specifically the need for high image signal-to-noise ratio (SNR), high spatial resolution, a relative long scan time, the ability to resolve crossing fibers, full coverage of tracks of interest, and the ability to track at regions with low diffusion anisotropy. To address the issue of crossing fibers, high angular resolution diffusion imaging (HARDI) [23] in DWI has gained some success. The issues related to neuronal fiber tractography in DWI motivated our research on the integral curve estimation. 2 The application of a diffusion gradient leads to low image SNR. The subject’s motion as well as physiological signals within the relatively long scan time in DWI, with a typical range of 8 to 20 min, can further exacerbate the noise issue in DWI. In this work, our goal is to understand the error propagation due to noise in neuronal fiber tractography under as few model assumptions as possible, leading to a nonparametric setup. Using the methods of dMRI one can also discover the degree of likelihood for which two regions are connected. Of course, this is of great scientific interest since developing our understanding of the connective features of the brain would inevitably help in the planning of neurosurgeries. This knowledge would serve as a strong basis for the studying of brain disorders, and research could build upon itself in a more fruitful manner. Creating an underlying physical connectivity map would serve as a highly useful foundation in the investigation of functional brain connectivity, i.e., fMRI. A more informed schematic of this structure on this small a scale could reveal potential patterns and insight into understanding of the proportions of the brain used for their respective purposes. One could upon combining this with fMRI analysis create a more rich analysis of region interactions and understanding of their purposes. Indeed, continued analysis of brain microstructure should produce a trustworthy anatomy. These methods could potentially be developed further and applied to uncover nerve pathway interactions on a small scale. Probabilistic fiber tractography [19] is one popular technique in DWI because it can assess the relative strength of fiber connection. However, this technique employs Monte Carlo sampling and bootstrap techniques, and depends on arbitrary prior parameter assumptions based on fully parametric models. Incorrect parameter assumptions will exacerbate the error due to the repeated Monte Carlo sampling. To reduce the need of parameter assumptions, Koltchinskii et.al. [5] developed a theoretically more rigorous semiparametric approach for the simple vector model [5], Carmichael and Sakhanenko investigated the “low-order” DTI model [2] and “high-order” HARDI model [9]. With the later approach, they demonstrated tighter confidence ellipsoids around the fibers, and their method is more robust in handling crossing of fibers than other DTI methods [10, 11]. Deterministic fiber tractography [22] is another popular technique in DWI, which does 3 not provide the assessment of uncertainty such as confidence regions or occupancy probabilities estimation. Based on the improvement of the semiparametric approach in the work by Carmichael and Sakhanenko, we want to assess the fiber tracking performance based on a completely nonparametric model using HARDI data. Although the proposed technique is computationally intense, it avoids the typical limitations of deterministic tractography techniques, and recovers the connectivity information similar to probabilistic tractography techniques. 4 1.1 STEJSKAL TANNER EQUATION The general form of the Stejskal-Tanner equation relates the diffusion signal E(x, b) to the diffusion tensor D at a location x ∈ G and for some applied magnetic gradient direction b ∈ R3 : E(x, b) = R(x, b) R0(x) = e−cbT Db. At a given location x, R(x, b) denotes the relative amount of water diffusion along the spatial direction b. R0(x) represents the amount of signal observed without any applied magnetic field. The ratio of these two gives a metric for amount of diffusion at a given location and direction. This serves as a basis in dMRI methodology for the construction of any particular type of model. The amount of diffusion in a particular direction at a given location x ∈ G can be characterized by this diffusion tensor D. The above relationship between the signal and the diffusion tensor was first given by Stejskal and Tanner. According to this classical equation, we may report the observed log-loss of signal obtained from the DTI data (see [8]) as related to a diffusion tensor as follows (cid:19) (cid:18) R(x, b) R0(x) y(x, b) = −log = cbT M(x)b, (1.1) where for each x ∈ G, M(x) (in place of D) is a positive-definite symmetric matrix representing the 3D distribution of diffusion at a location x ∈ G. This M can be visualized as an ellipsoidal structure describing diffusion at a given location whose diagonals are the eigenvalues corresponding to those eigenvectors of M aligned with the principal eigenvectors of said ellipsoid. The elongation of said ellipsoid is determined by those corresponding eigenvalues. Note this makes sense since the process of water diffusion is Gaussian and the diffusion tensor is representative of the covariance matrix of the diffusion process at a particular point. This model is quite natural for a single fiber where the ellipsoids would be really streched out along the fiber, while in the grey matter most ellipsoids would be spheres. 5 1.2 Definitions/Preliminaries Nonparametric Statistics makes use of methods for which no assumptions are placed on the data belonging to a parametric family of distributions. The need for these methods arises frequently in areas for which little is known about the underlying data structure of interest (e.g. the setting of this paper). Considering the highly unknown structure of the brain, and the unavoidably high amounts of noise shrouding the details we can in fact uncover, it makes sense to approach the problem with fewer assumptions. One can then formulate a more theoretically rigorous foundation upon which to proceed with analysis. Before discussing specific modeling methods we will introduce some basic concepts in the non-parametric methodology particular to Kernel Regression Estimation. In Integral Curve Estimation we are concerned with estimating curves able to be represented as integrals of smooth functions. We have the general form of a regression model Yi = f(Xi) + σ(Xi)i, where the i are random i.i.d. errors with Ei = 0, and σ > 0 is a scaling smooth enough function. The functions f and σ are unknown. We only assume that f exists and is at least ’smooth’ in some sense. The quantity of interest is the curve x(t) which is known as the ’integral curve’. So as stated it will be the case that Þ t 0 x(t) = g(x(s))ds for some smooth function g dependent on f . Usually the Xi are non-random points on a regular grid but they can be approximated by a Xi ∼ Uniform[0,1] setup, which makes analysis slightly easier. Thus we observe {(Xi,Yi)}n i=1 where the Xi ∼ Uniform[0,1] and Yi are some observed responses thought to be a function of the data. We write Yi = f(Xi) + σ(Xi)i. 6 Then the goal would be to first estimate f on [0,1], and then to estimate σ. The field of Kernel Regression Estimation is interested in the function ˆfn for estimating f . Let us review some basics here. Consider again X1, ..., Xn ∼ Uniform[0,1]. If we construct a histogram to consider f , then a natural estimator of f would be (cid:18) x − Xi h n i=1 I (cid:19) Yi, ∈ [−1,1] ˆfn(x) = 1 nh (cid:18) x − Xi (cid:19) w h Yi. n i=1 ˆfn(x) = 1 nh where h is called the bandwith, or window-width, and it is a smoothing parameter used to correct ’oversmoothing’ or ’undersmoothing’. Then we can include all of the Xi within h of x. Alternatively one can introduce a weight function w(x) = 1 2 I(|x| < 1). Our estimator would then have the form: Now one can go from uniform weights w to arbitrary weights defined by K. More precisely, let K(x) be a kernel function, which is a probability density satisfying the conditions (K): (K1) K is non-negative and symmetric about 0, (K2) Þ ∞ (K3) Þ ∞ (K4) 0 <Þ ∞ −∞ K(x)dx = 1, −∞ xK(x)dx = 0, −∞ |x|2K(x)dx < ∞. Then rather than a simple weighted function w, one may select K to be Gaussian, for example. Then ˆfn(x) = 1 nhn (cid:18) x − Xi (cid:19) h K Yi, n i=1 where the subscript n is included to indicate that (as usually is the case), the bandwith is a function of the sample size n. A primary concern in regression function estimation is how to handle the error on a local vs. global scale. Of course, there is a trade-off between minimizing the error locally and globally, 7 and the problem of minimizing the bias and variance of the estimator ˆf of f with respect to the bandwith h is problematic since as h increases so does the bias. Optimizing with respect to the mean squared error is the common practice. We consider the mean integrated squared error (MISE) between f and its estimator ˆf , that is, Þ (cid:0) ˆfn(x) − f(x)(cid:1)2 dx. MISE = E 1.3 Simple Vector Model The first of these works explored by Sakhanenko et al. was the simple vector model (see [5]), in which the diffusion tensor D was calculated at each location, characterized by a symmetric positive definite 3× 3 matrix. Recall one can envision this diffusion tensor as an ellipsoidal structure whose principal eigenvector is pointed in the direction of dominant diffusion, giving a measure of the path direction so that the fiber tract can then be reconstructed in small steps. The method is basically making use of Euler’s method to reconstruct the curve locations. Stated explicitly, there is a vector field v : G → R3 observed at uniformly i.i.d. locations Xi ∈ G with i.i.d. random errors ξ for which Eξ = 0 and Cov(ξ, ξ) = Σ. The ξi are taken to be independent of the locations Xi. The observations are: (Xi,Vi) = (Xi, v(Xi) + ξi). To trace the curves, we look at the Cauchy problem of solving the differential equation: or equivalently, dx(t) dt = v(x(t)), x(t) = a + tÞ v(x(s))ds. t ≥ 0, x(0) = a ∈ G, If the vector field is a very simple one (e.g., a constant vector field), then integrating along the vector field gives an integral curve x that is another regression function in the space of dimension 0 8 one less than the dimension of the space for which Vi is a member. Note that the set G ∈ R3 is a bounded open set of Lebesgue measure one and represents some scaled region of the brain. The Nadaraya-Watson type estimator was used as an estimator of the vector field v (see reference [5] of [5]): ˆV(x) = ˆVn(x) = 1 nhd (cid:18) x − Xi (cid:19) h Vi, n i=1 K = cbT M(x)b + σ(x, b)ξ. 9 where the kernel K satisfies the assumptions (K). The kernel K also should be noted to be defined on a region of bounded support wherein it is twice continuously differentiable. In particular, the estimate ˆV(x) = 0 outside a bounded neighborhood of G. Thus there is defined a plug-in estimate of the curve x(t) as ˆX(t) = ˆXn(t) for t ≥ 0 of the Cauchy problem: d ˆXn(t) dt = ˆVn( ˆXn(t)), t ≥ 0, x(0) = a ∈ G, tÞ 0 ˆXn(t) = a + ˆVn( ˆXn(s))ds. or equivalently, 1.4 DTI Model A more complete account of how noise in DTI data impacts fiber trajectory estimates is provided with the low order DTI model. As before, the main goal is to estimate the curve x(t) only now driven by the vector field v(M(x)) for x ∈ G, where M is a tensor field and M(x) represents the calculated tensor at the location x. Thus v(M(x)) is a tensor-driven vector field. The observations are modeled as according to the Stejskal-Tanner equation (1.1) with het- eroscedastic noise function σ > 0: y(x, b) = −log (cid:19) (cid:18) R(x, b) R0(x) Thus, given N magnetic field gradient directions b1, ..., bN we have the elements y(x, bj), j = 1, .., N that make up the vector Y(x) for a location x ∈ G modeled by 1/2(x)Ξx . Y(x) = BM(x) + Σ (1.2) Here the fixed matrix B is related to the set of diffusion gradient directions and timing parameters of the imaging procedure. For DTI, we require at least N = 6 directions. Denoting each diffusion directional vector b by b = (b(1), b(2), b(3)), we then have for these N directions bi the fixed matrix  B = 2 1 b(1) 1 b(1) b(1) 2 b(1) ... b(1) N b(1) N 2 1 2b(1) 1 b(2) 2b(1) 2 b(2) ... 2b(1) N b(2) N 2 1 2b(1) 1 b(3) 2b(1) 2 b(3) ... 2b(1) N b(3) N 2 1 b(2) 1 b(2) b(2) 2 b(2) . . . b(2) N b(2) N 2 1 2b(2) 1 b(3) 2b(2) 2 b(3) ... 2b(2) N b(3) N 1 b(3) 1 b(3) b(3) 2 b(3) 2 b(3) N b(3) N  . where the entry M(i,j) represents a measure of the amount of diffusion in the (i, j) direction. To be clear, for example, the (1,1) direction would be the direction (1,0,0) and the (1,2) direction would correspond to the direction (1,1,0). Likewise the (1,3) direction would correspond to the directional vector (1,0,1). Note each of these matrices in the field is an average within a voxel located at x that has been corrupted with noise. Σ is an N × N symmetric positive definite tensor with entries Σi j(x) = cov(σ(x, bi), σ(x, bj)). The N × 1 tensor Ξx is a vector of random noise with entries ξj, j = 1, .., N. 10 This second order tensor M describing the diffusion locally at each x can be represented by a 3 × 3 positive-definite symmetric matrix:  M(x) = M(1,1)(x) M(1,2)(x) M(1,3)(x) M(2,1)(x) M(2,2)(x) M(2,3)(x) M(3,1)(x) M(3,2)(x) M(3,3)(x)  The Cauchy problem of solving the ODE for the curve trajectory is now given by or equivalently dx(t) dt = v(M(x(t))), t ≥ 0, x(0) = a ∈ G, tÞ 0 x(t) = a + v(M(x(s)))ds. The vector field v(M(x)) consists of the leading eigenvectors of the tensors M(x) for each x ∈ G. The direction of diffusion is the eigenvector corresponding to the maximal eigenvalue of M(x). For a point Xj ∈ G we estimate M(Xj) using the ordinary least squares estimator: ˜M(Xj) = (BT B)−1BTY(Xj), provided that (BT B)−1 exists. We use a kernel smoothing method to estimate M(x) at locations x ∈ G between our observations Xj: ˆMn(x) = 1 nh3 n n j=1 K (cid:18) x − Xj hn (cid:19) ˜M(Xj), where K is a kernel function and hn is a bandwidth. We then compute the eigenvectors and eigenvalues v( ˆMn(x)) and λ( ˆMn(x)) of ˆMn(x). These are our estimators of the true eigenvectors of M(x). The eigenvector v( ˆMn(x)) with the corresponding maximal eigenvalue λ( ˆMn(x)) gives the solution for the estimate ˆXn(t) of x(t): d ˆXn(t) dt = v( ˆMn( ˆXn(t))), t ≥ 0, ˆXn(0) = a. The low order DTI approach utilizes a second order tensor. This method requires that one assume there only be one fiber present per voxel. That is, that only one fiber extends outward from 11 each cubic region in consideration rather than branching, touching, or ’kissing’ of two or more fibers. These situations require a more rich sampling of the directional diffusion space. This gives us the need for the higher order model known to HARDI, or ’High-Angular Resolution Diffusion Tensor Imaging’ in which a tensor of order greater than two is utilized. 1.5 HARDI Model Under the higher order tensor approach the Stejskal - Tanner equation becomes: (cid:19) (cid:18) R(x, b) R0(x) log 3 i1=1 3 iM =1 = −c · · · Di1...iM(x)bi1 ...biM + σ(x, b)ξb, where b is the vector in R3 denoting the applied magnetic field gradient direction on the unit sphere and c is a constant that depends on the parameters of the imaging procedure. The numbers Di1...iM(x) are components of the high order diffusion tensor D(x), which is a supersymmetrical tensor. Due to symmetry, D(x) can be represented by a vector D(x) ∈ positive definite 3 × ... × 3 Mtimes RJM , where JM = (M + 1)(M + 2)/2. Thus at locations x ∈ G we observe log-losses of signal (cid:124)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:123)(cid:122)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:125) (cid:19) (cid:18) R(x, b) R0(x) Y(x) = log ∈ RN stacked into the vector Y for all directions b. Now the diffusion signal is modeled as: Y(x) = BD(x) + Σ 1/2(x)Ξx, (1.3) where again we require N ≥ 6. B ∈ RN×JM is the fixed matrix whose components are combinations of the diffusion directions as in the DTI model. Ξx ∈ RN is random noise and Σ(x) is an N × N symmetric positive definite tensor. When M = 2 we have the usual eigenvalue problem. Computing the eigenvalues and eigenvec- tors of the high order tensor requires a different approach than the usual eigenvalue problem. For details see [13]. We will outline the basic idea here. First, the rank R of the high order tensor D is 12 the minimal number R for which: R k=1 (cid:124)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:123)(cid:122)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:125) vk ⊗ · · · ⊗ vk M times D = f or some v1, ..., vR ∈ R3 , where u ⊗ w = uT w means the outer product, (u ⊗ w)i j = uiwj. number of vector outer products needed to sum to D. In other words, the minimal As we are now considering the possibility of multiple fibers per voxel, we consider the tensor D(r) describing fiber r, for r = 1, ..., R. Then the best rank-1 approximation of the tensor D(1) = D is λ1v1 ⊗ · · · ⊗ v1, where λ1 > 0 and v1 is a unit vector. Then R is the minimum number of rank-1 tensors that sum to D. λ(k), v(k) make up the best rank-1 approximation of a tensor D(k) = D(k−1) − λ(k−1)v(k−1) ⊗ · · · ⊗ v(k−1) for all k = 2, . . . , R, which minimize the Frobenius norm: 3 i1=1 3 ir =1 · · · (D(k) i1···ir − λvi1 · · · vir)2 . λ(1), ..., λ(R) are called the pseudo-eigenvalues of the tensor D. v(1), ..., v(R) are called the pseudo-eigenvectors of the tensor D. For locations Xj ∈ G, we estimate D(Xj), using the ordinary least squares estimators: ˜D(Xj) = (BT B)−1BTY(Xj), j = 1, .., n, or the weighted LSE: ˜D(Xj) = (BT Σ−1(Xj)B)−1BT Σ−1(Xj)Y(Xj), j = 1, .., n. As before we use a kernel smoothing method to estimate D at locations x ∈ G between our observations Xj: ˆDn(x) = 1 nhd n (cid:18) x − Xj hn (cid:19) ˜D(Xj), n j=1 K 13 where K is a kernel function and hn is a bandwidth. Given ˆDn(x), x ∈ G we calculate its pseudo-eigenvalues ˆλ for r = 1, ...R by minimizing the Frobenius norm above. Then we have our r-th curve estimate (r) n (x) and pseudo-eigenvectors ˆv (r) n (x) d ˆx(r) n (t) dt = ˆv (r) n ( ˆx(r) n (t)), t ≥ 0, ˆx(r) n (0) = a. It is the curve whose gradients are the pseudo-eigenvectors ˆv (r) n . In practice we trace it in small (r) n , starting at location a. steps in the direction ˆv 1.6 Main Result The primary result is that for each of the above models one can establish the convergence of the properly normalized deviation process(cid:112) nhd−1( ˆXn(t) − x(t)) to a vector valued Gaussian process G(t) on [0,T], i.e., the normalized deviation process converges in the space C[0,T], whose mean and covariance will be a function of the vector field estimates ˆv or, in the case of the non-parametric approach discussed in this paper, ˆf , and in both cases, the noise Σ. Of course, the dimension d is usually d = 3. This result allows one to provide hypothesis tests of whether two regions are connected. This also allows for the production of the popular | ˆX(t) − z|2 probabilistic tractography p-value maps, and the construction of statistics such as in f t∈[0,T] that allow the consideration of the curve reaching a location of interest z. These are useful because they allow scientific users of such data to easily perform tests for the likelihood of regions being connected without requiring familiarity with the background processes. 14 CHAPTER 2 ESTIMATION & MAIN RESULT 2.1 Preliminaries In the current scenario we wish to model the signal completely non-parametrically (note that the previous were semi-parametric approaches). We will consider the problem of modeling diffusion signal with a general function f with heteroscedastic error in the non-parametric regression set-up: Yi jk = f(Xi, ϕj, θk) + S(Xi, ϕj, θk)i jk, where (ϕj, θk) are the applied directions on the unit sphere. Recall the Stejskal-Tanner equation (1.1). Under this regime, we have: R0(xi) Yi jk = −log = f(xi, ϕj, θk) + S(xi, ϕj, θk)i jk, (2.1) where R, R0 are as before and bjk is the vector in R3 corresponding to the magnetic field gradient applied in the direction (ϕj, θk) on the unit sphere. We also include the noise S(xi, ϕj, θk)i jk in the Stejskal-Tanner equation. S describes the covariance structure of the noise at a given location and direction. i jk are standardized error terms satisfying the following assumptions (Σ): (Σ1) Ei jk = 0, E = 1, and Eεi1 jk εi2 jk = Σi1i2 for all i, j, k,i1,i2. (Σ2) {εi1 j1k1} & {εi2 j2k2} are independent when j1 (cid:44) j2 or k1 (cid:44) k2. 2 i jk Let jk := (i1 jk, ..., njk). Σ jk i1i2 = cov(i1 jk, i2 jk) (cid:18) R(xi, bjk) (cid:19)  jk i1i2 . . . jk Σ i1i1 Σ ... jk ni1 Σ . . . jk nn Σ  . Var(jk) = Σ jk = 15 Naturally, the degree of diffusion at the locations xi1 and xi2 exhibits short range dependency for a given direction (φ j, θk) so that the covariance matrix consists of a high number of zero or near-zero entries. For example it could be diagonal or banded. This leads to the sparsity of Σ described in the condition (Σ3). k1(cid:44)k2 Σk1k2 → κ, where the sequence hn → 0 (Σ3) For some κ > 0 as n goes to ∞ we have h3 n n will be introduced in the main theorem. As it is true that the diffusion is strongest at a location xi when R is smallest (indeed, the order of magnitude of R is dependent on the degree of proton disalignment), to search for the direction of dominant diffusion we seek to maximize the function f in (2.1). As in the previous works, this will drive the vector field of curve tangents, v. For each x ∈ G we will search for the direction (ϕ∗(x), θ∗(x)) on the unit sphere for which ∂ ∂ ∂θ f(x, ϕ ∂ ∂ϕ 2 2 f(x, ϕ ∂ϕ 2 2 f(x, ϕ ∗(x), θ ∗(x), θ ∗(x), θ ∗(x)) = ∗(x)) < 0, 2 ∗(x)) ∂ 2 f(x, ϕ ∗(x), θ f(x, ϕ 2 2 f(x, ϕ ∗(x), θ ∗(x)) = 0, ∗(x), θ ∗(x)) − ( ∂ ∂ ∂θ 2 ∂ ∂ϕ ∂θ ∂θ∂ϕ ∗(x)) < 0, f(x, ϕ ∗(x), θ ∗(x)))2 > 0. (2.2) This indicates that (ϕ∗(x), θ∗(x)) is the direction of the maximum gradient flow at x. We use this direction to define the ODE for which the curve x∗ is a solution: sin θ∗(x∗(t)) cos ϕ∗(x∗(t)) sin θ∗(x∗(t)) sin ϕ∗(x∗(t)) x∗(0) = x0. ∗(x∗(t)) := (2.3) = v dx∗(t) dt cos θ∗(x∗(t))   , This integral curve models an axonal fiber x∗(t), t ≥ 0, starting at x0 and flowing along the maximal gradient direction. The location x0 is typically given as a seed point located in a region of interest. We assume condition (F) : The function f is twice continuously differentiable in G×[−π, π]×[0, π]. Then the uniqueness of the direction is guaranteed for some open subset of G containing x0. It then determines implicitly the end time t = T. Thus, the goal is to estimate x∗(t), t ∈ [0,T], based on the dataset (Yi jk, Xi, ϕj, θk), i = 1, . . . , n, j, k = 1, . . . , N. A natural estimation procedure consists of estimating f by some ˆfn, then finding the direction n(x)) as the pair that maximizes ˆfn(x, ϕ, θ) at a given of maximum estimated diffusion ( ˆϕ∗ n(x), ˆθ∗ 16 location x, followed by ODE (2.3) where unknown true x∗, ϕ∗, θ∗ are replaced by their respective estimators ˆx∗ n. Thus the estimated integral curve is defined as the solution of the ODE: n, ˆϕ∗ n, ˆθ∗ d ˆx∗ n(t) dt = ˆv ∗ n( ˆx∗ n(t)) = sin ˆθ∗ sin ˆθ∗ n(t)) cos ˆϕ∗ n( ˆx∗ n( ˆx∗ n(t)) sin ˆϕ∗ cos ˆθ∗ n(t)) n( ˆx∗ n( ˆx∗ n( ˆx∗ n(t)) n(t)) n(0) = x0.   , ˆx∗ Before introducing our estimator ˆf we make a few comments on the nature of Simpson’s integral approximation scheme, which would be used later. 2.2 Simpson’s Rule For a curve g on an interval [a, b] we can approximate the area under the curve of g from a to b using Simpson’s rule. Rather than the trapezoidal rule, or the left or right Riemann sum, which use lines or constants, the Simpson’s scheme uses quadratic forms to approximate the function g. That is, for xi, i = 0, ..., n, forming a partition of [a, b] with x0 = a and xn = b, the quadratic forms P(x) = px2 + qx + v, p, q, v ∈ R, can approximate g. P is often called a second order interpolating polynomial because it can be chosen such that it goes through the points in consideration. In particular, the Lagrange polynomials defined as where P(x) = n pk(x) = f(xk) n k=1 pk(x) x − xj xk − xj j=1 j(cid:44)k will pass through the curve f at the points xi, i = 1, .., n. Requiring an even number of intervals to divide [a, b] into and letting ∆x = b−a n , as well as requiring 17 P(xi) = g(xi), i = 1, ..., n enables one to show that we will have Þ b a P(x)dx = And the main result is used as an approximation to the integral: ∆x 3 (g(a) + g(x1) + 2g(x2) + ... + 4g(xn−1) + g(b)). Þ b Þ b g(x)dx ≈ P(x)dx. a a An easy case to demonstrate this is to take [a, b] = [−h, h] and n = 2. We will use the Simpson rule in our estimator ˆfn when dealing with variables. 2.3 Estimation Our estimator ˆfn(x, ϕ, θ) is a combination of a multivariate kernel smoothing on x with Simpson’s scheme for numerical approximation of an integral with respect to ϕ and θ. More precisely, introduce 2Nθ ˆfn(x, ϕ, θ) 2Nϕ n m=0 l=0 k=1 (cid:18) θ(x) − θm (cid:19) hθ Kϕ (cid:18) ϕ(x) − ϕl hϕ (cid:19) K (cid:19) (cid:18) x − Xk hn Yklm, am hθ bl hϕ 1 nh3 n Kθ (2.4) = where am, bl are coefficients in the Simpson’s scheme, and hn, hϕ, hθ are bandwiths for the kernels K, Kϕ, Kθ, respectively, to be optimized later. We have the following conditions (K) on the kernels: (K1) K is a symmetric probability density in Rd with bounded support, (K2) Þ Þ Þ Þ K(u)du = 1, K(cid:48)(u)du = 0, uK(u)du = 0, |u|2K(u)du < ∞. These conditions hold for the kernels K, Kϕ, and Kθ in R3, R, and R, respectively. Additionally we assume 18 (K3) Supp(Kϕ) = [−c, c] ⊂ [−N, N], Supp(Kθ) = [−d, d] ⊂ [−N, N]. Recall N2 is the number of equally spaced angles for the observed directions (ϕj, θk) on the unit sphere. These conditions are quite typical in kernel density estimation practices. The follow- ing proposition will be useful in establishing asymptotic unbiasedness of the estimators of the derivatives of the signal function f . Proposition 1. For any kernel K satisfying conditions (K) and any twice continuously differentiable on compact sets function g we have as h → 0: g(u)du = g(u0) + 0.5h2 g (cid:48)(cid:48)(u0)K2(1 + o(1)), Þ (cid:18)u − u0 (cid:19) h h−dK where K2 =Þ |u|2K(u)du. Next we will discuss the Simpson’s scheme for the integral of a real function g on a bounded interval [a, b]. Define SN(g(u0), ..., g(u2N), a, b) [g(u0) + g(u2N)] + = 1 6N N m=1 2 3N g(u2m−1) + 1 3N N m=1 g(u2m) with um = a + m2N(b − a), m = 0,1, . . . ,2N. The following result is well-known in literature. Proposition 2. For any four times continuously differentiable function g defined on the interval [a, b] we have as N → ∞ g(u)du = SN(g(u0), ..., g(u2N), a, b) − (b − a)5 with some u∗ in [a, b], where gIV is the fourth order derivative of g. a 180 gIV(u∗) 1 (2N)4 Þ b These two propositions together indicate that E ˆfn(x, ϕ, θ) = f(x, ϕ, θ) + o(1) as h, hϕ, hθ → 0 and n, Nϕ, Nθ → ∞. We will establish the exact nature of the remainder term in our proofs section. 19 Our curve estimators will be brought about through the estimation of ˆfn. The direction (ϕ, θ) for which ˆfn is maximized will yield the tangent vector ˆvn. We apply Euler’s method to obtain the curve estimators using this tangent vector field, meaning that we follow the vector field ˆvn in small steps. We establish the convergence of the properly normalized deviation between our estimator and the true underlying curve, ˆXn(t) − x(t), t ∈ [0,T], to a Gaussian process. We derive test procedures for testing whether two regions of the brain are connected. This allows us to construct so-called p-value maps that can rival the probability occupancy maps often used in probabilistic tractography methods to make inference about the likelihood of connected regions. There are two cases for each of the bandwidths: (I) hn = O(n−1/2), hϕ = hθ = h = O(n−1/2), Nϕ = Nθ = N ≥ ν(n) = O(n2) as n → ∞. Or (II) hn = O(N−1/5n−1/10), hϕ = hθ = h = O(N−1/5n−1/10), Nϕ = Nθ = N = o(n2) as n → ∞. In practice the number of gradient directions N2 is much smaller (typically around 64) than the number of voxels n (on the order of 106) in a typical HARDI dataset, so case (II) is more practical, which is what we will use in the simulation study with simulated and real data. 2.4 Main Result for Asymptotic Normality of Deviation Process Lemma 1. Suppose that (I) or (II) holds. Then uniformly in t ∈ [0,T], estimator of x∗(t). That is, ˆX∗ n(t) is a consistent | ˆX∗ n(t) − x∗(t)| P→ 0 as n → ∞. sup t∈[0,T] Theorem 1. Under case (I), n( ˆXn(t) − x(t)) D⇒ G(t) in the space C[0,T], where the Gaussian process G has mean function µ(t) and covariance function C(t1, t2) introduced in the next section. 20 Under case (II), n1/5N2/5( ˆXn(t) − x(t)) D⇒ G0(t) in C[0,T], where the centered Gaussian process G0 has covariance function C0(t1, t2) introduced in the next section. Then we can show by the Delta Method the following result: Theorem 2. Suppose conditions of the previous theorem hold under case (II). Moreover, suppose there exists the unique point τ ∈ (0,T) such that mint∈[0,T] |x∗(t) − z|2 = |x∗(τ) − z|2. If x∗(τ) (cid:44) z then the sequence n1/5N2/5(cid:20) n(t) − z|2 − |x∗(τ) − z|2(cid:21) min t∈[0,T] | ˆX∗ is asymptotically normal with zero mean and variance 4(x∗(τ) − z)∗C0(τ, τ)(x∗(τr) − z). If x∗(τ) = z then the sequence n2/5N4/5 mint∈[0,τ] | ˆX∗ variable |Z|2 − ( d C0(τ, τ). n(t)−z|2 converges in distribution to a random dt x∗(τ)t Z)2, where Z is a normal random variable with zero mean and variance This allows us to construct hypothesis tests for the likelihood of different regions being connected as well as the so-called probabilistic tractography p-value maps (see [2], [5], [9]). 21 CHAPTER 3 MEAN AND COVARIANCE OF THE LIMITING GAUSSIAN PROCESSES 3.1 Preliminaries Before we introduce the step-by-step algorithm for computing the estimated integral curve together with confidence ellipsoids we need to discuss the mean function and covariance functions of the limiting Gaussian processes, which in turn require introductions of several auxillary functions. Define the integrals: Ψ(z) = Þ Þ K(u)K(z + u)du, Ψ(cid:48)(z) = K(u)K(cid:48)(z + u)du. Then integration by parts of the above gives Þ Ψ(cid:48)(cid:48)(z) = − K(cid:48)(u)K(cid:48)(z + u)du. −|z|2 For instance, for a standard Gaussian kernel Ψ(z) = e 4 π)3 . Now out of kernels Kϕ and Kψ build (2√ the matrix ϕ(0)Ψθ(0) Ψ(cid:48) ϕ(0)Ψ(cid:48) Ψ(cid:48) ϕ(0)Ψ(cid:48) (0) Ψϕ(0)Ψ(cid:48)(cid:48) (cid:32) Ψ(cid:48)(cid:48) (0) (0) (cid:33) . Ψ0 = If both Kϕ and Kψ are Gaussian then Ψ0 = −diag(cid:0) 1 Þ θ θ (cid:1). For v ∈ R3 define θ 1 8π 8π, Ψ(−τv)dτ, ψ(v) = R which is 1 4π|v| for a standard Gaussian kernel. Also introduce Þ (cid:18) −Ψ(cid:48)(cid:48) Ψ0(v, x) = ϕ( ∂ϕ∗ ϕ( ∂ϕ∗ ∂x (x)vτ)Ψθ( ∂θ∗ ∂x (x)vτ)Ψ(cid:48) ∂x (x)vτ) Ψ(cid:48) ϕ( ∂ϕ∗ ∂x (x)vτ) −Ψϕ( ∂ϕ∗ ( ∂θ∗ ∂x (x)vτ)Ψ(cid:48)(cid:48) ∂x (x)vτ)Ψ(cid:48) Ψ(cid:48) (Ψ(−vτ) + κ) × θ θ (cid:19) dτ. ( ∂θ∗ ∂x (x)vτ) ( ∂θ∗ ∂x (x)vτ) θ 22 In case of a standard Gaussian kernel  1 +(cid:0) ∂θ∗ ∂x (x)(cid:1)2 −(cid:0) ∂ϕ∗ ∂x (x)(cid:1)(cid:0) ∂θ∗ ∂x (x)(cid:1) (cid:19)2 2(cid:18) (cid:18) ∂ϕ∗ 1 + (x) + ∂x Ψ0(v, x) = − 1 D(x) D(x) = 64π where Also, define and  , . ∂x (x) −(cid:0) ∂ϕ∗ ∂x (x)(cid:1)(cid:0) ∂θ∗ ∂x (x)(cid:1) ∂x (x)(cid:1)2 1 +(cid:0) ∂ϕ∗ (cid:19)2(cid:19)3/2 (cid:18) ∂θ∗  x∗(s) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ). M(s) = − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗ cos θ∗ sin ϕ∗ sin θ∗ cos ϕ∗ − sin θ∗ 0 (cid:32) ∂ 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f 2 ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ F(x, ϕ, θ) =  Þ t Next, Green’s function U(t, s) is defined as the solution of the PDE ∂U(t, s) ∂t = ∇v ∗(x∗(t))U(t, s), U(s, s) = I ∀s > 0, where ∇v(x∗(s)) = −M(s)F−1(x∗(s), ϕ ∗(x∗(s)), θ ∗(x∗(s))) (cid:32) (cid:33) . 2 ∂ ∂ϕ∂x f(x∗(s)) ∂θ∂x f(x∗(s)) 2 ∂ We will show that in case (I) the limit process G(t) has the mean function (cid:33) EG(t) = − (cid:34)(cid:32) (cid:32) (cid:32) + 3 ∂ ∂x2 ϕ 3 ∂ ∂x2 θ ∂ ∂ϕ 3 ∂ 2 ∂ϕ 3 θ ∂ 0 U(t, s)M(s)F−1(x∗(s)) f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 3 3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 2 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 3 3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 ∂ ∂θ + ∂ϕθ 2K0,2Kϕ,1,1 2K0,2Kθ,1,1 6Kϕ,1,3 3Kθ,1,1Kϕ,0,2 3Kϕ,1,1Kθ,0,2 6Kθ,1,3 (cid:33) (cid:33)(cid:35) ds, 23 where Kp,q =Þ xpK(q)(x)dx. As well as that in case (I) the limit process G(t) has covariance function satisfying: Þ t1∧t2 0 Cov(G(t1), G(t2)) = (cid:26) (cid:18) dx∗ (cid:19) ψ dt (s) ×S2(x∗(s), ϕ G(x∗(s))GT(x∗(s)) + ∗(x∗(s))) ∗(x∗(s)), θ (cid:19) (cid:18) dx∗ U(t1, s)M(s)F−1(x∗(s)) (s), x∗(s) 25 81 Ψ0 MT(s)(FT)−1(x∗(s))UT(t2, s)ds. (cid:27) dt Þ t1∧t2 For case (II) the centered limit process G0(t) has covariance function defined as follows C0(t1, t2) = ×S2(x∗(s), ϕ U(t1, s)M(s)F−1(x∗(s))25 (s), x∗(s) ∗(x∗(s)))MT(s)(FT)−1(x∗(s))UT(t2, s)ds. 81 Ψ0 dt 0 ∗(x∗(s)), θ (cid:18) dx∗ (cid:19) (cid:19) (cid:18) dx∗(t) , x∗(t) ∗(x∗(t))) dt ∗(x∗(t)), θ (3.1) Moreover, the variance function Cv(t) = C0(t, t) satisfies the ODE: ∗(x∗(t)))25 = M(t)F−1(x∗(t), ϕ ∗(x∗(t)), θ dCv(t) dt ×S2(x∗(t), ϕ +∇v ∗(x∗(t)), θ 81 Ψ0 ∗(x∗(t)))MT(t)(FT)−1(x∗(t), ϕ ∗T(x∗(t)), Cv(0) = 0. ∗(x∗(t))Cv(t) + Cv(t)∇T v Our method for numerically approximating the confidence regions is outlined in the following section. 3.2 Green’s Function We will present the basic facts and usefulness of Green’s function here so that the reader need not refer elsewhere. Let Θ, Ω ⊂ Rm be open and consider a linear differential operator Q on Θ. Then Green’s function U(t, s) on Θ × Ω at a point (t, s) is a solution of QU(t, s) = δ(t − s), (3.2) where δ is the normal ’Delta-function’. i.e., 24 By the definition of the function δ for the integral over Rm we have a unit point mass at the point for which this function is non-zero:  δ(x) = 1 0 x = 0 x (cid:44) 0 Þ Rm δ(t − s)h(s)ds = h(t). Now consider the definition of Green’s function and to see its usefulness multiply both sides of (3.2) by h(s): QU(t, s)h(s) = δ(t − s)h(s). Then integrate both sides over Rm: QU(t, s)h(s)ds = δ(t − s)h(s)ds. For the linear differential operator Q acting on t ∈ Θ we have U(t, s)h(s)ds = δ(t − s)h(s)ds. Þ Rm Þ Q Rm Þ Þ Rm Rm Thus consequently we have Þ Rm Q U(t, s)h(s)ds = h(t). Thus if we have a differential equation of the form: we will have Qw(t) = h(t), Qw(t) = Q Þ Rm 25 U(t, s)h(s)ds. In other words, we have the integral form solution w(t) = U(t, s)h(s)ds. Þ Rm By Theorem (2.2) in chapter (7) in Coddington and Levinson (1955) [17] the unique solution of Green’s function U(t, s) exists and is unique. 26 CHAPTER 4 ALGORITHM Our algorithm for calculating curve trajectories with their surrounding confidence ellipsoids is listed below. We will consider how to trace the fiber in case (II), since this is the typical scenario in DT-MRI where n is on the order of millions and N is 10–20. The steps are • Initialize c1, c2, x0, δ, tj = δ j and hn = c1N−1/5n−1/10 , hϕ = hθ = h = c2N−1/5n−1/10 , N = n2 εn, εn → 0. • Let ˆfn(x, ϕ, θ) be defined as in (2.5). • Find a direction ( ˆϕ∗ n(x), ˆθ∗ ∂ ∂θ n(x)) on the unit sphere such that ∗ ∗ ˆfn(x, ˆϕ ˆfn(x, ˆϕ n(x), ˆθ n(x)) = 0, 2 ∗ ∗ ˆfn(x, ˆϕ n(x), ˆθ ˆfn(x, ˆϕ n(x)) < 0, 2 ∂θ ∗ ∗ n(x)) n(x), ˆθ ˆfn(x, ˆϕ ∗ ∗ n(x), ˆθ n(x)) = ∗ ∗ n(x), ˆθ n(x)) < 0, 2 ∗ ∗ n(x)) ∂ n(x), ˆθ 2 ∂θ ∗ ∗ n(x), ˆθ ˆfn(x, ˆϕ n(x)) ˆfn(x, ˆϕ 2 (cid:19)2 > 0. ∂ ∂ ∂ ∂ϕ 2 2 ∂ϕ 2 2 ∂ (cid:18) ∂ ∂ϕ − ∂θ∂ϕ This direction indicates where the maximum gradient flow is. • Solve the ODE that governs a curve along the direction ( ˆϕ∗ n(x), ˆθ∗ n(x)) : d ˆx∗ n(t) dt = sin ˆθ∗ sin ˆθ∗ n( ˆx∗ n( ˆx∗ n(t)) n(t)) n( ˆx∗ n(t)) cos ˆϕ∗ n( ˆx∗ n(t)) sin ˆϕ∗ cos ˆθ∗ n(t)) n(tm) ≈ ˆxn(tm−1) + δˆv∗ n( ˆx∗ ˆx∗ n(0) = x0. n( ˆx∗ n(tm−1)). numerically using Euler’s method as ˆx∗   , • Now consider the noise scaling function S. To estimate it we propose 2 approaches. Approach 1: If several b0-images are available then one has m sets of Y(l) i jk = f(Xi, ϕj, θk) + S(Xj, ϕj, θk)ε (l) i jk, l = 1, ..., m, i = 1, ..., n, j, k = 1, ..., N, 27 j, k = 1, ..., N} are independent for l = 1, ..., m. Then for each (l) i jk : i = 1, ..., n, where {ε (Xi, ϕj, θk) we estimate S2 as follows 1 m − 1 ˆS2 i jk = m l=1 i jk − 1 (Y(l) m m q=1 (q) i jk )2 Y . Approach 2: If only one b0-image is available then we will assume that S is smooth locally so averaging locally should serve as its reasonable estimator. This can be done as in step 3 in section 5 of [9], yielding ˆS2 i jk . i jk n(x, ϕ, θ) by plugging ˆS2 in instead of Yi jk in step 2. • Obtain ˆS2 • Calculate and estimate the limiting covariance Cv(t) := Cov(G(t), G(t)), which will be done by ˆCn, the solution of the ODE similar to (3.1) where all unknown functions are estimated. The solution is then approximated by Euler’s method as follows (cid:19) ˆS2 (cid:18) d ˆx∗ ˆCn(tm) = ˆCn(tm−1) + δ ˆMn(tm−1)F−1( ˆx∗ ×25 81 Ψ0 n )−1( ˆx∗ ×( ˆFT ∗ n( ˆx∗ +δ∇ˆv n( ˆx∗ n( ˆx∗ ∗ ∗ n(tm−1)), ˆθ n(tm−1), ˆϕ n(tm−1))) n(tm−1) , ˆx∗ n( ˆx∗ ∗ n( ˆx∗ ∗ n( ˆx∗ n(tm−1))) ˆM∗ n(tm−1) n(tm−1), ˆϕ n(tm−1)), ˆθ dt ∗ n( ˆx∗ ∗ n( ˆx∗ n(tm−1), ˆϕ n(tm−1)), ˆθ n(tm−1))) ∗T n ( ˆx∗ n(tm−1)) ˆCn(tm−1) + δ ˆCn(tm−1)∇T ˆv ˆCn(0) = 0, n(tm−1)), n(tm−1) (cid:33) . ˆfn(x∗(s)) ˆfn(x∗(s)) 2 ∂ ∂ϕ∂x 2 ∂ ∂θ∂x ∗(x∗(s))) ∗ n(x) = − ˆMT n (s) ˆF−1 n (x∗(s), ϕ ∗(x∗(s)), θ 28 where and ∇ˆv  n (t) = ˆMT − sin ˆθ∗ sin ˆθ∗ n sin ˆϕ∗ n cos ˆϕ∗ 0 n cos ˆθ∗ n cos ˆϕ∗ cos ˆθ∗ n sin ˆϕ∗ − sin ˆθ∗ n n n (cid:32) ∂ 2 2 ∂ϕ 2 ∂ ∂ϕ∂θ ˆfn ˆfn ∂ 2 ∂ϕ∂θ 2 2 ∂ ∂θ ˆfn ˆfn ˆFn(x, ϕ, θ) = n  ˆx∗ (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ). (cid:32) n(t) • The asymptotical 100(1 − α)% confidence ellipsoid for x(tm), m ≥ 1, is approximated by , P(|Z| ≤ Rα) = 1 − α, (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) ˆCn(tm)−1/2( ˆxn(tm) − x(tm)) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Rαn−1/5N−2/5(cid:27) where Z is a standard normal vector in R3. • Repeat the steps above until tj reaches T. Use of Euler’s method for numerical approximation of the solutions is justified by its simplicity and by the work of [6], where it was shown that for DTI using the higher order Runge-Kutta approximations gave no benefit. In fact, the statistical accuracy outweighs the numerical accuracy on scales typical for brain imaging applications so it is of no concern. 29 CHAPTER 5 SIMULATIONS 5.1 Artificial example To simulate various curve scenarios, we need to design a function f whose maximum will be in the direction tangential to those curves of interest. To this end, we consider the function of the form f(x, ϕ, θ) = a(x) cos ϕ sin θ + b(x) sin ϕ sin θ + c(x) cos θ with some generic real valued functions a, b, c to be chosen later. The maximal direction is Then the integral curve is defined by the following ODE (cid:19) , a2 + b2 c θ ∗(x) = tan−1(cid:18)√  = dx∗(t) dt sin θ(x(t)) cos ϕ(x(t)) sin θ(x(t)) sin ϕ(x(t)) cos θ(x(t)) (cid:19) ϕ a ∗(x) = tan−1(cid:18) b  = a2 + b2 + c2 √ 1 .   . a b c We first simulate a spiral in 3D which mimics a C-shaped fiber. So we take a = −(x2 − 0.5), b = x1 − 0.5, and c = x3 and have f(x, ϕ, θ) = (0.5 − x2) cos ϕ sin θ + (x1 − 0.5) sin ϕ sin θ + x3 cos θ. Then solving the ODEs yields x(t) = 0.5 + r0 cos(ln z(t) + c), y(t) = 0.5 + r0 sin(ln z(t) + c), (cid:18)(cid:113) (cid:19) z2(t) + r2 0 − r0atanh z2(t) + r2 0/r0 , (cid:113) (cid:18) y(0) − 0.5 t + c = x(0) − 0.5 (cid:19) c = atan − ln z(0), 0 = (x(0) − 0.5)2 + (y(0) − 0.5)2 r2 . 30 Figure 5.1: The true curve is in blue, while the estimated curve is in red. 31 Figure 5.2: This is an enlargement of the previous figure to show the 95% confidence ellipsoid surrounding a point on the estimated curve. The true curve in blue touches it. 32 We take S to be a constant function which we varied. We take n = 4000, N = 100, m = 5 (from step 5), δ = 0.02. We trace the curve for 30 steps of size δ. The tracing is very fast but it relies heavily on the numerical search for the maximal direction in step 3, which we do using Matlab’s fminsearch. It requires an initial direction, which we take as (ϕ, θ) = (1,1). During the tracing the initial direction is taken to be the maximal direction obtained on the previous iteration with added small random perturbation. It was added to prevent the optimization algorithm from getting stuck at a local maximum. Without it the optimization step 3 introduces a systematic numerical bias contrary to expected zero bias. This is also the bottleneck of the implementation. It is possible to improve this step by utilizing a different optimization algorithm. The results are shown in Figures 5.3 and 5.4. The noise scaling is S = 0.25 and the noise εi jk is taken to be standard normal, which corresponds to signal-to-noise ratio of 4-5. If we increase S the curve does not trace, it veers off and tends to go out of bounds. The confidence ellipsoids are very tight around the estimated curve. The norm of the limiting covariance function is on the order of 10−7. It is excellent in comparison to typical confidence ellipsoids’ sizes for methods in [2] and [9]. 33 Figure 5.3: Diffusion ellipsoids illustrate the corresponding diffusion tensors along the fiber across the genu of corpus callosum. 34 Figure 5.4: Visualization of diffusion via ellipsoids using DTI/HARDI tensor model. 35 5.2 Real HARDI dataset A DWI dataset was collected from a twenty-something-year-old healthy male brain on a GE 3T Signa HDx MR scanner (GE Healthcare, Waukesha, WI) with an 8-channel head coil. The subject signed the consent form approved by the Michigan State University Institutional Review Board. DWI images were acquired with a spin-echo echo-planar imaging (EPI) sequence for 20 minutes with the following parameters: 32 contiguous 2.4-mm axial slices in an interleaved order, FOV = 22 rmcm × 22 cm, matrix size = 128 × 128, number of excitations (NEX) = 1, TE = 72.3 ms, TR = 7.5 s, 150 diffusion-weighted volumes (one per gradient direction) with b = 1000 s/mm2, 9 volumes with b = 0 and parallel imaging acceleration factor = 2. The desired three-dimensional model is not obtained altogether, but rather by imaging a ‘slice’ repeatedly along the z-direction and then reconstructing using a Fourier transform to form the aggregate image (see [1]). Typically, the number of pixels n representing the number in a sample of spatial locations is on the order of n = 128 × 128 × 32 = 524,288. That is, 32 slices in the z direction with 128 × 128 in each plane. The number can be increased to n = 256 × 256 × 32. The directions did not have the angular components on a regular grid, so we first used kernel smoothing of Y-values based on 1350 pairs of (ϕ, θ) to obtain Y-values for 1600 pairs of regularly spaced angular components, which corresponds to N = 40. The sample of spatial locations Xi had the size of n = 128 × 128 × 32 = 524,288. For all kernels we used Gaussian kernels of various dimensions. We used Matlab with C-subroutines to perform the computations. To find the maximal directions ( ˆϕ∗, ˆθ∗) we used fminsearch. However, we observed that it is very sensitive to the choice of the initial vector near which it looks for the local optimizer. We ended up using the direction from previous iteration perturbed by a small random vector. Without the perturbation fminsearch would simply take the initial direction as the optimizer and trace out short straight lines. We speculate that it is possible to use a more sophisticated optimization code to get faster and more robust numerical solution for step 3 of our algorithm. Our rationale for choosing seed regions for evaluation is based on the following: The corpus callosum (CC) contains thick axonal fibers connecting the two cerebral hemispheres and enabling 36 the communication between them. The general anatomical locations of these axonal fibers are well established. These fibers can be used to evaluate new techniques in fiber tractography. The anterior part of the CC, called the genu of CC, contains axonal fibers connecting the right and left frontal lobes. The second region is the right fornix, which is a much shorter fiber than the CC and it is quite challenging for tracing, but important for early diagnostics of Alzheimer’s disease. The results are presented in Figures 1 and 2. The estimated curves are shown in magenta color. The corresponding confidence ellipsoids are hardly visible, since the norm of the corresponding limiting covariance matrix was of the order 10−8. Both fibers follow the anatomical ground truth. Anterior fiber is perfectly centered where expected, while fornix fiber is a bit shifted off the center of the expected location. Use of Euler’s method for numerical approximation of the solutions is justified by its simplicity and by the work of Sakhanenko [6] where it was shown that for DTI using the higher order Runge- Kutta approximations gave no benefit. In fact, the statistical accuracy outweighs the numerical accuracy on scales typical for brain imaging applications so it is of no concern. 37 Figure 5.5: Axonal fibers across the genu of Corpus Callosum. We traced each fiber branch for 40 steps of size δ = 0.01. The estimated curve is shown in magenta accompanied by blue 95% confidence ellipsoids. 38 Figure 5.6: Right Fornix. We traced the fiber for 30 steps of size δ = 0.01. The estimated curve is shown in magenta accompanied by blue 95% confidence ellipsoids. 39 CHAPTER 6 PROOFS 6.1 Some Results for Simpson’s Scheme and Kernel Smoothing. Proposition 3. For any kernel K satisfying conditions (K) and any twice continuously differentiable on compact sets function g we have as h → 0: g(u)du = g(u0) + 0.5h2 g (cid:48)(cid:48)(u0)K2(1 + o(1)), Þ (cid:18)u − u0 (cid:19) h h−dK where K2 =Þ |u|2K(u)du. Proof. The above follows easily by a Taylor expansion of the integrand. For a detailed description see [15]. (cid:3) We will state the following standard result by Simpson for approximating integrals without proof: Þ b Proposition 4. For any four times continuously differentiable function g defined on the interval [a, b] we have as N → ∞ a g(u)du = SN(g(u0), ..., g(u2N), a, b) − (b − a)5 with some u∗ in [a, b], where gIV is the fourth order derivative of g. Lemma 2. For any kernel K ∈ C(j)([0,1]d) satisfying conditions (K) and any function g ∈ C(j) on a compact set, we have as h → 0 180 gIV(u∗) 1 (2N)4 Þ h−(d+j)K(j)(cid:18)u0 − u (cid:19) where j is a non-negative integer and K2 =Þ |u|2K(u)du as before. (j)(u0) + 0.5h2 g(u)du = g h g (j+2)(u0)K2(1 + o(1)), 40 Proof. Using integration by parts and proposition 1, Þ h−(d+1)K(cid:48)(cid:18)u0 − u (cid:48)(u0) + 0.5h2 = g g (cid:19) Þ h (cid:48)(cid:48)(cid:48)(u0)K2(1 + o(1)). g(u)du = h−dK By induction the proof is complete. Proposition 5. Consider ordinary Simpson’s scheme forÞ b (cid:18)u0 − u (cid:19) h (cid:48)(u)du g (cid:3) 1 6N (g(u0) + g(u2N)) + 2 3N N m=1 a g(u)du defined by g(u2m). N−1 1 3N m=1 g(u2m−1) + Then as N → ∞ 1 1 36N2(g(u0) + g(u2N)) + 9N2 Proof. Straightforward calculation yields the result: g(u2m−1) + 4 9N2 m=1 N N−1 m=1 Þ b a g(u)du(1 + O(N−1)). g(u2m) = 5 9N = N−1 N−1 g(u2m−1) + 1 9N2 g(u2m−1) + 4 9N2 (g(u0) + g(u2N)) + N N N−1 (g(u0) + g(u2N)) + g(u2m)] + m=1 Þ b 1 g(u)du(1 + O(N−1)) − 2 3N 6 g(u)du(1 + O(N−1)). 1 36N2(g(u0) + g(u2N)) + 2 3N − 2 Þ b 3N 2 3N 1 36N2(g(u0) + g(u2N)) = m=1 1 3N 2 3N m=1 2 3N 1 6N [ 1 6N [ 1 6N Þ b 5 9N m=1 = + a a a g(u)du(1 + O(N−1)) g(u2m) g(u2m)] m=1 1 (g(u0) + g(u2N)) 24N (cid:3) 6.1.1 Existence of Unique Direction and Establishing Approximation of ˆXn − x We now prove the existence of the unique direction corresponding to the direction of dominant diffusion: Lemma 3. The direction satisfying (6.2) exists and unique for each x in a neighbourhood of a. 41 Proof. Define the function H : G × S2 → R2 as f(cid:0)x, ϕ, θ(cid:1)(cid:19) yields the maximum of f . Thus we have H(cid:0)x, ϕ∗(x), θ∗(x)(cid:1) = 0 and f(cid:0)x, ϕ, θ(cid:1), ∂ ∂θ ∂ϕ . It is continuously differentiable in G by assumption. For each x ∈ G, the direction (ϕ∗(x), θ∗(x)) (cid:40) ∂H(cid:0)x, ϕ∗(x), θ∗(x)(cid:1) ∂(cid:0)ϕ(x), θ(x)(cid:1) det 2 f(x,ϕ∗(x),θ∗(x)) 2 f(x,ϕ∗(x),θ∗(x)) ∂θ∂ϕ ∂ ∂ 2 ∂θ  (cid:44) 0. 2 (cid:41) = det 2 f(x,ϕ∗(x),θ∗(x)) 2 f(x,ϕ∗(x),θ∗(x)) (cid:18) ∂ H(cid:0)x, ϕ, θ(cid:1) =  ∂ g(x) =(cid:0)ϕ ∗(x)(cid:1) H(cid:0)x(cid:48) , g(x(cid:48))(cid:1) = 0 for all x(cid:48) ∈ W . ∗(x), θ ∂ϕ∂θ ∂ϕ ∂ By the implicit function theorem, there is an open set W ⊂ R3 containing x and a unique continu- ously differentiable function g : W → R2 such that and Remark on Lemmas 3 and 4: (cid:3) Before stating and proving lemmas 3 and 4, we would like to explain their place in our estimation n(t)− x∗(t) we will consider a process zn(t), t ∈ [0,T], procedure. Now to approximate the process ˆx∗ defined as the solution of ODE: dzn(t) dt = ∇v ∗(x∗(t))zn(t) + (ˆv ∗ n − v ∗)(x∗(t)), zn(0) = 0. Similar to arguments in [5] we can show that zn(t) approximates ˆx∗ In order to solve the ODE above note that there exists a function U(t, s) ∈ [0,T]2 which satisfies the following conditions: n(t) − x∗(t). U(t, s) = ∇v∗(x∗(t))U(t, s), 0 ≤ s ≤ t ≤ T; 1. ∂ ∂t 42 2. U(t, t) = I3×3; 3. U(t, s) ≡ 0 ,0 ≤ t < s ≤ T . Then by Theorem (2.2) in chapter (7) in Coddington and Levinson (1955) (see [17]) the unique solution to the ODE above is given via the Green’s function U(t, s) as the following: Þ t 0 ˆzn(t) = U(t, s)(ˆv ∗ n − v ∗)(x∗(s))ds. n(t)− x∗(t) up to terms of order O((cid:107) ˆx∗ Even though ˆzn(t) approximates ˆx∗ an estimation procedure to terms based on the deviation ˆfn − f . Thus we introduce: Lemma 4. Let ∆ϕ = ϕ1 − ϕ2 and ∆θ = θ1 − θ2. Then n − x∗(cid:107)2), we need to relate sin θ1 cos ϕ1 − sin θ2 cos ϕ2 sin θ1 sin ϕ1 − sin θ2 sin ϕ2 ∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2 ∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2 cos θ1 − cos θ2 (cid:18) (cid:26) max (|∆θ| + |∆ϕ|)2 −∆θ sin θ2 ,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19) .    =  +O Proof. sin θ1 cos ϕ1 − sin θ2 cos ϕ2 1 2 sin(θ1 + ϕ1) + (cid:18) θ1 − θ2 + ϕ1 − ϕ2 (cid:19) sin(θ1 − ϕ1) − 1 2 (cid:19) (cid:18) θ1 − θ2 − (ϕ1 − ϕ2) (cid:18) θ1 + θ2 + ϕ1 + ϕ2 (cid:19) sin(θ2 + ϕ2) − 1 2 (cid:19) (cid:18) θ1 + θ2 − ϕ1 − ϕ2 cos 2 cos = 1 2 = sin + sin 2 . 2 2 sin(θ2 − ϕ2) 43 By a Taylor expansion, at ∆θ = ∆ϕ = 0, cos (cid:19) sin + sin 2 (cid:19) (cid:18) θ1 − θ2 + ϕ1 − ϕ2 (cid:18) θ1 − θ2 − (ϕ1 − ϕ2) (cid:18) ∆θ (cid:18) ∆θ (cid:18)(cid:18) ∆θ ∆ϕ 2 + 2 2 − ∆ϕ 2 (cid:19)(cid:20) (cid:19)(cid:20) (cid:19)2(cid:19) = + (cid:19) 2 (cid:19) (cid:18) θ1 + θ2 + ϕ1 + ϕ2 (cid:18) θ1 + θ2 − ϕ1 − ϕ2 (cid:18) ∆θ (cid:18) ∆θ cos (cid:19) (cid:19) ∆ϕ 2 + 2 2 − ∆ϕ 2 (cid:26) (cid:18) 2 2 cos(θ2 + ϕ2) − sin(θ2 + ϕ2) cos(θ2 − ϕ2) − sin(θ2 − ϕ2) ∆ϕ 2 2 + +O = ∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2 + O max (|∆θ| + |∆ϕ|)2 + O + O ∆ϕ 2 + 2 2 − ∆ϕ 2 (cid:19)2(cid:19)(cid:21) (cid:18)(cid:18) ∆θ (cid:19)2(cid:19)(cid:21) (cid:18)(cid:18) ∆θ ,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19) Similarly, sin θ1 sin ϕ1 − sin θ2 sin ϕ2 2 2 + sin 1 2 = = − sin sin (cid:19) cos(θ2 − ϕ2) + cos(θ1 − ϕ1) − 1 2 (cid:18) θ1 + θ2 − ϕ1 − ϕ2 (cid:19) cos(θ1 + ϕ1) − 1 2 (cid:19) (cid:18) θ1 + θ2 + ϕ1 + ϕ2 (cid:18) ∆θ (cid:18) ∆θ (cid:18)(cid:18) ∆θ (cid:18) θ1 − θ2 − (ϕ1 − ϕ2) (cid:19) (cid:18) θ1 − θ2 + ϕ1 − ϕ2 (cid:18) ∆θ (cid:18) ∆θ (cid:19) 2 − ∆ϕ 2 sin(θ2 − ϕ2) + cos(θ2 − ϕ2) sin(θ2 + ϕ2) + cos(θ2 + ϕ2) (cid:19)(cid:20) (cid:19)2(cid:19) (cid:19)(cid:20) 2 − ∆ϕ 2 2 + 2 + ∆ϕ 2 ∆ϕ 2 sin = − 2 2 1 2 (cid:19) + +O 2 + ∆ϕ 2 (cid:18) (cid:26) sin(θ2 + ϕ2) + O + O (cid:18)(cid:18) ∆θ (cid:19)2(cid:19)(cid:21) (cid:18)(cid:18) ∆θ (cid:19)2(cid:19)(cid:21) 2 − ∆ϕ 2 ,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19) 2 + ∆ϕ 2 max (|∆θ| + |∆ϕ|)2 = ∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2 + O 44 . . (cid:19) (cid:18) θ1 + θ2 2 (cid:19) (cid:18) θ1 − θ2 2 sin cos θ1 − cos θ2 = −2 sin = −∆θ sin θ2 + O(∆θ),   = ,(|∆θ| + |∆ϕ|)3(cid:27)(cid:19) −∆θ sin θ2 .  Finally, thus  sin θ1 cos ϕ1 − sin θ2 cos ϕ2 sin θ1 sin ϕ1 − sin θ2 sin ϕ2 ∆θ cos θ2 cos ϕ2 − ∆ϕ sin θ2 sin ϕ2 ∆θ cos θ2 sin ϕ2 + ∆ϕ sin θ2 cos ϕ2 (cid:18) (cid:26) cos θ1 − cos θ2 (|∆θ| + |∆ϕ|)2 +O max Lemma 5. In a neighbourhood of (x, ϕ∗(x), θ∗(x)) we have (cid:32) ˆϕ∗ (cid:18) +O (cid:33) (cid:32) ∂ n(x) − ϕ∗(x) ˆθ∗ n(x) − θ∗(x) 2 ∆ ϕ 2 + ∆ θ + ∆ϕ 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f 2 2( ˆf − f) 2 ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ (cid:12)(cid:12)(cid:12)(cid:12) + ∆θ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ = − (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂ϕ (cid:33)−1(cid:32) ∂ (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ˆfn − ∂ ∂ϕ f ˆfn − ∂ ∂θ f 2 ∂ϕ∂θ ∂ϕ ∂ ∂θ 2 2( ˆf − f) (cid:3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) ( ˆf − f) . Proof. By lemma 2 we have for each x ∈ G a direction (ϕ∗(x), θ∗(x)) for which ∂ ∂ϕ and likewise a direction ( ˆϕ∗ f(x, ϕ ∗(x), θ ∗(x)) = ∂ ∂θ f(x, ϕ ∗(x), θ ∗(x)) = 0 n(x), ˆθ∗ n(x)) for ˆfn for which ˆfn(x, ˆϕ ∗ ∗ n(x)) = n(x), ˆθ ∂ ∂ϕ ∂ ∂θ ˆfn(x, ˆϕ ∗ ∗ n(x)) = 0. n(x), ˆθ 45 Expanding the function ˆf (cid:48) ϕ at the point (x, ϕ∗(x) + ∆ϕ, θ∗(x) + ∆θ) gives ∗(x)) − ˆf (cid:48) ∗(x), θ ∗(x) + ∆ϕ, θ ∗(x), θ ∗(x), θ 2 + ∆ θ ∗(x), θ ∗(x)) ϕ(x, ϕ ∗(x) + ∆θ) − ˆf (cid:48) ϕ(x, ϕ ∗(x), θ ϕθ(x, ϕ ∗(x), θ ϕθ(x, ϕ 2 ∗(x), θ ∗(x)) ∗(x))∆θ + O ∗(x))∆θ ( ˆf − f) ∗(x)))∆ϕ + ˆf (cid:48)(cid:48) ∗(x)))∆ϕ + f (cid:48)(cid:48) 2 2( ˆf − f) (cid:12)(cid:12)(cid:12)(cid:12) + ∆θ (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) + ∆ϕ ∂ϕ∂θ (cid:18) ∂ϕ . f (cid:48) ϕ(x, ϕ = ˆf (cid:48) ϕ(x, ϕ = ˆf (cid:48)(cid:48) ϕϕ(x, ϕ = f (cid:48)(cid:48) ϕϕ(x, ϕ (cid:18) +O 2 ∆ ϕ (cid:19) 2 + ∆ θ 2 ∆ ϕ Note that the continuity of ˆf (cid:48) ϕ in ϕ, θ and the fact that ∗(x) + ∆θ) − ˆf (cid:48) ∗(x)) = f (cid:48) ϕ(x, ϕ ∗(x) + ∆ϕ, θ ∗(x), θ ϕ(x, ϕ ˆf (cid:48) ϕ(x, ϕ = − ˆf (cid:48) ϕ(x, ϕ ∗(x), θ ∗(x)) ∗(x), θ ∗(x)) − ˆf (cid:48) ϕ(x, ϕ ∗(x), θ ∗(x)) we have that ˆf (cid:48) ϕ(x, ϕ∗(x), θ∗(x)) → 0 as ∆ϕ, ∆θ → 0 and is included in the term O (cid:18) 2 ∆ ϕ + ∆ 2 θ (cid:19) . Likewise, expanding the function ˆf (cid:48) θ f (cid:48) θ(x, ϕ = ˆf (cid:48) θ(x, ϕ = ˆf (cid:48)(cid:48) θϕ(x, ϕ = f (cid:48)(cid:48) θϕ(x, ϕ ∗(x), θ at the point (x, ϕ∗(x) + ∆ϕ, θ∗(x) + ∆θ) gives ∗(x)) θ(x, ϕ ∗(x) + ∆θ) − ˆf (cid:48) ∗(x)) ∗(x), θ θ(x, ϕ ∗(x), θ ∗(x))∆θ + O θθ(x, ϕ ∗(x), θ ∗(x))∆θ θθ(x, ϕ 2 ( ˆf − f) ∗(x))∆ϕ + ˆf (cid:48)(cid:48) ∗(x)))∆ϕ + f (cid:48)(cid:48) 2 2( ˆf − f) ∗(x), θ ∗(x)) − ˆf (cid:48) ∗(x) + ∆ϕ, θ ∗(x), θ ∗(x), θ 2 + ∆ θ 2 + ∆ θ + ∆θ 2 ∆ ϕ (cid:19) (cid:18) . +O 2 ϕ ∆ (cid:18) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) ∂ϕ∂θ Thus, we have the result: (cid:32) ∂ (cid:18) 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f 2 ∆ ϕ +O 2 ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ 2 + ∆ θ + ∆ϕ (cid:33)(cid:32) ˆϕ∗ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂ϕ (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12) + ∆θ = n(x) − ϕ∗(x) ˆθ∗ n(x) − θ∗(x) 2 2( ˆf − f) ∂θ ˆfn − ∂ ∂ϕ f ˆfn − ∂ ∂θ f (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ 2 ∂ϕ∂θ ∂ϕ ∂ ∂θ 2 2( ˆf − f) (cid:12)(cid:12)(cid:12)(cid:12) + ∆ϕ (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:32) ∂ 46 (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) ( ˆf − f) . (cid:3) Therefore, we can approximate ˆzn by ˆZn(t) = Þ t 0 (cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗ cos θ∗ sin ϕ∗ − sin θ∗ 0 (cid:33)(cid:32) ∂ 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f 2 ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ (cid:33)−1(cid:32) ∂ ∂ϕ ∂ ∂θ ( ˆfn − f) ( ˆfn − f) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s) ds. − U(t, s) sin θ∗ cos ϕ∗ which approximates ˆx∗ n(t) − x∗(t) in terms of ˆfn − f . 6.1.2 Establishing Asymptotic Unbiasedness of Estimators of Derivatives of f Recall that our model is Yi jk = f(Xi, ϕj, θk) + S(Xi, ϕj, θk)εi jk, i = 1, . . . , n; j = 1, . . . , Nϕ; k = 1, . . . , Nθ, where Xi ∼ U([0,1]3), ϕj = −π + 2π j 2Nϕ First, define wj,k(x) = . , and θk = − π2 + π k2Nθ (cid:19) (cid:18) x − Xi n 1 nh3 n K i=1 Yi jk . hn Immediately note that Ewj,k(x) = K(u) f(x − hnu, ϕj, θk)du = f(x, ϕj, θk) + 0.5h2 n 2 ∂ ∂x2 f(x, ϕj, θk)K2(1 + o(1)). (cid:19) (cid:19) w2Nϕ,k(x),−π, π . (cid:18) ϕ − ϕ2Nϕ hϕ Þ Secondly, define uk(x, ϕ) = SNϕ (cid:18) 1 hϕ (cid:18) ϕ − ϕ0 (cid:19) hϕ Kϕ w0,k(x), ..., 1 hϕ Kϕ 47 (cid:18) ϕ − ϕ2Nϕ hϕ (cid:19) (cid:19) Ew2Nϕ,k(x),−π, π ∂x2 f(x, u, θk)K2(1 + o(1))]du Again, note that Kϕ Þ ϕ+chϕ Euk(x, ϕ) = SNϕ 1 hϕ ∂IV 4 ∂ϕ ϕ−chϕ 2chϕ + = 180(2Nϕ)4 hϕ = f(x, ϕ, θk) + 0.5h2 hϕ Kϕ Kϕ 1 hϕ 2 ∂ (cid:19) (cid:19) Ew0,k(x), ..., hϕ [ f(x, u, θk) + 0.5h2 (cid:18) 1 (cid:18) ϕ − ϕ0 (cid:18) ϕ − u (cid:19) (cid:20) 1 (cid:18) ϕ − u (cid:18) ϕ − u∗ (cid:19) ∂x2 f(x, ϕ, θk)K2(1 + o(1)) + 0.5h2 (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)ϕ=u∗ f(x, ϕ, θk) Kϕ hϕ hϕ ∂ 2 n n k ϕ f(x, u∗ k, θk)(1 + o(1)), k ∂ 2 2 f(x, ϕ, θk)Kϕ,2(1 + o(1)) ∂ϕ where the remainder is R = SNθ ..., 1 hθ Kθ (cid:19) (cid:18) θ − θ0 (cid:18) 1 (cid:18) θ − θ2Nθ (cid:19) Kθ hθ hθ K IV ϕ hθ (cid:19) 1440N4 θ h4 θ (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ hϕ 2Nθ K IV ϕ 0 hϕ 48 f(x, u∗ 0, θ0), f(x, u∗ 2Nθ , θ2Nθ ),−0.5π,0.5π (cid:19) . + K IV ϕ ϕh4 c 1440N4 hϕ k ∈ [ϕ − chϕ, ϕ + chϕ]. ϕ where u∗ Thirdly, let ˆfn(x, ϕ, θ) = SNθ Then (cid:18) θ − θ0 (cid:18) θ − θ0 hθ (cid:19) (cid:19) hθ Kθ hθ (cid:18) 1 (cid:18) 1 (cid:18) θ − v hθ Kθ (cid:19) u0(x, ϕ), ... 1 hθ Kθ (cid:18) θ − θ2Nθ (cid:19) (cid:18) θ − θ2Nθ hθ (cid:19) (cid:19) . u2Nθ (x, ϕ),− π 2 , π 2 (cid:19) (x, ϕ),− π 2 , π 2 Eu0(x, ϕ), ... 2 Kθ 1 EuNθ hθ ∂x2 f(x, ϕ, v)K2(1 + o(1)) hθ ∂ E ˆfn(x, ϕ, θ) = SNθ Þ θ+chθ = θ−chθ +0.5h2 ∂ ϕ ∂ϕ = f(x, ϕ, θ) + 0.5h2 n +0.5h2 θ ∂ ∂θ Kθ [ f(x, ϕ, v) + 0.5h2 n 1 hθ 2 2 f(x, ϕ, v)Kϕ,2(1 + o(1))]dv + hθ ϕ 2 ∂ ϕh4 ∂x2 f(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 K IV θ c 2 2 f(x, ϕ, θ)Kϕ,2(1 + o(1)) + c 1440N4 R ϕ ∂ ∂ϕ (cid:19) (cid:18) θ − θ∗ 2 2 f(x, ϕ, θ)Kϕ,2(1 + o(1)) c 1440N4 f(x, ϕ, θ ∗) + hθ R, ϕh4 ϕ To assess R let u∗ = Then R = O(1), since K IV ϕ +(u∗ k ϕ 1 hϕ f(x, u∗ f(x, u∗ (cid:18) ϕ − u∗ 2Nθ +12Nθ (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ k − u∗)[K IV (cid:19) (cid:18) θ − u Þ 1 (cid:18) θ − u Þ 1 (cid:19) (cid:18) ϕ − u∗ (cid:18) ϕ − u∗ (cid:19) hϕ , θk) − 1 hϕ j=0 u∗ k, θk) = K IV f (cid:48)(x, u∗ j ∈ [ϕ − chϕ, ϕ + chϕ]. Then (cid:19) , θk) (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ (cid:19) (cid:19) 1 (cid:18) ϕ − u∗ , u)du(1 + o(1)) f(x, u∗ hϕ KV ϕ f(x, u∗ K IV ϕ KV ϕ hϕ , θ) − 2cKV hθ f(x, u∗ f(x, u∗ (cid:19) Kθ Kθ hϕ hϕ hϕ hθ hθ hθ ϕ hϕ ϕ hϕ R = −2chϕ = K IV ϕ (cid:19) f(x, u∗ , θk)](1 + o(1)). , u)du(1 + o(1)) + O(h2 , θ) + O(h2 ϕ) + O(h2 (cid:19) (cid:19) (cid:18) 1 (cid:18) 1 θ N4 θ h4 N4 θ h4 θ θ) + O θ) + O Thus, combining several previous calculations we obtain E ˆfn(x, ϕ, θ) = f(x, ϕ, θ) + 0.5h2 n +0.5h2 θ f (cid:48)(cid:48) θθ(x, ϕ, θ)Kϕ,2(1 + o(1)) + (cid:18) ϕ − ϕ∗ (cid:19) (cid:21) 2 ∂ (cid:19) 2 ∂x2 f(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 2 f(x, ϕ, θ)Kϕ,2(1 + o(1)) ∂ϕ ∗ f(x, ϕ , θ) ∂ ϕ (cid:20) (cid:18) ϕ − ϕ∗ (cid:19) (cid:18) θ − θ∗ K IV ϕ c 1440N4 c ϕh4 ϕ K IV θ ϕ + ∗ hϕ , θ) f(x, ϕ −2cKV hϕ f(x, ϕ, θ for some ϕ∗ ∈ [ϕ − chϕ, ϕ + chϕ] and some θ∗ ∈ [θ − chθ, θ + chθ]. Lemma 6. Let hn, hϕ, hθ → 0 and Nϕ, Nθ → ∞. Under conditions (K) on kernels Kϕ and Kθ we have θ h4 1440N4 ∗) hθ θ ˆfn(x, ϕ, θ) = E ∂ ∂ϕ ∂ ∂ϕ f(x, ϕ, θ) + O(h2 n) + O(h2 ϕ) + O(h2 θ) + O E ∂ ∂θ 2 2 E ∂ ∂ϕ 2 2 E ∂ ∂θ 2 E ∂ ∂θ∂ϕ ˆfn(x, ϕ, θ) = ˆfn(x, ϕ, θ) = ˆfn(x, ϕ, θ) = ˆfn(x, ϕ, θ) = ∂ f(x, ϕ, θ) + O(h2 ∂ ∂θ 2 2 f(x, ϕ, θ) + O(h2 ∂ϕ 2 2 f(x, ϕ, θ) + O(h2 2 ∂θ ∂ n) + O(h2 ϕ) + O(h2 θ) + O n) + O(h2 ϕ) + O(h2 θ) + O n) + O(h2 ϕ) + O(h2 θ) + O ∂ ∂θ∂ϕ f(x, ϕ, θ) + O(h2 n) + O(h2 ϕ) + O(h2 θ) + O 49 (cid:18) (cid:18) (cid:19) (cid:19) ϕ (cid:18) (cid:18) (cid:19) (cid:19) 1 ϕh5 N4 1 ϕh4 N4 ϕ 1 ϕh6 N4 1 ϕh4 N4 ϕ 1 ϕh5 N4 (cid:18) ϕ ϕ + O + O θ (cid:19) (cid:19) N4 θ h4 (cid:18) 1 (cid:18) 1 (cid:18) 1 (cid:18) 1 (cid:18) 1 θ h5 N4 θ h6 N4 N4 θ h4 θ θ θ , , (cid:19) (cid:19) + O θ h5 N4 θ + O + O (cid:19) , , (cid:19) . Proof. (1) By linearity of the differential operator, ˆfn(x, ϕ, θ) ∂ ∂ϕ (cid:18) 1 = SNθ Kθ hθ (cid:18) θ − θ0 (cid:19) ∂ hθ ∂ϕ u0(x, ϕ), ..., 1 hθ Kθ (cid:18) θ − θ2Nθ hθ (cid:19) ∂ ∂ϕ (cid:19) . (x, ϕ),−0.5π,0.5π u2Nθ Then by the linearity of the expectation operator, we have ˆfn(x, ϕ, θ) E ∂ ∂ϕ (cid:18) 1 (cid:18) θ − θ0 hθ Kθ = SNθ hθ First, note that E ∂ ∂ϕ Þ ϕ+chϕ uk(x, ϕ) = SNϕ K(cid:48) = + ϕ−chϕ 2chϕ 180(2Nϕ)4 ϕ 1 h2 ϕ ∂IV ∂uIV , ..., (cid:19) u0(x, ϕ) (cid:19) (cid:18) ϕ − ϕ0 (cid:19) (cid:18) ϕ − u hϕ ϕ ∂ϕ E (cid:19) (cid:18) ∂ (cid:18) 1 (cid:18) ϕ − u (cid:20) 1 h2 hϕ K(cid:48) ϕ ϕ h2 ϕ K(cid:48) (cid:19) hϕ (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)u=u∗ k . f(x, u, θk) (cid:18) θ − θ2Nθ hθ 1 hθ Kθ E (cid:18) ∂ (cid:19) (cid:18) ϕ − ϕ2Nϕ ∂ϕ u2Nθ (cid:19) (cid:19) (x, ϕ) ,−0.5π,0.5π (cid:19) . (cid:19) 1 h2 K(cid:48) Ez0,k(x), ..., n f (cid:48)(cid:48) xx(x, u, θk)K2(1 + o(1))]du hϕ ϕ ϕ [ f(x, u, θk) + 0.5h2 Ez2Nϕ,k(x),−π, π By the previous lemma, n f (cid:48)(cid:48)(cid:48) xxϕ(x, ϕ, θk)K2(1 + o(1)) uk(x, ϕ) = f (cid:48) ϕ(x, ϕ, θk) + 0.5h2 E ∂ ∂ϕ ϕ f (cid:48)(cid:48)(cid:48) ϕϕϕ(x, ϕ, θk)Kϕ,2(1 + o(1)) +0.5h2 c 1440N4 (cid:18) ϕ − u∗ f(x, u∗ KV ϕ (cid:19) hϕ + k ϕh5 ϕ k, θk)(1 + o(1)). 50 Thus by proposition 2, (cid:18) θ − θ2Nθ (cid:19) (cid:18) ∂ (cid:19) (x, ϕ) ,−0.5π,0.5π (cid:19) (cid:19) ˆfn(x, ϕ, θ) E ∂ ∂ϕ (cid:18) 1 Þ θ+chθ hθ = SNθ = Kθ 1 hθ E hθ (cid:18) θ − θ0 (cid:19) (cid:18) ∂ (cid:18) θ − u (cid:19) (cid:18) θ − θ∗ Kθ hθ K IV θ−chθ ϕ f (cid:48)(cid:48)(cid:48) ϕϕϕ(x, ϕ, u)Kϕ,2(1 + o(1))]du + +0.5h2 c (cid:19) E ∂ϕ hθ , ..., u0(x, ϕ) 1 u2Nθ Kθ hθ n f (cid:48)(cid:48)(cid:48) [ f (cid:48) xxϕ(x, ϕ, u)K2(1 + o(1)) ϕ(x, ϕ, u) + 0.5h2 c 1440N4 ∂ϕ R ϕh5 ϕ f(x, ϕ, θ ∗)(1 + o(1)) + θ 1440N4 θ h4 hθ n f (cid:48)(cid:48)(cid:48) = f (cid:48) ϕ(x, ϕ, θ) + 0.5h2 xxϕ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 θ f (cid:48)(cid:48)(cid:48) ϕθθ(x, ϕ, θ)Kθ,2(1 + o(1)) + +0.5h2 K IV θ ∗ c f(x, ϕ 1440N4 θ h4 1440N4 , θ) − 2cKV I (cid:18) θ − θ∗ (cid:18) ϕ − ϕ∗ (cid:19) (cid:18) ϕ − ϕ∗ KV ϕ (cid:19) (cid:20) hϕ hϕ + c ϕ θ hθ f(x, ϕ (cid:19) ϕh5 ϕ ϕ f (cid:48)(cid:48)(cid:48) ϕϕϕ(x, ϕ, θ)Kϕ,2(1 + o(1)) ∗)(1 + o(1)) f(x, ϕ, θ (cid:21) ∗ , θ) (1 + o(1)). Note the remainder R is calculated exactly the same as above. (cid:3) Proof. (2) As in the proof of lemma 2(1), the linearity of the differential operator and expectation operator gives (cid:18) 1 h2 θ (cid:18) θ − θ0 (cid:19) hθ K(cid:48) θ u0(x, ϕ), ..., K(cid:48) θ 1 h2 θ (cid:18) θ − θ2Nθ (cid:19) hθ ˆfn(x, ϕ, θ) = SNθ ∂ ∂θ and (cid:19) (x, ϕ),−0.5π,0.5π u2Nθ ˆfn(x, ϕ, θ) = (cid:18) θ − θ0 K(cid:48) θ E ∂ ∂θ SNθ (cid:18) 1 h2 θ hθ (cid:19) Eu0(x, ϕ), ..., K(cid:48) θ 1 h2 θ (cid:19) (cid:18) θ − θ2Nθ hθ (cid:19) . (x, ϕ),−0.5π,0.5π Eu2Nθ 51 By proposition 2, E ∂ ∂θ Þ θ+chθ ˆfn(x, ϕ, θ) 1 K(cid:48) h2 = (cid:18) θ − v (cid:18) θ − θ∗ hθ (cid:19) (cid:19) [ f(x, ϕ, v) + 0.5h2 θ θ θ−chθ ϕ f (cid:48)(cid:48) ϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv + +0.5h2 c f(x, ϕ, θ + KV θ n f (cid:48)(cid:48) xx(x, ϕ, v)K2(1 + o(1)) c 1440N4 ∗)(1 + o(1)) ϕh4 R ϕ θ θ h5 1440N4 hθ n f (cid:48)(cid:48)(cid:48) = f (cid:48) xxθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 θ(x, ϕ, θ) + 0.5h2 θ f (cid:48)(cid:48)(cid:48) θθθ(x, ϕ, θ)Kθ,2(1 + o(1)) + +0.5h2 ∗ f (cid:48) c θ(x, ϕ 1440N4 θ h5 1440N4 , θ) − 2cKV (cid:18) θ − θ∗ (cid:19) (cid:18) ϕ − ϕ∗ KV θ (cid:19) hϕ hϕ hθ + c ϕ θ ϕh4 (cid:19) (cid:21) ϕ f (cid:48)(cid:48)(cid:48) ϕϕθ(x, ϕ, θ)Kϕ,2(1 + o(1)) ∗)(1 + o(1)) f(x, ϕ, θ f (cid:48) θ(x, ϕ ∗ , θ) (1 + o(1)). ϕ K IV ϕ (cid:20) (cid:18) ϕ − ϕ∗ (cid:19) (cid:18) 1 (cid:18) θ − θ0 (cid:19) (cid:18) θ − θ2Nθ K(cid:48) h2 hθ θ θ K IV ϕ hθ (cid:19) (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ hϕ 2Nθ K IV ϕ 0 hϕ R = SNθ ..., K(cid:48) θ 1 h2 θ Note that R was calculated in the same way as before, but this time with f(x, u∗ 0, θ0)(1 + o(1)), f(x, u∗ 2Nθ , θ2Nθ )(1 + o(1)),−0.5π,0.5π (cid:19) . Proof. (3) ∂ 2 2 ∂ϕ ˆfn(x, ϕ, θ) (cid:18) 1 (cid:18) θ − θ0 (cid:19) ∂ hθ ∂ϕ = SNθ Kθ hθ 2 2 u0(x, ϕ), ..., (cid:18) θ − θ2Nθ hθ 1 hθ Kθ (cid:19) ∂ 2 2 u2Nθ ∂ϕ (cid:19) . (x, ϕ),−0.5π,0.5π So again, by the linearity of the expectation operator, we have (cid:18) 1 2 2 E ∂ ∂ϕ = SNθ ˆfn(x, ϕ, θ) (cid:18) θ − θ0 (cid:18) ∂ (cid:19) E Kθ hθ hθ 2 2 u0(x, ϕ) ∂ϕ (cid:18) ∂ (cid:19) E 2 2 u2Nθ ∂ϕ (cid:19) (x, ϕ) ,−0.5π,0.5π (cid:19) , ..., 1 hθ Kθ (cid:18) θ − θ2Nθ hθ 52 (cid:3) (cid:19) . (cid:18) ϕ − ϕ2Nϕ (cid:19) (cid:19) 1 h3 K(cid:48)(cid:48) Ez0,k(x), ..., n f (cid:48)(cid:48) xx(x, u, θk)K2(1 + o(1))]du hϕ ϕ ϕ [ f(x, u, θk) + 0.5h2 Ez2Nϕ,k(x),−π, π We must then first calculate 2 2 uk(x, ϕ) = SNϕ E ∂ ∂ϕ Þ ϕ+chϕ ϕ = K(cid:48)(cid:48) ϕ−chϕ 2chϕ 1 h3 ϕ ∂IV ∂uIV By the preceding lemma, 180(2Nϕ)4 + (cid:19) (cid:18) 1 (cid:18) ϕ − u (cid:20) 1 h3 ϕ hϕ K(cid:48)(cid:48) ϕ h3 ϕ (cid:19) K(cid:48)(cid:48) ϕ (cid:18) ϕ − ϕ0 (cid:19) (cid:18) ϕ − u hϕ hϕ (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)u=u∗ k . f(x, u, θk) xxϕϕ(x, ϕ, θk)K2(1 + o(1)) n f IV 2 2Uk(x, ϕ) = f (cid:48)(cid:48) ϕϕ(x, ϕ, θk) + 0.5h2 E ∂ (cid:18) ϕ − u∗ ∂ϕ ϕϕϕϕ(x, ϕ, θk)Kϕ,2(1 + o(1)) +0.5h2 ϕ f IV c 1440N4 f(x, u∗ KV I ϕ (cid:19) hϕ + k ϕh6 ϕ k, θk)(1 + o(1)). Thus by proposition 2, (cid:18) θ − θ2Nθ (cid:19) (cid:18) ∂ (cid:19) (x, ϕ) ,−0.5π,0.5π (cid:19) (cid:19) ˆfn(x, ϕ, θ) E ∂ ∂ϕ 2 2 (cid:18) 1 Þ θ+chθ hθ = SNθ = Kθ 1 hθ E hθ (cid:19) (cid:18) θ − θ0 (cid:18) ∂ (cid:19) (cid:18) θ − u (cid:18) θ − θ∗ (cid:19) Kθ θ−chθ ϕϕϕϕ(x, ϕ, u)Kϕ,2(1 + o(1))]du + +0.5h2 ϕ f IV hθ E ∂ϕ ∂ϕ hθ , ..., 2 2 1 2 u0(x, ϕ) 2 u2Nθ Kθ hθ [ f (cid:48)(cid:48) ϕϕ(x, ϕ, u) + 0.5h2 xxϕϕ(x, ϕ, u)K2(1 + o(1)) n f IV c 1440N4 ∗)(1 + o(1)) f (cid:48)(cid:48) ϕϕ(x, ϕ, θ ϕh6 R ϕ c + K IV θ θ h4 1440N4 hθ = f (cid:48)(cid:48) ϕϕ(x, ϕ, θ) + 0.5h2 xxϕϕ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 n f IV +0.5h2 c ϕϕθθ(x, ϕ, θ)Kθ,2(1 + o(1)) + θ f IV ∗ f(x, ϕ KV I ϕ 1440N4 θ h4 , θ) − 2cKV II θ (cid:19) (cid:18) θ − θ∗ (cid:19) (cid:18) ϕ − ϕ∗ K IV θ hθ (cid:20) + c 1440N4 ϕh6 ϕ hϕ ϕϕϕϕ(x, ϕ, θ)Kϕ,2(1 + o(1)) ϕ f IV ∗) (cid:21) f (cid:48)(cid:48) ϕϕ(x, ϕ, θ ∗ , θ) f(x, ϕ (1 + o(1)). hϕ (cid:19) (cid:18) ϕ − ϕ∗ (cid:18) θ − θ0 (cid:18) 1 (cid:19) (cid:18) θ − θ2Nθ Kθ hθ hθ KV I ϕ hθ ϕ (cid:19) (cid:19) (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ hϕ 2Nθ KV I ϕ hϕ 53 Note that R was calculated in the same way as before, but this time with 0, θ0)(1 + o(1)), f(x, u∗ 0 R = SNθ ..., 1 hθ Kθ f(x, u∗ 2Nθ , θ2Nθ )(1 + o(1)),−0.5π,0.5π (cid:19) . (cid:18) 1 h3 θ (cid:18) θ − θ0 (cid:19) hθ K(cid:48)(cid:48) θ u0(x, ϕ), ..., K(cid:48)(cid:48) θ 1 h3 θ (cid:19) (cid:18) θ − θ2Nθ hθ (x, ϕ),−0.5π,0.5π u2Nθ (cid:3) (cid:19) Proof. (4) ˆfn(x, ϕ, θ) = SNθ ∂ 2 2 ∂θ and 2 2 (cid:18) 1 (cid:18) θ − θ0 ˆfn(x, ϕ, θ) = K(cid:48)(cid:48) θ E ∂ ∂θ SNθ hθ h3 θ (cid:19) Eu0(x, ϕ), ..., K(cid:48)(cid:48) θ 1 h3 θ (cid:19) (cid:18) θ − θ2Nθ hθ (cid:19) . (x, ϕ),−0.5π,0.5π Eu2Nθ By proposition 2, E ∂ ∂θ 2 2 Þ θ+chθ ˆfn(x, ϕ, θ) 1 K(cid:48)(cid:48) h3 = (cid:19) (cid:18) θ − v (cid:19) (cid:18) θ − θ∗ [ f(x, ϕ, v) + 0.5h2 θ θ hθ θ−chθ ϕ f (cid:48)(cid:48) ϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv + +0.5h2 c f(x, ϕ, θ + KV I θ n f (cid:48)(cid:48) xx(x, ϕ, v)K2(1 + o(1)) c 1440N4 ∗)(1 + o(1)) ϕh4 R ϕ θ θ h6 1440N4 hθ = f (cid:48)(cid:48) xxθθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 θθ(x, ϕ, θ) + 0.5h2 n f IV +0.5h2 c θθθθ(x, ϕ, θ)Kθ,2(1 + o(1)) + θ f IV ∗ f (cid:48)(cid:48) θθ(x, ϕ K IV ϕ θ h6 1440N4 , θ) − 2cKV θ ϕ (cid:19) (cid:18) θ − θ∗ (cid:19) (cid:18) ϕ − ϕ∗ KV I θ hθ + c 1440N4 ϕh4 ϕ ϕϕθθ(x, ϕ, θ)Kϕ,2(1 + o(1)) ϕ f IV ∗)(1 + o(1)) f(x, ϕ, θ f (cid:48)(cid:48) θθ(x, ϕ ∗ , θ) (1 + o(1)). hϕ (cid:21) Note that R was calculated in the same way as before, but this time with (cid:19) (cid:19) hϕ (cid:20) (cid:18) ϕ − ϕ∗ (cid:18) θ − θ0 (cid:18) 1 (cid:19) (cid:18) θ − θ2Nθ K(cid:48)(cid:48) h3 hθ θ θ K IV ϕ hθ (cid:19) (cid:18) ϕ − u∗ (cid:18) ϕ − u∗ (cid:19) hϕ 2Nθ K IV ϕ 0 hϕ R = SNθ ..., 1 h3 θ K(cid:48)(cid:48) θ 54 f(x, u∗ 0, θ0)(1 + o(1)), f(x, u∗ 2Nθ , θ2Nθ )(1 + o(1)),−0.5π,0.5π (cid:19) . (cid:3) Proof. (5) ∂ 2 SNθ and ∂θ∂ϕ (cid:18) 1 (cid:18) θ − θ0 ˆfn(x, ϕ, θ) = K(cid:48) θ hθ h2 θ (cid:19) ∂ ∂ϕ u0(x, ϕ), ..., K(cid:48) θ 1 h2 θ (cid:18) θ − θ2Nθ hθ (cid:19) ∂ ∂ϕ (cid:19) (x, ϕ),−0.5π,0.5π u2Nθ E ∂ ∂θ∂ϕ 2 (cid:18) 1 (cid:18) θ − θ0 ˆfn(x, ϕ, θ) = K(cid:48) (cid:19) θ hθ (cid:19) (cid:18) θ − θ2Nθ hθ u0(x, ϕ), ..., E ∂ ∂ϕ K(cid:48) θ 1 h2 θ (cid:19) . E ∂ ∂ϕ u2Nθ (x, ϕ),−0.5π,0.5π SNθ h2 θ By proposition 2, E ∂ 2 Þ θ+chθ ∂θ∂ϕ = ˆfn(x, ϕ, θ) K(cid:48) 1 h2 (cid:18) θ − v (cid:18) θ − θ∗ hθ (cid:19) (cid:19) [ f (cid:48) ϕ(x, ϕ, v) + 0.5h2 θ θ θ−chθ ϕ f (cid:48)(cid:48)(cid:48) ϕϕϕ(x, ϕ, v)Kϕ,2(1 + o(1))]dv + +0.5h2 c n f (cid:48)(cid:48)(cid:48) xxϕ(x, ϕ, v)K2(1 + o(1)) c 1440N4 R ϕh5 ϕ KV θ f (cid:48) ϕ(x, ϕ, θ ∗)(1 + o(1)) + θ θ h5 1440N4 hθ = f (cid:48)(cid:48) xxϕθ(x, ϕ, θ)K2(1 + o(1)) + 0.5h2 ϕθ(x, ϕ, θ) + 0.5h2 n f IV +0.5h2 c ϕθθθ(x, ϕ, θ)Kθ,2(1 + o(1)) + θ f IV ∗ f (cid:48) θ(x, ϕ KV ϕ θ h5 1440N4 , θ) − 2cKV I θ ϕ (cid:19) (cid:18) θ − θ∗ (cid:19) (cid:18) ϕ − ϕ∗ KV θ hθ + c 1440N4 ϕh5 ϕ hϕ ϕϕϕθ(x, ϕ, θ)Kϕ,2(1 + o(1)) ϕ f IV ∗)(1 + o(1)) (cid:21) f (cid:48) ϕ(x, ϕ, θ ∗ , θ) f (cid:48) θ(x, ϕ (1 + o(1)). hϕ (cid:19) (cid:20) (cid:18) ϕ − ϕ∗ (cid:18) θ − θ0 (cid:18) 1 (cid:19) (cid:18) θ − θ2Nθ K(cid:48) h2 hθ θ θ KV ϕ hθ (cid:19) (cid:18) ϕ − u∗ (cid:19) (cid:18) ϕ − u∗ (cid:19) hϕ 2Nθ KV ϕ 0 hϕ R = SNθ ..., K(cid:48) θ 1 h2 θ Note that R was calculated in the same way as before, but this time with f(x, u∗ 0, θ0)(1 + o(1)), f(x, u∗ 2Nθ , θ2Nθ )(1 + o(1)),−0.5π,0.5π (cid:19) . (cid:3) 55 6.2 Calculation of Mean and Covariance of ˆZn Recall the process ˆZn(t) = Þ t 0 − (cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗ U(t, s) sin θ∗ cos ϕ∗ 0 (cid:33)(cid:32) ∂ 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f 2 ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ (cid:33)−1(cid:32) ∂ ∂ϕ ∂ ∂θ ( ˆfn − f) ( ˆfn − f) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s) ds. cos θ∗ sin ϕ∗ − sin θ∗ (x∗(s)) = 0. Keep in mind that ∂ f ∂ϕ (x∗(s)) = ∂ f ∂θ The partial derivatives of ˆfn are easier to write as 2Nθ 2Nϕ n ∂ ˆfn(x) ∂ϕ = 1 nh3 n am hθ bl h2 m=0 ) and bl = O(N−1 ϕ ) are the coefficients in Simpson’s scheme. Denote k=1 l=0 Kθ hϕ hn K ϕ Yklm, K(cid:48) ϕ (cid:18) ϕ(x) − ϕl (cid:19) (cid:18) x − Xk (cid:19) where am = O(N−1 θ (cid:19) (cid:18) θ(x) − θm Þ hθ (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s) Km,l = ulK(m)(u)du. sin θ∗ cos ϕ∗ (cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗ (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x cos θ∗ sin ϕ∗ − sin θ∗ (cid:32) ∂ 0 2 . ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ Furthermore, denote M(s) = and 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f Consider the mean function of the process ˆZn(t). F(x) = 56 Lemma 7. Suppose hn → 0, hϕ → 0, hθ → 0. Then Þ t 0 (cid:32) (cid:32) (cid:32) (cid:34) h2 n +h2 ϕ +h2 θ E ˆZn(t) = −(1 + o(1)) U(t, s)M(s)F−1(x∗(s)) 3 ∂ ∂x2 ∂ϕ 3 ∂ ∂x2 ∂θ ∂ ∂ϕ 3 ∂ 2 ∂ϕ ∂θ 3 ∂ ∂ϕ∂θ ∂ ∂θ f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 3 3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 2 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 3 3 f(x∗(s), ϕ∗(x∗(s)), θ∗(x∗(s)))1 2K0,2Kϕ,1,1 2K0,2Kθ,1,1 6Kϕ,1,3 3Kθ,1,1Kϕ,0,2 3Kϕ,1,1Kθ,0,2 6Kθ,1,3 (cid:33) (cid:33) (cid:33)(cid:35) ds Proof. We calculated the asymptotic expressions of the expectations of the estimators of the derivatives of f in the previous section. The above is a result of applying those calculations to the defined process ˆZn. No other methods are needed to show the above. Next consider the covariance for any t1, t2 ∈ [0,T] Þ Þ I(s1 ∈ [0, t1])I(s2 ∈ [0, t2])U(t1, s1)M(s1)F−1(x∗(s1)) Cov( ˆZn(t1), ˆZn(t2)) = (cid:32) Cov(cid:0) ∂ ˆfn(x∗(s1)) Cov(cid:0) ∂ ˆfn(x∗(s1)) ∂ϕ , , ∂θ ∂ ˆfn(x∗(s2)) ∂ ˆfn(x∗(s2)) ∂ϕ ∂ϕ (cid:1) Cov(cid:0) ∂ ˆfn(x∗(s1)) (cid:1) Cov(cid:0) ∂ ˆfn(x∗(s1)) ∂ϕ ∂θ M∗(s2)(F∗)−1(x∗(s2))U∗(t2, s2)ds1ds2. (cid:1) (cid:33) (cid:1) ∂ ˆfn(x∗(s2)) ∂ ˆfn(x∗(s2)) ∂θ ∂θ , , (cid:3) (6.1) Consider the covariance array in the middle of the above. Then for the first entry we have: 57 (cid:19) 1 n2h6 n K(cid:48) ϕ (cid:18) ϕ(x) − ϕl2 (cid:19) hϕ am1am2 , ) ∂ϕ ∂ ˆfn(x∗(s2)) n 2Nϕ 2Nθ (cid:18) θ(x) − θm2 (cid:19) (cid:18) x − Xk2 h2 K(cid:48) k2=1 l2=0 hθ θ m2=0 (cid:19) Kθ Yk1l1m1, K hn ϕ ϕ hϕ bl1bl2 h4 Yk2l2m2), (cid:18) ϕ(x) − ϕl1 (cid:19) (cid:18) x2 − Xk2 (cid:18) x2 − Xk2 (cid:19) (cid:18) x2 − Xk1 (cid:19) (cid:19) hn hn = ∂ϕ hθ hn Kθ l1=0 k1=1 m1=0 Cov(K Cov( ∂ ˆfn(x∗(s1)) n 2Nϕ 2Nθ (cid:19) (cid:18) θ(x) − θm1 (cid:18) x − Xk1 (cid:19) (cid:18) x1 − Xk1 (cid:19) (cid:18) x1 − Xk1 (cid:18) (cid:18) x1 − x2 (cid:19) hn K(u)K Þ (cid:19) hn hn hn Cov(K (cid:18) x1 − Xk1 Cov(K = h3 n = h3 nΨ The latest covariance term consists of two terms f(Xk1, ϕl1, θm1), K f(Xk2, ϕl2, θm2)) and Cov(K S(Xk1, ϕl1, θm1)εk1l1m1, K S(Xk2, ϕl2, θm2)εk2l2m2). The first term is zero for k1 (cid:44) k2 due to independence, while for k1 = k2 we have (cid:18) x2 − Xk1 (cid:19) 58 where we used the substitution u = x−x1 hn The second term is zero when l1 (cid:44) l2 or m1 (cid:44) m2, while for the case of l1 = l2, m1 = m2 we f(Xk1, ϕl2, θm2)) f(x1 − uhn, ϕl1, θm1) f(x1 − uhn, ϕl2, θm2)du f(Xk1, ϕl1, θm1), K x1 − x2 hn u + f(x1, ϕl1, θm1) f(x2, ϕl2, θm2)(1 + O(h2 n)), hn (cid:19) and Ψ(z) :=Þ K(u)K(u + z)du. (cid:19) (cid:18) x2 − Xk1 S(Xk1, ϕl1, θm1)εk1l1m1, K S(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2 n)), hn S(Xk1, ϕl1, θm1)εk1l1m1) have for k1 = k2 Cov(K = h3 nΨ while for k1 (cid:44) k2 we have hn (cid:18) x1 − Xk1 (cid:18) x1 − x2 (cid:19) (cid:18) x1 − Xk1 hn (cid:19) (cid:19) Cov(K = Σk1k2h6 S(Xk1, ϕl1, θm1)εk1l1m1, K hn hn nS(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2 n)). S(Xk1, ϕl1, θm1)εk1l1m1) (cid:18) θ(x2) − θm2 (cid:19) hθ Now we consider the summation for the (1, 1)-component of (6.1): m2=0 ϕ ϕ Ψ hn hϕ K(cid:48) l1=0 m1=0 1 + O(h2 n) nh3 n ×K(cid:48) hϕ 1 + O(h2 n) nh3 n (cid:19) 2Nθ (cid:18) x1 − x2 2Nϕ 2Nθ (cid:18) ϕ(x1) − ϕl1 (cid:19) (cid:18) ϕ(x2) − ϕl2 (cid:19) (cid:19) 2Nθ (cid:18) x1 − x2 2Nϕ (cid:18) ϕ(x2) − ϕl1 (cid:18) ϕ(x1) − ϕl1 (cid:19) (cid:20)k1(cid:44)k2 Σk1k2 (cid:21) 2Nθ 2Nϕ (cid:18) ϕ(x2) − ϕl1 (cid:19) (cid:18) ϕ(x1) − ϕl1 hϕ a2 m1 h2 hn K(cid:48) b2 l1 h4 ×K(cid:48) m1=0 l1=0 l1=0 (cid:19) n2 hϕ Ψ + + ϕ ϕ ϕ θ m1=0 K(cid:48) ϕ ×K(cid:48) ϕ hϕ hϕ 2Nϕ Kθ (cid:19) hθ Kθ (cid:19) (cid:19) (cid:18) θ(x1) − θm1 (cid:18) θ(x2) − θm1 (cid:19) (cid:18) θ(x2) − θm1 hθ am1am2 h2 bl1bl2 h4 θ ϕ l2=0 f(x1, ϕl1, θm1) f(x2, ϕl2, θm2) b2 a2 l1 m1 h2 h4 S(x1, ϕl1, θm1)S(x2, ϕl1, θm1) (cid:18) θ(x1) − θm1 (cid:18) θ(x1) − θm1 (cid:19) (cid:19) Kθ Kθ hθ ϕ θ Kθ hθ hθ Kθ S(x1, ϕl1, θm1)S(x2, ϕl1, θm1)(1 + O(h2 n)). The first term above is an ordinary Simpson’s double integral. The next two terms have the coefficients squared so we need proposition (5). Now the (1, 1)-component of (6.1) becomes hn ϕ−chϕ (cid:19)Þ ϕ+chϕ (cid:18) x1 − x2 (cid:19) (cid:18) θ(x2) − u2 (cid:19) (cid:18) x1 − x2 (cid:19) (cid:18) θ(x2) − u (cid:19) (cid:19) hn hθ Ψ + 1 h2 θ−chθ θ−chθ ϕ−chϕ K(cid:48) Þ θ−chθ Þ ϕ+chϕ (cid:19) (cid:18) ϕ(x1) − v1  (cid:18) ϕ(x1) − v Þ θ−chθ (cid:18) ϕ(x2) − v2 (cid:21)Þ ϕ+chϕ (cid:19) (cid:18) ϕ(x2) − v ϕ−chϕ Σk1k2 k1(cid:44)k2 K(cid:48) (cid:19) hϕ hϕ ϕ ϕ θ Kθ Kθ f(x1, v1, u1) f(x2, v2, u2)dv1dv2du1du2 n) 1 + O(h2 n) Ψ hθ 1 + O(h2 n) nh3 n (cid:18) θ(x1) − u1 (cid:20)1 + O(h2 (cid:18) θ(x1) − u nh3 n + ϕ 1 h4 (cid:19) Þ θ−chθ θ−chθ K(cid:48) ϕ hϕ n2 K(cid:48) 25 ϕ 1 h2 θ 1 h4 ϕ hθ Kθ Kθ hθ S(x1, v, u)S(x2, v, u)dvdu 81NϕNθ Now we make substitutions in the integrals: . hϕ 59 (cid:18) x1 − x2 (cid:18) x1 − x2 (cid:19)Þ Þ Þ Þ 1 (cid:19) 1 + O(h2 n) h2 hn ϕ Ψ 1 + O(h2 n) nh3 n (cid:20)1 + O(h2 (cid:18) + n) f(x1, ϕ(x1) − z1hϕ, θ(x1) − y1hθ) f(x2, ϕ(x2) − z2hϕ, θ(x2) − y2hθ)dz1dz2dy1dy2 Kθ(y1)Kθ(y2)K(cid:48) ϕ(z1)K(cid:48) ϕ(z2)  (cid:18) (cid:21)Þ Þ 1 (cid:19) 1 h3 ϕ + (cid:19) Ψ hn nh3 n Kθ(z)Kθ S(x1, ϕ(x1) − zhϕ, θ(x1) − yhθ)S(x2, ϕ(x1) − zhϕ, θ(x1) − yhθ)dzdy Σk1k2 ϕ(x2) − ϕ(x1) n2 K(cid:48) ϕ(y)K(cid:48) θ(x2) − θ(x1) k1(cid:44)k2 z + y + hϕ hθ hθ ϕ 25 81NϕNθ . Recall properties (K). Applying Taylor expansion to functions f and S yields the following expres- sion for the second summand of the (1, 1)-component of (6.1): (cid:20)1 + O(h2 + nh3 n ×(−1)Ψ(cid:48)(cid:48) ϕ Ψ hn n) (cid:18) x1 − x2 (cid:19) (cid:18) ϕ(x2) − ϕ(x1) (cid:19) Þ hϕ Where we have defined the integrals:  (cid:21) (cid:18) θ(x2) − θ(x1) (cid:19) 1 + O(h2 n) + n2 Σk1k2 Ψθ k1(cid:44)k2 hθ S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1 hθ 1 h3 ϕ 25 81NϕNθ , Ψ(z) = K(u)K(z + u)du, Ψ(cid:48)(z) = K(u)K(cid:48)(z + u)du, Þ and the integration by parts gives Ψ(cid:48)(cid:48)(z) = − Þ K(cid:48)(u)K(cid:48)(z + u)du. 60 (x2, ϕ(x2), θ(x2))K2 ϕ,1,1 Consider the first summand of the (1, 1)-component of (6.1): (cid:18) x1 − x2 (cid:19)(cid:20) ∂ f ∂ + Ψ ∂ϕ hn 1 + O(h2 n) (x1, ϕ(x1), θ(x1)) ∂ f nh3 ∂ϕ ∂ϕ n 3 f ϕ( ∂ f (x1, ϕ(x1), θ(x1)) ∂ 3(x2, ϕ(x2), θ(x2)) +h2 ∂ϕ 3 f 3(x1, ϕ(x1), θ(x1)) ∂ f (x2, ϕ(x2), θ(x2)))Kϕ,1,1Kϕ,1,3 ∂ϕ ∂ϕ 5 f ϕ( ∂ f (x1, ϕ(x1), θ(x1)) ∂ 3(x2, ϕ(x2), θ(x2)) +h4 ∂ϕ 5 f 5(x1, ϕ(x1), θ(x1)) ∂ f (x2, ϕ(x2), θ(x2)))Kϕ,1,1Kϕ,1,5 ∂ϕ 3 f 3 f 3(x2, ϕ(x2), θ(x2))K2 3(x1, ϕ(x1), θ(x1)) ∂ +h4 ∂ϕ ∂ϕ (cid:21) ∂ϕ ∂ϕ + ∂ , ∂ where Kϕ,m,l =Þ ϕlK(m)(ϕ)dϕ. Recall that this expression is calculated on x1 = x∗(s1), x2 = ϕ,1,3 ϕ x∗(s2), where ∂ f ∂ϕ is zero. i.e., recall the equations: ∂ ∂ ∂θ f(x, ϕ ∂ ∂ϕ 2 2 f(x, ϕ ∂ϕ 2 2 f(x, ϕ ∗(x), θ ∗(x), θ ∗(x), θ ∗(x)) = ∗(x)) < 0, 2 ∗(x)) ∂ 2 f(x, ϕ ∗(x), θ f(x, ϕ 2 2 f(x, ϕ ∗(x), θ ∗(x)) = 0, ∗(x), θ ∗(x)) − ( ∂ ∂θ ∂ ∂ 2 ∗(x)) < 0, ∂ϕ ∂θ ∂θ∂ϕ f(x, ϕ ∗(x), θ ∗(x)))2 > 0. Then only the last summand in the first summand of the (1, 1)-component of (6.1) is nonzero. As a result, the (1, 1)-component of (6.1) is: Ψ h4 (cid:18) x1 − x2 (cid:19) (cid:18) x1 − x2 (cid:19) (cid:18) ϕ(x2) − ϕ(x1) (cid:19) n) hn hn Ψ ϕ 1 + O(h2 n) nh3 n (cid:20)1 + O(h2 + nh3 n ×(−1)Ψ(cid:48)(cid:48) ϕ hϕ ∂ 3 f 3(x1, ϕ(x1), θ(x1)) ∂ ∂ϕ 1 + O(h2 n)  3 f 3(x2, ϕ(x2), θ(x2))K2 ∂ϕ (cid:18) θ(x2) − θ(x1) (cid:21) (cid:19) ϕ,1,3 + n2 Σk1k2 Ψθ k1(cid:44)k2 hθ S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1 hθ 1 h3 ϕ 25 81NϕNθ . 61 The (2, 2)-component of (6.1) is easily obtained from the above by the exchange of ϕ and θ, so it is h4 θ Ψ (cid:18) x1 − x2 (cid:19) (cid:18) x1 − x2 (cid:18) θ(x2) − θ(x1) n) hn hn Ψ (cid:19) (cid:19) 1 + O(h2 n) nh3 n (cid:20)1 + O(h2 + nh3 n ×(−1)Ψ(cid:48)(cid:48) θ hθ ∂ 3 f 3 (x1, ϕ(x1), θ(x1)) ∂ ∂θ 1 + O(h2 n)  3 f 3 (x2, ϕ(x2), θ(x2))K2 θ,1,3 ∂θ (cid:18) ϕ(x2) − ϕ(x1) (cid:19) (cid:21) + n2 Σk1k2 Ψϕ k1(cid:44)k2 hϕ S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1 h3 1 hϕ 25 81NϕNθ . θ , θ ϕ = ) ∂θ ∂ϕ hθ hθ Kθ Now consider the (1, 2)-component of (6.1): ∂ ˆfn(x∗(s2)) l1=0 l2=0 k2=1 k1=1 m1=0 am1am2 Cov(K Yk1l1m1, K h3 K(cid:48) m2=0 K(cid:48) Cov( ∂ ˆfn(x∗(s1)) 2Nθ 2Nϕ n 2Nθ 2Nϕ n (cid:18) θ(x) − θm1 (cid:19) (cid:18) θ(x) − θm2 (cid:19) (cid:19) (cid:18) x − Xk1 (cid:18) x − Xk2 (cid:19) 2Nθ (cid:18) x1 − x2 2Nϕ 2Nθ 2Nϕ (cid:19) (cid:18) ϕ(x1) − ϕl1 (cid:18) ϕ(x2) − ϕl2 (cid:19) k1(cid:44)k2 Σk1k2 (cid:18) x1 − x2 (cid:19) (cid:20)1 + O(h2 (cid:18) ϕ(x1) − ϕl1 (cid:19) (cid:19) (cid:18) θ(x2) − θm1 (cid:21) 2Nθ 2Nϕ (cid:19) (cid:18) ϕ(x2) − ϕl1 am1am2 hϕ n) nh3 n m1=0 m1=0 m2=0 l1=0 1 + O(h2 n) nh3 n ×K(cid:48) h3 l1=0 Kϕ n2 hϕ hn hn hn Ψ Ψ + + ϕ hn K(cid:48) ϕ Kϕ hϕ hϕ ×K(cid:48) θ hθ which is 1 n2h6 n (cid:19) Kϕ (cid:18) ϕ(x) − ϕl2 (cid:19) hϕ θ ϕ bl1bl2 h3 (cid:18) ϕ(x) − ϕl1 (cid:19) Yk2l2m2), hϕ bl1bl2 h3 Kθ θ ϕ l2=0 f(x1, ϕl1, θm1) f(x2, ϕl2, θm2) (cid:19) (cid:19) K(cid:48) (cid:18) θ(x1) − θm1 (cid:18) θ(x1) − θm1 hθ θ (cid:18) θ(x2) − θm2 (cid:19) hθ Kθ b2 l1 h3 a2 m1 h3 S(x1, ϕl1, θm1)S(x2, ϕl1, θm1). hθ ϕ θ Then applying Simpson’s approximation to the last expression we have: 62 hn K(cid:48) ϕ−chϕ (cid:19)Þ ϕ+chϕ (cid:18) x1 − x2 (cid:18) θ(x2) − u2 (cid:19) (cid:19) (cid:19) (cid:18) x1 − x2 (cid:18) θ(x2) − u (cid:19) (cid:19) hn Ψ + 1 h3 θ−chθ θ−chθ ϕ−chϕ K(cid:48) Þ θ−chθ Þ ϕ+chϕ (cid:18) ϕ(x1) − v1 (cid:19)  (cid:18) ϕ(x1) − v Þ θ−chθ (cid:18) ϕ(x2) − v2 (cid:21)Þ ϕ+chϕ (cid:18) ϕ(x2) − v (cid:19) ϕ−chϕ Σk1k2 k1(cid:44)k2 (cid:19) Kϕ hϕ hϕ ϕ θ θ hθ Kθ f(x1, v1, u1) f(x2, v2, u2)dv1dv2du1du2 1 + O(h2 n) n) hθ Ψ 1 + O(h2 n) nh3 n (cid:18) θ(x1) − u1 (cid:20)1 + O(h2 (cid:18) θ(x1) − u nh3 n + ϕ 1 h3 (cid:19) Þ θ−chθ θ−chθ Kϕ hϕ hϕ K(cid:48) hθ Kθ hθ S(x1, v, u)S(x2, v, u)dvdu θ n2 K(cid:48) 25 ϕ 81NϕNθ . 1 h3 θ 1 h3 ϕ Introduce as well as y1 = θ(x1) − u1 hθ , y2 = θ(x2) − u2 hθ , z1 = ϕ(x1) − v1 hϕ z2 = ϕ(x2) − v2 hϕ hθ Then by substitution the above becomes: θ(x1) − u z = , y = ϕ(x1) − v hϕ . f(x1, ϕ(x1) − z1hϕ, θ(x1) − y1hθ) f(x2, ϕ(x2) − z2hϕ, θ(x2) − y2hθ)dz1dz2dy1dy2 (cid:18) x1 − x2 (cid:18) x1 − x2 (cid:19)Þ Þ Þ Þ (cid:19) hn 1 + O(h2 n) Ψ 1 + O(h2 n) nh3 n (cid:20)1 + O(h2 (cid:18) + n) 1 hϕhθ Kθ(y1)K(cid:48)  (cid:18) ϕ(z1)Kϕ(z2) θ(y2)K(cid:48) (cid:21)Þ Þ 1 (cid:19) 1 h2 + (cid:19) Ψ hn nh3 n Kθ(z)K(cid:48) S(x1, ϕ(x1) − zhϕ, θ(x1) − yhθ)S(x2, ϕ(x1) − zhϕ, θ(x1) − yhθ)dzdy Σk1k2 ϕ(x2) − ϕ(x1) n2 K(cid:48) ϕ(y)Kϕ θ(x2) − θ(x1) k1(cid:44)k2 h2 z + y + hϕ hθ ϕ θ θ 25 81NϕNθ . Finally, the (1, 2)-component of (6.1) is: 63 Ψ 1 + O(h2 n) nh3 n (cid:19) (cid:18) x1 − x2 (cid:20)1 + O(h2 (cid:18) x1 − x2 (cid:18) ϕ(x2) − ϕ(x1) (cid:19) nh3 n n) hn Ψ + Ψ(cid:48) ϕ hϕ (cid:21) 3 f 3 (x2, ϕ(x2), θ(x2))Kϕ,1,3Kθ,1,3 ∂θ Ψ(cid:48) (cid:18) θ(x2) − θ(x1) (cid:19) θ ∂ h2 ϕh2  3 f 3(x1, ϕ(x1), θ(x1)) ∂ ∂ϕ 1 + O(h2 n) (cid:19) hn S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1 h2 Σk1k2 k1(cid:44)k2 n2 + θ hθ 1 h2 ϕ 25 81NϕNθ . θ Similarly, the (2, 1)-component of (6.1) is Ψ 1 + O(h2 n) nh3 n (cid:19) (cid:18) x1 − x2 (cid:18) x1 − x2 (cid:20)1 + O(h2 (cid:19) (cid:18) ϕ(x2) − ϕ(x1) nh3 n n) hn Ψ + Ψ(cid:48) ϕ hϕ (cid:21) 3 f 3(x2, ϕ(x2), θ(x2))Kϕ,1,3Kθ,1,3 ∂ϕ Ψ(cid:48) (cid:18) θ(x2) − θ(x1) (cid:19) θ ∂ h2 ϕh2  3 f 3 (x1, ϕ(x1), θ(x1)) ∂ ∂θ 1 + O(h2 n) (cid:19) hn S(x1, ϕ(x1), θ(x1))S(x2, ϕ(x1), θ(x1)) 1 h2 Σk1k2 k1(cid:44)k2 n2 + θ hθ 1 h2 ϕ 25 81NϕNθ . θ To balance both terms with f and S in the covariance matrix (6.1) we need the following assumptions : and 1 + O(h2 n) n2  k1(cid:44)k2 Σk1k2 = κ nh3 n Nϕ = Nθ = N, hϕ = hθ = h, where κ > 0 is a constant. Now we plug in the expression for (6.1) into the covariance function of the process ˆZn. First we define the following (which is a slight restatement for convenience): Recall: Ψ(z) = Þ and the integration by parts gives Ψ(cid:48)(cid:48)(z) = − Þ Þ K(u)K(z + u)du, Ψ(cid:48)(z) = K(u)K(cid:48)(z + u)du, K(cid:48)(u)K(cid:48)(z + u)du. 64 build the matrix −|z|2 For instance, for a standard Gaussian kernel Ψ(z) = e 4 π)3 . Now out of kernels Kϕ and Kθ (2√ ϕ(0)Ψ(cid:48) (0) (0) Ψϕ(0)Ψ(cid:48)(cid:48) (0) ϕ(0)Ψθ(0) Ψ(cid:48) Ψ(cid:48) ϕ(0)Ψ(cid:48) (cid:32) Ψ(cid:48)(cid:48) Ψ0 = (cid:33) θ . If both Kϕ and Kθ are Gaussian then Ψ0 = −diag( 1 8π, Ψ(−τv)dτ, ψ(v) = 1 8π θ Þ R θ ). For v ∈ R3 define (cid:19) dτ. , where which is ( ∂θ∗ ∂x (x)vτ) ( ∂θ∗ ∂x (x)vτ) (cid:19) (Ψ(−vτ) + κ) × Ψ0(v, x) = − 1 D(x) 1 4π|v| for a standard Gaussian kernel. Also introduce Þ (cid:18) −Ψ(cid:48)(cid:48) Ψ0(v, x) = ϕ( ∂ϕ∗ ϕ( ∂ϕ∗ ∂x (x)vτ)Ψθ( ∂θ∗ ∂x (x)vτ)Ψ(cid:48) ∂x (x)vτ) Ψ(cid:48) ∂x (x)vτ) −Ψϕ( ∂ϕ∗ ϕ( ∂ϕ∗ ( ∂θ∗ ∂x (x)vτ)Ψ(cid:48) ∂x (x)vτ)Ψ(cid:48)(cid:48) Ψ(cid:48) θ −(cid:0) ∂ϕ∗ ∂x (x)(cid:1)(cid:0) ∂θ∗ 1 +(cid:0) ∂θ∗ ∂x (x)(cid:1)2 ∂x (x)(cid:1) (cid:18) In case of a standard Gaussian kernel −(cid:0) ∂ϕ∗ 1 +(cid:0) ∂ϕ∗ ∂x (x)(cid:1)(cid:0) ∂θ∗ ∂x (x)(cid:1) ∂x (x)(cid:1)2 (cid:19)2(cid:19)3/2 (cid:19)2 (cid:18) ∂ϕ∗ 2(cid:18) (cid:18) ∂θ∗ (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s) (cid:32) − sin θ∗ sin ϕ∗ cos θ∗ cos ϕ∗ (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(x,ϕ,θ). ∂ ∂ϕ∂θ f 2 ∂ 2 f ∂θ Green’s function U(t, s) is defined as the solution of the PDE cos θ∗ sin ϕ∗ − sin θ∗ 2 2 f ∂ϕ 2 ∂ ∂ϕ∂θ f sin θ∗ cos ϕ∗ D(x) = 64π F(x, ϕ, θ) = 0 (cid:32) ∂ Also, define M(s) = (x) + (x) ∂x 1 + ∂x and 2 θ θ . ∂U(t, s) ∂t = ∇v ∗(x∗(t))U(t, s),U(s, s) = I ∀s > 0, 65 where ∇v(x∗(s)) = −M(s)F−1(x∗(s), ϕ ∗(x∗(s)), θ ∗(x∗(s))) (cid:32) (cid:33) . 2 ∂ ∂ϕ∂x f(x∗(s)) ∂θ∂x f(x∗(s)) 2 ∂ Then back to the covariance function of the process ˆZn we combine the above calculations using these definitions: I(s1 ∈ [0, t1])I(s2 ∈ [0, t2])U(t1, s1)M(s1)F−1(x∗(s1)) (cid:26) Þ Þ (cid:19) (cid:18) x∗(s1) − x∗(s2) (cid:18) x∗(s1) − x∗(s2) (cid:20) Cov( ˆZn(t1), ˆZn(t2)) = h4 1 + O(h2 n) Ψ nh3 n 1 + O(h2 n) 1 N2h4 nh3 n ×S(x∗(s1), ϕ ∗(x∗(s1)), θ M∗(s2)(F∗)−1(x∗(s2))U∗(t2, s2)ds1ds2, (cid:19) hn Ψ + hn + κ ∗(x∗(s1)))S(x∗(s2), ϕ G(x∗(s1))G∗(x∗(s2)) (cid:21) 25 81 Ψ(s1, s2) ∗(x∗(s1)), θ (cid:27) ∗(x∗(s1))) where and with Ψ(s1, s2) := (cid:18) ∂ ∂ G(x) := (cid:19) 3 f(x) ∂ϕ 3 f(x) ∂θ 3 Kϕ,1,3 3 Kθ,1,3 (cid:18) −Ψ(cid:48)(cid:48) ϕ( ∆ϕ∗ )Ψθ( ∆θ∗ hθ ϕ( ∆ϕ∗ ( ∆θ∗ Ψ(cid:48) )Ψ(cid:48) hθ hϕ hϕ θ ϕ( ∆ϕ∗ ) Ψ(cid:48) ) −Ψϕ( ∆ϕ∗ hϕ hϕ )Ψ(cid:48) )Ψ(cid:48)(cid:48) θ ( ∆θ∗ ) hθ ( ∆θ∗ hθ θ (cid:19) ) ∗ = ϕ ∗(x∗(s2)) − ϕ ∗(x∗(s1)), ∆θ ∗ = θ ∗(x∗(s2)) − θ ∗(x∗(s1)). ∆ϕ Now consider the change of variable s2 = s1 + τh with ds2 = hdτ. Then we have proven the following result: 66 Lemma 8. Þ t1∧t2 (cid:26) (cid:18) dx∗ (cid:19) 1 + O(h2 n) nh2 n Cov( ˆZn(t1), ˆZn(t2)) = 0 G(x∗(s))G∗(x∗(s)) + (s) h4 ∗(x∗(s)), θ ×S2(x∗(s), ϕ ∗(x∗(s))) (cid:27) dt ψ (cid:18) dx∗ U(t1, s)M(s)F−1(x∗(s)) 25 (s), x∗(s) 81 Ψ0 M∗(s)(F∗)−1(x∗(s))U∗(t2, s)ds. N2h4 (cid:19) dt 1 6.3 Mean Squared Error of the Process ˆZn Proposition 6. The optimal bandwidths (hn, h) that minimize the MSE of ˆZn are attained under either one of the following cases: (I) hn = O(n−1/2), h = O(n−1/2), N ≥ O(n2) as n → ∞. or (II) hn = O(N−1/5n−1/10), h = O(N−1/5n−1/10), N = o(n2) as n → ∞. Remark. In practice in HARDI the number of gradient directions N is much smaller than the number of voxels n, so we have the case (II). Proof. The MSE has the form c1h4 n + c2h4 + c3h2h2 with respect to hn and h and set them to zero: n + c4h4 nh2 n + c5 nN2h4 . Take partial derivatives nh2 4c1h3 n + 2c3h2hn − 2c4h4 nh3 n − 2c5 nN2h4 = 0 nh3 and 4c2h3 + 2c3hh2 n + Consider the second equation: 4c4h3 nh2 n − 4c5 nN2h5 = 0. nh2 It is quadratic with respect to h2 (cid:114) n. Then −2c2nh3 ± h2 n = c3nhh4 n + 2c2nh3h2 n + 2c4h3 − 2c5 N2h5 = 0. 2n2h6 − 4c3nh(2c4h3 − 2c5 N2h5) 4c2 2c3nh , 67 which is h2 ± h2 ± c2 c3 n ≈ −c2 h2 c3 2c5 c2nh8N2 ∓ 2c4 up to terms of smaller order. Since c2/c3 > 0 we take h2 n ≈ (cid:18)1 (cid:19) (I) N2h8 → C ∈ (0, +∞], Consider 2 cases: h2 n = O and nh8N2 Now we consider the first equation with respect to h for case (I): (II) N2h8 → 0, h2 n = O n . (cid:18) 1 c2n . 2c5 c2nh8N2 − 2c4 (cid:19) c2n or equivalently 2c1n−3/2 + c3h2n−1/2 − c4h4n1/2 − c5n1/2 N2h4 = 0 2c1n−2 + c3h2n−1 − c4h4 − c5h4 C = 0 and furthermore with positive constants c, c0 h4 − c0h2n−1 − cn−2 = 0, which implies h2 = O(1 n). As a result case (I) becomes hn = O(n−1/2), (I) h = O(n−1/2), N ≥ O(n2) as n → ∞. Now consider case (II). The first equation with respect to h is 2c1 n3h24N6 + c3 n2h14N4 − c4h4 n − c5 nN2h4 = 0, which is equivalent up to terms of higher order with Nh4 = ε → 0 and h = ε 1/4N−1/4 −2c1N − nN1/2 5/2 + c5n2 ε 5 = 0, ε which is quadratic in ε 5/2. Then ε = O(N1/5n−2/5). As a result case (II) becomes (II) hn = O(N−1/5n−1/10), h = O(N−1/5n−1/10), N = o(n2) as n → ∞, 68 which completes the proof of the proposition. Remark. From the proof it is clear that in case (II) the second part of the covariance of ˆZn with S is the leading term, while the first part is of smaller order. Also the bias squared is of smaller order as well. 6.4 Asymptotic normality of the ˆZn process First we need to establish consistency, which will help justify the approximation of ˆX∗ n(t)− x∗(t) by ˆZn. Lemma 9. Suppose that (I) or (II) holds. Then uniformly in t ∈ [0,T], estimator of x∗(t). That is, ˆX∗ n(t) is a consistent | ˆX∗ n(t) − x∗(t)| P→ 0 as n → ∞. sup t∈[0,T] In the following proof we use the Gronwall-Bellman Inequality and formulate it before for completeness: Let F, G be non-negative continuous functions in [a, b] and D ≥ 0 be a constant. Suppose that for all t ∈ [a, b), G(t) ≤ D + Then for all t ∈ [a, b] Þ t (cid:40)Þ t a a F(s)G(s)ds (cid:41) F(s)ds G(t) ≤ Dexp Proof. Now, Þ t 0 0 = (ˆv (ˆv Þ t Þ t Þ t (cid:18) n(t) − x∗(t) = n( ˆx∗ ˆx∗ ∗ (ˆv 0 ∗ ∗)( ˆx∗ n − v n(s))ds + ∗)( ˆx∗ ∗ n(s))ds + n − v ∗ ∗)(x∗(s))ds + n − v | ˆx∗ sup t∈[0,T] Þ t Þ t Þ t n(t) − x∗(t)|2(cid:19) (ˆv +O = 0 0 = 0 . ∗(x∗(s)))ds n(s)) − v ∗( ˆx∗ n(s)) − v (v ∗(x∗(s))( ˆx∗ ∇v 0 (ˆv ∗ n − v ∗)( ˆx∗ (cid:18) ∗(x∗(s)))ds n(s) − x∗(s))ds + O sup t∈[0,T] n(s) − x∗(s))ds + ∗(x∗(s))( ˆx∗ ∇v Þ t | ˆx∗ 0 n(t) − x∗(t)|2(cid:19) . n(s) − x∗(s))ds 69 n(t) − x∗(t)|2(cid:19) | ˆx∗ = rn Since, Þ t Þ 1 0 0 n(s) − x∗(s))ds = sup t∈[0,T] ∗ (ˆv n − v 0 n(s) − x∗(s)|) = o(| ˆx∗ ∗)( ˆx∗ (ˆv ∗ n − v ∗)(cid:48)(λ ˆx∗ n(s) + (1 − λ)x∗(s))( ˆx∗ n(s) − x∗(s))ds = (6.2) ˆx∗ n(t) − x∗(t) = Þ t 0 (ˆv ∗ n − v ∗)(x∗(s))ds + Þ t 0 ∇v ∗(x∗(s))( ˆx∗ n(s) − x∗(s))ds + rn (6.3) Þ t 0 zn(t) = ∇v ∗(x∗(s))zn(s)ds + Þ t Þ t 0 (ˆv ∗ n − v ∗)(x∗(s))ds ˆx∗ n(t) − x∗(t) − zn(t) = ∇v ∗(x∗(s))[( ˆx∗ n(s) − x∗(s)) − zn(s)]ds + rn 0 Then by the Gronwall-Bellman inequality || ˆx∗ n(t) − x∗(t) − zn(t)|| ≤ ||rn||exp ∇v ∗(x∗(s))ds (cid:18) Þ t Let O We have Now, So that Now, Then (cid:41) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ds ( ˆfn − f)(x∗(s)) ( ˆfn − f)(x∗(s)) | ˜Zn(t)| ≤ ≤ ||A|| 0 0 (cid:40)Þ t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)U(t, s)M(s)F−1(x∗(s)) (cid:32) ∂ Þ t (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:32) ∂ Þ t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:32) ∂ ( ˆfn − f)(x∗(s)) ( ˆfn − f)(x∗(s)) | ˜Zn(t)| ≤ ||A||T sup x∈G sup t∈[0,T] ∂ϕ ∂ ∂θ ∂ϕ ∂ ∂θ 0 70 (6.4) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( ˆfn − f)(x) ( ˆfn − f)(x) ∂ϕ ∂ ∂θ ˆfn(x, ϕ, θ) − E ∂ ∂ϕ ˆfn(x, ϕ, θ) = O f(x, ϕ, θ) − ∂ ∂ϕ ∂ϕ ˆfn(x, ϕ, θ) = O Thus and analogously f(x, ϕ, θ) − ∂ ∂θ ∂θ ˆfn(x, ϕ, θ) = O To control the error term, we note that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞ (cid:32) ∂ (cid:32)(cid:115) (cid:32)(cid:115) (cid:32)(cid:115) (cid:33) (cid:33) (cid:33) log(h) nN2h6 log(h) nN2h6 log(h) nN2h6 (cid:33) a.s. a.s. a.s. ∂ϕ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:18) ∆ϕ (cid:19) Note ||A|| is the operator norm and we will denote || · ||∞ = sup x∈G norm. By GG proposition 3.1 (see [14]) we have | · | as usual for the supremum = F−1(x, ϕ ∗(x), θ ∗(x)) ( ˆfn − f)(x) ( ˆfn − f)(x) ∂ϕ ∂ ∂θ + O(∆ 2 ϕ + ∆ θ) 2 ∆θ And ˆf (cid:48) nϕ(x, ϕ ˆf (cid:48)(cid:48) nϕϕ(x, ϕ ∗(x∗), θ ∗(x∗), θ ∗(x∗)) − f (cid:48) ϕ(x, ϕ ∗(x∗))∆ϕ + ˆf (cid:48)(cid:48) ∗(x∗), θ nϕθ(x, ϕ ∗(x∗)) = ∗(x∗), θ ∗(x∗))∆θ + O(∆ 2 ϕ + ∆ θ) 2 (6.5) So we may rewrite it as: (cid:18) ∆ϕ (cid:19) (cid:18) ∂ ∆θ = −F−1(x, ϕ ∗(x), θ ∗(x)) (cid:19) ( ˆfn − f)(x) + O ( ˆfn − f)(x) +O ∂θ Regarding the second order derivatives of f we have by GG proposition 3.1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂ϕ 2 2 f(x, ϕ, θ) − ∂ 2 2 ∂ϕ ˆfn(x, ϕ, θ) = O (cid:32) ∂ (cid:33) a.s. ( ˆfn − f)(x) ( ˆfn − f)(x) ∂ϕ ∂ ∂θ (cid:19) (cid:32)(cid:114) log(h) nN2h7 (cid:33) ∂ϕ (cid:18) ∂ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞ and analogously 71 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ 2 2 f(x, ϕ, θ) − ∂ ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)∞ ˆfn(x, ϕ, θ) = O (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12) + ∆θ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ 2 2( ˆf − f) a.s. nN2h7 (cid:32)(cid:114) log(h) (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) + (∆ϕ + ∆θ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) ( ˆf − f) 2 ∂ϕ∂θ So that the terms governing (cid:18) 2 2( ˆf − f) have almost sure convergence to zero. 2 + ∆ θ + ∆ϕ 2 ∆ ϕ ∂ϕ O Thus, as n → ∞ n(t) − x∗(t)| P→ 0 | ˆx∗ sup t∈[0,T] (cid:3) We now prove the asymptotic normality of the normalized deviation process ˆZn(t) in the space C[0,T]. First, we have to establish the convergence of the finite dimensional distributions of this process for general functions. To this end we will establish that the CLT holds by checking Lyapunov’s condition and prove asymptotical equicontinuity of the process ˆZn(t). Theorem 3. Under case (I), n( ˆXn(t) − x(t)) D⇒ G(t) in the space C[0,T], where the Gaussian process G has mean function µ(t) and covariance function C(t1, t2) introduced in the next section. Under case (II), n1/5N2/5( ˆXn(t) − x(t)) D⇒ G0(t) in C[0,T], where the centered Gaussian process G0 has covariance function C0(t1, t2) introduced in the next section. 72 Þ t n 0 k=1 U(t, s)M(s)F−1(x∗(s)) 2Nθ 2Nϕ m=0 l=0 θ(x∗(s))−θm θ(x∗(s))−θm h h (cid:18) x∗(s) − Xk (cid:19) (cid:18) ϕ(x∗(s))−ϕl (cid:18) ϕ(x∗(s))−ϕl h h K(cid:48) ϕ Kϕ h K (cid:19) (cid:19) ambl (cid:19) (cid:19) (cid:33) ds (cid:18) (cid:18) (cid:32) Kθ K(cid:48) θ ×[ f(Xk, ϕl, θm) + S(Xk, ϕl, θm)εklm] ˆZn(t) = − 1 nh6 n k=1 =: 1 n ηn,j 6.4.1 Lyapunov’s Condition Under assumption hn = hϕ = hθ one can write For the case (I) one has to establish the following convergence: n k=1 E|ηn,j − Eηn,j|4 → 0, which is the Lyapunov’s condition for the process n ˆZn. For the case (II) one has to establish the following convergence: n k=1 [n−4/5N2/5]4 E|ηn,j − Eηn,j|4 → 0, which is the Lyapunov’s condition for the process n1/5N2/5 ˆZn. 73 Consider E|ηn,k − Eηn,k|4. It is bounded by 1 h24 E (cid:26) (cid:26) (cid:32) K −EK K −EK Kθ h h h 0 0 2Nϕ am1bl1am2bl2 m2=0 m1=0 2Nθ 2Nϕ (cid:20)Þ t Þ t 2Nθ (cid:18) x∗(s1) − Xk (cid:19) (cid:19) (cid:18) x∗(s1) − Xk (cid:18) x∗(s2) − Xk (cid:19) (cid:19) (cid:18) x∗(s2) − Xk (cid:19) (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2))−θm2 (cid:32) Kθ (cid:19) (cid:18) θ(x∗s2)−θm2 K(cid:48) l2=0 l1=0 ( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1) ( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1) ( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2) ( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2) (cid:27) (cid:27) (cid:18) θ(x∗s1) − θm1 (cid:19) (cid:19) (cid:19) (cid:18) ϕ(x∗(s2))−ϕl2 (cid:21)2 (cid:19) (cid:18) ϕ(x∗(s2))−ϕl2 (cid:18) ϕ(x∗(s1)) − ϕl1 h K(cid:48) ds1ds2 , K(cid:48) (cid:33) (cid:19) h h h ϕ θ B(s1, t1, s2, t2) h h ϕ Kϕ h h K(cid:48) θ (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19)(cid:33) h Kϕ where B(s1, t1, s2, t2) = (F−1(x∗(s1)))∗M∗(x∗(s1))U∗(t1, s1)U(t2, s2)M(x∗(s2))F−1(x∗(s2)) is a 2× 2 matrix. The expression after Y’s is a linear combination of 4 summands of the type (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 h (cid:19) (cid:19) h (r1) θ K (r2) θ K (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 h (cid:19) (cid:19) h (1−r1) ϕ (1−r2) ϕ K K with bounded coefficients. Here K(r) stands for the r-th order derivative. One gets all 16 summands for the squared expression since r1,r2,r3,r4 take values 0 and 1, which are all bounded in the same 74 way. So it is enough to consider one of them. Thus, E|ηn,k − Eηn,k|4 is bounded by 2Nθ 2Nϕ m4=0 l4=0 am1bl1am2bl2am3bl3am4bl4 2Nϕ m1=0 l1=0 (1−r1) K 2Nθ (cid:19) (cid:19) (cid:19) (cid:19) K K ϕ ϕ ϕ (1−r2) (1−r3) (1−r4) K ϕ C h24 E (r1) K θ (r2) (r3) (r4) K K θ θ (cid:20) K θ 6K [K ×[K 0 0 0 0 h h h h h Þ t Þ t Þ t Þ t (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 (cid:18) θ(x∗(s3)) − θm3 (cid:18) θ(x∗(s4)) − θm4 (cid:19) (cid:18) x∗(s1) − Xk (cid:18) x∗(s3) − Xk (cid:19) (cid:19) (cid:18) x∗(s4) − Xk (cid:19) (cid:18) x∗(s1) − Xk (cid:18) x∗(s1) − Xk (cid:18) x∗(s2) − Xk (cid:18) x∗(s3) − Xk (cid:18) x∗(s4) − Xk (cid:19) (cid:19) (cid:19) (cid:19) K h h h h h h h h m2=0 l2=0 l3=0 m3=0 2Nθ 2Nϕ 2Nθ 2Nϕ (cid:19) (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19) (cid:18) ϕ(x∗(s2)) − ϕl2 (cid:18) ϕ(x∗(s3)) − ϕl3 (cid:19) (cid:19) (cid:18) ϕ(x∗(s4)) − ϕl4 (cid:19) (cid:18) x∗(s2) − Xk (cid:18) x∗(s3) − Xk (cid:19) (cid:18) x∗(s4) − Xk (cid:19) (cid:18) x∗(s3) − Xk (cid:18) x∗(s1) − Xk (cid:18) x∗(s2) − Xk (cid:18) x∗(s3) − Xk (cid:18) x∗(s4) − Xk K h h h h h S(Xk, ϕl1, θm1)K f(Xk, ϕl3, θm3) − EK f(Xk, ϕl4, θm4) − EK (cid:18) x∗(s2) − Xk (cid:19) S(Xk, ϕl2, θm2) f(Xk, ϕl3, θm3)]δl1=l2,m1=m2 f(Xk, ϕl4, θm4)]δl3=l4,m3=m4 h (cid:19) (cid:19) (cid:18) x∗(s4) − Xk (cid:19) (cid:19) (cid:19) (cid:19) f(Xk, ϕl1, θm1)] f(Xk, ϕl2, θm2)] f(Xk, ϕl3, θm3)] f(Xk, ϕl4, θm4)] ds1ds2ds3ds4. h h h h K +K S(Xk, ϕl1, θm1)S(Xk, ϕl2, θm2)S(Xk, ϕl3, θm3)S(Xk, ϕl4, θm4) ×(δl1=l2=l3=l4,m1=m2=m3=m4 + 6δl1=l2(cid:44)l3=l4,m1=m2(cid:44)m3=m4) +[K ×[K ×[K ×[K ×(δl1=l2=l3=l4,m1=m2=m3=m4 + 6δl1=l2(cid:44)l3=l4,m1=m2(cid:44)m3=m4) f(Xk, ϕl1, θm1) − EK f(Xk, ϕl2, θm2) − EK f(Xk, ϕl3, θm3) − EK f(Xk, ϕl4, θm4) − EK h h h (cid:21) 75 This is further bounded by l1a2 m2b2 l2 2Nϕ l1=0 (1−r1) K ϕ (1−r2) ϕ (1−r3) ϕ K K m1=0 Ch3 h24 (r1) K θ (r2) (r3) K K θ θ 0 0 0 0 h Þ t Þ t Þ t Þ t 2Nθ (cid:18) θ(x∗(s1)) − θm1 (cid:19) (cid:18) θ(x∗(s2)) − θm1 (cid:19) (cid:18) θ(x∗(s3)) − θm2 (cid:19) (cid:19) (cid:18) θ(x∗(s4)) − θm2 (cid:18) x∗(s2) − x∗(s1) h h h m2=0 l2=0 a2 m1b2 2Nθ 2Nϕ (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl1 (cid:18) ϕ(x∗(s3)) − ϕl2 (cid:18) ϕ(x∗(s4)) − ϕl2 h h h (cid:19) (cid:19) (cid:19) (cid:19) θ K K (r4) (1−r4) (cid:19) ×[ f 2(x∗(s1), ϕl2, θm2) + S2(x∗(s1), ϕl2, θm2)]ds1ds2ds3ds4, x∗(s3) − x∗(s1) x∗(s4) − x∗(s1) Ψ4 h h h h ϕ , , f 2(x∗(s1), ϕl1, θm1) where Ψ4(z1, z2, z3) =Þ K(u)K(u + z1)K(u + z2)K(u + z3)du. Now using the Simpson’s scheme Þ θ(x∗(s2))+ch Þ ϕ(x∗(s1))+ch Þ θ(x∗(s1))+ch and the proposition we bound the above by θ(x∗(s1))−ch θ(x∗(s2))−ch Ch3 h24N4 (r1) 0 0 0 0 h Þ t Þ t Þ t Þ t (cid:19) (cid:18) θ(x∗(s1)) − θ1 (cid:18) θ(x∗(s2)) − θ1 (cid:19) (cid:18) θ(x∗(s3)) − θ2 (cid:19) (cid:19) (cid:18) θ(x∗(s4)) − θ2 (cid:18) x∗(s2) − x∗(s1) h h h (r2) (r3) (r4) K θ K K K θ θ θ ϕ(x∗(s1))−ch (1−r1) ϕ(x∗(s2))−ch Þ ϕ(x∗(s2))+ch (cid:19) (cid:19) (cid:19) (cid:19) (cid:18) ϕ(x∗(s1)) − ϕ1 (cid:18) ϕ(x∗(s2)) − ϕ1 (cid:18) ϕ(x∗(s3)) − ϕ2 (cid:18) ϕ(x∗(s4)) − ϕ2 h h h K ϕ K K (1−r2) ϕ (1−r3) ϕ (1−r4) (cid:19) ϕ K x∗(s3) − x∗(s1) h x∗(s4) − x∗(s1) f 2(x∗(s1), ϕ1, θ1) Ψ4 ×[ f 2(x∗(s1), ϕ2, θ2) + S2(x∗(s1), ϕ2, θ2)]dϕ1dϕ2dθ1dθ2ds1ds2ds3ds4, h h h , , 76 then by change of variable we have Þ t Þ t Þ t Þ t Þ c Þ c Ch7 h24N4 (r1) K (r2) θ K θ (r4) K θ −c (1−r3) θ ϕ u1 + 0 0 (v1)K 0 (u1)K −c −c 0 (r3) (1−r1) (u2)K θ(x∗(s2)) − θ(x∗(s1)) θ(x∗(s4)) − θ(x∗(s3)) (cid:18) (cid:18) (cid:18) x∗(s2) − x∗(s1) K h x∗(s3) − x∗(s1) u2 + K h ϕ Þ c −c (v2) (1−r2) Þ c (cid:19) (cid:19) ϕ (cid:18) (cid:18) (cid:19) (cid:19) ϕ(x∗(s2)) − ϕ(x∗(s1)) ϕ(x∗(s4)) − ϕ(x∗(s3)) h h v1 + ϕ v2 + (1−r4) x∗(s4) − x∗(s1) (cid:19) , , h h Ψ4 f 2(x∗(s1), ϕ(x∗(s1)) − hu1, θ(x∗(s1)) − hv1) ×[ f 2(x∗(s1), ϕ(x∗(s3)) − hu2, θ(x∗(s3)) − hv2) +S2(x∗(s1), ϕ(x∗(s3)) − hu2, θ(x∗(s3)) − hv2)]du1du2dv1dv2ds1ds2ds3ds4, h (cid:19) x∗(s4) − x∗(s1) h By a linear approximation and Defining Ψθ,r1,r2 we have: Þ t Þ t (cid:18) x∗(s2) − x∗(s1) Þ t Þ t (cid:18) θ(x∗(s2)) − θ(x∗(s1)) (cid:19) (cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1)) (cid:19) Ψθ,r3,r4 Ψ4 h h 0 0 0 0 , , h x∗(s3) − x∗(s1) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) (cid:19) (cid:19) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) h Ch7 h24N4 Ψθ,r1,r2 h Ψϕ,r3,r4 Ψϕ,r1,r2 f 2(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1))) ×[ f 2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3))) +S2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))](1 + o(h))ds1ds2ds3ds4, h 77 By a change of variable, si = s1 + τih, the above becomes: f 2(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))[ f 2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3))) Þ Þ Þ Ch10(1 + o(1)) +S2(x∗(s1), ϕ(x∗(s3)), θ(x∗(s3)))] h24N4 0 Þ t (cid:18) x∗(s2) − x∗(s1) (cid:18) θ(x∗(s2)) − θ(x∗(s1)) (cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1)) s2 − s1 s2 − s1 s2 − s1 Ψ4 Ψθ,r1,r2 Ψϕ,r1,r2 (cid:19) (cid:19) (cid:19) τ4 x∗(s3) − x∗(s1) s3 − s1 τ2, τ3, s4 − s1 x∗(s4) − x∗(s1) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) s4 − s3 s4 − s3 (τ4 − τ3) (τ4 − τ3) (cid:19) (cid:19) τ2 Ψθ,r3,r4 τ2 Ψϕ,r3,r4 As h → 0, by the boundedness of the integrands on their bounded support and their continuity ds1dτ2dτ3dτ4 dτ2dτ3dτ4 on their support, we have by the LDCT (cid:19) τ4 (cid:19) (cid:19) (τ4 − τ3) (τ4 − τ3) Þ Þ Þ τ3, s4 − s1 x∗(s4) − x∗(s1) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) (cid:18) θ(x∗(s4)) − θ(x∗(s3)) s4 − s3 s4 − s3 τ2 τ2, Ψ4 Ψθ,r3,r4 s3 − s1 x∗(s3) − x∗(s1) (cid:18) x∗(s2) − x∗(s1) (cid:18) θ(x∗(s2)) − θ(x∗(s1)) (cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1)) Þ Þ Þ (cid:19) (cid:18) d (cid:18) d (cid:19) dt θ(x∗(s1))τ2 dt θ(x∗(s1))τ2 (cid:19) s2 − s1 (cid:19) s2 − s1 (cid:19) s2 − s1 (cid:19) (cid:18) d v(x∗(s1))τ2, v(x∗(s1))τ3, v(x∗(s1))τ4 (cid:18) d (cid:19) dt θ(x∗(s3))(τ4 − τ3) dt θ(x∗(s3))(τ4 − τ3) Ψθ,r3,r4 Ψϕ,r3,r4 Ψϕ,r3,r4 (cid:18) Ψ4 τ2 Ψθ,r1,r2 Ψϕ,r1,r2 → Ψθ,r1,r2 Ψϕ,r1,r2 dτ2dτ3dτ4 (6.6) And so Therefore = C nMn (6.7) h24N4 E|ηn,k − Eηn,k|4 ≤ Ch10(1 + o(1)) n E|ηn,j − Eηn,j|4 ≤ C Mn → 0 k=1 78 For case (II) we have h = O(N−1/5n−1/10) and N2 = n2 αn with αn → 0. Then following the calculation done above (where h was arbitrary): E|ηn,k − Eηn,k|4 ≤ Ch10(1 + o(1)) h24N4 = C(1 + o(1)) αn)6/5n−7/5 = (n2 C(1 + o(1)) 6/5 nα n (6.8) and [n−4/5N2/5]4 n k=1 E|ηn,j − Eηn,j|4 ≤ α 8/5 n 6/5 α n → 0 6.4.2 Asymptotic Equicontinuity of the Process ˆZn Define ˆZn(t) := Wn(gt(s)), gt(s) = I[0,t]U(t, s)M(s)F−1(x∗(s)) Ξ(t1). Then ˆZn(t) = − and Þ gt(s) ( ˆfn − f) ( ˆfn − f) ∂ϕ ∂ ∂θ (cid:32) ∂ (cid:12)(cid:12)(cid:12)(cid:12) n k=1 ds. (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)x∗(s) (cid:12)(cid:12)(cid:12)(cid:12)4 (cid:34) E|n ˆZn(t) − nE ˆZn(t)|4 = E ηn,k − Eηn,k = n(n − 1)E2|ηn,k − Eηn,k|2 + nE|ηn,k − Eηn,k|4 (cid:35) . (6.9) Since ηn,j are row-wise independent we have the last line. Consider the above with arbitrary g : 79 E|nWn(g) − nEWn(g)|4 h h h h h h h h K(cid:48) θ K(cid:48) ϕ , K(cid:48) θ h K(cid:48) ϕ 2Nθ l1=0 m2=0 l2=0 2Nϕ 2Nϕ am1bl1am2bl2 m1=0 ( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1) ( f(Xk, ϕl1, θm1) + S(Xk, ϕl1, θm1)εkl1m1) ( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2) ( f(Xk, ϕl2, θm2) + S(Xk, ϕl2, θm2)εkl2m2) (cid:20)Þ Þ 2Nθ (cid:18) x∗(s1) − Xk (cid:19) (cid:19) (cid:18) x∗(s1) − Xk (cid:27) (cid:18) x∗(s2) − Xk (cid:19) (cid:19) (cid:27) (cid:18) x∗(s2) − Xk (cid:19) (cid:19) (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) θ(x∗s1) − θm1 (cid:18) θ(x∗(s1)) − θm1 (cid:18) ϕ(x∗(s2))−ϕl2 (cid:19) (cid:19) (cid:18) θ(x∗(s2))−θm2 (cid:33) (cid:32) Kθ (cid:21) (cid:18) θ(x∗s2)−θm2 (cid:19) (cid:18) ϕ(x∗(s2))−ϕl2 (cid:19) Þ Þ 2Nθ 2Nθ 2Nϕ 2Nϕ (cid:19) (cid:19) (cid:18) x∗(s2) − Xk (cid:19) (cid:18) (cid:18) x∗(s1) − Xk (cid:18) θ(x∗s1) − θm1 (cid:19) (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19) (cid:18) θ(x∗(s1)) − θm1 (cid:19) (cid:18) ϕ(x∗(s2))−ϕl2 (cid:19) (cid:18) θ(x∗(s2))−θm2 (cid:33) (cid:32) Kθ (cid:18) θ(x∗s2)−θm2 (cid:19) (cid:18) ϕ(x∗(s2))−ϕl2 (cid:19) Þ Þ 2Nθ 2Nϕ 2Nθ 2Nϕ (cid:18) x∗(s2) − Xk (cid:18) x∗(s1) − Xk (cid:19) (cid:19) (cid:18) θ(x∗(s1)) − θm1 (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 (cid:18) θ(x∗(s2)) − θm2 Ykl1m1, K K(cid:48) Ykl2m2 , K(cid:48) Ykl1m1, K (1−r1) am1bl1am2bl2 am1bl1am2bl2 m1=0 l1=0 m2=0 l2=0 h K(cid:48) ϕ Ykl2m2 m1=0 l1=0 m2=0 l2=0 ds1ds2 ds1ds2 (cid:19) (cid:19) (cid:19) (cid:19) K(cid:48) θ h h (cid:19) h h (cid:18) Kϕ Kϕ (1−r2) h h h h K h h θ h h h ϕ ds1ds2 K ϕ K ϕ θ h 1 h12 E = (cid:26) (cid:26) (cid:32) K −EK K −EK Kθ 1 h12 = (cid:32) Kθ Cov K ≤ 1 h12 Cov (r1) K θ (r2) K ∗(s1)g(s2) g ∗(s1)g(s2) g (cid:19) (cid:19) (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19)(cid:33) h Kϕ (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19)(cid:33) h Kϕ G(i,j)(s1, s2) h h 80 For some (i, j) and r1,r2 ∈ {0,1}. g∗(s1)g(s2) = G(s1, s2) and G(i,j)(s1, s2) are the components of G (cid:18) From the previous covariance portion, (cid:18) x∗(s1) − Xk (cid:19) (cid:18) x∗(s1) − x∗(s2) h (cid:18) x∗(s2) − Xk (cid:19) h (cid:19) K Ykl1m1, K Cov (1 + O(h))h3 f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) + S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2} Ykl2m2 (cid:20) Ψ h = (cid:19) (cid:21) (cid:21) 2Nθ 2Nϕ 2Nθ 2Nϕ m1=0 l1=0 m2=0 l2=0 am1bl1am2bl2 Thus E|nWn(g) − nEWn(g)|4 is bounded above by (1 + O(h))h3 G(i,j)(s1, s2) (cid:19) h12 Ψ Þ Þ (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 h h (cid:18) x∗(s1) − x∗(s2) (cid:19) (cid:19) K (1−r2) ϕ (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 h (cid:19) (cid:19) (cid:20) K θ (r2) θ K h K ϕ ds1ds2. h First consider the sum: f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) + S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2} (r1) (1−r1) 2Nϕ 2Nθ 2Nϕ (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 m2=0 l2=0 l1=0 h 2Nθ m1=0 (r1) K θ (r2) θ K h Now consider just 2Nθ 2Nϕ m1=0 l1=0 am1bl1 f(x∗(s1), ϕl1, θm1)K (r1) θ am1bl1am2bl2 f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) (cid:19) (cid:19) (1−r1) K ϕ (1−r2) K ϕ h (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 (cid:18) θ(x∗(s1)) − θm1 h (cid:19) (cid:19) (cid:19) . (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:19) h (1−r1) ϕ K h 81 The error term in Simpson’s method will be absorbed into the leading (1+O(h)) term above and will be ignored. It is approximated by: (cid:18) θ(x∗(s1)) − u1 (cid:19) (cid:18) ϕ(x∗(s1)) − v1 (cid:19) (1−r1) ϕ K du1dv1 h f(x∗(s1), v1, u1)K (r1) θ f(x∗(s1), ϕ(x∗(s1)) − ρ1h, θ(x∗(s1)) − τ1h)K (r1) θ (τ1)K (1−r1) ϕ (ρ1)dτ1dρ1 1 Kθ(τ1)dτ1 + h5 f (cid:48)(cid:48)(cid:48) 2 ϕϕϕ τ 1K(cid:48) 3 ρ ϕ(ρ1)dρ1 ϕ(x∗(s1))−ch Þ ϕ(x∗(s1))+ch = h2Þ c Þ θ(x∗(s1))+ch Þ c θ(x∗(s1))−ch −c −c = or h5 f (cid:48)(cid:48)(cid:48) ϕθθ Þ h Þ Þ 1Kϕ(ρ1)dρ1 + h5 f (cid:48)(cid:48)(cid:48) 2 ρ If r1 = 0 or r1 = 1, respectively. Call it h5C(x∗(s1),r1). h5 f (cid:48)(cid:48)(cid:48) ϕϕθ θθθ Þ 1 K(cid:48) 3 θ(τ1)dτ1 τ Note we have used a Taylor approximation and the fact that f (cid:48) ϕ((x∗(s1), ϕ(x∗(s1)), θ(x∗(s1))) = f (cid:48) Then ((x∗(s1), ϕ(x∗(s1)), θ(x∗(s1))) = 0. θ 2Nθ m1=0 (r1) K θ 2Nϕ 2Nθ 2Nϕ (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 m2=0 l2=0 l1=0 h am1bl1am2bl2 f(x∗(s1), ϕl1, θm1) f(x∗(s2), ϕl2, θm2) (cid:19) (cid:19) (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 (1−r1) (cid:19) (cid:19) K h ϕ (r2) K h = h10C(x∗(s1),r1)C(x∗(s2),r2) + O(N−4). (1−r2) K h ϕ θ (6.10) 82 (cid:21) am1bl1am2bl2S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1)I{m1=m2,l1=l2} (cid:19) (cid:19) (1−r1) (cid:19) (cid:19) K h ϕ Now consider 2Nθ m1=0 (r1) K θ (r2) θ ϕ h h K l1=0 l1=0 l2=0 m2=0 (1−r2) h a2 m1b2 2Nθ 2Nϕ 2Nϕ (cid:18) θ(x∗(s1)) − θm1 (cid:18) θ(x∗(s2)) − θm2 2Nθ 2Nϕ l1S(x∗(s1), ϕl1, θm1)S(x∗(s2), ϕl1, θm1) (cid:18) θ(x∗(s1)) − θm1 (cid:19) (cid:19) (cid:18) θ(x∗(s2)) − θm1 Þ ϕ(x∗(s1))+ch Þ θ(x∗(s1))+ch (cid:19) (cid:18) θ(x∗(s1)) − u1 (cid:18) θ(x∗(s2)) − u1 (cid:19) Þ c Þ c (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl2 (cid:18) ϕ(x∗(s1)) − ϕl1 (cid:18) ϕ(x∗(s2)) − ϕl1 (cid:19) (cid:18) ϕ(x∗(s1)) − v1 (cid:19) (cid:18) ϕ(x∗(s2)) − v1 θ(x∗(s1))−ch (1−r1) K ϕ(x∗(s1))−ch (1−r1) (1−r2) (1−r2) (cid:19) (cid:19) h (1 + O(h))h2 K K K h h h h h h ϕ ϕ ϕ m1=0 (r1) (r2) 1 N2 (r1) θ θ (r2) θ θ K = K K = K K = (cid:18) N2 −c −c S(x∗(s2), ϕ(x∗(s1)) − hp1, θ(x∗(s1)) − hq1)K (r2) θ(x∗(s2)) − θ(x∗(s1)) (1−r2) q1 + (cid:19) K ϕ K θ = h (1 + O(h)))h2 (cid:18) θ(x∗(s2)) − θ(x∗(s1)) N2 (cid:19) Ψθ,r1,r2 h Thus E|nWn(g) − nEWn(g)|4 is bounded above by 83 S(x∗(s1), v1, u1)S(x∗(s2), v1, u1) ϕ du1dv1(1 + O(N−1)) S(x∗(s1), ϕ(x∗(s1)) − hp1, θ(x∗(s1)) − hq1) h (cid:19) dq1dp1 (p1) (q1)K ϕ(x∗(s2)) − ϕ(x∗(s1)) θ ϕ (r1) p1 + (1−r1) (cid:18) (cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1)) h (cid:19) S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s2), ϕ(x∗(s1)), θ(x∗(s1))) Ψϕ,r1,r2 h . (6.11) h10C(x∗(s1),r1)C(x∗(s2),r2) + S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s2), ϕ(x∗(s1)), θ(x∗(s1))) Ψϕ,r1,r2 ds1ds2. Þ Þ Ψ (cid:18) x∗(s1) − x∗(s2) (1 + O(h))h3 (1 + O(h)))h2 h12 N2 Ψθ,r1,r2 h (cid:19) (cid:18) θ(x∗(s2)) − θ(x∗(s1)) (cid:18) Þ Þ h Change of variable s2 = s1 + τh yields: (cid:20) G(i,j)(s1, s2) (cid:19) (cid:18) ϕ(x∗(s2)) − ϕ(x∗(s1)) (cid:19)(cid:21) h (cid:19) G(i,j)(s1, s1 + τh) (cid:20) (1 + O(h))h4 x∗(s1) − x∗(s1 + τh) τ Ψ h12 s2 − s1 h10C(x∗(s1),r1)C(x∗(s1 + τh),r2) + (1 + O(h)))h2 (cid:18) N2 S(x∗(s1), ϕ(x∗(s1)), θ(x∗(s1)))S(x∗(s1 + τh), ϕ(x∗(s1)), θ(x∗(s1))) θ(x∗(s1 + τh)) − θ(x∗(s1)) ϕ(x∗(s1 + τh)) − ϕ(x∗(s1)) (cid:19)(cid:21) (cid:19) s2 − s1 ds1dτ G(i,j)(s1, s1 + τh)ds1dτ gik(s1)gjk(s1 + τh)ds1dτ Ψϕ,r1,r2 x∗(s1) − x∗(s1 + τh) s2 − s1 s2 − s1 x∗(s1) − x∗(s1 + τh) x∗(s1) − x∗(s1 + τh) Þ Þ Þ Ψ Ψ s2 − s1 τ (cid:18) (cid:18) (cid:18) (cid:19)1/2(cid:18)Þ τ τ τ (cid:18) (cid:19)Þ (cid:19) 3 (cid:19) k=1 Þ (cid:19)1/2 s2 − s1 |gjk(s1 + τh)|2ds1 dτ (cid:19) (cid:19) |gik(s1)|2ds1 |gip(s1)|2ds1 τ Ψθ,r1,r2 ≤ K(1 + O(h)) K(1 + O(h)) n = n Ψ n |gik(s1)|2ds1 ≤ K(1 + O(h)) (cid:18)Þ 3 3 (cid:18)Þ k=1 ≤ C(1 + O(h)) k=1 K∗(1 + O(h)) (cid:18)Þ n n = for some p ∈ {1,2,3} and some large enough constant K to bound both the first and second terms in brackets. Thus, E|nWn(g) − nEWn(g)|4 ≤ (cid:34) n(n − 1) (cid:20)C(1 + O(h)) (cid:18)Þ n (cid:19)(cid:21)2 (cid:35) . + nC nMn |gip(s1)|2ds1 84 Since E|ηn,k − Eηn,k|4 ≤ C nMn find an M > 0 such that, with g := gt1 − gt2 and gt(s) is Lipschitz and bounded, we can for its components Then |gt1,ip(s1) − gt2,ip(s1)|2ds1 ≤ C|t1 − t2|. Þ E(cid:12)(cid:12)nWn(cid:0)gt1(s) − gt2(s)(cid:1) − nEWn(cid:0)gt1(s) − gt2(s)(cid:1)(cid:12)(cid:12)4 = E(cid:12)(cid:12)n(cid:0) ˆZn(t1) − E ˆZn(t1)(cid:1) − n(cid:0) ˆZn(t2) − E ˆZn(t2)(cid:1)(cid:12)(cid:12)4 = E(cid:12)(cid:12)Ξn(t1) − Ξn(t2)(cid:12)(cid:12)4 (cid:21) (cid:20) ≤ C |t1 − t2|2 + K . When t1 = t2, K may be taken to be zero. If |t1 − t2| > 0, choose a constant C large enough so that K < C|t1 − t2|. Therefore E(cid:12)(cid:12)Ξn(t1) − Ξn(t2)(cid:12)(cid:12)4 ≤ C|t1 − t2|2 To establish the asymptotic equicontinuity of the process n(cid:0) ˆZn(t1)− E ˆZn(t1)(cid:1), we will apply the . following two lemmas from [16]. Here || · ||ψ = || · ||4. In terms of the above, we have ||Ξn(t1) − Ξn(t2)||ψ ≤ C 2 |t1 − t2| 1 1 2 = Kd(t1, t2) For the semi-metric d, a ball of radius  is the interval [t −  2] for each t ∈ T. Then we have N(, d) = T 2 , the number of balls of radius  needed to cover T. Note that since 2 −1 , ψ−1(x) = x ψ(x) = x4 4 . To find the integral above, we note that , t +  2 (cid:18)  (cid:19) 2, d . N(, d) ≤ D(, d) ≤ N 85 As is shown in [16], the idea is to, upon defining a nested sequence of maximally separated 2n for s, t ∈ Sn, bound the maximum of the process on Sn where the second maximum is taken over subsets S0 ⊂ S1 ⊂ ... ⊂ T, with d(s, t) > η by and all (s, t) ≤ δ ∈ Sn whose chains end at a unique pair s0, t0 ∈ S0. The second lemma bounds the second term: s∈Sn t∈Sn−1 |Ξn(s) − Ξn(t)| (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) max (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)max|Ξn(s) − Ξn(t)| (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ The integral can be bounded above by (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12), (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)max|Ξn(s) − Ξn(t)| (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) −1(cid:0)D2(cid:0)η, d(cid:1)(cid:1)max Þ η −1(cid:0)D(, d)(cid:1)d (cid:33) 1 (cid:32)2T Þ η ψ 4 0 ≤ Kψ 2 0  d . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Ξn(s) − Ξn(t) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)ψ (cid:33) 1 2 δ. (cid:32)2T 2 η ≤ K Thus, given any r > 0, by Markov’s inequality and the maximal inequality in the theorem, (cid:34)Þ η (cid:33) 1 4 (cid:32)2T 2 0  ≤ K r (cid:33) 1 2(cid:35) (cid:32)2T 2 η d + δ 2K(2T)1 4 r4 = 1 2 + η √ 2T δK r4 η . (cid:33) P (cid:32) (cid:12)(cid:12)(cid:12)(cid:12)Ξn(t1) − Ξn(t2) (cid:12)(cid:12)(cid:12)(cid:12) > r n(cid:0) ˆZn(t1) − E ˆZn(t1)(cid:1). sup |s−t|≤δ Choose η arbitrarily small and we have established the asymptotic equicontinuity of the process 86 CHAPTER 7 CONCLUSION The proposed methods provide a general mathematical and statistical framework for tractography based on HARDI data. Similar to the tensor model approach, the main advantage of the approach under consideration is the following through of the uncertainty from the acquired raw data level to the fiber level as it propogates via our ’signal fields’ to vector fields and finally to the integral curves. But unlike the tensor model approach we do not impose structural assumptions on the diffusion signal such as supersymmetric tensor or a positive definite matrix. This approach to tracing curves with surrounding confidence ellipsoids is unique and offers a computationally cheap alternative to probabilistic tractography methods which are tied to iterative MCMC sampling techniques. This way, the errors from data measurements are followed through the model to the level of axonal fibers in an easy to interpret way. The methods in DTI and HARDI give fiber estimates within O(n−1/3) from the true fiber, and they require O(n) operations for Gaussian type kernels as well as O(n2) operations for the asymptotical covariance calculation. Furthermore in DTI and HARDI one has hn = O(n−1/6). Typically the number of locations n in HARDI is on the order of hundereds of thousands or millions. The sampling of the directional space contains at least 6 directions for DTI and between 30 and 150 directions for HARDI, that would be N2 in our model. In the non-parametric scenario this is accomodated by case (II), and then the fiber estimates are within O(n−1 ) from the true fiber, and they require the same amount of operations. Our bandwidth is hn = O(n−1/2 ), where εn → 0. If we take εn = O(n−5/3), which corresponds to N = O(n1/3), then we will obtain the same order for bandwidth and accuracy as the methods based on tensor fields. −2/5 n ε −1/5 n ε The practical downside of our approach is the third step of the implementation, in which we need to solve a complicated optimization problem numerically. In our simulation study this was the bottleneck and this step often introduced numerical bias which ruined the subsequent statistical estimation. This systematic bias through the fminsearch is unexpected and difficult to control 87 through Matlab. The expected bias should be zero, however the process strays off in a line away from the true curve before regaining the correct estimated maximum direction. However, the bias always maintains its roughly distant initial distance strayed from the curve rather than switching back and forth as in the HARDI simulation scenarios. At this time we can only speculate as to why the Matlab function cannot escape these ’minimum wells’ to find a more maximal direction estimate at the beginning of the curve trace. As a direction of future research one could investigate the interplay between numerical errors and statistical errors, and one could balance them to come up with a practical guide on how to select tuning parameters in the optimization step. For future work we will compare in the same way the performance of the completely non-parametric method to that of the higher-order HARDI model and low order classical DTI model in the C-pattern scenarios as in [11]. It is also worth considering how one might build this same modeling method without the assumption of a unique maximal direction at each location so that branching scenarios can be explored, however the mathematical reasoning behind this may require an approach quite different than those we have discussed. We may consider the set of all points on a manifold and only require this existence rather than requiring conditions on the function f for which the unique maximal direction can be found at each point x. Although we have an imperfect method when considering the inability to handle crossings or branchings of fibers, this study was very fruitful in showing that the data can speak for itself in these noisy MRI data scenarios for which we wish to uncover curve estimates for the C pattern. The proof of concept that one may obtain these curve estimates is quite valuable as the methods of obtaining MRI data can continually improve and with them the noise can be reduced through gathering of multiple images. In addition, all of the methods could in theory trace a sequence of fibers, although one may be more tedious than another. For example, in the HARDI scenario we can trace multiple curves simultaneously without violating any assumptions. In the discussed methodology of this paper, we could skip branch points and search nearby for dominant diffusion directions outside of a fiber cluster and still within reason estimate a network of fiber trajectories. 88 Figure 7.1: A fiber across the genu of corpus callosum with diffusion "blobs" along it. 89 Figure 7.2: Visualization of diffusion via nonparametric function using our model. 90 BIBLIOGRAPHY 91 BIBLIOGRAPHY [1] Assemlal, H., Tschumperle, D., Brun, L., Siddiqi, K. Recent Advances in Diffusion MRI Modeling: Angular and Radial Reconstruction. (2011) Medical Image Analysis 15 369-396. [2] Carmichael, O. and Sakhanenko, L. (2016) Integral curves from noisy diffusion MRI data with closed-form uncertainty estimates. Statistical Inference for Stochastic Processes, vol. 19(3), pp. 289-319. [3] Coddington, E.A., Levinson, M. (1955) Theory of Ordinary Differential Equations. McGraw- Hill, New York. [4] Daducci, A., Canales-Rodríguez, E., Descoteaux, M., Garyfallidis, E., Gur, Y., Lin, Y-C., Mani, M., Merlet, S., Paquette, M., Ramirez-Manzanares, A., Reisert, M., Rodrigues, P., Sepehrband, F., Caruyer, E., Choupan, J., Deriche, R., Jacob, M., Menegaz, G., Prčkovska, V., Rivera, M., Wiaux, Y., Thiran, J-P. (2013) Quantitative comparison of reconstruction methods for intra-voxel fiber recovery from diffusion MRI. IEEE Proc. [5] Koltchinskii, V., Sakhanenko, L. and Cai, S. (2007) Integral Curves of Noisy Vector Fields and Statistical Problems in Diffusion Tensor Imaging: Nonparametric Kernel Estimation and Hypotheses Testing. Ann. Statist. 35, 1576–1607. [6] Sakhanenko, L. (2012) Numerical issues in estimation of integral curves from noisy diffusion tensor data. Statistics & Probability Letters 82, 1136–1144. [7] Sakhanenko, L, DeLaura,M. (2019) Supplement to “Fully nonparametric model for a tensor field based on HARDI”. [8] Assemlal, H.-E., Tschumperle, D., Brun, L. and Siddiqi, K. (2011) Recent advances in diffusion MRI modeling: Angular and radial reconstruction. Medical Image An. 15 369–396. [9] Carmichael, O. and Sakhanenko, L. (2015) Estimation of integral curves from high angular resolution diffusion imaging (HARDI) data. Linear Algebra and its Applications 473, pp. 377–403. [10] Sakhanenko, L. (2015) Using the Tractometer to assess performance of a new statis- e130, pp.1–12. Journal of Nature and Science, tical tractography technique. http://www.jnsci.org/content/130. 1(7): [11] Sakhanenko, L. and DeLaura, M. (2017) A comparison study of statistical tractography methodologies for Diffusion Tensor Imaging. International Journal of Statistics: Advances in Theory and Applications, 1(1), 93-110. [12] Assemlal, H., Tschumperle, D., and Siddiqi, K. (2011) Recent Advances in diffusion MRI modeling: Angular and radial construction. Medical Image Analysis 15 369-396. 92 [13] Ying, L., Zou, Y., Klemer, D., and Wang, J. Determination of Fiber Orientation in MRI Diffusion Tensor Imaging Based on Higher-Order Tensor Decomposition. Proceedings of the 29th Annual International Conference of the IEEE EMBS (2007) 2065-2068. [14] Evarist Gine, Armelle Guillou. Rates of Strong Uniform Consistency for Multivariate Kernel Density Estimators. Annales de l’I.H.P. Probabilités et statistiques, Volume 38 (2002) no. 6, p. 907-921. [15] Efromovich, Sam. (1999). Nonparametric Curve Estimation: Methods, Theory, and Applica- tions. [16] van der Vaart, Aad, Wellner, Jon A. (1996) Weak Convergence and Empirical Processes. Springer texts in statistics. pp. 95-104. [17] Coddington, E.A., Levinson, M. (1955) Theory of Ordinary Differential Equations. McGraw- Hill, New York. [18] Basser, P.J., Mattiello, J., LeBihan, D. (1994) MR diffusion tensor spectroscopy and imaging. Biophys J 66, 259–267. [19] Behrens, T.E., Berg, H.J., Jbabdi, S., Rushworth, M.F., Woolrich, M.W. (2007) Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? Neuroimage 34, 144–155. [20] Chang, S.E., Zhu, D.C. (2013) Neural network connectivity differences in children who stutter. Brain 136, 3709–3726. [21] Le Bihan, D., Mangin, J.F., Poupon, C., Clark, C.A., Pappata, S., Molko, N., Chabriat, H. (2001) Diffusion tensor imaging: concepts and applications. J Magn Reson Imaging 13, 534–546. [22] Mori, S., Kaufmann, W.E., Davatzikos, C., Stieltjes, B., Amodei, L., Fredericksen, K., Pearlson, G.D., Melhem, E.R., Solaiyappan, M., Raymond, G.V., Moser, H.W., van Zijl, P.C. (2002) Imaging cortical association tracts in the human brain using diffusion-tensor-based axonal tracking. Magn Reson Med 47, 215–223. [23] Ozarslan, E., Mareci, T.H. (2003) Generalized diffusion tensor imaging and analytical rela- tionships between diffusion tensor imaging and high angular resolution diffusion imaging. Magn Reson Med 50, 955–965. [24] Zhu, D.C., Covassin, T., Nogle, S., Doyle, S., Russell, D., Pearson, R.L., Monroe, J., Liszewski, C.M., DeMarco, J.K., Kaufman, D.I. (2015) A potential biomarker in sports- related concussion: brain functional connectivity alteration of the default-mode network measured with longitudinal resting-state fMRI over thirty days. J Neurotrauma 32, 327–341. [25] Zhu, D.C., Majumdar, S. (2014) Integration of resting-state FMRI and diffusion-weighted MRI connectivity analyses of the human brain: limitations and improvement. J Neuroimaging 24t, 176–186. 93 [26] Zhu, D.C., Majumdar, S., Korolev, I.O., Berger, K.L., Bozoki, A.C. (2013) Alzheimer’s disease and amnestic mild cognitive impairment weaken connections within the default-mode network: a multi-modal imaging study. J Alzheimers Dis 34, 969–984. 94