)V1531_] RETURNING MATERIALS: Place in book drop to LJBRARJES remove this checkout from .—c—. your record. FINES wiII be charged if book is returned after the date stamped be10w. ON THE CONVERGENCE OF THE GAUSS-GALERKIN METHOD FOR THE DENSITY OF SOME MARKOV PROCESSES BY Ali Haj Jafar A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1984 ABSTRACT ON THE CONVERGENCE OF THE GAUSS—GALERKIN METHOD FOR THE DENSITY OF SOME MARKOV PROCESSES BY Ali Haj Jafar This dissertation concerns the convergence of the Gauss-Galerkin method for some Markov processes. This method provides approximations to p(t,x), the solution of the Fokker-Planck equation it =-§%(ap)4v%‘;i;(b2p) corresponding to the stochastic differential equation dX(t) = a(X(t).t)dt4-b(X(t).t)dW(t) defined on (r1,r2) with O‘g r1 < r2 g_°. The approximation is in the sense of approximation of probability measures We first show that when the coefficients of the stochastic differential equation are polynomials. the resulting Gauss-Galerkin system is equivalent to the Hankel system of moments which is closed in an appropriate manner. We then compare the Gauss-Christoffel approximations to p(t,x) with the Gauss-Galerkin approximations and derive upper bounds for the inherent errors. Ali Haj Jafar The Gauss-Galerkin measures, which are atomic in nature, form the basis for numerical integration quadrature formulas. We show that under suitable conditions on the coefficients of the stochastic differential equation, these integration formulas converge to the true value of the integral. The proofs rely on the use of differential inequalities. Helly's theorems on weak compactness of measures and the spectral theory of linear operators. Numerical examples are presented that illustrate close agreements between the numerical and theoretical results. ACKNGNLEDGMENTS I wish to express my deep gratitude to Professor David Yen and Professor Habib Salehi, my thesis advisors, for all their patience, expert guidance and encouragement during the course of my work. I also wish to offer thanks to Professors L.M. Sonneborn, S. Dragosh and W.E. Kuan, other members of my thesis committee. I am grateful to Professor D. Dawson of Carleton University for many valuable conversations and for making his work on Galerkin approximation accessible to me. I also extend my appreciation to Professor S. Ethier for useful conversations and suggesting several related references. A special thanks goes to my wife, Fattaneh, for her unwavering support and infinite patience. I wish to thank Mrs. Berle Reiter, the Librarian of the Mathematics-Statistics library, for all her help during my graduate work at Michigan State University. Finally, I would like to thank Cindy Lou Smith and Tammy Hatfield for typing this dissertation. ii DEDICATED TO MY PARENTS AND MY WIFE TABLE OF CONTENTS Chapter 0. INTRODUCTION . . . . . . . . 0.1 Statement of Problem and Its Motivation 0.2 Organization of the Dissertation . PRELIMINARIES . . . . . . . . . . 1.1 Events and Random Variables . . 1.2 Probability and Distribution Functions 1.3 Conditional Expectation and Conditional Probabilities 1.4 Stochastic Processes 1.5 Markov Processes 1.6 Transition Probability and Density 1.7 The Wiener Process (Brownian Motion) 1.8 The Infinitesimal Generator 1.9 Diffusion Process . . . . . . 1.10 Backward and Forward Equations 1 .11 Stochastic Integrals and Stochastic Differential Equations . . . . 1.12 Existence and Uniqueness . . . NUMERICAL SOLUTIONS TO THE PROBABILITY DENSITY OF STOCHASTIC DIFFERENTIAL EQUATIONS IN ONE-DIMENSION . . . . . . . . . . . . . 2.1 The Mathematical Problem for the Density and Its Weak Form . . 2. 2 The Gauss-Christoffel Approximation of Measures . . . . . iii Page 10 12 13 16 17 19 2O 23 27 32 32 34 Chapter 2.3 The Gauss—Galerkin Method The Hankel System of Moments Comparison of the Gauss-Christoffel and the Gauss-Galerkin Measures 2.6 Steady-State Solutions 3. CONVERGENCE OF THE GAUSS-GALERKIN METHOD 3.1 Convergence of the Gauss—Christoffel Integration Formulas . . . . Assumptions and Boundary Conditions Some Results from Differential Inequalities Helly's Theorems . Preparatory Lemmas Cit» u L» w ONUIohUJN Main Convergence Theorems 4. NUMERICAL EXAMPLES 4.1 Example 4.2 Example 4.3 Example bWNH 4.4 Example APPENDIX A - A GEOSTOCHASTIC MODEL APPENDIX B - GALERKIN METHOD WITH FIXED NODES LIST OF REFERENCES iv Pace 39 47 53 58 6O 6O 63 75 77 85 100 100 107 110 . 117 121 125 132 CHAPTER 0 INTRODUCTION 0.1 Statement of Problem and Its Motivation We are concerned in this dissertation with numerical solutions of the probability law of a class of stochastic differential equations of the type (0.1.1) dX(t) = a(X(t) .t)dt+b(X(t).t)dW(t) where X is the spatial coordinate in one-dimension , t is time, W(t) is the standard Brownian motion, and the functions a(x,t) and b(x,t) are known as the "drift" and "diffusion" coefficients respectively. The stated equation models many problems in the physical, engineering and biological sciences. Since W(t) .is a random variable, the solution X(t) of the above equation for t > 0 subject to given initial condition X(0). whether random or deterministic. is a random process. Under suitable assumptions on a(x.t) and b(x,t), the existence and uniqueness of the above initial value problem for X(t) is known. Furthermore, under somewhat more stringent conditions, a probability density function p(t,x) for the process X(t)' can be shown to exist and satisfies the deterministic "Fokker—Planck" equation ap(t.x) _ 5(a(X.t1p) _1_ 62(b2(x.t)p) (00102) -- + at 5x 2 a 2 x (0.1.3) p(0,x) = given where x may lie in the infinite interval (—w,m), the semi-infinite interval [0,m) or a finite interval [r1,r2]. It is clear that additional appropriate boundary conditions on a, b and/or p must be posed. D. Dawson in [6] suggested a Galerkin type method for approximating p(t,x) by atomic measures. This method transforms the problem to one for a system of nonlinear ordinary differential equations for the "nodes" and "weights" of the atomic measures. The atomic measures provide approximations to the Gauss—Christoffel measures of the exact p(t,x). This method has been referred to as the Gauss-Galerkin method and the atomic measures the Gauss—Galerkin measures. Our aim in this dissertation is to make a detailed analysis of the Gauss-Galerkin method above. We shall establish some convergence theorems, in both the finite interval and semi-infinite interval case, for the convergence of the Gauss-Galerkin measures to p(t,x), as n 4 a (n = number of atoms), when the coefficients a(x,t) and b(x,t) satisfy appropriate continuity and growth conditions. The proofs of these theorems require techniques from both analysis and probability theory. Specifically, the proof involves the use of differential inequalities, Helley's theorems on the weak compactness of probability measures, the problem of moment, criteria for unique determinism of a measure by its moments, and the eigen- function expansion of a second order differential operator. A number of examples are presented and, whenever possible, compared with the exact solutions. We have also included examples which do not satisfy the assumptions of the convergence theorems but for which the Gauss- Galerkin method seems to work well. A Galerkin method based on fixed nodes is also developed and illustrated by applying it to an example. Unfortunately we do not have a convergence theorem for this method at this time. 0.2 Organization of the Dissertation This dissertation is organized as follows. Chapter I contains background materials from the probability theory. 4 We begin with a review of the notions of random variables, distribution functions and conditional expectations and probability. We then proceed to a discussion of stochastic processes including the Markov processes and Brownian motions. Diffusion processes and their governing back- ward and forward equations are then discussed. These are followed by stochastic integrals and stochastic differential equations, and, finally, we state several theorems on the existence and uniqueness of solutions of stochastic differential equations of the type we wish to consider. In Chapter II we develop the Gauss-Galerkin method by first presenting the weak form of the Fokker-Planck equation. We then review the Gauss—Christoffel approximations to the probability measures. This consideration leads to the development of the Gauss-Galerkin equations. An alternative formulation of the numerical problem in terms of the moments for the case where the Fokker-Planck equation involves polynomial coefficients leading to the so-called Hankel system is also presented. We conclude Chapter II with a discussion of the inherent errors in the Gauss- Galerkin approximations. The main convergence theorems of the Gauss-Galerkin method are presented in Chapter III for the cases of a semi-infinite interval and of a finite interval. We begin Chapter III with a discussion of the convergence of the Gauss-Christoffel approximations. This is followed by a detailed formulation of the assumptions involved in the Fokker-Planck equation and the boundary conditions. After some differential inequalities are presented, we state two Helly's theorems dealing with the weak compactness of probability measures. The proofs of the main convergence theorems are preceded immediately by a number of preparatory lemmas. Several numerical examples are presented in Chapter IV. The numerical solutions, whenever possible, are compared with known exact solutions. Appendix A contains the numerical solution of a nonlinear stochastic equation which is not of the type of equations studied in this dissertation. The numerical results for this problem. which was previously studied in [S] are encouraging enough and suggest that the Gauss- Galerkin method may indeed converge for much wider classes of stochastic equations. Appendix B contains the Galerkin method based on fixed nodes as mentioned before and includes numerical results for the problem treated in Appendix A. CHAPTER I PRELIMINARIES The following preliminaries are taken from the books on stochastic differential equations by Arnold [1] and Friedman [11]. 1.1 Events and Random Variables: Probability theory deals with mathematical models of trials whose outcomes depend on chance. ~We group together the possible outcomes (the elementary events) in a set D with typical element w E 0. An observable event A is a subset of 0: however, not every subset of Q is in general an observable or interesting event. Let 2 denote the set of observable events for a single trial. Of course, 2 must include 0, ¢ and for every event A, its complement A} Furthermore, given two events A and B in Z, the union A U B and the intersection A n B also belong to 2: thus, 2 is an algebra of events. An algebra E of events is called a sigma algebra if Q UAez n=l n H when An 6 L for n 2.1. The elements of Z are called measurable sets and the pair (0,2 is called a measurable space. Let 0 denote a family of subsets of Q. There exists in Q a smallest sigma algebra 2(a) that contains all sets belonging to a; This 2(a) is called the sigma algebra (o-algebra) generated by a; Let (0.2) and (0',Z') denote measurable spaces. A mapping X :0 * 0' is said to be (2-—Z')-measurab1e (and is called an 0’-va1ued random variable on (0,2)) if the preimage of measurable sets in 0' are measurable sets in 0. that is, for A’ E Z' 1 {w:X(w) EA'} = [X EA'] = X- (A') 6 Z If 0' is the d-dimensional Euclidean space Rd with the usual distance function, we shall always choose as the sigma algebra 2’ of events the sigma algebra Bd of Borel sets in Rd generated by the d-dimensional intervals. 1.2 Probability and Distribution Functions: Let (0.2) denote a measurable space. A set function P defined on 2 is called a probability measure or simply a probability if a) OgP(A) g1 forall A62, b) P(¢) = O. c) P( L) An) = ZDP(A if A 6 Z for all n=l n=l n ) n n 2 1, and An 0 Am = ¢ (n # m) (sigma-additivity) d) P(C) = 1. The triple (0,2,P) is called a probability space. Now, suppose that (0,2,P) is a probability space, that (0',£') is a measurable space, and that X is a random variable on (0,2) with values in 0'. X induces the probability P on the measurable space X of the images by PX(A’) = F(x‘ch')) = PIm :X(m) e A’} = P[X e A’] for all A’ E 2'. The function Px is called the distribution measure, or briefly the distribution of X. For an Rd-valued random variable, the distribution PX is uniquely defined on Ed by its distribution function F(x) = F(xl,...,x ) = P{m :X1(w) g_x1....,Xd(w) g_xdfl = d P[X g.x] It is called the joint distribution function of the d scalar random variables X X which are the components 1000;! d of X. 9 1.3 Conditional Expectation and Conditional Probabilities: For a real—valued random variable X, defined on (3,2,P) we define its expectation ax to be 5X = I X(w)dP(w) O For Rd-valued random variables, we define expectation componentwise. For p 2_l, we define the space Lp(Q,£,P) = {X :X is an Rd-valued random variable , dlxlp < m where |.| is the usual Euclidean norm 1 Let X E Ll(0,Z,P) be an Rd-valued random variable and let 3 c 2 be a sub o-algebra of 2. The probability space (0,3,P) is a coarsening of the original one and X is, in general, no longer (3-—Bd) measurable (S-measurable). We seek now an S-measurable coarsening Y of X that assumes, on the average, the same values as X, that is, an integrable random variable Y such that Y is S-measurable and J." yap = f xap for all c e a. ,. C 'C According to the Radon-Nikodynltheorem, there exists exactly one such Y, almost surely (a.s.) unique. It is called the conditional expectation of X given the o-algebra 3. We write 10 The conditional probability P(A (3) of an event A given the o-algebra 9 C 2 is defined by P(A (a) = 6(IA (a). 1.4 Stochastic Processes: Let I denote an arbitrary nonempty index set and let (0,2,?) denote a probability space. A family {X(t,-);t E I} of Rd-valued random variables on (0,2,P) is called a stochastic process with index set I and state space Rd. Sometimes we write Xt(-) instead of X(t.°). If [X(t,°):t 6 [tO,T]} is a stochastic process, then for every fixed w E 0. X (m) is an Rd-valued function defined on [tO,T] (sample functions). We wish to include the possibilities t0 = -~ or T = w in which case we write [t0,w), (-°.T] or (-m.w). The finite dimensional distribution functions of the stochastic process {X(t,-);t E [toT]} are given by P[X(t1) g-Xll = P{w :X1(t1.w) 3 x11. X2(tl.w) g x12.....Xd(t1,w) g xld} = Ftl(xl) P[X(t1) g X1, X(t2) g_x2] = Ftl,t2(xl'x2) P[X(tl) g-Xl""’X(tn) g_xn] = Ft .t (x1....,xn) 1'... n 11 where ti belongs to [tO,T] and x1 = (Xil'xi2""'xid) belong to Rd (the symbol g applies to the components), and n 2_l. This system of distribution functions satisfies the following two conditions: a) Condition of symmetry: If {i1,...,i ] is a permutation of the numbers l,...,n, then for arbitrary instants ti ,...,t.l and for arbitrary n1; 1, F (X. 'ooo'x.)=F (Xv-000x)- . o. . l tlloooltn 1 n b) Condition of compatibility: For m < n and arbitrary t .t E [tO'T]' m+1"" n F (XI-000x! ,...,CD) t . "tm'tm+1""'tn 1 m = Ft ,...,t (X1'°"'Xm)' 1 m Conversely by Kolmogorov's fundamental theorem, for every family of distribution functions that satisfies the symmetry and compatibility conditions, there exists a probability space (0,2,P) and a stochastic process [X(t,°):t 6 [tO,T]} defined on it that posesses the given distribution as its finite-dimensional distributions. For [X(t,-):t e [tO,T]} we shall write briefly x t or X(t), usually omitting the variable w. Two 12 stochastic process X(t) and X(t) defined on the same probability space are said to be (stochastically) equivalent if, for every t e [tO,T], we have X(t) = XTET with probability 1. Then X(t) is called a version of XTET' and vice versa. The finite dimensional distributions of X(t) and X??? coincide. However, since the set N£ for exceptional values of m for which X(t) # XTET depends in general on t, the sample functions of equivalent processes can have quite different analytical properties. 1.5 Markov Processes: A stochastic process (X(t);t 6 [tO,T]} defined on the probability space (0,2,P) With index set [tO,T] C [0,») and with state space Rd is called a Markov Process if the following so-called Markov Property is satisfied: For tO g_s g_t g.T and d all B 6 B (the Borel sets in Rd), the equation (1.5.1) P(X(t) e B |2([to,s])) = P(X(t) e B lxs) holds with probability 1. Here P(X(t) e B (XS) is P(X(t) e B (a) where a is the o-algebra generated by XS which is the smallest o-algebra w.r.t. which X3 is measurable, and X([to,s]) is the smallest o-algebra generated by X(t), tO git g.s. For given Markov Process X(t), equation (1.5.1) is equivalent to saying that "the past and future are statistically independent when the present is known". 13 1.6 Transition Probability and Density: Let X(t), tO th g_T, denote a Markov Process. The conditional probability P(X(t) 6 B IXS) determines a function P(s,x,t,B) of four arguments s,t E [tO,T], x 6 Rd and B 6 8d. It has the following properties: (Arnold [1]) (1.6.1) For fixed 5 K t and B 6 5d, we have A P(s,XS,t,B) = P(xt e B (XS) with probability 1. Here P(s,XS,t,B) is the conditional distribution (Arnold [1], p. 29) (1.6.2) P(s,x,t,o) is a probability on Ed for fixed s g.t and x 6 Ed. (1.6.3) P(s,-,t,B) is Sd-measurable for fixed 3 g_t and B 6 8d. (1.6.4) For to gis g'u g_t g.T and B 6 Ed and for all x 6 Rd with the possible exception of a set N C Rd_ such that P[X8 6 N] = 0, we have the so-called Chapman- Kolmogorov equation (1.6.5) P(s,x,t,B) P(u.y.t.B)P(S.X.u.dy) . F ”Rd It is always possible to choose P(s,x,t,B) in such a way that for all s 6 [tO,T] and B 6 Ed, we have 14 1 for x E B (1.6.6) P(s,x,s,B) = IB(x) == 0 for x Z B Definition 1.6.1: A function P(s,x,t,B) with the properties (1.6.2-6) is called a transition probability. If X(t) is a Markov Process and P(s,x,t,B) is a transition probability so that (1.6.1) is also satisfied, then P(s,x,t,B) is called a transition probability of the Markov Process X(t). We use the notation P(s,x,t,B) = P(X(t) E B IXs = x) which is the probability that the observed process will be in the set B at time t if at time s, where s g_t, it was in the state x. Definition 1.6.2: If p(s,x,t,y) is a non-negative function that is measurable with respect to (w.r.t.) y and whose integral is 1 and for all s,t E [tO,T], where s < t, all x 6 Rd and all B 6 6d, we have P(s,x,t,B) = f p(s,x,t,y)dy . B then we call p(s,x,t,y) a density for P(s,x,t,B) Remark 1.6.3: According to Definition 1.6.2 equation (1.6.5) reduces to p(s,x,t,y) = f d p(s,x.u.z)p(u.z.t.y)dz. R 15 Definition 1.6.4: A Markov Process X(t). t E [tO,T]. is said to be homogeneous (w.r.t. time) if its transition probability P(s,x,t,B) is stationary, that is, if the condition P(s +u.x.t+u.B) = P(s,x,t,B) is identically satisfied for tO g_s g.t g T and t0 g_s-+u g t-ku g_T. In this case the transition probability is then a function only of x, t-s and B. Hence we can write it in the form P(t-s,x,B) = P(s,x,t,B) , o _<_ t-s g T-to. Therefore, P(t,x,B) is the probability of transition from x to B in time t, regardless of the actual position of the interval of length t on the time axis. Remark 1.6.5: Every Markov Process X(t) can, by assuming time to be a state component, be transferred into a homogeneous Markov Process Y(t) = (t,X(t)) with state space [tO,T] de. The transition probability Q(t,y,B) for 'Y(t) for the special sets B = C xD is then given by Q(t,y,C xD) = Q(t,(s,x) ,C xD) = P(s,x,s+t,D)Ic(s+t) This uniquely determines the probability Q(t,y,-) on the entire set 81([tO,T]).de. 16 As an example of a Markov Process, we cite the Brownian motion. 1.7 The Wiener Process (Brownian motion): A Brownian motion is a stochastic process W(t), t 2_0, satisfying (i) W(O) = 0 (ii) For any 0 g.to < tl < --- < tn the random variables W(tk)-W(tk_l) (l g_k g_n) are independent. (iii) If 0 g_s < t, W(t) -W(s) is normally distributed with E(W(t) -W(s)) = (t-s)u, Var(W(t) -W(s)) = (t-s)02 where u, o are real constants, o # 0. u is called the drift and 02 is called the variance. As is well known, Brownian motion can be realized on the space of continuous functions with the property that its paths are nowhere differentiable with probability 1. For this process the transition density is given by 1 e-(x-y) 2/2t 2vt p(t.x.y) Defipition 1.7.1: A d-dimensional process W(t) = (W1(t),...,Wd(t)) is called a d-dimensional Brownian motion if each process Wi(t) is a Brownian motion and if the o-algebras 2(Wi(t),t 2_0) l g_i g_d, are independent. 17 1.8 The Infinitesimal Generator: We can assign to a general Markov Process X(t) a family of operators defined on a function space. Let X(t),t E [tO,T] be a homogeneous Markov Process with transition probability P(t,x,B). Define the operator T on the space B(Rd) t of bounded measurable scalar functions defined on Rd and equipped with the norm Hg” = sup lg(x)i as follows: XGR For t 6 [0,T-tO], let Ttg denote the function defined by = 6 = Ttg(x> xg> (Rd 9(Y)P(t.x,dY). Since TtIB(x) = P(t,x,B), we can derive the transition probability from the operator T These operators have t' the following properties: For t 6 [0,T-—t0] the operator Tt maps the space B(Rd) into itself, is linear, positive and continuous, and has norm HTtH = 1. The operator T0 is the identity, and Ts+t = TsTt = Tth whenever t,s,t-+s 6 [0,T-—to]. In particular, in the case [to,e) the Tt constitutes a commutative one-parameter semigroup of operators, the so-called semigroup of Markov transition operators. Definition 1.8.1: The infinitesimal operator (generator) A of a homogeneous Markov Process X(t) for tO g.t g,T is defined by 18 'rtg (x) -g (x) t Ag(x) = lim 0 g 6 B(Rd) t$O where the limit is uniform with respect to x. The domain of definition DA C B(Rd) consists of all functions for which the limit exists. The operator A is in general an unbounded closed linear operator. If the transition probabilities of X(t) are stochastically continuous, that is, if for every x 6 Rd and every 6 > 0 lim P(t,x,Ue) = 1, U8 = {y :Iy-—x| < e, tIO then P(t,x,B) is uniquely defined by A. (Arnold [l], p. 39) In the nonhomogeneous case, let X(t) for t E [tO,T] denote an arbitrary Markov Process with transition probability P(s,x,t,B). We refer to Remark 1.6.5 according to which Y(t) = (t,X(t)) is a homogeneous Markov process with the state space d d+l [tO.T]:xR C R We now define the Markov transition operator T and the infinitesimal operator A of X(t) t as being equal to the same quantities as in the case of the corresponding homogeneous proces Y(t) = (t,X(t)) under the definition given earlier, namely = " = P Ttg(s,x) asixg(s-+t,x(t-+s)) JRdg(s-i-t,y)P(s,x,t+s,dy), 0 g.t g T -s, where g(s,x) is a bounded measurable 19 d Ttg(S.x)-g(s.X) and Ag(s,x) = 1im tLO where the limit means the uniform limit in d function in [tO,T] XR t (s.x) e [tourlth 1.9 Diffusion Process: A Markov Process X(t), for to‘g t g_T, with values in Rd and almost certainly continuous sample functions is called a diffusion process if its transition probability P(s,x,t,B) satisfies the following three conditions: For every 5 E [tO,T], x 6 Rd, and e > 0 (a) lim'E%; f P(s,x,t,dy) = 0; txs Iy-x|>e (b) There exists an Rd-valued function a(x,s) such that . 1 - . 11m :t—:-S_ JP (y-X)P(S,Xotde) - a(xls)l (c) There exists a d.xd matrix-valued function b(s,x) such that 1im -L- P (y-x)(y-xIrP(s,x,t,dy) = b(x,s). Us t-S J ly-XK‘: The functions a and. b are called the coefficients of the diffusion process. In particular, a is called the drift vector and. b is called the diffusion matrix. b(x,s) is symmetric and non-negative-definite. The 20 Brownian motion is a diffusion with drift term being zero and diffusion term being 02 1.10 Backward and Forward Equations: To each diffusion process with coefficients a and t>= (bij) is assigned the second order differential operator 2 5x.ax. 1 J a 1 d 5‘. ai(X' S) axi + '5 151 j§1b1j(x' S) F‘ m II D. HCU L 9 can be formally written for every twice partially differentiable function g(x) and is determined by 2a and b. Every diffusion process is uniquely determined by its infinitesimal operator A. We calculate this operator from (1.10.1) Ag(s,x) = lim % f a (g(s+t,y) -g(s,x)) tto R P(s,x,t-ts,dy) by means of a Taylor expansion of g(s-tt,y) about (3.x) under the assumption that g is defined and bounded on [tO,T] de and is, on the set, twice continuously differentiable w.r.t. xi and once continuously differentiable w.r.t.s. When we use conditions (b) and (c) of the definition of diffusion process (1.9) we obtain for the right-hand members of (1.10.1) the operator Ji-+ L . Under certain conditions as on a and b we have, for all functions in DA 21 _ ii A—BS+L The following theorems and results are basic in this area and for our later results. (Gikhman and Skorokhod [12a]), and Arnold [l], p. 42). Theorem 1.10.1: Let X(t) for tO g.t g_T, denote a d-dimensional diffusion process with continuous coefficients a(x,s) and b(x,s) and suppose the limit relation in definition (1.9) holds uniformly in s E [tO,T]. Let g(x) denote a continuous and bounded scalar function and define u(S.x) = ES xg(X(t)) = r g(y)P(S.x.t.dy). de where t is fixed, 5 < t and x 6 Rd. Bu azu Suppose u, 3;; and SEiggj. for l g.1, j g.d are continuous and bounded. Then, u(s,x) is differentiable w.r.t. s and satisfies Kolmogorov's backward equation 33-+ L u = 0 as with the end condition 1im u(s,x) = g(x). sit Theorem 1.10.2: Suppose that the assumption of Theorem 1.10.1 regarding X(t) holds. If P(s,x,t,-) has a density p(s,x,t,y) which is continuous with 22 respect to s and if the derivatives .gEL and 2 Siig%_' exist and are continuous with respect to s, i 3' then p is a so-called FUNDAMENTAL SOLUTION of the backward equation with the end condition 1im p(s,x,t,y) = 6(x-—y). sit Theorem 1.10.3: For t E [tOT], let X(t) denote a d-dimensional diffusion process for which the limit relation in (1.9) holds uniformly in s and x and which posesses a transition density p(s,x,t,y). If the derivatives g—E, a(ai(y,t)p)/ayi and 62(ai(y,t)p)/ayiayj exist and are continuous functions, then for fixed 5 and x 6 Rd such that s g_t, this transition density p(s,x,t,y) is a fundamental solution of Kolmogorov's forward equation (the Fokker-Planck equation) aids (1.10.2) + E: 3;; (ai(y,t)p) l B 2 i=1 j=1 ayiayj 13 If we define the distribution X(to) in terms of the initial probability Pt we obtain from p(s,x,t,y) the O probability density p(t,y) of X(t) itself: 23 (1.10.3) p(t,y) = (Rd p(t0,x.t,y)PtO(dx). If we integrate (1.10.2)VVJ:J:.Pt (dx), we see that O p(t,y) also satisfies the Fokker-Planck equation (1.10.2) For a Brownian motion with drift term u and diffusion term 02 the Kolmogorov backward equation is 2 a l 2 a 3%.: 5.5 ——%-+ p gfi- for t > 0. -° < X: Y < m Bx and the Fokker—Planck equation is i2=1,2_262 _ at 2 ax2 “ ax 1.11 Stochastic integrals and stochastic differential Equations: Let W(t), t 2,0, be a one-dimensional Brownian motion on a probability space (0,3,P). Let ?t(t 2.0) be an increasing family of o-algebras such that the o-algebra 5(W(s), 0 g_s th) is contained in 3 and 3(W(k-+t)t—W(t), k 2,0) is independent of 3 t. for all t 2,0 (e.g., St = 3(W(s), 0 g_s g_t)). Let t 0 g_c < B < a. A stochastic process f(t) defined for a th g B is called a non-anticipative function with respect to 3t if: (i) f(t) is a separable process, i.e. there exists a countable dense set M = {t1,t2....) C:[o,B] and a set N E 3 of P-measure zero such that for every 24 open subinterval (a,b) of (o,B) and every closed subset A of Ra, the two sets {w =f(t.w) e A for all t 6 (a,b) n m} e a and {w :f(t,w) E A for all t 6 (a,b)} (not necessarily in F) differ, if at all, only on a subset of N. (ii) f(t) is a measurable process, i.e., the function (t,w) 4 f(t,w) from. [a,B] x0 into R1 is measurable. (iii) for each t E [c,B], f(t) is St measurable. We denote by LS[G,B] (1 g_p g.m) the class of all non—anticipative stochastic processes f(t) satisfying: B P PI! lf(t)l dt < a) = 1 d (P[ess sup If(t)[ < a} = 1 if P = “) dgtgfi P P . . We denote by Mw[a,B] the subset of Lm[G,B] con51st1ng of all stochastic processes f with r5 P a J [f(t)[ dt < a (1 (E[ess suplf(t)[ < a if p = m) cgtgfi (Friedman [8]) 25 Since any Brownian motion W(t) is nowhere differentiable with probability 1, the integral T f f(t)dW(t) for the stochastic function f(t) cannot 0 be defined in the usual Lebesgue-Stieltjes sense. K. Ito has given the definition of the integral above which we recall here briefly. Definition 1.11.1: (Friedman [11]) A stochastic process f(t) defined on [c,B] is called a step function if there exists a partition 0 = t0 < tl < --~ 0 with constants K and K* depending only on T and in addition if a(x,t), b(x,t) are d'x[0.°°). then the solution of continuous in (x,t) E R (1.12.1), (1.12.2) is a diffusion process with drift a(x,t) and diffusion matrix a(x,t) = b(x,t)b*(x,t); therefore, theorems 1.10.2 and 1.10.3 are valid for the solution. Remark 1.12.2: Theorems 1.10.2 and 1.10.3 are valid provided the transition density of the transition probability exists and is continuous. The following theorem guarantees the existence of a density. (Theorem 5.4, Friedman [11]). Theorem 1.12.3: If (i) There is a positive constant c such that Z}§i§jbij(x,t) 2.cl§(2 for all (x,t) E Rd><[0,T], §6Rd. 29 (ii) The functions ai'bij are bounded on d R x[0,T] and uniformly Lipschitz continuous in (x,t) d in compact subsets of R x[0,T]. (iii) The functions bij are Holder continuous in x (of order a), uniformly with respect to d (x,t) e R x [our]. Then the transition probability of the solution of (1.12.1) has a density. We close this chapter by stating the definition of a linear stochastic differential equation, a theorem concerning existence of moments of a linear stochastic differential equation and Ito's formula. (Arnold [l] and Friedman [11]). Definition 1.12.4: A stochastic differential equation dX(t) = a(X(t) .t)dt+b(X(t) .t)dW(t) for the d—dimensional process X(t) on the interval [tO,T] is said to be linear if the functions a(x,t) and b(x,t) are linear functions of x 6 Rd on Rd'x[tO,T], i.e. if a(x,t) = A(t)Xn+o(t), where A(t) is a (d) xn we have (2n-1)‘.K(s) = I e Therefore, (2.2.7) K(s) = e’5 for s > xn This shows that the integral in (2.2.6) is Q Q j K(s)sads = f e“3 saas = two-+1,xn) X X n n So in this case we have shown that (2.2.8) [E[f]] geM+BF(d+l,xn) To estimate e in (2.2.8) we note that E[P§(x)] = (O e.x P:(X)dx = (n!)2 38 (Krylov [14], p. 35). On the other hand, (n!)2 ]= (2n): 3‘“ K(s)ds O = E[Pfi] = E[xzn 50 x (2n)1(u(‘ n K(s)ds + f K(s)ds) 0 x n and m 00 t _x f K(s)ds = f e- dt = e n x x n n Thus in (2.2.8) we have -x e = (n!)2/(2n)1-e n Now let p(tgx) be the solution to the equations (2.1.3) and (2.1.4). By theorem 2.2.1 for any t the n-point Gauss Christoffel approximation to p(t,x)dx is un(t) = k; l;k(t)5xk(t) and by (2.2.4) (2.1.7) can be written as n (2.2.9) 5‘34 23 ak(t)V(Xk)+E[V]) k=1 n ~ ~ “2 ak( t) (Lv) (x.k(t)) +E[Lv] If fi(x) (i = 1,2,...,2n) is a basis for polynomials of degree less than or equal to 2n-—1, equation (2.2.9) for f (x)'s becomes 1 n (2.2.10) —‘i— 2 5k( (t).f (Xk(t)) tk=l 3‘ 2 =k:l ak( t) (Lf ) (xk(t)) +E[Lf ] for 1 = 1,2, ,2n The system (2.2.10) is a system of ordinary differential equations for the Gauss-Christoffel weights and nodes. The system (2.2.10) however, is not closed as p(t,x) is involved in E[Lfi] and thus cannot be used for finding fin(t) without knowing p(t,x). 2.3 The Gauss-Galerkin Method: The Gauss-Galerkin method for approximating p(t.x) (as introduced by D.A. Dawson [6]) is the following system obtained from (2.2.10) with the terms E[Lfi] dropped. 11 1'1 (2.3.1) dit glakumimkun = 1.21 akm (Lfi) (xk(t)) for i = 1,2,...,2n where [fi (x)]2n is a basis for polynomials of degree less than or equal to 2n-—l. We take the initial condition :3 ME ak" amt “0) as the Gauss-Christoffel approximation to p(0,x). 40 In matrix form the nonlinear system (2.3.1) with given initial condition p«3n<) can be written as (2.3.2) AX' = BX (2.3.3) X(0) = given where A11 A12 A = I A21 A22 Bl O B = . B2 0 T _ X - (allazoooopanpxlyooopxn) p fl(x1) fl(x2),... f1(xn) f2 (x1) f2 (x2) , . . . f2 (xn) ) fn(x2),... fn(xn) 41 I I I alf1(X1) azfl(X2) "' anf1(xn) I I I a1f2(x1) a2f2(x2) ... anf2(xn) A12 = I I alfn(xl) a2fn(xl) ... anfn(xn) fn+1( 1) fn+1(X2) ° fn+1(xn) fn+2(xl) fn+2(x2) "' fn+2(xn) A21 ' f2n(X1) f2n(X2) f2n(xn) I I 3l fn+1(xl ) a2fn+l(x2) ... anfn+1(xn) I I I alfn+2(xl) azfn+2(xz) a‘zif'n+2(X ) A22 = I I a1f2n(x1 ) a2f2n(X2) ‘3‘nf2n(X ) Lf1(xl) Lf1(x2) --- Lf1(xn) Lf2(x1) Lf2(x2) .-~ Lf2(xn) B = Lfn(x1) Lfn(x2) --- Lfn(xn) 42 and Lfn+1(x1) Lfn+1‘X2) "' Lfn+1(xn) Lfn+2(x1) Lf +2(X2) ' Lfn+2(xn) B2= ) Lf2n(xl‘ Lf2n(x2) Lf2n(xn) Throughout our work we shall take fi(x) = x!"-1 (i = l,...,2n). Note that the system (2.3.2) is nonlinear. In the remainder of this section we are concerned with the question of solvability of the system (2.3.2). In particular we shall show that under appropriate assumptions the matrix A in (2.3.2) is nonsingular. Lemma 2.3.1. If .x ,...,xn are distinct, then X1 2 the matrix A A /\ “l A Where Al1 = A11' A21 = A22' A12 = A12D . A22 = A22D and D = d1g(a1,a2,...,an), is non31ngular. Proof: Let f :R 4 R2n be defined by 2n-1 f(x)T= (l,x,...,x ) 43 Note that A I I A=(f(x1) f(x2) f(xn) 15 (x1) f (xn)) A Let F(xl,x2,...,xn) = det A. It is easy to see that 321.: det(f(x ) f(x ) f(x ) f”(x ) f’(x ) f’(x )) 5x1 1 2 "' n 1 2 "‘ n ' 522 3.5.: det(f'(xl) f(xz) ...£(xn)f”(x1) f'(x2) ...f'(xn)) x1 + det(f(xl) f(xz) ...f(xn) f”(x1) f'(x1) ...f’(xn)) and fi— 2 dt(f’( )f(x) f(x) f’”(x) f'(x) f’( )) axi— e X1 2 ... n 1 2 ... X11 + det(f(xl) f(x2) ...f(xn) f(4)(xl) f'(xl) ...f'(xn)) 3 = (x -x )h(x .....x ) because é-E-= 0 if we 2 l l n a 3 x1 put x1 = x2. Thus F(x1,....xn) = (x2-x1)4¢(x1....,xn). Similarly for o O — 4 a 1 # 3 we have F(xl,...,xn) - (xi-xj) ¢(x1,...,xn). Thu- A n 4 det A = C(X) n (x.-x.) . . 1 3 1>3 A Now since the degree of det A is the same as the degree n of H (x.-x.)4, C(X) must be a constant and the i>j ' proof is complete. 44 Lemma 2.3.2: With the same assumptions as in A A A_1 A . . Lemma 2.3.1 the matrix A22‘-A21 A11 A12 is non31ngular. A Proof: Note that All is the well known Vandermond matrix which is nonsingular. Now consider A A 1 0 A11 A12 A B = A A-l A A 7A21A11 I A21 A22 A A A11 A12 /\ A_1A A O ‘A21A11A12 +A22 Since A A A A [A a-“ det B — det A — (det A11)(det(A22- 21 11A12) #'0 . the proof is complete. Now with the assumption as in Lemma 2.3.1, if we let y: = (a£,aé,...,aé), yg = (alx',...,a x’) and nn aT = (a1,a2,...,an), then the system (2.3.2) becomes A A A11 A12 y1 B1a A A A21 A22 \ y2 B2a which is linear in yl, y2 and can be written as 45 T A_1 A A A /\_1/‘\ _1 A A-1 y1 ' A11IB1"A12(A22"A21A11A12) (32"A21 11131)1a (2.3.4) T _ A A A_1A _1 A A_1 y2 ‘ (A22"A21A11A12) (5’2“‘1‘2113‘11131’a The above system is a system of nonlinear differential equations for the Gauss-Galerkin points [xk(t)] coupled with a linear system of differential equations for the Gauss-Galerkin weights [ak(t)}. Now assume that the system (2.3.2), (2.3.3) has a solution (xk(t)] and [ak(t)] where X(o)T = (al(0).....an(0).x1(0)....,xn(0)) has been chosen as the Gauss-Christoffel approximation of p(0.x). By the continuity of xk(t) and ak(t) it is obvious that there exists an interval [0,T] so that for each t E [0,T], ak(t) is positive for each k, and x1(t), .... xn(t) are distinct. Also by Lemma 2.3.1 the systems (2.3.2) and (2.3.4) are equivalent on [0,T]. On such an interval the system (2.3.2) can be written as (2.3.5) x’ = A-lBX = F(X) A because A = A :diag(l,...,1,a1,...,an) which is invertible. Also 46 6x1 0x a i a 1 _ _A-l aBA A-le+A 1 gBX 1 xi = _A-l 5A XI+A-1 §B_ZC_ bx. x. 1 1 Since xi(t) and ai(t) for i = l,...,n are continuous functions on [0,T] and A”1 exists; it follows that A‘l, 5%?— and g? are all bounded. i i Therefore, F(X) is Lipschitz w.r.t xi for each i. It is easy to prove that F is Lipschitz w.r.t. X. Thus on [0,T] the system (2.3.4) has a unique solution 11 un(t) = 131 ak(t)6xk(t) which approximates the law of the process of the solution to the stochastic differential equation (2.1.1), (2.1.2). The corresponding approximated moments of the process are (2 3 6) r2 k . . mk(t) j' x (t)dun(t) r1 n k __J a. (t)x- (t) k = 0.1, o. 0 i=1 1 1 In the next section we discuss an alternative way to compute the above approximated moments, in the case where the coefficients of the stochastic differential 47 equations are polynomials. 2.4 The Hankel System of Moments: In this section we assume that the coefficients of the stochastic differential equation (2.1.1) are polynomials in x. From equation (2.3.6) we have n k (2.4.1) T = a; 1:31 ai(t)xi(t) n I k n k-l I £1 ai(t)xi(t) +1231 kai(t)xi xi(t) for k = 0,1,...,2n«-1. By (2.3.6) the sum n ii; ai(t)(Lfk)(xi(t)) becomes a polynomial gk(ml.m2....), involving finitely many moments. From the system (2.3.2) and (2.3.3) we have d (2.4.2) —dE-=gk(ml(t),m2(t),...) k=0.l.2,...,2n-l (2.4.3) mk(0) = given k = 0,1,...,2n -1 Let us suppose that a(x,t) is a polynomial of degree q in x and b(x,t) is a polynomial of degree 1 in x. Recall that Lf(x) is given by . 2 (Lf) (X) = a(x,t) §£+ lb2(x.t) Li . 0x 2 2 Bx For f(x) = xk, the polynomial 9k in (2.4.2) involves 48 For q‘g 1 and 2 g_l, we have g_2n-l so the pk system (2.4.2) is closed. 0n the other hand, if q 2_2 or E 2_2, p2n_1 = max[2n-+q-2, 21-t2nI—3} 2 2n, and the system (2.4.2) is not closed. Theorem 2.4.1: Suppose that the system (2.3.2) . and (2.3.3) has a solution in [0,T] with nodes that remain distinct as we have assumed. Then the system (2.4.2) may be made closed. Proof: We shall show that it is possible to express all the moments that appear in (2.4.2) with order higher than 2n-1 in terms of the lower moments. To do this we define Pn(x.t) = (x-x1(t))(x-x2(t)) (x-Xn(t)) Since un(t) is a measure concentrated on x1(t),...,xn(t), we have 1‘2 k (2.4.4) f x pn(x,t)dun(t) =0 k=0,l,2,... r 1 Let oi(t) i l,2,...,n be the sum of all products of i of the numbers x1(t) , x2 (t) , . . . ,xn(t) , without permutation cm'repetition. Then (2.4.4) becomes _i (2'4'5) mn+kmglmn+k~~l+ '+( 1) Oimn+k-i n _ .. +ooo+(—1)Gnmk—O for k-O,l'21..o 49 Let ' 1 mk mk+1 ° ' ° mn+1; mk+l mk+2 ' ° ° Inn+k+1 Mn .k _ _ mn+k mn+k+ 1 ' ' ° m2:14-14 .1 Any n consecutive equations of the system (2.4.5) form a linear system of equations with 01.02,...,on as unknowns. Therefore, we get (2.4.6) An = det Mn = 0 k = 0,1,2,... By the assumption that the nodes are distinct we see that for L < n and for real numbers yo.y1....,yz, 1"2 k z 2 ' we have fr x (yoi-ylx4--o-+-y‘x ) dun p081tive, i.e. L for any L K no Qz'k(Y) . .- mi+j+k Yin > OI 1,3—0 where y = (yl;...,y‘). This is equivalent to saying that in Mn the determinants of all principal minor ,k submatrices are positive. It is thus possible to determine m2 n in terms of mo,ml,...,m2n_1 from An,0 = 0. Similarly, from An,1 = 0 we get m2n+1 in terms of ml,m2,...,m2n and in general from An,k = 0 we get m2n+k in terms of previous moments. Thus the system (2.4.2) may be made closed. 50 Remark 2.4.2. Given a sequence of real numbers {mt};=1 which satisfy (2.4.6) with the additional assumption that the determinant of all minor submatrices of Mn (k = 0,1,...) are positive, we can obtain a ,k unique measure concentrated on n distinct points in (0,”) whose moments are the given sequence. For a proof we refer to a paper written by Ernest Fischer [9]. Actually the proof is based on the fact that the quadratic n-1 n-1 forms 2 and i,j=0 mi+inXj i'j=o mi+j+lxixj are positive definite. It is interesting to know that if the determinants of Mn 0 and Mn 1 are non—negative for all n, then there exist a measure u with [mk];;1 are as moments. If the determinants of Mn and Mn IO ,1 positive for all n, then u is a measure whose spectrum cannot be reduced to finitely many point. Remark 2.4.3: To close the system (2.4.2) in practice we need a simple computer algorithm to write higher moments in terms of previous ones. To do this we use the following theorem (George E. Forsythe and Cleve B. Moler [10]). Theorem 2.4.4: For a given square matrix A of order n, let Ak denote the principal minor submatrix made from the first k rows and column. Assume that 51 det(Ak) # 0 for k = 1,2,...,n-—l. Then there exist a unique lower triangular matrix L = (zij), with 111 = 122 = --- = inn = l and a unique upper triangular matrix U = (uij) so that LU = A. Moreover, det A = 1111 °u22 ---unn. Here A = Mn,k is a symmetric matrix of order n-tl; therefore, by A = LU = UTLT and b un' u n 55 of L and U we hav .. = u.. u.. y iq e e e L1] :1/ 33 if i # j. Thus = a for l g,i g ni-l u1i li Now multiplying all rows of L with index greater than or equal to 2 by the second column of U we get a 2 / 22 ’ 22"“12 u11 I: I and u2i = ai2"u1i 'ulz/fill' 3 é-l S-n*'1 Continuing this procedure we get the following algorithm" to compute ull.1122.....un-|_1 52 For I = 1, n+1 Do u1i = a1i For j = 2, ni—l Do '-1 (2.4.7) 3 2 .. = a.. - . “33 33 E “kJ/ukk k 1 For I = j-tl, ni-l Do j-l uij = aij 331 ukiukj/ukk In order to obtain m2n+k in terms of mk.mk+1...., m2n+k-1 which is a consequence of the equation An k = 0, we modify the last step which completes the algorithm (2.4.7) for the case j = ni-l and set un+1 n+1 = 0 which gives 2 un+1k/ukk m2n+k = an+1 n+1 - M5 k=1 We now prove the following theorem. Theorem 2.4.5: With the same assumptions as in Theorem 2.4.1 the systems (2.3.2) implies the system(2.4.2). Conversely the system 2.4.2 implies the system (2.3.2) provided Ank = 0 k = 0,1,2,... and the principal submatrices of Mnk have positive determinants. Proof: It is understood that we use equivalent initial values. We obtain the system (2.4.2) from the system (2.3.4) and it is obvious that any solution to the system (2.3.4) is a solution for the system (2.4.2). 53 Conversely, by Remark 2.4.2 any solution of the closed system (2.4.2) induces a unique measure uh(t) concentrated on n points xi(t)....,xn(t) with corresponding weights a1(t),...,an(t). Therefore by (2.3.6) we obtain the system (2.3.4) from (2.4.2) with “b(t) as a solution. Therefore any solution of the system (2.4.2) is also a solution to the system (2.3.4) and the proof is complete. 2.5 Comparison of the Gauss-Christoffel and the Gauss-Galerkin Measures: The Gauss-Christoffel approximation which was discussed in Section 2.2 can be considered as the "best possible" approximation of a measure because it determines the first 2n-—1 moments the same as exact moments. Now assuming that the system (2.3.4) has a solution on the interval [0,T] we wish to compare the Gauss-Christoffel approximation [43 Link) = k 1ak(t) 6%“) . with the Gauss-Galerkin approximation n ... V To do this we need the following lemma and corollary (Hale [13]). 54 Lemma 2.5.1: Let w(t,u) be continuous on an open connected set 0 C R2 and such that the initial value problem for the scalar equation (2.5.1) u’ = w(t,u) has a unique solution. If u(t) is a solution of (2.5.1) on a‘g t g_b and v(t) is a solution of (2.5.1) on a g_t < b with v(a) g u(a), then v(t) g u(t) for t 6 [a,b]. Corollary 2.5.2: Suppose that w(t,u) satisfies the conditions of Lemma 2.5.1 for a gnt < b, u 2'0, and let u(t) 2_0 be a solution of (2.5.1) on a g_t < b. If f :[a,b] an 4 RD is continuous and [f(t,X)l gw(t,[X|), a g t < b, x e Rn then the solution of X' = f(t,X), [X(a)| g_u(a) exists on [a,b) and |X(t)| g u(t) for t 6 [a,b]. Now the system (2.2.10) can be written in matrix form as ~~ (2.5.2) Ax’ = §x+e(t) *where A and B are of the same form as A and B in (2.3.2) with the Gauss-Galerkin nodes and weights 55 replaced by Gauss-Christoffel ones and a(t)T = (e1(t).e2(t),....en(t)) where ei(t) = E[Lfi] Note that the errors ei(t) depend on the nodes [xi(t)] and the weights (31(t)). By Lemma 2.3.1, A is invertible and equation (2.5.2) can be written as (2.5.3) 52' = fi'l§i+ii‘le(t) or (2.5.3)’ x’=F(x)+E(t) Where E(t) = E(X,5,t) is regarded as a known function. Again note that by neglecting E(t) from the system (2.5.3) we obtain the Gauss-Galerkin system. In the same way as before we can show that F (X) is Lipschitz with respect to X with some Lipschitz constant Mn' The two systems 52%) F(x(t))+E(t) x’(t) P(X(t)) lead to the system (2.5.4) ()2 -X) ’ = F(x) —F(X) +E(t) 56 We now consider the scalar initial value problem I u = Mu-+[E(t)[ u(0) = 0, which has the solution Mt t -MS u(t) = e n P lE(s)|e n ds Also we have ((52 -X) ’l g [F(x) -F(X)|+ [E(t)| g Man-Xl+ [E(t) [. Thus by Corollary 2.5.2, with X(O) = X(0) (as we always take Gauss-Christoffel nodes and weights as initial values), we have for t E [0,T] ~ M t t -M 5 (2.5.5) [X(t) -X(t)| gu(t) = e n j [E(s) (e n ds. 0 Thus we have proved Theorem 2.5.3: Assuming that the system (2.3.2) n has a solution u (t) = Z} ak(t)0 for t E [0,T] 1‘ k=l xk(t) with distinct nodes and with un(0) the Gauss- Christoffel approximation of p(0,x) and ~ n ~ un(t) = kg: ak(t)0;k(t) the Gauss-Christoffel approximation of p(t,x), then we have 57 M t t -M 5 (2.5.6) IX(t)-—X(t)[ g e‘“ 50 [E(s)[e “ ds where X(t) = (31(t)....,§n(t).£1(t),...,§n(t)) and X(t) = (a1(t),...,an(t),x1(t),...,xn(t)) Remark 2.5.4. In equation (2.5.6) E(s) = A-le(s) where e(s) is the Gauss-Christoffel error and [E(s)| g cn|e(s)l for some constant on. One may wish to use (2.5.6) to prove convergence of the Gauss- Galerkin weights and nodes to the Gauss-Christoffel ones as n 4 a. However the dependence of cn and the Lipschitz constant Mn on n does not make such convergence theorem possible. Remark 2.5.5. If the coefficients of the stochastic differential equation are polynomials, then the vector e(t) has its components equal to zero except for finitely many of some components. This number of non-zero components depends on the degree of the coefficients. If the degree of the coefficients is less than or equal to one, then e(t) = 0 and therefore, X(t) = X(t) for all t. Thus for a linear stochastic differential equation the Gauss-Galerkin and Gauss-Christoffel approximations are the same. 58 2.6 Steady-State Solutions: For some stochastic differential equations the density p(t,x) of the solution X(t) as t 4 4 becomes independent of t. In other words, the influence of the initial state fades away and the system tends to a steady-state governed by the stationary solution. In this case the density satisfies (2.1.3) with left hand side replaced by 0. Now assuming that a stochastic differential equation has a steady-state solution the corresponding system to (2.5.2) becomes (2.6.1) fix+e=0 or (2.6.1)I F(i)-+e = 0 which is a nonlinear system of equations and yields the Gauss-Christoffel steady-state approximation. It is to be noted that the system (2.6.1) is not closed in so far as X is concerned as the steady-state density is involved in e and thus cannot be used for finding the steady-state approximated solution. The corresponding system of equations given by the Gauss-Galerkin method has the form (2.6.2) BX = 0 or (2.6.2)’ F(X) = o 59 which is a closed system of nonlinear equations for X. Numerical methods such as Newton's method or the bisection method can be used to obtain the Gauss-Galerkin steady- state solution. Also an upper bound for the term IX-—XI as in Theorem 2.5.5 may be derived as follows. We define the function (2.6.3) H(X,s) = F(x)-tse = 0 Obviously for s = 0 we have the Gauss-Galerkin steady— state equations and for s = l the Gauss-Christoffel equations. From (2.6.3) we have (DF(X))X’(s) +e = o and this leads to (Chow and Hale [3], p. 21) [53 -XI 3 (in? (X) :1 (e) CHAPTER III CONVERGENCE OF THE GAUSS-GALERKIN METHOD For a given interval (r1,r2) where -¢ g,r1 < r2 / e and given p(x), assume that we have A a sequence of integration formulas Zn(f) a I(f) where E (f) = % afin)f(x1in)) and I(f) = J52 f(x)p(x)dx “ k=l r1 n = 1,2,... . It is of importance to know under what conditions the sequence 2%(f) converges to the true value of the integral as n 4 c. In this chapter we study the convergence of the integration formulas obtained by the Gauss-Galerkin method in Chapter II. 3.1 Convergence of the Gauss-Christoffel Integration Formulas: We shall begin by stating some results about the convergence of the Gauss—Christoffel integration formulas. If (rl,r2) is a finite interval, it is known that for a given positive measure u with finite moments, we 60 61 have ”1'2 "' (3.1.1) 1im Zl(f) = 1im ; f(x)du n4m n49 "r1 n r9 = I “ f(x)du = I(f) url for any continuous function f(x) defined on [rl,r2] (Stroud [21] p. 142). In the infinite interval case we generally do not have the same result as in the above. We state some theorems about the convergence of the Gauss-Christoffel integration formulas in the (0,“) and (-,¢) cases. Theorem 3.1.1: Let u be a positive measure defined on [0,“) whose moments of all order are finite. Let fin be the n-point Gauss-Christoffel approximation to u. Then there exists a positive measure v defined on [0,e) such that (3.1.2) 1im j food;n = f f(x)dv n49 0 0 for any continuous function f(x) on [0,”) such that as x 4 °, f(x) is dominated by a polynomial. (Shohat and Tamarkin [l9].p. 121). Theorem 3.1.2: Let u be a positive measure defined on (—°,°) whose moments of all orders are finite. Let fin be the n-point Gauss-Christoffel approximation to u. Then there exists a positive measure v defined on 62 ~ (-¢,¢) and a subsequence [u 7 such that ~ (3.1.3) 1im I f(x)dunk = f f(x)dv k-oa ”..a ..a: for any continuous function f(x) on (-¢,~) such that as Ix) 4 a, f(x) is dominated by a polynomial with non-negative coefficients. Corollary 3.1.3: If the measure u in Theorem 3.1.1 or 3.1.2 is uniquely determined by its moments, then the measure v will be the measure u "substantially". Remark 3.1.4: By "substantially" here we mean (3.1.4) [ f(t)du = j f(t)dv for any continuous function f(t) which vanishes for all sufficiently large values of It). J.V. Uspenski [25] has given sufficient conditions for a measure u with density p(x) that is determined uniquely by its moments (in both (0,9) and (-~,°) cases). For example in the (0,6) case he proves that if there exist constants C and R such that (3.1.5) m = [ xndu 0, it is essential. to have - convergence of 2% O(f) to IO(f) for f(x) belonging to some appropriate class of functions. For example 65 it is seen in Section 3.1 that if p(0,x) is determined uniquely by its moments, then we have convergence of zg’o(f) to Io(f) for all continuous functions f(x) such that as x 4 m, f is dominated by a polynomial. Throughout this chapter by "the stochastic differential equation satisfies the condition A on (r1.r2) x [0,T] we mean: Condition A l. The coefficients a(x,t) and b(x,t) are continuous on (r1,r2) x [0,T]. They are such that a unique solution exists and satisfy the inequality (3.2.3). 2. X(O) is a random variable independent of 3 (W(t), 0 g_t g_T) with density p(0,x) having finite moments of all order. 3. The Gauss-Christoffel integration formulas 2%(f) for p(0,x) converge to the true value of the integral I(f) for each function f that is continuous and such that as x 4 e, f is bounded by a polynomial. 4. The boundary conditions on p(t,x) for each t is such that * (3.2.5) (L p,fi) = (p,Lfi) for i = l,2,...,2n and fi(x) = xl-l. 66 5. The eigenvalue problem defined by Lu = Au and the boundary conditions above generates an infinite set of eigenfunctions [un(x)] such that the set .9 x , O . . (une ] is complgt: 1n L2(r1,r2) for some 60 > 0 o and such that use 4 0 as n 4 a. Furthermore any sufficiently smooth function P(x) can be approximated - x uniformly by combinations of the set [une O }. 6. The density p(t,x) governed by = L p x t (rl,r2) , and the boundary conditions above exists and is unique. We shall now make some comments on these conditions. Remark 3.2.1. Equation (3.2.5) for f1(x) = 1 implies that This means that boundary conditions are such that the probability in (r1,r2) is preserved. Remark 3.2.2. The assumptions made in Condition 4 may be analyzed by appealing to the spectral theory of the Operator L involved. For a smooth function @(x), the uniform convergence of its formal eigenfunction 67 expansion depends on the behavior of the expansion of the Green's function in terms of the eigenfunctions. See [4] or [17]. Actually for the finite interval case condition A5 is automatically true. See [4] Theorem 4.1. Remark 3.2.3. As Feller [6,b] points out in diffusion theory one usually starts with the assumption that the transition probability P(x,t,B) = P(0,x,t,B) has a probability density p(x,t,y) = p(0,x,t,y) which for fixed y satisfies the Fokker-Planck equation. As we have seen in Chapter I (equation (1.10.3)) the density p(0,x) and p(x,t,y) determines p(t,y) the solution of the FOkker-Planck equation. For simplicity we assume condition 6 here and refer to Gikhman and Skorokhod ([9,b]) where the existence of the density p(x,t,y) under more restrictive conditions on a and b is discussed in detail. Remark 3.2.4. The classical boundary conditions * that make L and L adjoint to each other are discussed by Feller [8,a]. Equation (3.2.5) is a special case of the above for polynomials. Let us present some examples of stochastic differential equations defined on (r1.r2) where 68 A 0 g_rl < r2 K 0 for which under suitable boundary conditions equation (3.2.3) is valid. We give examples defined on (0,1) and (0,”) only since by Ito's formula (using linear functions) any (r1,r2) can be treated similarly. The idea for the proof of the following examples are discussed in the work of S. Ethier [7]. Example 3.2.4. Consider the stochastic differential equation (3.2.6) dX(t) = a(X(t))dt+b(X(t))dW(t) (3.2.7) X(O) = given define on (0,1). Assume that a(x) and b(x) are continuous on (0,1), b(0) = b(l) = 0, a(0) 2_0 and a(l) g_0. Define a(x) = b2(x) and 1 if x 2.1 p(x)= 0} = 0 whenever 0 < tO < t1. Summing over all rationals t0 and. t1 we obtain P[Xs 6 (-°,0) U (1,”) for some 3 > 0] = 0 Therefore the probability is preserved in the interval (0,1). Thus if p(t,x) is the density of the stochastic differential equation (3.2.6) and (3.2.7) we have 1 (3.2.9) f p(t.X)dx = 1 0 Now (3.29) implies that (3.25) is valid for fl(x) = 1. For f(x) = Xk, k 2_1, we have 70 ° 1 2 2 * R (L pack) =3 (-%{2+%L%E)Xkdx 0 5x k . 1 292 1 1 k-12 1 = x (-ap4-bb p + E-b aX)[ i-E-kx b l O O + (p,ka) = (p,ka) - 111]!) ap x-O Therefore if a(l) # 0 the only boundary condition that we need here is lim p(t,x) = 0. x41 Example 3.2.5. Consider the stochastic differential equation (3.2.10) dX(t) = a(X(t))dt+b(X(t))dW(t) (3.2.11) X(0) = given defined on (0,0). Assume that b(0) = 0 and a(O) 2.0. Also assume that a(x) and b(x) satisfy the existence and uniqueness conditions. Define Cibd and p(x) the same as in Example 3.2.4. Therefore the stochastic differential equation 1 dX(t) = a °p(X(t))dt+ (c op)2(X(t))dW(t) X(O) given has a solution (0,3,X(t),P). The same equation as (3.2.8) implies that r ’ _ 1: P._X(S) E (4.0) t0 3 s 3 t1 and X(tl) X(to) < o, 0 and therefore with the same argument we have P(X(s) 6 (—¢,0) for some 8 > 0} = 0 and therefore equation (3.2.9) is true (only a is replaced by 1). Thus (3.2.5) is valid for fl(x) ='l. For f(x) = xk, k 2_1 we have a a - 1 (L p'xk) = (PvLXk) 'I-xk(-ap+bb'p+l b2 22‘ _kak 1.02pl 2 6x (3 2 0 = (p.ka) +lim Xk(-ap+bb'p+% b2 3.5.) x49 xdo Therefore here if we impose the boundary condition (3.2.12) 1im xkp = 0 and 1im xk 3.2 = o x40 x40 X the equation(3.2q5) is valid“ In Section 3.6 we see that if a(x) =bx+c, b(x) =a2.fx. then the boundary conditions (3.2.12) are automatically valid. 3.3 Some Results from Differential Inequalities: To continue our preparations for the convergence theorems of the Gauss-Galerkin formulas we need some lemmas from differential inequalities (Szarski [24]). 72 ~ Let Y = (yl.....yn)T and y = (§1....,§n) be two points of the n-dimensional space. We write Y3? if yigyi (i=l,2,....n) and N /\ Kt if yi < yi (1 = l,2,...,n) For a fixed index j we write Y g? § if yi g,§i (i = l,...,n) and yj = 9 Let a system of functions fi(t.y) = fi(t.y1.....y ) (i = l,...,n) be defined in a region D. We have Condition V+ (V-): System fi(t.y) i = l,2,...,n is said to satisfy condition V+ (V-) with respect to y in D if for every fixed index j the function f. (t,Y) is increasing (decreasing) w.r.t. each variable yl.....yj_1. yj+1.....yn separately. Condition W+ (W-): System fi(t,y) (i = l,2,...,n) is said to satisfy condition W+ (W-) with respect to Y in D if for every fixed index j, the following implication holds true Y 33 §,(t.Y) E D, (ti?) 6 D = fj(t,Y) g fj(t,§) (Y gj Q. (t.Y) e D. (tn?) 6 D = fj(t.Y) 2 fj(t.i‘)). 73 Definition 3.3.1 (Dini's derivatives) For a real-valued function w(t). defined in some neighborhood of the point t we denote D+w(to). D+o(to). D-$(to) 0! and D_o(to) by w(t0+h)-¢(to) + D w(t ) = lim sup 0 h40+ h w(t0+h)'-¢(to) D w(to) = lim inf h + h-vo+ _ cp(t0+h) -¢$(to) D w(to) = 1im sup h hfio' w(t0+h)‘-w(to) D w(to) = lim inf h ' h40‘ (the values +0 and -¢ being not excluded). Obviously if m is differentiable at tO all four derivatives are equal. Lemma 3.3.2. Let the right-hand side of the system dy. 1 = - = (3.3.1) dt Oi(t.yl,....yn) l l,2.....n be defined in some open region D and satisfy in D condition W+ with respect to Y. Let (tO.YO) E D. Assume that w(t) = (¢1(t).....on(t))T is continuous in [to.a) and that the curve Y = u(t) lies in D. Let Y(t) = (y1(t).....yn(t))T be an arbitrary solution of the system (3.3.1) passing through (tO.YO) and defined in some interval [tO,B). Under these assumptions. if 74 (3.3.2) w(to) < Y0 and D_oi(t) g_oi(t,ol(t).....on(t)) 1 = l,...,n for t E (to,a), then we have the inequality w(t) < Y(t) for t e [tO.G) n {tO.B) Corollary 3.3.3. If the system (3.3.1) in Lemma 3.3.2 has a unique solution. then Lemma 3.3.2 holds true when the strict inequalities (<) are replaced by (g). For a proof of the lemma and corollary we refer to Szarski [24]. Corollary 3.3.4. Let the right hand sides of the systems dy. l _ . = (3.3.3) dt — Oi(t.yl.....yn) 1 l,2,...,n and dyi (3.3.4) -d—t—= fi(t,y1.....yn) 1= l,...,n be defined in some region D and let the right-hand side of the system (3.3.3) satisfy condition W+ with respect to Y. Let (3.3.3) have a unique solution and assume that Y(t) = (y1(t),....yn(t))T and w(t) = (cpl(t).....cpn(t))T are solutions to (3.3.3) and (3.3.4) 75 respectively on [tO,G). With these conditions and if fi(t,- .....con) g U.(t.cp1.....cpn) (t.Y) 6 D . then w(t) g Y(t) for t E [t0.a). Proof. Notice that w(t) and Y(t) satisfies all conditions of Corollary 3.3.4. 3.4 Helly's Theorems: The following two theorems are needed in the proofcxfthe preparatory lemmas in Section 3.5. First we state Definition 3.4.1. A sequence of measures (uh) is said to converge to a measure u substantially if lim un(I) = u(I) n-OG for all (finite) intervals of continuity of u. Theorem 3.4.2. (The first theorem of Helly). Given a sequence {un} of positive measures and uniformly bounded. then there exists a subsequence (pk } and a measure u to which this subsequence converggs substantially. Furthermore, if the sequence {pm} itself does not converge substantially to u. then there exists another subsequence {“k'] converging substantially to another n measure u' which is not substantially equal to u. 76 Theorem 3.4.3. (The second theorem of Belly). Given a sequence {uh} of positive measures defined on (0,“) and uniformly bounded, which converges substantially to a measure u. Then Lim I f(x)du = F f(x)du n49 O n ”O for any function f(x) continuous in (0,9) and such D that, as intervals I T (0,6).f f(x)dun 4 I f(x)d',..n N IN 0 uniformly in n. 3.5 Preparatory Lemmas: We shall develop some lemmas that are needed in the proofs of the convergence theorems in the next section. In these lemmas we shall consider measures un(t) defined on the semi-infinite interval (O,a). It is clear that these lemmas also hold for measures defined on any finite interval (r .rz) contained 1 in (0.”). Lemma 3.5.1. Assume that the stochastic differential equation (3.5.1) dX(t) = a(X(t).t)dt+b(X(t).t)dW(t) (3.5.2) X(O) = given satisfies condition A on (0.“) x[O.T]. For t E [0,T] let 77 (t)=%a(n)’~ -12 “n t n - . .... (t) (n) k=1 k Xk(t) denotes the n-point Gauss-Galerkin solution to the corresponding system (3.5.1) with non-negative weights and distinct nodes. Let L be any fixed integer. Let min)(t) = F X ”o zdun(t): then the set (mén’uz): n>-§-(1+:). t6 [0,T]} is bounded and equi-continuous. Proof. We take fi(x) = x1, i = O,l,....2n—l. The system (2.3.1) can be written as dm(n7(t)/dt=j (Lfk) (map k k 0 1‘ 0:1,...,2n-1 Using condition A we have (3 5 3) dm(n)(t)/dt I“ (a x+b ) 6—55 d ° ' k 3 o 1 1 2»: “n 2 1 ° 2 a fk +-— (ax+b) -———du 2 IO 2 2 ax2 11 _ (n) (n) (n) " gk(m)<-2' mk-l' "‘k ) We let mo.m1....,m2n_1 be the solutions to the system dm'k (3.5.4) 7fi;-= gk(mk_2. mk_1. mk) k = O,l.....2n -l 78 with the same initial values as mk (O) k O.l.....2n -1. Then by Corollary 3.3.4 we have (n) l - (3.5.5) ml (t) g_mt(t) n > E(L-+l). t t [0,T] On the other hand the system (3.5.4) is the Hankel system for the stochastic differential equation (3.5.6) dX(t) (a x + b1)dt + (azx + b2)dW(t) 1 (3.5.7) X(O) = given If ”t = £(X(t)) is the law of the solution to the equations (3.5.6) and (3.5.7) we know by Remark 2.5.5 that the linearity of (3.5.6) implies ° k mk(t) = J" x dut k= O.l,....2n-l, o 1: that is. mk(t) is the exact k-th moment of the process. For each 0 g_k g 2n-l mk(t) is a continuous function on [0,T] and therefore bounded. Thus there exists a K2 such that for all n >-%(L+—l) and for all 0 g,t g.T (3 5 8) mm) (t) _<_m (t) gK ’ ' L z I (n) The non-negativeness of mi (t) implies that {mén)(t), n > %(L4—l). t E [0,T]} is bounded. To prove 79 the equicontinuity. we note that for given n > %(l4—t), by the mean-value theorem. we have lmén) (t2) my” (1:1)) = SEQ-(fl) “=2 ”‘1‘ for t1 < § < t2 but 3:12.251. = U: a(x,g) 5—8214» %b2(x.§) :gidunl g_f: [a(x,g)| %;é dx + %-f; [b2(x.§)|%;;% dun g f; (alx-i-bl) gaff- + %j; (azxfloz)2 22:3- dun _ (n) (n) e (n) Since g‘ is a polynomial with constant coefficients and m£2;(t). m£2i(t) and min)(t) are all uniformly bounded on t. there exists an ML such that (n) ‘1’“: (g) (n) (n (T) _<. Ig‘mbzm. m‘ {(9. my” (9) gm! Thus for fixed 3 and all n > %(z+-1) there exists an ML such that (n) (mJZ (t1) -m£n) (t2)l g MW:2 -t 1 l which implies the equicontinuity of {m§n)(t), n > %(z+-1), O g.t gDT) and the proof is complete. 80 Lemma 3.5.2. Assume that the condition A holds. Then there exists a subsequence {ukn(t)) of the Gauss- Galerkin measures (un(t)) and a sequence of functions {m:(t)] such that for any positive integer L. we have ° (k ) . n * le Int (t) = m‘(t) k an n (kn) fi" 1 uniformly on t e [0,T], where m (t) = = X duk (t). L do n Proof. By Lemma 3.5.1 the set {m{n)(t). n > 1. t E [0,T]) is bounded and equicontinuous; therefore. by the Ascoli theorem there exists a subsequence {k1 n] (k .n) ' contained in (n) such that {m1 1 (t)) converges * uniformly to a limit function m1(t). We can assume ( (k1 n) kl'n > 3/2. The set (m2 (t). kl'n > 3/2. t e [0,T]} is again bounded and equicontinuous: therefore. there exists a subsequence {k2 n) contained in (k ) such (k that {m2 2,n (t)) converges uniformly to a function l,n * m2(t). Continuing this process we get a subsequence (k1 n} such that (for any positive integer z) (k ) * {m ‘£.n } converges uniformly to a function m£(t). Now consider the subsequence {kn n} which we rename it (kn) that is contained in (n). It is obvious that (k ) * {m1 n (t)) converges uniformly to m‘(t) and the proof is complete. 81 Remark 3.5.3. In Lemma 3.5.2 if we start with any subsequence (kn) C (n). in the same way we can show the existence of a further subsequence {5k } C (kn) and a sequence of functions (m:(t)] such :hat (k ) * Lim m n(t)=m 1m uniformly in t E [0,T]. Lemma 3.5.4. For any t E [0,T], the elements of i the sequence (m‘(t)) are indeed moments of a measure, * i.e. there exists a measure P (t) such that for any L or}. O L * * dP (t) = mz(t) Proof. Assume {“k'} is a subsequence of (uh) such n that for any positive integer n. we have (kr'l) * Lim nu (t) = m‘(t) k an n By the first theorem of Helly there exists a subsequence ‘ . 'k of (kg) which we rename (kn) and a measure P (t) * such that f (t)) converges to P (t) substantially. ukn Now we claim that for any 1 and any 6 > 0, there b (k ) exists an interval (O.b) such that f X‘duk > m‘ n (t) -e O n for all n, i.e. L X d < e for all n Ib pkn 82 To prove the above claim. let us assume the contrary. Therefore, there exists an e > O and an n such that * . for any b there exists an kn 6 (kn) such that ° 1 (n g Xd *>_€ ub pkn Now consider (k’) .. Q n t+l L m (t) =V X :1, *(t) 2 xxd *(t) 1+1 d0 Ukn Ib ukn 2_b d *(t) 2_€ 'b j‘b “kn * Thus we have shown that for any M. there exists a kn * (k ) n such that mz+1 (t) 2.M or (kn) * lim.bd = a = M (t) kn“ 1+1 2+1 which is a contradiction. Thus by second theorem of Helly we have . ° z ° t * * Lim x d (t) = x dP (t) = m (t) kn“ IO H.kl'l j‘0 ‘ and the proof is complete. Note in Lemma 3.5.4 we have shown only the existence of a measure P*(t) which need not be unique. The following Lemma shows that if the measures P*(t) are uniquely determined by the moments m:(t). then the Gauss- Galerkin integration formulas converge to the true values of integrals. 83 Lemma 3.5.5. Under condition A if for each t * 'k the measure P (t) is uniquely determined by (mz(t)), then we have * 'k t r” mz(t)-m£(0) = IO “0 bx L-la(x.s)dP*(s)ds t + % f j 2(1-1)x“2b2(x.8)dP*(S)ds Proof. Let (kn) be a subsequence of positive integers such that for any 1., Lim mg n = mz(t) uniformly in t. By the assumption that for any t, {m:(t)} determines the measure P*(t) uniquely. the subsequence (pk (t)) converges to P*(t) substantially: otherwise. by thg first theorem of Helly for a fixed t there exists a subsequence (us } which converges to a ’ k n * * * measure Pl(t) and P1(t) is not P (t) substantially. In the same way as we did in Lemma 3.5.4 we can show that (s ) l (t) = m’;(t) = J” x‘dP:(t) = j‘ x‘dP*(t) . O O limrn and this is not possible. Also in the same way as we proved in Lemma 3.5.4. we can show that for any t and and any 3 >’O there exists b > 0 such that O f f(x)duk (t) is less than 6 for all k . where b n n f(x) is a polynomial with non-negative coefficients: 84 a Lim JP Xf (X) dpk = a k 4° 0 n n which is a contradiction. Now let a(x) be any continuous function which is dominated by a polynomial f(x) with non-negative coefficients. For any 6 > 0, there exists b > 0 such that I)" w(x)d‘ (t) _<_ )cp(x)ld,* (t) b F‘kn i J.‘b LICn _<_ ” f(x)d' (t) < e Jb Mkn for all k . Thus by the second theorem of Helly, we n have (3 59) L' j. ()d (t) (a ()dP*(t) . . 1m w x = m X ka- 0 Mkm o n Recall that (k) dmz n (t) ° L dt = IO 1x a(x,t)dukn(t) ”1 1-22 + -£(z-1)x b (x,t)d“ (t) I02 “kn 1-1 1. F (a X+b )x d' (t) S ”o 1 1 “kn 1 ° 2 2-2 + -2- LH-l) f0 (a2x+b2) X dukn(t) (kn) (kn) (kn) = g‘(mz_2 (t). m2_1 (t). m2 (t)) 85 We know that gt is uniformly bounded. By the continuity (kn) dm‘ (t) of dt we have (k ) (k ) t a L-l mg n (t) -mL n (O) = $0 (50 Lx a(x,s)dukn(s) a L 2-2 2 (s) +fo 2 l.(2.-l)x b (x,s)dp.kn )dS t (kn) (k ) S.) g‘(m1_2 (S).....m‘ n (s))ds 0 Thus by the Dominated Convergence Theorem and (3.5.9) we have t a a 'la(x.s)dp*(s) +j 15- u: -l)x"'-2b2 o (x.s)dp*(s))ds . 3.6 Main Convergence Theorems: As we have mentioned before all the Lemmas in section 3.5 are true for finite interval. Thus we can prove the following theorem. Theorem 3.6.1. Under condition A if (r1.r2) C (0.0) is a finite interval, then the Gauss-Galerkin integral formulas 2% t(f) converge to It(f) where f(x) is any continuous function defined on [r1.r2]. Proof. To prove the theorem. first we show that for any positive integer L and O g_t gDT, the sequence 86 7 {mén)(t)} converges to the exact moment 6(Xz(t)). If we consider the sequence (min)(t)) as a sequence in the complete metric space C([O,T]). it suffices to prove that any subsequence of (min)(t)) has a further subsequence which converges to the exact moment uniformly. Let (k;) be a subsequence of positive integers. By Lemma 3.5.2 there exists a subsequence {kn} C (k;} 'k and a sequence {mz(t)) such that for any L. (k ) , Lim nu’n (t) = mz(t) uniformly in t. By Lemma 3.5.4. k-n n 'I' * m‘(t).£ = 0,1,2,... are the moments of a measure P (t). Now since [r1.r2] is finite. the coefficients a and b in (3.5.1) are bounded and therefore. the equation (3.5.6) reduces to (3.6.1) dX(t) aldt-tbldW(t), a1 and b1 constants (3.6.2) X(O) given Since the density of the solution to the above stochastic differential equation is normally distributed (Arnold [1], p. 133). the moments of the solution satisfy . L mL(t) g C(2z+1) .R for some constants C and R independent of L that may depend on t. Thus for any I and n we have 87 m1 £.mz(t) g_C(2£-+1)1R * * and (m‘(t)} determines the measure P (t) uniquely. Thus by Lemma 3.5.5 for any 1 we have i * (3.6.3) mL(t)-m‘(0) .t .3” 1-1 * =.) ; AX a(x,s)dP (5)515 o "o 1 pt ” 1-2 2 * ds + E-JO IO l(£-—l)x b (x)dP (t)) t * L j. (P (S) .Lx )ds 0 On the other hand P(t) = p(t,x)dx. the exact measure, satisfies the same equation as above. Thus we have t (P(t).f)-(P(O).f) = f (P(s).Lf)ds O for all polynomials f. Now let a(x) be any C2-function on [rl.r2]. By the Weierstrass approximation theorem and the finiteness of the interval [r1.r2], for any e > 0 there exists a polynomial Pm(x) such that Hcp(x) -Pm(x)|) + Hcp'(x) —Pn’1(x)H+)):p”(x) -PI;(X)H < e for all x 6 [r1.r2]. Now we wish to prove that t (3.5.4) (P*(t).cp) - ‘) Thus by condition A5. (3.6.6) holds for any w(x) sufficiently smooth: therefore we have * * L P (t) = P(t) and mz(t) = 6(x (t)) . for any positive integer 1. Now this and equation (3.5.9) complete the proof of the theorem. In the remainder of this section we shall discuss the convergence of the Gauss-Galerkin method in the (O,w) case. First let us consider the special stochastic differential equation (3.6.7) dX(t) X(t)dt+X(t)dW(t) given (3.6.8) X(O) The Fokker-Planck equation for the above equation is 92 = _B(xp) +1 1:20:29) at 5x 2 a 2 x p(0,x) = given The Hankel system for the moments of p(t,x) is dmn(t) _l. 2 _ dt -2 (n+n )mn(t) n—C,l.... 90 with the solution %m+n%t m (t) =m (O)e n=O,l,... n n As these moments do not satisfy the sufficient conditions (3.1.5); p(t,x)dx may not be uniquely determined by these moments. Also it is obvious that the Gauss- Galerkin solutions of (3.6.7) and (3.6.8) are the same as the Gauss-Christoffel ones. Thus the sequence of functions (m:(t)) is the sequence of the exact moments. Thus. in general even if a linear stochastic differential equation satisfies condition A, Lemma 3.5.5 may not be true and therefore we do not have a convergence theorem similar to the Theorem 3.6.1. However, we have ' convergence for a more restrictive class of stochastic differential equations defined on (O.m). First consider the stochastic differential equations (3.6.9) dX(t) (bX(t) +a)dt+./2aX(t) dW(t) given where p(0,x) = Be-BX, B > 0 (3.6.9)’ mm for a 2_O defined on (O,w)><[O,T]. The Fokker-Planck equation of (3.6.9) is fig = ._B(bx+a)p+ azaxp at 5x ax2 ' 2 —bp+ (a-bx) -:-E+ ax é—E . x ax2 91 By the separation of variables, we seek solutions to above partial differential equation in the form e-x”v(x). Obviously v(x) satisfies the differential equation. xv”(x) + (1 -53 x)v'(x) -2 v(x) +2L v(x) = O a a a If b > O substituting u(g) = V(X) Where X = % 8 we have (3.6.13) gum) + (1 -s)u’(§) + (%-1)u(§) = o It is well known (see [l6],p.243) that a necessary and sufficient condition for the differential equation xy”+ (1 -x)y’+(ly = O to have a polynomial solution is that. p = n. Furthermore, Ln(x), the Laguerre polynomial of degree n is the 1 only solution. Now if we assume jgn-l = n, i.e. 1n = b(n-Pl) we may write 8 -K t (3.6.11) p(t,x) = Z) a e n Ln(§'x) as a solution for the Fokker-Planck equation. With p(0,x) = Be-Bx and since ([18]. p. 135 and 205)) (3.6.12) (0 e‘g Ln(g)Lm(§)dg = o m # n . (3.6.13) P e‘§[Ln(§)lzdg = 1 ”o (3.6.14) exp(-xz/l -z) = lenLn(z), [2| < 1, and .31. l-z Q (3.6.15) j 6'25 Ln(§)d§ = z‘r“1(z-.1)n . O we obtain from (3.6.11) -3 36"Bxe‘m/al L (2 x)6(-‘2 x) JO m a a ° Cbx/a b b b - _ _ — _ n:b e Ln(a x)Lm(a x)d(a X) en » Abx/a b 2 b - am Jo e [Lm(a X)] d(a X) am Thus by (3.6.13) we have _ ‘ag -n-l aé’n = .EE -1 a8 n an-B(1+b) (b) B(l+b) (17%;) and -bt °° -bt _ Es__ kn 19. (3.6.16) p(t.x) — “£41 £0 (b+aB) Ln(a x) b -bt = Bbe Abt exp(-bBe_btx/ b+aB-aBe -bt (b+aB-aBe )) In the case when b < 0 equation (3.6.10) becomes II 3 I x .— (Eu +(1+=)u +(l+m)u-O In this case for 1+Tg—I= l+n the solution of the _3 above equation is of the form u(g) = e b Ln(§). Thus equation (3.6.11) becomes 93 b an - X p(t,x) = z a e-nlblt e a L (lb-1- x) n=O n n a With p(0,x) = Be-BX using equations (3.6.12) -(3.6.15) we get a = B(f%%)-n—l ( Ba n n (En--1) . Thus formally 6 Ba - n ._121 t Z) lb) l:| 1 -|b|nt a X ,Ib] p( .X) a a e e Ln‘ a x) n=O lb‘ -Jfid.x e l | n igi'e a Z) (5&6; b e-Iblt) Ln(l§i'X) n=O Note that the series above converges for B > [bl/2a to the limit _ Bbe-bt -bBe-btx P(t'X) ' -bt ex? -bt b+aB-aBe b+aB-aBe which is the same as equation (3.6.16). However the limit is defined for all 8 without any restriction. Thus for any b. p(t,x) is exponential. In particular there exists Y such that 0 (3.6.17) f eYXp(t,x)dxv+I xeYXp(t.x)dx o o + f x2eYXp(t,x)dx g.M < ° 0 which is needed later. It is interesting to see that the stochastic differential equation (3.6.9). (3.6.10) satisfies Condition A. It is easy to verify Conditions 94 Al-A3. By integration by parts we can verify that A4 is valid. Also in this special case p(t.O) need not be zero. The eigenvalue prdblem of condition A5 for equation (3.6.9) becomes an azu (bx+a) -—+ ax—= ).u 5x 2 Bx As we saw above, depending on whether b is negative or positive, the eigenfunctions are Ln(x) or e-an(x) where Ln(x) is the Laguerre polynomial of degree n and they satisfy the requirements of the Condition A5. Finally we have derived the density explicitly (Equation (3.6.16)). We refer to (Feller [8,a]. p. 516) for the uniqueness of the solution and the validity of Condition A6 holds. Now consider any stochastic differential equation of the form (3.6.18) dX(t) a(X(t)) +b(X(t))dW(t) (3.6.19) X(O) given which satisfies Condition A on (0,») x[0,T]. In addition we assume (3.6.20) b(x) g_¢20x (o a positive constant) and X(O) has a moment generating function, i.e. n 6 mn(0) M= , n. < w for some 6 > 0 he 0 95 There exists a 81 > 0 such that gL-> (1+M)‘3-l or 1 «a -8 X n _ J lene 1 dx = n1(£L) 2,M9 nniil m (0) o ‘31 n In the same way as in Lemma 3.5.1 we can show that for any n, the n-th moment of the solution to the equation (3.6.18) is less than or equal to the n-th moment of the solution of an stochastic differential equation of type dX(t) = (bX-to)dt-h/23§ dW(t) with the initial density -Bx Be whose n-th moment is greater than or equal to mn(0) and therefore the functions {m;(t)} determine a measure uniquely. Thus all the Lemmas of Section 5 are valid. Let 81 be -bt _ . Bbe 9 - m1n Cbt 1' ogth b + 08 - oBe We are now ready to prove the following theorem. Theorem 3.6.2. Assume that the stochastic differ- ential equation (3.6.18) and the initial value (3.6.19) satisfy Condition A together with (3.6.20). If 90, the constant in A5. is less than or equal to 61, then the Gauss-Galerkin integral formulas converge to the true value of integral, i.e. 1.11:1 Zn,t(f) = It(f) where f(x) is any continuous function dominated by a polynomial. 96 ggggf: The proof is similar to that of the Theorem 3.6.1. Let (kg) be a subsequence of positive integers. By Lemma 3.5.2 there exists a subsequence (kn) C {k;} and a sequence [m;(t)) such that for any 4, Lim méknht) = m;(t) uniformly in t. Similar to k d” n what we did in Theorem 3.6.1 we can show that ‘1a(x)dp*(s)ds H L"; L1 N m;(t)-m;(0) l t a 36—2 2 * + J‘ J Lu -1)x b (x)dP (S)ds 2 o o t JP (P* (S) .Lx‘) ds 0 0n the other hand P(t) = p(t,x)dx, the exact measure. satisfies the same equation as above. Thus we have t (9*(t).f)«-(P*(0).f) .f (2*(s).Lf)6s O and t (P(t).f)-—(P(0).£) f (P(s).Lf)ds O for all polynomials f. Now let @(x) be a Cz—function -ex defined on [0,“) such that Lim $‘(x)e O = 0. By the x46 generalized.Weierstrass approximation theorem (see Buck [ 2L p. 74), for any a > 0 there exists a polynomial Pm(x) such that 6 x lWX) -Pm(X) I + (V(X) -Pn"(x) ) + It?” (1:) -P;1(x) l < €e 97 for all x E [0,”). Now we wish to show that .A t (3.6.21) (P*(t) .cs) - (FWD) .6?) =f (P*(s) .mds 0 To do this note that t I(P*(t) .P) - (P*(0) .12) -J‘ (9*(6) ,mmsl O * 0 ]('P (t) ,v-Pm+ Pm) -(P*(0) ,cp-Pm+ Pm) t * - (O (P (s) .L(=p-Pm) +LPm)dsl g I(P*(t) .62 ‘Pm‘ - (P*(0) .19 -Pm) t 'k - '0 (P (s) .L(=0-Pm))| * 'k g )(P (t).cp—Pm)l+lP (0) .12-Pm)! t i + (‘0 )(P (s).L(ca-Pm)))ds (CC where C is a constant depending on the bounds of no 9OX a 80x J e dP*(t), f xe dP*(t) O O a 9X and I x2e O dP*(t) O which is independent of t because of (3.6.17). Thus equation (3.6.4) holds for any C2-function v(x) such -8 x that ¢'(x)e O * O as x * °. We now let un be an eigenfunction for the eigenvalue problem 98 (3.6.22) lu = Lu Since u is a Ca-function and by Condition A5 Lim use = 0, equation (3.6.21) for un becomes t * _ * = P * (P (t) ,un) (P (O),un) JO (P (s),Lun)ds t i = 1 (0 (P (s),un)ds Thus in the same as in Theorem (3.6.1) we have (3.6.23) (P*(t),un) = (P(t),un) Now let ((x) be any Ca function with compact support contained in (0.”). By Condition A5 for any N -9 x e > 0 there exists a finite sum of the form Z} cnune 0 n=1 such that N -90x 1w - Z) cnu e I < 6 n=1 n 90x Then for @(x) = )(x)e we have N 90x ($1- 23 c u I < 6e n n n=1 Thus * N * N * (“P.P (t) -P(t)) = (ep- 23 cnun'P -P) + Z (enun.P -P) n=l n=1 N ... _ *— ._(o Eacnun,P P) 99 implies that D 9X )(cp.P*(t)-P(t))|< e(j‘ eodp*(t)+j‘ e dP(t)) O 0 (CE: Thus (3.6.23) holds for any v(x) with compact support: therefore, P*(t) = P(t). and m;(t) = 6(x‘(t)) . for any positive integer 3. Now this and equation (3.5.9) completes the proof of the theorem. CHAPTER IV NUMERICAL EXAMPLES We present in this chapter several numerical examples that serve to illustrate the Gauss-Galerkin method developed in the preceding chapters. We note that the Kolmogorov equation corresponding to nonlinear stochastic equations has been solved explicitly only in a few simple cases. We have thus included examples for some simple stochastic equations where the exact solution are known so that the numerical results may be compared with the exact ones. We have also included examples for problems which do not satisfy the hypotheses in the convergence theorem but whose Gauss-Galerkin approxi- mation seem to be accurate nonetheless. The Gauss-Galerkin solutions are obtained by using the welléknown programs (e.g the International Mathematical Statistical Library) to compute the initial Gauss- Christoffel weights and nodes and then using standard ordinary differential equation solvers (Dgear). 4.1 Example 1: Consider the stochastic differential equation lOO 101 (4.1.1) dX(t) = (1 —X(t))dt+V/_2X(t) dW(t) (4.1.2) p(0,x) .= 2 exp (~2x) studied in Chapter 3, by equation (3.6.16) p(t.x). the exact density, is given by t t (4.1.3) p(t.x) = —3€3— exp jig—’1. 2e 1-1 2e 3—1 and the exact n-th moment is given by (4.1.4) M (t) = n'.(1-]2'-('.="c)n n From (4.1.3), the exact S-point nodes and weights are computed and shown in Tables 4.1.5 and 4.1.6. The numerical solutions for the Gauss-Galerkin S-point nodes and weights are given in Tables 4.1.7 and 4.1.8. Using 5-point Gauss-Galerkin nodes and weights the first five moments are computed. Tables 4.1.9 and 4.1.10 show the exact and above computed moments. 0 61:16 'o'o'o'oin 102 Table 4.1.5 The exact 5-point nodes for Example 1 .13178 .18363 .21508 .23416 .24573 .25274 .25699 .26267 .26344 .26354 F4 (A P” I4 r4 14 P‘ (A .70670 .98477 .1534 .2557 .3178 .3554 .3782 .4086 .4127 .4133 .7982 .5058 .9349 .1952 .3531 .4489 .5069 .5843 .5948 .5962 .5429 .9369 .7824 .2953 .6063 .7950 .9094 .0619 .0825 .0853 6. 10 11 11 12 12 12 12 12 3204 .8073 .316 .231 .785 .122 .326 .598 .635 .640 'o'o'o'o .52176 .52176 .52176 .52176 .52176 .52176 .52176 .52176 .52176 .52176 103 Table 4.1.6 8‘2 .39867 .39867 .39867 .39867 .39867 .39867 .39867 .39867 .39867 .39867 a3 .75942E-1 .75942E-1 .75942E-1 .75942E-1 .759422-1 .759423-1 .75942E-1 .75942E—1 .75942E-1 .75942E-1 The exact 5-point weights for Example 1 a4 .361183—2 .36118E-2 .36118E-2 .36118E—2 .36118E-2 .36118E-2 .36118E-2 .36118E-2 .361183-2 .36118E-2 as .23370E-4 .23370E-4 .23370E-4 .23370E-4 .23370E-4 .2337OE-4 .23370E-4 .2337OE-4 .23370E-4 .23370E-4 The Gauss-Galerkin S-point nodes for 'o'o'o'olno .13178, .18366 .21507 .23410 .24560 .25259 .25686 .26258 .26338 .26353 (A H k' P‘ re 14 H 104 .70670 .98490 .1533 .2554 .3171 .3546 .3774 .4081 .4124 .4132 Table 4.1.7 :4 1.7982 2.5062 2.9347 3.1944 3.3514 3.4467 ‘ 3.5049 3.5830 3.5939 3.5959 Example 1 .5429 .9380 .7821 .2937 .6030 .7908 .9056 .0593 .0808 .0848 10 11 11 12 12 12 12 12 .3204 .8105 .315 .228 .780 .115 .319 .594 .632 .639 105 Table 4 .1 .8 The Gauss—Galerkin S-point weights for Example 1 N 0 UI O C) (D 0 U1 .52176 .52180 .52176 .52176 .52176 .52176 .52176 .52176 .52176 .52176 .39867 .39865 .39866 .39867 .39867 .39867 .39867 .39867 .39867 .39867 a3 .75942E-1 .75917E-1 .75939E-1 .75942E-1 .75942E-1 .75942E-1 .75943E-1 .75943E-1 .75943E-1 .75942E-1 a4 .36118E-2 .36082E—2 .36112E-2 .36117E-2 .36117E-2 .36117E-2 .36118E-2 .361183-2 .36118E-2 .36118E-2 as .23370E-4 .23312E-4 .2336OE-4 .23368E-4 .23368E-4 .23370E-4 .23370E-4 .233703-4 .23370E-4 .23370E-4 'o'o'o'o'o'o'o'o'oo H rd (A H (a H 106 Table 4.1.9 The exact moments for Example 1 0 1 2 3 4 .0000 .50000 .50000 .75000 1.5000 .0000 .81606 1.3319 3.2608 10.644 .0000 .93233 1.7385 4.8625 18.134 .0000 .97511 1.9017 5.5630 21.698 .0000 .99084 1.9635 5.8367 23.133 .0000 .99663 1.9865 5.9396 23.678 .0000 .99876 1.9950 5.9777 23.881 .0000 .99954 1.9982 5.9918 23.956 .0000 .99983 1.9993 5.9970 23.984 .0000 .99994 1.9998 5.9989 23.994 107 Table 4.1.10 The Gauss-Galerkin moments for Example 1 t m0 m1 m2 m3 m4 0 1.0000 .50000 .50000 .75000 1.5000 1.0 1.0000 .81599 1.3317 .2599 10.640 2.0 1.0000 .93186 1.7367 .8551 18.097 3.0 1.0000 .97456 1.8995 .5537 21.650 4.0 1.0000 .99029 1.9613 .8269 23.081 5.0 1.0000 .99626 1.9851 .9333 23.643 6.0 1.0000 .99848 1.9939 .9727 23.854 7.0 1.0000 .99930 1.9972 .9874 23.933 8.0 1.0000 .99967 1.9987 .9940 23.968 9.0 1.0000 .99986 1.9994 .9974 23.986 The agreement above appear to be excellent. We also note the rapid convergence of the above solutions to their steady-state values. 4.2 Example 2: Consider the stochastic differential equation (4.2.1) dX(t) = X(t) (1 -X(t)dW(t) (4.2.2) X(O) = given 108 defined on (0.1) x [0,T] where the initial process X(O) is a random variable whose density is oXexp(—l/(l—X)2) if Oh.H momma. OOOO.H m .mpmu 080m mgu pm monam> kept» on cam» paw mmoao kHQMCOmmmH mum mucmEOE was mm.mmm sm.Hmm Ha.omm 66.00H e sm.oma H6.Hme 60o.oo 04¢.em we sev.sm ~0~.0m m0m.0m Hmo.0a me msm.HH. msm.ea .mom~.m osso.m es seme.s 6060.6 Home.m 0666.~ me sows.e omen.e Nomm.H mmH~.H ms 66060. 06000. memos. sooem. He 0000.4 0000.H 0000.H 0000.H 06 0 o N H 0\6 Amvv mHmmem mo mucmEOE usmflm umuwm was m.e.v magma APPENDIX A A GEOSTOCHASTIC MODEL In This Appendix we consider a geostochastic model which is not covered by the kind of stochastic differen- tial equations discussed in previous chapters (A.1) dX(t) (X(t) -x2(t))dt+p(6(X(t)) -X(t))dt 2 +(YX(t))1/ dW(t) (21.2) X(O) given where Y > 0 and O g_p g_l defined on (0,”). The main difference is that in (A.l), the drift term depends on the law of the process if p # 0. The convergence theorems developed in this dissertation do not apply to this model. However the Gauss-Galerkin method can still" be applied to such problems, as we shall illustrate below. The Fokker-Planck equation corresponding to the equation (A.1) is %%= -63: [(p or yp(t,y)dy+ (l-p)x-x2)p(t.x)] 1 32 * +33 y -§-(xp(t.x)) = L P(t1X) Bx 121 122 The steady-state density of equation (A.1) is given by (see Dawson [5]) 2e (7'1) (A.3) Pe(x) = Cx exp(-(x-—l-+p)2/Y), x 2_0 Where C is a constant which makes C‘f Pe(x)dx = 1 and 0 e is a positive constant which satisfies the equation 0 (A.4) e = m(e) = p I xliahd dx 0 For fixed p. Y and any positive number e, m(e) is an integral which can be computed using the Gauss-Laguerre integration formula. Then we plot the graph of y = m(e) and y = e where the intersection provides an initial value for finding e*, the root of the equation e = m(e). Having Pe*(x), we can compute the exact moments and therefore the corresponding Gauss- Christoffel nodes and weights for comparison with the Gauss-Galerkin nodes and weights. The Hankel system of moments corresponding to equation (A.1) is given by dm1(t) (A.5) -_dE_— = pm1(t)4-aml(t)- m2(t) dmn(t) (A.6) T (npm1 (t) + §n(n - 1) Y)mn-1(t) +anmn(t) - nmn+l(t) , n 2 2 m(O)=6(X8), a=1-p 123 As we can see, the system (A.5) is not closed. Using the algorithm given in (2.9.1) we may close the system, solve it numerically and compare the steady- state results with exact steady-state moments. We studied the case of p = .5 and Y = 1. i.e. for the stochastic differential equation (A.7) dX(t) = (.5X(t) + .56(x(t)) -X2(t))dt + (X(t) fiat/e(t) The first six exact steady-state moments are m0 = 1.0000 1111 = .60284 m2 = .60434 (A.8) m = .77234 3 m4 = 1.1602 m5 = 1.9805 The S-point Gauss-Galerkin steady-state nodes and weights of the equation (A.7) are computed. Also using the Gauss-Galerkin nodes and weights, the first six steady-state moments are computed. The results are shown below. 124 m0 = 1.0000 m1 = .56044 m2 = .56044 (A.9) m = .71749 3 111.4 = 1.0762 m5 = 1.8154 The system (4.4.5) corresponding to equation (A.7) for n = 10 is solved and the first six moments are shown below. m0 = 1.0000 1111 = .56044 m2 = .56044 (A .10) m = .71748 m4 = 1.0762 m5 = 1.8153 By the equivalence of the system (2.2.4) and (2.5.12) we expect that the results 1J1 (A.9) and (A.lO) to be very close and indeed they are. The numerical results for first three moments are accurate within 5%. We have done computations for this example with Laguerre polynomials as a basis for polynomials of degree less than or equal to 2n-l x 2n-—1 instead of l,x,--~, , the numerical results are identical. APPENDIX B GALERKIN METHOD WITH FIXED NODES In Section 2.2 we discussed the n-point Gauss- Christoffel approximation for a measure u defined on (r1,r2) where -° g,r1 < r2 g_°.. Since the nodes [xk} are the zeros of orthogonal polynomials, the weights (5k) can be chosen so that r2 n. f(x)du = Z " f(" ) +E[f1 J‘rl k=l ak xk is exact for all polynomials of degree less than or equal 2n-l. This was the basis for the Gauss-Galerkin method that we developed in Chapter III. We shall consider in this Appendix another Galerkin method based on nodes that are fixed in time. Let x1.x2,...,xn be any n distinct points. We can find constants $1,...,an uniquely so that r2 n A f f(X)dLl= Z akf(xk)+E[f] r1 k=l 125 126 is exact for all polynomials of degree less than or equal n to n-l. (Stroud [21],p.107). The sum 2‘. f( ) k=l ak X7‘ is called a quadrature sum. Also, as before, if f(x).f(l)(x),...,f(n)(x) are continuous on [r1,r2]. then there exists a function K(s) so that ar2 n ‘ nr2 E[f]=J f(x)du- Z akf(xk)=J r1 k=l K(S)f(n) (s)ds r1 Let p(t,x) be the solution to the equation (2.1.3) and (2.1.4). For given nodes x1....,xn. that do not change n A with t, the n-point quadrature sum. 2: ak(t)6 is an *k:1 k (x. approximation to p(t,x)dx in the sense that equation (2.1.7) can be written as d n . (8.1) —( Z a. (t)v( )+E(V)) dt ]_1 k Xk n = Z 31((t) (Lv) (xk) +E[Lv] k=l If (fi(x)). i = l,...,n is a basis for polynomials of degree less than or equal to n-l, equation (8.1) for fi(x)'s becomes 6 E45 2 31((t)fi(xk) = £k(t)(1.fi) (xk) d (13.2) 5? k=1 k 1 + E[Lfi]. i = l,...,n 127 The system (B.2) is a system of ordinary differential equations for the weights ak(t). It is not a closed system as p(t,x) is involved in E[Lfi] and thus can not be used for the solution of the weights (ak(t)} without explicit knowledge of p(t,x). The Galerkin method with fixed nodes for approxi— mation p(t,x) is obtained from (B.2) with the terms E[Lfi] dropped d n A - _ n (13.3) 3,:- R231 ak(t)fi(xk) - XE]. ak(t) (Lfi(xk)) , for i = l,2,...,n where {fi(X)31 is a basis for polynomials of degree less than or equal to n-1, which we shall take to be 1,x....xn-1. The system (8.3) with given initial p(0.x) may be cast in matrix form as (13.4) AX = BX (B.5) X(O) = given where XT = (31(t)100003n(t)) r f1(x1) fl(x2) f1(xn) f2(x1) f2(x2) --~ f2(xn) fn(xn) fn(x2) --- fn(xn) 128 and Lf1(xl) Lfl(x2) ... Lf1(xn) Lf2(xl) Lf2(X2) ... Lf2(xn) B: ' Lfn(x1) Lfn(x2) ... Lfn(xn) The matrix A in (8.4) is the well known Vandermonde matrix which is non-singular. In the case when the coefficients of the stochastic differential equation are polynomials, we can find as in Gauss-Galerkin method the corresponding Hankel system of ordinary differential equations. (3.6) T=gk(m1.m2.~-) k: O.l,°°-,n-l (8.7) mk(O) given k ll 0 H :3 I .... In this case also if the degree of at least one of the coefficients is greater than one, moments of order higher than n«-l appears in some of the equations of the system. Theorem 8.1. The system (8.6) can be made closed. Proof. As before we shall show that it is possible to express all the moments that appear in (8.6) with order higher than n-l in terms of the lower order moments. 129 Recall that 1'1 m(t)= 23 (+6)“ n=o,1.--- 16:1?"k xk m0 /1 1 1 a1 In1 X1 x2 Xn 8‘2 (8 .8) = n-1 n-1 n-1 mn-l X1 x2 Xn an m xn xn xn a n l 2 n l n+1 n+1 n+1 n+1 1 x2 ... Xn a2 (3.9) : = : 2n-1 2n-1 2n-1 mZn-l/ X1 3‘2 Xn an In general if T _ Mk _ (mkn'mkn+l"°°'mkn+n-l) and gkn gkn ... an 1 2 n an+1 *kn+1 ... Xkn+1 l 2 n A = k I kn+n-l kn+n-l kn (n-l) X1 X2 "° Xn then we have M'k=AkX k=0,1,2,... From (A.l.8) we have Therefore, Mk = AkAglMo k = 1,2,... and the proof is complete. As we see from equation (8.4) and (8.8) A0 = A is independent of time in the present case and therefore we invert A only once while in the Gauss—Galerkin 0 method. We must perform matrix inversion at each time step. The above method is easy to implement and often yields satisfactory results. Unfortunately we do not have convergence theorem as n 4 a as those we established 131 in Chapter III for the Gauss-Galerkin method there. This should not come as a surprise as it is well known that quadrature formulas with fixed nodes named as the Newton-Cotes formulas do not have convergence in general though at each n we do have error bounds. As a numerical example we solve the system (8.6) for the geostochastic model (A.7) with n = 10 and an initial atomic measure chosen randomly with xi = i. a. = .l i = l,---,10. The first six steady-state moments 1 are shown below. m0 = 1.0000 m1 = .56242 1112 = .56242 m3 = .72058 m4 = 1.08086 m5 = 1.82393 The results are close to those obtained by the Gauss-Galerkin method. LI ST OF REFERENCES LIST OF REFERENCES Arnold:. Stochastic Differential Equations, Theory and Application, Wiley Intersciences, 1974. Buck: Studies in Modern Analysis, A.M.S. 1962. Chow and Hale: Methods of Bifurcation Theory, Springer, 1982. Coddington and Levinson: Theory of Differential Equations, New York, McGraw Hill, 1955. Dawson: Qualitative behavior of geostochastic systems, J. Stochastic Processes and their Application, 10 (1980) pp. 1-31. : Galerkin Approximation of Nonlinear Markov Processes: Statistics and Related Topics. Proceeding of the international symposium on Statistics and Related Topics held in Ottawa, Canada, May 5-7. 1980. North-Holland Publishing Company. Ethier: A Class of Degenerate Diffusion Processes Occuring in Population Genetics. Comm. Pure Appl. Math. 1976. pp. 483-493. Feller: (a) The Parabolic Differential Equations and the Associated Semi—Group of Transformation, Ann. Vol. 55. 1952. PP. 468-519. : (b) Diffusion Processes in One Dimension. Amer. Math. Soc. Trans. Vol. 77. 1954, pp. 1-31. Fischer: Ueber das Caratheodorysche Problem, Potenzreihen mit Positive reellen Teil betreffend, Rendiconti del Circolo Matematica di Palermo. 32 (1911), 240-256. 132 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 133 Forsythe and Moler: Computer Solution of Linear Algebraic Systems, Prentice-Hall, 1977. Friedman: Stochastic Differential Equations and Application to P.D.E.'s, Vol. I, Academic Press, Inc. 1975. Gikhman and Skordkhod: (a) Introduction to the Theory of Random Processes, Philadelphia W.8. Saunder, 1969. : (b) Stochastic Differential Equations, Springer 1972. Hale: Ordinary Differential Equations, Wiley- Intersciences, 1969. Krylov: Approximation Calculations of Integrals, Macmillan, New York, 1962 (Trans. from 1-st Russian ed. 1959 by A. Stroud). Mandl: Analytic treatment of One-Dimensional Markov Processes, Springer (1968). Mangnus, Oberhettinger and Soni: Formulas and Theorems for Special Functions of Mathematical Physics, Springer, 1966. Muller-Pfeiffer: Spectral Theory of Ordinary Differential Operators, John Wiley and Sons, 1981. Rainville: Special Functions, the Macmillan Co., 1960. Shohat and Tamarkin: The Problem of Moments, A.M.S. 1950. Strang and Fix: An Analysis of the Finite Element Method, Prentice Hall, Inc.. 1973. Stroud: Numerical Quadrature and Solutions of O.D.E.'s, Springer, 1974. Stroud and Kwan-wei Chen: Peano Error Estimates for Gauss-Lagrange Quadrature Formulas, SIAM. Numer. Vol. 9. No. 2 (1972). Stroud and Stanch: Quadrature Formulas with Multiple Gaussian nodes, J. SIAM. Numer. Anal. Ser. B - Vol. 2. No. 1. PP. 129-143. 134 24. Szarski: Differential Inequalities, Wars Zawa, 1965. 25. Uspensky:. On the Convergence of Quadrature Formulas Related to an Infinite Interval, Trans. Am. Math. Soc. 30 (1928). PP. 542-559.