M W W WU u I 1 \l‘ E W! H H .400 IN?» LN cr>""\J THESIO “ne‘esmzazggm 31 'I' "'“" - - . g. . ,M “1‘- ’-. "3"sz This is to certify that the dissertation entitled NUMERICAL APPROXIMATION OF THE LAWS OF SOME DIFFUSION PROCESSES presented by Ponniah Elancheran has been accepted towards fulfillment of the requirements for Ph.D. degree in Staijstjcs Major professor Date August 14, 1986 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 MSU LIBRARIES ” RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. NUMERICAL APPROXIMATION OF THE LAWS OF SOME DIFFUSION PROCESSES by Ponniah Elancheran A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1986 ABSTRACT NUMERICAL APPROXIMATIONS OF THE LAWS OF SOME DIFFUSION PROCESSES by Ponniah Blancheran In this dissertation. the Gauss—Galerkin approximation of the laws of some diffusion processes is considered. The Gauss-Galerkin approximation is obtained through a basic differential equation describing the evolution of the expected values of certain functionals of the process. This differential equation is derived using the martingale property and also through the semigroup approach. Dawson [4] and HajJafar [7] derived this basic differential equation through the Fokker-Planck equation. They then obtained the Gauss-Galerkin approximation with polynomial basis functions. The approach considered here covers diffusion processes for which the Fokker-Planck equation may not be satisfied or situations where the polynomial basis functions are inadequate and the use of more general basis functions becomes appropriate. The convergence of the Gauss-Galerkin approximations using such general basis functions is proved in this dissertation. Two numerical examples. one of which concerns a degenerate initial distribution, are also presented. To my brother P. Elango ii ACKNOWLEDGEMENTS I wish to express my sincere thanks to Professors Habib Salehi and David Yen for their guidance in the preparation of this dissertation. Their advice, encouragement and the amount of time spent on discussions are greatly appreciated. I also wish to thank Professor R.Y. Erickson for his careful reading of the dissertation and useful discussions and Professor J.C. Gardiner for serving on my committee. Finally, I wish to thank Cathy Sparks for her patience, efficiency and great care in typing this thesis. iii TABLE OF CONTENTS Chapter INTRODUCTION 1. Statement of the Problem and Motivation of the Gauss—Galerkin Approximation 2. Organization of the Dissertation MATHEMATICAL PRELIMINARIES 1. Some Results from Probability Theory 2. Some Results from Numerical Analysis DERIVATION OF THE GAUSS-GALERKIN APPROXIMATION AND CONVERGENCE RESULTS. . . . . . . . 1. Derivation of the Gauss-Galerkin Approximation 2. Convergence Theorem. 3. Examples DEGENERATE INITIAL DISTRIBUTION. 1. Method 1 2. Method 2 NUMERICAL EXAMPLES 1. Example 1. 2. Example 2. REFERENCES iv 14 2O 26 26 30 32 32 35 37 CHAPTER 0 INTRODUCTION 1. Statement of the Problem and Motivation of the Gau§§—Galerkin Approximation. Let {X(t)= t 2 0} be a Feller process (see definition 1.1 of Chapter 1) with state space S = R or an interval of R. Let P(t,F) = P(X(t) e T), where F is a Borel subset of S, be the law of X(t), and assume that the initial law is known. In this dissertation we are concerned with numerical methods of approximating the law P(t,°). In particular, we wish to construct a sequence of discrete measures Pn(t,°) which converges weakly to P(t,°). In this way the expected values of certain functions or the probabilities of certain sets of interest can then be approximated. The approximation that we consider here, known as the Gauss—Galerkin approximation and proposed originally by Dawson [4], combines elements of the Gaussian quadrature and the Galerkin approximation. Dawson [4] and HajJafar [7] considered processes which are solutions of stochastic differential equations. They derived the Gauss-Galerkin approximation by starting with the Fokker—Planck equation governing the density p(t,x) and arrived at §; I f(x) p(t.x) dx = I (Lf)(x) p(t.x) dx or, written differently, (1.1) g? E f(X(t)) = E Lf(X(t)) 02(x) Q__ + b(x) —; ; a,b are the coefficients of dx the stochastic differential equation. Here E denotes the _ 1 L — 2 expectation operator. Dawson and HajJafar assumed that (1.1) holds for all mononials and, incorporating the ideas of the Gaussian quadrature, rewrote (1.1) as (1.2) 3; 121 §§“)(t) fm(§§“)(t))=igl §§“’(t)(Lfm)(§§n’(t)) + error(Lfm) for 1 g m S 2n. Here fm(x) = x“‘"1 m = 1.2,.... The system (1.2), however ~ cannot be used to find agn)(t), 2:“)(t) because of the error term error(Lfm). The Gauss-Galerkin approximation is obtained by solving the system of dynamic equations with the error term dropped. For more details see HajJafar [7], pp. 38-45. We consider the Gauss-Galerkin approximation for Feller processes. The foregoing derivation of the Gauss—Galerkin scheme starting with the Fokker-Planck equation and with monomial basis functions has two shortcomings. Firstly, (1.1), is in many cases more basic than the Fokker-Planck equation. Secondly, for a number of Markov processes (1.1) may not hold for all monomials (see Example 3.2 of Chapter 2). The point of departure in this dissertation is the equation (1.1). We shall derive this equation for functions in a certain class, using the martingale property. The class of functions for which (1.1) is valid is determined by deriving the appropriate boundary conditions. In this way the Gauss—Galerkin approximations are extended for more general types of basis functions than the monomials. 2. giganigation of the Di§§ertation. This dissertation is organized as follows. Chapter 1 contains the background materials from probability theory and numerical analysis. In Chapter 2 we develop the Gauss—Galerkin approximation. Then we prove the convergence of our approximation, when the state space is a finite interval. Some examples are then presented. In many applications it may happen that the initial distribution is degenerate. In this case, the Gauss-Galerkin system becomes singular initially. In Chapter 3 we consider approximating the laws when the initial distribution is degenerate. Several numerical examples are presented in Chapter 4. CHAPTER 1 MATHEMATICAL PRELIMINARIES In this chapter we introduce some notations and definitions and state some known results which are used in the sequel. 1. Some Results from Probability Theory. Our basic reference for this section is the book by Ethier and Kurtz [5], where further details can be found. Let {X(t)= t 2 0} be a time-homogeneous Markov process with the state space S = R or an interval of R. Let P(t,x,F), xeS, Fe$(S), where 3(8) denotes the Borel subsets of S, be the transition probability function. We denote the associated semigroup of operators by {T(t)= t 2 0}, the infinitesimal generator by A and the domain of A by @(A). Let C(S) denote the Banach space of all bounded continuous functions vanishing at infinity, with the norm Hf" = SpreS|f(x)l‘ Note that if S is compact, C(S) = C(S), the space of bounded continuous functions on S. Definition 1.1. A Markov process {X(t)= t 2 O} is said to be a Feller process if the associated semigroup {T(t)} is strongly continuous on C(S) and satisfies T(t) C(S) g 8(3). Therefore, without loss of generality we can restrict the Feller semigroup to C(S). We note that @(A) of this semigroup is contained in C(S) and is dense there. We now state a theorem which gives the stability property of Feller processes in the sense of weak convergence of measures with respect to the initial data. Theoremfil.2. Let {X(t)= t 2 0} be a Feller process. For n = 1,2,... let {Xn(t)= t 2 0} be a sequence of Markov processes with the same transition function as that of X. (Thus Xn are also Feller processes.) Then Xn(O) é X(O) implies Xn 3 X. {2 means weak convergence}. Proof. This is a special case of Theorem 2.5. Chapter 4 of Ethier and Kurtz [5] with Tn(t) = T(t) for all n. D Martingale Property 1.3. We conclude this section by mentioning the martingale property related to Markov processes. The martingale problem approach is a powerful tool for studying Markov processes. We simply state the martingale property. Let {X(t)= t 2 0} be a progressively measurable Markov process with the infinitesimal generator A and domain $(A). Then f(X(t)) - 13 (Af)(X(s)) as is an {FE} martingale for every f e @(A). Here F): = a(X(s): s g t). For more details see Ethier and Kurtz [5] p.162. 2. Some Results from,Numerical Analysis. Let I = [a,b] n m1, — m g a < b g m. Calculating the definite integral of a given function f(x) with respect to a given measure u(dx) on I f f(X) MdX) is a classical problem in mathematics. For some simple cases these integrals can be computed exactly. Otherwise we have to resort to numerical methods to approximate these integrals. Quadrature formulas are integration formulas of the type D (n) (n) f f(x) u(dx) = 2 a f(x ) + error(f) k k I k=1 The xk are called the points (or nodes) in the formula; the ak are called the coefficients (or weights) in the formula. We now describe the so-called "Gaussian quadrature" or the "Gauss—Christoffel" integration formulas. Assume, (2.1) u is a positive measure on I. (2.2) All moments “k = [ xk u(dx), k = 0.1.2,... exist I and are finite. (2.3) For polynomials p(x) which are nonnegative on I { p(x) p(dx) = 0 implies p(x) E 0, Condition (2 3) is satisfied if the measure u is not supported on a finite number of points. Assumptions (2.1) to (2.3) are generalizations of assumptions made by Stoer and Bulirsch [10] p. 142. Theorem 2.1. a) The n points {x£n)) and the n weights {a£n)} can be uniquely chosen so that “~() ~() (2.4) f f(x) p(dx) = 2 ak“ f(xkn ) k=1 holds for all polynomials of degree less than or equal to 2n-1. b) Let {pm(')} be the family of orthogonal polynomials associated with the measure u, that is, pm(°) is for each m a polynomial of degree m and fpm(X) pn(X) u(dX) = 0 m ¢ n- Then the nodes (;£H)1 k = 1.....n} are the 52125 of the polynomial pn(~). c) The weights {;£n): k = 1,...,n} are uniquely obtained as the solutions of the equations “ ~(n) ~(n) f f(x) p(dx) = 2 ak f(xk ) for all polynomials k=1 of degree less than or equal to n—l. For a full discussion see Stroud [11] or Stoer and Bulirsch [10]. At the nth approximation, the integration formulas x2n_1} and there are Zn (2.4) are exact for {1.x,..., unknown variables (n nodes and n weights). Therefore these nodes and weights can also be obtained as solutions of the system of nonlinear equations “ ~(n) ~(n) m m 2 a (x ) = f x p(dx) 0g m g 2n-1 k k k=1 Since at the nth stage these formulas are exact for all monomials of degree g 2n-1 the collection {1.x,x2,...} are called the basis functions of the Gaussian quadrature formulas. We extend these ideas to more general types of basis functions. Let E = g ;(n) 5 N n k=1 k {x£n)} Then “n is called the Gauss-Christoffel approximation of g with the basis functions {fml m = 1,2,...,} if n ~ ~ (2.5) 2 afin) fm(x£n)) = [ fm(x) p(dx) 1 < m < 2n k=1 There are 2n unknowns and 2n equations. The hope is that these equations can be solved for the unknowns. But the existence and uniqueness of such nodes and weights do not seem to be well studied in general. CHAPTER 2 DERIVATION OF THE GAUSS—GALERKIN APPROXIMATION AND CONVERGENCE RESULTS 1. Derivation of the Gauss-Galerkin Approximation. As we already indicated in the introductory chapter, if we start with the Fokker—Planck equation then the Gauss-Galerkin approximation can be derived only in restricted cases. In this section, we derive the Gauss-Galerkin approximation via the martingale approach for more general processes. Let {X(t)= t 2 0} be a Feller process. Let A be its infinitesimal generator with domain @(A). Then, as was pointed out in Section 1.3 of Chapter 1 f(X(c)) — I5 (Ar)(X(u)) du is a martingale for every f in @(A). Let s < t. Then, E[f(X(t))-f5(Af)(X01))dUI=E[f(X(S))-I(S)(Af)(X(U))dUJ for every f in @(A). where E denotes the expectation operator. It follows that (1 1) Ef(X(t)) — Ef(X(s)) = E}: (Af)(X(u))du. Af is bounded and therefore by Fubini’s theorem. E I: (Af)(X(u))du = ]: E(Af)(X(u))du. Hence (1.2) E f(X(t)) — Ef(X(s)) = I: E(Af)(X(u))du. The strong continuity of the semigroup (T(t)! t 2 0} gives us the continuity of E (Af)(X(t)). To establish this we will show that lim E (Af)(X(t)) = E (Af)(X(s)). tas E (Af)(X(t)) = I (Af)(y) P(t.dy) = I (Af)(y)(f P(t.x.dY) P(0 dX)) (By Fubini’s theorem) = I (I (Af)(y) P(t,x,dy)) P(O.dx) = ] T(t)(Af)(x) P(O,dx). Further, lim I T(t)(Af)(x) P(O,dx) = I lim T(t)(Af)(x) P(O,dx) t4s tes (by strong continuity of T(t)): I T(s)(Af)(x) P(0.dx) = E (Af)(X(s)). Therefore by (1.2) we obtain, 9_ (1.3) dt E f(X(t)) = E(Af)(x(t)) for every f in @(A). Equation (1.3) can also be derived through the semigroup approach. If fe@(A), then T(t)fe@(A) and %? T(t)f = T(t)Af d—If(y)1’(t.x,dy)=I(Af)(3')1’(t.x,dy)- Q. t 11 Therefore, I %; [I f(y)P(t.x.dy)JP(o.dx)=II(Af)(y)P(t.x.dy)P(o.dx). Since I f(y) P(t,x,dy) is a bounded function L.H.S. = g? I [I f(y) P(t,x,dy)] P(O,dx) = %; I f(y) [I P(t.x.dy) P(o.dx)] = §;-I f(y) P(t.dy) = g? E f(X(t)). By Fubini’s theorem, R.H.S. = I (Af)(y) [I P(t,x,dy) P(O,dx)] = E I (Af)(y) P(t.dy) = E (Af)(X(t)). Hence. g? E f(X(t)) = E(Af)(X(t)) v f e %(A). The appropriate choice of basis functions is suggested by the boundary conditions on $(A). In the infinite interval case we may have to use a limit argument to show that (1.3) holds for our choice of basis functions. The Gauss-Galerkin approximation can now be derived. Let {fmi m=1,2 ..... } be our choice of basis functions for ~ “ ~(n) which (1.3) holds. Let Pn(t.°) = 2 ak (t) 6 ”(n) be k=1 {xk (t)} the n point Gauss-Christoffel approximation with the basis functions {fmi m=1,2,....}. From (1.3) we get 12 (1.4) g? z ;£D)(t)fm(;£n)(t)) =k§1 ££H)(t)Afm(§£n)(t)) + error (Afm) The system (1.4) cannot be used to find aén)(t), xén)(t) because of the error term error(Lfm). The Gauss—Galerkin scheme is obtained by dropping the error term in equation (1.4). This way we obtain a sequence of discrete measures P (t,°) = g a£“)(t) 5 “ k=1 {x£n)(t)} where aén)(t), xén)(t) are solutions of the system of dynamic equations n n (1.5) %? ksl aén)(t) fm(x£n)(t)) =k21 a£“)(t) Afm(x£n)(t)) 1 g m g 2n. solved together with the initial values (n) _~(n) (n) _~(n). “ ~(n) ak (O) _ ak , xk (0) _ xk , E ak 5 ~(n) k=1 (Xk } being the Gauss-Christoffel approximation of the known initial measure P(O,°). n The measure P (t,°) = E a(n)(t) 6 is called “ k:1 1‘ (xfj‘hm the Gauss-Galerkin approximation of P(t,°) with the basis functions {fmi m = 1,2,...}. The system (1.5) is called the Gauss-Galerkin system. 13 If aén)(t), xén)(t) are differentiable then the system of dynamic equations (1.5) can be written as F C(11) Y(n).(t) = D(n) Y(n)(t) (1.6) _ Y(n)(0) = (an)...,§£n), QED)....,;£“))T where Y(n)(t) = (agn)(t),...,a£n)(t), xgn)(t)....,x£n)(t))T (n) (n) (n) c c D o C(n) = 11 12 ; D(n) = 1 as?) ea) DI“) 0 CI?) : (fi(x§n)))nxn ’ Gig) — (agn) fi(xgn)))nxn C2?) = (fn+i(x§n)))nxn ’ 02;) = (agn) fn+i(xgn)))nxn Din) = (Afi(xgn)))nxn ’ Dén) = (Afn+i(xgn)))nxn This is a nonlinear system of ordinary differential equations for which the question of existence and uniqueness of solutions is not settled. The system of differential equations (1.6) can be solved numerically. Numerical solutions of some examples indicate that the system (1.6) should admit solutions in reasonable cases. 14 2. Convergence Theorem. In this section we shall establish the weak convergence of the Gauss-Galerkin approximation P (tn) = 1% a(n)(t) a “ k=1 k {x§“’(t)} to the actual law P(t,°) under appropriate assumptions. Dawson [4] and HajJafar [7] proved weak convergence theorems when the state space is [0.1] or [0,”) and the basis functions are monomials. We shall prove similar convergence result when the state space is [0,1] but with more general type basis functions. The assumptions made here are similar to those assumed by HajJafar [7] in the case of monomial basis functions. The techniques of the proof are closely related to that of HajJafar. Theorem 2.1. Let (X(t)= t 2 0) be a Feller process with infinitesimal generator A. Denote the domain of A by @(A). Let the state space S be [0.1] and T > 0 be given. Let (fnin = 1,2,....} be a class of basis functions contained in @(A) E C[O,1], B = sp{f1,...f2 n }. Suppose n that the following assumptions are satisfied. (i) f1 E 1. (ii) The operator A admits a countable number of eigenfunctions (en: n = 1,2,....} and sp{en: n = 1.2,....} is dense in @(A) with respect to the supremum norm. 15 Moreover, for a given e > O and an eigenfunction en, there exists gm 6 Bm for some m, such that def. nlen — ngH = n en — gmH+HAen - AgmN < e. It can be deduced that sp{f1,f2,....} is dense in 2(A). (iii) The system of equations (1.5) has a solution such that V t e [0,T] the weights afin)(t) are nonnegative and the nodes xén)(t) belong to [0.1]. n Then Pn(t,’) = 2 aén)(t) 6 converges weakly k 1 {xff’un to P(t,°) Vte [0.T], where P(t,°) is the law of X(t). n Remark 2.2. Since f E 1, E a(n)(t) = 1 for all t 1 k k=1 and all n. Therefore with assumption (iii) Pn(t,°) are all probability measures. Remark 2.3. Assumption (ii) is certainly satisfied if the union of the graphs CD = {(f;Af)= feBn} are dense in the graph G = {(f;Af): fe%(A)} with respect to the norm HI(f:g)IH = urn + Hg”. Remark 2.4. The system of dynamic equations (1.5) is a nonlinear system of differential equations. Therefore, it is 16 hard to verify assumption (iii) in general. However, our examples illustrate that assumption (iii) holds for some special cases. Proof of the Theorem. Let Efi(t) = I fe(x) Pn(t,dx), te[0,T] Then by (1.5) %? Efi(t) = I Afe(x) Pn(t,dx) for e = 1,...,2n. Let 9 be a fixed positive integer. Since Afe e C [0.1], 3 K such that e lAfeI s K, Therefore, d e e SUpte[0,T]ld? En(t)l g K8 V n > 2 ' . e , e . Hence the class of functions {En(°)- n > 5 , t e [0,T]} IS equicontinuous. Also, this class is uniformly bounded. Therefore, there exists a subsequence that converges uniformly to E:(°) (say). By diagonalization argument we can show that there exists a subsequence kn such that for any positive integer e, E: (°) 6 E:(°) uniformly in [0,T]. n We will now show that {E£(t)= e = 1,2,....} determines a probability measure. Since {Pk (t,°)2 n = 1.2,....} is a n sequence of probability measure on the compact interval [0,1], this collection is relatively compact. (See 17 Billingsley [1]; Theorem 6.1, Chapter 1). Therefore every subsequence {Pk (t,°)} of {Pk (t,°)} has a further n' n subsequence {Pk (t.°)} which converges weakly to a n" H probability measure P*(t,°). Furthermore, (2.1) Efi(t) = lim E: (t) = lim I fe(x) Pk (t,dx) 1(1‘ln'_)m n" 1(Iln_)(n n" = I 13(x) P;(t,dx). sp{fe: e = 1.2,....} is dense in @(A) and because X is a Feller Process @(A) is dense in C [0.1]. Hence sp{fe: B = 1,2,....} is a measure determining class. By (2.1) we can thus conclude that every subsequence of {Pk (t,°)} has a further subsequence which converges to the n same limit P*(t,°) (say). Hence {Pk (t,°)} itself n converges weakly to the probability measure P*(t,°). Now a fl m Was A fl V II I (Afe)(x) Pkn(t,dx) which means 2 e t (2 2) Ekn(t) - Ekn(0) I0 I (Afe)(x) Pkn(s,dx) ds. Since IAfeI g Ke. 18 II Afe(x) Pkn(t,dx)l g Kg for all n. Taking limits in (2.2) ‘e e e e E t — E o = 1' [E (t - E 0)] *( ) *( ) knig kn ) kn( = l‘ t Af P s,dx ds knig [I0 I ( e)(x) kn( )1 By the dominated convergence theorem = t 1' Af )P (s,dx ds IO [knlg I ,(x kn ) ] = I5 I Afe(x) P*(s,dx)ds. Therefore, (2 3) Ee(t) - Ee(0) — It I Af (x) P (s dx)ds f 11 e . * * - 0 e * , 0 r a ‘ . Let Ee(t) = I fe(x) P(t,dx). Then by (1 1) Ee(°) also satisfy the same integral equation (2.3). Furthermore, initially E:(0) = Ee(0). We want to show that Ee(t) = E:(t) for all e and all t. Let e be an n eigenfunction. Then, by assumption (ii), for a given 6 > 0, there exists g e B for some m such that Hle — g I” = m m n m Ilen - gm" + IlAen - AgmH < 6. Then, II en(x) P*(t.dx) - I en(x)P*(0.dx) - I5 I (Aen)(x) P,(s.dx)dsl s I I (en(x) — gm(X)) P*(t.dx) - I (en(x) — gm(x)) P*(0.dx) — I3 I (Aen(x) - Agm(x)) P*(s.dx)ds I + I I gm(x) P*(t.dx) — I gm(x) P,(o.dx) — I5 I Agm(x) P*(s.dx)dsl < e + e + eT + 0. 19 the last expression being zero by (2.3) together with its linearity property. Hence, I en(x) P,(t.dx) - Ien(x) P*(o.dx) = I5 I (Aen)(x) P*(s,dx)ds i.e. %? I en(x) P*(t,dx) I Aen(x) P*(t,dx) KnI en(x) P*(t,dx). Therefore, h t I en(x) P*(t,dx) = e n I en(x) P*(O,dx). Likewise P(t,°) also satisfies, A t I en(x) P(t,dx) = e n I en(x) P(O,dx). Moreover I en(x) P*(O,dx) = I en(x) P(O,dx). From these we conclude that I en(x) P*(t,dx) = I en(x) P(t,dx) for all n. This together with the assumption about the denseness of the eigenfunctions imply, I f dP(t) = I f dP*(t) for all feC[O,1]. Hence Pk (t,°) a P(t,°). n The proof given above demonstrates that if we start with a subsequence {Pk (t,°)} we can show that there is a n further subsequence which converges weakly to P(t,°). Hence Pn(t,°) itself converges weakly to P(t,°) D 20 1 d2 d Remark 2.5. Let A E — a(x) ———-+ b(x) ——. If the 2 dX2 dx basis functions are polynomials and the drift coefficient b(x) and the diffusion coefficient a(x) are bounded then the graphs UGn = U{(f;Af): feBn} are dense in G = {(f;Af)= fe@(A)}. To prove this let fe@(A) E C2 [0,1]. Then, by the Weierstrass approximation theorem, for any 6 > 0, there exists a polynomial Pm(x) such that "f(x) - Pm(x)H + Hf'(x) — P$(x)fl + Nf"(x) - P;(x)fl < e Therefore, if the drift and diffusion coefficients are bounded, then for any 6 > 0 there exists a polynomial Pm(x) such that Hf - P N + HAf - AP N < e. m m 3. Examples. We now give some examples, where our techniques work, but the Gauss-Galerkin approximation cannot be derived through the Fokker-Planck equation or the monomial basis functions are inappropriate. Example 3.1. Consider the diffusion process charactorized by 2 AEX(1—X)9-—2' dx a(A) = c2[0.1] This example arises as a diffusion approximation for a model in population genetics (see Kurtz [8] p.29). In this example 21 also (1.3) cannot be obtained via the Fokker—Planck equation approach. It is known that for any t > 0 the fixation probabilities are positive (see Ewens [6] p.41), i.e. there is an accumulation of mass at 0 and 1. Therefore the law P(t,°) has a singular part. Although the density of the absolutely continuous part satisfies the Fokker-Planck equation the weak form of the latter does not lead to (1.3) for all monomials, for example for f(x) E 1. Eq. (1.3) can nevertheless be derived directly through the martingale approach for all monomials. For each n, there is a polynomial of degree n which is an eigenfunction of A (see Example 1.2 of Chapter 3 for more details). Therefore the assumptions (i) and (ii) of the Theorem 2.1 are satisfied for the monomial basis functions. It can be shown that the Gauss-Christoffel nodes and weights satisfy equation (1.6). Hence the assumption (iii) is satisfied. To show that the Gauss-Christoffel nodes and weights satisfy equation (1.6), we note that the measure P(t,°) has a nonzero absolutely continuous part (see Ewens [6] pp. 40-41). Therefore assumptions (2.1) - (2.3) of Chapter 1 are satisfied. Hence, the Gauss-Christoffel nodes and weights exist. The nodes belong to [0,1] and are distinct. The weights are positive. Secondly, these nodes are differentiable functions of t. This is so because they are zeros of the orthogonal polynomials with respect to P(t,°) whose coefficients are differentiable in t, and an 22 application of the inverse mapping theorem gives the zeros are differentiable. The differentiability of the coefficients of the orthogonal polynomials follows from the standard Gram-Schmidt orthogonalization procedure and the fact that polynomials belong to the domain of A which gives def. that (p1,p2)t = I p1(x) p2(x) P(t,dx) is differentiable in t for polynomials p1, p2. It then easily follows that the weights are also differentiable in t. Thirdly, these nodes and weights satisfy the system of equations (1.6). We note that if fm is a polynomial of degree m, then Afm is also a polynomial of degree less or equal to m. This together with equation (1.3) applied to fm = xm—1 and equation (2.4) of Chapter 1 applied to both fm and Afm imply that the Gauss-Christoffel nodes and weights satisfy equation (1.6). By the uniqueness of the solution of equation (1.6) for this case (HajJafar [7], p. 46) we conclude that the Gauss-Galerkin nodes and weights exists and are unique. Hence the assumption (iii) of Theorem 2.1 is satisfied. Example 3.2. Consider the reflecting Brownian motion X in [0,1] with the initial distribution uniform on [0,%]. Then <12; 2 dx a(A) = (f: feC2[0,1], £'(o+) = f’(1 AE Mp4 ) 0} 23 We Show that the function f(x) = x does not satisfy the basic equation (1.3) namely gI-E f(X(t)) = E Af(X(t)) For if f(x) = x satisfy (1.3) then, %? E(X(t)) = 0, v t 2 o and thus E(X(t)) = i v t 2 0. But as t a m P(t,°) converges weakly to the uniform distribution in [0.1] (See Lamperti [9], p. 176), which implies E(X(t)) % % as t a w. This is a contradiction. Hence the monomial basis functions are inadequate. The boundary conditions on @(A) suggest that an appropriate choice of basis functions is {1, cos wx, cos 2wx,....} which are eigen functions of the operator A. By the Stone-Weierstrass theorem sp(1,cos wx, cos 2wx,....} is dense in C[O,1]. Therefore, the assumptions (i) and (ii) of Theorem 2.1 are satisfied. Using the map x 4 cos wx, 0 g x g 1, we can reduce the discussion of this basis to a set of polynonomial basis with respect to a new measure on [-1,1]. Similar arguments to those in Example 3.1 can then be used to conclude that the Gauss-Christoffel nodes and weights with respect to this new measure have the desired properties. Transporting these to original space [0.1] will imply the corresponding nodes and weights satisfy equation (1.6). Hence assumption (iii) of theorem 2.1 is satisfied. 24 Example 3.3. Consider the absorbing Brownian motion in [0,1]. Then, A 5 l 9—— 2 2 a(A) = (f: f e C2[0,1], £"(o+) = f"(1_) = 0}. Here the probability measure P(t,°) has a singular part and therefore (1.3) cannot be derived via the Fokker-Planck equation approach. An appropriate choice of basis functions suggested by the boundary conditions on @(A) is {1,x,sin wx, sin 2wx,....} These are eigenfunctions of the operator A. Also, sp{1,x,sin wx, sin 2rx,....} is dense in C[O,1]. Hence assumptions (i) and (ii) of Theorem 2.1 are satisfied. Whether the assumption (iii) of Theorem 2.1 is satisfied is not resolved. Remark 3.4. Let 2 _ 1 d d A — 5 a(x) -—§ + b(x) 3; dx on the interval [0.1]. Assume a, b are bounded and continuous. Then at least one restriction of A will generate a Feller process (see Ethier and Kurtz [5], chapter 8). Assume further that, a e C2[0,1], b e C1[0,1], a(x) > 0 on [0,1]. Let the boundary conditions be of the form f'(0) = f'(1) = 0. If the operator A on @(A) is a self-adjoint operator then there exists a countable number of eigenfunctions 25 (en: n = 1,2,....} and sp{en= n = 1,2,....} is dense in C2[0,1] in the supremum norm. (See Coddingtion and Levinson [2] p.197.) The operator A of the reflecting Brownian motion satisfies these conditions. CHAPTER 3 DEGENERATE INITIAL DISTRIBUTION In many applications it may happen that the initial distribution is degenerate. For a degenerate distribution all the Gauss-Christoffel nodes coincide. Therefore, the system of dynamic equations (1.6) of Chapter 2 becomes singular initially and cannot be solved. We describe here two ways of handling this case. We point out that the techniques discussed below are applicable to any initial distribution as well as the degenerate initial distribution. 1. Method 1. Sometimes it may be possible to choose appropriate basis functions {fmt m = 1,2,....} so that the equation (1.3) of Chapter 2 can be solved before discretizing it. One such choice is the eigenfunctions of the infinitesimal generator A. Let {fmt m = 1,2,....} be the set of eigenfunctions with the corresponding eigenvalues Am. Then Amt (1.1) E fm(X(t)) = E fm(X(0)) e Discretizing this system we obtain 7\ t (1 2) afin)(t) fm(x£n)(t)) = E fm(X(O)) e m , 1gmg2n. r IIM'J 1 26 27 This is a system of nonlinear equations for the nodes and weights and may be solved numerically using subroutines available in IMSL (for example, ZSCNT or ZSPOW). Theorem 1.1. Let afin)(t) > 0, xén)(t)eS, 1 g k g n, be a solution of (1.2). Let Pn(t,°) = % aén)(t) a If sp{f1,f2,....} is k 1 {x£n)(t)}' dense in C(S). Then Pn(t,°) é P(t,°). Proof. By (1.2) Amt E fm(X(0)) e , lgmg2n = I rm dP(t,-). Let e > 0 and f e C(S). Then there exists N g = E c f 6 8:1 2 8 I rm dPn(t,°) such that Hf — g6" < 6. Then II t dpn(.,.) — I f dP(t.°)| s I If — geIdPn(t.°) +II g6 dPn(t.-) -I g6 dP(t.°)I + I If - gel dP(t.°) < e + 0 + e V n 2 g. Hence I f dPn(t,°) a I f dP(t,°) for every f e C(S) i.e. Pn(t,-) :>P(t,o) u 28 We now give some examples. Example 1.2. Let the infinitesimal generator 1 d2 d A E — a(x) ——— + b(x) ——, with b(x) a polynomial of degree 2 dx2 dx g 1, a(x) a polynomial of degree g 2 and either b(x) has degree 1 or a(x) has degree 2, then for each m 2 1 we can find a polynomial fm of degree (m-1) such that Afm Amfm. Therefore if the equation (1.3) of Chapter 2 is satisfied by all polynomials then we could choose {fmi m = 1,2,...) as our class of basis functions and approximate the law even with degenerate initial distribution. As an illustration of this let a(x) = x(1—x), b(x) = 0 so that A is defined on [0.1] by 2 x(l - x) 9—- dx2 with a(A) = c2[o,1]. (Example (3.1) of Chapter 2). To find the eigenfunction fm of degree m - 1, let m-l m-2 . fm(x) = x + am_2x + ... + a0. We want to find Am, am_2,...,a0 such that, Af = A f m m m i.e.(x—x2)[(m-1)(m-2)xm_3+am_2(m—2)(m-3)xm—4+...+2a2] m-l m-2 +a x + . +a = Am(x m-2 " o) 29 Equating the coefficients of each degree we get, for m > 2 Am = -(m—1)(m-2) Amam-2 = (m-1)(m-2) - (m-2)(m—3) am_2 Amam-B = (m-2)(m-3)am_2 — (m-3)(m-4)am__3 Amal = 2a2 a0 = 0 and this system can be solved for a0,a1,...,am_2. For m = 0,1; the eigenvalues are 0 and the corresponding eigenfunctions can be taken to be 1.x. The first few eigenvalues and the corresponding eigenfunctions are, 0, 0, -2 , —6 ,... 1 x x2—x x3 — g x2 + — x I 9 i 2 2 "" These are related to the Gegenbauer polynomials (see Ewens [6] p.140). We already know (see Example 3.1 of Chapter 2) that the Gauss-Christoffel nodes and weights exist. These nodes and weights satisfy equation (1.2). Example 1.3. a) Brownian motion in [0,1] with reflecting barriers, i.e. 2 1 d 2 . . . 5 $5 , @(A) = {f e C [0,1]' f (0+) = f (1 The basis functions can be chosen to be A ) = 0}. {1,cos wx, cos 2wx,...} 30 b) Brownian motion in [0,1] with absorbing boundaries. Ihen % ——§ @ A = f e C 0 1 ' f" 0 = f" 1 The basis functions can be chosen to be A 0). {1,x,sin wx, sin 2rx,...}. Both a) and b) have solutions satisfying (1.2). 2. Method 2. Let (X(t): t 2 0} be a Feller process with X(O) = x. Then by Theorem 1.2 of Chapter 1 we have the stability property with respect to the initial distribution, i.e. if (Xn(t): t 2 0} is a sequence of processes with the same transition probabilities as X(t) and such that Xn(0) E X(O) then Xn(t) E X(t). Therefore we can approximate the discrete initial distribution by a continuous distribution and approximate Xn(t) using Gauss-Galerkin techniques. The Gauss-Galerkin approximations of Xn(t) approximate X(t). We state these ideas in the following theorem. Theorem 2.1. Let (X(t): t 2 0} be a Feller process with X(O) = x. Let (Xn(t): t 2 0} be a sequence of Feller processes with the same transition probabilities as (X(t): t 2 0} (i.e. they have the same infinitesimal generator) and with initial distribution such that 31 xn(0) : X(O) = x. Let m P (t) = 2 a(m)(t) a n,m _ n,k (m) k_1 {Xn,k(t)} be the mth stage approximation of the law of Xn(t). If the convergence of the Gauss-Galerkin approximations holds, then for every bounded continuous function f 1“ lim lim I f dPn m(t) = lim lim 2 f(xgmfl(t)) a£m£(t) new mam ’ new mew k=1 ’ ’ E f(X(t)) I f dP(t). Ppppi. {Xn(t): t 2 0}, {X(t): t 2 0} all have the same transition probabilities and have Feller property. Therefore Xn(0) E X(O) implies Xn(t) E X(t) by Theorem 1.2 of Chapter 1. By the convergence result of Gauss-Galerkin approximation, for fixed n, m $1: kil f(x£T%(t)) = E f(Xn(t)) Vn, f e C(S) Xn(t) E X(t) gives lim E f(Xn(t)) = E f(X(t)) V f e C(S). new Therefore for all f in C(S) “1 lim lim k21 f(x£T£(t)) aéT;(t) = E f(X(t)). Note. When we approximate the degenerate initial distribution, the nodes of the approximating initial measure are close to each other and therefore the Gauss-Galerkin system becomes ill-conditioned initially. CHAPTER 4 NUMERICAL EXAMPLES In this chapter we present several numerical examples, that serve to illustrate the Gauss-Galerkin method developed in the preceeding chapters. We have considered examples where the Gauss-Galerkin approximations can be obtained using the techniques described in Section 1 of Chapter 2 and also using the techniques described in Section 1 of Chapter 3. The numerical results are then compared. Dawson [4], HajJafar [7] also have considered several numerical examples and analyzed them. The system of nonlinear equations are solved using the subroutine ZSCNT available in IMSL. The system of ordinary differential equations are solved using the subroutine DGEAR. 1. Example 1. Consider the reflecting Brownian motion in [0,1] with a given initial distribution. Then, 2 dx2 and a(A) = (i: ie02 [0.1], f'(0+) = f'(1—) = 0} We can choose the basis functions to be {1, cos rx, cos 2nx,....} The approximations were computed with 5 nodes and 5 weights at t values 0.0, 0.1,....,0.6. The numerical 32 33 results are given in Table 1. For each t value the first row gives the approximate values obtained by the Method 1 of Chapter 3 for the initial distribution 6{l}. The second 2 row gives the approximations obtained by the Method 1 of Chapter 3 for the initial distribution uniform on [0.4. 0.6]. The third row gives the approximations obtained by the Method 2 of Chapter 3 which amounts to solving the Gauss-Galerkin system (1.5) of Chapter 2, with the initial distribution uniform on [0.4, 0.6]. Note that x = 0.5 for all t. Also a1 = a x = x and a = 1-2 (a1+a 4 3 2)' As we pointed out earlier in Example 3.2 of Chapter 2 the process converges weakly to the stationary distribution which is uniform on [0,1]. Our numerical results show that the 5 points Gauss-Galerkin approximations converge to the values listed for t = 0.6. These values are the Gauss-Christoffel nodes and weights for the uniform distribution on [0,1] with the basis functions {1, coswx, cos 2wx,...}. This behavior of the approximations are consistent with the fact that the process converges weakly to the uniform distribution on [0,1]. 00 GOO 000 000 000 GOO 000 .12058 .12058 .16389 .16622 .16609 .19500 .19532 .19513 .19931 .19935 .19925 .19990 .19991 .19985 .19999 .19999 .19996 .20000 .20000 .20000 000 000 000 CO GOO GOO 000 34 TABLE 1 .23890 .23890 .21433 .21338 .21341 .20192 .20180 .20187 .20027 .20025 .20029 .20004 .20003 .20006 .20000 .20000 .20001 .20000 .20000 .20000 CO GOO GOO 000 000 GOO 000 .40958 .40958 .10623 .10576 .10623 .10074 .10069 .10072 .10010 .10009 .10011 .10001 .10001 .10001 .10000 .10000 .10000 .10000 .10000 .10000 000 GOO 000 CO GOO 000 CO .44661 .44661 .30800 .30750 .30799 .30116 .30108 .30113 .30016 .30015 .30017 0.30002 .30002 .30002 .30000 .30000 .30000 .30000 .30000 .30000 35 2. Example 2. Consider the diffusion process whose infinitesimal generator 91 2 dx a(A) = 02 [0,1]. A E x(1-x) The basis functions can be chosen to be the polynomials. The numerical results are given in Table 2. For each t value, the first row gives the approximate values obtained by the Method 1 of Chapter 3 for the initial distribution uniform on [0.4, 0.6]. The second row gives the approximation obtained by solving the Gauss-Galerkin system (1.6) of Chapter 2. Our numerical results show that the smallest and the largest Gauss-Galerkin nodes converge to 0 and 1 respectively and the corresponding weights converge to 0.5. This behavior of the approximations are consistent with the 1 fact that the rocess conver es weakl to - 6 + 6 . P g y 2( {O} {1}) 36 Table 2 Initial distribution U [0.4, 0.6] t a1=a5 a _a4 1=1-x 2=1-x 0 0 0.11846 0.23931 0.40938 0.44615 0.11846 0.23931 0.40938 0.44615 0.1 0.05146 0.24751 0.06075 0.26014 0.05246 0.24862 0.06186 0.26056 0.2 0.11098 0.23237 0.02391 0.22816 0.11246 0.23231 0.02418 0.22891 0.3 0.17433 0.20084 0.01252 0.21877 0.17499 0.20069 0.01251 0.21870 0.4 0.23063 0.16801 0.00756 0.21491 0.23118 0.16787 0.00757 0.21487 0.5 0.27844 0.13878 0.00501 0.21330 0.27894 0.13865 0.00501 0.21374 0 6 0.31818 0.11406 0.00353 0.21254 0.31864 0.11392 0.00354 0.21262 0.7 0.35096 0.09356 0.00258 0.21215 0.35129 0.09352 0.00260 0.21212 0.8 0.37788 0.07667 0.00195 0.21192 0.37810 0.07674 0.00196 0.21181 10. 11. 12. REFERENCES Billingsley, Patrick (1968). Convergence of Probability Measures. John Wiley, New York. Coddington, Earl A. and Levinson, Norman (1955). Theory of Ordinary Differential Equations. McGraw Hill, New York. Davis, Philip J. and Rabinowitz: Philip (1984). Methods of Numerical Integration. Academic, Orlando. Dawson, Donald A. (1981). Statistics and Related Topics. M. Csorgo et.el. editors. North-Holland, Amsterdam. Ethier, Stewart N. and Kurtz, Thomas G. (1986). Markov Processes. Characterization and Convergence. John Wiley and Sons, New York. Ewens, Warren J. (1979). Mathematical Population Genetics. Springer-Verlag, Berlin. HajJafar, Ali (1984). On the Convergence of the Gauss-Galerkin Method for the Density of Some Markov Process. Ph.D. Thesis, Department of Mathematics, Michigan State University. Kurtz, Thomas G. (1981). Approximation of population processes. CBMS-NSF Regional Conf. Series in Appl. Math. 36, SIAM, Philadelphia. Lamperti, John (1977). Stochastic Processes. Springer-Verlag, New York. Stoer, J. and Bulirsch, R. (1980). Introduction to Numerical Analysis. Springer-Verlag, New York. Stroud, A.H. (1974). Numerical Quadrature and Solution of Ordinary Differential Equations. Springer-Verlag, New York. Wentzell, A.D. (1981). A Course in the Theory of Stochastic Processes. McGraw Hill, New York. 37