M

W

 

W

 

 

WU

 

u

 

I
1

\l‘

 

E
W! H
H

 

.400
IN?» LN
cr>""\J

 

 

THESIO

 

“ne‘esmzazggm 31
'I' "'“" - - . g. . ,M “1‘- ’-. "3"sz
This is to certify that the

dissertation entitled

NUMERICAL APPROXIMATION OF THE
LAWS OF SOME DIFFUSION PROCESSES

presented by
Ponniah Elancheran

has been accepted towards fulﬁllment
of the requirements for

Ph.D. degree in Staijstjcs

Major professor

Date August 14, 1986

 

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

 

MSU

LIBRARIES
”

 

 

RETURNING MATERIALS:
Place in book drop to
remove this checkout from
your record. FINES will
be charged if book is
returned after the date
stamped below.

 

 

 

 

 

NUMERICAL APPROXIMATION OF THE

LAWS OF SOME DIFFUSION PROCESSES

by

Ponniah Elancheran

A DISSERTATION
Submitted to
Michigan State University

in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

1986

ABSTRACT
NUMERICAL APPROXIMATIONS OF THE

LAWS OF SOME DIFFUSION PROCESSES

by

Ponniah Blancheran

In this dissertation. the Gauss—Galerkin approximation
of the laws of some diffusion processes is considered. The
Gauss-Galerkin approximation is obtained through a basic
differential equation describing the evolution of the
expected values of certain functionals of the process. This
differential equation is derived using the martingale
property and also through the semigroup approach.

Dawson [4] and HajJafar [7] derived this basic
differential equation through the Fokker-Planck equation.
They then obtained the Gauss-Galerkin approximation with
polynomial basis functions. The approach considered here
covers diffusion processes for which the Fokker-Planck
equation may not be satisfied or situations where the
polynomial basis functions are inadequate and the use of
more general basis functions becomes appropriate. The
convergence of the Gauss-Galerkin approximations using such
general basis functions is proved in this dissertation.

Two numerical examples. one of which concerns a

degenerate initial distribution, are also presented.

To my brother P. Elango

ii

ACKNOWLEDGEMENTS

I wish to express my sincere thanks to Professors Habib
Salehi and David Yen for their guidance in the preparation
of this dissertation. Their advice, encouragement and the
amount of time spent on discussions are greatly appreciated.

I also wish to thank Professor R.Y. Erickson for his
careful reading of the dissertation and useful discussions
and Professor J.C. Gardiner for serving on my committee.

Finally, I wish to thank Cathy Sparks for her patience,

efficiency and great care in typing this thesis.

iii

TABLE OF CONTENTS

Chapter

INTRODUCTION

1. Statement of the Problem and Motivation
of the Gauss—Galerkin Approximation

2. Organization of the Dissertation

MATHEMATICAL PRELIMINARIES

1. Some Results from Probability Theory
2. Some Results from Numerical Analysis

DERIVATION OF THE GAUSS-GALERKIN APPROXIMATION
AND CONVERGENCE RESULTS. . . . . . . .

1. Derivation of the Gauss-Galerkin Approximation
2. Convergence Theorem.

3. Examples

DEGENERATE INITIAL DISTRIBUTION.

1. Method 1
2. Method 2

NUMERICAL EXAMPLES

1. Example 1.
2. Example 2.

REFERENCES

iv

 

14
2O

26

26
30

32

32
35

37

CHAPTER 0

INTRODUCTION

1. Statement of the Problem and Motivation of the
Gau§§—Galerkin Approximation.

Let {X(t)= t 2 0} be a Feller process (see definition
1.1 of Chapter 1) with state space S = R or an interval of
R. Let P(t,F) = P(X(t) e T), where F is a Borel subset
of S, be the law of X(t), and assume that the initial law
is known. In this dissertation we are concerned with
numerical methods of approximating the law P(t,°). In
particular, we wish to construct a sequence of discrete
measures Pn(t,°) which converges weakly to P(t,°). In
this way the expected values of certain functions or the
probabilities of certain sets of interest can then be
approximated.

The approximation that we consider here, known as the
Gauss—Galerkin approximation and proposed originally by
Dawson [4], combines elements of the Gaussian quadrature and
the Galerkin approximation. Dawson [4] and HajJafar [7]
considered processes which are solutions of stochastic
differential equations. They derived the Gauss-Galerkin
approximation by starting with the Fokker—Planck equation
governing the density p(t,x) and arrived at

§; I f(x) p(t.x) dx = I (Lf)(x) p(t.x) dx

or, written differently,

(1.1) g? E f(X(t)) = E Lf(X(t))

02(x) Q__ + b(x) —; ; a,b are the coefficients of

dx

the stochastic differential equation. Here E denotes the

_ 1
L — 2
expectation operator. Dawson and HajJafar assumed that

(1.1) holds for all mononials and, incorporating the ideas

of the Gaussian quadrature, rewrote (1.1) as

(1.2) 3; 121 §§“)(t) fm(§§“)(t))=igl §§“’(t)(Lfm)(§§n’(t))
+ error(Lfm)
for 1 g m S 2n.

Here fm(x) = x“‘"1 m = 1.2,.... The system (1.2), however

~

cannot be used to find agn)(t), 2:“)(t) because of the
error term error(Lfm). The Gauss-Galerkin approximation is
obtained by solving the system of dynamic equations with the
error term dropped. For more details see HajJafar [7], pp.
38-45.

We consider the Gauss-Galerkin approximation for Feller
processes. The foregoing derivation of the Gauss—Galerkin
scheme starting with the Fokker-Planck equation and with
monomial basis functions has two shortcomings. Firstly,
(1.1), is in many cases more basic than the Fokker-Planck
equation. Secondly, for a number of Markov processes
(1.1) may not hold for all monomials (see Example 3.2 of

Chapter 2). The point of departure in this dissertation is

 

the equation (1.1). We shall derive this equation for
functions in a certain class, using the martingale property.
The class of functions for which (1.1) is valid is
determined by deriving the appropriate boundary conditions.
In this way the Gauss—Galerkin approximations are extended
for more general types of basis functions than the

monomials.

2. giganigation of the Di§§ertation.

This dissertation is organized as follows. Chapter 1
contains the background materials from probability theory
and numerical analysis. In Chapter 2 we develop the
Gauss—Galerkin approximation. Then we prove the convergence
of our approximation, when the state space is a finite
interval. Some examples are then presented.

In many applications it may happen that the initial
distribution is degenerate. In this case, the
Gauss-Galerkin system becomes singular initially. In
Chapter 3 we consider approximating the laws when the
initial distribution is degenerate. Several numerical

examples are presented in Chapter 4.

CHAPTER 1

MATHEMATICAL PRELIMINARIES

In this chapter we introduce some notations and
definitions and state some known results which are used in

the sequel.

1. Some Results from Probability Theory. Our basic
reference for this section is the book by Ethier and Kurtz
[5], where further details can be found. Let {X(t)= t 2 0}
be a time-homogeneous Markov process with the state space

S = R or an interval of R. Let P(t,x,F), xeS, Fe$(S),
where 3(8) denotes the Borel subsets of S, be the
transition probability function. We denote the associated
semigroup of operators by {T(t)= t 2 0}, the infinitesimal
generator by A and the domain of A by @(A). Let C(S)
denote the Banach space of all bounded continuous functions
vanishing at infinity, with the norm Hf" = SpreS|f(x)l‘
Note that if S is compact, C(S) = C(S), the space of

bounded continuous functions on S.

Definition 1.1. A Markov process {X(t)= t 2 O} is
said to be a Feller process if the associated semigroup
{T(t)} is strongly continuous on C(S) and satisfies

T(t) C(S) g 8(3).

Therefore, without loss of generality we can restrict
the Feller semigroup to C(S). We note that @(A) of this
semigroup is contained in C(S) and is dense there.

We now state a theorem which gives the stability
property of Feller processes in the sense of weak
convergence of measures with respect to the initial data.

Theoremﬁl.2. Let {X(t)= t 2 0} be a Feller process.
For n = 1,2,... let {Xn(t)= t 2 0} be a sequence of
Markov processes with the same transition function as that
of X. (Thus Xn are also Feller processes.) Then

Xn(O) é X(O) implies Xn 3 X. {2 means weak convergence}.

Proof. This is a special case of Theorem 2.5. Chapter

4 of Ethier and Kurtz [5] with Tn(t) = T(t) for all n. D

Martingale Property 1.3. We conclude this section by
mentioning the martingale property related to Markov
processes. The martingale problem approach is a powerful
tool for studying Markov processes. We simply state the
martingale property.

Let {X(t)= t 2 0} be a progressively measurable
Markov process with the infinitesimal generator A and domain
$(A). Then

f(X(t)) - 13 (Af)(X(s)) as
is an {FE} martingale for every f e @(A). Here
F): = a(X(s): s g t).

For more details see Ethier and Kurtz [5] p.162.

2. Some Results from,Numerical Analysis.

Let I = [a,b] n m1, — m g a < b g m. Calculating the
definite integral of a given function f(x) with respect to
a given measure u(dx) on I

f f(X) MdX)
is a classical problem in mathematics. For some simple
cases these integrals can be computed exactly. Otherwise we
have to resort to numerical methods to approximate these

integrals.

Quadrature formulas are integration formulas of the

 

type
D (n) (n)
f f(x) u(dx) = 2 a f(x ) + error(f)
k k

I k=1
The xk are called the points (or nodes) in the formula;
the ak are called the coefficients (or weights) in the
formula.

We now describe the so-called "Gaussian quadrature" or

the "Gauss—Christoffel" integration formulas. Assume,

(2.1) u is a positive measure on I.

(2.2) All moments “k = [ xk u(dx), k = 0.1.2,... exist
I

and are finite.

(2.3) For polynomials p(x) which are nonnegative on I

{ p(x) p(dx) = 0 implies p(x) E 0,

Condition (2 3) is satisfied if the measure u is not
supported on a finite number of points. Assumptions (2.1)
to (2.3) are generalizations of assumptions made by Stoer

and Bulirsch [10] p. 142.

Theorem 2.1. a) The n points {x£n)) and the n

weights {a£n)} can be uniquely chosen so that
“~() ~()

(2.4) f f(x) p(dx) = 2 ak“ f(xkn )
k=1

holds for all polynomials of degree less than or equal to

2n-1.

b) Let {pm(')} be the family of orthogonal
polynomials associated with the measure u, that is, pm(°)
is for each m a polynomial of degree m and

fpm(X) pn(X) u(dX) = 0 m ¢ n-
Then the nodes (;£H)1 k = 1.....n} are the 52125 of the
polynomial pn(~).

c) The weights {;£n): k = 1,...,n} are uniquely

obtained as the solutions of the equations
“ ~(n) ~(n)
f f(x) p(dx) = 2 ak f(xk ) for all polynomials
k=1

of degree less than or equal to n—l.

 

For a full discussion see Stroud [11] or Stoer and
Bulirsch [10].
At the nth approximation, the integration formulas

x2n_1} and there are Zn

(2.4) are exact for {1.x,...,
unknown variables (n nodes and n weights). Therefore these

nodes and weights can also be obtained as solutions of the

system of nonlinear equations

“ ~(n) ~(n) m m
2 a (x ) = f x p(dx) 0g m g 2n-1
k k
k=1
Since at the nth stage these formulas are exact for all
monomials of degree g 2n-1 the collection {1.x,x2,...}

are called the basis functions of the Gaussian quadrature

 

formulas.
We extend these ideas to more general types of basis

functions. Let

E = g ;(n) 5 N
n k=1 k {x£n)}

Then “n is called the Gauss-Christoffel approximation of

 

g with the basis functions {fml m = 1,2,...,} if

 

n ~ ~
(2.5) 2 aﬁn) fm(x£n)) = [ fm(x) p(dx) 1 < m < 2n
k=1

There are 2n unknowns and 2n equations. The hope is that
these equations can be solved for the unknowns. But the
existence and uniqueness of such nodes and weights do not

seem to be well studied in general.

CHAPTER 2
DERIVATION OF THE GAUSS—GALERKIN APPROXIMATION

AND CONVERGENCE RESULTS

1. Derivation of the Gauss-Galerkin Approximation.

As we already indicated in the introductory chapter, if
we start with the Fokker—Planck equation then the
Gauss-Galerkin approximation can be derived only in
restricted cases. In this section, we derive the
Gauss-Galerkin approximation via the martingale approach for
more general processes.

Let {X(t)= t 2 0} be a Feller process. Let A be its
infinitesimal generator with domain @(A). Then, as was
pointed out in Section 1.3 of Chapter 1

f(X(c)) — I5 (Ar)(X(u)) du
is a martingale for every f in @(A). Let s < t. Then,

E[f(X(t))-f5(Af)(X01))dUI=E[f(X(S))-I(S)(Af)(X(U))dUJ

for every f in @(A). where E denotes the expectation

operator. It follows that
(1 1) Ef(X(t)) — Ef(X(s)) = E}: (Af)(X(u))du.

Af is bounded and therefore by Fubini’s theorem.

E I: (Af)(X(u))du = ]: E(Af)(X(u))du.

Hence
(1.2) E f(X(t)) — Ef(X(s)) = I: E(Af)(X(u))du.

The strong continuity of the semigroup (T(t)! t 2 0} gives
us the continuity of E (Af)(X(t)). To establish this we

will show that lim E (Af)(X(t)) = E (Af)(X(s)).

tas
E (Af)(X(t)) = I (Af)(y) P(t.dy)
= I (Af)(y)(f P(t.x.dY) P(0 dX))
(By Fubini’s theorem) = I (I (Af)(y) P(t,x,dy)) P(O.dx)
= ] T(t)(Af)(x) P(O,dx).
Further,

lim I T(t)(Af)(x) P(O,dx) = I lim T(t)(Af)(x) P(O,dx)
t4s tes

(by strong continuity of T(t)): I T(s)(Af)(x) P(0.dx)
= E (Af)(X(s)).

Therefore by (1.2) we obtain,

9_

(1.3) dt

E f(X(t)) = E(Af)(x(t))

for every f in @(A).

Equation (1.3) can also be derived through the semigroup

approach. If fe@(A), then T(t)fe@(A) and
%? T(t)f = T(t)Af
d—If(y)1’(t.x,dy)=I(Af)(3')1’(t.x,dy)-

Q.

t

11

Therefore,

I %; [I f(y)P(t.x.dy)JP(o.dx)=II(Af)(y)P(t.x.dy)P(o.dx).

Since I f(y) P(t,x,dy) is a bounded function

L.H.S. = g? I [I f(y) P(t,x,dy)] P(O,dx)
= %; I f(y) [I P(t.x.dy) P(o.dx)]
= §;-I f(y) P(t.dy)
= g? E f(X(t)).

By Fubini’s theorem,

R.H.S. = I (Af)(y) [I P(t,x,dy) P(O,dx)]
= E I (Af)(y) P(t.dy)
= E (Af)(X(t)).
Hence.
g? E f(X(t)) = E(Af)(X(t)) v f e %(A).

The appropriate choice of basis functions is suggested
by the boundary conditions on $(A). In the infinite
interval case we may have to use a limit argument to show
that (1.3) holds for our choice of basis functions.

The Gauss-Galerkin approximation can now be derived.

Let {fmi m=1,2 ..... } be our choice of basis functions for
~ “ ~(n)
which (1.3) holds. Let Pn(t.°) = 2 ak (t) 6 ”(n) be
k=1 {xk (t)}
the n point Gauss-Christoffel approximation with the basis

functions {fmi m=1,2,....}. From (1.3) we get

12
(1.4) g? z ;£D)(t)fm(;£n)(t)) =k§1 ££H)(t)Afm(§£n)(t))

+ error (Afm)

The system (1.4) cannot be used to find aén)(t), xén)(t)
because of the error term error(Lfm). The Gauss—Galerkin
scheme is obtained by dropping the error term in equation

(1.4). This way we obtain a sequence of discrete measures

P (t,°) = g a£“)(t) 5
“ k=1 {x£n)(t)}

where aén)(t), xén)(t) are solutions of the system of

dynamic equations

n n
(1.5) %? ksl aén)(t) fm(x£n)(t)) =k21 a£“)(t) Afm(x£n)(t))

1 g m g 2n.

solved together with the initial values

(n) _~(n) (n) _~(n). “ ~(n)
ak (O) _ ak , xk (0) _ xk , E ak 5 ~(n)

k=1 (Xk }
being the Gauss-Christoffel approximation of the known

initial measure P(O,°).

n
The measure P (t,°) = E a(n)(t) 6 is called
“ k:1 1‘

(xfj‘hm

the Gauss-Galerkin approximation of P(t,°) with the basis

 

functions {fmi m = 1,2,...}. The system (1.5) is called the

Gauss-Galerkin system.

13

If aén)(t), xén)(t) are differentiable then the system

of dynamic equations (1.5) can be written as

F
C(11) Y(n).(t) = D(n) Y(n)(t)

 

 

 

 

 

(1.6)
_ Y(n)(0) = (an)...,§£n), QED)....,;£“))T
where Y(n)(t) = (agn)(t),...,a£n)(t), xgn)(t)....,x£n)(t))T
(n) (n) (n)
c c D o
C(n) = 11 12 ; D(n) = 1
as?) ea) DI“) 0
CI?) : (fi(x§n)))nxn ’ Gig) — (agn) fi(xgn)))nxn
C2?) = (fn+i(x§n)))nxn ’ 02;) = (agn) fn+i(xgn)))nxn
Din) = (Afi(xgn)))nxn ’ Dén) = (Afn+i(xgn)))nxn

This is a nonlinear system of ordinary differential equations
for which the question of existence and uniqueness of
solutions is not settled. The system of differential
equations (1.6) can be solved numerically. Numerical
solutions of some examples indicate that the system (1.6)

should admit solutions in reasonable cases.

14

2. Convergence Theorem.
In this section we shall establish the weak convergence

of the Gauss-Galerkin approximation

P (tn) = 1% a(n)(t) a
“ k=1 k {x§“’(t)}

to the actual law P(t,°) under appropriate assumptions.
Dawson [4] and HajJafar [7] proved weak convergence theorems
when the state space is [0.1] or [0,”) and the basis
functions are monomials. We shall prove similar convergence
result when the state space is [0,1] but with more general
type basis functions. The assumptions made here are similar
to those assumed by HajJafar [7] in the case of monomial
basis functions. The techniques of the proof are closely

related to that of HajJafar.

 

Theorem 2.1. Let (X(t)= t 2 0) be a Feller process
with infinitesimal generator A. Denote the domain of A by
@(A). Let the state space S be [0.1] and T > 0 be

given. Let (fnin = 1,2,....} be a class of basis functions

contained in @(A) E C[O,1], B = sp{f1,...f2

n }. Suppose

n

that the following assumptions are satisfied.
(i) f1 E 1.
(ii) The operator A admits a countable number of
eigenfunctions (en: n = 1,2,....} and sp{en: n = 1.2,....}

is dense in @(A) with respect to the supremum norm.

 

15

Moreover, for a given e > O and an eigenfunction en, there

exists gm 6 Bm for some m, such that

def.
nlen — ngH = n en — gmH+HAen - AgmN < e.

It can be deduced that sp{f1,f2,....} is dense in

2(A).

(iii) The system of equations (1.5) has a solution such
that V t e [0,T] the weights aﬁn)(t) are nonnegative and

the nodes xén)(t) belong to [0.1].

n
Then Pn(t,’) = 2 aén)(t) 6 converges weakly

k 1 {xff’un

to P(t,°) Vte [0.T], where P(t,°) is the law of X(t).

n
Remark 2.2. Since f E 1, E a(n)(t) = 1 for all t
1 k
k=1
and all n. Therefore with assumption (iii) Pn(t,°) are
all probability measures.
Remark 2.3. Assumption (ii) is certainly satisfied if

the union of the graphs CD = {(f;Af)= feBn} are dense in
the graph G = {(f;Af): fe%(A)} with respect to the norm

HI(f:g)IH = urn + Hg”.

Remark 2.4. The system of dynamic equations (1.5) is a

nonlinear system of differential equations. Therefore, it is

16

hard to verify assumption (iii) in general. However, our

examples illustrate that assumption (iii) holds for some

special cases.

Proof of the Theorem. Let

Eﬁ(t) = I fe(x) Pn(t,dx), te[0,T]
Then by (1.5)
%? Eﬁ(t) = I Afe(x) Pn(t,dx) for e = 1,...,2n.

Let 9 be a fixed positive integer. Since Afe e C [0.1],

3 K such that

e
lAfeI s K,
Therefore,
d e e
SUpte[0,T]ld? En(t)l g K8 V n > 2 '
. e , e .

Hence the class of functions {En(°)- n > 5 , t e [0,T]} IS
equicontinuous. Also, this class is uniformly bounded.

Therefore, there exists a subsequence that converges
uniformly to E:(°) (say). By diagonalization argument we
can show that there exists a subsequence kn such that for
any positive integer e,

E: (°) 6 E:(°) uniformly in [0,T].
n

We will now show that {E£(t)= e = 1,2,....} determines a

probability measure. Since {Pk (t,°)2 n = 1.2,....} is a
n

sequence of probability measure on the compact interval

[0,1], this collection is relatively compact. (See

17

Billingsley [1]; Theorem 6.1, Chapter 1). Therefore every

subsequence {Pk (t,°)} of {Pk (t,°)} has a further
n' n

subsequence {Pk (t.°)} which converges weakly to a
n"

H

probability measure P*(t,°). Furthermore,

(2.1) Eﬁ(t) = lim E: (t) = lim I fe(x) Pk (t,dx)
1(1‘ln'_)m n" 1(Iln_)(n n"

= I 13(x) P;(t,dx).

sp{fe: e = 1.2,....} is dense in @(A) and because X is
a Feller Process @(A) is dense in C [0.1]. Hence

sp{fe: B = 1,2,....} is a measure determining class. By
(2.1) we can thus conclude that every subsequence of

{Pk (t,°)} has a further subsequence which converges to the
n
same limit P*(t,°) (say). Hence {Pk (t,°)} itself
n

converges weakly to the probability measure P*(t,°).

Now

a
ﬂ
m
Was
A
ﬂ
V
II

I (Afe)(x) Pkn(t,dx)

which means

2 e t
(2 2) Ekn(t) - Ekn(0) I0 I (Afe)(x) Pkn(s,dx) ds.

Since IAfeI g Ke.

 

 

18

II Afe(x) Pkn(t,dx)l g Kg for all n.

Taking limits in (2.2)

‘e e e e

E t — E o = 1' [E (t - E 0)]

*( ) *( ) knig kn ) kn(
= l‘ t Af P s,dx ds
knig [I0 I ( e)(x) kn( )1

By the dominated convergence theorem

= t 1' Af )P (s,dx ds
IO [knlg I ,(x kn ) ]

= I5 I Afe(x) P*(s,dx)ds.

Therefore,
(2 3) Ee(t) - Ee(0) — It I Af (x) P (s dx)ds f 11 e
. * * - 0 e * , 0 r a ‘ .

Let Ee(t) = I fe(x) P(t,dx). Then by (1 1) Ee(°) also
satisfy the same integral equation (2.3). Furthermore,
initially E:(0) = Ee(0). We want to show that

Ee(t) = E:(t) for all e and all t. Let e be an

n

eigenfunction. Then, by assumption (ii), for a given 6 > 0,

there exists g e B for some m such that Hle — g I” =
m m n m

Ilen - gm" + IlAen - AgmH < 6.

Then,

II en(x) P*(t.dx) - I en(x)P*(0.dx) - I5 I (Aen)(x)
P,(s.dx)dsl s I I (en(x) — gm(X)) P*(t.dx) - I (en(x)
— gm(x)) P*(0.dx) — I3 I (Aen(x) - Agm(x)) P*(s.dx)ds
I + I I gm(x) P*(t.dx) — I gm(x) P,(o.dx) — I5

I Agm(x) P*(s.dx)dsl < e + e + eT + 0.

19

the last expression being zero by (2.3) together with its
linearity property.
Hence,
I en(x) P,(t.dx) - Ien(x) P*(o.dx) = I5 I (Aen)(x)
P*(s,dx)ds
i.e. %? I en(x) P*(t,dx) I Aen(x) P*(t,dx)

KnI en(x) P*(t,dx).

Therefore,

h t
I en(x) P*(t,dx) = e n I en(x) P*(O,dx).

Likewise

P(t,°) also satisfies,

A t
I en(x) P(t,dx) = e n I en(x) P(O,dx).

Moreover
I en(x) P*(O,dx) = I en(x) P(O,dx).
From these we conclude that
I en(x) P*(t,dx) = I en(x) P(t,dx) for all n.
This together with the assumption about the denseness of the
eigenfunctions imply,
I f dP(t) = I f dP*(t) for all feC[O,1].

Hence Pk (t,°) a P(t,°).
n
The proof given above demonstrates that if we start with

a subsequence {Pk (t,°)} we can show that there is a
n

further subsequence which converges weakly to P(t,°). Hence

Pn(t,°) itself converges weakly to P(t,°) D

20

1 d2 d
Remark 2.5. Let A E — a(x) ———-+ b(x) ——. If the
2 dX2 dx

basis functions are polynomials and the drift coefficient
b(x) and the diffusion coefficient a(x) are bounded then
the graphs UGn = U{(f;Af): feBn} are dense in

G = {(f;Af)= fe@(A)}. To prove this let fe@(A) E C2 [0,1].
Then, by the Weierstrass approximation theorem, for any 6 >

0, there exists a polynomial Pm(x) such that

"f(x) - Pm(x)H + Hf'(x) — P$(x)ﬂ + Nf"(x) - P;(x)ﬂ < e

 

Therefore, if the drift and diffusion coefficients are
bounded, then for any 6 > 0 there exists a polynomial
Pm(x) such that

Hf - P N + HAf - AP N < e.
m m

3. Examples.

We now give some examples, where our techniques work,
but the Gauss-Galerkin approximation cannot be derived
through the Fokker-Planck equation or the monomial basis

functions are inappropriate.

Example 3.1. Consider the diffusion process

charactorized by

2
AEX(1—X)9-—2'
dx

a(A) = c2[0.1]
This example arises as a diffusion approximation for a model

in population genetics (see Kurtz [8] p.29). In this example

 

21

also (1.3) cannot be obtained via the Fokker—Planck equation
approach. It is known that for any t > 0 the fixation
probabilities are positive (see Ewens [6] p.41), i.e. there
is an accumulation of mass at 0 and 1. Therefore the law
P(t,°) has a singular part. Although the density of the
absolutely continuous part satisfies the Fokker-Planck
equation the weak form of the latter does not lead to (1.3)
for all monomials, for example for f(x) E 1. Eq. (1.3) can
nevertheless be derived directly through the martingale
approach for all monomials. For each n, there is a
polynomial of degree n which is an eigenfunction of A (see
Example 1.2 of Chapter 3 for more details). Therefore the
assumptions (i) and (ii) of the Theorem 2.1 are satisfied for
the monomial basis functions. It can be shown that the
Gauss-Christoffel nodes and weights satisfy equation (1.6).
Hence the assumption (iii) is satisfied.

To show that the Gauss-Christoffel nodes and weights
satisfy equation (1.6), we note that the measure P(t,°) has
a nonzero absolutely continuous part (see Ewens [6] pp.
40-41). Therefore assumptions (2.1) - (2.3) of Chapter 1 are
satisfied. Hence, the Gauss-Christoffel nodes and weights
exist. The nodes belong to [0,1] and are distinct. The
weights are positive. Secondly, these nodes are
differentiable functions of t. This is so because they are
zeros of the orthogonal polynomials with respect to P(t,°)

whose coefficients are differentiable in t, and an

22

application of the inverse mapping theorem gives the zeros
are differentiable. The differentiability of the
coefficients of the orthogonal polynomials follows from the
standard Gram-Schmidt orthogonalization procedure and the

fact that polynomials belong to the domain of A which gives

def.
that (p1,p2)t = I p1(x) p2(x) P(t,dx) is differentiable

in t for polynomials p1, p2. It then easily follows that
the weights are also differentiable in t. Thirdly, these
nodes and weights satisfy the system of equations (1.6). We
note that if fm is a polynomial of degree m, then Afm is
also a polynomial of degree less or equal to m. This
together with equation (1.3) applied to fm = xm—1 and
equation (2.4) of Chapter 1 applied to both fm and Afm
imply that the Gauss-Christoffel nodes and weights satisfy
equation (1.6). By the uniqueness of the solution of

equation (1.6) for this case (HajJafar [7], p. 46) we

conclude that the Gauss-Galerkin nodes and weights exists and

are unique. Hence the assumption (iii) of Theorem 2.1 is
satisfied.
Example 3.2. Consider the reflecting Brownian motion X

 

in [0,1] with the initial distribution uniform on [0,%].

Then

<12;

2
dx
a(A) = (f: feC2[0,1], £'(o+) = f’(1

AE

Mp4

) 0}

23

We Show that the function f(x) = x does not satisfy the

basic equation (1.3) namely

gI-E f(X(t)) = E Af(X(t))
For if f(x) = x satisfy (1.3) then, %? E(X(t)) = 0,
v t 2 o and thus E(X(t)) = i v t 2 0. But as t a m
P(t,°) converges weakly to the uniform distribution in [0.1]
(See Lamperti [9], p. 176), which implies E(X(t)) % % as
t a w. This is a contradiction. Hence the monomial basis

functions are inadequate.
The boundary conditions on @(A) suggest that an

appropriate choice of basis functions is

{1, cos wx, cos 2wx,....}
which are eigen functions of the operator A. By the
Stone-Weierstrass theorem sp(1,cos wx, cos 2wx,....} is
dense in C[O,1]. Therefore, the assumptions (i) and (ii) of
Theorem 2.1 are satisfied. Using the map x 4 cos wx,

0 g x g 1, we can reduce the discussion of this basis to a
set of polynonomial basis with respect to a new measure on
[-1,1]. Similar arguments to those in Example 3.1 can then
be used to conclude that the Gauss-Christoffel nodes and
weights with respect to this new measure have the desired
properties. Transporting these to original space [0.1] will

imply the corresponding nodes and weights satisfy equation

(1.6). Hence assumption (iii) of theorem 2.1 is satisfied.

24

Example 3.3. Consider the absorbing Brownian motion in
[0,1]. Then,

A 5 l 9——

2 2
a(A) = (f: f e C2[0,1], £"(o+) = f"(1_) = 0}.
Here the probability measure P(t,°) has a singular part and
therefore (1.3) cannot be derived via the Fokker-Planck
equation approach. An appropriate choice of basis functions

suggested by the boundary conditions on @(A) is

{1,x,sin wx, sin 2wx,....}
These are eigenfunctions of the operator A. Also,
sp{1,x,sin wx, sin 2rx,....} is dense in C[O,1]. Hence

assumptions (i) and (ii) of Theorem 2.1 are satisfied.
Whether the assumption (iii) of Theorem 2.1 is satisfied is

not resolved.

Remark 3.4. Let
2
_ 1 d d
A — 5 a(x) -—§ + b(x) 3;
dx
on the interval [0.1]. Assume a, b are bounded and
continuous. Then at least one restriction of A will generate

a Feller process (see Ethier and Kurtz [5], chapter 8).
Assume further that, a e C2[0,1], b e C1[0,1], a(x) > 0 on
[0,1]. Let the boundary conditions be of the form

f'(0) = f'(1) = 0.
If the operator A on @(A) is a self-adjoint operator then

there exists a countable number of eigenfunctions

25

(en: n = 1,2,....} and sp{en= n = 1,2,....} is dense in
C2[0,1] in the supremum norm. (See Coddingtion and Levinson
[2] p.197.)

The operator A of the reflecting Brownian motion

satisfies these conditions.

 

 

CHAPTER 3

DEGENERATE INITIAL DISTRIBUTION

In many applications it may happen that the initial
distribution is degenerate. For a degenerate distribution
all the Gauss-Christoffel nodes coincide. Therefore, the
system of dynamic equations (1.6) of Chapter 2 becomes
singular initially and cannot be solved. We describe here

two ways of handling this case. We point out that the

 

techniques discussed below are applicable to any initial

distribution as well as the degenerate initial distribution.

1. Method 1.

Sometimes it may be possible to choose appropriate
basis functions {fmt m = 1,2,....} so that the equation
(1.3) of Chapter 2 can be solved before discretizing it.

One such choice is the eigenfunctions of the infinitesimal

generator A. Let {fmt m = 1,2,....} be the set of

eigenfunctions with the corresponding eigenvalues Am. Then
Amt

(1.1) E fm(X(t)) = E fm(X(0)) e

Discretizing this system we obtain

7\ t

(1 2) aﬁn)(t) fm(x£n)(t)) = E fm(X(O)) e m , 1gmg2n.

r
IIM'J

1

26

27

This is a system of nonlinear equations for the nodes and
weights and may be solved numerically using subroutines

available in IMSL (for example, ZSCNT or ZSPOW).

Theorem 1.1. Let aﬁn)(t) > 0, xén)(t)eS, 1 g k g n,

be a solution of (1.2). Let

Pn(t,°) = % aén)(t) a If sp{f1,f2,....} is

k 1 {x£n)(t)}'

dense in C(S). Then Pn(t,°) é P(t,°).

Proof. By (1.2)
Amt
E fm(X(0)) e , lgmg2n

= I rm dP(t,-).

Let e > 0 and f e C(S). Then there exists

N

g = E c f
6 8:1 2 8

I rm dPn(t,°)

such that

Hf — g6" < 6.

Then
II t dpn(.,.) — I f dP(t.°)|
s I If — geIdPn(t.°) +II g6 dPn(t.-) -I g6 dP(t.°)I
+ I If - gel dP(t.°)
< e + 0 + e V n 2 g.
Hence I f dPn(t,°) a I f dP(t,°) for every f e C(S)

i.e. Pn(t,-) :>P(t,o) u

28
We now give some examples.

Example 1.2. Let the infinitesimal generator
1 d2 d

A E — a(x) ——— + b(x) ——, with b(x) a polynomial of degree
2 dx2 dx

g 1, a(x) a polynomial of degree g 2 and either b(x) has

degree 1 or a(x) has degree 2, then for each m 2 1 we

can find a polynomial fm of degree (m-1) such that Afm
Amfm. Therefore if the equation (1.3) of Chapter 2 is
satisfied by all polynomials then we could choose {fmi m =
1,2,...) as our class of basis functions and approximate
the law even with degenerate initial distribution.

As an illustration of this let a(x) = x(1—x), b(x) = 0
so that A is defined on [0.1] by

2

x(l - x) 9—-

dx2
with a(A) = c2[o,1]. (Example (3.1) of Chapter 2).

To find the eigenfunction fm of degree m - 1, let

m-l m-2 .
fm(x) = x + am_2x + ... + a0. We want to find
Am, am_2,...,a0 such that,
Af = A f
m m m
i.e.(x—x2)[(m-1)(m-2)xm_3+am_2(m—2)(m-3)xm—4+...+2a2]

m-l m-2
+a x + . +a

= Am(x m-2 " o)

 

29

Equating the coefficients of each degree we get, for m > 2

Am = -(m—1)(m-2)
Amam-2 = (m-1)(m-2) - (m-2)(m—3) am_2
Amam-B = (m-2)(m-3)am_2 — (m-3)(m-4)am__3
Amal = 2a2
a0 = 0
and this system can be solved for a0,a1,...,am_2. For
m = 0,1; the eigenvalues are 0 and the corresponding

eigenfunctions can be taken to be 1.x.
The first few eigenvalues and the corresponding
eigenfunctions are,

0, 0, -2 , —6 ,...

1 x x2—x x3 — g x2 + — x
I 9 i 2 2 ""

These are related to the Gegenbauer polynomials (see Ewens
[6] p.140). We already know (see Example 3.1 of Chapter 2)

that the Gauss-Christoffel nodes and weights exist. These

nodes and weights satisfy equation (1.2).

Example 1.3. a) Brownian motion in [0,1] with
reflecting barriers, i.e.

2
1 d 2 . . .
5 $5 , @(A) = {f e C [0,1]' f (0+) = f (1

The basis functions can be chosen to be

A

) = 0}.

{1,cos wx, cos 2wx,...}

 

30

b) Brownian motion in [0,1] with absorbing boundaries.

Ihen
% ——§ @ A = f e C 0 1 ' f" 0 = f" 1

The basis functions can be chosen to be

A

0).

{1,x,sin wx, sin 2rx,...}.

Both a) and b) have solutions satisfying (1.2).

2. Method 2.

Let (X(t): t 2 0} be a Feller process with X(O) = x.
Then by Theorem 1.2 of Chapter 1 we have the stability
property with respect to the initial distribution, i.e. if
(Xn(t): t 2 0} is a sequence of processes with the same
transition probabilities as X(t) and such that
Xn(0) E X(O) then Xn(t) E X(t). Therefore we can
approximate the discrete initial distribution by a
continuous distribution and approximate Xn(t) using
Gauss-Galerkin techniques. The Gauss-Galerkin
approximations of Xn(t) approximate X(t). We state these

ideas in the following theorem.

Theorem 2.1. Let (X(t): t 2 0} be a Feller process
with X(O) = x. Let (Xn(t): t 2 0} be a sequence of
Feller processes with the same transition probabilities as
(X(t): t 2 0} (i.e. they have the same infinitesimal

generator) and with initial distribution such that

 

 

31

xn(0) : X(O) = x. Let

m
P (t) = 2 a(m)(t) a
n,m _ n,k (m)

k_1 {Xn,k(t)}
be the mth stage approximation of the law of Xn(t). If
the convergence of the Gauss-Galerkin approximations holds,

then for every bounded continuous function f

1“
lim lim I f dPn m(t) = lim lim 2 f(xgmﬂ(t)) a£m£(t)
new mam ’ new mew k=1 ’ ’

E f(X(t))

I f dP(t).

Ppppi. {Xn(t): t 2 0}, {X(t): t 2 0} all have the
same transition probabilities and have Feller property.
Therefore Xn(0) E X(O) implies Xn(t) E X(t) by Theorem
1.2 of Chapter 1. By the convergence result of

Gauss-Galerkin approximation, for fixed n,

m

$1: kil f(x£T%(t)) = E f(Xn(t)) Vn, f e C(S)
Xn(t) E X(t) gives lim E f(Xn(t)) = E f(X(t)) V f e C(S).

new
Therefore for all f in C(S)

“1
lim lim k21 f(x£T£(t)) aéT;(t) = E f(X(t)).

Note. When we approximate the degenerate initial
distribution, the nodes of the approximating initial measure
are close to each other and therefore the Gauss-Galerkin

system becomes ill-conditioned initially.

CHAPTER 4

NUMERICAL EXAMPLES

In this chapter we present several numerical examples,
that serve to illustrate the Gauss-Galerkin method developed
in the preceeding chapters. We have considered examples
where the Gauss-Galerkin approximations can be obtained
using the techniques described in Section 1 of Chapter 2 and
also using the techniques described in Section 1 of Chapter
3. The numerical results are then compared. Dawson [4],
HajJafar [7] also have considered several numerical examples
and analyzed them.

The system of nonlinear equations are solved using the
subroutine ZSCNT available in IMSL. The system of ordinary
differential equations are solved using the subroutine
DGEAR.

1. Example 1.

Consider the reflecting Brownian motion in [0,1] with a

given initial distribution. Then,
2 dx2

and a(A) = (i: ie02 [0.1], f'(0+) = f'(1—) = 0}

We can choose the basis functions to be
{1, cos rx, cos 2nx,....}
The approximations were computed with 5 nodes and 5

weights at t values 0.0, 0.1,....,0.6. The numerical

32

33

results are given in Table 1. For each t value the first
row gives the approximate values obtained by the Method 1 of

Chapter 3 for the initial distribution 6{l}. The second
2

row gives the approximations obtained by the Method 1 of
Chapter 3 for the initial distribution uniform on [0.4.
0.6]. The third row gives the approximations obtained by
the Method 2 of Chapter 3 which amounts to solving the
Gauss-Galerkin system (1.5) of Chapter 2, with the initial

distribution uniform on [0.4, 0.6].

Note that x = 0.5 for all t. Also a1 = a

x = x and a = 1-2 (a1+a

4 3 2)'

As we pointed out earlier in Example 3.2 of Chapter 2
the process converges weakly to the stationary distribution
which is uniform on [0,1]. Our numerical results show that
the 5 points Gauss-Galerkin approximations converge to the
values listed for t = 0.6. These values are the
Gauss-Christoffel nodes and weights for the uniform
distribution on [0,1] with the basis functions {1, coswx,
cos 2wx,...}. This behavior of the approximations are
consistent with the fact that the process converges weakly

to the uniform distribution on [0,1].

00

GOO 000 000 000 GOO 000

.12058
.12058

.16389
.16622
.16609

.19500
.19532
.19513

.19931
.19935
.19925

.19990
.19991
.19985

.19999
.19999
.19996

.20000
.20000
.20000

000 000 000 CO

GOO GOO 000

34

TABLE 1

.23890
.23890

.21433
.21338
.21341

.20192
.20180
.20187

.20027
.20025
.20029

.20004
.20003
.20006

.20000
.20000
.20001

.20000
.20000
.20000

CO

GOO GOO 000 000 GOO 000

.40958
.40958

.10623
.10576
.10623

.10074
.10069
.10072

.10010
.10009
.10011

.10001
.10001
.10001

.10000
.10000
.10000

.10000
.10000
.10000

000 GOO 000 CO

GOO 000 CO

.44661
.44661

.30800
.30750
.30799

.30116
.30108
.30113

.30016
.30015
.30017

0.30002
.30002
.30002

.30000
.30000
.30000

.30000
.30000
.30000

 

 

35

2. Example 2.

Consider the diffusion process whose infinitesimal

generator

91

2
dx
a(A) = 02 [0,1].

A E x(1-x)

The basis functions can be chosen to be the
polynomials. The numerical results are given in Table 2.
For each t value, the first row gives the approximate
values obtained by the Method 1 of Chapter 3 for the initial
distribution uniform on [0.4, 0.6]. The second row gives
the approximation obtained by solving the Gauss-Galerkin
system (1.6) of Chapter 2.

Our numerical results show that the smallest and the
largest Gauss-Galerkin nodes converge to 0 and 1
respectively and the corresponding weights converge to 0.5.
This behavior of the approximations are consistent with the

1
fact that the rocess conver es weakl to - 6 + 6 .
P g y 2( {O} {1})

36

Table 2

Initial distribution U [0.4, 0.6]

 

t a1=a5 a _a4 1=1-x 2=1-x
0 0 0.11846 0.23931 0.40938 0.44615
0.11846 0.23931 0.40938 0.44615

0.1 0.05146 0.24751 0.06075 0.26014
0.05246 0.24862 0.06186 0.26056

0.2 0.11098 0.23237 0.02391 0.22816
0.11246 0.23231 0.02418 0.22891

0.3 0.17433 0.20084 0.01252 0.21877
0.17499 0.20069 0.01251 0.21870

0.4 0.23063 0.16801 0.00756 0.21491
0.23118 0.16787 0.00757 0.21487

0.5 0.27844 0.13878 0.00501 0.21330
0.27894 0.13865 0.00501 0.21374

0 6 0.31818 0.11406 0.00353 0.21254
0.31864 0.11392 0.00354 0.21262

0.7 0.35096 0.09356 0.00258 0.21215
0.35129 0.09352 0.00260 0.21212

0.8 0.37788 0.07667 0.00195 0.21192
0.37810 0.07674 0.00196 0.21181

10.

11.

12.

REFERENCES

Billingsley, Patrick (1968). Convergence of Probability
Measures. John Wiley, New York.

Coddington, Earl A. and Levinson, Norman (1955). Theory
of Ordinary Differential Equations. McGraw Hill, New
York.

Davis, Philip J. and Rabinowitz: Philip (1984). Methods
of Numerical Integration. Academic, Orlando.

Dawson, Donald A. (1981). Statistics and Related
Topics. M. Csorgo et.el. editors. North-Holland,
Amsterdam.

Ethier, Stewart N. and Kurtz, Thomas G. (1986). Markov
Processes. Characterization and Convergence. John Wiley
and Sons, New York.

Ewens, Warren J. (1979). Mathematical Population
Genetics. Springer-Verlag, Berlin.

HajJafar, Ali (1984). On the Convergence of the
Gauss-Galerkin Method for the Density of Some Markov
Process. Ph.D. Thesis, Department of Mathematics,
Michigan State University.

Kurtz, Thomas G. (1981). Approximation of population
processes. CBMS-NSF Regional Conf. Series in Appl.
Math. 36, SIAM, Philadelphia.

Lamperti, John (1977). Stochastic Processes.
Springer-Verlag, New York.

Stoer, J. and Bulirsch, R. (1980). Introduction to
Numerical Analysis. Springer-Verlag, New York.

Stroud, A.H. (1974). Numerical Quadrature and Solution
of Ordinary Differential Equations. Springer-Verlag,
New York.

Wentzell, A.D. (1981). A Course in the Theory of
Stochastic Processes. McGraw Hill, New York.

37