.4 1 t
. 2.5; . .
1

I (V
".9...

'm

 

a??? a.

 

 

. H 3 LIBRARY
Michigan State
University

 

 

 

This is to certify that the

dissertation entitled

Bayesian Modeling 0n
Inhomogeneous Point Patterns Via
Independent Increment Random Measures

presented by

Fanzhi Kong

has been accepted towards fulfillment
of the requirements for

W/

Majorg rofcssor

Dennis giiiiland

 

 

 

Date Auqust 5, 2002

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c:/CIRC/DaIeDue.p65-p.15

 

 

 

BAYESIAN MODELING ON
IN HOMOGENEOUS POINT PATTERNS VIA
INDEPENDENT IN CREMENT RANDOM MEASURES

By

Fanzhi Kong

A DISSERTATION
Submitted to
MICHIGAN STATE UNIVERSITY
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Department of Statistics and Probability
2002

Abstract

BAYESIAN MODELING ON
INHOMOGENEOUS POINT PATTERNS VIA INDEPENDENT INCREMENT
RANDOM MEASURES

By

Fanzhi Kong

The main objective in this thesis is to investigate the inference and prediction
of the intensity functions governing inhomogeneous point patterns. The intensity
functions are modeled as random ﬁelds directed by independent increment random
measures in a reference state space. Random ﬁelds are the foundation of spatial data
analyses. One constructed model on point patterns is based on the Poisson general-
ized Gamma random ﬁelds. The generalized Gamma measures can be simulated by
the Inverse Lévy Measure algorithm. For highly clustered point patterns, it is more
appropriate to apply the geostatistical modeling approaches after the log transforma-
tion of count data. The related models are based on Wiener measure subordinated
random ﬁelds, which are closely related to the ordinary random ﬁelds speciﬁed directly

by mean and covariance functions.

Another concern is to make full use of collateral information obtained along
with point patterns. The collateral data, treated as ﬁxed covariates and covariate
stochastic processes, are considered in the setting of joint modeling on the intensities.
The involved modeling approaches follow both theoretical and simulation methods.
The corresponding MCMC schemes are proposed for each data model. The actual
implementation is demonstrated and performed for the joint model on counts and

binary clipped random ﬁelds.

Acknowledgements

My deepest regards and sincerest thanks go to my thesis advisors, Professors
Dennis Gilliland and Oliver Schabenberger, for their many instructions and great
enthusiasm support during this research and for their guidance through the early
years of chaos and confusion. I sincerely appreciate Professor Gilliland’s inspiration
and extreme patience in the ﬁnal stage of preparing this dissertation. My committee
members Professors V. S. Mandrekar, Habib Salehi and Yimin Xiao provided the
technical assistance that improved some results and eliminated errors; their help was

invaluable.

Fuxia Cheng and Yichuan Xia shared with me their knowledge of second order
analysis and provided many useful references and friendly encouragement. I am grate-
ful to the other members of the Department of Statistics and Probability at Michigan
State University who make the department such an exciting, friendly and inspiring

place to study and work, in particular to Professor James Stapleton.

For ﬁnancial support I thank Professors Oliver Schabenberger, Patrick Hart and
Shengyang He, who mainly provided the research assistantship support for this project
during my graduate studies. I should also mention that my graduate studies in the
United States were supported in part by the Department of Crop and Soil Sciences

and the DOE Plant Research Laboratory at Michigan State University.

East Lansing, Michigan, USA Fanzhi Kong
July 31, 2002

iii

Table of Contents

List of Tables

List of Figures

1 Spatial Poisson-Generalized Gamma Random Fields
1.1 Point Patterns and Collateral Information ...............
1.2 Thesis Synopsis ..............................
1.2.1 Terminology and Notation ....................
1.3 Poisson-Generalized Gamma Random Field Models ..........
1.3.1 Generalized Gamma Random Measures .............

1.3.2 Intensity Models and Modeling Objectives ...........

2 Joint Inference on Counts and Binary Random Fields
2.1 Model Speciﬁcation and Representation .................
2.2 Counts and Binary Data .........................
2.3 Joint Inference on Counts and Binary Random Fields .........
2.3.1 Graphical Representation ....................
2.3.2 MCMC Scheme ..........................
2.4 Simulation and Results ..........................
2.4.1 Simulation Setup .........................

2.4.2 Posterior Analysis .........................

3 Models Based on Log Gaussian Random Fields
3.1 Models for Log Counts ..........................
3.1.1 Model Speciﬁcation ........................
3.1.2 MCMC Inference .........................

3.2 Clipped Intensity Models .........................

iv

vi

vii

N01155:)

15

19
19
24
26
26
27
31
31
34

39
40
40
42
46

3.2.1 Second Order Structure of Binary Random Fields .......
3.2.2 MCMC Inference .........................
3.3 Joint Inference of Point Patterns and Geostatistical Random Fields

3.3.1 Association between Point Patterns and Covariate Processes .

3.3.2 MCMC Inference on Joint Models ................

4 Summary of Results and Further Investigation
4.1 Summary of Results ...........................

4.2 Further Investigation ...........................

Bibliography

48
50
54
55
59

64
64
65

67

2.1

List of Tables

True model and other models

oooooooooooooooooooooo

vi

2.1

2.2

2.3

2.4

2.5

2.6

2.7

3.1

3.2

3.3

List of Figures

Homogeneous Poisson Voronoi tessellation ............... 21
DAG of counts and binary random ﬁelds ................ 27
Empirical counts plot ........................... 32
Posterior means for intensities over lattice cells ............. 35
Posterior density for log—6 ........................ 36
Posterior density for b1 .......................... 37
Posterior density for b2 .......................... 38
DAG of log Gaussian models ....................... 43
DAG of clipped intensity models ..................... 50
DAG of point patterns and geostatistical models ............ 59

vii

Chapter 1. Spatial Poisson-Generalized Gamma

Random Fields

The study of inhomogeneous point patterns often involves the treatment of ad-
ditional spatial data types. The joint analysisof point patterns and collateral in-
formation may improve the inference and prediction efﬁciency. The consideration is
demanded because point patterns are practically most expensive and collateral data
are often available along with point patterns. Thus, the methods of joint analysis of
point patterns and their collateral data are highly recognized recently. The follow—
ing section gives a short introduction to point patterns and other spatial data types

considered in this thesis.

1.1 Point Patterns and Collateral Information

Consider a bounded subset S of the d—dimensional Euclidean space with d 2 1.
A point pattern, {S}; j = 1, - - - ,J}, is the set of locations of observed events from

all sampling windows {A,-} in S.

Other spatial measurements {Ygi = 1, - - - ,n} are taken from the corresponding
locations {3,} C S. The locations {5,} are explicitly / implicitly associated with sam-
pling windows {A;} for obtaining feasible measurements. Practical applications often
involve those random variates Y,- distributed on a regular lattice. Examples include
image analysis, where locations represent pixels, and crop experiments, where they
equate with plots in the ﬁeld. Another example is a two-way table with responses that
are the aggregate of independent row effects, column effects and higher order effects.

In this context, the locations {3,} are called the positive integral coordinates or

 

the generic locations. One may distinguish point patterns from other types of
measurements by associating Y, with the sampling window A, with s,- E A,- C S. The
sampling windows {A,-} may fall far away from each other or may make contact with
their neighbors. In the former case, {3,} can be assumed to have no distinguishable
points in the space concerning the sizes of sampling windows, and in the latter case,
{3‘} together with {24,-} form a partition of the study region. When the relative size
of sampling windows can not be neglected, and sampling windows are disconnected,
{8;} forms a network with features of the sampling windows as node attributes.
The corresponding data {Yi} may be called geostatistical data, lattice data and

network data, respectively.

The spatial data can also be categorized by the characteristics of observables and
may be grouped into three categories: geostatistical data, point pattern data
and categorical data (or random setsl). Cressie (1993) classiﬁes the general spatial

data into three types, i.e., geostatistical data, lattice data, and point pattern data.

1.2 Thesis Synopsis

The choice of spatial models is dictated by the particular form of the spatial data.
It is also inﬂuenced by the modeling objectives, which may be to summarize data,
to predict dynamics, or to model mechanistically in order to advance understanding.
In this thesis, the data representation and interpretation for different types of spatial
data will be based on the Voronoi tessellations. The corresponding models will be
constructed with the aid of their directed acyclic graphic representations. The direct
approaches for inference and prediction in inhomogeneous models are generally not

feasible. We will follow Monte Carlo and Markov Chain methods, and all the required

 

1The random set is the main subject of the spatial geometry.

full conditional distributions are derived or approximated for the prOposed models.

The thesis develops several models on inhomogeneous point patterns in the pres-
ence of covariates and covariate stochastic processes. A chapter based synopsis of the

involved models and their analyses follow.

The rest of this chapter concerns the Poisson generalized Gamma random ﬁelds.
We establish properties of the generalized Gamma measures that are useful for mod-
eling and inference for point patterns. We also discuss the incorporation of a covariate

into the analysis.

Chapter 2 performs the simulation study for the joint model of count data and
censored data. In Section 2.1, we continue the work on the two-stage model with a
particular generalized Gamma measure and with a particular data structure (count
and indicator variables). The data structure is more fully described in Section 2.2;
it consists of count data on a subset of the cells of a regular lattice and indicator
variables for counts in excess of a cutoff point on the other cells. In Section 2.3,
we develop the full conditional distributions needed for an MCMC implementation.
Section 2.4 gives the results of the MCMC implementation based on speciﬁc data

values.

Chapter 3 studies modeling on point patterns with the intensity function as the
transformed kernel mixture of Wiener measures, i.e., log Gaussian random ﬁelds. In
Section 3.1, we introduce a model for log-count data that includes covariates and an
additive error. In Section 3.2, we motivate a model for binary data derived from a
clipped intensity function. The second order structure is discussed. In Section 3.3,

we discuss inference for point patterns that associate with a covariate process.

Chapter 4 summarizes the main results in the thesis and discusses some issues

for further investigation.

1.2.1 Terminology and Notation

Some terminology and notation described here will be used through the thesis.
The Bayesian approach is natural for the type of statistical problems treated in the
thesis. The Bayesian approach to statistics uses probability measures, usually deﬁned
in terms of density functions, to quantify and assess unknown parameters and other
unknown quantities, so it involves a direct application of probability theory. Densities
and probability functions of random vectors X and Y will be denoted generically by
square brackets in the context, so that the joint, conditional and marginal forms
appear, respectively, as [X , Y], [X |Y], and [Y]. This notation is used by Gelfand et
al. (1990a). The usual marginalization by integration or summation procedure will

be denoted by forms such as [X] = [X [Y] [Y]

When applying the Monte Carlo Markov Chain methods, we will use the term
‘MCMC’ for simplicity. In deriving the conditional densities, the phrase “is propor-
tional to” is denoted with the symbol ‘oc’. For the log conditional densities, the
involved additive constant will be denoted by a generic constant ‘cg’. This convention
is convenient because it reduces the amount of notation, and since these densities are
seldom evaluated, it produces no confusion. The density or distribution of a random
vector X will denote by ‘~’, interpreted as “X is distributed as”. The transpose of a

vector or a matrix X will be denoted as X T.

In defining random measures the underlying probability space will be denoted by
(Q, f, P). We will denote the Lebesgue measure on d—dimensional Euclidean space Rd
by ‘| - |’. Among others, ‘GG’ is the shorthand for the generalized Gamma measures

or distributions with arguments supplied in the context. Finally, the geostatistical

random ﬁelds, models, and data will be used to emphasize the continuous nature over

a subset of Euclidean space Rd.

1.3 Poisson-Generalized Gamma Random Field Mod-

els

In analyzing point patterns the unifying theme is to model its spatial uncertainty
through its (first-order) intensity function. Point patterns are commonly modeled as
a Poisson process N (ds) over the state space S (and its Borel a-algebra S), which
provides the independency structure for mathematical manipulation. The N (ds) is
an inhomogeneous Poisson process with mean measure A if (i) for any A E S,
P(N(A) E {O,1,~ -- }) = 1, and for any collection of disjoint sets A1, - -- ,An 6 S, the

random variable N (A1), - -- ,N(An) are independent; (ii) for all s E S,

P(N(ds) = O) = 1 — A(ds) + 0(A(ds))
P(N(ds) = 1) = A(ds) + o(A(ds)) (1.1)
P(N(ds) > 1) = 0(A(ds))

where ds is an infinitesimal region located at s E S. From these two postulates, it

has been shown that N (A) has a Poisson distribution with mean measure A(A), for

all A E S, (e.g., pp13-14, Rogers, 1974),

A A j -A(A)
P<N<A> =1) = -(—)j‘i——— j = o.1.2.--- (1.2)
Point patterns may present spatial correlation. The double stochastic process

about to be described is a generic structure for creating spatial correlation. In this

study, the intensity function A(s) of a point pattern is modeled as a random ﬁeld,

given by the stochastic integral,

A(s) I=Lk(S,U)I\I(dU) (1.3)

where M (du) is an independent increment random measure on a reference state space
III (also a bounded subset of the d-dimensional Euclidean space) with the control
measure A(du) and the nonnegative deterministic kernel k is such that
sup k(s, u) E £2(lU,L(, A) (1.4)
sE S
where U is the Borel a-algebra of III. The kernel mixture type of independent incre-
ment random measures provides a ﬂexible family of models for inhomogeneous point
patterns with a simple and interpretable structure. The speciﬁcation of kernels often
relies on the preliminary analysis of semi-variogram structures. The resulting models

for point patterns are called Com M models.

The Cox M models have been used to explain the interaction among the point
events at small scales. Heisel et a1. (1996) use one to quantify the small range
covariance structure in a weeds application. Wolpert and Ickstadt (1998a, 1998b)
have deveIOped the theory of the Poisson Gamma random fields, that is, when M (du)
is a Gamma measure. Brix (1999) discusses the properties and the simulation of the

generalized Gamma measures with constant parameter functions.

In this study, we consider the generalized Gamma measures (CG-measures
for short), with possibly nonconstant parameter functions over III. The family of
GG-measures includes Gamma random measures, positive stable random measures
and inverse Gaussian measures. The family of Cox 66' models is a Poisson cluster
process type and includes Poisson processes and ordinary Neyman-Scott processes.
As such the family provides models which can be interpreted both as germ-grain

models (Stoyan et al., 1995) and as overdispersion models for point patterns.

1.3.1 Generalized Gamma Random Measures

The Laplace functional for random measures plays a role similar to the Laplace
transform for random variables. The distribution of a random measure M (du) on a
measurable space (M, .M) is uniquely determined by its Laplace functional (Stoyan et
al., 1995). The Lévy-Khinchin representation (Jacod and Shiryaev, 1987 and W'olpert
and Ickstadt, 1998a) for a nonnegative independent increment random measure M

has a Laplace functional

Ee"M[fl = exp (—ab]f] + // (e'hﬂu) - 1)€(dudh)) , (1.5)
u-

where f (u) is any compactly-supported bounded measurable function, and the drift
(1')] f] = In f (u)¢(du) with ¢(du) a a-ﬁnite positive measure on U. The inhomogeneous

Lévy measure [(dudh) on ID“ := ID x IR+ satisﬁes the integrability condition

// (1 /\ h)[’(dudh) < 00 (1.6)
BXIR+

for each compact B E U. Hence, the random measure AI is determined by the pair

of measures (d), f).

In the rest of this chapter and Chapter 2 we take M (du) to be the general-
ized Gamma measure. It is the GG(a(u), /\(du), b(u))-measure speciﬁed by the Lévy

measure,
_1_
F (1 - a(U))

where a(u) < 1 is the index parameter function, b(u) > O is the inverse scale

[(dudh) = h"°(“)‘le‘b(“)h)t(du)dh, (1.7)

parameter function, and the boundedly ﬁnite measure /\(du) is the control measure
on (11,“). By a boundedly ﬁnite measure, we mean it is ﬁnite on every bounded Borel
set. A suﬂicient condition for the Lévy measure 6 in (1.7) to satisfy the integrability

condition in (1.6) is given by: for every compact B G U,

/ b(u)°(“)‘1)\(du) < 00. (1.8)
B

We take a(u) and b(u) to be continuous on U unless speciﬁed otherwise.

The Generalized Gamma Distributions

The GG—measures are derived from the family of generalized gamma distribu-
tions, GG(a, A, b). This family is suggested by Tweedie (1984) and Hougaard (1986),
and considered in various directions by Bar-Lev and Enis (1986), Jorgensen (1987),
Aalen (1992), Hougaard et a1. (1997), and Brix (1998). A GG(a, A, b)-distribution is
concentrated on (0,00) and inﬁnitely divisible in A, which make it a natural basis for
deﬁning independent increment random measures. This also has an important impli-
cation in assigning independent random variables to disjoint subsets in the reference

state space in modeling spatial data.

The family of GG(a, A, b)-distributions is characterized by its Laplace transform,

A
L(t|a,A,b)=exp (— ((b+t)°—b°)), 120,

E
where a _<_ 1, A > O and b 2 0. The family includes three subfamilies, separated by

the index parameter a.

For a = O, the family is deﬁned only for b > 0 through the limit of its Laplace
transform, limano L(t|a, A, b). It consists of the Gamma distributions GG(O, A, b) with

the shape parameter A and the inverse scale b.

For a < 0 and b > O the family is a convolution of a Poisson distributed number
of Gamma random variables with shape —a and inverse scale b (Aalen, 1992), and its

absolutely continuous component has density

(hIaAb)-ex bh+5b° ESE—i— ——A—)k (19)
q ” ‘ p a hkzlk!I‘(—ka) ah“ '

on (0,00); the point mass at zero is given by

 

Ab“
q<0|a.A.b)=exp(a), ,\>o.

For 0 < a < 1, the family is the natural exponential family generated by the
positive stable distributions with the stable index a; in particular, for A > 0 and

b 2 0, the density is given by

A a 1 F(ka+1) A ", ..
q(h]a,A,b) = —exp (bh + Eb ) E Z T( ) Sln(0.fus). (1.10)

:x:
ah“
k=1

Its saddle point approximation is discussed by Hougaard (1986).

For a GG(a,A,b) distribution, all the moments exist for b > O, and the rth

cumulant, r > 1, is given by
x, = A(1-— a)(2 —a)---(r — 1 — a)b“".

The squared coefﬁcient of variation (i.e., the ratio of the variance and the square of
the mean) is (1 — a) / (Aba). The mean of the distribution is Ab“‘1 and the variance is
(1 — a)Ab“‘2. The conditional moments for the continuous part of the distribution in

the case a < 0 was considered by Aalen (p955. 1992).

Aalen (1992) uses the powerful tool Mathematica to perform numerical studies
of these densities and demonstrates the irregularly curved density surfaces for vari-
ous ranges of parameters. The densities of GG-distributions are generally difﬁcult to
compute due to multi-modality except for special cases, such as the Gamma distri—
bution GG(0, A, b) and the inverse Gaussian distribution GG(1/2, A, b). In statistical
problems involving CG—distributions, it may be difﬁcult to apply density computation
oriented methods such as the maximum likelihood inference. In analyzing inhomoge-
neous point patterns with the Cox M models, the MCMC method is the main tool

for numerical implementation. This will be used Chapter 2.

Poisson Processes Representations

Any independent increment random measure M (du) on U can be uniquely de-

composed into the following form, (see Theorem 6.3VIII, Daley and Vere-Jones, 1988)

M(du) = awn) + f... hN“(dudh.) + i Wj6u1(du). (1.11 )

0 j =1
Here the sequence {uJ} enumerates the countable set of ﬁxed atoms of M, 6,, is the
Dirac measure in u, and {Wj} is a sequence of mutually independent nonnegative
random variables determining the masses at these atoms. ¢(du) is a ﬁxed diffuse
boundedly ﬁnite measure on U and N *(dudh) is a Poisson process on U’, independent
of {Wj}. The intensity measure 8 of the process may be unbounded on sets of the form

B x (0, 6), but satisﬁes (1.6) for every bounded Borel set B, in which [({u} x (0, 00)) =
0 for all u E B.

The Poisson process N ‘ in (1.11) can be viewed as the generalized marked Pois-
son process on U with the marking space 1R1“. In particular, the GG-measures can be

represented as or approximated by the marked Poisson point processes.

For a probability space (9, .7“, P), a random measure M is said to have a fixed
atom u if P(M({u}) > 0) > 0 and is diffuse if M({u}) = 0 for every u E U (p4,
Karr, 1991). The following proposition shows that a GG-measure is almost surely
purely atomic.

Proposition 1. For a GG-measure [W with its Le’vy measure 8 given by (1.7), M is

almost surely purely atomic and has no ﬁxed atoms if and only if A(du) is diffuse.

Proof. The following proof can be extended to any random measure M having the
Laplace functional (1.5) without the drift (2’) and with its Lévy measure 6 satisfy-

ing (1.6).

10

Let N’ be the Poisson process on ID" with the intensity measure l(dudh) satisfy-

ing (1.6). Consider the random measure M’ on U given by
M’(du) = / hN’(dudh)
0

which is almost surely purely atomic. For a bounded set B 6 LI , the Laplace transform
of AI’(B) is then, for t 2 0,

E(e“""[’3]) = Eexp (—/U/0°°th13(u)N'(dudh))

= Eexp (— / thN’(B x dh))
O

= exp Occur“ — 1)e(B x (111))

0
= exp ( / (e-WBM — 1)£’(dudh))
ID 0

where the third equation follows from Campbell’s theorem; see Kingman (p82, 1993)
for a constructive proof. By taking limiting simple function approximations for all
compactly-supported measurable functions f on U, it follows from the dominated
convergence theorem and the integrability condition (1.6) that the Laplace functional

of M’ is as follows:

E(e-M’Ifl) = Eexp (—// f(u)hN’(dudh))
U 0
= exp (// (e‘hm‘) — l)€(dudh)) .
U 0
Thus, M and M’ have the same Laplace functional and consequently follow the same
distribution. The fact that a GG-measure M is almost surely purely atomic, and has

no ﬁxed atoms if and only if A(du) is diffuse follows from the result (1.11). C]

Later we use the so—called Inverse Le’vy Measure (ILM) algorithm2 to con-

struct a generalized marked Poisson process representation of GG-measures. Let

 

2Wolpert and Ickstadt (19983.) were the ﬁrst to propose this algorithm.

11

H(du) be a Borel probability measure on U and assume that A(du) have a density
A(u) with respect to the measure II(d-u). Let [(u, h) denote the density of the mea-
sure l(dudh) in (1.7) with respect to the measure 17 (du)dh. Let {Uj} be independent
identically distributed from the probability distribution 17(du), and tj 2 0 be the

successive jump times of a unit rate Poisson process. Set

Tu(h) :2/ l(u,:r)d.7: (1.12)
h
and, for each j = 1,2, - - ~ , deﬁne
Hj I: lnl{hZOITUJ(h) Stj}. (1.13)

Then, for a compactly-supported bounded measurable function f (u), the random

measure given by

Mm == 2 H.f<U.>

j <30

has the Laplace functional of the form ( 1.5) with its Lévy measure given by (1.7). In
particular, for a bounded Borel B in U, we have

M(B) = Z rig-603(3). (1.14)

J <30
The number of terms in (1.14) depends on the index parameter function a(u).
The following proposition shows that the sum (1.14) consists of inﬁnitely many terms
when 0 S a(u) < 1 on U. For a random measure M, a Borel B E L! and e > 0 deﬁne
N,(B) to be cardinality of the set {u 6 B; M({u}) 2 e}, i.e., the number of atoms for
M in B with mass at least 6 with e > 0. Denote by 511(3) the support of M within

'8.

Proposition 2. For any random measure M (du) with its Le’vy measure 6, N6 is a
Poisson process on (UN) with ﬁnite intensity measure €€( ) := €(- x [e,oo)). In

particular, if M (du) is a GG-measure with a(u) < 0 for all u in U, then, for any

12

compact B in U, the number ofjumps in SM(B) is ﬁnite; for 0 S a(u) < 1 for all u

in U and any Borel B in U, 5111(3) is dense in the support of A.

Proof. For each 6 > 0 and with the notation in (1.11), Nc can be written as N£(B) =
N "(B X [6, 00)), which is a Poisson process with intensity [6. Deﬁne No as the almost
sure limit of the process N( as e —+ 0. Recall that a(u) and b(u) are continuous, they
are bounded on a compact set B. If a(u) < 0 on U and b(u) > 0, then a(u) g —6

and b(u) 2 6 for some 6 > 0. Thus
lim0 66(B) = limO / / €(dudh)
i 1 6 7 B . (1.15)
u)a(“)A(du)

Z/BM —a<u)

which implies that No is a Poisson point process with ﬁnite intensity (’0. In general,

 

to in (1.15) is ﬁnite if b(u) is bounded from above and a(u) is bounded from zero. On
the other hand, for any Borel B in U with 0 g a(u) < 1, A(B) ———+ 00 as e ——> 0, so

that N0(SM(B)) is almost surely unbounded. D

The above proposition together with the marked Poisson process representation
of GG—measures imply that the number of terms in (1.14) is ﬁnite on a compact B,
on which a(u) < 0, with a Poisson intensity measure A(du)b(u)“(“)/(—a(u)) on U
and a Gamma marking distribution of shape —a( u) and inverse scale b(u). When
0 S a(u) < 1 the discrete measure M has dense support on the support of A for any
Borel set B. An approximate Poisson process representation can be simulated by the
ILM algorithm. The choice of a ﬁnite number J of atoms for the algorithm is discussed
by Wolpert and Ickstadt (1998a) and by Brix (1999). The truncation bound given by

13

Brix (1999) could be applied in most applications. The ILM algorithm implements
the sampling of atoms on the partially ordered sampling space, i.e., ®Jf=l(uj, hj) such
that TuB(hj)S 7213+] (hj+1) for every j < J. When simulating atoms of the
GG—measure with index 0 S a(u) < 1, J is used in tabulating 7‘1, which has no

closed form.

With the ILM algorithm, the joint distribution of the locations and marks of

atoms is provided by the following theorem. Deﬁne

Ea+1(h) = [hoe x‘a-le"xdx (1.16)
from which E1( h) is the exponential integral function (p220, Abramowitz and Stegun,
1964)

Theorem 1.3.1. The joint distribution of locations and jump sizes {(uj, hj)}j SJ C
U“ in the ILM algorithm has a density with respect to the product measure H]. SJ H(duJ-)dhj

proportional to

A(u1)b(uJ)a(u") Ea(uA)+l(b(uJ)hJ)

 

 

exp —
F (1 — a(UJD
J.)a(us)
-a(uB)-1 —b(u )h
x h,- e B B

311M747) F( 1 — a (a(u)-D J

on the set of {(11}, hj)}j SJ for which the measure
A u b u a(u)

7.0:) == F[—,)_—(;,%53Earui+l<b<u)h)

satisﬁes TuB(hj) S TUB+1(hJ+1) for everyj < J.

Proof. From the deﬁnition (1.16), for b > 0, by a change of variable,
Ea+1(bh) = b““/ x—“’le“bxdx
h

14

It follows from the deﬁnition in (1.12),

_ A(U) so —au-1—u:c
“(kl—ml. x " W
_ Mayday”)

which is decreasing in h and its inverse function r;1(t) exists for each ﬁxed u. Thus,

the greatest lower bound for inf h ru(h) = t is attained, and we can deﬁne

._ -1 _ _1_ -1 ————-—F(1_a(u))t
h .— Tu (t) — b(U) Ea(u)+l (A(u)b(u)°(“’)

Consider a sequence T1,T2,--- of independent exponential random variables, i.e.,
the waiting times for the successive jumps in the standard Poisson process. Then

H j = T531 (T1 + ° ° ° + TS) for each 3' _<_ J. Thus the theorem follows by a

change of variable. C]

For a special case of the above theorem, where the index function of the CG-

measure constant zero, see Corollary 2 to Theorem 2 in Wolpert and Ickstadt (1998a).

1.3.2 Intensity Models and Modeling Objectives

Generally, the intensity functions in (1.3) used for modeling point patterns are
complex mechanisms with unknown parameter functions. The kernel and parameter
functions are parameterized with problem speciﬁc unknown quantities 0. We suppress

the display of 0 in k9(s, u), a9(u), A9(du) and b9 (u) for the rest of this chapter.

Covariates and Intensity Models

For some applications, covariates a(s) are available to be included in the analysis,

and the intensity model in (1.3) may be treated as a “baseline” intensity Ao(s) of the

15

point patterns. The covariate factor 6(5) can be modulated by the Poisson intensity
locally in locations, given by 6(5) = :rT(s)6 with the covariate coefﬁcients ,8. We
adopt the multiplicative incorporation of collateral information into the analysis of
intensities for a different class of spatial Poisson models. In this case, the intensity

function A(s) for point patterns may be factored into two factors, i.e.,
NS) = €XP(8(S))A0(8)- (1-17)

The multiplicative model with the exponential link has the merits of simplicity and
positivity insured when further parameterizing the model (Diggle, 1990; Ickstadt and
Wolpert, 1996). Often in applications, the covariates 95(5) are determined only in
lattice cells of the state space S, and one takes :I:(s) = :z:(s,-) for all 5 in the lattice
cell with center 5,. In this case, on the cell with center 5,, x(5) = 2:36. For the
intensity model (1.17), the covariate factor 6(5) can be incorporated into the kernel
k(s, u) = exp(6(5))k0(s, u) with [to as the baseline kernel corresponding to the baseline

intensity A0(s).

The mean intensity surface in (1.3) with a GG-measure A/I(du) is given by

EA(5) =/wk(s,u)b(u)°(“)_lA(du)

and the independence property of GG-measures leads to the covariance function given

by
71((5, 5') = /Uk(s,u)k(s’,u)(1 —- a(u))b(u)°(“)“2A(du) — EA(5)EA(5’). (1.18)

This can also be derived by the Campbell theorem, see Brix (1999).

Inference on Intensity Models

With the observed point patterns {Sn}, there is a general interest in the spatial

correlation, which is speciﬁed through a parameterized kernel, say k(s, u). Another

16

aspect of spatial dependence is the locality, which is explained by the distribution of
a GG—measure M9(du) and covariate coefﬁcients 6. The inference on the intensity
model is carried out through estimation of the involved parameter, estimation of the
intensity at the observed locations, and prediction of the intensity at unobserved lo-
cations. The stochastic view to the inference problem is to treat the point patterns
{Sn} as part of one realization of the spatial Poisson process N (d5) with the given
intensity measure A(ds), while the unobserved part (of the same realization) is pre-
dicted using the point patterns data, location-dependent covariates and correlation
structure speciﬁed by the model. This has proved to be very useful as a surrogate for

the analyst’s lack of knowledge about the complex system generating {Sn}.

When modeling point patterns, uncertainty analysis is often of main interest.
One concern is actually the spatial homogeneity, which involves invariant properties
such as location-transformation, rotation, scaling, and location-dependent covariates.
Another concern is spatial trends and spatial patterns, such as clustering and aggre-
gation. The third concern lies with the spatial correlation. Traditionally, stationary
random ﬁelds together with other special conditions are assumed; inference in spa-
tial statistics is based on a combination of ad hoc nonparametric techniques, such
as distance-based methods and second-order methods, kernel smoothing (Silverman,
1986), maximum-likelihood parameter estimates and associated approximations and
simulation-based estimation and testing. With constant index and scale parameters,
a GG-measure is stationary if and only if its control measure A(du) is proportional

to the Lebesgue measure on (UN); see Brix (1999).

The inference on inhomogeneous Cox GG models is difﬁcult in part due to its nu-
merical implementation.- The emergence of MCMC algorithms (Gelfand and Smith,
1990a; Tierney, 1994; Gilks, Richardon and Spiegelhalter, 1996) has made it prac-

ticable, and increasingly common, to apply Bayesian statistical methods to spatial

17

analyses, where the full conditional distributions required for the Gibbs sampling ap—
proach are available. The posterior samples are obtained from the stabilized Markov
chain in a MCMC scheme. One realization obtained by this simulated chain can be re-
garded as a dependent, approximate sample from the posterior, and various posterior
inferences can be drawn from empirical data analysis of this sample. The estimates
of parameters and prediction of A(s) are obtained by the posterior sample means.
The visual appreciation can be obtained from the posterior empirical distributions

(histograms), which will be discussed in Chapter 2.

The implementation of MCMC schemes sometimes turns out to be difﬁcult be-
cause of the difﬁculty in sampling from full conditional distributions. An important
MCMC technique is data augmentation (Tanner and Wong, 1987). When the di~
rect sampling from the conditional distributions for certain parameters is difﬁcult,
the data augmentation may be needed and will be demonstrated later in context.
Another useful technique is the MetrOpolis-Hastings (MH) algorithm (see Smith and
Roberts, 1993), which makes it possible to sample from complicated full conditional
distributions. The MH scheme is not only straightforward to implement, but is also a
general purpose method. This makes it suitable for handling any form of conditional

distribution under general assumptions.

18

Chapter 2. Joint Inference on Counts and Binary

Random Fields

In this chapter, a simulation study is performed for the joint model of count
data and binary clipping data. In Section 2.1, we continue the work on the two-stage
Cox M model with a Gamma measure (Wolpert et al., 1998a) and with a particular
data structure (count and indicator variables). The data structure is more fully
described in Section 2.2; it consists of count data on a subset of the cells of a regular
lattice and indicator variables for counts in excess of a cutoff point on the other cells.
In Section 2.3, we develop the full conditional distributions needed for an MCMC
implementation. Section 2.4 gives the results of the MCMC implementation based

on speciﬁc data values.

2.1 Model Speciﬁcation and Representation

Consider a Poisson process N (d5) on S C R2 with the intensity deﬁned as in (1.3),

in which M (du) is a GG(0, A(du), b(u)) measure with parameter functions given by
A(du) = Adu, b(u) = exp(bTu/62)
where 0 > O and b E R2. Take the kernel to be
k9(s u) = —1—exp —-i[|s -— u||2 (2.1)
’ r02 02 ’

which is a two-dimensional stationary Gaussian kernel, normalized by the condition,

[W k = 1, supporting a range of possible spatial correlations. With the above choices

19

the mean intensity function is given by

1
EA(s) = /U 7763 exp (—-;—2|Is — 11H?) b—(1u—)A(du)

1 1 b 2
— Umexp(—-6—2 ([[s+§—u[|))du
2
= cA exp (512' (st + H—Zl))

where c = c(0, U). It is seen that the mean function has the directional trend. For
example, for b = (b1, 0)’, the mean function has an “eastness” trend. The covariance

function of the intensity random ﬁeld A(s) is given by

 

'7'(s,s’) = [J k(s‘u)k(sl’u)A(du)—EA(s)EA(s’)

 

WU)
/\ A 1 I I 9
1 1 s+s’+b 2 ,
X/ﬁexp (_ﬁllu—TH) du—EA(5)EA(8) (2-2)
I)

cA 1 , ,
= Wexp (_W ([[s —— s [I2 — 2bT(s + s ) — [|b[|2))

1 2
"‘C2A28Xp (a? (bT(8+S’)+ [lb2ll ))

where the covariance function consists of a stationary part and directional part.

Point Pattern Voronoi Tessellations

In spatial data analysis, Voronoi tessellation provides a way of translating the
state space into spatial attributes, 0, for instance, which can be in turn characterized
through parameter functions or unknown quantities for modeling inference. Voronoi
tessellation is a method of partitioning a (state) space into convex polygonal re-
gions; divisions of the plane are most often discussed. The Voronoi cell of a spatial

object is the set of all points in the state space closer to the object than to any other

20

 

I I I I I I I I I I I
10” + ‘I
9" + ..,
8' + +
7' +Uj

+
6" + C. "
+ l

 

+ +

 

 

1 1 _ 1 1 1 1 1 1 1 1

J
0 1 2 3 4 5 6 7 8 9 10

 

Figure 2.1: Homogeneous Poisson Voronoi tessellation

object in the set. The set of Voronoi cells for a set of spatial objects, called Voronoi
diagram (also known as Dirichlet or Thiessen polygons), provides the partition of a
set of spatial objects according to its spatial structure. For a theoretical treatment
of Voronoi tessellations and mathematical proofs concerning their properties refer to
Moller (1994) and Okabe et a1. (2000). The latter follows an application oriented
approach. In a two-dimensional Euclidean space, Voronoi cells are convex polygons,
whose edges are half-way between two points in U and perpendicular to the line seg—
ment joining them. Green and Sibson (1978) provide an efﬁcient computer algorithm
for constructing a Voronoi tessellation from a given set of points. Okabe et al. (2000)

compare the Green-Sibson to alternative construction methods.

21

The Voronoi tessellation generated from a random point pattern is a random par-
tition of the state space (Figure 2.1). Miles (1972) gives an excellent survey for the
case where a Poisson process generates the Voronoi tessellation. For a homogeneous
planar Poisson-Voronoi tessellation of rate A, the expected number of sides (or edges)
in an arbitrarily-chosen cell (e.g., the cell containing the origin) is 6, and its expected
area is 1/A. The connection between the intensity and tessellation cell attributes
thus provides a way of representing a point pattern. This can be extended to inho-
mogeneous point processes, in which the intensity measure is characterized through
the parameter functions. Probably the most useful application of the Voronoi model
lies in compilation of intensity surface maps from point data. As the area of cells
estimates the inverse value of the local intensity, assigning 1 / A to the point location
as local intensities, the map can be easily compiled with the aid of any kind of (either

deterministic or stochastic) interpolation method.

We will use the lattice or gridding tessellation of point patterns, where the corre-
sponding spatial models are then speciﬁed based on count data instead of point pat-
tern data directly. We shall take a state space S of the rectangular shape with length
11w and width 12w, where w is the length of the lattice cell. The number I = 11 x 12 is
the total number of lattice cells. A cell is referred to as Ci = 0 (2'1”). The center
of a lattice cell C,- is then ((21, 7:2) — I /2)w used for necessary approximation,
and the area of a lattice cell is wz. Each sampling window A,- can be considered as
the union of 0,5 with generic location at 5,, and A = Uz' Ai. Similarly, a reference
space U is deﬁned in an analogous way indexed by J with S C U for correcting the
edge effect and with generic locations uj. In general the sampling windows may be
any shape or size. We shall take the sampling windows as the union of neighboring

square lattice cells (or quadrants) for convenience.

22

Data Representation and Approximation

Tessellation is an effective way of approximating the sample paths of a stochas-
tic process on the state Space. However, the accuracy of the approximation highly
depends on the smoothness of the sample paths. The smoothness of parametric func-
tions is problem speciﬁc. For deterministic kernels, 1:, Marcus and Rosinski (2000)
derived explicit and easily veriﬁable sufﬁcient conditions for the almost sure continu-
ity of the stochastic integrals (1.3) directed by the GG-measures. Tessellation is used
to provide data driven approximation of parameter functions or unknown quantities
in the spatial models. For smooth parameter functions, we may integrate parameter
functions over tessellation cells or evaluate parameter functions at any point in the

tessellation cell.

In this study, we follow the generic (or cell center) location approximation for
the smoothing parameter functions b(u) and random intensity function A(s). Based
on the above gridding tessellation scheme, we have AJ- = A1172, b,- = exp(uTb/0;u E
(3,) z exp(u}‘b/0), kfj z k9(s,-, 11,-). With 9 = (0,1,,b,) ~ «(9), a lattice model

is given by

[Mi-10] ~ II 00(0. 13,11?)
1'

Ai = Z kzngdei (2.3)

.7

WW, {Mjll N UPON/M)-

With known counts {Ni}, Ickstadt et a1. (1996) Show that the features (including

mean vectors and covariance matrices) of the joint distribution of random variables

23

{A’Ij} and {N,-} conditional on 0 E O, can be studied from their Laplace exponents,

i.e., the negative logarithms of their Laplace transforms,

-10s Ewe-WW) = Z 10s(1 + 12/11.».-
j

— log Eg(€_ ZYf‘NYIMj) = :(1 — e-fB)k,-J-Mj

773'

— log E9(e_ZYf‘N‘) = Zlog 1+ :(1— e’fB)k,-j/bj
j 1'

XAJ'

(2.4)

2.2 Counts and Binary Data

In the rest of this chapter, we illustrate inference for partially censored data on
a lattice. The data consists of counts N = {N,-} for each lattice cell within sampling

windows, and binary variables Z = {2,}, where
Z,‘ = I{]Viz 2 71.0}.

Here no is the level of censoring and the unobserved counts N: are assumed to be on

the same path as the observed counts N,.

In simulation of Section 2.3, the exact locations of the events are known. The

counts in lattice cells are represented by independent Poisson random variables N + ——
{Ni} U {N 12,} With N 25 denoting the counts in a binary observation cell Oil.

The Poisson means are a linear combination (approximation) in (2.3) of unobserved

24

independent Gamma-distributed random measure M = {M -}, €J with uncertain non-
negative coefﬁcients Kg — -—z-6]{k -} Uncertainty about the parameters of the
Gamma distribution of the measures M]- ~ GG(O /\j,b'), and about the values or
collateral dependence of the coefﬁcients kg - is modeled through dependence of kg,
/\j = A? and Dj = b? on an unknown parameter 6 E O, regarded as a random

variable taking values in O C R” for some p, with intensity 7r(6) with respect to

Lebesgue measure d6. Dependence on 6 is indicated by sub- or super-scripts.

Given the random measure M, the N+ are conditionally independent Poisson
random variables with means A = {A,}. It follows from (2.4) that unconditionally the
{N,} are dependent, distributed as sums of independent negative-binomial random

variates. Their means and covariances are given by

EQN, = 2 199,1,- /b,-
J'

C)!

(2: )
_ 2 : i 6 6
COV(N,‘, Nil) — (617+ kij/bj)(ki’j/bj))‘
j
where (522-, is the Kronecker symbol, 1 if i = i’ and 0 otherwise. From (2. 5) we see
that N, and Ni’ are uncorrelated (and in fact independent) only if It)? 110%: 0

for all jand each i ¢ 1". That is, k, and [(330 are orthogonal 1n the sequence space

emit/((1.9)?)

25

2.3 Joint Inference on Counts and Binary Random

Fields

2.3.1 Graphical Representation

The speciﬁcation of spatial models may become complicated when introduc—
ing many parameters, especially in the presence of data augmentation and collateral
stochastic processes. Gilks et a1. (1996) develop a graphical representation, called
directed acyclic graph (DAG), to describe the conditional independence for model
components. The DAG representation of the model under consideration in this Chap-
ter is shown in Figure 2.2. The statement “6 inﬂuences M ” describes 6 as being a
parent of M, as shown by the arrow that goes from 6 to M. The DAG makes state-
ments about a sequence of conditional distributions. Each node is independent of all
its other ancestors given its parents. Types of nodes can be differentiated (Gilks et al.,
1996). We shall use squares about N and Z, and circles about 6 and M, for observed
and unknown quantities, respectively. In addition, the dotted circles about N‘ and
V, denote augmented data nodes. The model dependence structure and assumptions
can be readily derived from the directed acyclic graph. The implementation of a
MCMC scheme will accordingly visit each node in a DAG. We leave out the implicit
functional relationship among unknown quantities for simplicity in a DAG scheme,
although they are important in deriving full conditional distributions. Several DAGs

related intensity models of spatial point processes are given in the next chapter.

The implementation of MCMC inference below requires samples from the full
conditional distribution of the random measure M, given 6 and the observed counts

N = n, a mixture distribution from which direct sampling is cumbersome. Therefore,

26

"I

 

 

 

Po

Figure 2.2: DAG of counts and binary random ﬁelds

1', V

we introduce latent “augmentation” variables V = {l/zjli, j E I x J, which can
be sampled easily and which lead to a mixture-model representation of model (2.3)
and to the conjugate (Gamma) full conditional distribution for M. More speciﬁcally,
conditional on M = m, N = n and Z = z, where Z, is the indicator variable, no
is assumed to be a constant. Also the augmented counts {NZ-z} corresponding to
binaries variables {2,} are conditionally independent Poisson random variables with
AYB
means {A,}. Set Aij — hum]; A1, = Zj Aij, pzj — X17 and let {Vii}
have independent Multinomial distributions MN (TL: , p,). We might regard the

random variates l/z'j as the portions of counts 11, attributable to the jth measure,

j E J in some applications.

2.3.2 MCMC Scheme

We now study the (analytically intractable) posterior distribution of uncertain
quantities 6 and M by simulating steps from an ergodic Markov chain with the pos-
terior as its stationary distribution. The joint distribution of all uncertain quantities

are given by the following theorem

Theorem 2.3.1. The joint distribution of 6, M, V, N, N2, and Z has a density

function with respect to the product of counting measure for discrete and Lebesgue

27

measure for continuous variables, given by

[0,M, V, N, N3, Z] = 71(9)

 

mAB-F’UQB- leB
X H (Z:+1 — Z___z‘-) J
i pt 1 — pi) j EH] F(AJ)
16?.“
2J
x H J exp —-(ZbJ + ng)m
2] .7

where pf = IP(NJ-z 2 77.0), 1123' = Zi ’UiJ‘ and kQJ' = Zi kiJ'.

Proof. With the adopted notations, and from the directed graph presentation, we

have the following factorization under the considered model,

[6,M,V,N,N’,Z] =[6][M|6][V[6,M,N,N’,Z,no][N|6,M]
(Nae, M, z, noiizlo, M, no]
Note that [V|6,M,N,N“,Z] = [V|6,M,N,N"], in which Z is redundant, is the
product of multinomial distributions. [N [6, M] and [Nz|6, M] are products of Poisson
distributions. [Z|6, M] can be evaluated by computing the tail probabilities of a

Poisson distribution. Direct calculation will lead to the result. [:1

In the above theorem, we consider no to be a ﬁxed value. In practice, this may

be estimated by the available count data and collateral information.

Corollary 1. The conditional distribution of the augmented data vector (V, N z) given

the observed counts 11 and z, the signal measure M, and the parameter 6 is the

28

product of multinomials with parameters 71, and p,, independent fori E I and [Ni =

nz|6,M,z], i.e.,

[V = v,Nz = nz

nE.
HIV—[NW ,Pi) (1722 + q) 711! €XP(—‘/\z:)

n,z,m,6] =

 

 

Corollary 2. The conditional joint distribution ofM given N, Z, V, and 6 is again
gamma component, independent forj E J with parameters AJ' + UQj and bj + kgj,

i.e.,

[M[0, ’U, n, Z] '—" H GO“), AJ' + v2j, bj + kgj)
j

Corollary 3. The conditional density of6 given {n,}, {2,}, {2117]}, and {mJ} is

 

 

given by
[6|M,V, N] oc 71(0)
AB+v2B-1 /\B W13
11 ’"J' ‘5' 11 ’“‘
. . .l
jEJ F(AJ) i], v,J.
exp — E (DJ + ng)mJ-
j
Proof. Immediate from Theorem 2.3.1 and the deﬁnition of {th’ J'. C]

The above corollaries lead directly to the following MCMC scheme. Given a

prior density 7r(6) on 9, a parametric kernel {1626]}, a transition probability kernel

29

Q(6,d6’) = q(6,6’)d6’, parameters {AJ} and {bJ} for the Gamma distributions of the
random measures { M J}, and initial points 60 E O, and nonnegative integers {110‘},

we can generate successive points starting at t = 0 as follows:

Step I: Gibbs step to update the random measure. Given {n,}, {2,}, {12,9}. and
6’,
1+1 _ 03 1 1+1 63
. SCI, Aj — Aj + Uzj and bj :b’ + [112]}
. Mt+1 _ t+1 . G At+1 bt+1 .
o generate 1ndependent - — mj w1th C(O, j , j )
1+1 63 1+1 1+1 1+1
0 set Aijzijl’nk ’A1ZZjEJJAi"
t+1J t-I-l t+1
a, =A: / A

Step II: Gibbs step to update the augmentation points. Given {n,}, {2,} {mt-,H}

and 6‘,

o generate independent random variates A'f with probability

(- A ) (2-6)

       

nE.
Zi+1—ZinAz-\
pf 1- p;

for each iw here the indicator random variable Z, is observed;

0 generate { ICE-+1} j N M N (71:, pg“) independent for i E I .

Step III. Metropolis-Hastings step to update the parameter 6. Given {n, }, {UtJ-H ,
t-I-l
{mj }, and at

o generate a new candidate 6’ ~ Q(6’, 116’);

30

o calculate the acceptance probability

,3“
”(904(9’, 0’) H ’16] ‘8

 

 

P’ =
116111626) .,. 1.1.3
X (mt+1)/\B+U§El— l(b§I)/\l3 F()\§l 9)
j (mt+1)A[3S+U2Bl-l(b§S)AlBS[1(Agl)

X exp — 2(ng — bgs + keg—13m ”1

j
(2.7)
o generate 6’+1 = 6’ with probability (1 /\ P’) and = 6‘ otherwise.

Step IV: set t «no t + 1 and return to Step I.

2.4 Simulation and Results

2.4.1 Simulation Setup

Consider the model (2.3) with the state space, [0.2, l] x [0.2, 1], and the reference
state space, [0, 1.2] x [0, 1.2]. The width of the lattice cell (quadrant) is set to be 0.2.
The control measure is proportional to Lebesgue measure with the proportion coeffi-
cient A = 1000, the parameter 6 = 0.4 for modeling the circular correlation structure,
and the spatial trend parameters bT= (b1,b2)= (1,0) for modeling‘ ‘”eastness in this
case. One run with true parameters is performed to get the raw count data for all

16 lattice cells. We set no = 10 to be equal to the average of raw counts. Six cells

31

25\_...
20\

 

   
 

0.4
1 0.2 0.3

Figure 2.3: Empirical counts plot

are randomly selected, and their counts are compared with no to get the indicator
variables, Z, = I {N, 2 no}. Hence, the simulated data consist of six 0 — 1 indicator
values and raw counts for the rest of the 10 cells. The outcome of the simulated
data is shown in Figure 2.3, in which the true counts for indicators are shown in

parentheses.

The sensibility of model parameters in modeling this data is studied by compar-
ing the three models shown in Table 2.1. Model I has all three parameters unspeciﬁed
and to be estimated. It is most general in the sense of catching all the directional
covariance structure that is implied in (2.2). Model 11 is a wrong model in this case in

that it takes bl = 0 even though the data were generated from a model with bl = 1,

32

 

Model Parameters Estimates
True Model (0.4, 1,0)
Model I (6, b1, b2) (0.93, 0.89, —1.00)
Model II (6, 0,b2) (161,—, -0.91)
Model III (6, b1, 0) (1.77, -3.10,—)

 

 

 

Table 2.1: True model and other models

that is, the “eastness” is expected. Finally, Model III is the correct model in that
b2 = 0 is assumed and is expected to give better estimates and prediction. The values
in the ﬁrst column of the table is the posterior sample mean corresponding to each

model.

For each model we have chosen independent standard normal prior distributions
for each parameters 6, b1 and b2 and independent GG(0, AJ, bJ) distributions for MJ-
with AJ- E Aw2 constant in all j and b, as deﬁned before. To perform the Metropolis-
Hastings step, the transition distributions for b1 and b2 are also normal with standard
deviations 1,01 and 1,122. For the parameter 6, ﬁrst a transformation is imposed such
that E = log 6, then 6; H is generated from the proposal density qE( - [5,) = N (5,, v3),

and the value is accepted proportional to

WWW) 9

1 1 ,
(19(9l9’) = 5 exp (—2—w32— (log(6 /6) + log(6/6 ))).

 

The tuning scale parameters 1,11, control the rate of acceptance P’ in (2.7) of the
Metropolis-Hastings step, and a judicious selection is needed for efﬁciency of the
algorithm. If w,’s are too small, the rate of acceptance P’ will be high, but the steps
such as {2+1 — 6t will in general be small. Such a chain will move very slowly
through the support of the posterior distribution. On the other hand, if 11155 are too
large, the proposal distribution will generate large steps, often from the body to the
tails of the posterior distribution, so the rate of acceptance will be small. Such a

chain will frequently not move, producing slow mixing again. In both cases it would

33

take a large number of iterations for the chain to move through the support of the

posterior distribution, making the algorithm inefﬁcient.

A rule-of-thumb used by many practitioners is to select (by experimentation)
values of 11135 giving an empirical rate of acceptance of about 0.4 — 0.5. This rule
has some theoretical support for the case when the target and proposal distributions
are both normal (Gelman et al., 1996). In this study, for model I, we use 112, = 1 for
all i = 1,2,3, while for model II and III, 11), = 2, and the empirical acceptance rate
was about 0.40. Implementation for all models was based on 100, 000 MCMC steps,
and the posterior sampling was started at 90,000 and successive samples were taken
at every other 100 runs. The Monte Carlo iteration was realized by a C++ program.
The judgement as to how close an iterative algorithm is to convergence after a ﬁnite

number of iterations is difﬁcult. Some discussion is found in Kass et. a1. (1997).

2.4.2 Posterior Analysis

One of our simulation goals is to predict intensities {A,} for those lattice cells
without known count data. The predictions of the intensities are obtained by the
corresponding posterior sample means of size 100. The predictions for Model I are
shown in Figure 2.4. The smooth prediction surface for intensities is curved instead
of showing a pure “eastness” trend. This is because of the presence of the parameter
b2, which measures the “northness” trend. In fact, Model I can explain all direc-
tional covariance structure if present. However, the other models did not improve
the prediction surfaces in the sense of catching the covariance structure during the

experimentation.

The following ﬁgures are plotted using the function plotdensO from MatLab’s

statistics toolbox. Figures 2.5, 2.6 and 2.7 show the prior (solid curves), true value

34

 

 

 

1 0.2 0.3

Figure 2.4: Posterior means for intensities

35

 

over lattice cells

 

0.4

0.35 ~ true §=Iog(9) ‘

   

0.3 posterior density of E

0.25

0.2

0.15

0.1 prior density of E

  

 

cie—————'~,————————————-———-
O -'

 

 

 

0.05- _ - l I
01 f' . o o O ' o
...,-.»1..:-=v'-.1::-.-;=.;.sz..-..-;-1'-°'~<..:.:~'::~:= =- --:- .- ~ ’. 1
.005 1 1 A 1 1 ‘
-2 —1 0 1 2 3 4

Figure 2.5: Posterior density for log-6

(dashed vertical lines) and the estimated posterior density function (dotted curves)
for 5 = log 6, b1 and b2, respectively. The posterior means are listed in Table 2.1.
The estimated densities for 6 and b1 suggest that the systematic trends should be
included in the analysis. The estimated density for b2 tends to be ﬂat and suggests a

weak “northness” trend that is consistent with the true model.

While the above graphical outputs are very suggestive of which models ﬁt well
and which do not, a more objective and more quantitative approach, using so called
Bayes factors, can be applied (Ickstadt, et al., 1996). This was not done in our

example.

36

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

-0.05

 

 

 

 

 

37

Figure 2.6: Posterior density for b1

I I 1 r I 1 j
I
I
- I
I
I
- I
I
prior density :
I
I
_ I

I true b1
I
I
” I
I
empirical posterior density I
*- I
\ . |
I
_ » ' ' I
I
A .’ . ' 81.63%..- . o. 3""‘1 “o'v- . -" . °
1 l l l I l I
-15 -10 -5 0 5 10 15

 

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

-0.05

 

I

empirical posterior density

prior density

\

 

 

 

 

 

I
I
I
I
l
I
I
I
I
I
I
I
: true b2
I
I
I
I
l
I
I
I
I
I

 

 

-——-v fivvvvvrw'w'vv" _vvv'

l l l l

v-' va—W

 

-20 -10 0 10

Figure 2.7: Posterior density for b2

38

20

 

 

Chapter 3. Models Based on Log Gaussian

Random Fields

In this chapter we continue to focus on developing models based on inhomoge-
neous point patterns. For highly clustered point patterns, it may be more suitable to
perform the geostatistical analysis after an appropriate transformation of the count
data. The imposed transformation is convenient for factoring the intensity models,
and is required to achieve the Gaussian regularity for geostatistical models. This
commonly happens to be the log-transformation. More general transformation meth-
ods are discussed by Martinez (1997). Geostatistical models for log-counts tend to
explain large scale variation rather than small scale variation in point patterns (Heisel

et al., 1996).
In the geostatistical modeling setting, a log intensity model is suggested by the
intensity model (1.17) and is given by

77(8) = 6(8) + {(3)

NS) = exp(.3(8))Ao(S) = exp(n(8))

(3.1)

where {(5) is a Gaussian random ﬁeld on S. The direct speciﬁcation of covariance
function of 5(5) is convenient under stationarity. For non-stationary covariance func-
tion of {(5), we shall instead consider a second order random measure W with control

measure A on ll. Its subordinated random ﬁeld 6 (5) is given by

5(5) 2: /Uk(s,u)W(du), s 6 S (3.2)

then {(5) is diffuse and is well deﬁned on S; see page 10 for the concept of diffuse
measure. Clearly, 6 is a centered second order random ﬁeld, i.e., EE (5) = 0, and its

covariance structure can be speciﬁed by choosing appropriate kernels and parameter

39

functions deﬁning random measures. It is more convenient to work with the kernels
k(5, -) rather than directly with the covariance function 7(5, 5’), in which it might
be difﬁcult to ensure symmetry and positive deﬁniteness for all 5 and 5’. Under the
condition (1.4), the speciﬁcation of 5 in (3.2) is valid with arbitrary choice of kernels
and the derived the covariance function 7(5, 5’) of 5 is positive deﬁnite (Higdon et
al., 1999). Under the suggested transformation in (3.1), the relation between Ao(5)
and 5(5) is deﬁned through the exponential link function, which is chosen so that
the intensity is nonnegative. The covariate factor 6(5) is an additive term in the
transformed model, and in this setting, 5(5) may also be interpreted as a covariate

process to point patterns.

3.1 Models for Log Counts

3.1.1 Model Speciﬁcation

Consider the log-count data, denoted by Y = (Y(51), - - - ,Y(s,,))’ at the generic

locations 51, - -- ,5,,. In this section we discuss the model

Y, = sci-r6 + 5, + e, (3.3)

where 6i = {(80 is from an underlying spatial stochastic process and e, = 6(5,) is
from a spatial error process. In spatial data analysis, the spatial variation includes two
sources, i.e., the spatial uncertainty and measurement error. Measuring heterogeneity
is sometimes critical to the success of an error modeling system, especially in the case
where the measurements have underlying spatial patterns. Spatial processes 5(5) and

5(5) are assumed to be independent, and the Y,’s are thus conditionally independent

given 5(5).

40

In the sequel, 61,- -- ,5" are assumed to be i.i.d normal random variables with
mean zero and variance 02, and 5(5) is a Gaussian random ﬁeld with zero mean. i.e.,
E5 (5) = 0, and with covariance function 79(5, 5’) = cov(5(5),5(5')) depending 011 a
parameter 6 E 9. We let ’75 := (7(51,s), - ~- ,y(s,,,s))T and let F = {7(5,, 5J)} denote
the n x n covariance matrix at generic locations. Although not demanded, it is often

2 o

assumed that 5(5) has a constant variance 7' , 1.e.,

79(5, 5’) = 727545, 5’) (3.4)

where y" is the correlation function speciﬁed by 6" with 6 = (r, 6‘). The mean for Y,
is

yo.) = EY(s.-) = E(E(Y(81)l€)) = so.) = .235 (3.5)
where x, = :17(s,) is the p—dimensional covariate vector. The covariance matrix for Y
is given by
02 + y(5,, sJ), for i = j;
“(31131) = i

”7(81',Sj), IOTIS'éJ

 

K

and is conveniently written as
cov(Y) = 02(1 -+- 1702) = 022

The additive measurement error 02 is the variance of the conditional random ﬁeld Y,
analogous to a nugget eﬂ'ect in the spatial statistics literature. Also let 0'3 1:
(YE/(31, S), ‘ ' ' 731(an 8)), for any 5 E S and denote the n x p covariate matrix
by X.

41

3.1.2 MCMC Inference

In the inhomogeneous spatial models including the non-Gaussian cases, the main
difﬁculty in performing likelihood-based inference and prediction is the shortage of
multivariate families of distributions as ﬂexible and operationally convenient as the
Gaussian distribution. In most cases, those densities are analytically intractable.
The most widely implemented methodology for coping with non-Gaussian problems
is trans-Gaussian Kriging and lognormal Kriging (Cressie, 1993), which consist of

applying standard Gaussian methods after a marginal non-linear transformation.

When applying log Gaussian models (3.3) for point pattern data analysis, the
error term is often omitted (Diggle and Tawn, 1998). In this section we introduce
the Bayesian analysis of the log Gaussian model with the error term present. Let
g : (6(31)1° ' ° 760970), and let 5" : (€*(310)1 ' ° ' 16*(3m0)), for
the corresponding values of 5 (5) at locations s,o for which predictions are required.
We are then working on the extended model [5,6, 6,Y,0',5"] instead of the model
[5, 6, 6,Y, a], where 6 denotes the parameter vector in the covariance structure F9
of 5 (5), in particular, under constant variance in (3.4), 6 includes the variance 72,
and any further parameters in the speciﬁcation of the correlation structure of 5,
and 6 consists of all the regression parameters. Our objective is to use MCMC
methods to estimate the model parameters, 6, 6 and a, and to generate samples
from the conditional distribution of (5,5‘) given Y under the stated assumptions.
This allows us to obtain predictions for any functional L of interest associated with
this conditional distribution, while making proper allowance for uncertainty in the
parameter estimates. Under a standard MCMC scheme we need to generate samples

from the posterior distribution of [(6, 5, 6, o)|Y] for inference and from the posterior

of [5‘|(6,5,6,Y,o)] for prediction.

42

 

 

(”DAV—CD

 

O——-O'—@

Figure 3.1: DAG of log Gaussian models

The implementation of the MCMC scheme requires sampling from the condi-
tional distributions: [6|5,6,Y,o], [6[6,5,Y,o], and [5,|6,5_,,6,Y,0], where 5_,- =
{(5J); j aé i}. A graphical representation of the dependence structure of the model is

shown in Figure 3.1, the model can be factorized into

[9.5.6,YJIE‘I = IGII€|9llﬁlIUIIY|9,£,ﬁ.Ullﬁ'l9.£.61Y,0] (3.6)

The ﬁrst objective is inference about model parameters. We can simplify this struc-
ture by dropping the node 5'. It only requires visiting the nodes 6, 5, 6 and a in
Figure 3.1. Given 5, it is clear that 6 is conditionally independent of Y, and that 6 is
conditionally independent of 6 and 0. So for inference, a single cycle of the MCMC
algorithm involves ﬁrst updating [6|5], then [5[6,6,Y, a], [6|5,Y, a] and [0|5,6,Y].

Under the model (3.3), [Y[6,5,6,o] is the product of normal densities with

. T .
respective means 77, = 113, ,8 + 5, and variance 02

1 says
[309,660] = [Ylélﬂ 0] = 99645.60) (3-7)
and
M5610] = I9I€l = [69169] = <p(€|9)7r(9) (3-8)
and

[Silg-inaa Y) 0'] : [Y[€,ﬂ,0[[€i[6,€_i] (39)

43

where <p(£ I6) in (3.8) is a multivariate normal density. As a consequence. [$16.5 _,]

in equation (3.9) has a univariate Gaussian distribution. Finally,

lﬁ|9,€,YaUJ = [ﬂléeYeal 0< lYléaﬁyallBl (3.10)

and
{0l9,£,Y,ﬁl = [0l£.Y,ﬁ]0< lYléﬁﬂliUl (3-11)

In practice, any prior distributions [6], [6] and a will make MCMC work.

The MCMC algorithm consists of the following steps. Initial values need to be
chosen for 6, 6 and £ to start the algorithm. Arbitrary starting values for 6,6 , o and
5 may be chosen in the range speciﬁed by the prior. However, assuming 5 = 0, our
model reduces to a standard generalized linear model and the corresponding estimate
of b from this model may be used as an initial value, 6°. We then set the starting
value, 6?, for each 5,- by equating 17,- to y,- for each i in the link function deﬁned to

give {20 = yz’ — (Efﬁo. 0’0 can be set to the sample standard deviation.

In a MCMC cycle, we ﬁrst update all the components of the parameter vector
6 through a MH step to accept a new state 6' chosen uniformly from the parameter

space speciﬁed by the prior, with acceptance probability

,_ . p(£l9') }
A(6,6)—m1n{p(£|0),1

The second step is to update the signal é. Choose a new value, 5;, for the ith

 

component of £, from the transition probability function q(§,,€£) = [516,6], and

accept £1- with probability

 

.. {2 . 90(3/2'l‘52’ala) }
mas.) mm{w(yilw),1 .

Repeat the sub-steps for all z' = 1, - n ,n. The third step is to update the elements

of the regression parameter 6. Choose a new value 6' from some appropriate density

44

(1(393) = [B'I6]. Accept 6’ with probability

mieazamwcm 1}
w(yl£.ﬂ,a)q(ﬁ,ﬁ’)’ '

 

mm = min{

The fourth step is to update the parameter 0. Choose a new value 0’ from some

appropriate density q(a, 0’) 2: [or’|a]. Accept 0’ with probability

a ,2 . <p(y|£.ﬁ,0’)q(0’,0) }
A( ’0) mm{90(y|£,ﬁ,0)q(a,a’)’l ’

 

The choice of the transition kernel q is problem-speciﬁc. We iterate the above
steps until the chain is judged to have reached its equilibrium distribution. At
this point we draw a random sample from the multivariate Gaussian distribution
of [5'|6,5,6,Y,a], where (6,5,6, 0) are the values generated in previous updat-
ing steps. This step reduces to direct simulation from the Gaussian distribution of
[5"l6, 5], since our model implies that 5" is conditionally independent of Y, 6 and a,

given 5. Speciﬁcally, it follows from the assumptions that

[§*|9,g] ~ Muir-15,1“... — PIP-'11:.)
where I‘ = var(5), I‘. = cov(5,5’) and F... = var(5‘).

We then cycle over all the above steps as many times as necessary to obtain
the required number of realizations from the distribution of [(6,5,6,a)|Y] and of
[516,5]. After convergence we sample every rth realization of the chain. Increasing
the value of 7‘ reduces the serial correlation in the resulting output-sample. The last

step is then only necessary at every rth cycle.

45

3.2 Clipped Intensity Models

This thesis has been motivated in part by a research project concerning the pre-
diction and inference of deoxynivalerol (DON vomitoxin) intensity in wheat scab from
truckloads and wheat ﬁelds; see Schabenberger, Gregoire and Kong (1999). Unfor-
tunately, the developed models were not successfully applied to the DON assessment
due to lack of information about marking distributions. One issue raised in that re-
search is that the measurement of DON in ppm cannot be obtained when the DON
exceeds the threshold for the assay techniques. This leads to be the multinomial map-
ping problem in the geostatistical modeling. This type of spatial data has recently
received large attention in terms of “hotspots” and “excursion set” (Gregoire, Sch-
abenberger and Kong, 2000), where a spatial point process is highly clustered with
certain attributes. Other examples where this sort of situation occurs are numerous:
a geologic formation composed of three rock types, a contaminated geographic region
with a subregion determined by the locations where the contaminant concentration

surpasses a safety level, or below a limit of detection.

Now we consider a binary random ﬁeld { Z (s); s E S}, which assumes the values
0 or 1 for every location 3. Furthermore, we assume that the binary random ﬁeld

Z (5) come from clipping an underlying intensity surface at some given cutoff point

I

c, i.e.,

Z(s) = I{A(s) 2 c’}. (3.12)
Suppose that we have n observations Z = (21,- ~ ,Zn) based on (3.12) and a sin-
gle realization of the random ﬁeld A(s) given by (3.1), where 31, - .. , 3,, are known

sampling locations in S. Based on Z and our prior knowledge on parameters from
the random ﬁeld model, we want to predict the unobserved random vector Z* =

(2(801), ' ° ' , Z (30m)), where Z" comes from the same realization of A(s) as

46

 

the data vector Z, and 801, ° ° ° , 80m are known locations in S. To apply geo-
statistical analysis methods, the equivalent log transform on intensity to (3.12) is

considered, namely,

2(8) = 107(8) 2 C} (3-13)

where 77(3) = 33T(3)6 + 5(3) as deﬁned before. Since the binary data contain no
information about variance, we assume that {5 (3); 3 E S} is an underlying unobserved
random ﬁeld with the form (3.1), with E5 (3) = 0 and the covariance structure as

in (3.4).

Under the normality assumption for 5 (3), the connection between the two ran-
dom ﬁelds (3.12) and (3.13) provides an indirect way to model the family of ﬁnite
dimensional distributions of a binary random ﬁeld, enabling one to perform likelihood-
based inference. It also allows for an ‘exact’ sampling-based predictive inference in
the resulting binary random ﬁeld by exploiting the mathematical tractability of ran-
dom ﬁeld 5 (3). The likelihood of the parameters ( 6, 7', 6“, c), based on the binary data

2, is given by

1
L 0. = o o I —— F. _1/2
(ﬁaTa ,CIZ) Al [4" (27TT2)n/2l 9]

x exp «Elf—207 — XT6)TI‘;‘1(17 — XT6)) dn (3.14)

1 ., 1 ,_
= /, .../I WHEN—U“ exp (_§_+_2 TF9 1£)d£
Where 6 = (€(81)7. ' ' ,€(Sﬂ))” 7] = XTIB + £9 and

(-oo,c), if z, = 0;
A,=A(z,~)= forz'=l,-~,n

(0,00), IfZi=1.

—oo,c—ar,-T , if 2.3-=0;
A2=A'(z,)= ( '8) fori=1,--~,n
[c — :r,T6,oo), if z,- = 1.

47

It is apparent from (3.14) that a direct Bayesian analysis of this model is in-
tractable, even when using MCMC, because the likelihood of the model parameters
is given in terms of high-dimensional integrals, which are difﬁcult to evaluate. No-
tice from (3.14) that under this full model, the parameters are not identiﬁed. If
anp with p = 1, then for any constant a1 > 0, a2 6 R, the parameter vector
(a16 — a2, a'i’Tz, 6, alc - (12) all have the same likelihood, and in particular the binary
data contain no information about 7’2. To avoid this situation, we ﬁx the parameter

72 = 1 and c = 0 in the rest of this section so that our unknown model parameters

are (6,6).

3.2.1 Second Order Structure of Binary Random Fields

In the study of the mean and covariance structure of the resulting binary random
ﬁeld it is helpful to understand the global behavior of the binary realizations and their

relation with the parameters of the underlying random ﬁeld 5 (-.) Given (6,6) we have

var(Z(8)) = <P($T(S)B)(1 - ‘P($T(8)ﬁ)))
where (I) is the cdf of the standard normal distribution, so the mean and variance of

the binary random ﬁeld are controlled only by the parameter 6 as well as the location

dependent covariates. Likewise, conditional on (6,6), the correlation function of the

48

binary random ﬁeld is given by

00 OO
72(8,8')=/ 1"(m/ N([mi/92001:w2§7*(333’))dw1dw2
—.’L' S ‘17 S

 

/< \/<I><xT(s)m<1 — crown»

 

(ﬁbcwwl - <1><xT<s'>m> >

<P($T(8)ﬁ)¢($T(S’)6)
(1 - ¢($T(8)ﬁ))(1 - ‘P($T(8’)ﬂ))

 

 

(3.15)

where 992(w1,w2;7) is the bivariate normal density function with zero mean, unit
variance and correlation 7‘. Note that the correlation structure in (3.15) of the
binary random ﬁeld depends on both 6and 6, as well as on the covariates, even
though the correlation structure of the underlying field 5 (3) is independent of 6 and

only depends on 6.

Suppose that 5(-) has a constant mean, say 6(3) = 6. We assume this is so in
the remainder of this subsection. Using the formula given in Kedem (p35, 1980), the

correlation function (3.15) reduces to

dt

 

 

* , _ 1 ﬂ“) exp (—62/(1 + t))
“3’ 3) ‘ 27r<1><m<1— arm/o (142)”?

(3.16)

This provides a strictly increasing correspondence between 7‘(3, 3’) and 7;(3, 3’). For
any ﬁxed value of '7‘(3, 3’), 7X3, 3’) is an even function of 6, having its maximum at

6 = 0, and decreasing to zero in absolute value as I6] increases (Ochi and Prentice,

49

 

Z @

/

0 0 @

Figure 3.2: DAG of clipped intensity models

 

 

 

1984). For the case 6 = 0, the equation (3.16) reduces to (2/7r)sin‘1(7‘(3,3’)), so
for any 6 and 7‘(3,3’) we have [7§(3,3’)| S [7‘(3,3’)|. Therefore, the correlation
structure induced on the binary ﬁeld is always weaker than that of the underlying
Gaussian ﬁeld, and extremely so if [6] is large, i.e., if P(Z(3) = 1) is close to 0 or
1. In these extreme cases the realizations of the binary random ﬁeld will be ‘almost

constant,’ either 0 or 1.

3.2.2 MCMC Inference

The direct Bayesian analysis of the clipped Gaussian random ﬁeld is intractable.
We again use ’data augmentation’ and work with the extended model [6, 6, 17, Z, 17‘, 2"]
that explicitly includes the unobserved latent vectors 5 and 5'. In doing 0, we take
advantage of the mathematical tractability of the underlying Gaussian random ﬁeld
and, by using MCMC methods we are able to: 1) obtain posterior inference of the
model parameters, and 2) obtain a posterior sample of the latent vectors 5 and 5‘,
which provide the needed probabilistic bridge between Z and 2", allowing us to pre-

dict the former from the latter.

The model assumption with its graphic representation in Figure 3.2 is deﬁned

50

as follows:

[06,171,711 = [Bllﬁllnlﬂﬁllzlaﬂnlln‘IOﬁMJ (3.17)

To identify the full conditional distribution for each component of the vector (5, 6,6),
we need to pick up all the unknown factors in (3.17) depending on the component in

question. In doing so, and using

Tl

lnlﬂﬁl = Him-In-.. 9, ﬁlm—.19, 6]

i=1

we have for each component of the latent vector 17 or 5 that

[mm—39,612] = [0377-396 22']
= N(5’3zTﬁ + 7?P:2](Tl—i — X—iﬁ): 1 “ ’Yirrj’l’z')
X [{772' 6 Ar}
[636.“ 0’ﬁ’ Z] = N(72TF:2]€—iv 1 — 7{P:i7i)
X HQ 6 A2}

where 77—1; = {07ij 74 Z} and E—i = {(gj),j 75 Z}, and the matrix
P_i is F with the ith column and row removed, ‘7? is the ith row of I‘, and the

matrix X _z'i s X with the ith row removed, 2' = 1, - - - ,n. The above full conditional

distribution is a Normal distribution truncated to A, or A’.

Sampling from the truncated distribution can in principle be done using rejection
sampling, but this is inefﬁcient due to the large proportion of rejections. To draw

an observation from N (11,02) distribution, truncated to (a, b) with a,b 6 IR, the

51

inversion CDF methods can be applied (Devroye, 1986). If U ~ Uni. f (0.1), the

following statistic T, where

T==p+o¢-l<<»(—;—u>+v<¢<b¢>w<ar>>i

has the desired distribution. The ideas are easily extended to give a general account

 

of Bayesian analysis for order-restrictive parameters (Gelfand et al., 1990a). A mixed
rejection algorithm for sampling from univariate normal distribution truncated to

intervals is given in Geweke (1991).

Under the present framework, ﬁrst regarding our prior speciﬁcation, 6 and 6 are
assumed to be independent, with 6 ~ Np(60, BO“ 1), where 60 and Bo are chosen to
reflect prior information about P(Z (3) = 1). In most cases a vague prior distribution
may be obtained by setting 60 = 0 and Bo = 701,, with To small, say 0.05, which
assures that the inferences are mainly driven by the likelihood. The prior on 6 would
depend on the correlation function '7‘(3, 3’). Suppose that the prior distribution for

the model parameters is given by

1
loglﬁ. 9] = ’§(B — [BOVBOCB — 60)+10g[6] + Cg-
Then, for the full conditional distribution of 6, again from Figure 3.2. we have

[ﬂaw] = lﬁl9anl = [nlﬂﬁllﬁl
0< [n - XTﬁIﬂﬁllBl = [69.6116]

since conditional on n the data 2 are redundant. Similarly, for the full conditional

distribution for 6,
l9|ﬁ,n,Zl 0< l€|9,6l[9l

The last two full conditional distributions are of non-standard form and are difﬁcult to
directly sample from, so one has to perform a Metropolis-Hastings step. For example,

given the current value 6‘, a candidate 6’ for the next iteration is generated from a

52

proposal distribution q(6’|6‘) and this candidate is accepted as the next iterate with

probability

 

A(6‘, 9') = min [ [9.5.5149ng 1]

l9‘lﬁ,€lCI(9’|9‘)’
If the candidate 6’ is accepted, the next iterate becomes 6‘+1 = 6’; otherwise it is set
to the current state 6‘. The choice of proposal distribution depends on the choice

of correlation function 79(3, 3’) and so is problem speciﬁc. The MH algorithm also

works for updating 6.

Starting with initial value of (no, 60, 60), one cycle (iteration) of this MCMC al-
gorithm consists of successively (and in that order) sampling from the full conditional
distributions (4.9), and then performing the MH step described above. The generated
sequence, of length m say, is a trajectory of a Markov chain having [17, 6, 6|z] as its
limiting distribution. Therefore, after discarding an initial burn-in of the chain, say
7‘ iterations, long enough to surpass the transient stage of the chain, the subsequent
iterations are a dependent sample {(n‘, 6’, 6‘);t = r + 1, - -- , m} approximately dis-

tributed as [17, 6, 6|z]. This sample can be used to make inferences about the model

53

parameters and the latent vector 17. The MCMC algorithm is used to obtain approx-

imate samples from the posterior distribution of (6,5,6), which is given by

[n,GﬁIZl = [9llﬁllnl9,ﬁllzl9.ﬁ,nl

o< [Guam—“2
>< exp (€07 - XT6)TP‘1(77 - XT6)>
X Hlfﬂz' E A1}
—1/2 1 T —1 n I
[0.33121316113an exp (7: r e) [1133 e A.}
(3.18)

3.3 Joint Inference of Point Patterns and Geosta-

tistical Random Fields

In this section we consider the point patterns that associate with the covariate
stochastic processes. When the data sets of various types are closely related to the
same topic of interest, the joint modeling of those data sets may lead to more efﬁcient
analysis. This is possible, because we assume all the data sets are associated with
or depend on a spatial process (a random ﬁeld 5 (3)). Then we can make prediction
and inference based on this random ﬁeld. The joint modeling makes full use of the
data, such as the location information in the point pattern data. Point pattern

data associates with the random ﬁeld 5 (3) through the intensity function, and the

54

categorical data associates with the random ﬁeld through clipping. Consequently, we
can perform the joint analysis of point patterns and other types of spatial data. We

can also measure their association and dependence.

Assume that an observed point pattern {Sj} of N (d3) with intensity function
A(s), and observations {K} of a covariate stochastic process 5 (3) are both available
over the same region S. The covariate random ﬁeld is modeled as in (3.3), and the
point process may depend on the covariate process 5 through a functional link deﬁned
to be

log A(s) = C + 077(3) (3.19)

where 77(3) is deﬁned in (3.1) as 77(3) = 6(3) -+- 5(3).

3.3.1 Association between Point Patterns and Covariate Pro-

CBSSGS

Assume that the local intensity of a spatial inhomogeneous Poisson process N (d3)
depends on the realization of another spatial stochastic process 5 (3). The extent
of the dependence of the point process on the covariate process 5(3) is determined
by the unknown regression coefﬁcient a in (3.19). We examine the dependence of
the intensity A(s) on the level of the covariate process 5 (3) using a model of the

form (3.19).

The data consist of a partial realization of a point process, [S], J}, and mea-
surements {K ;2' = 1, - -- ,n} of a spatial process, 5 (3) on a region S. In the case
where 5 (3) has no effect on the intensity of the point process, a = 0. Conditional
on 77(3) and on the observed point pattern, {S}, J}, the locations of the points 5',-

from a subset S of the study region are independently and identically distributed

55

with probability density function proportional to Ms). Given (6,5), the conditional

log-likelihood equation for (a, 5) in (3.19) is

“3, cm) = —— /A exp<< + an(8))d8 + Z (c + an.) (3.20)

where 77,- = 77(8)). Inference for a and C can be based on Cox (1972), provided that the
true value of the 5 (3) throughout A is known. When 77 is estimated, the conditional
likelihood estimates of (I and its variance derived from Cox (1972) will be invalid, as
they do not allow for prediction or measurement error. However, we may estimate
(6,5) by Kriging estimates or other generic geostatistical methods described in the

previous sections.

Currie (1998) develops a test for association between realizations of a point
pattern and a covariate process 5(3) on the study region. The test is also valid with the
presence of covariates 6 (3). The test is motivated by the score test. Measurements Y,-
of this process are assumed to follow the model given in Equation (3.3). The intensity
of the point process depend on 5 and 6(3) jointly according to (3.19). Under this
model, the test for association is a test of the null hypothesis: Ho : a = 0. Up to
an additive constant, the conditional log-likelihood for the intensity [1(3) of the point
process given 77(3) is given in (3.20). The parameter 5 is a nuisance parameter, so we
maximize over it. Differentiating the expression with respect to 5, and then setting

the expression equal to zero gives the expression for C
C J
c = .
fA exp(a77(3))d3

Differentiating this equation with respect to a, and substituting for 5, the derivative

 

of the log-likelihood with respect to a is

(3.21)

 

gt _ __ J L, 77(8)exp(0z77(8))ds
a... — Z" fAexpmmsnds '

56

For testing the null hypothesis, 0 = 0, consider the statistic

J
U0 — $777 — WANfSldS

: ZWTﬁ + 57') — Iii—l [A(xT(8)6 + 5(3))ds

This is the score statistic for testing the association between A(3) and 5(3), the
derivative of the log-likelihood evaluated at the null hypothesis. If we had exact
ascertainment of 5 (3) throughout S, then we could calculate the mean and variance
of the distribution of this test statistic under the null hypothesis of a = 0, and hence
carry out the test of the null hypothesis. Note that calculating U0 only involves the
evaluation of (3.21) at a = 0. Thus we avoid the need for calculation of stochastic

integrals any more complex than fA 77(3)d3.

In practice we do not observe 77(3) directly but only through observation Y, on S.
From these measurements, we obtain the conditional expectation of 77(3) throughout
S. The Y,’s are conditionally independent given 77, according to the model (3.3). We
may use an estimator (e.g., Kriging, see Diggle et al., 1998) for 77. Substituting 77 for

77 in the expression, we obtain the approximate test statistic

J
00 = 0‘2 Z (X76 + ﬂit—WY — XT6))

i=1

J
02|A|

where ‘Yj = 733 = {COV(€J', I/z') ?=1 2 {COV(€j,€i)}zn___1 is the column

vector of length 77. Under the assumption that 5 (3) is Gaussian process, the distri-

 

/ (XT6 + 732-107 — X76» d3
A

bution of 00 is Gaussian. To ﬁnd the expected value of this statistic under the null

57

hypothesis, we take the expectation of Uow ith respect to Y, which is zero, that is

a(QEy U0): ZEY( (XTﬂ-(ﬁE (Y —X-%))
j: —-1
J

_l—AIA EY(XT3+7Z§: (Y —XT6))ds

Thus, the approximate score statistic U0 has expectation zero under the null hy-
pothesis a = 0. The variance of U0 with respect to the measurements Y is given

by

02 vary(U )=Z'ijE—1 ’7]- + 2 2732-17,:

7': lkaéj

J
O__IX|Z/7f21’ysd8

—1 7
+|——A2|2//78 T2 737d3d3.

Thus, given locations of events of a point processes, {Sj} on A, the measurements
{K} of the underlying continuous process {77(3)} are used to calculate U0. Then
conditioning on the locations of the process points, we use expressions for the mean
and variance of this statistics under the null hypothesis, to obtain a p-value for the

data.

Currie (1998) examines the performance of this test on a selection of simulated
data sets for a simple model. The test performs well, in the sense that it is able to

identify association between the point process intensity and the covariate process 5 (3)

58

  
 

a
@06)
O—~®

Figure 3.3: DAG of point patterns and geostatistical models

when the association is not evident to the eye. Currie also investigates the effect of
estimating the covariance parameters of the covariate process, rather than using their
true values. The replacement of the true covariance parameters by their maximum
likelihood estimates has only a small effect on the null distribution of the approximate

score statistic and on the power of the test.

3.3.2 MCMC Inference on Joint Models

Consider when both a point pattern {5]} and a geostatistical process Y(3) de-
pend on the realization of 5 (3) and on the parameters a, 6, and 6. The dependence
structure for the quantities in (3.19) can be referred to its graphic representation in
Figure 3.3. The graphical model also aids the factorization of the joint distribution of
all the quantities in our graph into the product of the full conditionals. The analysis
for factorization is to take the product over all nodes of the conditional density of
that node given its parents. With the given DAG, the joint distribution of our model

quantities can be factored as follows:

[9, W110, C, N, a, Y] = [Gllwlallﬁllalll’la Wﬁ, Ulla, CllNIO, W710“, Cl (3.2?)

Model speciﬁcation is involved in the distributional forms for each of the fac-

tors in (3.22). The covariate process 5 (3) is most commonly modeled as a stationary

59

Gaussian random ﬁeld with a partially uncertain covariance structure. Some no-
table references in this area include Besag (1974), Ripley (1981), Cressie and Chan
(1989), Clayton and Kaldor (1987), Isaaks and Srivastava (1990), Besag, York and
Mollié (1991), Cressie (1993), Bernardinelli et a1. (1997), and Moller, Syverseveen
and Waagepetersen (1998). Ideally, the distributional assumptions about the under-
lying properties of our model should be sufﬁciently ﬂexible to accommodate a wide
range of behavior of the random variates they describe. In this study, we are focusing
on the model in (3.2) with the inhomogeneous intensity. Modeling on the combined
spatial data with the kernel mixture type is best implemented through tessellation of
the reference state 11}. Consider a tessellation of J cells in [U with cell centers {27,-}.
The W(du; Uj E du) are assumed to be independent normal random variates with
mean zero and variance depending on some parameters. Approximately, the Gaus-
sian random ﬁeld 5 (3) is constructed through the kernel mixture of Wiener process
W(du) on U, given by,

s = ROW” (3.23)

where 5 = (5(31),--- ,5(3,,)) with 31, - ~ ,sni n S, and the elements of the random
vector W = {Wj } j EJ are independent normally distributed with the variance
subject to smooth variation in parameters 6 and cell attributes M(du). Note that
K = {kg} with kg]- = (60(87, Uj), and the covariance matrix of 5 is then
P = K K T. Thus, the density of 5 is given by
p€(z) = (27r)""/2|I‘[_1/2 exp (—%ZTF”IZ) .

The density of 77(3) = $T(3)6 + 5(3) is then

1032) = (23-7231“? exp (gt. - XTaVP-wz — #3)) .
Other model distributional assumptions are given by

[N(ds)l€]~ pow.»

[Y(S)I€l~ N(77(8), 02)-

60

When applying the MCMC method to simulate from the joint posterior density
function given in (3.22), we may employ independent vague priors on the model

parameters in the absence of genuine prior information, for example,
[91~ ”(9 )lﬁl~N 14330, )lal~ Mao, )lCl~1V (0C0. ( )lal~ gammmba )

Other steps for generating the chain are to update the other unknown quantities
conditional on the data, {53-} and Y, and visiting each of the other nodes in Figure 3.3
in turn. The full conditionals of the quantities 6, 6, a, C, a and W are derived as

follows

The log of the conditional density for 6 is derived based on the Bayesian relation,

[6[W, 6, a, C , N, a, Y] = [6]W]o< [W]6][6] then, up to an additive constant, we have
1 T —1
10g[91|9_1,£] = ‘108lfal — 55 F9 5 + Cg

where 5 is the vector of the {62-}2' E 1. We may employ a symmetric random walk
proposal on the feasible values of 6,’s, so the update steps for each of the 6,- reduce to

MH steps.

From the relation, [6[6, W,a,C,N, a, Y] or [Y|6,W,0][N]6,W,a,(][6], the log

of the conditional density of 6 is, up to an additive constant, given by

108l5l51Nl2—2—Lg (Y— n.)

l

- feXpK +an(8))d8

+ a 277,- + log[6] + cg
j

where 77, = 77(3,) and 773- = 77(5)).

61

For the full conditional density of 5, from the relation [5 [6, W, 6,01, N, a, Y] cx
[N[6, W, 0, {HQ}, we have

log[5[6, W,6,a, N, a, Y] = — /(exp(5 + 077(3))d3 + J5 + log[5] + cg.

The full conditional density of 0, follows [a|6,W,6,5 ,N , a, Y] or [N]6,W,a,(][a]

and is given by

[0'91W1781C1N107Yl

= — /(exp(5 + a77(3))d8 + a Z 77]“ +108lal°

.7

With the model, [0|6, W",6, a, 5, N, 0,Y]o< [Y|6, W,6, a][0]. Up to an additive

constant, the log density of the conditional density of a is

I
log[0|5, Y] = nloga — :2??- (Y,- — 77,-)2 + log[0] + cg.

Based on [5|6,W,6,a,C,N,o,Y,a] oc [Y|6,W,6,a][N|6,W,a,5][5|6], the log

density of covariate random vector 5 is given by

log[5|6,N,6,Y] = ——(17— (Y7 — 772')2

— /exp(5 + an(3))d3 + a Z 77]-

I
—§TF;177+cg.

62

From equation (3.23), K is constant in W and 5 , the log of the full conditional density

of W is given by the following, up to an additive constant,
log[W|6, NyﬁaYl = ’7 (Y2? — 771.)“
— €XP(C) 2 d3,- €XP(0’772')
7'

1
+ a 2 77.777,- - -2-WT(KKT)_1W + cg.

Then we can use an independent random walk proposal for the candidate W with
centered normal density on Wj. As such, the update step reduces to 3 MH step in

the MCMC algorithm.

63

Chapter 4. Summary of Results and Further

Investigation

4.1 Summary of Results

The thesis concerns the Cox M modeling approach to inhomogeneous point pat-
terns with the random intensity functions directed by independent increment random
measures in a reference state space. The idea of kernel mixture representations is
relatively new in spatial statistics. It has the potential to model non-stationary pro-
cesses, such as the Poisson/ Gamma mixture models by Wolpert and Ickstadt (1998a)

and Binary/ Gaussian models as in Section 3.2.

In Chapter 2, the proposed model on point patterns is based on the Poisson
generalized Gamma random ﬁelds. The generalized Gamma measures can be simu-
lated by the Inverse Lévy Measure algorithm. The tessellation representation of a
point pattern creates a lattice structure, which is the vehicle to tackle the problem
of spatial prediction. This is worked out in detail for an example where the data are
counts from a subset of lattice cells and binary variables on the complement. The
implementation is performed with MCMC for this type of model on counts and binary

clipped random ﬁelds.

For highly clustered point patterns, it is more appropriate to apply the geosta-
tistical modeling approaches after the log transformation of count data. The related
models are based on Wiener measure subordinated random ﬁelds, which are closely
related to the ordinary random ﬁelds speciﬁed directly by mean and covariance func-

tions. The proposed models in Chapter 3 are intended to make full use of collateral

64

information obtained along with point patterns. The collateral data, treated as ﬁxed
covariates and covariate stochastic processes, are considered in the setting of joint
modeling on the intensities. The corresponding MCMC schemes are proposed for

each data model.

The probability structure in the binary clipped models (in Section 3.2) that
contain latent Gaussian processes may be of great interest to spatial statisticians,
but very little progress has been made in the past in deriving likelihoods and in
studying the covariance structures in such processes. One may think of a myriad of
examples where the two types of processes are linked. For example, a smooth fertility
trend in soil will affect both the emergence of weed plants (a point process) as well

as the crop yield surface (a geostatistical process).

4.2 Further Investigation

Wolpert and Ickstadt (1998a) propose a data augmentation scheme for the Cox
M models when M is a Gamma measure. The data augmentation method does not
work with nonzero index parameter function a(u). For the discrete case in Chap-
ter 2, a product multinomial augmentation scheme works in general for the spatial
Poisson-generalized Gamma random ﬁelds. A new data augmentation scheme may be
needed in order to implement a MCMC inference on point patterns. The method is
computationally intensive and time consuming in practice. However, it is demanded

in that it utilizes the point patterns.

A point pattern may be obtained by thinning of another one, or by the super-
posing of others, or by the clipping of geostatistical random ﬁelds. The limiting point

patterns may result in geostatistical random ﬁelds. Weak convergence of random

65

ﬁelds has been studied largely using mixing conditions, see Yoshihara (1992). The
explicit conditions for GG—measures to be approximated by geostatistical random
ﬁelds are still under investigation, especially on the index set where a(u) sé 0 on
U. The classic asymptotic normality theory for a 06' distribution may suggest the
conditions for weak convergence of point patterns. For a 06' (a, A, b) random variable

M, the standardized variable is asymptotically normal, i.e.,

__ 0-1 _
= M Ab _ => N(0,1) as 1 a
\//\(1 —a)b° 2 Ab“

 

 

10

 

where the convergence is along the decreasing sequences of the squared coefﬁcient of
variation. The proof comes from taking the limit of the Laplace transforms; see also
Jargensen (1987). Note that when a —+ —oo the distribution of b.M/(1 — a) converges
to a Poisson distribution with mean Ab“ / (1 — a). This follows by applying a limit
argument to the Laplace transform; see also Aalen (1992). Then the property of
asymptotic normality of a 06' (a, A, b) may suggest that, for a large inverse scale b(u)
when a(u) > 0 or small b(u) when a(u) < 0, the limiting random ﬁeld may tend to
be a Gaussian random ﬁeld. The re-weighted GG—measures with a(u);£ 0 can not
be constructed under the current augmentation method. In application, this occurs
mostly for highly clustered point patterns; it might be more appropriate for applying

geostatistical methods as discussed in Chapter 3.

When testing of association of a point pattern and geostatistical random ﬁelds
in Section 3.3 the hypothesis being tested is Ho : a = 0. In case you reject, we plan to
study whether the direction of the association can be gleaned from the test statistic.
It would be interesting to know whether the latent process inhibits one process while
exacerbating the other. One might imagine that through a site-speciﬁc application
(spatially varied) of herbicide one can reduce the number of weeds (suppress the point

pattern) and increase the crop yield (accelerate the yield process).

66

Bibliography

AALEN, O. (1992). Modeling heterogeneity in survival analysis be the compound

poisson distribution. The Annals of Applied Probability 2 951—972.

ABRAMOWITZ, M. and STEGUN, I. A. (1972). Handbook of Mathematical Functions.

Dover, New York.

BAR-LEV, S. and ENIS, P. (1986). Reproducibility and natural exponential families

with power variance functions. Annals of Statistics 14 1507—1522.

BESAG, J. (1974). Spatial interaction and the statistical analysis of lattice system

(with discussion). J. R. Statist. Soc. B 35 192—236.

BESAG, J. YORK, J. and MOLLIE, A. (1991). Bayesian image restoration, with

two applications in spatial statistics (with discussion). Ann. Inst. Statist. Math.

43 1—59.

BRIX, A. (1998). Spatial point processes in weed science. Technical report De-
partment of Mathematics and Statistics Lancaster University, Lancaster, United

Kingdom.

BRIX, A. (1999). Generalized gamma measures and shot-noise cox processes. Ad-

vances in Applied Probability 31.

CLAYTON, D. G. and KALDOR, J. (1987). Empirical bayes estimates of age-

standardized relative risks for use in disease mapping. Biometrics 43 671—81.

COX, D. (1972). The statistical analysis of dependencies in point processes. In Lewis,
P., editor, Stochastic point processes: statistical analysis, theory, and applications.

Wiley.

67

CRESSIE, N. A. C. (1993). Statistics for Spatial Data. John \Niley & Sons.

CRESSIE, N. A. C. and CHAN, N. H. (1989). Spatial modeling of regional variables.
J. Amer. Statist. Assoc. 84 393—401.

CURRIE, J. E. (1998). Joint Modelling of Point Process and Geostatistical Measure-

ment Data. Ph.D. thesis Lancaster University.

DALEY, D. J. and VERE-JONES, D. (1988). Introduction to the Theory of Point

Processes. Springer, New York.

DEVROYE, L. (1986). Non-uniform random variate generation. Springer Verlag:

New York.

DIGGLE, P. J ., MOYEED, R. A. and TAWN, J. A. (1998). Model-based geostatis-
tics. Applied Statistics 47(3) 299—350.

DIGGLE, P. J. (1990). A point process modelling approach to raised incidence of
a rare phenomenon in the vincity of a prespeciﬁed point. Journal of the Royal

Statistical Society (Series A) 153 349—362.

GELFAND, A. E. and SMITH, A. F. M. (1990a). Sampling-based approaches to
calculating marginal densities. Journal of the American Statistical Association 85

398—409.

GELFAND, A. E., HILLS, S. E., RACINE-POON, A. and SMITH, A. F. M. (1990b).
Illustration of bayesian inference in normal data models using gibbs sampling. Jour-

nal of the American Statistical Association 85 972—985.

GELMAN, A., ROBERTS, G. O. and GILKS, W. R. (1996). Efficient metropolis
jumping rules. In J. M. Bernardo, J. O. Berger, A. P. D. and Smith, A. F. M.,

editors, Bayesian Statistics 5. Oxford University Press 599—607.

68

1

GEWEKE, J. (1991). Efﬁcient simulation from the multivariate normal and student-t
distributions subject to linear constraints. In Computing Science and Statistics:

Proceedings of the 23rd Symposium on the interface. Ammerican Statistical Asso-

ciation Alexandria, VA 571—578.

GILKS, W. R., RICHARDSON, S. and SPIEGBLHALTER, D. J. (1996). Markov Chain.

Monte Carlo in Practice. New York: Chapman and Hall.

GILKS, W. and ROBERTS, G. (1996). Markov chain Monte Carlo in practice chapter
Strategies for improving MCMC. Chapman 8; Hall 89-110.

GREEN, P. and SIBSON, R. (1978). Computing dirichlet tesselations in the plane.

Computer Journal 21 168—173.

GREGOIRE, T. G., SCHABENBERGER, O. and KONG, F. Z. (2000). Prediction from

an integrated regression equation: a forestry application. Biometrics 56 414—9.

HEISEL, T., ANDREASEN, C. and ERSBQLL. A. K. (1996). Annual weed distribu-

tions can be mapped with kriging. Weed Research 36 325—337.

HIGDON, D., SWALL, J. and KERN, J. (1999). Non-stationary spatial modeling. In
Bernardo, J. M., Berger, J. O., Dawid, A. P. and Smith, A. F. M., editors, Bayesian
Statistics: 6 6. Oxford University Press 761—768.

HOUGAARD, P., LEE, M.-L. and WHITMORE, G. (1997). Analysis of overdispersed

count data by mixtures of poisson variables and poisson processes. Biometrics 53

1225—1238.

HOUGAARD, P. (1986). Survival models for heterogeneous populations derived from

stable distributions. Biometrika 73 387—396.

69

ICKSTADT, K. and WOLPERT, R. L. (1996). Spatial correlation or spatial variation?
a comparison of gamma/poisson hierarchical models. Technical Report 01 Institute

of Statistics and Decision Sciences, Duke University, Durham NC 27708-0251.

JACOD, J. and SHIRYAEV, A. N. (1987). Limit Theorems for Stochastic Processes.
volume 288 of Crundlehren der mathematischen Wissenschaften. Berlin: Springer-

Verlag.

J ORGENSEN, B. (1987). Exponential dispersion models. Journal of the Royal Statis-

tical Society, Series B 49 127—162.

KARR, A. F. (1991). Point Processes and Their Statistical Inference. Markcel Dekker

second edition.

KASS, R. E., CARLIN, B. P., GELMAN, A. and NEAL, R. M. (1997). Mcmc in

practice: A rountable discussion. Technical report Carnegie Mellon University.
KEDEM, B. (1980). Binary Times Series. New York: Marcel Dekker.
KINGMAN, J. F. C. (1993). Poisson Processes. Oxford Science Publications.

MARCUS, M. B. and ROSID’ISKI (2000). Continuity of stochastic integrals with respect

to inﬁnitely divisible random measures. Department of Mathematics, The City

College of CUNY, New York, NY 10031, mbmarcus©earthlink.net.

MARTINEZ, V. D. O. (1997). Prediction in Some Classes of Non-Gaussian Random
Fields. Ph.D. thesis University of Maryland at College Park.

MILES, R. (1972). The random division of space. Suppl. Adv. Appl. Prob. 243—266.

MoLLER, J. SYVERSVEEN, A. R. and WAAGEPETERSEN, R. P. (1998). Log gaus-

sian cox process. Scand. J. Statist. 25.

MOLLER, J. (1994). Lectures on Random Voronoi Tessellations. Springer.

70

OKABE, A., Boors, B., SUGIHARA, K. and CHIU, S. (2000). Spatial Tessellations.

Wiley, Chichester.
RIPLEY, B. D. (1981). Spatial Statistics. John Wiley and Sons, New York.
ROGERS, A. (1974). Statistical analysis of spatial dispersion. Pion, London.

SCHABENBERGER, 0., GREGOIRE, T. G. and KONG, F. Z. (1999). Statistical
features of wheat scab sampling protocols. random volumes and inhomogeneous
point processes. Spring meetings of the Eastern North American Region of the

international biometric society.

SILVERMAN, B. W. (1986). Density estimation for statistics and data analysis.

Chapman and Hall: London.

SMITH, A. F. M. and ROBERTS, G. O. (1993). Bayesian computation via the gibbs
sampler and related markov chain monte carlo methods. J. R. Statist. Soc. B, 55,

3—23.

STOYAN, D., K. W. and MECKE, J. (1995). Stochastic Geometry and its Applica-

tions. Wiley, Chichester second edition.

TANNER, M. A. and WONG, W. H. (1987). The calculation of posterior distribution

by data augmentation. J. Am. Statist. Assoc. 82 528—50.

TIERNEY, L. (1994). Markov chains for exploring posterior distributions (with dis-

cussion and a rejoinder by the author). Ann. Statist. 22 1701—1762.

TWEEDIE, M. (1984). An index which distinguishes between some important ex-
ponential families. In Ghosh, J. and Roy, J ., editors, Statistics: Applications and
new directions. Proceedings of the Indian Statistical Institute golden jubilee inter-

national conference 579—604.

71

WOLPERT, R. and ICKSTADT, K. (1998a). Poisson/ gamma random ﬁeld models for

spatial statistics. Biometrika 85 251—267.

WOLPERT, R. L. and ICKSTADT, K. (1998b). Simulation of Le’vy Random Fields.
Practical Nonparametric and Semiparametric Bayesian Statistics. New York:

Springer-Verlag dipak dey, peter miiller and debajyoti sinha (editors) edition.

YOSHIHARA, K. (1992). Weakly Dependent Stochastic Sequences and Their Appli-

cations volume VOL. VII Generalized partial-sum processes. Sanseido Co., Ltd.

Misaki—cho.

72

I
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

   

Ill[[I[II][[[][[I]]l]l|][][][lll