.4 1 t . 2.5; . . 1 I (V ".9... 'm a??? a. . H 3 LIBRARY Michigan State University This is to certify that the dissertation entitled Bayesian Modeling 0n Inhomogeneous Point Patterns Via Independent Increment Random Measures presented by Fanzhi Kong has been accepted towards fulfillment of the requirements for W/ Majorg rofcssor Dennis giiiiland Date Auqust 5, 2002 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 c:/CIRC/DaIeDue.p65-p.15 BAYESIAN MODELING ON IN HOMOGENEOUS POINT PATTERNS VIA INDEPENDENT IN CREMENT RANDOM MEASURES By Fanzhi Kong A DISSERTATION Submitted to MICHIGAN STATE UNIVERSITY in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 2002 Abstract BAYESIAN MODELING ON INHOMOGENEOUS POINT PATTERNS VIA INDEPENDENT INCREMENT RANDOM MEASURES By Fanzhi Kong The main objective in this thesis is to investigate the inference and prediction of the intensity functions governing inhomogeneous point patterns. The intensity functions are modeled as random fields directed by independent increment random measures in a reference state space. Random fields are the foundation of spatial data analyses. One constructed model on point patterns is based on the Poisson general- ized Gamma random fields. The generalized Gamma measures can be simulated by the Inverse Lévy Measure algorithm. For highly clustered point patterns, it is more appropriate to apply the geostatistical modeling approaches after the log transforma- tion of count data. The related models are based on Wiener measure subordinated random fields, which are closely related to the ordinary random fields specified directly by mean and covariance functions. Another concern is to make full use of collateral information obtained along with point patterns. The collateral data, treated as fixed covariates and covariate stochastic processes, are considered in the setting of joint modeling on the intensities. The involved modeling approaches follow both theoretical and simulation methods. The corresponding MCMC schemes are proposed for each data model. The actual implementation is demonstrated and performed for the joint model on counts and binary clipped random fields. Acknowledgements My deepest regards and sincerest thanks go to my thesis advisors, Professors Dennis Gilliland and Oliver Schabenberger, for their many instructions and great enthusiasm support during this research and for their guidance through the early years of chaos and confusion. I sincerely appreciate Professor Gilliland’s inspiration and extreme patience in the final stage of preparing this dissertation. My committee members Professors V. S. Mandrekar, Habib Salehi and Yimin Xiao provided the technical assistance that improved some results and eliminated errors; their help was invaluable. Fuxia Cheng and Yichuan Xia shared with me their knowledge of second order analysis and provided many useful references and friendly encouragement. I am grate- ful to the other members of the Department of Statistics and Probability at Michigan State University who make the department such an exciting, friendly and inspiring place to study and work, in particular to Professor James Stapleton. For financial support I thank Professors Oliver Schabenberger, Patrick Hart and Shengyang He, who mainly provided the research assistantship support for this project during my graduate studies. I should also mention that my graduate studies in the United States were supported in part by the Department of Crop and Soil Sciences and the DOE Plant Research Laboratory at Michigan State University. East Lansing, Michigan, USA Fanzhi Kong July 31, 2002 iii Table of Contents List of Tables List of Figures 1 Spatial Poisson-Generalized Gamma Random Fields 1.1 Point Patterns and Collateral Information ............... 1.2 Thesis Synopsis .............................. 1.2.1 Terminology and Notation .................... 1.3 Poisson-Generalized Gamma Random Field Models .......... 1.3.1 Generalized Gamma Random Measures ............. 1.3.2 Intensity Models and Modeling Objectives ........... 2 Joint Inference on Counts and Binary Random Fields 2.1 Model Specification and Representation ................. 2.2 Counts and Binary Data ......................... 2.3 Joint Inference on Counts and Binary Random Fields ......... 2.3.1 Graphical Representation .................... 2.3.2 MCMC Scheme .......................... 2.4 Simulation and Results .......................... 2.4.1 Simulation Setup ......................... 2.4.2 Posterior Analysis ......................... 3 Models Based on Log Gaussian Random Fields 3.1 Models for Log Counts .......................... 3.1.1 Model Specification ........................ 3.1.2 MCMC Inference ......................... 3.2 Clipped Intensity Models ......................... iv vi vii N01155:) 15 19 19 24 26 26 27 31 31 34 39 40 40 42 46 3.2.1 Second Order Structure of Binary Random Fields ....... 3.2.2 MCMC Inference ......................... 3.3 Joint Inference of Point Patterns and Geostatistical Random Fields 3.3.1 Association between Point Patterns and Covariate Processes . 3.3.2 MCMC Inference on Joint Models ................ 4 Summary of Results and Further Investigation 4.1 Summary of Results ........................... 4.2 Further Investigation ........................... Bibliography 48 50 54 55 59 64 64 65 67 2.1 List of Tables True model and other models oooooooooooooooooooooo vi 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2 3.3 List of Figures Homogeneous Poisson Voronoi tessellation ............... 21 DAG of counts and binary random fields ................ 27 Empirical counts plot ........................... 32 Posterior means for intensities over lattice cells ............. 35 Posterior density for log—6 ........................ 36 Posterior density for b1 .......................... 37 Posterior density for b2 .......................... 38 DAG of log Gaussian models ....................... 43 DAG of clipped intensity models ..................... 50 DAG of point patterns and geostatistical models ............ 59 vii Chapter 1. Spatial Poisson-Generalized Gamma Random Fields The study of inhomogeneous point patterns often involves the treatment of ad- ditional spatial data types. The joint analysisof point patterns and collateral in- formation may improve the inference and prediction efficiency. The consideration is demanded because point patterns are practically most expensive and collateral data are often available along with point patterns. Thus, the methods of joint analysis of point patterns and their collateral data are highly recognized recently. The follow— ing section gives a short introduction to point patterns and other spatial data types considered in this thesis. 1.1 Point Patterns and Collateral Information Consider a bounded subset S of the d—dimensional Euclidean space with d 2 1. A point pattern, {S}; j = 1, - - - ,J}, is the set of locations of observed events from all sampling windows {A,-} in S. Other spatial measurements {Ygi = 1, - - - ,n} are taken from the corresponding locations {3,} C S. The locations {5,} are explicitly / implicitly associated with sam- pling windows {A;} for obtaining feasible measurements. Practical applications often involve those random variates Y,- distributed on a regular lattice. Examples include image analysis, where locations represent pixels, and crop experiments, where they equate with plots in the field. Another example is a two-way table with responses that are the aggregate of independent row effects, column effects and higher order effects. In this context, the locations {3,} are called the positive integral coordinates or the generic locations. One may distinguish point patterns from other types of measurements by associating Y, with the sampling window A, with s,- E A,- C S. The sampling windows {A,-} may fall far away from each other or may make contact with their neighbors. In the former case, {3,} can be assumed to have no distinguishable points in the space concerning the sizes of sampling windows, and in the latter case, {3‘} together with {24,-} form a partition of the study region. When the relative size of sampling windows can not be neglected, and sampling windows are disconnected, {8;} forms a network with features of the sampling windows as node attributes. The corresponding data {Yi} may be called geostatistical data, lattice data and network data, respectively. The spatial data can also be categorized by the characteristics of observables and may be grouped into three categories: geostatistical data, point pattern data and categorical data (or random setsl). Cressie (1993) classifies the general spatial data into three types, i.e., geostatistical data, lattice data, and point pattern data. 1.2 Thesis Synopsis The choice of spatial models is dictated by the particular form of the spatial data. It is also influenced by the modeling objectives, which may be to summarize data, to predict dynamics, or to model mechanistically in order to advance understanding. In this thesis, the data representation and interpretation for different types of spatial data will be based on the Voronoi tessellations. The corresponding models will be constructed with the aid of their directed acyclic graphic representations. The direct approaches for inference and prediction in inhomogeneous models are generally not feasible. We will follow Monte Carlo and Markov Chain methods, and all the required 1The random set is the main subject of the spatial geometry. full conditional distributions are derived or approximated for the prOposed models. The thesis develops several models on inhomogeneous point patterns in the pres- ence of covariates and covariate stochastic processes. A chapter based synopsis of the involved models and their analyses follow. The rest of this chapter concerns the Poisson generalized Gamma random fields. We establish properties of the generalized Gamma measures that are useful for mod- eling and inference for point patterns. We also discuss the incorporation of a covariate into the analysis. Chapter 2 performs the simulation study for the joint model of count data and censored data. In Section 2.1, we continue the work on the two-stage model with a particular generalized Gamma measure and with a particular data structure (count and indicator variables). The data structure is more fully described in Section 2.2; it consists of count data on a subset of the cells of a regular lattice and indicator variables for counts in excess of a cutoff point on the other cells. In Section 2.3, we develop the full conditional distributions needed for an MCMC implementation. Section 2.4 gives the results of the MCMC implementation based on specific data values. Chapter 3 studies modeling on point patterns with the intensity function as the transformed kernel mixture of Wiener measures, i.e., log Gaussian random fields. In Section 3.1, we introduce a model for log-count data that includes covariates and an additive error. In Section 3.2, we motivate a model for binary data derived from a clipped intensity function. The second order structure is discussed. In Section 3.3, we discuss inference for point patterns that associate with a covariate process. Chapter 4 summarizes the main results in the thesis and discusses some issues for further investigation. 1.2.1 Terminology and Notation Some terminology and notation described here will be used through the thesis. The Bayesian approach is natural for the type of statistical problems treated in the thesis. The Bayesian approach to statistics uses probability measures, usually defined in terms of density functions, to quantify and assess unknown parameters and other unknown quantities, so it involves a direct application of probability theory. Densities and probability functions of random vectors X and Y will be denoted generically by square brackets in the context, so that the joint, conditional and marginal forms appear, respectively, as [X , Y], [X |Y], and [Y]. This notation is used by Gelfand et al. (1990a). The usual marginalization by integration or summation procedure will be denoted by forms such as [X] = [X [Y] [Y] When applying the Monte Carlo Markov Chain methods, we will use the term ‘MCMC’ for simplicity. In deriving the conditional densities, the phrase “is propor- tional to” is denoted with the symbol ‘oc’. For the log conditional densities, the involved additive constant will be denoted by a generic constant ‘cg’. This convention is convenient because it reduces the amount of notation, and since these densities are seldom evaluated, it produces no confusion. The density or distribution of a random vector X will denote by ‘~’, interpreted as “X is distributed as”. The transpose of a vector or a matrix X will be denoted as X T. In defining random measures the underlying probability space will be denoted by (Q, f, P). We will denote the Lebesgue measure on d—dimensional Euclidean space Rd by ‘| - |’. Among others, ‘GG’ is the shorthand for the generalized Gamma measures or distributions with arguments supplied in the context. Finally, the geostatistical random fields, models, and data will be used to emphasize the continuous nature over a subset of Euclidean space Rd. 1.3 Poisson-Generalized Gamma Random Field Mod- els In analyzing point patterns the unifying theme is to model its spatial uncertainty through its (first-order) intensity function. Point patterns are commonly modeled as a Poisson process N (ds) over the state space S (and its Borel a-algebra S), which provides the independency structure for mathematical manipulation. The N (ds) is an inhomogeneous Poisson process with mean measure A if (i) for any A E S, P(N(A) E {O,1,~ -- }) = 1, and for any collection of disjoint sets A1, - -- ,An 6 S, the random variable N (A1), - -- ,N(An) are independent; (ii) for all s E S, P(N(ds) = O) = 1 — A(ds) + 0(A(ds)) P(N(ds) = 1) = A(ds) + o(A(ds)) (1.1) P(N(ds) > 1) = 0(A(ds)) where ds is an infinitesimal region located at s E S. From these two postulates, it has been shown that N (A) has a Poisson distribution with mean measure A(A), for all A E S, (e.g., pp13-14, Rogers, 1974), A A j -A(A) P =1) = -(—)j‘i——— j = o.1.2.--- (1.2) Point patterns may present spatial correlation. The double stochastic process about to be described is a generic structure for creating spatial correlation. In this study, the intensity function A(s) of a point pattern is modeled as a random field, given by the stochastic integral, A(s) I=Lk(S,U)I\I(dU) (1.3) where M (du) is an independent increment random measure on a reference state space III (also a bounded subset of the d-dimensional Euclidean space) with the control measure A(du) and the nonnegative deterministic kernel k is such that sup k(s, u) E £2(lU,L(, A) (1.4) sE S where U is the Borel a-algebra of III. The kernel mixture type of independent incre- ment random measures provides a flexible family of models for inhomogeneous point patterns with a simple and interpretable structure. The specification of kernels often relies on the preliminary analysis of semi-variogram structures. The resulting models for point patterns are called Com M models. The Cox M models have been used to explain the interaction among the point events at small scales. Heisel et a1. (1996) use one to quantify the small range covariance structure in a weeds application. Wolpert and Ickstadt (1998a, 1998b) have deveIOped the theory of the Poisson Gamma random fields, that is, when M (du) is a Gamma measure. Brix (1999) discusses the properties and the simulation of the generalized Gamma measures with constant parameter functions. In this study, we consider the generalized Gamma measures (CG-measures for short), with possibly nonconstant parameter functions over III. The family of GG-measures includes Gamma random measures, positive stable random measures and inverse Gaussian measures. The family of Cox 66' models is a Poisson cluster process type and includes Poisson processes and ordinary Neyman-Scott processes. As such the family provides models which can be interpreted both as germ-grain models (Stoyan et al., 1995) and as overdispersion models for point patterns. 1.3.1 Generalized Gamma Random Measures The Laplace functional for random measures plays a role similar to the Laplace transform for random variables. The distribution of a random measure M (du) on a measurable space (M, .M) is uniquely determined by its Laplace functional (Stoyan et al., 1995). The Lévy-Khinchin representation (Jacod and Shiryaev, 1987 and W'olpert and Ickstadt, 1998a) for a nonnegative independent increment random measure M has a Laplace functional Ee"M[fl = exp (—ab]f] + // (e'hflu) - 1)€(dudh)) , (1.5) u- where f (u) is any compactly-supported bounded measurable function, and the drift (1')] f] = In f (u)¢(du) with ¢(du) a a-finite positive measure on U. The inhomogeneous Lévy measure [(dudh) on ID“ := ID x IR+ satisfies the integrability condition // (1 /\ h)[’(dudh) < 00 (1.6) BXIR+ for each compact B E U. Hence, the random measure AI is determined by the pair of measures (d), f). In the rest of this chapter and Chapter 2 we take M (du) to be the general- ized Gamma measure. It is the GG(a(u), /\(du), b(u))-measure specified by the Lévy measure, _1_ F (1 - a(U)) where a(u) < 1 is the index parameter function, b(u) > O is the inverse scale [(dudh) = h"°(“)‘le‘b(“)h)t(du)dh, (1.7) parameter function, and the boundedly finite measure /\(du) is the control measure on (11,“). By a boundedly finite measure, we mean it is finite on every bounded Borel set. A suflicient condition for the Lévy measure 6 in (1.7) to satisfy the integrability condition in (1.6) is given by: for every compact B G U, / b(u)°(“)‘1)\(du) < 00. (1.8) B We take a(u) and b(u) to be continuous on U unless specified otherwise. The Generalized Gamma Distributions The GG—measures are derived from the family of generalized gamma distribu- tions, GG(a, A, b). This family is suggested by Tweedie (1984) and Hougaard (1986), and considered in various directions by Bar-Lev and Enis (1986), Jorgensen (1987), Aalen (1992), Hougaard et a1. (1997), and Brix (1998). A GG(a, A, b)-distribution is concentrated on (0,00) and infinitely divisible in A, which make it a natural basis for defining independent increment random measures. This also has an important impli- cation in assigning independent random variables to disjoint subsets in the reference state space in modeling spatial data. The family of GG(a, A, b)-distributions is characterized by its Laplace transform, A L(t|a,A,b)=exp (— ((b+t)°—b°)), 120, E where a _<_ 1, A > O and b 2 0. The family includes three subfamilies, separated by the index parameter a. For a = O, the family is defined only for b > 0 through the limit of its Laplace transform, limano L(t|a, A, b). It consists of the Gamma distributions GG(O, A, b) with the shape parameter A and the inverse scale b. For a < 0 and b > O the family is a convolution of a Poisson distributed number of Gamma random variables with shape —a and inverse scale b (Aalen, 1992), and its absolutely continuous component has density (hIaAb)-ex bh+5b° ESE—i— ——A—)k (19) q ” ‘ p a hkzlk!I‘(—ka) ah“ ' on (0,00); the point mass at zero is given by Ab“ q<0|a.A.b)=exp(a), ,\>o. For 0 < a < 1, the family is the natural exponential family generated by the positive stable distributions with the stable index a; in particular, for A > 0 and b 2 0, the density is given by A a 1 F(ka+1) A ", .. q(h]a,A,b) = —exp (bh + Eb ) E Z T( ) Sln(0.fus). (1.10) :x: ah“ k=1 Its saddle point approximation is discussed by Hougaard (1986). For a GG(a,A,b) distribution, all the moments exist for b > O, and the rth cumulant, r > 1, is given by x, = A(1-— a)(2 —a)---(r — 1 — a)b“". The squared coefficient of variation (i.e., the ratio of the variance and the square of the mean) is (1 — a) / (Aba). The mean of the distribution is Ab“‘1 and the variance is (1 — a)Ab“‘2. The conditional moments for the continuous part of the distribution in the case a < 0 was considered by Aalen (p955. 1992). Aalen (1992) uses the powerful tool Mathematica to perform numerical studies of these densities and demonstrates the irregularly curved density surfaces for vari- ous ranges of parameters. The densities of GG-distributions are generally difficult to compute due to multi-modality except for special cases, such as the Gamma distri— bution GG(0, A, b) and the inverse Gaussian distribution GG(1/2, A, b). In statistical problems involving CG—distributions, it may be difficult to apply density computation oriented methods such as the maximum likelihood inference. In analyzing inhomoge- neous point patterns with the Cox M models, the MCMC method is the main tool for numerical implementation. This will be used Chapter 2. Poisson Processes Representations Any independent increment random measure M (du) on U can be uniquely de- composed into the following form, (see Theorem 6.3VIII, Daley and Vere-Jones, 1988) M(du) = awn) + f... hN“(dudh.) + i Wj6u1(du). (1.11 ) 0 j =1 Here the sequence {uJ} enumerates the countable set of fixed atoms of M, 6,, is the Dirac measure in u, and {Wj} is a sequence of mutually independent nonnegative random variables determining the masses at these atoms. ¢(du) is a fixed diffuse boundedly finite measure on U and N *(dudh) is a Poisson process on U’, independent of {Wj}. The intensity measure 8 of the process may be unbounded on sets of the form B x (0, 6), but satisfies (1.6) for every bounded Borel set B, in which [({u} x (0, 00)) = 0 for all u E B. The Poisson process N ‘ in (1.11) can be viewed as the generalized marked Pois- son process on U with the marking space 1R1“. In particular, the GG-measures can be represented as or approximated by the marked Poisson point processes. For a probability space (9, .7“, P), a random measure M is said to have a fixed atom u if P(M({u}) > 0) > 0 and is diffuse if M({u}) = 0 for every u E U (p4, Karr, 1991). The following proposition shows that a GG-measure is almost surely purely atomic. Proposition 1. For a GG-measure [W with its Le’vy measure 8 given by (1.7), M is almost surely purely atomic and has no fixed atoms if and only if A(du) is diffuse. Proof. The following proof can be extended to any random measure M having the Laplace functional (1.5) without the drift (2’) and with its Lévy measure 6 satisfy- ing (1.6). 10 Let N’ be the Poisson process on ID" with the intensity measure l(dudh) satisfy- ing (1.6). Consider the random measure M’ on U given by M’(du) = / hN’(dudh) 0 which is almost surely purely atomic. For a bounded set B 6 LI , the Laplace transform of AI’(B) is then, for t 2 0, E(e“""[’3]) = Eexp (—/U/0°°th13(u)N'(dudh)) = Eexp (— / thN’(B x dh)) O = exp Occur“ — 1)e(B x (111)) 0 = exp ( / (e-WBM — 1)£’(dudh)) ID 0 where the third equation follows from Campbell’s theorem; see Kingman (p82, 1993) for a constructive proof. By taking limiting simple function approximations for all compactly-supported measurable functions f on U, it follows from the dominated convergence theorem and the integrability condition (1.6) that the Laplace functional of M’ is as follows: E(e-M’Ifl) = Eexp (—// f(u)hN’(dudh)) U 0 = exp (// (e‘hm‘) — l)€(dudh)) . U 0 Thus, M and M’ have the same Laplace functional and consequently follow the same distribution. The fact that a GG-measure M is almost surely purely atomic, and has no fixed atoms if and only if A(du) is diffuse follows from the result (1.11). C] Later we use the so—called Inverse Le’vy Measure (ILM) algorithm2 to con- struct a generalized marked Poisson process representation of GG-measures. Let 2Wolpert and Ickstadt (19983.) were the first to propose this algorithm. 11 H(du) be a Borel probability measure on U and assume that A(du) have a density A(u) with respect to the measure II(d-u). Let [(u, h) denote the density of the mea- sure l(dudh) in (1.7) with respect to the measure 17 (du)dh. Let {Uj} be independent identically distributed from the probability distribution 17(du), and tj 2 0 be the successive jump times of a unit rate Poisson process. Set Tu(h) :2/ l(u,:r)d.7: (1.12) h and, for each j = 1,2, - - ~ , define Hj I: lnl{hZOITUJ(h) Stj}. (1.13) Then, for a compactly-supported bounded measurable function f (u), the random measure given by Mm == 2 H.f j <30 has the Laplace functional of the form ( 1.5) with its Lévy measure given by (1.7). In particular, for a bounded Borel B in U, we have M(B) = Z rig-603(3). (1.14) J <30 The number of terms in (1.14) depends on the index parameter function a(u). The following proposition shows that the sum (1.14) consists of infinitely many terms when 0 S a(u) < 1 on U. For a random measure M, a Borel B E L! and e > 0 define N,(B) to be cardinality of the set {u 6 B; M({u}) 2 e}, i.e., the number of atoms for M in B with mass at least 6 with e > 0. Denote by 511(3) the support of M within '8. Proposition 2. For any random measure M (du) with its Le’vy measure 6, N6 is a Poisson process on (UN) with finite intensity measure €€( ) := €(- x [e,oo)). In particular, if M (du) is a GG-measure with a(u) < 0 for all u in U, then, for any 12 compact B in U, the number ofjumps in SM(B) is finite; for 0 S a(u) < 1 for all u in U and any Borel B in U, 5111(3) is dense in the support of A. Proof. For each 6 > 0 and with the notation in (1.11), Nc can be written as N£(B) = N "(B X [6, 00)), which is a Poisson process with intensity [6. Define No as the almost sure limit of the process N( as e —+ 0. Recall that a(u) and b(u) are continuous, they are bounded on a compact set B. If a(u) < 0 on U and b(u) > 0, then a(u) g —6 and b(u) 2 6 for some 6 > 0. Thus lim0 66(B) = limO / / €(dudh) i 1 6 7 B . (1.15) u)a(“)A(du) Z/BM —a 0, so that N0(SM(B)) is almost surely unbounded. D The above proposition together with the marked Poisson process representation of GG—measures imply that the number of terms in (1.14) is finite on a compact B, on which a(u) < 0, with a Poisson intensity measure A(du)b(u)“(“)/(—a(u)) on U and a Gamma marking distribution of shape —a( u) and inverse scale b(u). When 0 S a(u) < 1 the discrete measure M has dense support on the support of A for any Borel set B. An approximate Poisson process representation can be simulated by the ILM algorithm. The choice of a finite number J of atoms for the algorithm is discussed by Wolpert and Ickstadt (1998a) and by Brix (1999). The truncation bound given by 13 Brix (1999) could be applied in most applications. The ILM algorithm implements the sampling of atoms on the partially ordered sampling space, i.e., ®Jf=l(uj, hj) such that TuB(hj)S 7213+] (hj+1) for every j < J. When simulating atoms of the GG—measure with index 0 S a(u) < 1, J is used in tabulating 7‘1, which has no closed form. With the ILM algorithm, the joint distribution of the locations and marks of atoms is provided by the following theorem. Define Ea+1(h) = [hoe x‘a-le"xdx (1.16) from which E1( h) is the exponential integral function (p220, Abramowitz and Stegun, 1964) Theorem 1.3.1. The joint distribution of locations and jump sizes {(uj, hj)}j SJ C U“ in the ILM algorithm has a density with respect to the product measure H]. SJ H(duJ-)dhj proportional to A(u1)b(uJ)a(u") Ea(uA)+l(b(uJ)hJ) exp — F (1 — a(UJD J.)a(us) -a(uB)-1 —b(u )h x h,- e B B 311M747) F( 1 — a (a(u)-D J on the set of {(11}, hj)}j SJ for which the measure A u b u a(u) 7.0:) == F[—,)_—(;,%53Earui+l 0, by a change of variable, Ea+1(bh) = b““/ x—“’le“bxdx h 14 It follows from the definition in (1.12), _ A(U) so —au-1—u:c “(kl—ml. x " W _ Mayday”) which is decreasing in h and its inverse function r;1(t) exists for each fixed u. Thus, the greatest lower bound for inf h ru(h) = t is attained, and we can define ._ -1 _ _1_ -1 ————-—F(1_a(u))t h .— Tu (t) — b(U) Ea(u)+l (A(u)b(u)°(“’) Consider a sequence T1,T2,--- of independent exponential random variables, i.e., the waiting times for the successive jumps in the standard Poisson process. Then H j = T531 (T1 + ° ° ° + TS) for each 3' _<_ J. Thus the theorem follows by a change of variable. C] For a special case of the above theorem, where the index function of the CG- measure constant zero, see Corollary 2 to Theorem 2 in Wolpert and Ickstadt (1998a). 1.3.2 Intensity Models and Modeling Objectives Generally, the intensity functions in (1.3) used for modeling point patterns are complex mechanisms with unknown parameter functions. The kernel and parameter functions are parameterized with problem specific unknown quantities 0. We suppress the display of 0 in k9(s, u), a9(u), A9(du) and b9 (u) for the rest of this chapter. Covariates and Intensity Models For some applications, covariates a(s) are available to be included in the analysis, and the intensity model in (1.3) may be treated as a “baseline” intensity Ao(s) of the 15 point patterns. The covariate factor 6(5) can be modulated by the Poisson intensity locally in locations, given by 6(5) = :rT(s)6 with the covariate coefficients ,8. We adopt the multiplicative incorporation of collateral information into the analysis of intensities for a different class of spatial Poisson models. In this case, the intensity function A(s) for point patterns may be factored into two factors, i.e., NS) = €XP(8(S))A0(8)- (1-17) The multiplicative model with the exponential link has the merits of simplicity and positivity insured when further parameterizing the model (Diggle, 1990; Ickstadt and Wolpert, 1996). Often in applications, the covariates 95(5) are determined only in lattice cells of the state space S, and one takes :I:(s) = :z:(s,-) for all 5 in the lattice cell with center 5,. In this case, on the cell with center 5,, x(5) = 2:36. For the intensity model (1.17), the covariate factor 6(5) can be incorporated into the kernel k(s, u) = exp(6(5))k0(s, u) with [to as the baseline kernel corresponding to the baseline intensity A0(s). The mean intensity surface in (1.3) with a GG-measure A/I(du) is given by EA(5) =/wk(s,u)b(u)°(“)_lA(du) and the independence property of GG-measures leads to the covariance function given by 71((5, 5') = /Uk(s,u)k(s’,u)(1 —- a(u))b(u)°(“)“2A(du) — EA(5)EA(5’). (1.18) This can also be derived by the Campbell theorem, see Brix (1999). Inference on Intensity Models With the observed point patterns {Sn}, there is a general interest in the spatial correlation, which is specified through a parameterized kernel, say k(s, u). Another 16 aspect of spatial dependence is the locality, which is explained by the distribution of a GG—measure M9(du) and covariate coefficients 6. The inference on the intensity model is carried out through estimation of the involved parameter, estimation of the intensity at the observed locations, and prediction of the intensity at unobserved lo- cations. The stochastic view to the inference problem is to treat the point patterns {Sn} as part of one realization of the spatial Poisson process N (d5) with the given intensity measure A(ds), while the unobserved part (of the same realization) is pre- dicted using the point patterns data, location-dependent covariates and correlation structure specified by the model. This has proved to be very useful as a surrogate for the analyst’s lack of knowledge about the complex system generating {Sn}. When modeling point patterns, uncertainty analysis is often of main interest. One concern is actually the spatial homogeneity, which involves invariant properties such as location-transformation, rotation, scaling, and location-dependent covariates. Another concern is spatial trends and spatial patterns, such as clustering and aggre- gation. The third concern lies with the spatial correlation. Traditionally, stationary random fields together with other special conditions are assumed; inference in spa- tial statistics is based on a combination of ad hoc nonparametric techniques, such as distance-based methods and second-order methods, kernel smoothing (Silverman, 1986), maximum-likelihood parameter estimates and associated approximations and simulation-based estimation and testing. With constant index and scale parameters, a GG-measure is stationary if and only if its control measure A(du) is proportional to the Lebesgue measure on (UN); see Brix (1999). The inference on inhomogeneous Cox GG models is difficult in part due to its nu- merical implementation.- The emergence of MCMC algorithms (Gelfand and Smith, 1990a; Tierney, 1994; Gilks, Richardon and Spiegelhalter, 1996) has made it prac- ticable, and increasingly common, to apply Bayesian statistical methods to spatial 17 analyses, where the full conditional distributions required for the Gibbs sampling ap— proach are available. The posterior samples are obtained from the stabilized Markov chain in a MCMC scheme. One realization obtained by this simulated chain can be re- garded as a dependent, approximate sample from the posterior, and various posterior inferences can be drawn from empirical data analysis of this sample. The estimates of parameters and prediction of A(s) are obtained by the posterior sample means. The visual appreciation can be obtained from the posterior empirical distributions (histograms), which will be discussed in Chapter 2. The implementation of MCMC schemes sometimes turns out to be difficult be- cause of the difficulty in sampling from full conditional distributions. An important MCMC technique is data augmentation (Tanner and Wong, 1987). When the di~ rect sampling from the conditional distributions for certain parameters is difficult, the data augmentation may be needed and will be demonstrated later in context. Another useful technique is the MetrOpolis-Hastings (MH) algorithm (see Smith and Roberts, 1993), which makes it possible to sample from complicated full conditional distributions. The MH scheme is not only straightforward to implement, but is also a general purpose method. This makes it suitable for handling any form of conditional distribution under general assumptions. 18 Chapter 2. Joint Inference on Counts and Binary Random Fields In this chapter, a simulation study is performed for the joint model of count data and binary clipping data. In Section 2.1, we continue the work on the two-stage Cox M model with a Gamma measure (Wolpert et al., 1998a) and with a particular data structure (count and indicator variables). The data structure is more fully described in Section 2.2; it consists of count data on a subset of the cells of a regular lattice and indicator variables for counts in excess of a cutoff point on the other cells. In Section 2.3, we develop the full conditional distributions needed for an MCMC implementation. Section 2.4 gives the results of the MCMC implementation based on specific data values. 2.1 Model Specification and Representation Consider a Poisson process N (d5) on S C R2 with the intensity defined as in (1.3), in which M (du) is a GG(0, A(du), b(u)) measure with parameter functions given by A(du) = Adu, b(u) = exp(bTu/62) where 0 > O and b E R2. Take the kernel to be k9(s u) = —1—exp —-i[|s -— u||2 (2.1) ’ r02 02 ’ which is a two-dimensional stationary Gaussian kernel, normalized by the condition, [W k = 1, supporting a range of possible spatial correlations. With the above choices 19 the mean intensity function is given by 1 EA(s) = /U 7763 exp (—-;—2|Is — 11H?) b—(1u—)A(du) 1 1 b 2 — Umexp(—-6—2 ([[s+§—u[|))du 2 = cA exp (512' (st + H—Zl)) where c = c(0, U). It is seen that the mean function has the directional trend. For example, for b = (b1, 0)’, the mean function has an “eastness” trend. The covariance function of the intensity random field A(s) is given by '7'(s,s’) = [J k(s‘u)k(sl’u)A(du)—EA(s)EA(s’) WU) /\ A 1 I I 9 1 1 s+s’+b 2 , X/fiexp (_fillu—TH) du—EA(5)EA(8) (2-2) I) cA 1 , , = Wexp (_W ([[s —— s [I2 — 2bT(s + s ) — [|b[|2)) 1 2 "‘C2A28Xp (a? (bT(8+S’)+ [lb2ll )) where the covariance function consists of a stationary part and directional part. Point Pattern Voronoi Tessellations In spatial data analysis, Voronoi tessellation provides a way of translating the state space into spatial attributes, 0, for instance, which can be in turn characterized through parameter functions or unknown quantities for modeling inference. Voronoi tessellation is a method of partitioning a (state) space into convex polygonal re- gions; divisions of the plane are most often discussed. The Voronoi cell of a spatial object is the set of all points in the state space closer to the object than to any other 20 I I I I I I I I I I I 10” + ‘I 9" + .., 8' + + 7' +Uj + 6" + C. " + l + + 1 1 _ 1 1 1 1 1 1 1 1 J 0 1 2 3 4 5 6 7 8 9 10 Figure 2.1: Homogeneous Poisson Voronoi tessellation object in the set. The set of Voronoi cells for a set of spatial objects, called Voronoi diagram (also known as Dirichlet or Thiessen polygons), provides the partition of a set of spatial objects according to its spatial structure. For a theoretical treatment of Voronoi tessellations and mathematical proofs concerning their properties refer to Moller (1994) and Okabe et a1. (2000). The latter follows an application oriented approach. In a two-dimensional Euclidean space, Voronoi cells are convex polygons, whose edges are half-way between two points in U and perpendicular to the line seg— ment joining them. Green and Sibson (1978) provide an efficient computer algorithm for constructing a Voronoi tessellation from a given set of points. Okabe et al. (2000) compare the Green-Sibson to alternative construction methods. 21 The Voronoi tessellation generated from a random point pattern is a random par- tition of the state space (Figure 2.1). Miles (1972) gives an excellent survey for the case where a Poisson process generates the Voronoi tessellation. For a homogeneous planar Poisson-Voronoi tessellation of rate A, the expected number of sides (or edges) in an arbitrarily-chosen cell (e.g., the cell containing the origin) is 6, and its expected area is 1/A. The connection between the intensity and tessellation cell attributes thus provides a way of representing a point pattern. This can be extended to inho- mogeneous point processes, in which the intensity measure is characterized through the parameter functions. Probably the most useful application of the Voronoi model lies in compilation of intensity surface maps from point data. As the area of cells estimates the inverse value of the local intensity, assigning 1 / A to the point location as local intensities, the map can be easily compiled with the aid of any kind of (either deterministic or stochastic) interpolation method. We will use the lattice or gridding tessellation of point patterns, where the corre- sponding spatial models are then specified based on count data instead of point pat- tern data directly. We shall take a state space S of the rectangular shape with length 11w and width 12w, where w is the length of the lattice cell. The number I = 11 x 12 is the total number of lattice cells. A cell is referred to as Ci = 0 (2'1”). The center of a lattice cell C,- is then ((21, 7:2) — I /2)w used for necessary approximation, and the area of a lattice cell is wz. Each sampling window A,- can be considered as the union of 0,5 with generic location at 5,, and A = Uz' Ai. Similarly, a reference space U is defined in an analogous way indexed by J with S C U for correcting the edge effect and with generic locations uj. In general the sampling windows may be any shape or size. We shall take the sampling windows as the union of neighboring square lattice cells (or quadrants) for convenience. 22 Data Representation and Approximation Tessellation is an effective way of approximating the sample paths of a stochas- tic process on the state Space. However, the accuracy of the approximation highly depends on the smoothness of the sample paths. The smoothness of parametric func- tions is problem specific. For deterministic kernels, 1:, Marcus and Rosinski (2000) derived explicit and easily verifiable sufficient conditions for the almost sure continu- ity of the stochastic integrals (1.3) directed by the GG-measures. Tessellation is used to provide data driven approximation of parameter functions or unknown quantities in the spatial models. For smooth parameter functions, we may integrate parameter functions over tessellation cells or evaluate parameter functions at any point in the tessellation cell. In this study, we follow the generic (or cell center) location approximation for the smoothing parameter functions b(u) and random intensity function A(s). Based on the above gridding tessellation scheme, we have AJ- = A1172, b,- = exp(uTb/0;u E (3,) z exp(u}‘b/0), kfj z k9(s,-, 11,-). With 9 = (0,1,,b,) ~ «(9), a lattice model is given by [Mi-10] ~ II 00(0. 13,11?) 1' Ai = Z kzngdei (2.3) .7 WW, {Mjll N UPON/M)- With known counts {Ni}, Ickstadt et a1. (1996) Show that the features (including mean vectors and covariance matrices) of the joint distribution of random variables 23 {A’Ij} and {N,-} conditional on 0 E O, can be studied from their Laplace exponents, i.e., the negative logarithms of their Laplace transforms, -10s Ewe-WW) = Z 10s(1 + 12/11.».- j — log Eg(€_ ZYf‘NYIMj) = :(1 — e-fB)k,-J-Mj 773' — log E9(e_ZYf‘N‘) = Zlog 1+ :(1— e’fB)k,-j/bj j 1' XAJ' (2.4) 2.2 Counts and Binary Data In the rest of this chapter, we illustrate inference for partially censored data on a lattice. The data consists of counts N = {N,-} for each lattice cell within sampling windows, and binary variables Z = {2,}, where Z,‘ = I{]Viz 2 71.0}. Here no is the level of censoring and the unobserved counts N: are assumed to be on the same path as the observed counts N,. In simulation of Section 2.3, the exact locations of the events are known. The counts in lattice cells are represented by independent Poisson random variables N + —— {Ni} U {N 12,} With N 25 denoting the counts in a binary observation cell Oil. The Poisson means are a linear combination (approximation) in (2.3) of unobserved 24 independent Gamma-distributed random measure M = {M -}, €J with uncertain non- negative coefficients Kg — -—z-6]{k -} Uncertainty about the parameters of the Gamma distribution of the measures M]- ~ GG(O /\j,b'), and about the values or collateral dependence of the coefficients kg - is modeled through dependence of kg, /\j = A? and Dj = b? on an unknown parameter 6 E O, regarded as a random variable taking values in O C R” for some p, with intensity 7r(6) with respect to Lebesgue measure d6. Dependence on 6 is indicated by sub- or super-scripts. Given the random measure M, the N+ are conditionally independent Poisson random variables with means A = {A,}. It follows from (2.4) that unconditionally the {N,} are dependent, distributed as sums of independent negative-binomial random variates. Their means and covariances are given by EQN, = 2 199,1,- /b,- J' C)! (2: ) _ 2 : i 6 6 COV(N,‘, Nil) — (617+ kij/bj)(ki’j/bj))‘ j where (522-, is the Kronecker symbol, 1 if i = i’ and 0 otherwise. From (2. 5) we see that N, and Ni’ are uncorrelated (and in fact independent) only if It)? 110%: 0 for all jand each i ¢ 1". That is, k, and [(330 are orthogonal 1n the sequence space emit/((1.9)?) 25 2.3 Joint Inference on Counts and Binary Random Fields 2.3.1 Graphical Representation The specification of spatial models may become complicated when introduc— ing many parameters, especially in the presence of data augmentation and collateral stochastic processes. Gilks et a1. (1996) develop a graphical representation, called directed acyclic graph (DAG), to describe the conditional independence for model components. The DAG representation of the model under consideration in this Chap- ter is shown in Figure 2.2. The statement “6 influences M ” describes 6 as being a parent of M, as shown by the arrow that goes from 6 to M. The DAG makes state- ments about a sequence of conditional distributions. Each node is independent of all its other ancestors given its parents. Types of nodes can be differentiated (Gilks et al., 1996). We shall use squares about N and Z, and circles about 6 and M, for observed and unknown quantities, respectively. In addition, the dotted circles about N‘ and V, denote augmented data nodes. The model dependence structure and assumptions can be readily derived from the directed acyclic graph. The implementation of a MCMC scheme will accordingly visit each node in a DAG. We leave out the implicit functional relationship among unknown quantities for simplicity in a DAG scheme, although they are important in deriving full conditional distributions. Several DAGs related intensity models of spatial point processes are given in the next chapter. The implementation of MCMC inference below requires samples from the full conditional distribution of the random measure M, given 6 and the observed counts N = n, a mixture distribution from which direct sampling is cumbersome. Therefore, 26 "I Po Figure 2.2: DAG of counts and binary random fields 1', V we introduce latent “augmentation” variables V = {l/zjli, j E I x J, which can be sampled easily and which lead to a mixture-model representation of model (2.3) and to the conjugate (Gamma) full conditional distribution for M. More specifically, conditional on M = m, N = n and Z = z, where Z, is the indicator variable, no is assumed to be a constant. Also the augmented counts {NZ-z} corresponding to binaries variables {2,} are conditionally independent Poisson random variables with AYB means {A,}. Set Aij — hum]; A1, = Zj Aij, pzj — X17 and let {Vii} have independent Multinomial distributions MN (TL: , p,). We might regard the random variates l/z'j as the portions of counts 11, attributable to the jth measure, j E J in some applications. 2.3.2 MCMC Scheme We now study the (analytically intractable) posterior distribution of uncertain quantities 6 and M by simulating steps from an ergodic Markov chain with the pos- terior as its stationary distribution. The joint distribution of all uncertain quantities are given by the following theorem Theorem 2.3.1. The joint distribution of 6, M, V, N, N2, and Z has a density function with respect to the product of counting measure for discrete and Lebesgue 27 measure for continuous variables, given by [0,M, V, N, N3, Z] = 71(9) mAB-F’UQB- leB X H (Z:+1 — Z___z‘-) J i pt 1 — pi) j EH] F(AJ) 16?.“ 2J x H J exp —-(ZbJ + ng)m 2] .7 where pf = IP(NJ-z 2 77.0), 1123' = Zi ’UiJ‘ and kQJ' = Zi kiJ'. Proof. With the adopted notations, and from the directed graph presentation, we have the following factorization under the considered model, [6,M,V,N,N’,Z] =[6][M|6][V[6,M,N,N’,Z,no][N|6,M] (Nae, M, z, noiizlo, M, no] Note that [V|6,M,N,N“,Z] = [V|6,M,N,N"], in which Z is redundant, is the product of multinomial distributions. [N [6, M] and [Nz|6, M] are products of Poisson distributions. [Z|6, M] can be evaluated by computing the tail probabilities of a Poisson distribution. Direct calculation will lead to the result. [:1 In the above theorem, we consider no to be a fixed value. In practice, this may be estimated by the available count data and collateral information. Corollary 1. The conditional distribution of the augmented data vector (V, N z) given the observed counts 11 and z, the signal measure M, and the parameter 6 is the 28 product of multinomials with parameters 71, and p,, independent fori E I and [Ni = nz|6,M,z], i.e., [V = v,Nz = nz nE. HIV—[NW ,Pi) (1722 + q) 711! €XP(—‘/\z:) n,z,m,6] = Corollary 2. The conditional joint distribution ofM given N, Z, V, and 6 is again gamma component, independent forj E J with parameters AJ' + UQj and bj + kgj, i.e., [M[0, ’U, n, Z] '—" H GO“), AJ' + v2j, bj + kgj) j Corollary 3. The conditional density of6 given {n,}, {2,}, {2117]}, and {mJ} is given by [6|M,V, N] oc 71(0) AB+v2B-1 /\B W13 11 ’"J' ‘5' 11 ’“‘ . . .l jEJ F(AJ) i], v,J. exp — E (DJ + ng)mJ- j Proof. Immediate from Theorem 2.3.1 and the definition of {th’ J'. C] The above corollaries lead directly to the following MCMC scheme. Given a prior density 7r(6) on 9, a parametric kernel {1626]}, a transition probability kernel 29 Q(6,d6’) = q(6,6’)d6’, parameters {AJ} and {bJ} for the Gamma distributions of the random measures { M J}, and initial points 60 E O, and nonnegative integers {110‘}, we can generate successive points starting at t = 0 as follows: Step I: Gibbs step to update the random measure. Given {n,}, {2,}, {12,9}. and 6’, 1+1 _ 03 1 1+1 63 . SCI, Aj — Aj + Uzj and bj :b’ + [112]} . Mt+1 _ t+1 . G At+1 bt+1 . o generate 1ndependent - — mj w1th C(O, j , j ) 1+1 63 1+1 1+1 1+1 0 set Aijzijl’nk ’A1ZZjEJJAi" t+1J t-I-l t+1 a, =A: / A Step II: Gibbs step to update the augmentation points. Given {n,}, {2,} {mt-,H} and 6‘, o generate independent random variates A'f with probability (- A ) (2-6) nE. Zi+1—ZinAz-\ pf 1- p; for each iw here the indicator random variable Z, is observed; 0 generate { ICE-+1} j N M N (71:, pg“) independent for i E I . Step III. Metropolis-Hastings step to update the parameter 6. Given {n, }, {UtJ-H , t-I-l {mj }, and at o generate a new candidate 6’ ~ Q(6’, 116’); 30 o calculate the acceptance probability ,3“ ”(904(9’, 0’) H ’16] ‘8 P’ = 116111626) .,. 1.1.3 X (mt+1)/\B+U§El— l(b§I)/\l3 F()\§l 9) j (mt+1)A[3S+U2Bl-l(b§S)AlBS[1(Agl) X exp — 2(ng — bgs + keg—13m ”1 j (2.7) o generate 6’+1 = 6’ with probability (1 /\ P’) and = 6‘ otherwise. Step IV: set t «no t + 1 and return to Step I. 2.4 Simulation and Results 2.4.1 Simulation Setup Consider the model (2.3) with the state space, [0.2, l] x [0.2, 1], and the reference state space, [0, 1.2] x [0, 1.2]. The width of the lattice cell (quadrant) is set to be 0.2. The control measure is proportional to Lebesgue measure with the proportion coeffi- cient A = 1000, the parameter 6 = 0.4 for modeling the circular correlation structure, and the spatial trend parameters bT= (b1,b2)= (1,0) for modeling‘ ‘”eastness in this case. One run with true parameters is performed to get the raw count data for all 16 lattice cells. We set no = 10 to be equal to the average of raw counts. Six cells 31 25\_... 20\ 0.4 1 0.2 0.3 Figure 2.3: Empirical counts plot are randomly selected, and their counts are compared with no to get the indicator variables, Z, = I {N, 2 no}. Hence, the simulated data consist of six 0 — 1 indicator values and raw counts for the rest of the 10 cells. The outcome of the simulated data is shown in Figure 2.3, in which the true counts for indicators are shown in parentheses. The sensibility of model parameters in modeling this data is studied by compar- ing the three models shown in Table 2.1. Model I has all three parameters unspecified and to be estimated. It is most general in the sense of catching all the directional covariance structure that is implied in (2.2). Model 11 is a wrong model in this case in that it takes bl = 0 even though the data were generated from a model with bl = 1, 32 Model Parameters Estimates True Model (0.4, 1,0) Model I (6, b1, b2) (0.93, 0.89, —1.00) Model II (6, 0,b2) (161,—, -0.91) Model III (6, b1, 0) (1.77, -3.10,—) Table 2.1: True model and other models that is, the “eastness” is expected. Finally, Model III is the correct model in that b2 = 0 is assumed and is expected to give better estimates and prediction. The values in the first column of the table is the posterior sample mean corresponding to each model. For each model we have chosen independent standard normal prior distributions for each parameters 6, b1 and b2 and independent GG(0, AJ, bJ) distributions for MJ- with AJ- E Aw2 constant in all j and b, as defined before. To perform the Metropolis- Hastings step, the transition distributions for b1 and b2 are also normal with standard deviations 1,01 and 1,122. For the parameter 6, first a transformation is imposed such that E = log 6, then 6; H is generated from the proposal density qE( - [5,) = N (5,, v3), and the value is accepted proportional to WWW) 9 1 1 , (19(9l9’) = 5 exp (—2—w32— (log(6 /6) + log(6/6 ))). The tuning scale parameters 1,11, control the rate of acceptance P’ in (2.7) of the Metropolis-Hastings step, and a judicious selection is needed for efficiency of the algorithm. If w,’s are too small, the rate of acceptance P’ will be high, but the steps such as {2+1 — 6t will in general be small. Such a chain will move very slowly through the support of the posterior distribution. On the other hand, if 11155 are too large, the proposal distribution will generate large steps, often from the body to the tails of the posterior distribution, so the rate of acceptance will be small. Such a chain will frequently not move, producing slow mixing again. In both cases it would 33 take a large number of iterations for the chain to move through the support of the posterior distribution, making the algorithm inefficient. A rule-of-thumb used by many practitioners is to select (by experimentation) values of 11135 giving an empirical rate of acceptance of about 0.4 — 0.5. This rule has some theoretical support for the case when the target and proposal distributions are both normal (Gelman et al., 1996). In this study, for model I, we use 112, = 1 for all i = 1,2,3, while for model II and III, 11), = 2, and the empirical acceptance rate was about 0.40. Implementation for all models was based on 100, 000 MCMC steps, and the posterior sampling was started at 90,000 and successive samples were taken at every other 100 runs. The Monte Carlo iteration was realized by a C++ program. The judgement as to how close an iterative algorithm is to convergence after a finite number of iterations is difficult. Some discussion is found in Kass et. a1. (1997). 2.4.2 Posterior Analysis One of our simulation goals is to predict intensities {A,} for those lattice cells without known count data. The predictions of the intensities are obtained by the corresponding posterior sample means of size 100. The predictions for Model I are shown in Figure 2.4. The smooth prediction surface for intensities is curved instead of showing a pure “eastness” trend. This is because of the presence of the parameter b2, which measures the “northness” trend. In fact, Model I can explain all direc- tional covariance structure if present. However, the other models did not improve the prediction surfaces in the sense of catching the covariance structure during the experimentation. The following figures are plotted using the function plotdensO from MatLab’s statistics toolbox. Figures 2.5, 2.6 and 2.7 show the prior (solid curves), true value 34 1 0.2 0.3 Figure 2.4: Posterior means for intensities 35 over lattice cells 0.4 0.35 ~ true §=Iog(9) ‘ 0.3 posterior density of E 0.25 0.2 0.15 0.1 prior density of E cie—————'~,————————————-———- O -' 0.05- _ - l I 01 f' . o o O ' o ...,-.»1..:-=v'-.1::-.-;=.;.sz..-..-;-1'-°'~<..:.:~'::~:= =- --:- .- ~ ’. 1 .005 1 1 A 1 1 ‘ -2 —1 0 1 2 3 4 Figure 2.5: Posterior density for log-6 (dashed vertical lines) and the estimated posterior density function (dotted curves) for 5 = log 6, b1 and b2, respectively. The posterior means are listed in Table 2.1. The estimated densities for 6 and b1 suggest that the systematic trends should be included in the analysis. The estimated density for b2 tends to be flat and suggests a weak “northness” trend that is consistent with the true model. While the above graphical outputs are very suggestive of which models fit well and which do not, a more objective and more quantitative approach, using so called Bayes factors, can be applied (Ickstadt, et al., 1996). This was not done in our example. 36 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 -0.05 37 Figure 2.6: Posterior density for b1 I I 1 r I 1 j I I - I I I - I I prior density : I I _ I I true b1 I I ” I I empirical posterior density I *- I \ . | I _ » ' ' I I A .’ . ' 81.63%..- . o. 3""‘1 “o'v- . -" . ° 1 l l l I l I -15 -10 -5 0 5 10 15 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 -0.05 I empirical posterior density prior density \ I I I I l I I I I I I I : true b2 I I I I l I I I I I -——-v fivvvvvrw'w'vv" _vvv' l l l l v-' va—W -20 -10 0 10 Figure 2.7: Posterior density for b2 38 20 Chapter 3. Models Based on Log Gaussian Random Fields In this chapter we continue to focus on developing models based on inhomoge- neous point patterns. For highly clustered point patterns, it may be more suitable to perform the geostatistical analysis after an appropriate transformation of the count data. The imposed transformation is convenient for factoring the intensity models, and is required to achieve the Gaussian regularity for geostatistical models. This commonly happens to be the log-transformation. More general transformation meth- ods are discussed by Martinez (1997). Geostatistical models for log-counts tend to explain large scale variation rather than small scale variation in point patterns (Heisel et al., 1996). In the geostatistical modeling setting, a log intensity model is suggested by the intensity model (1.17) and is given by 77(8) = 6(8) + {(3) NS) = exp(.3(8))Ao(S) = exp(n(8)) (3.1) where {(5) is a Gaussian random field on S. The direct specification of covariance function of 5(5) is convenient under stationarity. For non-stationary covariance func- tion of {(5), we shall instead consider a second order random measure W with control measure A on ll. Its subordinated random field 6 (5) is given by 5(5) 2: /Uk(s,u)W(du), s 6 S (3.2) then {(5) is diffuse and is well defined on S; see page 10 for the concept of diffuse measure. Clearly, 6 is a centered second order random field, i.e., EE (5) = 0, and its covariance structure can be specified by choosing appropriate kernels and parameter 39 functions defining random measures. It is more convenient to work with the kernels k(5, -) rather than directly with the covariance function 7(5, 5’), in which it might be difficult to ensure symmetry and positive definiteness for all 5 and 5’. Under the condition (1.4), the specification of 5 in (3.2) is valid with arbitrary choice of kernels and the derived the covariance function 7(5, 5’) of 5 is positive definite (Higdon et al., 1999). Under the suggested transformation in (3.1), the relation between Ao(5) and 5(5) is defined through the exponential link function, which is chosen so that the intensity is nonnegative. The covariate factor 6(5) is an additive term in the transformed model, and in this setting, 5(5) may also be interpreted as a covariate process to point patterns. 3.1 Models for Log Counts 3.1.1 Model Specification Consider the log-count data, denoted by Y = (Y(51), - - - ,Y(s,,))’ at the generic locations 51, - -- ,5,,. In this section we discuss the model Y, = sci-r6 + 5, + e, (3.3) where 6i = {(80 is from an underlying spatial stochastic process and e, = 6(5,) is from a spatial error process. In spatial data analysis, the spatial variation includes two sources, i.e., the spatial uncertainty and measurement error. Measuring heterogeneity is sometimes critical to the success of an error modeling system, especially in the case where the measurements have underlying spatial patterns. Spatial processes 5(5) and 5(5) are assumed to be independent, and the Y,’s are thus conditionally independent given 5(5). 40 In the sequel, 61,- -- ,5" are assumed to be i.i.d normal random variables with mean zero and variance 02, and 5(5) is a Gaussian random field with zero mean. i.e., E5 (5) = 0, and with covariance function 79(5, 5’) = cov(5(5),5(5')) depending 011 a parameter 6 E 9. We let ’75 := (7(51,s), - ~- ,y(s,,,s))T and let F = {7(5,, 5J)} denote the n x n covariance matrix at generic locations. Although not demanded, it is often 2 o assumed that 5(5) has a constant variance 7' , 1.e., 79(5, 5’) = 727545, 5’) (3.4) where y" is the correlation function specified by 6" with 6 = (r, 6‘). The mean for Y, is yo.) = EY(s.-) = E(E(Y(81)l€)) = so.) = .235 (3.5) where x, = :17(s,) is the p—dimensional covariate vector. The covariance matrix for Y is given by 02 + y(5,, sJ), for i = j; “(31131) = i ”7(81',Sj), IOTIS'éJ K and is conveniently written as cov(Y) = 02(1 -+- 1702) = 022 The additive measurement error 02 is the variance of the conditional random field Y, analogous to a nugget efl'ect in the spatial statistics literature. Also let 0'3 1: (YE/(31, S), ‘ ' ' 731(an 8)), for any 5 E S and denote the n x p covariate matrix by X. 41 3.1.2 MCMC Inference In the inhomogeneous spatial models including the non-Gaussian cases, the main difficulty in performing likelihood-based inference and prediction is the shortage of multivariate families of distributions as flexible and operationally convenient as the Gaussian distribution. In most cases, those densities are analytically intractable. The most widely implemented methodology for coping with non-Gaussian problems is trans-Gaussian Kriging and lognormal Kriging (Cressie, 1993), which consist of applying standard Gaussian methods after a marginal non-linear transformation. When applying log Gaussian models (3.3) for point pattern data analysis, the error term is often omitted (Diggle and Tawn, 1998). In this section we introduce the Bayesian analysis of the log Gaussian model with the error term present. Let g : (6(31)1° ' ° 760970), and let 5" : (€*(310)1 ' ° ' 16*(3m0)), for the corresponding values of 5 (5) at locations s,o for which predictions are required. We are then working on the extended model [5,6, 6,Y,0',5"] instead of the model [5, 6, 6,Y, a], where 6 denotes the parameter vector in the covariance structure F9 of 5 (5), in particular, under constant variance in (3.4), 6 includes the variance 72, and any further parameters in the specification of the correlation structure of 5, and 6 consists of all the regression parameters. Our objective is to use MCMC methods to estimate the model parameters, 6, 6 and a, and to generate samples from the conditional distribution of (5,5‘) given Y under the stated assumptions. This allows us to obtain predictions for any functional L of interest associated with this conditional distribution, while making proper allowance for uncertainty in the parameter estimates. Under a standard MCMC scheme we need to generate samples from the posterior distribution of [(6, 5, 6, o)|Y] for inference and from the posterior of [5‘|(6,5,6,Y,o)] for prediction. 42 (”DAV—CD O——-O'—@ Figure 3.1: DAG of log Gaussian models The implementation of the MCMC scheme requires sampling from the condi- tional distributions: [6|5,6,Y,o], [6[6,5,Y,o], and [5,|6,5_,,6,Y,0], where 5_,- = {(5J); j aé i}. A graphical representation of the dependence structure of the model is shown in Figure 3.1, the model can be factorized into [9.5.6,YJIE‘I = IGII€|9llfilIUIIY|9,£,fi.Ullfi'l9.£.61Y,0] (3.6) The first objective is inference about model parameters. We can simplify this struc- ture by dropping the node 5'. It only requires visiting the nodes 6, 5, 6 and a in Figure 3.1. Given 5, it is clear that 6 is conditionally independent of Y, and that 6 is conditionally independent of 6 and 0. So for inference, a single cycle of the MCMC algorithm involves first updating [6|5], then [5[6,6,Y, a], [6|5,Y, a] and [0|5,6,Y]. Under the model (3.3), [Y[6,5,6,o] is the product of normal densities with . T . respective means 77, = 113, ,8 + 5, and variance 02 1 says [309,660] = [Ylélfl 0] = 99645.60) (3-7) and M5610] = I9I€l = [69169] = 0, a2 6 R, the parameter vector (a16 — a2, a'i’Tz, 6, alc - (12) all have the same likelihood, and in particular the binary data contain no information about 7’2. To avoid this situation, we fix the parameter 72 = 1 and c = 0 in the rest of this section so that our unknown model parameters are (6,6). 3.2.1 Second Order Structure of Binary Random Fields In the study of the mean and covariance structure of the resulting binary random field it is helpful to understand the global behavior of the binary realizations and their relation with the parameters of the underlying random field 5 (-.) Given (6,6) we have var(Z(8)) = m> > +v<¢w>i has the desired distribution. The ideas are easily extended to give a general account of Bayesian analysis for order-restrictive parameters (Gelfand et al., 1990a). A mixed rejection algorithm for sampling from univariate normal distribution truncated to intervals is given in Geweke (1991). Under the present framework, first regarding our prior specification, 6 and 6 are assumed to be independent, with 6 ~ Np(60, BO“ 1), where 60 and Bo are chosen to reflect prior information about P(Z (3) = 1). In most cases a vague prior distribution may be obtained by setting 60 = 0 and Bo = 701,, with To small, say 0.05, which assures that the inferences are mainly driven by the likelihood. The prior on 6 would depend on the correlation function '7‘(3, 3’). Suppose that the prior distribution for the model parameters is given by 1 loglfi. 9] = ’§(B — [BOVBOCB — 60)+10g[6] + Cg- Then, for the full conditional distribution of 6, again from Figure 3.2. we have [flaw] = lfil9anl = [nlflfillfil 0< [n - XTfiIflfillBl = [69.6116] since conditional on n the data 2 are redundant. Similarly, for the full conditional distribution for 6, l9|fi,n,Zl 0< l€|9,6l[9l The last two full conditional distributions are of non-standard form and are difficult to directly sample from, so one has to perform a Metropolis-Hastings step. For example, given the current value 6‘, a candidate 6’ for the next iteration is generated from a 52 proposal distribution q(6’|6‘) and this candidate is accepted as the next iterate with probability A(6‘, 9') = min [ [9.5.5149ng 1] l9‘lfi,€lCI(9’|9‘)’ If the candidate 6’ is accepted, the next iterate becomes 6‘+1 = 6’; otherwise it is set to the current state 6‘. The choice of proposal distribution depends on the choice of correlation function 79(3, 3’) and so is problem specific. The MH algorithm also works for updating 6. Starting with initial value of (no, 60, 60), one cycle (iteration) of this MCMC al- gorithm consists of successively (and in that order) sampling from the full conditional distributions (4.9), and then performing the MH step described above. The generated sequence, of length m say, is a trajectory of a Markov chain having [17, 6, 6|z] as its limiting distribution. Therefore, after discarding an initial burn-in of the chain, say 7‘ iterations, long enough to surpass the transient stage of the chain, the subsequent iterations are a dependent sample {(n‘, 6’, 6‘);t = r + 1, - -- , m} approximately dis- tributed as [17, 6, 6|z]. This sample can be used to make inferences about the model 53 parameters and the latent vector 17. The MCMC algorithm is used to obtain approx- imate samples from the posterior distribution of (6,5,6), which is given by [n,GfiIZl = [9llfillnl9,fillzl9.fi,nl o< [Guam—“2 >< exp (€07 - XT6)TP‘1(77 - XT6)> X Hlfflz' E A1} —1/2 1 T —1 n I [0.33121316113an exp (7: r e) [1133 e A.} (3.18) 3.3 Joint Inference of Point Patterns and Geosta- tistical Random Fields In this section we consider the point patterns that associate with the covariate stochastic processes. When the data sets of various types are closely related to the same topic of interest, the joint modeling of those data sets may lead to more efficient analysis. This is possible, because we assume all the data sets are associated with or depend on a spatial process (a random field 5 (3)). Then we can make prediction and inference based on this random field. The joint modeling makes full use of the data, such as the location information in the point pattern data. Point pattern data associates with the random field 5 (3) through the intensity function, and the 54 categorical data associates with the random field through clipping. Consequently, we can perform the joint analysis of point patterns and other types of spatial data. We can also measure their association and dependence. Assume that an observed point pattern {Sj} of N (d3) with intensity function A(s), and observations {K} of a covariate stochastic process 5 (3) are both available over the same region S. The covariate random field is modeled as in (3.3), and the point process may depend on the covariate process 5 through a functional link defined to be log A(s) = C + 077(3) (3.19) where 77(3) is defined in (3.1) as 77(3) = 6(3) -+- 5(3). 3.3.1 Association between Point Patterns and Covariate Pro- CBSSGS Assume that the local intensity of a spatial inhomogeneous Poisson process N (d3) depends on the realization of another spatial stochastic process 5 (3). The extent of the dependence of the point process on the covariate process 5(3) is determined by the unknown regression coefficient a in (3.19). We examine the dependence of the intensity A(s) on the level of the covariate process 5 (3) using a model of the form (3.19). The data consist of a partial realization of a point process, [S], J}, and mea- surements {K ;2' = 1, - -- ,n} of a spatial process, 5 (3) on a region S. In the case where 5 (3) has no effect on the intensity of the point process, a = 0. Conditional on 77(3) and on the observed point pattern, {S}, J}, the locations of the points 5',- from a subset S of the study region are independently and identically distributed 55 with probability density function proportional to Ms). Given (6,5), the conditional log-likelihood equation for (a, 5) in (3.19) is “3, cm) = —— /A exp<< + an(8))d8 + Z (c + an.) (3.20) where 77,- = 77(8)). Inference for a and C can be based on Cox (1972), provided that the true value of the 5 (3) throughout A is known. When 77 is estimated, the conditional likelihood estimates of (I and its variance derived from Cox (1972) will be invalid, as they do not allow for prediction or measurement error. However, we may estimate (6,5) by Kriging estimates or other generic geostatistical methods described in the previous sections. Currie (1998) develops a test for association between realizations of a point pattern and a covariate process 5(3) on the study region. The test is also valid with the presence of covariates 6 (3). The test is motivated by the score test. Measurements Y,- of this process are assumed to follow the model given in Equation (3.3). The intensity of the point process depend on 5 and 6(3) jointly according to (3.19). Under this model, the test for association is a test of the null hypothesis: Ho : a = 0. Up to an additive constant, the conditional log-likelihood for the intensity [1(3) of the point process given 77(3) is given in (3.20). The parameter 5 is a nuisance parameter, so we maximize over it. Differentiating the expression with respect to 5, and then setting the expression equal to zero gives the expression for C C J c = . fA exp(a77(3))d3 Differentiating this equation with respect to a, and substituting for 5, the derivative of the log-likelihood with respect to a is (3.21) gt _ __ J L, 77(8)exp(0z77(8))ds a... — Z" fAexpmmsnds ' 56 For testing the null hypothesis, 0 = 0, consider the statistic J U0 — $777 — WANfSldS : ZWTfi + 57') — Iii—l [A(xT(8)6 + 5(3))ds This is the score statistic for testing the association between A(3) and 5(3), the derivative of the log-likelihood evaluated at the null hypothesis. If we had exact ascertainment of 5 (3) throughout S, then we could calculate the mean and variance of the distribution of this test statistic under the null hypothesis of a = 0, and hence carry out the test of the null hypothesis. Note that calculating U0 only involves the evaluation of (3.21) at a = 0. Thus we avoid the need for calculation of stochastic integrals any more complex than fA 77(3)d3. In practice we do not observe 77(3) directly but only through observation Y, on S. From these measurements, we obtain the conditional expectation of 77(3) throughout S. The Y,’s are conditionally independent given 77, according to the model (3.3). We may use an estimator (e.g., Kriging, see Diggle et al., 1998) for 77. Substituting 77 for 77 in the expression, we obtain the approximate test statistic J 00 = 0‘2 Z (X76 + flit—WY — XT6)) i=1 J 02|A| where ‘Yj = 733 = {COV(€J', I/z') ?=1 2 {COV(€j,€i)}zn___1 is the column vector of length 77. Under the assumption that 5 (3) is Gaussian process, the distri- / (XT6 + 732-107 — X76» d3 A bution of 00 is Gaussian. To find the expected value of this statistic under the null 57 hypothesis, we take the expectation of Uow ith respect to Y, which is zero, that is a(QEy U0): ZEY( (XTfl-(fiE (Y —X-%)) j: —-1 J _l—AIA EY(XT3+7Z§: (Y —XT6))ds Thus, the approximate score statistic U0 has expectation zero under the null hy- pothesis a = 0. The variance of U0 with respect to the measurements Y is given by 02 vary(U )=Z'ijE—1 ’7]- + 2 2732-17,: 7': lkaéj J O__IX|Z/7f21’ysd8 —1 7 +|——A2|2//78 T2 737d3d3. Thus, given locations of events of a point processes, {Sj} on A, the measurements {K} of the underlying continuous process {77(3)} are used to calculate U0. Then conditioning on the locations of the process points, we use expressions for the mean and variance of this statistics under the null hypothesis, to obtain a p-value for the data. Currie (1998) examines the performance of this test on a selection of simulated data sets for a simple model. The test performs well, in the sense that it is able to identify association between the point process intensity and the covariate process 5 (3) 58 a @06) O—~® Figure 3.3: DAG of point patterns and geostatistical models when the association is not evident to the eye. Currie also investigates the effect of estimating the covariance parameters of the covariate process, rather than using their true values. The replacement of the true covariance parameters by their maximum likelihood estimates has only a small effect on the null distribution of the approximate score statistic and on the power of the test. 3.3.2 MCMC Inference on Joint Models Consider when both a point pattern {5]} and a geostatistical process Y(3) de- pend on the realization of 5 (3) and on the parameters a, 6, and 6. The dependence structure for the quantities in (3.19) can be referred to its graphic representation in Figure 3.3. The graphical model also aids the factorization of the joint distribution of all the quantities in our graph into the product of the full conditionals. The analysis for factorization is to take the product over all nodes of the conditional density of that node given its parents. With the given DAG, the joint distribution of our model quantities can be factored as follows: [9, W110, C, N, a, Y] = [Gllwlallfillalll’la Wfi, Ulla, CllNIO, W710“, Cl (3.2?) Model specification is involved in the distributional forms for each of the fac- tors in (3.22). The covariate process 5 (3) is most commonly modeled as a stationary 59 Gaussian random field with a partially uncertain covariance structure. Some no- table references in this area include Besag (1974), Ripley (1981), Cressie and Chan (1989), Clayton and Kaldor (1987), Isaaks and Srivastava (1990), Besag, York and Mollié (1991), Cressie (1993), Bernardinelli et a1. (1997), and Moller, Syverseveen and Waagepetersen (1998). Ideally, the distributional assumptions about the under- lying properties of our model should be sufficiently flexible to accommodate a wide range of behavior of the random variates they describe. In this study, we are focusing on the model in (3.2) with the inhomogeneous intensity. Modeling on the combined spatial data with the kernel mixture type is best implemented through tessellation of the reference state 11}. Consider a tessellation of J cells in [U with cell centers {27,-}. The W(du; Uj E du) are assumed to be independent normal random variates with mean zero and variance depending on some parameters. Approximately, the Gaus- sian random field 5 (3) is constructed through the kernel mixture of Wiener process W(du) on U, given by, s = ROW” (3.23) where 5 = (5(31),--- ,5(3,,)) with 31, - ~ ,sni n S, and the elements of the random vector W = {Wj } j EJ are independent normally distributed with the variance subject to smooth variation in parameters 6 and cell attributes M(du). Note that K = {kg} with kg]- = (60(87, Uj), and the covariance matrix of 5 is then P = K K T. Thus, the density of 5 is given by p€(z) = (27r)""/2|I‘[_1/2 exp (—%ZTF”IZ) . The density of 77(3) = $T(3)6 + 5(3) is then 1032) = (23-7231“? exp (gt. - XTaVP-wz — #3)) . Other model distributional assumptions are given by [N(ds)l€]~ pow.» [Y(S)I€l~ N(77(8), 02)- 60 When applying the MCMC method to simulate from the joint posterior density function given in (3.22), we may employ independent vague priors on the model parameters in the absence of genuine prior information, for example, [91~ ”(9 )lfil~N 14330, )lal~ Mao, )lCl~1V (0C0. ( )lal~ gammmba ) Other steps for generating the chain are to update the other unknown quantities conditional on the data, {53-} and Y, and visiting each of the other nodes in Figure 3.3 in turn. The full conditionals of the quantities 6, 6, a, C, a and W are derived as follows The log of the conditional density for 6 is derived based on the Bayesian relation, [6[W, 6, a, C , N, a, Y] = [6]W]o< [W]6][6] then, up to an additive constant, we have 1 T —1 10g[91|9_1,£] = ‘108lfal — 55 F9 5 + Cg where 5 is the vector of the {62-}2' E 1. We may employ a symmetric random walk proposal on the feasible values of 6,’s, so the update steps for each of the 6,- reduce to MH steps. From the relation, [6[6, W,a,C,N, a, Y] or [Y|6,W,0][N]6,W,a,(][6], the log of the conditional density of 6 is, up to an additive constant, given by 108l5l51Nl2—2—Lg (Y— n.) l - feXpK +an(8))d8 + a 277,- + log[6] + cg j where 77, = 77(3,) and 773- = 77(5)). 61 For the full conditional density of 5, from the relation [5 [6, W, 6,01, N, a, Y] cx [N[6, W, 0, {HQ}, we have log[5[6, W,6,a, N, a, Y] = — /(exp(5 + 077(3))d3 + J5 + log[5] + cg. The full conditional density of 0, follows [a|6,W,6,5 ,N , a, Y] or [N]6,W,a,(][a] and is given by [0'91W1781C1N107Yl = — /(exp(5 + a77(3))d8 + a Z 77]“ +108lal° .7 With the model, [0|6, W",6, a, 5, N, 0,Y]o< [Y|6, W,6, a][0]. Up to an additive constant, the log density of the conditional density of a is I log[0|5, Y] = nloga — :2??- (Y,- — 77,-)2 + log[0] + cg. Based on [5|6,W,6,a,C,N,o,Y,a] oc [Y|6,W,6,a][N|6,W,a,5][5|6], the log density of covariate random vector 5 is given by log[5|6,N,6,Y] = ——(17— (Y7 — 772')2 — /exp(5 + an(3))d3 + a Z 77]- I —§TF;177+cg. 62 From equation (3.23), K is constant in W and 5 , the log of the full conditional density of W is given by the following, up to an additive constant, log[W|6, NyfiaYl = ’7 (Y2? — 771.)“ — €XP(C) 2 d3,- €XP(0’772') 7' 1 + a 2 77.777,- - -2-WT(KKT)_1W + cg. Then we can use an independent random walk proposal for the candidate W with centered normal density on Wj. As such, the update step reduces to 3 MH step in the MCMC algorithm. 63 Chapter 4. Summary of Results and Further Investigation 4.1 Summary of Results The thesis concerns the Cox M modeling approach to inhomogeneous point pat- terns with the random intensity functions directed by independent increment random measures in a reference state space. The idea of kernel mixture representations is relatively new in spatial statistics. It has the potential to model non-stationary pro- cesses, such as the Poisson/ Gamma mixture models by Wolpert and Ickstadt (1998a) and Binary/ Gaussian models as in Section 3.2. In Chapter 2, the proposed model on point patterns is based on the Poisson generalized Gamma random fields. The generalized Gamma measures can be simu- lated by the Inverse Lévy Measure algorithm. The tessellation representation of a point pattern creates a lattice structure, which is the vehicle to tackle the problem of spatial prediction. This is worked out in detail for an example where the data are counts from a subset of lattice cells and binary variables on the complement. The implementation is performed with MCMC for this type of model on counts and binary clipped random fields. For highly clustered point patterns, it is more appropriate to apply the geosta- tistical modeling approaches after the log transformation of count data. The related models are based on Wiener measure subordinated random fields, which are closely related to the ordinary random fields specified directly by mean and covariance func- tions. The proposed models in Chapter 3 are intended to make full use of collateral 64 information obtained along with point patterns. The collateral data, treated as fixed covariates and covariate stochastic processes, are considered in the setting of joint modeling on the intensities. The corresponding MCMC schemes are proposed for each data model. The probability structure in the binary clipped models (in Section 3.2) that contain latent Gaussian processes may be of great interest to spatial statisticians, but very little progress has been made in the past in deriving likelihoods and in studying the covariance structures in such processes. One may think of a myriad of examples where the two types of processes are linked. For example, a smooth fertility trend in soil will affect both the emergence of weed plants (a point process) as well as the crop yield surface (a geostatistical process). 4.2 Further Investigation Wolpert and Ickstadt (1998a) propose a data augmentation scheme for the Cox M models when M is a Gamma measure. The data augmentation method does not work with nonzero index parameter function a(u). For the discrete case in Chap- ter 2, a product multinomial augmentation scheme works in general for the spatial Poisson-generalized Gamma random fields. A new data augmentation scheme may be needed in order to implement a MCMC inference on point patterns. The method is computationally intensive and time consuming in practice. However, it is demanded in that it utilizes the point patterns. A point pattern may be obtained by thinning of another one, or by the super- posing of others, or by the clipping of geostatistical random fields. The limiting point patterns may result in geostatistical random fields. Weak convergence of random 65 fields has been studied largely using mixing conditions, see Yoshihara (1992). The explicit conditions for GG—measures to be approximated by geostatistical random fields are still under investigation, especially on the index set where a(u) sé 0 on U. The classic asymptotic normality theory for a 06' distribution may suggest the conditions for weak convergence of point patterns. For a 06' (a, A, b) random variable M, the standardized variable is asymptotically normal, i.e., __ 0-1 _ = M Ab _ => N(0,1) as 1 a \//\(1 —a)b° 2 Ab“ 10 where the convergence is along the decreasing sequences of the squared coefficient of variation. The proof comes from taking the limit of the Laplace transforms; see also Jargensen (1987). Note that when a —+ —oo the distribution of b.M/(1 — a) converges to a Poisson distribution with mean Ab“ / (1 — a). This follows by applying a limit argument to the Laplace transform; see also Aalen (1992). Then the property of asymptotic normality of a 06' (a, A, b) may suggest that, for a large inverse scale b(u) when a(u) > 0 or small b(u) when a(u) < 0, the limiting random field may tend to be a Gaussian random field. The re-weighted GG—measures with a(u);£ 0 can not be constructed under the current augmentation method. In application, this occurs mostly for highly clustered point patterns; it might be more appropriate for applying geostatistical methods as discussed in Chapter 3. When testing of association of a point pattern and geostatistical random fields in Section 3.3 the hypothesis being tested is Ho : a = 0. In case you reject, we plan to study whether the direction of the association can be gleaned from the test statistic. It would be interesting to know whether the latent process inhibits one process while exacerbating the other. One might imagine that through a site-specific application (spatially varied) of herbicide one can reduce the number of weeds (suppress the point pattern) and increase the crop yield (accelerate the yield process). 66 Bibliography AALEN, O. (1992). Modeling heterogeneity in survival analysis be the compound poisson distribution. The Annals of Applied Probability 2 951—972. ABRAMOWITZ, M. and STEGUN, I. A. (1972). Handbook of Mathematical Functions. Dover, New York. BAR-LEV, S. and ENIS, P. (1986). Reproducibility and natural exponential families with power variance functions. Annals of Statistics 14 1507—1522. BESAG, J. (1974). Spatial interaction and the statistical analysis of lattice system (with discussion). J. R. Statist. Soc. B 35 192—236. BESAG, J. YORK, J. and MOLLIE, A. (1991). Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann. Inst. Statist. Math. 43 1—59. BRIX, A. (1998). Spatial point processes in weed science. Technical report De- partment of Mathematics and Statistics Lancaster University, Lancaster, United Kingdom. BRIX, A. (1999). Generalized gamma measures and shot-noise cox processes. Ad- vances in Applied Probability 31. CLAYTON, D. G. and KALDOR, J. (1987). Empirical bayes estimates of age- standardized relative risks for use in disease mapping. Biometrics 43 671—81. COX, D. (1972). The statistical analysis of dependencies in point processes. In Lewis, P., editor, Stochastic point processes: statistical analysis, theory, and applications. Wiley. 67 CRESSIE, N. A. C. (1993). Statistics for Spatial Data. John \Niley & Sons. CRESSIE, N. A. C. and CHAN, N. H. (1989). Spatial modeling of regional variables. J. Amer. Statist. Assoc. 84 393—401. CURRIE, J. E. (1998). Joint Modelling of Point Process and Geostatistical Measure- ment Data. Ph.D. thesis Lancaster University. DALEY, D. J. and VERE-JONES, D. (1988). Introduction to the Theory of Point Processes. Springer, New York. DEVROYE, L. (1986). Non-uniform random variate generation. Springer Verlag: New York. DIGGLE, P. J ., MOYEED, R. A. and TAWN, J. A. (1998). Model-based geostatis- tics. Applied Statistics 47(3) 299—350. DIGGLE, P. J. (1990). A point process modelling approach to raised incidence of a rare phenomenon in the vincity of a prespecified point. Journal of the Royal Statistical Society (Series A) 153 349—362. GELFAND, A. E. and SMITH, A. F. M. (1990a). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85 398—409. GELFAND, A. E., HILLS, S. E., RACINE-POON, A. and SMITH, A. F. M. (1990b). Illustration of bayesian inference in normal data models using gibbs sampling. Jour- nal of the American Statistical Association 85 972—985. GELMAN, A., ROBERTS, G. O. and GILKS, W. R. (1996). Efficient metropolis jumping rules. In J. M. Bernardo, J. O. Berger, A. P. D. and Smith, A. F. M., editors, Bayesian Statistics 5. Oxford University Press 599—607. 68 1 GEWEKE, J. (1991). Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the interface. Ammerican Statistical Asso- ciation Alexandria, VA 571—578. GILKS, W. R., RICHARDSON, S. and SPIEGBLHALTER, D. J. (1996). Markov Chain. Monte Carlo in Practice. New York: Chapman and Hall. GILKS, W. and ROBERTS, G. (1996). Markov chain Monte Carlo in practice chapter Strategies for improving MCMC. Chapman 8; Hall 89-110. GREEN, P. and SIBSON, R. (1978). Computing dirichlet tesselations in the plane. Computer Journal 21 168—173. GREGOIRE, T. G., SCHABENBERGER, O. and KONG, F. Z. (2000). Prediction from an integrated regression equation: a forestry application. Biometrics 56 414—9. HEISEL, T., ANDREASEN, C. and ERSBQLL. A. K. (1996). Annual weed distribu- tions can be mapped with kriging. Weed Research 36 325—337. HIGDON, D., SWALL, J. and KERN, J. (1999). Non-stationary spatial modeling. In Bernardo, J. M., Berger, J. O., Dawid, A. P. and Smith, A. F. M., editors, Bayesian Statistics: 6 6. Oxford University Press 761—768. HOUGAARD, P., LEE, M.-L. and WHITMORE, G. (1997). Analysis of overdispersed count data by mixtures of poisson variables and poisson processes. Biometrics 53 1225—1238. HOUGAARD, P. (1986). Survival models for heterogeneous populations derived from stable distributions. Biometrika 73 387—396. 69 ICKSTADT, K. and WOLPERT, R. L. (1996). Spatial correlation or spatial variation? a comparison of gamma/poisson hierarchical models. Technical Report 01 Institute of Statistics and Decision Sciences, Duke University, Durham NC 27708-0251. JACOD, J. and SHIRYAEV, A. N. (1987). Limit Theorems for Stochastic Processes. volume 288 of Crundlehren der mathematischen Wissenschaften. Berlin: Springer- Verlag. J ORGENSEN, B. (1987). Exponential dispersion models. Journal of the Royal Statis- tical Society, Series B 49 127—162. KARR, A. F. (1991). Point Processes and Their Statistical Inference. Markcel Dekker second edition. KASS, R. E., CARLIN, B. P., GELMAN, A. and NEAL, R. M. (1997). Mcmc in practice: A rountable discussion. Technical report Carnegie Mellon University. KEDEM, B. (1980). Binary Times Series. New York: Marcel Dekker. KINGMAN, J. F. C. (1993). Poisson Processes. Oxford Science Publications. MARCUS, M. B. and ROSID’ISKI (2000). Continuity of stochastic integrals with respect to infinitely divisible random measures. Department of Mathematics, The City College of CUNY, New York, NY 10031, mbmarcus©earthlink.net. MARTINEZ, V. D. O. (1997). Prediction in Some Classes of Non-Gaussian Random Fields. Ph.D. thesis University of Maryland at College Park. MILES, R. (1972). The random division of space. Suppl. Adv. Appl. Prob. 243—266. MoLLER, J. SYVERSVEEN, A. R. and WAAGEPETERSEN, R. P. (1998). Log gaus- sian cox process. Scand. J. Statist. 25. MOLLER, J. (1994). Lectures on Random Voronoi Tessellations. Springer. 70 OKABE, A., Boors, B., SUGIHARA, K. and CHIU, S. (2000). Spatial Tessellations. Wiley, Chichester. RIPLEY, B. D. (1981). Spatial Statistics. John Wiley and Sons, New York. ROGERS, A. (1974). Statistical analysis of spatial dispersion. Pion, London. SCHABENBERGER, 0., GREGOIRE, T. G. and KONG, F. Z. (1999). Statistical features of wheat scab sampling protocols. random volumes and inhomogeneous point processes. Spring meetings of the Eastern North American Region of the international biometric society. SILVERMAN, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall: London. SMITH, A. F. M. and ROBERTS, G. O. (1993). Bayesian computation via the gibbs sampler and related markov chain monte carlo methods. J. R. Statist. Soc. B, 55, 3—23. STOYAN, D., K. W. and MECKE, J. (1995). Stochastic Geometry and its Applica- tions. Wiley, Chichester second edition. TANNER, M. A. and WONG, W. H. (1987). The calculation of posterior distribution by data augmentation. J. Am. Statist. Assoc. 82 528—50. TIERNEY, L. (1994). Markov chains for exploring posterior distributions (with dis- cussion and a rejoinder by the author). Ann. Statist. 22 1701—1762. TWEEDIE, M. (1984). An index which distinguishes between some important ex- ponential families. In Ghosh, J. and Roy, J ., editors, Statistics: Applications and new directions. Proceedings of the Indian Statistical Institute golden jubilee inter- national conference 579—604. 71 WOLPERT, R. and ICKSTADT, K. (1998a). Poisson/ gamma random field models for spatial statistics. Biometrika 85 251—267. WOLPERT, R. L. and ICKSTADT, K. (1998b). Simulation of Le’vy Random Fields. Practical Nonparametric and Semiparametric Bayesian Statistics. New York: Springer-Verlag dipak dey, peter miiller and debajyoti sinha (editors) edition. YOSHIHARA, K. (1992). Weakly Dependent Stochastic Sequences and Their Appli- cations volume VOL. VII Generalized partial-sum processes. Sanseido Co., Ltd. Misaki—cho. 72 I IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Ill[[I[II][[[][[I]]l]l|][][][lll