ESTIMATiON QF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT T0 BACTSRIA Thesis gov HM Dogma a“ pin D. MICHIGAN STATE UNWERSITY Ramesh C. Srivastava 18965 i I .’ x l ! fr [HESHB J LIBRARY Michigan State University This is to certify that the thesis entitled ESTIMATION OF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT TO BACTERIA presented by Ramesh C. Srivastava has been accepted towards fulfillment of the requirements for Ph.D. degree in Statistics fig/%‘ Major In“ 0 Date May 24, 1965 0-169 ABSTRACT ESTIMATION OF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT TO BACTERIA by Ramesh C. Srivastava Recently Gani has considered the following stochastic model for phage attachment to bacteria in suspension. Let n 00 be the number of bacteria and v00 be the number of phages V a at time t = 0. Also let m = Egg-be the multiplicity of 00 phages and r be the saturation capacity of a bacterium. Further let ni(t) (i = 0,...,r) be the number of bacteria with exactly i phages attached to them at time t 2.0. If P(n0,...,nr;t) denotes the probability that there are no....,nr bacteria with 0,...r phages attached to them respectively at time t Z_0, then, under certain assump- tions, it is shown by Gani that n 1 r n.(t) _ 00 i P(“0"""‘r"‘) ‘ n z...n : " (801(t)) O r . 1:0 that is, at any fixed time t, O E.t E.to’ the distribution ofmn(t) = (n0(t),...,nr(t)) is multinomial. The probabilities a0j(t) are functions of a single parameter a, defined by j 0 I O O _ 3-1 r r-i -(r-1)ap(t) r-m exp(-uot) r-m where p(t) = %~log ( ) . In this thesis we investigate some of the basic properties of this model, describe a method of estimating the parameter a, and study the asymptotic properties of the estimate. In Chapter 1, we describe the deterministic and the stochastic models for phage attachment to bacteria and review different methods of estimation for Markov chains with continuous time parameter. In Chapter 2, the joint probability generating function of Q(t1) and %(t2) (t1 < t2) is derived and is used for calculating the mixed moments of the process. The rest of the chapter is devoted to the study of the asymptotic distribution of n(t) and the limiting joint distribution of nfitl),...,nfitk) (t1 < tk) as n tends to infinity. 00 The problem of estimating the parameter a in the stochastic model is considered in Chapter 3. In section 3.2 a method of estimation is described and is shown to yield a consistent estimate satisfying certain conditions. Then we obtain a lower bound to the variance of a consistent estimate satisfying; certain conditions and use our result to obtain the asymptotic efficiency of the estimate. Finally we indicate a simpler method of estimating the parameter a. The modified method of estimation yields an estimate with the same properties as that obtained by the original procedure. ESTIMATION OF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT TO BACTERIA .5?” o“ ‘\ Ramesh C3 Srivastava A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics 1965 To My Parents ACKNOWLEDGMENTS I wish to express my thanks to Professor Leo Katz for his inspiring guidance throughout this investigation. His comments and suggestions were very helpful. Many thanks are also due to Professor Joseph Gani for suggesting the problem. His suggestions and many illuminating discussions were extremely helpful. I am also indebted to Professor J. F. Hannan for some useful suggestions and discussions that led to the present form of theorems 2.4.1 and 3.5.1. The financial support and creative atmosphere provided by the Department of Statistics at the Michigan State University were invaluable. I am also grateful to the National Science Foundation for their financial support under contracts G-18976, GP-2496, and GP-4307, under the direction of Professor Leo Katz. iii. TABLE OF CONTENTS CHAPTER Page 1. REVIEW OF PREVIOUS WORK . . . . . . . . . . . . . . . . . 1 1.0 Introduction . . . . . . . . . . . . . . . . . . . 1 1.1 A Deterministic Model for the Attachment of Phages to Bacteria . . . . . . . . . . . . . . . . . . . . 3 1.2 A Stochastic Model for the Attachment of Phages to Bacteria . . . . . . . . . . . . . . . . . . . . . 4 1.3 Estimation Methods for Evolutive Processes . . . . 8 1.4 Maximum Likelihood Estimation for a Finite Markov Chain with Continuous Time Parameter. . . . . . . . 12 1.5 An Estimation Procedure in the Emigration-immigra- tion Process. . . . . . . . . . . . . . . . . . . . 20 2. SOME ASPECTS OF THE STOCHASTIC MODEL FOR THE ATTACHMENT OF PHAGES TO BACTERIA . . . . . . . . . . . . . . . . . . 25 2.0 Introduction. . . . . . . . . . . . . . . . . . . . 25 2.1. The Joint p.g.f. and Moments of'3(t1) and 3(t2) (t:1 < t2) 25 2.2 General Discussion of the Model . . . . . . . . . . 30 2.3 A Useful Convergence Theorem . . . . . . . . . . . 32 2.4 The Limiting Joint Distribution of C(tl)""43(tk)' 35 3. ESTIMATION OF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT TO BACTERIA. . . . . . . . . . . . . . . 45 3.0 Summary . . . . . . . . . . . . . . . . . . . . . . 45 3.1 Introduction . . . . . . . . . . . . . . . . . . . 45 3.2 Derivation of the Estimation Equation . . . . . . . 48 3.3 Preliminary Results . . . . . . . . . . . . . . . . 49 iv. 3.4 An Extension of the Implicit Function Theorem. 3.5 PrOperties of the M.S.C.F. Estimate. 3.6 Efficiency of the M.S.C.F. Estimate. 3.7 A Modified Estimation Procedure. BIBLIOGRAPHY . 52 57 64 64 67 CHAPTER 1 REVIEW OF PREVIOUS WORK 1.0. Introduction: Recently Gani [91* has considered a stochastic model for the attachment of phages to bacteria. In the following we study some of the basic properties of this model and describe a method for estimating the parameter and studying the asymptotic properties of the estimate. This chapter consists of two parts. In the first part, which includes sections 1.1 and 1.2, we describe the deterministic model due to Yassky [20] and the stochastic model due to Gani for the attachment of phages to bacteria. In the second part, which includes sections 1.3, 1.4 and 1.5, we review different methods of estimation for finite MarkOV'Chains with continuous time parameter t. Before describing mathematical modelsfor the attachment of phages to bacteria, it may be useful to give a brief account of some essential facts about bacteriophage. It is known from plaque tests on bacterial su3pensions, that phages attack and destroy bacterial colonies, after first attaching themselves to some of the bacteria. A variety of other experiments indicate that one or more phages may attach themselves to a bacterium. In practice, a SUSpension of 1:100 bacteria in a nutrient medium are infected with v00 phages at time t - 0. Within a short time (4 or 5 minutes) one or more phages attach themselves to a * Numbers in the square bracket refer to the bibliography. bacterium and infect it. Inside an infected bacterium phage particles reproduce following a complex reproduction cycle and then the bacterial cell bursts out, releasing some 200 to 300 phage particles. The disintegration of the bacterium and the scattering of new phage particles is called lysis. The entire reproductive cycle takes place in about 20-25 minutes. Mean- while, the uninfected bacteria continue to reproduce. The new phages in turn attach themselves to other bacteria and infect them. This cycle goes on for awhile. Since the reproduction of phages is faster than that of bacteria, after some time all bacteria are killed. The probability of extinction of a bacterial colony by phages has been calculated by Gani [8]. Here we are mainly concerned with the phenomenon of phage attachment to bacteria. The mathematical models which are des- cribed in the sections 1.1 and 1.2 are derived under the following assumptions. Assumptions. A1. The duration of experiment is taken to be very small, so that the number of bacteria or phages neither increases nor decreases in this period. A . No attachment occurs between like particles. 2 A3. Collision of particles is due only to Brownian motion, both bacteria and phages being non-motile. A4. 0n the basis of Brenner's work [5], we assume that there exists a maximum number of phage particles that can become attached to a bacterium. This number is called the saturation capacity of a bacterium. 1,1. A Deterministic Model for the Attachment of Phages to Bagteria: Recently Yassky [20] has considered a deterministic model for the attachment of phages to bacteria in suSpension. Let ni(t) be the number of bacteria with exactly i (i=0,l,...,r) phage particles attached to them at time t.ZLO and let v0(t) be the concentration of free phage particles. Let n = n0(0) and v 00 = v0(0), then 00 r r 2 n.(t) = n and 2 i n.(t) = v - v (t). i=0 1 00 i=0 1 00 0 Assumption B The rate of attachment of a phage particle to a 1: bacterium which is already attacked by i phage particles is proportional to the product of ni(t) and v0(t). Under assumptions A1 through A4 and assumption B1, it is shown by Yassky that dn 0 = - I n v dt 0 0 0 0.0. NS 5—. ll (ii.1ni_1 - Xini) v0 (i=l,...,r-1) (1.1.1) (1 G10. H :3 fl ll r-lnr-l) \0 and dv r 0 T” ' v0 .2 )‘ini (1'1'2) i=0 d where I, >.0 is a constant of proportionalityr.and as a first approximation ni(t) are regarded as continuous and differentialbe functions of t. Assumption B2: Assume that Xi = (r - i) a 0.5 i < r =0 121' where a > 0. Solving (1.1.2 ), we get v (r-m) v0(t) = 00 (1.1.3) r exp(uat)-m V0 where m =‘-- is the multiplicity of phages, and u = n (r-m). n00 00 Next, solving (1.1.1 ), we get n (t) = n [e-pat-+-£- (l - e-uyt )]-Er ' 0 00 r-m : -_1. n (t) n (t) r i -r+i-l 0 0 “1(t) = n00 ( i > n 1' n 00 00 (1.1.4) r-l n (t) nr(t)=n00 <1- z :1). j=0 00 1.2. A Stochastic Model for the Attachment of Phages to Bacteria: Let n00 be the number of bacteria at time t = 0 in su3pension and ni(t) (i=0,1,...,r) denote the number of bacteria with exactly i phages attached to them at time t 2:0. Let P = P(n0,...,nr;t) be the probability that there are no,n .,n bacteria with r 1,.. 0,1,...,r phages attached to them at time t.Z.0 and v0(t) denote the number of unattached phages. In this model, the deterministic value of v0(t) given by (4) in [9] is taken. Assumption C: In addition to the assUmptions A and B1, we make the following assumption. C. The probability of attachment during interval (t,t + dt) of a phage to a bacterium already having i phages is linivodt + o(dt) (i=0,l,...,r - l) and the probability of attachment of more than one phage is o(dt). Then we obtain in the usual way dP r-l r-1 '3; = - .2 linivOP +'.Z Xi(ni + l)\)O P(n0,...,ni+l, 1=0 1=0 n1+1 f l,...,nr;t). If m(uo,...,ur;t) denotes the probability generating function (p.g.f.) of these probabilities. then it follows that .ET r-l fly at 7‘ 1:0 x1"o (”1+1 ' “1) 5‘11. (1.2.1) This is a particular case of the p.g.f. for the multivariate Markov process first considered by Bartlett [3], and has been solved for this particular case by Gani [9]. TO solve the first order linear differential equation (1.2.1), we consider the auxiliary equations {1% ‘=4 9611M 5%? = 1 vd‘z‘ij _ u) (i=0,l,...,r-l) (1.2.2) i 0 1+1 i These can be rewritten as d Pu _ = v ’1 -1 - ”u _ dt 0 O O )0 -- O 1 A1 Ur-l Ar-l-xr-l ur-l u I u r r r _. .. L -1- .J = v Lu . (1.2.3) The solution of this is of the form e'Lp(t)u = c (1.2.4) ’1: where 8 is the column vector (u ..,ur)', c a constant vector, o’“1" t L the matrix array of Ii and p(t) a g VO(T)dT. Thus we have o(gfi) = cp0(e-Lp(t)g) (1.2.5) where m is some suitable function of the new variables. 0 To determine the form of mo, we take into account the initial conditions and the fact that n00 m(u;0) = “0 . (1.2.6) Then n 1300,50) = ([e'L"(t?g]O> 0° (1.2.7) .. t . -Lt where [e Lp( £90 13 the 0th element of the column vector [e p( kg. We now proceed to calculate this element. Since by assumption the 11's are distinct and non-negative, the matrix L can be written in the canonical form L = NAM = N “v 4:AM (1.2.8) where N = M.1 and M is the matrix whose rows are eigenvectors correSponding to the eigenvalues 10,11,...,1r respectively. It is easy to see that s-l M..=1 M..=fl (1.2.1),-L) 11 13 s=t+1xs'xi j-l AS N11 = 1 Nij " 1 _A. (r 2;J > 1) (1.2.9) It follows from (1.2.8) that e'Lp(t)u_ = N e'A 9(t1M 3 (1.2.10) m Let e'LP(t) = N e'A 9(t) M.= || aij(t) || (1.2.11) r where aij(t) = 0 for i > j and 2 aoj(t) = 1. Then we see from 1‘0 (1.2.7) and (1.2.11), that r n00 ¢(g;t) = ( on aoj(t) uj) gives the p.g.f.ofa multinomial distribution and so n I n,(t) _ 00 r 1 P(n0,...,nr,t) - n !.....n I .n aOi(t) (1.2.12) 0 r 1=0 where a0j(t) = 1:0 NOi Mij e . (1.2.13) The Particular Case I, = (£7- 1) a: The above result holds generally for anylti; now we discuss the particular case obtained by setting xi = (r - i)o (i=0,l,...,r-l) and Ar = 0. It also follows from (1.2.9) that M.. = 0 for i > j 1J ii '= _ j-i r-i . = . Mij ( 1) ( j-i ) for J r+l,...,r. (1.2.14) and N.. = 0 for i > j 1] Nii = 1 (1.2.15) r 1 . . ij j-i ) for J r+1,...,r. Hence. from (1.2.13), (1.2.14) and (1.2.15) we have 1 j-i j . . . aoj(t) - .2 (.1)J"l ( r ) ( r-i )e'(r'1)ap(t) (1.2.16) i=0 where o(t) =§ 1ug( “n “P flail ) r-m and u = n00(r-m). In the present investigation this particular multivariate stochastic process is discussed 1n some detail, and the problem of estimating the parameter a is etnsidered. 1.3. Estimation Methods for Evolutive Processes: In an early paper Kendall [10] considered the problem of estimating the birth rate for a purely reproductive process. Let nO be the number of individuals at time t = 0 and suppose each individual is capable of giving birth to a new individual in accordance with the following: (a) The Sub-populations generated by two co-existing indiv1duals develop in complete independence of one another, (b) the probability that an individual existing at time t will reproduce a new individual during (t,t + dt) is ldt +- o(dt) and the probability of more than one birth is o(dt). Let Pn(t) be the probability that at time t, there are n individuals. If n = 1, then as shown in Kendall [10] O Pn(t) = e'At (l-e'M)“'1 (n 2 1). (1.3.1) If nO = a > 1, then it follows from (1.3.1), that Pn(t) = (::I ) e‘xta (1 - e'xt)“'a. (1.3.2) Now we consider a pure birth process and suppose observations are taken at times 0 T 2r . . . kt = T and the observed sizes of the population are reSpectively n n n 0 1 2 . . . k' The conditional probability P(n1,...,nk I no) is n k-l P(n1,...,nk I no) - :0 P(ni+1 I ni) and by (1.3.2). n -1 -n 1T n. -n. _ i+1 ' -17 1+1 1 Pm... I w - < .>e <1 - > Therefore the log likelihood function is --T k-l L = constant + (n -n ) log(l-e A ) - IT 2 n.. (1.3.3) k 0 i=0 1 Differentiating (1.3 3) with reSpect to I, we get aL (“k'no 41* k'1 —T'= -_-—_TRT Te - T 2 ni 5 1 - e . i=0 10. T - T Z n -1 101 A _ n + ... + nk n0 + ... + nk_1' Kendall also suggested another rough procedure for estimating I and obtained an expression for its asymptotic variance. In [11] and [12] Moran investigated the problem of estimation for some simple processes; for example the Poisson process, the pure birth process, birth and death processes, etc. For a pure birth process, Moran suggested the following alternative procedure: Let N(t) denote the size of the population at time t; clearly N(t) is a non-decreasing function of t. The number of births during an interval (0,T] is then given by k=N(T)-N(0). Suppose these births occurred at times tl""’tk' Now we consider a sample function of N(t) with k jumps. Starting from time ts, the time (ts+ - ts) to the next 1 jump is such that 21(N(0) + s) (ts+1 - ts) is distributed as x2 with 2 degrees of freedom. Then the log likelihood function is k-I L = E log[l(N(0) + s) e-X(N(O)+S)(ts+l-ts)] (t0 = 0) s=0 11. k-l = $50 [log 1 + log (N(O) + s) - 1(N(o) + s) (ts+1-ts)]. Equating the derivative of L to zero, we get k-l 31 - -1 - - ax - kA sEo (n(0) + s)(t8+1 - ts) — 0. (1-3 4) Therefore the maximum likelihood estimate X of I is A k I - k-l . (1.3.5) 2 (N(0) + s)(ts+1-ts) s=0 k-l The sum 2 (N(0) + s)(t - t ) is equal to the area under s=0 3+1 3 the curve N(t) and so g: __k_ = NH) - N(g) T T £N(t)dt IN(t)dt 0 1 T 1 Also E’I N(t)dt is an unbiased estimate of I- and its 0 sampling distribution is (2k}()”1 x2 with 2k degrees of freedom and so its variance is (kkz)-1. Further, Moran considered the problem of estimation for the parameters in a birth and death process. Anscombe [2] can also be consulted for sequential estimation in a birth and death process. 12. 1.4. Maximum Likelihood Estimation for a Finite Markov Chain with Continuous Time Parameter: Let {x(t), t 2 0} be a separable* finite Markov chain with continuous time parameter and let P(t) = IIpij(t)II be the stationary transition matrix function. Then under certain conditions P(t) can be written in the form P(t) = exp(tQ) where Q = IIqijII is an MXM matrix known as the "infinitesimal generator" of the process. For a finite Markov chain with continuous time parameter, two types of estimation problems have been studied. In [1], Albert has considered the problem of estimating Q = IIqijII, the "infinitesimal generator"of the process and in [4], Billingsley has considered the problem of estimating the parameter 0 when Q = IIqij(9)II (or equivalently when P(t) = IIpij(t,9)II) is a function of 9. Assume that pij(t) has a derivative pij'(t) with reapect to t for all t 2 0 and let = lim 1-pii(t) qi t—10 t and = lim Pij(t) i # . qij t~0 t J and let Q = IIqijII be the matrix where qii = -qi. It is well known (see for example Doob [6]) that the probabilities pij(t) satisfy Kolmogorov's forward and back- ward equations. ;TFET-EETinition and other questions regarding separability see Doob [6]. 13. Let (0,.3, P) be a probability Space and {X(t), t 2 0] be a separable Markov process defined on this Space and taking values in a finite set x0 = {l,...,M}. Then it is shown in Doob [6], chapter vi, that if IIpij(t)II is stationary transition matrix function satisfying the condition lim pij(t) = 1 for i = j t—10 = o for 1 # j (1.4.1) then the limits qi and qij exist for all i and j. Further it follows from theorem 1.4. page 248 of [6], that almost all sample functions are step functions with a finite number of jumps in any finite interval. (a) The Estimation of the Infinitesimal Generator: For estimating the "infinitesimal generator" of a Markov chain, first Albert [1] constructs a density on the set of all realizations of the process; then the likelihood equation is defined and finally large sample properties of maximum likelihood estimates are studied. Suppose observations are made on the process {X(t),0£ t < T} where T is finite. Due to theorem 1.4., page 248 of [6], the sample function is a step function. Let Xi denote the state after the ith jump and Ti be the time the system stays in state Xi' Then with probability one, a sample function X(w,'), 0 s t < T can be written as an ordered sequence {(xO,TO),...,(XV(T)_1,Tv(T)_1), XV(T)} (1.4.2) 14. where v(T) is the maximum n such that TO + ... + Tn-l S T < T1 + ... + Tn . Now we can obtain the probability distribution on the space of sample functions. Theorem.A(Albert): Let . . 0 1f 1 = j q'(1:J) = I . . 1f 1 <11)J # J and q(1) = q1 Then P[v(T) = v, X0 = x0, TO 5 do, , XV-1= xv_1, v_1 S av_1; -q(xv)T Xv = xv] - P[XO = x0] e X v-l -(q(xj) - q(xv))t. I n dt. q'(x,,x. ) e J (1.4.3) . J J J+1 S j=0 v v-l where Sv = {(t0""’tv-l) : jEO tj < T and 0 s tj s oj} if v > 0, and -q(xO)T P[v(T) = 0, X0 = x0] = P[XO = x0] e . Next we proceed to construct the density on the Space of all sample functions. Any sample function with v jumps in [0,T) can be represented as a point in v-l + xv =j:0 (XOXR ) X x0 15. where x0 = [1,...,M} and RI = (0,w). Let lbe the Lebesgue measure on Rf and let c be the counting measure on x0. Let C(v) be the product on Xv’ defined by v-l o(v) = I n (c x 2)} X c. i=0 Then almost all sample functions of the process {X(t), 0 s t < T} can be represented as a point in m X = Ji0 XV. For any Subset A of x for which Aan is o(v) measurable, we define 0*(A) = E o(v)(Aflxv). v=0 * * Thus 0 is defined on the o-field B , which is the smallest o-field containing all subsets A whose projection on v-l { n (x0 X R3} X x0 is a measurable set for each v. Let G be i=0 3 measure on the Space of all sample functions, defined for * all subsets B such that ane B , * o(B) = o (30x). Now we obtain the density on the set of all realizations of the process with respect to c which is a o-finite measure. Theorem B (Albert): If B is a subset of the Space of all sample functions over [0,T) which is measurable with reSpect to a, then 16. FEB] = I f (v)do(v) B Q where [' "’(xo)T . P[X(0) = x0] e if v = (x0) 'Q(X )T k'1 '[Q(X.)-Q(X )Jt. P[X(0) = x0] e V jzo q'(xj,xj+1)e J v J fQ(v) = if v - ((XO’tO)""’(Xv-1’tv-l)’ xv) (1.4.4) with v > 0, xj exo, tj 2 0 (j=0,...,v-l) v-l and 2 t, < T i=0 0 otherwise. Now we define the likelihood function and obtain the maximum likelihood estimate. If k independent realizations v1,...,vk of {X(t), 0 s t < T} are observed, then the likelihood function is defined by the equation k LQ I n 35¢ 1 fQ(vj). (1.4.5) j Let N; (i,j) be the total number of transitions from state i to j observed in k trials and A;(i) be the total length of time that state i is occupied during k trials. Then we can write log Lk Q (k)(i)j)log q(i,j)-2A§k)(i)q(i) (1.4.6) 7 Ck + g .2. NT . 1 j¥1 1 where ck is finite with probability one and does not depend on Q. 17. From (1.4.6), we conclude that (1) {Nék>(i,j), ATk)(i)}j#i is a sufficient statistic for Q. (2) The maximum likelihood estimate §¥(i,j) of qij is given by (k) . . A(k) NT (1).-I) . . (R) q (12]) a T Aék)(i) if i¥j and AT (i) > 0. If Aék)(i) = 0, then the maximum likelihood estimate does not exist and we define aTk)(i,j) = o if i¥j and Aék)(i) = 0. (1.4.7) In this context, the term 'large sample' can be interpreted in two ways. Many independent realizations of the process {X(t), 0 s t < T}, keeping T fixed could be observed or a single (finite k) realization of the process over long period of time may be observed. In both cases, Albert [1] proved that under certain conditions the maximum likelihood estimates are strongly consistent and asymptotically normally distributed. (b) Estimating the Parameter 0 When the Infinitesimal Generator is a Function of 9: Consider again a separable Markov process {X(t),t 2 0} defined on the probability Space (0,3,P9) and taking values in a finite set)(0 - {l,...;M} where 9 - (91""’9r) is a parameter, taking values in an open subset S in r-dimensional Euclidean Space Rr' Let P(t,9)=IIpij(t,9)II denote the stationary transition probability matrix function. In this case the condition expressed in (1.4.1) becomes lim p..(t,9) = 1 if i = j (1.4.8) t~O 13 =0 if ia‘j. 18. Then it follows from.Doob1[6, theorems 1.2 and 1.3 of chapter vi], that under certain conditions the limits lim l'pii(t’9) t—oO t 3 qi(9) < ° and (1.4.9) 1. p..(t,9) . .323 41—;— - q,j(e> 1 1‘ J exist for all i and j. Now we state the assumptions which are needed for proving the asymptotic properties of the maximum-likelihood estimate. Assumptions: D1. For each w, the sample function is a right continuous step function. The limits in (1.4.8) exist for all i and j and so there exist functions qi(9) and qij(9)(i ¢ j) satisfying (1.4.9). For all 9 and i, qi(9) >10. Since by assumption D the sample function is a right 19 continuous step function, the system starts at time t = 0 in state x0, remains there for time t then it jumps to some other 0’ state say x1, Stays there for time t1 and so on. If v(T) denotes the number of jumps in time [0,T), then it follows that {xn} is a Markov process and 'Q(Xn+1)9)a .,tngx0,...,xn+1} - e , o 2.0 (1.4.10) and 19. Q(Xn)j)e) P9{Xn+1=J I t0,...,tn;x0,...,xn] ='-a?;;:§y- (1.4.11) From (1.4.10) and (1.4.11) it follows that {(xn,tn), n=1,2,...} + is a Markov process on the Cartesian product x0 X R where pd I - (0,w). This process is called the imbedded process and as remarked by Billingsley (page 38, [4])the information contained in the sample {X(t),0 s t < T} is essentially the same as that in the sample {(xk’tk); k=0,...,\)(T)-1}. The complete sample contains only Slightly more informationtfluni the sample from the imbedded process. The additional information is the length for which the system is in state x v(T)' Let nij(0) = P9(Xn+1=J I xn = 1) 1 # 3. Then qij(9) = fiij(9) qi(9). We assume that the set D of pairs (i,j) for which nij(0) > 0 (or equivalently, for which qij(9) > 0) is independent of 9. (Note that (1,1) I D by construction). Hence, d s M(M - l) where d is the number of elements in D. Assumption D The set D of pairs (i,j) such that qij(9) > 0 2: is independent of 0 and the functions qij(9) have continuous third order derivatives throughout S. The d X r matrix aqi.(9) IIEKYJ II has rank r for all 9 e S. Assumption D For each 9 e S, the Markov chain {xn} has only 3: one ergodic set and there are no transient states. Now given a sample {(Xk’tk)’ 0 S k S v(T)} we can write 20. the log likelihood function from (1.4.6) by setting k = l, l l . . l . log Lé )= E N; )(1,J) log q..(9) - E Aé )(1) q.(9) (1.4.12) D 13 i 1 where Aé1)(i) denotes total time for which the system is in state i and N§1)(i,j) denotes the total number of direct transitions from state i to state j. In writing the expression for the log likelihood function we have, following Billingsley, dropped the term P9(X(0) = x0) because the information contained in P9(X(0) = x0) does not increase as v(T) a m. Thus the likelihood equations are: _§_ (1) (I) ' ° —5— (1) . i 9) 59” log LT =15) NT (1m) agu log qi.(9) -§13AT (1) agé—L J (1.4.13) u = 1,...,r. Finally we state the asymptotic properties of the maximum- likelihood estimate. For a proof see Billingsley [4]. Theorem (Billipgslgy): If (X(t), qij(9)’ S) satisfies conditions D1,D2 and D3 and 906 S is the true value of the parameter, then there exists a consistent solution 0 of (1.4.13). Moreover, , ”'3 . 0 11m v(I) (9 - 9 ) E 0, and .1. 11m v v (t1,3213) = .30 u21 , (2.1.7) then we have Lp r -Lp(t ,t ) “i‘t1) * - l 2 l 2 . (PO (8 32) = n (e 152% (2.1.8) i=0 -Lp(t1)t2) where (e €2)i is the ith element of the column vector e 32. We now proceed to calculate it. Since by assumption 11's are distinct and non-negative, the matrix L can be written in the form L2 NAM==N F10 M (2.1.9) 11 X _ r-J where N = M-1 and M is the matrix whose rows are eigenvectors of L corre3ponding to the eigenvalues 10,...,1r reSpectively. It is 28. easy to see that j-l 1 S a 0 N11 _ l Nij u 3:; xsmxj (r‘z J > 1) J' 1 . . M11 “ 1 Mij " ..izl (r.Z J > 1). (2.1.10) s=i+1 1 -1. s 1 In part1cular,Nir = 1 for all 1 = 0,...,r and Mir = - TEE—IMIr-l for i = 0,...,r-1. It follows from (2.15?) that —Lp(t t ) -Ap(t t ) 1’ 2 1’ 2 e ’92 = N e M792 , (2.1.11) ’Lp(t12t2) .. DAMtI’tZ) "’ (2-1-12) L91: 8 = N e ’ M = Ilaij(t1)t2)II r where aij(tl’t2) = 0 for 1 > j and janij(t1,t2) = l for all T a 0,...,r. Hence we see from (2.1.7) and (2.1.11), that n1(t1) (t1,t2)u2j ) , (2.1.13) which gives Wt1’t2 ; 31’32) 3 E g 301(t1) ( é aij(tl’t2)u2j)uli ] W (2.1.14) It is clear from (2.1.13) that the conditional distribution of E(tz) given E(tl) is the same as that of the sum of (r+l) independent random variables Such that the jth random variable has a multinomial distribution with parameter nj(t1) and probabilities aj0(t1,t2),...,ajr(t1,t2). 29. Next we proceed to calculate the second order mixed moments which will be needed in subsequent work. [I Evaluation of elements of R(t ,t ) E n(t ) n'(t ) 12 mlmz i! ll Ena(t1)nB(t2)II. Differentiating Y(t1,t u2) with respect to u and u and 2T1h 1a 25 ll puttingrp = u l where l is the column vector, we get 1 m2 m m r E na(t1) nB(t2) a n00(nOO-1) a0a(t1)[iz aOi(t1)aiB(tl,t2)]if B < o =5 3 H00 30o(tl) aas(t1’t2) v r ( \ + “00(“00'1) 80a(t1) E i28301~t1’aie(t1’t2> 3 if B_? o (2.1.15) - Ap(t ) -Ap(t )-Ap(t ,t ) Since N e 2 M = N e 1 1 2 M I p(t ) ‘ p(t 3t ) = N e 1 M N e 1 2 M , that is, Ilaij(t2)ll = IIaij(t1) II IIaij(t1)t2) II 2 then we have r . 301(t2) = Z an(t1) ajk(t1’t2) for 1 — 0,...,r i=0 Therefore (2.1.15) can be written as: = ( - .' < . E na(t1) nB(t2) n00\n00 l) 80o(tl) aOB(t2) if B o n00 30o(tl) aoB (t1’t2) + nOO(nOO-l) 800(t1) aOB(t2) if 8 2:0. (2.1.16) 30. This gives Cov “o(t1)n8(t2) = - n00 a01(t1) 306(t2) if B < a. = n00 a0y(t1) {aae(t1,t2)-aoe(t2)} if 5202. (2.1.17) 2.2. General Discussion of the Model: We have shown that for fixed t, 0 S_t‘5 to, n(t) has a multinomial distribution with ’\J parametertboand probabilities a00(t),...,a0r(t) where j C O O O j-1 r r-1 -(r-1)op(t) .0j(t) = .2 <-1) < i > < . > e 1=0 J-l These probabilities are functions of n O,m, and t; that is, they 0 depend on the initial concentration of bacteria, the multiplicity of phages and the duration of the experiment. In this section we confirm mathematically certain experimental facts. For example, we consider the limiting behavior of the probability distribution of n(t) when m, the multiplicity of phages, is large, and prove that in a short time all bacteria are saturated with the maximum of r phages, as is known experimentally. The following cases are considered: (a) Let m-# m, keeping t fixed. Then u = n (r—m) d -m as m- w. 00 -mt Since p(t) = é-log ( iEEIEL'——') 5 a (t) = f (411‘1 < r > ( r'i > < r‘m >r'i 0j i=0 ‘ i j-i r-m exp(-uyt) 31. and a0r(t) d l as m tends to infinity, that is, if the multiplicity of phages is large, then in a Short time all bacteria are saturated by the maximum of r phages. (b) If m is small and fixed, then m exp(-pat) is very small and l r P(t) 31; 10g 1:; (the Signgimeans approximate equality). In this case we get a fixed distribution after some time. In fact in a Short time all the phages attach themselves to bacteria and no more phages are left. Therefore further observations do not give us any more information. (c) Thus we see that if m is very large, then in a short time all the bacteria are saturated by the maximum of r phages, and if m is small, then in a short time all phages attach themselves to bacteria and no more phages are left. To avoid both these extreme cases, we choose m less than r in such a way that the product n -m) is constant; this proves to be an interesting 00 0 (2.2.1) is constant. If (2.2.1) is satisfied, it is clear from (1.2.12) that the distribution of n(t) tends to a multivariate normal distribution. '12 32. More precisely, if nj(t) - nOOaOi(t) l §j(t) = then as n00 tends to infinity, the joint distribution of §O(t),...,§r(t) tends to a multivariate normal distribution with means zero and covariance matrix W11 = II zij II where 2.. = 1 11 and z aOi(t)aoj(t) ‘% ij‘ "(1-a0,(t))<1-a0j’ ’ i # j. 2.3. A Useful Convergence Theorem: In this section we prove a convergence theorem which is used in the next section to derive the limiting joint distribution of E(tl),...,’p(tk),t1 < ,...,< tk. For the formulation of our theorem, we require the notion of UC* convergence. The notion of the UC* convergence has been introduced by Parzen in [13]; we use a slightly different defini- tion of it, already given by Sethuraman [18]. Let vn(0,'), n = 0,1,... be a family of sequences of probability measures on R the Euclidean Space of k dimensions. Assume that 8 takes k) values in a compact metric space I. Let Yn(u,9) denote the m characteristic function (c.f.) of vn(0,°), that is, “(3,6) = I exp(i p'pwnmmm) n = 0,1,... (2.3.1) Rk 33. Definition (Sethuraman): The family of sequences Vn(°;') is said * * to converge in the UC sense to vo(6,') (denoted by UC ) relative to 6 e I as n tends to infinity if (a) 3:1 I In)?” ' 1’0“?” I " 0 as n00 " ”2 (b) Y0(u,8) is equicontinuous in 6 at u = 0; m m and (c) YO(U,9) is a continuous function of 6 for each u. m m We shall also require the notion of weak convergence of distribution functions (d.f.g). Let Fn(x)(n = 0,1,...) be a sequence of d.f.'S. We say that Fn(x) converges weakly (denoted by Fn(x) w F0(xX)to F0(x) if If(x)an(x) tends to If(x)dFO(x) as n tends to infinity for all bounded continuous functions. Now let Hn(x1,...,xk) (n = 0,1,...) be a sequence of distribution functions (d.f.'S) in 111.1 + ... mk = m dimensional . 1 ._ Euclidean Space R.m where xi 6 Rmi. Also let Hn(x1) (n—0,1,...) be the correSponding sequence of the marginal d.f.'S and H:(inx1...xi_1) (n = 0,1,...) be the sequences of the conditional d.f.'S (i = 2,...,k). Then we have the following theorem ' l 1 Theorem 2.3.1. If (a) Hn(x1) w H0(x1) , and i * i (b) Hn(inx1,...,xi_1) Us H0(inx1,...,xi_1) relative to (x1,...,xi_1) e I where I is any compact subset of Rm + ... + m. ,and i = 2,...,k, 1 1-1 34. then Hn(x1,. .,xk) w H0(x1,...,xk) x1 1 k =I; H0(dx1) ..... 2E H0(dkux1,...,xk_1)- The proof of this theorem depends on the following result due to Sethuraman [18]. Let Fn(x1,x2) (n = 0,1,...) be a sequence of d.f.'S. Also let F;(x1) and F:(x2Ix1) (n=0,l,...) be the correSponding sequences of the marginal and the conditional d.f.'S, respectively. Theorem (Sethuraman): If (a) F:(x1) w F3(x1), and (b) F2(x Ix) 00* F2(x Ix) n 2 1 w 0 2 1 relative to x1 6 I where I is any compact subset of Rp and xzeRq, then Fn(x1’x2) 1' F0(x1’x2> x1 x2 2 = I F0(dx1) I F0(dx2Ix1). 0 0 Now we prove the theorem 2.3.1. Proof of theorem 2.3.1: We prove this theorem by induction. The result is true for k = 2 by Sethuraman's theorem. Suppose it is true for k, then again by Sethuraman's theorem it is true for k1+ l. 35. 2.4. The Limiting Joint Distribution of E(tl)""’n(tk): Let 1, m n(t ),...,n(t ) be k observations on the process n(t) and let m 1 m k m “1(51) ' “00301(tj) §.(t.) = ‘1 1 J {“00301(tj)(1‘a01(tj)) I: and 5.01.1) = I§O(tj)7 and 5 = S0031) slop srctl) J's'ruj) took) §r(tk) L ._ Also let W = E(gg'), that is, W is a square matrix of order mm k(r+l). It is the covariance matrix of the random vectorq§. r Further 2 §i(tj) = 0 for j = l,...,k; that is, the random i=0 vector 5 takes values in rk-dimensional subspace and W is a m . . 2 singular matrix of rank rk. The matrix W con51sts of k submatrices Wij (i,j = l,...,k) each of order (n+1) where Wjj = E(N(tj)§'(tj)) and W.. = E(§(t.) '(tj)) (i 9‘ J' = 1:'°°:k) 1] 1r» 36. Before deriving the limiting joint distribution of ,p(t1),...,p(tk), we prove several lemmas which are used for proving that the conditional distribution of §(t2) given * m §(t1) = E(tl) converges in the UC sense relative to any ’1: compact set I to a multivariate normal distribution (t1 < t2). Lemma 2.4.1: The c.f. of’€(t2) givenf5(t1) =.§(t1) is given by Tammy, I 3“?) 1. .1 = exp[-i no: ; (lag-(ti)))2 u2. j=0 "303‘ 2 J 1 1 2 a — {“00301(t1) + Kim) “00 a0, (t1) (1-aOi(t1))2 } x H r + 2 i=0 1 { E a (t t ) ( i u2j 1 )}] 08 - , exp " '(2.4.l) i=0 i3 1 2 (n00a0,>>2 Proof: As remarked in section 2.1, it follows from (2.1.13) that the conditional distribution of n(tz) given n(tl) is the same as ’b ’\o that of the sum of (n+1) independent random variables such that the jth random variable has a multinomial distribution with parameter nj(t1) and probabil1t1es aj0(t1,t2),...,a. (t1,t2). Jr So we have ( ;t |x(1- a0 .>> r 11E): a. jz(1:1,1: )exp( (n i= -0 j= —0 NIH where ni(t1) = nOOaOi(t1)+xi(tl)(nOOaOi(t1)(l-aOi(t1))) = exp[-1 1100 1 1. nOOa0i (t1)+xi (t1)nm2 aOi (t1)(1'aoi(t1))2} x NIH r + 2 { i=0 i ugi Hj‘ 1’“2 ) ex?‘ (n n00 aojoz2 1(1- -a0j>> NIH r log{ 2 a 1‘0 )1] The equation (2.4.1) may be compared with (2.1.13) which gives the conditional p.g.f. of n(t ) given n(t ). m2 ’1.1 Lemma 2.4.2: Let C be any compact subset of R the Euclidean n+1’ Space of (r+l) dimensions. Then for any fixed u m2, uggofiflmb=QfifiwQ1 00 00 unifromly in x(t1) on C. Here 50(32‘t2'i‘(t1) 1 l r _ .7 r a, (t ,t2)U2. = expEi { 2 Xi(t1) 80i(t1)2 (1-301(C1)) 2 lj 1 J % } i=0 i=0 (a0j(t,><1-aoj(c,>> 38. aij(tl’t2) aijxt1,t2)u2ju2j, r + 2 Oil(t ) { 2 l 1 jrj (aOJ(t,>aOJ.>(1-aOJ.(t,>))2 ll 0 2 - 1. g aij(t1,t2)(l-aij(tl,t2))u21 2 J 80J(t2)(1-80J(t2)) Proof: It is sufficient to prove the theorem for any bounded closed rectangle C = {E(tl) I x145 xi(t1) 5 xi"; 1 = 0,...,r} because any compact set can be enclosed by a bounded closed rectangle and uniform convergence on a set implies uniform convergence on all subsets of the set. From (2.4.1), we have 't 1: 1100932, 2he 1)) .1 . 2 r M(t ) ‘ exPE'l n00 JED (1-0 a0 J.(t2 ))% "2i 1. r + z {“00 301(t1) + "1(t1)(“00801“1)“”01“?”2 } X i=0 r i “2' 1og< 2 aJJ(t1,t,) exp< J —) J i=0 (n n00 aOJ '2' = exp[-1 1100 ED (f:ggj?2;7- ) u2j 39. 1 r .— + .20 {n00a01 + xi(c1)(n00a01(c1>(1-a01(c1>>>2 } x 1: log{l + i f: aii(t1’t2)u2j .1 i=0 (nooaoj(t2><1-aoj(c2)>2 2 2 - l- 5 aija0j.(t2)(1-a0j(t2>><1-aoj.(c25» 2 aij£t1,t2)(l-aii(t1,t2)u2j l 2 J 1 a0j(c2>(1-a0j(t2>) 41. l uniformly on C as n00 tends to infinity. Lemma 2.4.3: §b(u2;t2Ix(t1))is a continuous function of §(t1) ’b ’b for any fixed 32. The proof of this lemma is immediate. Lemma 2.4.4: If f(x,y) is a continuous function on [a,b] X [c,d], then the family of functions f(x,'), for x 6 [a,b], is equi- continuous in x at y = y0 e [c,d]. 'Egggf: First we note that the function f(x,y) is uniformly continuous since [a,b] X [c,d] is a compact set. The family of functions f(x,-), x 6 [a,b], is equicontinuous in x at y = y0 if given 6 > 0, there exists 6(a) (depending only on e) such that If(x,y) - f(x,y0) I < e if I y - y0 I < 6(a). But such a 6(a) exists since f(x,y) is a uniformly continuous function. Hence the family f(x,') is equicontinuous at y = yo. Lemma 2.4.5: 50(32;t2|§(t1)) is equicontinuous in §(t1) e: I at 32 = 0 where I is any compact subset of Rr+l' Proof: It is sufficient to prove the theorem for closed rectangles I H I = {§(t1)Ixi(t1) s xi(t1) S xi(t1),i = 0,...,r}. The function §b(22;t2I§(t1)) is continuous in (§(t1),gz) on I X J where J 18 any closed subset of Rr+l containing the p01nt 32 = 0. Hence by lemma 2.4.4, the function:§o(gz;t2I§(t1)) is equ1continuous in 3(t1) at £2 = 0. Theorem 2.4.1: The conditional distribution of §(t2) given ’b §(t1) =.§(t1) converges in the UC* sense relative to any compact ’b subsetIof Rr+l to a multivariate normal d.f. 42. Proof: SinceZEOgu2;t2L§(t1))is a c.f. of a multivariate normal d.f., we conclude from the lemmas 2.4.2, 2.4.3, and 2.4.5 that the conditional d.f. of §(t ) given §(t ) = x(t ) converges in the UC* I» 2 I» 1 I» 1 sense relative to any compact subset I of Rr+1 to a multivariate normal d.f. Lemma 2.4.6: If F(x,Y) is a d.f. such that the marginal d.f. G(x) has the c.f. expE- ] and the conditional d.f. “l 2 c: U U zijij 111j . _ ’ . .1 of Y given X - x has the c.f. exp[1 g xj u2j - 2 iZjBijUZiuzj, I then the joint d.f. F(X,Y) is multivariate normal d.f. Proof: The c.f. of F(X,Y) is expE- %- Z. BijUZiuzj]. E(exp[i g xj(ulj + u2j)] ). i,J J _ l. .1 ‘ exPE' 2 izjaij (“11 + u21)(“1j + “2j) 2 izjaijUZiUZjJ' ) I But this is the c.f. of a multivariate normal d.f. Corollary: If F(x1,...,xk) is a d.f. such that (i) the marginal -1 z 2 i,j alijuliulj] and (ii) the d.f. of X1 has the c.f. eXp[- conditional d.f. of X1 given X = x1,...,X 1 = x has the c.f. 1 i- 1-1 1 l exp[i 2 x .u. . - -' 2 j i'lyJ 1)] 2 “i j j"“ij“ij' jyj' J ) for i 2,...,k, then the joint d.f. F(x1,...,xk) is a multi- variate normal d.f. Proof: We prove this result by induction. This is true for k = 2 by the above lemma. Suppose it is true for k, then again by the above lemma, it is true for k+l. 43. Theorem 2.4.2: The joint d.f. of §(t1) and 5(t2) converges weakly to the multivariate normal d.f. N(O,II 11 12II) as n - W21 W 2 00 tends to infinity. Proof: It follows from theorem 2.3.1, that the joint d.f. of §(t ) and §(t ) converges weakly to a joint d.f. as n tends m 1 I» 2 00 to infinity. The limiting d.f. is a multivariate normal d.f. W11 W12 N(0,II II) by lemma 2.4.6 since the marginal d.f. of W21 w22 §(t1) converges weakly to a multivariate normal d.f. N(O,W11) N and by theorem 2.4.1 the conditional d.f. of §(t2) given §(t1) - 5(t1) converges in the UC* sence relative to any m compact subset I of Rr+l to a multivariate normal d.f. whose c. f. is E09255 I301» . Now we state and prove the main theorem of this section. Theorem 2.4.3: The joint d.f. of §(tj)(j = l,...,k) converges weakly to a multivariate normal d.f. N(O,W) as nOO tends to infinity. Proof: (1) It is clear from (1.2.12) that the marginal d.f. of §(t1) converges weakly to a multivariate normal d.f. N(0,W11). 'b (i1) From theorem 2.4.1, we know that the conditional ‘ * d.f. of §(t ) given §(t ) = x(t. ) converges in the UC sense q, 1 9 1-1 q, 1']. relative to any compact subset to a multivariate normal d.f. whose c.f. 15:§O€31;tiL§(ti-l)) as 1100 tends to infinity. This 13 true for 1 = 2,...,k. (iii) Fr'm the corollary to lemma 2.4.6 and theorem 2.3.1, 44. ,we conclude that the joint d.f. of 5(t1),...,§(tk) converges weakly to a multivariate normal d.f. N(O,W) as n00 tends to infinity. CHAPTER 3 ESTIMATION OF THE PARAMETER IN THE STOCHASTIC MODEL FOR PHAGE ATTACHMENT TO BACTERIA. 3.0. Summary: In this chapter we consider the problem of estimating the parameter a in the stochastic model for the attachment of phages to bacteria described in section 1.3. A simple method, of the type originated by Ruben [17], for estimating the parameter a in this model is described in section 3.2. The estimate is based on k observations 2ft1)’ ...,\n(tk) at times t = jT (T > o,j=1,...k) and is shown to J be consistent and asymptotically normally distributed. In section 3.6 we study the efficiency of the estimate. 3.1. Introduction: The stochastic model for phage attach- ment to bacteria gives rise to a multivariate stochastic process n(t), depending on a single unknown parameter a. This process R(t) is Markovian and its transition probabilities are functions of a. For a Markov process in general it does not seem unrealistic to expect that relatively efficient estimates for the transition probabilities, hence for the parameter a, may sometimes be obtained from the consecutive differences of the relative frequencies observed at discrete points in time. Following Ruben we call such an estimate the Mean Square Consecutive Fluctuation(M.S.C.F.) estimate. 45. 46. Consider the differences Fd-n>- E{ n, (“p(t1) - np(ci_1>>) = z, and (iii) for any point x e N(p(z)), F(x,fz(x)) = O and Ifz(x) - z | < CZ. (3.4.3) That is, for any point x e N(p(z)), fz(x) 6 NZ where Nz = (z - Cz’ z + CZ) Since fz is a continuous function, the set f;1(Nz) is an open set and contains p(z). Also f;1 (Nz) n N(p(z)) is an open set containing p(z). So we can choose a Spherical neighbourhood N*(p(z)) of p(z) such that * -1 -1 * N (9(2)) sz (NZ) 0 N(p(2)) and p (N (p(2)))C NZ -1 * because p is a continuous function. Now if p(zl)eN (p(22)) for any z in D, then p(zl) e p(Nz ) 2 but then 21 6 NZ . That is, due to inverse continuity of p 2 and continuity of f2 we can replace the neighbourhood N(p(z)) 1’22 * by the Spherical neighbourhood N (p(z)) with the additional 55. property * (iv) 1f p(zl) e: N (p(zz» for any 21,22 3 D, then 21 6: N22. ** Now consider the spherical neighbourhoods N (p(z)) with * radii equal to l/3 that of N (p(z)) with centre at p(z). Let ** N = U N (p(z)). The set N is clearly a neighbourhood of the 25D set S = {p(z) I z e D}. o ** ** We will show that if x e N (p(zl) n N (p(zz)), then fz (x0) = fz (x0). Since n**(p(zl>> n n**) ¢ ¢ 1 2 (where ¢ denotes the null set) we have, either p<21> e u*(p(zz>> (3.4.4) or p(zz) e N*>. (3.4.5) Suppose (3.4.4) is true. Then F. f22) = 0. But F(p(zl), le(p(zl))) = 0 and 21 e Nz2 , hence fzz(P(zl)) = fZI(P(Zl)) = 21- (3-4.6) ** * If x e N (p(zl)) flN (p(zz)), then fz is continuous and 2 * satisfies F(x,f (z)) = 0. Also f (x) e N for x e N (p(z )) z z z 1 2 l l and this implies that fz is the unique function, as is shown 1 below, which is continuous and has continuous first partial 56. ** derivatives in N (p(zl)) such that 0. (3.4.7) le(p(zl)) = 21 and F(x,le(X)) Suppose g is any other continuous function , having continuous ** first partial derivatives in N (p(zl)) such that g(p(zl)) = z1 and F(x,g(x)) = 0. Let B = {x I fzfx) - g(x) = 0}. Then B is a closed set since fzfx) - g(x) is a continuous function. Let x e B, then f (x) = g(x). Since f (x) e N , there exists an Open set G, z1 z1 21 containing le(x) (hence also g(x)) such that f;:(G) and g-1(G) are both contained in N(p(zl)). Hence by the implicit function theorem, fz (x) = g(x) on g-1(G), that is, x is an 1 interior point of B. So B is open. Therefore either B is the ** null set or the whole set N (p(zl)). Since B is not null, B ** is N (p(zl)). So fz is a unique continuous function, having 1 *9: continuous first partial derivatives in N (p(zl)) such that le(p(zl)) = 21 and F(x,le(x)) = 0. (3.4.8) Hence f (x0) = f (x0). (3.4.9) 21 z2 . *9: *9: If x e N = U N (p(z)), then x e N (p(z)) for some 2. ZeD Define f(x) = fz(x). It may be remarked here that in view of (3.4.9), we may take 57. ** any 2 such that x e N (p(z)). Thus we have defined a function on N. Clearly this function has the properties (a),(b) and (c) of the lemma. For (d), the neighbourhood can be taken to be id: U (N (p(2)) X NZ). zeD . 3.5. Properties of the M.S.C.F. Estimate: In this section we consider the question of existence of a root (or roots) of the estimation equation (3.2.2) and study the analytic properties of such a root. The idea of studying the analytic properties of an estimate was initiated by Rao [15] and has been found to be very useful in the theory of maximum likelihood estimation. As noted by Rao [15], "In fact, many probability statements concerning the maximum likelihood estimate are direct conse- quences of the continuity and differentiability properties of the maximum likelihood estimate as a function of the observed relative frequencies." This is also true for the M.S.C.F. estimate. First we prove that the M.S.C.F. estimate & is a continuous function of dp(i) dq(i) (i = l,...,k; p,q 0,...r-l) possessing continuous first partial derivatives with reSpect to each dp(i) dq(i). Then we deduce the consistency and asymptotic normality of the M.S.C.F. estimate &. We need the notion of Fisher Consistency (F.C.). Definition: A statistic T which is a function of dp(i) dq(i) (i = l,...,k; p,q = 0,...,r-l) only is F.C. if when the expected values of dp(i) dq(i), are substituted in T, the function T identically reduces to the value of the parameter. 58. If T is F.C. and is also a continuous function of dp(i)dq(i), then T converges in probability to a as n tends to infinity, 00 since as shown in (3.5.4), dp(i)dq(i) ‘ R converges in Probability ipq to zero as n00 tends to infinity. That is, in this case, F.C. implies the usual consistency (convergence in probability) Theorem 3.5.1: (i) As n tends to infinity, there exists, with 00 probability tending to one, one and only one function & of dp(i)dq(i) (i = l,...,k; p,q = 0,...,r-l) which satisfies the estimation equation (3.2.2) and has the following properties, (ii) & possesses continuous first partial derivatives with reSpect to all dp(i)dq(i), (111) &(R(a)) =0; for all a e D (which implies that &(d) is a consistent estimate of a), 1 (iv) n30 (& - a) is asymptotically normally distributed with mean zero and variance 02(a) as n00 tends to infinity, where 02(a) is defined in (3.5.6). Before we present the proof of this theorem, we make a few remarks. Remark 1: (t) is a one-to-one continuous function of a for 800 any fixed t. Proof: We have from (1.2.6) r- r 300“) = ( r-m exp(-n00(r-m)at)) ' Clearly a00(t) is a continuous function of 0. Suppose a1# 02 ( r-m exp(-n00(r-m)mf) r-m exp(-n00(r-m)a2t) 59. or m exp(-n00(r-m)alt) = m.exp(-n00(r-m)azt). (3.5.1) But (3.5.1) implies that 01 = a2. This proves the one-to-one continuit of a (t). y 00 Remark 2: Let d be a vector whose elements are dp(i)dq(i) (i = l,...,k; p,q = 0,...,r-l) and R(a) = E(d). Then R(a) is a one-to-one continuous function of a. Proof: Clearly R(a) is a continuous function of 0 because Ripq(a) (i = l,...,k; p,q = 0,...,r-l) is a continuous function of 0. Now R(a) is a one-to-one function of a if one element of R(a) is a one-to-one function of a. We show that R100(a) is a one-to-one function of a. Be definition R (m)=n'2 E(n(t)-n )2 100 00 O l 00 300(t1) (1 - a00(t1)) n 2 =(l-a(t))+ 00 1 00 Clearly R100(a) is a one-to-one function of a00(t1) which is a one-to-one function of 0. Hence R100(a) is a one-to-one function of a.’ Remark 3: If f(x) = (f1(x),...,fk(x)) is a one-to-one continuous vector valued function of a real variable x, then f(x) is inversely continuous if one of the functions f1(x),...,f (x) is one-to-one and inversely continuous. nggf: Suppose f1(x) is one-to-one and inversely continuous. Let fn(x0); n = 0,1,... be a sequence of points in the range Space of f(x) such that fn(x0) tends fo(x0) as n tends to infinity. Then f:1(x0) tends to ff)(x0). Due to the inverse continuity of -l n -l O _ f1(x), f1 (f1 (x0) tends to fO (f1(x0)) - x0. But 60. (f )-1(fn(x0)) = fi1(f?(xo), hence (ffi)-1(fn(xo)) tends to xO as n tends to infinity. This proves remark 3. Remark 4: In view of remarks 2 and 3, R(a) is a one-to-one and bicontinuous function of a. Proof of Theorem 3.5.1, (i) and (ii): The estimation equation is 1 pq . . -'2 2 R. d 1 d 1 - l = O . 1 p()qu rk . 1 P:q The expression on the left hand side of the above equation is a function of d and a. Denote this function by F(d,a). By assumption B2, a e (0,”). The function R(a) is a one-to- one and inversely continuous function of a as shown above. Also the function F(d,a) is continuous in d and a. Clearly‘gg exists and is given by pq .QE = l_.z z 5R1 d (i)d (1) ad rk i p,q 50 p q which is a continuous function of d and 0. Also the derivatives .bE . . a(d (i)d (1))ex1st and are continuous. We have P q 2 z R.pq R, = rk. (3.5.2) . 1 1pq 1 P:q Differentiating (3.5.2) with reSpect to a, we get Pq 5R 3R. 2 2 333 Ripq=-2 2:18;“ 3013‘! i qu i P29 = _ ..L ‘40 61. Thus we see that all the conditions of lemma 3.4.1 are satisfied- Hence there exists a neighbourhood N of the set S =[R(a)laeD} and a unique function &(d) from N to D such that (a) &(d) is continuous and has continuous first partial derivatives, (b) &(R(a)) = a for all a e D, (c) F(R(a),a) - 1 = O for all R(a) e N, (d) there exists a neighbourhood of the curve {(R(a),a)|aeD} in which the only zeros of the function F(x,z) are the points (d;&(d)). Thus we see that for d e N, the estimation equation has one and only one root &(d) which possesses the properties mentioned in (a) through d . B defin’tion R, = E d i d i ( > y x lpq ( p( > q( >) 53 E(np"“0q(ti-1):| 1 1 (a0p - a0p>(aoq(t.) - a0q) (3.5.3) 62. as n00 tends to infinity. Also dpdq>(nq(a0q(ti)-aoq(ci_1)> (3.5.4) as nOO tends to infinity. Hence from (3.5.3) and (3.5.4), we have R P 0 (3.5.5) dp(i)dq(i) ipq -' as n00 tends to infinity. Therefore given e,fl positive, there exists n(e,n) such that for n00 > n(e,n) P(Id-RI < e) > 1-1]. Hence with probability tending to one, as n00 tends to infinity the estimation equation has one and only one root which possesses the properties mentioned in the theorem. Proof of (iii): If follows from (b) that &(d) is F.C. Since x ; , P G(d) is a continuous function of d,and dp(i)dq(i) . Ripq + 0 as n00 tends to infinity, &(d) P'a as n tends to infinity. That is, 00 'o(d) is consistent.’ Proof of4(iv): To prove this part of the theorem, we need the following lemma: Lemma 3.5.1: If Xn - Yn‘g 0 and Fn(x) H F(x), then Gn(x) W F(x) where Fn(x) = P(X.n s x) and Gn(x) = P(Yn s x). This is a well kniwn result. 63. By Taylor's firmula, we have O(&(d) - a) = n2 O SNIo-d O 2 E (dp (i)dq (i)- m)[ J i p,q 3(dp (i)d q(i)) d* 0 where d: is a point on the line segment joining d and R. Let 2 Xn00=n00 E pzq(80q(ti)'80q(ti-1))[np(ti)'“p(ti-1)’aop(ti)'aop(ti-1)J' But (i) n63(nq(ti)-nq(ti g (a0q(ti)-a0q(ti_1)) as n00 —. on; (ii) we know from (3.5.3), that Ripq -+ (aOp(ti) - aOp(ti-1)) (aoq(ti) - an(ti-1)) as n00 tends to infinity; and (iii) [ <—5§ . 1 *.§ -j£L- a(dp(1)dq(1)) d aRipq as n00 tends to infinity, since &(d) has continuous partial derivatives. Clearly (i),(ii) and (iii) imply that n (& -(y) - X converges in probability to zero as n 00 n00 tends to infinity. But it follows from Rao[l4,§ 5e] that Xn 00 00 is asymptotically normally distributed with mean zero and variance 02(a) since X.n is a linear function of normally 00 distributed random variables. Here 2 2 [Riq REE] ] A‘ilAq i' E(Xp (i)Xp ,(i')) .,.' J L 'q' 02(0): 1 1 P Q,P aRqu_fi 2 ( Z 2 -- R ) (3.5.6) i pq aa ipq 1 2 . = - - - d where Xp(1) n00 ( np(ti) np(ti-1) aop(ti) aOp(ti—1)) an 64. q- Ai — (a0q(t.) (ti-1))' i - an 1 Hence n30 (& - a) is asymptotically normally distributed with mean zero and variance 02(a). 3.6. Efficiency of the M.S.C.F. Estimate: We now discuss the asymptotic efficiency of the M.S.C.F. estimate &. In the theorem 3.5.1. we proved that the M.S.C.F. estimate & is asymptotically normally distributed with mean a and variance 1 ;- 02(a). In section 3.3 we considered the whole class of 00 consistent estimates satisfying assymption F and proved that the variance of a consistent estimate is greater than or equal to %— . 0 Hence the efficiency of the M.S.C.F. estimate & of a is 1 given by 'ESSESEEZE) 3.7. A Modified Estimation Procedure: The estimation procedure described in section 3.2. is somewhat lengthy and is not suit- able for numerical computation. 'In practice the value of r (the maximum number of phages that can become attached to a bacterium) is usually quite large (for example, 130 or 140), and this requires the inversion of a 140 X 140 matrix. In this section we describe a simpka:method which is essentially a modification of the method described in 3.2. The modification consists in pooling the data in p classes. Let r be p positive integers such that r1 +...+rp = r-l 1,...,rp and let 65. r m0(t) = .2 nj(t) J=0 r2 m(t) = 2 n.(t) 1 '=r +1 J J 1 r1+.+rp m (t) = 2 n.(t). p j=r++r +1 J l ' p-l In practice, we may take p = 4 from the point of view of numerical calculation. Modified Estimation Equation: 1 I100 Let Q (i) = Fmoui) - m0(ti_1)] mp(ti) - mp(ti-1) and “1 = E(S i)€(i) ), then we have 1 ' -1 _ Emmni f(ifl' 1' ’b The equation (3.7.1) is true for all values of i, i = 1,.. This suggest that we use 4— 12; 5' n'1 a )= 1 pk i 1M1) i «.(i) ’ as an estimation equation. (3.7.1) .,k. (3.7.2) The estimation equation (3.7.2) may 66. be rewritten as l k p pq —— 5: 2 n 5 5 =1 (3.7.3) pk i=1 p q=0 1 p(1) q(i) where nipq denotes the (p,q)th element of ni-l. It may be noted that the estimation equation (3.7.3) connot be solved explicitly; however, numerical solutions can be obtained fairly readily. It is clear that the modified procedure of estimation will yield several estimates, one of these with the same asymptotic properties as that obtained by the original procedure described in 3.2. In general (3.7.3) will have many roots and the problem of selecting the root satisfying the conditions of theorem 3.5.1 does not seem to have a simple solution and needs some further investi- gation. Suppose a1,...,ap are p roots of (3.7.3), then sometimes it may be possible to determine p functions 6 ,...,&p of d such that l &l(d) = al,...,&p(d) = up. Now according to theorem 3.5.1 there is one and only one root which possessses the properties (ii) and (iii) of theorem 3.5.1 and we can select this root by checking these conditions. This is one possible solution and it is proposed to study this method in some detail. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] BIBLIOGRAPHY ALBERT, ARTHUR (1962). Estimating the Infinitesimal Generator of a Continuous Time Finite State Markov Process. Ann. Math. Statist. 33, pp. 727-753. ANSCOMBE, F. J. (1953). Sequential Estimation. Jour.Roy.Statist.Soc., B 15, pp.1-21. BARTLETT, M. S. (1949). Some Evolutionary Stochastic Processes. Jour.R9y.Statist.Soc., B 11, pp.211-229. BILLINGSLEY, P. (1961). Statistical Inference for Markov Processes. Institute of Mathematical Statistics, Univ. of Chicago. Statistical Research Monographs, Univ. of Chicago Press. BRENNER, S. (1955). The Adsorption of Bacteriophages by Sensitive and Resistant Cell of Escherichia Coli Strain B. Proc.R9y.Soc. 3.144, pp. 93-99. DOOB, J. L. (1953). Stochastic Processes. John Wiley, New York. FERGUSON, T. (1958). A Method of Generating Best Asymptotically Normal Estimate with Application to the Estimation of Bacterial Densities. Ann.Math. Statist. 29, pp.1046-1062. GANI, J. (1963). Models for a Bacterial Growth Process with Removals. Jour.ng.Statist.Soc.,B 25, pp.140-149. GANI, J. (1965). Stochastic Phage Attachment to Bacteria. To appear in Biometrics, March, 1965. KENDALL, D. G. (1949). Stochastic Processes and Population Growth. Jour.Roy.Statist.Soc.,B 11, pp. 230-264. MORAN, P.A.P. (1951). _Estimation Methods for Evolutive Processes. Jour.Roy.Statist.Soc., B 13, pp. 141-145. MORAN, P.A.P. (1953). The Estimation of the Parameter of a Birth and Death Process. Jour.Roy.Statist.Soc., B 15, pp. 241-245. PARZEN, E. (1954). On Uniform Convergence of Families of Sequences of Random Variables, University of Calif. publications is Statistics, 2, No. 2, pp. 23-54. 67. [14] [15] [16] [17] [18] [19] [20] 68. RAO, C. R. (1952). Advanced Statistical Methods in Biometric Research. RAO, C. R. (1957). Maximum Likelihood Estimation for the Multinomial Distribution. Sankhya 18, pp. 139-148. RUBEN, HAROLD (1962). Some Aspects of the Emigration- immigration Process. Ann.Math.Statist. 33, pp. 111-129. RUBEN, HAROLD (1963). The Estimation of a Fundamental Interaction Parameter in an Emigration-immigration Process. Ann.Math.Statist. 84, pp. 238-259. SETHURAMAN, J. (1961). Some Limit Theorems for Joint Distributions. Sankhya 23, A pp. 379-386. TAYLOR, A. E. (1955). Advanced Calculus. Ginn and Company. YASSKY, D. (1962). A Model for the Kinetics of Phage Attachment to Bacteria in Suspension. Biometrics 18, pp. 185-191. Mil A" 8" HM v.” H u H E” H N” I III EHI 93 03177 5962 3 JIM IIHIWIHH