ICHIGAN STA N Ilfllllllllillllllll ilifliii‘lili‘ifliififil 3 1293 01051 7872 This is to certify that the dissertation entitled Estimation in Interval Censorship Models presented by Zhiming Wang has been accepted towards fulfillment of the requirements for PhoDo degree in Stat‘iSt'iCS ACWW V . / Major podfessor Date October 28, 1993 MS U is art Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY Michigan State Unlverslty PLACE It RETURN BOX to remove thte checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE MSU le An Afflrmetlve ActlorVEqueI Opportunity lnetltulon Wilma-9.1 ESTIMATION IN INTERVAL CENSORSHIP MODELS By Zhiming Wang A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 1993 ABSTRACT ESTIMATION IN INTERVAL CENSORSHIP MODELS By Zhiming Wang Interval-censored data arise in the analysis of survival times where there is only periodic assessment of subjects for outcomes of interest. We address the problem of estimating the survival distribution of the time of onset of some biologic event when its time of occurrence is not observed directly, but is known to have taken place in some time interval determined by the pattern of the examination times. This event time may also be right-censored if the event has not occurred at the time of the last assessment. We introduce a statistical model incorporating the examination times and the event status at these times that indicates whether or not the outcome has already taken place. Two cases are considered, one in which the number of assessments that are made over the period of study is fixed, and in the other where this number is random. Identifiability of the distribution of the event time is proved. A class of estimators of the survival function is proposed in an interval censorship model in which there is one examination time. These estimators are shown to satisfy certain self-consistency equations and this provides a means of their computation using the EM algorithm. The relationship between this class of estimators and the ii nonparametric maximum likelihood estimates is investigated. We establish the strong consistency of our estimators and derive the weak convergence of certain functionals. A corresponding Bayesian estimator is constructed under the assumption of a Dirichlet process prior, and is shown that it does not necessarily converge to the nonparametric maximum likelihood estimator. Simulation studies are presented under parametric assumption with two examination times. Comparisons are made between the nonparametric Tumbull estimate of the survival function and the maximum likelihood estimate under the parametric model. iii To my wife Liqin and my daughter Cynthia iv ACKNOWLEDGMENTS I thank my advisor Professor Joseph C. Gardiner for his patient guidance and encouragement during the preparation of this dissertation. I would also like to thank Professor R.V. Ramamoorthi for an introduction to some problems studied here and for fruitful discussions. I extend my sincere thanks to Professor Habib Salehi and Shlomo Levental for having served on my guidance committee and for their comments. Finally I would also wish to thank Mr. Alexander White for helping me to improve the language of this thesis. TABLE OF CONTENTS Chapter Page 1. INTRODUCTION AND SUIVIMARY .............................................. 1 2. IDENTIFIABILITY ................................................................................. 6 2.1 Interval Censorship Model ................................................................. 6 2.2 Modell ............................................................................................... 8 2.3 Model 11 .............................................................................................. 12 3. STRONG CONSISTENCY AND WEAK CONVERGENCE 16 3.1 Introduction ........................................................................................ 16 3.2 Self-Consistency ................................................................................ 18 3.3 Strong Consistency ............................................................................. 22 3.4 Weak Convergence ............................................................................ 27 4. BAYESIAN ESTIMATION ................................................................. 34 4.1 Introduction ........................................................................................ 34 4.2 Right Censoring .................................................................................. 35 4.3 Interval Censoring .............................................................................. 38 5. SIMULATIONS .......................................................................................... 43 APPENDIX ....................................................................................................... 49 REFERENCES ................................................................................................ 54 vi Chapter 1 INTRODUCTION AND SUMMARY The past two decades have witnessed the rapid development of statistical methodology for the analysis of survival times from censored data. These data arise in several clinical, biomedical and epidemiological studies where outcomes of interest are response times, for example, time to tumor appearance, time to relapse, or time to death or failure. These endpoints, however, may not be observed in all subjects for various reasons. Subjects in the study may be lost to follow-up due to withdrawal or to the occurrence of an end point unrelated to the outcome of interest in the study. These data are turned right censored. The typical situation where right censorship occurs is when subjects enter a study at different (random) times and are followed until a specific endpoint is observed. The time of its occurrence T, measured from entry, will be right censored if by the time of termination of the study the event of interest has not taken place. The situation where the event/outcome of interest may be subject to left or right censoring is generally referred to as double censoring. Leiderman (1973) describes a study of infant precocity in a group of Kenyan children in which subjects entered the study at different times and were tested periodically to ascertain whether they had acquired certain skills. The time of onset T of the development of these skills was of interest. We note that T is left censored if a child at the time of entry into the study has attained the desired level of development, and right censored if, by the last available assessment time, the child has not reached the goal. 2 Interval censorship arises where continuous monitoring of outcomes of interest is impractical, and assessments of the study subjects can be conducted only periodically. The precise time T at which the outcome occurred is not observed, but is known to have taken place within a specified time interval determined by the sequence of observation times {Wk : k z 1}. What is known is that the event occurred in some time interval (Wk, WM]. Therefore, the event time T is said to be mm. For example, Peckham (1991) reported for the European Collaborative Study regarding the development of AIDS in 600 children born to mothers who tested positive for the human immune deficiency virus-1. These children were examined at birth and at 3 months intervals until 18 months of age, and again every 6 months thereafier until 4 years of age. Thus changes in their clinical status were not observed directly. They were seen only at the age at which disease was first detected and the age at the previous assessment and hence the time of onset of infection or disease is interval censored. In each of the cases of censorship described here, a basic problem is the estimation of the distribution of T from a sample of observations under nonparametric circumstances. For right censored data, Kaplan & Meier (1958) introduced the product limit (PL) estimator, which has become the cornerstone for almost all analysis of right censored survival data. The properties of the PL estimator, its asymptotic theory and the statistical inferential procedures based on it, have been extensively investigated by several researchers. Turnbull (1974) introduced an algorithm for estimating survival curves from double censored data, and Chang & Yang (1987) described a statistical model under which identifiability of the distribution of T and the strong consistency of an estimator of it were established. The weak convergence of the estimator was proved by Chang (1990). Samuelson (1989) describes another approach to the problem of estimation using a formulation based on counting processes. 3 Tumbull (1976), improving on a method of Peto (1973), developed an algorithm for computing an estimate of the survival function fi‘om interval censored observations. His method of estimation is based on maximum likelihood considerations and yields a system of equations (self-consistent equations) that may be solved using the EM algorithm. An investigation of the properties of the Turnbull estimator was made very recently by Groeneboom and Wellner (1992), who explored a connection in the construction of the nonparametric maximum likelihood estimator (NPMLE) and isotonic regression. They established the strong consistency and weak convergence of the NPMLE in two models of interval censorship. The focus of this thesis is on three issues in interval censorship: Identifiability of the distribution of T in a general model of interval censorship, Consistency of a sequence of estimators of the distribution of T and a Bayesian approach to this estimation. Chapter 2 introduces a general model of interval censorship that incorporates the entire pattern of potential examination times over the duration of a study. We consider two designs. The first permits only a fixed number of inspections of the units, while the other allows the number of assessments made during the follow-up period to be random. Within a nonparametric formulation and under some mild conditions, we prove the identifiability of the distribution of the event time and describe its relationship to other interval censorship models. Chapter 3 studies the strong consistency and weak convergence of the estimate of survival firnction. The estimator is defined implicitly through two equations ((3.3) and (3.4)) and we demonstrate that their solution is equivalent to the solution of the self- 4 consistency equation. Turnbull (1976) has suggested that a self-consistent estimator is a maximum likelihood estimator (MLE). However, we have found self-consistent estimators that depend on the initial mass assigned in the EM-algorithm and do not lead to maximum likelihood estimators. An example is given in section 3.2 showing we have different self-consistent estimates derived from one set of interval censored data. Strong consistency is proved for the class of self-consistent estimators under certain conditions which are also satisfied by the MLE. Therefore the MLE is also strongly consistent under these conditions. In Section 3.4, we study the weak convergence of our estimator. Let S(t) be the survival function of T and S‘"’(t) be a self-consistent estimator of S(t). For the double censorship model, Chang (1990) shows that JE(S(")(t)—S(t)) converges weakly to a Gaussian process. However, in our interval censoring model, we do not achieve such a property. Instead, we show that J5]: (S‘"’(s) -S(s))ds converges weakly to a normal distribution. In Chapter 4 we construct a Bayesian estimator of S(t) under the assumption of a Dirichlet process prior. For right censored data, Susarla & van Ryzin (1976) . demonstrated a nonparametric Bayesian solution to estimation under squared error loss. The resulting Bayesian estimator was shown to reduce to the PL estimator in the cases where or(R+)—>O, where a( - ) is a finite measure on R‘“ which serves as a parameter of the Dirichlet process prior. By following the process of construction of the Bayes estimate of S(t) for right censored data, we find that, for right censored data, the estimator satisfies certain self-consistency equations. Several relationships between the Kaplan-Meier PL estimator and Bayes analog are obtained and an expression for this estimate under 5 interval censorship is presented in section 4.3. It turns out that this Bayes solution does not necessarily reduce to the MLE as a(R+) —>O. Examples simulation studies are presented in Chapter 5. The underlying survival time T is assumed exponential and the inspection times are Gamma-distributed. We compare the Turnbull estimate of the survival function with that of the MLE obtained under the parametric assumption. For the former we employ our APL programs to carry out the task of solution of the self-consistency equations. The SAS LIFEREG procedure is used for the parametric model. Chapter 2 IDENTIFIABILITY 2.1 Interval Censorship Model Let T denote the time of occurrence of the event or response of interest, and let {Wk : k 2 1} be the increasingly ordered sequence of potential observation times, with all variables being measured from a common time origin. At the k-th examination time W,, the event status is designated by the indicator 5k= [ T s W, ], that is 6,, = 1 if the response has occurred by time W, and 5,: 0 otherwise. We set W0 = 0 and for each I, define N(t)=min{k21: 8k=l,W,St}, (2.1) if this minimum exists and set M!) = +00 otherwise. Also M(t) = max { k 2 l: W, s t } is the last examination at or before I, which takes place at time WM (see Figure 2.1). Since throughout this paper t is held fixed, we shall not exhibit the explicit dependence of the variables on t. Under the assmnption that W1 5 t almost surely, M(t) is finite, also N in (2.1) when finite, marks the first examination at which a positive diagnosis of the event 6 7 occurring at the unobserved time T is made by time t. Then L = WM1 and R = W N, define respectively the time of the last negative assessment and the time of the first positive assessment. In this case we have T e ( L , R]. On the other hand if N = +00, we only have the knowledge that T > W M, which makes T right censored. Therefore the information available on T is that Te (WN_,,WN] when N is finite, and for N = 00, T > WM. Define We, = WM. The problem addressed in this chapter is the identifiability of the distribution of T on the basis of the datum (WN_,,WN,N,M). Specifically, if N' and M are defined similarly for the nonnegative variable T ' and the positive increasing sequence {W k': kal} then we wish to show that {T, W, , k =1, 2, - - - } and {T', Wk' , k =1, 2, - - ~ } have the same distribution on [0, t], when the distributions of (WN_,,WN ,N, M) and (W,;,._,,W,;,.,N', M') are the same. Turnbull (1976) described a method of maximum likelihood estimation of the distribution of T based on data of the type T e (L , R], with R possibly infinite when T is right censored. Groeneboom and Wellner (1992) have developed estimation methods for two interval censoring models. In the simplest of these, called Case 1, there is a single examination time X, and the observable aspect of the model is the pair (X , 6 ) where 8=[TSXJ. This situation is obtained by taking X = W,, for any examination time Wk. In the second case, called Case 2, there are two examination times X and Y, with X < Y, with the observable datum being (X, Y,y ,y '), wherey = [TSXJ and y ' = [T e( X, Y]]. In the context of periodic assessments Case 2 is also too wide, since the times X and Y can be obtained from the examination history in several ways. For instance, from any two examination times W,, W]. with W, < W}, we get Case 2 by defining X = W, and Y = W}. Our model, on the other hand, focuses on the entire examination pattern and defines the variables that are relevant. 8 For our identifiability problem, we consider two situations. In both the times of inspection are random. In the first, a fixed number of inspections is made in each subject, that is M is a constant. We refer to this case as 'Model I'. In the second situation, the follow-up period t is fixed and then the number of inspections M over the period [0, t] is a random number. We refer to this case as 'Model 11'. Let S, (x) = P (T > x) denote the survival distribution of T, and V, = Wk- W,,_1 , k 21. The basic assumptions that we make to establish our results are the following: (A1) Tis independent of{ W,; k =1, 2, - - - }. (A2) {Vk : k 2 1} are independent and nonnegative, with the density g, of V, being continuous and positive on (0, 00). (A3) 0 < S,(x) < 1 for x e (0, co), and ST is continuous. The conditions (A1) and (A3) were used by Chang and Yang (1987) to prove identifiability in a model for double censorship. Identifiability of the distribution of T fails when the inspection times are degenerate; that is g, = O a.s.. See Remark (2) at the end of the proofs for an example. Therefore in this nonparametric framework we will not have identifiability if the inspections are made at fixed times. Only the probabilities assigned by the distribution of T to the fixed intervals (WM, Wk] can be identified. 2.2 Model 1: Random Inspection with Fixed Number of Inspections During Follow-up Since M is assumed here to be fixed, let M = m (21), where m is an integer. Also N is the first integer k for which T S W, or N = 00 if no such k exists. We first define 9 several sub-distributions that will be used in our proofs. All arguments x, x,, x2 6 R”r = [0, 00)- Let Ql (x)=P(Wl Sx ,N=l), Qa,(x) =P(Wm Sx , N=oo), and forkz 2, Q.(xl ,xz)=P(W.., Sx. , W,.Sx2 ,N=k). From our assumptions, Wk will have a density f, which is continuous and positive on [0, 00). In fact f, = g, and f, (x) = [I'm (u) g, (x — u)du, k 2 2. Then Q,, Q, and Q, for 122 can be expressed as follows: Q. (x) =Jo’{1-Sr(u)}f.(u)du, (2.2) Q. (x) = Jam/mm (2.3) on. ,x.) =1," If {5.6) — Sr(v)}f.-1 cog. (v - u)dvdu, k 2 2. (2.4) Our first theorem shows that T is identified from the datum ( Wm , WN, N ). Theorem 2.1 Under assumptions (A1)-(A3), ST is uniquely determined by Q,, Q00, and Q, (k = 2, . . ., m). Proof: For the random variables T ', W,’ , k 21, we define V,’ = Wk' - WH' ( Wo' = 0 ) and N ', M ' as in (2.1). Let S,(x) = P (T' > x) and fl, g,’ be the densities of Wk', V,‘ respectively. Assume the conditions similar to (A1) - (A3), and suppose they produce the same sub-distributions Ql , Q”, and Q, (k = 2, - - -, m) as in (2.2) - (2.4). We get 10 {1 - 510)”? (I) = {1 - 51(x)}f.'(X), (2-5) 5106)]; (x) = 5106)}? (X), (2-6) {57051) ' Si(x2)}fk-1 (x1) gk(x2 ' x1) = {Sr(x1) ' S1(x2)}fk.r'(xr) gk'(x2 ' x1), (2-7) where k = 2, 3, ~ - -, m. We will prove S, a S,. If m = 1, we only have (2.5) and (2.6), and together these yield f, (x) =f,‘ (x). Hence S, = S, since f, is assumed positive. For m 2 2, rewrite (2.5) - (2.7) in the form {1 - 3700} {f1 ()6) -f1'(X)} = {5206) - 51(X)}f1'(x), (2.3) SKX)%' (x)-f..(X)} = {5706) - S1(x)}f..'(x), (29) {(ST -S,)(x,) ' (str)(x2)}fi.r (x1) gr(x2 ‘ x,) = {Sr(x|) ' S1052» {fr-1.051) gr'(x2 ' x1) ’fk-r(x1)gk(x2 ' 351)}, (2-10) where k= 2, 3, - - -, m. Let h = ST - S,. Then if h at 0 we must have one of the following two cases: Case ( I ): h(x) 2 O (or symmetrically h(x) S O), x eR+. Case ( II ): There exist x,, x2 eR”, such that h(x,) > O and h(xz) < 0. Now we prove both cases will lead to a contradiction. 11 In Case ( I ), since h is continuous and h(O) = 0, there exists to eR+ such that h(x) 2 O for xe [0, to], and h(x) < h(to) for x e [0, to). From (2.10) with k = m, flu-1'05) gm'(t0 " x) (fin-1(x) 8.110 ' x): x E [09 to). So. fro.) = I,“ f -.'(x)g.'(t. —- x)dx < I: fin—I (x)gm (t0 — x)dx =f...(to )- (2.11) Since S,(to) - S, (to) > 0, we get from (2.9) 123(4)) >fm(to ) (2.12) Hence (2.11) and (2.12) are contrary to each other. Symmetrically if h S 0, the proof is the same. Therefore Case ( I ) cannot hold. In Case ( II ), since h is continuous and h(0) = 0, there exist t0 , t, with 0 S t0 < t,, such that h(to) = O, h(t,) = O, and h > O on ( t0 , t, ) (or h < O on (to, t,), in which case the proof is the same). Let us assume to >0, for otherwise we would essentially have Case ( I ) again. By (2.8) and (A3) we have h(to) =f1'(to), f1(t.) =f1'(t.), and/'10:) >fl' (X) for 15600 , 10- (2.13) Taking x, = to and x2 6(1‘0 , 1,) in (2.10) with k = 2, we obtain fr'(xl) g2'(x2 - xi) <fl (x.) g2 (x2 - x1), and in view of (2.13), also g2'(x) < g2 (x), for xe(0, t, - 0). (2.14) Taking x, e( to, I, )and x2 = t, in (2.10) with k = 2, and in view of (2.13), we obtain f1'(xr) gz'(x2 - 16,) >f.(x1) g2(x2 - x1) >f1'(x1) 82062 - x1)- So, g2'(x) > g2 (x), for xe(0, t, - to). (2.15) Hence (2.14) and (2.15) are contrary to each other. Therefore Case ( II ) cannot hold. 12 Corollary 2.1 The identifiability above is total identifiability, that is { f, , g, , k = l, 2, - - - , m } are also identified by Q, , Q 2 , - - . , QM, Q00. Proof: We need to provefis f,’ , g,-=- g,', k=1, 2, - - - , m. We have already proved S,(x) = S, (x) for x eR+. By (2.5) and (A3) we get g, =f, =f,’ =g,’ on R+. In (2.7) with k=2, take x, = 0, x2 =x eR“, then {1 - 510an (0) g; (I) = {1 - S,(X)}f1'(0) 82' (X), and it follows that g2(x) = g2'(x). From f2 (x) = ff, (u) g2 (x - u)du, we obtain f2 (x)=f2' (x). By the same arguments and inductively we can establish 1; (x) = fi'(x) and g, (x) = g,'(x) forlSkSm. 2.3 Model II: Random Inspection with Fixed Follow-up Period In this case the number of inspections M made over the observational period [0, t] is not fixed in advance. Then M becomes a random integer that is the total number of inspections made in this period. Note that N takes on values 1, 2, ~ - -, M and +00. Let Q (x, ,x2 , n , m) = P(WN_, Sx,,W,,, S x2,N = n, M: m), (2.16) where OSx,,x2 S t,1 S n S m andm _>. 1; and Qw(x,m)=P(Wme,N=oo,M=m), (2.17) where 0 S x S t and m 2 1. The following theorem shows that T is identified from the datum (Wm, WN, N, M). 13 Theorem 2.2 Under assumptions (A1) - (A3), on [0, t], S, is uniquely determined by the sub- distributions of (2. 16) and (2.17). Proof: Let Q: (x)=P(W,Sx,N= 1): = ZPU‘Vn—r SxraWN Sx,N=l,M=M), m-l = ZQ(x,,x,1,M), nt=l and Q2 (x,,x2)=P(W, Sx, , WZSx2,N=2), = iQ(x,,x2,2,m), m=1 where x, x, , x2 6 [0, t]. By the same argument leading to (2.2)-(2.4), Q, and Q2 can be expressed as follows: Q, (x)=],’{1-Sr(u)}fi(u)du. (2.18) Q2(x1 ,x. > =1," I,” {5.6) — Sr(v)}f1(u)g2(v - u>dvdu. (2.19) wherex,x, ,x,e[0,t]. For xe [0, t], let (72(x) = P(V2 > x) = fg,(u)du, then Qw(x, l )=P(W, Sx, T2 W,,M= 1), =P(W, Sx, T2 W, , W, + V2>t), = L’P(T>u,V, >t—u)f,(u)du, = 10’s,(u)f,(u)c_;,(t —u)du. (2.20) We will prove ST is uniquely determined by Q, , Q2 and Q0, . 14 As in the proof of theorem 2.1, for the random variables T ', W,’ , k 21, we define V,'= W,‘ - W,_,'( W,'=O)andN',M'asin(2.1). LetS, (x)=P( T'>x)andf,',g,’ bethe density of W,', V,’ respectively. Assume the conditions similar to (A1) - (A3), and suppose they produce the same sub-distributions Q and Q, as in (2.16) and (2.17). From (2.18)-(2.20) we get {1 ' 51(x)}f1 (x) = {1 ' S,(x)}f,'(x), (2-21) {51(x1) ‘ S7(x2)}f1 (x1) 32 (x2 ' x1) = {Sl(xl) ‘ Sr(x2)}fr'(xr) 32.052 " x1), (2-22) S7001: (xx—ix t - x ) = Slam '(x)C—32'(t - x), (2.23) where x, x, ,x, e[o, t], and (72'(x) = P(V2'> x) = fg;(u)du. Since (72 and (72' are survival distributions, taking x = t in (2.23) yields $10M (t)=S.(t)f.'(t)- (2.24) Together with (2.21), this gives f, (t) = f,'(t), and since f, and f,' are positive, we also get S,(t) = S, (t). Hence ST and S, agree at zero and t. Suppose S, :t S, on [0, t]. There exist t0 and t, with 0 S t0 < t, S t such that S,(to) = S,(t,), S,(t,) = S,(t,), and S,. > S, on (to, t,) (or S, < S, on (to, t,)). The rest of the proof follows the same arguments as in Case ( II ) of section 2.3 in order to obtain a contradiction. Therefore ST E S, on [0, t]. Corollary 2.2 The densities {j}, g,; k 21} are also identified by the sub-distributions Q and Q00. Proof: The proof is the same as corollary 2.1, note we have the sub-distributions: 15 Qk(xl ,x2)=P( Wit-13x19 Wkst’Nt=k) =Q(x,,x2,k,k)+Q(x,,x2,k,k+1)+ ' ' ' =P( W,_,_<_x, , W,,Sx2, W,_, t ), Y be the inspection time on R+ = [0, 00) with survival function S,(t) = P( Y > t ), and 5=[TS Y]. We observe n independent and identically distributed copies of (Y, 8), {(Y, ,8, ): i=1, 2, n }. Assume the following conditions to hold: (B1) T and Y are independent. (BZ) S, is continuous, and S,(s) - S,(t) > O, for V O S s < t < 00. (B3) S is continuous, and S(s) - S(t) > O, for V 0 S s < t < 00. Define W, (t)=P(Y >t, 6 =1), W. (t)=P(Y», 6 =0). teIO, co), which can be written in terms of the survival functions as M(t)=-]i1—St], hi the left censoring process Nimitz St, 8.-=1], 1'21 and the right censoring process main: St, 6.=01. i=1 Let WI‘"’ = -],f(1- s‘"’(s))dS;"’(s> , W (t) = -J: S“) (new (s). This approach can be applied to right censorship and double censorship. Chang and Yang (1987) utilize this method in the latter case. However, in our case of interval censoring, if we solve for S‘") and S,‘"’ from these equations, then S,.“) will be the empirical of S, and S‘") will not be a survival function. The modification made in (3.3) leads to the appropriate solution of 8"" as a survival function. (2) The solution S,"" from (3.3) and (3.4) is not the empirical S)”. Later in Section 3.3 we give its relationship to the empirical Sf”. 3.2 Self-Consistency Given { (Y, ,6, ): i=1,2,---,n } = 11/", computation of the conditional expectation F(t/\)u)[ F(t)—F(t/\u) E(— ;§[T>t] I Mn)gives H{_ [x_u]+ 1—F(u) [x > u]}dP,, (x,u), where P,,(x, u) = 12m S x, Y, S u], F = l-S. A self-consistent estimator S‘") of S is a solution of i=1 the equation (self-consistency equation) 50%): E(— 3121754 I 2,), "i=1 where the right-hand side is evaluated at S"). In this section we show the equivalence between the self-consistent estimator and the solution to the equations (3.3) and (3 .4). 19 Theorem 3.1 Given the observations { (Y, ,5, ): i =1,2,--°,n }, if S ('0 is a self-consistent estimator of S , then there exists a survival fitnction SI"), such that S 0') and SI") are the solution of the equations (3.3) and (3.4). Conversely if S 0" and S,(f') are the solution of the equations (3.3) and (3.4), then S 0') is a self-consistent estimator of S. Proof: Suppose S 1") is a self-consistent estimator of S, then 1—=s‘"’(t) ES,,{E,(t) ”1,... , 1;; 5,,---,5,,}, (3.5) where F,,(t) = 12”; S t] is the empirical of F = 1-S(t), and E W is the expectation n i=1 under the assumption that T, have the survival distribution S 0" for any i. So 1" 300“)" — 12E 8"" II]; St” X’si}: 1- S(")(t A 1;) + 50%: A 1;) — S(")(t) 1-50009 " SW01) __1_ " _ *n§{ (1 51)}, 1— S‘"’ (t) -sS‘"’( ) ,S(n) ( S) _S(n) ( I) We ) Recall the definition of W,"° and W2”), and that N L(oo) + N R(°°) = n. We get =—[NL()+ +I,""1 —dNL(s s)+ +I dN,(s)]. (3.6) 8%) = W."” Y,,, be an arbitrary point to put on the remaining mass of 5‘") and SI"). Then .S‘"’(00)=S§")(00) = O. This does not affect the self-consistency of 5‘"). So (3.7) becomes an no 1- S‘"’ (t) . (n) _ (n) (n) (n) (n) (u) (n) S (t) — W. m— LS (ads, + Lam—”4W1 -S (oIodS, (s), with S§"’(O) = 1. Using the fact that d(UV) = VdU + U_dV (3.10) for discontinuous functions U and V, this becomes 1—S‘"’(t) 1-S(")(s) which is (3.3). Hence S1") satisfies equations (3.3) and (3.4). W.‘"’(s)dW‘ (s). (3.12) Then (3.3) becomes We)=-I,°fS;"’(s—)ds‘"’(s). (3.13) Since W,<"’(t) + W,<'°(t) = Em), and using (3.9), we get Mme) = d§§"’(t) - dWZ‘Wt), = We) - S‘"’(t)dS§"’(t), = mm) — St’u» + (1 — S‘"’(r»dS§"’(t). (3.14) Differentiating both sides of (3.12) and using (3.10) yields dW,"”(s) _ 1- S(")(t) dV,(n)(t) ____ (In/‘0!) (l) _dS(")(l)fo l— S(n)(s) 1_ S(n)(t) dW."°(t). 23 = —dS(")(t)_[wl—df/(—g:;)%)-)u = _ds<">(r)I,°° “3:2,; 295:0» + ds‘"’I,°° ——4: : EL: 8 St") (s), = —dS‘"’ (t) I °° “5,352,; 2953(3)) + dS‘") (as?) (t—). (3.15) Now using dV,"” (t) = s;”’(t—)ds("’(t) by (3.13), we get ds<">(z)I°° d<§f;)£2,;f:;;(‘)) = o , t2 0. (3.16) Since { 1,: k = 1, 2, - - -, m} are the points ofjump of S‘") , so dS‘"’(t,) at 0 and (3.16) implies ,wd(§§"’(s)-S§"’(s)) :0 ,=, 2 , . (1—s<">(s)) --,m. (3.17) Hence mm) -S§"’(s» _ 0 1..-...) (l-S‘"’(s>> - ' Since 8"" is constant on [t,_,, t,), 1 "(n) (n) _ . . . ,_ S(")(tk_1)[,k:[,td)(sy (S) — Sy (S)) — O , whrch implres I,”d(-§1"’—S§"’) =0. (3,18) Therefore SI")(t,-)=S,(,")(t,—), k=1,2,..., m. (3.19) That is the left limits of SI") and SI") agree at every point of jump of 5"” . To obtain the strong consistency of a sequence of self-consistent estimators S"), we need to assign some condition on J, which is the points of jump of 5‘"). Define 24 A,,= { a): 3N,suchthatforanyn>N,J,,(co)n(t-e,t+e)¢ E}. Condition C: P(A, , ) = l for any t> 0 and every 8 > 0. The above condition also means that almost surely for any I > O and a > 0, there exists N = N(t, e) such that for any n > N, J,(co)n( t - e , t + 8 )¢ E. Namely, almost surely, in any neighborhood of t, we can always find points of jump of 8‘") for n large enough. Theorem 3.2 Under Condition C, S“) uniformly converges to S almost surely. That is P( lim 1301p) lS‘")(t)—S(t)l = 0 ) = 1. Since {S(")(t)}and {SW (t)} are series of survival functions, by Helly's theorem, there exists a sub-sequence {Sm (t), S?" (t)} and sub-survival functions S°(t) and S,’ (t) , such that 5"” —> So at continuous points of 5°, and SI”) —) S3 at continuous points of $3. With probability 1, W,W(t) —> W,(t), W,(")(t) —> W2 (t), and SIM) —> S,(t) uniformly for t e[0, 00), since W, , W2 and S, are continuous. Hence without loss of generality, we may assume uniform convergence on the whole space (2. Further since we will show every sub-sequence has the same limit, we will also assume {n’} = {n}. To prove the theorem, we need a few lemmas. Lemma 3.1 S: (t) is continuous on [0, 00). Proof: From (3.8), S§"’(t) = 1+ J: WMMKSLI E [0, 00)- s For continuity points of S3 (t) 0S s, 00 ”(W2 (52) - Wz(s1» . '(Sr (SD—Sr (31)) 5 ”3(32) (3.21) Hence S,0 is continuous on [0, 00) by the continuity of W,. So, SI") (t) —-> S,’ (t) uniformly for ta [0, co). Lemma 3.2 W,(t) = -I°° s°(s)ds;’(s). (3.22) Proof: Let t be a continuity point of So, by (3.4), We) = - I f S‘"’(s)dsi"’(s), = —I:d[s“’(s)si"’(s)1+Ifisitts-Msws) . = S‘"’ (1)5)” (t) + f 5;") (s-)dS‘") (s) . Lemma 3.1 implies SI") —> S,’ uniformly on [0, 00). Hence "5‘"’—"*'i+s°(t)S3(r> + I":S;’(s)ds°(s>, = —I”s°(s)ds3(s). Since the continuity points of S0 are dense in [0, co) and W2 is continuous, therefore W20) = - I °° S°(s>d83(s). new, no) . 26 Lemma 3.3 S3") converges to S, that is S3 2 S, . Proof: For te[0, 00), let s, e J, such that s, -t| = min{|t, —t ;t, 6 J,}. Then under Condition C, s, —-) t . Remember S3")(s, -) = S3"’(s, -), so Si") (t) " Sy(t)=[51(r") (l) ' Si") (S,. ')l + [Sim (Sn ") - Sy(S,. ‘)I + [Sr (Sn —) " Sy(t)] , 55+5+5 ,, —) 0 , by S3") —) S3 uniformly and the continuity of S3 . 1,, —-> 0 , by S3") -) S, uniformly on [0, oo) 1,, —> O , by the continuity of S, . So, S3")(t) —> S,(t) for te [0, so). Hence S3 5 S, on [0, 00), since both S3 and S, are continuous. Proof of theorem 3.2: By (3.2), Lemma 3.2 and Lemma 3.3, we have W,(z)=—I,°fs(s)ds,(s) and W,(t) = — I °° S°(s)dS,(s), te[0, 00). That is I”(S°(s)—S(s))d S,(s) = o, te[0, 00). So 5" = S as. [I]. But S ° is right continuous and S is continuous, so S 0 a S on [0, 00). That is S‘") -—) S , the continuity of S makes the convergence uniform. Therefore . (n) — = : P( 11m .ZEEJS (t) S(t)| 0) 1. "—)w Groeneboom (1992) studied the maximum likelihood estimator (MLE) of S, which he called the nonparametric MLE (NPMLE). He shows that the NPMLE is 27 strongly consistent. We prove here that if S 0" is the NPMLE, then S") must satisfies Condition C and therefore strongly consistent. This is stated below. Theorem 3.3 If S ‘"’ is the MLE of S, then S W satisfies Condition C. Proof: See Appendix. 3.4 Weak Convergence In this section, we discuss the weak convergence of self-consistent estimators. Now assume Tand Y are defined on [0, l] (or defined on [0, M] with M > 0). Define ' ut’m = 43 (We) — S(t)). u§"’(r> = 42 (SW) — S,(t». q:"’ = Jim/1"”(t) - W1(t)). q§"’(t) = JRWZWI) - W20», We) = J3 (§§”’(t) —S.(t,—) so u,">(1,—=) R‘")(t,—)= I:_dR("’(s).Hence I," :1::3——:d5; ,(s s)= I "' §§:)——(—(‘,)— I " dR Rs‘"’( ) (3.26) Applying the same technique we used to obtain (3.18) we may remove S(")(s) in the denominator, I;"u,<"’(s)ds,(s) = I:—[dq§"’(s) — S‘”’(s)dR(")(s)], k = 1, 2, - . -, m. (3.27) The following lemma will be used to find the asymptotic variance. Lemma 3.4 For any bounded measurable functions h, and h2 on [0, 1], the variance of the following integral is Var { I; h.(S)dql"’(S) + I modqr’tsn = —I,; h2(s)th(s) — I; (towns) — {I31 (s)dW.(s> + Igntswwsnz, 0 s t s 1. Proof: Since{Y,, 5,} are i.i.d., and by the definition of q,"" , qé") , W3") and W21") (t) , we have Var { I modqt’ts) + I meagre): = Var {JEIJIttsWWm + «HI; mansion. = Va, {_‘/1=:{h,(1;)[5, = 1,1; St]+h,(K)[8, = 0,11 S tl}}, "1.1 = Var {h,(I;)[5, =1,1; st]+h,(1:)[8,. =0,K Sill, = E {h,();)[8, =1,); st]+h,(r;)[5,=0,11 St]? - {E MODE). =L1§Stl+hz(1:)[51=0,XSII}}2, def. =L-A- 29 The cross-product term of I, is 0 when expanding, so I. = 80120916. = 1,1: 911+ Eihfms. =1,): s t1}, = -I,' momm— Igtétstzo). By the same principle, 1,: {I0}, (s)dW,(s) + Io'h, (s)dW2 (s)}2. Theorem 3.4 Let S‘") be the self-consistent estimators of S satisfying Condition C, then (i) ‘51-]: (S m (s) — S (S))dS, (s) converges weakly to a normal distribution with mean 0 and variance -I01S(s)(1— S (s))dS, (s). Further, if Y has positive continuous density g, then (ii) x/hI;(S("’(s)—S(s))ds converges weakly to a normal distribution with mean 0 and variance I0] S(S)fg1(;;9(3)) ds. Proof: (i) Taking t,=l in (3.27) yields Jit- I: (S 0') (s) — S(s))dS, (s) = I: [dq§") (s) — S”) (s)dR('" (s)], which converges weakly to a normal distribution since 5"” is strongly consistent. The variance is given by the asymptotic variance of I: [dq§") (s) — S(s)dR("’ (s)] = —I,:S(s)dq,‘"’(s) + I010 — S(s))dq§"’ (s), which is obtained by using Lemma 3.4 as ‘ (n) ' (n) Var {—I, S(s)dq. (s>+I, (1— S(s))dq. on I 2 l 2 l l 2 = —I, S (s)dW.(s) — I0 (1— S(s)) dmts>+{-IOS. 30 (ii) Define the function g, on [0, 1] by g,(s) = g(t,_,) for s e [t,,, , t,) , k = 1, 2, - - ~, m. Then almost surely —1— — -1— —-) O uniformly on [0, 1] since g is continuous. So 8,. 1 l l m d =— I") —dS, , Iou. (s) s l,“ mg“) (s) =-l'u.‘"’(s) 1 dS,(s)+0,,(1), 0 g..(S) = - m 31(S)I[r 1 ,uf")(s)dS,(s)+ 010(1)’ k=l n """ =-z k=l gn (S) l. .,ldq§"’(s)-S‘"’(S)dR‘")(S)1+or“), = _I‘;[dq;">(s) _ S("’(s)dR(")(s)]+ 0,,(1), ° g..(s) = ‘l ——-1dq;"’(s)- S(s>dR‘"’(s>1+ o (I) The desired result follows since -IOg—(1—S)[—dq§")(s) S(s)dR(")(s)] converges weakly to a normal distribution. The asymptotic variance is obtained by the same method as in (i). Groeneboom (1992) proved the weak convergence of MLE under certain conditions. The following theorems are proved by using Groeneboom's Lemma 5.4 (restated here as Lemma 3.5). Assume T and Y have bounded densities f and g respectively, satisfying fit) 2 or > 0, g(t) 2 or > 0, for t e [0, 1] and g has a bounded derivative on [0, l]. 31 Lemma 3.5 ( Groeneboom) Let or, = M n'm, where M > 0 is a constant. If S") is the MLE, then for any t0 6 [0, 1]. (i) The probability that S‘") does not have a jump in an interval of the form [t,,-or,, to+or,] can be made arbitrarily small. (ii) sup Is“) (1) —S(t0)l = 0,02%). Idle-(1,, .10 441,, The following theorem presents the relationship among S3"), S3"), and S,. Theorem 3.5 If 15"" and S3") are solution of (3.3) and (3.4), with 5‘") being the MLE, then (i) JEt such that S‘"’(s) = S"’(t,) for se [t, t,] and S3") (t,) = S3")(t,). Then t,- t = 0,(n"’3) by (i) of Lemma 3.5. Define P,(t,y) = £2.17; S t,)? S y], which is the empirical ofthe pairs (T,, Y,), and hi P (try) = P( T S I, Y Sy) = (1-S(t))(1-Sr0’))- n 1 l n n 1 n l 1 Then s; 1(1) = "WWW; ’(s). W.‘ ’0) =;§1T. > 11.1: >t1= I,I,,1s>y1d12.(s,y), ..n 1 " 1 1 and Si ’(t)=;2111 >tl= I,I,,dP.(s,y>. So i=1 1 S 3" (s) 1 1 (n) _ (I!) _ ('0 —______ s, (t,) S, (t)-I(] atW2 (s)— Swan)IOIW[s>y]dP,(s,y),and .. ~ I Si"’y] Si ’(r) -5; ’(t) = —I, 1,...{1- S,., (tn)}d1’..(s,y). = @1753 1..., {is > y] -S""(t..)}dR.(S,y)r =IS<»>(t, )Io It”, {[5 > y]— S‘"’(t.. )}dP(Ss y) 1 1 .. .»§,—,,—,(t—n)I0 Im {[s > y]—S( )(t,)}d(P,, — P)(S.y), def. = Iln+12n ' 1 I... = “WI...,1{S(S)‘ s‘"’(t.)}dS.(s). =S.,..( ——,{S‘"’< .)I,, as .s—() I S(s)dS.(s>} {(5330, )- S(9))(Sy (t)- S, (1.. D} (wherCGEUJ ,.l). =SW ) = 0,01%), since S‘"’(t,)-S(0) =o,(n‘V3) by (ii) of Lemma 3.5, and S,(t)—S,(t,) =o,(n“%). = o,( n“ ) by P,- P = o,( 11"” ) and t,-t = o,( n"’3 ). So s/Ztsi"’(t) - Site»: 0.0). (ii) follows from (i) since we know Jh(S3”(t) —S,(t)) converges weakly to a Gaussian Under these conditions, we can use Lemma 3.5 to extend the result of Theorem 3.4 to integrals over [0, t], for any t 6(0, 1). Theorem 3.6 Let 5"" be the MLE of S, then 33 (i) J; I; (S ("’(s)—S(s))dS, (s) converges weakly to a normal distribution with mean 0 and variance —I:S(s)(l — S(s))dS, (s). (ii) JZI; (S"')(s)-S(s))ds converges weakly to a normal distribution with mean 0 and variance I: S(s)(gl(-;;S’(s)) ds. Proof: We will show for any t e(0,1) I. uf"’(s)dS.(s)-I,1dq§"’(s)- S(s1dR‘"’(s)1—”>o. (328) since I: [dq§"’ (s) — S (s)dR(") (s)] converges weakly to a Gaussian process. By (i) of Lemma 3.5, fort e(0,l), there exists a jump point t,>t, such that t,- t = 0,(n"’3). Then by (3.27) and using the same principle as in the proof of Theorem 3.5, I, u:"’(s)dS.(s) — I,"u.‘"’(s)dS.(s) = —~/;I(,M(S(")(s) — S(s))dS,(s) = 0,0), and III-”41") (S) ‘ 5"" @de) (S)) - I0 [dq§"’ (s) — 5‘") (s)dR""(s)] = o, (1). (3.28) follows from the strong consistency of 5"". Hence JhI: (S(")(S)—S(s))dS',(s) converges weakly to a normal distribution. Having proved weak convergence, the variance and (ii) can be obtained in the same way as in Theorem 3.4. Chapter 4 BAYESIAN ESTIMATION 4.1 Introduction The problem of nonparametrically estimating a survival fimction from censored data has been addressed by a number of authors. For right censored data, Susarla & Van Ryzin (1976 ) gave a nonparametric Bayes solution to the estimator under squared error loss using Dirichlet process prior. The resulting Bayes estimator was shown to reduce to the PL estimator in the cases where or(R+) —) 0, where a( - ) is a finite measure on R+ which serves as a parameter of the Dirichlet process prior. In this chapter, with right censored data, we find another expression for the Bayes estimator, which is related to the self-consistency equation. This approach sheds more light on the connection between the Bayes estimator and the Kaplan-Meier's PL estimator (also the MLE). For interval censored data, we also present a Bayes estimator, but we find that in this case the Bayes estimator may not reduce to the MLE as a(R+) —> 0. Let (R+, d", i0 ) be the probability space, where R+ = (0, co), and fl is the Borel o-field on (0, 00). P is a random measure. We say Pe Wm) to mean P is a Dirichlet process on ( R+, d’ ) with parameter or. See Ferguson(1973) for the definition of Dirichlet process. J’( P(A)) = MAR) , forAe Ki T, , - - -, T, is called a sample of size n from P, if a flu, eC,,~-,T, EC, | P(A,),-~,P(A,,);P(C,),-~,P(C,)} = 1‘]P(C,.), a.s. i=1 We assume T, T, are the lifetimes with survival S(t), and further that they are a Sarnple from P, S(t) = P( t ,oo ). Pe .fla). The Bayes estimator is always under the 34 35 squared error loss L(S,S) = I: (S(u)—S(u))2dw(u), where w is a weight function and S(u) is a estimator of S(u). Ferguson (1973) proved: (i) The conditional distribution of P given T, ,-~,T, is a Dirichlet process with parameter B=a+:8, i=1 (ii) Let S(t)= P(t, 00), then, given T, ,- ",,,T the Bayes estimator of S(t) is Sh(t)= 3(S(t)| 71"",72). _ 0t(t, co) ot(R)+n +1ot(R)+nZ[’>] _ a(t,oo) + n "o(R)+n a(R)+n S, (t). (4.1) whereS, (t) = 1 2H; > t] is the empirical of S(t), and [T,>t] is the indicator of (T,>t) . n 1 4.2 Right Censoring We now consider right censored data, where T ,- « -, T, are the lifetimes and Y,,- . -,Y, are the censoring times. We observe {(Z,, 8,) ; i= 1,-~-, n }, with Z, = T, A Y,, 8, = {T, SY,], i= 1, - - -, n. Without loss of generality, assume8, = = 8, = l, 8,,, = = 8,= O and that Z, ,--°,Z, are all distinct. Let N , , (t) = Z[Z, 2 t,8, = 1] be the number of uncensored observations in [t, 00), and i=1 36 N, (t) = Z[Z, 2 t,8, = 0] be the number of censored observations in [t, oo). (’21 Let N (t) = NU (t) + NC (t), N " (t) = N (t+), then N(t) is right continuous counting process. Actually N(t) is the number at risk at time t. To find the Bayes estimator under such right censored data, Susarla & Van Ryzin use the following lemma and theorem. Lemma 4.1 Given ( Z, ,1), - - - (Z,, 1), the conditional distribution of P is a Dirichlet process k with parameter [5, = or + 28,, . i=l Theorem 4.1 (Susarla & Van Ryzin) Consider T,,, , - - -, T, as a sample of size n-kfi'om J45, ), which is obtained by Lemma 4.1. Then under condition (Z,,, , 0), - - -, ( Z,, 0), the Bayes estimator of survival function S(t) is S(t) = a(t,oo)+N+(t)I'-'I{ “(Ztaw)+N(Zi) (4.2) or(R)+n ,,, or(Z,,oo)+N+(Z,) Remark: From (4.2) , by taking a(R+) —> O, Susarla and Van Ryzin showed that the limit of S (t) is the Kaplan and Meier's PL estimator [2, s:.5,=l] (1 NZ, S..(r>= II{1-(———N,3,”} 37 The following theorem enable us to see more about the relation between the Bayes estimator and the Kaplan & Meier's PL estimator. The proof of lemma 4.2 is in Appendix. Lemma 4.2 Let S(t) = o"{S(t)| (Z,,8,),---,(Z,,8,)}. Then ”{7} >1 |(Z.,5.),---,(Z..,5..)} =[Z,>tl.if8,-=1; = S(th,) $12,) ,if8,=0;j=1,-~,n Theorem 4.2 Given {(Z,, 8,) ; i= 1,---, n }, S (t), the Bayes estimator of S(t) is the solution of the following equation S(t) = who, oo)+ N*(t)+ I %d(—N NC(t))}. (4.3) Proof: By (4.1) and Lemma 4.2, S(t) = J’{S(t)| (Z..6.>.-~,(z,..6.)}, = fl J’(S(t)IT..---.T.)1 (Z,,5,),...,(Z,,5,)}, a(t, 00) 0‘(R )+n+a(R*)+n g” >t]|(Zl’51)” (321.95,» _ a(t, co) a(R )+n 0t(R 1)+n,, 2 MT. >t| (2.,6.)-- (2,6.1), ———{a(t, oo)+Z[T >r]+ Z W(T >t| (Z,,8,,-) ,(Z,,8,))}, =a(R+)+ j= -k+l 38 S(th,) =a(R 1")+ ——{a(t, oo)+N,, U(t+)+j="2k;rl— S(Z) ——}, _ 1 00 S(_(t_d)( -——a,,.)+ {att )+N..(t+)+N.)}, 0.. . S(t) _ =a,, 1,) ———,.{a(t )+N (t)+I'S —d( Neon}. Remark: SinceSPL (t) is the solution of equation S.. = -‘-{N* 0 ’ where B(y, n)- — M ,and F(a+1)= aF(a) for a > O. (4.7) IW11) Even in the simplest cases, the expression for I1 and 12 can be quite complicated. For example, if all 8's are 0, then I"(CLOTH r‘(Ot 1+1)1"(a,.)”1"(0tn+1+"‘a2 +n)F(ai) 2 _r(a1)”r(an+l) r(an+1+an+1) l-‘(ari1~1"""'+’a'l4'") I"(0L(R+)) = r(a(R+)+n)a((n),°0)(<1(}f,._1),°°)+1)“(0“ (1)’°°)+"_1)’ 41 _ P(a(R*)) 1“(a(R*)+n) 0(Xn,a°°)(a(}fn_n ,oo) + N+ (Kn—1))) ' ' (“(111),”) + N+ (Yb) )) - If there exists at least one 8 equal to 1, then I2 is a summation of the forms above. Such as, suppose 8,“) = 1, then = P(a(R*)) 2 r(a(R+)+n—1) a(Kn)’w)(a(Kn-2) 9°°) + N+()(n—2)) "1) ”'(aaimw) + N+(Y(1))"1) _ l“(OIUVD I“(a(R*) +n) “(Kn)a°°)(a(Kn—1) ,oo) + N+(}(n—l))) ' ”((1021) ’00) + N+(Y(1)))° In each case, I, will have a similarly complicated expression. Theorem 4.3 tells us a way to calculate the Bayes estimator S(t), but it is too tedious to find an explicit expression. The following example show that given a Dirichlet process prior (1, with nonnull probability, there exist Bayes estimator S(t), which does not converge to the MLE as a(R+) —) 0. Example 4.1 Let y,, y,, y3 and y4 be fixed real numbers with O < y, < y2 < y3 < y,. Consider a Dirichlet prior such that oc[0, y,) = a[y,, y2) = (lb/2, y3) = only,” y4) =aDI4, co) = x. a(R+) = 5x, where O S x S 00. Let Y be a discrete random variable taking values on y,, y,, y3 and y,,. Let T be the lifetime with Dirichlet prior or. Suppose the intervals observed are (y,, 00), (y,, 00), (0, y,,], and (y,, 00), n = 4. The probability of getting such observations is P( Yl = y,, T, > Y,; Y2 = y,, T2 > Y2; Y3 = y,, T3 S y,; Y4 = y,, T4 > Y,, ), which is positive. Then the Bayes estimator S, when t = y,, is Sg(t): 5? S(t) IXIEO’v 0°): X2605, ‘30), X3€(0,y3], X46041, ‘30)}, 42 £(P(t,oo)P(0,y3]l'I P(ynw) w. 1 = iat3 _ i1'IP(y,-,oo»- fH P(ypoo», and #3 i 12= J’(1‘[P(y.,oo))- J’(1’[P(y.,oo)). Then, each terms in I, and I2 can be calculated by integration as in Susarla & Van Ryzin ( 1976). Denote a = a(R+). = Na) _ Na) 1, P(a+4)x(2x+l)(3x+2)(4x+3) ma”) x(2x+1)(2x+2)(3x+3)(4x+4), = Na) _ Na) 2 F(a + 3) x(3x+1)(4x+2) F(a + 4)x(2.x+l)(3x+2)(4x+3), I = (x +1)(3x + 2)(4x + 3)(5x + 4) — (x + 1)(2x + 2)(3x + 3)(4x + 4) S°’ 5") = I (3x+1)(4x+2>(5x+3)(5x+4)—(2x+1)(3x+2)<4x+3><5x+4>’ lingSU) = 13/22. On the other hand, the MLE is S"’(y,) = S"’( y,) = 1, Smog) = S")(y,) = 1/2. As x —) 0, so that a( R+ ) —-) O, S(t) does not converge to 1/2. Hence, S does not converge to the MLE asa(R+)—>O. Chapter 5 SIMULATIONS In this chapter, we investigate the performance of our estimation procedures through simulation studies from known distributions. a. Parametric assumptions We just consider the 2-inspection interval censorship model described in Chapter 2. The two inspection times are W,, W2 ( W, < W2 ) are generated through independent exponential rv's V,, V2 with mean 1; That is, W,=V,, W,=V,+V2. The survival time T of interest is also assumed exponential (mean 1/9) and independent of (V,, V2 ). The stopping number N(t) = min{ k >0: TSW,_W,. The distribution of N is then P[N=1] = 29—3—1, P[N=2] = (Ti—:11?" P[N= 00] = (9:1), 43 44 Thus 69—1 is the proportion of left censoring, 799—; the proportion of interval + +1) censoring and (Tl? the proportion of right censoring. The objective of our simulations + is to compare the nonparametric estimator of the survival distribution S(t) = e“, t >0 of T, based on the Turnbull scheme, with that obtained through maximum likelihood estimation of the parameter 6. The data are generated from n independent triples ( V, ,., V2,, T,),lSiSn. b. Maximum likelihood estimation of 9 The information on T may be written in the form T e (L, R], where L=O, R= V, when N=1; and L= V,, R= V, +V2 when N=2; and L= V, +V2, R= +00 when N= 00; The SAS LIFEREG procedure is used to estimate 9 form the data { T, e(L,,., R,): ISiSn}. c. Turnbull method for estimation of S(t) From the data { T, e(L,, R,]: ISiSn}, Tumle (1976) provided an algorithm for the (nonparametric) estimation of S(t). He constructed a set of disjoint intervals {(q,, p,]: ISjSm}, where qj's and pj's lie in the sets {L,.: ISiSn} and {R,: 1_<_iSn}. The likelihood is proportional to Lm= flown-Sm»). (5.1) Define the vector of probability 3 = (s,, - - ., s”) by s, = S(qj) - S(pj) . The problem of maximizing (5.1) reduces to one of maximizing £6) = N(Zaysj), (5.2) i=1 j=1 where a”. = 1 if (qj, pj] g (L,, R,], 0 otherwise. 45 For ISiSn, 15j.<_m, let 1,, = 1 if T,e(q,, p,] and 0 otherwise. Because of the censoring the value of 1,, may not be known, however its expectation is given by 1511.1 = a. s,-/ 2,1,5, = 11., (s). (5.3) k=1 If we treated (5.3) as observed rather than expected frequencies, the proportion of observations in intervals is (:u,(s))/ n = 11,. We say that the vector of probability s is i=1 self-consistent if s, = 1t,(S,, . . ., s") (ISjSm). (5.4) A self-consistent estimator of s is defined to be any solution of the simultaneous equations (5.4). The form of (5.4) immediately suggests a iterative procedure for finding the solution. (i) Obtain initial estimates s3 (13me). This can be any set of positive numbers summing to unit. (ii) Evaluated 11,, ( s°) for lSj_<_m. (iii) Obtain improved estimates s} by setting s}. = n,( s°) for IgSm. (iv) Return to step (ii) with s1 replacing s°, etc. (v) Stop when the required accuracy has been achieved. d. Results 1. Our first simulation study involves 10 replications of size n = 100 from each of the exponential distributions with 9= .5, 1, 2, 3. The distribution of N is denoted by p = e 9 1 9+1’ (9+1)2’ (9+1) ( 2 ) and [3 denotes the observed proportion of left, interval, and right censorship, averaged over the 10 replications. Table 5.1 summarizes these results 46 for the four distribution that we generated. Even though the number of replication is small, the degree of closeness of p to fl is very satisfactory. 2. For each distribution, maximum likelihood estimate 6 is obtained from the SAS LIFEREG procedure for each replication. In Table 5.2 we input the average of these estimates over the 10 replications. 3. For the nonparametric maximum likelihood estimation of S(t) we use the Turnbull self- consistency algorithm. APL program were written to efficiently carry out the task of solving the self-consistency equations. For each replication, the Turnbull estimator S (t) is close to a step function, with gaps, in the sense that it is undefined on some intervals. We extended the definition of S to these intervals by extending consecutively the steps from the lefi of the curve to the next jump in S. These step functions were then averaged over the 10 replications to obtain an estimator of S(t). The figures illustrate the Turnbull estimator based on a single replication, and that obtained by averaging across the 10 replications. The true survival curve is S(t) = e'z’; t>0. 0:0.5 0=1 0: =3 Censoring p i5 p i5 p 16 p 13 Left(N=1) .333 .317 .500 .510 .667 .655 .750 .744 Interval (N=2) .222 .220 .250 .230 .222 .224 .188 .191 Right (N=oo) .444 .463 .250 .260 .111 .121 .062 .065 Table 5.1 47 9 (3 .5 .4969 1 .9978 2 1.9678 3 2.9299 Table 5.2 TU RN BU LL ESTIMATION Based on one sample of size 100 1.’L 1.1 .‘—— .8‘. \\\‘_ .6" 4:"- 41- \\.- 21- \_._,~\ _ \ ° Turnbull 0.0- -.----.-..— ° True - I I 1 U I I survival -.5 0.0 .5 1.0 1.5 2.0 2.5 3.0 Figure 5.3 48 TURNBULL ESTIMATION Based on 10 samples of size 100 1.0- .8-- .6" .4" .20 __ ° Turnbull 0.0“- W ' ' ' ' ° True - , , , 1 , Survival 5 0.0 5 1.0 1.5 2.0 2.5 3.0 Figure 5.4 APPENDIX Proof of theorem 3.3: Let F”) = 1 - S"), then F") is the MLE of F. Let Y,,) < - - . < Y,,, be the ranked Y,'s, and let 8,,,be the 8 corresponding to Y,,). Groeneboom (1992) showed that the value of F") at Y,,, is the lefi derivative of H“ at i, where H“ is the convex minorant of the points 0,280,) on [0, n]. Call apointte{ Y, : i= 1, 2, - - -, n } avertex ofthe convex minorant jSi if it is a point such that F"’(t) < F‘"’(Y,) for any Y, > t, that is H“ changes its slope at k if r = Y,,). We know F") -) F0 on the continuous points of F’. If we could show F 0 is strictly increasing, then Condition C will be satisfied. Suppose F0 is not strictly increasing, there exist 3, and s,, s, < s,, such that 1306,) = F°(s2). For our convenience, assume F0 is continuous at s, and s2, F°(s)< F°(s,) for any 8 < s,, and F°(s) > F°(s,) for any 3 > S,. If P” is not continuous at s, and/or s,, we will consider ,S,+8 and/or S2-8 for small a, and the proofis similar. Lets,,,=max { 15s, :1 isavertex ofF‘") } ands2n=min { r 282:1isa vertex of F“), then s,,, —> s, and s,,, —) s2 , since 1“") —-) F0. Therefore, since H" is convex minorant, we have the number of 8,'s being 1 in (SM’SZ"] F0081) S ' the number of 8,'s 1n (Slnisznl s F<">(s,). (3.23) As it —>00, 1“") (3,) —) F0 (S1), F"’(s2) -—) F0 (S) = F” (3,). The numerator of (3.23) becomes —l—(the number of 8,'s being 1 in (s,,,s2,]) = “1'2”; 3 KY e(s,,,,s,,]], n n i=1 = I: 61W") U). n—no ‘1 ———> I, F(y)dFy(y), 49 50 since WI") —) W, and W, (t) = fFKQdF} (s) by (3.1). The denominator of (3.23) becomes 1(the number of6.'s in cams...» = 121): 60.88.11, n i=1 $4 I: dF,( y). So, as n —900, we have I: F (y)dFy(y) 0 1 32 = F (30° (3.24) I; dF Y 0’) By Mean Value Theorem of integral, there exists 0 e (s, , 32) such that F (0) = F0 (s, ). But H" is convex minorant, so the number of 8,'s being 1 in (s,,, ,0) . 2 F(n)(sl)- the number of 8,'s 1n (SING) Following the same argument as obtaining (3,24) as n —->00, I6 F (y)dFy (y) I: dFy (y) By the Mean Value Theorem of integral again, there exists 0, e (s,, 0) such that F(0,) _>_ 2 F°(s,) = 17(0). P(s,) = F (0), which is contrary to the fact that F is strictly increasing. Therefore F0 has to be strictly increasing. Proof of Lemma 4.2: The lemma is trivially true when 8,. =1. We now assume all 8's are 0, otherwise we could apply lemma 4.1 so that the new prior would become a Dirichlet process with parameter [3,. We want to prove 51 S(thj) MT,->t|(ZnO)a""(Z"’O))= 8(2) ’ or equivalently o"(P(th,-.°°)H P(Z..°o)) 6”(P(th,,oo)H P(Z,,00)) J’(P(Z,,00)HP(Z,,00)) ’ £(P(z,,oe)n P(Z,,00)) iatj (4.0) Denote the LHS of (4.0) by 1’}, denote the RHS of (4.0) by %. k+l lall a2, 0, 1 bl rsz ak+2 aln “n+1 I t 0 Z 22 HZ]. Z, ,,, Z" Figure4.1 Without loss of generality, we assume Z, < . - - < Z,, . (4.0) is obviously true if t 5 Z,, so let I e (Z, , Z,,,], and let B, = (0, Z, ], B, = (Z, , Z, ] ,- ~ -, Bn= (Z,,, Z,,], BM= (Z,, , 00), and 01, = 01( B, ), 01, = 01( B, ), - - ., 01,, = 01( B"), 01",, = a(B,,,,). Let b, = 01( Z,, t] and b, = 010, Z,,, ] , then b, + b, =01,” (see Figure 4.1). Use ‘I’Jfl'I (c—x)""'dx = CWHBWm) for 0 S c $1 and 7,11 >0, where B(y,n) = W, and F(a+ 1) = 01F(a) for 01 > 0. Then, we have Y 11 = mam» r(a(R+) +n) aml( am] T art-+1). ' ' ( “MT ' ' ' + ak+2+ "+1'(k+2)) (a,.1+ - - - + 01m+ b,+n+1-(k+2)+1 ) ( 01M+ . - - + (1141+ n+1-(k+1)+1 )... (01",,+ - - - + a,,,+n+1-(i+2)+1)(01,,,,+---+ 011+ n+1-j)-- -(01,,,,+- - -+ 01,+n-1), 52 = 1"(Ot(R*)) I‘(01(R+ ) +n+ l) an+l( (1",, + 01,,+1) ' ° ' (am1+ ' ' ' + 0942+ n+1'(k+2)) (01",,+ - - - + 01,,,+ b, +n+1-(k+2)+1 )( 01",,+ - - - + am+ b, +n+1-(k+1)+1 ). . . (01",,+ . - - + afi,+n+l-(i+2)+l )( 61M+ - - - + 0941+ n+1-(,'+1) +1) (amr+'°'+0lj+rrt1-j+l) ---(01n+,+o--+ or,+n), z 1"(<1(R*)) r(a(R+) +n) an+l( “n+1 + anTI) ' ' ' (am1+ ' ' ' + ak+l+ n+1'(k+1) ) ' ‘ ’ (“n+1+ ' ° ' + aj+2+n+l-(i+2) ) ( am1+ ° ' ° + aj+l+ ”TI'UTI)) (am1+"'+aj+”+1'j) '°'(an+|+"'+ “24774): and = P(a(R*)) F(01(R*) + n + 1) 01",,( 01",, + an+1) . -- (01",,+ - . - + 0101+ n+1-(k+1) ) . .. (a...+ - - - + a,..+n+1-o'+2) ) ( a...+ - ~ - + a...+ n+1-+1v+ 0 J [1(a(zi ,oo) + N(Zr)) 1:1 and B. F(a(R+)+n+1) D l"(a(R+)+n) 53 (01",,+---+01, +n+l—j)---(01,,,, +---+01, +n—1) _(a +---+01,,, +n+1—(j+1)+1)(01 +---+01, +n+1—j+1)---(01,,, +---+a, +n)’ n+1 n+1 fi