z. .32? . a’S-u‘11’7 : 2.. i2; 3. 3.2.1.3.. , . ... 5... ('n II. 28".191'... 1.51.3..- 1.... _. a . {v.39}... :331 3.17.. 12:3. m 1 . c l. 3. 513.1... o. .I I!“ 1"} 9 50:.- 1‘3. {5:}.I42’ 1.. if a. :3... ........ .I‘ .. 2:15.... .. :53 . 313...... 13...... n :..«.'!..J c 39.4 7m m w w w. m. m m m. w ' UniVersity l This is to certify that the dissertation entitled Regression Models for ; Analysis of Medical Costs l presented by Elena Polverejan I i has been accepted towards fulfillment of the requirements for PhoDo fidegreeinitatistig Joseph Gardiner Date_duly 31, 2001* - -- - l Ma «professor l I l l l MS U i: an Affirmative Action/Equal Opportunity Institution I 012771 REGRESSION MODELS FOR ANALYSIS OF MEDICAL COSTS By Elena Polverejan A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Statistics and Probability 2001 REGRES R;sin intznentions Hospital cos Knouing the decisions on Incre Vanables joi; 353535 [he in Parametric o lth‘CSSiOIl CC COWIRIIOn b In an- funCUOn 0f P LOS and cos ABSTRACT REGRESSION MODELS FOR ANALYSIS OF MEDICAL COSTS By Elena Polverejan Rising cost of health care and the need for evaluating costs of new medical interventions have led to interest in developing methods for medical cost analysis. Hospital costs constitute a significant proportion of overall expenditures in health care. Knowing the correlates of in-hospital length of stay (LOS) and cost is important for decisions on allocating resources. Increasing availability of patient specific LOS and cost permits analysis of these variables jointly, accounting for their likely correlation. A bivariate model is used to assess the impact of covariates on these outcomes. Under marginal specification through parametric or Cox regression models for LOS and cost, standard errors of estimates of regression coefficients are obtained using a robust covariance matrix to account for correlation between LOS and cost that is otherwise left unspecified. In another model, we use a conditional approach to estimate mean costs as a function of patient hospital stay and adjusts for the influence of patient characteristics on LOS and cost. The mean cost over a specified duration is a weighted average of the expected cumulative cost, with weighting determined by the distribution of LOS. We extend this model to address costs and resource utilization in longitudinal studies when patient histories evolve through several health states. In these studies costs are incurred in random amounts at random times as patients transit through different health states Ma'kov pro paicnl char of expend: sopums in of all costs mind Stale cxpected in Our controlling competing - statistics an health states. We describe the evolution of a patient’s health history by a continuous time Markov process with finite state space. Dependence of the transition intensities on patient characteristics is modeled through semiparametric regression models. Two types of expenditures are incurred, one at transitions between health states and the other for sojourns in a health state. Over a fixed follow up period, we consider net present values of all costs incurred in this period for the two types of expenditures. Conditional on the initial state and a specified covariate vector, we obtain consistent estimates of the expected net present values and derive their asymptotic distributions. Our methods provide flexible approaches to estimating medical costs while controlling for the effects of covariates. In addition, for economic evaluation studies of competing medical interventions, our methods can be applied to estimate summary statistics and cost-effectiveness ratios. The beauty of Science is that one can say something (useful) without having to say everything. To my husband and parents, for all of their love and support! l v. c guidance 9.] research to advice on members i Agency it T. Amma chOUI-EE owl undersr. ACKNOWLEDGMENTS I would like to thank my advisor, Dr. Joseph C. Gardiner for all of his patient guidance and support during my graduate studies. The freedom he has given me in my research to investigate different areas is greatly appreciated and I am also thankful for his advice on the non-statistics related issues. I also want to express my thanks to the other members of my committee for their friendly support. My research was supported by the Agency for Healthcare Research & Quality, under Grant lROlHSO9514. To my family, I extend my love and thanks for being so supportive. My parents, Adrian and Valentina Feleaga, my sister Anamaria have continually offered me encouragement. My deepest gratitude goes to my husband Mihai, for all the love and understanding he has given me over the years. LIST OF' LIST OF ' ABBRFA lntrodr Chapte A BIVr AND C 1.1 ' 1.2 TABLE OF CONTENTS LIST OF TABLES ........................................................................... LIST OF FIGURES ......................................................................... ABBREVIATIONS ......................................................................... Introduction ............................................................................... Chapter 1 A BIVARIATE MODEL FOR HOSPITAL LENGTH OF STAY AND COST ................................................................................ 1.1 Semiparametric Marginal Models ............................................... 1.1.1Estimation of the Regression Parameters and Integrated Baseline Hazards ......................................................................... 1.1.2 Large Sample Properties of the Estimators of Regression Parameters and Integrated Baseline Hazards ............................. 1.1.3 Large Sample Properties of the Estimators of Survival Functions ....................................................................... 1.1.4 Point Estimates and Confidence Intervals for Median LOS and Median Cost .................................................................... 1.2 Parametric Marginal Models ...................................................... 1.2.1 Estimation of the Model Parameters ........................................ 1.2.2 Large Sample Properties of the Parameter Estimators ................... 1.2.3 Point Estimates and Confidence Intervals for Median LOS and Median Cost; Application for the Bivariate Normal Case .............. 1.3 Application .......................................................................... Chapter 2 ESTIMATING HOSPITAL COST OVER A SPECIFIED DURATION ................................................................................ 2.1 Model Description .................................................................. 2.2 Application .......................................................................... Chapter 3 ESTIMATING MEDICAL COSTS IN LONGITUDINAL STUDIES ................................................................................... 3.1 Model Description ................................................................. 3.1.1 A Markov Model for Describing Patient Health Histories .............. 3.1.2 Incorporating Costs in the Markov Model ................................ 3.2 Estimation of the Mean Transition Cost and Mean Sojoum Cost ........... vi viii ix X 13 14 16 21 30 38 41 47 69 74 91 91 93 100 101 101 104 107 3.2.1Estimation of the Regression Parameters and Integrated Baseline Intensities ...................................................................... 3.2.2Estimation of the Transition Probabilities ................................. 3.2.3 Estimation of the Mean Transition Cost ................................... 3.2.4Estimation of the Mean Sojoum Rate ..................................... 3.3 Large Sample Properties of the Mean Cost Estimators ........................ 3.3.1 Uniform Consistency of the Mean Cost Estimators ...................... 3.3.2 Asymptotic Distribution of the Mean Transition Cost .................. 3.3.3 Asymptotic Distribution of the Mean Sojoum Cost ..................... APPENDIX A EXTENSION OF SLLN ON DE ([0,1]2) ....................................... APPENDIX B FUNCTIONAL DELTA METHOD ............................................ APPENDIX C RESULTS ON ITO INTEGRATION .......................................... REFERENCES .......................................................................... vii 107 116 122 132 135 135 138 152 160 174 179 183 Table 1 Table 1 Table 1. Table 2. Table 1.1 Table 1.2 Table 1.3 Table 2.1 LIST OF TABLES Characteristics of Patients ................................................... Length of Stay and Costs by Comorbidity and Discharge Status (Semiparametric Model) .................................................... Length of Stay and Costs by Comorbidity and Discharge Status (Parametric Model) .......................................................... Estimates of mean cost at duration times by comorbidity and discharge status ............................................................... viii 85 89 90 98 fiwmlJ fiwmll “WNIJ 58ml? Figure 1.1 Figure 1.2 Figure 1.3 Figure 2.1 LIST OF FIGURES Distribution of Costs and LOS .............................................. Estimated LOS survival function and approximate 95% adjusted (-——) and naive (- - - —) pointwise confidence intervals. Estimates were made for a patient discharged alive, who underwent CATH, with a CCI of 4+, ejection fraction 50+, age 65 at admission and no history of prior CABG ...................................................... Estimated cost survival function and approximate 95% adjusted (——) and naive (- - - -) pointwise confidence intervals. Estimates were made for a patient discharged alive, who underwent CATH, with a CCI of 4+, ejection fraction 50+, age 65 at admission and no history of prior CABG ....................................................... Estimated mean cost at duration times by comorbidity (for survivors with age 65 at admission, ejection fraction 50+, no history of prior CABG, who underwent catheterization during their hospital stay) .................................................................. 86 87 88 99 AB CABG CATH LOS DRG 1011 1CD i.i.c1 CLT SLLN CER CABG CATH DRG AMI ICD i.i.d. CLT SLLN ABBREVIATIONS -Cost-Effectiveness Ratio -Coronary Artery Bypass Grafting -Catheterization -Length of Stay -Diagnosis Related Group -Acute Myocardial Infarction -Imp1antable Cardioverter Defibrillator -Independent Identically Distributed -End of Statement -End of Proof -Central Limit Theorem —Strong Law of Large Numbers INTRODUCTION Over the past decade the need to control health care expenditures in an environment of limited budgets has led health care providers and government planners to turn to cost analyses and cost-effectiveness analyses as an aid to decision making in allocation of health care dollars. While the primary goals of clinical studies are centered on patient outcomes, more attention has being paid to collecting economic data alongside traditional clinical investigations of efficacy of interventions such as randomized clinical trials and prospective cohort studies. Discrete patient-level cost and resource use data will become increasingly available. Therefore there is interest in developing rigorous statistical techniques to analyze both cost and health outcomes. In many situations two competing interventions need to be compared on their health benefits and costs. When an intervention is more effective and more costly than its comparator, the cost-effectiveness ratio (CER) is defined as the ratio of the incremental cost relative to the incremental benefit. With benefits measured in their natural units such as years of life saved or number of lives saved and costs measured in dollars, the CER is stated in dollars per unit of effectiveness. When health benefit is measured by gain in life expectancy, the cost-effectiveness ratio is the additional cost of the new intervention to deliver one unit of benefit and is expressed in dollars per life year saved. Assessment and estimation of the CER are an important part of conducting economic evaluations of health care programs. A goal of our research is to address the specification, estimation and evaluation of statistical methods for costs and to enumerate \ developing T appropriate r Use. notice and role oi time p manifest in e; accumulating random ever in 1116mm death and m Dh'mnacolo “Wm. the 3° C17) is “31’ ('denou Esti Some Of thc deOgraph army of next goal m Elm) j z)- 1103ij Cost m 1'6”“ age 6. demonstrate their application to cost-effectiveness analysis. In this thesis we focus on developing rigorous models for analysis of cost data. Integrating these cost models with appropriate models for assessing health outcomes is the subject of future research. Usually costs accrue due to resource use over time. Both the amount of the resource and the time at which it is used vary across individuals. To fully integrate the role of time into analyses of costs we would need the cumulative cost histories as they manifest in each patient. For a treatment or intervention under study, let C(t) denote the accumulating cost over time tin an individual patient. Expenditures terminate at a random event time T that signals the occunence of some health outcome. For example, in the treatment of cancer patients following diagnosis, the study endpoint T is the time of death and the cost at the endpoint C(T) is the lifetime cost from diagnosis. In a pharmacological intervention in patients with serum cholesterol elevated above 240 mg/dL, the endpoint T might be the first time the cholesterol level falls below 200 mg/dL, so C(T) is the total treatment cost. For hospital cost studies the endpoint is the length of stay (denoted LOS), and C(T) is the cumulative cost from admission through discharge. Estimating the distribution of the total cost C(T) or assessing its correlates are some of the objectives of cost studies. Correlates (called also covariates) might be demographic factors such as age, gender, race, education or clinical factors such as severity of the disease. Once the significant correlates for total cost are determined, the next goal might be the estimation of a summary statistic such as the expected total cost E(C(T) | Z) or, the median cost m(Z) for specified covariate profiles. For example, in a hospital cost study one might want to estimate the expected total hospital cost for a male patient, age 65 at admission, hospitalized for acute myocardial infarction (AMI), undergoing . padent cos: r E(C(IJlZ)~ cost example cost after9 d for AW, and Statisr include 1'1 gm. censored cos: observations Salem] p610 has 10 COW V The . simple m 6111 fiqnwm um ‘0 mitt applied to It. effeqs of Pa Using an ob: mod“ etc- undergoing coronary artery bypass surgery (CABG). If there is interest in modeling patient cost histories, one objective might be the estimation of the expected cost E(C(t) | Z) over a given fixed duration t, for specified covariate profiles. In the hospital cost example, the objective of interest could be the estimation of the expected hospital cost after 9 days of hospitalization for a male patient, age 65 at admission, hospitalized for AMI, undergoing CABG. Statistical analyses of cost data must address several technical problems.1 There include right-skewed cost data, a significant proportion of zero observations, right- censored cost data, correlation between time and cost outcomes and dependent observations when costs are ascertained at multiple time points (for instance in each of several periods during the course of an intervention). Every statistical model for costs has to cope with at least one of these issues. Skewness and Transformation The distribution of costs might exhibit a considerable degree of skewness. Then simple methods of analysis based on an assumed normal distribution of the cost variable C=C( T) will not be tenable. A transformation 3 such as logarithm or square-root can be used to mitigate the effects of this skewness. Standard regression analyses may then be applied to the transformed dependent variable g( C), which permit assessment of the effects of patient and intervention characteristics Z that influence the cost distribution. Using an observed sample {(ani ) : l S i S n} , least-squares estimation of 13 in the model g(C,) = Z}? + 81 needs only a simple moment structure for the errors 8,. .2' 3 A retransfonn. of measure: logiQ) = Z . , cost at a spe. E“ N(0,U: mean cost E . nonnality is . residuals log estimation of and median c\ 5115pr the C: inference. W: OfC 311d 3 par SimpllClly of [I appmaCh [hm l retransformation then reproduces the results of these analyses back in their original units of measurement, permitting easy interpretation. For example, we could use the model log(C,) = 2:13 + 8,. , where the errors 8,. have zero mean and variance 0’. Then the mean cost at a specified covariate profile Z0 is E (C | 20) = exp(fl 'Z0)E(exp a) . If 8,. ~ N (0, 0’) , so the costs are logonormally distributed, the simple closed form for the mean cost E (C I Z0) = exp(,13'Zo + 0.502) make the analyses quite straightforward. If normality is untenable, one can use a smearing estimate for E (exp 8) , based on the residuals log C, — 3'2, = 3‘, .4 In the absence of covariates, classical maximum likelihood estimation of parameters in log—normal distributions have been used to compare mean and median costs in two independent samples.5 When the parametric assumptions are suspect, the estimates of the standard error of ,8 could be imprecise, leading to invalid inference. When transformations can substantially eliminate the skew in the distribution of C and a parametric distribution can be assumed for E , the advantage lies in the relative simplicity of the analysis and greater efficiency of estimates compared to a nonparametric approach that leaves the distribution of 8 unspecified.4 / 4 \ / Two-Part Models When sampling an eligible population for assessing the costs of medical services during a given period, a large proportion would not have used any services so would not have incurred any medical costs. In these circumstances, the cost measure C is positive only for users and the zero costs cannot be ignored econometrically. The two-part model assumes that P(C > 0 | Z) is governed by a parametric binary probability model like logit or probit (part one) and that E(g(C) | C > 0,Z) is a linear function of 2 (part two), where g is a transformation applied to moderate the effects of skewness. The objective is to obtain an estimate of the overall mean E (C | Z ) .6 The first part governs the probability of some expenditure and the second part models the level of the expenditure, given that there is a positive expense. Two-part models are used not only for medical costs, but also for many other outcomes, such as measures of health care utilization (e.g. number of physician visits over a specified period), health care outcomes (claims data) or measures of use of substance use I abuse (tobacco, alcohol, illicit drugs). Right-Censored Cost Observations Due to incomplete patient follow-up, in many studies the endpoint T and the total cost C( T) for some patients are right-censored. For example, in clinical trials with staggered entry of patients, the signaling event of interest might not have occurred by the close of the trial. If U denotes the follow-up time for a subject, with right-censorship the observable data are restricted to X = min(T,U ) , the smaller of T and U, the indicator of non-censoring 6 = [T S U] that denotes whether T (if 6 =1) or U (if 6 = 0) was observed, and the covariate vector 2. If T is not censored we observe the true cost C( T). If T is censored we observe the cost up to the follow-up time U, but we know that C(T) > C(U). Because of the analogy with censored survival times, censored medical costs have been analyzed by Cox regression and other survival analysis techniques. An earlier work by Dudley et at.7 explored the idea by comparing different analytic models for the cost of CABG surgery. In addition to the Cox model, these investigators studied the 01.8 method, with and without a log transformation of the cost variable C, a parametric Weibull regression model, and a binary logistic model using a dichotomization of C. In their analysis, the Cox model provided the most accurate estimates of the mean and median cost of CABG surgery, and the proportion of patients with high cost (>$20,000). This method was successfully reapplied to assess the determinants of costs in CABG surgery.8 Recent articles9"2 have questioned the appropriateness of survival analytic methods for medical costs particularly in the treatment of censoring. The total costs C(T) would not, in general, be independent of C(U) even if T and U are independent. To apply standard survival analysis methods, we would need the independence of C(T) and C(U) given the covariate profile 2. When cumulative cost histories are available over time, greater flexibility in modeling is possible that skirt the issue of censored costs. In a discussion of different models for predicting the cost of illness, Lipscomb et al.l applied a proportional hazards model for the cost intensity a(c | Z) on a large data set of Medicare patients hospitalized for stroke. Costs were analyzed for a 36-month period following hospital discharge. The unit of analysis was a patient-month making the potential cost incurred in month (1) right censored if the patient died during that month. If this occurred, only costs through the first (j—l) months were considered, thereby skirting the issue of censored costs. Any dependence of costs in month (1) on the stroke patient’s cost history is captured in the regression model by using as covariates the initial cost of hospitalization and costs in follow up months 0-1) and (i—2). Also included were patient characteristics such as age, race, gender and economic status. The investigators found that the Cox proportional hazards model and the two-part model were superior in their ability to predict accurately the distribution of costs, based on a logarithmic scoring rule to compare models. For predictions of mean and median costs these models and log-transformed linear models performed equally well. Because C(T) and C(U) cannot in general be independent, Lin et al. 1° proposed two alternative ways to analyze cost in trials with incomplete patient follow-up. An assumption is made that patients are not censored because they accrue unusually high or low costs. Under one approach, if cost histories are available for each patient, an estimate called the Kaplan-Meier Sampling Average (KMSA) estimate of the average cost EC(T) is computed. It is essentially an average of costs incurred in each of several time periods in [0,T] , weighted by the Kaplan-Meier estimate of survival at the start of each period. If cost histories are not available, the second approach bases cost estimates on the subset of patients who experience the “event” at issue (in their example death). The properties of these estimators are dependent on the assumption of discrete censoring times, which is not true in general. Bang and Tsiatisl3 introduced a class of weighted estimators for mean medical costs, which account appropriately for censoring. Besides the consistency and the asymptotic normality, the efficiency of their estimators was also studied. None of these methods incorporate covariates in the cost modeling. The KMSA technique to defining average cost is similar to the approach taken by Gardiner et al. ”‘ '5 in evaluating the cost-effectiveness of the Implantable Cardioverter Defibrillator (ICD). Expected total cost over a fixed time interval [0,to] was defined by I;°e""S(t)dC(t) , where r is the discount rate, S the survival function and C(t) the value of cumulative resource use up to time t. Apart from the discounting, the integrand weights by S(t) the incremental expenditure dC(t) in the small interval [t,t + dt]. The cost C(.) was assumed to be nonstochastic. It was derived from Medicare payments, drug charges and physician fees for services that were associated with the interventions. In their application to the cost-effectiveness of the ICD, cost comprised of expenditures over 72, 30-day periods. Very recently methods have been proposed to explicitly account for cost censoring and also adjust for patient characteristics. Lin"S modified in several ways the familiar normal equations for least-squares estimation. The total cost in subjects with complete follow-up is used in the proposed methodology. More efficient estimators are provided when cost data are recorded in multiple time intervals. Lin'l also developed a methodology that specifies multiplicative rather than additive covariate effects on the mean medical cost. He proposed a semiparametric proportional means regression model for the cumulative medical cost. This model specifies that the mean cost function over time, conditional on a set of covariates, is equal to an arbitrary baseline mean function multiplied by an exponential regression function. The corresponding inference procedures are based on possibly censored observations of the lifetime cost. Objectives and Structure of Thesis The existing literature on cost analysis is still in its infancy. Several issues of practical importance have yet to be address. We believe that our research will help fill some methodological gaps on modeling and estimation of medical costs, particularly in longitudinal studies. With the growing availability of large databases on patient-level health care utilization and outcomes, there is need to develop statistical techniques to analyze jointly both costs and patient outcomes. Current methods generally focus on a single measure of cost or health outcome and do not fully exploit the longitudinal dynamic mechanisms that engender cost and health outcome data. How individual characteristics might impact summary statistics such as mean, median cost and survival are key to predicting resource utilization and informing policy on allocating health care dollars. - A simple, although important situation in which our proposed methods would apply is in assessing the correlates of hospital cost and LOS jointly. Several studies have '7' '8, while others have focused on the used LOS as a proxy for resource utilization correlates of hospital cost or charge'g'zz. Because of the likely correlation between cost and LOS, it is unclear if a model that explicitly recognizes this correlation would lead to different quantitative results. Other issues such as the skewness in the distribution of cost, whether or not in-hospital deaths should be regarded as censoring events also add to the complexity of the joint analyses of LOS and hospital cost. In Chapter 1 we develop a bivariate model for the two outcomes, LOS and cost, from a common constellation of covariates that might influence their joint distribution. Cost per patient, measured in monetary units, is the total resource use from admission to discharge, and LOS, measured in time units, is the duration of the hospital stay. Our regression analyses account for three important aspects. First, because the distributions of cost and LOS are skewed, for each outcome we use either a semiparametric Cox model or a linear model applied to a transformation of the outcome. Second, the model accounts for incomplete observations in both outcomes. Finally, although the correlation between cost and DOS is not a primary concern, we use methods developed for multiple failure times23 to adjust for its impact on the standard errors of regression coefficients. The proposed methods are applied to assess the influence of comorbidity and demographic factors on LOS and hospital costs in a cohort of patients who underwent CABG surgery. The longitudinal framework that underlies survival analytic techniques provides a natural setting for a complete specification of alternative models for estimating costs. In Chapters 2 and 3 we develop several models of increasing complexity for health care costs and outcomes. In these models costs are considered to dynamically evolve over time. In Chapter 2 we focus on cost alone including LOS among other patient characteristics as potential correlates of the accumulating hospital cost. The model permits estimation of mean cost over a given duration of hospital stay. The mean cost over a specified duration is a weighted average of the expected cumulative cost, with weighting determined by the distribution of [.08. We demonstrate the application of this technique using the same study as in Chapter 1. The described model can be 10 incorporated in a more general setup, in longitudinal studies with multiple health states and transitions between them. When a health care intervention is deployed, costs are engendered through the use of resources. These occur in random amount at random times that might differ among patients. Incorporating these components into statistical models that accurately reflect the patient health histories permits consideration of health outcomes and costs jointly. In the actuarial literature it is common to model the stochastic mechanism governing the events that trigger the payment of life insurance benefits as a Markov chain with finite state space, where each sample path is interpreted as the life history of the insured.2MB We extend and adapt these models to our context. In Chapter 3 we propose longitudinal stochastic models that reflect the experience of patients in sustaining and changing states of health. We use a Markov model29 to describe the evolution of patient histories over time. Dependence of the transition intensities on patient characteristics is modeled through the Cox regression model. We consider two types of costs that might be incurred in the course of follow-up: costs at transitions between health states (eg., cost of diagnosis of a condition) and costs of sojoums in a health state (eg., cost of the treatment of that particular condition). Present values are obtained by discounting all expenditures at a fixed rate. Conditional on the initial state and a specified covariate profile, we provide estimators of the expected present value of these two types of costs incurred over a fixed follow-up period. Under additional assumptions, these estimators can be shown to be consistent and asymptotically normal, which sets the stage necessary for statistical inference. ll Our proposed methods have the capability of incorporating concomitant covariate information for the time and cost outcomes. Both fully parametric and semiparametric models were studied, including regression models for the transformed response variables, Cox regression and Markov models that specify covariate effects in transition intensities. These models are based on a natural time—cost setting and have easy interpretation. As a result, the methodologies developed in this thesis are very promising to provide a flexible, unified framework for statistical inference on summary statistics (such as CER) used in cost-effectiveness analysis. 12 CHAPTER 1 A BIVARIATE MODEL FOR HOSPITAL LENGTH OF STAY AND COST Hospitalizations constitute a significant proportion of overall expenditure in health care. Length of stay (LOS) is often used as a surrogate for hospital cost or charge. However, the increasing availability of databases with patient specific LOS and cost permits analyses of these variables jointly, accounting for their likely correlation. In this chapter we explore the differences and advantages of using a bivariate model, compared to separate univariate models for assessing the impact of covariates on LOS and cost. Under marginal specification through semiparametric or fully parametric regression models for LOS and cost, standard errors of estimates of regression coefficients are obtained using a robust covariance matrix” 30 to account for correlation between LOS and cost. These models account for incomplete observations in both outcomes. In Section 1.1 we use a Cox regression model for each of the LOS and cost outcomes and in Section 1.2 a parametric model. In Section 1.3 we apply the proposed methods to LOS and hospital cost in a cohort of patients who underwent coronary artery bypass surgery (CABG). l3 1.1 Semiparametric Marginal Models Consider n individuals in a study. Censoring occurs when assessment of 108 and cost are made at some fixed calendar time. This is called administrative censoring. At that time some patients may not have completed their LOS or incurred all their costs. For the i-th patient we observe the true L08 7} or the censoring time 7}’ , whichever occurs first, and ZU(.) , a vector of p explanatory variables that depends on the current time t since admission. We restrict the time t to a finite interval [0, 1'1] , r, < 0°. The nonnegative variables I} , 17 and the process Z,,-(.) are defined on the probability space ($21,.77,P,). Let X1,- = min(7},7}’) , 61,. = [1} S 1}], where L] is the indicator function of the displayed event. For the i-th patient we also observe X 2,. = min(C,,C,-') , 62,- : [Ci 5 CI] and 22,- (.) , where C,- is the total hospital cost, C,’ is the censoring cost and 22,-(.) a vector of p explanatory variables that depends on the current cost c since admission. The costs are restricted to a finite interval [0,12] , 2'2 < 0°. The nonnegative variables C,- , C,’ and the process Z2,(.) are defined on the probability space (522,33, P2). The following hazard functions a“, a2, relate the covariates to the distributions of 7;. and q. For L08: 0:“- (t, [3.0) = alo(t)exp(/3,’0Zh-(t)). (1.1) For cost: 02,. (c, [320) = aQO(C)exp(fl§oZZ,- (c)). (1.2) 14 We assume the vectors of true values of the regression parameters 610, 1320 to have dimension p. The underlying intensities am(.) , 0'20 (.) are the baseline intensities corresponding to zero covariates and they are left completely unspecified. The subscripts will allow us to distinguish between covariate effects on LOS and cost. Given a fixed covariate profile 20 and 7;- 2 r , the hazard a1,- (t, 510 | 20) is the instantaneous probability that LOS would end just after time 1. Similarly, given 20 and C,- 2 c , the hazard a2,(c, [320 | 20) is interpreted as the instantaneous probability that total cost would be realized just above the level c.I With independent identically distributed (i.i.d.) data (X ”,6‘ ,-,Z"(.)) on n patients we obtain on the basis of (1.1) the estimate ,3, of 6.0 by maximum partial likelihood estimation and the Nelson-Aalen estimate 11,00, 31) of the integrated baseline hazard A,o(r) = Lam(u)du . Analogously, with i.i.d. data (X 2i,62,-,Z2,-(.)) we use (1.2) to obtain the corresponding estimates 32 and A20(c, [92). Following Wei et al. (1989)”, (31’, flé)’ has an asymptotic 2p-variate normal distribution whose covariance matrix can be consistently estimated. The survival distributions of LOS and cost at a fixed (i.e. time-independent) covariate profile 20 are respectively, 5,0 | 20) = P(T, > t | Z0) = exp(-A,o(t)e“°z°) and s,(c | 2,) = P(C, > c | 20) = exp(-A,o(c)efliozo). Their estimates, denoted by s, (t | 20) and 5'2 (c | 20) , are obtained by replacing the unknown quantities by the aforementioned estimates. We will show that for fixed time t and cost c, given a fixed covariate profile 15 20, {um (S, (t | 20) — s,(t IZO)),n1/2(S2(c|Zo)- S2(c | 20))} is asymptotically bivariate normal, with zero-mean vector and a covariance matrix CS that can be consistently estimated from the data. We call CS the adjusted covariance matrix. Approximate pointwise 95% confidence intervals for S,(t | 20) and 52 (c | Z0) are calculated and point estimates and approximate 95% confidence intervals for the median LOS and median cost are obtained from the estimated survival curves by the procedure described in p511- 512, Andersen et al. (1993).29 1.1.1 Estimation of the Regression Parameters and Integrated Baseline Hazards Define of each patient the processes N ”(t) = [X l,- 5 1,61,. =1] and l’,,-(t) = [X l,- 2 t]. Aggregated over all n patients, the processes Nl(r) = 22.1 N,,-(r) and Y, (t) = 2;] Yul!) denote respectively the number of patients with completed LOS by time t and the number who have not completed their hospital stay at time just prior to time r from admission. Similarly define N2,-(c) , Y2,-(c). Here the aggregated process N2 (c) denotes the number of patients whose total hospital costs, completely observed, do not exceed c, and Y2(c) the number of patients whose current hospital cost is least c. In the sequel for notational convenience we will use a single generic argument u for all processes remembering that the subscript 1 is associated with time and the subscript 2 with cost. 16 We need some standard notation. Let Sl°’(u. fl.) = Z" Y..(u)exp(mz,.(u». i=1 Si"91 (14. [3* )2 17 2.09.) = 1," v. (w. iti°’a.odu . We formulate our models and prove many of the results in the framework of multivariate counting processes. A survey of this theory can be found in Andersen et al. (1993).29 Our notations follow the ones of this reference. A.1 A.2 A.3 A.4 A.5 A.6 A.7 The following list of conditions will be assumed to hold throughout this section: Model Assumptions: Conditional on Z,,-(.) , 7} and 7}I are independent and conditional on Z25(.) , C,- and Ci, are independent; {[ X“- ){515 ],[zli('))}’l S i g n are i.i.d.; XZi 62" ZZi(') Ame.) = j," aromd: < .., Awe.) = I," a.o(c)dc < co; Z.,(.) , er(-) are bounded; Z,,-(.) , 22,-(.) are adapted, left-continuous with right-hand limits processes; 1101,04) = I,VuE [0.71 1) = P(Y“(r,‘) = 1) >0; by 2* =mzk (flko) IS pOSItivc definite. norau' Note: A.l is also called the independent censoring assumption. A.2 implies that (Nu (u),Yki(U),Zu(u),ue [O’Tk ]) , 15 i S n are I.I.d. A.4 and A.5 assure that Zli(') , 18 22,-(.) are bounded, predictable processes. For k =1, the interpretation of A.6 is that there is a positive probability that at any time r from admission a subject might not have completed his/her hospital stay. For k = 2 , the interpretation of A.6 is that there is a positive probability that the total cost of any subject might be larger than any cost c from admission. Assumption A.7 is crucial for the (asymptotic) existence of the regression parameter estimator. :1 Consider on the probability space (£21, .7: , P,) the right-continuous nondecreasing family (15;,- (t),te [011]), where fl,(t) represents everything that happens up to the time t for the i—th patient. Formally RV) = 171,2 (r) v R” (t) , where Ii,” (t) = 0'{Nl,-(s),s S t} , fh-Z (r) = a’{Z,,-(s),s S t} . Similarly we define the filtration (1:2,.(c),ce [0, 12]) on the probability space (522, .772 , P2). Under the independent censoring assumption, N ,0. (.) is a counting process on (9,: ,f; , Pk) with the .75)“. (.) — intensity process 21,“. (v, 13*) = a,“- (v, ,3,‘ )Yu (v) , where the hazard function a,“- (., 13k ) was defined in (1.1), (1.2). The processes M ,0- defined by M,,(u) = N,,(u) - firm, )3, )dv, u e 10.71.] (1.3) are .7,“- (.) — local square integrable martingales on the interval [0, Ti] , with (Mb->04): Emvfimv and (MU,M,j)=o for i¢j,i.e. M,“- and My are orthogonal for i¢= j. The process Ak,(u, 13k) = EA,“- (v, [3,, )dv is called the compensator l9 of the counting process N ,- (.) . For details of this result see Andersen et al. (1993)”, Sections [1.4.1 and H122 Let of" = $2, e...®o,, 1,”) = f, camera, P,‘"’ = P, ®...®P, and Jimm) = .75}, ®...®f,,, , where the product ®...®is over It factors. The family (f,“’(u),u 6 [0,1, ]) is a filtration on the n-th sample space (95,"),3”), P,(")). Then (see Andersen et al. (1993)”, Section 1.4.3) N,,- has the same compensator A,,- with respect to the product sample space (Q‘"’,f,‘"’ , Pf”) and the filtration (fl(")(u),u 6 [0,1, ]). When LOS and cost quantities are considered jointly, the stochastic properties are relative to the filtration (yawn) e f,‘“(c),(t,c)e [0,1,]x[0,12]) on the product space (a?!) 8991,5011 ®J:2(n)’Pl(n) ® 13203)) . An estimator 6, of ,6", is obtained by maximizing the Cox partial likelihood (see Andersen et a1. (1993)”, p483-484). The log-partial likelihood evaluated at time/cost u has the form C1049.) = 2;. g fl;Z,,(V)dN,,(v) - f; log 51‘” (v, 13, )dN, (v) . The vector U, (n.6,) of derivatives of C, (u,,6,) with respect to 19, is (l) U, (u, 5,) = 2;! E z,,(v)dN,,(v) - EWA». (1.4) 20 The maximum partial likelihood estimator ,6, of 6,0 is defined as the solution of the likelihood equation U, (1, , fl, ) = 0. Then the Nelson-Aalen estimator of the integrated baseline hazard A,0(u) = £a,o(v)dv is given by J Arrow 18k): E 5(0),;(1‘39‘) "(v)’ where J, (v) = [Y, (v) > 0] (see Andersen et al. (1993)”, Sections N1 and VII.2.1). 1.1.2 Large Sample Properties of the Estimators of Regression Parameters and Integrated Baseline Hazards The following conditions are necessary for the asymptotic properties of our estimators. They were first introduced by Andersen and Gill (1982).31 We use their formulation from Andersen et al. (1993).29 Throughout this chapter the norm of a vector a = (a,) or a matrix A = (a,-,-) is "a" = sup|a,| and “A" = suplaU-l , respectively. 1' i. j Conditions C.a-C.f: 0 There exist a compact neighborhood 8, of [3,0, with [3,0 e B), (the interior of B ), and scalar, vector and matrix functions 3“” , s“) , s 2) defined on [0,1 ]xB such It It i 1 Ir that for me {0,1, 2} : 1 m m P C.a sup l—Sé ’(u,13,)-s,§ ’(u,,6,) —>o; (14.3. Flora”; n 21 C.b s‘"’(., .)is a uniformly continuous bounded function of (u, fl,)e [0,1, ]xB, , C.c s£°)(.,.) is bounded away from zero; ca S,l)(u, 13,)=—— a ——s,”(u, 13,); 1813* 3:0)“ :61) 3:2)(14 fit)=— a fit C.c 2, is positive definite; C.f f a,o(u)du < co. Under our model assumptions A.l-A.7 the conditions C.a-C.f are verified for the functions 3“” , s1”, , s0) defined in the previous sub-section. For a proof and a discussion about these regularity conditions see Section 4 of Andersen and Gill (1982).31 Under these general conditions, with a probability tending to one, there exists a unique consistent solution 3, of the likelihood equation: Theorem 1 (Theorem v11.2.1, p497, Andersen et al. (1993)”) Under the assumptions A.l, A.2 and conditions C.a-C.f, the probability that the P equation U, (1,, 6,) = O has a unique solution ,6, tends to 1 and ,8, —) [3,0 as n —9 00.13 The assumption of bounded covariates is very important to prove the asymptotic normality of (31', 35)’. This assumption implies the Lindeberg-type condition used in Andersen et al. (1993).29 Wei et al. (1989)23 proved the following theorem. Because some intermediate steps of the proof of this theorem will be used later, we sketch them below. 22 Theorem 2 Under the model assumptions A.l-A.7, n“2 (61' — 6'0, 6; - 650) converges in distribution to a zero mean normal 2p-dimensional random vector with covariance matrix Q=(D,,,k,le{l,2}), where the po matrix 0,, is DH = ZEIE(WH(flkO)WIl(flIO)’)zl-l . with w,, (6,0) ap-dimensional vector, Wiriwro) = E (211(10‘ ek (“tflkolldMqu -0 Sketch of the proof: By (1.4) the score function U, (u, 6,) has the form Uktu.fl.)=2;, E(zatvl—E.(v.firl)d1v.,(vl. Replacing N,,-(v) by (1.3), it follows immediately that U, (u, ,9, ) = 2;, E (Z,,-(v) - E, (v, 6, ))dM,,-(v). By Taylor expansion of U, (1, , 6,) around 6,0, we have "-IIZUk (Tkvflko) = ("-1111 (fl;))"“2 (.81 ’flko)t where -—I, (6,) is the matrix of derivatives of U, (1, , 6,) with respect to 6, and 6,: is on the line segment between 6, and 6,0. 23 Step 1: n’” 2 (U 1 (1,, 60)’, U2 (1,, 620 Y), converges in distribution to a zero mean normal 2p-dimensional random vector. The asymptotic covariance is B = (Bu,k,l 6 {1,2}) , where Bu = E(Wkl(flk0)wll(1610)’)' Step 2: l O P i I P n' I, (6,)—>2, for any random 6, such that 6, —>6,0. Step 3: I "112 (6; - 6’0, 6; — 650) converges in distribution to a zero mean normal 2p- dimensional random vector with asymptotic covariance matrix Q = A4811"l , where A = diag(2,,22) and B was defined in Step 1. II The next theorem gives a consistent estimator Q of the asymptotic covariance matrix Q. We follow closely the notations of Wei et al. (1989).” Theorem 3 Under the model assumptions A. 1-A.7, the asymptotic covariance matrix Q = (Du,k,l E {1, 2}) of n“2 (6,' - 6,3,6; - 650), is consistently estimated by Q = (barre {1, 2}) , with 15.1 = "21{l(18r)§uIr-l(/9l). where 6,, =n"z:;l 1%,, (6, )1'1‘2, (6, )’ and 111,,(6,) is a p-dimensional vector, 24 Wki(fik) = E‘ (Zki(u) - E], (u, Bk ))de-(u, flk ) , M,,(u,6,) = N,,.(u) — £Y,,(v)exp(6,Z,,(v))-§(T‘:’£:—v2—)dN, (v) .0 It . Ir Wei et al. (1989)23 paper does not provide a proof of this result. A Although they mention that the proof essentially uses the same techniques as in the proofs of Theorem 1 of Wei and Lachin (1984)32 and of Theorem 3.2 and Corollary 3.3 of Andersen and Gill (1982)3 ', these references are not in the context of our model. In the parametric section of this chapter we need a similar theorem and we provide all the details of its proof. In the following we present several results related to the large sample properties of the integrated baseline hazards estimators. As mentioned in Section 1.1.1, the estimator of the integrated baseline hazard A,o(u) = £0,0(v)dv is given by ~ J Arrows/91): Emmi“) . where J, (v) = [Y, (v) > 0]. As shown in the proof of Theorem VII.2.3, p504, Andersen et al. (1993)”, n“2 (A,0(u,6, ) - A,o(u)) can be expanded as W. (Ill-""2091 war fie.aro. where W,(u)=n”2 Efié‘ié—Sdum) and k ’ k0 Mk(U) = 2:" Mn (1‘) =Nk (u) - ES£O)(V, flko)ako(V)dv . 25 Therefore n"2 (Actu.3.l—A.0(u))+n”2(8. 'fiko)’E e.o, (1.6) E k v 3:0)(Vrflk0) 3:0)(v9flk0) k v P n“ l-J (v) —dM*—(")—>o. (1.7) E( k )S£O)(Vtflko) For this we use the Lenglart’s Inequality for local square integrable martingales and the following proposition: Lenglart’s Inequality (see p86, Andersen et al. (1993)”) For every 77 >O,6>0 uE[0,1] P( sup |M(u)| > 77) 5%,» P((M)(1)> 6), where (M ) is the compensator of the martingale M. 1:1 Proposition (see Proposition 11.4.1, p78, Andersen et al. (1993)”) Let the counting process N have intensity process ’1 , let M = N — I}. and let H be locally bounded and predictable. Then M and IHdM are local square integrable martingales with (M)=J31, (II-MM): [H210 In our case N, (u) = M, (u) - 55,0) (v, 6,0)a,o(v)dv and the quantities in the integrals of (1.6) and (1.7) are bounded and predictable by our model assumptions. Consequently both expressions in (1.6) and (1.7) are local square integrable martingales, 27 of the type IHdM , so we can calculate their compensators and apply the Lenglart’s Inequality. Let r) > 0,6 > O. For (1.6): - 1 P( sup n ”2] (v)[ n - )dM (v) >rp)S 1:610:11 E k 51:0)(151310) 31:0)(1411310) k n-1S£0)(u, flko)ako(V)dv > 6) . 2 S—+P( J (v) 712$ k [3,0107 1610) 3:0 ).(u 11810)] By conditions C.a , C.b, C.c, C.f and Dominated Convergence Theorem, 2 1 _ P ‘J () " - 's‘°’( . )a ()dv—>0. so (1.6) is proved. For (1.7): _ dM (v) P( su n "2 l-J (v) ——"-—l> )s 14610311 E ( k )SIEO)(V11610) 77 6 —1 (0) S—+P(E‘( 1- J,(v))————— 1°“ nS, (v.6.o)ato(v)dv>6). 7] (14121310) But 1- J, (v) = [1, (u) = o] = [1,, (u) = ovt'] = [51” (u, 13,0) = o] , so the integral in the previous relation is zero. Consequently (1.7) follows. Relations (1.6) and (1.7) imply that W, (u) is asymptotically equivalent to net asp—5% - "’2Z?....w;.-. as 28 a sum of n i.i.d. random variables w,,(u, 6,0) = g dei(") s 1°’(v.fi.o) ' The quantity s,°’(., 6,0) is predictable, bounded away from 0, so, by the stated . . ‘ . ’ . . Proposrtron, w,,(u,6,0) are zero-mean martingales. Because w,,-(u,6,o) are also 1.1.d. random variables, by the Multivariate Central Limit Theorem, (W, (0,1172 (0)) converges in distribution to a zero mean bivariate normal random vector with covariance matrix C t t t l , Q =(Du(u,,u,),k,16 {1,2}), Where Du(u,,u,) = E(W,,(u,,fi,o)wn(u,,fl,0)) and u, IS equal to t or c according as k =1 or 2. Therefore Theorem 4 is proved. I The following theorem gives a consistent estimator Q" of the asymptotic covariance matrix Q‘. As previously mentioned, the techniques of the proof for this type of theorem will be provided in a similar theorem in the parametric section of this chapter. Theorem 5 Under the model assumptions A.l-A.7, the asymptotic covariance matrix Q‘ = (0;, (u, ,u,),k,l E {1, 2}) of (VI’,(t),W2(c))I is consistently estimated by Q” = (6;,(u,,u,),k,ze {1,2}), with 6;,(u, ,u,) = 1142;, w;,(u,,B, )w;,(u,,19,) and n 9;.(u..6.>= Elt(V)mdMa-(V.flt). k ’ it 29 112,,(14, 13,) = N,,.(u) — f; Yu(v)exp(6,Z,,(v))§w%dN,(v).0 It ’ k 1.1.3 Large Sample Properties of the Estimators of Survival Functions The survival distribution of LOS or cost at a fixed covariate profile Z0 S, (u '20) = exp(—A,o(u)efi; 20) is estimated by S, (u 120) = exp(—A,o(u, 6, )efliz0). We want to determine the asymptotic joint distribution of S,(t | Z0) and S, (c | 20) for fixed t and c. Because by (1.5) n"2 (21,001.11) - A,o(u)) +n”2(6, - 5,0)’ I; e, (v, 6,0)a,0(v)dv is asymptotically equivalent to W, (u) , a first step would be to consider the joint distribution of the vectors (W,(r),W2(c))’and n”2 (6,' - 670,63 - 1330) Theorem 6 Under the model assumptions A.l-A.7, (W10), W2“), "“2081 " 310),: "“2092 ’ 520),), converges in distribution to a zero-mean normal random vector with covariance matrix .. * P’ . Q = [QP Q). The matrices Q and Q and their consistent estimators are described in Theorems 2-5. The matrix P has the form P=[ 9 1321(0), 1312(1) 9 30 where P,, (u,) = ZI‘E (w;,(u,, 6,0)w,,(6,0)) for k at! is consistently estimated by puma): I,"(6,)- w;i(ukrflk)wli(8l)'u i=1 (See Theorems 2-5 for notations.) Proof Theorem 6: As shown in the steps of the proof of Theorem 2, Wm. - fl...) = n1;'(fi;)(n"’22;;,w..(fl.o))+ 0,,(1). , P where nI,"(6,)—)2,' and was”): £‘(z,,(u)-e, (u, 6,0))dM,,-(u),1SiSnare i.i.d. zero mean random vectors. By (1.8) W, 0‘) = "4,22; W;- (urflko) v dei(V) where W;i(urflko) = film Ir . to are i.i.d. zero mean random variables. Consequently (W,(t), W,(c), n"2(6, - 6,0)’, n"2(62 -- 620)')’ is asymptotically equivalent to the (2p+2)-dimensional vector diag(1. 1. nlf'wf ). n1§'(fl5>)xn"’22;;, p,(t.c. 19.0.1320). where the i.i.d. vectors p,- have the components "IL-(1,6,) , w;,(c, 6,) , rein-(6,0) . W2,(1620). the first two being scalars and the last two p-dimensional vectors. It follows from the Multivariate Central Limit Theorem and Slutsky’s Lemma that (W,(t), W,(c). n”2(6, - 6,0)’, n"2(62 - 620)’)’ converges in distribution to a zero-mean 31 ~ It normal random vector with covariance matrix Q = [QP Q) , where the matrices Q and Q‘ and their consistent estimators are described in Theorems 2-5. Write {Pam 151(0)) 112(1) 1322(0) ’ where P,,(u,) is the asymptotic covariance between W, (u,) and n"2(6, - 6,0) , k,le {1,2}. For k #1, P,,(u,) has the form P,,(u,) = z;'13(w;,(u,, 6,0)w,,(6,0)) and is consistently estimated by 13,, (u,) = If1 (6,)2" w; (u, , 6, Mir-(31) . i=1 where W, (6,)and 13);,- (u,, 6,) are described in Theorems 3 and 5, respectively. We will show that P,, (u,) has only zero components, so that =[ 9 131(5)] P12“) 9 . Fork=1, P11“) = 211E(Wll(t’fl10)wll(fl10)) _ dM () =2" E[£s——,‘°)(;|/;o) ("2M“) 310‘ filolldM11(“)] Both factors of the product in the expectation are local square integrable martingales. By the definition of the predictable covariation process (M ,M ’) of two local square integrable martingales M and M’ (see p68, Andersen et al. (1993)”), EMM’ = E (M , M’). If H, K are bounded predictable processes and M is a counting 32 process martingale with (M ) = Ill then (II-MM , IKdM ) = IHK/l (see p78, Andersen et al. (1993)”). Therefore, using these results, for je (1,., p} we obtain that dM ELE S(O)(—l—;__:$)) (le(u)_ e,(u 610))dM”(u)]= 10 J W 1 =55“ —0——(10:12:)E(211100-e1j(urfllo))dM|1(u)] (”W I =E =E[£ 110.1100— s(°)( 141,6 —(lej(“)‘elj(“tfilo))alo(“)cxl)(fllozl1(a))Yll(“)d“] 10) By Fubini’s Theorem the last quantity is equal to “1004) (l) ._sl(l)(u 1310)} 31(0) d 0. [is—W finalsl (“.131“) 31(0)(“.B10) (14.510) “2 Hence we proved P,,(t) = 9. Similarly we can show P,, (c) = Q .I Now we have the tools to prove the following theorem about the asymptotic joint distribution of S,(t | 20) and S2(c | 20) , for fixed time t and cost c. Theorem 7 Under the model assumptions A. l-A.7, Ill/2 (3’1“ l 20) ‘ Si“ Izo)’-§2(C l 20) 7 52“ l 20)), 33 converges in distribution to a zero mean bivariate normal random vector with covariance matrix cs = (05,, (u,,u,),k,ze {1, 2}), where 65.. (u.) = s. (u. I20)2 ewe/31020) x x”); (u, ) + ( f; (20 - ek(v,,6ko))ako(v)dv)’ Du ( g (20 - ek(v, flko))ako(v)dv }, CS,2(:,c) = s, (t | z(,)s2 (c | zo)exp(,57020 + figozo) x x{D|.2(t,c)+(£(Zo ‘en(v,,510))alo(v)dv) P21(c)+ +(E(ZO’32(Vvfl20))azo(V)dv) P120)+ + ( H20 __ e,(V.flno))alo(V)d")’ Du ( g(Zo - e2(v.flzo))a¢o(v)dv)}. 0 (See Theorem 2-6 for notations.) Proof Theorem 7: We first state the Delta Method, a popular and elementary tool of asymptotic statistics, which we will apply repeatedly in the proof of this theorem. We will use the following version, stated at p109, Andersen et al. (1993)”: Delta Method Suppose for some random p-vectors Tu and a sequence of numbers an —> oo , D an(Tn—0)—)Z as n—)oo, where 66 R” is fixed. Suppose ¢ : RP —-) R" is differentiable at 6 with qx p matrix ¢’(9) of partial derivatives. Then 34 D an (“T") " ¢(9))'-> W09) X Z in Rq and, indeed, an (¢(T,,) - (9(6)) is asymptotically equivalent to ¢’(0) x an (Tn - 6) .0 Recall that for k 6 {1,2} the survival function at a fixed covariate profile Z0 is S,‘ (u |Zo) = exp(-A,‘ (u,)?” |Zo)) , where Ak(u,,Bko '20) = Ako(u)exp(fl;o 20) is the integrated hazard. The survival function is estimated by 5'1: (u '20) = “Phat (“’31: '20)) . where AI: (“’31: '20) = 31100531 )CXPUZZM- By the Delta Method n"2 (§,(t | 20) - S,(: | 20).s‘2(c | 20) - 82(c | 20)) has the same limiting distribution as _[Sl(’|Zo) 0 ]n1/2[ Al(t’Bl|ZO)—Al(t’fl10lzo) ] (1.9) 0 52(CIZO) 3203,32 |Zo)‘A2(C»/320 '20) Therefore we will determine first the asymptotic distribution of nu2[ 3.0.3. IZo)-An(t.flio I20) )znu2[ CXP(l§iZo)Alo(h/91)‘Cxp(flilozo)Alo(‘) ] Mafiz I20) -A. R2, 35 ¢(a,b,c,d) = (aexp(c 20)). bexp(d‘Zo) The function ¢ is everywhere differentiable, with the 2 x (2p + 2) matrix ¢’(a,b,c,d) of partial derivatives ¢’(a,b,c,d) = exp(c'Zo) 0 aexp(c’Zo)Z{, Q I . 0 exp(d‘Zo) Q bexp(dZo)Zo By the stated Delta Method, n1/2[ Al(t’81|ZO)—Al(trfi10|zo) . =a.(¢) Mafiz I20) -A2-A2(c.flzolzo> J converges in distribution to a zero mean normal random vector, with covariance matrix denoted C = (CH (uk ,u, ),k,l e {1, 2}) By the asymptotic independence of W,(uk) and tr"2(,1§1k - 13,0) . 36 Cut (uh) = “Pamela” X {th (uk ) + (E(Zo — er (V’flk0))ak0(v)dv) Du: (E(Zo ‘ 3k (V’fik0))ak0(v)dv)}r C120,” = “Nimble + 33020) x x{D;’2(r,c-) +( E(Zo -el(v, fi,o))a,o(v)dv) P2,(c) + + ( E(Zo * 82(V,flzo))ago(v)dv)’ P12“) + I + ( E(Zo " 31(Vrfllo))alo(V)dV) 012 ( E(Zo - 62(v,fl20))azo(v)dv)} . By (1.9), n'” (330 |zo) - S,(t | Z0),S2(c | 20) - S2(c | 20)) converges in distribution to a zero mean bivariate normal random vector with covariance matrix cs = (C5,, (u, ,u, ),k,le {1, 2}), where CSu(u,‘ ) = 3,04,, IZO)2Cu(uk), cs,2(r,c) = 3,0 | Z0)Sz(c | Z0)C,2(t,c) .II A consistent estimator CS = (CSMuk ,u, ),k,l e {1, 2}] of the asymptotic covariance matrix CS = (03,, (u, ,u, ),k,l e {1, 2}) of n“2 (S10 | Z0) — Sl(t |Zo),S2(c | 20) —- 52(c | 20)) can be obtained replacing flko by [9,, , Sk (“k I20) by the CSIimator SAIC (“k '20) , Dk,,D;,(uk,u,),I’21(C) ,I’l2(t) by their consistent estimates and E(Zo - ek (v, ,Bw)) ak0(v)dv by the quantity J j: (20 - E,(v,B,))W)'E:V:§—k)-d1vk(v). 37 1.1.4 Point Estimates and Confidence Intervals for Median LOS and Median Cost The median LOS or cost for a fixed covariate profile 20 is defined as m,‘ (Z0) = inf {u : 5,, (u |Zo) S .5} and estimated by iii, (20) = inf {u : s, (u | 20) s .5}. For constructing a confidence interval for the median we will follow the approach described in Andersen el al. (1993)”, p511-512. The advantage of this procedure is that the estimation of the density of S, is not needed. The confidence interval for the median m (20) can be read directly from the lower and upper pointwise confidence limits for the survival distribution in exactly the same manner as ri‘ik (20) can be read from the curve s, (. | 20) itself. Let us first consider pointwise confidence intervals for S, (u | Z0). By Theorem 6, nm (3,, (u IZO) - Sk(u | Zo)) converges weakly to a zero mean normal distribution with variance CS“, (u). The “standar ” asymptotic 100(1—a)% interval A 1/2 . A 1/2 [Sk(u|Zo)-za,2[CSu(u)/n] ,Sk(u|Zo)+za,2[CSu(u)/n) ], where Zen is the upper a/ 2 quantile of the standard normal distribution, might not be completely satisfactory for small sample size.33 Kalbfleisch and Prentice (1980)34 and Thomas and Grunkemeier (1975)33 sug ested that usin transformations such as 8 8 38 g(x) = log(- log x),xe (0,1) might improve the small sample properties of the confidence intervals. Other transformations for constructing confidence intervals are described in Klein and Moeschberger (1997).” If CS“ (u) it 0 and g is a real function differentiable in the neighborhood of S,‘ (u | Z0) , with continuous derivative g’ different from zero at 5,, (u | Z0) , then by the Delta Method and Slutsky’s Lemma, 5‘ z — S 2 8( k(“| 0)) 8( rod o)?,2-l—)>N(O,1). lg’($‘,(u |zo))|x[5sa(u)/n] Asymptotic lOO(1—a)% confidence limits for g(S,‘ (u |Zo)) are 112 g(Sk(u|Z0))iza,2IgI(§k(u|Z0))IX(CSu(u)/n] . (1.10) For the transformation g(x) = log(—log x),xe (0,1) the derivative 3’ exists and is continuous everywhere on (0,1) , g'(x) = (xlogx)-I . Also the inverse of this function has the form g"( y) = exp(—exp y), y 6 R. If S,‘ (u |Zo)e (0,1) then, by retransforrning (1.10), we obtain the following 1000— a)% confidence interval for S,‘ (u | Z0): [$110! I 20), 31204 '20)] , Where $110420) = §k (u |Zo)exp(—PS,(u))’ §,,(u|zo) = S“, (u |zo)°"P‘PS*‘“”, (1.11) A 1/2 CSkk(U)/II] PS u :2 .. .. . "U alek(u|Z0)logSk(u|Zo) 39 We now turn to the construction of a confidence interval for the median m, (Z0). By the procedure described in Andersen et al. (1993)”, p277, we can take as an approximate 100(1- a)% confidence interval for m, (Z0) all values u which satisfy 8 (§k(“ I 20)) - g (5) A 1/2 - zctr/2’ lgr(Sk (u IZO))IX(CSu (u)/n] where g(x) = log(-log x),x€ (0,1) , i.e. all hypothesized values u of m,‘ (20) which are not rejected when testing H0 : m,‘ (Z0) = u against H 1 :m,‘ (Z0) at u at level a , based on the asymptotic normality of g (Ska; |Zo)}. Therefore the approximate confidence interval for mk (20) is A in {u : g(.5) is between g (S, (u |Z0)) j; Zai2|8’(51(u IZo))lX(CSkk(u)/"] }= = {u : 0.5 is between 5“,,(ulzo) and 5“,,(u IZo)}. where 5,,(u |z,) and 3“,,(ulzo) are given in (1.11). Define @420) = inf {u : 5,,(u |z,) s .5}, #1,,(20) = inf {u : 5“,,(u |z,) s .5}. Then [rfik,(Zo),rfrk2(Zo)] is an approximate 1000 —a)% confidence interval for mk(Zo). This completes our construction of point estimates and confidence intervals for the median LOS and median cost in the case of semiparametric marginal models. An application of these results will be presented in Section 1.3. 40 1.2 Parametric Marginal Models Using the same notations as in Section 1.1, suppose we observe for the i-th patient of a study with n individuals X” = min(7},7}'), 5,, = [7,. s 7;]. z”. with 7;. = g(r‘), 7}’= g(T,") , where 7}. is the true L08, 7}” is the censoring time and Z” is vector of p time-independent explanatory variables. The monotonic transformation g (with inverse g'l ) is chosen to mitigate the effects of skewness that might be present in the data. For example, the log or square-root transformations are used when the data are right-skewed and have the advantage of permitting easy interpretation of the model. All variables 7}',T,~",Zl,- are defined on the probability space (521,? , PI). The transformation g is chosen such that both 7} and 7}' are strictly positive variables. In a similar manner we consider X 2,- = min(C,,C‘,’) , 62,- : [Ci 5 CI] and the p-vector Z2, of explanatory variables, where C,- = g(Cf), C,-’= g(Cf‘) , C: is the true hospital cost, C," is the censoring cost and both C,,Cf > 0. All these variables are defined on the probability space (S22, .5, P2). The transformation applied to cost need not be the same as that applied to DOS. The relationship of explanatory variables to the time-cost observations is modeled through the following linear regression model: 7i = flI’OZli + 010811 Ci = fléozzi + 02052." (1.12) 41 where (81,-,82i)’, 1S i Sn are i.i.d., with distribution function F(.,.|p0). Here p0 is the true value of a nuisance parameter p. Let f (.,. | p0) and a(.,. | p0) be the density and the hazard function associated to the distribution function F (.,. | p0) . As in Section 1.1, the subscript k = l is associated with time and the subscript k = 2 is associated with cost. We assume the 85,1 S i S n are i.i.d., with zero mean and distribution function F0 that does not depend on p0. This parameter is associated with the joint distribution function of (8“,82i) . Let f0 be the density and do the hazard of the distribution of 8,“. Example 1.2.1: 8. 0 1 Suppose[ h], lSiSn arei.i.d. NU ),[ '00)]. 521' 0 po 1 Marginally E“, lSiSn are i.i.d. N(O,l), so ao(u) = 1 “’(b (u) , where or and o are the hazard and distribution function of the standard normal distribution. :1 Example 1.2.2: .V2 rylayZERr 8. Suppose[ h], lSiSn arei.i.d. F(y1.y2)= _ _ e l+e "+e 21' 3° (61,-,82,) has the bivariate logistic distribution, with no parameter affecting the shape 9f the joint distribution of 8,,- and 82,-. Marginally 8,“, 1S 1' Sn are i.i.d. Fo(u) = ao(u) = _.. , as R, with mean E(e,,) = o and variance var(e,,,) =7t2/6.u l-l-e 42 For the model (1.12) we denoted by [3,0, [320,010,020 the true values of the regression and scale parameters, reserving ,8” [32,01,02 for the free parameters in the likelihood function. The vectors 1310,1320 have each dimension p and 0,0,020 are strictly I positive scalars. Denote by 6“, the (p+1)- dimensional vector 6,20 = (figmako) . By the specification of the linear model (1.12), given the covariates 21,,sz the time-cost vector (7},Ci)’ has the distribution function F '(.,.,6,0,620,p0) and density f‘(°9-96|09620,p0) , Where F.(I,C,610,620,p0) = F[t-fil’ozli ’c—fliQZZLIpO], 010 0'20 (1.13) ‘ ' ’- ’ i C‘ r' f (LCflmflzmpo)=(010020) lf[ [3,02, . $022 We} 0'10 020 Marginally, given 2”, 7} has distribution function Fu(.,0,O), density f,,-(.,l9,o) and hazard function ark-("6110) , where areas) = Fo[-‘;§—'9ZE}. 10 fli(t’610)=—fo[t;fl’fl]v (1.14) l 010 a“(t,610)= Flgao (Li—U.) 1 Similarly, given Z2,- , we define for the costs C,- the functions F2i(.,020).f2,-(.,620) . “If (-9 620) . 43 With i.i.d. data (X wail“) on n patients we estimate 0,, by the maximum partial likelihood estimator 63,. We will show that (67,655) has an asymptotic 2(p+l)- vaiiate normal distribution and we will provide a consistent estimator of the asymptotic covariance matrix. Then point estimates and confidence intervals for the median LOS and median cost will be obtained for a specified covariate profile. Detailed formulas will be provided for the bivariate normal case (see Example 1.2.1). 1.2.1 Estimation of the Model Parameters Consider the study observation period restricted to the finite time interval [0,1,] , t, < oo. Similarly costs are assumed to be upper-bounded by the finite cost 1'2. The processes N k,- (u),Y,u- (u),N k (u) have the same definition and interpretation as in Section 1.1.1. The argument it is again used to denote either t or c, depending on the subscript k 6 {1,2}. As mentioned, u takes values in the finite interval [0,Tk]. Define the (p + l)x(p +1) matrix 21‘ 212 2" ‘ (2121’ or ’ Where the px p matrix Bil is 2 Zzl =akO-3E{Z“Z;lflk (2) (“-mOZHJYHWMu}, [:0 the p-dimension vector 2:2 has the form “to 0r (10 ako 2:2 = arc-35{er £t%[5;fl_iqzfi)[l+(u—flio l]fi[fl£9_z.fl}]yu(u)du} and the scalar 0‘,” is a“) a0 akO k0 (,3. = awe, { I? [,[Mu-flt}£4a[u_-M}} ao[g_—_zi"a_za}y,,(,,d,}, The following assumptions, similar to those used in Section 1.1, will be assumed to hold throughout this section: Model Assumptions: B.1 Conditional on Z.,-(.) , T,- and 7}, are independent and conditional on 22,.(.) , C,- and C; are independent; (Independent Censoring Assumption) 3.2 {(x"},[6”},[Z”(')}},1si5n arei.i.d.; X21' 62r' ZZl‘(') B.3 Zh- , Z2iare bounded; 8.4 The hazard function a0 is strictly positive and its derivatives of first, second and third order exist and are continuous; 8.5 The matrices 21,22 are nonsingular. Under the independent censoring assumption, the counting process N “(.) has the compensator Ak,(u,0ko) = Elfiwflkoflv , where the intensity process Ak,(v,6,,o) has the form Mao”) = a,,(v,t9,o)r,,,(v), with the hazard function a,,.(.,19,,,) defined in (1.14). 45 Let M k,- denote the local square integrable martingales M 5(a) = N “(14) - Au(u,6ko) and n Mk = 2".le1' ' Properties of stochastic processes, such as being a local martingale, are relative to a filtration (fi‘"’(u),u e [0,rk]} of sub a-algebras on the n-th sample space (52),"),fl("),1’k(")); fi‘")(u) represents everything that happens up to the point u in the n—th model. For the joint time-cost stochastic processes, we consider the filtration (£000) ® E")(C).(I.C)€ [0,T]]X [012]) on the product space ($29” @ngfj‘") ® 13"”, Pf") ® sz) . Details of the filtrations definitions are provided in Section 1.1.1. An estimator 0A,, of 6“, is obtained by maximizing the partial likelihood 1., (19,) = f1 11 1..-(14.61)” exp(- 1? 1..-(v.6. >dv). i-l uelo.r,l A summary of the main results on partial likelihood for counting processes can be found in Section 11.7, Andersen et al. (1993).” Likelihood representations for general Counting process models were first given by Jacod (1973, 1975).” 37 Use of the product integral concept to make the otherwise rather involved formulas more interpretable goes back to Johansen (1983).38 Arjas and Haara (1984)39 were the first to describe the notations of independent censoring rigorously by formulating these in terms of likEtlihoods of intensity processes of general marked point processes. The log-partial likelihood is 46 Cr, (91:) = 2;,{ f 108 31$“va )dNri(“) " Ln Aki(ur6k )du} - Because f log Yk,(u)dN,u-(u) = [O S Xh- S 1,, ]§, log[X,,, 2 X,,] = 0 , we can also write C,‘(6,,)= Lam loga,“ (u 9, )dN,,-(u)- E 11,,(ufl, )du}. (1.15) Assuming we may interchange the order of differentiation and integration, the vector U71 (6,) of derivatives of C,‘ (6,) with respect to 6, has the components U{;(0,)=Z 2H“: 30, —loga,,,.(u, 9,)d1v,,.(u)- For: ——a,,,.(u. 6,)Y,,(u)du}, 36k)“ je {l,...,q}, q=p+l. The log-partial likelihood may have a number of local maxima, so the equation U ,k (6,) = 0 may have multiple solutions. We will consider the maximum partial likelihood estimator 5,, given as a solution to the equation U“ (6,) = 0. If more than one solution is found in a concrete situation, one could then check which of these gives the largest value of the log-partial likelihood function. 1.2.2 Large Sample Properties of the Parameter Estimators The asymptotic properties of the parameter estimators (67,63) hold under some general “regularity” conditions D.a — D.d. These conditions are stated in p420-421, Andersen et al. (1993).29 They were used by Borgan (1984)“, who studied the maximum like«lihood estimation for the multiplicative intensity model. We adopt the notation 47 a _a_ 30—g(6,0) for 30 g(6, )la,=o,o' The dimension of the vector 6, of parameters is ’9' k} q=p+L Conditions D.a-D.d: D.a There exists a neighborhood 9,0 of 6,0 such that for every n, 6, 6 6,0 and for almost all ue [0,1,], the partial derivatives of a,,(u,6,) and loga,,(u,6,) of first, second and third order with respect to 6, exist and are continuous in 6, for 6, 6 6,0. Moreover, the log-partial likelihood may be differentiated three times with respect to 6, 6 9,0 by interchanging the order of integration and differentiation. D.b There exist a sequence {awn Z l} of non-negative constants increasing to infinity as n —-) co and finite functions of (6,) defined on 9,0 such that for all j,le {l,...,q}: -2 l: n a a P jl an E Z._ —loga,“-(u,6,o)—loga,,(u,6,o)/l,,(u,6,o)du—)0, (6‘0) as 1430,, 319,, n—>°°. D.c The matrix 2, = (017(6”), j,l e {l,...,q}) , with a,{'(6,0) defined in condition D.b, is positive definite. Dod For every n there exist predictable processes 0,, and H ,, not depending on 6, such that for all u 6 [0,1,]: 48 | 8’ Dodo]. a,“- (u,6,) su S G ,(u) a,eeliolaH,J-36,,39m k a3 D.d.2 su I lo a (u,6) okegolagkjaaklagkm g I“ k S H,,(u) for all j,l,me {l,...,q}. Moreover: D.d.3 of L" Z?=10,,(u)du D.d.4 11;” L" Zf=lH,,(u)A,,(u,0,o)du 2 _ 32 D.d.S a 2 k n 10 a. u,6 -u,6 du n I: Z'=l{391j39u 8 kr( 10)} AM k0) converge in probability to finite quantities as n —) oo , and for all 8 > 0: D.d.6 of L" 2;,11,01)[ag‘li,,.(u)“2 > e]/l,,(u,6,o)du—’;o. Note: Deriving the statistical properties of the maximum likelihood estimators (6,63) involves martingale results and Taylor expansions. Condition D.b ensures the Convergence in probability of the predictable covariation processes of certain martingales. Conditions D.b and DC are crucial in the proof of the existence and Consistency of the parameter estimators. By condition D.a the Taylor expansions are Valid, whereas D.d ensures the remainder terms in these expressions will behave Properly. D 49 Proposition Under our model assumptions B. l-B.5, the conditions D.a-D.d are verified. 1:1 (The matrix 2, from conditions D.b, Do is the one defined at the beginning of Section 1.2.1.) Proof Proposition: We consider 8,0 = 9pm x 1,0 a neighborhood of 6,0 such that 9pm c: R” , 0 0 0 1,0 C (0,00) are compact sets and 6,06 9,10,11,06 Ito. We denote by A the largest open set included in A, also called the interior of A. Condition D.a: Recall that a,,(u,6,)=a[‘ao[E—:—O_锑—Z’-‘i], where 6,: =(6;,a, )Ie RPx(0,oo), r u-flizk, Then loga,,(u,6,) = --log0', +loga0[ a’ I: J. The log function is infinitely differentiable on (0,oo) , with continuous derivatives. By assumption B.4, the first part of condition D.a is verified. The log-partial likelihood was given in (1.15): Cit (9*) = Z:=l{ El log 0,, (14,9, )dei (u) - fit 0,, (“,0k )Yh (U)d“} , where I)" loga,,(u,6, )dN,,.(u) = [o s x, s 1, 15,. loga,,(X,,,6, ). By the differentiability Properties of log 01,,- , what it is left to be proved is that f a,,(u,6, )Y,, (u)du can be 50 Si; 111 differentiated three times with respect to 6,0 6 9,0 by interchanging the order of integration and differentiation. We will show only that a r r a — a-u,6 Yrudu= —a-u,6 Y- du, 1.16 313,1; ..( .>,..() flap, ,.( .).,(u) ( ) the proofs of the other relations having similar arguments. The next theorem gives sufficient conditions for differentiation under the integral sign. For a proof for the l-dimensional case see Theorem 4, p30, Fabian and Hannan (1985)": Theorem (Differentiation under the integral sign) Let acompact set @C IR”, 60 a point in 6 and f (u,6) a function on [0,1]x9 such that i) f (.,6) is Lebesgue measurable for every 66 9; ii) f (.,60) is integrable; iii) for every ue [0,1] the partial derivative 7% 1' f(u,6) exists on 6 for je {l,...,q} and there exist integrable functions g j, j e {l,...,q} such that a f(u.9) Sg,(u) Vue [0,11,66 9. Then f («9) is integrable for every 66 6 and for every j e {l,...,q} 3 8 f(u96)du = —'f(U,6)du .Cl 86’. E E 39], For every fixed 6, , (1,,-(u,6, )Y,,(u) is integrable because 51 it a“ (14,0, )Yki (U)du S E‘ d,,(u,9, )d“ and a,,(.,6,) is continuous on [0,1,] , so it is bounded. Also, for 6, = (,6,,a,)e 9,0 and je {l,...,q}: l a | 3 a (14.6 )Y (a) S su a,(u,6) 3.3:, I“ k I“ 91420 8161:] k k Using %a,,(u,6,) = —0,’2a{, [115125,] 2,,- and the boundedness of the components k! r of 2,, , we obtain _. , by a att(u.91)Yr.-(u) 5 SUP 0;2%[u—£’&L] =, s(u). aflkj 0* 6 9‘0 k natauon By the following Lemma (see Lemma 1, p635, Jennrich (1969)”), s(.) is a continuous function on [0, 1,] . Consequently it is bounded and hence integrable over any finite interval. Lemma 1: If g is a real valued function continuous on the Cartesian product {1’ x} of two Euclidian spaces and if Y is a bounded subset of y then sup g(x, y) is a continuous YEJ’ function of x. 1:: The conditions i-iii of the theorem for differentiation under the integral sign are then‘efore verified and (1.16) is proved. Condition D.b: Let a, = n'”. We will show 52 I D.b.l * "1.12: I[3—a_fl, loga,,(u, 9"“)(379 loga,,(u, 6,0)J 11,,(u, 6,0)du—)£, , where It all the elements of the px p matrix Z,‘ are finite; 2 P D.b.z ‘ 1142.: l[—loga,,(u, 6k0)) A,,(u,0,o)du—)O’,32 < D.b.3 ‘ 11-12,: =laflk —loga',,(u, 9,0)-%;-10ga’,,(u, 0,0)11,,(u, BkoflULZiz , Where a“ the elements of the pxl matrix 2,2 are finite. The derivative functions involved in the relations D.b.l-D.b.3 are _fi “Wail/ti _ -I _ —loga,,(u, 61:) a0 ——"a_k ( 0‘: Zkt)’ 3191 _3_ . =_ -| u-flkzki fl u-fikzlu’ aO_klogar,,(u,6,) a, {1+[ 0* Jae ——ar . Relations D.b. l-D.b.3 can be proved through a version of Strong Law of Large Numbers (SLLN). We Will show the details of the proof for D.b.1.; the other two can be proved using similar arguments. Consider the p x p matrix I Vi(u’6k0) = [Bi-logak,(u.6,o))(a—%- 108011 (u 910)] 4M“ 91:0) "' It It 2 =0'-3(%) (“'flzflu}x( )z.z’., e 0.1 . 10% aka Iaukrkru[ *1 53 2 % is continuous and Y,,(.) is right continuous with left hand limits. Let The function V,'(u,6,o) = V,(1, - u,6,0),u e [0, 1,] . Then the p2 components of the matrix V,‘(u,6,o) are random components of D[O,1, ], the set of right—continuous real valued functions with left-hand limits on [0, 1,] . The space BID, 1,] is endowed with the Skorohod topology. We will apply an extension of SLLN for D[O,1, ]. For the proof for D[O,l] , see Rao RR. (1963).43 SLLN for D[O,1,] Let X ,,i .21 be i.i.d. random elements of D[O,1,]. Suppose that —)0 a.e.as n—)oo.u E sup lIX,(u)I 0 a.e. as n —> oo , that implies l n e .. SUP Izz,=,thl(“t910)‘EVlle‘kao) “[0911] -—)0 a.e.as n—-)oo. l SUP Izzyslvtfl(u’6k0) ' EVljl(u’6k0) ue[0.1,] 54 Because 1, < co, the previous relation implies E3721: -l Vifl(“ 6,0)du—)E Evljl(ur 010M“ 421),, Every (j,l) component of the matrix 21' is finite: (2* )fl (7111': SIUP Vijl(u 910)<°° ue[ 0,1,] Therefore condition D.b.l holds. Condition D.c: By assumption B.5, the matrix 2, is nonsingular. Consequently we need to show that 2‘. is ositive semidefinite. Let xe IR”, 6 R. It is eas to show that r P y Y 2 a (x ’:,y)2,[ I: BE IIx’a 3161 —-,loga,,(u,6,0)+yE-aTlogaum ,6,0)) A,,(u,6,o)Idu20. Condition D.d: Let ue [0,1,] and j,l,m€ {l,...,p+l}. l 83 su ,(ufl )5 o‘egoIaQU-aguag,” ’1‘ k0 by =. 811 (Zia) = Gkr'. "010110” l a3 S su su i(u.9 ) «log.la,et§ola6,jaauagkm )1 k0 Variables 0,, are bounded because g, is continuous by Lemma 1 and 2,, are bounded. Therefore condition D.d.i is verified. Because variables 0,, do not depend on the al‘gurnent u, condition D.d.3 follows by regular SLIN. Similar arguments can be used for the rest of the D.d conditions. This completes the proof of the Pmposition. I 55 The next results state that, with a probability tending to one, there exists a solution of the likelihood equation and this solution is consistent. However, this does not rule out the possibility of the likelihood equation having other, possibly inconsistent, solutions. Under condition D.a the vector U n (6,) of score statistics has the components (11' (6,) = Z’.‘ I)“ —a-loga,,(u,6, )dN,,(u) - Z’.’ I," —a-—a,,(u,6,o)Y,,-(u)du. 71 i=1 391, i=1 391, Let -2"" (6,) , 2,17” (6,) denote the second and third order partial derivatives of the log- partial likelihood C“ (6,) . Theorem 1 (Theorem v1.1.1, p422, Andersen et al. (1993)”) Under the assumptions Bi, 3.2 and conditions D.a-D.d, with a probability ,, ,, P tending to one, the equation U,‘ (6,) = 0 has a solution 6, and 6, ->6,0 as n —> oo .0 Sketch of the proof: By condition D.a, a Taylor expansion gives for every 6, 6 6,0, j e {l,...,q} that Uri, (91) = Uri, (910) ‘ 27... (91:1 ‘ 61:01)P1,fl(6k0) + l . r. + 3 27,121.,(6111 " 6101 X911». ‘ 910M311". (91 ). where 6,: is on the line segment joining 6, and 6,0. Step 1: 2 ' P a; U4 (9,0)—->0. Essential in this step is that 56 (14(0‘0) = 22;] Kt 'a-‘Z—r'logak,‘ (u,6,o)dM,, (u) r I Step 2: -2 jl P jl an 7;, (610)—“71 (910)- Step3: There exists M, < co , not depending on 6, , such that W . lim pm") = lim PIa;2;e,I""(9,)|< M, Vj,l,m, v0, 6 9,0} = 1. n—roo notation n-rec Step 4: Combine the previous steps and finish the proof of the Theorem. I Next we prove the following result about the joint distribution of the maximum partial-likelihood estimators: Theorem 2 Assume that our model assumptions B.l-B.5 hold. Let 6, be a consistent I Solution of the equation U (6 )=0. Then "112 6’ -6’ ,6’ -6I converges in 1, i 1 lo 2 20 distribution to a 2( p + 1) -dimensional normal random vector with zero mean and covariance matrix C = (C,,, k,le {1,2}) , where the (p + l)x(p +1) matrix C,, is 0,, = 2;'B,,2;', with Bid -.-_ E(v“(6,o)v“(6,o)’) and the (p+l) -dimensional vectors v,,(6,0) given by 57 Vu(6ko) = E" £10g (1,, (u,6k0Wb-(U) . 0 k Proof Theorem 2: Porn sufficiently large 6, E 9,0. Expanding U,,(6,) around 6,0 gives n"’zu,, (6,0) = it"g, (6,)n"2 ((3,, -6,0). Step 1: n‘“2 (U, (6,o)’,U,2 (620)} converges in distribution to a 2q-dimensiona1 normal random vector (q = p +1), with mean zero and covariance matrix B = (B,,Jc,l 6 {1,2}}. Because n’WU,‘ (6,0) = {”2224 v,,(6,0) is a sum of i.i.d. zero mean random vectors, the result of this step follows immediately from the Multivariate Central Limit Theorem. Step 2: l O P i t P n' P,t (6, )—>2, for any random 6, such that 6, —->6,0. With probability approaching l, 6,: lies in 9,0. When 6,: 6 9,0 , by the Taylor expansion, for every j,le{1q}, -ljl‘_-ljl -lq * jbn" ’1 pa (Bk ) - n P“ (9‘0) — n 2m=l(6"m "' 6&0", Wk (6k) , Where 6, is on the line segment joining 6,: and 6,0. 58 . P . By Step 2 of the proof of Theorem l, 11")?” (6,0)-—>a,”(6,0). We will show that the second term of the right-hand side of the previous expression converges in probability [0 261°C. Let the constant M, and the sequence of events A, be defined as in the Step 3 of Theorem 1. We have that P(A,,) —) 1. Consider an arbitrary 8 > 0 and the sequence of events 3,, = I IVE?"=1 (6;, - 6,0,, )af’mw", )I > e}. We need to show that P(B,,) —) 0. This follows from the inequality 0s P(B,.) = M. Mm P(B.. M5). where A: is the complementary set, and the results P(B,, n A,‘,') S P(A,‘,') —-> 0 and P(B,, n A") 5 PIM, IZLIIQL, — 6,0,,II > e) s e 8 it P S PIII6, - 6,0II > ——I —> 0 because 6, 66,0. th We denote by II . II the supremum norm. Step 3: I n"2 (6; - $0.6; - 6’20) converges in distribution to a 2( p + 1) dimensional normal random vector with mean zero and covariance matrix C = A"BA"l , where A = diag(2,,22) and B = (B,,,k,le {1,2}). By a Taylor expansion we can write 59 n-l/2 [(14010) _ . _ . é-6 Ur.<6..>]=‘”“g‘" ”in“?! M '72.<62>>n“2[ ‘1 "I 2 620 We apply the following lemma: Lemma 2 (Theorem 10.1, p62, Billingsley (1961)“) Let E an Euclidian space. Suppose u, is a random vector in E ' satisfying D u, -—),u , where y is a probability measure in E '. Suppose further that v, is a second random vector in E ’ satisfying either Iun - vnI S 8,, IunI or Ian - vnI S 8:, IvnI , P P . where £,,—>O, £,',—>0. Then v, has the same limiting distribution 11 as u, .0 We have "—1/2 (171(610) _Anl/2 €1‘610 S (112(620) 02-020 "1,2[6-610] 32'920 By Step 2 the first factor of the right-hand side converges to zero in probability. s Idiom-‘7; (61' xii-‘22 (195)) - AII x A 19 -0 Applying Lemma 2 we obtain that An"2[ ' 1° A I has the same asymptotic distribution 92 ‘ 620 as n—l/Z Utl(6lo) 0,2(920) I. By Step 1, the result of Step 3 is shown and therefore Theorem 2 is Pro ved. I The next theorem provides a consistent estimator for the asymptotic covariance matrix in Theorem 2. Theorem 3 Under the model assumptions B.l-B.5, the asymptotic covariance matrix C = (C,,,k,l e {1,2})of n"2 (6,I - 6,3,6; —6’20)’ is consistently estimated by 6‘ = (Cu,k,l 6 {1,2}), with 611 = "23:1(61 $113,761). where 3,, = n"Zf=,l7,,(é,)17,(é,)’ and V,,(6,) is a (p+l)-dimensional vector, ‘7“ (ék) = E‘ filog 0,, (14,01 ”A?“ (u) , 117,,(11) = N,,(u) - I: Y,,(s)a,,(s,6,)ds.0 Proof Theorem 3: By Step 2 of the previous theorem and SlUtsky’s Lemma, what is left to be proved - .. P rs the convergence 3,, —)B,,, k,l 6 {1,2}. Recall that 8,, = E (v,,(6,o)v,,(6,o)’) , where the (p + 1) -dimensional vectors v,, (6,0) have the form vki (0‘0) = £1 110g a,,- (u,9,o)dM,, (u) . 36, Therefore we will show that for every j,me {1,..., p +1} ,. . . ,. .. . P . m n“Zj=,v,§ (9, )V,;" (19,) —)E(v,{,(6,o)v,, (9,,)). (1.17) 61 The j-th components of the vectors 17,, (6,) and v,,(6,o) are ‘;k{:(ék)= 6“ [O< in ST,]— —10ga,,(X,,-,é,)- Et—g—a,,(u,é,)l’,,(u)du , 86 6k]: 66k} "1109110)== 61:1[05Xkl<7k1-a—g_logakl(xkl’0k0) [Ia—63 “110‘: 910V11(“)du- kj 0"} Consequently it"1 2;ka (6, )V,f" (6, )’ can be expanded as R1 - R2 - R3 + R4 , where - a .. a .. R1: n '2; 6,6,,[o u [I Earns. 0171(3) . Similarly E (v,{ ,(6,o)v,",' (6,0)) can be expanded as L1 - L2 - L3 + L4 , where a 3 L1: 5(511611[0 La for ae {1.2.3.4} and k.l e {1.2}. We will provide the details for k = 1.1 = 2 . the other cases being similar. P We start by proving Rl-—) Ll. Recall that al.-(1.6, ) = af'ao [C—g'Z—‘J. Let I a _ t- ’ a _ - é(t.c.61.62.zl.z2)=561710g£al 'ao( 06121))362m10g[021a0[£—c£$&)] and Ru(61.02) =6..52,.[osr,. s 1.][05 C. S1216(7}.C,-.0,.62.Z“.22,-). Then R1: 1242;, R,,-(é,.é2) and L1 = E(R..(6.0.020)). First we show that sup (91 .02 5910x920 n"2?=lRl,-(6,.62)-E(Rl,(6,.02))| —>o a.e. as n —>oo. (1.18) We will apply a SLLN for separable Banach spaces. first proved by Mourier (1953)“: SLLN 1: If (X .|| . II) is a separable Banach space and {Vn} a sequence of i.i.d. random elements in X such that EllVlfl < on then "n”‘ZL'Vi — EVII -—> 0 a.e. as n —) oo .13 Consider the separable Banach space of continuous functions on the compact set 9. o x 920 . endowed with the supremum norm. Then R,.(...) are i.i.d. random elements Of this space. The convergence (1.18) follows from the direct application of the stated SLI.N 1 if we show that 63 E[ sup |R.,(6..62)|] 0. By (1.19). ln-nzfgl R,.(é.,é.) — E(R,.(é.,é2))| < 5/2 (1.20) wi th probability tending to one as n —> oo. By Dominated Convergence Theorem. ERl l(.,.) is a continuous function on the compact set 6,0 x920. Then ERH(él,éz)‘ER”(610,620)(8/2 (1.21) with probability tending to one as n —> 0°. The inequalities (1.20) and (1.21) imply that .1142le R...(é..6“2) — ER, “6.0.02.9 < e P with probability tending to one as n —> 00. Hence we proved R1—>Ll. P Next we show that R2 ——> L2 . Let 7)(t.c.6 .92. 2,. 22) the function obtained from a 86 '108a11(’.91)302m 11 021(c.62) by replacing the covariates Z1 1 .221 with the arguments zl . 22. Define also R2.(c.6,.62) = (Sh-[0 S T,- S t,]17(7}.c.6,.62.Z,,-.Z2,.)Y2,-(c). By Fubini’s Theorem R2: 5’ (1:42;, R2.(c.é..é2))dc and L2 = E’ER2.(c.6.o,0.o)dc. First we show that sup sup "-12:31R2i(c’01’62)— ERZI(C,61,92)| ‘9 0 a.e. as n —) 0° . (1.22) $103,114.92 )ee...xe.. We will apply an extension of SLLN to DE[O.1'2] . the set of right-continuous functions Wi th left-hand limits on [0.12]. taking values in a separable Banach space E. The space 05; [0.12] is endowed with the Skorohod topology. For a proof of this result see Andersen and Gill (1982).3| 65 SLLN 2: Let {Va} a sequence of i.i.d. random elements of DE[0.1'2]. Suppose E||V1fl=E sup flV,(c)|| 0. With probability tending to one. 5,. 6 91:0- Then. as a consequence of (1.22). sup 1:42;; R2,.(c.1é1‘l .éz) - ER21(c. 61.,62) < 8/212 CEIOvIZJ 66 with probability tending to one as n —) 00. Then S If n-lz:=1R21(c’éhé2)dc ’ E2 ER21(Caé1.é2)dc (1.24) - .. .. ~ n e n lzr=lR2i(c’61’62)-ER21(C’61’62)I<72§;2-=€/2 5 1'2 sup c'e[0.r2] with probability approaching one as n —> oo. By (1.23) and Dominated Convergence Theorem. the function (61.02) —> E” ER2,(c.6,.62)dc =15 £2 R2,(c,6..62)dc is continuous on 9,0 x 920. so (8/2 (1.25) | f ER2,(c.él.éz)dc — L" ER2,(c.0,0.620)dc with probability tending to one as n -—> oo. The inequalities (1.24) and (1.25) imply that .<€ IE2 VIZ?=1 R2.(c. 51.52 )dc — E2 ER2,(c.6,0.020)dc P with probability approaching one as n —) co. Therefore the convergence R2—>L2 is P proved. The proof for R3—-) L3 is identical. P Finally we will show that R4—>L4. Let [(1.0.0,.62.z,.zz) the function obtained from —a—a“(t.6.)—a—a2|(c.62) 86,,- 892... by replacing the covariates 211,221 with the arguments z,.z2. Define also R4.- (t. c.0|.62) = {(t.c.01.02.Zl,-.Zz.- )Y,,-(t)Y2,-(c) . By Fubini’s Theorem R4 = E £2(n42;R4.(t.c.él.é2))dtdc and L4: El [:2 ER41(‘»C19101920)d‘dc° 67 P The proof follows the same steps as in the proof of R2—-) L2. We will show only that sup sup (t.CE[O.T| No.12 1(61 .0) Eeloxezo NZ; R4,-(t.c.6,.62) - ER..(:.c.0..92) —-) o a.e. (1.26) as n -—> 0°. The SLLN that has to be applied in this case is an extension to the space DE ([0,7l]x[0.12]) and it is proved in Appendix A. SLLN 3: Let {Va} a sequence of i.i.d. random elements of DE ([0,1,]x[0.z'2]). Suppose EHVIH=E SUP “V,(t.c)fl].g“ [6.5 + 2.6. swig)”. where 2,, is the upper or quantile of the standard normal distribution and 13.65 and SE(f5).SE(C_5) are given in (1.27) and (1.28). respectively. Application: Bivariate Normal Case In the setting of Example 1.2.1. 8,“- are i.i.d. N(0. 1) . which is a symmetric distribution with median 8.5 = O. 70 Using the notations of Theorem 2. consider C CL: C13 1dr = I . (c121 where C3 is the px p asymptotic covariance matrix of n”2 (3,. - filo) and C3,? is the 1/2(. asymptotic variance of n 0',‘ — 0“,). Similarly define the estimators C”. CE. C33. In this case the estimates and confidence intervals for the median LOS and median cost. at a specified covariate profile Z0 are 7:5 = 8409170). 6‘3 = g" = —6,..-&;' - 6,..- 5“}(5111c11'8‘a + xu —2 “-81:21“ a-2 u’filizki u—fllzzlu' _ +1) [6, ao[———————&k ]+a, 0% at at du- = &;|[f*:flfiao(v)dv+ fgfi%(v)vdv—5k, ‘6k1(ao(§k1)‘§k1)§k1]- a. a. By assumption B4 and integration by parts: J:0o(V)dv+ f%(v)vdv =ao(b)b-ao(a)a- Therefore Vki+l(élr) = 5;! [an (Shaw/:1 + a0 [" 3321.]35511 ' 51a“ 511% (3&1)§ki + 5a§21:l = k r (1.30) = 5’1:1 [(1' 51:1 )ao (§k1)§ki + a0 [‘ ml“) 6‘ fizz“ ' 51a + @3131] It 6‘1 73 Let 511 = (1'5kt)ao(~§kr)‘ do (if?) “Laugh, I: 51a = (1’ 51:1)00 (511° ) 5'11 + a0 [‘ 1632“]3221. "' 51:14” 511%- It It By (1.29) and (1.30). the (p+1)-dimensional vector 17k,(ék) has the form A .. _ 0“th Vu(6k)=akl[ . ]. bl.- Consequently 3a = n"Z;;.Vr.-<ér 191(6) 1’ —l n A A I —l n A A n Zigalfl'ah‘zflzfi n Ziglakiblizki n-l -l =0 6' "’ n-Iz “132’ 42" 1313 M011 ki 11 ’1 ,=, 1111 1.3 Application We demonstrate the application of the methods proposed in Sections 1.1 and 1.2 to hospital LOS and costs in a cohort of patients admitted for coronary artery bypass surgery (CABG). Data are from a tertiary care academically affiliated medical center. In this study there were 1268 consecutive admissions for CABG from August 23. 1993 to December 29. 1994. Computerized files were maintained on demographic characteristics. medical history. clinical outcomes. resource use and costs. Length of stay was defined as the number of days from admission to discharge. inclusive of the admission day. Costs were derived from services for operating room. 74 nursing. laboratory. and pharmacy as well as room and board and convenience items. All professional fees were excluded. The complete dataset was available. including cost histories for each patient. As an example for our techniques that incorporate censoring and assume that only total costs per patient were observed. we reconstructed the dataset six months after the study started and we computed the total costs. At that time some patients were still in hospital and some costs were not as yet incurred. This resulted in a dataset of 465 subjects. with 7.53% of the LOS and 9.68% of the hospital costs being censored. respectively. Our objective is to examine in our sample the health care utilization. as measured by LOS and costs. Using both approaches (semiparametric and parametric). the medians of these two outcomes will be estimated and confidence intervals will be provided. CORRELATES OF LOS AND COST Potential correlates of DOS and costs included demographic and clinical variables that could be identified at admission. the use of cardiac catheterization during hospital stay and discharge status. an indicator of whether the patient was alive at discharge. The variables available at admission were age at admission. gender. race (W hite. African- American or Other). marital status (Married. Alone or Unknown). insurance status. comorbidity. ejection fraction and history of prior CABG. Insurance status was categorized as Medicare. private. Medicaid or other. Ejection fraction. a useful measure of cardiac function. is the volume of blood expelled at each systole as a fraction of the volume of blood contained in the ventricle at the end of the diastole. A value less than 75 50% is generally considered abnormal. Values of ejection fraction were grouped as below 35%. 35% to 49% and 50% and above. Charlson Comorbidity Index (CCI)‘“s was used to assess comorbidity. It is a weighted sum of the presence of 19 specified medical conditions at admission. These conditions include diabetes. liver disease. congestive heart failure. peripheral vascular disease. prior myocardial infarction. cerebrovascular disease. connective tissue disease. dementia. chronic obstructive pulmonary disease. hemiplegia. tumor. and acquired immunodeficiency syndrome (AIDS) or AIDS related complex. We formed three comorbidity groups based on CCI scores 0-1. 2-3 and 4 or more. Diagnosis Related Group variable (DRG) is in this case a binary variable that indicates if a patient has cardiac catheterization during the hospital stay. Table 1.1 shows the characteristics of the 465 patients in our sample. The mean age was 63.4 years and the median age 65 years. RESULTS There is a high correlation between LOS and cost (Spearrnan r = .79. n = 465) even if all censored observations are omitted (Spearrnan r =.77. n = 420). The range of the uncensored LOS observations was 3 to 35 days. and the uncensored costs ranged from $11,669 to $90,088. Figure 1.1 shows a plot of LOS and cost (circles represent censored cost observations). The bivariate models we use for LOS and cost allow inferences about regression parameters simultaneously for these two outcomes. For example. suppose we are 76 interested in the effects of a covariate with three categories. such as CCI. on both LOS and cost. Two dummy variables are created for these three categories. Denote by ’70 = ((1310),:(1610); »(flzo)1t(fl20)2) I the subvector of the vector of all true regression coefficients ( ,B,’o. [350) corresponding to the dummy variables for both LOS and cost. and by 7'7 the vector of estimators 77 = (31118121321’322),' Let I]? be the consistent estimator of the asymptotic covariance matrix of 77 . as defined in Theorems 3. Sections 1.1 and 1.2. By the asymptotic normality of the regression parameter estimators. the quadratic form W = fl’l/7"fi is asymptotically chi-square distributed with 4 degrees of freedom and can be used to test jointly the null hypotheses: H,, : [3,0, = 0.ke {l,2}.je {1.2}. In the covariate selection procedure. correlates of LOS and cost were tested jointly so that the resulting model would have the same constellation of significant variables. Each potential covariate was assessed individually and then in combination with all others that were found to be significant by univariate analysis (p-value< 0.20). Only age at admission was regarded as a continuous independent variable. In the final regression model we retained only variables that were significant at p-value< 0.10. All analyses were performed with SAS software version 8 (SAS Institute Inc.. Cary NC). 77 Semiparametric Model To implement the method developed in Section 1.1. we create two records for each patient. one for DOS and one for cost. For the categorical covariates sixteen dummy variables d, . 2 S i S16 were created and type-specific covariates were defined as follows: For LOS: Z” = age*[type =1] and 2,, = (I, *[type =1].2 S i S 16; For cost: Zn = age*[type =0] and 2,2 = d,- *[type = 0].2 S i S 16. After the covariate selection procedure. the significant correlates in the final model were age at admission (p=.0958). DRG (p<.0001). indicator of being discharged alive (p<.0001). history of prior CABG (p=.0006). ejection fraction (three categories. p=.0246) and Charlson Comorbidity Index (three categories p<.0001). Age was regarded as a continuous variable and seven dummy variables d.- . 2 S i S 8 correspond to the significant categorical covariates. Consider the vectors Z, =(Z,,..Z2,....Zs,.)’. fl, =(fi,,.fiu....flgk )’. lSk S2 and Z = (Z,’.Z§ )I. [3 = (3.433). Note that for the LOS data: fl’Z = flfZ, and for the cost data: ,B'Z = .6522. We are essentially fitting two separate Cox models to LOS and cost. but this formulation is amenable to the SAS PI-IREG procedure and permits simultaneous estimation of [3, and .62 as well as direct estimation of the correlation between the two estimators. 78 We present a part of the final model estimates. For the three CCI categories. the two (out of seven) dummy variables d2.d3 were defined as following: forCCIO-l: d2=l.d3=0; for CCI 2-3: d2 = 0. d3 =1 ; for CCI 4 or more d2 = 0. d3 = 0. The LOS observations have Z2, = d2.Z3, = d3.222 = O.Z32 = 0 and the cost observations have 22, = 0.23, = 0.Z22 = d2.Z32 = d3. Estimates of the corresponding regression parameters flZI’flJI’flfl’fln are 192, =.683,/331 = 230,322 =.766.1932 = .140 . The estimated adjusted covariance matrix of these four beta estimators is (0.0155 0.0105 0.0140 0.0101) 0.0165 0.0102 0.0163 0.0219 0.0150 ' L 0.0230 ) When fitting independent proportional hazard models for LOS and cost. the estimate for the so-called nai've covariance matrix is ( 0.0224 0.0148 0 0 ) 0.0200 0 0 0.0229 0.0151 1' \ 0.0206 ) Estimates and approximate 95% pointwise confidence intervals for the LOS and cost survival distribution functions were calculated for the 6 covariate profiles defined by CCI and discharge status (alive or dead). for a patient age 65 years at admission. with 79 ejection fraction above 50. who had catheterization and with no history of prior CABG. As presented in Section 1.1.4. confidence intervals were obtained using the log(-log) transformation and both the adjusted and naive variances were used. Figures 1.2 and 1.3 depict the LOS and cost survival function estimates and approximate confidence intervals for one of these profiles (for CCI = 4+. discharged alive). The naive and adjusted confidence intervals are different. but close. Median LOS and cost were estimated from the corresponding survival distributions (see Section 1.1.4). Table 1.2 shows these estimates for the 6 covariate profiles. previously presented. The adjusted and naive confidence intervals for the median LOS are the same for patients discharged alive. but they differ substantially for patients who died during their hospital stay. For the median cost the two types of confidence intervals were different for all profiles. possibly due to more variation in the cost data. For the survivor profiles the naive confidence intervals wholly contained the corresponding adjusted confidence intervals but this pattern changed completely for the non-survivor profiles. In our sample of 465 subjects. 12 patients did not survive their hospital stay. The small number of deaths and the large variability of the outcomes in these 12 patients might be a reason for the instability of our estimates for the non- survivor profiles. In Table 1.2 we notice how the LOS and cost median estimates increase with larger comorbidity. Patient who survived their hospital stay had larger LOS and smaller costs than those who died. 80 Parametric Model Both LOS and cost exhibit right skewness but this is not severe. No simple transformation could eliminate it. In our analyses outcomes are in their original scale and we assume bivariate normality. The significant correlates in the final joint model were DRG (p<.0001). indicator of being discharged alive (p<.0001). history of prior CABG (p=.0282). ejection fraction (p=.0032) and Charlson Comorbidity Index (p<.0001). This parametric model has the same set of covariates as the semiparametric one. except age. which is not significant in this case. Again. these assessments were made jointly for both LOS and cost. Following the calculations of the Section 1.2.3. we computed the adjusted estimator of the asymptotic covariance matrix of the regression parameters and the point estimates and confidence intervals for the medians of both outcomes. Since no transformation was used. g was the identity function. Estimates and confidence intervals of the median LOS and median cost by comorbidity and discharge status are shown in Table 1.3 for patients with ejection fraction above 50. no history of prior CABG. who had cardiac catheterization. The estimates are larger than in the semiparametric case. but similar patterns are noticed. The confidence intervals are wider for non-survivors than for survivors and non-survivors have lower LOS and larger costs than survivors. As expected. LOS and cost increase with comorbidity. 81 DISCUSSION We applied two models to estimate the median LOS and the median cost for patients hospitalized for CABG. Each model permits adjustments for correlates and recognizes the correlation between the dependent variables. The semiparametric approach specifies marginal models for LOS and cost and yields consistent estimates of the regression parameters as long as the marginal models are correctly specified. The adjusted covariance matrix then accounts for the correlation between the outcomes. without explicitly specifying a joint distribution for them. The parametric approach specifies a joint distribution for LOS and cost but the parameters related only to the joint distribution and not to the marginal models are considered nuisance parameters and are left unspecified. In our study. using the adjusted instead of the naive covariance estimates that do not address the correlation between LOS and cost. gave qualitatively different results with respect to the confidence intervals for the median cost. The final sets of covariates in both models were essentially the same: comorbidity. discharge status (alive or dead). DRG. history of prior CABG. ejection fraction and age at admission. that was significant only in the semiparametric model. Previous studies of LOS and hospital cost have shown the importance of these covariates and their statistical significance regardless of the model used.8' 22' 47 A problem with LOS and in-hospital cost data is the appropriate treatment of in- hospital deaths. Previously. several authors7' 8 have regarded deaths as an early curtailment of LOS and costs. In these studies observed cost and LOS of non-survivors is considered right censored because if they had survived their hospital stay. their costs 82 would have been higher and LOS longer. In this approach the independent censoring assumption is not verified. Besides this. for many applications estimates of costs for those who died are just as important as for survivors. By censoring at death no model can be used to derive predictions for a decision model or cost-effectiveness analysis in which death is an explicit outcome. If only the observations of those who died are used in the analysis. the investigator generally sacrifices considerable efficiency. since the majority of the total sample typically survives. We regarded an in-hospital death along with other demographic and clinical characteristics as potential correlates of LOS and total hospital costs. Our finding was that the non-survivors had lower LOS and larger costs than survivors. Treating non-survivor costs as censored would have increased the bias of the cost estimators. We have several limitations in the analyses of our application. One is the strong distributional assumption needed in the parametric model. We made a normality assumption and even though the resulting model did not provide a very good fit for our data. we used it as a comparison for the semiparametric model. When a distributional assumption is plausible for a study. the parametric models have the advantage of the simplicity of calculations and the efficiency of the estimators. Another limitation is the problem of censoring for costs. The use of survival analysis techniques to analyze 1.9.11.48 on its medical care costs is relatively new and has sparked a lively debate applicability. given the assumptions that underlie traditional survival models for duration times. Patients who accumulate costs over time at relatively higher rates tend to generate larger cumulative costs at both the event time and the censoring time. leading to dependent censoring. This contradicts the usual assumption of independent censoring 83 made in standard survival analyses. While no single approach can be expected to perform in all situations. we believe the traditional survival methods will still have a useful role in the cost analyses. especially of cost histories are not available. TABLE 1.1: Characteristics of Patients Variable Subgroup N Percent Discharged Alive Yes 453 97.41 Gender Male 3 14 67.53 Race White 398 85.59 African Amer. 36 7.74 Other 22 4.73 Unknown 9 1.94 Marital Status Married 333 71.61 Alone 1 12 24.09 Unknown 20 4.30 DRG with CATH 235 50.54 Ejection Fraction < 35 58 12.47 35 - 49 126 27.10 50 + 281 60.43 History of prior CABG Yes 36 7.74 Charlson Comorbidity Index 0-1 198 42.58 2-3 189 40.65 4 + 78 16.77 Insurance Medicare 275 59.14 Private 133 28.60 Medicaid 30 6.45 Other 27 5.81 85 COST (1000 $3 1101 m5 0 so-j . 30: at 70: 3 r 4 It 50. )1 I * 4 * ‘ ll 50: * i: X,“ at: . 71‘ : i a * t * 401 *Q * 3 " ll 1 a X a * 4* * Ix‘ 30; * 1 1 x... t .. , * 11 h x 201 ** * i .. 1' l a * I]: o 5 *** UNCENSORED cosr j 000 CENSOREDCOST . 0 .go 0IIIIIIIIIIIIIIIIIIVIIIIIITTIIIIITTIIIIIIlTjIIIIIII] 0 10 20 30 40 50 IDS [days] FIGURE 1.1: Distribution of Costs and LOS 86 1.01 _ 3 I 0-9: I 0.8-? I 0.75 I 0.05 0.55 .. l 0.4-: 0.35 .. 0.25 0.1{ J 00‘ ' IIIIIIIIIIIIITTIIIIIIIIIIIIIIITTIIIllllllllllllllllllllllllllllllTTrllllllllllll 0 a .0 (a ,0 ,3: $0 15‘: .0 LOS (days) FIGURE 1.2: Estimated LOS survival function and ) and naive (- - - -) pointwise approximate 95% adjusted ( confidence intervals. Estimates were made for a patient discharged alive, who underwent CATH, with a CCI of 4+. ejection fraction 50+. age 65 at admission and no history of prior CABG. 87 10' A I J I 0.9: 0.8:: 0.7% 0.6“: 0.5 1 0.43 0.3 : 0.2; “-1- ‘------h---- 0.1f V ‘5- ”hhq--—---b----' rrlllllllllllllllllllllllllllllllllIIIITTTIIITITIIIIIIIlllllllilIIIIIITIIIWTIIIIIIIIIII '9 0° 0° 6° 6° 0° 1° 0° 0° 0° C061 (1000$l FIGURE 1.3: Estimated cost survival function and approximate 95% adjusted ( ) and naive (- - - -) pointwise confidence intervals. Estimates were made for a patient discharged alive. who underwent CATH, with a CCI of 4+. ejection fraction 50+, age 65 at admission and no history of prior CABG. 88 TABLE 1.2: Length of Stay and Costs by Comorbidity and Discharge Status (Semiparametric Model) Length of Stay. days Cost. 3 Adjusted Adjusted Adjusted Adjusted (naive) (naive) (naive) (naive) CCI Status Median 95 % 95 % Median 95 % 95 % LCL UCL LCL UCL 10 11 20.378 21,315 0-1 Alive 1 1 20.819 (10) (1 1) (20.356) (21.382) 8 15 18.922 34.884 Dead 10 24.167 (9) (12) (20.682) (29.422) 12 14 24.140 25.149 2-3 Alive 13 24.666 (12) (14) (24.116) (25.192) 9 19 21.415 79.103 Dead 12 30.155 (10) (15) (24.201) (44.637) l3 16 24.913 27.083 4 + Alive 14 25.660 (13) (16) (24.728) (27.177) 9 22 23.048 79,103 Dead 13 31.276 (10) (16) (25.399) (46.662) Estimates for a patient who underwent CATH. with ejection fraction 50+. age 65 at admission and no history of prior CABG. 89 TABLE 1.3: Length of Stay and Costs by Comorbidity and Discharge Status (Parametric Model) Length of Stay, days Cost. $ Adjusted Adjusted Adjusted Adjusted (naive) (naive) (naive) (naive) CCI Status Median 95 % 95 % Median 95 % 95 % LCL UCL LCL UCL 10.98 12.10 22.094 24,380 0-1 Alive 1 1.54 23.237 (10.73) (13.35) (21.493) (24.981) 4.27 12.91 17.778 41.632 Dead 8.59 29.705 (5.92) (1 1.27) (23.945) (35.466) 12.47 14.47 25.790 29,892 2-3 Alive 13.47 27.841 (12.62) (14.32) (26.001) (29.681) 6.13 14.92 22.380 46.238 Dead 10.52 34.309 (7.89) (13.16) (28.636) (39.982) 13.51 16.23 27.058 32.728 4 + Alive 14.87 29.893 (13.66) (16.07) (27.287) (32.499) 7.35 16.49 23.278 49.445 Dead 1 1.92 36.361 (9.20) (14.64) (30.497) (42.226) Estimates for a patient who underwent CATH. with ejection fraction 50+ and no history of prior CABG. CHAPTER 2 ESTIMATING HOSPITAL COST OVER A SPECIFIED DURATION Considering accumulating cost as a process evolving over time. we construct in this chapter a regression model that permits the estimation of the mean cost over a specified duration of hospital stay and also adjusts for the influence of patient characteristics on LOS and cost. The proposed methods that model the relationship between hospital cost and LOS are applied to assess the mean cost in a cohort of hospitalized patients who underwent CABG surgery. 2.1 Model Description Let T denote the LOS and Z, a vector of p explanatory variables that might have an impact on the distribution of T. The cumulative cost C(t) through 1 days of hospital stay is only observed at t = T and therefore we will focus on the influence of covariates Z2 on the distribution of the total cost C = C(T). To fully integrate the role of time into analyses of costs we would need the cumulative cost histories as they manifest in each patient. Suppose that C(t) = £B(u)du . 91 so that cost is incurred at the rate B(u) at time u. The rate of hospital cost accumulation is observed at time u only if u is less or equal than the hospital stay T. Therefore the observed cost has the form Err 2 u]B(u)du . We want to estimate the expected value of the observed cost over a duration t for a specified covariate profile Z0. Given this covariate profile. we assume that T is independent of the rate process {B(u).u > 0}. Then. by Fubini’s Theorem. the mean observed cost is MC(t|Zo)= £b(u|Zo)S(u|Zo)du . (2.1) where b(u |Zo) = E (B(u) | Z0) is the average potential rate at time u and S (u IZO) = P(T > u | Z0) the survival function of T. both for the specified profile 20. Consider it individuals in the study. Using the same notations of Chapter 1. for the i-th patient let 7} denote the true LOS. 7}’ the censoring time. X ,} = min(7}.7}') . 5,} = [7} S 7}] the indicator of non-censoring for time and Z,, a vector of p explanatory variables that influence 7}. Consider also the true hospital cost C, . the censored cost C} . X 2, = min(C,-.C,’) . 62,- = [C, S C,'] the indicator of non-censoring for cost and 22, a vector of p explanatory variables that influence C} . Let 5(.| Z0) . S(.|Zo) be estimators of b(.| Zo).S(.|Zo) . respectively. obtained from the data {(X,,.6,}.Z,,).(X2,.62,.Zz}).1 S i S n}. Then the mean observed cost is estimated by 92 rice | 20) = £130. | ZO)S(u | Zo)du. (2.2) The construction of the estimators 6(. | Z0) . S(. | Z0) relies heavily on the study data. In the application described below. a linear relationship between time and cost is considered appropriate and a linear regression model is used for the estimation of the expected rate. The survival function is estimated through a proportional hazard model. In Chapter 3 we extend the model proposed here to a more general setup in which patients pass through different health states. In the present chapter a patient makes a single transition from the state of being hospitalized to the state of being discharged. A rate model is used for describing the accumulating hospital costs. With transitions between several states. costs might be incurred at transition between health states and also during a sojourn in a health state. The latter costs can be described also through a rate model. whereas the “jump” costs at transition times between states can be described using marked point processes. For these cases we provide in Chapter 3 our methods of estimation of mean cost. 2.2 Application We apply the presented method to the hospital costs of the patients from the same sample used in Section 1.3. In that section we described the study. the available potential correlates and the covariate selection procedure. The proportional hazard model provided a good fit to the data. The significant correlates of LOS were age at admission (p=.0366). DRG (p<.0001). history of prior CABG (p=.0685). ejection fraction (p=.0650) and Charlson Comorbidity Index 93 (p<.0001). The estimator S(. I 20) of the LOS survival function was derived from the Cox regression model. as described in Section 1.1. The plot of costs versus LOS (see Figure 1.1) suggests a linear relationship. The variance seems to change a little over time but for simplicity we do not consider this fact in the analyses of this example. As in the parametric model used in Section 1.3. we assume costs approximately normally distributed. For modeling total costs as a function of time. we allow LOS and LOS2 to compete for inclusion in the final model. We also regard the indicator of non- censoring for cost along with the other demographic and clinical characteristics as potential correlates of total hospital cost. The significant correlates are LOS (p<.0001). DRG (p=.0636). history of prior CABG (p=.0213) and indicator of being discharged alive (p<.0001). Even though the indicator of cost non-censoring is not significant (p=.6138). we include it in the final list of covariates because we want to be able to distinguish between the censored and the non-censored cost observations. We then consider the model C1“) = 161621 + 55221 + 163 t + 051: in which the parameters [3,432,133.17 are estimated by maximum likelihood. assuming the errors 8,- independently normally distributed with zero mean and unit variance. Consequently. for a given profile 20 . we estimate the expected (uncensored) cost Co(t 120): E(C(t) | 20) = E(£B(u)du IZO) = £b(u |Zo)du by 60(t|zo)=19, +3gzo+B.r. for r>0 and é.(0|zo)=0. (2.3) 94 The above model for GO) incorporates the dynamics of time into the accumulating hospital cost. In other applications the dependence on time might be more complex than the simple linear relation used here. However. in practice a polynomial in I should adequately capture the dependence on the time t. In a related study of hospital charges in patients undergoing cardiac procedures. a quadratic in t provided reasonable fit to the data.49 Suppose we want to estimate the mean observed cost over the fixed duration t. The estimated survival function 8(. | 20) is a step function that jumps at the observed LOS times. Let TU).I 21 denote the ordered observed LOS times in our sample. Then the fixed duration 1 is located between two of these times: 7},_,) < t S Tu) and. by (2.2). " l-l ~ ~ ~ MC“ I 20) = Zj=|S(T(j-|) lzo)(Co(T(1) lZo)‘Co(T(1—ll 1200* + §(T(l—l) IZO)(éO(t l 20) _ 60(721-1) IZO))’ where COL | Z0) is given by (2.3). Thus the mean observed cost is an average of costs increments. weighted by the likelihood of surviving through each incremental period. Table 2.1 shows the estimated mean costs at the LOS median and at the largest observed LOS (35 days) by comorbidity and discharge status (alive or dead). The estimates are for survivors. age 65 at admission. ejection fraction 50+. with no prior history of CABG. who had a cardiac catheterization during their hospital stay. Medians of the LOS distribution for different covariate profiles were estimated from the Cox model. The discharge status was not significant in the model for LOS. so the LOS medians do not differ between the survivors and non-survivors with the same characteristics. As expected. the mean costs increase with comorbidity. We also notice 95 (hat non-surv the estimated and $34,030 1 by category or for Table 2.1. noticed in the the linear trenl In conc stays of specif survival anal )1, potential corre way We do not valid in many 5 cost aCCUmUlut absence 0f the . cost and L03 1: increments, Th adopted in CVal In that “My 811 estimated by th incomplete f0“ Will-Ch gave the Lemma [20) l that non-survivors have larger costs than survivors. Thus for patients with CCI of 2 or 3. the estimated mean cost through their median LOS of 12 days was $21,651 for survivors and $34,030 for non-survivors. Figure 2.1 is a plot of the estimated mean observed cost by category of comorbidity for patients discharged alive. with the same profile described for Table 2.1. The fact that a model linear in time was used for the accumulating cost is noticed in the figure. After approximately 9 days. the survival function weights change the linear trend and make the estimated mean cost different by comorbidity. In conclusion. the presented model estimates the mean observed cost for hospital stays of specified duration. The censored LOS observations were analyzed by standard survival analysis methods. The indicator of non-censoring for cost was regarded as a potential correlate of total cost. even though it was not statistically significant. In this way we do not use the assumption of independent cost censoring. which might not be valid in many situations. When assessing mean costs over a given duration. the potential cost accumulation C(t) through time t was modeled by a linear function of time. In the absence of the cost histories of the patients. the model was fitted using the observed total cost and LOS in our sample. The overall mean cost was a weighted average of cost increments. This approach is similar to that Gardiner er al. (1995. 1999)"' '5 previously adopted in evaluating the cost-effectiveness of the implantable cardioverter defibrillator. In that study survival time was the underlying stochastic variable whose distribution was estimated by the Kaplan-Meier method or Cox regression. Censoring occurred due to incomplete follow-up of some patients. The cumulative cost C(t) was assumed known. which gave the expected total cost over a fixed time interval [0.t] as £e'”S(u |Zo)dC(u) . where r is the discount rate and S the survival function. Our 96 proposed me the stochaSIiC short hosPita costs over [0 110w interpret underlying re estimator is 11 properties do proposed method extends this analysis in a very important practical way by incorporating the stochastic element in costs. Ignoring discounting (which was irrelevant for relative short hospital stays considered in this study) we constructed an estimator of the mean costs over [0.t] as £S(u|ZO)dCo(u|Z0). where C0(t|ZO)= E(C(t)|Zo) and C(t) is now interpreted as the potential cumulative cost up to time t. This estimator exploits the underlying relationship between total hospital cost and LOS. While consistency of the estimator is immediate. other distributional properties will follow as special cases of the properties developed in the more general set-up of Chapter 3. 97 TABLE 2.1: Estimates of mean cost at duration times by comorbidity and discharge status LOS Mean Cost ($) Mean Cost ($) CCI Status Median at at Overall (days) LOS Median Follow-up 0-1 Alive 10 19.190 22.759 Dead 10 31.569 35,138 2-3 Alive 12 21.651 26.697 Dead 12 34.030 39.076 4 + Alive 13 23.020 29.272 Dead 13 35,399 41 .651 Estimates for patients with age 65 at admission, ejection fraction 50+. no history of prior CABG. who underwent catheterization during their hospital stay. 98 :1; nun/W E m 5 . . .. . . I 141 141‘ 4 111.4. <... 1.1414 < 114311141114 .11.?‘44.11.1111111V31‘1‘441“. 1 0 ..a<< < 0 2 G1. @0003 «000 was: 3131') 35 30 I . - ...... ' ’ U 25 20 Mean Cost [1000$) 5 .00 COMORBIDITY 4+ 000 COMORBIDITY 2-3 *** COMORBIDITY 0-1 0 IIIIIIIIIIIIITfIII I I I I I I I I I I I I T I I I j I I 1 5 9 13 17 21 25 29 33 37 IDS (days) FIGURE 2.1: Estimated mean cost at duration times by comorbidity (for survivors with age 65 at admission. ejection fraction 50+. no history of prior CABG. who underwent catheterization during their hospital stay) 99 In 511 periods of fr example. in count. A le and 500. an costs are in. will depenc variables 0 Mu have bCCOI disease p“ decision a of Patients degIEe 0f bem'een t rand0m_ Chiaj IS “I CHAPTER 3 ESTIMATING MEDICAL COSTS IN LONGITUDINAL STUDIES In studies of the natural history of diseases and in medical interventions with long periods of follow-up. clinical conditions define several health conditions or states. For example. in the progression of HIV diseaseso‘52 stages can be defined by the CD4 cell count. A level below 200(x106 / L) triggers aggressive treatment and levels between 200 and 500. and above 500 have recognizable clinical interpretation. In these situations costs are incurred in each state and at transitions between states through resource use that will depend on treatment and patient attributes. Assessing the influence of these variables on costs is one of the objectives of this chapter. Multistate Markov models which have their theoretical origin in survival models have become the standard for modeling health related outcomes. specially in studying disease progression in patients.”’58 They are also framework for cost-effectiveness and decision analyses (see p152-153. Gold et al. (1996)”). By describing the event histories of patients as sojoums through different health states. Markov models provide a sufficient degree of flexibility to model the probabilistic mechanisms that underlie the transitions between these states. The sojourn times in the states and transitions between them are random. The Markov assumption restricts their dependency on the past information and entails the conditional independence of sojourn times given the states. 100 In Section 3.1 of this chapter we use a Markov model to describe the experience of patients in sustaining and changing states of health. Two types of costs are considered: costs incurred at transition between health states and costs of sojoums in a health state. Then present values are computed by discounting all costs at a fixed rate. In Section 3.2 we provide estimators of the mean present value of these two types of costs incurred over a fixed duration. The estimators are obtained conditional on an initial state and a given covariate profile. Large sample properties of these estimators are presented in Section 3.3. 3.1 Model Description 3.1.1 A Markov Model for Describing Patient Health Histories Let ($2.? . P) a probability space and let {X (I). (6 T} with T = [0.1"] ,z' 0. so the expected transition cost at any r> 0 does not depend on the initial health state. It is known that if N is a counting process with intensity process A . M = N - IA and H is locally bounded and predictable. then M and IHdM are local square integrable martingales. with E(M ) = E( IHdM) = 0 (see Proposition 11.4.1. p70. Andersen et al. (1993)”). Then. by assumption (A.0.1). MPV,fj”(t | 1.20) = E( Lame... ($)/1,,j(s)ds| X0 = .320) = = E( fie"sc,.,(s)y,. (s)a,.,o(s)ertp(19:,z,.,0 )ds 1 X0 = 1.20). By Fubini’s Theorem: MPV,}(}”(I | 1.20) = Ie"‘E(C,.,(s)1/,(s) | X(0) = i.Zo)ah,o(s)exp(fl,',Z,y-o)ds. We can write E(C..}(s)Y,.(s)| X0 = 1.20) = 13(C,.,(s)|1r0 = i. X(s—) = h. ZO)P(X(s-) = h| x0 = 1.20). By the assumption (A.0.2). MPV,}(}”(I | i,Z0) has the form MPV,}(}"(t | i. 20) = file-"ch1- (s | low... (0.s | zo)a,,o(s)exp(fi;.z,,}o)ds. (3.1) where ch}(s | 20) = E(C,,(s)| X(s—) = 12.20). 105 Estimation of the transition probabilities is described in Section 3.2.2 and the method of estimation of Chj(3 | Z0) is presented in Section 3.2.3. We now turn to the cost of sojoums in a health state. Suppose that the cost in state h is incurred at the rate 8,, (u) at time u. The observed rate is zero at time u whenever. just before u. the patient is not in state h anymore. so [X (u—) = h] = 0. Then the observed present value of all expenditures in state h. started at time s and ended after the duration time d is given by (2) +d _m C, (s,d)= I: e B,(u)Y,,(u)du, where r is the discount rate and Y, (u) = [X (u—) = h]. Conditional on the initial state. given the vector 20 of basic covariates. the mean of this present value is MPV,,‘2)(s,d | .320) = E(Cj2’(s.d)l X0 = i. Z.) = = [w 9"“E(B. (“Wk (u) l X0 = i’20)“- Conditions similar to (A01) and (A.0.2) are assumed for B,, (.): A.0.3 B,. (.) are bounded. non-negative real stochastic processes over [0.1]. adapted to (if). A.0.4 E(B,| (u)| X0 =1“, X(u-) = h.Zo) = E(Bh(u)| X(u—) = h.Zo) for all u E [0. T] . Denote b,.(u | Z0) = E(Bh (u) | X(u-) = h.Zo). We can write E(B,(u)Y,.(u)| x0 = 1.20) = E(Bh(u)| X(u—) = h. X0 = i.Zo)P(X(u—) = h | X0 = 1.20). 106 By assumption (A.0.4): (2) . ‘+d -ru MPV, (s.d|r.Zo)=J: e b,,(u|ZO)P,,,(0.u|Z0)du. (3.2) The method of estimation of b,, (. | 20) is presented in Section 3.2.4. 3.2 Estimation of the Mean Transition Cost and Mean Sojoum Cost 3.2.1 Estimation of the Regression Parameters and Integrated Baseline Intensities Consider 12 individuals in the study. We assume that given the random vector X0 = (X,0..... Xno) and the random processes Z(.) = (Z,(.).....Z,, (.)) . independent Markov processes X,(.)..... Xn(.) are constructed. with X,(0) = X}o.1 S i S n. Each process X ,(.) has the same description as that of X (.) previously presented in Section 3.1. For the i-th individual. Z,(t) is the p-dimensional vector of covariates measured at time t and X ,0 the initial state. A multivariate counting process N, = (Nhj,.h ¢ j) is defined from X }(.) : Nhj,(t) =#{0SsSt : X,(s-) =h.X,(s) = j} . h it j. Let f0 = 0(X0}. flu) = 0'{Z(s).s St}. Mo) = a(N,.,,(s).s $1.11 a: j} and 1,70) = f° v f2 (t) v MU) . By Theorem 11.6.8, p94. Andersen et al. (1993)”. N, = (Nh,,.h at j ) is a multivariate counting process with its .7,(t)- transition intensities 107 2,,,,(t) = a,,,(t)Y,,,(t) . where Y,, (t) = [ X, (t—) = h] and (1,,, is the transition intensity from state h to state j for the Markov process X, (.). We assume the transition intensities ah}, have the form a..,.-(t..6o) = a.,o(t)exp(ffoZ.,-.(t)). IE 7. where the type-specific covariate vector Z,,,,(t) is computed from the vector Z, (t) of basic covariates for the i-th individual. This is the standard pmportional hazard model. In the following we present the construction and thelarge sample properties of the estimators ,8 and Ah,o(.. ,3) of the true value of the p-dimensional regression parameter [3,, and of the integrated baseline intensity Ah,0(t) = £a,,,0(u)du . respectively. Most of this is the development of Cox regression model from Andersen et al. (1993)”. The stated main results will be needed for the proofs presented in Section 3.3 on the mean present values. Let (9”) .f‘"). Pm) denote the product probability space and we define the filtration f (t) = f0 v f2 (t) v {M (t)...../l/,,(t)} . This filtration is the same as the one generated by the covariate vectors and all n Markov processes. By the conditional independence of X, (.) and by the product construction (see Section 11.4.3. Andersen et al. (1993)”). the multivariate counting process N =(N,,,,;ie {l,...,n}.h.je {1.....k}.h¢ j) has the intensity process (Aggie {l,...,n} ,h.je {l,...,k} .h 1: j) with respect to the combined filtration (f(t).re 7'). 108 Next. suppose the observation of N, = (Nh,,.h at j ) is ceased after some random time U, > 0. We say the process N, is right censored at U, . Define the censoring indicator process C, (t) = [U , 2 t] . the filtration 9} (t) = .7370) v 0{C,(s).s S t} and the right censored counting process N," = (szhh ¢ j ) . where N,‘,},(t) = £C,(s)dN,,,,(s) = AU- : I: 'thfl-(s). The censoring process C,(.) is g-predictable_ Assume that given Z,(.). U, is independent of X ,(.). Then N,(.) has the same compensator both with respect to (£(t).te 7') and with respect to (g; (0.16 7') . This is the assumption of independent right censoring. referred in this sequel as “independent censoring”. Each N,,, with h at j has the decomposition Nap-(t) = A)”, (I) + thi (I). where M h}, is a local square integrable martingale with respect to (£(t).te 7') . Then N5}, (t) = LC,(s)dN,,,-,(s) = £C,(s)dA,,,,(s) + LC,(s)dM,y-,(s) =A;,,(r)+M;,‘,,(t). By the predictability and boundedness of C, (.) . M ,f,, is a local square integrable martingale with respect to (game 7'). Thus. under independent censoring. N,‘,',, has the (9}(1).re 7') - compensator A2,,(t) = LC,(s)dA,,,,-(s). so Nf has the intensity process A," ={A,f,,,h ¢ j}. where 2,2},(1) =a,,,,(t)Y,f, (t) and 109 Yh‘,(t) = C, (t)Y,,, (t) = [X ,(t—) = h.U, 2 1]. Therefore N,f,, has the same “individual intensity” (1,,}, as the uncensored process and the proportional hazards assumption is preserved for N ,f ,,. Also 13,”, is interpreted as the predictable indicator process for the i-th individual being observed in state h just before time t. Forn independent processes N,".1 S i S n. N" = (Nf.ie {l,...,n}) has intensity process 11" = (Nf .i E {l,...n}) (see Section 11.4.3, Andersen et al. (1993)”). From now on the superscript “c” will be dropped and although not explicit in the notation. N h}, and Y,,, are derived from censored observations. The following standard notations. similar to those of Chapter 1. will be used. For h at j: 2,}... (1)” = z,,,(t)z,.,,(t)’, ifm = 2 = Zh,,(t) . ifm =1 =l.ifm=0; Sig-”0.13) = Z Y,.,(r)Z,.,,(t)®"' exP(/9'Zr.,-.-(t)). me 10.1.21 ; i=1 15., (r. 13) = $1.10. 131/313?) 0. fl) ; Vac. fl) = Sif’tt. .6) I sif’tt. 13) — a.e. a)“; 10.13) = Z £V,,(u.fl)d1v,,(u). with N,,, = ZN,,,; hit} i=1 s1?’(afl>= E[Y.. (02.1.0)” eater/72.40)]. me (0.1.21; 110 states. A.1 A.2 A.3 A.4 A.5 A.6 A.7 4.0.19) = s1? 0. 0) I sf,” (2. fl) ; 01.10 | 20) = Ahj0(t’B)exp(3IZth) . Z(t, p) = Z £v,,(u. fl) sgj?’(u. fl)a,,,o(u)du. haej Consider the vector 1’, = (Yh,.he {l,...,k}) . where 1.....k label all the health As in Chapter 1. similar assumptions will be adopted throughout this chapter: Model Assumptions: Conditional on Z,(.). U, is independent of X ,(.); (N,(.).Y,(.).Z,(.)).1SiSn are i.i.d.; For h¢jz A,,0(2') = fiahjoomr < co; Zh,,(.) are bounded; Zh,,(.) are adapted. left continuous with right hand limits processes (so Z,,,,(.) are predictable processes); P(Yh,(t) = 1.Vre [0,1]) > 0; by E, = 2(1.,BO) is positive definite. notation The form of the partial likelihood is functionally the same as in the case of the ordinary survival Cox proportional hazards model. Thus the log-partial likelihood evaluated at time t (see p483, Andersen et al. (1993)”) is: 111 n It €0.13) =2 2 £[,B'Z,,fl(u)—log sjf’(r.,6)]d1v,,,(u). i=1 h.j=l he j Since sgjlo. 13) is the vector of first partial derivatives of 5:9)0. 13) with respect to 19. the vector U (1, [3) of partial derivatives of C(t. ,8) with respect to ,B is n k U(t,fl)=2 Z £[Zh,,(u)—Eh,(u.fl)]dN,,,,(u). i=1 h.j=l h¢j The maximum partial likelihood estimator B of 50 is defined as the solution of the likelihood equation U (1. 3) =0. For h at j we estimate A,,o(t) by the Nelson- Aalen estimator A J (u) Aorta = 1W0». n n where N,,, =ZN,,,, . J,,(u) =[Y,,(u) >0]. Y, =ZY,,. We use the convention %=0. Let i=1 [:1 01.1.0 (t, 3) = -Z 01.10 (t, ,3). Thus the matrix of integrated baseline intensities jaeh 140(1) = (A,,,o(t).h. je{1.....k}) is estimated by 1100,19) = (Ahjo(t,3).h. je {1,....k}). As we have also seen in Chapter 1. under our model assumptions A.l-A.7. the following conditions necessary for the asymptotic properties of our estimators are verified. We denote by H . II the supremum norm of a vector or a matrix. 112 Conditions C.a-C.f: 0 There exist a compact neighborhood 5’ of 130 . with .60 e .6’ (the interior of [5’ ). and scalar. p-vector and px p matrix functions 3g». 3,? and 3,3). h at j. defined on A(.|Zo)=(A,,,(.|Z0).h.je {l,...,k})such that for me {0.1.2} and h.je {l,...,k}. hat j: C.a sup (l were. rlxb’" $1.701 1?)— arm mil—>0 C.b sg-"K. .) are uniformly continuous bounded functions of (t. fl)e [0.r]x B ; C.c s,,)(.. .) is bounded away from zero; C.d ...,”(r fl)=—- s.?’(t3% 19) s‘j’rt fi)=— a}, 313W 13): C.c E, is positive definite; C.f I: a,,o(r)dr < co. Theorem 1 (see Theorem V11.2.l. p497. Andersen et al. (1993)”) Under the assumptions A. 1. A.2 and conditions C.a-Cf. the probability that the P equation U (2'. fl) = 0 has a unique solution 13 tends to one and ,3 —> ,60 as n -) oo . 1:1 The next theorem gives the asymptotic normality of B and an estimator of the asymptotic covariance: 113 Theorem 2 (see Theorem v11.2.2. p498. Andersen et al. (1993)”) Assume A.l. A.2. AA and C.a-Cf. Then n”2(/3 - [30) converges in distribution to a zero mean normal p-dimensional random vector with covariance matrix 2;! and _l P. . .. by _l P n 1(t,B)-z(t.fio)|]—>0. In particular 2. = n [(1.13)—92,0 notation sup te[o.r] The next theorem provides a description of the asymptotic joint distribution of the estimators A,,o(t..8).h.je {1..k} and B. First we need to state some definitions. We denote by (M ) and [M] the predictable and the optional variation process of a martingale M. respectively. Definition (see p83, Andersen et al. (1993)”): A continuous k-dimensional vector martingale M = (M (t). t e 7' ) . 7' = [0.1) . re [2 is called Gaussian if: i) (M ) = V . a continuous deterministic k xk positive semidefinite matrix valued function on 7' . with positive definite increments. zero at time zero; ii) M (t) -M (s) has a multivariate normal distribution with zero mean and covariance matrix V(t) -V(s) and is independent of (M (u). u S s) . for all 0 S S St in 7.13 114 Definition Two sequences of processes (X n,(.).n _>. 1) and (X "Z(.).n 21) are called asymptotic independent if (X M. X n2)(.) converges weakly to a process (X ,. X 2)(.) with X, (.) independent of X 2(-) .0 Recall that E = {l,...,k} denotes the state space of all the Markov processes. Let E“ ={(h.j),h.je {l,...,k}.h ¢ j}. Theorem 3 (see Theorem v0.2.3. p503. Andersen et al. (1993)”) Assume A.l. A.2. A4 and C.a-C.f. Then n” 2(13 — ,Bo) and the processes Wat.) = n"2(/1.,.(..B>—A..,-.(.))+amtfi-fl.)’[,e.,-(u.fi.)a.,-.(u)du . (h. De E‘are independent. Let WM (.) = -2 Wk, (.). jack The limiting distribution of the k x k matrix-valued process W(.) = (Wh, (.),h. je {l,...,k}) is that of a k xk matrix-valued process (150 =(Ug,.,(.).h. je {l,...,k}). where 115., =—ZU;,, and {U;,.,(.).(h. j)e E‘} is a jath continuous Gaussian vector martingale. with i) 05.,(01 = 0. ii) (U;,,.Ugm,) =0 for (h. j) a: (m.r).(h. j).(m.r)e E‘. t by a ‘ (u) . = 2. = MO 111) (Uoh,>(t) (01., 0)me Emilu .0 115 Notes: 1) The sequence of vector processes W (.).(h. j)e Et weakly converges to hr {115,}. (.).(h. j)e E'} in D[0.r]"(""). the space of R"”"”-valued right-continuous with left-hand limits functions on [0,T] . endowed with the Skorohod topology. 2) Relation ii) implies that the processes Ham-(.).(h. j)e E. are independent. 3) For s.tE [0. 1'] and (h. j)e E. we have that Cov(U,;,,j (s).U3,,, (1)) = (of, (s A t). where 0111:)“ is defined in iii).l:l 3.2.2 Estimation of the Transition Probabilities The matrix of transition probabilities P(s.t|Zo)=(P,,,(s.t|Z0).h.je {l,...,k}) for individuals with given fixed basic covariates 2,, and corresponding type-specific covariates Z,,,o is defined as the product integral P(s.r | 20) = ”(I + dA(u | Z0 )) . for (3.11 s St . 3.16 [0,T] . where the matrix of integrated intensity functions A(.|Z,,) =(A,,,(.| Z0).h.je {l,...,k}) has elements A,,,(t | 2,.) = A,,o(t)exp(fl,',Z,,,-0) for h if j and A,,,,(. | 20) = -Z A,,(.| zo). jath For a review of the definition and properties of the product integration. see Section 11.6 of Andersen et al. (1993).29 116 We consider the estimators A,,(t | Z0) = A,,,,(t. 3)exp(.3'Z,,,o) for h #3 j. A,,,,(.|Zo)=—ZA,,,(.|ZO) and the matrix A(.|Z,,)=(A,,(.|Zo).h.je{l,...,k}). Then jaeh the matrix of transition probabilities P(s.t | 20) is estimated by the product integral P(sJ | Z0) = H (I + dA(u | 20)) , this estimate being meaningful as long as (8.1] AA“, (u | Z0) 2 —1 on (s.t]. A jump process AX is defined by AX(t) = X(t) - X(t—). Next we state and prove the asymptotic properties of R(s.. | Z0) for a given .___.—_. Se [0. 2'). We provide the details of the proofs because some results and intermediate steps. such as representations or expansions of certain entities. will be used in Section 3.3. Even though we follow the development offered by Andersen et al. (1993)29 on p521-516. clear statements and details of needed results were not provided in any reference. Let (h. j)e E'. First we want to show that n"2(A,,(. | 2,.) - A,,,(.| 20)) is asymptotically equivalent to x,';,,(.) + x;,,(.) . (3.3) where X,",.,(t) =exp(fl,’,Z,,,o)n'/2(B— poy I’m,0 —e,,,(u.,60))a,,,o(u)du and X53, (t) = exp(figz,,,)vi/,, (t) . The proof of (3.3) is very similar to the one of Theorem VII.2.3. Andersen et al. (1993).29 The process Wh,(.) defined as Wk] (t) = n1/2 £._‘]L(_u)__thj(u) 3,230.19.) 117 is asymptotically equivalent to the process W,”- (.) defined in Theorem 3. “Asymptotically equivalence” means convergence in probability to zero of the supremum norm of the difference. We use the expansion: n"2exp — Ahjo(‘)CXP(/30Zhjo)) = = "“2 £J,,(s){exp(/3’Z,,,0)S,(,}”(s. 3)" - exp(figz,,o)s,§j”(s, )3,)"}d~,,(s) + +exp(,13.’,z,.,.,)n”2 £1,(s){s,‘.j.”(s. ,6., )"d1v,,,(s) - a,,o(s)ds}+ +exp(,6’..z,,,.,)n“2 £(J,(s) —1)a,.,o(s)ds. The third term above converges in probability to zero and the second term is X 3,, (t). By Taylor expansion around [3,, . the first term equals ’ O O -1 exp(fi Z...)n"2(B—fi.)' photon—15,06 1181.90.13) dN..(s). with ,6‘ on the line segment between ,3 and .50- It can be shown that i i -1 P sup £1,(s)(Z,,,o—E,,,(u. fl ))s,‘,j.”(s. 13) dN,,,(s)- Lam-emu. 130))a,.,..,(s)du —>0 te[0.r] P x for any fl'aflo. Then the first term of the expanded nl’2(A,,,(t|Zo)—A,,,(t|Zo)) is asymptotically equivalent to X ,",,,(.) . so (3.3) follows. Then n”2(A,,,,(.|Zo)—A,,,,(.|Zo))=—n”2(z A,,(.|zo)—ZA.,(.|ZO)) is jack jack by asymptotically equivalent to -Z X,",,,(.) + —Z x;,,(.) =. 11,"... (.) + X3... (.) . j¢h j$h notation Let x;(.) = (x:,,(.).h. je {l,...,k}).me {1, 2}. Thus 118 n"2(A(. | 20) - A(. | 20)) is asymptotically equivalent to X," (.) + X 5’ (.) . (3.4) Now we will show the uniform consistency of the estimator R(s.. | 20) to P(s..|Zo). From (3.3) we have that for (h. j)e E‘, A.,(.| zo)- A,,(.|zo) is asymptotically equivalent to B,’,',(.) = 3,7,, (.) + 83,, (.) = n’” 2 (X ,"h, (.) + X 3,, (.)) . We have that Blnhj (I) = exp(flazthB " .60). E(Zhjo - 31,,(14. 130))a,,0(u)du 9 thj (I) = exp(fiazhjo)£—1h—(‘l£)—thj(u) . sij’tur.) By the boundedness of Z,,,,. eh,(..flo). assumption A.3: Eamo(t)dt0. By the Lenglart’s Inequality te[0.r] for local square integrable martingales (see p86. Andersen et al. (1993)”). for every £————d’ 1‘“) M...(u) Sif’(u.fio) 71.6 > 0: 5 J (u) (.,, P SL1 >7] S-+P l! S (14,3 )a, (106114)?) [151031 ] 772 [£5330 (u’fl0)2 h] 0 1110 n P P and n" J (u)———a. (u)du—>0. Thus sup 1 " -(r)|—>O. L h S’s?)(urflo) hJO te[0.t] 82h] P Therefore we have proved sup |B,’,', (t) | —>0 . from which te[0.r] sup A(t | Z0)— A(t | Z0)“:)0. where H . H is the supremum norm. te[0.r] 119 Consider a fixed 36 [0.1). By the continuity of the product integral (see Appendix B), the previous uniform convergence implies the uniform consistency of the estimator [5(3le0) to P(s,.|Zo): sup A P P(s,t | Zo)-P(s,t | zo)||—>o. te[0,r] In the following we will describe the asymptotic distribution of n"2(;’(s,.|Zo)—P(s,.|ZO)). In (3.4) we have seen that n"2(A(.|zo)-A(.|zo)) was asymptotically equivalent to X1” (.) + X 3' (.) . By Theorem 2, Mum—£0) converges in distribution to a zero mean normal distributed p-dimensional random vector 6 , with covariance matrix 2?. Then X {'(J converges weakly to a kxk matrix-valued process UK. '20) = (Ufa-(- I Zo).h.je {1.--..k}). (3.5) where Ufhja | 20) = {wk}. |zo), with WI; 0 I 20) = exp(flozhjo) i (live " ehj(u’fl0))ahj0(uflu for h 9‘ j and Wh33('|ZO) = ’ZWI;(- I 20)- jath The process Whj (.) is asymptotically equivalent to the process Whj (.) described in Theorem 3. Then X 3 (.) converges weakly to a k xk matrix-valued process U;(. IZo) = (U5,.,-(. | Zo),h, je {l,...,k}), (3.6) 120 where U; (t | 20) = exp( 1352mm; (t) and the process U5(.) is defined in the statement of Theorem 3. Theorem 3 also implies that X {‘(J and X 5' (.) are asymptotically independent. Then, by (3.5) and (3.6), n1/2(A(. | 20) - A(. | 20)) converges weakly (in the Skorohod sense) to U'(. |Zo) = UK. |Zo) +U;(. | 20) , where the processes Uf(. IZO) and U;(. | 20) are independent, with continuous sample paths. Thus n”2(A(. | 20) — A(. I 20 )) converges weakly to U '(. |Zo) in the supremum norm sense (see Appendix B). By the compact differentiability of the product integral and the Functional Delta Method (see Appendix 3). n"2(i’(s,. | 20) — P(s,. | 20 )) converges weakly to U(s,.|Zo) = £P(s,u IZo)dU‘(u IZO)P(u,.|Zo) = U1(s,. IZO)+U2(S,. IZO), (3.7) where U,(s,. | 20) and U2(s,.|ZO) are independent, Um(s,.lzo) = j‘ P(s,u |zo)du;,(u |Zo)P(u,.|ZO), me {1,2}. We can write k Um(s,z|zo),,j =22 £13,8(s,u|20)dumg,(u|zo)a,(u,t|20)+ g=ll¢g I: +2 [8... (an I Zo)(‘Z dU;,,X.->‘ZX.-’V.-"(a>0. i=1 and follows a multivariate normal distribution with mean 5 and covariance matrix It cov ( firms (0')) = (Z X ,’ V,"(a)X ,)' .65 Then the parameter a is estimated by its i=1 maximum likelihood (ML) or restricted maximum likelihood (REML) estimator. Linear mixed models often contain many fixed effects and in such cases it might by important for the variance component estimation to explicitly take into account the 126 loss of degrees of freedom involved in estimating the fixed effects. This can be done via restricted maximum likelihood estimation. The REML estimator for the variance components a is obtained from maximizing the likelihood function of error contrasts U = A’Y , where Y = (Y,’,...,Y,,’ )’ and A is a (n X(n - p)) full-rank matrix with columns orthogonal to the columns of X, the matrix obtained from stacking the matrices X, underneath each other. This likelihood can be written as: n -|/2 ZXIW'WWr xLML(flMLE(a),a) 9 i=1 LREML(a) = C where C is a constant which does not depend on a, so the resulting REML estimator does not depend on the error contrasts (i.e. on the choice of A). See p43-47, Verbeke and Molenberghs (2000)62 or Diggle, Liang and Zeger (1994)"3 for reviews of the REML results and comparisons between ML and REML estimators. Let (2 denote the ML or REML estimator of a and V, the estimator of V, obtained by replacing the variance components a from D and Z, by (2. We will then estimate 31141.5(“) by 3 = (i Xffi"Xr)'iX.-’Vf‘cr (3.14) i=1 i=1 and cov(Bm.s(a)) by coming) = (Z flit-"xiii i=1 It follows from classical likelihood theory (see for example Chapter 9, Cox and Hinkley (1990)“) that under some regularity conditions the REML estimator (2 is consistent and its distribution can be well approximated by a normal distribution with 127 mean vector 6! and covariance matrix given by the inverse of the Fisher information matrix. Given (I, suppose there exists the asymptotic matrix n-roo n 2(a) = lim lZX,’V,"1(ar)X,. Then nl’2(BMLE(a)—fl) converges weakly to a n i=1 multivariate normal distribution MVN(O,2(a)"). In many situations (i.e. different covariance structures) the consistency of 6"! implies the consistency of our estimator ,3 = [9,,” (d) (called the feasible generalized least squares estimator in the econometrics literature) and also implies that ""209 - fl) ll2 and n (19,“; (a) — ,B) have the same asymptotic distribution MVN(O,Z(a)"). Amemiya (1985)67 (see p186-222 ) provides in detailed proofs of these results for several models with economic interpretation: 1) serial correlation; 2) seemingly unrelated regression models; 3) heteroscedasticity; 4) error components model and 5) random coefficients model. We assume that the above stated results hold in our situation. Thus 3:13 and D n“2(B — fl)—>§,, where ;, ~ MVN(0,>:(a)"). (3.15) Now we need to interpret obj-(s | 20) = E(Ch, (s) I X (s-) = h,Zo) and get a suitable estimator. For ease of explanation we will use the quadratic time model described previously as an example throughout this section. Recall that for transitions of the h to j type we recorde (C,, ,l e {l,...,n, }) , the costs incurred at the i-th individual’s transition times (t,, ,l e {l,...,n, }). Based on our example 128 model (3.13), the expected cost for the i—th individual with subject level covariates F,,,...,F,,, at his/herh toj transition time 1,, is E(Cil) = (fllFli + flzei + + flrFr-i) + (flrHFIi + r+2FZi + + fl2an')til + +(l32r+1F1i + fl2r+2F2£ + + 133an)‘5 = (3-16) = 51’; F1 + an,» 1:} Iit + 133mm Ft ‘5- In practice, after the covariate selection procedure, the model (3.13) might be reduced so some of the 3r regression coefficients in (3.16) might be zero. For an individual with given subject covariates Z0, ch, (3 I 20) is the expected cost of the h to j transition time 5. Thus we assume t , t at 2 chj(s I 20) : film 20 + r+l,2r 20 S + grflfir 20 S ’ where 2,; is the r-dimensional covariate vector obtained by substituting F,,,...,F,, by their correspondent covariates in Z0. Replacing the fixed effect regression parameter vector )9 = (13,}, flap/33“,) by its estimator given in (3.14), we estimate ch,(s I Z0) by 51:1(3120) = Bl’r Z6 + A;+1,2r 26 S + 135mm 26 32- (3-17) I ,32 (Z3) ) , so we can write Eh,(s|Z,,) = 375,0). I Denote zg,(s)=((zg) ,s(z,;) The asymptotic properties (3.15) of the estimator [9 imply that P sup 1541(3' Z0)—c,,,(s | Zo)|—>0 and .sE[0.T] D D , n"2(5,,,(.| 20) -c,,,(. | 20))—->c,,,0(. | 20), where c,,o(s | zo)={,’zo,(s). (3.18) 129 Therefore the process ch,0(. | 20) is Gaussian, with mean zero and cov(c,,,o(s | Z0),c,,,o(t | 20)) = zg,(s)’2(a)“zg,(r). The matrix 2(a)1 is consistently estimated by 2(62)". Comments Suppose there are some patients that do not incur any h to j transition costs. One possible approach for this situation is the use of a two-part mo®1.6'68'69 For each individual i in the sample we observe the binary variable 60, that indicates if the subject incurred any h to j transition expenses. Then we assume that P(6c, =1) = 7t,(a) is governed by a parametric binary probability model (part one) and we consider the mixed-effects model (part two) C, = X,,6+Z,b, +s,, where all the assumed error and random effects distributions are conditional on the realization of the event (6,, = 1) . Then E(C,) = P(6c, =1)E(C, I6}, =1)= 7r,(a)X,fl. The component 7t,(a) can be specified through either a logit model: exp(a/Z,) ora robit model It. a ==a1+2f,;',(flii) E[X(trt)=5l+fl21‘} t, +1921”:- :5. le {l,...,n, }. Hence we assume c,,(s | 20) = 3,25 + (13;) z; +(fl3j) z; + 1932,} s + figzg s2 and we estimate it by replacing the fixed-effect regression parameter vector ,6 with its estimator given in (3.14). This strategy of utilizing the entire sample to estimate simultaneously all (Cry-(- | Z0),h ¢ j) has the advantage of drawing strength from other parts of the data set, when some patients do not have observed transitions of a specific type. Its limitation is that, when considering all types of transition costs, it might be difficult to distinguish any pattern in time that approximates the individual transition cost profiles. :1 131 3.2.4 Estimation of Mean Sojourn Rate For a given fixed covariate vector Z0 , we defined in Section 3.1.2 b,,(u | 20) = E(Bh (u) | X(u—) = h,Zo) as the expected expense rate at time u for sojoums in state h. This quantity is never truly observed unless we have a very fine time scale on which the accumulating cost history is observed. For example, daily or weekly costs incurred while sojourning in a state might provide in same applications an adequate representation of the rate of expenditures. We assume we do not have a detailed cost history and observation is restricted to the total expenditures for each sojourn, together with the time of entry and duration of the sojoums. The cumulative expected expense of a sojourn in state h with entry time s, after . +d . . duration d is C,,(s,d) = r b,, (u | Zo)du . We assume the rate of accumulating costs in a S sojourn does not depend on the entry time in that sojourn. For the i-th individual, all observed total costs incurred in sojoums in state h are collected into a single vector C, = (C,,,...,C,,,,)’. For 16 {l,...,n,} , C,, is the total cost (up to transition in another state or up to censoring) of the l-th sojourn in state h that had entry time 3,, and duration d,,. We assume the vector C, contains at least one cost value, for all i. The comments from the end of the previous section apply for the situation when this assumption does not hold. Our approach is similar to our use of linear mixed models to derive estimates of the expected transition costs. We use the same two-stage analysis and we also make the 132 assumption that the individual sojourn cumulative profile can be well approximated by a polynomial function of the duration of the sojourn. As an example we consider again a quadratic curve model. For the first stage model, let Cu = flu + flzili‘i '1' 33:31! + fittidil + fisrdi? + 5i! . (3-19) where 1,? is the indicator that the l-th sojourn in state h of the i-th individual is completely observed. The second stage model is similar to (3.12): file = ak-l)r+l,krl:i + bki’ k E {1,23,4’51- (320) We use the notations from the previous section. Let 3 = (3,9, 3;+,.2,,..., 3;,+,,5,)’ , b, = (b,,,...,b5,)’. Replacing (3.20) in (3.19) we obtain the following final model: _ I I c I I 2 Ci! - ramp} + 16r+i,2rFi lil + fl2r+l,3rFi Si! ‘1’ .33me dil + fl4r+l,5rFi du + + bli + 52:15 ‘1' b31311 + b4idil + by": + gil‘ (3.21) Note: In practice, after the model selection procedure, some of the regression coefficients might be zero. 0 Based on the model (3.21), an estimator 3 analogous to (3.14) can be calculated. For an individual i with an observed l-th sojourn in state h with entry time s,, and duration d,, , the expected cost is I I I I I 2 E(C,)) = flu}? + fir+l,2rFi + t62r+l,3rFiSil + fl3r+l,4rFidil '1' fl4r+l.5rFidil = E(n’.a’.s.ia’.din’.difi’) 133 Accordingly, for an individual with given subject covariates Z0, we assume that I mm) = st[(zg)’,(zg)'.r(za)'.d(251.612(25),) and we estimate C,,(s,d) by Ct=fi’((25) .(za) .siza) 2125) #21251) . where, as defined in the previous section, Z5 is the r-dimensional covariate vector obtained on substituting the elements of F, by their corresponding covariates in Z0. The rate b,,(. | Z0) was assumed not to depend on the entry time 3. Therefore we estimate it by 13,, (u | 20) =a—Z-é,(o,d)|d=, = 32520;), (3.22) I where 25204) = [0’ o’ o’ (25) u(z{,) ] . As described in the previous section, under some regularity conditions A P b,,(ulZo)—-b,,(u|Zo)l—>0 and sup ue[0.r] . D D , . n"2 (b,,(. | Zo)-b,,(. | 20))—>b,,0(.| zo) , with b,,o(u |20)={,zo,(u), (3.23) D n where n'/2(3 - 3)—->{2, ;, ~ MVN(0,2(a)") and 2(a) = lim l2X,5v,“(a)x,. We ""°° n i=1 use the same notations from the previous section. This will not create confusion because we derive asymptotic properties separately for the mean transition cost and the mean sojourn rate. The process b,,0(. | Z0) is Gaussian, with zero mean and 134 cov(b,,0(u IZO),b,,o(w | 20)) = 2,3,(u)’2(a)'l 252m) . 3.3 Large Sample Properties of the Mean Cost Estimators 3.3.1 Uniform Consistency of the Mean Cost Estimators By (3.1), conditional on the initial state i, given the vector Z0 of basic covariates, the mean present value of all expenditures associated with the h to j transitions in (0,t] is MPV,,‘,"(: |i,Z0) = £y,,,(s)dA,,(s | 20), where 7111i“) = e—rschj(s IZO)})M(O’S|ZO) and AMSIZO) = Eahjo(u)¢XP(/3’ozhjo)du is the integrated intensity function of a h to j transition. We estimate this quantity by Mfiv,f,”(t |i,Z0) = £7,,,(s)dji,,,(s 120) , where y,,,(s) = e"‘6,,,(s |z,,)13,,,(0,s|zo) and 21,,(s |z,,) = A,,o(s,3)exp(,8’z,,,o). See Sections 3.2.1, 3.2.2 and 3.2.2 for the definitions of the estimators Ah, (. | 20) , 13,,l (0,. | Z0) and 8,,,(.|Zo). We will prove the uniform consistency of the mean transition cost estimator, that is A l l P sup MPV,§,’(: |i,Zo)—MPV,,(,)(t|i,Zo)|—)0. (3.24) te[0,r] We first prove P sup thjiU) " 71w“) “90 - (3.25) te[0,r] 135 By the definition of 7h,,(.) and y,,,(.), 5,,(r |zo)fi,,,(o,t |zo) —c,,,(z |Z0)P,,,(O,t |zo)| s 3UP thjiU) “ thi(t)l S SUP te[0.r] tE[0.fl S sup 151210 |Zo)| sup re[O,t] re[0,r] é,(o,t|zo)— P,,,(0,t|ZO)|+ sup lamb IZo)-c,,,(t|Zo)|. re[0,r] P By (3.18), sup |5,,,(r | 20) — c,,(t |20)|—>o and by the assumption A.0.l, c,,(. | 20) is te[0.r] A P Pihmvt 120)“ Pmm.’ IZO)I->0. bounded. We have shown in Section 3.2.2 that sup re[0.r] Consequently (3.25) follows. For :6 [0,1]: |Mfiv,f,”(r |i,Zo) — MPV,,‘,”(: |i,Zo)l s 1:171:11“)— n,,(s)ld?\).,-(s I 20) + + £71m“) ((#1th I 20) ‘ (1"th 120)) ,S0 sup MfiVg’o |i,Z,,) — MPV,§,”(t Ii, zo)| s sup |7,,,,(s) - y,,.-(s)|3,.,-(r | 20) + re[0,t] £10.71 + sup te[0,r] (rs-(swineIzoi-dAijmzoi) .. P By (3.25) and the fact Ah,(r | Zo)—~>A,,,(r | 20) < oo , the first term of the right hand side of the above inequality converges to zero in probability. By our model assumptions y,,,(.) is bounded on [0,T]. Then, as in the proof of (3.3), we can show that gyh,,(s)(dA,,,-(s | 20) — dAh,(s |Zo)) is asymptotically equivalent to 136 J (s) B.,-.- = cxptfiizn.) granawijts) + + (3 — floi’ f, yin-mam. — s,,-(s. flo))ah,o(8)d8- Using Lenglart’s Inequality for square integrable martingales, consistency of 3 to 30, Eah,0(s)ds < co and our model boundedness assumptions, one can prove that P —-)0 and (3.24) sup |B,,,(t)|fio. Thus sup £y,,,(s)(dri,,,(s|zo)—dA,,(s|zo)) te[0,r] 1610.1] follows. Next we will show the uniform consistency of the mean sojourn cost estimator. By (3.2), conditional on the initial state i, given the vector Z0 of basic covariates, the mean present value of expenditures after duration time d for the sojoums in state h with entry time s is 3+ 5 (2) - d -m MPV, (S,d|l,Zo)=Iy e b,,(ulZo)P,,,(0,u|Z0)du. We estimate this quantity by '+ S A d a A MPV,,(2)(s,d |i,Z,,) = ]“ e'mb,(u |Z0)P,,,(0,u |Zo)du, where the estimators P,,,(0,. |Zo) and 3,,(. | Z0) are defined in Sections 3.2.2 and 3.2.4, respectively. Using an argument similar to the one used to show (3.25), it can be shown that for every duration d such that 0 < s + d S T: sup Mfiv;2’(s,a |i,Z,,) — MPV,,(2)(s,a |i,Zo)| s aE[0.d] 6,,(u | zo)fi,.,(o,u |zo) —b,,(u |Z0)P,,,(0,u |zo)| = 0,0). S constant x sup u€[s,s+d] 137 Therefore the uniform consistency of the mean sojourn cost estimator holds: A P sup MPV,,(2)(s,a | i, 20) — MPV,,(2)(s,a |i,Zo)l—>0 aE[0,d] for all dsuch that 0 R. «Ma y. z) = (xtsntsidzts). where E is a subset of D[O,1’]3 such that (a, is well defined. Notice we can write MPV,,(,”(t |i,Zo) as we. yozoi = ate-"cin- l Zora-imp I Zia/int. 120»- If (p, has an extension to D[O,1’]3 that is Hadamard differentiable in (x0, yo,zo) then, under some extra-conditions, we can apply the Functional Delta Method to obtain our desired result. First we recall our convergence results from the previous sections. 138 From (3.3), (3.5) and (3.6), we obtained in Section 3.2.2 that for (h, j)e E‘, n”2(r§,,,(. | 20) — Ah,(. I 20)) converged weakly in D[0,z'] (in the Skorohod sense) to the process U;,(. IZO) = Ufh,(.|Z0)+U;,,,-(.|Zo), where Ufh,(. IZO) and U;,,(. |Zo) were independent. We showed that o D , e Uihj(tlzo)=§ WhjUIZo). (3.26) where 6 ~ MVN (0,2;1) , the matrix 22, being defined in assumption A.7, Section 3.2.1, and Wh}(t | Z0) = exp(3,’,Z,,,o) £(Zhjo - eh,(u, 30))a,,,0(u)du . The other process is defined as U§,,,(t | Zo):exp(3,',Z,,j-0)U5,,,(t) , (3.27) where the matrix-valued process U5(.) is described in Theorem 3, Section 3.2.1. By (3.7), n”2(13,,,(0,. lZo) — 13,,(0,. 120)) converges weakly to the process U,,,(0,.|Zo) = U,,,,(0,. |Zo)+U2,,,(0,. |Zo), where U,,,,(0,. IZO) and U2,,,(0,.|Zo) are independent and for m 6 {1,2} It s . U,,,,,(0,s | 20) = Z Z LP},(0,u IZO){P,,,(u,s | 20)- P,,,(u,s|zo)}du,,,,(u | 20). g=ll¢g Using (3.26) and (3.27), we write D I U,,,,(0,s|Zo)=§ F,,,(0,s|Zo), (3.28) (12,},(0, S I 20) = exp(flézhjo)x (3.29) k x22 Eggmm[20){15,,(u,s|zo)-Pg,,(u,s|zo)}dugg,(u)’ g=ll¢g 139 k where F,,,(0,s|Zo) = 22 £34044 |z,){P,,(u,s|z,)—P,,(u,s|zo)}dw,‘,(u |zo). g=ll$g By (3.18), n”2(6,,,(. IZO) — ch,(. I la» converges weakly to the Gaussian process C,,,0(. | Z0) described in Section 3.2.3: c,,,,(s | 2,) = §,’zg,(s), (3.30) where 4, ~ MVN(O,2(a)" ), 2(a) = lim 12x31," (a)X,. ’H” n i=1 Next we prove the following lemma. We denote by fldzl the total variation of the function z. Lemma Let E ={(x,y,z)€ D[0,2']3: [:le IS C}, where 0< C <00. For a fixed time :6 (0,1'] we define (p, : E—> R by (p,(x,y,z) = Exydz. Let (x0, yo,z,,) be a fixed point ofE such that El d(xoyo) |< co. Then (p, can be extended to the space D[0,r]3 so as to be Hadamard differentiable at (x0, yo, zo) , with derivative d¢,(x0,yo,zo).(h,k,l)= Ehyodzo+ Exokdzo-I- Exoyodl, (3.31) where the integral with respect to l is defined by the integration by parts formula if! is not of finite variation.u 140 Notes: 1) We interpret integration from 0 to 2' as being over the interval (0,2']. 2) The integration by parts formula gives onyodl = xo(t)yo(t)l(t) - xo<0)yo(0)z(0) — j Ldtxoyo). (0.11 3) The extension assessed in this Lemma is not necessarily unique and the differentiability is shown only for the fixed point (x0, yo, zo) .0 Proof Lemma: Obviously the hypothesized derivative d¢,(xo, yo,z0) is a linear map. We will show that d¢,(xo, yo, 20) is also continuous. Let (h,k,l)e D[0,r]3 be a fixed arbitrary point. Consider sequences 1,, e R” h k,,,l,, e D[0,r] that satisfy t,, —> 0, h,, -;>h, 9 n, k,, —->k , 1,, $1 , where I] . II is the supremum norm, and define x" = xO+tnhn ya = yo +tnkn z. = Zo +1.1. and suppose (x,, , y,,, 2,, ) e E for each n. We need to prove that d(p,(xo, yo,zo).(h,,,k,,,l,,) —) d¢,(x0, yo, 20).(h,k,l) as n —) oo. (3.32) The sequence d¢,(xo, yo, zo).(h,,,k,,,l,,) has the form d¢l(x0’ yO’ZO)'(hn'kn’ln) = EhnYOdZo + L'kandZO +‘ngyOd n ’ where the integral with respect to 1,, is defined by integration by parts formula. 141 Slhn-hIIXIIyollxflleoL lfihnyodZo - Lhytidztil = I £01. - h)yod20 By the hypothesis of the Lemma we prove Eldzol S. C and we also have "You < co because yo 6 D[0,1']. Then the convergence h,, —r>h implies that lghnYOdZo - ghyodzo'e 0 as n —-) co. Similarly lgxokndzo- gxokdzol—ioas n —>oo. By the integration by parts formula S Ixo(t>|><|>’o<|yo(0)|><||la -l||+ ngohdln — £x0y0dl + ||l,,_ -—l-||x £|d(xoyo)|. But "1,, —1||=i|1,,_ -1_||, so S III. -zu(2nxou><1au+ 1:: Mayor)- I (wait. - (and! Because £|d(xoyo)| < ..., 1,, U1 and “x0","y0" < ..., the right hand side of the previous inequality converges to zero as n —-> oo , so ——)0 as n—>oo. (and. - (and: We proved (3.32) that implies the continuity of the mapping d¢,(xo, yo, 20). Next we apply the Lemma stated in Appendix B. In its context B, and 82 are normed vector spaces, endowed with a — algebras 4 , 5’2 , respectively, where 142 6}, c, 6} g: 6:, , i = 1,2. The a — algebras 6}, ,6}, are generated by the open balls and the open sets of B, , respectively. In our case 82 = R , 52’ = 52' and B, = D[0,T]3 is endowed with the open ball topology. According to the Appendix B Lemma, if t," {(p,(x,,, y,,z,, ) - (0,(x0, yo, zo)}- dtp,(xo, yo, zo).(h,k,l) —> o as n —> oo (3.33) then (a, can be extended to D[O,1t]3 in such a way that it is differentiable at (x0, yo, zo) , with derivative dqp,(xo, yo, zo).(h,k,l) as defined in (3.31), so our Lemma holds. Therefore we will prove (3.33). By the continuity of d(p,(xo, yo, zo) , for (3.33) it is sufficient to prove that by Sn = tit-l {¢t(xn’ Yntzn)—¢t(x0’ y0’20)}_d¢t(x0’ yO’z0)'(hn’kn’ln) —) 0 notation as n —) co. This sequence is equal to t; 1 £99. yndzn ‘ tn. 1 £XOYOdZo ‘ fihn Yodzo ' Exokndzo ‘ fixOYOdln - We expand r;‘ Exnyndzn = [,(tg‘xo + h,,)(y0 +t,,k,,)d(zo +t,,l,,) = = r;‘ (xoyodzo + (xoyodl. + fixokndzo + r. Law. + (mode + r. (11de. + + r, Lhnkndzo + t3 fihnkndl, Thus 5,, = r, Exokndln +t,, [)hmdz, +r, (lama, + 2,? Lhnkndl, = 143 = £x.k.dk.dzo+ (no. - yoidtz. -zo> = = nl + Tn2 + Tn3 + Tn4' We will prove that T,,, —> 0 as n —-) 0° for all ie {1,2,3,4}. We start with T,,, = fixokndkn — zo). suxouxuk, —k"x(,;|az, (+£3.12, |)52Cuxo||x||k,, -k||—>o. lgxou‘n _ k)d(Z,, — 20) ll because k,, —->k and "x0” < 0°. The fact El dz, |< C follows from our assumption that (x,,, y,,, z,,)e E . Thus it is sufficient to show gxokd(z,-zo)—+o as n—)°°. (3.34) Let f0 = xok . Then fo is an element of D[O,1] , so for every 8 > 0 there exists f6 6 D[0,z'] such that f6 is a step function with a finite number (say N) of jumps and “f0 ‘ f0,“ 5 8. We have I E(fo - f6)d(z. - ml 5 |lfo - f6||( [:Idz, |+ Eldzo |) s 2C8. ’ (3.35) By partial integration lfifédun " Z0) 5 leféllxllzn - Zo||+||z,. — zollx L'Idfgl. The mapping f6 is a step function with N jump points, say s,,...,sN , so N flldfalsZlfats.>-f5(s.-)|sz~||f5 i=1 144 Thus I (fo'dtz. -zo)sth+1)||fo'll> oo. Notice that similarly IIx,, - x0" —> 0 and II y,, — y,,II —> O as n —> 00. By (3.35) and (3.36) we obtain limsup "-909 S 2C8. Since 6 was £f0d(zn -Zo) arbitrary chosen, (3.34) is proved, so the convergence T,,, —> 0 follows. The proof of T,,2 —9 0 is identical. For T,,3 = £09: - xo)k,,dzo we have that lTnsl S llxn -xoll><||kn||>< (Idle |- Because IIx,, — x0 II —> 0 , IIk,, II S IIk,, — k" + "k" < co and £| dzo |< C, the convergence T,,3 —-> 0 is an immediate consequence. For T,,,, = £h,,(y,, — yo)d(z,, - Zo) . IT,,4ISIIh,,IIxIIy,, —on|x(£|dz,, |+ £|dzoD52CIIhnI|xIIyn —y0II—->0 as n—ioo. This completes the proof of Lemma. I 145 Recall that in Section 3.2.3 we considered C,’ = (C,,,...,C )’ to denote the vector in,- of all h to j transition costs related to the i-th subject. The cost vectors C, ,...,C,, are assumed to be independent. For technical reasons the following extra-assumptions are considered: EA.1 C,,,(. IZO) is of finite variation over [0,1]. We write EIdcm-(s |Z°)I < co. EA.2 The cost vectors C,,...,C,, are independent of (N,,Y,,Z,),lSi Sn. Comments: 1) In the proof of the next theorem we will need [flaw-“ch,“ | Z0)P,,,(0,s | zo))| < oo. (3.37) ’8' The function s —> e’ is monotone, so of bounded variation on [0,1] . The matrix-valued process P(0,. IZO) is (componentwise) right continuous with left hand limits and of bounded variation (see Theorem 11.6.1, p90, Andersen et al. (1993)”). By assumption EA. 1, c,,,-(. | 20) is of finite variation over [0,1] . A product of finite variation functions is also a function of finite variation, so (3.37) follows. 2) The estimator Eh,(. I Z0) of c,,,(. | 20) was obtained from the cost vectors C, ,..., C,,. The estimators AM. | Z0), 13,, (0,. |Zo) were calculated from (N,,Y,,Z,),l S i S n . By assumption EA.2, we can consider that 6h,(. I 20) is independent of (A,,(.IZO),B,,(0,.|Z0)) . This implies that 146 III/2 (Em-L IZO) - c,,,(. | 20)) is asymptotically independent of "WI “A,,,(.|ZO)-A,,,(.|Zo) In (3.38) Pn(0r-|Zo)-Iia(0v IZo) Theorem 4 Under the assumptions A.0-A.7 and the extra-assumptions EA.1 and EA.2, for a fixed time t: 111/2 (Mfiv,f,"(t | i, 20) - MPV,,‘,-"(t |i.Zo)) = = "Wm, (e-r-c,,(. | 20113.10. 12.1411. 120)) - _ a, (e-r-c,,(. 1201340.. 12014.3. I2.))1—Da f, d,,, (flew, , 20,, p,,(o,. lZo). A,,(, I 20)).(e-'°c,,o(. | Z0).U,-,.(0,. Izo).U,‘,(. Izo)) = = fie-"motsIzoiPato.sIzo>dAn(slzo)+16"’chr R is defined by tp,(x, y, z) = £x(s)y(s) dz(s) , where E = {(x.y.z)e D[0.1]3: £le [5 C} and C = Ah,(1|Zo)+l = [hyodzo + [xokdzo + [xoyodu where the integral with respect to l is defined by the integration by parts formula if I is not of finite variation. Denote by (pf the extension of (a, to D[O,1]3. Define 55,,(s) = e"‘é,,,(s | 20). 51,.(s)= 13,7, (0,3 I Z0), 2,,(s) = 13,".(3 I Z0),se [0,1]. We have (in, 5”.»2n)6 D[O,1]3 for every n and P(EdIanA,,,(1|Zo) < C. Thus (in, y,,2,)e E with probability tending to one. We have D n”2 (2,,(.) - x0(.))-> X00 = e-r'chjoc 120): D ""2 ( y,,(.) - y0(.))—)Yo(.) = (1,,, (0.. I Z0). D t n"2 (2,,(.) — zo(.))-)Zo(.) = U,,,-(. I 20). 148 The processes X 0,Y,,,Z,, are Gaussian and hence have versions that are almost surely continuous. Let C[O,1] the set of continuous functions on [0, 1]. The subset C[O,1]3 C D[O,1]3 is separable, so (X 0,Y,,,Zo) has separable support. Because fi(0,. | 20) = [1(1 + dA(. | 20)) and P(0,. | 20) = [[(I + dA(. | 20)). (0.,] (0,.] the matrices 13(0,.IZ0),P(0,.IZO) are functionals of A(.|Zo),A(.|Zo), respectively. It can be easily shown that jointly: ,2 13,,(0,.|zo)—P,,,(o,.|zo) SIU‘“(O"|Z°)I holler-41,112.) 01,112.) ' Consequently, by (3.38), "Hz ’ D ’ [(x. () y..() z .0) -(xo(.).yo(.).zo(-)) I—>(Xo(.).Yo(.).Zo(.)) . By the Functional Delta Method stated in Appendix B, D "“2 (¢t(xnvynrzn )" (01(th YOrZO))_)d¢rE(x0r yOrZ0)°(X0’Y0rZO) = P(t) defined in the theorem statement. This completes the proof of Theorem 4. I By (3.26)-(3.30), we can write the limiting process P(t) as P(t) = g,’ £e‘"Z,;,(s)P,,,(0,s | Zo)a,,,o(s)exp( 13;,sz +§I£e-rschj(s I ZO)Fih (O, S I Zo)ahjo(3) exp(fltozhjo)d3 +22 £e ”C,,,(SIZO)I:EP,-g (0, u IZo)( P,,,(u, sIZo)- P,,,(u,sIZo))dU,;g,(u)] g= -ll¢g ago“) CXP(2fl'thjo)d3 149 1’: ’ flinch)“ I Zo)Piti (0, S I Zo)exp(fi’ozhjo) (Zhjo " em“. fi0))ahj0(s)d3 + £e’"c,,,(s | Z0)P,,,(0,s | Zo)exp( 1352,,0)dug,,(s). By Theorem 3, {U5h,,(h, j)e E‘} are independent, continuous Gaussian martingales. The integrals with respect to 115,, are Ito integrals and the theory from Appendix C can be applied. Therefore, by the Fubini-type Theorem from Appendix C applied to the bounded functions H,,(s,u lZo) = 1,0.,,(u)e'”c,,,-(sIZO)R-g(0,uIZO)(P,,,(u,sIZO)— P,,,(u,s |zo)) and the finite measure defined by ,u(0,s] = I: d,,,o(u)du , the third term of the previous sum can be written as It exp(zflézhj0)2 Z Epig (0!“ I 20) X g=ll¢g XI [IPM (u,s I 20) —- Pgh (u,s I 20)) (”C,,-(s I Zo)a,,,0(s)ds:IdU38,(u). Therefore the process P(t) has the form: P(t) = Pi(t) + P20) + 1’30) + 110). where P,(t) = {{T, (t), P20) = 5720). k B(t)=Z Z [f.,,(“)+ £f42h,(u)d (U5m>(u)- g=1 latg (l.g)¢(h.j) and“) ,0, u. A consistent estimator of the Shj (“r 160) By Theorem 3, for h it: j: (Ugh,>(t) = £ variance of P(t) is obtained replacing all unknown quantities by their corresponding consistent estimators. 3.3.3 Asymptotic Distribution of the Mean Sojourn Cost The asymptotic normality of n'” IMPVI2’(s,d |i,Z,,) - MPV,,m(s,d Ii, 20)) will be assessed also by the Functional Delta Method. The entry time s and duration time d are considered fixed throughout this sub-section. d Consider the functional (y,,, :D[0,1]2 —)R, (y,,,(x,y)= I” x(u)y(u)du. The mean present value MPV,,(2)(s,d Ii, 20) can be written as 152 MPV,,m(s,d |i,Z,,) = (y,,, (x0, yo) = w,_,, (e""b,,(. IZO),P,,, (0,. Izo». We will show that III“, is Hadamard differentiable in (x0, yo). Two convergence results from the previous sections will be needed. One is that n” 2 (13,, (0,. I 20) — P,-,,(0,. I Zo)) converges weakly to the process U,,,(0,. I 20) = U,,,,(0,. IZO) + U2,,,(0,. I Z0), where U,,,, (0,. | Z0), U2,,,(0,. I 20) are described in (3.28) and (3.29). The second is that, by (3.23), nm (3,,(. I20) — b,,(. I Zo)) converges weakly to the Gaussian process b,,o(. | 20) described in Section 3.2.4: bait. I2.) = 525.1). (3.44) where Z520 is a deterministic vector function, {2 ~ MVN(0,Z(a)") and 2(a) = lim liX,'V,_l(a)X,. n—roo . n i=1 First we prove the following lemma: Lemma For a fixed se [0, 1) and d > 0 such that s + d s 1 we define y,,, : D[O,1]2 —>-n by Ill“, (x, y) = If“! x(u)y(u)du. Then W”, is Hadamard differentiable at every point (x0, y,,)e D[O,1]2 , with derivative dw,.,(x,,, y,,).(h,k) = I“ h(u)yo(u)du + f+dxo(u)k(u)du.c1 153 Proof Lemma: Let (x0, yo) be an arbitrary element of the space D[O,1]2. It is straightforward that dwmxxo, y,,) is a continuous, linear mapping. Consider an arbitrary (h,k)e D[O,1]2 and the sequences t,, e R+,h,,,k,, e D[O,1] that satisfy t,, —9 0, h,, —9h, k,, 9k , where II . II is the supremum norm. Define x,, = x0 + t,,h,, yn = Yo ‘1‘ tnkn° For each n, (x,,, y,,)e D[O,1]2. We want to show that w”, is Hadamard differentiable at (x0, yo) , so that t;' {w.,.(x,. y,,i—w.,.,(x..yo)}—dw.,.(xo.yo).(h.k) —> o as n —> ..., (3.45) By the continuity of cit/I”, (x0, yo) , it is sufficient to prove that by Sn = I:{Ws,d(xnryn)-Ws,d(x0’yO)}-dWs,d(x0’y0)'(hn’kn)‘90 notation as n —> 00. This sequence is equal to 15' If" x. (u)ya(u)du 4.71 If” xo(u)>’o(u)du - EM h.(u)yo(u)du - ISM Xo(u)k..(u)du. We expand t,',' 1 If” x,, (u) y,, (u)du = If"! (1,:le + h,, )(u)( yo + t,,k,, )(u)du = = r;' f” x0(u)yo(u)du + If" h,, (u) y0(u)du + I“ xo(u)k,,(u)du + r, If” h,,(u)k,,(u)du. As a result SH! .1 s+d 5,, =2, I h,,(u)k,,(u)du= I (x,, -x,,)(u)k,,(u)du. 154 We have that IS,,I S dIIx,, -onIxIIk,,II —-) 0 as n —-> 00 because IIx,, - onI —> 0 , IIk,, — kII —) 0 and IIk,, II S IIk,, — kII+ "k" < oo . Therefore (3.45) follows. This completes the proof of the stated Lemma. I I In Section 3.2.4 we considered C,’ = IC,,,..., C I to denote the vector of all total in,- observed costs incurred in sojoums in state h by the i—th individual. The cost vectors C, ,...,C,, are assumed independent. The same notations are used in both Sections 3.2.3 and 3.2.4 but in the context we are easily able to recognize between them. A similar extra-assumption as EA.2 is considered: EA.3 The cost vectors C,,...,C,, are independent of (N,,Y,,Z,),l$i Sn. The estimator 3,,(. I Z0) of b,,(. | 20) was obtained from the cost vectors C,,...,C,, and 13,, (0,. I20) is calculated from (N,,Y,,Z,-),1 S i S n . By assumption EA.3, we can consider that 3,,(.IZ,,) is independent of 13,,(0,. IZo)- This implies that n1’2(3,(. I Z0) — b,, (. I 20)) is asymptotically independent of n"2(fi,,,(o,. IZo)- P,,,(0,. Izo)). (3.46) Theorem 5 Under the assumptions A.0-A.7 and the extra-assumption EA.3, for a fixed entry time s and fixed duration (1: 155 n1/2 (MfiVIIz)(-Yrd “’20) — MPV,,(2)(S.d Ii,Zo)) = 1/2 -r"‘ ‘ "' D =n [Wad (e bh('IZO)’Pih(Or'IZo))-W3.d (e bh(-IZO)’PM(O’°IZO))]—) D adv/ad Ie”'b;.(. I Zo).Pth(0.. I 20)).Ie"'b,,0(. I ZO)'Uih(0” '20)) = 1135r s+d s d = I e’mbaow IZo)P.~,,(0.u IZo)du + I+ e“’“b,,(s|z,,)u,,,(o,u |Z0)du R(s,d).c1 S S no *3. 100 Proof Theorem 5: We defined w“, : D[O,1]2 -> R by W“, (x, y) = EM x(u)y(u)du. Let x0(u) = e""‘b,, (u 120) and y,,(u) = P,,, (0,u IZo) , where both x0, yo are elements of D[O,1]. By the previous Lemma, w”, is Hadamard differentiable at (x0, yo) , with . . s+d +d derivative dI/IU, (x0, yo).(h,k) = I; h(u)y,,(u)du + I: xo(u)k(u)du. For every n define (2,, y, )e D[O,1]2, 2,, (u) = e‘mb“,(u IZo) . 5",.(u) = P,,.(Om IZo). u 6 [0,1]. By (3.46), "112 [fn(°)]_[x0(r)] _D)[Xo(-)), yn(°) y0(-) Y0(.) Where X0(.) = e-r.bho(. I 20) , Yo(.) = U”, (0,. I 20) . The processes X 0, Yo are Gaussian, so they have versions that are almost surely continuous. Let C[O, 1] the set of continuous functions on [0,1] . The subset C[O,1]2 C D[O,1]2 is separable, so (X ,Y ) has separable support. 0 o 156 By the Functional Delta Method (see Appendix B), D "“2 (Wst xnrf’n) -Ws,d (XOrYOIITdV’sa (xOryOI'IXO’YOI = ”3"!) ’ so Theorem Sis proved. I By (3.28), (3.29) and (3.44), the limiting process R(s,d)can be written as R(s,d) = R,(s,d) + R2(s,d) + R3(s,d) , where R,(s,d) = (2’s,(s,d), R2(s,d) = 6'52(s,d) , R3(S, d) = = EELS“! e'mb,(u IZO)I:EPi g(0 V Izo)( PM": u Izo) Pgh(v.uIZo))dU,;g,(v):Idu g= -ll$g and we denote by S,(s,d) , 52(s,d) the expressions: s+d -ru . S,(s,d)= I e 202(u)P,,,(0,u|Z,,)du, s d s,(s,d) = I + 6%,, (u |Z0)F,,,(0,u |Z,,)du. (See p139-l40 for the definition of F,,,(0,u I201) By the Fubini-type Theorem from Appendix C, applied to the bounded functions H,,(u,v) = I,,.,+,,,(u)e'"‘b,,(u Izo)1,,,,,(v)13.,(o,vIzo)(P,,,(v,u IZo)- P,,,(v,u Izo)), u,v 6 [0,1] and the Lebesgue measure on [0,1] , R3(s,d) can be written as k R3(s,d) = exp(3,’,Z,,,-o)z Z £13.,(0,v IZo) x g=ll¢g XIEIls,s+d](u)1[v.1](u)(1)1110)!“I20)“ P8,, (V,“ '20)) e-mbh(u IZo)du]dU;gl(V), 157 k 3+ It so R3(s,d)= 22 L dS3g,(v)dUog,(v), g=ll¢g where S38! (V) = exp(fiézhjo)flg (O,V '20) X 3+ d -m X s Ilv.s+d](U)(Pul(V,uIZo)—Pgh(v,u'20)) e bh(UIZO)du. Using the same approach from Section 3.3.3, we obtain that R(s,d) is normally distributed, with mean zero and variance Var(R(s,d)) = Sl(s,d) '>:(a)“s,(s,d) + 32(s,d)'2;'S2(s,d) + + i Z Em S328,(v)d(v), gal latg . a u . . . where for g at l : U (I) = —LIQS——)—du . A consrstent estimator of the variance of °" s‘?’(u 130) g 9 R(s,d) is obtained replacing all unknown quantities by their corresponding consistent estimators. Comments 1) The technique proposed in this chapter separates the temporal dynamics of movement between states from the actual expenses. Transition probabilities and intensities that capture the former are estimated by Markov models, while the level of expense is modeled through mixed models. 2) Consider there are only two states: the initial state ‘0’ and the state we label as ‘1’. Denote by T the random time of transition from the initial state to the state ‘1’. For a 158 given profile 20 , we have that the mean present value of all expenditures in (O,t] associated with expenditures in state ‘0’ is MPVo‘z’o | 20) = Le'mbom | ZO)S(u | Zo)du , where S (u | 20) = Poo (0,u | 20) = P(T 2 u | 20) is the survival function and bo(u | 20) = E (Bo (u) | T > u,Zo) . Under the assumption that T is independent of the rate process {Bo (u),u > O} , b0 (u | 20) = E (80 (u) | 20) is the expected rate of the accumulating cost. Therefore, if ‘0’ and ‘1’ are labels for the states of a patient being ‘in-hospital’ and ‘discharged’, the model described in Chapter 3 for sojourn costs with no discounting reduces to the model proposed in Chapter 2. 159 APPENDIX A EXTENSION OF SLLN ON DE ([0, 112) Let [2 =[0,l]2 and (E,|| . II) a separable Banach space. Following Neuhaus (1971)70 we will introduce the space 05(12). Let I . I be the maximum norm in R2. For Ac: R2 , A denotes the closure and A the interior of A in the le-topology. Let P ={p = (p,,p2);pl,pze {0,1}} the set consisting of the four vertices of 12. Consider t= (t,,t2)e 12,p = (p,,p2)e P. We define the quadrants Q(p,t) and é(p,t) in I2 with vertextby: Q(.0J)= [(Pr.ll)><1(p2,12). where I(O,tk)=[0,tk),l(l,tk)=(tk,1], ke {1,2} and é(p,t) = i(pl’t|)Xi(p29t2)a , ke{l,2}, where (D is the .. O,t ift <1 - t,l ift <1 where I(O,tk)={[ k) I: {It} I: ,10,: = [0,1] iftk=1 ( ") iftk=l null set. Figures A.l-A.4 provide a visualization of the defined quadrants. The following properties are immediate consequences of the above definitions: (200.0 C (200.0 C é(p.t); Q(p,t) = 0 if and only if Q(p,t) = O; 160 é(p.t) né(p’.t) = if .0 ¢ 12’; ZpepémJ) = 12 for every IE 12. Also, for every :6 12 there exists one and only one p = p(t)e P (denoted 0') with :6 Q(a',t). The quadrants Q(a,t) and Q(o,t) are called continuity quadrants in t. For these quadrants é(a,t) at (I) and Q-(O'J) = Q(a,t). Definition of the “quadrant limit” Consider the function f : I2 -) E . If for the point Is 12 , the vertex p6 P with Q(p,t) ¢ and for every sequence {tn} CQ(0',t) with tn —>t, the sequence { f (tn )} converges then the limit (not necessarily unique) is denoted f (t +0 p) and it is called a p-limit of f in t or a “quadrant limit”. 0 Definition of the space DE (12) The space 05(12) is the set of all functions f : 12 —-> E for which the p -limit of f in t exists for every p6 P,te I2 for which Q(p,t) $4) and which are “continuous from above”, in the sense that f (t) = f (t + 00) for every t. 0 Definition of a partition generated by points of I 2 Let t,,....t, e 12. The collection of all rectangles R of the form: R=[u1,u;>x[wz,u§), 161 where uj,u;-eKj={tlj,...,t,j}u{0,l}, uj, je{l,2} is called the partition generated by t1,...,t, and it is denoted P = P(t,,...,t,). The symbol ")" means ")" or " " if the right endpoint of the interval is less than 1 or equal to one, respectively. D Neuhaus (1971)70 generalized the Skorohod metrics d,d0 on the space DR[O,1] to the metrics d,do on Dnuz) (actually on DR(Ik),Ik =[0,l]x...x[0,l]). The space DnUz) is separable and complete with respect to the metric do. Then, just like 03(12), DE(12) is also separable and complete with respect to do, replacing the absolute value on IR with the norm on the space E. The metrics d and do are equivalent. The characterizations of compact sets of DR[0,1] given by Theorems 14.3 and 14.4 in Billingsley (1968)" and generalized by Neuhaus (1971 )70 to DRUZ) do not carry over directly to 05(12) , since in E a closed, bounded set is not necessarily compact. However, the given conditions are still necessary for compactness, even if not sufficient anymore.3| Necessary condition for compactness If K C 05(12) is a compact then 1' ' 5 =0, stew > 162 where w;(6)= sup min(||x(t)—x(:,)||,||x(:)-x(:2)||), with [t,,tz] for t,,t2e12 te[r..12] |12-1,|<6 denoting the Cartesian product [tl l,t21 ] x [(12,1‘22] .0 On R2 we say that t S u if and only if tl 5 ul and :2 S 142 (same if we replace " s " by the strictly inequality sign " < " ). This is not an well defined order relationship. Definition of a 05(12) -valued random variable By a 0502) -valued random variable we understand a function X = X (t,a)) such that l) for each fixed re 12 , X(t,a)) is a random variable; 2) X(t,w)e DE(12) for almost all (0.0 For xe D502) we define IIxIId =supIIx(t)II. Then (05(12),I . IL) is a normed rel; space. The following lemma is a generalized version of Lemma 2, Rao RR. (1963).43 Lemma Let X be a 05(12) -valued random variable such that E IIX IL < co. Then, for each 8 > O , there exists a partition of [2 generated by some points r,,...,t, such that sup EIIX(t) - X(t’)| s e t.t’e R 163 for every rectangle R of the partition P(TI ,...,t,) .0 Proof Lemma: For a < b we define p(a,b) = sup EIIX(t) — X(t’)II. Recall that for ae 12, t,r’e[0.b)\[0.a) [0,“) = [O’al)x[09a2) ' Consider the set Diag = {t e 12 :t1 = t2} endowed with the order relationships "S", "<",where tSu (t 0. If p ((0, 0),(1,1)) .<_ 8 then define the point t, = (1,1). As 1 "travels” on Diag from (0,0) to (1,1), the function t—> p((0,0),t),te Diag is increasing (with respect to the order relation " S " on Diag) and also continuous, by the Lemma hypothesis. As a result we can define 1', = inf {re Diag : p((0,0),t) > 8}. Generally, define I}. = (1,1) if p(z'j-,,1) S 8 and otherwise let I}. =inf{te Diag 371—1 £}. Next we show that Tj = (1,1) for some j. If this is not true, there would exist a sequence {In} C Diag,r,, S tn < Tn“ such that for each n EIIX(t,,)—X(r,,+,)II>£/2. (A.1) 164 The sequence {Tn} is increasing (with respect to the order relation " S " on Diag) and bounded, so there exists 2'6 Diag such that 1,, -) 1'. Then X (t,,) — X (1,,) —9 O in E as n —> 00. We also have that EIIX(I,, ) -— X(z'n )II S 2EIIXIId < co. By Dominated Convergence Theorem, (A. l) is then not possible. Consequently there exists r such that r,“ = (1,1) and we consider the partition generated by 11,...,z',. The stated Lemma is proved. I Now we state and prove a Strong Law of Large Numbers (SLLN) on 05(12). We followed the ideas of the proof of the SLLN on DE[0,1] done by Andersen and Gill (1982).” SLLN on 05(12) Let X ,X,,X2,... a sequence of i.i.d. 05(12) -valued random variables such that EIIXIId < oo. Then IIn—IZLIXi —EXIId —)0 a.e. as n—->°°.CJ Proof SLLN: The space 05(12) is separable and complete with respect to the Skorohod do metric. Then any random element of 05(12) is tight. This result is true by the following 8131611161112 165 Proposition (Theorem 1.4, p10, Billingsley (1968)") If (3,5) is a metric space with 6‘ the class of Borel sets in S and S is separable and complete then each probability measure on (S,t5') is tight. 0 Therefore, because E IIX IL < oo, i) for every 6 > 0 there exists a compact set K C 05(12) such that EIIXIIIX e K] < 8. Next we show that ii) for every 8 > O and every compact set K C 05(12) there exists 6 > 0 such that if xe K and aSt < ,8 Sari-(5,6) then IIx(t) - x(a)II s "qu + opo) — x(a)II + e , where the vertex p0 = (0,0). Let 8 > 0 and K C DE(12) a compact set. Using the previously stated necessary condition for compactness, there exists 5 > 0 such that sup w;(6) S 8. By the definition xEK of W;(5), if xe K and aSt<fl$a+(5,6) then min (IIx(t) - x(a)II,IIx(,B — e’) — x(t)II) s e for every 8’: (£{,£§),£,’,£§ >0 such that (ISIS ,B-e’< ,6. Because xe 05(12) , 52113)}, 0) X(fl — 6‘ ) = X(fl + Ono)“ Consequently min (IIx(t) — x(a)||,|Ix(,8 + opo) — x(t)II) s e . If IIx(t) - x(a)II s a then ii) is obviously satisfied. If |wa + opo) — x(t)II s s then 166 IIx(t) - x(a)II s |wa + op0 ) - x(a)II + |wa + op”) - x(t)II s ||x(,6 + ope) - x(a)II + a, so ii) is again verified. The last property we prove is: iii) for every 8 > 0 and every 6 > 0 there exists a partition P of [2 generated by some points t,,....t,\,_l such that for each rectangle Re P , R = [a, )3) we have |,6—a|<6 and EIIX(fl+0po)—X(a)IIs£. By the stated Lemma, there exists a partition of 12 generated by some points r.,...,z', e Diag such that sup E “X (t) — X (1’)“ S 8 for every rectangle R of the partition r,t’e R P(z'l ,...,r,). Taking on Diag intermediate points between 23,714, , we define a finer partition P = P(t,,...,tN_,) of 12 such that for every Re P , R = [(1,/3) we have |,6 - a] < 6 and EIIX(I) — X0)“ 5 e for all t,t’e R. Then, taking po-limits ofX in ,6 and a in the previous relation (possible because E "X "(I < 00), we obtain EIIX(fl +opo) — X(a)II s e , so iii) follows. In the following we will use the properties i)-iii) to prove the SLLN. Consider an arbitrary 8 > 0. We choose a compact set K by i), a 6 > 0 by ii) and finally the partition P = P(tl,...,tN_l) by iii). First we show that sup II"-12?=IX‘(’)- EX «)II —> o a.e. as n —> oo. (A.2) te[0,l)x[0.l) Let t a point of [O,1)x[0,1). Then there exists a rectangle R in the partition P such that IE R. Denote R = [a.fi). 167 We have that "WE?=l X,(t) — EX (2)" s sf (t) + n"Zf=l||X,.||d [X,- e K] + EIIXIId [X e K], (A.3) where of (r) = r242; X,(t)[x,. e K]— EX(t)[X e K]II. The quantity sf (t) s sf (a) + n"'Z:=lIIX,-(t) — X,(a)||[x,. e K] + EIIX(t) — X(a)II[X e K]. Then, by ii) and iii), 8,50) s ef(a)+n"zf=lIIx,.(/9+opo)— X,(a)||+e+ EIIX(,B+0pb)- X(a)II+£ s £f(a)+n"Z;lIIX,-(fl +0%)- X,(a)||+ 33. We will apply a SLLN on separable Banach spaces, first proved by Mourier (1953)”: SLLN on Banach spaces: If (X ,II . II) is a separable Banach space and {Vn} a sequence of i.i.d. random elements in X such that EIIVl II < 00 then III-12.11% — EV'I —> 0 a.e. as n -—> oo .0 By this SLLN, a: (a) —-) 0 a.e. and by the regular SLLN (for the real valued random variables) n“z;'=lIX,.(fl+opo)- X,(a)II—> EIIX(,6+0po)— X(a)II a.e. as n -—>oo. Therefore by iii) limsup sup of (t) s 0+ EIX(,6 + op”) - X(a)I+ 385 42. (AA) n—m te[0.l)x[0,l) We apply again the SLLN on Banach spaces and then, by (A.3), 168 lim sup sup n—m te[0.1)x[0.l) S limsup sup 8f(t)+2EIIXIId [X e K]. n—roo re[0.1)x[0.l) n'12?=lX,-(t)— EX(t)II 5 By (AA) and property i), limsup sup n—Nn t€[0.l)x[0,l) {‘2le X,(t) — EX(t)II s 68. Taking 8 to converge to zero, we obtain (A.2). The following is the extension of SLLN to the space DE[0,1] , proved by Andersen and Gill (1982)“: SLLN on DEW, 1] Let {Va} a sequence of i.i.d. random elements of DE[0,12]. Suppose EIIVIII = EI sup III/1(6)") < co. Then IIn"Z:__IV,- — EV'I -> O a.e. as n —) oo .0 ce[0.r2] By this extended SLLN: r142; X,(t) — EmeI =' sup re{(rI ,l).t,e[0,l]} (A5) = sup n-IZ:=1X"(I"1) -— EX (t,,l)II —> 0 a.e. i.e[OJI because X ,-(.,1) are DE[0,l]—valued random variables and E “X l(t,,1)II S EIIX1IId < co. Similarly sup III-12:! X,(t) — EX (1)" -> O a.e. (A.6) :6 {(1.12 ).12E[0.1]} ' By (A.2), (A.5) and (A.6), the SLLN on 05(12) follows. I 169 p = (030) Qtp,t)=t0,trlxto,trl étp,t)=t0.tr>xttr,ll Q'tp,t)=ttr.llxt0,tr> t2 ........................................... .l p t p r 12__.:, ............................ 5 p = (19 0) 12 ............................................ ‘1 p = (191) t2 .............. 5. Q'tp,t)=ttr.llxtt2,ll Figure A.1: Definition of the quadrants Q(p,t) when t e [0, l) x[0, 1) 170 p=(0,0) Q(p,t)=[0,t1)x{0,1] p = (1,0) Q(p,t) =[t1,1]><[0,1] pE{(0,l),(l,1)} é(p,t)= Figure A.2: Definition of the quadrants Q"(p,t) when t e {(t1,l), t1 6 [0,1)} 171 p=(0r0), ,0 =(0a1) .06 {(1,0),(1,1)} Qtp,t)=t0,llxt0,trl Q"(p,t) =[0,ll><[t2,1] Q(p,t) = Figure A.3: Definition of the quadrants Q(p,t) 172 when t e {(1,1‘2), t2 6 [0,1)} p = (0,0) Q”(p,r) =[0,1]x[0,l] p e {(0.1), (1,0),(1,1)} Q(p,t) = o Figure A.4: Definition of the quadrants Q(p,t) when t = (1,1) 173 APPENDIX B The Functional Delta Method We will briefly review some results from Gill (1989).72 A concept of differentiability that allows a generalization of the usual Delta Method is the one of Hadamard or compact differentiability. Let B,, 82 denote two normed vector spaces. Definition The functional (p : B, —> 82 is compactly or Hadamard differentiable at a point 66 B, if and only if a continuous linear map d¢ : B, —) 82 exists, such that for all real sequences a,, —> co and all convergent sequences h" —) h e B, , a, (qu + d,,-1h") - 0(6)) -> d¢(6).h as n —) co. Here (“0(6) is called the derivative of w at the point 0. (See Definitions 1-3, p100 and the characterizations of differentiability, p102, Gill (1989)")0 An important property of Hadamard differentiation is that it satisfies the chain rule: if (p : B, —9 B2 and W : 32 —> B3 are Hadamard differentiable at xe B, and 174 (0(x) e 82 respectively, then w o q) : B, —-> B3 is Hadamard differentiable at x, with derivative dl/I(¢(x)).d(a(x). Next we define the concept of weak convergence in normed vector spaces. Let ( B,II . II ) be a normed vector space endowed with a 0‘ — algebra B , such that B' C B C B " , where B ' and B " are the a' — algebras generated by the open balls and the open sets of B, respectively. Thus 13 " is the Borel 0' — algebra; when B is separable, 8': B" . Definition (See Definition 4, Gill (1989)”) Let X n be a sequence of random elements of (3,8) and let X be another random element of that space. We say X n converges weakly (or in distribution) to X and we D D write X n —> X if and only if Ef (X n)—9 Ef (X ) for all bounded, norm-continuous, B-measurable f : B -—> R .0 The full functional version of Delta Method is given by Gill (1989)”, Theorem 3: Theorem (Functional Delta Method) Suppose (p: B, —> 82 is compactly differentiable at a point p e B, and both it and its derivative are measurable with respect to the a — algebras B, and B, (each nested between the open ball and Borel a - algebras). Suppose X n is a sequence of random 175 D elements of B, such that Z, = n"2(X,, - u)—->Z in B, , where the distribution of Z is concentrated on a separable subset of B, . Suppose addition: 32 x 82 -—) 82 is measurable (see Remark 2 below). Then D (1) [n“2(x,. -#).n”2(¢(X..)-¢(#))-d¢(#).n”2(X,, —m]—>(z.0) m Bl x32 and consequently (in particular) 1/2 1/2 P (2) n (¢(X,.)-¢(#))-d¢(#).n (X.-#)—+0. (3) nll2( __ D (0(Xn) ¢(#))—>d¢(#)-Z .0 Remark 1: Measurability of d¢(,u): B, —) 82 can often be shown to follow from measurability of (p (see Lemmas 4.4.3 and 4.4.4, van der Vaart (1988)”). 0 Remark 2: For x = (x,,x2)e B, x 82 we define "x" = max(IIx,II,IIxQI) and we give product spaces B, x82 and 32 X32 their product a —algebras. if B, and 82 are D[0,z']P XR" for some finite p, q and B, , B, are the open-ball 0' — algebras then all product a — algebras are also the open-ball a — algebras with respect to the max norm. If one is only interested in getting (3), it suffices (qua measurability) to assume that left and right hand sides here are random elements of 82 .0 176 The following is a useful lemma. In many applications the mapping w is only a priori defined on certain members of B, and one could set about choosing a particular extension to all of B, such that the hypotheses of the Functional Delta Method are satisfied in each particular application. Lemma (see Lemma 1, Gill (1989)72) Consider XE E C B, and (a: E —> 32. Suppose there exists a continuous linear map d¢(x) : B, —) Bz such that for all tn —> 0 (tn 6 R) and h" —> he B, such that x" = x + t,,h,, e E for all n, we have: z;' (¢(x + t,,h,, ) -¢(x)) —> d¢(x).h as n —-> oo. Then to can be extended to B, in such a way that it is differentiable at x, with derivative d(p(x) . The derivative is unique if the closed linear span of possible limit points h equals B, .0 Comments: Let D[O,1] the space of real functions, right-continuous with left-hand limits, defined on [0,1]. We endow this space with II . II“ , the supremum norm. In Chapter 3 we consider spaces like B = (D[O,1])p . Under the Skorohod topology these spaces are Banach and separable. Under the max-supremum norm the separability is not valid any more. In these spaces, if the limiting process has continuous sample paths then weak convergence in the sense of the Skorohod metric and in the sense of the supremum norm are exactly equivalent. Otherwise, supremum norm convergence is stronger. 0 177 The next proposition characterizes the compact differentiability of the product integral. As a reference see Theorem 8, Gill and Johansen (1990).74 We state the result in the Andersen et al. (1993)29 form (see Proposition [1.8.7, p114): Proposition 2 Let E5 C (D[O,1])k be the set of k xk matrix cadlag functions with components of total variation boundedby the constant M. Let q? : E I: —> D[O,1]"2 be defined by ¢(X)=H(I+dX). [0») Let X be a fixed point of E5 . Then (a can be extended to D[O,1]"2 so as to be compactly differentiable at X, with derivative (d¢(X).H)(t)= II](1+dx)H(ds)I‘[(1+dx), selo.r][0.s) (3.1] where, when H is not of bounded variation, the last integral is defined by the application (twice) of the integration by parts formula and the forward and backward integral equations (see Theorem 5, Gill and J ohansen (1990)"). 0 The following continuity result is proved in Theorem 7, Gill and Johansen (1990)": If X n, X in E5 are such that X n —9 X in supremum norm then HU +an) —-) “(I +dX) in supremum norm. 178 APPENDIX C Results on Ito Integration Let ($2,.F, P) a filtered, complete probability space, with filtration F: (f; ),20 , .7-3 containing all null sets of .7: . Let M a continuous Gaussian martingale on this space, M = {M (t),t E ’1'}, with T =[0,1'],z' < 00 (see definition of Gaussian martingales in Section 3.2.1). Let (M ) = V a continuous, deterministic, positive, increasing function on ’I' , zero at time zero. Note: The results in stochastic calculus are usually for the standard Brownian motion (e. g. Harrison (1985)”, Oksendal (1995)"). They can be easily translated for continuous Gaussian martingales which are called “time-transformed” Brownian motions.0 Let H 2 be the set of all adapted processes X on ((9,?) P),F) satisfying E£X2(s)V(ds) X in H 2 . As a reference see p92-95, Liptser-Shiryayev (1977).77 For X e H 2 there exists a random variable I (X )e L2 , unique up to a null set, such that I (X n) —> I (X ) in L2 for each simple sequence {Xn} satisfying X n —> X in H2. Furthermore, E[I(X)] = o and I|I(X)II = "XI (see Proposition 11, p58-59, Harrison (1985)”). This concludes our short sketch of the definition of I ,(X ) for X e H 2 and a fixed time t2 0. Properties of the Ito integral: Let X,Ye Hzand let OSsIIXII). Then I,(X,,)—)I,(X) in L2, so I,(X,,) converges in distribution to x "X." I,(X ). The distribution function of I ,(X ,,) is <1>I I(where CD is the standard normal dlstnbutlon function) which converges to <1) I—] , the dlstnbutlon function of IX“ I,(X ). Thus iv) follows.- The next theorem is a type of Fubini’s Theorem for stochastic integration. Fubini-type Theorem Let (s,u) —> H (s,u), (s,u)e 7' x7 be a bounded, 6’ x5 measurable function, where 6’ is the set of Borelians on 7' . Let [1 a finite measure on the space (7,5). Then, for every :6 7', A6 6’: LI EH (“OW (“)IMdS) = £I: IAH(S.u)#(dS):IdM (u) for a.a.w. 0 For more general versions of this theorem and their proofs, see p159-l61, Protter (1990).78 182 REFERENCES Lipscomb J, Ancukiewicz M, Parmigiani G, Hasselblad V, Samsa G, Matchar DB. Predicting the Cost of Illness: A comparison of alternative models applied to stroke. Medical Decision Making. 1998;18 suppl:SB9-SS6. Intrillagator MD, Bodkin RG, Hsaio C. Econometric Models, Techniques, and Applications. Second ed. Upper Saddle River: Prentice Hall; 1996. Wooldridge JM. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press; 1999. Duan N. Smearing estimate: A nonparametric retransformation method. Journal of the American Statistical Association. l983;78(383):605-610. Zhou XH, Melfi CA, Hui SL. Methods for comparison of cost data. Annals of Internal Medicine. 1997;127(8):752-756. Mullahy J. Much ado about two: Reconsidering retransforrnation and the two- part model in health econometrics. Journal of Health Economics. 1998;17:247- 281. Dudley RA, Frank E. Harrell J, Smith LR, et al. Comparison of analytic models for estimating the effect of clinical factors on the cost of coronary artery bypass graft surgery. Journal of Clinical Epidemiology. l993;46(3):261-271. 183 10. 11. 12. 13. 14. 15. 16. 17. Smith LR, Milano CA, Molter BS, Elbeery JR, Sabiston DC, Smith PK. Preoperative determinants of postoperative costs associated with coronary artery bypass graft surgery. Circulation. 1994;90(5, Part 2): 124-128. Etzioni RD, Feuer EJ, Sullivan SD, Lin D, Hu C, Ramsey SD. On the use of survival analysis techniques to estimate medical care costs. Journal of Health Economics. 1999;18:365-380. Lin DY, Feuer EJ, Etzioni R, Wax Y. Estimating medical costs from incomplete follow-up data. Biometrics. 1997;53:419-434. Lin DY. Proportional means regression for censored medical costs. Biometrics. 2000;56:775-778. Hallstrom AP, Sullivan SD. On estimating costs for economic evaluation in failure time studies. Medical Care. l998;36(3):433-436. Bang H, Tsiatis AA. Estimating medical costs with censored data. Biometrika. 2000;87(2):329-343. Gardiner J, Hogan A, Holmes-Rovner M, Rovner D, Griffith L, Kupersmith J. Confidence intervals for cost-effectiveness ratios. Medical Decision Making. 1995;15:254-263. Gardiner J, Holmes-Rovner M, Goddeeris J, Rovner D, Kupersmith J. Covariate- adjusted cost-effectiveness ratios. Journal of Statistical Planning and Inference. 1999;75:291-304. Lin DY. Linear regression of censored medical costs. Biostatistics. 2000;1:35-47. Rapoport J, Teres D, Lemeshow S. Explaining variability of cost using a severity- of—illness measure for ICU patients. Medical Care. 1990;28:338-348. 184 18. 19. 20. 21. 22. 23. 24. 25. Jones KR. Predicting hospital charge and stay variation: the role of patient teaching status, controlling for diagnosis related groups, demographic characteristics, and severity of illness. Medical Care. 1995;23:220-235. Silberbach M, Shumaker D, Menashe V, Cobanoglu A, Morris C. Predicting hospital charge and length of stay for congenital heart disease surgery. Am J Cardiol. 1993;72:958-963. Calvin JE, Klein LW, VandenBerg BJ, Meyer P, Ramirez-Morgen LM, Parrillo JE. Clinical predictors easily obtained at presentation predict resource utilization in unstable angina. American Heart Journal. 1998;136:373-381. Benzaquen BS, Eisenberg MJ, Challapalli R, Nguyen T, Brown KJ, Topol El. Correlates of in-hospital cost among patients undergoing abdominal aortic aneurysm repair. American Heart Journal. 1998;136:696-702. Krumholz HM, Chen J, Murillo JE, Cohen DJ, Radford MJ. Clinical correlates of in-hospital costs for acute myocardial infarction in patients 65 years of age and older. American Heart Journal. 1998;135:523-531. Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distritutions. Journal of the American Statistical Association. 1989;84(408):1065-1073. Hoem J M, Aalen OO. Actuarial values and payment streams. Scandinavian Actuarial Journal. 1978:38-47. Praestgaard J. Nonparametric estimation of actuarial values. Scandinavian Actuarial Journal. 1991;2: 129- 143. 185 26. 27. 28. 29. 30. 31. 32. 33. 35. Norberg R. Payment measures, interest, and discounting - an axiomatic approach with applications to insurance. Scandinavian Actuarial Journal. 1990: 14-33. Norberg R. Reserves in life and pension insurance. Scandinavian Actuarial Journal. 1991:3-24. Norberg R. Hattendorffs Theorem and Thiele's Differential Equation Generalized. Scandinavian Actuarial Journal. 1992:2-14. Andersen PK, Borgan O, Gill RD, Keiding N. Statistical Models Based on Counting Processes. New York: Springer-Verlag; 1993. Lin DY, Wei U. The robust inference for the cox proportional hazards model. Journal of the American Statistical Association. 1989;84(408):1074-1078. Andersen PK, Gill RD. Cox's regression model for counting processes: A large sample study. Annals of Statistics. l982;10(4):1100-1120. Wei LJ, Lachin JM. Two-sample Asymptotically Distribution-Free Tests for Incomplete Multivariate Observations. Journal of the American Statistical Association. 1984;79:653-661. Thomas DR, Grunkemeicr GL. Confidence interval estimation of survival probabilities for censored data. Journal of American Statistical Association. 1975;70:865-871. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. New York: John Wiley & Sons; 1980. Klein JP, Moeschberger ML. Survival Analysis: Techniques for Censored and Truncated Data. New York: Springer-Verlag; 1997. 186 36. 37. 38. 39. 40. 41. 42. 43. 45. J acod J. Multivariate point processes: Predictable projection, Radon-Nikodym derivatives, representation of martingales. Z. Wahrsch. verw. Geb. 1975;31:235- 253. J acod J. On the stochastic intensity of a random point process over the half-line. Technical Report 15, Department of Statistics, Princeton University. 1973. J ohansen S. An extension of Cox's regression model. International Statistical Review. 1983;51:258-262. Arjas E, Haara P. A marked point process approach to censored failure time data with complicated covariates. Scandinavian Journal of Statistics. 1984;11:193- 209. Borgan O. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scandinavian Journal of Statistics. 1984;11:1-16. Fabian V, Hannan J. Introduction to probability and mathematical statistics. 1985. J ennrich RI. Asymptotic properties of non-linear least squares estimators. Annals of Mathematical Statistics. 1969;40(2):633-643. Rao RR. The law of large numbers for D[O,1]-valued random variables. Theory of Probabability with Applications. 1963;8:70-74. Billingsley P. Statistical Inferences for Markov Processes. Chicago: University of Chicago Press; 1961. Mourier E. Elements aleatoires dans un espace de Banach. Ann. Inst. H. Poincare. 1953;13:161-244. 187 46. 47. 48. 49. 50. 51. 52. 53. Charlson ME, Pompei P, Ales KL, Mackenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. Journal of Chronic Diseases. 1987;5:373-383. Matsui K, Goldman L, Johnson PA, Kuntz KM, Cook F, Lee TH. Comorbidity as a correlate of length of stay for hospitalized patients with acute chest pain. Journal of General Internal Medicine. 1996;11:262-268. Fenn P, McGuire A, Backhouse M, Jones D. Modelling programme costs in economic evaluation. Journal of Health Economics. 1996;15:115-125. Polverejan E, Gardiner J C, Bradley CJ, Holmes-Rovner M, Rovner D. Estimating mean hospital cost as a function of length of stay and patient characteristics. In review. 2001. Longini IR, Byers RH, Hessol NA, Tan WY. Estimating the stage-specific numbers of HIV infection using a Markov model and backcalculation. Statistics in Medicine. 1992;] 1:831-843. Satten GA, Longini IR. Markov chains with measurement error: estimating the 'true' course of a marker of the progression of HIV disease. Applied Statistics. 1996;45:275-309. Aalen OO, Farewell VT, de-Angelis D, Day NE, Gill ON. A Markov model for HIV disease progression including the effect of HIV diagnosis and treatment: Application to AIDS prediction in England and Wales. Statistics in Medicine. 1997;16:2191-2210. Longini [M], Clark WS, Byers RH, et al. Statistical analysis of the stages of HIV infection using a markov model. Statistics in Medicine. 1989;8z831-843. 188 54. 55. 56. 57. 58. 59. 61. 62. Gentleman RC, Lawless JF, Lindsey J C, Yan P. Multi-state markov models for analysing incomplete disease history data with illustrations for HIV disease. Statistics in Medicine. 1994;13:805-821. Hansen BE, Thorogood J, Hermans J, Ploeg RJ, Bockel JHV, Houwelingen JCV. Multistate modelling of liver transplantation data. Statistics in Medicine. 1994;13:2517-2529. Dabrowska DM, Guo-wen S, Horowitz MM. Cox regression in a Markov renewal model: an application to the analysis of bone marrow transplant data. Journal of the American Statistical Association. 1994;89:876-877. Wanek LA, Elashoff RM, Goradia TM, Morton DL, Cochran A]. Application of multistage markov modeling to malignant melanoma progression. Cancer. 1994;73:336-343. Perez-Ocon R, Ruiz-Castro JE, Gamiz-Rerez ML. A multivariate model to measure the effect of treatments in survival to breast cancer. Biometrical Journal. 1998;40:703-715. Gold MR, Siegel JE, Russell LB, Weinstein MC, eds. Cost-Efiectiveness in Health and Medicine. New York: Oxford University Press; 1996. J acobsen M. Statistical Analysis of Counting Processes. New York: Springer- Verlag; 1982. Verbeke G, Molenberghs G, eds. Linear Mixed Models in Practice: A SAS- Oriented Approach. New York: Springer-Verlag; 1997. Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York: Springer-Verlag; 2000. 189 63. 65. 66. 67. 68. 69. 70. 71. 72. 73. Diggle PJ, Liang KY, Zeger SL. The Analysis of Longitudinal Data. Oxford, UK: Oxford University Press; 1994. Cnaan A, Laird NM, Slasor P. Using the general linear mixed model to analyse unbalanced repeated measures and longitudinal data. Statistics in Medicine. 1997;16:2349-2380. Laird NM, Ware J H. Random-Effects Models for Longitudinal Data. Biometrics. December 1982 1982;38:963-974. Cox DR, Hinkley DV. Theoretical Statistics. London: Chapman & Hall; 1990. Amemiya T. Advanced Econometrics. Cambridge, MA: Harvard University Press; 1985. Manning WG, Duan N, Rogers WH. Monte Carlo evidence on the choice between sample selection and two-part models. Journal of Econometrics. 1987;35:59-82. Duan N, Willard G. Manning J, Morris CN, Newhouse JP. Choosing between the sample-selection model and the multi-part model. Journal of Business & Economic Statistics. 1984;2(3):283-289. Neuhaus G. On weak convergence of stochastic processes with multidimensional time parameter. Annals of Mathematical Statistics. 1971;42(4): 1285-1295. Billingsley P. Convergence of probability measures. New York: Wiley; 1968. Gill RD. Non- and Semi-parametric Maximum Likelihood Estimators and the von Mises Method (Part 1). Scandinavian Journal of Statistics. 1989;16:97-128. van der Vaart AW. Statistical estimation in large parameter spaces. Vol 44. Amsterdam: Centrum voor Wiskunde en Inforrnatica; 1988. 190 74. 75. 76. 77. 78. Gill RD, Johansen S. A survey of product-integration with a view towards application in survival analysis. Annals of Statistics. 1990;18: 1501-1555. Harrison JM. Brownian Motion and Stochastic Flow Systems. New York: John Wiley; 1985. Oksendal B. Stochastic Differential Equations. Fourth ed. New York: Springer- Verlag; 1995. Liptser RS, Shiryayev AN. Statistics of Random Processes I . New york: Springer- Verlag; 1977. Protter P. Stochastic Integration and Differential Equations. New York: Springer- Verlag; 1990. 191 HIII HIII‘I HI