MODELING HOSPITAL LENGTH OF STAY AND COST WITH HETEROGENEITY By Xiaoqin Tang A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Statistics 2010 ABSTRACT MODELING HOSPITAL LENGTH OF STAY AND COST WITH HETEROGENEITY By Xiaoqin Tang Hospital length of stay (LOS) is an important measure of healthcare utilization and is generally positively skewed and heterogeneous. We fit a Coxian phase-type distribution to LOS and identify the hidden states of the underlying latent homogeneous Markov model. We demonstrate that selecting an appropriate number of phases and a regression model for hazard rates can account for some heterogeneity in LOS. The Reversible Jump Markov Chain Monte Carlo (RJMCMC) method enables us to dynamically uncover the hidden stochastic Markov structure. A classification method is used to assign patients to different latent LOS groups according to their mean LOS in hospitals. Increasing availability of patient LOS and cost data permits joint analysis accounting for their possible correlations. A bivariate Coxian phase-type/log-normal (CPH-LN) model is proposed to assess the impacts of covariates simultaneously. Under marginal specification through parametric models for LOS and cost, shared random effects are introduced in the model as regressors and the model is easily estimated using SAS Proc NLMIXED. We also propose an innovative method for the consideration of two-level correlations between LOS and cost. In our model, we are concerned with intra-hospital correlations, crossequation correlations at both the hospital level and patient level. Full maximum likelihood (FML) is used to derive parameter estimates. A simulation study is conducted to illustrate our method. The methodologies are illustrated with application to hospital admissions for acute my- ocardial infarction (AMI) in the 2003 Nationwide Inpatient Sample (NIS) from the Healthcare Utilization Project (HCUP). Copyright by XIAOQIN TANG 2010 This thesis is dedicated to my parents for all their love and support! v ACKNOWLEDGMENTS I would like to express my sincere gratitude to my advisor Dr. Joseph C. Gardiner for his valuable advice, patience, support and continuing encouragement during my graduate studies in the Department of Epidemiology, Michigan State University. He allowed me to investigate different areas of research. Working under his guidance on various research projects strengthened by training as a biostatistician. I also want to thank Dr. Zhehui Luo, for her devoted help on my research and her encouragement on my life. Appreciation is extended to other members in my committee, Dr. R.V. Ramamoorthi, Dr. Lijian Yang, and Dr. Connie Page, for their friendly support. My dissertation research was supported by the Agency for Healthcare Research Quality under grant 1R 01 HS 14206. I give my special thanks to Dr. James Stapleton in the Department of Statistics and Probability. He is always ready to help students like a father. During the fives years of my graduate study at Michigan State University, he has given a lot of help and encouragement. I also thank all the other professors and my friends for their care and help. I thank my parents for all the endless love and support they give me. They have always been supportive of my efforts for carrying out my career goals. I dedicate this dissertation work to both of them. vi TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES ix . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction 1.1 Background . . . . . . . . . . . . . . . . . . . 1.1.1 Length of Stay (LOS) . . . . . . . . . . 1.1.2 Cost . . . . . . . . . . . . . . . . . . . 1.1.3 Accelerated Failure Time (AFT) Model 1.1.4 Phase-type (PH) Model . . . . . . . . 1.1.5 Random-Effects (RE) Model . . . . . . 1.1.6 Copula-based Regression Model . . . . 1.2 Outline of the Thesis . . . . . . . . . . . . . . x . . . . . . . . 1 1 3 4 5 6 7 9 11 2 Modeling Hospital Length of Stay with a Coxian PH Regression Model 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Continuous Phase-type Distribution . . . . . . . . . . . . . . . . . . . . . . . 2.3 Coxian Phase-type Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Properties of Phase-type Distributions . . . . . . . . . . . . . . . . . . . . . 2.5 Bayesian Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Reversible Jump Markov Chain Monte Carlo (RJMCMC) . . . . . . . 2.5.2 Transformation Strategies . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 RJMCMC for Coxian PH Distribution . . . . . . . . . . . . . . . . . 2.6 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Computation of Partial Effects . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Continuous Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Discrete Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Modeling Hospital LOS for AMI Patients . . . . . . . . . . . . . . . . . . . . 2.9.1 Patient Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9.2 Estimation of Coxian phase-type Regression Model . . . . . . . . . . 2.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 14 20 23 25 28 29 31 34 35 36 37 38 39 39 43 51 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Joint Modeling of Hospital Length of Stay and Cost with Shared Random Effects 52 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.2 Shared Random-effects Model . . . . . . . . . . . . . . . . . . . . . . . . . . 54 vii 3.3 3.4 3.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application to 2003 NIS AMI Hospitalized Patients . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Bivariate Copula Random-effects (BCRE) Model for Cost 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Copulas and Dependence: a Brief Overview . . . . . . 4.3 Bivariate Copula Random-effects (BCRE) Model . . . 4.3.1 BCRE Model and Likelihood . . . . . . . . . . 4.3.2 Estimation . . . . . . . . . . . . . . . . . . . . . 4.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Application to 2003 NIS AMI Hospitalized Patients . . 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 60 62 66 Length of Stay and . . . . . . . . 68 68 71 81 81 83 84 87 93 5 Conclusion and Future Research 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 95 98 APPENDICES 99 A Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 B Example SAS code for copula estimates 103 B.1 SAS Code for Gaussian Copula . . . . . . . . . . . . . . . . . . . . . . . . . 104 B.2 SAS Code for Clayton Copula . . . . . . . . . . . . . . . . . . . . . . . . . . 105 B.3 SAS Code for Gumbel Copula . . . . . . . . . . . . . . . . . . . . . . . . . . 105 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 viii LIST OF TABLES 2.1 Descriptive Statistics for AMI patients in NIS in 2003 (N=11,749) . . . . . . 42 2.2 Estimates of posterior means (SD) based on 4-phase Coxian PH model and log-normal model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Estimated average partial effects on mean LOS (SD) associated with different covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.4 Patient characteristics by classification of LOS . . . . . . . . . . . . . . . . . 49 3.1 Characteristic of patients for LOS and Cost (N=1,1749) . . . . . . . . . . . 63 3.2 Estimates of coefficients and random effects (γ = 1) . . . . . . . . . . . . . . 65 3.3 Estimates of coefficients without random effects . . . . . . . . . . . . . . . . 66 4.1 Selected examples of copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Simulation results for bivariate copulas with normal margins . . . . . . . . . 86 4.3 Parameter estimates (SE) for log-normal margins with Gumbel copula . . . . 90 4.4 Parameter estimates (SE) for log-normal margins with Gumbel copula, reduced model A without random effects . . . . . . . . . . . . . . . . . . . . . 90 Parameter estimates (SE) for log-normal margins with Gumbel copula, reduced model B with independent errors . . . . . . . . . . . . . . . . . . . . . 91 4.6 Parameter estimates (SE) for log-normal margins with Gaussian copula . . . 91 4.7 Parameter estimates (SE) for log-normal margins with Clayton copula . . . . 92 2.3 4.5 ix LIST OF FIGURES 1.1 Phase-type model for incubation distribution of AIDS . . . . . . . . . . . . . 7 2.1 Two different representations for one PH distribution . . . . . . . . . . . . . 20 2.2 Representation of (m + 1)-state Markov process with a Coxian PH distribution 21 2.3 Histogram of hospital length of stay (LOS) and fitted 4-phases Coxian PH model. The solid line (—) is the histogram of LOS and the dotted line (- - -) is the model-based density function estimate (truncated at 50 days). . . . . . 41 2.4 Number of phases dynamically selected by RJMCMC . . . . . . . . . . . . . 46 2.5 Goodness-of-fit curves by procedure type. The solid lines (—) are the Empirical estimates of discharge probability, the dash lines (- - -) are the Coxian PH model-based estimates, the dotted lines (...) are the log-normal model-based estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.1 Plot of LOS vs. Log (Cost), truncated at 50 days . . . . . . . . . . . . . . . 64 4.1 Bivariate pdf contour plots induced by copula, N (0, 1) margins (Kendall’s tau=0.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Plot of Log (LOS) vs. Log (Cost), truncated at 50 days . . . . . . . . . . . . 87 4.2 x Chapter 1 Introduction 1.1 Background Studies of healthcare utilization from national and state administrative databases have been stymied by a lack of powerful methodological approaches to circumvent the lack of specificity that is often common in these databases. New statistical and econometric techniques are needed to account for unobserved heterogeneity and uncover hidden structures in healthcare utilization data. Predictive models are valuable in identifying factors associated with utilization so that adequate resources are properly allocated. These models are also used for risk-adjustment in payment for healthcare services and identifying high-risk patients for disease management programs. An important measure of healthcare resource use is hospital length of stay (LOS), measuring the number of days from admission to discharge. LOS can be terminated by cure, transfer to another care facility or death. In the past decade, health care providers and hospital administrators are interested in LOS predictions for both economic and organizational reasons. In addition to these aspects of quality control, there 1 is also patient interest in anticipated dates of discharge. The development of LOS models has been useful to economists and statisticians. Some standard parametric models, for example, log-normal, Weibull, log-logistic and Generalized Gamma models have been widely applied to survival data. Another key measure in healthcare utilization analysis is the cost per individual patient, which is defined to be the total resource use from admission to discharge. Some analysts treat the hospital cost independent of LOS in standard linear mixed models after log transformation, or by standard survival analysis techniques, for example, proportional hazards models or accelerated failure time models which specify different error distributions, such as the extreme-value distribution, normal distribution and logistic distribution, etc. However, the independence assumption is generally not valid since an important feature of LOS and cost is the correlation between them, possibly in hierarchical levels, i.e. both at the hospital and individual levels. Hence, although separate univariate models may suffice to assess the correlates of LOS and hospital cost, the advantage of a multivariate model over independent models is that the correlation between the outcomes can be estimated resulting in inferences that are more efficient (Polverejan et al., 2003; Gardiner et al., 2002). The shared random-effects models and the copula-based models are two different ways to account for this correlation. For example, the correlation between the healthcare outcomes of individuals in a cluster is often assessed using measures of dependence such as the Pearson’s correlation coefficient, the Spearman’s rho and the Kendall’s tau. In addition, to account for correlation in the healthcare outcomes within a cluster or equivalently heterogeneity between clusters, a random effect term is used, resulting in what are known as frailty models in survival analysis. In this thesis, however, we propose a joint bivariate copula random-effects model in a health economics study that includes both hospital-level 2 and individual-level correlation. This thesis is concerned with two closely related measures of healthcare resource utilization: LOS and cost. We consider models for LOS and joint models for LOS and cost. Methodological advances have been made for analysis of both measures during the past decade. In this chapter we first give a basic introduction to relevant models. We also briefly review related statistical literature. Models are described in detail in the subsequent chapters. 1.1.1 Length of Stay (LOS) Length of stay (LOS) is a term commonly used to measure the duration of a single episode of hospitalization. Inpatient days are calculated by subtracting day of admission from day of discharge. People entering and leaving a hospital on the same day have a LOS of one. Hospital LOS is an important measure of healthcare resource use because hospital spending represents approximately one third of the national health spending which is projected to reach 4.4 trillion by 2018 (Konetzka et al., 2008; Sisko et al., 2009). Various statistical models have been used to assess the influence of clinical and demographic factors on LOS and hospital charge. The primary goal is to estimate the expected value of LOS (Faddy et al., 2009; Li, 1999; Wang et al., 2002). Analyses of LOS present several challenges to researchers as LOS data are often skewed and therefore standard regression methods cannot be applied directly. In addition, several other features of LOS data that need to be addressed include incomplete observations and heteroscedasticity. Validity of statistical inference on LOS will depend upon the strength of the methodology in addressing these characteristics of hospital LOS. 3 1.1.2 Cost Hospital cost constitutes a significant proportion of overall expenditure in health care and analysis of cost data is another interesting area which has attracted considerable attention recently. Investigating appropriate models or assessing health outcomes and analysis of these data can help us identify the most cost-effective treatment and ascertain the determinants of cost (Liu, 2004). Patient cost is collected as a final total cost in some national and state databases. For a treatment or intervention under study, let C(t) be the accumulating cost over time t for an individual patient. Expenditures terminate at the discharge time from the hospital. In other situations we record cost C(tj ) in a regular time interval (tj−1 , tj ]. Similar to LOS, analysis of hospital cost may have several technical problems, such as skewness, incompleteness, etc. Sometimes other components of cost such as inpatient or outpatient cost exhibit a two-part feature as a result of a significant proportion of zero cost. In addition, hospital cost is often correlated with LOS. With escalating cost, knowing the correlates (also called covariates) of LOS and in-hospital cost is important for decisions on allocating resources. Naive analysis treats the hospital cost independent of LOS by standard linear mixed models after log transformation (Sirbu, 2004), or by standard survival analysis techniques (such as Kaplan-Meier estimators) (Etzioni et al., 1999; Lin et al., 1997; Lin, 2000; Hallstrom and Sullivan, 1998). However the estimates might be invalid due to the correlation between LOS and cost we mentioned before. Therefore there is an interest in developing a joint modeling approach to analyze both LOS and hospital cost simultaneously. 4 1.1.3 Accelerated Failure Time (AFT) Model The accelerated failure time (AFT) model is presented as an alternative to the proportional hazards model, which is widely used in medical research and time to event data analysis. The AFT models also have been studied extensively in the literature for analyzing right censored data (Buckley and James, 1979; Tsiatis, 1990; Wei et al., 1990; Jin et al., 2003). Recently, Tian and Cai (2006) studied a numerically efficient simple estimation procedure to the regression analysis of interval censored data using the AFT model. Consider the following model, log T = x β + σ (1.1) AFT models have a number of advantages, in particular, they offer a wider variety of shapes of hazard functions than the parametric proportional hazards models that assume a particular distribution for survival times, since the family includes distributions with unimodal hazard functions, such as the log-normal, log-logistic distributions. Moreover, the log-linear formulation of such models emphasizes that the roles of the regression parameters and dispersion parameters are clearly separated (Keiding et al., 1997). The regression parameters in an AFT model are also robust with respect to neglected random effects (Hougaard, 1999), less affected by the choice of probability distribution, which is not the case for proportional hazards models (Hougaard et al., 1994). In addition, regression parameters in the proportional hazards model are more sensitive to the distribution of the random component. 1 The survivor function can be written as S(t|x) = S0 (t exp(−x β )) σ , where S0 is the survivor function of the random variable exp( ). Clearly, the individual characteristics act 5 on the duration distribution by transforming the time scale T to T exp(−x β ). This may be an accurate description of the actual variation in the lifetime distribution of complex selfevolving organisms or mechanisms. Because of the one-to-one relation between a distribution and its hazard function, the AFT specification can be translated into a specification of the hazard function of T given x. Assuming a normal or logistic distribution for provides a closed form expression for S(t|x), which will be used in the later chapters when modeling cost. 1.1.4 Phase-type (PH) Model Phase-type (PH) distributions are defined as distributions of absorption times T in Markov processes with m < ∞ transient states (the phases) and one absorbing state labeled m + 1. Since their introduction by Neuts in 1981 (Neuts, 1981), PH distributions have attracted a lot of attention in the past decades and have been widely used in queuing theory, reliability analysis, insurance risk, survival analysis, telecommunications, and healthcare utilization, such as hospital LOS. Aalen (2002) connects PH models to problems in survival analysis, for example the incubation time of acquired immune deficiency syndrome (AIDS) shown in Figure 1.1. Asmussen et al. (1996) propose EM algorithm to estimate PH distributions and exhibit four samples of the lengths of incoming telephone calls to the service center of one of Israels major television cable companies. Olsson (1996) extends EM algorithm to cover estimation from censored data including right-censored and interval-censored observations. Bitran and Dasu (1994) analyze a queue to which the arrival process is the superposition of separate arrival streams, each of whose interarrival time distributions is of phase type, and the service time distribution is also of phase type. 6 HIV infection α- β - Acvanced HIV λ- AIDS 6 γ λθ ? Treatment Figure 1.1: Phase-type model for incubation distribution of AIDS PH distributions arise from a generalization of Erlang’s method of stages in a form that is particularly well suited for numerical computation (Erlang, 1917). The simplest examples are mixtures and convolutions of exponential distributions (in particular Erlang distributions, defined as gamma distributions with an integer shape parameter). More generally, the class comprises all series/parallel arrangements of exponential distributions, possibly with feedback. PH distributions are a versatile class of distributions, which is dense in the class of distributions defined on the non-negative real line. Hence any distribution on [0, ∞) can, at least, in principle be approximated by a PH distribution. Moreover, their formulation also allows the Markov structure of stochastic models to be retained when they replace the simple exponential distributions. Details will be provided in Chapter 2. 1.1.5 Random-Effects (RE) Model The random-effects (RE) model arises in the context of analysis of survival data, event counts, jointly dependent continuous and discrete variables and so forth. Random effects have been used to model dependence, but they can also be viewed as a general approach of 7 joint modeling. The RE modeling approach is built using regression models in which either common or correlated latent variables enter the models in the same manner as regressors. Such RE models have long history in statistics, for example, in the context of bivariate distributions they have appeared under a variety of names such as the shared frailty model in survival analysis (Hougaard, 2000), trivariate reduction model in multivariate count data analysis (Kocherlakota and Kocherlakota, 1992), and latent variable models for multilevel, longitudinal and structural equation analysis (Skrondal and Rabe-Hesketh, 2004). Generically they are all mixture models and can also be interpreted as random-effects models. Liu (2009) proposes a joint random effects model of longitudinal medical cost data and survival, taking into account the semi-continuous nature of medical costs. Liu and Huang (2009) propose a joint random-effects model for a repeated measures process and a recurrent events process, for which the correlation is modeled by shared random effects. RE models have been also established as an appealing approach to analyzing longitudinal and survival data (Vonesh et al., 2006; Ratcliffe et al., 2004; Wulfsohn and Tsiatis, 1997; Gruttola and Tu, 1994; Tsiatis et al., 1995; Henderson et al., 2000; Tsiatis and Davidian, 2004; Xu and Zeger, 2001; Guo and Carlin, 2004). Let us consider a shared random-effects model for two possibly correlated outcomes Y1 and Y2 with joint density f : ∞ f (y1 , y2 |x1 , x2 ) = 0 f1 (y1 |x1 , ν)f2 (y2 |x2 , ν)g(ν)dν (1.2) where f1 , f2 , and g are univariate densities. For example, let f1 (y1 |x1 , ν) and f2 (y2 |x2 , ν) denote normal marginal distributions for continuous variables Y1 and Y2 , with conditional mean µ1 = x1 β1 + ν and µ2 = x2 β2 + γν, where γ is a scale parameter as Y1 and Y2 might 8 be in different scales. This approach suggests a way of specifying correlated models based on a suitable choice of g(.). In fact, we can consider more flexible bivariate or multivariate parametric models by introducing correlated, rather than identical, unobserved heterogeneity components in marginal models. For example, suppose Y1 and Y2 are, respectively, N ormal(µ1 |ν1 ) and N ormal(µ2 |ν2 ), where ν1 and ν2 represent correlated unobserved heterogeneity. Dependence between Y1 and Y2 is induced if ν1 and ν2 are correlated. We refer to (ν1 , ν2 ) as latent factors. For example, we can assume (ν1 , ν2 ) to have bivariate normal distribution with correlation ρ. The joint density is of the form: f (y1 , y2 |x1 , x2 ) = f1 (y1 |x1 , ν1 )f2 (y2 |x2 , ν2 )g(ν1 , ν2 )dν1 dν2 (1.3) Similar to other generalized linear mixed models (GLMM’s), a full likelihood analysis of the above joint model is hindered by the high-dimensional integration. To avoid these computational problems, several approaches have been proposed, for example, Gaussian quadrature techniques can be used as a practical estimation tool and computations carried out in available software such as freely available softwares aML (http://www.appliedml.com/index.html) or SAS Proc NLMIXED. 1.1.6 Copula-based Regression Model Copulas are another approach for modeling dependence or correlation in the context of linear or nonlinear regression models. Copulas, originally introduced by Sklar in 1959 (Sklar, 1959), have been suggested as a useful method for generating joint distributions from the given marginals. For example, a copula approach can generate a Gaussian distribution when 9 the Gaussian copula is applied to the Gaussian marginal distributions. More importantly, a copula model can also generate many complicated non-Gaussian joint distributions. This approach is fruitful when the marginals can be specified with confidence, but the joint distribution is awkward to establish (Cameron et al., 2004). In the recent years, the copula models become popular for modeling dependencies between random variables, especially in such fields as biostatistics, finance and actuarial science (Genest and Rivest, 1993; Joe, 1997; Nelsen, 1999; Cap´ra` et al., 2000). Chen and Fan (2002) study the temporal dependence e a properties and the estimation of a class of semiparametric stationary Markov time series models, propose simple estimators of the unknown marginal distribution and the copula dependence parameter, and establish their large sample properties under easily verifiable conditions. Miller and Liu (2002) propose a minimum cross-entropy approach that recovers continuous joint distributions from the joint and marginal moments and the marginal densities. Smith (2003) models sample selection using Archimedean copulas with the specification of binary models that are designed to account for data selectivity. A copula is a function that connects the marginal distributions to restore the joint distribution. In the bivariate case, a joint distribution H(x, y) can be expressed in terms of its margins FX (x) and FY (y) and a copula function C(·, ·) such that C(FX (x), FY (y)) = H(x, y). In this approach, C models the dependence structure. Copulas separate marginal distributions from the dependence structure, and the appropriate copula for a particular application is the one which best captures dependence features of the data. Details will be shown in Chapter 4. 10 1.2 Outline of the Thesis The remainder of this thesis is organized as follows. In Chapter 2, we will systematically address the principles of PH and Coxian PH regression models. We will also describe the estimation methods, including a context within the Bayesian framework. Later, we describe the classification and calculation of partial effects based on the Bayesian posterior samples. The methodology is illustrated with application to hospital admissions for acute myocardial infarction (AMI) in the 2003 Nationwide Inpatient Sample (NIS) from the Healthcare Utilization Project (HCUP). In Chapter 3, we will describe a statistical approach based on shared random effects for joint modeling LOS and hospital cost. To begin with, conditional on a common latent factor ν, suppose hospital LOS (denote by T ) has a Coxian PH distribution and cost (denote by C(T )) is log-normally distributed, they are independent. From this model, we derive the marginal means, variances, covariances and correlations. The shared random-effects methodology is applied to the same AMI data set described in Chapter 2, Section 2.9. In Chapter 4, we address the problem in the correlation of LOS and cost with the bivariate copula random-effects model. We model the hospital-clustered unobserved heterogeneity through correlated random effects and individual correlation through the copula-based correlated measurement errors simultaneously. The full maximum likelihood (FML) is used for the estimation. We carry out a simulation study to assess model fit and performance of estimates. Similarly, the joint modeling approach is illustrated by the AMI data set again. Finally, Chapter 5 summarizes our work and gives some suggestions for future research. 11 Chapter 2 Modeling Hospital Length of Stay with a Coxian PH Regression Model In this chapter, we will give an overview of the phase-type (PH) distributions. To better understand this chapter, we need to know the definition of the PH distributions and its special subclasses, especially Coxian PH distributions, which we will use in our application with real data. We also provide some basic properties of the PH distributions in Section 2.4. 2.1 Introduction Phase-type (PH) distributions, introduced by Neuts in 1981 (see Neuts, 1981), are defined as the distributions of the time to reach an absorbing state in a finite-state continuoustime homogeneous Markov process. PH distributions have been received a lot of attention in a wide range of stochastic modeling applications, such as telecommunication, teletraffic modeling, queuing theory, reliability theory, insurance risk, survival analysis and healthcare utilization, like hospital length of stay (LOS). PH distributions have enjoyed such popularity 12 because they constitute a very versatile class of distributions defined on the non-negative real numbers that lead to algorithmically tractable models. Since the set of PH distributions is dense in the set of non-negative distributions with support on [0, ∞), any non-negative distribution, at least in principle, can be approximated arbitrarily closely by a PH distribution. In addition, their formulation also allows the Markov structure of stochastic models to be retained when they replace the simple exponential distributions. A Coxian PH distribution is a special form of PH distributions with the transient states ordered so that transitions are forward flowing with exit from any transient state. Coxian PH distributions have considerably fewer parameters than general PH distributions (Cumani, 1982). In the past two decades, Aalen (2002), Marshall and his colleagues (Marshall et al., 2000, 2002; Marshall and McClean, 2003; Marshall et al., 2005) have applied PH models to analyze LOS data. Marshall et al. (2007) estimate total inpatient cost for a cohort of patients based on a Coxian PH distribution for their hospital LOS. Fackrell (2009) gives a general survey of PH models with applications to healthcare data. More recently, Aus´ and colleagues (Aus´ and Lopes, ın ın 2007; Aus´ et al., 2008, 2009) proposed Bayesian methods for estimating a Coxian PH disın tribution, treated as “finite mixtures”. McGrory et al. (2009) use a fully Bayesian approach for inference in Coxian PH models with covariate-dependent mean duration. McGrory et al. (2009) incorporate covariates into Coxian PH distributions through a loglinear mean function with the intent of estimating covariate effects on the mean LOS. Aus´ ın and collegues (Aus´ and Lopes, 2007; Aus´ et al., 2008, 2009) mention the importance ın ın of ordering the incidence rate parameters in identifying parameter estimates, which has consequences for classification. Their focus is on the estimation of ruin probabilities in certain risk reserve processes in insurance claims and on the busy period in queuing systems, 13 but they do not consider the influence of covariates. Bayesian methods are used for estimation and inference. In this chapter, we apply a similar Bayesian method to estimate Coxian PH regression models that combine identification of transition probabilities and coefficients of covariates. We describe a classification method to assign patients to different latent groups, and explain differences in these groups by covariates. 2.2 Continuous Phase-type Distribution Continuous phase-type (PH) distributions describe the time to absorption T of an underlying finite-state continuous time Markov process {X(t); t ≥ 0}. We denote by E = {1, 2, . . . , m, m + 1} the state space where the single absorbing state is labeled ‘m + 1’ and the remaining states 1, 2, . . . , m are transient. X(t) is governed by the (m + 1) × (m + 1) transition intensity matrix, A = {αhj }, h, j ∈ E, with the elements P {X(t + ∆t) = j|X(t) = h} ∆t ∆t↓0 αhj = lim (2.1) αhh = − αhj , h, j ∈ E j=h The probability density function, survival function and k-th non-central moment of T 14 have closed forms 1 f (t) = π exp (Qt)(−Q1 ) (2.2) 1 S(t) = π exp (Qt)1 (2.3) mk = (−1)k k!Q−k1 , k = 1, 2, . . . (2.4) Here, exp (A) denotes the matrix exponential of the square matrix A (Golub and Van Loan, 1996). The initial probability distribution over the transient states is π (row vector), Q is the m × m intensity matrix for the Markov chain, restricted to the transient states, and 1 is a column m-vector of 1’s. Equation (2.2) is a representation of a PH distribution given π (π , Q). The class of all PH distributions with generator Q is denoted by PH(Q), including the representation (0, Q), which is the distribution with point mass at zero (or a process that is instantly absorbed in m + 1). The class PH(Q) can be described as the convex hull of the e points (0, Q) and (e k , Q), k = 1, 2, . . . , m, where e k is a m × 1 vector with k-th component equals to 1, and all other components equal to 0. We now give some special examples of PH distributions, assuming that there is no probability mass at zero. (1) Exponential distributions 1 λ - 2 The simplest non-trivial example of a PH distribution is the exponential distribution with the parameter λ , which has density function f (x) = λ exp(−λx) with a repre15 sentation π = 1, Q = (λ) (2) Hyper-Exponential (Mixture of exponential) distributions 1 ··· 2 λ1 m λ2 ? λm ? ? ? m+1 The hyper-exponential distribution with density function f (x) = m i=1 πi λi exp(λi x) can be represented as a PH distribution with π = (π1 , π2 , · · · , πm )  0 0  −λ1    0 −λ2 0  Q=  . . . . .  . . .  .  0 0 0 where m i=1 πi  ··· 0 ··· 0 .. . . . . ··· −λm           = 1. (3) Erlang distributions 1 λ- 2 λ- ··· λ- m λ- m+1 The m-Erlang distribution has two parameters, the shape an integer m > 0 and the rate λ > 0. This is sometimes denoted E(m, λ). The Erlang distribution with the 16 density function f (x) = λm xm−1 exp(−λx) has the PH representation (m − 1)! π = (1, 0, · · · , 0)           Q=         −λ 0 0 . . . 0 0 λ  0 0 ··· 0 0    −λ λ 0 · · · 0 0     0 −λ λ · · · 0 0    . . . .. . .  . . . . .  . . . . . .    0 0 0 · · · −λ λ    0 0 0 · · · 0 −λ m×m (4) Hypo-exponential distributions 1 λ1 2 λ2- ··· λk- m λm - m+1 The hypo-exponential distribution is a generalization of the Erlang distribution by having different rates for each transition (the non-homogeneous case), i.e. π = (1, 0, · · · , 0)   0 0  −λ1 λ1    0 −λ λ2 0 2     0 0 −λ3 λ3  Q=  . . . . . . .  . . . . .     0 0 0 0   0 0 0 0 17 ··· 0 0    ··· 0 0     ··· 0 0    . .  .. . .  . . .    · · · −λm−1 λm−1    ··· 0 −λm m×m (5) Coxian PH distributions ? 1 - ? 2 - ? ··· - m ? ? m+1 The Coxian distribution is a generalization of the hypo-exponential distribution. The absorbing state can be reached from any phase. The PH representation is given by, π = (1, 0, · · · , 0)   0 0  −λ1 p12 λ1    0 −λ2 p23 λ2 0     0 0 −λ3 p34 λ3  Q=  . . . . . . .  . . . . .     0 0 0 0   0 0 0 0 ··· 0 0     ··· 0 0     ··· 0 0    . . .. . .  . . .    · · · −λm−1 pm−1,m λm−1    ··· 0 −λm m×m where 0 < pk,k+1 ≤ 1, k = 1, 2, · · · , m − 1. In the case when all pk,k+1 ’s=1, it reduces to hypo-exponential distribution. The Coxian PH distribution is extremely important as any acyclic PH distribution has a generator that is upper triangular and has an equivalent Coxian representation. In addition, the process generalised Coxian PH distribution relaxes the condition that requires starting in the first phase. (6) Mixture of Erlang distributons The mixture of two Erlang distributions with parameters E(m1 , λ1 ), E(m2 , λ2 ) and 18 (ω1 , ω2 ), such that ω1 + ω2 = 1, ωi ≥ 0 for each i, has the PH representation π = (1, 0, · · · , 0)    Q1 0  Q=  0 Q2 m×m  0 0  −λk λk    0 −λk λk 0     0 0 −λk λk  where Qk =   . . . . . . .  . . . . .     0 0 0 0   0 0 0 0  ··· 0 0    ··· 0 0     ··· 0 0    . .  .. . .  . . .    · · · −λk λk    ··· 0 −λk mk ×mk As noted earlier the class of PH distributions is dense in the set of distributions on [0, ∞). Unfortunately, this distribution family is over-parameterized and thus not identifiable. Evπ ery PH distribution has several alternative representations (π , Q). This makes parameter estimation difficult (Fackrell, 2009). For a simple example where one PH distribution can be represented by two structures, see Figure 2.1. Consider two structures with m = 2,    −λ1 qλ1  π π PH(π 1 , Q1 ) and PH(π 2 , Q2 ), where π 1 = (1, 0), π 2 = (q, 1 − q), Q1 =  , and 0 −λ2    −λ2 λ2  Q2 =  , λ1 = λ2 . From (2.2), we get 0 −λ1 f1 (t) = f2 (t) = (1 − q)λ1 exp (−λ1 ) + 19 qλ1 λ2 [exp (−λ2 t) − exp (−λ1 t)] λ1 − λ2 q λ ?1 1 λ2 - ? 1−q λ ?2 λ ?1 2 1 - ? 2 ? 3 3 π PH(π 1 , Q1 ) π PH(π 2 , Q2 ) Figure 2.1: Two different representations for one PH distribution It is also apparent from the examples that representations for PH distributions do not necessarily have the same order, meaning that the number of transient phases need not be the same. In fact, there must be a representation that has the minimal order. A representation that has minimal order is called a minimal representation. Imposing a special structure on Q can guarantee that every distribution PH(Q) has a π unique representation in the form (π , Q). Although one could work with such a minimal representation, in this thesis we will use Coxian PH distributions because they present fewer problems in estimation and inference and also provide a simple interpretation of fit for LOS data. 2.3 Coxian Phase-type Models A Coxian PH distribution results when the transient states have a natural order and only forward transitions between them may occur, beginning in state 1: 1 → 2, 2 → 3, . . . , m−1 → m and exiting from any transient state: 1 → m + 1, 2 → m + 1, . . . , m → m + 1. Transitions between the m + 1 states is illustrated in Figure 2.2. The actual states of the Markov model are not observable, that is, we assume every patient enters the system from state 1, but do 20 not know the state from which the patients exit. ? 1 ? - 2 - ? ··· ? - m ? m+1 Figure 2.2: Representation of (m + 1)-state Markov process with a Coxian PH distribution π A Coxian PH distribution is represented by (π , Q) where π = (1, 0, . . . , 0), and   α12 0 ··· 0  −(α12 + α1,m+1 )      0 −(α23 + α2,m+1 ) α23 · · · 0    Q=    . . . . ... . . . .   . . . .     0 0 · · · · · · αm,m+1 (2.5) The number of parameters in a general PH distribution is m2 + m, but the number of parameters in a Coxian PH distribution is reduced to 2m − 1. The Coxian family is also dense in the class of all distributions on [0, ∞) and is appropriate for estimating long-tailed distributions. In the context of survival analysis we are interested in the hazard rate for sojourn time in each state and transition probabilities between states, thus it is constructive to reformulate this intensity rates α ’s to hazard rate in transient state k as λk = αk,k+1 + αk,m+1 , k = 1, 2, . . . , m − 1, and in transient state m as λm = αm,m+1 . Transition probabilities are pk,k+1 = αk,k+1 (αk,k+1 + αk,m+1 ) from k → k + 1, k = 1, 2, . . . , m − 1 with pm,m+1 = 1. Suppose we observe duration times of n patients t = (t1 , t2 , . . . , tn ) from a presumed Coxian PH with m transient states and covariate matrix X = {x1 , . . . , xn } where xi = 21 (x1i , x2i , . . . , xli ) . We can incorporate covariates in the classical way comparable to a generalized linear model (Faddy et al., 2009). Assuming λki = λk (xi ) = λ0k exp (−xiβ ), the conditional mean distribution is log-linear in xi , i.e, log (E(Ti |xi )) = a0 + xiβ , where β = (β1 , β2 , . . . , βl ) is the coefficient vector, and a0 is a function of pk,k+1 ’s and λ0k ’s. From (2.2) the corresponding likelihood function can be written as n t L(t |X, λ0 , β , p ) = n i=1  ˜ π exp(Qi ti )˜i q 1 π [exp (Λi Pti )](−Λi P1 ) = (2.6) i=1    0 ··· 0  0 ··· 0   1 −p12  −λ1i          0   0 1 −p23 · · · 0  −λ2i · · · 0     where Λi =  , , P =   .  . . . . .  .. .  .. . . . .   .  . . .  . . . .  . .   .  .     0 0 0 ··· 1 0 0 · · · −λmi ˜ ˜ 1 π = (1, 0, . . . , 0), Qi ti = Λi P and q i = −Λi P1 = (p1,m+1 λ1i , p2,m+1 λ2i , . . . , pm,m+1 λmi ) . The parameters of the model are λ0 = (λ01 , λ02 , . . . , λ0m ) , β = (β1 , β2 , . . . , βl ) , and p = (p12 , p23 , . . . , pm−1,m ) . The likelihood function (2.6) becomes             0     −λ01 p12 λ01 · · · (1 − p12 )λ01                     n     0   (1 − p23 )λ02   −λ02 · · · 0       π exp exp (−xiβ )   t  exp (−xiβ )      .   . .  i  . ..  . . .   .    .    i=1  . . .   .                      0 0 · · · −λ0m λ0m (2.7) We refer to (2.7) as the likelihood of our Coxian PH regression model. Covariates enter through a specific parameterization of the mean E(T |x) = exp (a0 + x β ). This maintains 22 the general form of the mean common to accelerated failure time models, log T = x β + σ when σ is a scale parameter and has a specified distribution on the real line. For example, if is extreme-value, then T has the Weibull distribution with E(T |x) = exp (log Γ(1 + σ) + x β ) where Γ is the Gamma function. The log-normal, log-logistic and Generalized Gamma failure time distributions also share this structure. At the expense of a formidable increase in the number of parameters, a general Coxian PH regression model could be described by specifying λk (xi ) = λ0k exp(−xiβ k ) with phasetype specific parameters β k . In addition, a logit model log pk,k+1 /(1 − pk,k+1 ) = ziγ k could be specified for transition probabilities. However, stability of ML estimates is often in doubt unless some structural simplifications can be made and exclusion restrictions can be imposed on the covariates x, z in the two parts of the regression model. Exploration of various strategies for model formulation and estimation is the subject of on-going research. 2.4 Properties of Phase-type Distributions In this section, we discuss some of the basic properties of the PH distributions. First, the set of PH distributions is quite broad and, in theory, any non-negative distribution can be approximated arbitrarily closely by a PH distribution. π Proposition 2.1. The distribution of PH(π , Q) is given by 1 F (t) = 1 − π exp (Qt)1 for t ≥ 0, where the matrix exponential is defined by exp(A) = 23 ∞ i=0 1 i A . The density i! π function of PH(π , Q) is given by 1 f (t) = π exp (Qt)(−Q1 ) π for t ≥ 0. Let T be a random variable with the PH(π , Q) distribution, then E T k = (−1)k k!Q−k1 , k = 1, 2, . . . π The Laplace transformation of PH(π , Q) is given by ˜ 1 T (s) = π (sI − Q)(−Q1 ) where I is an identity matrix and s is a complex number. π Theorem 2.2. Suppose that F and G are PH distributions with representations (π1 , Q1 ) of π order m1 , and (π2 , Q2 ) of order m2 , respectively. Then we have the following. π 1. The convolution F1 *F2 is a PH distribution with a representation (π , Q) of order m1 + m2 where π= π 1 π10π 2    Q1 −Q1eπ 1  Q=  0 Q2 where 0 is a m1 × m2 matrix of zeros and π10 is known as the point mass at zero. 2. The mixture ωF1 + (1 − ω)F2 , where 0 ≤ ω ≤ 1, is a PH distribution with a represen24 π tation (π , Q) of order m1 + m2 where π= π π ωπ 1 (1 − ω)π 2    Q1 0  Q=  0 Q2 Proof. See Neuts (1981). 2.5 Bayesian Estimation Methods Several approaches have been proposed for the estimation of PH/Coxian PH models. Asmussen et al. (1996) develop an expectation-maximization (EM) algorithm to calculate maximum likelihood (ML) estimators for general PH distributions. Olsson (1996) extends this algorithm to right censored and interval censored data. Using Matlab or R routines, Faddy and McClean (1999, 2005) use ML methods to estimate the LOS of geriatric patients with Coxian PH distributions. In Faddy (2002), a penalized ML method is carried out to fit Coxian PH distributions with high orders. The method of moments has also been used to fit PH distributions. Johnson (1993) develops an algorithm that matched the first three moments of a mixture of Erlang distributions to the empirical moments. Horv´th and Telek (2000) a propose a method that separately approximates the main and tail parts of a PH distribution. More recently, McGrory et al. (2009) propose an innovative fully Bayesian approach for inference where the number of phases is unknown and the mean duration depends on covariates. Unfortunately, the Coxian model is only identifiable up to permutation of the intensity rates (Cumani, 1982). This identifiability issue often affects the convergence of the Markov 25 Chain Monte Carlo (MCMC) algorithm and the interpretation of the estimated parameters. Aus´ and colleagues (Aus´ and Lopes, 2007; Aus´ et al., 2008, 2009) use Bayesian methın ın ın ods for estimating PH distributions and place ordering constraints on the intensity rates to address this identifiability problem. We incorporate covariates into the model and extend their Bayesian method to fit the Coxian PH regression model. Moment Matching Algorithm The moment matching algorithms are broadly used in computer science, engineering, operation research, etc., where the performance evaluation or optimization in stochastic environment is needed. The idea is to map a general probability distribution G, into a PH distribution P , such that some moments of P and G agree. Matching the first moment of any non-negative distribution is possible by a single exponential distribution, but unfortunately it is often not sufficient, as ignoring the higher moments can result in misleading conclusions. Generally in cases where matching only two moments suffices, it is possible to achieve solutions which perform very well. Sauer and Chandy (1975) provide a closed-form solution for matching two moments of a general distribution with squared coefficient of variation (CV). They use a two-phase hyper-exponential distribution, for matching distributions with squared coefficient of variability CV2 > 1, and a generalized Erlang distribution for matching distributions with CV2 < 1. Marie (1980) provides a closed-form solution for matching two moments of a general distribution with CV2 > 0. He uses a two-phase Coxian PH distribution for distributions with CV2 > 1, and a generalized Erlang distribution for distributions with CV2 < 1. Osogami (2005) improves existing estimates with respect to methods computational efficiency and provides the closed-form solutions for matching distributions with Erlang-Coxian (EC) distributions. For example, if G has sufficiently high 26 second and third moments, then a two-phase Coxian PH distribution alone suffices. If the variability of G is lower, however, we might append several exponential distributions to the two-phase Coxian PH distribution in order to get the variability of P to be low enough. Maximum likelihood estimate (EM algorithm) The Expectation-Maximization (EM) algorithm is an iterative method that aims to maximize the log-likelihood function (Dempster. et al., 1977). The EM algorithm is a broadly applicable approach to the iterative computation of maximum likelihood (ML) estimates, useful in a variety of incomplete data problems. On each iteration of the EM algorithm, there are two steps: the Expectation step (E-step) and the Maximization step (M-step). Asmussen et al. (1996) propose a novel EM algorithm fitting PH Distributions. Later Olsson (1996) extends the EM algorithm to censored data. EMpht is a C program for fitting PH distributions (http://home.imf.au.dk/asmus/pspapers.html). It can be used either to fit a PH distribution to a sample (which may contain censored observations), or to make a PH approximation of another continuous distribution. In addition, it is complemented by a Matlab program, PHplot, for graphical display of the fitted PH distribution. 27 2.5.1 Reversible Jump Markov Chain Monte Carlo (RJMCMC) Green’s (see Green, 1995) Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm is basically of Metropolis-Hastings type with specific trans-dimensional proposals carefully designed to move between different models in a way that is consistent with the desired stationary distribution of the MCMC algorithm. The idea has been widely applied in the finite normal mixture models (Richardson and Green, 1997). The key in RJMCMC is to allow moves between parameter subspaces of different dimensionality by permitting a series of different ‘move types’. Details can be found in Denison et al. (1998); Dellaportas and Forster (1999). A typical way to estimate parameters in finite mixture models is the EM algorithm we mentioned above. However, the basic EM algorithm has two main drawbacks, slow convergence and lack of an in-built procedure to compute the covariance matrix of parameter estimates. Moreover, some complex problems lead to intractable E-steps, for which Monte Carlo methods have been shown to provide efficient solutions. Though in the finite mixture problems, the E-step is easy to implement, the M-step is more difficult using Newton-Raphson method, Quasi-Newton method or other optimization methods. Diebolt and Robert (1994) provide the standard Bayesian formulation of the finite mixure models with a known number of components and its implementation via Markov Chain Monte Carlo (MCMC). For PH distributions, we first need to do some transformation to make the density function more like “finite mixture models”. 28 2.5.2 Transformation Strategies We can assume, without loss of generality, λ01 ≥ λ02 ≥ . . . ≥ λ0m . One way to incorporate the ordering restriction is to represent the hazard rates of the model as follows (Aus´ and ın Lopes, 2007; Aus´ et al., 2008, 2009): ın λ0k = λ01 ν2 ν3 · · · νk , 0 < νj ≤ 1, j, k = 2, 3, · · · , m. In addition, we can easily recover the transition probabilities from η = (η1 , · · · , ηm ). η1 = 1 − p12 η2 = p12 (1 − p23 ) . . . ηm−1 = p12 p23 · · · pm−2,m−1 (1 − pm−1,m ) ηm = p12 p23 · · · pm−2,m−1 pm−1,m where ηk is the total probability of existing from transient state k. Instead of the parameters (m, λ 0 , β , p ), we can work with the equivalent set of parameters (m, η , λ01 , ν , β ) , which is relatively stable and easy to sample in the MCMC framework. Based on the new parameter set, the likelihood function (2.6) can be written as n m ηk fk (ti |xi , k, ν , β ) L(t|X, m, λ01 , η , ν , β ) = i=1 n k=1 m (2.8) ηkπ exp(Qik ti )(−Qik1 ) = i=1 k=1 29 where fk (ti |xi , k, λ01 , ν , β ) = π exp(Qik ti )(−Qik1 ) is a k-phase generalized Erlang distribution and   −1      Qk = λ01 exp(−x β )      0 . . . 0 1 0 ··· 0 −ν2 ν2 · · · . . .. . . . . . 0 0 ··· 0 . . . −ν2 ν3 · · · νk           If all the rates are unequal, we can obtain an explicit form of (2.8) without complicated matrix exponential computation. Let k Cj,k = r=1,r=j r i=2 νi j r i=2 νi − i=2 νi for j = 1, 2, · · · , k Then  k fk (ti |xi , k, λ01 , ν , β ) = j  j νj  exp −λ01 exp(−xiβ )ti Cj,k λ01 exp(−xiβ )  r=2 j=1   νj  r=2 Due to this re-parameterization, we can borrow ideas from finite mixture models and it is relatively easy to implement the imputation steps in MCMC. 30 2.5.3 RJMCMC for Coxian PH Distribution We now define the prior distributions for the parameters (m, λ01 , η , ν , β ) as follows: First if the number of phases m is known, we assume the following prior distributions: η ∼ Dirichlet(ξ1 , ξ2 , · · · , ξm ) λ01 ∼ Improper, i.e. density ∝ 1 λ01 νk ∼ Beta(a, b), for k = 2, · · · , m 0 β ∼ M V N (0 , σ 2 Il ) We specify the priors by setting ξ1 = ξ2 = · · · ξm = 1, a = 1.1, b = 1 and σ = 1000 (Aus´ ın et al., 2008). The RJMCMC algorithm for our model consists of the following steps. (1) Update the total exiting weights η = (η1 , η2 , · · · , ηm ) ; (2) Update the intensity rate λ01 ; (3) Update the vector of ν = (ν2 , ν3 , · · · , νm ) ; (4) Update the coefficient parameter β = (β1 , β2 , · · · , βl ) ; (5) Update the current number of phases. Steps (1)-(4) do not alter the dimension of the current model and they are performed using the Gibbs Sampler with Metropolis-Hastings step. But step (5) accounts for the jumps between adjacent values of m, which is done using the two types of moves discussed in Richardson and Green (1997), i.e. the split-combine and birth-death moves suitably modified for specific needs. We construct a MCMC algorithm using a data augmentation step, introducing a missing η indicator variable Zi such that p(Zi = k|η , m) = ηk , f (ti |Zi = k) = fk (ti |k, λ01 , ν , β ), i = 31 1, 2, · · · , n. Then the posterior distribution is p(Zi = k|ti , η , λ01 , ν , β ) = ηk fk (ti |k, λ01 , ν , β ) m j=1 ηj fj (ti |j, λ01 , ν , β ) ∝ ηk fk (ti |k, λ01 , ν , β ), k = 1, 2, · · · , m With this missing data approach, the complete data set now is (ti , xi , zi ), i = 1, 2, · · · , n. Given the missing data z, the complete likelihood is calculated by n fzi (ti |xi , k, λ01 , ν2 , · · · , νZi , β ) L(t|z, X, m, η , λ01 , ν , β ) = i=1 Therefore, step (1) can be achieved as η |t, z ∼ Dirichlet(1 + n1 , 1 + n2 , · · · , 1 + nm ) where nj is the number of observations exited from phase j, j = 1, 2, · · · , m. One advantage of this reparameterization is that the total exiting probability η can be obtained directly from Gibbs sampling. Next we obtain the conditional posterior distributions of λ01 , ν and β , which do not have explicit forms, and require the Metropolis-Hastings algorithm in order to obtain the new updates through iterations. 32 n f (λ01 |t, z, X, ν , β ) ∝ fzi (ti |xi , k, λ01 , ν2 , · · · , νZi , β )π(λ01 ) i=1 n f (νk |t, z, X, λ01 , ν −k , β ) ∝ fzi (ti |xi , k, λ01 , ν2 , · · · , νZi , β )π(νk ), k = 2, 3, · · · , m i=1,zi ≥k β β f (β |t, z, X, λ01 , ν ) ∝ fzi (ti |xi , k, λ01 , ν2 , · · · , νZi , β )π(β ) ˜ We use a Gamma proposal distribution λ ∼ G(2, 2/λ(s) ) to update λ01 (step (2)) at (s+1) step s. To sample νk (step (3)), we propose a Beta mixture distribution as a proposal (s) distribution in order to preserve the mode of νk and avoid clustering around 1 (Aus´ et al., ın 2008).     1 1 1  1 , 2 + Beta 2, , k = 2, 3, · · · , m νk ∼ Beta  ˜ (s) (s) 2 2 1−ν ν k k The coefficient β are generated with the aid of a Metropolis-Hastings step as well, using a symmetric random walk. The log density of β , ignoring terms that do not depend on β , is β β T (β ) = log fzi (ti |xi , k, λ01 , ν2 , · · · , νZi , β ) + log(π(β )) β Assume that the maximum of T (β ) exists and that the Hessian matrix H is negative definite ˜ β β in a neighborhood of β (s) , define V(s) = −H(β (s) )−1 , and propose β ∼ M V N (β (s) , V(s) ). We now describe the relevant maps and constraints that must be satisfied to make these moves (in step (5)) reversible, so called “detailed balance” (Green, 1995). Assume a discrete uniform prior defined on [1, Mmax ] for m, where Mmax represents an assumed upper limit of m. If the current number of phases is M , then the proposed number is M , where M = M ± 1. We will need to consider split and combine moves, i.e. split one phase into 33 two adjacent phases, or combine two adjacent phases into one. We take a mapping approach to combine or split phases as follows. In the combine move, parameters are updated according to ηr = ηr1 + ηr2 , νr = νr1 νr2 , ˜ ˜ ˜ ˜ where we can obtain λ0r = λ0,r2 . For the case r = 1, λ01 = λ01 ν2 . In the split move, we generate u1 and u2 from U (0, 1). Let ηr1 = u1 ηr ˜ ηr2 = (1 − u1 )ηr ˜ νr1 = u2 + νr (1 − u2 ) ˜ νr2 = ˜ νr u2 + νr (1 − u2 ) ˜ For the case r = 1, we generate νr2 = u2 , and λ01 = λ01 /u2 . The parameters in the ˜ remaining phases are not modified. 2.6 Model Checking To assess model fitting, Faddy et al. (2009) define a generalized residual as observed outcome divided by the estimated mean from the fitted model, and produce a quantile-quantile plot for the residuals. However, the residual distribution of a Coxian PH model is unknown. An alternative approach, which also allows for censored data, compares the empirical KaplanMeier estimate of the survival function with the model based estimates (Lambert et al., 34 2004). The model-based estimated survival for each patient is ˆ ˆ Si (ti ) = π exp Qi ti 1   ˆ ˆ ˆ 0  −λ01 p12 λ01 · · ·      ˆ  0  −λ02 · · · 0  ˆ  = π exp exp(−xiβ )   .  . . .. . .  .  . . . .     ˆ 0 0 · · · −λ0m              ti  1         (2.9) Average predicted survival is obtained from point-wise averages of the survival curves over patients with a specified covariate profile. This procedure is an informal assessment of model adequacy. Johnson (2004) extends the classical Pearson χ2 -test for goodness of fit to the Bayesian context. A formal diagnostic to departures of the assumed model is beyond the scope of the present thesis. A unique aspect of PH model specification is the choice of the number of phases of the hidden Markov process, which is described in the previous section. 2.7 Classification One of the goals in analyzing healthcare utilization is to predict resource use by different groups of patients who exhibit a semblance of within-group homogeneity while having between-group heterogeneity. Such a classification would be valuable in healthcare resource management and efficient allocation. For inpatient stays, Marshall and McClean (2004) investigate the potential of Coxian PH distributions to identify common characteristics in groups of patients according to their observed LOS. We focus here on the expected mean LOS and exploit this measure to group patients in the study. Specifically, we use the following algorithm to group LOS in m classes. Recall the exit probabilities η = (η1 , η2 , · · · , ηm ) 35 from the transient states which can be obtained directly from MCMC posterior samples after the reparameterization. Patients are assigned into different clusters according to their LOS in the ratio η1 : η2 : · · · ηm . The k-th LOS group Gk is determined by:   K−1 K   ˆ Gk = t(i) : n ηk < i ≤ n ηk , K = 1, 2, · · · , m.   k=1 (2.10) k=1 ˆ where t(i) , i = 1, 2, · · · , n denotes the ordered expected mean LOS and n is the number of patients. Patient characteristics within each latent LOS group may then be explored to determine if they have any common characteristics and across LOS groups is also be explored to see if they differ. 2.8 Computation of Partial Effects Social scientists are sometimes interested in estimating quantities other than survival in Eq (2.9) or classifying individuals into relatively more homogeneous groups using (2.10). One estimand favored by social scientists is the marginal effect (or predictive margin, recycled prediction, partial effect) of covariates with respect to the mean of the dependent variable, which is, in our case, the mean survival time (Basu and Rathouz, 2005). Under the parameterization in (2.6) and (2.7), the conditional mean duration is E(T |x) = exp(a0 + x β ) (Faddy et al., 2009). The marginal effects are defined below for continuous and discrete covariates. 36 2.8.1 Continuous Covariates For continuous covariates, partial effects (PE) are usually computed as the derivative of the conditional mean with respect to the covariate. In order to assess the overall impact of a continuous covariate xb on the mean LOS, it is reasonable to consider the following PE: PE = d (E(T |x)) dx (2.11) β = E(T |x)β β = exp(a0 + x β )β The PE is estimated from the posterior samples β (s) ; 1 ≤ s ≤ S (s) PE 1 = n n by (s) β exp(a0 + xiβ (s) )β (s) (2.12) i=1 S 1 PE = S PE (s) (2.13) s=1 The standard deviation (SD) of PEb is given by SD(PEb ) = 1 S−1 S (s) PEb 2 − PEb (2.14) s=1 where the subscript ‘b’ denotes the b-th component of PE. Here, Eq (2.11) measures the local rate of change in E(T |X) considered as a function in x, relative to E(T |X), and can be viewed as an approximation to the percentage change in E(T |X) for one unit increase in x. Eq (2.13) and (2.14) compute the estimated average PE and SD derived directly from the posterior samples, instead of bootstrapping by resampling methods. 37 2.8.2 Discrete Covariates For a discrete covariate, the formula (2.11) is no longer valid. The partial effects are more appropriately derived by partial difference rather than partial derivatives. Without loss of generality, suppose a covariate has B categories. We can generate B -1 dummy variables for each level of the covariate other than the reference level. We will estimate the conditional mean twice. First, we set xb to 1 (e.g., b-th category indicator) in the entire sample; and then second, we set xb to 0. In both occasions, the values of the other B -2 dummy variables are set to 0 and the values of other covariates are left untouched. Then the proper analogs of (2.11) and (2.12) are PEb = E(Y |xb = 1, x(−b) ) − E(Y |xb = 0, x(−b) ) (s) PEb 1 = n n exp (s) a0 (s) + βb (s) + xiβ (−b) i=1 1 − n n (s) (2.15) (s) exp a0 + xiβ (−b) (2.16) i=1 The estimated average PE and SD are hence calculated in the same way as formula (2.13) and (2.14). The partial effect of the binary variable xb is the change in the mean LOS when xb changes from 0 to 1 with all other variables kept constant at their observed values. 38 2.9 2.9.1 Modeling Hospital LOS for AMI Patients Patient Sample We apply the above models to the analysis of length of stay (LOS) for patients with acute myocardial infarction (AMI) in the 2003 Nationwide Inpatient Sample (NIS). The NIS is a database of all hospital inpatient stays from a stratified sample of approximately 1,000 community hospitals in the US. In 2003, the NIS contains nearly 8 million discharges from 37 states. Sixty strata are defined by a combination of geographic region (Northeast, South, Midwest and West), location (urban, rural), ownership (public, private), teaching status, and bed size (small, medium and large). Summary measures derived from the sample can be extrapolated to national estimates using sampling weights. Patient demographics in the NIS include age at admission, gender, race, and primary payer. Patient clinical characteristics include treatment procedures undergone and comorbidities assessed during the stay. In addition to the stratification variables in the NIS, we used the following covariates in our Coxian regression model: age at admission in years, gender, procedure, comorbidity and insurance status (Pompei et al., 1991). The primary procedure that the patient underwent was based on the ICD-9 CM Procedure codes: CABG=Coronary Artery Bypass Graft, PTCA=Percutaneous Transluminal Coronary Angiography, CATH=Cardiac Catheterization, Other, or None=no procedure performed. We constructed the Charlson Comorbidity Index (CCI) to capture comorbidity based on ICD-9 CM diagnoses during the stay (Charlson et al., 1987; Matsui et al., 1996). The CCI is a weighted sum of 19 conditions such as diabetes, with and without complications, congestive heart failure, peripheral vascular disease, pulmonary disease, renal disease, etc. We categorized the CCI score into 4 subgroups: 1, 2, 39 3 and ≥ 4. Insurance status is defined as the expected primary payer: Medicare, Medicaid, Self-pay or Other (which includes private insurance 33.6%, no charge < 1%). The LOS of AMI patients could vary widely depending on severity of illness, comorbidity and type of procedure. Most patients are discharged after a relatively short period of time. However, some patients may remain in hospital over a long time period receiving continuous care. This suggests that the distribution of LOS reflects different hidden structures. Figure 2.3 shows the histogram (truncated at 50 days) of the distribution of LOS for 11,749 AMI patients in the 2003 NIS. The data are highly skewed to the right with a large number of outliers, because of variation in patient characteristics. The LOS ranges between 1 and 142 days with mean of 5.51 days and standard deviation of 5.90 days. The mean LOS of female patients (mean=6.01) was higher than that of male patients (mean=5.21). Patients who underwent CABG had the highest LOS, followed by patients who underwent OTHER, CATH, PTCA and NONE. Mean LOS was different according to CCI scores. Patients with CCI≥4 had the highest mean LOS (mean=7.89), followed by CCI=3, CCI=2, and CCI=1 with means 6.70, 5.61 and 3.87, respectively. Approximately 63% of our patient sample is male. Mean age at admission was 64.6 years (SD=12.7). The most prevalent procedure performed on these patients was PTCA (40.3%) followed by CATH (19.4%). By definition all patients had CCI≥1. Approximately 37.8% of patients had no other comorbidity (CCI=1), followed by CCI=2 (29.1%), CCI=3 (16.9%) and CCI≥4 (16.2%). As expected in our sample, Medicare was the primary payer for 52.7% of the patients. We dropped 23 patients from the sample who had invalid or missing insurance information. See Table 2.1 for additional information. 40 0.20 0.10 0.00 0.05 Percent 0.15 Histogram 4−phases Coxian PH model 0 10 20 30 40 50 Hospital Length of Stay (Days) Figure 2.3: Histogram of hospital length of stay (LOS) and fitted 4-phases Coxian PH model. The solid line (—) is the histogram of LOS and the dotted line (- - -) is the model-based density function estimate (truncated at 50 days). 41 Table 2.1: Descriptive Statistics for AMI patients in NIS in 2003 (N=11,749) Characteristics All Patients Gender Procedure Comorbidity, CCI Disposition of Patients Hospital Region Hospital Location/ Teaching Status Hospital Bedsize Primary Payer Group N(%) Male Female CABG&PTCA CABG PTCA CATH Other None 1 2 3 4+ Routine Transfer: Short-term Hospital Transfer: Other Type of Facility Home Health Care Died in Hospital Northeast Midwest South West Rural Urban non-teaching Urban teaching Small Medium Large Medicare Medicaid Self-pay Other 42 11,749 ( 100) 7378 (62.8) 4371 (37.2) 107 ( 0.9) 1301 (11.1) 4736 (40.3) 2275 (19.4) 1490 (12.7) 1840 (15.7) 4438 (37.8) 3423 (29.1) 1981 (16.9) 1907 (16.2) 7635 (65.0) 1451 (12.3) 1119 ( 9.5) 883 ( 7.5) 661 ( 5.6) 2524 (24.5) 2633 (22.4) 4986 (42.4) 1606 (13.7) 1041 ( 8.9) 5301 (45.1) 5407 (46.0) 950 ( 8.1) 2658 (22.6) 8141 (69.3) 6186 (52.7) 690 ( 5.9) 562 ( 4.8) 4311 (36.7) LOS (days) Mean (SD) 5.51 (5.90) 5.21 (5.54) 6.01 (6.43) 11.73 (8.78) 11.59 (8.81) 4.04 (3.38) 5.08 (5.12) 7.06 (8.16) 3.88 (2.89) 3.87 (3.79) 5.61 (6.22) 6.70 (6.54) 7.89 (7.33) 4.58 (3.51) 2.96 (2.96) 11.33 (10.3) 9.25 (10.8) 6.96 (6.65) 5.76 (6.60) 5.42 (5.43) 5.34 (5.72) 5.16 (6.00) 4.64 (4.20) 5.22 (5.16) 5.95 (6.76) 4.79 (5.65) 5.30 (6.24) 5.66 (5.81) 6.25 (6.36) 6.17 (7.17) 4.32 (3.64) 4.48 (4.97) 2.9.2 Estimation of Coxian phase-type Regression Model For the Bayesian estimation of the Coxian regression model described in Section 2.5, we performed 100,000 MCMC iterations, discarded the first 50,000 of these iterations as burnη in period and retained 10,000 thinned posterior samples of (η , λ01 , ν , β ). The selection of the best estimate for the number of phases m, using the RJMCMC algorithm, is based on the highest percent of times out of the 10,000 posterior samples where a particular number of phases is reached. In our data, the most likely number of phases of the underlying Coxian PH distribution is 4, with posterior probability of 0.313, followed by a 3-phase model with posterior probability of 0.273 and the 5-phase model with posterior probability of 0.255 (see Figure 2.4). The posterior means of the intercept parameters and covariate coefficients are given in Column 1, Table 2.2. We also use these posterior means as the starting values and apply the fully Bayesian approach (McGrory et al., 2009) to generate the posterior samples for measures of interest. Similar results are presented in Column 2, Table 2.2, except for the first two intensity rates without ordering. Based on the estimates of the 4-phase model, all patients start in state 1, almost all of them transfer from state 1 to state 2, then from state 2 to state 3. The total probabilities of exiting from states 1, 2, 3, and 4 were 0.0002, 0.001, 0.923 and 0.076, respectively. Among correlates, LOS was positively associated with age, female gender, CCI and procedure type (Table 2.2). As expected, older patients had longer LOS. A possible explanation is that older patients have a longer recovery period from their procedures that lengthen their hospital stay. The effect of female gender on LOS was an estimated 0.076 (β-coefficient), corresponding to an expected increase in LOS of 1.079 times that of males. Higher comorbidity would indicate a more severe condition possibly leading to a longer stay in hospital. For example, 43 CCI comorbidity scores 2, 3 and ≥4, led to an increase in LOS by 1.23, 1.46, and 1.78 times the LOS of patients with a CCI score of 1. The expected LOS for patients who underwent CABG, PTCA, CATH and ’other’ procedures were, respectively, 3.24, 1.25, 1.31, 1.43 times the LOS of patients without any procedure. In the present study, we found that Medicare and Medicaid insured individuals had increased mean hospital LOS, compared to self-pay and ‘other’ (primarily private insurance). LOS variation by region and hospital size has been explored previously (Xiao et al., 1997). In our study, the highest mean LOS was in the North East region whereas hospitals in the West region had the smallest mean LOS. To further justify the proposed 4-phase Coxian model, an alternative model using a heavy tailed distribution, like the log-normal distribution, was fitted for comparison with the same structure for the mean LOS. Some similarities are found in the covariate coefficients between the two different models. However, the adequacy of the specified log-normal is inferior to the proposed Coxian model. It is discussed later and visually illustrated in Figure 2.5. Mean LOS can be estimated from the model using a specific covariate profile or by averaging over the predicted values for each observation. The partial effects, defined in Section 2.8 for continuous and discrete covariates are reported in Table 2.3. For example, the mean LOS for patients with CCI equal to 3 was 1.97 days longer than the mean LOS for patients with CCI equal to 1. Compared to those having no procedures, patients who underwent CABG had much longer mean LOS, the difference was 8.30 days. Table 2.4 highlights the sample average of estimates of the fitted mean for different classes. Combining the fairly small total exit probabilities in the first two phases, an estimated 92.4% of the sample exited from first three phases. This is regarded as short-stay group, with predicted average LOS of 4.8 days. The remaining 7.6% of the sample are long-stay patients, 44 with predicted average LOS of 13.8 days. For the whole sample, mean LOS estimated from the model is 5.48 days, with a minimum of 1.75 days and a maximum of 22.46 days. There is large variation in the mean LOS between these two classes. To assess model fit, we compared the empirical Kaplan-Meier survival curves to the model-based average of subject-specific fitted curves from the 4-phase Coxian distribution. We can obtain values of estimated discharge probability 1 − S(t) for the individual, for LOS ranging from 1 to 142 days. The overall averaged fitted curves and empirical estimates are shown in the top left plot in Figure 2.5. The step functions, solid lines (—), are the empirical ˆ Kaplan-Meier estimates of S(t) the dash lines (- - -), the Coxian PH model-based estimates, are visually superimposed which indicate an adequate level of fitting. Within each stratum defined by procedure type, the model also fitted fairly well. Patients who had CABG surgery had the highest discharge probability. For comparison, the dotted lines (· · · ) in Figure 2.5 are the log-normal model-based estimates. Relative to the better fitting Coxian PH model, the log-normal model gives similar estimates in covariates coefficents, but inferior fit, especially for patients with CABG surgery. These patients tended to have longer LOS. 45 0.35 0.30 0.25 0.20 Percent 0.15 0.10 0.05 0.00 2 3 4 5 6 7 Number of Phases Figure 2.4: Number of phases dynamically selected by RJMCMC 46 Table 2.2: Estimates of posterior means (SD) based on 4-phase Coxian PH model and lognormal model Covariate λ01 λ02 λ03 λ04 η1 η2 η3 η4 Intercept Age Gender Procedure CCI Region Location/ Teaching status Bedsize Primary payer RJMCMC Female CABG PTCA CATH Other 2 3 4+ Northeast Midwest South Rural Urban non-teaching Small Medium Medicaid Self-Pay Other Posterior Mean (SD) 1.297 (0.052) 1.230 (0.046) 1.131 (0.065) 0.209 (0.011) 0.0002 (0.0002) 0.001 (0.001) 0.923 (0.005) 0.076 (0.005) 1.042 (0.026) 0.008 (0.0007) 0.076 (0.012) 1.177 (0.021) 0.222 (0.018) 0.272 (0.020) 0.360 (0.025) 0.205 (0.015) 0.378 (0.019) 0.576 (0.019) 0.094 (0.021) 0.055 (0.020) 0.085 (0.019) -0.135 (0.022) -0.069 (0.013) -0.155 (0.026) -0.067 (0.015) 0.072 (0.029) -0.031 (0.029) -0.063 (0.018) RJMCMC (McGrory) Posterior Mean (SD) 1.202 (0.103) 1.203 (0.104) 1.231 (0.104) 0.206 (0.013) 0.0003 (0.0003) 0.002 (0.002) 0.923 (0.007) 0.075 (0.007) 1.050 (0.024) 0.008 (0.0007) 0.072 (0.012) 1.171 (0.022) 0.220 (0.017) 0.268 (0.018) 0.361 (0.023) 0.205 (0.015) 0.377 (0.019) 0.574 (0.019) 0.092 (0.021) 0.056 (0.019) 0.086 (0.016) -0.134 (0.023) -0.072 (0.014) -0.156 (0.024) -0.066 (0.012) 0.072 (0.028) -0.031 (0.031) -0.069 (0.019) log-normal Posterior Mean (SD) N.A. 0.365 (0.055) 0.007 (0.0007) 0.071 (0.012) 1.237 (0.025) 0.237 (0.022) 0.285 (0.021) 0.339 (0.022) 0.194 (0.024) 0.357 (0.019) 0.565 (0.020) 0.084 (0.022) 0.055 (0.021) 0.084 (0.019) -0.167 (0.024) -0.075 (0.014) -0.176 (0.025) -0.085 (0.015) 0.056 (0.029) -0.036 (0.030) -0.064 (0.018) 0.433 (0.006) Scale Note: CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1, Region=West, Bed size=large, Location/teaching status=urban teaching, Primary payer=Medicare. 47 Table 2.3: Estimated average partial effects on mean LOS (SD) associated with different covariates Covariate Estiamte (SD) Covariate Estimate (SD) ∗ Age 0.043 (0.004) Location/teaching status Rural -0.719 (0.111) Procedure Urban non-teaching -0.379 (0.071) CABG 8.296 (0.205) PTCA 0.918 (0.072) CATH 1.154 (0.087) Bedsize Other 1.602 (0.117) Small -0.809 (0.125) Medium -0.365 (0.081) CCI 2 0.974 (0.072) Primary Payer 3 1.971 (0.106) Mediciad 0.424 (0.172) Self-Pay -0.167 (0.159) ≥4 3.343 (0.127) Other -0.340 (0.097) Region Northeast 0.507 (0.111) Midwest 0.288 (0.102) South 0.455 (0.097) Note: CCI=Charlson Comorbidity Index. ∗ The partial effect for age was estimated at the mean age. Reference group:Procedure=none, CCI=1, Region=West, Bed size=large, Location/teaching status=urban teaching, Primary payer=Medicare. 48 Table 2.4: Patient characteristics by classification of LOS Covariate Total Short LOSa Long LOSa N=11,749 N=10,856 N=893 Mean LOS (SD) 5.50 (5.90) 4.8 12.8 Mean Age (SD) 64.6 (12.7) 64.26 68.22 N(%) N(%) N(%) Gender Female 4371 (37.0) 4019 (37.0) 352 (39.4) Male 7378 (63.0) 6837 (63.0) 541 (60.6) Procedure CABG 1408 (12.0) 516 ( 0.5) 892 (99.9) PTCA 4736 (40.3) 4736 (43.6) 0 ( 0.0) CATH 2275 (19.4) 2275 (21.0) 0 ( 0.0) Other 1490 (12.7) 1489 ( 3.7) 1 ( 0.1) None 1840 (15.6) 1840 (16.9) 0 ( 0.0) CCI 1 4438 (37.8) 4363 (40.2) 75 ( 8.4) 2 3423 (29.1) 3054 (28.1) 369 (41.3) 3 1981 (16.9) 1722 (15.9) 252 (29.0) ≥4 1907 (16.2) 1717 (15.8) 190 (21.3) Insurance Medicare 6186 (52.7) 5628 (51.8) 558 (62.5) Medicaid 690 ( 5.9) 622 ( 5.7) 68 ( 7.6) Self-Pay 562 ( 4.8) 535 ( 4.9) 27 ( 3.0) Other 4311 (36.7) 4071 (37.5) 240 (26.9) Note: CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1, Region=West, Bed size=large, Location/teaching status=urban teaching, Primary payer=Medicare. a Sample averages for LOS and age within the short-stay and long-stay groups 49 0 20 40 60 80 100 0.8 0.4 0.0 0.4 0.8 K−M Estimate Group=CABG 0.0 K−M Estimate ALL 140 0 20 40 60 80 10 20 30 0.4 0.0 0.8 0.4 0.0 0 0.8 Group=CATH K−M Estimate Hospital Length of Stay (Days) Group=PTCA K−M Estimate Hospital Length of Stay (Days) 40 0 20 40 60 80 20 40 60 80 100 140 0.4 0.0 0.8 0.4 0.0 0 0.8 Group=NONE K−M Estimate Hospital Length of Stay (Days) Group=OTHER K−M Estimate Hospital Length of Stay (Days) 0 Hospital Length of Stay (Days) 5 10 15 20 25 Hospital Length of Stay (Days) Figure 2.5: Goodness-of-fit curves by procedure type. The solid lines (—) are the Empirical estimates of discharge probability, the dash lines (- - -) are the Coxian PH model-based estimates, the dotted lines (...) are the log-normal model-based estimates. 50 2.10 Discussion This chapter demonstrates the application of Coxian PH stochastic regression models to hospital LOS to account for the heavy skewness and heterogeneity in the data. A Bayesian method based on RJMCMC was applied to dynamically select the number of phases. This method avoids arbitrary trimming and transformation of the data. In addition, the approach allows us to obtain estimates of mean LOS, median, other percentiles, and class characteristics. While a complete description of the underlying hidden states of the Markov process would depend on specific applications, we found in this example a meaningful interpretation based on short-stay, and long-stay subgroups for AMI patients. The two groups differed primarily on comorbidity and procedure type. Short-stay patients had the lowest CCI; longstay patients comprised largely of those who underwent CABG surgery. Procedure type and CCI were significant correlates of LOS. The strength of Coxian PH regression models for LOS lies in their flexibility in accommodating extreme values, while revealing hidden features such as short and long stays in hospitals. The management of hospital LOS has become an important issue in cost containment and control. Determination of relevant factors and hidden structures could inform discharge planning and allocation of resources (Lanzarone et al., 2010; McDermott and Stock, 2007; Ramiarina et al., 2008). We believe this work contributes to the development of statistical models for analysis of LOS distributions and other consumption variables in healthcare resource analysis. For example, our proposed Coxian PH model is applicable to other skewed data such as inpatient cost as well as censored duration data. It can be extended to analyze longitudinal data exhibiting unobserved heterogeneity. 51 Chapter 3 Joint Modeling of Hospital Length of Stay and Cost with Shared Random Effects In this chapter, we consider hospital length of stay (LOS) and cost jointly and construct a bivariate model from a common constellation of covariates that might influence their joint distribution. Shared random-effects modeling approach is then developed to conduct the joint analysis which permits simultaneous assessment of the correlates of LOS and in-hospital cost. Our model also provides important information for decisions on resource allocation. 3.1 Introduction Random-effects (RE) models are of interests and widely used in capturing statistical dependence for a variety of data types, and allow for prediction, imputation, and hypothesis testing within a general regression context. The shared random effect approach has a very 52 intuitive appeal to a lot of researchers who generally believe that there might be some latent quantity underlying an individual’s susceptibility to both in-hospital LOS and cost. This latent quantity may represent environmental risk factors or hospital site effects yet to be identified. The latent process also induces dependence between the two explicitly observed processes. In this chapter, we illustrate the use of shared random effects to account for the correlation, build a joint Coxian phase-type /Log-normal (CPH-LN) model and describe the details of the marginal measurement. The idea of introducing random effects to incorporate correlation is not new, but a lot of previous work has focused on joint modeling the multivariate survival times through frailty models, with the same underlying model and β -coefficient for covariates. Here, we develop a flexible joint modeling approach with different distribution types and scales. We propose a modeling framework using a finite-state continuous time Markov process with a single absorbing state (discharge) to describe the hospital LOS, introduced in Chapter 2. Each phase represents an unknown health status and absorbing state is interpreted as the healthcare is terminated and discharge from the hospital or death. In addition, we assume cost is log-normally distributed. The correlations between outcomes are introduced by shared random effects, which can also affect the marginal means, variances and covariances. This approach adjusts for patient level characteristics while allowing for the inherent correlation structure. The rest of the chapter is organized as follows. Section 3.2 introduces the joint modeling approach of LOS and cost via shared random effects. We provide a detailed description for marginal mean, variance and covariance structure. Then Section 3.3 describes the estimation methods using likelihood reformulation (LR). Application of the proposed model is given in Section 3.4. Finally in Section 3.5, we conclude and outline further research direction. 53 3.2 Shared Random-effects Model Let Yi1 (i = 1, 2, . . . , n) denote in-hospital LOS and x1i be a vector of l1 explanatory variables that might have impact on the distribution of Yi1 , and similarly, Yi2 (i = 1, 2, . . . , n) denote cost, x2i are the l2 variables that may affect Yi2 . In practice, covariates x1i , x2i may represent the same covariate constellation, but not necessarily so. The observed data for patient i are (Yi1 , Yi2 , x1i , x2i , 1 ≤ i ≤ n). The underlying relationship between LOS and its risk factors is modeled by a Coxian phase-type regression introduced in Chapter 2, where the mean structure is assumed to be a log-linear function of covariates x1 , i.e. CP log (µCP H ) = β0 H + x1iβ CP H . Similarly, the relationship between cost and risk factors is i LN modeled by a log-normal regression log Yi2 = β0 + x2iβ LN + σ = µi + σ i , where µi is the location parameter linear in x2i , σ is the scale parameter and i ∼ N (0, 1). Corresponding 1 LN mean function of cost is given by log (µLN ) = µi + 1 σ 2 = β0 + x2iβ LN + σ 2 . That is i 2 2 Yi1 ∼ Coxian PH(λ0 , p , β ) CP µCP H = exp (β0 H + x1iβ CP H ) i (3.1) Yi2 ∼ log-normal(µi , σ) LN µi = β0 + x2iβ LN The separate regression models described above are not suitable for dependent data because the two healthcare utilization measures, LOS and cost, are most of the time, correlated. To accommodate the inherent dependence between LOS and cost at the subject (individual, patient) level, shared random effects bi , 1 ≤ i ≤ n are incorporated into the linear prediction 54 (Liu et al., 2007). Then (3.1) can be rewritten as: Yi1 |bi ∼ Coxian PH(λ0 , p , β ) CP µCP H = exp (β0 H + x1iβ CP H + bi ) i Yi2 |bi ∼ log-normal(µi , σ) (3.2) LN µi = β0 + x2iβ LN + γbi bi ∼ i.i.d.F (.) where F (.) is the distribution function for i.i.d. random effects bi (1 ≤ i ≤ n). Condition on bi , Yi1 and Yi2 are independent, i.e. conditional independence. γ measures the different scale for the variance components. In (3.2), bi characterizes the correlation between Yi1 and Yi2 within the same subject. When the correlation is high, fitting separate models might lead to biased covariate effect estimates. Economists are interested in marginal means, variances, covariances or correlations, which can be derived from conditional measurements by the following basic formulas: E[Yij ] = E[E(Yij |bi )], i = 1, 2, . . . n j = 1, 2 Cov(Yij , Ykl ) = E[Cov(Yij , Ykl |bi )] + Cov(E[Yij |bi ], E[Ykl |bi ]), (3.3) (3.4) i, k = 1, 2, . . . , n j, l = 1, 2 In addition, under the assumption of conditional independence, we have Cov(Yi1 , Yi2 |bi ) = 0. Marginal Means The marginal means for LOS and cost, averaging over the distribution of random effects b, 55 are easy to calculate from (3.4): CP E[Yi1 ] = E[exp(β0 H + x1iβ CP H + bi )] = CP exp(β0 H (3.5) + x1iβ CP H )E[exp (bi )] 1 LN E[Yi2 ] = E[exp(β0 + x2iβ LN + γbi + σ 2 )] 2 1 LN = exp(β0 + x2iβ LN + σ 2 )E[exp (γbi )] 2 (3.6) Without loss of generality (WLOG), we can assume E[exp (bi )] = 1, under which the marginal means keep the same in both separate models and joint model for LOS and only a scale changed for cost. For example, assume Wi = exp (bi ), and Wi ∼ Gamma( 1 , θ), i.e. θ ( 1 −1) p(wi ) = wi θ w exp (− θi ) 1 Γ(θ−1 )θ θ Then E[Wi ] = 1, and V ar(Wi ) = θ. Large values of θ signify a closer positive relationship between LOS and cost for the same subject and greater heterogeneity among subjects. Accordingly, bi ∼ log-Gamma( 1 , θ) with density function θ 1 p(bi ) = (exp (bi )) θ exp (− exp (bi ) ) θ 1 Γ(θ−1 )θ θ For some identifiability reasons, we restrict γ=1. CP E[Yi1 ] = exp(β0 H + x1iβ CP H ) 1 LN E[Yi2 ] = exp(β0 + x2iβ LN + σ 2 ) 2 56 which retains the unchaged marginal means. Marginal Variances From (3.4), we have Cov(Yij , Yij ) = E[Cov(Yij , Yij |bi )] + Cov(E[Yij |bi ], E[Yij |bi ]), i = 1, 2, . . . , n j = 1, 2 (3.7) Therefore under the log-Gamma assumption for bi , the marginal variances can be derived: Cov(Yi1 , Yi1 ) = E[V ar(Yi1 |bi )] + Cov(E[Yi1 |bi ], E[Yi1 |bi ]) = E[V ar(Yi1 |bi )] CP CP + Cov(exp(β0 H + x1iβ CP H ) exp (bi ), exp(β0 H + x1iβ CP H ) exp (bi )) CP = E[V ar(Yi1 |bi )] + exp(2(β0 H + x1iβ CP H ))V ar(exp(bi )) CP = E[V ar(Yi1 |bi )] + exp(2(β0 H + x1iβ CP H ))θ (3.8) From Coxian PH properties in Chapter 2, we can obtain 2 V ar(Yi1 |bi ) = E(Yi1 |bi ) − (E(Yi1 |bi ))2 π˜ π˜ = 2(π Q−21 ) − (π Q−11 )2 π π = E[exp (2(x1iβ CP H + bi ))] 2π (Λ0 P)−21 − (π (Λ0 P)−11 )2 = exp(2(x1iβ CP H ))M E[exp(2bi )] 57 π π where M = 2π (Λ0 P)−21 − (π (Λ0 P)−11 )2 , Λ0 , P can be found in Chapter 2, Section 2.2. E[exp(2bi )] = E[Wi2 ] 1 −1 = = = θ 2 wi wi w exp (− θi ) 1 Γ(θ−1 )θ θ dwi 1 +1 w wiθ exp (− θi ) dwi 1 −1 )θ θ Γ(θ −1 θ−1 +2 Γ(θ + 2)θ Γ(θ−1 )θθ −1 =θ+1 With above results, marginal variance of Yi1 in (3.8) can be rewritten as: CP Cov(Yi1 , Yi1 ) = exp(2(x1iβ CP H )M (θ + 1) + exp(2(β0 H + x1iβ CP H ))θ = (3.9) CP exp(2(x1iβ CP H )(M (θ + 1) + exp(2(β0 H ))θ) Cov(Yi2 , Yi2 ) = E[V ar(Yi2 |bi )] + Cov(E[Yi2 |bi ], E[Yi2 |bi ]) = E[(exp(σ 2 ) − 1) exp(2µi + σ 2 )] 1 1 LN LN + Cov(exp(β0 + x2iβ LN + σ 2 )ebi , exp(β0 + x2iβ LN + σ 2 )ebi ) 2 2 (3.10) 2 ) − 1) exp(2(β LN + x β LN ) + σ 2 )E[e2bi ] = (exp(σ 0 2i 1 LN + exp(2(β0 + x2iβ LN + σ 2 ))V ar(ebi ) 2 LN = exp(2(β0 + x2iβ LN ) + σ 2 )[(exp(σ 2 ) − 1)(θ + 1) + θ] 58 Marginal Covariances Under the assumption of conditional independecne, Cov(Yi1 , Yi2 ) = E[Cov(Yi1 , Yi2 |bi )] + Cov(E[Yi1 |bi ], E[Yi2 |bi ]) = 0 + Cov(E[Yi1 |bi ], E[Yi2 |bi ]) 1 LN CP = [exp(β0 H + x1iβ CP H + bi ), exp(β0 + x2iβ LN + σ 2 + bi )] 2 1 2 CP LN = exp(β0 H + x1iβ CP H + β0 + x2iβ LN + σ )V ar(exp(bi )) 2 1 CP LN = exp(β0 H + x1iβ CP H + β0 + x2iβ LN + σ 2 )θ 2 (3.11) Marginal Correlations Corr(Yi1 , Yi2 ) = Cov(Yi1 , Yi2 ) V ar(Yi1 ) V ar(Yi2 ) CP exp(β0 H )θ = CP (M (θ + 1) + exp(2β0 H )θ) (exp(σ 2 ) − 1)(θ + 1) + θ) Obviously, θ → 0, Corr(Yi1 , Yi2 ) → 0 θ → ∞, Corr(Yi1 , Yi2 ) → CP exp(β0 H ) CP M + exp(2β0 H ) 59 exp(σ 2 ) (3.12) 3.3 Estimation Let Oi denote the observed data of a particular subject i, i.e. Oi = (Yi1 , Yi2 , x1i , x2i ) are i.i.d. for subjects, i = 1, 2, . . . , n. The likelihood for Oi is L(Oi ) = = f (yi1 |bi )f (yi2 |bi )p(bi |θ)dbi 1 π [exp(exp(−x1iβ CP H − bi )Λ0 Pyi1 )](− exp(x1iβ CP H − bi )Λ0 P1 ) (3.13) LN (log yi2 − (β0 + x2iβ LN + bi ))2 1 √ )p(bi |θ)dbi exp(− 2σ 2 yi2 2πσ A variety of numerical methods have been used to assess the above integral, for example, Laplace approximation, partial quasilikelihood, Gauss-hermite quadrature, adaptive Gaussian quadrature, and various Monte Carlo techniques. We will use SAS Proc NLMIXED (SAS Institute, Cary, NC, USA) for estimation in our joint modeling with shared random effects, based on the likelihood reformulation (LR) method proposed by Liu and Yu (2008). The basic idea is to reformulate the conditional likelihood on non-normal random effects to standard normal random effects and improve its computational efficiency. Apply LR method to reformulate (3.13) with respect to a log-Gamma density to a standard normal density φ(ai ), which gives, 60 L(Oi ) = = f (yi1 |ai )f (yi2 |ai ) p(ai |θ) φ(ai )dai φ(ai ) 1 π [exp(exp(−x1iβ CP H − ai )Λ0 Pyi1 )](− exp(x1iβ CP H − ai )Λ0 P1 ) LN (log yi2 − (β0 + x2iβ LN + ai ))2 p(ai |θ) 1 √ ) exp(− φ(ai )dai φ(ai ) 2σ 2 yi2 2πσ = Aj where li (3.14) A A C B exp(li 1 + li 2 + li − li ) is the conditional log-likelihood of Yij , j = 1, 2, i.e. A 1 li 1 = log π [exp(exp(−x1iβ CP H − ai )Λ0 Pyi1 )](− exp(x1iβ CP H − ai )Λ0 P1 ) LN (log yi2 − (β0 + x2iβ LN + ai ))2 A li 2 = − log yi2 − log σ − 0.5 log 2π − 2σ 2 B li is log of log-Gamma density function, i.e. a exp(ai ) B li = log p(ai θ) = −θ−1 log θ − log Γ(θ−1 ) + i − θ θ C and li is log of standard normal density function, i.e. 1 C li = log φ(ai ) = − a2 + Constant 2 i After above reformulation, likelihood function in (3.13) can be easily constructed by regular SAS programming statement and maximized by ’general’ option in the ’Model’ statement in Proc NLMIXED with adaptive Gaussian quadrature method. 61 3.4 Application to 2003 NIS AMI Hospitalized Patients In this section, we apply the proposed joint modeling approach to the analysis of LOS and cost for acute myocardial infarction (AMI) patients in the 2003 Nationwide Inpatient Sample (NIS), which is introduced in Chapter 2, Section 2.9. The LOS of AMI patients could vary widely depending on severity of illness, comorbidity and type of procedure. Most patients are discharged after a relatively short period time. However, some patients may remain in hospital over a long time period receiving continuous care. The LOS ranged between 1 and 142 days with mean 5.50 days and SD 5.90 days. Cost had a range from $90 to $962,611, with the median $31,704. Preliminary descriptive statistics is shown in Table 3.1. The mean LOS of female patients (mean=6.01) was higher than that of male patients (mean=5.21). The mean cost of female patients was $45,166, approximately $2,000 lower than that of males. Patients who underwent CABG had the highest LOS, followed by patients who underwent OTHER, CATH, PTCA and NONE. The same conclusion was obtained for CABG patients having highest mean cost of $109,188. Mean LOS and median cost were different according to CCI scores. Patients with CCI≥4 had the highest mean LOS (=7.89) and mean cost (=$52,282), followed by CCI=3, CCI=2 and CCI=1, respectively. However when concerning with median cost, the order could be changed. The plot of log-total charge versus LOS in Figure 3.1 indicates the positive correlation between the two outcomes. 62 Table 3.1: Characteristic of patients for LOS and Cost (N=1,1749) Variable Subgroup N(%) Mean LOS (SD) Mean Cost (SD) Median (Cost) Gender Female 4371 (37.2) 6.01 (6.43) 45,166 (54,755) 29,381 Male 7378 (62.8) 5.21 (5.54) 47,054 (54,394) 33,278 CCI CCI=1 4438 (37.8) 3.87 (3.79) 39,782 (39,497) 30,724 CCI=2 3423 (29.1) 5.61 (6.22) 48,883 (57,602) 33,489 CCI=3 1981 (16.9) 6.71 (6.54) 50,985 (61,542) 32,802 CCI≥4 1907 (16.2) 7.89 (7.33) 52,282 (68,103) 30,561 Procedure CABG 1408 (12.0) 11.60 (8.81) 109,188 (89,822) 81,701 PTCA 4736 (40.3) 4.04 (3.38) 46,555 (33,949) 375,71 CATH 2275 (19.4) 5.08 (5.12) 34,825 (40,781) 23,087 Other 1490 (12.7) 7.06 (8.16) 41,722 (65,561) 21,710 None 1840 (15.7) 3.88 (2.90) 15,744 (13,404) 11,946 Because healthcare utilization outcomes LOS and cost may be correlated, we apply our joint modeling approach proposed in Section 3.2 to fit the 2003 AMI patients data, linking the LOS and cost at the individual level with shared random effects. Potential correlates of LOS and cost included demographic and clinical variables that could be identified at admission, and the use of any cardiac procedures. In many applications, a two-phase Coxian model can provide sufficient flexibility to describe the evolution of the process over time. Therefor for simplicity we assume LOS has a 2-phase Coxian distribution. States typically represent unknown healthcare status. Gardiner et al. (2002) shows it is suitable to assume that in-hospital cost is log-normally distributed. SAS Proc NLMIXED with 10 quadrature points is used for estimation. Results are summarized in Table 3.2. Both LOS and cost are positively associated with age, female gender, CCI and procedure types. As expected older patients had longer LOS, higher cost, and the effect of female gender on LOS was an estimated 0.073 (β-coefficient) corresponding to an expected relative increase in LOS of 1.08, while female gender effect on cost is not significant. Higher comorbidity led to an increased stay by 1.25, 1.46, 1.78, and increased cost by 1.18, 1.29, 1.44 for CCI scores 2, 3, and ≥ 4 63 14 q q q q q 8 10 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqq q q qq qqq qqq qq qqq q qqq q qqq qqq qqq qqq qqq qq qqq q q qqq q qqq qqq qqq qqq qqq qqq q qqq qqq qq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq qqq q qqq qqq qqq qqq qq qqq qqq qqq qqq q q q qqq qqq q qq qq qq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq qq q q q qq qq q qq qq q qq qq q qq qq qq qq q q qq qq q q qq q q q qqqqq q qqq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q q qq q qqq q q qq q q q q qqq qqqq q q qq q qqqq qqqq qqq q q q qq qqqq q q q q q q q q qq qq q q q q q q q qq qq q q q qq q q q q q q q q q q q q q q q q q q q q 6 Log (Cost), $ 12 q q q q q q 0 10 20 30 40 LOS (days) Figure 3.1: Plot of LOS vs. Log (Cost), truncated at 50 days 64 50 relative to a CCI score of 1. The effects of CABG, PTCA, CATH and ’other’ procedures, compared to no procedure was an estimated relative increase by 3.78, 1.42, 1.49, 1.63 for LOS and 8.24, 3.82, 2.33, 2.17 for cost, respectively. These results were consistent with the raw data. In addition, from Table 3.2 we observe that random effects exist among the subject ˆ level. The significant random effect variance component (θ = 0.290, p − value < 0.0001) indicates that a patient with longer length of stay tends to have higher cost. For comparison, we also fit the separate models in (3.1) without random effects assuming 2-phase Coxian for LOS, and log-normal for cost. Results are provided in Table 3.3. We found that estimated coefficient for patients with other procedure types is larger than our model. There also exists some noticeable difference in the scale parameter estimate for σ. Table 3.2: Estimates of coefficients and random effects (γ = 1) Variable CPH-Lognormal LOS (CPH) Cost (LN) Estimate (SE) Estimate (SE) Intercept 1.005 ( 0.026) 9.312 ( 0.021) AGE 0.008 (0.0007) 0.003 (0.0006) Gender Female 0.073 ( 0.018) 0.005 ( 0.010) CCI CCI=2 0.221 ( 0.021) 0.162 ( 0.017) CCI=3 0.378 ( 0.026) 0.257 ( 0.021) CCI≥4 0.577 ( 0.027) 0.362 ( 0.021) Procedure CABG 1.330 ( 0.032) 2.109 ( 0.026) PTCA 0.353 ( 0.027) 1.341 ( 0.021) CATH 0.401 ( 0.029) 0.846 ( 0.023) Other 0.489 ( 0.032) 0.774 ( 0.025) Scale σ 0.450 ( 0.005) Random Effect θ 0.290 ( 0.004) Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. 65 Table 3.3: Estimates of coefficients without random effects Variable CPH-Lognormal LOS (CPH) Cost (LN) Estimate (SE) Estimate (SE) Intercept 0.926 ( 0.021) 9.190 ( 0.020) AGE 0.008 (0.0005) 0.003 (0.0005) Gender Female 0.092 ( 0.014) 0.012 ( 0.014) CCI CCI=2 0.261 ( 0.017) 0.155 ( 0.016) CCI=3 0.429 ( 0.021) 0.258 ( 0.019) CCI≥4 0.618 ( 0.021) 0.370 ( 0.020) Procedure CABG 1.228 ( 0.003) 2.084 ( 0.025) PTCA 0.265 ( 0.021) 1.285 ( 0.020) CATH 0.346 ( 0.023) 0.772 ( 0.022) Other 0.518 ( 0.026) 0.663 ( 0.022) Scale σ 0.698 ( 0.004) Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. 3.5 Discussion In this chapter we propose a novel joint model for correlated LOS and cost data. Our study demonstrates the application of joint modeling of correlated LOS and total cost for hospitalized patients. The underlying distributions, with conditional independence assumption are Coxian PH for LOS and log-normal for cost. Gaussian quadrature technique implemented in SAS Proc NLMIXED is used for ML estimation. Our model is comprehensive yet easy to fit. These advantages make it valuable in practical data analysis. The strength of Coxian PH regression models for LOS lies in their flexibility in accommodating extreme values, while revealing hidden status possibly due to the presence of other comorbid conditions. We find a Coxian PH model serves well for LOS. Cost easily fitted through a log-normal regression model. The correlation between them can be accommodated by shared random effects. Joint modeling approach is primarily necessary when researchers 66 are more interested in the association of different two outcomes. The management of hospital LOS and cost has become an important issue, since the determination of relevant factors and hidden structures could inform discharge planning and allocation of resources. In the application, we showed that there exists significant association between LOS and cost at the subject level. Moreover, this approach can be extended to analyze longitudinal data or clustered data exhibiting unobserved heterogeneity. If the data are incomplete, for example, when patients who die in hospital, this joint model can be generalized to deal with censored observations by modifying the log-likelihood function (Liu et al., 2007). 67 Chapter 4 Bivariate Copula Random-effects (BCRE) Model for Length of Stay and Cost This chapter is concerned with regression models for correlated outcomes constructed using copula functions and correlated random effects. Our approach entails specifying conditional marginal regression models with random effects for the outcomes and combining them to form a joint model via a specified copula. 4.1 Introduction As we mentioned in Chapter 3, mixed outcomes have attracted more attention in health and medicine, and joint analysis of such outcomes entails specification of models flexible enough to accommodate them. Such joint models are potentially advantageous in several statistical and practical respects. For example, a multivariate model enables analysts to 68 account for relationships between outcomes and assess at the same time the joint influence of predictors/covariates have on them. Multilevel models (also called hierarchical linear models, nested models, mixed models, random-effects models, random-coefficient models, or split-plot designs) are statistical models for addressing variation at more than one level. They can be viewed as generalizations of linear models. The multivariate normal distribution is by far the most commonly used to model multivariate outcomes. However, multivariate normality may not be a proper assumption in many situations. For example, if two outcomes are positive and skewed, the log-normal or log-logistic regression models might be more appropriate. This is especially true for LOS and cost where positive right skewness and correlation are present. In the recent years, some attempts have been made to relate those two outcomes that permit consideration of the correlation between LOS and medical charges. Gardiner et al. (2002) propose a two-equation model for total cost and duration of treatment with the endogeneity of the later accounted for in the model for cost. In their model, correlation is assessed either under the assumption of bivariate normal for measurement error or by the “seemly unrelated regression”. Often different survival distributions give better fit to LOS and cost, for example, a log-logistic LOS and log-normal cost. The new approach that we consider here provides a more general and flexible model for the related variables. Our approach is based on copula dependence modeling. Copula functions are useful tools to model dependence for multivariate outcomes. There is an increasing use of copulas in several scientific fields, such as economics (Cameron et al., 2004), survival analysis (Lambert and Vandenhende, 2002), finance (Bee, 2004; Breymann et al., 2003) and insurance (Klugman 69 and Parsa, 1999). A copula is a function that connects the marginal distributions to restore the joint distribution with a dependence parameter. For example, multivariate Gaussian distribution can be generated by a Gaussian copula applied to the Gaussian marginal distributions. More importantly, a copula can provide many flexible, complicated non-Gaussian joint distributions. The context of our application of copula models is LOS and cost. They are likely to have different marginal distributions being correlated. In addition to the correlation for LOS and cost for each patient, another potential correlation exists at the cluster (hospital) level. For example, unmeasured latent variables (hospital efficiency, provider characteristics, etc.) induce a random effect shared among patients within the same cluster that affects the outcomes and results in a within-cluster correlation. We can integrate out this latent random effect to derive a new marginal distribution for the outcomes, but the forms are typically not closed and complicated. In fact, the random effect (RE) approach is intuitive to researchers who believe that there may be some latent quantity underlying a cluster’s LOS and cost. The random-effects models have been used extensively in clustered and longitudinal data analysis, see Allison (2005); Chen and Dunson (2003); Wooldridge (2002). For example, Chen and Dunson (2003) propose an approach for random-effects model selection and apply it to study the relationship between prenatal exposure to polychlorinated biphenyls and motor development in young children with possible heterogeneity among the 12 centers. In this chapter, we develop a new flexible joint model based on correlated measurement errors modeled by copulas and incorporate a cluster level random effect to account for individual and within-cluster correlations simultaneously. The proposed approach tries to capture the various dependence structures of LOS and cost (symmetric or asymmetric) 70 in the copula function, and takes advantage of the relative ease in specifying the marginal distributions and introduction of within-cluster correlation based on the cluster level random effects. 4.2 Copulas and Dependence: a Brief Overview The study of copula functions was initiated in the 1940s by Hoeffding, and further developed by Fr´chet. Model theories about copulas were introduced by Sklar in 1959 (Sklar, 1959). e Nelsen (1999); Joe (1997); Trivedi and Zimmer (2005) provide a comprehensive discussion of copulas and their applications. Copulas are parametrically specified joint distributions generated from given marginal distributions. Therefore, properties of copulas are analogous to properties of joint distributions. In this section we provide some fundamental properties of copulas. The copula approach is a modeling strategy whereby a joint distribution is induced by specifying marginal distributions and a copula function with a dependence parameter. Sklar’s theorem states that there exists a copula function which acts to represent the joint cumulative distribution functions (CDF) of random variables in terms of its underlying one dimensional margins. An m-copula can be defined as an m-dimensional CDF whose support is contained in [0, 1]m and whose one-dimensional margins are uniform on [0, 1]. Consider a continuous m-variate distribution F (y1 , y2 , . . . , ym ) with univariate marginal distributions −1 −1 −1 F1 , F2 , . . . , Fm and inverse functions F1 , F2 , . . . , Fm . Let u1 , u2 , . . . , um ∈ [0, 1], y1 = 71 −1 −1 −1 F1 (u1 ), y2 = F2 (u2 ), . . . , ym = Fm (um ). Then −1 −1 −1 F (y1 , y2 , . . . , ym ) = F F1 (u1 ), F2 (u2 ), . . . , Fm (um ) = Pr (U1 ≤ u1 , U2 ≤ u2 , . . . , Um ≤ um ) (4.1) = C(u1 , u2 , . . . , um ) where (U1 , U2 , . . . , Um ) are each marginally uniform on [0, 1]. C is the unique copula associated with the distribution function F . Here for simplicity, we discuss the bivariate copulas to illustrate copula modeling approach. Consider two random variables X and Y with continuous distributions F and G, respectively, and joint distribution function H. Theorem 4.1 (Sklar’s theorem). Let H be a joint distribution function with margins F and G, then there exists a 2-dimensional copula function Cθ : [0, 1]2 → [0, 1] such that H(x, y) = Cθ (F (x), G(y)) where θ is the dependence parameter, which measures dependence between the margins. Assume Cθ , F, G are differentiable, we can write down the joint density function of X and Y in the form: h(x, y) = ∂ 2 H(x, y) ∂ 2 Cθ (F (x), G(y)) = = cθ (u1 , u2 )f (x)g(y) ∂x∂y ∂x∂y (4.2) ∂ 2 Cθ (u1 , u2 ) where u1 = F (x), u2 = G(y), cθ (u1 , u2 ) = and f, g are the marginal univariate ∂u1 ∂u2 densities. Three bivariate copulas of importance are 72 1. Independent (Product) Copula : Π(u, v) = uv 2. Maximum Copula: W (u, v) = max(u + v − 1, 0) 3. Minimum Copula: M (u, v) = min(u, v) where (u, v) ∈ [0, 1]2 . W and M are called the Fr´chet lower and upper bounds and have e the property that all bivariate copulas C satisfy W ≤ C ≤ M (Smith, 2003). Family of copulas Cθ are indexed by the association parameter θ which captures dependence between the random variables interested since copula parameters have different ranges and interpretations, they are not comparable to each other. Here, we present three dependence concepts. 1. Pearson correlation. The Pearson correlation coefficient ρ(X, Y ) for two random variables is a measure of linear dependence, given by ρ= Cov(X, Y ) V ar(X)V ar(Y ) The Pearson correlation coefficient is the most popular measure of the linear association. For elliptical copulas (such as Gaussian copula, ρ appears naturally as the . An alternative measure of dependence is rank correlation, including Kendall’s tau and Spearman’s rho, which are measures of concordance. 2. Kendall’s tau and Spearman’s rho. Kendall’s tau and Spearman’s rho are associations between rankings instead of the actual values of the observations. Hence Kendall’s tau 73 is an alternative measure of association for non-elliptical distributions. τ (X, Y ) = Pr[concordance] − Pr[discordance] = Pr[(X1 − X2 )(Y1 − Y2 ) > 0] − Pr[(X1 − X2 )(Y1 − Y2 ) < 0] where (X1 , Y1 ) and (X2 , Y2 ) are two independent pairs of random variables (X, Y ) from H. Spearman’s rho is the linear correlation between F (X) and G(Y ) defined as: ρS (X, Y ) = ρ(F (X), G(Y )) Both τ (X, Y ) and ρS (X, Y ) can be expressed in term of copulas: 1 τ (X, Y ) = 4E [C(U1 , U2 )] − 1 = 4 0 0 1 ρS (X, Y ) = 12E[U1 U2 ] − 3 = 12 0 1 C(u1 , u2 )dC(u1 , u2 ) − 1 1 0 u1 u2 dC(u1 u2 ) − 3 where (U1 , U2 ) ∼ C with uniform margins. For continuous variables the above measures take value [-1,1]. For the independent copula Π, both measures are 0. For the Fr´chet lower bound W , both measures are -1. e For the Fr´chet upper bound M , both measures are 1. From the above expressions for e τ (X, Y ), ρS (X, Y ) both measures depend on the copula of the joint distribution and not on the margins. They are also invariant with respect to strictly increasing transformation of (X, Y ). Recall that the Pearson correlation is not a measure of independence: for example, ρ(X, Y ) = 0 does not imply independence of the two variables. Table ?? below gives the 74 functional form of some selected copulas. For simplicity, we here write down copulas in terms of random variables U1 and U2 that have standard uniform marginal distributions. Table ?? summarizes several bivariate copula functions in terms of U1 and U2 . Farlie-Gumbel-Morgenstern (FGM) copula The FGM copula is defined as Cθ (u1 , u2 ) = u1 u2 (1 + θ(1 − u1 )(1 − u2 )) , θ ∈ [−1, 1]. The joint distribution of (X, Y ) is H(x, y) = F (x)G(y) (1 + θ(1 − F (x))(1 − G(y))) (4.3) X and Y are independent as θ = 0, and the FGM copula collapses to independent copula. The FGM copula does not caputre the Fr´chet lower & upper bounds as special cases. The e joint density of the FGM coula is h(x, y) = ∂ 2 H(x, y) = (1 + θ(1 − 2F (x))(1 − 2G(y))) f (x)g(y) ∂x∂y (4.4) The FGM copula is attractive due to its simplicity but it can only be useful when dependence between the two margins is modest. 75 Gaussian Copula The Gaussian copula is defined by Cθ (u1 , u2 ) = Φ2 Φ−1 (u1 ), Φ−1 (u2 ); θ , θ ∈ [−1, 1]. The joint distribution of (X, Y ) is H(x, y) = Φ2 Φ−1 [F (x)], Φ−1 [G(y)]; θ (4.5) where Φ(.) is the CDF of the standard normal distribution, and Φ2 (.) is the standard bivariate normal CDF with correlation parameter θ. X and Y are independent if θ = 0. The Gaussian copula contains the Fr´chet bounds: for θ = −1 we get W , and for θ = 1 we get M . The e density of H is h(x, y) = g(x) ∂ 2 H(x, y) f (x) = φ2 Φ−1 [F (x)], Φ−1 [G(y)]; θ ∂x∂y φ(Φ−1 [F (x)]) φ(Φ−1 [G(x)]) (4.6) where φ2 is the density of bivariate standard normal with correlation θ, i.e. x2 − 2θxy + y 2 φθ (x, y) = exp − 2(1 − θ2 ) 2π 1 − θ2 √ 1 φ(.) is the standard normal density, f (x) and g(y) are densities of X and Y , respectively. The higher the association parameter θ , the stronger the dependence. We can see the dependence structure is symmetric, see left top in Figure 4.1. 76 Clayton Copula Clayton copula has the form Cθ (u1 , u2 ) = u−θ + u−θ − 1 1 2 −1/θ , θ ∈ (0, ∞). The corresponding joint distribution function and density function are H(x, y) = F (x)−θ + G(y)−θ − 1 h(x, y) = −1/θ (4.7) −1/θ−2 ∂ 2 H(x, y) = (1 + θ)[F (x)G(y)]−θ−1 F (x)−θ + G(y)−θ − 1 f (x)g(y) ∂x∂y (4.8) X and Y are independent as θ → 0. When correlation between two variables is strongest in the left tail of the joint distribution, Clayton copula is an appropriate choice. Frank Copula The Frank copula takes the form: Cθ (u1 , u2 ) = −θ−1 log 1 + (e−θu1 − 1)(e−θu2 − 1) e−θ − 1 The corresponding distribution function is H(x, y) = −θ−1 log 1 + exp(−θF (x)) − 1)(exp(−θG(y)) − 1) exp(−θ) − 1 (4.9) where dependence parameter θ ∈ (−∞, ∞) \ {0}. Values −∞, and ∞ correspond (limit) to the Fr´chet lower bound and upper bound, respectively. We also have H(x, y) = F (x)G(y) e 77 if θ → 0. The left bottom panel of Figure 4.1 shows the strongest dependence is centered in the middle of distribution. Gumbel Copula The Gumbel copula has the form: Cθ (u1 , u2 ) = exp −[(− log u1 )θ + (− log u2 )θ ]1/θ , θ ∈ [1, ∞]. The joint distribution function is H(x, y) = exp −[(− log F (x))θ + (− log G(y))θ ]1/θ (4.10) X and Y are independent as θ = 1. Gumbel copula exhibits strong right tail dependence, while does not allow negative dependence. 78 Clayton copula, θ=2 2 1 0 −1 −2 −2 −1 0 1 2 Gaussian copula, θ=0.7 −2 −1 0 1 2 −2 Frank copula, θ=5.74 −1 0 1 2 2 1 0 −1 −2 −2 −1 0 1 2 Gumbel copula, θ=2 −2 −1 0 1 2 −2 −1 0 1 2 Figure 4.1: Bivariate pdf contour plots induced by copula, N (0, 1) margins (Kendall’s tau=0.5) 79 Table 1: Selected examples of copulas Copula Copula function Cθ (u1 , u2 ) Density cθ (u1 , u2 ) a FGM u1 u2 (1 + θ(1 − u1 )(1 − u2 )) 1 + θ(1 − 2u1 )(1 − 2u2 ) φ2 Φ−1 (u1 ), Φ−1 (u2 ); θ −1 (u ), Φ−1 (u ) Gaussian Φθ Φ 1 2 φ(Φ−1 (u1 ))φ(Φ−1 (u2 )) −1/θ Clayton u−θ + u−θ (1 + θ)[u1 u2 ]−θ−1 (u1 + u2 − 1)−1/θ−2 1 2 −θ−1 log 1+ −θ(e−θ − 1)e−θ(u1 +u2 ) Frank (exp(−θu1 ) − 1)(exp(−θu2 ) − 1) (e−θu1 − 1)(e−θu2 − 1) + (eθ − 1)2 exp(−θ) − 1 (˜1 u2 )θ−1 u ˜ Cθ (u1 , u2 )(u1 u2 )−1 1/θ (˜θ + uθ )2−1/θ u1 ˜ 2 Gumbel exp − (− log u1 )θ + (− log u2 )θ θ + uθ )1/θ + θ − 1 (˜1 ˜2 u where u1 = − log u1 , and u2 = − log u2 ˜ ˜ a FGM denotes Farlie-Gumbel-Morgenstern. b the density function D (z) = kz −k z t (et − 1)−1 dt for k any positive integer. k 0 k 80 θ-domain −1 ≤ θ ≤ 1 Kendall’s τ 2θ 9 −1 ≤ θ ≤ 1 2 π arcsin(θ) θ ∈ (0, ∞) θ θ+2 1 − 4 [1− θ θ ∈ (−∞, ∞) \{0} D1 (θ)]b θ ∈ [1, ∞) 1− 1 θ 4.3 Bivariate Copula Random-effects (BCRE) Model 4.3.1 BCRE Model and Likelihood In this section, we present our model for the hospital length of stay (LOS) and cost at two levels. We define notation as follows. Suppose that we observe LOS Tij and cumulative cost Cij = C(Tij ) for the jth subject in ith hospital, where i = 1, 2, · · · , n, j = 1, 2, · · · , ni . Denote by x1,ij , x2,ij the covariate vectors for the fixed effect, specific to the type of outcome. In practice, they may represent the same common covariate constellation. Let ui and vi be the correlated random effects at the hospital level with joint density φ(ui , vi ), that is     2 ρτ1 τ2   ui   τ1   ∼ N (0, Σ), with Σ =   being a positive definite matrix, ρ ∈ 2 ρτ1 τ2 τ2 vi [−1, 1]. Denote by 1,ij and 2,ij the correlated measurement error terms for the positive values of Tij and Cij . We assume that measurement errors 1,ij and 2,ij are independent of random effects ui and vi , but jointly distributed with some specified marginal distributions (normal, logistic, etc.) from a particular copula with dependence parameter θ, for example, assume 1,ij ∼ N (0, 1) and 2,ij ∼ N (0, 1). Our Bivariate copula random-effects (BCRE) model is defined as log Tij = x1,ij β 1 + ui + σ1 1,ij (4.11) log Cij = x2,ij β 2 + vi + σ2 2,ij where β 1 and β 2 are regression coefficient vectors, respectively, σ1 and σ2 are two scale parameters. The joint distribution of random effects defines the correlation (between, and cross equation) among clustered measures (Ti , Ci ). In one aspect, ui and vi model intra-hospital 81 correlation, which induces the ”within-hospital correlation”, i.e. they generate dependence between those individuals in the same cluster (hospital), whereas conditional on the random effects those individuals are independent. In another aspect, the correlation between ui and vi describes the ”cross-equation correlation” at the hospital level. There might also be ”cross-equation correlation” at the individual level between the two equations, which is modeled by copula dependence parameter θ. For example, patients with longer hospital length of stay tend to incur more cost. Also, patients with higher cost are more likely to be hospitalized. Under the assumptions that {( 2,ij , 2,ij ), 1 ≤ i ≤ n, 1 ≤ j ≤ ni } are correlated and {( m,ij , m,ik ), m = 1, 2, j = k} are uncorrelated, the variances and covariances are easily calculated 2 2 2 V ar(log Tij ) = V ar(ui ) + σ1 V ar( 1,ij ) = τ1 + σ1 2 2 2 V ar(log Cij ) = V ar(vi ) + σ2 V ar( 2,ij ) = τ2 + σ2 2 2 Cov(log Tij , log Tik ) = V ar(ui ) + σ1 Cov( 1,ij , 1,ik ) = τ1 2 2 Cov(log Cij , log Cik ) = V ar(vi ) + σ2 Cov( 2,ij , 2,ik ) = τ2 Cov(log Tij , log Cij ) = Cov(ui , vi ) + σ1 σ2 Cov( 1,ij , 2,ij ) = ρτ1 τ2 + σ1 σ2 Cov( 1,ij , 2,ij ) Cov(log Tij , log Cik ) = Cov(ui , vi ) + σ1 σ2 Cov( 1,ij , 2,ik ) = ρτ1 τ2 where i = 1, 2, . . . , n, j, k = 1, 2, . . . , ni , j = k. Such equations are often necessary by identifying all the variance/covariance components. 82 Likelihood Denote by ti = (ti1 , ti2 , . . . , tini ) and ci = (ci1 , ci2 , . . . , cini ) the vectors of response variables, LOS and cost, for the i-th unit (cluster), i = 1, 2, . . . , n. The likelihood for (ti , ci ) is Cθ Li = j F( log tij − x1,ij β 1 − ui log cij − x2,ij β 2 − vi ), G( ) σ1 σ2 log tij − x1,ij β 1 − ui 1 log cij − x2,ij β 2 − vi 1 f( ) g( )p(ui , vi |τ1 , τ2 , ρ)dui dvi σ1 tij σ1 σ2 cij σ2 (4.12) where F and G are CDF of 1 and 2 , similarly, f and g are PDF of 1 and 2 , separately. The whole estimation process can be conveniently implemented in SAS. 4.3.2 Estimation The EM algorithm (Dempster. et al., 1977) is commonly used for estimation in joint randomeffects models, with random effects treated as missing data. However, for the above models, the conditional expectations of random effects given observed data do not have a closed form. Monte Carlo methods (e.g. the Metropolis−Hastings algorithm) are often needed in the E-step to approximate these terms, making the estimation highly computationally intensive. Furthermore, the implementation (programming) of the Monte Carlo EM method is quite difficult and must be treated case by case. Therefore, joint models are not yet widely adopted in practical data analysis. Likelihood (4.12) involves an integral with respect to random effects. Numerical integration techniques, e.g. Gaussian quadrature, thus can be adopted for estimation. The resulting parametric likelihood can be maximized conveniently by Gaussian quadrature tools in stan83 dard statistical packages such as Proc NLMIXED in SAS. An introduction to the Gaussian quadrature technique and implementation in SAS is given in Appendix A. Generally, we have two estimation methods when using copulas. Full maximum likelihood (FML) approach is the most direct estimation method. To obtain FML estimates, one maximizes the loglikelihood function l(ΩFML ) = n i=1 Li where Li is obtained from Eq (4.12) and Ω denotes all parameters. By standard likelihood theory under regularity conditions, the ML estimates ΩFML is consistent for the true parameter vector Ω0 and retains its asymptotic normality (Trivedi and Zimmer, 2005). Joe and Xu (1996) propose a two-stage estimation method called inference function for margins (IFM). The set of parameters of the model are estimated through a (nonlinear) system of estimating equations, with each estimating equation being a score function (partial derivative of a loglikelihood) from some marginal distribution of the multivariate model. However, the standard error of ΩIFM is not appropriately taken care of, which is typically solved by bootstrapping method. In this chapter, we apply the FML with Gaussian quadrature for the model estimation. By simulations, we found that such method yields satisfactory estimates. 4.4 Simulation In this section, we conduct a simulations study to evaluate the performance of the proposed estimation method. Data are simulated under a regression model with random effects. In each simulated data set, n = 30 clusters and ni = 20 observations are fixed within each cluster, which gives 600 observations per simulated data set. The covariate vectors x1,ij and x2,ij consist of the constant 1 (representing the intercept term) and values randomly generated from the uniform (−0.5, 0.5) distribution, with associated regression coefficients 84 (1, −0.5) for the LOS measure, and (4, 0.5) for the cost measure. The correlation of the random effects ρ is chosen to be 0.5. We assume that measurement errors 1,ij and 2,ij have normal marginal distributions. The true value of the association parameter θ for each copula is chosen so that Kendall’s tau is approximately equal to 0.3, except for the FGM copula, which cannot accommodate dependence of this magnitude. Alternatively in the FGM case, we set θ = 0.5, with corresponding Kendall’s tau value of 0.11. According to this simulation design, 30 realizations of the random cluster effects from a normal distribution are generated for each outcome. The simulation study is designed to evaluate the performance of the estimators. The number of replications is 500 for each setting considered. For each replication, new realizations of 1,ij and 2,ij are randomly drawn from each type of copula, which results in the new observations of LOS and cost. Results of the simulation study are presented in Table 4.2, reporting the average estimate and mean standard errors of the parameter estimates over the 500 replicates. It is evident that the FML estimators of the regression coefficients have negligible biases and relatively small MSE. For variance component parameters, the FML estimator also performs reasonably well in each setting considered. The simulation results thus confirm the applicability of the FML for parameter estimation in the correlated regression model with random effects. 85 Table 2: Simulation results for bivariate copulas with normal margins ρ = 0.5 Gaussian Copula Clayton Copula Frank Copula Gumbel Copula FGM Copula (θ = 0.45) (θ = 0.86) (θ = 5.40) (θ = 1.43) (θ = 0.50) DGP Est SE SEM Est SE SEM Est SE SEM Est SE SEM Est SE SEM β01 1.00 0.999 0.152 0.149 1.001 0.151 0.148 1.000 0.149 0.147 1.001 0.147 0.151 1.000 0.149 0.147 β11 -0.50 -0.503 0.120 0.120 -0.502 0.124 0.121 -0.499 0.118 0.127 -0.498 0.121 0.124 -0.499 0.162 0.157 β02 4.00 3.997 0.247 0.246 3.991 0.227 0.228 3.990 0.239 0.224 3.988 0.235 0.227 3.989 0.231 0.222 β12 0.50 0.500 0.267 0.270 0.500 0.260 0.241 0.502 0.258 0.256 0.502 0.271 0.260 0.501 0.262 0.257 σ1 1.00 1.001 0.030 0.031 0.999 0.031 0.029 0.996 0.030 0.029 0.996 0.031 0.031 1.002 0.030 0.028 σ2 2.00 1.999 0.059 0.059 1.998 0.059 0.057 1.997 0.057 0.058 1.995 0.059 0.059 2.004 0.061 0.058 τ1 0.80 0.775 0.113 0.113 0.778 0.116 0.108 0.777 0.111 0.110 0.778 0.113 0.116 0.776 0.111 0.112 τ2 1.20 1.166 0.180 0.176 1.167 0.175 0.170 1.164 0.180 0.175 1.160 0.176 0.175 1.162 0.183 0.179 ρ 0.50 0.502 0.151 0.147 0.504 0.149 0.147 0.505 0.159 0.154 0.503 0.147 0.149 0.502 0.147 0.149 θ 0.452 0.154 0.156 0.863 0.223 0.231 5.409 0.337 0.340 1.434 0.156 0.162 0.504 0.503 0.041 Est is the mean of the parameter estimates (based on 500 replicates); SE is the sampling standard error of the parameter estimates; SEM is the sampling mean of the standard error estimates. 86 4.5 Application to 2003 NIS AMI Hospitalized Patients We apply our method discussed above to the 2003 NIS for LOS and charges for patients hospitalized for AMI. There is a high correlation between LOS and cost (rank correlation r = 0.576 and Kendall’s tau τ = 0.431, N = 11, 749). The range of the LOS was 1 to 142 days, and cost ranged from $90 and $962,611. Figure 4.2 show a plot of log(LOS) and 14 log(cost). q q q q 10 8 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq qq qq q q q q q q q qq q qq q qq qq q q q q q qq q q q q q q q q qq q q q q qqqqq q q q qqq q q q q q q q qqqqqq q q q q q q q q q qqqqqqq q q q qq q q q q qqq q q q q q q q q q qqqqqqq q q qq q q q q q q qq qq q q q q qqqqqqq q q q q q q q q q q qqqqqqq q q q q qq q q q q qq q q q q q q qqqqqqq q q q q q q q q q q qq q q qq q q q q q qqqqqqq q q q q q q q qq q q q q q qqq qq q q q q q q q q qqqqqqq q q q q q q q q q q q q q qqqqqqq q q q q q q q qqqqqqq q q q qq q q q q q q q qqqqqqq q q q q q q qqqqq q q q q q q q qq q q q q q q qqqqq q q q qq q q q q qqq qq q q q q q q qqqq q q q q q qqqqq q q q q q q q q qq q q q q qqqqqq q q q q q qq qq q qq q qq q q q qq q q q q qq q q q q qq q q q q q qq q q q q q q q qq q q q q q qq q q qq q q qq q q q q qqqqq qq q q q q qq qqq q qq q qq q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 6 Log (Cost), $ 12 q q q q 0 1 2 3 Log (LOS), days Figure 4.2: Plot of Log (LOS) vs. Log (Cost), truncated at 50 days 87 4 The bivariate model we use for LOS and cost allow inference about regression parameters simultaneously for these two outcomes. Our method accounts for the existence of both the non-zero intra-hospital and individual level correlations. For comparison, we also fit two reduced models as special cases. A reduced model A is fit without random effects and a reduced model B is fit with independent measurement errors. The covariates of interest include age, gender, comorbidity and procedure type. To correct for skewness to the right of LOS and cost, we take the logarithm of the outcomes. Level 1 is the subject level, Level 2 is the hospital level. We use the adaptive Gaussian quadrature estimation method with five quadrature points in Proc NLMIXED. Different starting values are used in the estimation. We fit with three different copulas (Gaussian, Clayton and Gumbel) and chose the best one based on BIC. We conclude that in our application, the Gumbel copula performs the best. We include the results for covariates with Gumbel copula in Table 4.3. Table 4.3 shows that comorbidities and procedure types are highly significant in both the LOS and cost of the model. CABG patients have significant longer LOS and higher cost. Subjects with high comorbidity (CCI≥4) are more likely to stay longer and incur more cost at a rate of 1.72 and 1.41 respectively, compared to those with low comorbidity CCI=1 (p < 0.0001). Patient age has a significant effect in both equations of the model, but with ˆ ˆ very small effect size (β1,age = 0.0008, β2,age = 0.0003, p < 0.0001). Gender has a significant effect in LOS, but there is no difference between males and females in cost. Bottom of Table 4.3 shows the estimates of the covariance matrix and copula association parameter. We note that both the random effects are present for LOS and cost, as τ1 and τ2 are highly significant. A significant cross-equation correlation is seen (ˆuv = 0.615, p < ρ 0.0001), suggesting that a hospital with higher LOS also has higher cost. At the same time, 88 ˆ there is an association between LOS and cost at the patient level, θ = 2.051(p < 0.0001), resulting in the estimates of Kendall’s tau=0.512. Results for the Gaussian and Clayton copulas are shown in Table 4.6 and Table 4.7. For comparison, we fit two additional models. Reduced model A assumes that there are no random effects at the hospital level. In this model, we only consider association between LOS and cost at the patient level via measurement errors connected with a copula function. Results are shown in Table 4.4. We observe that this model yields the estimates of age, gender, comorbidities and procedure type close to those in our model. However some notable difference are present. We note a larger variance σ 2 for both LOS and cost, which might be due to ignoring the heterogeneity at the hospital level. Reduced model B only involves the random effects, assuming independent measurement errors. Table 4.5 shows the estimates for model B. We see that the estimate ρuv is higher than that in our model ˆ possibly transferring some correlation to the hospital level. 89 Table 4.3: Parameter estimates (SE) for log-normal margins with Gumbel copula LOS Cost Estimate (SE) p-value Estimate (SE) p-value Intercept 0.753 (0.019) < 0.0001 9.131 (0.018) < 0.0001 Age 0.008(0.0005) < 0.0001 0.003(0.0004) < 0.0001 Gender Female 0.071 (0.012) < 0.0001 0.001 (0.011) 0.922 Procedure CABG 1.236 (0.023) < 0.0001 1.928 (0.021) < 0.0001 PTCA 0.270 (0.019) < 0.0001 1.189 (0.018) < 0.0001 CATH 0.314 (0.021) < 0.0001 0.704 (0.019) < 0.0001 Other 0.366 (0.022) < 0.0001 0.672 (0.020) < 0.0001 CCI 2 0.096 (0.018) < 0.0001 0.131 (0.013) < 0.0001 3 0.337 (0.019) < 0.0001 0.222 (0.016) < 0.0001 ≥4 0.542 (0.018) < 0.0001 0.343 (0.016) < 0.0001 Scale (σ) 0.649 (0.004) < 0.0001 0.564 (0.004) < 0.0001 Random Effects (τ ) 0.129 (0.007) < 0.0001 0.424 (0.007) < 0.0001 Correlation (ρ) 0.615 (0.011) < 0.0001 θ 2.051 (0.020) < 0.0001 BIC 310631 Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. Table 4.4: Parameter estimates (SE) for log-normal margins with Gumbel copula, reduced model A without random effects LOS Cost Estimate (SE) p-value Estimate (SE) p-value Intercept 0.758 (0.019) < 0.0001 9.203 (0.018) < 0.0001 Age 0.008(0.0005) < 0.0001 0.003(0.0005) < 0.0001 Gender Female 0.078 (0.013) < 0.0001 0.004 (0.011) 0.787 Procedure CABG 1.287 (0.027) < 0.0001 2.052(0.026) < 0.0001 PTCA 0.303 (0.021) < 0.0001 1.283 (0.022) < 0.0001 CATH 0.318 (0.026) < 0.0001 0.774 (0.025) < 0.0001 Other 0.330 (0.033) < 0.0001 0.656 (0.023) < 0.0001 CCI 2 0.201 (0.014) < 0.0001 0.142 (0.017) < 0.0001 3 0.362 (0.016) < 0.0001 0.243 (0.016) < 0.0001 ≥4 0.565 (0.017) < 0.0001 0.362 (0.019) < 0.0001 Scale (σ) 0.662 (0.004) < 0.0001 0.702 (0.005) < 0.0001 θ 1.640 (0.015) < 0.0001 Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. 90 Table 4.5: Parameter estimates (SE) for log-normal margins with Gumbel copula, reduced model B with independent errors LOS Cost Estimate (SE) p-value Estimate (SE) p-value Intercept 0.758 (0.019) < 0.0001 9.236 (0.021) < 0.0001 Age 0.008(0.0005) < 0.0001 0.003(0.0004) < 0.0001 Gender Female 0.077 (0.013) < 0.0001 0.008 (0.011) 0.469 Procedure CABG 1.254 (0.024) < 0.0001 1.922 (0.022) < 0.0001 PTCA 0.257 (0.019) < 0.0001 1.181 (0.019) < 0.0001 CATH 0.294 (0.021) < 0.0001 0.682 (0.020) < 0.0001 Other 0.351 (0.023) < 0.0001 0.657 (0.021) < 0.0001 CCI 2 0.197 (0.015) < 0.0001 0.134 (0.013) < 0.0001 3 0.358 (0.018) < 0.0001 0.228 (0.016) < 0.0001 ≥4 0.579 (0.019) < 0.0001 0.348 (0.016) < 0.0001 Scale (σ) 0.649 (0.004) < 0.0001 0.543 (0.004) < 0.0001 Random Effects (τ ) 0.145 (0.008) < 0.0001 0.461 (0.011) < 0.0001 Correlation (ρ) 0.840 (0.008) < 0.0001 Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. Table 4.6: Parameter estimates (SE) for log-normal margins with Gaussian copula LOS Cost Estimate (SE) p-value Estimate (SE) p-value Intercept 0.732 (0.019) < 0.0001 9.171 (0.019) < 0.0001 Age 0.008(0.0005) < 0.0001 0.003(0.0004) < 0.0001 Gender Female 0.080 (0.013) < 0.0001 0.014 (0.011) 0.205 Procedure CABG 1.257 (0.024) < 0.0001 1.966 (0.022) < 0.0001 PTCA 0.257 (0.020) < 0.0001 1.224 (0.018) < 0.0001 CATH 0.297 (0.022) < 0.0001 0.715 (0.019) < 0.0001 Other 0.354 (0.023) < 0.0001 0.666 (0.020) < 0.0001 CCI 2 0.202 (0.015) < 0.0001 0.144 (0.013) < 0.0001 3 0.361 (0.019) < 0.0001 0.240 (0.016) < 0.0001 ≥4 0.579 (0.019) < 0.0001 0.365 (0.016) < 0.0001 Scale (σ) 0.651 (0.004) < 0.0001 0.549 (0.004) < 0.0001 Random Effects (τ ) 0.122 (0.007) < 0.0001 0.467 (0.008) < 0.0001 Correlation (ρ) 0.599 (0.010) < 0.0001 θ 0.715 (0.005) < 0.0001 BIC 310979 Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. 91 Table 4.7: Parameter estimates (SE) for log-normal margins with Clayton copula LOS Cost Estimate (SE) p-value Estimate (SE) p-value Intercept 0.768 (0.020) < 0.0001 9.098 (0.019) < 0.0001 Age 0.008(0.0005) < 0.0001 0.003(0.0004) < 0.0001 Gender Female 0.076 (0.013) < 0.0001 0.018 (0.011) 0.085 Procedure CABG 1.250 (0.026) < 0.0001 1.978 (0.022) < 0.0001 PTCA 0.180 (0.021) < 0.0001 1.240 (0.018) < 0.0001 CATH 0.305 (0.022) < 0.0001 0.734 (0.018) < 0.0001 Other 0.376 (0.023) < 0.0001 0.682 (0.019) < 0.0001 CCI 2 0.228 (0.016) < 0.0001 0.171 (0.013) < 0.0001 3 0.383 (0.019) < 0.0001 0.270 (0.015) < 0.0001 ≥4 0.628 (0.020) < 0.0001 0.395 (0.016) < 0.0001 Scale (σ) 0.696 (0.005) < 0.0001 0.565 (0.004) < 0.0001 Random Effects (τ ) 0.160 (0.008) < 0.0001 0.459 (0.010) < 0.0001 Correlation (ρ) 0.560 (0.012) < 0.0001 θ 1.387 (0.035) < 0.0001 BIC 314048 Note:CCI=Charlson Comorbidity Index. Reference group: Gender=male, Procedure=none, CCI=1. 92 4.6 Discussion In this chapter, we have introduced a joint bivariate copula random-effects model to describe the interplay of LOS and cost. The basic idea is that we start out with a specific latent response process (conditional on the random effects), connect those two equations through a copula with the association parameter. This approach flexibly models two-level correlations and at the same time account for hospital heterogeneity. • ui and vi , model intra-hospital correlation, which induces the ”within-hospital correlation”, i.e. they generate dependence between those individuals in the same cluster, whereas conditional on the random effects those individuals are independent. Cov(log Tij , log Tik ) = V ar(ui ) Cov(log Cij , log Cik ) = V ar(vi ) i = 1, 2, . . . , n; j, k = 1, 2, . . . , ni , j = k. • Correlation between ui and vi ρ describes the ”cross-equation correlation” at the hospital level. Cov(log Tij , log Cij ) = Cov(ui , vi ) + σ1 σ2 Cov( 1,ij , 2,ij ) Cov(log Tij , log Cik ) = Cov(ui , vi ), j = k • Association parameter in copula model θ describes ”cross-equation correlation” at the individual level due to Cov( 1,ij , 2,ij ). For example, patients with longer hospital 93 length of stay tend to incur more cost. Our proposed BCRE model overcomes some of the problems in existing models for mixed outcomes. Full maximum likelihood method is used to estimate parameters simultaneously, which is easily obtain in SAS Proc NLMIXED. Simulation results indicate that FML performs well. To illustrate our approach, we apply our model to 2003 NIS AMI patient data. Results obtained from our analysis are similar to previous results in Chapter 2 and 3. To accommodate model misspecification, our model can be generalized in several aspects. (a) In this chapter we consider only bivariate normal random effects. It can be easily extended to other bivariate distributions, eg. Gamma. Liu and Yu (2008) proposed a likelihood reformulation method which is much faster and can handle more complicated non-normal random-effects cases. (b) ’s are assumed to have normal margins, with some pre-specified copula. We may also apply logistic distribution for measurement errors, which might be more proper for LOS (Gardiner et al., 2002) and cost (Luo et al., 2007). With the easy implementation and satisfactory estimate results, our method greatly facilitates the application of joint random-effects models. We expect our estimation method to gain more popularity in practical data analysis of multiple measures of healthcare utilization. 94 Chapter 5 Conclusion and Future Research 5.1 Conclusion In this thesis, we consider the two closely related measures of healthcare resource utilization: hospital length of stay (LOS) and cost. The management of hospital LOS has become an important issue in cost containment and control. Determination of relevant factors and hidden structures could inform discharge planning and allocation of resources (Lanzarone et al., 2010; McDermott and Stock, 2007; Ramiarina et al., 2008). In Chapter 2, a Coxian phase-type (PH) stochastic regression is proposed to model LOS and account for the heavy skew and heterogeneity in the data. Coxian PH distributions describe the time to absorption T of an underlying finite-state continuous time Markov process with only forward transitions. Transitions between states are governed by the Markov assumptions. The actual states of the Markov process are not observable, i.e, we do not know the state from which a patient enters the system, or the state from which the patient exits. The Coxian PH model presented in this thesis retains the general form of the common mean structure E(T |x) = β0 + x β . 95 A Bayesian method based on RJMCMC was applied to dynamically select the number of phases. This method avoids arbitrary trimming and transformation of the data. In addition, the approach allows us to obtain the estimates of mean LOS, median and other percentiles, and class memberships. While a complete description of the underlying hidden states of the Markov process would depend on specific applications, it is easy for us to classify patients into different groups, for example, short, medium, and long LOS groups. Partial effects are derived directly from the posterior samples, instead of using any bootstrapping method. Shared random-effects models are introduced to jointly analyze LOS and cost in Chapter 3, which simultaneously assess the correlates of LOS and in-hospital cost, and provide important information for decisions on resource allocation. This model helps us ascertain the degree of correlation between LOS and cost, and to estimate the underlying mechanism for both processes. When there is a complex random-effects structure in the model, the corresponding likelihood may be computationally prohibitive. Liu and Yu (2008) propose a likelihood reformulation method for non-normal random-effects models, which substantially reduces computational time, while yielding similar estimates to the probability integral transformation (PIT) method (Nelson et al., 2006). In this model, we assume that the hospital cost may be correlated with LOS. In the application, we showed that there exists significant association between LOS and cost at the subject level, therefore separate models which disregard the correlations are not appropriate. In Chapter 4, we develop a novel bivariate copula random-effects (BCRE) model for the 96 analysis of LOS and cost with two-level correlations. log Tij = x1,ij β 1 + ui + σ1 1,ij (5.1) log Cij = x2,ij β 2 + vi + σ2 2,ij • ui and vi , 1 ≤ i ≤ n model intra-hospital correlation, which induces the ”withinhospital correlation” among patients for LOS and cost, respectively. • Correlation between ui and vi , 1 ≤ i ≤ n describes the ”cross-equation correlation” at the hospital level. • Association parameter θ in measurement errors ( 1,ij , 2,ij ), 1 ≤ i ≤ n, 1 ≤ j ≤ ni , describes ”cross-equation correlation” at the individual level. Full maximum likelihood (FML) is implemented using SAS Proc NLMIXED to derive the parameter estimates simultaneously. The estimation method yields satisfactory results as demonstrated by the simulation study. Our model is very comprehensive yet easy to fit. These advantages make it valuable in real data analysis. In the application to the 2003 NIS AMI patients, we showed that there are significant associations between LOS and cost at both the hospital level and the individual level. For comparison, we also fit the two measures with two reduced models, assuming that there are no random effects, and that the measurement errors are independent. Those reduced models are not appropriate due to ignoring either the hospital or the individual level correlations. Therefore our more comprehensive model should be preferred in such situations. 97 5.2 Future Research The strength of Coxian PH regression models for LOS lies in their flexibility in accommodating extreme values, while revealing hidden features such as short, long stays in hospitals. A natural extension of the proposed Coxian PH method is the modeling of censored duration data, as well as multiple discharge destinations. Olsson (1996) extends the EM algorithm for estimation of PH distributions for censored data. We further extend the models by incorporating covariates in the mean specification. In Chapter 3, an equivalent joint model can be proposed as follows: Yi1 |ai ∼ Coxian PH(λ0 , p , β ) CP log E(Yi1 |ai ) = β0 H + x1iβ CP H + ai (5.2) log(Yi2 |bi ) = x2iβ LN + bi    ai  where ai and bi are assumed correlated.   ∼ N (0, Σ), with Σ being a positive definite bi matrix. This equivalent form is more flexible in modeling random effects for two outcomes with different scales. In Chapter 4, we assumed that (1) measurement errors ( 1,ij , 2,ij ), 1 ≤ i ≤ n, 1 ≤ j ≤ ni have normal margins with some specified copula. (2) the hospital level random effects (ui , vi ), 1 ≤ i ≤ n are bivariate normally distributed. To accommodate possible model misspecification, our model can be generalized in several aspects. (a) Bivariate normal random effects can be easily extended to other bivariate distributions, eg. Gamma. Liu and Yu (2008) proposed a likelihood reformulation method which 98 is much faster in comparison to PIT method, and can handle more complicated nonnormal random-effects cases. (b) We may also apply logistic distribution for measurement errors, which might be more proper for LOS (Gardiner et al., 2002) and cost (Luo et al., 2007). Another attractive extension of the proposed methods in Chapter 3 and 4 is the joint modeling of LOS and repeated measures via shared random-effects models as wells as copulabased models. This model can be used to model hospital LOS and daily (monthly) cost jointly in heathcare utilization studies. More research on identifiability, estimation and inference of this model is needed. We apply our methods to hospital LOS and cost for patients with acute myocardial infarction (AMI) in the 2003 Nationwide Inpatient Sample (NIS). More applications can be found for other heathcare measures. Generalization to other related areas is of further interest. 99 APPENDICES 100 Appendix A Gaussian Quadrature Gaussian quadrature is often used to approximate the likelihood with no closed form. It approximates the integration by a weighted average of the integrand assessed at Q predetermined quadrature points cq (q = 1, 2, . . . , Q) over the normal random effects ui with density p(ui ). The likelihood ni f (cij , tij |xij , ui )p(ui )dui Li = j=1 can be approximated by Q ni Li ≈ f (cij , tij |xij , cq )p(cq )ωq q=1 j=1 with cq = √ √ 2 2zq and ωq = 2ηq exp(zq ), where ηq and zq can be can be obtained from tables (Abramowitz and Stegun, 2002). In the adaptive Gaussian quadrature, the integral is centered at the empirical Bayes estimate of ui , while it is centered at 0 in the nonadaptive Gaussian quadrature. Thus, the adaptive Gaussian quadrature provides a better 101 approximation for badly behaved integrands. Refer Littell et al. (2006) for details of Gaussian quadrature in SAS. 102 Appendix B Example SAS code for copula estimates The data format for a sample data set is given below. data gaussian; input groupid id x1 x2 los; cards; 1 1 −0.36473 1 2 −0.37871 −0.47626 14.563 573.40 1 3 0.02560 −0.19365 34.126 190.54 2 1 2 2 2 3 3 1 ··· ··· 0.43007 8.833 −0.08726 −0.10627 18.973 0.22927 0.05269 16.582 31.83 7.47 4.40 −0.23027 −0.00650 33.323 140.89 0.26273 −0.48568 ··· 2.724 8.55 ··· ; 103 B.1 SAS Code for Gaussian Copula proc nlmixed data=gaussian gconv=0 qpoints=5; parms b01 0.994 b11 -0.46 sigma1 1.308 b02 3.98 b12 0.35 sigma2 0.938 alpha 0.48 tau1 0.5 tau2 1 rho 0.5; bounds sigma1>0, sigma2>0, -1≤alpha≤1,tau1>0,tau2>0,-1≤rho≤1; mu1=b01+b11*x1+u1; mu2=b02+b12*x2+u2; p1=cdf(’lognormal’,y1,mu1,sigma1); if p1>.999999 then p1=.999999; F1=quantile(’normal’,p1); p2=cdf(’lognormal’,y2,mu2,sigma2); if p2>.999999 then p2=.999999; F2=quantile(’normal’,p2); logGaussian=-.5*log(1-alpha**2)-.5*(F1**2+F2**2-2*alpha*F1*F2)/(1-alpha**2) +logpdf(’lognormal’,y1,mu1,sigma1)+logpdf(’lognormal’,y2,mu2,sigma2)-(-F1**2/2-F2**2/2); model y1∼general(logGaussian); random u1 u2∼normal([0,0],[tau1**2,rho*tau1*tau2,tau2**2]) subject=group; run; 104 B.2 SAS Code for Clayton Copula proc nlmixed data=clayton gconv=0 qpoints=5; parms b01 1.05 b11 -0.48 sigma1 1 b02 4.07 b12 0.6 sigma2 2.1 alpha 0.86 tau1 0.8 tau2 1.15 rho 0.5; bounds sigma1>0, sigma2>0,tau1>0,tau2>0,-1≤rho≤1; mu1=b01+b11*x1+u1; mu2=b02+b12*x2+u2; p1=cdf(’lognormal’,y1,mu1,sigma1); if p1>.999999 then p1=.999999; p2=cdf(’lognormal’,y2,mu2,sigma2); if p2>.999999 then p2=.999999; logClayton=log(1+alpha)-(alpha+1)*(log(p1)+log(p2))-(1/alpha+2)*log(p1**(-alpha) +p2**(-alpha)-1)+logpdf(’lognormal’,y1,mu1,sigma1)+logpdf(’lognormal’,y2,mu2,sigma2); model y1∼general(logClayton); random u1 u2∼normal([0,0],[tau1**2,rho*tau1*tau2,tau2**2]) subject=group; run; B.3 SAS Code for Gumbel Copula proc nlmixed data=gumbel gconv=0 qpoints=5; parms b01 1.1 b11 -0.46 sigma1 1 b02 4.07 b12 0.6 sigma2 2.1 alpha 1.2 tau1 0.8 tau2 1.15 rho 0.5; bounds sigma1>0, sigma2>0,tau1>0,tau2>0,-1≤rho≤1; 105 mu1=b01+b11*x1+u1; mu2=b02+b12*x2+u2; p1=cdf(’lognormal’,y1,mu1,sigma1); if p1>.999999 then p1=.999999; logp1=-log(p1); p2=cdf(’lognormal’,y2,mu2,sigma2); if p2>.999999 then p2=.999999; logp2=-log(p2); logGumbel=-(logp1**alpha+logp2**alpha)**(1/alpha)-log(p1*p2)+(alpha-1)*log(logp1*logp2) -(2-1/alpha)*log(logp1**alpha+logp2**alpha) +log((logp1**alpha+logp2**alpha)**(1/alpha)+alpha-1) +logpdf(’lognormal’,y1,mu1,sigma1)+logpdf(’lognormal’,y2,mu2,sigma2); model y1∼general(logGumbel); random u1 u2∼normal([0,0],[tau1**2,rho*tau1*tau2,tau2**2]) subject=group; run; 106 BIBLIOGRAPHY 107 BIBLIOGRAPHY Aalen, O. O., 2002. Phase type distributions in survival analysis. Scandinavian Journal of Statistics, 22:447–463. Abramowitz, M. and I. A. Stegun, editors, 2002. Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables. Dover Publications. Allison, P. D., 2005. Fixed Effects Regression Methods for Longitudinal Data Using SAS. SAS Publishing. Asmussen, S., O. Nerman, and M. Olsson, 1996. Fitting phase-type distributions via the EM algorithm. Scandinavian Journal of Statistics, 23(4):419–441. Aus´ M. C. and H. F. Lopes, 2007. Bayesian estimation of ruin probabilities with a hetın, erogeneous and heavy-tailed insurance claim-size distribution. Australian & New Zealand Journal of Statistics, 49(4):415–434. Aus´ M. C., M. P. Wiper, and R. E. Lillo, 2008. Bayesian prediction of the transient ın, behaviour and busy period in short- and long-tailed GI/G/1 queueing systems. Comput. Stat. Data Anal., 52(3):1615–1635. Aus´ M. C., M. P. Wiper, and R. E. Lillo, 2009. Bayesian estimation of finite time ruin ın, probabilities. Applied Stochastic Models in Business and Industry, 25(6):787–805. Basu, A. and P. J. Rathouz, 2005. Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics, 6(1):93–109. Bee, M., 2004. Modelling credit default swap spreads by means of normal mixtures and copulas. Applied Mathematical Finance, 11(2):125–146. Bitran, G. R. and S. Dasu, 1994. Analysis of 42:158–174. P Hi /P H/1 queues. Operations Research, Breymann, W., A. Dias, and P. Embrechts, 2003. Dependence structures for multivariate high-frequency data in finance. Quantitative Finance, 3(1):1469–7688. Buckley, J. and I. James, 1979. Linear regression with censored data. Biometrika, 66(3):429– 436. 108 Cameron, A. C., T. Li, P. K. Trivedi, and D. M. Zimmer, 2004. Modeling differences in counted outcomes using bivariate copula models: With application to mismeasured counts. Econometrics Journal, 7(2):566–584. Cap´ra`, P., A. L. Foug`res, and C. Genest, 2000. Bivariate distributions with given extreme e a e value attractor. Journal of Multivariate Analysis, 72(1):30–49. Charlson, M. E., P. Pompei, K. L. Ales, and C. R. MacKenzie, 1987. A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation. Journal of Chronic Diseases, 40(5):373 – 383. Chen, X. and Y. Fan, 2002. Estimation of copula-based semiparametric time series models. Working Papers 0226, Department of Economics, Vanderbilt University. Chen, Z. and D. B. Dunson, 2003. Random effects selection in linear mixed models. Biometrics, 59(4):762–769. Cumani, A., 1982. On the canonical representation of homogeneous markov processes modelling failure-time distributions. Microelectron. Reliab., 22:583–602. Dellaportas, P. and J. J. Forster, 1999. Markov chain monte carlo model determination for hierarchical and graphical log-linear models. Biometrika, 86(3):615–633. Dempster., A. P., N. M. Laird, and D. B. Rubin, 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38. Denison, D. G. T., B. K. Mallick, and A. F. M. Smith, 1998. Automatic bayesian curve fitting. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(2):333–350. Diebolt, J. and C. P. Robert, 1994. Estimation of finite mixture distributions through bayesian sampling. Journal of the Royal Statistical Society. Series B (Methodological), 56(2):363–375. Erlang, A. K., 1917. Solution of some problems in the theory of probabilities of significance in automatic telephone exchanges. Elektrotkeknikeren, 13:5. Etzioni, R. D., E. J. Feuer, S. D. Sullivan, D. Lin, C. Hu, and S. D. Ramsey, 1999. On the use of survival analysis techniques to estimate medical care costs. Journal of Health Economics, 18(3):365–380. Fackrell, M., 2009. Modelling healthcare systems with phase-type distributions. Health Care Management Science, 12(1):11–26. Faddy, M., N. Graves, and A. Pettitt, 2009. Modeling length of stay in hospital and other right skewed data: Comparison of phase-type, gamma and log-normal distributions. Value in Health, 12(2):309–314. 109 Faddy, M. J., 2002. Penalized maximum likelihood estimation of the parameters in a coxian phase-type distribution. In Latouche, G. and P. Taylor, editors, Matrix-analytic methods: theory and applications, pages 107–114. World scientific. Faddy, M. J. and S. I. McClean, 1999. Analysing data on lengths of stay of hospital patients using phase-type distributions. Applied Stochastic Models in Business and Industry, 15(4):311–317. Faddy, M. J. and S. I. McClean, 2005. Markov chain modelling for geriatric patient care. Methods of Information in Medicine, 44(3):369–373. Gardiner, J. C., Z. Luo, C. J. Bradley, E. Polverejan, M. Holmes-Rovner, and D. Rovner, 2002. Longitudinal assessment of cost in health care interventions. Health Services and Outcomes Research Methodology, 3:149–168. Genest, C. and L.-P. Rivest, 1993. Statistical inference procedures for bivariate archimedean copulas. Journal of the American Statistical Association, 88(423):1034–1043. Golub, G. and C. Van Loan, 1996. Matrix computations. The Johns Hopkins University Press, Baltimore, MD, 3rd edition. Green, P. J., 1995. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika, 82(4):711–732. Gruttola, V. D. and X. M. Tu, 1994. Modelling progression of cd4-lymphocyte count and its relationship to survival time. Biometrics, 50(4):1003–1014. Guo, X. and B. P. Carlin, 2004. Separate and joint modeling of longitudinal and event time data using standard computer packages. The American Statistician, 58:16–24. Hallstrom, A. P. and S. D. Sullivan, 1998. On estimating costs for economic evaluation in failure time studies. Medical Care, 36(3):433–436. Henderson, R., P. Diggle, and A. Dobson, 2000. Joint modelling of longitudinal measurements and event time data. Biostatistics, 1(4):465–480. Horv´th, A. and M. Telek, 2000. Approximating heavy tailed behaviour with phase type a distributions. Leuven, Belgium. Hougaard, P., 1999. Fundamentals of survival data. Biometrics, 55(1):13–22. Hougaard, P., 2000. Analysis of multivariate survival data. Springer, New York. Hougaard, P., P. Myglegaard, and K. Borch-Johnsen, 1994. Heterogeneity models of disease susceptibility, with application to diabetic nephropathy. Biometrics, 50(4):1178–1188. Jin, Z., D. Y. Lin, L. J. Wei, and Z. Ying, 2003. Rank-based inference for the accelerated failure time model. Biometrika, 90(2):341–353. 110 Joe, H., 1997. Multivariate Models and Multivariate Dependence Concepts. Chapman and Hall, London, 1 edition. Joe, H. and J. J. Xu, 1996. The estimation method of inference functions for margins for multivariate models. Technical Report 166, Department of Statistics, University of British Columbia. Johnson, M. A., 1993. Selecting parameters of phase distributions: Combining nonlinear programming, heuristics, and erlang distributions. ORSA Journal on Computing, 5(1):69– 83. Johnson, V. E., 2004. A bayesian chi-squared test for goodness of fit. The Annals of Statistics, 32(6):2361–2384. Keiding, N., P. K. Andersen, and J. P. Klein, 1997. The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. Statistics in Medicine, 16(2):215–224. Klugman, S. A. and R. Parsa, 1999. Fitting bivariate loss distributions with copulas. Insurance: Mathematics and Economics, 24(1-2):139 – 148. Kocherlakota, S. and K. Kocherlakota, 1992. Bivariate discrete distributions. Konetzka, R. T., W. Spector, and M. R. Limcangco, 2008. Reducing hospitalizations from long-term care settings. Medical Care Research and Review, 65(1):40–66. Lambert, P., D. Collett, A. Kimber, and R. Johnson, 2004. Parametric accelerated failure time models with random effects and an application to kidney transplant survival. Statistics in Medicine, 23(20):3177–3192. Lambert, P. and F. Vandenhende, 2002. A copula-based model for multivariate non-normal longitudinal data: analysis of a dose titration safety study on a new antidepressant. Statistics in Medicine, 21(21):3197–3217. Lanzarone, E., A. Matta, and G. Scaccabarozzi, 2010. A patient stochastic model to support human resource planning in home care. Production Planning & Control, 21(1):3–25. Li, J., 1999. An application of lifetime models in estimation of expected length of stay of patients in hospital with complexity and age adjustment. Statistics in Medicine, 18(23):3337– 3344. Lin, D. Y., 2000. Proportional Means Regression for Censored Medical Costs. Biometrics, 56(3):775–778. Lin, D. Y., E. J. Feuer, R. Etzioni, and Y. Wax, 1997. Estimating medical costs from incomplete follow-up data. Biometrics, 53(2):419–434. Littell, R. C., G. A. Milliken, W. W. Stroup, R. D. Wolfinger, and O. Schabenberber, 2006. SAS for Mixed Models. SAS Publishing, 2 edition. 111 Liu, L., 2004. Modeling Recurrent Events and Medical Cost Data in the Presence of a Correlated Terminating Event. PhD thesis, University of Michigan. Liu, L., 2009. Joint modeling longitudinal semi-continuous data and survival, with application to longitudinal medical cost data. Statistics in Medicine, 28(6):972–986. Liu, L. and X. Huang, 2009. Joint analysis of correlated repeated measures and recurrent events processes in the presence of death, with application to a study on acquired immune deficiency syndrome. Journal of the Royal Statistical Society: Series C (Applied Statistics), 58(1):65–81. Liu, L., R. A. Wolfe, and J. D. Kalbfleisch, 2007. A shared random effects model for censored medical costs and mortality. Statistics in Medicine, 26(1):139–155. Liu, L. and Z. Yu, 2008. A likelihood reformulation method in non-normal random effects models. Statistics in Medicine, 27(16):3105–3124. Luo, Z., J. Goddeeris, J. C. Gardiner, and R. C. Smith, 2007. Costs of an intervention for primary care patients with medically unexplained symptoms: A randomized controlled trial. Psychiatr Serv., 58(8):1079–1086. Marie, R., 1980. Calculating equilibrium probabilities for λ(n)/ck /1/n queues. In In Proceedings of the Performance, pages 117–125. Marshall, A. and S. McClean, 2004. Using coxian phase-type distributions to identify patient characteristics for duration of stay in hospital. Health Care Management Science, 7:285– 289. Marshall, A., S. McClean, C. Shapcott, and P. Millard, 2002. Modelling patient duration of stay to facilitate resource management of geriatric hospitals. Health Care Management Science, 5:313–319. Marshall, A., S. McClean, M. Shapcott1, and P. Millard, 2000. Learning dynamic bayesian belief networks using conditional phase-type distributions. In Principles of Data Mining and Knowledge Discovery, volume 1910 of Lecture Notes in Computer Science, pages 1–11. Springer Berlin / Heidelberg. Marshall, A., C. Vasilakis, and E. El-Darzi, 2005. Length of stay-based patient flow models: Recent developments and future directions. Health Care Management Science, 8(3):213– 220. Marshall, A. H. and S. I. McClean, 2003. Conditional phase-type distributions for modelling patient length of stay in hospital. International Transactions in Operational Research, 10(6):565–576. Marshall, A. H., B. Shaw, and S. I. McClean, 2007. Estimating the costs for a group of geriatric patients using the coxian phase-type distribution. Statistics in Medicine, 26(13):2716– 2729. 112 Matsui, K., L. Goldman, P. Johnson, K. Kuntz, E. Cook, and T. Lee, 1996. Comorbidity as a correlate of length of stay for hospitalized patients with acute chest pain. Journal of General Internal Medicine, 11(5):262–268. McDermott, C. and G. N. Stock, 2007. Hospital operations and length of stay performance. International Journal of Operations & Production Management, 27(9):1020–1042. McGrory, C., A. Pettitt, and M. Faddy, 2009. A fully bayesian approach to inference for coxian phase-type distributions with covariate dependent mean. Computational Statistics & Data Analysis, 53(12):4311 – 4321. Miller, D. J. and W. Liu, 2002. On the recovery of joint distributions from limited information. Journal of Econometrics, 107(1-2):259–274. Nelsen, R. B., 1999. An Introduction to Copulas. Springer, New York. Nelson, K. P., S. R. Lipsitz, G. M. Fitzmaurice, J. Ibrahim, M. Parzen, and R. Strawderman, 2006. Use of the probability integral transformation to fit nonlinear mixed-effects modelswith nonnormal random effects. Journal of Computational and Graphical Statistics, 15(1):39–57. Neuts, M. F., 1981. Matrix-geometric solutions in stochastic models: an algorithmic approach. Johns Hopkins University Press, Baltimore, MD. Olsson, M., 1996. Estimation of phase-type distributions from censored data. Scandinavian Journal of Statistics, 23(4):443–460. Osogami, T., 2005. Analysis of Multi-server Systems via Dimensionality Reduction of Markov Chains. PhD thesis, Carnegie Mellon University. Polverejan, E., J. C. Gardiner, C. J. Bradley, M. H. Rovner, and D. Rovner, 2003. Estimating mean hospital cost as a function of length of stay and patient characteristics. Health Economics, 12(1):935–947. Pompei, P., M. E. Charlson, K. Ales, C. R. Mackenzie, and M. Norton, 1991. Relating patient characteristics at the time of admission to outcomes of hospitalization,. Journal of Clinical Epidemiology, 44(10):1063 – 1069. Ramiarina, R., R. Almeida, and W. A. Pereira, 2008. Hospital costs estimation and prediction as a function of patient and admission characteristics. The International Journal of Health Planning and Management, 23(4):345–355. Ratcliffe, S. J., W. Guo, and T. R. T. Have, 2004. Joint modeling of longitudinal and survival data via a common frailty. Biometrics, 60(4):892–899. Richardson, S. and P. J. Green, 1997. On bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society: Series B (Methodological), 59(4):731–792. 113 Sauer, C. H. and K. M. Chandy, 1975. Approximate analysis of central server models. IBM Journal of Research and Development, 19:31–3. Sirbu, C. M., 2004. Assessing Medical Costs from a Longitudinal Model. PhD thesis, Michigan State University. Sisko, A., C. Truffer, S. Smith, S. Keehan, J. Cylus, J. A. Poisal, M. K. Clemens, and J. Lizonitz, 2009. Health spending projections through 2018: recession effects add uncertainty to the outlook. Health Affairs (Millwood), 28(2):w346–w357. Sklar, A., 1959. Fonctions de repartition ´ n dimensions et leurs marges. Publ. Inst. Statist. a Univ. Paris, pages 229–231. Skrondal, A. and S. Rabe-Hesketh, 2004. Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Chapman & Hall/CRC, Boca Raton, FL. Smith, M. D., 2003. Modelling sample selection using archimedean copulas. Econometrics Journal, 6:99–123. Tian, L. and T. Cai, 2006. On the accelerated failure time model for current status and interval censored data. Biometrika, 93(2):329–342. Trivedi, P. K. and D. M. Zimmer, 2005. Copula modeling: An introduction for practitioners. Foundations and Trends in Econometrics, 1(1):1–111. Tsiatis, A. A., 1990. Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics, 18(1):354–372. Tsiatis, A. A. and M. Davidian, 2004. Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica, 14:809–834. Tsiatis, A. A., V. DeGruttola, and M. S. Wulfsohn, 1995. Modeling the relationship of survival to longitudinal data measured with error. applications to survival and cd4 counts in patients with aids. Journal of the American Statistical Association, 90(429):27–37. Vonesh, E. F., tom Greene, and M. D. Schluchter, 2006. Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25:143–163. Wang, K., K. K. W. Yau, and A. H. Lee, 2002. A hierarchical poisson mixture regression model to analyse maternity length of hospital stay. Statistics in Medicine, 21(23):3639– 3654. Wei, L. J., Z. Ying, and D. Y. Lin, 1990. Linear regression analysis of censored survival data based on rank tests. Biometrika, 77(4):845–851. Wooldridge, J. M., 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, Massachusetts. Wulfsohn, M. S. and A. A. Tsiatis, 1997. A joint model for survival and longitudinal data measured with error. Biometrics, 53(1):330–339. 114 Xiao, J., D. Douglas, A. H. Lee, and S. R. Vemuri, 1997. A delphi evaluation of the factors influencing length of stay in australian hospitals. The International Journal of Health Planning and Management, 12(3):207–218. Xu, J. and S. L. Zeger, 2001. Joint analysis of longitudinal data comprising repeated measures and times to events. Journal of the Royal Statistical Society. Series C (Applied Statistics), 50(3):375–387. 115