HOW A SUPPRESSOR VARIABLE AFFECTS THE ESTIMATION OF CAUSAL EFFECT: EXAMPLES OF CLASSICAL AND RECIPROCAL SUPPRESSIONS By Yun-Jia Lo A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Measurement and Quantitative Methods 2012 ABSTRACT HOW A SUPPRESSOR VARIABLE AFFECTS THE ESTIMATION OF CAUSAL EFFECT: EXAMPLES OF CLASSICAL AND RECIPROCAL SUPPRESSIONS By Yun-Jia Lo In educational research, a randomized controlled trial is the best design to eliminate potential selection bias in a sample to support valid causal inferences, but it is not always possible in educational research because of financial, ethical, and logistical constraints. One alternative solution is use of the propensity score (PS) methods. However, the bias and variance of the estimated causal effect can depend strongly on which covariates are included in the PS model of assignment to treatment. This study uses two simulated examples to understand how inclusion or exclusion of a classical or reciprocal suppressor, improving the R2 in the regression model, affect the estimations of causal effect by using regression, PS as a covariate, PS weighting, and PS matching methods. An additional condition of adding different covariates, P’s, is also tested in all methods where P’s explain the variance of outcome in different levels to approximate unconfoundedness. Findings indicate that both classical and reciprocal suppressors increase the predictive power of the treatment effects and influence the estimations of the treatment effects regardless in regression or PS methods without controlling any P. Although the impacts of the suppressors vary by different types of models applied, the strong-enough covariates, P’s, can eliminate the impact of suppressors in all models. With the stronger P’s applied, the estimates of standard error only decline by using the regression models, but are quite consistent in the example of classical suppression and slightly increase in the example of reciprocal suppressions by using the PS models. Copyright by Yun-Jia Lo 2012 ACKNOWLEGEMENTS I would like to express my gratitude to all those who gave me the opportunity to complete this dissertation. I am deeply indebted to my advisor Prof. Kenneth Frank, whose help, wise suggestions, and encouragement helped me shape the course I presented in this dissertation. I particularly appreciate his patience and kindness to the many questions I raised during this process. He has always been an excellent model of a great researcher. I also want to express my thanks to my parents, Chien-Sheng Lo and HsuehKuang Liu, and my dearest brother, Dr. Yung-Chung Lo. Their support and unconditional love make me who I am. My colleagues at the College of Education at MSU, Dr. Yisu Zhou, Dr. Min Sun, and Dr. Shu-Chuan Kao have always been a great source of learning and have been great friends in my life. My friend, Dr. Tianran Chen from the Department of Mathematics at MSU, provided me many suggestions about the simulations in this study and helped me in checking the programming. I very much appreciate his help. I also want to thank Daniel Schleh and Dr. Tiffeny Jimenez, who helped me correct English style and grammar and offered suggestions for improvement of this dissertation. I specially thank my committee members, Prof. Kimberly Maier, Prof. Spyros Konstantopoulos, and Prof. Geoffrey Booth. They have shown me great insight from different areas and provided critical feedback to strength this dissertation. I cannot list all the people who have shown their support to me, but give my sincere thanks to every one of them. v TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 2 THEORETICAL FRAMEWORK . . . . . . . . . . . . . . Theoretical Approach for Causal Effect . . . . . . . . . . . . . . . . Randomized Experimental Design and Quasi-Experimental Design . . Estimating Causal Effects Using Propensity Score Methods . . . . . PS as a covariate . . . . . . . . . . . . . . . . . . . . . . . . PS weighting . . . . . . . . . . . . . . . . . . . . . . . . . . PS matching . . . . . . . . . . . . . . . . . . . . . . . . . . Definitions and Types of Suppressions . . . . . . . . . . . . . . . . . Classical suppression . . . . . . . . . . . . . . . . . . . . . . Classical suppressor variable vs. instrumental variable . Negative suppression . . . . . . . . . . . . . . . . . . . . . . Reciprocal suppression . . . . . . . . . . . . . . . . . . . . . Reciprocal suppressor variable vs. mediator variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . 6 . 7 . 9 . 10 . 11 . 13 . 15 . 15 . 16 . 18 . 20 . 22 CHAPTER 3 METHODS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Testing the Validity of Simulated Data Sets . . . . . . . . . . . . . . . . . . Estimating the Causal Effect by Regression and PS Analyses . . . . . . . . Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS as a covariate . . . . . . . . . . . . . . . . . . . . . . . . PS weighting . . . . . . . . . . . . . . . . . . . . . . . . . PS matching . . . . . . . . . . . . . . . . . . . . . . . . . . 24 24 29 10 33 34 35 37 38 CHAPTER 4 EXAMPLE of CLASSICAL SUPPRESSION . . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Testing validity of simulation data sets . . . . . . . . . . . . . . . . Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS as a covariate . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impact of a confounding variable . . . . . . . . . . . . . . . . . . . 40 40 40 43 46 48 53 60 64 CHAPTER 5 EXAMPLE of RECIPROCAL SUPPRESSION . . . . . . . . . . . 67 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 vi Testing validity of simulation data sets . . . . . . . . . . . . . . . . Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS as a covariate . . . . . . . . . . . . . . . . . . . . . . . . . . . PS weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PS matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impact of a confounding variable . . . . . . . . . . . . . . . . . . . CHAPTER 6 CONCLUSION AND DISCUSSION . Summary of Findings . . . . . . . . . . . . . . Implication . . . . . . . . . . . . . . . . . . . Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 70 73 74 80 87 91 . . 93 . 93 . 104 . 107 APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A Simulation Program . . . . . . . . . . . . . . . . . . . . Appendix B A Glossary of Literary Terms . . . . . . . . . . . . . . 109 110 113 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 vii LIST OF TABLES Table 1 Classical Suppression Data Results . . . . . . . . . . . . . . . . . . 41 Table 2 Correlation Table for Simulated Variables – Classical Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Correlation Table for Simulated Variables and P’s – Classical Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . 42 Table 3 Table 4 The Estimated Treatment Effects of Regression Models – Classical Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 5 Correlation Table for Simulated Variables and Propensity Scores – Classical Suppression Example . . . . . . . . . . . . . . . . . . . . 47 Table 6 The Estimated Treatment Effects of Propensity Score as a Covariate Models – Classical Suppression Example . . . . . . . . . . . . . . . 51 Table 7 Coefficients of P’s in Regression Models and Coefficients of Propensity Scores in PS as a Covariate Models – Classical Suppression Example 52 Table 8.1 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Classical Suppression Example . . . . . . . . . . . . . 56 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Classical Suppression Example . . . . . . . . . . . . . 57 Table 8.2 Table 9.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Classical Suppression Example . . . . . 58 Table 9.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Classical Suppression Example . . . . 59 Table 10.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Classical Suppression Example. . . . . . 62 Table 10.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Classical Suppression Example . . . . . 63 Table 11 Impact of Suppressor, P’s and the Propensity Scores on Treatment Indicator – Classical Suppression Example . . . . . . . . . . . . . . 66 Table 12 Reciprocal Suppression Data Results . . . . . . . . . . . . . . . . . viii 68 Table 13 Table 14 Correlation Table for Simulated Variables – Reciprocal Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Correlation Table for Simulated Variables and P’s – Reciprocal Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . 70 Table 15 The Estimated Treatment Effects of Regression Models– Reciprocal Suppression Example . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Table 16 Correlation Table for Simulated Variables and Propensity Scores – Reciprocal Suppression Example . . . . . . . . . . . . . . . . . . . . 77 Table 17 The Estimated Treatment Effects of Propensity Score as a Covariate Models – Reciprocal Suppression Example. . . . . . . . . . . . . . . 78 Table 18 Coefficients of P’s in Regression Models and Coefficients of Propensity Scores in PS as a Covariate Models – Reciprocal Suppression Example 79 Table 19.1 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Reciprocal Suppression Example. . . . . . . . . . . . . . 83 Table 19.2 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Reciprocal Suppression Example . . . . . . . . . . . . . 84 Table 20.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Reciprocal Suppression Example . . . . 85 Table 20.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Reciprocal Suppression Example . . . . 86 Table 21.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Reciprocal Suppression Example . . . . 89 Table 21.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Reciprocal Suppression Example . . . . 90 Table 22 Impact of Suppressor, P’s and the Propensity Scores on Treatment Indicator – Reciprocal Suppression Example . . . . . . . . . . . . . . 92 ix LIST OF FIGURES Figure 1 Classical Suppression . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 2 Negative Suppression . . . . . . . . . . . . . . . . . . . . . . . . . 20 Figure 3 Reciprocal Suppression . . . . . . . . . . . . . . . . . . . . . . . . 22 Figure 4 Graphs of function 𝑃 = 𝑅 + sin(𝐶 × 𝑅) when C equals to 1, 4, 12, and 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Figure 5 Line Graphs of Estimations of Treatment Effect in Example of Classical Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Figure 6 Line Graphs of Estimations of Standard Errors of Treatment Effect in Example of Classical Suppression . . . . . . . . . . . . . . . . . . . 99 Figure 7 Line Graphs of T-ratios of Treatment Effect in Example of Classical Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Figure 8 Line Graphs of Estimations of Treatment Effect in Example of Reciprocal Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Figure 9 Line Graphs of Estimations of Standard Errors of Treatment Effect in Example of Reciprocal Suppression . . . . . . . . . . . . . . . . . 102 Figure 10 Line Graphs of T-ratios of Treatment Effect in Example of Reciprocal Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 x Chapter 1 INTRODUCTION In educational research, many questions depend on understanding the causal effects of programs or policies based on the Rubin Causal Model (Rubin, 1974). Although a randomized controlled trial (RCT) is the best design to eliminate potential selection bias in the sample in order to allow for valid causal inferences to be made, RCT design is not always possible in educational research because of financial, ethical, and logistical issues. One alternative method that can be used to approximate randomized assignment and to overcome potential selection bias is the propensity score (PS) method. The definitions of causal effect and causal inference are in Appendix B. Propensity Score methods were introduced by Rosenbaum and Rubin (1983a) and have become one of the standard techniques for controlling confounding in nonexperimental studies. The PS is defined as the probability of receiving a treatment. However, in empirical studies, the true PS’s are always difficult or impossible to obtain. It is always given by an estimated probability of receiving a treatment by using a logistic regression model, controlling for a set of observed variables. PS methods adjust the known PS’s in the models to reduce selection bias and to estimate causal effects. In observational studies, there may be a wide range of variables in the data that are related to the dependent variable, the treatment indicator, or both of them; all of which can be possible covariates to estimate PS’s (see Appendix B for the definition of covariate). However, the bias and variance of the estimated causal effect can depend strongly on which covariates are included in the PS model. Therefore, it is important to understand how inclusion or exclusion of certain covariates in the PS model affects the estimation of 1 causal effect. Though Rosenbaum and Rubin (1983b) suggest selecting covariates that are independent of the treatment indicator in the PS model so that covariates cannot lead to biases, recent researches suggest selecting all variables regardless of whether they are independent of the treatment indicator or not, providing they are related to the dependent variable (Brookhart et al., 2006; Rubin & Thomas, 1996). Later works of Rubin (1997) and Perkins, Tu, Underhill, Zhou, and Murray (2000) demonstrate that including variables that are strongly related to the treatment indicator but unrelated to the dependent variable can decrease the efficiency of an estimated causal effect, but if such a variable had even a weak effect on the dependent variable, the bias resulting from its exclusion would dominate any loss of efficiency for a reasonable-sized study (see Appendix B for the definition of efficiency). However, we do not know whether this strategy can also be applied to decide whether a suppressor variable should be included in a PS model as a covariate, especially when it may not reduce, but promote, the bias for estimating the treatment effect. A suppressor variable is a predictor that improves the total correlation coefficient square (R2) by directly predicting some of the variance in the dependent variable and by indirectly removing the variance of one or more of the other predictor variables while including it in the regression model. Although suppressor variables sometimes tend to appear useless as separate predictors to dependent variables and sometimes their pure impacts are hard to interpret in the regression model, they may in fact change the prediction values of other variables by suppressing them, completely altering research results and improving the prediction of dependent variable. Therefore, suppressor 2 variables are actually advantageous to be included in the regression models (Lancaster, 1999) to approach the statistically significant results. For example, Oden and his colleagues (2000) tried to illustrate the long-term benefits associated with the Head Start Program, a national foremost federally funded provider of educational services to young children in poverty since 1965. In the studies before theirs, few benefits were found statistically significant. However, in their study, they found that the Head Start group was slightly lower in Social Economic Status (SES) than the non-Head Start comparison group. After adjusting SES in the analysis, the direction and pattern of results suggests possible long-term benefits, such as girls who had attended Head Start were significantly more likely to graduate high school or earn a GED and significantly less likely to have been arrested than those in the non-Head Start comparison group. In this case, SES was suppressing the effect of the Head Start program to improve the prediction of the results and provided the rationale and theoretical explanation for the findings. Based on this finding, including the suppressor variable (SES) derived more accurate estimations of the treatment effects. However, not all the effects of suppressor variables can be interpreted appropriately with theoretical supports in all the studies. When including suppressor variables in the models without the supportive rationales and theories, it is less likely that accurate estimations can be generated because the selection bias may not be removed, but be promoted, by suppressor variables. Although researchers had demonstrated how the suppressors variables affect the estimations in the regression models, no research directly addresses how suppressor variables affect the estimations of causal effects when using the PS methods. In this dissertation, examples of classical and reciprocal suppression are 3 constructed to demonstrate how the classical suppressor and the reciprocal suppressor suppressing on the treatment indicators affect the estimations of causal effects as one of covariates in the PS models separately. Classical suppression is the most specialized definition of suppression and reciprocal suppression is the most general one. I am using classical suppression because applying the rule of selecting covariates in the PS model, people should exclude a classical suppressor variable in the PS analysis which is uncorrelated or slightly correlated to the dependent variable. However, it indeed increases the predictive validity in the regression model and including it in the regression models has been suggested in some studies. By using the example of classical suppression, the difference of estimating the treatment effect between the regression models and the PS models can be easily detected. Reciprocal suppression cases are provided because they are more likely to occur in practical educational research settings and show how one suppressor suppresses the other. This example will apply a general concept of how a suppressor affects the estimations of causal effect by using PS methods. More specifically, an example with an opposite sign of the estimated coefficient of the treatment indicator compared to its correlation with the dependent variable after including the reciprocal suppressor variable is provided. For both classical and reciprocal suppressors, the predictive validity of treatment indicator increases while including them in the regression models. To approach these goals, this study provides examples of classical and reciprocal suppressions by using an evolutionary algorithm to simulate 10 data sets for each example. The examples have to satisfy the corresponding constraints aptly. The purpose of simulations is not to generalize to the population of suppressions, but to create specific cases of suppression. As a result, in the processes of simulations, I 4 select only the simulated data sets which fit the given constraints precisely as given in the examples. In each data set, 1,000 subjects are generated. Moreover, to test whether the unconfoundedness assumption is fulfilled at different levels and how that affects the estimates of the treatment effect, a set of covariates, P’s, are derived from the residuals of simple linear regressions with the treatment indicator as the only predictor for each data set. Different covariates, P’s, are effective to remove the selection bias by explaining variance in the outcome at different levels. With the P more correlated with the outcome, the unconfoundedness is more likely to be fulfilled. This dissertation addresses whether the predictive validity of a treatment indicator increases by using the PS methods including PS as a covariate, PS weighting, and PS matching models as well as in the regression models while including the suppressor variable as a covariate. I am also interested in how the estimations of the treatment effect differ among the PS and regression models and whether different types of suppressions will lead to different results. Moreover, this study addresses how the estimations of the treatment effects vary in the models while controlling different covariates, P’s, which are assumed to remove the selection bias in different levels effectively. This dissertation is not only doing model comparisons to see how the estimations of the causal effects differ by using PS as a covariate, PS weighting, and PS matching methods and how the inferences differ in regression and PS models with different sets of covariates, but is more importantly trying to generate a guideline of how to approach a more accurate estimation of the causal effect when a suppressor variable is involved in the estimating process. 5 Chapter 2 THEORETICAL FRAMEWORK Theoretical Approach for Causal Effect In causal studies, the main question of interest is what would have happened if an individual exposed to one treatment condition had been exposed to a different one. As Rubin (1974) defined, a causal effect is the difference between what would have happened to the individual in one treatment group and what would have happened if he or she had instead been exposed to the control group. However, although the definition provides a clear theoretical formulation of what a causal effect is, it cannot be tested empirically because we cannot observe what happened to an individual in the treatment condition and in the control condition at the same time. This is referred to as the fundamental problem of causal inference analysis (Holland, 1986). Holland (1986) identified two general approaches to solving this problem based on Rubin’s model: the scientific solution and the statistical solution. In the scientific solution, two assumptions are made. The first is temporal stability, an assumption that the constancy of the response stays stable over time. The second is causal transience, which means that the effect of a prior treatment is transient and does not affect what happens to an individual in a later treatment. Based on these two assumptions, for example, one can be in the control group at time one and in the treatment group at time two. The causal effect can be the difference between the outcome of an individual in time one and the outcome of him or her in time two. However, it is difficult to keep these two assumptions when implementing a scientific solution in educational studies. For example, to test the effects of two different curricula on students’ achievement, it is hard to assume that the 6 effect of one curriculum at time one does not affect that of the other curriculum at time two. Also, the development of maturity over time may violate the assumption of temporal stability. In the statistical solution, a weaker assumption of unit homogeneity is made to solve these issues. The unit homogeneity assumption implies that if the units have identical values in all relative respects, then people can expect they will also have identical values in the outcome value. As a result, when an individual in a treatment group has the same values in all relative respects as the other one in a control group, the causal effect can be the difference between the outcomes in the treatment group and in the control group. However, in the empirical studies, whether the unit homogeneity assumption is achieved or not is more difficult to define. A weaker assumption, unconfoundedness, is the most widely used assumption for different methods for estimating causal effects in observational studies, which is introduced by Rosenbaum and Rubin (1983b). This assumption requires that there are no unobserved respects associated both with the treatment and the outcome after being conditional on observed covariates. As a result, while designing the studies to test causal effects, researchers need to collect all possible variables in order to make the unconfoundedness assumption more likely to be fulfilled. Randomized Experimental Design and Quasi-Experimental Design Randomized experimental design is based on randomized controlled trial (RCT). RCT provides the most reliable form of scientific evidence because it reduces sample bias, especially with large sample size. Under RCT, units are randomly assigned to either control or treatment groups. This implies that subgroups are non-confounding and that no 7 interaction and consistency correspond to homogeneity. As a result, the unit homogeneity assumption can be achieved. This kind of research design ensures that the control and treatment groups are statistically equivalent for a population of individuals, especially in a large sample. The estimated treatment effect is the average difference of outcomes between treatment and control groups. However, randomized controlled trials are not feasible in some educational settings. First, the cost of RCT is more expensive as well as other large sample designs. Second, the ethical and logistical issues are hard to overcome especially in educational situations. Those limitations drive researchers to use quasiexperimental designs. Quasi-experimental designs are used when the random assignment is impossible or impractical. Under quasi-experimental design, control and treatment groups may be not statistically equivalent because of selection bias and non-confounding. As a result, the unit homogeneity assumption is violated and therefore the estimate of causal effect is biased and misleading. In order to strengthen the unit homogeneity assumption and to achieve the unconfoundedness assumption, a set of variables, especially those related to the outcome variables, needs to be collected and controlled as the covariates in the statistical methods to estimate the unbiased causal effects which have no difference from the true effects theoretically. A number of approaches can be used to strengthen the unit homogeneity assumption under quasi-experimental designs. PS methods are becoming one of the popular standard techniques to deal with the disadvantage of quasiexperimental designs by adopting the unconfoundedness assumption. Three types of PS methods: PS as a covariate, PS weighting, and PS matching, are explained and applied in this dissertation. 8 Estimating Causal Effects Using Propensity Score Methods PS analysis was first introduced by Rosenbaum and Rubin (1983a). The PS is often unobserved but can be estimated by the predicted probability of receiving a treatment for an individual conditional on all the observed covariates. The PS for an individual, defined as the conditional probability of being treated on the given covariates, has been used to reduce bias in observational studies. In theory, individuals with similar PS’s in the treatment group can be compared to those in the control group. The idea is that people with similar PS’s are likely to have the similar characteristics and motivation in a treatment condition. Moreover, researchers can assume that they are more likely to behave in a similar way under the same conditions. Generally, the actual PS’s are not known in social science studies. When the PS’s are unknown, the estimated PS’s can be computed by using logistic regression when there are two treatment conditions (i.e., treatment vs. control). A logistic regression model is used with a large number of covariates as predictors and a treatment indicator with values of zero or one as the dependent variable. The predicted values from the logistic regression model are the PS’s for individuals. There is an important assumption under the PS analysis, unconfoundedness, which requires that after conditioning on observed covariates there are no unobserved variables that are associated both with the treatment assignment and with the dependent variable. Meaning, with the same value of PS, covariates are independent of the treatment indicator, and thus an unbiased estimation of the treatment effect can be obtained. To estimate the treatment effects on the dependent variables, different types of methods such as PS as a covariate, PS weighting, and PS matching incorporating the estimated PS’s can be applied and are introduced in following sections. 9 PS as a covariate. There are several different methods of using PS’s to estimate treatment effects. The first method introduced here, PS as a covariance, directly uses the PS’s as a covariate which is the only covariate in the regression model with a treatment indicator (Heckman & Robb, 1986). With the continuous dependent variable, the estimated coefficient of the treatment indicator is simply the average of the differences in predicted values of dependent variable for the treatment and control groups. That is the impact of the treatment effect. When the dependent values of treatment and control groups are parallel, using PS as a covariate can reduce the bias of the estimation of the treatment effect (Roseman, 1994). Another way to apply the PS’s in the regression model as a covariate is not only controlling the PS’s as the covariate but also a subset of the covariates used to estimate the PS’s in the regression model. This method may allow the diagnostic checks on the fit of the model to be more reliable than using all the covariates in the model. There are some limitations of using this method. If the dependent values of treatment and control groups are nonlinear or nonparallel, the treatment effects could be estimated incorrectly (Rubin, 1979). Moreover, although this method can be simply used, it is not much more efficient than a multiple regression model adjusting for all observed covariates. This means the estimations of the treatment effects by using PS covariance adjustment model and by using a multiple regression model should be the same whenever the same sample covariance matrix is used for both the covariance adjustment and the discriminant analysis (Rosenbaum & Rubin, 1983b). Thus, comparing the estimated treatment effects of PS as a covariate models to the estimated treatment effects of multiple regression models with and without a suppressor variable should contrarily 10 provide precise information to detect the influence of a suppressor variable. Because of the addressed limitations, researchers sometimes use this method as an additional adjustment under a randomized experimental design. Morrow and her colleagues (2010) tried to evaluate the Starting Early Starting Smart (SESS) national initiative to integrate behavioral health services into the pediatric health care setting for families with young children. They utilized longitudinal data collected from five pediatric care sites. In their study, although families were randomly assigned to either the SESS program or a standard care comparison group, 10 of 34 baseline variables were not equivalent between the SESS intervention and comparison groups including child gender, child race, primary language, household size, family substance use history, family mental health history, family criminal justice history, caregiver psychological distress BSI total score, total family service utilization, and perceived service barriers. To adjust group nonequivalence, these l0 variables were used in a logistic regression model to predict the PS for each child for being in the SESS program. In their primary outcome analyses, they retained the PS as a covariate. Their results demonstrated the success of the SESS program in coordinating and improving access to behavioral health services for high-risk caregivers within the pediatric health care setting on the behavioral health care needs of families with young children. PS weighting. The PS weighting method uses the PS’s to generate the sampling weights and are then applied in the causal model (Lunceford & Davidian, 2004; Rubins, 1997, 2001). Two different types of weights can be generated from the PS’s, depending on whether an average treatment (ATE) or the average treatment effect for the treated 11 (ATT) is desired. For estimating ATE, weights are defined as (Z = 1) and as 1 for the treatment group ̂ 𝑃𝑆 1 for the control group (Z = 0) where ̂ are the estimated PS’s. For 𝑃𝑆 ̂ 1−𝑃𝑆 estimating ATT, weights are defined as 1 for the treatment group and as ̂ 𝑃𝑆 for the ̂ 1−𝑃𝑆 control group. After weighting, individuals who are more likely to receive the treatment condition statistically but are in the control group in reality, and those who are more likely to be in the control group statistically, yet receive the treatment condition instead gain more weights in the analysis. Through weighting, the sample can be more representative of the population of interest. This method may increase the sample size efficiently and creates a pseudo-population, but can be solved by using standardized weights instead. Also, for individuals with PS’s close to zero or one, the weights can be large. As a result, the estimations of the treatment effects are easily influenced by those individuals with high variance (Rubin, 2001; Kang & Schafer, 2007; Schafer & Kang, 2008), which may lead to the biased estimations. Two possible solutions for this problem are to improve the specification of propensity score models, and to diminish the values of those extreme weights (Potter, 1993; Scharfstein, Rotnitzky, & Robins, 1999). Frank and his colleagues (2008) used the PS weighting method to test the effect of the National Board for Professional Teaching Standards (NBPTS) certification on the number of colleagues a teacher helps with instructional matters. Data was collected from the teachers in 47 elementary schools in two states. In their study, propensity scores were estimated by a logistic regression model with multiple covariates related to NBPTS certification on whether the teacher became NBPTS-certified or not. Then, weights for 12 estimating ATE and ATT were both conducted for the individuals. After applying the weights generated from the PS’s in the sample, there were no statistically significant differences between NBPTS-certified teachers and non-NBPTS-certified teachers on the covariates. This finding indicated that the PS weighting method achieved balance for the groups of NBPTS-certified and non-NBPTS-certified teachers. By using PS weighting for the outcome analyses, they found that NBPTS-certified teachers helped more colleagues than non-NBPTS-certified teachers with instructional matters significantly for both ATE and ATT models. PS matching. The PS matching method matches individuals in the treatment group to those in the control group on their PS’s as closely as possible to be a pair. A new sample of pairs is created to obtain approximately similar probabilities of being assigned to the treatment group to reduce the selection bias. The overall treatment effect is estimated as the average of the differences in outcomes within all pairs. Using matching, people need to decide what the acceptable number of matches is, because not all individuals can be matched on similar PS’s, and thus some individuals may be lost so that sample size and power may also reduce. An optimal matching algorithm often takes more time than a greedy matching one. Variance can be decreasing by a larger sample size, however, matching individuals on distinct PS’s increases bias. Therefore, balancing bias and variance is an important concern in the PS matching method. Fortunately, different matching schemes have been widely studied in theory and practice (Abadie & Imbens, 2006; Gu & Rosenbaum, 1993; Rosenbaum, 1989, 1995, 2002; Rosenbaum & Rubin, 1985) to solve this problem. Different matching methods can be used in different settings 13 to balance bias and variance. Two types of matching methods, greedy matching and optimal matching, are often used. For greedy matching, it matches the pairs with the closet PS’s and since the decisions of matching pairs have made, they will not change again. For optimal matching, through the algorithm, the decisions of matching pairs will be reconsidered in the later matching processes and be revised to achieve optimal matching. Rosenbaum and Rubin (1985) addressed that nearest neighbor available matching on the estimated propensity score under the greedy matching method is the easiest technique in terms of computational considerations. In this study, nearest neighbor matching method is applied. Moreover, nearest neighbor matching within a caliper method is also used to overcome the problem of inaccurate matching when the absolute difference of PS’s is too large to avoid bias. Henry, Gordon, and Rickman (2006) addressed a study to compare the quality and outcomes of two early education policies, federal Head Start programs and statesubsidized prekindergarten programs, by using PS matching techniques. They matched 4year-old participants of the Head Start program in Georgia to those who were eligible for Head Start but who attended the state prekindergarten program in Georgia by their PS’s. The multiple covariates were used to convert propensity scores by using a logit model including the characteristics related to the child (e.g., sex, race, age), their family (e.g., parent’s education, marital status), their school (e.g., sex, race of class) and their county of residence (e.g., race, income distributions). After matching, there were no statistical differences between the two groups at the beginning of their preschool year on their abilities of oral and written language, letter-word and applied problems. But by the beginning of kindergarten the children attending the state prekindergarten program posted 14 higher developmental outcomes. Definitions and Types of Suppressions Three types of suppressions, classical suppression, negative suppression, and reciprocal suppression, are introduced in this study. Classical suppression is the strictest definition of suppression and reciprocal suppression is the most general one, which can imply negative suppression. Classical suppression. The concept of suppression is important but elusive. This phenomenon was first introduced by Horst (1941), who defined a suppressor variable as a predictor that has zero or near-zero correlation (bivariate correlation) with the dependent variable while paradoxically still contributing predictive validity in the regression model. It infers that a suppressor variable (1) is uncorrelated or slightly correlated to the dependent variable, (2) is correlated to the other predictors (which it suppresses), and (3) increases R2, the variance of dependent variable explained. This was labeled as “classical suppression” which is also named as “traditional suppression” by Conger (1974). In practice, variables seldom have a zero or near-zero correlation with the dependent variable. Therefore, variables which have very small correlations with the dependent variable can also be considered as classical suppressor variables (Cohen & Cohen, 1975). Generally, the usefulness of a given predictor can be detected by testing the impact of that predictor on explaining the variance in the dependent variable. However, the problem of suppressor variables is that the pure impact of the predictor on the dependent variable cannot be revealed by its correlation but by its estimated 15 coefficient in the regression model while including the suppressor variables. If people select covariates in the regression model by only considering whether the variables have some correlations with the dependent variable, classical suppressor variables can be disregarded easily. Without including suppressor variables, the overall predictive validity can be underestimated. An example of classical suppression in empirical study was provided by Martz (2003). In that study, paid-work experience was noted as a suppressor variable, including several psychological and demographic independent variables, predicting employment among community college students with disabilities. Although correlation of paid-work experience with employment was not significant, when including paid-work experience in the model with other independent variables, the R2 increased by three times comparing to the model excluding paid-work experience. Paid-work experience acted as a suppressor variable so that more variance was explained by other independent variables. Figure 1 is a Venn diagram which graphically illustrates the operation of a classical suppression case. Classical suppressor variable vs. instrumental variable. Although a classical suppressor variable shares similar characteristics with an instrumental variable, such as it is uncorrelated or slightly correlated with the outcome variable and it is correlated with the other predictor in the model, people can still distinguish them by the basic definitions and rationale. The purpose of using an instrumental variable is to solve the problem of an endogenous predictor, which means that the predictor is correlated with the error in the regression model or correlated with some unobserved confounding variables, also known 16 as omitted variables. Including an instrumental variable in the model can remove the bias caused from the endogenous variables correlated with residuals of the model and omitted variables, the unobserved variables which the outcome is conditional on (see Appendix B). Based on the definition, an instrumental variable should be uncorrelated with the error of the regression model with the endogenous variable as a predictor. However, the purpose of including a classical suppressor variable is not defined to remove the bias. A classical suppressor variable is not necessary to be uncorrelated with the error in the regression model. A main criterion to define a variable as a suppressor is that the predictive validity (R2) can be increased when a suppressor is included in the model which is not necessary to be obtained with an instrumental variable. As a result, when a classical suppressor variable can remove the bias for the endogenous variables or omitted variables, it can be a special case of instrumental variable; when an instrumental variable can increase the predictive validity, it can be a special case of classical suppressor. 17 Y X 𝑟 𝑦𝑥 = 0 𝑟𝑧𝑥 ≠ 0 𝛽𝑥 ≠ 0 Z 𝑅2 > 0 𝑟 𝑦𝑧 > 0 Figure 1. Classical Suppression Y is the dependent variable, X is the suppressor variable, and Z is the predictor variable suppressed by X. 𝑟 𝑦𝑥 is the bivariate correlation between Y and X, 𝑟𝑧𝑥 is the bivariate correlation between Z and X, and 𝑟 𝑦𝑧 is the bivariate correlation between Y and Z. 𝛽 𝑥 is the standardized coefficient of X in the regression model with both X and Z as predictors. R2 is variance explained in the regression model with both X and Z as predictors and Y as the outcome. For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. Negative suppression. The issue of suppression was not widely recognized until a more general definition of a suppressor variable was provided by Lubin (1957) and Darlington (1968). Darlington defined suppression as occurring when all predictor variables have positive pairwise correlations with each other and with the dependent variable, but the suppressor variable receives a negative estimated coefficient in the regression model. This condition was extended to include the situation in which the correlation between some of the predictor variables was negative (Conger, 1974). This kind of suppression was labeled as “negative suppression” by Conger (1974) and also named as “net suppression” by Cohen and Cohen (1975), when not only the sign of its 18 correlation with the dependent variable differs from that of the estimated coefficient, but predictive power increases in the regression model through removing of irrelevant variance in other predictor variables. Therefore, a negative suppressor variable is a predictor that (1) has a positive correlation with the dependent variable, (2) is correlated with the other predictor no matter whether it is positive- or negative-correlated, (3) has a negative estimated coefficient in the regression model, and (4) still increases R2 the variance of the dependent variable explained. However, the problem of a negative suppressor variable is that the effect of it can be critical to note and interpret because of the inconsistency between its signs of correlation and coefficient. Walker (2003) found that the variable for level of education attained acted as a negative suppressor variable predicting administrators’ salaries at both public and private institutions with other independent variables. In this case, level of education attained had a small but positive correlation with administrators’ salaries; however, including it in the multiple regression model, its coefficient not only became statistically significant but was also negative. The explained variance of administrators’ salaries was also increased significantly in the model with level of education attained compared to the model without it. Figure 2 is a Venn diagram which graphically illustrates the operation of a negative suppression case. 19 X 𝑟 𝑦𝑥 > 0 𝑟𝑧𝑥 > 0 𝛽𝑥 < 0 Y 𝑅2 > 𝑟 2 𝑦𝑧 Z 𝑟 𝑦𝑧 > 0 𝛽 𝑧 > 𝑟 𝑦𝑧 Figure 2. Negative Suppression Y is the dependent variable, X is the suppressor variable, and Z is the predictor variable suppressed by X. 𝑟 𝑦𝑥 is the bivariate correlation between Y and X, 𝑟𝑧𝑥 is the bivariate correlation between Z and X, and 𝑟 𝑦𝑧 is the bivariate correlation between Y and Z. 𝛽 𝑥 and 𝛽 𝑧 are the standardized coefficients of X and Z respectively in the regression model with both X and Z as predictors. R2 is variance explained in the regression model with both X and Z as predictors and Y as the outcome. Reciprocal suppression. It is commonly known that the values of the estimated coefficients of predictor variables in the regression model can vary when other variables are included. With the addition of a new predictor variable, the estimated coefficients of the originally existing variables may all change and some of them may change significantly. Therefore, a suppressor variable is not uniquely defined by its own estimated coefficient but rather generically through its impact on the coefficients given to all the other predictor variables (Conger, 1974), especially to the treatment indicator in this study. By considering context idea and subsuming all previous typologies, an even 20 more general definition of a suppressor was defined, reciprocal suppression (Cohen and Cohen, 1975; Conger, 1974; Lutz, 1983). Reciprocal suppression occurs when the two predictor variables mutually suppress irrelevant variance in each other (Lutz, 1983). Their real effects on the dependent variable are suppressing by each other and can be larger or possibly of opposite sign compared to their correlations with the dependent variable. Under this definition, any variable in the regression model can be both a predictor and a suppressor (Lord & Novick, 1974). The reciprocal suppression can be detected when the R2 in the regression model with the two predictor variables is larger than the sum of their squared correlations with the dependent variable (Matthews & Martin, 1992). The correlation of a reciprocal suppressor variable with a suppressed variable may be high and even statistically significant. Although two independent variables may be highly correlated and the estimated coefficients may also change dramatically under both a reciprocal suppression condition and a multicollinearity condition in a regression model, the predictive validity in the model with multicollinearity may not increase as it may with reciprocal suppression. Also, theoretically, multicollinearity happens when two predictors measure the same thing but this is not why suppression happens. Paulhus, Robins, Trzesniewski, and Tracy (2004) found an example of reciprocal suppression. In the study, researchers used the variables of shame and guilt to predict aggression of undergraduate students. Although both variables of shame and guilt had positive correlations with aggression, the effect of shame on aggression increased and became negative while including guilt in the regression model. R2 also increased dramatically when guilt was added in the model. Figure 3 is a Venn diagram that graphically illustrates the operation of a reciprocal suppression case. 21 X |𝑟 𝑦𝑥 | > 0 |𝑟𝑧𝑥 | > 0 |𝛽 𝑥 | > |𝑟 𝑦𝑥 | Y 𝑅 2 > 𝑟 2 +𝑟 2 𝑦𝑧 𝑦𝑥 Z |𝑟 𝑦𝑧 | > 0 |𝛽 𝑧 | > |𝑟 𝑦𝑧 | Figure 3. Reciprocal Suppression Y is the dependent variable, X is the suppressor variable, and Z is the predictor variable suppressed by X. 𝑟 𝑦𝑥 is the bivariate correlation between Y and X, 𝑟𝑧𝑥 is the bivariate correlation between Z and X, and 𝑟 𝑦𝑧 is the bivariate correlation between Y and Z. 𝛽 𝑥 and 𝛽 𝑧 are the standardized coefficients of X and Z respectively in the regression model with both X and Z as predictors. R2 is variance explained in the regression model with both X and Z as predictors and Y as the outcome. Reciprocal suppressor variable vs. mediator variable. In causal study, a mediator variable is used to explain the mediational effect underlying a causal relationship between an independent variable and a dependent variable. Rather than a direct causal relationship between the independent variable and the dependent variable, a mediator variable is caused by the independent variable and then causes the dependent variable in turn to explore the underlying mechanism or process. In a mediation model, the estimated causal effect between the independent variable and the dependent variable is dispersed through the pathway of the mediator variable. Much like the mediator variable, although a 22 reciprocal suppressor variable may be related with both independent variable and dependent variable, it is not necessary to have causal relationships with them. A reciprocal suppressor is not defined to explore the underlying relationship between the dependent variable and the independent variable. Moreover, the inclusion of a mediator variable always conducts a smaller estimated causal effect in the mediation model than the directly causal effect between the independent variable and dependent variable without the mediator variable. Unlike the mediational effect caused by the mediator variable, which may decrease the predictive validity of the independent variable on the dependent variable, the reciprocal suppressor increases the predictive validity. 23 Chapter 3 METHODS Data Simulation In this study, two examples, classical suppression and reciprocal suppression, are provided through simulations. The goal of the simulations is to provide the specific cases of the suppressions but not to simulate the population of the suppressions to give general inference. As a result, particular constraints are used to conduct the examples. In each example, 10 simulated data sets are selected through simulations only when they satisfy the constraints precisely. Because the data sets of the examples do not have to represent the population of classical or reciprocal suppressions but just to show specific cases of them, 10 precise data sets for each example are adequate to explain how the estimations of causal effects be affected by suppressions. In each data set, 1,000 subjects are simulated. The outcome Y, treatment indicator Z, and suppressor X are generated where 500 subjects have Z = 1 for the treatment group and the other 500 subjects have Z = 0 for the control group. The data sets for the example of classical suppression have to satisfy the following conditions as well as the research interests:  The correlation of the outcome Y and the treatment indicator Z, 𝑟 𝑦𝑧 , is not statistically significant.  The correlation of Y and suppressor X, 𝑟 𝑦𝑥 , is close to zero.  The R2 in the regression model while including both Z and X as predictors is larger than the sum of their squared correlations with Y.  Five covariates are randomly selected. 24 The value of R2 can be computed by the sum of the products of each predictor’s standardized coefficient and it’s correlation with Y. ( 𝑅 2 = |𝛽 𝑧 | × 𝑟 𝑦𝑧 + |𝛽 𝑥 | × 𝑟 𝑦𝑥 > 𝑟 2 + 𝑟 2 ). Only when Z and X are independent with each other does 𝑦𝑧 𝑦𝑥 𝑅 2 = 𝑟 2 + 𝑟 2 . The specific constraints are set in the simulation program for 𝑦𝑧 𝑦𝑥 classical suppression through a correlation matrix of Y, Z, and X as: Y -.030 -.050 Y Z X Z X -.600 -- and by the unstandardized regression coefficients as: 𝑌 = 𝐵0 + 𝐵1 𝑍 + 𝜀 , where 𝐵1 = 2 𝑌 = 𝐵0 + 𝐵1 𝑍 + 𝐵2 𝑋 + 𝜀 , where 𝐵1 = 7 and 𝐵2 = 2 The values of 7 and 2 were chosen for 𝐵1 ’s such that the results of simulations can satisfy all the conditions above and those coefficients are statistically significant in regression models. The data sets for the example of reciprocal suppression have to satisfy the following conditions as well as the research interests, which are:  The absolute value of correlation of Y and Z is not zero (|𝑟 𝑦𝑧 |  The correlation of Z and X, 𝑟𝑧𝑥 is statistically significant.  The correlation of Y and X, 𝑟 𝑦𝑥 is statistically significant. 25 > 0).  The R2 in the regression model while including both Z and X as predictors is larger than the sum of their squared correlations with Y ( 𝑅2  > 𝑟 2 + 𝑟 2 ). 𝑦𝑥 𝑦𝑧 The beta coefficient of regressing Z on Y is significant with an opposite sign to its correlation while including X in the model  Five covariates are randomly selected. The specific constraints are set in the simulation program for reciprocal suppression through the correlation matrix of Y, Z and X which is: Y Z X Y -.200 .600 Z X -.700 -- and by the unstandardized regression coefficients defined as: 𝑌 = 𝐵0 + 𝐵1 𝑍 + 𝜀 , where 𝐵1 = 2 𝑌 = 𝐵0 + 𝐵1 𝑍 + 𝐵2 𝑋 + 𝜀 , where 𝐵1 = −4 and 𝐵2 = 2 The values of -4 and 2 were chosen for 𝐵1 ’s such that the results of simulations can satisfy all the conditions above and both coefficients are statistically significant in regression models. Under the definition of reciprocal suppression, both Z and X can be suppressor variables. In this case, X and Z suppress each other. Z can be note as negative suppressor where the correlation between Z and Y is positive but the coefficient of Z on Y is negative after controlling X in the regression model. A versatile method, evolutionary algorithm, was adopted to simulate the data sets. For each data set, Y, Z, and X are generated according to the aforementioned constraints 26 based on which example they belong to. The simulations try to have the minimum deviations from desired conditions. The evolution algorithm, a relatively new member of the algorithms for solving nonlinear optimization problems, can potentially offer approximate solutions very efficiently (Eiben & Smith, 2007). As its name implies, the evolution algorithm is inspired by the Darwinian theory of evolution: in nature a population of organisms within some environment with limited resources, competing with each other for those resources, causes natural selection. This in turn causes a rise in the fitness of the population. The evolution algorithm is designed to mimic the evolution process and apply it to optimization problems. Nature's seemingly endless creativity in designing complex life forms to fit virtually every imaginable environment through nothing but the simple process of evolution is often cited as the evidence for the potential effectiveness of the evolutionary algorithm. In this particular implementation, the population can be considered as a set of 500 triplets of the form (Y, Z, X) where Y and X are random real number vectors from 0 to 100 and Z is a vector of 0 or 1. Each triple can be considered as a data set with variables of X, Z, and Y. The population evolves over time in discrete generations with each generation being generated from the previous generation in two stages: variation and selection. In the variation stage, for an individual, each entry of X and Y is perturbed by adding a randomly selected real number to it. This random number is selected based on a Gaussian distribution with a mean of 0 and certain prescribed standard deviation. Entries of Z, however, remain fixed with exactly 500 entries of 0 and 500 entries of 1. The collection of perturbed individuals together with existing individuals are called the children of the current generation. 1,000 children are produced at first, and 500 of them 27 are selected as the next generation based on the algorithm in the next selection stage. In the selection stage, the fitness of each individual (Y, Z, X) is evaluated. The fitness which is defined to be how well Y, Z, and X satisfy the constraints addressed above in each example is the sum of the squared of differences of simulated values and corresponding constraints. The child with a smaller value of the fitness means that the parameters of that child are closer to the set constraints. The next generation is selected from the children. The probability that a child is selected to be a part of the next generation is proportional to its fitness. In the implementation, the fitness of children is ranked first and the best fit child is given the rank of zero. The survival probability for each child is calculated by using its rank as the power of the defined survival probability, .95. A real number between zero and one is randomly selected from a uniform distribution as an index to decide whether a child survives or not. When the index is less than or equal to a child’s survival probability, that child survives and remains in the population. The selection process starts with the child with the highest fitness. When the total number of selected children reaches 500, the algorithm stops and the new generation has been created. If less than 500 children survive, the absent individuals were created by cloning the child with rank zero. Each iteration represents the procession from one generation to the next. Throughout this process, the children with better fitness have a higher chance of being included in the next generation. Based upon my experimentations, setting a smaller number of populations requires less iteration time for each generation than a larger number; however, the fitness would not have substantial improvement after several hundred iterations. Even with an extremely large number of iterations, the fitness would not be good enough. Therefore, 28 the population is set with a larger number in the simulation so that fewer iterations are required to conduct the data set with a good enough fitness. For the classical suppression, about 9,000 iterations are run and take about 10 hours for each completed simulation. For the reciprocal suppression, about 5,000 iterations are run and take about 6 hours for each completed simulation. Only the best child with the smallest value of the fitness is selected as one of the conducted data sets. The processes are repeated until 10 data sets are conducted for each example of classical and reciprocal suppressions. Although repeating the processes takes much more time, it secures that all the selected data sets are precise enough to the constraints. A manual step is used during the processes. The population was discarded manually if the fitness did not have substantial improvement after the first 1,000 iterations. This is done because that the first randomly selected generation may not be appropriate to create the desired example. Based on my failure experiments, if the fitness did not improve substantially after the first 1,000 iterations, no matter how many iterations were run, the fitness would not be good enough to represent the example of the suppression. The program is written in Python programing language version 2.7 (see Appendix A). Five covariates are randomly selected for each example. Testing the Validity of Simulated Data Sets To test the validity of simulated examples of classical and reciprocal suppressions, first, the correlations of treatment indicator Z and suppressor X with outcome Y and the R2 of regressing Z and X on Y are estimated for each simulated data set to see whether each data set satisfies the condition of 𝑅 2 > 𝑟 2 + 𝑟 2 . In the example of classical 𝑦𝑥 𝑦𝑧 suppression, 𝑟 𝑦𝑧 should be not statistically significant and 𝑟 𝑦𝑥 should be close to zero. 29 The correlations, 𝑟 𝑦𝑧 and 𝑟 𝑦𝑥 , should both be positively significant in the example of reciprocal suppression. Second, to detect whether X suppresses Z validly, two regression models are run. Model 1 regresses only Z on Y and Model 2 regresses both Z and X on Y 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 (1) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜀 (2) For each simulated data set in the example of classical suppression, the estimated treatment effect 𝛽1 should be non-significant in Model 1 but statistically significant in Model 2. In the example of reciprocal suppression 𝛽1 should be positively statistically significant in Model 1 but negatively statistically significant in Model 2. All simulated data sets should satisfy the aforementioned conditions for each example. Estimating the Causal Effect by Regression and PS Analyses In this study, the causal effect of treatment indicator Z is estimated by regression models and PS methods including PS as a covariate, PS weighting, and PS matching models for both examples of classical and reciprocal suppressions. Based on the rule of selecting variables in the PS model, all variables that are correlated with the dependent variable should be included in the model to estimate a causal effect. In some causal inference analyses, including a predictor such as pre-test scores or other related test scores as a confounding variable to estimate the intervention effect on post-test scores is a key method to eliminate confoundedness. Also, some variables that have certain relationships with the outcome and have been proved in 30 previous studies need to be included in the models to address the unconfoundedness assumptions. In order to determine whether the estimations of the treatment effect differ when the unconfoundedness is fulfilled at different levels, different covariate, P’s, correlated with the dependent variable in different degrees are generated. These variables, P’s, are computed by applying a non-linear function 𝑃 = 𝑅 + 𝐶 × sin(𝑅) where R is composed by the standardized residuals of a simple regression model with Z on Y and where C is a constant. Here, R is a fixed vector based on the regression model that indicates the unexplained variance of Y after controlling Z. The reason I use the residuals from the simple regression model with the only predictor Z to derive P is because the estimate from this model is defined as the true treatment effect in this study. Based on the function, as C becomes smaller, the correlation between P and R becomes larger. Figure 4 illustrates the relationships between P and R with different values of C. In this study, 10 different levels of covariates P’s, P1 to P10 are generated for each selected data set from simulations. Including P, which is highly correlated with R in the models, implies that a large portion of variance of Y can be explained by P after controlling Z. Under this condition, the unconfoundedness assumption is approximately fulfilled, and then the estimated coefficient of Z should be unbiased and close to the true treatment effect that I defined based on the theory. Moreover, how the estimations of treatment effect differ in multiple regression models and the models of PS methods with and without the suppressor variable when the unconfoundedness assumption is fulfilled in different levels can be tested. By comparing the models with different levels of P’s, the impact of the unconfoundedness assumption violation on the estimations of the treatment effects can be addressed. 31 P=R+C×sin(R) C=1 40 P=R+ C×sin(R) C=4 40 -40 -40 40 P=R+ C×sin(R) C=16 40 -40 40 -40 -40 P=R+ C×sin(R) C=12 40 40 40 -40 -40 -40 Figure 4. Graphs of function 𝑃 = 𝑅 + 𝐶 × sin(𝑅). The vertical axis represents P and the horizontal axis represents R. 32 Regression. To test the estimate of the treatment effect by the regression method, Model 3 regresses Z and covariates Vi on Y. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝜀, 𝑖 = 1, 2, 3, … , 5 (3) To test the estimate of the treatment effect conditional on X by the regression method, Model 4 regresses Z, X, and Vi on Y. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜷′ 𝟐+𝒊 𝑉𝑖 + 𝜀, 𝑖 = 1, 2, 3, … , 5 (4) To test how covariates, P’s, affect the estimations without suppressor X in regression, the following models are analyzed. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃1 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.1) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃2 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.2) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃3 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.3) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃4 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.4) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃5 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.5) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃6 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.6) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃7 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.7) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃8 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.8) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃9 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.9) 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃10 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.10) 33 P1 to P10 can be abbreviated as Pj where j = 1, 2, 3, ….., 10. I will use the abbreviated notation for other equations as: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (3.j) To test how covariates, P’s, affect the estimations with suppressor X in regression, the following models are analyzed. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜷′ 𝟐+𝒊 𝑉𝑖 + 𝛽8 𝑃𝑗 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (4.j) PS methods. Because the true PS’s are unknown, logistic regression models are used to estimate the PS’s with the binary treatment indicator Z (Z = 1 or 0) as the dependent variable. The PS’s are the predicted probabilities of receiving the treatment (Z=1) of models. In this study, different PS’s are estimated by using a different set of independent variables in the models to see how the estimated treatment effects change under different settings. To estimate the PS, PS_C, with only covariates Vi as predictors, Model 5 is used. 𝑃(𝑍=1) log(1−𝑃(𝑍=1)) = 𝛽0 + 𝜷′ 𝒊 𝑉𝑖 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (5) To estimate the PS, PS_CX, with suppressor X and covariates Vi as predictors, Model 6 is used. 34 𝑃( 𝑍=1) Log (1−𝑃( 𝑍=1)) = 𝛽0 + 𝛽1 𝑋 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (6) To estimate the PS’s, PS_CPj, with covariates Vi and covariate Pj as predictors, Model 5.1 to 5.10 are used. 𝑃( 𝑍=1) Log (1−𝑃( 𝑍=1)) = 𝛽0 + 𝜷′ 𝒊 𝑉𝑖 + 𝛽6 𝑃𝑗 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (5.j) To estimate the PS’s, PS_CXPj, with suppressor X, covariates Vi and covariate Pj as predictors, Model 6.1 to 6.10 are used. 𝑃( 𝑍=1) Log (1−𝑃( 𝑍=1)) = 𝛽0 + 𝛽1 𝑋 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀, 𝑖 = 1, 2, 3, 4 𝑎𝑛𝑑 5 (6.j) To estimate the treatment effect by using PS methods, the PS_C, PS_CX, PS_CPj and PS_CXPj are used in PS as a covariate adjustment, PS weighting and PS matching models. 35 PS as a covariate. To estimate the treatment effect by using covariate adjustment methods, PS is directly used as a covariate which is the only covariate in the regression model of treatment indicator Z on outcome Y. Model 7 is used to estimate the treatment effect of Z by controlling PS_C which is conditional on covariates Vi only. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶 + 𝜀 (7) Model 8 is used to estimate the treatment effect of Z by controlling PS_CX which is conditional on covariates Vi and suppressor X. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋 + 𝜀 (8) Model 7.1 to Model 7.10 are used to estimate the treatment effect of Z by controlling PS_CPj where j is from 1 to 10, which is conditional on covariates Vi and covariate Pj. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑃𝑗 + 𝜀 (7.j) Model 8.1 to Model 8.10 are used to estimate the treatment effect of Z with PS_CXPj which is conditional on suppressor X, covariates Vi, and covariate Pj where j varies from 1 to 10. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋𝑃𝑗 + 𝜀 36 (8.j) PS Weighting. The PS weighting method uses PS’s to generate sampling weights and applying weights to estimate the treatment effect by regressing only treatment indicator Z on Y. Two different types of weights are used under PS weighting methods, depending on whether the average treatment (ATE) or the average treatment effect for the treated (ATT) is desired. ATE is the average difference in expected Y between the treatment and control groups. It is defined as 𝐴𝑇𝐸 = 𝐸(𝑌1𝑖 |𝑍 = 1) − 𝐸(𝑌0𝑖 |𝑍 = 0) where 𝑌1𝑖 |𝑍 = 1 is the value of outcome Y for individual i in the treatment group if the individual was treated and 𝑌 |𝑍 0𝑖 = 0 is the value of outcome Y for individual i in the control group if the individual was not treated. ATT is defined as 𝐴𝑇𝑇 = 𝐸(𝑌1𝑖 |𝑍 = 1) − 𝐸(𝑌0𝑖 |𝑍 = 1) where 𝑌0𝑖 |𝑍 = 0 is the value of outcome Y for individual i in the treatment group if the individual was not treated. The term 𝑌 |𝑍 0𝑖 cannot be observed, only estimated. For estimating ATE, weights are defined as the treatment group (Z = 1) and as =0 1 for ̂ 𝑃𝑆 1 for the control group (Z=0) where ̂ are the 𝑃𝑆 ̂ 1−𝑃𝑆 estimated PS’s. For estimating ATT, weights are defined as 1 for the treatment group and as ̂ 𝑃𝑆 for the control group. In this study, both ATE and ATT are estimated; therefore, ̂ 1−𝑃𝑆 for each generated PS, two types of weights are computed. To solve the problem of extremely large weights which may easily influence the estimation of the treatment effect, weight trimming is applied. Lee, Lessler and Stuart th (2011) suggested trimming the weights at the 95 percentile to improve the estimations 37 when the logistic regression models when PS’s are estimated with the scenario of mild non-additivity and non-linearity. In this study, the models to estimate the treatment effect th with weight trimming at the 95 percentile are used and the models without weight trimming are also analyzed. Model 9 is used to estimate both ATE and ATT by using different weights generated by each PS with and without weight trimming. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 (9) PS matching. The PS matching method matches treated to control individuals based on the estimated PS. Based on the setting for the simulated data sets, there are equal individuals in the control and treatment groups so that all the individuals are matched into 500 pairs. There are various matching methods and the most common matching algorithm is the greedy matching. Two types of greedy matching methods are applied here, nearest neighbor matching and nearest neighbor matching within a caliper. The nearest neighbor matching method matches an individual in the treatment group to that in the control group if the absolute difference of PS’s between a treated individual and a control individual is the smallest among all possible pairs. This method provides one-to-one complete matching but would be inaccurate when the absolute difference of PS’s is too large. The nearest neighbor matching within a caliper method matches the individuals by the nearest neighbor matching method first and then removes the pairs if their absolute difference of PS’s falls outside a set caliper ε. This method can overcome the problem of 38 inaccurate matching when the absolute difference of PS’s is too large. The size of the caliper is typically set as 𝜀 ≤ .25𝜎 𝑃𝑆 , where 𝜎 𝑃𝑆 is the standard deviation of the PS (Rubin, 1985). In this study, 𝜀 ≤ .25𝜎 𝑃𝑆 is applied. ATT’s are estimated after applying these two matching methods separately for each PS by Model 10: ̂𝑡 = 𝜏 where 1 𝑛 𝑛𝑡 ∑ 𝑘=1{𝑌 𝑘 |𝑍 = 1 − ̂𝑘 |𝑍 = 0} 𝑌 𝑡 (10) ̂𝑡 is the estimated ATT, 𝑛 𝑡 is the number of pairs after matching, 𝑌 𝑘 |𝑍 = 1 is 𝜏 the outcome value for the individual who is in the treatment group in pair k and ̂𝑘 |𝑍 = 𝑌 0 is the estimated outcome value for the individual who is in the treatment group in pair k if he was not treated. All the models are applied for all the simulated data sets that are examples of classical and reciprocal suppressions. The statistical tests are performed by using STATA version 11 (StataCorp, Texas). 39 Chapter 4 EXAMPLE of CLASSICAL SUPPRESSION Data For the example of classical suppression, 10 simulated data sets with outcome Y, treatment indicator Z, suppressor X, and covariates Vi are generated with 1,000 subjects for each. In each data set, the number of subjects in the treatment group (Z = 1) and in the control group (Z = 0) is equal to 500. In the example of classical suppression, the correlation 𝑟 𝑦𝑧 is small and not statistically significant, and the correlation 𝑟 𝑦𝑥 is close to zero. The value of R2 in Model 2 which regresses Z and X on Y is larger than the sum of 𝑟 2 and 𝑟 2 . 𝑦𝑧 𝑦𝑥 Testing validity of simulation data sets. The correlations, 𝑟 𝑦𝑧 and 𝑟 𝑦𝑥 , and the values of R2 from Model 1 and Model 2 are reported in Table 1 for each simulated data set. Based on the results, both 𝑟 𝑦𝑧 and 𝑟 𝑦𝑥 are close to zero and non-significant, and the values of R2 in Model 2 is larger than the sum of 𝑟 2 and 𝑟 2 for each simulated data 𝑦𝑧 𝑦𝑥 set. The estimated treatment effects are all positive but not significant in Model 1 where only Z regresses on Y. In Model 2 with the added X, the estimated treatment effects are still positive but now significant. These results indicate that X increases the predictive validity of Z and all data sets satisfy the conditions established for classical suppression. 40 Table 1 Classical Suppression Data Results Model 1 𝑟 𝑦𝑧 Model 2 𝑟 𝑦𝑥 R2 R2 B1 B1 Simulated Data 1 .033 -.058 2.011 .001 6.997*** .011 Simulated Data 2 .032 -.057 2.011 .001 6.997*** .011 Simulated Data 3 .033 -.058 2.011 .001 6.997*** .012 Simulated Data 4 .032 -.056 2.011 .001 6.997*** .011 Simulated Data 5 .034 -.059 2.011 .001 6.997*** .012 Simulated Data 6 .033 -.059 2.011 .001 6.997*** .011 Simulated Data 7 .033 -.058 2.011 .001 6.997*** .011 Simulated Data 8 .034 -.058 2.011 .001 6.997*** .012 Simulated Data 9 .033 -.058 2.011 .001 6.997*** .011 Simulated Data 10 .033 -.059 2.011 .001 6.997*** .012 Note: B1 is the coefficient for treatment indicator Z. Model 1is 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 and Model 2 is 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜀 . *p < .05. ** p < .01. *** p < .001. Table 2 reports the means and standard deviations (SD) of correlations for Y, Z, and X from 10 data sets where 𝑟 𝑦𝑧 is .033, 𝑟 𝑦𝑥 is -.058, and 𝑟𝑧𝑥 is .628. The standard deviations in Table 2 are quite small, less than or equal to .002, providing evidence that all simulated data sets satisfy the given constraints precisely. Table 2 Correlation Table for Simulated Variables – Classical Suppression Example Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) Outcome(Y) -Treatment(Z) .033 (.001) -Suppressor(X) -.058 (.001) .628 (.002) -Note: The values are calculated by 10 simulated data sets. Covariates, P’s, are generated by using the non-linear function 𝑃 = 𝑅 + 𝐶 × sin(𝑅) where R is the standardized residuals from Model 1 and where C is a constant. Ten P’s are generated with different values of C. Table 3 reports the means and standard deviations of correlations of 10 P’s and simulated variables Y, Z, and X from 10 data sets. 41 The results indicate that the correlations between P’s and Y increase monotonically from P1 (.139) to P10 (.973) at an approximate rate of .10. The correlations between P’s and Z are close to zero and slightly decrease from P1 (-.008) to P10 (-.002). The correlations between P’s and X are negative and they slightly decrease at an approximate rate of .005 from P1 (-.022) to P10 (-.079). The covariate P has a stronger effect on Y when its correlation with Y is larger. Since the P’s are generated from the unexplained residuals, with a stronger P, the unconfoundedness assumption is more likely to be fulfilled. In this case, the correlation of P10 and Y is .973 which is extremely high. As a result, by controlling P10 in the regression model, the unconfoundedness assumption can be approximately fulfilled. Table 3 Correlation Table for Simulated Variables and P’s – Classical Suppression Example Simulated Variables Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) P’s .139 (.043) -.008 (.025) -.022 (.026) P1 .241 (.040) -.008 (.024) -.030 (.025) P2 P3 .340 (.037) -.008 (.024) -.037 (.024) P4 .443 (.033) -.008 (.022) -.045 (.023) P5 .546 (.027) -.007 (.021) -.052 (.022) P6 .648 (.021) -.007 (.019) -.060 (.020) P7 .730 (.016) -.006 (.017) -.065 (.018) P8 .819 (.010) -.005 (.014) -.071 (.015) P9 .906 (.005) -.004 (.011) -.076 (.011) .973 (.001) -.002 (.006) P10 Note: The values are calculated by 10 simulated data sets. -.079 (.006) 42 Regression Models In regression models, the estimated coefficients of treatment indicator Z are the estimated treatment effects. Table 4 reports unstandardized coefficients B, standard errors of the coefficient SE(B) and standardized coefficients β of Z. The values in Table 4 are the means and standard deviations of estimates from 10 data sets. The standardized treatment effect in Model 3 is .032 and that increases dramatically to .114 after adding suppressor X in Model 4. Also, the treatment effect is non-significant in Model 3, but becomes significant in Model 4. This indicates that the suppressor influences the estimation of the treatment effect and increases the predictive validity significantly of the treatment indicator. Considering Model 3.1 to Model 3.10, the estimated treatment effects are quite consistent no matter which levels of P’s are controlled. However, for the models with the stronger P’s, the estimates of standard error become smaller. As a result, the treatment effects become more significant when the stronger P’s are controlled. For Model 3.10 with the strongest covariate, P10, which explained most variance of the outcome after controlling the treatment indicator, the unconfoundedness assumption can possibly be fulfilled. This is because P10 is generated to be highly correlated to the residuals of Model 1 and the outcome. When the unconfoundedness assumption is fulfilled, the true treatment effect can be estimated. In Model 3.10 with P10, the estimated standardized treatment effect of .035 can be considered as the approximately true treatment effect and it is not that different from the estimated treatment effect of .032 in Model 3. The estimated treatment effect of .114 with the added suppressor in Model 4 is much larger 43 than the approximately true treatment effect in Model 3.10. By comparing the results from Model 4.1 to Model 4.10, the estimates of the treatment effect and the standard error become smaller with the stronger P’s included in the models. The estimated treatment effect is .112 with the estimated standard error 2.456 in Model 4.1 and the corresponding values in Model 4.10 are .038 and 0.563, respectively. Examining Model 4 and Model 4.1, the difference of the estimated the treatment effect is only .002, which means by adding the least strong covariate P1, the estimates of the treatment effect do not change significantly. It demonstrates that the suppressor still has a strong impact on the estimation of the treatment effect when only the weakest covariate, P1, is controlled. However, the stronger P’s included in the model, the smaller the impact of the suppressor. In Model 4.10, the estimated treatment effect decreases to .038 which is quite close to the value in Model 3.10, conveying the approximate true treatment effect of .035 without being conditional on the suppressor. It provides evidence that the influence of the suppressor on the treatment indicator becomes smaller with a stronger covariate P controlled. The effect of the suppressor can be eliminated by controlling the strong-enough covariate P. As shown in Model 3.1 to Model 3.10, with stronger P’s, the estimates of the standard error become smaller and the estimated effects become more significant. This is also true in Model 4.1 to Model 4.10. These findings providing evidence that the stronger covariates, P’s, can not only eliminate the impact of the suppressor, but can improve the precision of the estimates of the treatment effect. 44 Table 4 The Estimated Treatment Effects of Regression Models– Classical Suppression Example B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) p < .05 Regression Without Suppressor Model 3 1.975 (0.097) 1.938 (0.042) .032 (0.002) 0/10 Model 3.1 2.039 (0.177) 1.919 (0.041) .033 (0.003) 0/10 P1 Model 3.2 2.086 (0.312) 1.881 (0.043) .034 (0.006) 0/10 P2 Model 3.3 2.130 (0.441) 1.822 (0.044) .035 (0.008) 0/10 P Model 3.4 3 P4 2.172 (0.561) 1.737 (0.044) .036 (0.010) 0/10 P5 2.207 (0.656) 1.624 (0.043) .036 (0.011) 0/10 P6 2.232 (0.718) 1.475 (0.040) .037 (0.011) 2/10 P7 2.241 (0.731) 1.324 (0.035) .037 (0.012) 3/10 P8 2.235 (0.694) 1.111 (0.028) .037 (0.011) 6/10 P9 2.199 (0.569) 0.816 (0.017) .036 (0.010) 7/10 P10 With Suppressor Model 4 Model 4.1 P1 Model 4.2 P 2.122 (0.329) 0.437 (0.007) .035 (0.006) 10/10 6.953 (0.142) 6.856 (0.290) 2.480 (0.050) 2.456 (0.051) .114 (.003) .112 (.007) 10/10 10/10 6.679 (0.476) 2.408 (0.053) .110 (.010) 10/10 Model 4.3 2 P3 6.408 (0.647) 2.334 (0.055) .105 (.012) 10/10 P4 6.024 (0.800) 2.226 (0.056) .099 (.015) 10/10 P5 5.539 (0.917) 2.082 (0.054) .091 (.016) 9/10 P6 4.947 (0.985) 1.893 (0.050) .081 (.017) 9/10 P7 4.396 (0.991) 1.701 (0.044) .072 (.017) 9/10 P8 4.396 (0.991) 1.428 (0.035) .061 (.016) 9/10 P9 2.954 (0.750) 1.051 (0.022) .049 (.013) 9/10 Model 3.5 Model 3.6 Model 3.7 Model 3.8 Model 3.9 Model 3.10 Model 4.4 Model 4.5 Model 4.6 Model 4.7 Model 4.8 Model 4.9 Model 4.10 0.563 (0.008) .038 (.008) 10/10 P10 2.300 (0.428) Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 3: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽′1+𝑖 𝑉𝑖 + 𝜀 . Model 3.1 – 3.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽′1+𝑖 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . Model 4: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 Model 4.1 – 4.10: 𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽′2+𝑖 𝑉𝑖 + 𝜀 . 𝑍 + 𝛽2 𝑋 + 𝛽′2+𝑖 𝑉𝑖 + 𝛽8 𝑃𝑗 + 𝜀. 45 PS Methods Before estimating the treatment effects by using PS methods, the predicted PS’s are estimated by including different sets of variables in the logistic regression models. Table 5 reports the correlations between the predicted PS’s and simulated variable Y, Z, and X. The correlations between PS_C and Y, Z, and X are all small, .010, .069, and .041, respectively, where PS_C is estimated by including only covariates Vi in the model. For PS_CX, which is estimated by including suppressor X and Vi, its correlation with Y is still small, -.057, but the correlations with Z and X are quite large, .633 and .988, respectively. The correlations of PS_CP1 to PS_CP10 with Y are slightly stronger from PS_ CP1 (.001) to PS_ CP7 (-.030) and are slightly weaker from PS_ CP8 (-.030) to PS_ CP10 (-.015); however, the values are all small to zero. The correlations of PS_CPj with Z and X are quite consistent with different levels of P’s involved. The correlations of PS_CXP1 to PS_CXP10 with Y become slightly weaker from PS_ CXP1 (-.057) to PS_ CXP8 (-.006) and slightly stronger from PS_ CXP9 (.006) to PS_ CXP10 (.016). The values are all close to zero, too. The correlations of PS_ CXPj with Z and X are also quite consistent with different levels of P’s involved. 46 Table 5 Correlation Table for Simulated Variables and Propensity Scores – Classical Suppression Example Simulated Variables Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) Propensity Scores Model 5 PS_C .010 (.021) .069 (.023) .041 (.021) Model 5.1 .001 (.042) .073 (.025) .044 (.024) PS_CP1 Model 5.2 PS_CP2 -.006 (.070) .073 (.025) .045 (.025) Model 5.3 PS_CP3 -.012 (.097) .073 (.025) .045 (.027) Model 5.4 PS_CP4 -.018 (.122) .073 (.025) .046 (.028) Model 5.5 PS_CP5 -.024 (.141) .072 (.025) .046 (.029) Model 5.6 PS_CP6 -.028 (.154) .072 (.025) .046 (.030) Model 5.7 PS_CP7 -.030 (.157) .071 (.025) .046 (.030) Model 5.8 PS_CP8 -.030 (.150) .071 (.024) .045 (.029) Model 5.9 PS_CP9 -.026 (.123) .070 (.024) .045 (.027) Model 5.10 Model 6 Model 6.1 PS_CP10 PS_CX PS_CXP1 -.015 (.071) -.057 (.005) -.055 (.007) .070 (.024) .633 (.004) .634 (.004) .043 (.024) .988 (.002) .987 (.002) Model 6.2 PS_CXP2 -.053 (.009) .634 (.004) .987 (.002) Model 6.3 PS_CXP3 -.048 (.011) .634 (.004) .987 (.002) Model 6.4 PS_CXP4 -.042 (.013) .634 (.004) .987 (.003) Model 6.5 PS_CXP5 -.035 (.015) .634 (.004) .986 (.003) Model 6.6 PS_CXP6 -.026 (.016) .635 (.004) .986 (.003) Model 6.7 PS_CXP7 -.017 (.016) .635 (.003) .986 (.003) Model 6.8 PS_CXP8 -.006 (.015) .635 (.003) .985 (.003) Model 6.9 PS_CXP9 .006 (.012) .636 (.003) .985 (.003) Model 6.10 .016 (.006) .636 (.004) PS_CXP10 Note: The values are calculated by 10 simulated datasets. 𝑃(𝑍=1) Model 5: log ( ) = 𝛽0 + 𝛽′ 𝑖 𝑉𝑖 + 𝜀 . 1−𝑃(𝑍=1) .984 (.003) Model 5.1 – 5.10: log ( 𝑃(𝑍=1) ) = 𝛽0 + 𝛽′ 𝑖 𝑉𝑖 + 𝛽6 𝑃𝑗 + 𝜀 . 1−𝑃(𝑍=1) 𝑃(𝑍=1) Model 6: log ( ) = 𝛽0 1−𝑃(𝑍=1) + 𝛽1 𝑋 + 𝛽′1+𝑖 𝑉𝑖 + 𝜀 . Model 6.1 – 6.10: log ( = 𝛽0 + 𝛽1 𝑋 + 𝛽′1+𝑖 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . 𝑃(𝑍=1) ) 1−𝑃(𝑍=1) 47 PS as a covariate. The estimates of the treatment effect by using the PS as a covariate method are in Table 6. Comparing Table 4 and Table 6, the estimated treatment effects are quite similar from the corresponding models whether using regression method or the PS as a covariate method. However, the estimates of standard error differ for the two methods. By comparing Model 7 and Model 8, the standardized treatment effect increases from .032 to .115 by controlling the PS estimated by suppressor X and covariates Vi. The result indicates that the suppressor also influences the estimation of the treatment effect and increases the predictive validity of the treatment indicator significantly by using the PS as a covariate method. Considering Model 7.1 to Model 7.10, the estimated treatment effects remain consistently with each other. For Model 7.10 with the strongest covariate P10, the estimated standardized treatment effect is still .035, which is the same as the approximately true treatment effect in Model 3.10. Model 8.1 to Model 8.10 indicate a decrease in the estimated treatment effects. This finding also provides evidence that the influence of the suppressor on the treatment indicator becomes smaller with a stronger P applied. In Model 8.10, the estimated treatment effect is .038, which is quite close to that in Model 7.10, the approximately true treatment effect. In Model 7.1 to Model 7.10 and Model 8.1 to Model 8.10, by controlling the PS’s with the stronger P’s involved, the estimates of the standard error do not become smaller as they do by directly controlling the stronger P’s in the regression models. As a result, unlike in the regression models, controlling the PS’s with stronger P’s involved cannot improve precision of the estimates of the treatment effect. In the regression model, the standard error of the treatment effect of Z is defined as following: 48 1−𝑅2 𝑆𝑦 𝑧 𝑆𝑧 𝑆𝐸 𝑧 = √(1−𝑅2 )(𝑁−𝐾−1) × with all predictors on Y, on Z, , where R2 is the R2 of the regression model 𝑅 2 is the R2 of the regression model with all predictors except Z 𝑧 𝑆 𝑦 is the standard deviation of Y, 𝑆 𝑧 is the standard deviation of Z, N is the total sample size, and K is the number of predictors. As mentioned in Chapter 3, R2 is the sum of the products of the absolute standardized coefficient and the correlation with Y for all predictors ( 𝑅2 = ∑ 𝑘 |𝛽 𝑘 | × 𝑟 𝑦𝑐 𝑘 , where C is all the predictors in the model). 𝑘=1 With smaller standardized coefficients, the value of R2 tends to be smaller. When R2 becomes smaller, the standard errors of predictors become larger. Table 7 compares the coefficients of different levels of P’s in regression models with those in the PS as a covariate models. It provides evidence of why the standard errors of estimates do not become smaller when the stronger P’s are involved in the PS as a covariate models as they do in regression models. In Table 7, the stronger the covariates P’s, the larger the absolute standardized coefficients of P’s for the regression models, with or without the suppressor. However, in the PS as a covariate models, for the PS’s with the stronger P’s involved, the absolute standardized coefficients of the PS_CPj are extremely small and just slightly increase and then decrease. The absolute standardized coefficients of the PS_CXPj are also small and just slightly decrease from PS_CXP1 to PS_CXP10 in the PS models. The changes correspond to the estimates of standard error in Model 7.1 to Model 7.10 and Model 8.1 to Model 8.10. With the smaller absolute standardized coefficients of the PS’s, the estimated standard errors of the treatment effect are large. 49 These results indicate that using the PS with a stronger P involved as a covariate in the regression model cannot improve the predicted line, and cannot decrease the mean squared error as directly as adding a stronger P as a covariate. As the results indicate, the estimates of the standard error are large with the stronger P’s involved in the PS as a covariate method. 50 Table 6 The Estimated Treatment Effects of Propensity Score as a Covariate Models – Classical Suppression Example B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Propensity Score as a Covariate Without Suppressor Model 7 1.976 (0.097) 1.938 (0.042) .032 (.002) 0/10 Model 7.1 2.039 (0.177) 1.938 (0.042) .033 (.003) 0/10 P1 Model 7.2 P2 2.087 (0.312) 1.935 (0.042) .034 (.006) 0/10 Model 7.3 P3 2.131 (0.442) 1.931 (0.042) .035 (.008) 0/10 Model 7.4 P4 2.173 (0.561) 1.926 (0.041) .036 (.020) 0/10 Model 7.5 P5 2.207 (0.657) 1.921 (0.041) .036 (.011) 0/10 Model 7.6 P6 2.232 (0.718) 1.917 (0.042) .037 (.012) 0/10 Model 7.7 P7 2.242 (0.732) 1.916 (0.042) .037 (.012) 0/10 Model 7.8 P8 2.235 (0.695) 1.918 (0.042) .037 (.012) 0/10 Model 7.9 P9 2.199 (0.569) 1.924 (0.042) .036 (.010) 0/10 Model 7.10 P10 2.123 (0.329) With Suppressor Model 8 7.035 (0.351) Model 8.1 6.940 (0.423) P1 Model 8.2 6.760 (0.559) P2 1.934 (0.042) .035 (.006) 0/10 2.486 (0.052) 2.487 (0.054) .115 (.005) .114 (.008) 10/10 10/10 2.488 (0.056) .111 (.011) 10/10 Model 8.3 P3 6.485 (0.703) 2.490 (0.057) .106 (.013) 10/10 Model 8.4 P4 6.097 (0.838) 2.492 (0.057) .100 (.015) 9/10 Model 8.5 P5 5.606 (0.943) 2.494 (0.058) .092 (.017) 8/10 Model 8.6 P6 5.008 (1.003) 2.497 (0.058) .082 (.018) 6/10 Model 8.7 P7 4.451 (1.005) 2.499 (0.057) .073 (.018) 2/10 Model 8.8 P8 3.758 (0.940) 2.501 (0.057) .062 (.016) 1/10 Model 8.9 P9 2.987 (0.760) 2.503 (0.056) .049 (.013) 0/10 Model 8.10 2.505 (0.054) .038 (.008) 0/10 P10 2.316 (0.435) Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 7: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶 + 𝜀 . Model 7.1 – 7.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑃𝑗 + 𝜀 . Model 8: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 Model 8.1 – 8.10: 𝑌 = 𝛽0 + 𝛽1 𝑃𝑆_𝐶𝑋 + 𝜀 . 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋𝑃𝑗 + 𝜀 . 51 Table 7 Coefficients of P’s in Regression Models and Coefficients of Propensity Scores in PS as a Covariate Models – Classical Suppression Example Regression PS as a covariate β SE β SE Mean (SD) Mean (SD) Mean (SD) Mean (SD) Without Suppressor Model Model 3.1 P1 PS_CP1 .139 (.043) 1.349 (0.031) 7.1 -.001 (.043) 29.725 (11.270) 3.2 P2 .241 (.041) 1.296 (0.035) 7.2 PS_CP2 -.008 (.071) 29.728 (11.272) 3.3 P3 .341 (.038) 1.216 (0.039) 7.3 PS_CP3 -.015 (.098) 29.740 (11.276) 3.4 P4 .444 (.033) 1.105 (0.041) 7.4 PS_CP4 -.021 (.123) 29.768 (11.282) 3.5 P5 .546 (.028) 0.966 (0.040) 7.5 PS_CP5 -.026 (.143) 29.827 (11.292) 3.6 P6 .659 (.021) 0.797 (0.035) 7.6 PS_CP6 -.031 (.156) 29.934 (11.304) 3.7 P7 .730 (.016) 0.642 (0.029) 7.7 PS_CP7 -.033 (.159) 30.069 (11.313) 3.8 P8 .820 (.010) 0.452 (0.019) 7.8 PS_CP8 -.033 (.151) 30.289 (11.321) 3.9 P9 .907 (.005) 0.244 (0.009) 7.9 PS_CP9 -.028 (.124) 30.599 (11.320) .974 (.001) -.017 (.072) 30.921 (11.303) -.127 (.012) 3.907 (0.092) 3.10 P10 4.1 P1 .137 (.043) 0.070 (0.002) 7.10 PS_CP10 With Suppressor 1.343 (0.032) 8.1 PS_CXP 1 4.2 P2 .238 (.041) 1.291 (0.036) 8.2 PS_CXP2 -.123 (.016) 3.908 (0.093) 4.3 P3 .337 (.037) 1.213 (0.039) 8.3 PS_CXP3 -.116 (.020) 3.910 (0.094) 4.4 P4 .440 (.033) 1.103 (0.041) 8.4 PS_CXP4 -.106 (.023) 3.912 (0.094) 4.5 P5 .542 (.027) 0.965 (0.040) 8.5 PS_CXP5 -.093 (.026) 3.915 (0.094) 4.6 P6 .645 (.021) 0.798 (0.035) 8.6 PS_CXP6 -.078 (.027) 3.917 (0.094) 4.7 P7 .727 (.016) 0.643 (0.028) 8.7 PS_CXP7 -.063 (.027) 3.920 (0.094) 4.8 P8 .817 (.011) 0.453 (0.019) 8.8 PS_CXP8 -.045 (.025) 3.922 (0.093) 4.9 P9 .905 (.005) 0.245 (0.009) 8.9 PS_CXP9 -.025 (.020) 3.923 (0.092) 4.10 P10 .974 (.002) 0.070 (0.002) 8.10 PS_CXP10 -.008 (.011) 3.924 (0.091) Note: The values are calculated by 10 simulated data sets. Model 3.1 – 3.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . 𝛽7 is reported. Model 4.1 – 4.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜷′ 𝟐+𝒊 𝑉𝑖 + 𝛽8 𝑃𝑗 + 𝜀 . 𝛽8 is reported. Model 7.1 – 7.10: Model 8.1 – 8.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑃𝑗 + 𝜀 . 𝛽2 is reported. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋𝑃𝑗 + 𝜀 . 𝛽2 is reported. 52 PS weighting. Table 8.1 and Table 8.2 report the average estimated treatment effects th (ATE) without and with trimming at the 95 percentile of weights respectively by using PS weighting models. Different weights are generated from the corresponding PS’s. Table 9.1 and Table 9.2 report the average estimated treatment effects for the treated th (ATT) without and with trimming at the 95 percentile of weights respectively instead. In the PS weighting method, the suppressor can also influence the estimation of the treatment indicator and increase the predictive validity of the treatment indicator significantly. The impact of the suppressor is found in the estimations for both ATE or ATT models regardless of whether weight trimming is applied or not. Without the suppressor in the processes of estimating the treatment effect, the estimates of ATE and ATT are almost the same as well as the estimates of standard error whether weight trimming is applied or not as shown in Table 8.1, Table 8.2, Table 9.1, and Table 9.2. It implies that the distribution of outcomes for individuals in the control group is similar to that for all individuals. Also, the estimates are all similar to the corresponding models by using the PS as a covariate method. With the suppressor in the process of estimating treatment effect, the estimates of ATE tend to be smaller than those of ATT, especially when weight trimming is applied. Moreover, by applying weight trimming, the estimates of ATE and ATT are both smaller without any covariate P or with the weaker ones comparing to those without weight trimming applied. However, the estimates of ATE become larger with the strong P’s such as P8, P9, and P10 than those without weight trimming. For example, the standardized ATE with PS_CX is .136 without weight trimming and .094 with weight trimming, 53 respectively, but that value with PS_CXP10 is .026 without weight trimming and .052 with weight trimming, respectively. It provides evidence that weight trimming can eliminate the impact of the suppressor more when the unconfoundedness assumption is less likely to be fulfilled. However, when the unconfoundedness assumption is approximately fulfilled by applying the strong-enough covariates P’s, the models with weight trimming do not eliminate the impact of the suppressor better than the models without weight trimming. This is because when unconfoundedness is not achieved, the unbiased estimation is not assumed. Applying weight trimming can diminish those individuals with extreme values who may easily affect the estimation. When unconfoundedness is fulfilled, removing any individual in the sample implies losing essential information to estimate the unbiased estimation. Since the regression and the PS as a covariate models estimate the ATE’s as well, the estimates in Table 8.1 and Table 8.2 can be compared to those in Table 4 and Table 6. When weight trimming is not applied and the unconfoundedness assumption is approximately fulfilled by applying P10, the estimate of ATE is .026 with the suppressor involved and the value is slightly smaller than the corresponding values of .038 in the regression and the PS as a covariate models. When weight trimming is applied, the estimate of ATE is .052 in the PS weighting model which is slightly higher than the values in the regression and the PS as a covariate models. By comparing the corresponding standard errors of the estimates in the PS weighting models to those in the regression models, with the stronger covariates P’s involved, the standard errors of the estimates do not decrease in the PS weighting models as they do in the regression models. This finding is consistent with the results in the PS as a covariate 54 models. By applying PS weighting, the only predictor in the regression model is the treatment indicator. Unless the absolute standardized coefficient increases, the model cannot improve the predicted line and decrease the mean squared error. As a result, the estimate of standard error cannot be smaller in the PS weighting model with a stronger covariate P applied. Moreover, for all PS weighting models with the suppressor involved, the standard deviations of the estimated treatment effects in 10 data sets are larger than those in regression and the PS as a covariate models. This implies that the estimates which might vary in different data sets would determine different inferences of the treatment effect. 55 Table 8.1 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Classical Suppression Example Without Trimming B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Model 9 Propensity Scores Used Without Suppressor PS_C 1.971 (0.165) 2.048 (0.052) .030 (.003) 0/10 PS_CP1 2.073 (0.283) 2.048 (0.052) .032 (.005) 0/10 PS_CP2 2.145 (0.479) 2.048 (0.052) .033 (.008) 0/10 PS_CP3 2.213 (0.672) 2.048 (0.052) .034 (.011) 0/10 PS_CP4 2.276 (0.850) 2.048 (0.052) .035 (.014) 0/10 PS_CP5 2.329 (0.992) 2.048 (0.052) .036 (.016) 0/10 PS_CP6 2.366 (1.083) 2.048 (0.053) .037 (.017) 0/10 PS_CP7 2.379 (1.103) 2.048 (0.053) .037 (.017) 0/10 PS_CP8 2.369 (1.046) 2.048 (0.053) .037 (.017) 0/10 PS_CP9 2.313 (0.856) 2.048 (0.053) .036 (.014) 0/10 2.194 (0.496) 2.048 (0.053) With Suppressor 8.848 (2.298) 2.046 (0.057) .034 (.008) 0/10 .136 (.036) 9/10 PS_CP10 PS_CX PS_CXP1 8.759 (2.438) 2.044 (0.055) .135 (.038) 9/10 PS_CXP2 8.500 (2.624) 2.044 (0.055) .131 (.041) 9/10 PS_CXP3 8.096 (2.816) 2.045 (0.054) .125 (.044) 9/10 PS_CXP4 7.518 (3.006) 2.046 (0.054) .116 (.047) 9/10 PS_CXP5 7.781 (3.174) 2.048 (0.053) .105 (.049) 9/10 PS_CXP6 5.874 (3.296) 2.049 (0.053) .091 (.051) 8/10 PS_CXP7 5.022 (3.348) 2.050 (0.052) .078 (.051) 7/10 PS_CXP8 3.953 (3.336) 2.052 (0.052) .061 (.051) 5/10 PS_CXP9 2.748 (3.217) 2.054 (0.051) .043 (.049) 5/10 PS_CXP10 1.680 (2.988) 2.055 (0.051) .026 (.045) 3/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 56 Table 8.2 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Classical Suppression Example th With Trimming at 95 Percentile B Model 9 Propensity Scores Used PS_C PS_CP1 SE(B) Without Suppressor 2.104 (0.439) 2.060 (0.049) β Significance P<.05 .033 (.007) 0/10 2.256 (0.450) 2.058 (0.047) .036 (.007) 0/10 PS_CP2 2.274 (0.438) 2.056 (0.048) .036 (.006) 0/10 PS_CP3 2.291 (0.485) 2.055 (0.048) .036 (.007) 0/10 PS_CP4 2.291 (0.468) 2.055 (0.050) .036 (.007) 0/10 PS_CP5 2.365 (0.496) 2.056 (0.049) .037 (.008) 0/10 PS_CP6 2.383 (0.540) 2.055 (0.047) .038 (.008) 0/10 PS_CP7 2.390 (0.493) 2.054 (0.048) .038 (.008) 0/10 PS_CP8 2.340 (0.464) 2.056 (0.048) .037 (.007) 0/10 PS_CP9 2.238 (0.575) 2.058 (0.050) .035 (.009) 0/10 2.212 (0.568) 2.059 (0.050) With Suppressor 6.441 (2.193) 2.187 (0.085) .035 (.009) 0/10 .094 (.030) PS_CP10 PS_CX PS_CXP1 6.300 (2.298) 2.195 (0.075) .092 (.032) 9/10 8/10 PS_CXP2 6.187 (2.146) 2.196 (0.075) .091 (.030) 7/10 PS_CXP3 5.984 (2.071) 2.199 (0.076) .088 (.030) 7/10 PS_CXP4 6.078 (2.354) 2.200 (0.084) .089 (.034) 7/10 PS_CXP5 5.898 (2.336) 2.199 (0.085) .087 (.035) 7/10 PS_CXP6 5.671 (2.031) 2.201 (0.085) .083 (.030) 6/10 PS_CXP7 5.205 (2.076) 2.200 (0.082) .077 (.031) 5/10 PS_CXP8 4.625 (1.971) 2.200 (0.081) .068 (.029) 6/10 PS_CXP9 3.972 (1.996) 2.202 (0.081) .058 (.029) 5/10 4/10 PS_CXP10 3.557 (1.881) 2.200 (0.079) .052 (.028) Note: Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 57 Table 9.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Classical Suppression Example Without Trimming B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Model 9 Propensity Scores Used Without Suppressor PS_C 1.999 (0.162) 1.934 (0.043) .033 (.003) 0/10 PS_CP1 2.075 (0.240) 1.934 (0.043) .034 (.004) 0/10 PS_CP2 2.125 (0.353) 1.934 (0.043) .035 (.006) 0/10 PS_CP3 2.171 (0.473) 1.933 (0.043) .036 (.008) 0/10 PS_CP4 2.214 (0.586) 1.933 (0.043) .036 (.010) 0/10 PS_CP5 2.250 (0.678) 1.933 (0.043) .037 (.012) 0/10 PS_CP6 2.275 (0.736) 1.933 (0.043) .037 (.012) 0/10 PS_CP7 2.283 (0.748) 1.933 (0.044) .037 (.013) 0/10 PS_CP8 2.275 (0.709) 1.933 (0.044) .037 (.012) 0/10 PS_CP9 2.235 (0.584) 1.934 (0.044) .037 (.010) 0/10 2.153 (0.350) 1.934 (0.043) With Suppressor 6.727 (2.657) 1.926 (0.063) .035 (.006) 0/10 .110 (.044) 9/10 PS_CP10 PS_CX PS_CXP1 6.666 (2.749) 1.924 (0.063) .110 (.046) 9/10 PS_CXP2 6.477 (2.872) 1.924 (0.063) .106 (.048) 9/10 PS_CXP3 6.192 (2.993) 1.925 (0.063) .102 (.050) 9/10 PS_CXP4 5.791 (3.111) 1.926 (0.063) .095 (.051) 8/10 PS_CXP5 5.285 (3.211) 1.928 (0.063) .087 (.053) 7/10 PS_CXP6 4.667 (3.282) 1.929 (0.062) .077 (.054) 5/10 PS_CXP7 4.091 (3.306) 1.931 (0.061) .068 (.054) 5/10 PS_CXP8 3.371 (3.286) 1.932 (0.060) .056 (.053) 4/10 PS_CXP9 2.567 (3.189) 1.934 (0.059) .043 (.052) 3/10 PS_CXP10 1.865 (3.004) 1.935 (0.058) .031 (.048) 2/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 58 Table 9.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Classical Suppression Example th With Trimming at 95 Percentile B Model 9 Propensity Scores Used PS_C PS_CP1 SE(B) Without Suppressor 1.982 (.340) 1.989 (.048) β Significance P<.05 .032 (.006) 0/10 1.935 (.408) 1.989 (.043) .032 (.006) 0/10 PS_CP2 1.959 (.464) 1.990 (.043) .032 (.007) 0/10 PS_CP3 1.945 (.513) 1.990 (.043) .032 (.008) 0/10 PS_CP4 1.947 (.517) 1.988 (.044) .032 (.008) 0/10 PS_CP5 1.945 (.521) 1.988 (.044) .032 (.008) 0/10 PS_CP6 1.897 (.621) 1.988 (.044) .031 (.010) 0/10 PS_CP7 1.947 (.597) 1.988 (.044) .032 (.009) 0/10 PS_CP8 1.996 (.565) 1.988 (.044) .033 (.009) 0/10 PS_CP9 1.998 (.642) 1.987 (.045) .033 (.010) 0/10 2.011 (.461) 1.987 (.045) With Suppressor 3.623 (2.446) 2.199 (.061) .033 (.007) 0/10 .053 (.036) 5/10 PS_CP10 PS_CX PS_CXP1 3.546 (2.296) 2.201 (.062) .052 (.034) 5/10 PS_CXP2 3.377 (2.185) 2.200 (.063) .050 (.032) 5/10 PS_CXP3 3.448 (2.365) 2.202 (.064) .051 (.035) 4/10 PS_CXP4 3.241 (2.258) 2.202 (.065) .048 (.033) 3/10 PS_CXP5 3.210 (2.382) 2.203 (.066) .047 (.035) 4/10 PS_CXP6 2.876 (2.320) 2.204 (.066) .042 (.034) 2/10 PS_CXP7 2.877 (2.196) 2.204 (.066) .042 (.033) 1/10 PS_CXP8 2.693 (2.191) 2.205 (.067) .040 (.032) 1/10 PS_CXP9 2.291 (2.108) 2.205 (.064) .034 (.031) 1/10 PS_CXP10 2.142 (1.939) 2.204 (.066) .032 (.029) 1/10 Note: Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 59 PS matching. Two types of matching methods, nearest neighbor matching and nearest neighbor matching within a caliper 𝜀 = .25𝜎 𝑃𝑆 , are used and the estimates are shown in Table 10.1 and Table 10.2, respectively. As the finding in the PS weighting methods, the standard deviations of the estimated treatment effects are large in PS matching models. It indicates that at different simulated data sets the estimates vary. As a result, different inferences of the treatment effect may be generated. In Table 10.1, the estimated treatment effect increases from 2.185 by matching with PS_C to 6.509 by matching with PS_CX. This implies that the suppressor influences the estimation of the treatment indicator and increases the predictive validity of the treatment indicator, although only the estimates in 4 out of 10 simulated data sets are statistically significant due to the relatively large values of standard error. Unlike previous methods where the estimated treatment effects are relatively consistent without the suppressor involved in the models, the corresponding estimates by using PS matching vary without a specific pattern when different levels of P’s are applied in the models. For example, the estimated treatment effect is 1.440 by matching with PS_CP4, increases to 3.238 by matching with PS_CP5, decreases to 1.636 by matching with PS_CP7, and then increases again to 2.443 by matching with PS_CP8. As what had been found in previous methods, the stronger the covariates P’s in PS matching models, the smaller the estimated treatment effects. This provides evidence that PS matching can also reduce the impact of the suppressor to the extent that other methods did. When the unconfoundedness assumption is approximately fulfilled with the covariate P10 applied, the estimated treatment effect is 1.035, which is relatively smaller than those 60 by using PS weighting where the value is 1.865 without weight trimming and is 2.142 with weight trimming. Table 10.1 and Table 10.2 report that the standard errors of the estimated treatment effects are much larger comparing to all the corresponding estimates in other methods, especially when the covariates P’s are involved. Moreover, when matching the individuals by using the PS’s with the stronger covariates P’s involved, the estimates of standard error do not decrease as they do in the regression models. These findings indicate that by using the PS matching method, the estimated standard errors are not reduced by applying the stronger covariates P’s and the least precise estimates of the treatment effect are conducted by using the PS matching method. Examining Table 10.1 and Table 10.2, the estimates are really similar, regardless of using the nearest neighbor matching method or nearest neighbor matching within a caliper method. For the model with the suppressor involved, estimates are even the same. This result provides evidence that the difference of the individuals’ PS’s within a pair are smaller than the defined calipers for most pairs or all pairs. The biggest number of pairs removed in the analyses is four out of 500. The largest caliper applied in the analyses is .014 and the smallest one is .003 with the mean .008. Even for the largest caliper, it is small enough to assume that the individuals’ PS’s within the matched pairs are similar. 61 Table 10.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Classical Suppression Example Nearest Neighbor Matching B SE(B) Significance Mean (SD) Mean (SD) P<.05 Model 10 Propensity Scores Used Without Suppressor PS_C 2.185 (2.582) 1.331 (.112) 0/10 PS_CP1 2.441 (1.361) 2.549 (.058) 0/10 PS_CP2 2.174 (2.138) 2.585 (.049) 1/10 PS_CP3 1.600 (1.123) 2.543 (.049) 0/10 PS_CP4 1.440 (2.239) 2.538 (.087) 0/10 PS_CP5 3.238 (1.256) 2.562 (.079) 1/10 PS_CP6 2.028 (1.498) 2.584 (.096) 0/10 PS_CP7 1.636 (1.568) 2.569 (.088) 0/10 PS_CP8 2.443 (2.106) 2.546 (.065) 2/10 PS_CP9 2.614 (1.878) 2.556 (.075) 1/10 PS_CP10 1.575 (1.595) With Suppressor 6.509 (4.554) 2.566 (.057) 0/10 3.271 (.274) 4/10 6.453 (3.065) 4.636 (.413) 3/10 PS_CXP2 6.213 (3.768) 4.670 (.437) 2/10 PS_CXP3 6.748 (3.011) 4.683 (.391) 1/10 PS_CXP4 6.541 (3.799) 4.725 (.377) 2/10 PS_CXP5 5.774 (3.954) 4.765 (.310) 1/10 PS_CXP6 5.257 (3.719) 4.640 (.336) 2/10 PS_CXP7 2.800 (6.044) 4.655 (.341) 2/10 PS_CXP8 2.643 (5.111) 4.638 (.351) 1/10 PS_CXP9 1.363 (4.796) 4.640 (.370) 0/10 PS_CX PS_CXP1 PS_CXP10 1.035 (4.003) 4.622 (.404) 0/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 10: ̂𝑡 = 𝜏 1 𝑛𝑡 ∑ 𝑘=1{𝑌 𝑘 |𝑍 = 1 − ̂𝑘 |𝑍 = 0}. 𝑌 𝑛𝑡 62 Table 10.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Classical Suppression Example Nearest Neighbor Matching within a Caliper B SE(B) Significance Mean (SD) Mean (SD) P<.05 Model 10 Propensity Scores Used Without Suppressor PS_C 2.186 (2.585) 1.355 (.112) 0/10 PS_CP1 2.432 (1.353) 2.549 (.058) 0/10 PS_CP2 2.163 (2.067) 2.586 (.049) 1/10 PS_CP3 1.600 (1.082) 2.544 (.048) 0/10 PS_CP4 1.435 (2.188) 2.540 (.087) 0/10 PS_CP5 3.240 (1.286) 2.562 (.077) 1/10 PS_CP6 2.044 (1.570) 2.584 (.095) 0/10 PS_CP7 1.640 (1.589) 2.570 (.087) 0/10 PS_CP8 2.449 (2.101) 2.548 (.064) 2/10 PS_CP9 2.611 (1.870) 2.556 (.074) 1/10 PS_CP10 1.581 (1.604) With Suppressor 6.509 (4.554) 2.568 (.057) 0/10 3.271 (.274) 4/10 6.453 (3.065) 4.636 (.413) 3/10 PS_CXP2 6.213 (3.768) 4.670 (.437) 2/10 PS_CXP3 6.748 (3.011) 4.683 (.391) 1/10 PS_CXP4 6.541 (3.799) 4.725 (.377) 2/10 PS_CXP5 5.774 (3.954) 4.765 (.310) 1/10 PS_CXP6 5.257 (3.719) 4.640 (.336) 2/10 PS_CXP7 2.800 (6.044) 4.655 (.341) 2/10 PS_CXP8 2.643 (5.111) 4.638 (.351) 1/10 PS_CXP9 1.363 (4.796) 4.640 (.370) 0/10 PS_CX PS_CXP1 PS_CXP10 1.035 (4.003) 4.622 (.404) 0/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 10: ̂𝑡 = 𝜏 1 𝑛𝑡 ∑ 𝑘=1{𝑌 𝑘 |𝑍 = 1 − ̂𝑘 |𝑍 = 0}. 𝑌 𝑛𝑡 63 Impact of a confounding variable. Frank (2000) derived an index for quantifying the impact of a confounding variable on the inference of a predictor’s coefficient in a regression model. The impact is defined as the product of two correlations, where one is the correlation of a confounding variable and the dependent variable, and the other is the correlation of that confounding variable and the predictor. For a confounding variable with a larger impact, the inference of the predictor is more likely to be influenced by adding that confounding variable in the regression model. Here, the index is used to compute the impacts of suppressor X, covariate Pj, and PS variables on the treatment indicator. Table 11 shows that X and the PS_CX have the largest impacts, which are both -.036. This indicates that by adding the suppressor or the PS generated by the suppressor in the regression models, the estimation of the treatment effect is more likely to be affected. Although the impacts of them are small, they still have significant impacts on the estimations of the treatment indicator based upon the results in previous models. Moreover, the degrees of the impacts of the PS_CXPj decrease from PS_CXP1 to PS_CXP8 and then increase from PS_CXP8 to PS_CXP10. The changes are due to the variations of the correlations between PS_CXPj and Y. However, considering the correlation coefficients between PS_CXPj and Y, the values keep increasing from negative to positive from PS_CXP1 (-.055) to PS_CXP10 (.016), although the differences are small. As we know, the PS is the estimated probability of an individual being at the treatment group. The results indicate that while conditional on the suppressor and the stronger covariates P’s, an individual with a higher probability in the treatment group 64 tends to have a higher outcome value. As a result, when the stronger covariates P’s are used to estimate the PS’s, the predicted lines of the correlations between the PS’s and the outcome are pulled up from negative to positive. Moreover, the impacts of PS_CPj are all close to zero. In Table 11, the correlation coefficients between PS_CPj and Y decrease from PS_CP1 to PS_CP8 first, and then increase from PS_CP8 to PS_CP10. Because the correlation coefficients are extremely small, the values of them may be easily influenced by the individuals with the extremer values of PS_CPj and Y. 65 Table 11 Impact of Suppressor, P’s, and the Propensity Scores on Treatment Indicator– Classical Suppression Example Correlation with Y Correlation with Z Impact X -.058 .628 -.036 .139 -.008 -.001 P1 .241 -.008 -.002 P2 .340 -.008 -.003 P3 .443 -.008 -.004 P4 .546 -.007 -.004 P5 .648 -.007 -.005 P6 .730 -.006 -.004 P7 .819 -.005 -.004 P8 .906 -.004 -.004 P9 .973 -.002 -.002 P10 PS_C .001 .010 .069 < .001 .001 .073 PS_CP1 < -.001 -.006 .073 PS_CP2 -.001 -.012 .073 PS_CP3 -.001 -.018 .073 PS_CP4 -.002 -.024 .072 PS_CP5 -.002 -.028 .072 PS_CP6 -.002 -.030 .071 PS_CP7 -.002 -.030 .071 PS_CP8 -.002 -.026 .070 PS_CP9 -.001 -.015 .070 PS_CP10 PS_CX -.036 -.057 .633 -.035 -.055 .634 PS_CXP1 -.034 -.053 .634 PS_CXP2 -.030 -.048 .634 PS_CXP3 -.027 -.042 .634 PS_CXP4 -.022 -.035 .634 PS_CXP5 -.017 -.026 .635 PS_CXP6 -.011 -.017 .635 PS_CXP7 -.004 -.006 .635 PS_CXP8 .004 .006 .636 PS_CXP9 .010 .016 .636 PS_CXP10 Note: The values are the means calculated by 10 simulated datasets. Impact is the product of correlations with Y and Z. 66 Chapter 5 EXAMPLE of RECIPROCAL SUPRESSION Data For the example of reciprocal suppression, 10 simulated data sets with outcome Y, treatment indicator Z, suppressor X, and covariates Vi are also generated with 1,000 subjects for each. In each data set, the number of subjects in the treatment group (Z = 1) and in the control group (Z = 0) is equal to 500. In the example of reciprocal suppression, the correlation 𝑟 𝑦𝑧 is not zero, and the correlations 𝑟 𝑦𝑥 and 𝑟𝑧𝑥 are statistically significant. The value of R2 in Model 2 which regresses Z and X on Y is larger than the sum of 𝑟 2 and 𝑟 2 . Moreover, the coefficient of Z in Model 2 is significant and has an 𝑦𝑧 𝑦𝑥 opposite sign comparing to 𝑟 𝑦𝑧 . Testing validity of simulation data sets. The correlations 𝑟 𝑦𝑧 , 𝑟 𝑦𝑥 , and 𝑟𝑧𝑥 and the values of R2 from Model 1 and 2 are reported in Table 12 for each simulated data set. Based on the results for each simulated data, the correlation 𝑟 𝑦𝑧 is not zero, both 𝑟 𝑦𝑧 and 𝑟 𝑦𝑥 are positively significant, and the value of R2 in Model 2 is larger than the sum of 𝑟 2 and 𝑟 2 . The estimated treatment effect is all positively significant in Model 1, and 𝑦𝑧 𝑦𝑥 the estimated treatment effect becomes negatively significant with the added X in Model 2 for each data set. These results imply that X increases the predictive validity of Z. The sign of the correlation between Z and Y is opposite to the estimated coefficient of 67 Z. All data sets satisfy the conditions established for reciprocal suppression. Table 12 Reciprocal Suppression Data Results Model 1 Model 2 B1 B1 𝑟 𝑦𝑧 𝑟 𝑦𝑥 𝑟𝑧𝑥 R2 R2 Simulated Data 1 .215*** .534*** .751*** 2.009*** .046 -3.996*** .365 Simulated Data 2 .214*** .534*** .750*** 2.009*** .046 -3.996*** .365 Simulated Data 3 .214*** .534*** .750*** 2.009*** .046 -3.996*** .365 Simulated Data 4 .215*** .534*** .751*** 2.008*** .046 -3.996*** .365 Simulated Data 5 .214*** .534*** .749*** 2.009*** .046 -3.996*** .365 Simulated Data 6 .214*** .534*** .750*** 2.009*** .046 -3.996*** .365 Simulated Data 7 .214*** .534*** .749*** 2.009*** .046 -3.996*** .365 Simulated Data 8 .214*** .534*** .750*** 2.009*** .046 -3.996*** .365 Simulated Data 9 .215*** .534*** .750*** 2.008*** .046 -3.996*** .365 Simulated Data 10 .214*** .534*** .749*** 2.009*** .046 -3.996*** .365 Note: B1 is the coefficient for treatment indicator Z. Model 1is 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 and Model 2 is 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜀 . *p < .05. ** p < .01. *** p < .001. Table 13 reports the means and the standard deviations (SD) of correlations for Y, Z, and X from 10 data sets where 𝑟 𝑦𝑧 is .214, 𝑟 𝑦𝑥 is .534, and 𝑟𝑧𝑥 is .750. The standard deviations are quite small with the values less than or equal to .001, showing that all simulated data sets satisfy the given constraints precisely. Table 13 Correlation Table for Simulated Variables – Reciprocal Suppression Example Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) Outcome(Y) -Treatment(Z) .214 (<.001) -Suppressor(X) .534 (<.001) .750 (.001) -Note: The values are calculated by 10 simulated data sets. 68 Covariates P’s are also generated by using the non-linear function 𝑃 = 𝑅 + 𝐶 × sin(𝑅). Table 14 indicates that the correlations between P’s and Y increase monotonically from P1 (.148) to P10 (.950) at an approximate rate of .10. The correlations between P’s and Z are negative and closer to zero from P1 (-.027) to P10 (-.006). The correlations between P’s and X are positive and slightly increase from P1 (.034) to P10 (.367) at an approximate rate of .040. The covariate Pj has a stronger effect on Y when its correlation with Y is larger. Since the covariates P’s are generated from the unexplained residuals, with a stronger Pj in the model, the unconfoundedness assumption is more likely to be fulfilled. In this example, the correlation between P10 and Y is .950, which is extremely high. As a result, by controlling P10 in the regression model, the unconfoundedness assumption can be approximately fulfilled. 69 Table 14 Correlation Table for Simulated Variables and P’s – Reciprocal Suppression Example Simulated Variables Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) P’s .148 (.032) -.027 (.046) .034 (.039) P1 P2 .244 (.030) -.027 (.045) .072 (.038) P3 .355 (.027) -.026 (.043) .117 (.037) P4 .441 (.024) -.025 (.041) .151 (.035) P5 .535 (.020) -.023 (.039) .190 (.033) P6 .617 (.017) -.021 (.036) .223 (.031) P7 .718 (.012) -.019 (.031) .265 (.027) P8 .830 (.007) -.015 (.024) .313 (.021) P9 .910 (.004) -.010 (.017) .348 (.015) .950 (.002) -.006 (.010) P10 Note: The values are calculated by 10 simulated data sets. .367 (.009) Regression Models Table 15 reports the estimates of the treatment indicator by using regression models. In Model 3, the standardized treatment effect is .213, and the value decreases dramatically to -.426 with the added suppressor in Model 4. The estimates are both statistically significant in these two models. The result provides evidence that the suppressor has a strong impact on the estimation of the treatment effect which influences the estimate from positively significant to negatively significant. This finding also implies that the models with or without the suppressor would generate different inferences of the treatment effect where the treatment promotes the individuals’ outcome values in one model but reduces those values in another significantly. Considering Model 3.1 to Model 3.10, the estimated treatment effects are quite 70 consistent no matter which levels of P’s are controlled. However, for the models controlling the stronger covariates P’s, the estimates of standard error become smaller. This result is also found in the example of classical suppression. In Model 3.10, controlling the strongest covariate P10 implies that the unconfoundedness assumption is approximately fulfilled. The estimated treatment effect of .220 in Model 3.10 can be considered as the approximately true treatment effect. This value is quite close to the estimated treatment effect of .213 in Model 3. The estimated treatment effect with the added suppressor in Model 4 is -.426 which differs from the approximately true treatment effect significantly. By comparing the results from Model 4.1 to Model 4.10, the estimated treatment effects change from negative to positive and the estimates of the standard error become smaller when the stronger covariates P’s are included in the models. The estimated standardized treatment effect is -.413 with the estimated standard error 0.356 in Model 4.1 and the corresponding estimates are .169 and 0.109 in Model 4.10, respectively. The results indicate that the influence of the suppressor on the treatment indicator becomes smaller with a stronger covariate Pj. When the strongest covariate P10 is controlled in the model, the estimated treatment effect is approaching the approximately true treatment effect. These finding imply that the impact of the suppressor can be eliminated by controlling a strong-enough covariate Pj. Meanwhile, the estimates of standard error are smaller with the stronger covariates P’s applied. This indicates that by controlling the stronger covariates P’s, the precision of estimating the treatment effect can be promoted. These findings are consistent with those in the example of classical suppression. 71 Table 15 The Estimated Treatment Effects of Regression Models – Reciprocal Suppression Example B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) p < .05 Regression Without Suppressor Model 3 1.999 (0.030) 0.291 (0.001) .213 (.003) 10/10 Model 3.1 2.040 (0.096) 0.287 (0.002) .218 (.010) 10/10 P1 Model 3.2 2.063 (0.132) 0.281 (0.003) .220 (.014) 10/10 P Model 3.3 2 P3 2.087 (0.169) 0.270 (0.003) .223 (.018) 10/10 P4 2.103 (0.193) 0.259 (0.004) .224 (.020) 10/10 P5 2.117 (0.212) 0.242 (0.004) .226 (.022) 10/10 P6 2.125 (0.221) 0.224 (0.004) .227 (.024) 10/10 P7 2.128 (0.220) 0.196 (0.004) .227 (.023) 10/10 P8 2.117 (0.193) 0.151 (0.003) .226 (.021) 10/10 P9 2.091 (0.144) 0.104 (0.002) .223 (.015) 10/10 P10 2.063 (0.094) With Suppressor Model 4 -3.996 (0.026) Model 4.1 -3.870 (0.087) P1 Model 4.2 -3.691 (0.122) P 0.065 (0.001) .220 (.010) 10/10 0.358 (0.002) 0.356 (0.002) -.426 (.003) -.413 (.009) 10/10 10/10 0.353 (0.003) -.394 (.013) 10/10 Model 4.3 Model 3.4 Model 3.5 Model 3.6 Model 3.7 Model 3.8 Model 3.9 Model 3.10 Model 4.4 Model 4.5 Model 4.6 Model 4.7 Model 4.8 Model 4.9 Model 4.10 2 P3 -3.378 (0.155) 0.346 (0.004) -.360 (.017) 10/10 P4 -3.042 (0.176) 0.337 (0.005) -.324 (.019) 10/10 P5 -2.572 (0.191) 0.325 (0.005) -.274 (.020) 10/10 P6 -2.056 (0.196) 0.310 (0.006) -.219 (.021) 10/10 P7 -1.264 (0.189) 0.283 (0.006) -.135 (.020) 10/10 P8 -0.095 (0.162) 0.233 (0.005) -.010 (.017) 0/10 P9 0.961 (0.124) 0.169 (0.003) .103 (.013) 10/10 0.109 (0.002) .169 (.009) 10/10 P10 1.589 (0.085) Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 3: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽′1+𝑖 𝑉𝑖 + 𝜀 . Model 3.1 – 3.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽′1+𝑖 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . Model 4: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 Model 4.1 – 4.10: 𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽′2+𝑖 𝑉𝑖 + 𝜀 . 𝑍 + 𝛽2 𝑋 + 𝛽′2+𝑖 𝑉𝑖 + 𝛽8 𝑃𝑗 + 𝜀. 72 PS Methods Before estimating the treatment effects by using PS methods, the predicted PS’s are estimated by including different combinations of variables in logistic regression models. Table 16 reports the correlations between the PS’s and simulated variables Y, Z, and X. The correlations between PS_C and Y, Z, and X are all small, .032, .069, and .059, respectively, where PS_C is estimated by including only covariates Vi in the model. For PS_CX, which is estimated by including suppressor X and covariates Vi, its correlation with Y is .509, and the correlations with Z and X are quite large, .762 and .977, respectively. For the predicated PS’s by using covariates Vi and different levels of P’s, PS_CP1 to PS_CP10, the correlations with Y are all negative. The values become stronger from PS_CP1 (-.016) to PS_CP7 (-.154) and then become weaker from PS_CP8 (-.150) to PS_CP10 (-.079). The correlations of PS_CP1 to PS_CP10 with Z are all positive and slightly decrease from PS_CP1 (.086) to PS_CP10 (.070). Their correlations with X vary a little from model to model, but the values are all close to zero so that the differences are ignorable. For the predicated PS’s by using suppressor X, covariates Vi, and different levels of P’s, PS_CXP1 to PS_CXP10, their correlations with Y are all positive and decrease from PS_CXP1 (.509) to PS_CXP10 (.197). The correlations between PS_CXP1 to PS_CXP10 and Z are all quite large and slightly increase from PS_CXP1 (.762) to PS_CXP10 (.851). Their correlations with X are also large but slightly decrease from 73 PS_CXP1 (.977) to PS_CXP10 (.880). PS as a covariate. The estimates of the treatment effect by using the PS as a covariate method are reported in Table 17. Comparing Table 15 and Table 17, the estimated treatment effects are quite similar by using the PS as a covariate method to the corresponding estimates by using regression method. However, the estimates of standard error differ for two methods. By comparing Model 7 and Model 8, the standardized treatment effect changes from .213 to -.416 while using the PS with the added suppressor in the PS model. This result provides evidence that the suppressor has a strong influence on the estimation of the treatment indicator and the estimated treatment effect is affected by the suppressor from positively significant to negatively significant. This also implies that the models use the PS’s with or without the suppressor involved, generating different inferences of the treatment effect. Considering Model 7.1 to Model 7.10, the estimates of the treatment effect still are quite consistent with each other. For Model 7.10 with the strongest covariate P10 involved, the estimate of the treatment effect is .220 which is the same as the value, the approximately true treatment effect, in Model 3.10. From Model 8.1 to Model 8.10, the estimated treatment effects increase from negative (-.402) to positive (.170) while the stronger covariate P’s are applied. In Model 8.10, the estimated treatment effect of .170 is close enough to the estimates in Model 3.10 and in Model 7.10, the approximately true treatment effect of .220. These results imply that the influence of the suppressor on the treatment indicator can also be eliminated with a strong-enough covariate Pj involved by 74 using the PS as a covariate method. As what is addressed in the example of classical suppression, the estimates of the standard error do not become smaller by using the PS’s with stronger P’s involved as the covariates in Model 7.1 to Model 7.10 and Model 8.1 to Model 8.10 as they do in the regression models. In the example of reciprocal suppression, by using the regression method, the estimated standard errors decrease by controlling the stronger covariates P’s. In Model 7.1 to Model 7.10, the estimated standard errors keep almost the same no matter the PS’s estimated by which levels of P’s. In Model 8.1 to Model 8.10, with the PS’s estimated by the suppressor, the standard errors even increase when the stronger covariates P’s are involved in the models. As what was explained in Chapter 4, when the regression model with smaller standardized coefficients, the value of R2 tends to be smaller. As a result, when the value of R2 becomes smaller, the standard errors of predictors in the model become larger. In Table 18, the stronger the covariates P’s, the dramatically larger the absolute standardized coefficients of P’s for regression models, with or without controlling the suppressor. However, in the PS as a covariate models, the absolute standardized coefficients of the PS_CPj are relatively small where the values slightly increase and then decrease. By applying the PS’s with the suppressor and different levels of P’s involved, the absolute standardized coefficients of the PS_CXPj become smaller from .806 (PS_CXP1) to .052 (PS_CXP10). These changes correspond to the estimates of standard error in Model 7.1 to Model 7.10 and Model 8.1 to Model 8.10. The smaller the absolute standardized coefficients of the PS, the larger the standard errors of the estimated treatment effects. 75 These results provide evidence that using the PS with a stronger Pj involved as a covariate in the regression model cannot improve the predicted line and cannot increase the R2 as directly adding a stronger Pj as a covariate. As a result, the estimates of standard error do not become smaller by using the PS with a stronger Pj as a covariate. 76 Table 16 Correlation Table for Simulated Variables and Propensity Scores – Reciprocal Suppression Example Simulated Variables Outcome(Y) Treatment(Z) Suppressor(X) Mean (SD) Mean (SD) Mean (SD) Propensity Scores Model 5 PS_C .032 (.045) .069 (.023) .059 (.032) Model 5.1 PS_CP1 -.016 (.110) .086 (.024) .055 (.050) Model 5.2 PS_CP -.045 (.154) .085 (.024) .043 (.066) Model 5.3 2 PS_CP3 -.078 (.201) .084 (.024) .029 (.085) PS_CP4 -.101 (.235) .083 (.024) .019 (.098) PS_CP5 -.124 (.266) .081 (.023) .023 (.109) PS_CP6 -.141 (.287) .080 (.023) .001 (.120) PS_CP7 -.154 (.300) .078 (.023) -.006 (.126) PS_CP8 -.150 (.283) .074 (.023) -.008 (.120) PS_CP9 -.120 (.225) .072 (.023) .001 (.100) PS_CP10 PS_CX PS_CXP1 -.079 (.153) .509 (.003) .070 (.023) .762 (.002) .016 (.074) .977 (.002) .499 (.010) .765 (.003) .973 (.004) PS_CXP2 .484 (.013) .768 (.004) .969 (.005) PS_CXP3 .459 (.015) .774 (.005) .962 (.006) PS_CXP4 .434 (.016) .780 (.006) .954 (.006) PS_CXP5 .401 (.016) .788 (.006) .945 (.007) PS_CXP6 .367 (.015) .798 (.007) .935 (.006) PS_CXP7 .321 (.012) .811 (.007) .921 (.006) PS_CXP8 .263 (.009) .829 (.006) .902 (.005) PS_CXP9 .219 (.006) .843 (.005) .888 (.004) PS_CXP10 .197 (.004) .851 (.004) Note: The values are calculated by 10 simulated datasets. .880 (.003) Model 5.4 Model 5.5 Model 5.6 Model 5.7 Model 5.8 Model 5.9 Model 5.10 Model 6 Model 6.1 Model 6.2 Model 6.3 Model 6.4 Model 6.5 Model 6.6 Model 6.7 Model 6.8 Model 6.9 Model 6.10 Model 5: log ( 𝑃( 𝑍=1) ) = 𝛽0 1−𝑃( 𝑍=1) 𝑃( 𝑍=1) Model 5.1 – 5.10: log ( ) = 𝛽0 + 𝛽′ 𝑖 𝑉𝑖 + 𝛽6 𝑃𝑗 + 𝜀 . 1−𝑃( 𝑍=1) 𝑃( 𝑍=1) Model 6: log ( ) = 𝛽0 1−𝑃( 𝑍=1) Model 6.1 – 6.10: log ( + 𝛽 ′ 𝑖 𝑉𝑖 + 𝜀 . 𝑃(𝑍=1) 1−𝑃(𝑍=1) + 𝛽1 𝑋 + 𝛽′1+𝑖 𝑉𝑖 + 𝜀 . ) = 𝛽0 + 𝛽1 𝑋 + 𝛽′1+𝑖 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . 77 Table 17 The Estimated Treatment Effects of Propensity Score as a Covariate Models– Reciprocal Suppression Example B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Propensity Score as a Covariate Without Suppressor Model 7 1.999 (0.030) 0.291 (0.001) .213 (.003) 10/10 Model 7.1 2.040 (0.096) 0.289 (0.002) .218 (.010) 10/10 P1 Model 7.2 P2 2.063 (0.132) 0.287 (0.004) .220 (.014) 10/10 Model 7.3 P3 2.087 (0.168) 0.284 (0.006) .223 (.018) 10/10 Model 7.4 P4 2.103 (0.192) 0.281 (0.008) .224 (.020) 10/10 Model 7.5 P5 2.117 (0.212) 0.278 (0.011) .226 (.022) 10/10 Model 7.6 P6 2.125 (0.221) 0.275 (0.013) .227 (.024) 10/10 Model 7.7 P7 2.128 (0.220) 0.273 (0.015) .227 (.023) 10/10 Model 7.8 P8 2.117 (0.194) 0.275 (0.014) .226 (.021) 10/10 Model 7.9 P9 2.091 (0.144) 0.281 (0.009) .223 (.015) 10/10 Model 7.10 P10 With Suppressor Model 8 Model 8.1 P1 Model 8.2 P2 2.062 (0.094) 0.286 (0.005) .220 (.010) 10/10 -3.896 (0.085) -3.772 (0.145) 0.375 (0.002) 0.381 (0.006) -.416 (.009) -.402 (.015) 10/10 10/10 -3.599 (0.171) 0.389 (0.008) -.384 (.018) 10/10 Model 8.3 P3 -3.298 (0.193) 0.403 (0.009) -.352 (.021) 10/10 Model 8.4 P4 -2.975 (0.205) 0.417 (0.010) -.317 (.022) 10/10 Model 8.5 P5 -2.524 (0.209) 0.435 (0.010) -.269 (.022) 10/10 Model 8.6 P6 -2.024 (0.207) 0.453 (0.011) -.216 (.022) 10/10 Model 8.7 P7 -1.251 (0.198) 0.479 (0.011) -.133 (.021) 9/10 Model 8.8 P8 -0.097 (0.179) 0.512 (0.010) -.010 (.019) 0/10 Model 8.9 P9 0.959 (0.148) 0.538 (0.009) .102 (.016) 2/10 Model 8.10 0.552 (0.008) .170 (.011) 10/10 P10 1.590 (0.104) Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 7: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶 + 𝜀 . Model 7.1 – 7.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑃𝑗 + 𝜀 . Model 8: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 Model 8.1 – 8.10: 𝑌 = 𝛽0 + 𝛽1 𝑃𝑆_𝐶𝑋 + 𝜀 . 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋𝑃𝑗 + 𝜀 . 78 Table 18 Coefficients of P’s in Regression Models and Coefficients of Propensity Scores in PS as a Covariate Models – Reciprocal Suppression Example Regression PS as a covariate β SE β SE Mean (SD) Mean (SD) Mean (SD) Mean (SD) Without Suppressor Model Model .154 (.032) 0.201 (0.003) 7.1 -.035 (.112) 3.655 (1.139) 3.1 P1 PS_CP1 3.2 P2 .250 (.030) 3.3 P3 .360 (.027) 3.4 P4 .447 (.024) 3.5 P5 .540 (.020) 3.6 P6 .622 (.016) 3.7 P7 .722 (.011) 3.8 P8 .834 (.006) 3.9 P9 .912 (.003) 3.10 P10 .952 (.001) 4.1 P1 .108 (.026) 4.2 P2 .175 (.026) 4.3 P3 .256 (.024) 4.4 P4 .323 (.023) 4.5 P5 .403 (.020) 4.6 P6 .481 (.018) 4.7 P7 .588 (.014) 4.8 P8 .733 (.008) 0.192 (0.003) 7.2 0.178 (0.004) 7.3 PS_CP2 -.064 (.155) 3.656 (1.140) PS_CP3 -.096 (.203) 3.661 (1.141) 0.163 (0.004) 7.4 0.143 (0.004) 7.5 PS_CP4 -.120 (.237) 3.672 (1.144) PS_CP5 -.143 (.268) 3.695 (1.153) 0.122 (0.003) 7.6 0.094 (0.003) 7.7 PS_CP6 -.159 (.289) 3.734 (1.167) PS_CP7 -.171 (.301) 3.821 (1.202) 0.056 (0.002) 7.8 0.026 (0.001) 7.9 PS_CP8 -.167 (.283) 4.022 (1.295) PS_CP9 -.136 (.224) 4.288 (1.441) 0.010 (0.001) 7.10 PS_CP10 With Suppressor 0.165 (0.002) 8.1 PS_CXP1 0.160 (0.003) 8.2 PS_CXP2 -.094 (.152) 4.492 (1.567) .806 (.021) 0.495 (0.006) .779 (.025) 0.503 (0.008) 0.152 (0.003) 8.3 0.142 (0.003) 8.4 PS_CXP3 .731 (.029) 0.517 (0.009) PS_CXP4 .682 (.031) 0.531 (0.009) 0.129 (0.004) 8.5 0.115 (0.003) 8.6 PS_CXP5 .613 (.031) 0.548 (0.010) PS_CXP6 .539 (.031) 0.565 (0.010) 0.093 (0.003) 8.7 0.060 (0.002) 8.8 PS_CXP7 .429 (.029) 0.588 (0.010) PS_CXP8 .271 (.025) 0.616 (0.010) 0.030 (0.001) 8.9 .133 (.019) 0.637 (0.009) PS_CXP9 .052 (.013) 0.648 (0.008) 4.10 P10 .927 (.003) 0.012 (0.001) 8.10 PS_CXP10 Note: The values are calculated by 10 simulated data sets. Model 3.1 – 3.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜷′ 𝟏+𝒊 𝑉𝑖 + 𝛽7 𝑃𝑗 + 𝜀 . 𝛽7 is reported. 4.9 P9 .856 (.004) Model 4.1 – 4.10: Model 7.1 – 7.10: Model 8.1 – 8.10: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑋 + 𝜷′ 𝟐+𝒊 𝑉𝑖 + 𝛽8 𝑃𝑗 + 𝜀 . 𝛽8 is reported. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑃𝑗 + 𝜀 . 𝛽2 is reported. 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝛽2 𝑃𝑆_𝐶𝑋𝑃𝑗 + 𝜀 . 𝛽2 is reported. 79 PS weighting. Table 19.1 and Table 19.2 report the average estimated treatment th effects (ATE) without and with trimming at the 95 percentile of weights respectively by using different types of weights. The weights are generated from the corresponding PS’s. Table 20.1 and Table 20.2 report the average estimated treatment effects for the treated th (ATT) without and with trimming at the 95 percentile of weights respectively as well. In the PS weighting method, the suppressor can also influence the estimation of the treatment indicator and can affect the estimated treatment effect from positive to negative. The impact of the suppressor is found no matter the estimations for both ATE or ATT models, regardless of whether weight trimming is applied or not. The estimates are all similar to the corresponding models by using the PS as a covariate method also. As to what I found in the example of classical suppression, without the suppressor in the processes of estimating treatment effect, estimates of ATE and ATT are almost the same as well as the estimates of standard error, regardless of whether weight trimming is applied or not. It provides evidence that the distribution of outcomes for individuals in the control group is similar to that for all individuals. With the suppressor in the processes of estimating treatment effect, the impact of the suppressor on the treatment indicator tends to be smaller in ATT than in ATE. For example, the estimate of standardized treatment effect with PS_CX in ATT is -.338 which is closer to the approximately true treatment effect of .220 than the estimate in ATE, which is -.471. Especially when weight trimming is applied, the estimate in ATT of .007 is no longer negative where the corresponding estimate in ATE is -.129. This implies that weight trimming can eliminate the impact of the suppressor more when there is no 80 strong-enough covariate Pj involved in the model. As mentioned in Chapter 4, by applying weight trimming, the individuals with the extreme values that cause the biased estimation can be removed. However, when the unconfoundedness assumption is fulfilled, removing any value in the sample may cause the loss of essential information to achieve the unbiased estimation. For example, by using the weights generated from PS_CXP10, the estimated ATE of .237 without trimming is closer to the approximately true treatment effect of .220 than the estimated ATE of .135 with trimming. To the extent in all other models, with the stronger covariates P’s involved, the impact of the suppressor is also more likely to be eliminated in the PS weighting method. For example, the estimated ATE’s without weight trimming change from -.454 with PS_CXP1 involved to .237 with PS_CXP10 involved. The estimate with the strongest P10 involved is close to the approximately true treatment effect, .220. By comparing the standard deviations of the estimated treatment effects to those in regression and the PS as a covariate models, the standard deviations are larger by using PS weighting models. This finding implies that by using PS weighting, the results of each simulated data set are various so that different inferences of the treatment effect are conducted, especially when the suppressor is involved. In Table 19.1, to take the model using the weights generated from PS_CXP8 for example, two of the estimated treatment effects are negatively significant but three of them are positively significant out of 10 simulated data sets. Consistent with previous findings, the estimated standard errors do not decrease in the PS weighting models when the stronger covariates P’s are involved, but they do 81 decrease in the regression models. By using the PS weighting method, the only predictor is the treatment indicator in the regression model. Unless the absolute standardized coefficient of the treatment indicator increases, the model cannot improve the predicted line and then increase the value of R2. As a result, the estimates of standard error cannot be smaller when the stronger covariates P’s are applied in PS weighting models. 82 Table 19.1 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Reciprocal Suppression Example Without Trimming B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Model 9 Propensity Scores Used Without Suppressor PS_C 1.992 (0.040) 0.308 (0.003) .201 (.005) 10/10 PS_CP1 2.055 (0.142) 0.308 (0.003) .207 (.014) 10/10 PS_CP2 2.089 (0.195) 0.307 (0.003) .210 (.019) 10/10 PS_CP3 2.124 (0.250) 0.307 (0.003) .212 (.025) 10/10 PS_CP4 2.148 (0.285) 0.307 (0.003) .216 (.028) 10/10 PS_CP5 2.169 (0.314) 0.307 (0.003) .218 (.031) 10/10 PS_CP6 2.181 (0.328) 0.307 (0.003) .219 (.032) 10/10 PS_CP7 2.186 (0.326) 0.307 (0.003) .219 (.032) 10/10 PS_CP8 2.169 (0.286) 0.308 (0.003) .218 (.028) 10/10 PS_CP9 2.130 (0.212) 0.308 (0.003) .214 (.021) 10/10 PS_CP10 2.089 (0.138) 0.308 (0.003) With Suppressor -5.764 (0.559) 0.342 (0.022) .210 (.014) 10/10 -.471 (.038) 10/10 -5.437 (0.598) 0.337 (0.014) -.454 (.042) 10/10 PS_CXP2 -5.121 (0.622) 0.346 (0.012) -.436 (.042) 10/10 PS_CXP3 -4.617 (0.655) 0.330 (0.013) -.404 (.044) 10/10 PS_CXP4 -4.102 (0.681) 0.326 (0.014) -.368 (.047) 10/10 PS_CXP5 -3.403 (0.709) 0.322 (0.015) -.315 (.052) 10/10 PS_CXP6 -2.649 (0.738) 0.318 (0.017) -.252 (.059) 10/10 PS_CXP7 -1.506 (0.787) 0.313 (0.021) -.146 (.073) PS_CXP8 0.154 (0.861) 0.309 (0.031) .020 (.093) 9/10 2/10(-); 3/10(+) PS_CXP9 1.610 (1.047) 0.308 (0.041) .165 (.108) PS_CX PS_CXP1 9/10 PS_CXP10 2.397 (1.251) 0.308 (0.044) .237 (.111) 10/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 83 Table 19.2 The Estimated Average Treatment Effects (ATE) of Propensity Score Weighting – Reciprocal Suppression Example th With Trimming at 95 Percentile B Model 9 Propensity Scores Used PS_C PS_CP1 SE(B) Without Suppressor 2.003 (0.082) 0.309 (0.003) β Significance P<.05 .206 (.008) 10/10 2.018 (0.101) 0.309 (0.003) .207 (.009) 10/10 PS_CP2 2.012 (0.116) 0.308 (0.003) .207 (.011) 10/10 PS_CP3 2.019 (0.124) 0.308 (0.003) .208 (.012) 10/10 PS_CP4 2.025 (0.129) 0.308 (0.002) .209 (.012) 10/10 PS_CP5 2.013 (0.115) 0.307 (0.003) .208 (.011) 10/10 PS_CP6 2.014 (0.110) 0.307 (0.003) .208 (.011) 10/10 PS_CP7 2.011 (0.094) 0.307 (0.003) .208 (.010) 10/10 PS_CP8 2.010 (0.090) 0.307 (0.003) .208 (.008) 10/10 PS_CP9 1.992 (0.099) 0.307 (0.003) .206 (.009) 10/10 2.006 (0.089) 0.308 (0.003) With Suppressor -1.368 (0.247) 0.342 (0.011) .207 (.008) 10/10 -.129 (.023) 10/10 -1.363 (0.370) 0.344 (0.009) -.128 (.035) 10/10 PS_CXP2 -1.212 (0.369) 0.350 (0.010) -.112 (.033) 9/10 PS_CXP3 -0.950 (0.431) 0.355 (0.010) -.086 (.039) 8/10 PS_CXP4 -0.778 (0.343) 0.365 (0.012) -.069 (.031) 5/10 PS_CXP5 -0.360 (0.342) 0.370 (0.012) -.032 (.030) 2/10 PS_CXP6 0.141 (0.345) 0.378 (0.014) -.013 (.030) 0/10 PS_CXP7 0.335 (0.416) 0.389 (0.010) .028 (.035) 1/10 PS_CXP8 0.977 (0.353) 0.405 (0.010) .078 (.028) 7/10 PS_CXP9 1.557 (0.315) 0.428 (0.015) .117 (.022) 10/10 PS_CP10 PS_CX PS_CXP1 PS_CXP10 1.835 (0.348) 0.438 (0.015) .135 (.026) 10/10 Note: Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 84 Table 20.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Reciprocal Suppression Example Without Trimming B SE(B) β Significance Mean (SD) Mean (SD) Mean (SD) P<.05 Model 9 Propensity Scores Used Without Suppressor PS_C 1.997 (0.022) 0.290 (0.001) .213 (.003) 10/10 PS_CP1 2.038 (0.094) 0.289 (0.001) .218 (.010) 10/10 PS_CP2 2.060 (0.129) 0.289 (0.001) .220 (.013) 10/10 PS_CP3 2.084 (0.164) 0.289 (0.001) .222 (.017) 10/10 PS_CP4 2.101 (0.187) 0.289 (0.001) .224 (.019) 10/10 PS_CP5 2.114 (0.205) 0.289 (0.001) .225 (.021) 10/10 PS_CP6 2.123 (0.214) 0.289 (0.001) .226 (.022) 10/10 PS_CP7 2.126 (0.212) 0.289 (0.001) .226 (.022) 10/10 PS_CP8 2.115 (0.185) 0.289 (0.001) .225 (.019) 10/10 PS_CP9 2.089 (0.136) 0.289 (0.001) .223 (.014) 10/10 2.061 (0.087) 0.289 (0.001) With Suppressor -3.248 (0.620) 0.285 (0.013) .220 (.009) 10/10 -.338 (.052) 10/10 -3.059 (0.640) 0.284 (0.011) -.321 (.055) 10/10 PS_CXP2 -2.857 (0.635) 0.284 (0.011) -.302 (.056) 10/10 PS_CXP3 -2.529 (0.634) 0.284 (0.010) -.269 (.058) 10/10 PS_CXP4 -2.188 (0.641) 0.284 (0.010) -.234 (.061) 10/10 PS_CXP5 -1.720 (0.661) 0.285 (0.010) -.186 (.066) 10/10 PS_CXP6 -1.205 (0.698) 0.286 (0.010) -.130 (.073) 8/10 PS_CXP7 -0.409 (0.775) 0.288 (0.011) -.043 (.084) 5/10 PS_CXP8 0.781 (0.866) 0.291 (0.019) .086 (.097) 3/10 PS_CXP9 1.864 (0.975) 0.294 (0.032) .196 (.103) 9/10 PS_CP10 PS_CX PS_CXP1 PS_CXP10 2.461 (1.148) 0.295 (0.037) .252 (.108) 9/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 85 Table 20.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Weighting – Reciprocal Suppression Example th With Trimming at 95 Percentile B Model 9 Propensity Scores Used PS_C PS_CP1 SE(B) Without Suppressor 2.036 (0.077) 0.298 (0.001) β Significance P<.05 .217 (.008) 10/10 2.002 (0.073) 0.298 (0.002) .213 (.007) 10/10 PS_CP2 2.001 (0.095) 0.298 (0.002) .213 (.010) 10/10 PS_CP3 1.985 (0.111) 0.298 (0.002) .212 (.011) 10/10 PS_CP4 1.989 (0.107) 0.298 (0.002) .212 (.011) 10/10 PS_CP5 1.982 (0.126) 0.297 (0.002) .212 (.012) 10/10 PS_CP6 1.959 (0.148) 0.296 (0.003) .210 (.015) 10/10 PS_CP7 1.954 (0.163) 0.296 (0.003) .210 (.016) 10/10 PS_CP8 1.944 (0.177) 0.296 (0.003) .209 (.018) 10/10 PS_CP9 1.961 (0.158) 0.297 (0.002) .210 (.016) 10/10 1.999 (0.110) 0.297 (0.002) With Suppressor 0.086 (0.257) 0.390 (0.012) .213 (.011) 10/10 .007 (.022) 0/10 PS_CP10 PS_CX PS_CXP1 0.188 (0.209) 0.393 (0.012) .016 (.018) 0/10 PS_CXP2 0.256 (0.217) 0.397 (0.012) .021 (.018) 0/10 PS_CXP3 0.407 (0.152) 0.404 (0.012) .033 (.012) 0/10 PS_CXP4 0.539 (0.303) 0.411 (0.013) .042 (.024) 2/10 PS_CXP5 0.733 (0.250) 0.423 (0.013) .056 (.020) 2/10 PS_CXP6 0.907 (0.336) 0.437 (0.014) .067 (.025) 6/10 PS_CXP7 1.077 (0.327) 0.458 (0.013) .076 (.023) 6/10 PS_CXP8 1.538 (0.369) 0.493 (0.015) .101 (.024) 9/10 PS_CXP9 1.831 (0.512) 0.519 (0.013) .113 (.031) 9/10 PS_CXP10 2.159 (0.525) 0.539 (0.016) .128 (.030) 9/10 Note: Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 9: 𝑌 = 𝛽0 + 𝛽1 𝑍 + 𝜀 . 86 PS matching. Two types of matching methods, nearest neighbor matching and nearest neighbor matching within a caliper 𝜀 = .25𝜎 𝑃𝑆 , are used and the estimates are shown in Table 21.1 and Table 21.2, respectively. As to what was found in PS weighting methods, the standard deviations of the estimated treatment effects are larger. This implies that from different simulated data sets, the estimates are various. As a result, different inferences of the treatment effect may be generated, especially for those estimates with the suppressor involved in the models. In Table 21.1, the estimated treatment effect increases from 2.065 by matching with PS_C to -4.246 by matching with PS_CX. This result indicates that the suppressor has the impact on the estimation of the treatment effect and changes the predictive direction of the treatment indicator on the outcome significantly. Moreover, the impact of the suppressor on the treatment indicator can also be eliminated in the PS matching method when the stronger covariates P’s are applied to estimate the PS’s. For example, the estimated ATT changes from -3.348 by matching individuals with PS_CXP1 to 2.746 by matching individuals with PS_CXP10. When the unconfoundedness assumption is approximately fulfilled by matching the individuals with the suppressor and the strongest covariate P10 involved, the estimate of ATT of 2.746 is slightly larger than 2.461 without weight trimming and 2.159 with weight trimming in PS weighting models. Table 21.1 and Table 21.2 also provide evidence that with the suppressor involved, the estimated standard errors are much larger compared to the corresponding estimates in all other methods. The estimates of standard error become larger especially when the stronger Pj is involved. This implies that the PS matching method conducts the least 87 precise estimates of the treatment effect compared to all other methods. Comparable to what I found in the example of classical suppression, the estimates are really similar by using the nearest neighbor matching method and the nearest neighbor matching within a caliper method. The estimates are even the same when the suppressor is involved in the PS models. This provides evidence that the differences of the individual’s PS’s within each pair are smaller than the defined calipers. The biggest number of pairs removed from the analyses is nine out of 500. The largest caliper is .016 and the smallest one is .005 in the analyses with the mean .010. The values of calipers are slightly larger than those in the example of classical suppression. However, they are also small enough to assume the individuals’ PS’s for the matched pairs are similar. 88 Table 21.1 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Reciprocal Suppression Example Nearest Neighbor Matching B SE(B) Significance Mean (SD) Mean (SD) P<.05 Model 10 Propensity Scores Used Without Suppressor PS_C 2.065 (0.292) 0.385 (0.010) 10/10 PS_CP1 2.201 (0.307) 0.388 (0.010) 10/10 PS_CP2 2.072 (0.319) 0.385 (0.016) 10/10 PS_CP3 2.102 (0.329) 0.386 (0.015) 10/10 PS_CP4 2.059 (0.288) 0.389 (0.010) 10/10 PS_CP5 2.076 (0.374) 0.383 (0.006) 10/10 PS_CP6 2.186 (0.326) 0.381 (0.009) 10/10 PS_CP7 2.280 (0.378) 0.384 (0.012) 10/10 PS_CP8 2.182 (0.278) 0.386 (0.014) 10/10 PS_CP9 2.049 (0.402) 0.381 (0.010) 10/10 PS_CP10 2.033 (0.385) With Suppressor -4.246 (1.393) 0.390 (0.011) 10/10 1.257 (0.270) 10/10 -3.349 (1.253) 1.311 (0.215) 7/10 PS_CXP2 -3.278 (1.242) 1.342 (0.260) 7/10 PS_CXP3 -3.122 (.898) 1.419 (0.287) 5/10 PS_CXP4 -2.894 (1.184) 1.520 (0.321) 3/10 PS_CXP5 -1.728 (1.578) 1.595 (0.349) 1/10 PS_CXP6 -1.342 (1.425) 1.740 (0.358) 1/10 PS_CXP7 -0.582 (1.520) 1.875 (0.373) 0/10 PS_CXP8 0.587 (1.687) 2.040 (0.390) 0/10 PS_CXP9 1.422 (1.557) 2.174 (0.413) 1/10 PS_CX PS_CXP1 PS_CXP10 2.746 (1.960) 2.353 (0.471) 2/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 10: ̂𝑡 = 𝜏 1 𝑛𝑡 ∑ 𝑘=1{𝑌 𝑘 |𝑍 = 1 − ̂𝑘 |𝑍 = 0}. 𝑌 𝑛𝑡 89 Table 21.2 The Estimated Average Treatment Effects on the Treated (ATT) of Propensity Score Matching – Reciprocal Suppression Example Nearest Neighbor Matching within a Caliper B SE(B) Significance Mean (SD) Mean (SD) P<.05 Model 10 Propensity Scores Used Without Suppressor PS_C 2.065 (0.297) 0.385 (0.010) 10/10 PS_CP1 2.198 (0.308) 0.388 (0.010) 10/10 PS_CP2 2.072 (0.319) 0.385 (0.015) 10/10 PS_CP3 2.101 (0.330) 0.386 (0.014) 10/10 PS_CP4 2.059 (0.288) 0.389 (0.010) 10/10 PS_CP5 2.075 (0.378) 0.383 (0.006) 10/10 PS_CP6 2.185 (0.327) 0.381 (0.009) 10/10 PS_CP7 2.279 (0.378) 0.384 (0.012) 10/10 PS_CP8 2.180 (0.280) 0.385 (0.014) 10/10 PS_CP9 2.045 (0.411) 0.381 (0.010) 10/10 PS_CP10 2.027 (0.382) With Suppressor -4.246 (1.393) 0.390 (0.011) 10/10 1.257 (0.270) 10/10 -3.349 (1.253) 1.311 (0.215) 7/10 PS_CXP2 -3.278 (1.242) 1.342 (0.260) 7/10 PS_CXP3 -3.122 (0.898) 1.419 (0.287) 5/10 PS_CXP4 -2.894 (1.184) 1.520 (0.321) 3/10 PS_CXP5 -1.728 (1.578) 1.595 (0.349) 1/10 PS_CXP6 -1.342 (1.425) 1.740 (0.358) 1/10 PS_CXP7 -0.582 (1.520) 1.875 (0.373) 0/10 PS_CXP8 0.587 (1.687) 2.040 (0.390) 0/10 PS_CXP9 1.422 (1.557) 2.174 (0.413) 1/10 PS_CX PS_CXP1 PS_CXP10 2.746 (1.960) 2.353 (0.471) 2/10 Note: The values are calculated by 10 simulated data sets. Significance column reports the number of simulated data sets which have significant treatment effect at the level of p < .05 out of 10. Model 10: ̂𝑡 = 𝜏 1 𝑛𝑡 ∑ 𝑘=1{𝑌 𝑘 |𝑍 = 1 − ̂𝑘 |𝑍 = 0}. 𝑌 𝑛𝑡 90 Impact of a confounding variable. Table 22 reports that the impacts of suppressor X, covariate Pj, and PS variables on the estimation of treatment indicator Z. For the variable with a larger impact, the estimation of the treatment indicator is more likely to be influenced by adding it as a confounding variable in the regression model. In Table 22, X and PS_CX have the largest impacts on the treatment indicator, .401 and .388, respectively. The result implies that with the suppressor itself or the PS with the suppressor involved, the estimation of the treatment indicator can easily be affected because of their large impacts. For the PS_CXPj, with a stronger covariate Pj, the impact becomes smaller from PS_CXP1 (.382) to PS_CXP10 (.168). This is because the correlations of PS_CXPj with outcome Y decrease when the stronger covariates P’s are used to estimate the PS’s. It implies that an individual with a higher probability in the treatment group tends to have a smaller outcome value. As a result, the impacts of PS_CXPj decrease from PS_CXP1 to PS_CXP10. Although the correlation coefficients between PS_CPj and Y are first decreasing from PS_CP1 to PS_CP8 and then increasing from PS_CP8 to PS_CP10, the changes are small to zero. As a result, the correlation coefficients are easily influenced by the individuals with the extremer values of PS_CPj and Y. The results provide evidence that with the suppressor and weaker covariates P’s in the PS models, the PS’s have the stronger impacts on the estimations of the treatment indicator. 91 Table 22 Impact of Suppressor, P’s, and Propensity Scores on Treatment Indicator – Reciprocal Suppression Example Correlation with Y Correlation with Z Impact X .534 .750 .401 .148 -.027 -.004 P1 .244 -.027 -.007 P2 .355 -.026 -.009 P3 .441 -.025 -.011 P4 .535 -.023 -.012 P5 .617 -.021 -.013 P6 .718 -.019 -.014 P7 .830 -.015 -.012 P8 .910 -.010 -.009 P9 .950 -.006 -.006 P10 .032 .069 PS_C .002 -.001 -.016 .086 PS_CP1 -.004 -.045 .085 PS_CP2 -.007 -.078 .084 PS_CP3 -.008 -.101 .083 PS_CP4 -.010 -.124 .081 PS_CP5 -.011 -.141 .080 PS_CP6 -.012 -.154 .078 PS_CP7 -.011 -.150 .074 PS_CP8 -.009 -.120 .072 PS_CP9 -.006 -.079 .070 PS_CP10 .509 .762 PS_CX .388 .499 .765 .382 PS_CXP1 .484 .768 .372 PS_CXP2 .459 .774 .355 PS_CXP3 .434 .780 .339 PS_CXP4 .401 .788 .316 PS_CXP5 .367 .798 .293 PS_CXP6 .321 .811 .260 PS_CXP7 .263 .829 .218 PS_CXP8 .219 .843 .185 PS_CXP9 .197 .851 .168 PS_CXP10 Note: The values are the means calculated by 10 simulated datasets. Impact is the product of correlations with Y and Z. 92 Chapter 6 CONCLUSION AND DISCUSSION Summary of Findings The primary goal of this study is to provide a basic understanding of how a suppressor variable suppressing on the treatment indicator affects the estimations of causal effect as one of the covariates in regression models and PS methods including PS as a covariate, PS weighting, and PS matching models. Two types of suppressions are studied: classical suppression and reciprocal suppression. Classical suppression has the strictest definition of suppression and reciprocal suppression has the most general one. Examples of classical and reciprocal suppressions are presented by 10 simulated data sets for each. For both examples, the impacts of the suppressors are extremely strong based on the simulated data sets. As a result, the estimations are easily affected by adding the suppressors in the models. An additional condition of adding a covariate, Pj, explaining the variance of outcome, is also tested to see how the estimates of the treatment effects vary in the models. Ten covariates P’s are generated to explain different amounts of the variance of the outcome from small to large, conditional on the treatment indicator. The simulated data sets are successfully conducted by the evolution algorithm technique for both examples of classical and reciprocal suppressions with the relative constraints. Figure 5 and Figure 8 illustrate the line graphs of the estimates of the unstandardized treatment effect for examples of classical and reciprocal suppressions respectively with different combinations of covariates:  random covariates (C), 93  random covariates and different level of P’s (CP1 – CP10),  random covariates and suppressor (CX) and  random covariates, suppressor, and different level of P’s (CXP1 – CXP10). Those estimates of the treatment effect are estimated by 8 different types of models in this study including: regression, PS as a covariate, PS weighting for ATE, PS weighting for ATE with weight trimming, PS weighting for ATT, PS weighting for ATT with weight trimming, PS matching for ATT, and PS matching for ATT within a caliper. Figure 6 and 9 illustrate the relative line graphs of the estimated standard errors of the estimated treatment effects by the models. Figure 7 and 10 illustrate the relative line graphs of the t-ratios of the estimated treatment effects by the models. Both examples of classical and reciprocal suppressions provide evidence that the suppressors increase the predictive power of the treatment effects and influence the estimations of the treatment effect, in regression or PS methods, without controlling any covariate Pj. However, the impacts of the suppressors vary in different types of models. The results can be found by comparing the values of the estimated treatment effects in the combinations of covariates C and CX in Figures 5 and 8. The estimate increases in the example of classical suppression and decreases in the example of reciprocal suppressions substantially. The impacts of the suppressors influence the estimated values the most by using PS weighting for ATE models where the changes of the estimates are largest after adding the suppressors involved in the model. The changes of the estimates are least when using PS weighting for ATT with trimming models in both examples. According to the findings, PS weighting for ATT with trimming is a better model to apply to eliminate the impact of the suppressor, whether used for classical suppression or a reciprocal one 94 when there is no covariate P or only weak covariates P’s involved in the models. When the strong-enough covariates P’s, eliminating the impacts of suppressors substantially by themselves are applied, the estimated treatment effects from the models of PS weighting for ATT with trimming tend not to be closer to the approximately true treatment effect than those from the models without trimming. That is because without unconfoundedness, the unbiased estimation cannot be assumed. The individuals with the extreme values tend to affect the biased estimation can easily be removed by applying weight trimming. However, when the unconfoundedness assumption is fulfilled, removing any individual in the sample may cause the loss of essential information to achieve the unbiased estimation. This result can also be found in the models of PS weighting for ATE with trimming. The estimations without the suppressor involved (C and CP1 – CP10) are quite consistent in all types of models in the example of reciprocal suppression. In the example of classical suppression, the estimated treatment effects are consistent in all models except those in PS matching models. Previous studies demonstrate that including a variable that is strongly related to the treatment indicator, but unrelated to outcomes (such as a classical suppressor or an instrument variable), can decrease the efficiency of the estimated causal effect in PS methods (Perkins et al., 2000; Rubin, 1997; Wooldridge, 2005). This study provides evidence that the PS matching method is most responsible for decreasing efficiency. Moreover, with the stronger covariates P’s involved, the impact of the suppressor becomes smaller in all models by comparing the estimates from CXP1 to CXP10 for both examples. However, the changes of the estimates with the stronger covariates P’s applied 95 by using PS matching methods are not as smooth as those by using other methods. In this study, I assume the unconfoundedness assumption can be approximately fulfilled when the estimate is conditional on the strongest covariate, P10. Results indicate that the estimates with P10 involved in the models are almost the same for all models in the example of reciprocal suppression. The example of classical suppression conducts a similar finding except the estimate in PS weighting for ATE with weight trimming is slightly larger and the estimate in PS matching models is slightly smaller under unconfoundedness. When the unconfoundedness assumption is fulfilled, the estimates should be close to the true treatment effect. Steiner, Cook, and Shadish (2011) demonstrated that the unreliability of measurement can degrade the ability of the PS’s to reduce the bias used in the simulation. Increasing the reliability of the covariate promoting bias reduction can reduce bias of the estimation. This finding implies that with the unreliable variables in the PS model, the unconfoundedness assumption would tend to be violated so that the unbiased estimation could not be conducted. They also found that if the covariates have no effect on reducing bias, no matter how reliable the measurements are, including those covariates would not reduce selection bias. This study provides evidence that if a covariate such as a suppressor has no effect on reducing bias, including that covariate would generate a biased estimation in both regression and PS models. However, including a good covariate promoting bias reduction, such as the variable of pre-test scores, can eliminate the bias produced by the other covariate such as a suppressor. In Figure 5 and Figure 8, the slopes become larger, apparently from CXP5 to CXP10 for most models. This finding implies that the bias caused by the suppressors 96 can be removed significantly by including the strong-enough covariate such as P5 or the stronger ones in most models, even when the impacts of the suppressors are extremely large on the estimations of the treatment effects in these two examples. It can be observed by Figure 6 and Figure 9 that the estimates of standard error only decline by using regression models when the stronger covariates P’s are applied. However, the estimates of standard error are quite consistent in the example of classical suppression and slightly increase in the example of reciprocal suppression by using all other models. This finding implies that regression can increase the efficiency of the estimations precisely when the unconfoundedness assumption is more likely to be fulfilled. That is because the predicted line can be improved by directly adding the strong-enough covariates so that the mean square error tends to be smaller. As a result, for the corresponding models, the inference of causal effect is more likely to be significant by using regression methods than by using PS methods. Moreover, according to the result, the estimated standard errors by using PS matching models are much larger than those by using all other models. Figure 7 and Figure 10 demonstrate that the largest t-ratios of the treatment effect are from the regression models, where the most significant inferences can be generated when the stronger covariates P’s are applied. Meanwhile, the smallest t-ratios are from the PS matching models because of the large estimated standard errors. As a result, the least significant inferences are generated from the PS matching models in both examples. 97 Estimated Treatment Effect -- Classical Suppression 10 Regression 9 PS as a Covariate 8 ATE PS Weighting 7 6 ATE PS Weighting with Trimming 5 ATT PS Weighting 4 3 ATT PS Weighting with Trimming 2 ATT PS Matching 1 ATT PS Matching within a Caliper 0 Figure 5. Line Graphs of Estimated Treatment Effects in Example of Classical Suppression. 98 Estimated Standard Error of the Treatment Effect -- Classical Suppression 5 regression 4.5 PS as a Covariate 4 ATE PS weighting 3.5 3 ATE PS Weighting with Trimming 2.5 ATT PS Weighting 2 ATT PS Weighting with Trimming 1.5 ATT PS Matching 1 0.5 ATT PS Matching with a Caliper 0 Figure 6. Line Graphs of Estimated Standard Errors in Example of Classical Suppression. 99 T-ratio of Estimated Treatment Effect -- Classical Suppression 6 Regression PS as a Covariate 5 ATE PS Weighting 4 ATE PS Weighting with Trimming 3 ATT PS Weighting 2 ATT PS Weighting with Trimming 1 PS Matching 0 PS Matching within a Caliper Figure 7. Line Graphs of T-ratios of the Treatment Effect in Example of Classical Suppression. 100 Estimated Treatment Effect -- Reciprocal Suppression 4 Regression 3 PS as a Covariate 2 1 ATE PS Weighting 0 -1 ATE PS Weighting with Trimming -2 ATT PS Weighting -3 ATT PS Weighting with Trimming -4 ATT PS Matching -5 -6 ATT PS Matching within a Caliper -7 Figure 8. Line Graph of Estimated Treatment Effect in Example of Reciprocal Suppression. 101 Estimated Standard Error of the Treatment Effect -- Reciprocal Suppression 2.5 Regression PS as a Covariate 2 ATE PS Weighting 1.5 ATE PS Weighting with Trimming ATT PS Weighting 1 ATT PS Weighting with Trimming ATT PS Matching 0.5 ATT PS Matching within a Caliper 0 Figure 9. Line Graph of Estimated Standard Errors in Example of Reciprocal Suppression. 102 T-ratio for Estimated Treatment Effect -- Reciprocal Suppression 40 Regression PS as a Covariate 30 ATE PS Weighting 20 ATE PS Weighting with Trimming 10 ATT PS Weighting ATT PS Weighting with Trimming 0 ATT PS Matching -10 ATT PS Matching within a Caliper -20 Figure 10. Line Graphs of T-ratios of the Treatment Effect in Example of Reciprocal Suppression. 103 Implication Although model comparisons can provide adequate knowledge about how the suppressors and how the covariates P’s affect the estimations of causal effects in regression and PS methods, the most important thing in this study is trying to generate a guideline of how to approach a more accurate estimate of causal effect when a suppressor variable is involved in the estimating process. In this study, the covariates P’s, which are unconditional on the suppressor, are used to approximate the unconfoundedness assumption. Another assumption made before generating the covariates P’s is that the true treatment effect is not conditional on any other variables including the suppressor in the models. As a result, the covariates P’s are generated from the residuals of a simple linear regression of the treatment indicator on outcome only. In this case, the estimate is not accurate for the treatment effect while including the suppressor variable in the model. The reason to make this assumption is because if the true treatment effect is conditional on the suppressor, intuitively, the suppressor needs to be controlled in the model as a confounding variable to achieve the unbiased estimation. Based on the findings, the PS matching method conducts the worst estimation of the treatment effect because of its low efficiency, especially when a classical suppressor is used to estimate the PS’s. Even a reciprocal suppressor can conduct an extremely large standard error of the estimated treatment effect in the PS matching method. Based on the strategy for selecting variables for the PS model, as long as the variable is correlated to the dependent variable, it has to be included in the PS model (Augurzky & Schmidt, 2001; Caliendo & Kopeinig, 2008; Heckman et al., 1998; Lechner, 2002; Ravallion, 104 2001). There is no question about excluding classical suppressors in the PS model by following this strategy, but not for reciprocal suppressors. That is because reciprocal suppressors are correlated to both the outcome and the treatment indicator significantly so that they have to be selected into the PS model by following this strategy. However, this study provides the example that selecting the reciprocal suppressor in the PS model produces bias on the estimation when there is no good enough covariate such as pre-test scores in the model. As a result, unless the unconfoundedness assumption can be fulfilled with sufficient evidence, using this strategy can also generate a biased estimate of causal effect. Moreover, although some of the estimated treatment effects are similar among the corresponding models, the PS matching method conducts less efficient estimates. As a result, using the PS matching method may generate a different inference of causal effect from other methods. This study also found that the impact of the suppressor can be eliminated under unconfoundedness. As a result, the bias produced by the suppressors can be removed. There is no doubt unconfoundedness is difficult to achieve in empirical studies without a randomized design. One reason PS methods are becoming more and more popular is because researchers believe that by including as many as possible covariates, the unconfoundedness assumption can be fulfilled. However, this study provides examples showing that not all the covariates can reduce the bias of the estimation of the treatment effect in PS methods, but some may increase the bias such as suppressor variables. Also, the quality of covariates is much more important than the quantity of them to remove bias. In some cases, a good covariate, such as pre-test scores, can lead to an accurate estimate of the treatment effect. People should keep in mind that the good PS’s indicate that 105 individuals with the same PS behave similarly under the same conditions; however, the good PS’s do not have to conduct the best statistical significance. The classical and reciprocal suppressors in this study give examples illustrating that the PS’s with the suppressors involved can increase the statistical significance of the estimated treatment effects, but the estimated treatment effects can be biased. The strategy of selecting variables in PS models should not only depend on statistical results, but also on rationales and relative theories. In planning a quasi-experimental design, empirical information from previous studies and existing theories need to be considered carefully to define which covariates have to be measured to generate an unbiased estimation. Most of time, a good covariate is most crucial to be accessed. Moreover, when the unconfoundedness assumption is violated, this study demonstrates that the model of PS weighting for estimating ATT with weight trimming at th the 95 percentile eliminates the most impacts of the suppressors on the treatment effect, for both examples of classical and reciprocal suppressions, compared to all other models. This implies that when the quality of the PS is uncertain, researchers can apply this method to approach the estimate closer to the true treatment effect by removing the individuals who may easily affect the estimation. When the unconfoundedness assumption is fulfilled, this study finds out that a multiple regression model controlling for all covariates comes out to be the best model with the smallest standard error of estimated treatment effect. This implies that if a set of covariates were obtained in a good quality, a simple model could perform better than a complex one. 106 Limitations In this study, only specific examples of classical and reciprocal suppressions are provided. The examples here cannot be used to generalize to all the phenomena of classical and reciprocal suppressions. Also, I define that the true treatment effects are estimated without controlling suppressor variables in this study. It is possible that some covariates not only can cause suppressions, but can also reduce the bias in other frameworks. As a result, people should control suppressor variables to obtain unbiased estimates. It is also possible that some suppressor variables have small impacts on the estimation of the treatment indicators or suppress other covariates so that controlling those suppressor variables in the models does not affect estimation. However, this study does not illustrate those phenomena of suppression. In both examples, the suppressors have extremely large impacts on the estimation of treatment indicators. As the results show, the impacts can only be eliminated by the really strong covariates. It is necessary to verify that when the suppressors produce bias in different degrees, how strong the covariates should be can eliminate their impacts on the treatment effects. Also, with various degrees of suppressors, whether the diverse methods perform differently can also be tested. Because there are many distinct PS matching methods such as mahalanobis metric matching and optimal matching, although the PS matching with the nearest neighbor and nearest neighbor within a caliper do not conduct precise estimates in this study, by applying different matching methods, the results may differ. From Hansen (2007), applying optimal matching matched where matched adjustment requires analyst to articulate a distance between desirable and undesirable potential matches, and then to 107 match individuals in treatment and control groups by that. This method favors the more desirable pairs which can substantially improve the power and robustness of causal inference. Whether this method can also improve the power and robustness of causal inference when suppressor variables are involved should be tested in further studies. It is also important to use empirical data to verify the findings here and to see how suppressor variables affect the inference in a real study. Moreover, by using the data from a longitudinal design, researchers can have an empirical example of how the covariate eliminates the impact of the suppressors, especially for the suppressors which may lead the biased estimates. 108 APPENDICES 109 Appendix A Simulation Program import sys import numpy from numpy import matrix from numpy import ones from numpy import linalg #“Classical Suppression Code” target_corr = numpy.matrix([[1, 0.6, 0.7],[0.6, 1, 0.2],[0.7, 0.2, 1]]) target_L = numpy.linalg.cholesky (target_corr) target_d1 = -4 target_d2 = 2 target_d1b = 2 #“Reciprocal Suppression Code” target_corr = numpy.matrix([[1,-0.05,0.6],[0.6,1,0.03],[-0.05,0.03,1]]) target_L = numpy.linalg.cholesky (target_corr) target_d1 = 7 target_d2 = 2 target_d1b = 2 mutate_scale = 5.0 flip_prob = 0.08 surv_prob = 0.95 scale_exp = 0.999639589 dimension = 1000(sys.argv[1]) generation = 90000(sys.argv[2]) population = 500 _mut_X = True _mut_Y = True _eval_corr = True _eval_lsqr = True class DNA: def __init__ (self, n): self.dim = n self.X = numpy.random.rand(n) * 100.0 self.Y = numpy.random.rand(n) * 100.0 self.Z = numpy.zeros (n) for i in range(0,n/2): self.Z[i] = 0 for i in range(n/2,n): self.Z[i] = 1 self.fitness = self.evaluate() def clone (self): copy = DNA (self.dim) copy.X = self.X.copy() copy.Y = self.Y.copy() copy.Z = self.Z.copy() copy.fitness = self.fitness 110 return copy def evaluate (self): err = 0.0 if _eval_corr: corr = numpy.corrcoef([self.X, self.Y, self.Z]) corr_diff = target_corr - corr corr_err = numpy.linalg.norm (corr_diff, ord='fro') err += pow (corr_err, 2) if _eval_lsqr: M = matrix([ones(self.dim), self.Z, self.X]) MM = matrix([ones(self.dim), self.Z]) (d, res, rank, s) = linalg.lstsq (M.transpose(), self.Y) (dd, res, rank, s) = linalg.lstsq (MM.transpose(), self.Y) err += pow (abs(d[1] - target_d1), 2) err += pow (abs(dd[1] - target_d1b), 2) return pow (err, 0.5) def mutate (self): if _mut_X: self.X += numpy.random.normal (loc=0.0, scale=mutate_scale, size=self.dim) numpy.clip (self.X, 0.0, 100.0, out=self.X) if _mut_Y: self.Y += numpy.random.normal (loc=0.0, scale=mutate_scale, size=self.dim) numpy.clip (self.Y, 0.0, 100.0, out=self.Y) self.fitness = self.evaluate() def evolve (pop, gen): pop_size = len (pop) last_fitness = None for g in range(0,gen): nextgen = [] for p in pop: nextgen.append (p) c = p.clone() c.mutate() nextgen.append(c) nextgen.sort (key = lambda x: x.fitness) pop[:] = [] for k in range(0,len(nextgen)): prob = pow (surv_prob, k) if len(pop) < pop_size: if numpy.random.rand() < prob: pop.append (nextgen[k]) while len(pop) < pop_size: pop.append (nextgen[0]) global mutate_scale if 0 == (g % 8): best = pop[0] if last_fitness is not None: if last_fitness - best.fitness < 0.0005 * last_fitness: 111 mutate_scale *= 0.5 mutate_scale = max (mutate_scale, 0.001) sys.stderr.write ('mutate scale: ' + str(mutate_scale) + '\n') sys.stderr.flush() last_fitness = best.fitness sys.stderr.write ("generation sys.stderr.flush() pop = [] for k in range(0,population): pop.append (DNA(dimension)) try: while (generation > 0): evolve (pop, generation) best = pop[0] print "fitness: ", best.fitness print "correlation coef.: " print numpy.corrcoef([best.X,best.Y,best.Z]) M = matrix([ones(best.dim), best.Z, best.X]) (d, res, rank, s) = linalg.lstsq (M.transpose(), best.Y) print "Linear least square: ", d M = matrix([ones(best.dim), best.Z]) (d, res, rank, s) = linalg.lstsq (M.transpose(), best.Y) print "Linear least square (shorter): ", d cmd = raw_input('Command: ') if (cmd == 'exit'): break else: exec cmd except KeyboardInterrupt: print "interrupted" pass best = pop[0] if False: print "X = ", numpy.array_repr(best.X) print "Y = ", numpy.array_repr(best.Y) print "Z = ", numpy.array_repr(best.Z) print "fitness: ", best.fitness print "correlation coef.: " print numpy.corrcoef([best.X,best.Y,best.Z]) M = matrix([ones(best.dim), best.Z, best.X]) (d, res, rank, s) = linalg.lstsq (M.transpose(), best.Y) print "Linear least square: ", d M = matrix([ones(best.dim), best.Z]) (d, res, rank, s) = linalg.lstsq (M.transpose(), best.Y) print "Linear least square (shorter): ", d else: f = open ('output','w+') for x, y, z in zip(best.X,best.Y,best.Z): f.write ( ', '.join( [str(x), str(y), str(z)] ) + '\n') f.close() 112 Appendix B A Glossary of Literacy Terms Bias. Bias is the systematic deviation of results or inferences from the population parameter of interest. Any trend in the data collection, analysis, interpretation, or review of data can lead to conclusions that are systematically different from the population parameter of interest. Causal Effect. A causal effect is the difference between what did happen from a treatment and what would have happen if the treatment did not exist. A more general definition of a causal effect, the difference between the outcomes in the treatment group and in the control group, is provided when unit homogeneity assumption is achieved. For observational studies, a weaker assumption, unconfoundedness, approximating the unit homogeneity assumption, can be applied to estimate the causal effects by comparing the difference of outcomes between treatment and control groups. Causal Inference. A causal inference is made by using the estimated causal effect from the sample to generate a conclusion for the population through statistical analysis procedure. A causal inference must meet the basic requirements for all causal relationships: that cause preceded effect, that cause was related to the effect, and that there is no plausible alternative explanation for the effect other than the causal. Covariate. In statistics, a covariate is a variable that is possibly predictive of the outcome variable. A covariate may be of direct interest or it may be a confounding or interacting variable. In this study, covariates are the secondary variables that may affect the estimates of the independent variable of primary interest, the treatment indicator, on the outcome variable. 113 Efficiency/Efficient. In statistics, efficiency is defined by Fisher as the minimum possible variance for an estimate divided by its actual variance. An estimate is regarded as more “efficient” than another if it has a smaller variance which is also influenced by the sample size. Essentially, a more efficient estimate needs fewer samples than a less efficient one to achieve statistical significance in the statistical models. When the consistency assumption of ordinary least squares estimation is violated or bias appears, less efficient estimates tend to be conducted. Endogenous Variable. A variable is endogenous when is correlated with the unobservable random error of the regression model which is assumed to have mean zero and is uncorrelated with all the independent variables to obtain consistent ordinary least squares estimates. It usually appears because of the issues of an omitted variable or measurement error. Fitness (Simulation) In this study, fitness is an index to see how close the simulated parameters are to the defined constraints. The value of fitness is calculated by the sum of the squared of differences of simulated values and corresponding constraints. Omitted Variable. An omitted variable appears in a regression model when it is supposed to be controlled but is unavailable, usually because of data unavailability. When the issue of omitted variable appears, the assumption of consistency of ordinary least squares estimation is violated. Unbiased Estimate. When the expected value of the parameter from the statistical model is not different from the true value of it, the estimate is unbiased. The true value of the parameter is always unknown. To obtain an unbiased estimate, the statistical assumptions have to be achieved. 114 REFERENCES 115 REFERENCES Abadie, A., & Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica, 74(1), 235–67. Augurzky, B., & Schmidt, C. M. (2001). The propensity score: A means to an end (IZA Discussion Paper Series No. 271). Bonn, Germany: IZA. Brookhart, M., Schneeweiss, S., Rothman, K., Glynn, R., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. Practices of Epidemiology, 163(12), 1149-1156. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys, 22(1), 31–72. Conger, A. J. (1974). A revised definition for suppressor variables: A guide to their identification and interpretation. Educational and Psychological Measurement, 34, 35-46. Cohen, J., & Cohen, P. (1975). Applied multiple regression/correlation analysis for the behavioral sciences. New York: Wiley. Darlington, R. B. (1968). Multiple regression in psychological research and practice. Psychological Bulletin, 69, 161-182. Eiben, A.E. & Smith, J. E. (2007). Introduction to evolutionary computing. 2nd Edn., Springer, Berlin, ISBN: 3540401849. Frank, K. A. (2000). Impact of a confounding variable on the inference of a regression coefficient. Sociological Methods and Research, 29(2), 147-194. Frank, K. A., Sykes, G., Anagnostopoulos, D., Cannata, M., Chard, L., Krause, A. & McCrory, R. (2008). Does NBPTS certification affect the number of colleagues a teacher helps with instructional matters? Educational Evaluation and Policy Analysis, 30(1), 3-30. Hansen, B. B. (2007). Optmatch: Flexible, optimal matching for observational studies. R News, 7, 19-24. Heckman, J. J., Ichimura, H., Smith, J., & Todd, P. (1998). Characterizing selection bias using experimental data. Econometrica, 66, 1017-1098. Heckman, J. J., & Robb, R., Jr. (1986). Alternative methods for solving the problem of selection bias in evaluating the impact of treatment on outcome. In H. Wainer (Ed.), Drawing inferences from self-selected samples (pp. 63-107). New York: Springer-Verlag. 116 Henry, G. T., Gordon, C. S., & Rickman, D. K. (2006). Early education policy alternatives: comparing quality and outcomes of Head Start and state prekindergarten. Educational Evaluation and Policy Analysis, 28(1), 77-99. Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945-970. Horst, P. (1941). The role of predictor variables which are independent of the criterion. Social Science Research Council Bulletin, 48, 431-436. Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1), 5-86. Gu, X., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: Structures, distances and algorithms. Journal of Computational and Graphical Statistics, 2(4), 405– 420. Kang, J., & Schafer, J. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22, 523–580 Lancaster, B. P. (1999). Defining and interpreting suppressor effects: Advantages and limitations. Annual Meeting of the Southwest Educational Research Association, San Antonio, TX. Lechner, M. (2002). Some practical issues in the evaluation of heterogeneous labour market programmes by matching methods. Journal of the Royal Statistical Society, 165(A), 5982. Lee, B. K., Lessler, J., & Stuart, E. A. (2011). Weight trimming and propensity score weighting. PLoS ONE, 6(3): e18174. Doi:10.1371/journal.pone.0018174. Lord, F. M., & Novick, M. R. (1974). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. Lubin, A. (1957). Some formulae for use with suppressor variables. Educational and Psychological Measurement, 17, 286-296. Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine, 23, 2937-2960. Lutz, J. G. (1983). A method for constructing data which illustrate three types of suppressor variables. Educational and Psychological Measurement, 43, 373-377. Martz, E. (2003). Invisibility of disability and work experience as predictors of employment among community college students with disabilities. Journal of Vocational 117 Rehabilitation, 18, 153-161. Morrow, C. E., Mansoor, E., Hanson, K. L., Vogel, A. L., Rose-Jacobs, R., Genatossio, C. S., Windham, A., & Bandstra, E. S. (2010). The starting early starting smart integrated services model: Improving access to behavioral health services in the pediatric health care setting for at-risk families with young children. Journal of Child and Family Studies, 19(1), 42-56. Oden, S., Schweinhart, L. J., Weikart, D. P., Marcus, S. M., & Xie, Y. (2000). Into adulthood: A study of the effects of Head Start. Ypsilanti, MI: High/Scope Press. Paulhus, D. L., Robins, R.W., Trzesniewski, K. H., & Tracy, J. L. (2004). Two replicable suppressor situations in personality research. Multivariate Behavioral Research, 39, 303328. Perkins, S. M., Tu, W., Underhill, M. G., Zhou, X-H., & Murray, M. D. (2000). The use of propensity scores in pharmacoepidemiologic research. Pharmacoepidemiology and Drug Safety, 43, 93–101. Potter, F. J. (1993). The effect of weight trimming on nonlinear survey estimates. San Francisco: American Statistical Association. Ravallion, M. (2001). The mystery of the vanishing benefits: An introduction to impact evaluation. The World Bank Economic Review, 15(1), 115-140. Robins, J. M., Mark, S. D., & Newey, W. K. (1992). Estimating exposure effects by modeling the expectation of exposure conditional on confounders. Biometrics, 48, 479–495. Roseman, L. (1994). Using regression and subclassification on the propensity score to control bias in observational studies. Unpublished manuscript. Rosenbaum, P. R. (1989). Optimal matching for observational studies. Journal of the American Statistical Association, 84(408), 1024–1032. Rosenbaum, P. R. (1995). Observational studies. New York: Springer. Rosenbaum, P. R. (2002). Covariance adjustment in randomized experiments and observational studies. Statistical Science, 17(3), 286–327. Rosenbaum, P. R., & Rubin, D. B. (1983a). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 45(2), 212–218. Rosenbaum, P. R., & Rubin, D. B. (1983b). The central role of the propensity score in observational studies for causal Effects. Biometrika, 70(1), 41–55. Rosenbaum, P. R., & Rubin, D. B. (1985). Constructing a control group using multivariate 118 matched sampling methods that incorporate the propensity score. The American Statistician, 39(1), 33–38. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688-701. Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127, 757-763. Rubin, D. B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169-188. Rubin, D. B. & Thomas, N. (1996). Matching using estimated propensity scores: Relating theory to practice. Biometrics, 52, 249-264. Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13, 279–313. Scharfstein, D. O., Rotnitzky, A., & Robins, J. M. (1999). Adjusting for non-ignorable drop-out using semiparametric non-response models. Journal of the American Statistical Association, 94, 1096–1120. Steiner, P. M., Cook, T. D., & Shadish, W. R. (2011). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics, 36, 213. Walker, D. A. (2003). Suppressor variable(s) importance within a regression model: An example of salary compression from career services. Journal of College Student Development, 44, 127-133. 119