COMPARISON OF THREE MEDIATION ANALYSIS METHODS WITH TWO SEQUENTIAL MEDI ATORS B y Xinchun Zhang A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Biostatistics - Master of Science 2020 ABSTRACT COMPARISON OF THREE MEDIATION ANALYSIS M ETHODS WITH TWO SEQUENTIAL MEDI ATORS B y Xinchun Zhang Mediation analysis is an important tool for understanding causal mechanisms in epidemiology and social sciences. The estimation of direct and indirect effects with multiple mediators is a challenging problem. This thesis focused on the comparison of three mediation analysis methods with two sequential mediators. Our goa l was to access the robustness of the methods in estimating natural indirect effect and partial indirect effect. In this thesis we simulated multiple scenarios based on a counterfactual frame work and employed three weighted - marginal structural models to estimate direct and indirect effects (1 - 3) . The bias, root mean squared error and 95% confidence interval coverage probability from the Monte - Carlo simulations were the criteria to compare the three methods. By comparing their performance in the estimation of direct and indir ect effects, we concluded that the Lange method was more robust in mediation analysis with two sequential mediators compared with the methods by Steen and Hong . Key words: Causal inference, mediation analysis, sequential mediators, marginal structural models, data simulation, causal directed acyclic graph, direct effect, indirect effect Copyright by XINCHUN ZHANG 20 20 iv ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my thesis advisor, Dr. Zhehui Luo , not only for her insightful guidance and endless support , but also f or h er feedback and encouragement throughout my graduate study . I am grateful to have Dr. Joseph Gardiner and Dr. Honglei Chen on my thesis committee and thank them for helping me improve my thesis with your superb suggestions. Thanks also go to Aiwen Yang, Liang Wang, Yanjie Li, Hanyue Li, and all my professors and classmates in the D epartment of Epidemiology and Biostatistics at Michigan State University for their kind support. Lastly, my deepest gratitude goes to my husband, Hansen , and my daughter, Anna, for their love and support. v TABLE OF CONTENTS LIST OF TABLES ................................ ................................ ................................ .............. vii LIST OF FIGURES ................................ ................................ ................................ ........... viii CHAPTER 1. INTRODUCT ION ................................ ................................ ............................ 1 CHAPTER 2. COUNTERFACTUAL APPR OACH TO MEDIATION AN ALYSIS WITH TWO SEQUENTIAL MEDIATORS ................................ ................................ ............................ 4 2.1 Motivating epidemiology example ................................ ................................ .... 4 2.2 The counterfactual framework ................................ ................................ .......... 4 2.3 Decomposition of causal effects ................................ ................................ ....... 6 2.3a Counterfactual notation for natural direct and indirect effects .................. 6 2.3b Decomposition in a mediation model with two sequential mediators ....... 7 2.3c Computing true effects ................................ ................................ .............. 9 2.4 Assumptions that permit identification ................................ ........................... 10 CHAPTER 3. THE PROPO SED SIMULATION PROCE DURE ................................ ................. 12 3.1 Super population ................................ ................................ ............................ 12 3. 2 Equatio n for data generating procedure (DGP) ................................ .............. 12 3.3 True effects ................................ ................................ ................................ ..... 14 3.4 The proposed estimation procedure ................................ ............................... 14 CHAPTER 4. DATA SIMU LATION ................................ ................................ ..................... 19 4.1 Simulation Scenarios ................................ ................................ ....................... 19 4.2 Perf ormance criteria ................................ ................................ ....................... 24 CHAPTER 5. SIMULATIO N RESULTS ................................ ................................ ................ 25 5.1 Comparison across scenarios within each method ................................ .......... 26 5.1a Simulation results generated from Lange method ................................ ... 26 5.2b Simulation results for Steen method ................................ ....................... 30 5.1c Simulation results for Hong method ................................ ........................ 34 5.2 Cross method comparison ................................ ................................ .............. 37 5.2a Comparison of Lange, Steen and Hong methods in estimating total effects ................................ ................................ ................................ ................... 37 5.2b Comparison of Lange, Steen and Hong methods in estimating natural direct effect ................................ ................................ ................................ 41 5.2c Comparison of Lange, Steen and Hong methods in estimating natural indirect effect ................................ ................................ ............................. 44 vi 5.2d Comparison of Lange, Steen and Hong methods in estimating partial indirect effect ................................ ................................ ............................. 47 CHAPTER 6. CONCLUSIO N ................................ ................................ .............................. 50 APPENDICES ................................ ................................ ................................ .................. 52 APPENDIX A: Validity o f the Modified Lange Method ................................ ............ 53 APPENDIX B: Proof of RMPW for estimating counterfactual outcomes of consecutive mediators ................................ ................................ .......................... 55 REFERENCES vii LIST OF TABLES Table 1. Minimal sufficient adjustment sets (MSAS) ................................ ........................ 6 Table 2 . Prevalence of Exposure, Mediators, and Outcome in Super - population ........... 12 Table 3. True causal effects ................................ ................................ ........................... 14 Tabl e 4. Hong weights for marginal structural models. ................................ .................. 17 Table 5 . Methods and simulation scenarios ................................ ................................ ... 22 Table 6 . Performance in risk difference and risk ratio scales when the all parameters were correctly specified in the three methods. ................................ ............................. 25 Table 7 Summary of Lange method in RD scale ................................ .............................. 29 Table 8 Summary of Steen method in RD scale ................................ .............................. 33 Table 9 Summary of Hong method in RD scale ................................ .............................. 36 Table 10 Summarizing comparison of Lange, Steen and Hong methods in TE estimation in RD scale ................................ ................................ ................................ ..................... 40 Table 11 Summarizing co mparison of Lange, Steen and Hong methods in NDE estimation in RD scale ................................ ................................ ................................ ... 43 Table 12. Summarizing comparison of Lange, Steen and Hong methods in NIE estimation in RD scale ................................ ................................ ................................ ..................... 46 Table 13. Summarizing comparison of Lange, Steen and Hong methods in PIE estimation ................................ ................................ ................................ ................................ ...... 49 viii LIST OF FIGRES Figure 1. Causal directed acyclic graph (DAG) with exposure A, outcome Y, two sequential mediators and , and a set of baseline confounders sufficient for confounding control. ................................ ................................ ................................ .. 5 Figure 2 . Chart of the procedures executed for each mediation analysis method .......... 18 Figure 3 . Performance of Lange method in risk difference (RD) scale and risk ratio (RR) scale ................................ ................................ ................................ .............................. 28 Figure 4 . Performance of Steen method in risk difference (RD) scale or risk ratio (RR) scale ................................ ................................ ................................ .............................. 32 Figure 5 . Performance of Hong method in risk difference (RD) scale or risk ratio (RR) scale ................................ ................................ ................................ .............................. 35 Figure 6 . Comparison of Lange method, Steen method and Hong method in TE estimation ................................ ................................ ................................ ..................... 39 Figure 7 . Comparison of Lange method, Steen method and H ong method in NDE estimation ................................ ................................ ................................ ..................... 42 Figure 8 . Comparison of Lange method, Steen method and Hong method in NIE estimation ................................ ................................ ................................ ..................... 45 Figure 9 . Comparison of Lange method, Steen method and Hong method in PIE estimation ................................ ................................ ................................ ..................... 48 1 CHAPTER 1. INTRODUCTION Mediation analysis is used in epidemiology and social sciences to estimate how an exposure is related to an outcome through a mediator in complex observational settings . There may be a single mediator, or a set of mediators that are causally related between the exposure and the outcome. In many epidemiological studies , multiple m ediators may be of interest . We are interested in assessing the extent to which the effect of an exposure on an outcome is mediated by two sequential mediators. There have been two different statistical techniques methods based on regression and m ethods based on weighting to estimate direct and indirect effects in mediation analyses with multiple mediators . R egression - based approache s involve combination of results from two models, a model for the outcome and a model for the mediator , to estimate direct and indirect effects (4 - 6) . The approaches described here work when all mediators are continuous, but cannot accommodate binary or categorical mediators , especially when these mediators interact . An alternative class of strateg ies for these cases is weighting - based method s . Weighting - based methods can be used more generally to setting with continuous, binary, count or time - to - event outcomes (7, 8) . These approaches involve specify ing exposure and mediator weights . U sing a weighting approach, it is easy to overcome the difficulties in estimating direct and indirect effects with more than one mediator (5, 6, 8) . Following Liu et al. (9) , we consider ed a mediation analysis with two sequential mediators . W e employ ed weighting approach es , because the mediators are sequential . 2 Moreover, there were exposure - mediator and mediator mediator interactions, which makes it difficult to obtain easily generalizable analytic expressions for the direct and indirect effects using regression - based approaches (2, 3, 10, 11) . Identification and estimation of unbiase d direct and indirect effects rely on many assumptions, such as, no measurement error in the exposure, mediators, or outcome; no unaccounted - for confounding between exposure and mediator, exposure and outcome, or mediator and outcome; correct specification of the regression models for exposure - mediator - outcome relations (2, 3, 12) . In the case of multiple mediators, mediation analysis methods using a potential outcomes framework have been proposed (1 2 - 14) . Natural direct and indirect effects ( NDE and NIE) were estimated using linear structural equation modeling, outcome and mediator regression - based methods, inverse - probability - of - treatment weighting (IPTW) fitting of marginal structural models (MSMs ) (2, 3) . S everal propensity score - based weighting methods for mediation analysis with multiple mediators have been developed . These methods apply the estimated weights rather than the true weights that are usually unknown , such as IPTW, and ratio - of - mediator - probability weighting (RMPW ) (1, 7) . Causal mediation analysis through IPTW or RMPW is a weighting - based approa ch to estimating NDE, and NIE through mediators . In this study, we compare Lange method (2) , Steen method (3) and Hong method (1) in the causal mediation analysis with two sequential mediators. Th e thesis is organized as fo llows: chapter 1 - introduction; chapter 2 - counterfactual 3 approach to mediation analysis with two sequential mediators; chapter 3 - the proposed simulation procedure ; chapter 4 - data simulation ; chapter 5 - simulation results ; finally , chapter 6 - conclus ion . 4 CHAPTER 2. COUNTERFACTUAL APPRO ACH TO MEDIATION ANA LYSIS WITH TWO SEQUENTIAL MEDIATORS 2.1 Motivating epidemiology example For illustrative purposes, we revisit previous analyses on a cohort study about the connection between poor olfaction and mortality among older adults (9) . As shown in Liu et al . (9) , the effect of olfaction impairment on the risk of higher mortality among older adults were mediated by neurodegenerative diseases and weight loss. Previous mediation analysis suggested that neurodegenerative disease and weight loss may partly explain the relationship between poor olfaction and higher mortality (9, 15) . In the thesis, a ll variables including the expo sure, the outcome , two mediators, and six confounders, are binary variables. Corresponding natural direct and indirect effects are estimated under the assumption that baseline covariates are sufficient to control for confounding so that the identification assumptions are met (1, 2, 10, 16) . The causal diagram of Figure 1 depicts a generalization of the causal relations between the aforementioned variables. 2. 2 The c ounterfactual framework The potential outcomes framework has been used to address mediation analysis (12, 17) . This framework not only give s clear relationships among variables, but also defines mediation effects in causal terms . From the framework, researchers can not only reveal explicitly assumptions required for causal inference , but also formulate the confounding control needed for the direct and indirect causal effects of interest (3, 12, 16) . 5 Figure 1 . Causal directed acyclic graph (DAG) with exposure A, outcome Y, two s equential mediator s and , and a set of baseline confounders sufficient for confounding control . Note: The DAG was created using http://www.dagitty.net/dags.html . We consider a cohort study in which denotes mortality among older adults (the outcome of interest), denotes olfaction impairment ( the exposure), and denote n eurodegenerative diseases and weight loss (two sequential mediators) respectively . and are t wo causally ordered mediators, which means can affects , but not vice versa. We allow for potential interactions as well as interaction. C denotes a set of pre - exposure confounding covariates sufficient for confounding control. Under no - omitted - confounder assumption, is the minimum sufficient adjustment set ( MSAS ) of relationship, is the MSC of relationship, and are of relationship, 6 and are of relationship, is the MSAS of relationship, is the MSAS of the relationship (Table 1) . Table 1 . Minimal sufficient adjustment sets (MSAS) Exposure Outcome MSAS for estimating the total effect of exposure on outcome MSAS for estimating the direct effect of exposure on outcome Note: This table was created using http://www.dagitty.net/dags.html . 2. 3 Decomposition of causal effects 2. 3a Counterfactual notation for natural direct and indirect effects Let A = 1 if a subject is assigned to the exposed condition, and let A =0 if the same subject is assigned to the control condition instead (11, 12) . is counterfactual or potentia l outcome when is set to a. In this thesis, is binary, then each subject has two potential outcomes: and . When there is only one mediator M, the natural direct effect (NDE) measures how much Y would change if A were se t at versus , but for each subject M was kept at the . The natural indirect effect (NIE) estimates how much 7 Y would change if A were controlled at , but the counterfactual were changed from to , in which A were changed from level to level . We now consider the situation illustrated in Figure 1 which includes two causally sequential mediators and . We use 3 - way decompostion approach to identify the causal effects (3, 5) . T here are four finest possible distinct deposition from A to Y : , , and . denotes the potential value of the mediator under , represents the potential outcome of when . Then, indicates the counterfactual value of Y that would be observed if A was assigned a=1, was set to value of that would be observed if A was set to a=1, was set to value of that would be observed if A was set to a=1 and . For the same subject, when assigned to instead, counterfactual outcome would be . 2. 3b Decomposition in a mediation model with two sequential mediators The causal mediation effects for a binary outcome can be presented in either the risk difference scale (RD) or the risk ratio scale (RR) (18) . The total effect ( TE ) is defined as how much Y would change overall for a change in A from level to level . In this counterfactual - based approach, the total effect decomposes into the natural direct and indirect effects (3) . If there are sequential mediators in a DAG, the possible decompositions would be 8 ( W ithout i mposing parametric restrictions. natural effect models in a 3 - way decomposition (3) . The saturated model for a 3 - way decomposition of causal effects with two sequential mediators and is: , where are binary ( coded 0 or 1). Then, the direct effect of A on Y: The indirect effect mediated by on Y: The partial indirect effect mediated solely by (bypassing ) on Y: In this study, we have six ( ) possible decompositions as shown below: 9 2. 3c Computing true effects The estimation of true effects is presented as follows based on Steen and Lange methods (2, 3) . The difference between the following two counterfactual outcomes is the total risk difference (RD) of treatment effect on the outcome ( ): Natural direct effect (NDE) A on Y is a causal effect medicated not throught or , but by pathway . NDE is defined as follow s : Natural indirect effect (NIE) captures all pathways mediated by : A p artial indirect effect (PIE) with respect to as the media tor. PIE captures pathway : . PIE is the indirect treatment effect on Y mediated by bypassing : For Hong method, decomposition of and remain the same as aforementioned. However, and are different as discussed in book (1) : 10 2. 4 Assumptions that permit identification Sufficient assumptions (1 - 3) for the identification of unbiased causal direct and indirect effect s include: (a) Consistency of on : if (b) The effect of the exposure A on outcome Y is unconfounded given C . There is no other unmeasured confounding of the relationship . T he measured covariates are in the data generating process for the outcome and suffice to control for confounding for A - Y relationship. (Figure 1) . ( c) The effect of and on outcome Y is unconfounded conditional on and , where are covariates observed for the data generating process for and Y . are sufficient covariates for identifying the association between and Y. are sufficient covariates for identifying the association between and Y. ( d ) The effect of A on both mediators is unconfounded conditional on . There are no un observed confounders between A and any of , and are sufficient 11 to adjust for confounding of the effects of A on . 12 CHAPTER 3 . THE PROPOSED SIMULAT ION PROCEDURE 3 .1 Super population A study (9) reported th at among 2289 adults aged 71 to 82 years at baseline, 31.76 % had p oor olfaction ( ), 37.3% had neurodegenerative disease ( ), 19.1% had weight loss ( ) , M ortality 52.91% ( ) by year 13. We created a super population (Table 1) with the prevalence of the exposure (A) , mediators ( ), and outcome (Y) similar in the paper (9) as shown in table 1 based on the data generating process in Figure 1 . Table 2 . Prevalence of Exposure, Mediators, and Outcome in Super - population N=10 , 000 , 000 Frequency (Percentage) 0 6,962,575 (69.6%) 1 3,037,425 (30.4%) 0 6,525,293 (65.3%) 1 3,474,707 (34.7%) 0 7,635,725 (76.4%) 1 2,364,275 (23.6%) 0 4,962,836 (49.6%) 1 5,037,164 (50.4%) 3 . 2 Equation for data generating procedure (DGP) In this chapter, we provide details about our simulation study. We assume d all covariates are independent and identically distributed in our study. We generate d six independent baseline covariates with identical distribution: . 13 Next, we generated A, , and Y with Bernoulli distribution as described below . iid: A sequence of independent, identically distributed ( IID ) random variables 14 3.3 True effects After generating the exposure, outcome, and mediator variables as mentioned in section 3.2 DGP , we further generate d counterfactual outcomes and estimate d true effects using the superpopulation ( N=10,000,000 ) discussed in section 2. 4 c . The calculated true effects are listed in table 2. Table 3 . True causal effects Risk Difference (RD) Risk Ratio (RR) True effects for: Lange and Steen methods 0.404 0.297 0.065 0.043 2.052 1.773 1.089 1.063 Hong method 0.044 0.063 1.059 1.093 These true effects are one decomposition of the natural ef f ects as discussed in section 2. 3 a . The total effect in risk difference scale ( ) is the sum of the component effects ( ) and the total effect in risk ratio scale is the product of the component effects . The DGP as described in equations (section 3.2) was used for each simulation to generate 2000 observations. The direct, indirect, and total effects of interes t were estimated within each scenario using the three statistical approaches discussed below . To assess performance, method - and scale - specific effects from each simulation were then compared to the true effects obtained from the super population (2, 3) . 3.4 The proposed estimation procedure The rationale for how the newly modified procedure featuring two causally ordered 15 mediators provides validity is shown in Appendices A and B . We performed Monte - Carlo simulations similar to the Lange and Steen simulations, and extended them to Hong method. We carried out Monte Carlo simulation based on generalizations of Steen mediation formula as shown in Figure 2 (3, 11) . The Lange, Steen and Hong approaches share several similarities in their respective procedures (Figure 2). All methods require expansion s of the original dataset, need to generate weights for at least one mediator and use those weights to fit a suitable model to the outcome variable. In addition, each approach used counterfactual - framework based on a marginal structural model (MSM) to estim ate causal mediation effects (2, 3) . As shown in Figure 2, the three methods are different in the process of data simulation. Prior to data expansion and weight generation, the Lange and Hong method s require that both and to be modeled, while disregarding any model for Y . On the other hand, the Steen method needs one mediator ( or ) and Y to be modeled . Additional differences were discussed as follows and in Figure 2. Model specifications were different for three methods . We f it ted a logistic regression model for binary mediator s , conditional on A and covariate set C . Lange and Hong methods: both (1) and (2) , but Steen method: either (1) or (2) . With regard to Steen method, f it a logistic regression model for the binary outcome 16 Y conditional on A , both and , covariate set C as shows in equation (3) . Three methods employ various weighting in MSM (1 - 3) . The section below describes how the weights are generated. Lange weight was given by ( ) Steen weight was computed by ( ) , where C is a set of confounders. Hong weight s were computed as shown in Table 3. 17 Table 4 . Hong weights for marginal structural models. Weight 0 0 0 0 1.0 1 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1.0 18 Figure 2 . Chart of the procedures executed for each mediation analysis method 19 CHAPTER 4 . DATA SIMULATION We compare d three estimation approaches for sequential mediation analysis including their software implementations in STATA version 16 (StataCorp LP, College Station, Texas). In this section we present the simulation scenarios designed to compare the robustness of three approaches in estimating causal effects, as well as the criteria used to compare the performance of the three methods. 4.1 Simulation Scenarios I n addressing causal effects, we control for variables that are confounders ( ) of the relationship ( ) , of the relationship ( ) , of the relationship ( ) , relationship ( ) of relationship ( ) and of relationship ( ) (Section 2.2) . We select ed N = 2000 representing a relatively large sample size. We design ed a range of scenarios that mis - specify the aforementioned relationships based on the causal pathway (Figure 1) so that we can understand the robustness of the three methods under even unusual conditions. In total we evaluate d the correctly - specified model (CSM) and 12 scenarios for Lange and Hong method s, two CSM (CSM1 and CSM2) and 22 scenarios for Steen method ( Table 5 ) . The CSM represent s the case in which all models were correctly specified. Scenarios 1 - 9 are shared by all three methods. The first four scenarios represent cases in which is mis - specified when (scenario 1 , omitting confounding variable for relationship ), or (scenario 2 , omit ting unnecessary covariate ) or (scenario 3 , 20 omitting unnecessary covariate ), or (scenario 4) were omitted from the variable generation equation. Scenarios 5 - 8 represent cases in which wa s mis - speci fied when (scenario 5 , omitting unnecessary covariate ), or (scenario 6 , omitting confounding variable for relationship ) or (scenario 7 , omitting confounding variable for relationship ), or (scenario 8 ) were omitted from the variable generation equation. Scenario 9 portrays the case in which M 2 is mis - specified due to interaction being omitted. For Lange and Hong methods, scenarios 10 - 12 represent case s in which both and were mis - specified due to unmeasured confounding as shown in Table 5 . With regard to Steen method, CSM1 and CSM2 are both the correctly - specified model s, CSM1 is the model when and Y are included to generated IPTW, and CSM2 is the model when and are included to generated IPTW (Table 5). S cenarios 10 - 13 were cases in which Y was mis - specified due to unmeasured confounding (Table 5 ). Scenarios 14 - 16 represent ed mis - specified cases in which interaction (scenario 14), interaction (scenario 15), or interaction (scenario 16) were omitted . Scenarios 17 - 19 were cases in which both and were mis - specified due to unmeasured confounding; and scenarios 20 - 22 represent ed cases in which both and are mis - specified due to unmeasured confounding. We discussed the confounder for each relation in section 2.2 . Under the assumption that no - omitted confounders in our DAG (Figure 1), omitting in scenarios 1, 4, 10 or 17 would increase bias for NIE due to the fact is the confounder of relation (Table 21 5 ). On the other hand, omitting in scenarios 6 , 8 , 11 , or 21 would increase bias for PIE due to the fact is the confounder of relation (Table 5 ) . When interaction is omitted in scenario 9 , bias for PIE w ould rise compared to CSM. Omitting in scenarios 11 or 21 would increase bias for PIE due to the fact is the confounder of relation . In addition, when interaction is omitted in scenario 14, bias for NIE would increase; when interaction is omitted in scenario 15, bias for P IE would increase. Lastly, when is omitted in scenario s 13, 17 or 20, bias for N D E would increase. We would provide performance criteria tha t would be used to compare three mediation analysis methods from different simulation scenarios for illustration (section 4.2) . Each scenario varies one feature of the data - generating process at a time. The changes of the parameter values in the inverse - probability - of - treatment weighting ( IPTW ) or ratio - of - mediator - probability weighting (RMPW) models lead to changes in IPTW or RMPW weights and correspondingly the performance of the estimators in the section 4.2. In this way, we can evaluate the influence of each data generation feature on the estimation results and assess the stability of performance for each estimation procedure. Equations used to generate , and Y are as follows: 22 Table 5 . Methods and s imulation s cenarios Method Scenario Description Explanation Expectation of bias in theory Lange Hong Steen C SM Model for Model for : C SM= Correctly - specified model No 1 is omitted from model is misspecified due to unmeasured confounding Yes in NIE 2 is omitted from model No 3 is omitted from model No 4 are omitted from model Yes in NIE 5 is omitted from model is misspecified due to unmeasured confounding No 6 is omitted from model Yes in PIE 7 is omitted from model No 8 are omitted from model Yes in PIE 9 is omitted from model is misspecified due to lack of interaction Yes in PIE Lange Hong 10 a is omitted from model and is omitted from model Both M1 and M2 are misspecified due to unmeasured confounding Yes in NIE 11 a is omitted from model and is omitted from model Yes in NIE, PIE 12 a is omitted from model and is omitted from model No C SM 1 Model for Model for : C SM 1 = Correctly - specified model 1 No 23 Steen C SM 2 Model for Model for : C SM 2 = Correctly - specified model 2 No 10 b is omitted from model is misspecified due to unmeasured c onfounding Yes in NDE 11 b is omitted from model Yes in PIE 12 b is omitted from model Yes in NIE 13 are omitted from model Yes in all 1 4 is omitted from model is misspecified due to lack of interaction Yes in NIE 1 5 is omitted from model Yes in PIE 1 6 is omitted from model Yes in NIE 17 is omitted from model and is omitted from model and Y are misspecified due to unmeasured confounding Yes in NDE, NIE 18 is omitted from model and is omitted from model Yes in PIE 19 is omitted from model and is omitted from model No 20 is omitted from model and is omitted from model and Y are misspecified due to unmeasured confounding Yes in NDE 21 is omitted from model and is omitted from model Yes in PIE 22 is omitted from model and is omitted from model No 24 4 .2 Performance c riteria The causal mediation estimation was replicated 2000 times for each scenario . We used three criteria to examine the performance of three estimation methods (3, 19) : (1) Bias was computed by s ubtracting the true value of the parameter from the parameter estimate : is the causal mediation effect estimate for the simulation. (2) Root mean square error ( RMSE ) : RMSE was computed by subtracting the true value of the parameter from the parameter estimate, squaring this value, and then adding the empirical variance of the parameter estimate. Low values of R MSE reflect either low bias, high precision, or some combi nation of the two. (3) Confidence interval c overage : W e calculated the probability of true estimates falling into actual 95% confidence interval (CI ) across the 2 ,000 replications. where is lower bound, is upper bound of the corresponding effect for the simulation. 25 CHAPTER 5. SIMULATION RESULTS To investigate the robustness of the considered estimation approaches, we performed simulation studies with 2000 runs of data sets with 2000 observations. Each method estimates total effect (TE) , natural direct effect (NIE) , natural indirect effect (NIE) and partial indirect effect (P IE) as they differ in how the weights are generated and how they handle confounders and interactions . To make the simulation results more comparable, we report ed the bias, root mean squared error (RMSE) , and coverage probability of the 95% confidence interval (CI) across 2,000 simulations relative to the true causal effects estimated from a large simulated data set (Table 6) , as discussed in section 3.3 . Low values (or absolute values) of bias indicate low bias. Low values (or absolute values) of RMSE reflect low bias, high precision, or the combination of the two . High values of coverage probability (The closer to 95%, the better.) indicate better performance. 25 Table 6 . Performance in risk difference and ris k ratio scales when the all parameters were correctly specified in the three methods . True effect Metric Lange CSM Steen CSM1 Steen CSM2 Hong CSM Bias 0.004 (0.017) 0.004 (0.024) 0.004 (0.024) 0.000 (0.024) RMSE 0.611 (0.455) 0.847 (0.705) 0.850 (0.705) 0.832 (0.700) Coverage 94.3% 80.6% 81.3% 82.3% Bias 0.005 (0.020) 0.005 (0.024) 0.005 (0.024) 0.001 (0.028) RMSE 0.740 (0.555) 0.876 (0.656) 0.878 (0.656) 0.984 (0.744) Coverage 95.7% 25.3% 26.8% 77.3% Bias 0.003 (0.007) 0.003 (0.011) 0.003 (0.016) 0.022 (0.010) RMSE 0.266 (0.210) 0.388 (0.301) 0.575 (0.435) 0.969 (0.446) Coverage 88.0% 62.3% 95.2% 7.3% Bias - 0.003 (0.006) - 0.004 (0.017) - 0.004 (0.010) - 0.022 (0.010) RMSE 0.253 (0.188) 0.610 (0.501) 0.360 (0.337) 1.005 (0.447) Coverage 83.7% 96.0% 66.5% 11.2% Bias 0.021 (0.060) 0.020 (0.090) 0.020 (0.090) 0.004 (0.088) RMSE 2.273 (1.722) 3.167 (2.616) 3.171 (2.622) 3.040 (2.498) Coverage 94.3% 76.8% 77.8% 78.0% Bias 0.020 (0.062) 0.020 (0.079) 0.020 (0.079) 0.005 (0.086) RMSE 2.329 (1.778) 2.899 (2.223) 2.907 (2.222) 3.065 (2.354) Coverage 95.5% 40.1% 41.5% 77.3% Bias 0.004 (0.011) 0.005 (0.017) 0.005 (0.023) 0.032 (0.016) RMSE 0.423 (0.339) 0.614 (0.493) 0.848 (0.656) 1.440 (0.712) Coverage 90.5% 65.0% 95.0% 9.5% Bias - 0.005 (0.010) - 0.006 (0.026) - 0.006 (0.016) - 0.033 (0.016) RMSE 0.395 (0.291) 0.918 (0.754) 0.562 (0.516) 1.474 (0.696) Coverage 85.0% 95.7% 64.3% 15.1% Note: Values in parentheses are the standard deviations (sd) of the measure of interest from 2000 simulations. Abbreviations: NDE =N atural direct effect , NIE =N atural indirect effect , PIE =P artial indirect effect , TE =T otal effect , RD =R isk difference , GLM =G eneralized linear model , RMSE =R oot mean square error. 26 5.1 Comparison across scenarios w ith in each method 5.1a Simulation results generated from Lange method In Table 6, each row represents a specific causal effect in RD or RR scale. The robustness of each method in accessing causal effects w as evaluated using bias, RMSE and coverage. For Lange and approach ( Figure 3 and Table 6 ), when all parameters were correctly specified, the bias and RMSE in RD and RR scales for TE, NDE, NIE and PIE were small . T he coverage in RD and RR scales were close to 95% , which indicates high precision in simulation (Table 6 ) . However, t he coverage in RD and RR scales for the NIE and PIE in RD and RR scales were lower than expected ( 90 %) wh en all parameters were correctly specified . Figure 3 shows the performance in bias, RMSE and coverage of Lange method in RD and RR scales across all scenarios. Table 7 displays the summary of consistency of Lange method in terms of with our expectation as mentioned in section 4.1 . Across all scenarios, the coverage for TE and NDE in RD and RR scales were around 95% resilient to parametermis - specification , but the coverage for NIE and NDE in RD and RR scales were below 90% (Figure 3). When was mis - specified due to omission of , bias and RMSE for NIE in RD and RR scales increased , and coverage in RD and RR scales decreased in scenarios 1, 4 and 10 compared to th ose in CSM (Figure 3 and Table 7 ) . is the confounder between (Figure 1), and were included in the equation to generate , but are not 27 confounders between , therefore the larger bias for NIE occurred when was deleted in scenarios 1, 4 and 10 . This result is consistent with our expectation of bias in theory (Table 5). is the confounder between (Figure 1) . When was mis - specified due to the omission of , bias and RMSE for P IE in RD and RR scales increased in scenarios 6, 8 and 11 compared to th ose in CSM (Figure 3 and Table 7) . The bias for NIE and PIE increased in scenario 11 when was omitted from both and models, which is consistent with our expectation of bias in theory ( Table 5 and Table 7) . is the confounder between . When was deleted from the equation to generate , bias and RMSE for NIE in RD and RR scales increased and cover age decreased in scenarios 3 and 12. This result was against our expectation of bias in theory. In summary, Lange approach performed well as expected with a few exceptions across all scenarios. 28 Figure 3 . Performance of Lange method in risk difference (RD) scale and risk ratio (RR) scale Solid red line represents TE; black long - dashed line represents NDE; blue dot - dashed line represents NIE; green short - dashed line represents PIE. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 29 Table 7 Summary of Lange method in RD scale Lange Scenario Description Expectation of bias in theory Bias RMSE Coverage CSM CSM No No No No 1 Yes in NIE Yes in NIE Yes in NIE Yes in NIE 2 No No No No 3 No Yes in NIE Yes in NIE Yes in NIE 4 Yes in NIE Yes in NIE Yes in NIE Yes in NIE 5 No No No No 6 Yes in PIE No Yes in PIE Yes in PIE 7 No No No No 8 Yes in PIE Yes in PIE Yes in PIE No 9 Yes in PIE Yes in PIE Yes in PIE Yes in PIE 10a & Yes in NIE Yes in NIE Yes in NIE Yes in NIE 11a & Yes in NIE, PIE Yes in NIE, PIE Yes in NIE, PIE Yes in NIE, PIE 12a & No Yes in NIE Yes in NIE Yes in NIE Note: Description column represents CSM or mis - specified models, for example, indicates was omitted from equation . 30 5.2 b Simulation results for Steen method For Steen approach (Figure 4 and Table 6 ), when all parameters were correctly specified, the bias and RMSE in RD scales for TE, NDE, NIE and PIE were small as expected ( Table 6 ) , however the coverage in RD and RR scales for TE and NDE were much smaller than 95% ( Table 6 ) . The coverage in RD and RR scales for NIE were 95.2% and 95.0% in CSM2, in contrast, the coverage in RD and RR scales for PIE in CSM1 were poor. The coverage in RD and RR scales for PIE were 96.0% and 95.7% in CSM1 , in contrast , the coverage in RD and RR scales for PIE in CSM2 a re much smaller than 95%. Steen method showed different performance in coverage depending on whether or was used for weigh t ing. Figure 4 shows the performance in bias, RMSE and coverage of Steen method in RD and RR scales across all scenarios. Table 8 displays the summary of consistency of Steen method in terms of bias with our expectation as mentioned in section 4.1. Notably, across all scenarios, the c overage for TE and NDE in RD and RR scales was poor (<80%) , furthermore, the coverage for NIE and PIE was below 90% in most scenarios (Figure 4). is the confounder between (Figure 1). When was mis - specified due to omission of , the bias and RMSE for NIE in RD and RR scales increased in scenarios 1, 4 and 1 7 compared to th ose in CSM 1 (Figure 4 and Table 8 ) . Interestingly, the bias and RMSE for PIE in RD and RR scales were also increased in scenario 1 and 4. is the confounder between (Figure 1). When was mis - specified due to the omission of , bias and RMSE for P IE in RD and RR scales increased in scenarios 6 and 8 compared to th ose in CSM 2 (Figure 4 and Table 8 ). The bias and RMSE for PIE did 31 not increase as expected. Notably, the bias and RMSE for NIE increased in scenarios 6 and 8. is the confounder between . Wh en was deleted from the equation to generate , the bias and RMSE for NDE in RD and RR scales increased in scenarios 10, 13, 17 and 20 as expected (Figure 4 and Table 8) . Deletion of interaction from model (scenario 14) led to bias and RMSE increase for NIE as expected (Figure 4 and Table 8). In addition, when interaction was omitted from Y model (scenario 15), the bias and RMSE for PIE increased as expected. When interaction was omitted from Y model (scenario 16), the bias and RMSE for NIE did not change as expected (Figure 4 and Table 8). There are two notable points to mention for Steen methods. First, in most scenario s when or was mis - specified, the bias and RMSE for NIE and NDE did not change as expec ted (Figure 4 and Table 6). Second, in scenarios 3, 7, 19 and 22, there were unexpected change in bias and RMSE (Table 6). In summary, Steen approach seems less resilient to parameter mis - specification compared to Lange method in many scenarios. We would h ave a further comparison across the methods in the following sections. 32 Figure 4 . Performance of Steen method in risk difference (RD) scale or risk ratio (RR) scale Solid red line represents TE; black long - dashed line represents NDE; blue dot - dashed line represents NIE; green short - dashed line represents PIE. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 33 Table 8 Summary of Steen method in RD scale Steen Scenario Description Expectation of bias in theory Bias RMSE CSM1 CSM1 No No No CSM2 CSM2 No No No 1 Yes in NIE Yes in NIE, PIE Yes in NIE, PIE 2 No Yes in NIE , PIE No 3 No Yes in NIE , PIE Yes in NIE , PIE 4 Yes in NIE Yes in NIE, PIE Yes in NIE, PIE 5 No No No 6 Yes in PIE Yes in NIE , PIE Yes in NIE , PIE 7 No Yes in NIE , PIE Yes in NIE 8 Yes in PIE Yes in NIE , PIE Yes in NIE 9 Yes in PIE Yes in NIE Yes in NIE 10b Yes in NDE Yes in NDE Yes in NDE 11b Yes in PIE No No 12b Yes in NIE Yes in NIE Yes in NIE 13 Yes in all Yes in NDE Yes in NDE 14 Yes in NIE Yes in NIE Yes in NIE 15 Yes in PIE Yes in PIE Yes in PIE 16 Yes in NIE No No 17 & Yes in NDE, NIE Yes in NDE, NIE, PIE Yes in NDE, NIE, PIE 18 & Yes in PIE Yes in NIE Yes in NIE 19 & No Yes in NIE Yes in NIE 20 & Yes in NDE Yes in NDE, NIE Yes in NDE, NIE 21 & Yes in PIE Yes in NIE Yes in NIE 22 & No Yes in PIE Yes in NIE Note: Description column represents CSM1, CSM2 or mis - specified models, for example, indicates was omitted from equation. 34 5. 1c Simulation results for Hong method As shown in Table 6 , for Hong method when all parameters were correctly - specified, bias and RMSE in TE and NDE in both RD and RR scales were smaller than those from Lange method and Steen method. On the other hand , Hong method ha d larger bias and RMSE for NIE and PIE in RD and RR scales (Table 6) . The coverage from Hong method was poor , especially the coverage for NIE and PIE were extremely low (<20%). Figure 5 shows the performance in bias, RMSE and coverage of Hong method in RD and RR scales across all scenarios. Table 9 displays the summary of consistency of Hong method in terms of with our expectation as mentioned in section 4.1. Across all scenarios, TE was resilient to parameter mis - specification (Figure 5) . The coverage for TE, NDE, NIE and PIE in RD and RR scales were poor (<85%) in all scenarios (Figure 5). When was mis - spec ified due to omission of , bias and RMSE for NIE in RD and RR scales increased in scenarios 1, 4 and 10 compared to th ose in CSM (Figure 5 and Table 9 ) . When was mis - specified due to the omission of , bias and RMSE for P IE in RD and RR scales increased in scenario s 6, 8 and 11 compared to those in CSM (Figure 5 and Table 9 ). Surprisingly, w hen was deleted from the equation to generate , bias and RMSE for NIE in RD and RR scales increased and coverage decreased in scenarios 3 and 12 . In summary, Hong method ha d higher bias and RMSE for NIE and PIE in RD and RR scales across all scenarios relative to the other two methods (Figure 5). In many scenarios, Hong approach showed bias and RMSE change as expected. 35 Figure 5 . Performance of Hong method in risk difference (RD) scale or risk ratio (RR) scale Solid red line represents TE; black long - dashed line represents NDE; blue dot - dashed line represents NIE; green short - dashed line represents PIE. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 36 Table 9 Summary of Hong method in RD scale Hong Scenario Description Expectation of bias in theory Bias RMSE CSM CSM No No No 1 Yes in NIE Yes in NIE Yes in NIE 2 No No No 3 No Yes in NIE Yes in NIE 4 Yes in NIE Yes in NIE Yes in NIE 5 No No No 6 Yes in PIE No Yes in PIE 7 No No No 8 Yes in PIE Yes in PIE, NIE Yes in PIE, NIE 9 Yes in PIE No No 10a & Yes in NIE Yes in NIE Yes in NIE 11a & Yes in NIE, PIE Yes in NIE Yes in NIE, PIE 12a & No Yes in NIE Yes in NIE Note: Description column represents CSM or mis - specified models, for example, indicates, was omitted from equation. 37 5.2 Cross method comparison Table 6 displays the results of performance in risk difference scale and risk ratio scales when confounders, mediators, and interactions were correctly specified. With respect to coverage probability, Lange method had best 95% confidence interval (CI) coverage co mpared to the o ther two approaches. 5. 2a Comparison of Lange, Steen and Hong methods in estimating total effects Because three methods had the same mis - specification in scenarios 1 - 9, we did method comparison using CSM (CSM1 and CSM2 for Steen method) and scenarios 1 - 9. The robustness of the three methods in RD and RR scales for TE estimation were shown in Figure 6 and Table 10 . Low values (close to 0) of bias and RMSE indicate low bias. High values of coverage probability (The closer to 95%, the better.) indicate better performance. As shown in Table 10 , if the color was red, the method performed the best among three methods based on the criteria we discussed previously; green, the middle; and blue, the worst. The higher the total numbers of red are, the better overall performance for that method . Lange method had be tter coverage in RD and RR scales for TE than the other two methods , and Steen had the poorest coverage in RD and RR scales for TE in most scenarios (Fig ure d 6E and 6F, and Table 16 ). Hong method had the smallest bias and RMSE in RD and RR sca les for TE . Given that the bias and RMSE for TE were small for all three methods, we compared the performance of the three methods in TE estimation using coverage. 38 Lange method had a score of 10 in coverage (Table 10), which indicates that Lange method wa s more robust in TE estimation comparing to Steen and Hong methods. 39 Figure 6 . Comparison of Lange method, Steen method and Hong method in TE estimation Black solid line represents Lange method ; red dot - dashed line represents Steen method ; green long - dashed line represents Hong method. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 40 Table 10 Summarizing comparison of Lange, Steen and Hong methods in TE estimation in RD scale TE in RD scale Bias Coverage RMSE Scenario Lange Steen Hong Lange Steen Hong Lange Steen Hong CSM NA NA NA CSM1 NA NA NA NA NA NA CSM2 NA NA NA NA NA NA 1 2 3 4 5 6 7 8 9 Total 0 2 8 10 0 0 1 0 9 Note: Smallest bias Middle bias Largest bias (The smaller, the better) Largest coverage Middle coverage Smallest coverage (The larger, the better) Smallest RMSE Middle RMSE Largest bias (The smaller, the better) Low values (or absolute values) of bias indicate low bias. High values of coverage probability (The closer to 95%, the better.) indicate better performance. Low values (or absolute values) of RMSE reflect low bias, high pr ecision, or some combination of the two . 41 5. 2b Comparison of Lange, Steen and Hong methods in estimating natural direct effect Because the bias and RMSE for NDE were relatively small for all three methods, we compared the performance of the three methods in NDE estimation using coverage (Figure 7) . Lange method had a score of 10 in coverage (Table 11), suggesting that Lange method performed better in NDE estimation comparing to Steen and Hong methods. 42 Figure 7 . Comparison of Lange method, Steen method and Hong method in NDE estimation Black solid line represents Lange method ; red dot - dashed line represents Steen method ; green long - dashed line represents Hong method. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B 000 C D E F 43 Table 11 Summarizing comparison of Lange, Steen and Hong methods in NDE estimation in RD scale NDE in RD scale Bias Coverage RMSE Scenario Lange Steen Hong Lange Steen Hong Lange Steen Hong CSM NA NA NA CSM1 NA NA NA NA NA NA CSM2 NA NA NA NA NA NA 1 2 3 4 5 6 7 8 9 Total 3 4 3 10 0 0 1 9 0 Note: Smallest bias Middle bias Largest bias (The smaller, the better) Largest coverage Middle coverage Smallest coverage (The larger, the better) Smallest RMSE Middle RMSE Largest bias (The smaller, the better) Low values (or absolute values) of bias indicate low bias. High values of coverage probability (The closer to 95%, the better.) indicate better performance. Low values (or absolute values) of RMSE reflect low bias, high precision, or some combination of the two . 44 5. 2c Comparison of Lange, Steen and Hong methods in estimating natural indirect effect As shown in Figure 8 and T able 12 , Lange method had smaller bias and RMSE for NIE relative to Steen and Hong methods. M oreover, Lange had higher coverage for NIE estimation in 9 out of 10 scenarios (Table 12) . Hong method had different rationale in estimating NIE f rom Lange and Steen methods, and had poorest performance in bias, RMSE and coverage in NIE estimation in almost all scenarios (Table 12) . 45 Figure 8 . Comparison of Lange method, Steen method and Hong method in NIE estimation Black solid line represents Lange method ; red dot - dashed line represents Steen method ; green long - dashed line represents Hong method. Bias in RD scale (A) and in RR scale (B); RMSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 46 Table 12 . Summarizing comparison of Lange, Steen and Hong methods in NIE estimation in RD scale NIE in RD scale Bias Coverage RMSE Scenario Lange Steen Hong Lange Steen Hong Lange Steen Hong CSM NA NA NA CSM1 NA NA NA NA NA NA CSM2 NA NA NA NA NA NA 1 2 3 4 5 6 7 8 9 Total 9 1 0 6 4 0 7 3 0 Note: Smallest bias Middle bias Largest bias (The smaller, the better) Largest coverage Middle coverage Smallest coverage (The larger, the better) Smallest RMSE Middle RMSE Largest bias (The smaller, the better) Low values (or absolute values) of bias indicate low bias. High values of coverage probability (The closer to 95%, the better.) indicate better performance. Low values (or absolute values) of RMSE reflect low bias, high precision, or some combination of the two . 47 5. 2d Comparison of Lange, Steen and Hong methods in estimating partial in direct effect As shown in Figure 9 and Table 13 , Lange method had smallest bias and RMSE in most scenario among the three methods (Figure 9 and Table 13). Lange method had highest c overage for PIE in all scenarios (Table 13) . The performance of Steen method ranked the second. Hong method it performed poor ly in PIE estimation in almost all scenarios. 48 Figure 9 . Comparison of Lange method, Steen method and Hong method in PIE estimation Black solid line represents Lange method ; red dot - dashed line represents Steen method ; green long - dashed line represents Hong method. Bias in RD scale (A) and in RR scale (B); R MSE in RD scale (C) and RR scale (D); coverage probability in RD scale (E) and RR scale (F). A B C D E F 49 Table 13 . Summarizing comparison of Lange, Steen and Hong methods in PIE estimation PI E in RD scale Bias Coverage RMSE Scenario Lange Steen Hong Lange Steen Hong Lange Steen Hong CSM NA NA NA CSM1 NA NA NA NA NA NA CSM2 NA NA NA NA NA NA 1 2 3 4 5 6 7 8 9 Total 10 0 0 10 0 0 8 2 0 Note: Smallest bias Middle bias Largest bias (The smaller, the better) Largest coverage Middle coverage Smallest coverage (The larger, the better) Smallest RMSE Middle RMSE Largest bias (The smaller, the better) Low values (or absolute values) of bias indicate low bias. Low values (or absolute values) of bias indicate low bias. High values of coverage probability (The closer to 95%, the better.) indicate better performance. Low values (or absolute values) of RMSE reflect low bias, high precision, or some combination of the two . 50 CHAPTER 6 . CONCLUSION To understand the mechanisms with two sequential mediators , it is necessary to investigate the counterfactual framework (3, 10, 16) . In this thesis, motivated by previous research (9) , we considered two - sequential - mediator models with binary mediators, a binary outcome and a set of binary confounders . Before this study, Lange et al (8) had performed the only simulation that we knew of that allowed different sets of confounders for different relationships; but they considered only one mediator and three confounders. We extended the scenarios with mis - specification s by omitting one or multiple confounders , or deleting an interaction . By comparing their robustness in bias , RMSE and coverage in various scenarios, we concluded that Lange method was more resilient to mis - specifications and performed better in NIE and PIE estimation in most scenarios compared to Steen and Hong methods (Figures 8 and 9, Tables 12 and 13) . Hong method had large st bias and RMSE , and lowe st coverage for NIE and PIE among the three method s . In addition, Lange method showed the same changes in b ias, RMSE, and coverage as we expected in most scenarios when mis - specification occurred. Coverage for all causal effects estimated using Steen and Hong methods were much smaller than the expected 95% in most scenarios ( Figures 4 and 5 ) . Low coverage was likely due to the fact that w e did not use bootstrap ping to obtain coverage , but employed default standard errors for parameter estimates ( section 4.2 ) . Bootstra pping will likely 51 provide better coverage as the calculation of the standard errors and confide nce interval is more accurate (3) . 52 APPENDICES 53 A PPENDIX A : Validity of the Modified Lange Method Consistency : ; ; Positivity: ; No unmeasured confounders: Identification assumptions (Extended sequential ignorability): and for all Under above - mentioned assumptions, we can estimate in the mediation analysis with two consecutive mediators . 54 ( Where W is the weight.) 55 APPENDIX B : Proof of RMPW for estimating counterfactual outcomes of consecutive mediators To obtain unbiased estimates of the causal effects applying RMPW method, stronger assumptions about the sequential ignorability are required as follows: (i) Exposure A is independent of all the potential outcomes and the potential mediators given the observed pretreatment covariates C . (ii) Given A and C , the assignment of is independent of all the potential outcomes and the potential mediators. (iii*) Given A , , and C , the assignment of is independent of all the potential outcomes. I ndividuals are assigned at random to or at the first stage of the experiment; those in the same treatment group a are assigned at random to different values of the first mediator at the second stage for ; those in the same treatment group a and with the same mediator value are assigned at random to different values of the second mediator at the third stage for a = 0,1 and for all possible values of . Note: 56 RMPW weight 57 Note: RMPW weight 58 59 RMPW weight 60 REFERENCE S 61 REFERENCES 1. Hong G. Causality in a social world : moderation, mediation and spill - over . 2015 ; ISBN: 978 - 1 - 118 - 33256 - 6 2. Lange T, Rasmussen M, Thygesen LC. Assessing natural direct and indirect effects through multiple pathways. Am J Epidemiol. 2014;179(4):513 - 8. 3. Steen J, Loeys T, Moerkerke B, Vansteelandt S. Flexible Mediati on Analysis With Multiple Mediators. Am J Epidemiol. 2017;186(2):184 - 93. 4. Baron RM, Kenny DA. The moderator - mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol. 1986;51 (6):1173 - 82. 5. Valeri L, Vanderweele TJ. Mediation analysis allowing for exposure - mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psyc hol Methods. 2013;18(2):137 - 50. 6. VanderWeele TJ, Vansteelandt S. Conceptual issues concerning mediation, interventions and composition. Stat. Interface. 2009;2(4):457 - 68. 7. Hong G, Deutsch J, Hill HD. Ratio - of - Mediator - Probability Weighting for Causal Mediation Analysis in the Presence of Treatment - by - Mediator Interaction. J. Educ. Behav. Stat. 2015;40(3):307 - 40. 8. Lange T, Vansteelandt S, Bekaert M. A Simple Unified Approach for Estimating Natural Direct and Indirect Effects. Am. J. Epidemiol. 2012;176(3):190 - 5. 9. Liu BJ, Luo ZH, Pinto JM, et al. Re lationship Between Poor Olfaction and Mortality Among Community - Dwelling Older Adults A Cohort Study. Ann. Intern. Med. 2019;170(10):673 . 10. Vansteelandt S, Vanderweele TJ. Natural direct and indirect effects on the exposed: effect decomposition under wea ker assumptions. Biometrics. 2012;68(4):1019 - 27. 11. Breese J, American Association for Artificial I, Conference on Uncertainty in Artificial I, Conference on Uncertainty in Artificial I, Uai. Uncertainty in artificial intelligence : proceedings of the sev enteenth conference (2001) ; August 2 - 5, 2001, University of Washington, Seattle, Washington ; [the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI - 2001)]. San Francisco, Calif: Morgan Kaufmann; 2001. 12. Robins JM, Greenland S. Ident ifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3(2):143 - 55. 62 13. Bellavia A, Valeri L. Decomposition of the Total Effect in the Presence of Multiple Mediators and Interactions. Am J Epidemiol. 2018;187(6):1311 - 8. 14. Daniel RM, De Stavola BL, Cousens SN, Vansteelandt S. Causal Mediation Analysis with Multiple Mediators. Biometrics. 2015;71(1):1 - 14. 15. Ekstrom I, Sjolund S, Nordin S, et al. Smell Loss Predicts Mortality Risk Regardless of Dementia Conversion. J Am Geri atr Soc. 2017;65(6):1238 - 43. 16. Vanderweele TJ, Vansteelandt S, Robins JM. Effect decomposition in the presence of an exposure - induced mediator - outcome confounder. Epidemiology. 2014;25(2):300 - 6. 17. Imai K, Keele L, Tingley D. A general approach to causa l mediation analysis. Psyc hol Methods. 2010;15(4):309 - 34. 18. Vanderweele TJ, Vansteelandt S, Robins JM. Marginal structural models for sufficient cause interactions. Am J Epidemiol. 2010;171(4):506 - 14. 19. Richiardi L, Bellocco R, Zugna D. Mediation analysis in epidemiology: methods, interpretation and bias. Int. J. Epidemiol. 2013;42(5):1511 - 9.