MEDIATION ANALYSIS  FOR  CASE-CONTROL  STUDIES WITH  SECONDARY AND 
TERTIARY OUTCOMES

By

Zichun  Cao

A DISSERTATION

Submitted  to
Michigan  State University
in partial fulﬁllment of the requirements 
for the degree of

Biostatistics –  Doctor  of Philosophy

2023

ABSTRACT

Mediation analyses have been implemented in observational studies such as cross-sectional,

case-control and both prospective and retrospective cohort studies. However, little has been

done for mediation analysis in a case-control study with subsequently collected (secondary

and/or tertiary) outcomes. This may happen in a case-control study with additional in-

terests of a secondary/tertiary outcome and direct or indirect eﬀects of exposure on the

secondary (tertiary) outcome through the primary (secondary) outcome. Performing such

mediation analysis ignoring the complex study design could lead to seriously biased estimates

since the primary case-control sample was a biased sample. To address this, we propose to

account for the sampling process with the ratio-of-mediator-probability weighting method

and established two new weighting estimators. The proposed estimators work well with

marginal structural models in estimating controlled direct eﬀects, natural direct eﬀects and

natural indirect eﬀects under complex study designs and allow for the existence of exposure-

mediator interaction in relation to the outcome. We compare the proposed methods with

existing and extended versions of existing methods and ﬁnd that the proposed methods sig-

niﬁcantly reduce the computational resources when estimating standard errors compared to

existing methods. We also illustrate the implementation of the proposed weighting estima-

tors using real-world data. Finally, we explore the minimum detectable eﬀect sizes and the

optimal case-to-control ratio for proposed weighting estimators for case-control studies with

secondary (tertiary) outcomes.

Copyright by
ZICHUN CAO
2023

For my family and friends

iv

ACKNOWLEDGMENTS

I would like to express gratitude to my dissertation committee chairs, Dr. Zhehui Luo and

Dr. Honglei Chen, for their guidance and inspiring comments for both scientiﬁc knowledge

and life experience.

I would like to thank my other committee members: Dr. Chenxi Li and Dr. Yuehua Cui

for their kindness, encouragement and the valuable input they give.

I would like to thank the faculty and staﬀ in my department.

I would like to thank all the friends and coworkers I’ve met.

Lastly, I would like to thank my family, for loving and supporting me all the way along

my PhD journey.

v

TABLE OF CONTENTS

CHAPTER 1 Mediation Analysis

. . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1
. . . . . . . . . . . . . . . . . . . . . . . .
1.2 Causal Directed Acyclic Graphs
. . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Neyman-Rubin Causal Model
1.4 Fundamental Assumptions
. . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Casual Mediation Estimands . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Estimation Strategies for Mediation Analysis . . . . . . . . . . . . . . . . . .

1
1
2
4
5
6
8

CHAPTER 2 Mediation Analyses for Case-Control Studies With Secondary

2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Settings
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Existing Methods for the Scenario . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Proposed Weighting Method (Weighting Estimator I) . . . . . . . . . . . . .
2.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6 Simulation Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7 Application: the Olfaction Sub-study of the Sister Study Cohort . . . . . . .
2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
16
17
19
23
28
32
37
40

CHAPTER 3 Mediation Analyses for Case-Control Studies With Secondary

3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Settings
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Existing Methods for the Scenario . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Proposed Weighting Method
. . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6 Simulation Results
3.7 Application: the Olfaction Sub-study of the Sister Study Cohort . . . . . . .
3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

and Tertiary Outcomes . . . . . . . . . . . . . . . . . . . . . . 41
41
42
44
49
52
54
60
62

CHAPTER 4 Monte Carlo Based Statistical Power Analysis for Mediation
Analysis Using Case-Control Studies With Secondary and
Tertiary Outcomes

4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Settings
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Estimation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Simulation Studies
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . 64
64
66
72
74
76
80

CHAPTER 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

vi

APPENDIX A PROOFS OF THEOREMS . . . . . . . . . . . . . . . . . . . 90

APPENDIX B DATA GENERATING PROCESS FOR CHAPTER 4 . . . 104

vii

CHAPTER 1 Mediation Analysis

1.1 Introduction

The research on the causal mediation analysis, including both methodology and application,

has exploded in the past several decades (Ten Have & Joﬀe 2012, Baron & Kenny 1986,

VanderWeele 2009, Huber 2014, Hong 2010, Tchetgen Tchetgen 2013). However, limited

work has been done for mediation analysis with complex study designs. Most existing statis-

tical methods have been developed assuming the exposure, mediator, and outcome are fully

observed by study design, e.g., in a prospective cohort study with no missing data. Such de-

signs are usually expensive and time-consuming to conduct. Compared with cohort studies,

a case-control study design is less costly and has been widely used for studying associations

between exposures and rare diseases/outcomes. Furthermore, followed by a case-control

sampling, researchers may be interested in secondary analyses with additionally collected

outcomes within case-control sample only and study the association between exposure with

the additional outcomes (Schifano 2019).

In these studies, the outcome for case-control

sampling is usually called the primary outcome and the additionally collected variables are

called secondary outcomes. For example, Weedon et al.

studied genetic risk factors for

adult and childhood height (secondary outcome) with a case-control sample drawn based on

type 2 diabetes status (primary outcome) (Weedon et al. 2007). Weuve et al. studied the

relationship between lead exposure and cognitive function using data from two case-control

samples in which the primary outcomes were hypertension and osteoporosis, respectively

(Weuve et al. 2009). While those studies mainly aimed for association instead of causa-

tion, several attempts have also been made to perform mediation analyses using case-control

studies (VanderWeele & Vansteelandt 2010, VanderWeele & Tchetgen Tchetgen 2016, Kim

& Kaplan 2014, Wang et al. 2019, Satten et al. 2022).

Treating the primary outcome (case-control status) as the mediator and the sec-

ondary outcome as the outcome of interest, Huber and Solovyeva proposed a weighting

method to estimate the mediating eﬀect of an exposure on the secondary outcome through

1

the primary outcome (Huber & Solovyeva 2020, Huber 2014). However, one big drawback of

Huber and Solovyeva’s method is that part of the weight involves modeling the treatment as

a function of the mediator and covariates, which does not have immediate substantive inter-

pretations (Hong 2015). In addition, their approach relies on bootstrap resampling method

to obtain the standard error and 95% conﬁdence interval (CI), which could be computation-

ally expensive. To address these drawbacks, we extend the work by Hong (Hong 2010, Lange

et al. 2012) and propose a new weighting method.

Moving one-step forward, a tertiary outcome can be subsequently collected in ad-

dition to the secondary outcome in case-control studies. For example, when the secondary

outcome, i.e., the mediator, is relatively expensive and researchers use a cheaper surrogate

(primary outcome) to do the case-control sampling to improve the eﬃciency of the study, a

tertiary outcome can further be collected in extended follow-up visits within the case-control

sample while the study interest is the mediating eﬀect of exposure on the tertiary outcome

through the secondary outcome. Currently there is no readily available statistical method for

estimating mediating eﬀects in such scenario. Thus, we ﬁrst extend Huber and Solovyeva’s

work (Huber & Solovyeva 2020) by proposing a new weighted estimator, which was an ex-

tension of Hong’s work (Hong 2010, Lange et al. 2012), for mediation analyses which account

for the complex study design. Finally, we explore the power and the optimal case-control

sampling ratio for mediation analyses under complex study designs speciﬁed above.

1.2 Causal Directed Acyclic Graphs

Causal directed acyclic graphs (DAGs) have been introduced to help communicate researchers’

understanding of the potential interplay among diﬀerent variables in mediation analysis,

(Pearl 2009, Hernan & Robins 2020, Greenland et al. 1999, Lipsky & Greenland 2022).

Brieﬂy, the causal DAG is consisted of a set of vertices (or nodes) and a set of directed edges

(or links) and contains no directed cycles.

2

Figure 1.1: An Example of a Causal DAG

More speciﬁcally, the causal DAG visually lays out the assumptions of the data gen-

erating process (DGP), which helps researchers to minimize bias in causal inference analysis.

A causal DAG has the following characteristics (Lipsky & Greenland 2022):

a. Since causality implies the time ordering from cause to consequence, cycles (e.g.,

X → Y → M → X) are not allowed.

b. The arrow reﬂects the causal relationship from cause to consequence. For example,

X → Y in Figure 1.1 indicates that there is a direct causal eﬀect of X on Y . However, the

presence of an arrow does not guarantee that the relationship could be detected in the data

since the eﬀect can be small or negligible. And the lack of an arrow between any two variables

indicates that there is no direct causal relationship between for any units.

c. A complete causal DAG should include all common causes for each pair of vari-

ables.

d. In order to reﬂect the study design, a causal DAG may also include a variable to

indicate the (sample) selection process.

Taking Figure 1.1 as an example, if X is the exposure of interest and Y is the

outcome of interest, the arrow from X to Y indicates the assumption of existence of direct

causal eﬀects of X on Y . There are several types of nodes (variables) in a causal DAG:

1) Confounder (common cause): variables that have causal eﬀects on both the ex-

posure and the outcome, i.e., C.

3

2) Mediator:

intermediate variables on the causal path from the exposure to the

outcome, i.e., M .

3) Collider (common consequence): the variable with two arrows pointing into it,

i.e., U .

4) Selection indicator: researchers use a binary variable to indicate the sampling se-

lection process, either selected (1) or unselected (0) (Didelez et al. 2010, Lipsky & Greenland

2022), i.e., S in Figure 1.1. Usually, a square outside the selection node suggests that the

analysis is carried out among the selected sample. Practically, this can reﬂect the fact that

in a case-control study we can only analyze data using the case-control sample, where S = 1.

1.3 Neyman-Rubin Causal Model

We discuss the causal eﬀects under the potential (counterfactual) outcome framework ﬁrst de-

scribed by Neyman (Splawa-Neyman et al. 1923, 1990) and later developed by Rubin (Rubin

1974, 1978) and further extended to observational studies. Through out the entire disserta-

tion, we adopt upper case to represent a random variable and lower case for a particular value

of that random variable. For example, we use X to denote a binary exposure/treatment of

interest, with the value of either 1 (exposed/treated) or 0 (unexposed/untreated). For an

individual i, Y x=1

i

denotes the potential outcome if he/she is exposed/treated, and Y x=0

i

de-

notes the potential outcome if he/she is unexposed/untreated. For simplicity, we will use the

term “potential outcome” for Y x and “treatment” for X through the dissertation. Hence,

the individual causal eﬀect of treatment of subject i on the outcome can be deﬁned by the

diﬀerence between two potential outcomes (Rubin 1974, ?):

(cid:3)i = Y x=1

i − Y x=0

i

(1.1)

Since we can only observe one of above two potential outcomes in the real world, in-

dividual causal eﬀects cannot be identiﬁed, i.e., cannot be expressed as a function of the

observed data (Hernan & Robins 2020). Now we will deﬁne the average causal/treatment

4

eﬀects across a population. Taking a binary outcome Y (0/1) as an example, the marginal

average treatment eﬀects (ATE) are usually deﬁned with three diﬀerent scales (Hernan &

Robins 2020):

Risk diﬀerence: ATERD = P (Y x=1 = 1) − P (Y x=0 = 1);
Risk ratio: ATERR = P (Y x=1=1)
P (Y x=0=1) ;
Odds ratio: ATEOR = P (Y x=1=1)/P (Y x=1=0)
P (Y x=0=1)/P (Y x=0=0) .

A more general notation for marginal average treatment eﬀects in terms of risk

diﬀerence can be written in ATE = E[Y x=1] − E[Y x=0] (Hong 2015, Hernan & Robins

2020). Some researchers may be interested in the conditional average treatment eﬀects, e.g.,

ATE = E[Y x=1|C] − E[Y x=0|C] where C is the conditioning set (VanderWeele 2015). The

choice of marginal or conditional eﬀects in reality depends on the study goal and purpose.

In this dissertation work, we focus on the marginal eﬀects for more general purposes. Even

though the individual causal eﬀects are not identiﬁable, the average causal/treatment eﬀects

in the population can sometimes be identiﬁed using observed data. To identify and estimate

the average treatment eﬀects, several basic assumptions are required and we will brieﬂy

introduce them in next section.

1.4 Fundamental Assumptions

The stable unit treatment value assumption (SUTVA) was ﬁrst proposed and repeatedly

emphasized by Rubin (Rubin 1980, 1986).

SUTVA (Stable Unit Treatment Value Assumption) (Rubin 1980, 1986, Im-

bens & Rubin 2015): The potential outcomes for any unit do not vary with the treatments

assigned to other units (no interference), and, for each unit, there are no diﬀerent forms

or versions of each treatment level, which lead to diﬀerent potential outcomes (no hidden

variations of treatments).

Consistency (Hernan & Robins 2020): For each subject, one of the potential out-

comes is actually factual - the one corresponding to the treatment that the individual actually

received. Formally, Yi = XiY x=1

i

+ (1 − Xi)Y x=0

i

for a binary treatment X.

5

(Conditional) Exchangeability (Hernan & Robins 2020): The potential outcome

is independent of the exposure (conditioning on covariates), i.e., Y x ⊥⊥ X|C.

Additional assumptions are required to identify target causal eﬀects and will be

discussed in later chapters.

For chapter 2 and 3, this dissertation work will mainly discuss the mediation analysis

followed by the ﬁve steps below:

1. Deﬁne the population potential outcomes and the target causal/mediating eﬀects of

interest.

2. Identify the target causal eﬀects using observable data under a set of assumptions. A

population causal eﬀect is identiﬁable only if the potential outcomes can be equated

with some observable data even if such data were not actually available for all individ-

uals in the population (Hong 2015).

3. Estimate the target causal eﬀects from observed data.

4. Examine the performance of the estimation in simulation.

5. Implementation of the statistical methods into the real-world data.

The ﬁrst three tasks are common for general causal inference analysis (Heckman &

Vytlacil 2007) while the next two are for the evaluation of the estimation of the target causal

eﬀects (Hong 2015).

1.5 Casual Mediation Estimands

Unlike the average treatment eﬀects, the mediation analysis studies the eﬀect of an exposure

on an outcome that operates through one (or multiple) intermediate process(es), referred

as the indirect (or mediating eﬀect). At the same time, we may also be interested in the

eﬀect of the treatment on an outcome that does not operate through speciﬁc intermediate

process(es), referred as the direct (or unmediated) eﬀect. Formally, let X be the exposure,

M be the intermediate process, which can be a single mediator or a vector, and Y be the

outcome of interest. Diﬀerent types of causal contrasts have been proposed for mediation

methodology such as controlled direct eﬀects (CDE) (Pearl 2001), natural direct and indirect

6

eﬀects (NDE/NIE) (Robins & Greenland 1992, Pearl 2001), and interventional (in)direct ef-

fects (VanderWeele et al. 2014), etc. In this dissertation, we mainly focus on CDE, NDE

and NIE since they are the most commonly seen quantities in mediation research. Addition-

ally, NDE/NIE decompose the total causal eﬀect and provide insight about the underlying

mechanisms that motivate researchers to conduct mediation analysis.

1.5.1 Controlled Direct Eﬀects (CDE)

Holland was the ﬁrst person to use Neyman-Rubin causal model to mediation problems

(Holland 1988). He deﬁned two causal eﬀects of interest: Y xm

i − Y x∗m

i

and Y xm

i − Y xm∗

i

. The

ﬁrst quantity measures the eﬀect of a manipulable treatment while the second measures the

eﬀect of a manipulable mediator. Several years later, the ﬁrst quantity was deﬁned formally

as the CDE (Pearl 2001, Robins & Greenland 1992). The individual CDE Y xm

i −Y x∗m

i

stands

for the causal eﬀect of X on Y for subject i when the mediator M is ﬁxed at value m.

1.5.2 Natural Eﬀects (NE)

Instead of ﬁxing the value of the mediator at m, natural eﬀects are the eﬀect measures

when one sets the value of the mediator by the random variable M x. Pearl pointed out

that the CDE is prescriptive since the mediator value is strictly under the experimenter’s

control and is not aﬀected by the treatment assignment. However, the descriptive natural

(in)direct eﬀects (NDE/NIE) maintain the natural relationship between the treatment and

the mediator (Pearl 2001, Hong 2015, Robins & Greenland 1992). To formally deﬁne the

NDE/NIE for each subject, we use the following notations (Pearl 2001):

• (Pure) Natural direct eﬀects: NDEi = Y

xM x∗
i
i

− Y

x∗M x∗
i
i

, which represents the eﬀect of

X on Y when change the treatment value from x∗ to x while setting the mediator to

its natural value as if the treatment is set to the baseline value x∗.

• (Total) Natural indirect eﬀects: NIEi = Y

xM x
i
i

− Y

xM x∗
i
i

, the diﬀerence between the

outcome Y of a treated subject i with his mediator set to the value as he would be

xM x
i
treated (Y
i

), and the outcome Y of the treated subject i with his mediator set to

the value as he would not be treated (Y

xM x∗
i
i

).

7

Similarly, we have another pair of natural eﬀects:

• (Total) Natural direct eﬀects: NDEi = Y

xM x
i
i

− Y

• (Pure) Natural indirect eﬀects: NIEi = Y

x∗M x
i
i

− Y

x∗M x
i
i

.
x∗M x∗
i
i

.

The sum of (pure) NDE and (total) NIE, or the sum of (total) NDE and (pure)

NIE, is the total eﬀect of the treatment (X) on the outcome (Y ): pure NDE + total NIE =
total NDE + pure NIE = Y xM x − Y x∗M x∗

. However, in general, the total eﬀect

= Y x − Y x∗

cannot be divided in terms of CDEs and indirect eﬀects. From now on, we refer to (pure)

NDE simply as NDE and (total) NIE as NIE since they are more commonly used.

1.6 Estimation Strategies for Mediation Analysis

1.6.1 Path Analysis

The earliest mediation analysis is path analysis (Wright 1920, 1934), which used several paths

to determine the piebald pattern of guinea-pigs. The path analysis approach was widely

applied in social science research for mediation analysis (Alwin & Hauser 1975, Duncan

1966, Finney 1972).

1.6.2 Structural Equation Modeling (SEM)

In 1986, Baron and Kenny proposed the well-known structural equation modeling (SEM)

approach to analyze direct and indirect eﬀects (Baron & Kenny 1986).

Figure 1.2: Causal DAG for SEM Approach

As shown in Figure 1.2, there are two causal paths from X to Y . One is X → Y

(direct eﬀect of X on Y ) and the other one is X → M → Y (indirect of X on Y through

M ). This approach used two linear regression models, one is for the mediator (M ) and the

8

other for the outcome (Y ):

M = γ0 + aX + εM

Y = β0 + bM + cX + εY

(1.2)

(1.3)

Here, γ0 and εM are the intercept and the error term in the mediator model while β0

and εY are the intercept and the error term for the outcome model. Researchers treat c from

regression model 1.3 as the direct eﬀect and the product ab from the regression models 1.2

and 1.3 as the indirect eﬀect (Baron & Kenny 1986, Judd & Kenny 1981). Additionally, we

can build a third regression model:

Y = β(cid:3)

0 + c(cid:3)X + ε(cid:3)
Y

(1.4)

If treat c(cid:3) = ab + c from regression model 1.4 as the total eﬀect of X on Y , we may then

calculate the indirect eﬀect as c(cid:3) − c instead of ab shown above. To determine if there exists

an indirect eﬀect, the Sobel test is widely used (Sobel 1982).

However, this approach has several disadvantages:

• The functional forms of the speciﬁed regression models have to be correct. For example,

if M and Y are not linearly associated or the X and M do not additively inﬂuence Y,

the direct and indirect eﬀects calculated using above models could be incorrect.

• The total eﬀect cannot be decomposed using controlled direct and indirect eﬀects.

• Potential interaction between treatment (X) and mediator (M ) may bias the direct

and indirect eﬀects of X on Y . Attempts have been made to use modiﬁed regression ap-

proaches to incorporate the treatment-mediator interaction in the outcome regression

model (Preacher et al. 2007), but their proposed alternative conditional indirect eﬀect

does not correspond to the average natural indirect eﬀect deﬁned in section 1.5.2 with-

out additional assumptions. Later on more researchers proposed diﬀerent approaches

to estimate the NDE with treatment-mediator interaction in the outcome regression

9

which requires much more complex and intensive calculations (Petersen et al. 2006,

VanderWeele & Vansteelandt 2010, Valeri & Vanderweele 2013).

• When logistic regression models or probit models are used for binary outcomes, the

product of coeﬃcients approach (ab) and the diﬀerence of coeﬃcients approach typi-

cally give diﬀerent estimates of NIE deﬁned in section 1.5.2, and neither is unbiased

(MacKinnon 2007, Mackinnon & Dwyer 1993, MacKinnon et al. 2007).

Therefore, deﬁning of NDE/NIE in terms of potential outcomes is more ﬂexible and

preferable than using the SEM.

1.6.3 Marginal Structural Models (MSM)

Marginal structural models can be used to estimate both controlled and natural eﬀects, and

allow for treatment-mediator interaction (VanderWeele 2009, Lange et al. 2012, Robins et al.

2000). To estimate CDE, we can use the following MSM (VanderWeele 2009):

E[Y xm] = α0 + α1x + α2m + α3xm

(1.5)

It can be easily shown that when change treatment value from x∗ to x and ﬁx the me-

diator value at m, and the MSM is correctly speciﬁed, the CDE in risk diﬀerence scale is

E[Y xm − Y x∗m] = α1(x − x∗) + α3(x − x∗)m. To estimate such eﬀect, researchers can use the

inverse probability of treatment weighting procedure (VanderWeele 2009, Robins et al. 2000),

with the ﬁrst part of the weight as

P (X=xi)

P (X=xi|C=ci) and the second part as

P (M =mi|X=xi)
P (M =mi|X=xi,C=ci,W=wi)

where C is the treatment-outcome confounder set and W is the mediator-outcome con-

founder set, under the exchangeability assumptions. One advantage of this approach over

the traditional regression approach is that it allows W to be a consequence of the treatment

while the traditional approach doesn’t (VanderWeele 2009).

To estimate NDE and NIE, we can use a similar MSM (Lange et al. 2012):

E[Y xM x∗

] = c0 + c1x + c2x∗ + c3x · x∗

(1.6)

10

It can be easily shown that when changing treatment value from x∗ to x, and

the MSM is correctly speciﬁed, the NDE in risk diﬀerence scale is NDERD = E[Y xM x∗
Y x∗M x∗

] = (c0 + c1x + c2x∗ + c3x · x∗) − (c0 + c1x∗ + c2x∗ + c3x∗ · x∗) = c1(x − x∗) + c3x∗(x − x∗).

−

and the NIE in risk diﬀerence scale is

NIERD = E[Y xM x − Y xM x∗

] = (c0 + c1x + c2x + c3x · x) − (c0 + c1x + c2x∗ + c3x · x∗) =

c2(x − x∗) + c3x(x − x∗).

The major diﬀerence between model 1.5 and model 1.6 is how to handle the value of

the mediator. The former assumes that the mediator M is manipulable so it speciﬁes E[Y xm]

as a function of x and m. However, for NDE/NIE, the mediator is usually assumed as not
manipulable and E[Y xM x∗

] is speciﬁed with a function of the actually received treatment (x)

and the counterfactual treatment (x∗).

1.6.4 Conditional Marginal Structural Models

Except the approach mentioned in section 1.6.3 and to address the potential treatment-

mediator interaction issue in estimating NDE and NIE, VanderWeele proposed the con-

ditional marginal structural models (VanderWeele 2009, 2015). He proposed to use two

conditional marginal structural models to estimate conditional NDE and NIE:

E[Y xm|C = c] = θ0 + θ1x + θ2m + θ3xm + θ(cid:3)

4c

E[M x|C = c] = γ0 + γ1x + γ(cid:3)

2c

(1.7)

(1.8)

where C is the conditioning set, e.g., treatment-outcome confounders.
With some calculation, the conditional NDE is E[Y xM x∗

− Y x∗M x∗

|C = c] = (θ1 +

θ3γ0 + θ3γ1x∗ + θ3γ(cid:3)

2c)(x − x∗) and the conditional NIE is E[Y x∗M x − Y x∗M x∗

|C = c] =

θ2γ1(x − x∗) + θ3γ1x(x − x∗). As pointed out by VanderWeele, the conditional NIE in risk

diﬀerence scale estimated using models 1.7 and 1.8 does not depend on covariate set C so

the average marginal NIE would be same as the conditional NIE (VanderWeele 2009).

11

1.6.5 Resampling Approach

Researchers have also proposed Monte Carlo simulation algorithms for generating the dis-
tribution of potential outcomes E[Y xM x∗

] to estimate NDE and NIE (Imai,

] and E[Y x∗M x

Keele & Tingley 2010, Imai, Keele & Yamamoto 2010). However, these methods are com-

putationally intensive. Thus, from the practical perspective, this method may not be easily

implemented.

1.6.6 Imputation-based Approach

Like resampling approach, researchers also try to generate the unobservable potential out-

comes using imputation-based methods (Vansteelandt et al. 2012). This approach is some-

time attractive if unstable performance (due to extreme values in the weight) is concerned

when using weighting-based estimators. However, since weighting-based methods are more

well established and easy to implement in practical, we will focus on existing weighting-based

methods and the extension of them.

1.6.7 Other Weighting Methods

Other weighting methods have been proposed based on the rationale of identifying popula-

tion potential outcomes using observed data (Huber 2014, Huber & Solovyeva 2020, Tchet-

gen Tchetgen 2013, Hong 2010, Albert 2012, Nguyen et al. 2023). Interestingly, all weighting

methods are mathematically equivalent or closely related (Hong 2010, Hong et al. 2015, Hong

& Nomi 2012, Tchetgen & Shpitser 2012) except the Albert’s method, which has advantages

if the mediator is non-binary. For example, Hong’s ratio-of-mediator-probability-weighting

(RMPW) approach is valid under the following assumptions (Hong 2010):

1) 0 < P (X = x|C) < 1

2) 0 < P (M x = m|X, C) < 1

3) Y xm ⊥⊥ M x|X = x, C

4) M x ⊥⊥ X|C

5) Y xm ⊥⊥ M x∗|X = x, C
, Y x∗M x∗
6) (Y xM x∗

, Y x∗M x

, Y xM x

) ⊥⊥ X|C

12

The population potential outcome E[Y xM x∗

] can be estimated using observed data

in the absence of post-treatment covariates (Hong 2010):
] = E[wY |X = x] and w = q(x∗)(M x∗

E[Y xM x∗

=m|X=x∗,C)

q(x)(M x=m|X=x,C)

· P (X=x)
P (X=x|C) , where q(.)(·) is the den-

sity function for the mediator. This method is quite ﬂexible and can be easily implemented

in practice. Lange et al.

further extended Hong’s work with the combination of MSM to

make it easier to obtain the robust standard error and 95% CI using standard statistical

softwares (Lange et al. 2012).

1.6.8 Two-Phase and Multi-Phase Sampling

Unlike randomized control trials, the observational study does not have randomization pro-

cess which makes it easily suﬀer from potential confounding issue. Many attempts have

been made for performing causal mediation analyses in commonly seen observational study

types such as cohort, case-control studies (Hong 2010, VanderWeele & Vansteelandt 2010,

VanderWeele & Tchetgen Tchetgen 2016). However, limited amount of publications focus on

mediation analyses with more complex study designs, e.g., two-phase sampling. In studies

with two-phase sampling (a.k.a. double-sampling or two-stage sampling), some variables are

measured on all individuals in a sample drawn from a target population, e.g., a cohort. Next,

based on the values of one or multiple measured variables, a subsample is further drawn and

the values of additional variables (secondary or tertiary outcomes) are further obtained only

within the subsample (Neyman 1938, White 1982, Gustavo Amorim et al. 2017). Such de-

signs are useful when the secondary and/or tertiary outcomes or covariates are expensive

or diﬃcult to measure, e.g., genetic data. With additional steps of subsampling, two-phase

designs can be further extended to three-phase, or more general, a multi-phase sampling

design (Whittemore & Halpern 1997).

In this dissertation, we are inspired by a real-world research question raised from the

National Institute of Environmental Health Sciences (NIEHS) Sister Study cohort. While the

details of the NIEHS Sister Study data will be introduced in detail later, the study design we

focus on can be considered as a special case of the multi-phase sampling. Brieﬂy, we consider

13

a case-control sample drawn from a pre-speciﬁed sampling pool derived from the Sister

cohort. Additional variables (secondary and tertiary outcomes) are further collected only

within the case-control sample. Therefore, we extend Hong’s and Lange’s work and proposed

two sets of weighting-based estimators (weighting estimator I and II) for the mediation

analysis under case-control study samples with secondary (and tertiary) outcome(s).

1.6.9 Aims and Organization

Aim 1: Mediation analyses of the secondary outcome with the primary case-control status

as the mediator in case-control studies.

Speciﬁc Aim 1.1: Deﬁne and identify the causal estimand nonparametrically.

Speciﬁc Aim 1.2: Illustrate the necessity of considering the study design when

performing the mediation analysis.

Speciﬁc Aim 1.3: Propose weighted estimators and compare the performance of

proposed estimators with an existing estimator in simulation analysis.

Speciﬁc Aim 1.4: Implement the existing and proposed methods in real-world

data.

Aim 2: Mediation analyses of the tertiary outcome with the secondary outcome as the

mediator in case-control studies.

Speciﬁc Aim 2.1: Deﬁne and identify the causal estimand nonparametrically.

Speciﬁc Aim 2.2: Illustrate the necessity of considering the study design when

performing the mediation analysis.

Speciﬁc Aim 2.3: Propose weighted estimators and compare the performance of

proposed estimators with a modiﬁed version of an existing estimator in simulation analysis.

Speciﬁc Aim 2.4: Implement the existing and proposed methods in real-world

data.

Aim 3: Monte Carlo simulation based power analysis and optimal case-control sampling

ratio exploration for mediation analyses in case-control studies with secondary and/or ter-

tiary outcomes.

14

Speciﬁc Aim 3.1: Perform power analysis for scenarios assumed in Aim 1 under

diﬀerent sampling strategies with diﬀerent eﬀect sizes.

Speciﬁc Aim 3.2: Perform power analysis for scenarios assumed in Aim 2 under

diﬀerent sampling strategies with diﬀerent eﬀect sizes.

15

CHAPTER 2 Mediation Analyses for Case-Control Studies With Secondary

Outcomes

2.1 Background

In a recent scoping review (Rijnhart et al. 2021), researchers found that mediation analyses

have been implemented in observational studies such as cross-sectional, case-control and both

prospective and retrospective cohort studies. However, little has been done for mediation

analysis in a case-control study with subsequently collected (secondary) outcomes. This

may happen after a case-control study has been conducted but with additional interest of

a secondary outcome and direct or indirect eﬀects of exposure on the secondary outcome

through the primary outcome. For example, Han et al. examined the diﬀerent risk factors in

relation to the skin cancer using a nested case-control sample from the Nurses’ Health Study

(Han et al. 2006). Later on, researchers might be further interested in how diﬀerent risk

factors are related to the death (a secondary outcome) through the skin cancer (the primary

outcome) using the same case-control sample. Here, the the mortality (secondary outcome)

might only be available within the case-control sample instead of the whole Nurses’ Health

Study cohort. However, performing such mediation analysis using existing methods could

lead to seriously biased estimation since that the case-control sample was not designed for the

mediation analysis purpose and potential bias could be an issue. Thus, appropriate methods

which can account for the bias due to sampling should be considered. In this chapter, we will

ﬁrst describe the settings and notations, followed by applying a marginal structural model

method to show the consequence of ignoring the complex study design. Then we review an

existing method that is suitable for this scenario but with limitations. Next, we propose a

new estimator (weighting estimator I) and compare its performance with the existing method

in simulation analysis. Finally, we implement both the existing and the proposed method

using the NIEHS Sister study data.

16

2.2 Settings

2.2.1 Notations and Causal DAG

Our study setting is the mediation analysis of a case-control sample with the case-control

status as the mediator and a secondary outcome as the outcome of interest. Formally, let

C1 be the exposure-mediator, exposure-outcome, and mediator-outcome confounder, e.g.,

the age at study baseline. Let C2 be the mediator-outcome confounder, e.g., comorbidity

disease. Let Z, i.e., the variable used for selecting the case-control sample (0-control, 1-case),

be the mediator of interest. We assume that the sample is selected from an existing cohort

study, like in many epidemiological studies. Let S be an indicator of case-control sample

selection status, where “S = 1” means the unit is selected and “S = 0” means the unit is

unselected (Didelez et al. 2010). Let Y be the secondary outcome (the outcome of interest

in the mediation analysis). Figure 2.1 shows the causal DAG for the data generating process

of (C1, C2, X, Z, S, Y ), where the square outside S indicates that the analysis is performed

using observed data, i.e., the selected case-control sample.

17

C1: confounder between exposure, mediator and outcome, e.g. participants’ age at study

baseline

C2: confounder between mediator and outcome, e.g. participant comorbidity

X: exposure/treatment

Z: case-control status (mediator)

S: case-control sample selection node

Y : (secondary) outcome

Figure 2.1: Causal DAG for Mediation Analysis

For simplicity, assume C2, X, Z and Y are all binary variables.

In practice, our

proposed approach can be extended to non-binary variables as well. If the case-control sample

is selected from a cohort (source population), then (C1, C2, X, Z, S) are fully observed for all

units whereas Y is only observed when S = 1, i.e., among units that are selected into the

case-control sample (Cao et al. 2022). In our application of the Sister Study data, cases are

100% selected and controls are selected using simple random sampling (SRS) method with a

pre-speciﬁed sampling rate (Cao et al. 2022). Our proposed approach may also be extended

(with modiﬁcations) to other sampling method such as stratiﬁed sampling or proportion to

size sampling as long as the sampling process is known by design. The aim here is to estimate

the causal estimands (e.g., mediating eﬀects) using the selected case-control sample. With

such study design, X is fully observed from the cohort. We adopt Neyman-Rubin causal

model (Rubin 1978, Splawa-Neyman et al. 1923) and invoke the SUTVA for simplicity (Rubin

18

1986). Let Z x be the counterfactual mediator and Y xZx

be the counterfactual outcome when

the exposure X is set to the value x. The goal is to study mediating eﬀects from X to Y ,

through the mediator Z.

2.2.2 Controlled Direct Eﬀects (CDE)

For binary Y and Z, the population (cohort) CDE can be deﬁned in three diﬀerent scales

(Robins & Greenland 1992, Pearl 2001, VanderWeele 2015):

Risk diﬀerence: CDERD = E[Y 1z − Y 0z], for z ∈ {0, 1}.
Relative risk: CDERR = P (Y 1z=1)
Odds ratio: CDEOR = P (Y 1z=1)/P (Y 1z=0)

P (Y 0z=1)/P (Y 0z=0) , for z ∈ {0, 1}.

P (Y 0z=1) , for z ∈ {0, 1}.

As pointed out by Pearl (Pearl 2001), experimentally controlling the mediator at a

ﬁxed value reﬂects a prescriptive conceptualization that has limited practical meaning since

it is hardly achievable in epidemiological studies. Thus, the use and interpretation of CDE

needs to be careful.

2.2.3 Natural Eﬀects (NEs)

Similar to CDE, for binary Y and Z, the population natural direct eﬀects (NDE) and natural

indirect eﬀects (NIE) can also be deﬁned in three diﬀerent scales (Robins & Greenland 1992,

Pearl 2001, VanderWeele 2015):

Risk diﬀerence: NDERD = E[Y 1Z0 − Y 0Z0
Relative risk: NDERR = P (Y 1Z0
Odds ratio: NDEOR = P (Y 1Z0

P (Y 0Z0 =1) and NIERR = P (Y 1Z1
P (Y 0Z0 =1)/P (Y 0Z0 =0) and NIEOR = P (Y 1Z1

=1)
P (Y 1Z0 =1) .

=1)/P (Y 1Z0

=1)

=0)

] and NIERD = E[Y 1Z1 − Y 1Z0

].

=1)/P (Y 1Z1
=0)
P (Y 1Z0 =1)/P (Y 1Z0 =0) .

2.3 Existing Methods for the Scenario

2.3.1 Weighting Method Ignoring the Study Design

There is a number of existing weighting methods which can estimate the mediating eﬀects

including both CDE and NEs (Coﬀman et al. 2023, Hong 2010, Huber 2014, Nguyen et al.

2023, Albert 2012, Tchetgen Tchetgen 2013, Lange et al. 2012, VanderWeele 2009). However,

none of these takes the complex study design into consideration, except one recent paper by

Huber and Solovyeva (Huber & Solovyeva 2020), that extended Huber’s weighting method

19

(Huber 2014). Before looking into their method, I would like to point out the necessity

for considering the study design while performing mediation analysis. Particularly in this

section, I will use one of the most well-known weighting methods, proposed by VanderWeele

(VanderWeele 2009), to illustrate the consequence of ignoring the secondary data collection

process in terms of biased estimates in mediation analysis. Also, I will discuss the possible

remedy for such scenario. The notation is consistent with the deﬁnitions in section 2.2.1.

VanderWeele proposed three marginal structural models (MSM) to estimate the

causal mediating eﬀects, for example in the continuous outcome and continuous mediator

setting:

E[Y xz] = g(x, z) = α0 + α1x + α2z + α3xz

E[Z x|C = c] = h(x, c) = γ0 + γ1x + γ

(cid:3)

2c

E[Y xz|C = c] = g(x, z, c) = θ0 + θ1x + θ2z + θ3xz + θ

(cid:3)

4c

(2.1)

(2.2)

(2.3)

The MSM 2.1 is used to estimate the CDE with assigning each individual with a weight

of wx ∗ wz, where wx = P (X=xi)

P (X=xi|C=ci) and wz = P (Z=zi|X=xi)

P (Z=zi|X=xi,C=ci) . The MSM 2.2 and 2.3 are
1

used to estimate the conditional NEs. For 2.2, the weight w2 =

P (X=xi|C=ci) is assigned for
each individual, and for 2.3, the same weight wx ∗ wz is used. One diﬀerence between 2.3

and 2.1 is that model 2.3 also conditions on the covariate set C, i.e., (C1, C2) in our case.

Thus, estimates from MSM 2.2 and 2.3 are called conditional NEs. To get marginal ones,

we need to further take expectation regarding the covariate set C.

VanderWeele further gave the explicit form of both CDE and marginal NEs:

CDERD = E[Y 1z − Y 0z] = α1 + α3z

pure NDERD = E[Y 1Z0 − Y 0Z0

] = θ1 + θ3γ0 + θ3γ

(cid:3)

2E[C]

total NIERD = E[Y 1Z1 − Y 1Z0

] = (θ2 + θ3)γ1

(2.4)

(2.5)

(2.6)

20

Table 2.1: Mediating Eﬀects When Ignoring Study Design

(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)(cid:3)

(cid:49)(cid:39)(cid:40)(cid:3)(cid:3)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:3)

(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:19)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:27)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:19)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:20)(cid:23)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:27)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:24)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:26)(cid:28)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:23)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:26)(cid:22)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:26)(cid:25)(cid:19)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:20)(cid:19)(cid:24)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:23)(cid:22)(cid:23)(cid:3)

(cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)

(cid:28)(cid:24)(cid:17)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:3)

(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)
(cid:3)

From Table 2.1 we can see that even though the CDE can be unbiasedly estimated,

there is serious bias in estimating NEs. One possible reason could be that when ﬁxing the

exposure X and the mediator Z (case-control status), Y xz ⊥⊥ S is achieved. Thus, ignoring

the sample selection process would still give unbiased estimates of Y xz from MSM 2.1 due to

the independency. However, for NEs, even though we can ﬁx X at x (or x∗), Z x(or Z x∗

) is

considered as a random variable. Thus, Y Zx

(or Y Zx∗

) is not independent of sample selection

S in such scenario, which will lead to biased estimates of NEs if ignoring the study design.

In such case, either the weight for MSM 2.3 needs to be adjusted to incorporate the sample

selection process, or a diﬀerent approach can be used to estimate the NEs. Below we give a

potential remedy for this.

NDE can still be unbiasedly estimated if CDE estimates are unbiased since NDE is

weighted averages of CDE in a cohort study (Rudolph et al. 2019, Petersen et al. 2006). In

the case-control sample, if each mean of counterfactual outcome can be estimated unbiasedly,

then so can all NEs.

E[Y xZx

] = Ec[EY xZx |C[Y xZx|C]]

(cid:2)

= Ec[

y · P (Y xZx

= y|C)dy]

= Ec[

(cid:2) (cid:2)

(cid:2) (cid:2)

y · P (Y xZx

= y, Z x = z|C)dzdy]

= Ec[

y · P (Y xz = y|Z x = z, C) · P (Z x = z|C)dzdy]

= Ec[

(cid:3)

z

(E[Y xz|C]) · P (Z x = z|C)]

21

Similarly, we have E[Y xZx∗

] = Ec[

(cid:4)

z(E[Y xz|C]) · P (Z x∗

= z|C)]. Then NIE can

be estimated using the diﬀerence between above two quantities. Although this is a separate

topic and not of our primary focus, future research is needed to investigate further. Next,

we will present an existing method that can give unbiased estimation of mediating eﬀects in

such complex study designs.

2.3.2 Existing Methods on Estimation of Causal Eﬀects

Using notations from section 2.2.1 and data from mediator-dependent sampling, Huber

and Solovyeva (2020) proposed to ﬁrst estimate the means of the counterfactual outcomes

Y xz,Y xZx

and Y xZx∗

(Huber & Solovyeva 2020). Their method and estimator will be called

the HS method and estimator henceforth. Speciﬁcally, assuming the causal structure in Fig-

ure 2.1, they showed

ˆE[Y xz] =

=

ˆE[Y xZx

] =

=

ˆE[Y xZx∗

] =

=

1
n

1
n

1
n

1
n

1
n

1
n

(cid:3)

i
(cid:3)

i
(cid:3)

i
(cid:3)

i
(cid:3)

i
(cid:3)

i

Yi · I{Xi = x} · I{Zi = z} · Si
P (Xi = x|C1i, C2i) · P (Zi = z|Xi, C1i, C2i) · P (Si = 1|Xi, Zi, C1i, C2i)

Yi · I{Xi = x} · I{Zi = z} · Si
P (Xi = x|C1i) · P (Zi = z|Xi, C1i, C2i) · P (Si = 1|Zi)

Yi · I{Xi = x} · Si
P (Xi = x|C1i, C2i) · P (Si = 1|Xi, Zi, C1i, C2i)

Yi · I{Xi = x} · Si
P (Xi = x|C1i) · P (Si = 1|Zi)

(2.7)

(2.8)

Yi · I{Xi = x} · Si
P (Xi = x|Zi, C1i, C2i) · P (Si = 1|Xi, Zi, C1i, C2i)

· P (Xi = x∗|Zi, C1i, C2i)
P (Xi = x∗|C1i, C2i)

Yi · I{Xi = x} · Si
P (Xi = x|Zi, C1i, C2i) · P (Si = 1|Zi)

· P (Xi = x∗|Zi, C1i, C2i)
P (Xi = x∗|C1i)

(2.9)

The HS estimators of CDE and NEs naturally follow. Instead of directly plug in

equation 2.7-2.9, they proposed to use normalized versions of the sample analogs (Huber &

Solovyeva 2020).

However, to estimate NEs, the quantities P (Xi = x∗|Zi, C1i, C2i) and P (Xi =

x|Zi, C1i, C2i) are required to be estimated using observed data. While such estimation is

22

usually based on parametric models, modeling the treatment as a function of the subsequent

emerged mediator is counterintuitive and does not have immediate substantive interpreta-

tions since the treatment causally precedes rather than succeeds the mediator (Hong et al.

2015). The HS method estimates mean of counterfactual outcomes ﬁrst and then estimates

the target causal estimands as a function of those potential outcomes, which makes the SE

estimation diﬃcult. It generally relies on the bootstrap resampling method to obtain the

standard error (SE) or conﬁdence interval (CI) (Huber & Solovyeva 2020). To overcome

these disadvantages, we extended Hong’s work on ratio-of-mediator-probability weighting

(RMPW) (Hong 2010), VanderWeele’s MSM (VanderWeele 2009), and Lange’s uniﬁed ap-

proach (Lange et al. 2012), to propose a new weighting method (weighting estimator I) to

estimate both CDE and NEs for a secondary outcome in a case-control sample drawn from

a source population, e.g., a cohort. We will compare the performance between the HS es-

timator and weighting estimator I using the simulations, followed by an application using

real-world data.

2.4 Proposed Weighting Method (Weighting Estimator I)

2.4.1 Nonparametric Identiﬁcation

The central goal of this chapter is to identify and estimate the population causal mediating

eﬀects using the case-control sample. As mentioned in previous sections, counterfactual

outcome Y xz and nested counterfactual outcomes (Y xZx

, Y xZx∗

) play key roles in estimating

CDE, NDE and NIE. Thus, to identify the causal mediating eﬀects, we need to connect

E[Y xz] with E[Y |X = x, Z = z, S = 1], and E[Y xZx

] and E[Y xZx∗

] with E[Y |X = x, S = 1].

Before showing how to nonparametrically identify the population mediation eﬀects using

case-control sample, we lay out assumptions which are required for the identiﬁcation, based

on the causal structure shown in Figure 2.1.

Assumption 1a. Y xz ⊥⊥ (X, Z)|C where C = (C1, C2), i.e., no unobserved con-

founders between the treatment and the outcome, and no unobserved confounders between

the mediator and the outcome.

23

Assumption 1b. Y xz ⊥⊥ Z|(X, C).

Assumption 1c. Y xz ⊥⊥ Z x∗|(X, C).

Assumption 2. Conditional independence based on the study design and back-

ground knowledge.

Y ⊥⊥ S|(Z, X, C)

S ⊥⊥ X|Z

S ⊥⊥ (X, C)|Z

X ⊥⊥ C2|C1

0 < P (Z = z|X, C), 0 < P (X = x|C1) and 0 < P (S = 1|Z)

(2.10)

(2.11)

(2.12)

(2.13)

(2.14)

The conditional independence assumptions 2.10 and 2.13 are primarily design-based.

For example, in a randomized controlled trial, instead of Y ⊥⊥ S|(Z, X, C), we have Y ⊥⊥ S.

For observational studies such as general case-control and matched case-control designs, we

may have Y ⊥⊥ S|Z or Y ⊥⊥ S|(Z, C) by design. It will change the weight slightly but will

not aﬀect the identiﬁcation generally. For more details, please refer to the proof of Theorem

1 in Appendix A. Assumption 1a, 2.10-2.12 are required for the identiﬁcation of CDE. To

identify NDE and NIE, except for Assumption 1b and Assumption 1c, three sets additional

assumptions are further required:

Assumption 3a. Y xZx ⊥⊥ X|(Z x, C).
Assumption 3b. Y xZx∗
Assumption 4. (Z x, Z x∗

⊥⊥ X|(Z x∗

) ⊥⊥ X|C.

, C).

Assumption 5. Y xz ⊥⊥ Z x∗|(X, C).

Next, we introduce 2.4.1, to show that the mean of (nested) counterfactual outcomes

can be written using weighted conditional density of the observed outcome Y .

Identiﬁcation of mean of (nested) counterfactual outcomes using case-control sample

under assumptions 1 to 5.

24

E[Y xz] = E[w1Y |X = x, Z = z, S = 1] where w1 = P (z|x)P (X=x)

P (z|x,c1,c2)P (x|c1) .

E[Y xZx
E[Y xZx∗

] = E[w2Y |X = x, S = 1] where w2 = P (S=1|X=x)P (X=x)
] = E[w3Y |X = x, S = 1] where w3 = P (Z=z|X=x∗,c1,c2)
P (Z=z|X=x,c1,c2)

P (S=1|Z=z)P (X=x|c1) .

· P (S=1|X=x)P (X=x)
P (S=1|Z=z)·P (X=x|C1) .

2.4.2 Estimation of the Causal Estimand

Compared to the Theorem 1 from (Huber & Solovyeva 2020), 2.4.1 rewrites P (X|Z, C) into

P (Z|X, C) which makes it easier to be interpreted as the mediator causally succeeds the

exposure (Hong 2015). So far, we can still estimate CDE and NEs using the same way

as HS approach does but it is computationally expensive since it relies on the bootstrap

resampling method to obtain SE and 95% CI (Huber & Solovyeva 2020). Therefore, we

adopt a MSM modeling-based approach which assumes particular relationships between the

mean of a (nested) counterfactual outcome and the observed exposure and mediator whenever

appropriate (Robins et al. 2000, VanderWeele 2009, Lange et al. 2012). Following MSM will

be used for estimating CDE (2.15) and NE (2.16):

g(E[Y xz]) = c0 + c1x + c2z + c3x · z

g(E[Y xZx∗

]) = c0 + c1x + c2x∗ + c3x · x∗

(2.15)

(2.16)

where g(·) is the link function.

Assuming the functional form is correctly speciﬁed in 2.15 and 2.16, the causal

mediating eﬀects in terms of risk diﬀerence, relative risk and odds ratio can be expressed as

a function of the parameters estimated from following MSM (Robins et al. 2000):

For CDE:

linear model: P (Y xz = 1) = c0 + c1x + c2z + c3x · z

log-linear model:

log[P (Y xz = 1)] = c0 + c1x + c2z + c3x · z

logistic model: logit[P (Y xz = 1)] = c0 + c1x + c2z + c3x · z

(2.17)

(2.18)

(2.19)

25

For NE:

linear model: P (Y xZx∗

= 1) = c0 + c1x + c2x∗ + c3x · x∗

log-linear model:

log[P (Y xZx∗

= 1)] = c0 + c1x + c2x∗ + c3x · x∗

logistic model: logit[P (Y xZx∗

= 1)] = c0 + c1x + c2x∗ + c3x · x∗

(2.20)

(2.21)

(2.22)

where w1 is used for estimating CDE in 2.17, 2.18 and 2.19 while w3 is used for estimating

NE in 2.20, 2.21 and 2.22. The interaction term c3x · z or c3x · x∗ are added into above

models to reﬂect the potential exposure-mediator interaction in relation to the outcome, and

if the exposure-covariate interactions are of interest, we can further include x · C and x∗ · C

to capture the direct and indirect eﬀects modiﬁed by covariates (Lange et al. 2012). In our

scenario, to estimate CDE, MSM 2.17-2.19 can be directly applied. However, since MSM

2.20-2.22 involves not only the observed exposure x but also the counterfactual value of the

exposure, i.e., x∗, we adopt (Lange et al. 2012) approach and construct a new dataset by

repeating each observation in the original dataset twice, including an additional variable X ∗,

which is equal to the original exposure for the ﬁrst replication and equal to the opposite of

the actual exposure for the second replication. After computing weights for each individual

(and the “duplicates”), we ﬁt MSM 2.20-2.22 to estimate NEs.

Similar to (Lange et al. 2012), when assuming a MSM for the nested counterfactual

outcomes, our method can be equated with the generalized estimating equations (GEE)

method. For example, the estimating equation corresponding to the MSM in 2.16 is given

by:

U (c, α) =

(cid:4)n

i=1

(cid:4)1

x∗=0 d(Xi, x∗)(Yi − c0 − c1Xi − c2a∗ − c3Xia∗) P (Z=z|X=x∗,c1,c2)
P (Z=z|X=x,c1,c2)

·

P (S=1|X=x)P (X=x)

P (S=1|Z=z)·P (X=x|C1) with d(Xi, x∗) = (1, Xi, x∗, Xix∗)T .

For the estimators to have good performance (i.e. to be regular), we further multiply

the weight (w3) by P (Xi) like the stabilized weight in inverse probability weighting estimation

(Lange et al. 2012). As Lange et al. argued, let model M1 be the parametric model for

26

P (Z|X, C) (with the unknown parameter denoted α) and M2 be the model deﬁned by the

restrictions of model M1 and the additional assumption that the generalized linear MSM

2.16 holds. According to Lange et al., the condition under which conservative SE will be

obtained is “that α is substituted by the maximum likelihood estimator under model M1 and

that this is also an eﬃcient estimator of α under model M2”. Although Lange et al. believe

that the latter condition will almost always be (nearly) satisﬁed in practice, they recommend

to obtain alternative SE (e.g., based on the bootstrap resampling method) if researchers are

concerned about the validity of the additional condition, i.e., the generalized linear MSM

is so overly restrictive that the restrictions it imposes on the outcome distribution carry

information about the actual value of α.

Following the M-estimation theory, we know that:

√

n(ˆcn − ct) D
−→

N (0, [E{ ∂U (ct,αt)

∂cT

}]−1var(U (ct, αt))[E{ ∂U (ct,αt)

∂cT

}]−1T

) where ct and αt stand

for the true values of vectors (c0, c1, c2, c3)T and (α0,α1,α2)T . And (α0,α1,α2)T is from

the generalized linear model of Z on (X, C) with link function g(·), i.e., g(E[Z|X, C]) =

α0 + α1X + α2C. Since ct and αt are unknown, we instead substitute the GEE estimator ˆcn

for ct and MLE estimator ˆαn for αt and to obtain the sandwich estimator for the asymptotic

variance speciﬁed above, given by

[ ˆE{ ∂U (ˆcn,ˆαn)

∂cT

}]−1 ˆvar(U (ˆcn, ˆαn))[ ˆE{ ∂U (ˆcn,ˆαn)

∂cT

}]−1T

.

So far, we have showed the validity of using MSM to estimate both CDE and NEs,

including the sandwich estimators for SE. Following are estimates for mediating eﬀects using

MSMs 2.17-2.22:

(cid:2)CDERD,z = ˆc1(x − x∗) + ˆc3z(x − x∗).
(cid:2)CDERR,z = expˆc1(x−x∗)+ˆc3z(x−x∗).
(cid:2)CDEOR,z = expˆc1(x−x∗)+ˆc3z(x−x∗).
(cid:2)NDERD = ˆc1(x − x∗) + ˆc3x∗(x − x∗).
(cid:5)NIERD = ˆc2(x − x∗) + ˆc3x(x − x∗).
(cid:2)NDERR = expˆc1(x−x∗)+ˆc3x∗(x−x∗).

27

(cid:5)NIERR = expˆc2(x−x∗)+ˆc3x(x−x∗).
(cid:2)NDEOR = expˆc1(x−x∗)+ˆc3x∗(x−x∗).
(cid:5)NIEOR = expˆc2(x−x∗)+ˆc3x(x−x∗).

2.5 Simulation

2.5.1 Data Generating Process (DGP)

The parameters used in the data generating process are carefully chosen to achieve the similar

marginal distributions as in the Sister Study data introduced in the section 2.7.

1) Generate the study confounders:

C1 is a continuous confounder assessed at baseline (T0), which is generated using the

truncated normal distribution, N (60.72, 54.05), with the interval [44.1, 76.3]. An example of

C1 could be the participants’ age from the source population (cohort) at study baseline.

C2 is a binary (0/1) confounder assessed at baseline and C2 ∼ Bernoulli(PC2) and

PC2 ≈ 12.71%. An example of C2 could be the race/ethnicity where 0 is the White and 1 is

the Black.

2) Generate the exposure:

X is the binary (0/1) exposure of interest and X ∼ Bernoulli(PX), where PX =

{1 + exp[− log(0.991C1) − log(0.44)]}−1

and P (X = 1) ≈ 20.32%.

3) Generate the primary outcome (case-control status):

Z is the binary (0/1) mediator, i.e., case-control status or the primary outcome,

assessed at follow-up time T1, later than baseline, with Z ∼ Bernoulli(PZ), where

PZ = {1 + exp[− log(1.033C1) − log(0.874C2) − log(2.7X) − log(0.00882)]}−1

and P (Z ≈ 1) = 7.73%.

4) Generate the sampling selection indicator:

S is the indicator of being selected into the case-control sample (0-unselected and

1-selected), where for subjects with Z = 1, all units are selected and for units with Z = 0,

28

a simple random sampling is selected with sampling rate of 3.56%. This is very common in

epidemiological studies where cases are completely selected while controls are selected with

a small portion due to limited budget (Cao et al. 2022).

5) Generate the observed (secondary) outcome:

Y is the binary (0/1) outcome of interest and Y ∼ Bernoulli(PY ), where

PY = {1 + exp[−kY ]}−1

kY = log(1.085C1) + log(1.57C2) + log(16.421X) + log(4.832Z) + log(0.961C1X)

+ log(0.650C2X) + log(1.75XZ) + log(0.000792)

and P (Y = 1) ≈ 14.91%.

6) Generate the potential outcomes:

For the mediator Z, Z x ∼ Bernoulli(PZx), where PZx is similar to PZ with replacing

X by the actual value x. For the outcome Y , Y xZx∗
similar to PY with replacing X and Z by x and counterfactual outcome Z x∗

∼ Bernoulli(PY xZx∗ ), where PY xZx∗ is

.

2.5.2 Monte Carlo Simulation

The Monte Carlo (MC) simulation follows these steps:

1. Since exposure-covariate and exposure-mediator interactions are assumed, it would be

cumbersome to directly calculate the true CDE and NEs, especially for log- and logit-

link functions. Thus, the true mediating eﬀects are calculated from a super population

(N0 = 2000000) based on the DGP speciﬁed in the section 2.5.1.

2. To better mimic the sampling process in the real world, a “cohort” with size N is

separately simulated for each individual MC run using the DGP speciﬁed in the section

2.5.1.

3. A stratiﬁed (on Z) case-control sample is drawn from the “cohort” in step 2, with

100% sampling for cases and a simple random sampling with 3.56% sampling rate for

controls. These sampling fractions are chosen to mimic the real-world data collection

29

in the section 2.7. Diﬀerent sampling strategies can be applied as long as they are

known to researchers.

4. CDE and NEs are estimated using the case-control sample in step 3, i.e., assuming the

outcome Y is only observed within the case-control sample.

5. Repeat step 2 to step 4 for R = 1000 times.

6. The point estimates and bootstrap 95% CI of the CDE and NEs are calculated across

R MC runs.

2.5.3 Bootstrap for Standard Errors

Both VanderWeele (VanderWeele 2009) and Huber (Huber & Solovyeva 2020) recommend

to use bootstrap resampling method to calculate SE. For VanderWeele’s approach, two-step

are used to estimate the NEs, which makes it diﬃcult to have immediately available SE from

the MSM. For HS method, E[Y xz], E[Y xZx

], and E[Y xZx∗

] are separately estimated with

weighting method and then CDE and NEs are further estimated by the function of those

three quantities. In such case, a direct calculation of the SE can be cumbersome and not

quickly achievable. Although our proposed method can obtain robust SE directly from the

MSM, for comparison purposes, bootstrap SE will be used across three diﬀerent methods.

Later in section 2.6.2, we will explore comparisons between robust and bootstrap SE for the

proposed method. Below are two-step bootstrap resampling method used to get the SE and

the 95% CI:

1. Starting from the case-control sample at r-th MC simulation described in the step 3

in the section 2.5.2, we use stratiﬁed bootstrap sampling to ensure the same sample

size for each category of case-control status (?). Speciﬁcally, assuming among the r-th

case-control sample, there are nr units with nr0 units for Z = 0 and nr1 units for

Z = 1. We use stratiﬁed bootstrap method to resample separately among subjects

with Z = 0 and Z = 1. Speciﬁcally, for each bootstrap sample b, we resample nr0b

from nr0 and nr1b from nr1 with unrestricted random sampling and sampling ratio of

1, i.e., nr0b = nr0 and nr1b = nr1. The unrestricted random sampling assigns equal

probability for each unit in the stratum and with replacement.

30

2. For the r-th MC run, we perform B bootstrap sampling runs (B = 200) and get the

percentile-based bootstrap CI and the bootstrap SE (Zhang 2014). Speciﬁcally, the

bootstrap SE is calculated using the following formula (Chernick & LaBudde 2011,

Efron 1982):

(cid:6)

SEb =

1
B−1

(cid:4)

b(ˆθ

∗
∗
b − ˆθ

∗
)2, where ˆθ
b is the estimated causal eﬀect, e.g., CDE, esti-

∗
mated from the b-th bootstrap sample and ˆθ

=

2.5.4 Simulation Performance Criteria

(cid:2)

∗
ˆθ
b
B .

b

To evaluate the performance of estimators for CDE and NEs, we use the following criteria

based on MC simulation (Sahiner et al. 2008, Hafdahl 2010, Morris et al. 2019):

Table 2.2: Simulation Evaluation Criteria

(cid:39)(cid:72)(cid:73)(cid:76)(cid:81)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)

(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:53)(cid:72)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)

(cid:38)(cid:85)(cid:76)(cid:87)(cid:72)(cid:85)(cid:76)(cid:82)(cid:81)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3)

(cid:3495)(cid:1831)(cid:4670)(cid:4666)(cid:2016)(cid:3552) (cid:3398) (cid:2016)(cid:4667)(cid:2870)(cid:4671)(cid:3)

(cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:51)(cid:85)(cid:82)(cid:69)(cid:68)(cid:69)(cid:76)(cid:79)(cid:76)(cid:87)(cid:92)(cid:3)

(cid:87)(cid:75)(cid:72)(cid:3)(cid:83)(cid:85)(cid:82)(cid:83)(cid:82)(cid:85)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:82)(cid:73)(cid:3)(cid:87)(cid:76)(cid:80)(cid:72)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3)(cid:70)(cid:82)(cid:89)(cid:72)(cid:85)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)
(cid:87)(cid:85)(cid:88)(cid:72)(cid:3)(cid:83)(cid:68)(cid:85)(cid:68)(cid:80)(cid:72)(cid:87)(cid:72)(cid:85)(cid:3)(cid:70)(cid:82)(cid:81)(cid:86)(cid:87)(cid:85)(cid:88)(cid:70)(cid:87)(cid:72)(cid:71)(cid:3)(cid:73)(cid:85)(cid:82)(cid:80)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:72)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)

(cid:40)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:54)(cid:40)(cid:3)

(cid:54)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:71)(cid:72)(cid:89)(cid:76)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:82)(cid:73)(cid:3)(cid:86)(cid:76)(cid:80)(cid:88)(cid:79)(cid:68)(cid:87)(cid:72)(cid:71)(cid:3)(cid:83)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)
(cid:72)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)

(cid:36)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)

(cid:3496)

(cid:883)
(cid:1844)

(cid:3533) (cid:4666)(cid:2016)(cid:3552)(cid:3045) (cid:3398) (cid:2016)(cid:4667)(cid:2870)

(cid:3045)

(cid:3)

(cid:3045)

(cid:963) (cid:1835)(cid:4668)(cid:891)(cid:887)(cid:936)(cid:3)(cid:1829)(cid:1835)(cid:3)(cid:1855)(cid:1867)(cid:1874)(cid:1857)(cid:1870)(cid:1871)(cid:3)(cid:2016)(cid:4669)
(cid:3)
(cid:1844)

(cid:3496)

(cid:883)
(cid:1844) (cid:3398) (cid:883)

(cid:3533) (cid:4666)(cid:2016)(cid:3552)(cid:3045) (cid:3398) (cid:2016)(cid:1191)(cid:4667)(cid:2870)

(cid:3045)

(cid:3)

(cid:3496)

(cid:883)
(cid:1844)

(cid:3533) (cid:1848)(cid:1853)(cid:1870)(cid:3554) (cid:3045)(cid:4666)(cid:2016)(cid:4667)

(cid:3045)

(cid:3)

(cid:54)(cid:68)(cid:75)(cid:76)(cid:81)(cid:72)(cid:85)(cid:3)(cid:72)(cid:87)(cid:3)(cid:68)(cid:79)(cid:17)(cid:3)(cid:11)(cid:21)(cid:19)(cid:19)(cid:27)(cid:12)(cid:3)

(cid:43)(cid:68)(cid:73)(cid:71)(cid:68)(cid:75)(cid:79)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:19)(cid:12)(cid:3)

(cid:48)(cid:82)(cid:85)(cid:85)(cid:76)(cid:86)(cid:3)(cid:72)(cid:87)(cid:3)(cid:68)(cid:79)(cid:17)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:28)(cid:12)(cid:3)

(cid:48)(cid:82)(cid:85)(cid:85)(cid:76)(cid:86)(cid:3)(cid:72)(cid:87)(cid:3)(cid:68)(cid:79)(cid:17)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:28)(cid:12)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)

(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)

(cid:1844) (cid:3398) (cid:883) (cid:963) (cid:4666)(cid:2016)(cid:3552)(cid:3045) (cid:3398) (cid:2016)(cid:1191)(cid:4667)(cid:2870)
(cid:3497)
(cid:963) (cid:1848)(cid:1853)(cid:1870)(cid:3554) (cid:3045)(cid:4666)(cid:2016)(cid:4667)
(cid:53)(cid:48)(cid:54)(cid:40)(cid:29)(cid:3)(cid:85)(cid:82)(cid:82)(cid:87)(cid:3)(cid:80)(cid:72)(cid:68)(cid:81)(cid:3)(cid:86)(cid:84)(cid:88)(cid:68)(cid:85)(cid:72)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:30)(cid:3)(cid:54)(cid:40)(cid:29)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:30)(cid:3)(cid:38)(cid:44)(cid:29)(cid:3)(cid:70)(cid:82)(cid:81)(cid:73)(cid:76)(cid:71)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)(cid:76)(cid:81)(cid:87)(cid:72)(cid:85)(cid:89)(cid:68)(cid:79)(cid:30)(cid:3)(cid:2016)(cid:29)(cid:3)(cid:87)(cid:85)(cid:88)(cid:72)(cid:3)(cid:70)(cid:68)(cid:88)(cid:86)(cid:68)(cid:79)(cid:3)(cid:72)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:17)(cid:3)(cid:3)
(cid:2016)(cid:3552)(cid:3045)(cid:29)(cid:3)(cid:83)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:72)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)(cid:73)(cid:85)(cid:82)(cid:80)(cid:3)(cid:85)(cid:16)(cid:87)(cid:75)(cid:3)(cid:48)(cid:82)(cid:81)(cid:87)(cid:72)(cid:3)(cid:38)(cid:68)(cid:85)(cid:79)(cid:82)(cid:3)(cid:85)(cid:88)(cid:81)(cid:17)(cid:3)(cid:3)
(cid:2016)(cid:1191) (cid:3404) (cid:2869)
(cid:3019)

(cid:3)(cid:29)(cid:3)(cid:80)(cid:72)(cid:68)(cid:81)(cid:3)(cid:68)(cid:70)(cid:85)(cid:82)(cid:86)(cid:86)(cid:3)(cid:53)(cid:3)(cid:48)(cid:82)(cid:81)(cid:87)(cid:72)(cid:3)(cid:38)(cid:68)(cid:85)(cid:79)(cid:82)(cid:3)(cid:86)(cid:76)(cid:80)(cid:88)(cid:79)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:85)(cid:88)(cid:81)(cid:86)(cid:17)(cid:3)(cid:3)

(cid:963) (cid:2016)(cid:3552)(cid:3045)(cid:3045)

(cid:883)
(cid:1844)

(cid:3045)

(cid:3045)

(cid:883)

(cid:3)

(cid:48)(cid:82)(cid:85)(cid:85)(cid:76)(cid:86)(cid:3)(cid:72)(cid:87)(cid:3)(cid:68)(cid:79)(cid:17)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:28)(cid:12)(cid:3)

To explore how the sample size may aﬀect the estimation, the simulation for compar-

ing HS method and the proposed approach is also carried out under three diﬀerent choices of

“cohort” size N , i.e., 18000, 36491, and 72000. The second choice of sample size (N = 36491)

is speciﬁcally to reﬂect what was happened in the empirical study sample described in the

section 2.7. The sampling selection strategy is consistent across three scenarios, i.e., 100%

sampling among subjects with Z = 1 and 3.56% simple random sampling among subjects

with Z = 0. Therefor, the case-control sample sizes are approximately 2000, 4000, and 8000,

respectively.

31

2.6 Simulation Results

2.6.1 Results Comparison Between Two Methods

Using MC simulation with 1000 runs, and under diﬀerent presumed cohort size (18000,

36491, or 72000), we have the following result:

Table 2.3: Controlled Direct Eﬀect Fixing Z = 0

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:21)(cid:17)(cid:26)(cid:21)(cid:21)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:21)(cid:17)(cid:27)(cid:26)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:21)(cid:17)(cid:26)(cid:21)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:21)(cid:17)(cid:25)(cid:19)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:21)(cid:17)(cid:27)(cid:21)(cid:20)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:21)(cid:17)(cid:25)(cid:27)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:21)(cid:17)(cid:24)(cid:26)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:22)(cid:21)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:24)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:25)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:21)(cid:23)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:23)(cid:22)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:19)(cid:24)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:23)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:20)(cid:27)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:24)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:25)(cid:24)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:26)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:24)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:26)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:22)(cid:25)(cid:25)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:19)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:24)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:22)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:24)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:23)(cid:24)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:28)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:23)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:23)(cid:20)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:27)(cid:26)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:28)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:26)(cid:21)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:27)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:20)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:23)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:24)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:22)(cid:25)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:27)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:21)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:22)(cid:24)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:24)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:20)(cid:21)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:25)(cid:24)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:25)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:19)(cid:28)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:20)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:19)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:22)(cid:23)(cid:26)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:22)(cid:28)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:22)(cid:19)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:22)(cid:20)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:23)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:23)(cid:21)(cid:19)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:27)(cid:25)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:22)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:19)(cid:23)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:26)(cid:28)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:26)(cid:24)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:24)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:24)(cid:3)

From Table 2.3 we can see that our method is slightly better than the HS method

for CDE when ﬁxing the mediator Z at 0 for both RR and OR scale, in terms of smaller bias

and bootstrap SE. For RD scale, the HS method is slightly better with smaller bootstrap

SE and overall RMSE.

32

Table 2.4: Controlled Direct Eﬀect Fixing Z = 1

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:19)(cid:27)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:24)(cid:27)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:27)(cid:3)
(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:21)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:22)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:21)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:24)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:25)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:24)(cid:27)(cid:3)

(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:25)(cid:23)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:25)(cid:26)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:25)(cid:20)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:22)(cid:22)(cid:26)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:21)(cid:17)(cid:22)(cid:22)(cid:27)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:22)(cid:22)(cid:22)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:21)(cid:17)(cid:22)(cid:21)(cid:28)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:27)(cid:28)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:23)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:28)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:23)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:24)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:25)(cid:24)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:27)(cid:22)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:21)(cid:28)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:26)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:19)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:23)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:21)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:19)(cid:28)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:28)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:26)(cid:21)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:25)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:28)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:22)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:25)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:27)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:28)(cid:22)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:22)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:28)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:24)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:25)(cid:24)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:19)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:23)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:25)(cid:26)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:27)(cid:25)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:20)(cid:19)(cid:3)

(cid:28)(cid:22)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:20)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:24)(cid:3)

(cid:28)(cid:22)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:20)(cid:17)(cid:20)(cid:3)
(cid:27)(cid:26)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:22)(cid:3)

(cid:28)(cid:22)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:20)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:22)(cid:3)

From Table 2.4 we can see that our method is better than the HS method for CDE

when ﬁxing the mediator Z at 1. First, our method has smaller bias for RD, RR, and

OR scales. Second, although our method has slightly larger bootstrap SE, the RMSE is still

overall smaller compared to the HS estimates. Third, as the sample/cohort size increases, the

coverage probability of our method gets closer to 95% whereas the HS estimated coverages

actually decrease.

Table 2.5: Natural Direct Eﬀects

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:22)(cid:17)(cid:26)(cid:26)(cid:19)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:22)(cid:17)(cid:27)(cid:25)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:22)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:21)(cid:22)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:27)(cid:25)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:22)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:21)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:27)(cid:22)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:19)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:19)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:26)(cid:28)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:24)(cid:20)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:22)(cid:23)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:26)(cid:27)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:24)(cid:20)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:22)(cid:23)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:23)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:26)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:26)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:27)(cid:26)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:26)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:27)(cid:25)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:23)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:26)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:22)(cid:26)(cid:23)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:26)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:21)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:26)(cid:22)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:26)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:21)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:25)(cid:27)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:28)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:23)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:19)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:21)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:25)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:23)(cid:27)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:24)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:25)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:23)(cid:27)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:26)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:28)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:21)(cid:23)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:21)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:20)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:21)(cid:23)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:23)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:22)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:26)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:22)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:26)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:26)(cid:22)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:20)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:26)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:26)(cid:21)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:28)(cid:21)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:22)(cid:26)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:22)(cid:23)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:23)(cid:25)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:20)(cid:26)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)

33

For the NDE estimates in Table 2.5, two methods have almost exactly the same

results, with our method has slightly larger RMSE (at 4 decimal places).

Table 2.6: Natural Indirect Eﬀects

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:22)(cid:17)(cid:26)(cid:28)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:26)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:28)(cid:19)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:27)(cid:19)(cid:22)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:28)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:28)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:27)(cid:19)(cid:24)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:21)(cid:21)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:21)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:21)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:21)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:21)(cid:26)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:28)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:27)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:26)(cid:24)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:26)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:21)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:26)(cid:24)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:21)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:27)(cid:22)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:21)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:27)(cid:22)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:21)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:22)(cid:23)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:20)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:19)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:26)(cid:23)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:22)(cid:21)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:22)(cid:27)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:27)(cid:22)(cid:28)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:24)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:27)(cid:25)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:25)(cid:26)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:24)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:27)(cid:23)(cid:25)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:24)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:23)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:27)(cid:26)(cid:24)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:20)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:27)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:25)(cid:22)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:24)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:21)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:24)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:25)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:22)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:26)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:26)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:21)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:26)(cid:22)(cid:3)

(cid:28)(cid:24)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:21)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:20)(cid:17)(cid:26)(cid:3)
(cid:27)(cid:28)(cid:17)(cid:26)(cid:3)

(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:25)(cid:3)

(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:20)(cid:3)

From the Table 2.6 we can see that overall, two methods are still very similar in

simulation performance. For the RD scale, our method has lower coverage compared to the

HS method, although both methods have very close point estimates and the bootstrap SE.

2.6.2 Uncertainty of the Estimators

Since researchers are usually interested in the NE parameters, we hereby discuss the SE for

the NIE for our method in this section.

Although we have compared the bootstrap SE between the proposed and HS esti-

mates in section 2.6.1, it is worth to take another look at the distribution of the estimator

for NIE to examine the asymptotic normality. We would also like to compare the bootstrap

SE with the robust SE using our method. This potentially helps researchers to understand

how diﬀerent it can be when using two methods to obtain SE in practice.

1) Distribution of the NIE estimator

Figure 2.2 shows the distribution of estimates for the NIE using the proposed method

based on 1000 MC simulation. The top two histograms are for the RD scale for N = 18, 000

34

and N = 72, 000 while the bottom two are for the log-OR scale under two cohort sizes.

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:19)(cid:17)(cid:26)

(cid:19)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:19)(cid:17)(cid:26)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:3)

(cid:3)

Figure 2.2: Distribution of NIE Estimates using Proposed Approach

From the visual check on the ﬁgure, we can see that the proposed approach gives

estimates for NIE that is close to the normal density (orange curve). And when increasing

the cohort size, the kernel density (blue curve) gets closer to the normal density, which

supports the asymptotic normality assumption.

Figure 2.3 shows the distribution of estimates for the HS estimates of the NIE based

on 1000 MC simulations.

35

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:19)(cid:17)(cid:26)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:19)(cid:17)(cid:26)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:3)

(cid:3)

(cid:3)

Figure 2.3: Distribution of NIE Estimates using HS Method

From the visual check on the ﬁgure, we can see that the HS method gives estimates

that are also close to the normal. They point out that their approach is

√

n-consistent

and asymptotically normal under standard regularity condition (Huber & Solovyeva 2020).

When increasing the cohort size from 18, 000 to 72, 000, we can see that the kernel density

(blue curve) gets closer to the the normal density (orange curve).

With these supporting evidence, we consider that both the proposed and the HS

method give asymptotically normally distributed estimators for the NIE.

2) Robust and bootstrap SE

Figure 2.4 shows the distribution of robust and bootstrap SE for our proposed

method under simulation settings from section 2.5.2 and 2.5.3. The top two densities are

for the RD scale for N = 18, 000 and N = 72, 000 while the bottom two are for the log-OR

36

scale under two cohort sizes.

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:27)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:53)(cid:39)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:19)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:53)(cid:39)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)

(cid:16)(cid:22)(cid:17)(cid:24)

(cid:16)(cid:22)(cid:17)(cid:19)

(cid:16)(cid:21)(cid:17)(cid:24)

(cid:16)(cid:21)(cid:17)(cid:19)

(cid:16)(cid:20)(cid:17)(cid:24)

(cid:16)(cid:22)(cid:17)(cid:27)

(cid:16)(cid:22)(cid:17)(cid:25)

(cid:16)(cid:22)(cid:17)(cid:23)

(cid:16)(cid:22)(cid:17)(cid:21)

(cid:16)(cid:22)(cid:17)(cid:19)

(cid:16)(cid:21)(cid:17)(cid:27)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:79)(cid:82)(cid:74)(cid:50)(cid:53)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:79)(cid:82)(cid:74)(cid:50)(cid:53)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:12)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:12)

(cid:3)

(cid:3)

Figure 2.4: Distribution Comparison Between Robust and Bootstrap SE for Proposed

Method

While the robust (orange curve) SE is slightly larger than and bootstrap (blue curve)

SE for the NIE estimates in RD scale, their distribution are very close for the NIE estimates

in log-OR scale. Since two methods give asymptotically normally distributed estimators, we

believe both robust and bootstrap resampling based methods can be used to obtain the SE

for the NIE estimates.

2.7 Application: the Olfaction Sub-study of the Sister Study Cohort

In this section, we implement both the HS method and our proposed method using the

NIEHS Sister Study data. The Sister Study is an ongoing nationwide cohort to investigate

environmental and genetic risk factors for breast cancer and other chronic diseases (Sandler

37

et al. 2017). Details of a case-control study design with a secondary outcome and data

collection have been published elsewhere (Cao et al. 2022). In this example, we will use the

same case-control study design as previously published (Cao et al. 2022), with a subsequently

collected secondary outcome as the interest of the outcome, to implement two methods

for estimating both CDE and NEs. Brieﬂy, during 2012-2014 (second detailed-follow-up,

i.e., DFU-2, considered as baseline in this example), the Sister Study collected data on

whether the participants ever had a serious head injury that resulted in unconsciousness,

coma, or hospitalization. At the third detailed-follow-up (DFU-3, 2014-2016), a stratiﬁed

case-control sample was selected based on self-reported sense of smell (100% sampling for

self-report “poor” and about 3.56% simple random sampling for self-report “normal”) (Cao

et al. 2022). The self-report sense of smell is a surrogate of the olfactory function at DFU-3.

Later in 2018-2019, study investigators mailed out objective olfaction test kits (12-item Brief

Smell Identiﬁcation Test, B-SIT) to those who were selected in the case-control sample only

and collected the secondary outcome, B-SIT-tested olfaction. Previous publications have

suggested the link between traumatic brain injury and olfactory impairment (Howell et al.

2018, Xydakis et al. 2015). We deﬁne the serious head injury for subjects who answered

“YES” to the corresponding question and deﬁned the mediator as self-report sense of smell

at DFU-3. We further deﬁne the olfactory impairment in 2018-2019 using a cutoﬀ of ≤9 for

the B-SIT test (Cao et al. 2022).

In the illustrative example, we are interested in studying the eﬀects of a serious

head injury (X) on tested poor olfaction (secondary outcome Y ) mediated through the self-

report sense of smell (case-control status or the primary outcome Z) at DFU-3. Using the

notations described in the section 2.2.1, we consider the participants’ age at DFU-2 as the

common cause for the exposure, the mediator and the outcome, i.e., C1, the participants’

race/ethnicity as the common cause for the mediator and the outcome, i.e., C2. Table 2.7

below summarizes the deﬁnition of variables used in this analysis:

38

Table 2.7: Variable Deﬁnition for the Sister Study Example

(cid:57)(cid:68)(cid:85)(cid:76)(cid:68)(cid:69)(cid:79)(cid:72)(cid:86)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:73)(cid:82)(cid:88)(cid:81)(cid:71)(cid:72)(cid:85)(cid:3)(cid:11)(cid:38)(cid:20)(cid:12)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:73)(cid:82)(cid:88)(cid:81)(cid:71)(cid:72)(cid:85)(cid:3)(cid:11)(cid:38)(cid:21)(cid:12)(cid:3)

(cid:40)(cid:91)(cid:83)(cid:82)(cid:86)(cid:88)(cid:85)(cid:72)(cid:3)(cid:11)(cid:59)(cid:12)(cid:3)
(cid:48)(cid:72)(cid:71)(cid:76)(cid:68)(cid:87)(cid:82)(cid:85)(cid:3)(cid:11)(cid:61)(cid:12)(cid:3)
(cid:50)(cid:88)(cid:87)(cid:70)(cid:82)(cid:80)(cid:72)(cid:3)(cid:11)(cid:60)(cid:12)(cid:3)

(cid:3)

(cid:39)(cid:72)(cid:73)(cid:76)(cid:81)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)

(cid:36)(cid:74)(cid:72)(cid:3)(cid:3)

(cid:57)(cid:68)(cid:85)(cid:76)(cid:68)(cid:69)(cid:79)(cid:72)(cid:3)(cid:55)(cid:92)(cid:83)(cid:72)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:87)(cid:76)(cid:81)(cid:88)(cid:82)(cid:88)(cid:86)(cid:3)

(cid:53)(cid:68)(cid:70)(cid:72)(cid:18)(cid:40)(cid:87)(cid:75)(cid:81)(cid:76)(cid:70)(cid:76)(cid:87)(cid:92)(cid:3)

(cid:58)(cid:75)(cid:76)(cid:87)(cid:72)(cid:3)(cid:11)(cid:19)(cid:12)(cid:18)(cid:50)(cid:87)(cid:75)(cid:72)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:3)

(cid:43)(cid:76)(cid:86)(cid:87)(cid:82)(cid:85)(cid:92)(cid:3)(cid:82)(cid:73)(cid:3)(cid:86)(cid:72)(cid:85)(cid:76)(cid:82)(cid:88)(cid:86)(cid:3)(cid:75)(cid:72)(cid:68)(cid:71)(cid:3)(cid:76)(cid:81)(cid:77)(cid:88)(cid:85)(cid:92)(cid:3)
(cid:54)(cid:72)(cid:79)(cid:73)(cid:16)(cid:85)(cid:72)(cid:83)(cid:82)(cid:85)(cid:87)(cid:3)(cid:82)(cid:79)(cid:73)(cid:68)(cid:70)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:3)
(cid:37)(cid:16)(cid:54)(cid:44)(cid:55)(cid:3)(cid:87)(cid:72)(cid:86)(cid:87)(cid:72)(cid:71)(cid:3)(cid:82)(cid:79)(cid:73)(cid:68)(cid:70)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:11)(cid:148)(cid:28)(cid:12)(cid:3)

(cid:60)(cid:72)(cid:86)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)
(cid:51)(cid:82)(cid:82)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)
(cid:51)(cid:82)(cid:82)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)

(cid:39)(cid:76)(cid:86)(cid:87)(cid:85)(cid:76)(cid:69)(cid:88)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)
(cid:11)(cid:49)(cid:32)(cid:22)(cid:22)(cid:15)(cid:28)(cid:22)(cid:21)(cid:12)(cid:3)
(cid:25)(cid:19)(cid:17)(cid:26)(cid:3)(cid:11)(cid:26)(cid:17)(cid:22)(cid:12)(cid:3)
(cid:1842)(cid:4666)(cid:1829)(cid:2870) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:883)(cid:884)(cid:484)(cid:886)(cid:936)(cid:3)
(cid:1842)(cid:4666)(cid:1850) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:889)(cid:484)(cid:886)(cid:936)(cid:3)
(cid:1842)(cid:4666)(cid:1852) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:888)(cid:484)(cid:888)(cid:936)(cid:3)
(cid:51)(cid:82)(cid:82)(cid:85)(cid:29)(cid:3)(cid:28)(cid:26)(cid:19)(cid:3)(cid:82)(cid:88)(cid:87)(cid:3)(cid:82)(cid:73)(cid:3)(cid:22)(cid:15)(cid:21)(cid:23)(cid:28)(cid:3)

(cid:55)(cid:76)(cid:80)(cid:72)(cid:3)(cid:82)(cid:73)(cid:3)(cid:36)(cid:86)(cid:86)(cid:72)(cid:86)(cid:86)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:21)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:21)(cid:16)(cid:21)(cid:19)(cid:20)(cid:23)(cid:12)(cid:3)

(cid:40)(cid:81)(cid:85)(cid:82)(cid:79)(cid:79)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)(cid:11)(cid:21)(cid:19)(cid:19)(cid:22)(cid:16)(cid:21)(cid:19)(cid:19)(cid:28)(cid:12)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:21)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:21)(cid:16)(cid:21)(cid:19)(cid:20)(cid:23)(cid:12)(cid:3)
(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:23)(cid:16)(cid:21)(cid:19)(cid:20)(cid:25)(cid:12)(cid:3)
(cid:21)(cid:19)(cid:20)(cid:27)(cid:16)(cid:21)(cid:19)(cid:20)(cid:28)(cid:3)

This analysis aims to illustrate the implementation of two methods for mediation

analysis in the complex survey data and may not have epidemiological interpretation due

to multiple reasons. First, the assessment of the mediator (olfactory status at DFU-3) is

based on self-report, which suﬀers from potential measurement errors. Second, the mediator

and the outcome are both surrogates for the true olfactory status at two timepoints so the

clinical meaning of this mediating eﬀects is questionable. Third, only a single question or

test was given for the olfaction assessment for both the mediator and the outcome. This

increase the possibility of having a measurement error for the true olfactory status.

This example includes 33,932 out of 36,491 participants from the cohort who had

non-missing values for the covariate set C, the exposure X, and the mediator (the primary

case-control outcome) Z.

Investigators used 100% sampling for cases and ˜3.2% simple

random sampling for controls. The complete case data has 3,249 participants in the case-

control sample with no missingness in the outcome Y . Table 2.8 shows the estimates of the

mediating eﬀects in the RD, RR, and OR scales with 95% CIs.

Both methods give similar estimates on NEs and the CDE when ﬁxing the mediator

(self-report sense of smell) at value of 1. For the CDE when ﬁxing the mediator at value

of 0, two methods provide slightly diﬀerent point estimates, while the bootstrap 95% CIs

from two methods still give similar estimation. Comparing with the bootstrap 95% CI, the

robust one gives similar estimation with slight eﬃciency gain (narrower). Again, this is just

an example of showing how to implement two methods to the real-world data and no clinical

conclusion should be drawn from this analysis.

39

Table 2.8: Method Comparison Using the Sister Study

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:9)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3) (cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3) (cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3) (cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3) (cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3)

(cid:16)(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:25)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)

(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:26)(cid:23)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:26)(cid:20)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:19)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:25)(cid:24)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:20)(cid:20)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:22)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:26)(cid:22)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:20)(cid:15)(cid:3)(cid:20)(cid:17)(cid:21)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:24)(cid:19)(cid:15)(cid:3)(cid:20)(cid:17)(cid:27)(cid:20)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:24)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:21)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:27)(cid:27)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:27)(cid:25)(cid:15)(cid:3)(cid:20)(cid:17)(cid:24)(cid:22)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:23)(cid:25)(cid:15)(cid:3)(cid:21)(cid:17)(cid:19)(cid:23)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:27)(cid:3)
(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:25)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:26)(cid:3)

(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:26)(cid:21)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:19)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:21)(cid:28)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:19)(cid:26)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:27)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:20)(cid:21)(cid:12)(cid:3)

(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:26)(cid:25)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:22)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:21)(cid:28)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:19)(cid:26)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:25)(cid:25)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:27)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:20)(cid:21)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:22)(cid:25)(cid:15)(cid:3)(cid:20)(cid:17)(cid:28)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:22)(cid:20)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:24)(cid:22)(cid:15)(cid:3)(cid:20)(cid:17)(cid:27)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:22)(cid:23)(cid:15)(cid:3)(cid:21)(cid:17)(cid:21)(cid:23)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:27)(cid:27)(cid:15)(cid:3)(cid:20)(cid:17)(cid:24)(cid:25)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:23)(cid:28)(cid:15)(cid:3)(cid:21)(cid:17)(cid:19)(cid:23)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:26)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:24)(cid:21)(cid:15)(cid:3)(cid:21)(cid:17)(cid:21)(cid:19)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:22)(cid:15)(cid:3)(cid:20)(cid:17)(cid:22)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:25)(cid:21)(cid:15)(cid:3)(cid:20)(cid:17)(cid:28)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:19)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:20)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:23)(cid:27)(cid:15)(cid:3)(cid:21)(cid:17)(cid:23)(cid:24)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:27)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:24)(cid:26)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:24)(cid:26)(cid:15)(cid:3)(cid:21)(cid:17)(cid:20)(cid:22)(cid:12)(cid:3)
(cid:11)(cid:20)(cid:17)(cid:19)(cid:20)(cid:15)(cid:3)(cid:20)(cid:17)(cid:20)(cid:22)(cid:12)(cid:3)

2.8 Conclusion

In Chapter 2, we discussed the estimation of mediating eﬀects (controlled direct eﬀects and

natural eﬀects) using complex sampling data of a case-control sample drawn from the target

population (e.g., a cohort) with the primary outcome as the mediator and a secondary

outcome as the outcome of interest. We ﬁrst showed that ignoring the study design would

lead to serious bias in the natural (in)direct eﬀect estimation. Second, we proposed a set

of weighting estimators and compared it with Huber and Solovyeva’s work using the Monte

Carlo simulation study. For controlled direct eﬀects, the proposed estimator has overall

smaller root mean squared error and better coverage probability when ﬁxing the mediator at

value of 1. Two methods are very comparable for estimating natural (in)direct eﬀects. The

second advantage of the proposed method is that unlike the Huber and Solovyeva’s approach,

the proposed estimation of weights does not involve the probability model of the exposure

given the mediator. Third, we showed that the validity of using robust standard error for the

proposed estimator in addition to the bootstrap resampling based standard error. Fourth,

we implemented and compare two approaches into a real-world data for illustration purpose.

Overall, we recommend to use more than one method in practice, to provide researchers

more information and comprehensive results for clinical interpretation.

40

CHAPTER 3 Mediation Analyses for Case-Control Studies With Secondary

and Tertiary Outcomes

3.1 Background

In Chapter 2, we discussed the mediation analysis using case-control studies with the primary

outcome as the mediator and a secondary outcome collected subsequently as the outcome

of interest. In fact, with more follow-ups available to researchers, not only secondary but

also tertiary outcomes can further be collected. This may happen after a case-control study

has been conducted with additional interest of direct or indirect eﬀects of exposure on the

tertiary outcome through the secondary outcome. Unlike Chapter 2, here both the secondary

outcome (the mediator) and the tertiary outcome (the outcome), and possibly some covari-

ates, are observed only within the case-control sample. For example, Cao et al. (in press,

Environmental Health Perspectives, DOI: 10.1289/EHP12066) studied exposure to ambient

air pollutants in relation to a secondary outcome, i.e., olfactory impairment deﬁned by a

Brief Smell Identiﬁcation Test (B-SIT), using a case-control sample from the NIEHS Sister

Study. Using similar designs, researchers may be further interested in the mediating eﬀects

of exposure to air pollutants on a tertiary outcome, e.g., cognitive deﬁcit, through the sec-

ondary outcome, i.e., olfactory impairment. However, performing such mediation analysis

using existing methods could lead to seriously biased estimates since the case-control sam-

ple an outcome-dependent sample. In this chapter, we will ﬁrst describe the settings and

notations, followed by a marginal structural model to show the consequence of ignoring the

complex study design. Then we will discuss the extension of an existing estimator that is

potentially suitable for this scenario. Next, we propose a new estimator (weighting estima-

tor II) and compare its performance with the existing estimator. Finally, we implement two

methods using a real-world example from the NIEHS Sister Study data.

41

3.2 Settings

3.2.1 Notations and Causal Directed Acyclic Graph (DAG)

In this chapter, our study setting is the mediation analysis of a case-control sample with a

secondary outcome as the mediator and a subsequently collected tertiary outcome as the out-

come of interest. We have (C1, C2, X, Z, S) with the same deﬁnition as described in Chapter

2. Additionally, let C3 be the mediator-outcome confounder assessed post-baseline, e.g., co-

morbidity disease at the time of mediator assessment, prior to the assessment of the outcome.

It is possible that C3 is only assessed within the case-control sample. Let M be the mediator

(secondary outcome) which is assessed after the collection of case-control sample and avail-

able only within the sample. We assume that the sample is selected from an existing cohort

study, like in many epidemiological studies. Let Y be the tertiary outcome (the outcome of

interest in the mediation analysis) assessed after the measurement of the mediator. Figure

3.1 shows the causal DAG for the data generating process of (C1, C2, C3, X, Z, M, S, Y ).

42

C1: confounder between the exposure, the mediator and the outcome

C2: common cause for primary outcome, the mediator and the outcome

C3: confounder between the mediator and the outcome

X: exposure/treatment

Z: case-control status

M : secondary outcome (mediator)

S: case-control sample selection node

Y : (tertiary) outcome

Figure 3.1: Causal DAG for Mediation Analysis

For simplicity, assume C2, C3 X, Z, M and Y are all binary variables. In practice, our

proposed approach can be extended to non-binary variables as well. If the case-control sample

is selected from a cohort (source population) study, then (C1, C2, X, Z, S) are observed for all

units whereas (C3, M, Y ) are only observed when S = 1, i.e., among units who are selected

into the case-control sample. In our application of the Sister study data, cases are 100%

selected and controls are selected using SRS with a pre-speciﬁed sampling rate (Cao et al.

2022). Similarly to the previous chapter, our proposed approach may also be extended to

other sampling method such as stratiﬁed sampling or proportion to size sampling as long as

the sampling process is known by design. The aim here is to estimate the causal parameter

43

(e.g., mediating eﬀects) in the cohort using the selected case-control sample. With such study

design, X is fully observed from the cohort. However, even if X is only observable within

the case-control sample, using the approach discussed in section 3.3 we can still estimate the

mediating eﬀects in the cohort (source population). We adopt Neyman-Rubin causal model

(Rubin 1978, Splawa-Neyman et al. 1923) and invoke the SUTVA for simplicity (Rubin

1986). Let M x be the counterfactual mediator and Y xM x

be the counterfactual outcome

when the exposure X is set to the value x. The goal is to study mediating eﬀects from X to

Y , through the mediator M .

3.2.2 Controlled Direct Eﬀect (CDE)

For binary Y and M , the population (cohort) CDE can be deﬁned in three diﬀerent scales

(Robins & Greenland 1992, Pearl 2001, VanderWeele 2015):

Risk diﬀerence: CDERD = E[Y 1m − Y 0m], for m ∈ {0, 1}.
Relative risk: CDERR = P (Y 1m=1)
Odds ratio: CDEOR = P (Y 1m=1)/P (Y 1m=0)

P (Y 0m=1)/P (Y 0m=0) , for m ∈ {0, 1}.

P (Y 0m=1) , for m ∈ {0, 1}.

3.2.3 Natural Eﬀects (NE)

Similar to CDE, for binary Y and M , the population natural (in)direct eﬀects can also be

deﬁned in three diﬀerent scales (Robins & Greenland 1992, Pearl 2001, VanderWeele 2015):

Risk diﬀerence: NDERD = E[Y 1M 0 − Y 0M 0
Relative risk: NDERR = P (Y 1M 0
Odds ratio: NDEOR = P (Y 1M 0

P (Y 0M 0 =1) and NIERR = P (Y 1M 1
P (Y 0M 0 =1)/P (Y 0M 0 =0) and NIEOR = P (Y 1M 1

=1)
P (Y 1M 0 =1) .

=1)/P (Y 1M 0

=1)

=0)

] and NIERD = E[Y 1M 1 − Y 1M 0

].

=1)/P (Y 1M 1
=0)
P (Y 1M 0 =1)/P (Y 1M 0 =0) .

3.3 Existing Methods for the Scenario

3.3.1 Weighting Method Ignoring the Study Design

Similarly as in Chapter 2, I would like to point out the necessity for considering the study

design while performing causal mediation analysis. The same model is used as in Chapter 2

(VanderWeele 2009), to illustrate the consequence of ignoring the secondary and tertiary data

collection process in terms of biased estimates in mediation analysis. Also, I will discuss the

possible remedy for such scenario. The notation is consistent with the deﬁnitions in section

44

3.2.1.

The adapt the notations and settings for this chapter, following MSMs are used:

E[Y xm] = g(x, m) = α0 + α1x + α2m + α3xm

E[M x|C = c] = h(x, c) = γ0 + γ1x + γ

(cid:3)

2c

E[Y xm|C = c] = g(x, m, c) = θ0 + θ1x + θ2m + θ3xm + θ

(cid:3)

4c

(3.1)

(3.2)

(3.3)

The MSM 3.1 is used to estimate the CDE with assigning each individual with a
P (X=x|C=c) and wm = P (M =m|X=x)

P (M =m|X=x,C=c) . However, since M

weight of wx ∗ wm, where wx = P (X=x)

is only observed within the case-control sample, P (M = m|X = x) and P (M = m|X =

x, C = c) are not directly estimable using observed data. Here, we can take advantage of

knowing the study design and estimate those two quantities indirectly. Speciﬁcally, we have

P (M = m|X = x) =

P (M = m|X = x, C = c) =

(cid:3)

z
(cid:3)

z

P (M = m|X = x, Z = z) · P (Z = z|X = x)

(3.4)

P (M = m|X = x, C = c, Z = z) · P (Z = z|X = x, C = c)

(3.5)

The MSM 3.2 and 3.3 are used to estimate the conditional NEs. For 3.2, the weight

w2 =

1

P (X=x|C=c) is assigned for each individual, and for 3.3, the same weight wx ∗ wm is used.

One diﬀerence between 3.3 and 3.1 is that model 3.3 also conditions on the covariate set C,

i.e., (C1, C2, C3) in our case. Thus, estimates from MSM 3.2 and 3.3 are called conditional

NEs. To get marginal ones, we need to further take expectation regarding the covariate set

C.

Since the case-control sampling is stratiﬁed by Z, P (M |X, Z = z) and P (M |X, C, Z =

z) can be correctly estimated. This is simply because for units with Z = 1, we have

45

100% sampling, which indicates that the probability mass function (pmf) P (M |X, Z = 1)

(or P (M |X, C, Z = 1)) estimated from the case-control sample is the “true” pmf from

the sub-population. Similarly, for subjects with Z = 0, the SRS also indicates that the

pmf P (M |X, Z = 0) (or P (M |X, C, Z = 0)) can be correctly estimated using the case-

control sample. And since Z is observed across the whole population (e.g., the cohort),

P (Z = z|X = x) and P (Z = z|X = x, C = c) can be correctly estimated using the whole

population.

CDE and marginal NEs are estimated using the following:

CDERD = E[Y 1m − Y 0m] = α1 + α3m

pure NDERD = E[Y 1M 0 − Y 0M 0

] = θ1 + θ3γ0 + θ3γ

(cid:3)

2E[C]

total NIERD = E[Y 1M 1 − Y 1M 0

] = (θ2 + θ3)γ1

(3.6)

(3.7)

(3.8)

Table 3.1: Mediating Eﬀects When Ignoring Study Design

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)(cid:3) (cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3) (cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)(cid:3)

(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:25)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:25)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:21)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:25)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:28)(cid:26)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:27)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:22)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:23)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:25)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:28)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:26)(cid:25)(cid:17)(cid:24)(cid:3)
(cid:20)(cid:20)(cid:17)(cid:25)(cid:3)

(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

From Table 3.1 we can see that even though the CDE can be unbiasedly estimated,

there is serious bias in estimating NEs. One possible reason could be that when ﬁxing

the exposure X and the mediator M with controlling for the covariates C (by weighting),

Y xm ⊥⊥ S | C is achieved. Thus, ignoring the sample selection process would still give

unbiased estimates of Y xm from MSM 3.1 due to the conditional independency. However,

for NEs, even though we can ﬁx X at x (or x∗), M x(or M x∗

) is considered as a random

variable. Thus, Y M x

(or Y M x∗

) is not independent of sample selection S in such scenario,

which will lead to biased estimates of NEs if ignoring the study design. In such case, either

46

the weight for MSM 3.3 needs to be adjusted to incorporate the sample selection process, or

a diﬀerent approach can be used to estimate the NEs. Below we give a potential remedy for

this.

NDE can still be unbiasedly estimated if CDE estimates are unbiased since NDE is

weighted average of CDE (Rudolph et al. 2019, Petersen et al. 2006). In the case-control

sample, if each counterfactual outcome mean can be estimated unbiasedly, then all NEs also

can.

E[Y xM x

] = Ec[EY xM x |C[Y xM x|C]]

(cid:2)

= Ec[

y · P (Y xM x

= y|C)dy]

= Ec[

(cid:2) (cid:2)

(cid:2) (cid:2)

y · P (Y xM x

= y, M x = m|C)dmdy]

= Ec[

y · P (Y xm = y|M x = m, C) · P (M x = m|C)dmdy]

= Ec[

(cid:3)

m

(E[Y xm|C]) · P (M x = m|C)]

Similarly, we have E[Y xM x∗

] = Ec[

(cid:4)

m(E[Y xm|C]) · P (M x∗

= m|C)]. Then NIE

can be calculated using the diﬀerence between above two quantities. The key here is to

unbiasedly estimate P (M x = m|C)] and P (M x∗

= m|C). In the VanderWeele paper, MSM

3.2 is speciﬁcally used for estimating these two quantities while ignoring the study design

would lead to biased estimation at this step, and further lead to the biased estimation in

NEs. A thorough investigation of this will be a separate topic and is not our primary focus

here. Next, we will present a modiﬁed version of an existing method that could potentially

give unbiased estimation of mediating eﬀects in such complex study design.

3.3.2 Modiﬁed Huber and Solovyeva’s Method

With the notations and settings under this particular chapter, HS estimator can be written

as following

47

ˆE[Y xm] =

ˆE[Y xM x

] =

ˆE[Y xM x∗

] =

1
n

1
n

1
n

(cid:3)

i
(cid:3)

i
(cid:3)

i

Yi · I{Xi = x} · I{Mi = m} · Si
P (Xi = x|C1i) · P (Mi = m|Xi, Ci) · P (Si = 1|Xi, Mi, Ci)

Yi · I{Xi = x} · Si
P (Xi = x|C1i) · P (Si = 1|Xi, Mi, Ci)

Yi · I{Xi = x} · Si
P (Xi = x|Mi, Ci) · P (Si = 1|Xi, Mi, Ci)

· P (Xi = 1 − x|Mi, Ci)
P (Xi = 1 − x|Ci)

(3.9)

(3.10)

(3.11)

where Ci = (C1i, C2i, C3i).

However, probabilities such as P (Si = 1|Xi, Mi, Ci) cannot be directly estimated

using observed data. And the stratiﬁed approach described in section 3.3.1 even fails since

P (Zi = z|Xi, Mi, Ci) is still unestimable using observed data only. Thus, we modiﬁed

equations 3.9, 3.10 and 3.11 to the following with basic probability theory:

ˆE[Y xm] =

ˆE[Y xM x

] =

ˆE[Y xM x∗

] =

1
n

1
n

1
n

(cid:3)

i

(cid:3)

i

(cid:3)

i

Yi · I{Xi = x} · I{Mi = m} · Si
P (Xi = x|C1i) · P (Mi = m|Xi, Ci, Si = 1) · P (Si = 1|Xi, C1i, C2i)

Yi · I{Xi = x} · Si · P (Mi = m|Xi, Ci)
P (Xi = x|C1i) · P (Mi = m|Xi, Ci, Si = 1) · P (Si = 1|Xi, C1i, C2i)

(3.12)

Yi · I{Xi = x} · Si
P (Xi = x|Mi, Ci)

· P (Xi = 1 − x|Mi, Ci)
P (Xi = 1 − x|Ci)

·

P (Mi = m|Xi, Ci)
P (Mi = m|Xi, Ci, Si = 1) · P (Si = 1|Xi, C1i, C2i)

(3.13)

(3.14)

And P (Xi = x|Mi, Ci) = P (Mi=m|Xi,Ci)·P (X|Ci)

P (Mi=m|Ci)

. For P (Mi = m|Xi, Ci) and P (Mi =

m|Ci), we can further estimate them using the stratiﬁed approach described in section 3.3.1.

This modiﬁed method and estimator will be called the modiﬁed HS method and estimator

henceforth. The modiﬁed HS estimators of CDE and NEs naturally follow.

Instead of

directly plug in equation 3.12-3.14, they proposed to use normalized versions of the sample

48

analogs (Huber & Solovyeva 2020).

The modiﬁed HS method has a relatively complicated estimation process with many

quantities to estimate, plus there is no easy way to obtain the standard error (SE) or conﬁ-

dence interval (CI) for both the CDE and NEs except using the bootstrap resampling method

(Huber & Solovyeva 2020). To overcome these disadvantages, we extended Hong’s work on

ratio-of-mediator-probability weighting (RMPW) (Hong 2010), VanderWeele’s MSM (Van-

derWeele 2009), and Lange’s uniﬁed approach for estimating NEs (Lange et al. 2012), to

propose a new weighting method (weighting estimator II) to estimate both CDE and NEs

for a tertiary outcome in the case-control sample drawn from a source population, e.g., a

cohort. We will compare the performance between modiﬁed HS method and our proposed

method using the simulation, followed by an application using real-world data.

3.4 Proposed Weighting Method

3.4.1 Nonparametric Identiﬁcation

Similarly to Chapter 2, to identify the causal mediating eﬀects for the population, we need

to connect E[Y xm] with E[Y |X = x, M = m, S = 1], and E[Y xM x

] and E[Y xM x∗

] with

E[Y |X = x, S = 1]. Before showing how to nonparametrically identify the population

mediating eﬀecs using case-control sample, we lay out assumptions which are required for

the identiﬁcation, based on the causal structure shown in Figure 3.1.

Assumption 1. Y xm ⊥⊥ (X, M )|C where C = (C1, C2, C3), i.e., no unobserved con-

founders between the treatment and the outcome, and no unobserved confounders between

the mediator and the outcome.

Assumption 2. Conditional independency based on the study design and back-

ground knowledge.

49

Y ⊥⊥ S | (X, Z, M, C)

S ⊥⊥ (X, C) | Z

X ⊥⊥ (C2, C3) | C1

Y ⊥⊥ S | (X, M, C)

S ⊥⊥ C3 | (X, C1, C2)

0 < P (M = m | X, C), 0 < P (M = m | X, C, S = 1), 0 < P (S = 1 | Z)

0 < P (S = 1 | X, C1, C2) and 0 < P (X = x | C1)

(3.15)

(3.16)

(3.17)

(3.18)

(3.19)

(3.20)

The conditional independence assumptions 3.15-3.19 are primarily design-based. For

example, in a randomized controlled trial, instead of Y ⊥⊥ S | (X, Z, M, C), we have Y ⊥⊥ S.

For more details, please refer to the proof of Theorem 2 in Appendix A. Assumption 1 and

3.15-3.17 are required for identiﬁcation of CDE. To identify NDE and NIE, 3.18, 3.19 and

three additional assumptions are further required, as shown below:

Assumption 3. (Y xM x

, Y xM x∗

) ⊥⊥ X | C.

Assumption 4. Y xm ⊥⊥ (M x, M x∗

) | (X, C).

Assumption 5. M x∗ ⊥⊥ X | (Z, C).

Next, we introduce 3.4.1, to show that the mean of population (nested) counterfac-

tual outcomes can be written using weighted conditional density of the observed outcome Y .

Identiﬁcation of population (nested) counterfactual outcomes using case-control sample.

E[Y xm] = E[w4Y |X = x, M = m, S = 1] where w4 = P (M =m|x,S=1)·P (S=1|x)·P (X=x)

P (M =m|x,c1,c2,c3)P (S=1|z)P (X=x|c1) .

E[Y xM x
E[Y xM x∗

] = E[w5Y |X = x, S = 1] where w5 = P (M =m|c1,c2,c3,x)
P (M =m|x,c1,c2,c3,S=1)
] = E[w6Y |X = x, S = 1] where w6 = P (M =m|c1,c2,c3,x∗)
P (M =m|x,c1,c2,c3,S=1)

·

P (S=1|x)P (X=x)
P (S=1|x,c1,c2)P (X=x|c1) .
P (S=1|x)P (X=x)
·
P (S=1|x,c1,c2)P (X=x|c1) .

3.4.2 Estimation of the Causal Estimand

We adopt the same MSM modeling-based approach as described in Chapter 2 (Robins et al.

2000, VanderWeele 2009, Lange et al. 2012). Following MSM will be used for estimating

50

CDE (3.21) and NEs (3.22):

g(E[Y xm]) = c0 + c1x + c2m + c3x · m

g(E[Y xM x∗

]) = c0 + c1x + c2x∗ + c3x · x∗

(3.21)

(3.22)

where g(·) is the link function.

Assuming the functional form is correctly speciﬁed in 3.21 and 3.22, the causal

mediating eﬀects in terms of risk diﬀerence, relative risk and odds ratio can be expressed as

a function of the parameters estimated from following MSM (Robins et al. 2000):

CDE can be estimated by replacing z by m in 2.17-2.19. NEs can be estimated

using the similar model described in Chapter 2 (2.20-2.22). And w4 is used for estimating

CDE while w5 and w6 are used for estimating NEs. The interaction term c3x · m or c3x · x∗

are added into above models to reﬂect the potential exposure-mediator interaction eﬀect in

relation to the ﬁnal outcome, and if the exposure-covariate interactions are of interest, we

can further include x · C and x∗ · C to capture the (in)direct eﬀects modiﬁed by covariates

(Lange et al. 2012).

Similar to (Lange et al. 2012), when assuming a MSM for the nested counterfactual

outcomes, our method can be equated with the generalized estimating equations (GEE)

method. For example, the estimating equation corresponding to the MSM in 3.22 is given

by:

U (c, α) =

(cid:4)n

i=1

(cid:4)1

x∗=0 d(Xi, x∗)(Yi − c0 − c1Xi − c2a∗ − c3Xia∗) P (M =m|c1,c2,c3,x∗)
P (M =m|x,c1,c2,c3,S=1)

·

P (S=1|x)P (X=x)

P (S=1|x,c1,c2)P (X=x|c1) with d(Xi, x∗) = (1, Xi, x∗, Xix∗)T .

As Lange et al. argued, let model M1 be the parametric model for P (M |X, C)

(with the unknown parameter denoted α) and M2 be the model deﬁned by the restrictions

of model M1 and the additional assumption that the generalized linear MSM 3.22 holds.

According to Lange et al., the condition under which conservative SE will be obtained, is

“that α is substituted by the maximum likelihood estimator under model M1 and that this

51

is also an eﬃcient estimator of α under model M2”. Although Lange et al. believe that

the latter condition will almost always be (nearly) satisﬁed in practice, they recommend to

obtain alternative SE (e.g., based on the bootstrap resampling method) if researchers are

concerned about the validity of the additional condition, i.e., the generalized linear MSM

is so overly restrictive that the restrictions it imposes on the outcome distribution carry

information about the actual value of α.

Following from the M-estimation theory, we know that:

√

n(ˆcn − ct) D
−→

N (0, [E{ ∂U (ct, αt)

∂cT

}]−1var(U (ct, αt))[E{∂U (ct, αt)

∂cT

}]−1T

)

where ct and αt stand for the true values of vectors (c0, c1, c2, c3)T and (α0,α1,α2)T . And

(α0,α1,α2)T is from the generalized linear model of M on (X, C) with link function g(·), i.e.,

E[M |X, C] = g(α0 + α1X + α2C).

Since ct and αt are unknown, we instead substitute the GEE estimator ˆcn for ct and MLE

estimator ˆαn for αt and to obtain the sandwich estimator for the asymptotic variance spec-

iﬁed above, given by

[ ˆE{ ∂U (ˆcn, ˆαn)
∂cT

}]−1 ˆvar(U (ˆcn, ˆαn))[ ˆE{∂U (ˆcn, ˆαn)

∂cT

}]−1T

.

So far, we have showed the validity of using MSM to estimate both CDE and NEs,

including the sandwich estimators for SE. The estimates for mediating eﬀects are similar to

those described in Chapter 2.

3.5 Simulation

3.5.1 Data Generating Process (DGP)

Part of the parameters used in the DGP are same with those described in Chapter 2 section

2.5.1. The DGP for additional variables is described below:

52

1) Generate the study confounders:

Additional to C1 and C2,C3 is a binary (0/1) confounder assessed at or after baseline,

and C3 ∼ Bernoulli(PC3) and PC3 ≈ 7.27%. An example of C3 could be the comorbidity

disease at the time of assessment of the secondary outcome (the mediator).

2) Generate the mediator (secondary outcome):

M is the mediator (secondary outcome), assessed at T1, with M ∼ Bernoulli(PM ),

where

PM = {1 + exp[−kM ]}−1

kM = log(1.079C1) + log(1.468C2) + log(1.439C3) + log(2.4X) + log(4Z) + log(0.001)

and P (M = 1) ≈ 14.90%.

3) Generate the observed (tertiary) outcome:

Y is the binary (0/1) outcome of interest assessed at T2, follow-up after T1 and

Y ∼ Bernoulli(PY ), where

PY = {1 + exp[−kY ]}−1

kY = log(1.012C1) + log(1.187C2) + log(3.271C3) + log(1.1X) + log(2.3M )

+ log(1.5C3X) + log(2XM ) + log(0.0503)

and P (Y = 1) ≈ 13.51%.

4) Generate the potential outcomes:
For the mediator M , M x,Zx∗

∼ Bernoulli(PM x,Zx∗ ), where PM x,Zx∗

is similar to

PM with replacing X and Z by x and counterfactual outcome Z x∗
Y xM x∗
and counterfactual outcome M x∗

.

∼ Bernoulli(PY xM x∗ ), where PY xM x∗ is similar to PY with replacing X and M by x

. For the outcome Y ,

53

3.5.2 Monte Carlo, Bootstrapping and Performance Criteria

The Monte Carlo (MC) simulation, bootstrapping, and performance evaluation criteria meth-

ods are same as described in Chapter 2.

3.6 Simulation Results

3.6.1 Results Comparison Between Two Methods

Using 1, 000 MC simulation with 200 bootstrap samples for each single run, and under

diﬀerent presumed cohort size (18, 000, 36, 491, or 72, 000), we have the following result:

Table 3.2: Controlled Direct Eﬀect Fixing M = 0

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:26)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:20)(cid:17)(cid:23)(cid:26)(cid:21)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:25)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:27)(cid:24)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:25)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:28)(cid:19)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:26)(cid:26)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:23)(cid:25)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:23)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:25)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:25)(cid:21)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:26)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:26)(cid:22)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:26)(cid:21)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:19)(cid:24)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:25)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:27)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:25)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:20)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:19)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:20)(cid:19)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:24)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:21)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:26)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:24)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:25)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:23)(cid:24)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:28)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:20)(cid:27)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:24)(cid:23)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:28)(cid:26)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:25)(cid:28)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:26)(cid:27)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:20)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:27)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:21)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:24)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:24)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:23)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:22)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:25)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:25)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:23)(cid:25)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:19)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:19)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:22)(cid:21)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:25)(cid:27)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:28)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:19)(cid:19)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:23)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:21)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:25)(cid:19)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:23)(cid:25)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:23)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:22)(cid:21)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:25)(cid:27)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:20)(cid:28)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:21)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:27)(cid:27)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:19)(cid:22)(cid:26)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:22)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:28)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:19)(cid:3)

54

Table 3.3: Controlled Direct Eﬀect Fixing M = 1

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3)

(cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:20)(cid:17)(cid:25)(cid:21)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:21)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:21)(cid:26)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:21)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:28)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:24)(cid:19)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:21)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:20)(cid:3)

(cid:20)(cid:17)(cid:26)(cid:24)(cid:27)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:27)(cid:22)(cid:3)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:26)(cid:22)(cid:3)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:25)(cid:26)(cid:3)(cid:3)
(cid:20)(cid:17)(cid:27)(cid:26)(cid:19)(cid:3)(cid:3)
(cid:20)(cid:17)(cid:27)(cid:22)(cid:25)(cid:3)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:28)(cid:23)(cid:3)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:26)(cid:22)(cid:3)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:23)(cid:25)(cid:3)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:22)(cid:22)(cid:3)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:24)(cid:19)(cid:19)(cid:3)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:21)(cid:17)(cid:22)(cid:28)(cid:19)(cid:3)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:21)(cid:17)(cid:21)(cid:28)(cid:21)(cid:3)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:21)(cid:17)(cid:21)(cid:20)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:28)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:28)(cid:19)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:23)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:25)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:26)(cid:20)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:27)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:21)(cid:27)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:26)(cid:28)(cid:26)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:25)(cid:25)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:19)(cid:24)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:23)(cid:27)(cid:25)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:21)(cid:20)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:21)(cid:23)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:25)(cid:26)(cid:25)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:19)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:24)(cid:25)(cid:22)(cid:25)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:20)(cid:20)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:27)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:21)(cid:28)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:24)(cid:19)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:26)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:28)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:25)(cid:23)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:26)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:27)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:27)(cid:24)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:23)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:24)(cid:25)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:24)(cid:20)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:20)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:27)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:26)(cid:27)(cid:19)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:19)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:21)(cid:25)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:28)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:19)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:28)(cid:24)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:24)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:25)(cid:21)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:27)(cid:19)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:21)(cid:28)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:25)(cid:27)(cid:28)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:28)(cid:23)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:23)(cid:25)(cid:25)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:22)(cid:20)(cid:26)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:21)(cid:26)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:22)(cid:27)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:27)(cid:22)(cid:21)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:24)(cid:21)(cid:26)(cid:23)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:20)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:21)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:26)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:22)(cid:3)

From the Table 3.2 and 3.3, we can see that modiﬁed HS method generally has

smaller bias, bootstrap SE and RMSE for the CDE. However, when increasing the co-

hort/sample size, the proposed method has a coverage probability that is closer to 95%,

whereas the modiﬁed HS method doesn’t.

Table 3.4: Natural Direct Eﬀects

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3)

(cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:22)(cid:17)(cid:23)(cid:19)(cid:22)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:22)(cid:17)(cid:21)(cid:24)(cid:25)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:22)(cid:21)(cid:24)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:22)(cid:22)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:23)(cid:21)(cid:24)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:23)(cid:26)(cid:21)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:24)(cid:20)(cid:21)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:27)(cid:20)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:21)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:21)(cid:25)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:21)(cid:27)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:21)(cid:26)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:23)(cid:26)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:23)(cid:25)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:23)(cid:26)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:22)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:20)(cid:25)(cid:28)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:20)(cid:26)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:27)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:21)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:20)(cid:28)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:27)(cid:23)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:21)(cid:19)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:23)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:20)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:20)(cid:20)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:25)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:21)(cid:28)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:27)(cid:21)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:26)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:22)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:28)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:22)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:21)(cid:21)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:25)(cid:26)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:21)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:19)(cid:27)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:25)(cid:19)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:27)(cid:28)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:25)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:28)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:28)(cid:25)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:26)(cid:28)(cid:20)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:20)(cid:27)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:28)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:20)(cid:22)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:28)(cid:22)(cid:3)

(cid:19)(cid:17)(cid:20)(cid:25)(cid:22)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:20)(cid:26)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:27)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:26)(cid:22)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:21)(cid:19)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:27)(cid:22)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:20)(cid:28)(cid:28)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:21)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:21)(cid:20)(cid:21)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:23)(cid:25)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:20)(cid:19)(cid:20)(cid:27)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:24)(cid:17)(cid:20)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:27)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:23)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)

(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:22)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:21)(cid:17)(cid:28)(cid:3)
(cid:28)(cid:23)(cid:17)(cid:27)(cid:3)

For NDE, Table 3.4 shows that the modiﬁed HS method and the proposed method

55

have overall similar performance while there are several notable observations. First, the

modiﬁed HS method has overall slightly smaller bootstrap SE and RMSE (except for the

risk diﬀerence scale at N = 72, 000). Second, the true eﬀect tends to be underestimated

using the modiﬁed HS method while overestimated using the proposed method. Last, when

increasing the cohort/sample size, the proposed method has a coverage probability that is

closer to 95%, whereas the modiﬁed HS estimates does not.

Table 3.5: Natural Indirect Eﬀects

(cid:48)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)(cid:3)

(cid:38)(cid:82)(cid:75)(cid:82)(cid:85)(cid:87)(cid:3)(cid:86)(cid:76)(cid:93)(cid:72)(cid:3)

(cid:55)(cid:85)(cid:88)(cid:72)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3)

(cid:54)(cid:40)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)

(cid:54)(cid:40)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:68)(cid:3)

(cid:53)(cid:48)(cid:54)(cid:40)(cid:3) (cid:38)(cid:82)(cid:89)(cid:72)(cid:85)(cid:68)(cid:74)(cid:72)(cid:3)(cid:11)(cid:8)(cid:12)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:22)(cid:17)(cid:25)(cid:22)(cid:27)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:20)(cid:20)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:26)(cid:19)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:28)(cid:23)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:27)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:28)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)
(cid:22)(cid:17)(cid:25)(cid:26)(cid:28)(cid:91)(cid:20)(cid:19)(cid:16)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:22)(cid:23)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)

(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:22)(cid:26)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:19)(cid:20)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:27)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:24)(cid:3)
(cid:20)(cid:27)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:28)(cid:3)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:22)(cid:3)
(cid:26)(cid:21)(cid:19)(cid:19)(cid:19)(cid:3)
(cid:68)(cid:3)(cid:55)(cid:75)(cid:76)(cid:86)(cid:3)(cid:76)(cid:86)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)(cid:69)(cid:72)(cid:87)(cid:90)(cid:72)(cid:72)(cid:81)(cid:3)(cid:72)(cid:80)(cid:83)(cid:76)(cid:85)(cid:76)(cid:70)(cid:68)(cid:79)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:86)(cid:87)(cid:68)(cid:81)(cid:71)(cid:68)(cid:85)(cid:71)(cid:3)(cid:72)(cid:85)(cid:85)(cid:82)(cid:85)(cid:17)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:28)(cid:19)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:24)(cid:21)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:24)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:24)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:20)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:24)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:23)(cid:26)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:25)(cid:20)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:21)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:28)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:20)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:20)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:28)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:23)(cid:20)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:20)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:27)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:23)(cid:20)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:19)(cid:19)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:28)(cid:26)(cid:20)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:28)(cid:21)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:26)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:22)(cid:21)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:21)(cid:26)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:24)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:25)(cid:27)(cid:25)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:20)(cid:28)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:20)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:24)(cid:25)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:23)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:23)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:28)(cid:24)(cid:23)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:25)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)(cid:28)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:25)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:24)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:24)(cid:26)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:20)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:27)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:28)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:23)(cid:20)(cid:21)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:21)(cid:27)(cid:28)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:25)(cid:28)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:19)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:23)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:26)(cid:20)(cid:19)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:24)(cid:19)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:24)(cid:20)(cid:3)

(cid:27)(cid:27)(cid:17)(cid:23)(cid:3)
(cid:27)(cid:27)(cid:17)(cid:24)(cid:3)
(cid:27)(cid:28)(cid:17)(cid:19)(cid:3)
(cid:27)(cid:26)(cid:17)(cid:26)(cid:3)
(cid:27)(cid:25)(cid:17)(cid:20)(cid:3)
(cid:27)(cid:25)(cid:17)(cid:20)(cid:3)

(cid:28)(cid:20)(cid:17)(cid:22)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:21)(cid:3)
(cid:28)(cid:20)(cid:17)(cid:19)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:25)(cid:3)

(cid:28)(cid:19)(cid:17)(cid:24)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:22)(cid:3)
(cid:27)(cid:28)(cid:17)(cid:25)(cid:3)
(cid:28)(cid:19)(cid:17)(cid:22)(cid:3)
(cid:27)(cid:28)(cid:17)(cid:26)(cid:3)
(cid:27)(cid:28)(cid:17)(cid:27)(cid:3)

For NIE, Table 3.5 shows that two methods have overall similar results with a few

interesting observations. First, the proposed method consistently has smaller bias compared

to modiﬁed HS method. Second, the proposed method has slightly smaller bootstrap SE

than modiﬁed HS method. Third, the proposed method has slightly larger RMSE compared

to the modiﬁed HS method. Finally, both methods have poor but comparable coverages that

are deviated from the 95%.

3.6.2 Uncertainty of the Estimator

Similarly to Chapter 2 2.6.2, we will discuss the SE for the NIE for our method in this

section.

1) Distribution of the NIE estimates

56

Figure 3.2 shows the distribution of estimates for the NIE using the proposed method

based on 1, 000 MC simulation. The top two histograms are for the RD scale for N = 18, 000

and N = 72, 000 while the bottom two histograms are for the log-OR scale under two cohort

sizes.

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:27)

(cid:19)(cid:17)(cid:19)(cid:19)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:27)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:19)(cid:17)(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:25)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

Figure 3.2: Distribution of NIE Estimates using Proposed Approach

From the visual check on the ﬁgure, we can see that the proposed approach gives

estimates for NIE (blue curve) that is close to the normal density (orange curve). And when

increasing the cohort size, the kernel density (blue curve) gets closer to the normal density

(orange curve), which supports the asymptotic normality assumption.

Figure 3.3 shows the distribution of NIE estimates using modiﬁed HS method and

57

(cid:3)

(cid:3)

based on 1, 000 MC simulation.

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:20)

(cid:19)(cid:17)(cid:19)(cid:21)

(cid:19)(cid:17)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:26)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:19)(cid:17)(cid:20)

(cid:19)(cid:17)(cid:21)

(cid:19)(cid:17)(cid:22)

(cid:19)(cid:17)(cid:23)

(cid:19)(cid:17)(cid:24)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:49)(cid:44)(cid:40)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:86)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:39)(cid:72)(cid:81)(cid:86)(cid:76)(cid:87)(cid:92)

(cid:3)

(cid:3)

Figure 3.3: Distribution of NIE Estimates using Modiﬁed HS Method

From the visual check on the ﬁgure, we can see that the modiﬁed HS method gives

NIE estimates (blue curve) that are also close to the normal density (orange curve). They

point out that their original approach is

√

n-consisten and asymptotically normal under

standard regularity condition (Huber & Solovyeva 2020). When increasing the cohort size

from 18, 000 to 72, 000, we can see that the kernel density (blue curve) gets closer to with

slight deviation from the the normal density (orange curve).

With these supporting evidence, we believe both our proposed method and the

modiﬁed HS method give asymptotically normally distributed estimators for the NIE.

58

2) Robust standard errors and bootstrap standard errors

Figure 3.4 shows the distribution of robust and bootstrap SE for our proposed method

under simulation settings from the section 3.5.2 and 2.5.3. The top two densities are for the

RD scale for N = 18, 000 and N = 72, 000 while the bottom two are for the log-OR scale

under two cohort sizes.

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:21)(cid:19)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:23)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:25)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:26)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:27)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:24)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:19)

(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:24)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:53)(cid:39)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:53)(cid:39)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:53)(cid:39)(cid:12)

(cid:87)
(cid:81)
(cid:72)
(cid:70)
(cid:85)
(cid:72)
(cid:51)

(cid:20)(cid:24)

(cid:20)(cid:19)

(cid:24)

(cid:19)

(cid:16)(cid:22)(cid:17)(cid:24)

(cid:16)(cid:22)(cid:17)(cid:19)

(cid:16)(cid:21)(cid:17)(cid:24)

(cid:16)(cid:21)(cid:17)(cid:19)

(cid:16)(cid:22)(cid:17)(cid:27)

(cid:16)(cid:22)(cid:17)(cid:25)

(cid:16)(cid:22)(cid:17)(cid:23)

(cid:16)(cid:22)(cid:17)(cid:21)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:79)(cid:82)(cid:74)(cid:50)(cid:53)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:49)(cid:44)(cid:40)(cid:66)(cid:79)(cid:82)(cid:74)(cid:50)(cid:53)(cid:66)(cid:54)(cid:40)(cid:69)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:20)(cid:27)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:12)

(cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:15)(cid:3)(cid:49)(cid:32)(cid:26)(cid:21)(cid:15)(cid:19)(cid:19)(cid:19)(cid:12)

(cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:54)(cid:40)(cid:3)(cid:11)(cid:79)(cid:82)(cid:74)(cid:16)(cid:50)(cid:53)(cid:12)

(cid:3)

(cid:3)

Figure 3.4: Distribution Comparison Between Robust and Bootstrap SE for Proposed

Method

The distribution between robust and bootstrap SE for the NIE estimates are quite

close to each other for both RD and log-OR scales. Since two methods give asymptotically

normally distributed estimators, we believe both robust and bootstrap resampling based

methods can be used to obtain the SE or the CI for the NIE.

59

3.7 Application: the Olfaction Sub-study of the Sister Study Cohort

In this section, we implement both the modiﬁed HS method and our proposed method using

the NIEHS Sister study data. Details of the Sister study have been introduced in Chapter 2

2.7. In this example, we will use the same case-control study design as previously published

(Cao et al. 2022), with the subsequently collected data as the tertiary outcome, to implement

two methods for estimating both CDE and NEs. Brieﬂy, during 2014-2016 (third detailed-

follow-up, i.e., DFU-3, considered as baseline in this example), the Sister study collected data

on whether the participants ever had a serious head injury that resulted in unconsciousness,

coma, or hospitalization. And at the same visit, a stratiﬁed case-control sample was selected

based on self-report sense of smell (100% sampling for self-report “poor” and ˜3.56% simple

random sampling for self-report “normal”) (Cao et al. 2022). The self-report sense of smell

is a surrogate of the olfactory function at DFU-3. Later in 2018-2019, study investigators

mailed out objective olfaction test kits (12-item Brief Smell Identiﬁcation Test, B-SIT) to

those who were selected in the case-control sample only and collected the secondary outcome,

B-SIT-tested olfaction. At the Sister Study fourth detailed-follow-up (DFU-4, 2017-2019),

a dementia screening interview was conducted using the AD-8 instrument (Galvin et al.

2006) and the sum score was calculated for each participant. Here, the self-report history of

serious head injury is the exposure of interest. We treat the B-SIT tested olfactory status

(secondary outcome) as the mediator of interest and we consider the cognitive function

(tertiary outcome) collected at DFU-4 as the outcome of interest. Previous publications have

suggested the link between the traumatic brain injury and olfactory impairment (Howell et al.

2018, Xydakis et al. 2015), head injury and dementia (Gu et al. 2022, Fann et al. 2018), and

olfaction loss and dementia (Stanciu et al. 2014, Chen et al. 2021). We deﬁned the serious

head injury for subjects who answered “YES” to the corresponding question and deﬁned the

olfactory impairment using a cutoﬀ of ≤9 for the B-SIT test score (Cao et al. 2022). The

cognitive impairment/dementia was deﬁned as AD-8 score ≥4 (Cai et al. 2021).

In this illustrative example, we are interested in studying the eﬀects of a serious head

60

injury (X) on cognitive impairment/dementia (the tertiary outcome Y ) mediated through

the B-SIT tested olfaction (the secondary outcome M ) while the primary outcome (case-

control status) is self-report sense of smell (Z). Using the notation described in the section

3.2.1, we also consider the participants’ age (continuous) at DFU-2 as the common cause for

the exposure, the primary outcome, the mediator and the outcome, i.e. C1, the participants’

race/ethnicity as the common cause for the primary outcome, the mediator and the outcome,

i.e., C2, and the self-report overall health status in the past year assessed at DFU-3 as a

surrogate for the the comorbidity disease (the confounder between the mediator and the

outcome), i.e., C3. Table below summarizes the deﬁnition of variables used in this analysis.

Table 3.6: Variable Deﬁnition For The Sister Study Example

(cid:57)(cid:68)(cid:85)(cid:76)(cid:68)(cid:69)(cid:79)(cid:72)(cid:86)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:73)(cid:82)(cid:88)(cid:81)(cid:71)(cid:72)(cid:85)(cid:3)(cid:11)(cid:38)(cid:20)(cid:12)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:73)(cid:82)(cid:88)(cid:81)(cid:71)(cid:72)(cid:85)(cid:3)(cid:11)(cid:38)(cid:21)(cid:12)(cid:3)

(cid:39)(cid:72)(cid:73)(cid:76)(cid:81)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)

(cid:36)(cid:74)(cid:72)(cid:3)(cid:3)

(cid:57)(cid:68)(cid:85)(cid:76)(cid:68)(cid:69)(cid:79)(cid:72)(cid:3)(cid:55)(cid:92)(cid:83)(cid:72)(cid:3)

(cid:39)(cid:76)(cid:86)(cid:87)(cid:85)(cid:76)(cid:69)(cid:88)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:11)(cid:49)(cid:32)(cid:22)(cid:22)(cid:15)(cid:26)(cid:23)(cid:21)(cid:12)(cid:3)

(cid:55)(cid:76)(cid:80)(cid:72)(cid:3)(cid:82)(cid:73)(cid:3)(cid:36)(cid:86)(cid:86)(cid:72)(cid:86)(cid:86)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:87)(cid:76)(cid:81)(cid:88)(cid:82)(cid:88)(cid:86)(cid:3)

(cid:25)(cid:22)(cid:17)(cid:23)(cid:3)(cid:11)(cid:26)(cid:17)(cid:22)(cid:12)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:23)(cid:16)(cid:21)(cid:19)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:53)(cid:68)(cid:70)(cid:72)(cid:18)(cid:40)(cid:87)(cid:75)(cid:81)(cid:76)(cid:70)(cid:76)(cid:87)(cid:92)(cid:3)

(cid:58)(cid:75)(cid:76)(cid:87)(cid:72)(cid:3)(cid:11)(cid:19)(cid:12)(cid:18)(cid:50)(cid:87)(cid:75)(cid:72)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:3)

(cid:1842)(cid:4666)(cid:1829)(cid:2870) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:883)(cid:883)(cid:484)(cid:890)(cid:936)(cid:3)

(cid:40)(cid:81)(cid:85)(cid:82)(cid:79)(cid:79)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)(cid:11)(cid:21)(cid:19)(cid:19)(cid:22)(cid:16)(cid:21)(cid:19)(cid:19)(cid:28)(cid:12)(cid:3)

(cid:38)(cid:82)(cid:81)(cid:73)(cid:82)(cid:88)(cid:81)(cid:71)(cid:72)(cid:85)(cid:3)(cid:11)(cid:38)(cid:22)(cid:12)(cid:3)

(cid:54)(cid:72)(cid:79)(cid:73)(cid:16)(cid:85)(cid:72)(cid:83)(cid:82)(cid:85)(cid:87)(cid:3)(cid:75)(cid:72)(cid:68)(cid:79)(cid:87)(cid:75)(cid:3)(cid:68)(cid:87)(cid:3)(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)

(cid:42)(cid:82)(cid:82)(cid:71)(cid:3)(cid:11)(cid:19)(cid:12)(cid:18)(cid:41)(cid:68)(cid:76)(cid:85)(cid:16)(cid:83)(cid:82)(cid:82)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:3)

(cid:1842)(cid:4666)(cid:1829)(cid:2871) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:888)(cid:484)(cid:882)(cid:936)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:23)(cid:16)(cid:21)(cid:19)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:40)(cid:91)(cid:83)(cid:82)(cid:86)(cid:88)(cid:85)(cid:72)(cid:3)(cid:11)(cid:59)(cid:12)(cid:3)

(cid:43)(cid:76)(cid:86)(cid:87)(cid:82)(cid:85)(cid:92)(cid:3)(cid:82)(cid:73)(cid:3)(cid:86)(cid:72)(cid:85)(cid:76)(cid:82)(cid:88)(cid:86)(cid:3)(cid:75)(cid:72)(cid:68)(cid:71)(cid:3)(cid:76)(cid:81)(cid:77)(cid:88)(cid:85)(cid:92)(cid:3)

(cid:60)(cid:72)(cid:86)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)

(cid:1842)(cid:4666)(cid:1850) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:889)(cid:484)(cid:889)(cid:936)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:23)(cid:16)(cid:21)(cid:19)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:16)(cid:70)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:3)(cid:86)(cid:87)(cid:68)(cid:87)(cid:88)(cid:86)(cid:3)(cid:11)(cid:61)(cid:12)(cid:3)

(cid:54)(cid:72)(cid:79)(cid:73)(cid:16)(cid:85)(cid:72)(cid:83)(cid:82)(cid:85)(cid:87)(cid:3)(cid:82)(cid:79)(cid:73)(cid:68)(cid:70)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:3)

(cid:51)(cid:82)(cid:82)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)

(cid:1842)(cid:4666)(cid:1852) (cid:3404) (cid:883)(cid:4667) (cid:3404) (cid:887)(cid:484)(cid:883)(cid:936)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:22)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:23)(cid:16)(cid:21)(cid:19)(cid:20)(cid:25)(cid:12)(cid:3)

(cid:48)(cid:72)(cid:71)(cid:76)(cid:68)(cid:87)(cid:82)(cid:85)(cid:3)(cid:11)(cid:48)(cid:12)(cid:3)

(cid:37)(cid:16)(cid:54)(cid:44)(cid:55)(cid:3)(cid:87)(cid:72)(cid:86)(cid:87)(cid:72)(cid:71)(cid:3)(cid:82)(cid:79)(cid:73)(cid:68)(cid:70)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:11)(cid:148)(cid:28)(cid:12)(cid:3)

(cid:51)(cid:82)(cid:82)(cid:85)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:85)(cid:80)(cid:68)(cid:79)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)

(cid:51)(cid:82)(cid:82)(cid:85)(cid:29)(cid:3)(cid:26)(cid:22)(cid:21)(cid:3)(cid:82)(cid:88)(cid:87)(cid:3)(cid:82)(cid:73)(cid:3)(cid:21)(cid:15)(cid:24)(cid:19)(cid:21)(cid:3)

(cid:21)(cid:19)(cid:20)(cid:27)(cid:16)(cid:21)(cid:19)(cid:20)(cid:28)(cid:3)

(cid:50)(cid:88)(cid:87)(cid:70)(cid:82)(cid:80)(cid:72)(cid:3)(cid:11)(cid:60)(cid:12)(cid:3)

(cid:38)(cid:82)(cid:74)(cid:81)(cid:76)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:76)(cid:80)(cid:83)(cid:68)(cid:76)(cid:85)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)(cid:11)(cid:36)(cid:39)(cid:16)(cid:27)(cid:3)(cid:149)(cid:23)(cid:12)(cid:3)

(cid:60)(cid:72)(cid:86)(cid:3)(cid:11)(cid:20)(cid:12)(cid:18)(cid:49)(cid:82)(cid:3)(cid:11)(cid:19)(cid:12)(cid:3)

(cid:60)(cid:72)(cid:86)(cid:29)(cid:3)(cid:22)(cid:26)(cid:20)(cid:3)(cid:82)(cid:88)(cid:87)(cid:3)(cid:82)(cid:73)(cid:3)(cid:21)(cid:15)(cid:24)(cid:19)(cid:21)(cid:3)

(cid:39)(cid:41)(cid:56)(cid:16)(cid:23)(cid:3)(cid:11)(cid:21)(cid:19)(cid:20)(cid:26)(cid:16)(cid:21)(cid:19)(cid:20)(cid:28)(cid:12)(cid:3)

This analysis aims to illustrate the implementation of two methods for mediation

analysis in the complex survey data and may not have epidemiological interpretation due to

multiple reasons. First, the mediator is a surrogate for the true olfactory status but only

measured at a single timepoint, plus the olfactory status is a multi-dimension assessment,

e.g., discrimination, threshold and identiﬁcation. Therefore, a single smell identiﬁcation test

may not reﬂect the true olfactory status at one particular time point. Third, the exposure

is serious head injury history which is based on self-reports and may suﬀers from recall bias

and it may also suﬀer from the measurement error compared to the deﬁnition of traumatic

brain injury. Fourth, the choice of the cutoﬀ for AD-8 in the deﬁnition of the cognitive

impairment is higher than the common choice since the Sister cohort is made from relatively

61

healthy people. This may lead to the measurement error in the outcome. Fifth, there is

a overlap in the data collection time period for the mediator (B-SIT tested olfaction) and

outcome (cognitive function) which might not reﬂect the true causal relationship.

This example includes 33, 742 out of 36, 491 participants from the cohort who had

non-missing values for the covariates C, the exposure X, and the primary case-control out-

come Z. Investigators used 100% sampling for cases and ˜2.5% sampling for controls. The

complete data has 2, 502 participants in the case-control sample. Table 3.7 shows the esti-

mates of the mediating eﬀects in the RD, RR, and OR scales with 95% CIs:

Table 3.7: Method Comparison Using the Sister Study

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:38)(cid:39)(cid:40)(cid:19)(cid:3)
(cid:38)(cid:39)(cid:40)(cid:20)(cid:3)
(cid:49)(cid:39)(cid:40)(cid:3)
(cid:49)(cid:44)(cid:40)(cid:3)

(cid:48)(cid:82)(cid:71)(cid:76)(cid:73)(cid:76)(cid:72)(cid:71)(cid:3)(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:9)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3) (cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3) (cid:51)(cid:82)(cid:76)(cid:81)(cid:87)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:72)(cid:3) (cid:37)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3) (cid:53)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:3)(cid:28)(cid:24)(cid:8)(cid:3)(cid:38)(cid:44)(cid:3)

(cid:19)(cid:17)(cid:19)(cid:22)(cid:25)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:25)(cid:24)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:26)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:23)(cid:19)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:26)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:3)

(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:21)(cid:21)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:28)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:22)(cid:19)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:25)(cid:26)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:20)(cid:24)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:21)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:19)(cid:20)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:12)(cid:3)

(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:22)(cid:15)(cid:3)(cid:20)(cid:17)(cid:27)(cid:19)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:21)(cid:15)(cid:3)(cid:21)(cid:17)(cid:20)(cid:21)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:26)(cid:19)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:22)(cid:12)(cid:3)

(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:20)(cid:15)(cid:3)(cid:21)(cid:17)(cid:19)(cid:22)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:26)(cid:28)(cid:15)(cid:3)(cid:21)(cid:17)(cid:25)(cid:24)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:26)(cid:15)(cid:3)(cid:20)(cid:17)(cid:28)(cid:19)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:23)(cid:12)(cid:3)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:22)(cid:3)(cid:3)
(cid:16)(cid:19)(cid:17)(cid:19)(cid:19)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:22)(cid:27)(cid:3)
(cid:19)(cid:17)(cid:19)(cid:19)(cid:21)(cid:3)
(cid:53)(cid:72)(cid:79)(cid:68)(cid:87)(cid:76)(cid:89)(cid:72)(cid:3)(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:22)(cid:3)
(cid:20)(cid:17)(cid:21)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:3)
(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:23)(cid:3)
(cid:19)(cid:17)(cid:28)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:22)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:3)

(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:25)(cid:24)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:27)(cid:23)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:28)(cid:21)(cid:15)(cid:3)(cid:19)(cid:17)(cid:20)(cid:20)(cid:28)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:20)(cid:23)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:22)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:19)(cid:20)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:12)(cid:3)

(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:26)(cid:20)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:26)(cid:26)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:20)(cid:19)(cid:20)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:27)(cid:25)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:20)(cid:25)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:28)(cid:21)(cid:12)(cid:3)
(cid:11)(cid:16)(cid:19)(cid:17)(cid:19)(cid:19)(cid:20)(cid:15)(cid:3)(cid:19)(cid:17)(cid:19)(cid:19)(cid:24)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:21)(cid:25)(cid:15)(cid:3)(cid:21)(cid:17)(cid:19)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:22)(cid:26)(cid:15)(cid:3)(cid:21)(cid:17)(cid:25)(cid:23)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:28)(cid:19)(cid:15)(cid:3)(cid:20)(cid:17)(cid:26)(cid:20)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:22)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:21)(cid:23)(cid:15)(cid:3)(cid:21)(cid:17)(cid:22)(cid:19)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:22)(cid:23)(cid:15)(cid:3)(cid:22)(cid:17)(cid:19)(cid:22)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:27)(cid:27)(cid:15)(cid:3)(cid:20)(cid:17)(cid:28)(cid:21)(cid:12)(cid:3)
(cid:3)(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:23)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:23)(cid:22)(cid:15)(cid:3)(cid:21)(cid:17)(cid:23)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:22)(cid:25)(cid:15)(cid:3)(cid:21)(cid:17)(cid:22)(cid:28)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:22)(cid:15)(cid:3)(cid:20)(cid:17)(cid:26)(cid:26)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:22)(cid:12)(cid:3)

(cid:11)(cid:19)(cid:17)(cid:23)(cid:19)(cid:15)(cid:3)(cid:21)(cid:17)(cid:26)(cid:20)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:22)(cid:21)(cid:15)(cid:3)(cid:21)(cid:17)(cid:25)(cid:22)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:20)(cid:15)(cid:3)(cid:20)(cid:17)(cid:28)(cid:27)(cid:12)(cid:3)
(cid:11)(cid:19)(cid:17)(cid:28)(cid:28)(cid:15)(cid:3)(cid:20)(cid:17)(cid:19)(cid:23)(cid:12)(cid:3)

Both methods give very close estimates on NEs. For CDE, two methods provide

diﬀerent point estimates and 95% CIs, regardless the choice of ﬁxed value for the mediator

M , i.e., the B-SIT tested olfaction status. Again, this is just an example of showing how to

implement two methods to the real-world data and no clinical conclusion should be drawn

from this analysis.

3.8 Conclusion

In Chapter 3, we discussed the estimation of mediating eﬀects (controlled direct eﬀect [CDE]

and natural (in)direct eﬀects [NEs]) using complex sampling data of a case-control sample

drawn from the target population (e.g., a cohort) with a secondary as the mediator and

a tertiary outcome as the outcome of interest. We ﬁrst showed that ignoring the study

design would lead to serious bias in NEs estimation. Second, we proposed a set of weighting

62

estimators and compared it with modiﬁed version of Huber and Solovyeva’s work using

the Monte Carlo simulation study. Overall, two methods have comparable performance

while the proposed method is computationally less expensive. Third, we showed that the

validity of using robust standard error for the proposed estimator in addition to the bootstrap

resampling based standard error. Fourth, we implemented and compare two methods using

real-world dataset for illustration purpose. Overall, we recommend to use more than one

method in practice, to provide researchers more information and comprehensive results for

clinical interpretation.

63

CHAPTER 4 Monte Carlo Based Statistical Power Analysis for Mediation Anal-

ysis Using Case-Control Studies With Secondary and Tertiary

Outcomes

4.1 Background

Mediation analysis has exploded in past two decades, in both methodology and applications,

especially for observational studies (Robins et al. 2000, Pearl 2001, VanderWeele 2009, Hong

2010, VanderWeele & Vansteelandt 2010, Lange et al. 2012, Richiardi et al. 2013, Tchet-

gen Tchetgen 2013, Valeri & Vanderweele 2013, Huber 2014, Hong 2015, VanderWeele 2015,

Huber & Solovyeva 2020). However, less attention has been paid to the study design and

sample size calculation for mediation analyses (VanderWeele 2020), primarily due to the fact

that such analysis usually relies on crude approximation of the sample size (Rudolph et al.

2019). In such case, researchers may not be clear what is the minimum detectable mediating

eﬀects with particular statistical method and the study design they choose.

Two approaches have been used for calculating statistical power for the estimation

of indirect eﬀects, one based on generalized linear models (GLM) (Vittinghoﬀ et al. 2009)

and one based on simulation using the Baron and Kenny product estimator (Baron & Kenny

1986, Judd & Kenny 1981). Both approaches have signiﬁcant limitations. The GLM method

can only detect whether an indirect eﬀect exists but cannot provide information on the power

for a given eﬀect size. The Baron and Kenny’s method generally doesn’t allow the exposure-

mediator interaction and requires a linear relationship between the mediator and outcome.

There are alternative methods for performing mediation analysis which allow for the potential

exposure-mediator interactions and diﬀerent functional forms between the outcome and the

exposure, e.g., log- or logit-link (Lange et al. 2012, VanderWeele 2009, Huber & Solovyeva

2020). However, power analysis using such methods are rarely seen or discussed.

In addition to the statistical methods accounting for the exposure-mediator inter-

action and diﬀerent functional forms, the complexity of the study design should also be

taken into consideration. Mediation analysis has been implemented in diﬀerent types of

64

observational studies, including cross-sectional, case-control and cohort studies (Rijnhart

et al. 2021). However, little has been done for mediation analysis in a case-control study

with subsequently collected (secondary and/or tertiary) outcomes. This may happen after a

case-control study has been conducted but with additional interest of a secondary outcome

and direct or indirect eﬀects of exposure on the secondary outcome through the primary

outcome. Similarly, subsequent (tertiary) outcomes can further be collected and the study

interest is the mediating eﬀects from the exposure to the tertiary outcome through a sec-

ondary outcome.

At the time of writing, there is no comprehensive publication discusses about the

power analysis and sample size calculation using methods other than structural equation

modelings. In fact, while most publications focus on the power analysis in mediation analy-

sis using the Baron and Kenny’s approach (Fritz & MacKinnon 2007), there is no publication

discussing the design of a mediation analysis involving complex sampling process with sec-

ondary (and tertiary) outcomes. Therefore, we would like to ﬁll these two gaps in this

chapter, by exploring the relationship between the eﬀect size of the natural indirect eﬀect

and the statistical power from a recently proposed weighting method particularly for the

mediation analysis in complex study designs with existence of secondary (and tertiary) out-

comes, using simulation. With such study design, the exposure X is fully observed from

the cohort. However, even if X is only observed within the case-control sample, using the

approach discussed in Section 3.4-3.6 we can still estimate the mediating eﬀects in the cohort

(source population).

The ﬁrst aim is to ﬁnd the minimum detectable eﬀect size for the natural indirect ef-

fect at a ﬁxing target statistical power, i.e., 0.8, using a recently proposed weighting method.

The second aim is to ﬁnd the optimal case-to-control ratio with a ﬁxed sample size to detect

a given eﬀect size.

In this chapter, we will ﬁrst introduce study settings, notations, the data structure

presented by causal directed acyclic graphs (DAG), a brief review of natural indirect eﬀects,

65

the data generating process, Monte Carlo simulation, and the bootstrap resampling proce-

dure. Next, we introduce the estimation method for the target causal parameter, i.e., the

natural indirect eﬀect, followed by simulation results. Lastly, we summarize our ﬁndings and

draw the conclusion.

4.2 Settings

4.2.1 Notation and Causal DAG

In this chapter, we considered the mediation analysis using case-control samples drawn from

a pre-deﬁned target population, e.g., a cohort, under two diﬀerent scenarios. The scenario

1 studies the mediating eﬀect of the exposure to a secondary outcome through the variable

for case-control sampling (primary outcome). The scenario 2 studies the mediating eﬀect of

the exposure to a tertiary outcome through the secondary outcome. Next, we introduce the

variables involved in the two scenarios.

Formally, let C1 be the exposure-mediator, exposure-outcome, and mediator-outcome

confounder, e.g., the participants’ age at the study baseline. Let C2 be the common cause of

the primary outcome (case-control status), the mediator and the outcome, e.g., race/ethnicity.

Let C3 be the mediator-outcome confounder assessed post-baseline, e.g., comorbidity disease

at the time of mediator assessment, prior to the assessment of the outcome. C1 and C2 are

assessed at the baseline (T0). It is possible that C3 is only assessed within the case-control

sample (T1). Let X be the exposure of interest assessed at baseline (T0). Z is the variable

used for selecting the case-control sample (0-control, 1-case). Let M be the secondary out-

come which is assessed after collection of case-control sample (T1) and available only within

the case-control sample. We assume that the sample is selected from an existing cohort

study. Let S be an indicator of case-control sample selection status, where “S = 1” means

the unit is selected and “S = 0” means it is unselected (Didelez et al. 2010). Let Y be the

tertiary outcome assessed after the measurement of the mediator (T2). Figure 4.1 and 4.2

show the causal DAG for the data generating process for two scenarios, where the square

outside S indicates the analysis is performed using observed data, i.e., the selected case-

66

control sample.

C1: confounder between exposure, mediator and outcome, e.g. participants’ age at study

baseline

C2: confounder between mediator and outcome, e.g. participant comorbidity

X: exposure/treatment

Z: case-control status (mediator)

S: case-control sample selection node

Y : (secondary) outcome

Figure 4.1: Causal DAG for Scenario 1

67

C1: confounder between the exposure, the mediator and the outcome

C2: common cause for primary outcome, the mediator and the outcome

C3: confounder between the mediator and the outcome

X: exposure/treatment assessed at T0

Z: case-control status assessed between T0 and T1

M : secondary outcome (mediator) assessed at T1

S: case-control sample selection node

Y : (tertiary) outcome assessed at T2

Figure 4.2: Causal DAG for Scenario 2

For simplicity, assume C2, C3 X, Z, M and Y are all binary variables. In practice,

they can be non-binary variables as well.

If the case-control sample is selected from a

cohort (source population) study as we assumed, then (C1, C2, X, Z, S) are observed for all

participants whereas (C3, M, Y ) are only observed when S = 1, i.e., among subjects who are

selected into the case-control sample.

4.2.2 Natural Indirect Eﬀect (NIE)

Since NIE is generally more attractive than other causal eﬀects (controlled direct eﬀects and

natural direct eﬀects) in mediation analysis, we hereby focus on the NIE for this chapter.

For binary mediator and outcome, the population NIE can be deﬁned with two diﬀerent

68

scales, separately for scenario 1 and 2 (Robins & Greenland 1992, Pearl 2001, VanderWeele

2015):

Scenario 1:

Risk diﬀerence: NIERD = E[Y 1Z1 − Y 1Z0
].
=1)/P (Y 1Z1
Odds ratio: and NIEOR = P (Y 1Z1
=0)
P (Y 1Z0 =1)/P (Y 1Z0 =0) .

Scenario 2:

Risk diﬀerence: NIERD = E[Y 1M 1 − Y 1M 0
Odds ratio: NIEOR = P (Y 1M 1

=1)/P (Y 1M 1
=0)
P (Y 1M 0 =1)/P (Y 1M 0 =0) .

].

4.2.3 Data Generating Process (DGP)

Since diﬀerent case-control sampling ratios may aﬀect the statistical power when ﬁxing the

sample size, we explore the power-sample size relationship for 10 diﬀerent eﬀect sizes of NIE

with 4 diﬀerent case-to-control sampling ratios (2:1, 1:1, 1:2 and 1:3). To achieve diﬀerent

eﬀect sizes, we adopt diﬀerent parameters to generate the data, as detailed below and in the

Appendix B.

Scenario 1:

1) Generate the observed outcomes:

C1 is a continuous confounder assessed at baseline (T0), which is generated using the

truncated normal distribution, N (60.72, 54.05), with the interval [44.1, 76.3]. An example of

C1 could be the participants’ age from the source population (cohort) at study baseline.

C2 is a binary confounder assessed at baseline and C2 ∼ Bernoulli(PC2) and PC2 ≈

12.71%. An example of C2 could be the race/ethnicity where 0 is the White and 1 is the

Black.

X is a binary exposure of interest and X ∼ Bernoulli(PX), where PX =

1
1+exp(−βc1xC1−β0x)

and P (X = 1) ≈ 20%.

Z is a binary mediator (primary outcome) assessed at T1, later than baseline, with

69

Z ∼ Bernoulli(PZ), where

(cid:7)

PZ =

1 + exp(−βc1zC1 − βc2zC2 − βxzX − β0z)

(cid:8)−1

and P (Z ≈ 1) = 7.73%.

S is the indicator of being selected into the case-control sample (0-unselected and

1-selected), where for subjects with Z = 1, 100% is selected and for subjects with Z = 0,

a simple random sampling is selected with sampling rate of 3.56%. This is very common in

epidemiological studies where cases are completely selected while controls are selected with

a small portion due to limited budget (Cao et al. 2022).

Y is the binary (0/1) outcome of interest and Y ∼ Bernoulli(PY ), where

PY = {1 + exp(−kY )}−1

kY = βc1yC1 + βc2yC2 + βxyX + βzyZ + βc1xyC1X + βc2xyC2X + βxzyXZ + β0y

and P (Y = 1) ≈ 14.90%.

2) Generate the potential outcomes:

For the mediator Z, Z x ∼ Bernoulli(PZx), where PZx is similar to PZ with replacing

X by the actual value x. For the outcome Y , Y xZx∗
similar to PY with replacing X and Z by x and counterfactual outcome Z x∗

∼ Bernoulli(PY xZx∗ ), where PY xZx∗ is

.

Scenario 2:

1) Generate the observed outcomes:

The DGP of C1, C2, X, Z, and S are exactly same as in scenario 1. Below are the

DGP for three additional observed variables.

C3 is a binary confounder assessed either at T0, between T0 and T1, or at T1, and

C3 ∼ Bernoulli(PC3) and PC3 ≈ 7.27%. An example of C3 could be the comorbidity disease

at the time of assessment of the secondary outcome (the mediator).

M is the mediator (secondary outcome), assessed post-baseline (T1), with M ∼

70

Bernoulli(PM ), where

(cid:7)

PM =

1 + exp(−βc1mC1 − βc2mC2 − βc3mC3 − βxmX − βzmZ − β0m)

(cid:8)−1

and P (M = 1) ≈ 14.91%.

Y is the binary (0/1) outcome of interest assessed at T2 and Y ∼ Bernoulli(PY ),

where

PY = {1 + exp(−kY )}−1

kY = βc1yC1 + βc2yC2 + βc3yC3 + βxyX + βmyM + βc3xyC3X + βxmyXM − β0y

and P (Y = 1) ≈ 13.51%.

2) Generate the potential outcomes:

For the primary outcome (case-control status) Z, Z x ∼ Bernoulli(PZx), where PZx

is similar to PZ with replacing X by the actual value x. For the mediator M , M x,Zx∗

∼

Bernoulli(PM x,Zx∗ ), where PM x,Zx∗ is similar to PM with replacing X and Z by x and coun-
terfactual outcome Z x∗
∼ Bernoulli(PY xM x∗ ), where PY xM x∗ is

. For the outcome Y , Y xM x∗
similar to PY with replacing X and M by x and counterfactual outcome M x∗

.

To explore the relationship between diﬀerent eﬀect sizes for the NIE and statistical

power across diﬀerent case:control selection ratios, we manipulate primarily the coeﬃcients

for the exposure and mediator in the data generation process for the outcome Y . Meanwhile,

we also modify the intercepts to ensure the desired population prevalence for Z, M and Y

when applicable. Table 7.1 and 7.2 in Appendix show the choice of coeﬃcients to generate

data for scenario 1 and 2. Brieﬂy, we start with no NIE, i.e., NIERD = 0 and NIEOR = 1.

And we keep increasing the NIE until the power reaches at least 90%.

71

4.3 Estimation Method

In Chapter 2 and 3, we proposed two weighting estimators to estimate the NIE for scenario 1

and 2 and we showed the validity to use both bootstrap resampling and the sandwich methods

to estimate the standard error (SE) of proposed estimators. Additionally, an existing method

(Huber & Solovyeva 2020) can be used to estimate the NEs in scenario 1 (HS method) and

with some modiﬁcation, it can be further used for scenario 2 as well (modiﬁed HS method).

However, to estimate SE, (modiﬁed) HS method mainly relies on the bootstrap resampling

method, which is computationally expensive. Therefore, we simulate the data with robust

SE primarily, along with (modiﬁed) HS method with bootstrap SE for only 1 case-to-control

ratio in each scenario, for comparison purpose.

We will lay out the brief introduction on estimation methods for two scenarios sep-

arately. For proof and more details, please refer to the Chapter 2, Chapter 3, and the

Appendix A.

1) Scenario 1: primary case-control outcome as the mediator

Proposed method (weighting estimator I):

E[Y xZx
E[Y xZx∗

] = E[w1Y |X = x, S = 1] where w1 = P (S=1|X=x)P (X=x)
] = E[w2Y |X = x, S = 1] where w2 = P (S=1|X=x)P (Z=z|X=x∗,c1,c2)P (X=x)
P (S=1|Z=z)P (Z=z|X=x,c1,c2)P (X=x|c1) .

P (S=1|Z=z)P (X=x|c1) .

To estimate the NIE, we adopted MSM approach. The RD and OR can be expressed

in terms of the parameters from MSMs with linear- or logit-link functions:

linear model: P (Y xZx∗

= 1) = c0 + c1x + c2x∗ + c3x · x∗

(4.1)

logistic model: logit[P (Y xZx∗

= 1)] = c0 + c1x + c2x∗ + c3x · x∗

(4.2)

And it is not diﬃcult to show:

ˆNIERD = ˆc2(x − x∗) + ˆc3x(x − x∗) where (ˆc2, ˆc3) is estimated from the model 4.1.

ˆNIEOR = eˆc2(x−x∗)+ˆc3x(x−x∗) where (ˆc2, ˆc3) is estimated from the model 4.2.

72

Huber & Solovyeva’s (HS) method:

Taking risk diﬀerence as an illustrative example as shown in the original publication

(Huber & Solovyeva 2020), we have

ˆNIERD = ˆE[Y 1Z1

] − ˆE[Y 1Z0

],

where ˆE[Y 1Z1

]= 1
n

(cid:2)

Yi·Xi·Si
P (Xi=1|C1i)·P (Si=1|Zi)

/ 1
n

i

(cid:2)

i

Xi·Si
P (Xi=1|C1i)·P (Si=1|Zi) ,
(cid:2)

and ˆE[Y 1Z0

]= 1
n

(cid:2)

i

Yi·(1−Xi)·Si·P (Xi=0|Zi,C1i,C2i)
P (Xi=0|C1i)·P (Xi=1|Zi,C1i,C2i)·P (Si=1|Zi)

/ 1
n

(1−Xi)·Si·P (Xi=0|Zi,C1i,C2i)
P (Xi=0|C1i)·P (Xi=1|Zi,C1i,C2i)·P (Si=1|Zi) .

i

1) Scenario 2: secondary outcome as the mediator

Proposed method (weighting estimator II):

E[Y xM x
E[Y xM x∗

] = E[w3Y |S = 1] where w3= P (M =m|c1,c2,c3,x)

P (M =m|x,c1,c2,c3,S=1)
] = E[w4Y |S = 1] where w4= P (M =m|c1,c2,c3,x∗)
P (M =m|x,c1,c2,c3,S=1)

·

P (S=1|x)P (X=x)

P (S=1|x,c1,c2)P (X=x|c1) .

·

P (S=1|x)P (X=x)

P (S=1|x,c1,c2)P (X=x|c1) .

Similarly, we use the following MSMs to estimate the NIE:

linear model: P (Y xM x∗

= 1) = c0 + c1x + c2x∗ + c3x · x∗

(4.3)

logistic model: logit[P (Y xM x∗

= 1)] = c0 + c1x + c2x∗ + c3x · x∗

(4.4)

And we have:

ˆNIERD = ˆc2(x − x∗) + ˆc3x(x − x∗) where ( ˆc2, ˆc3) is estimated from the model 4.3.

ˆNIEOR = e ˆc2(x−x∗)+ ˆc3x(x−x∗) where ( ˆc2, ˆc3) is estimated from the model 4.4.

Modiﬁed Huber & Solovyeva’s (HS) method:

Taking risk diﬀerence as example, we have

ˆNIERD = ˆE[Y 1M 1

] − ˆE[Y 1M 0

], where

ˆE[Y 1M 1

]= 1
n

(cid:2)

Yi·Xi·Si·P (Mi=m|Xi,C1i,C2i,C3i)
P (Xi=1|C1i)·P (Mi=m|Xi,C1i,C2i,C3i,Si=1)·P (Si=1|Xi,C1i,C2i)

i

/s1 and

(cid:2)

s1= 1
n

Xi·Si·P (Mi=m|Xi,C1i,C2i,C3i)
P (Xi=1|C1i)·P (Mi=m|Xi,C1i,C2i,C3i,Si=1)·P (Si=1|Xi,C1i,C2i) .

i

ˆE[Y 1M 0

]= 1
n

(cid:2)

Yi·(1−Xi)·Si·P (Xi=0|Mi,C1i,C2i,C3i)·P (Mi=m|Xi,C1i,C2i,C3i)
P (Xi=0|C1i,C2i,C3i)·P (Xi=1|Mi,C1i,C2i,C3i)·P (Mi=m|Xi,C1i,C2i,C3i,Si=1)·P (Si=1|Xi,C1i,C2i)

i

/s2 and

(cid:2)

s2= 1
n

(1−Xi)·Si·P (Xi=0|Mi,C1i,C2i,C3i)·P (Mi=m|Xi,C1i,C2i,C3i)
P (Xi=0|C1i,C2i,C3i)·P (Xi=1|Mi,C1i,C2i,C3i)·P (Mi=m|Xi,C1i,C2i,C3i,Si=1)·P (Si=1|Xi,C1i,C2i)

/s2.

i

73

4.4 Simulation Studies

4.4.1 Monte Carlo (MC) simulation

The MC simulation follows these steps:

1. Since exposure-covariate and exposure-mediator interaction are assumed, it would be

cumbersome to derive the true NIE, especially for the logit-link function. Thus, the

true NIE is calculated from a super population (N0 = 2, 000, 000) based on the DGP

speciﬁed in the section 4.2.3.

2. Generate a cohort (N = 36, 491) and four case-control samples with ﬁxed sample size

(n = 4, 000) and diﬀerent case-to-control ratios, 2 : 1, 1 : 1, 1 : 2 and 1 : 3. Both

cases and controls are drawn from the generated cohort (N = 36, 491) using stratiﬁed

simple random sampling with diﬀerent sampling rates within the strata deﬁned by the

primary case-control outcome.

3. Estimate the NIE within the selected samples using either proposed or (modiﬁed) HS

method.

4. Test the statistical signiﬁcance based on either robust SE from one single MSM (pro-

posed method) or the bootstrap (HS method) conﬁdence intervals (CI) with α = 0.05.

The details of calculating bootstrap (CI) will be described below.

5. Repeat step 2 to step 4 for R = 1, 000 (number of MC runs) times, with r out of R

runs show statistical signiﬁcance. The power of the analysis is calculated as 1 − β = r
R .

4.4.2 Bootstrap Resampling

Previously, we have shown the validity of using robust SE with weighting estimators I and II.

For (modiﬁed) HS method, they recommend to use bootstrap SE since the estimation of the

NIE is by separately estimating two quantities instead of from one single model. Below are

two-step bootstrap resampling method used to get the bootstrap SE and the quantile-based

95% CI:

1. Starting from the case-control sample at the r-th run of the MC simulation described

in the step 2 in the section 4.4.1, we use stratiﬁed bootstrap sampling to ensure case-

74

control ratio. Speciﬁcally, assuming among the r-th case-control sample, there is nr

subjects with nr0 subjects for Z = 0 and nr1 subjects for Z = 1. For each boot-

strap sample b, we resample nr0b from nr0 and nr1b from nr1 with unrestricted random

sampling such that nr0b + nr1b = nr, the predetermined sample size (4000) by various

case-control ratios. The unrestricted random sampling assigns equal probability for

each individual in the same stratum and with replacement.

2. For each Monte Carlo run r, we run B bootstrap sampling runs (B = 500) and get the

percentile bootstrap conﬁdence interval and the bootstrap SE (Zhang 2014). Speciﬁ-

cally, the bootstrap SE is calculated using the following formula (Chernick & LaBudde

2011, Efron 1982): SEb =

(cid:6)

(cid:4)

1
B−1

b( ˆθ∗

estimated from the b-th bootstrap sample and ˆθ∗ =

b − ˆθ∗)2, where ˆθ∗
b is the causal eﬀect, i.e., NIE,
ˆθ∗
B .
b

(cid:2)

b

75

4.5 Results

4.5.1 Power of the proposed methods

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:19)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:24)

(cid:17)(cid:19)(cid:19)(cid:21)(cid:19)

(cid:17)(cid:19)(cid:19)(cid:22)(cid:25)

(cid:17)(cid:19)(cid:19)(cid:26)(cid:21)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)

(cid:83)(cid:79)(cid:68)(cid:81)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:21)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:21)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:22)

(cid:20)

(cid:20)(cid:17)(cid:19)(cid:19)(cid:23)(cid:21)

(cid:20)(cid:17)(cid:19)(cid:20)(cid:27)(cid:20)

(cid:20)(cid:17)(cid:19)(cid:21)(cid:26)(cid:24)

(cid:20)(cid:17)(cid:19)(cid:24)(cid:24)(cid:25)

(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)

(cid:83)(cid:79)(cid:68)(cid:81)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:21)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:21)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:22)

(cid:3)

(cid:3)

Table 4.1: Scenario 1 power analysis for natural indirect eﬀects in risk diﬀerence scale (upper

panel) and odds ratio scale (lower panel) using weighting estimator I with marginal structural

models and robust standard errors

76

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:19)

(cid:17)(cid:19)(cid:19)(cid:21)

(cid:17)(cid:19)(cid:19)(cid:25)

(cid:17)(cid:19)(cid:20)(cid:19)

(cid:17)(cid:19)(cid:20)(cid:21)

(cid:17)(cid:19)(cid:20)(cid:25)

(cid:17)(cid:19)(cid:21)(cid:19)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)

(cid:83)(cid:79)(cid:68)(cid:81)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:21)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:21)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:22)

(cid:20)

(cid:20)(cid:17)(cid:19)(cid:21)(cid:19)

(cid:20)(cid:17)(cid:19)(cid:23)(cid:27)

(cid:20)(cid:17)(cid:19)(cid:27)(cid:19)

(cid:20)(cid:17)(cid:20)(cid:19)(cid:23)

(cid:20)(cid:17)(cid:20)(cid:22)(cid:25)

(cid:20)(cid:17)(cid:20)(cid:26)(cid:19)

(cid:50)(cid:71)(cid:71)(cid:86)(cid:3)(cid:53)(cid:68)(cid:87)(cid:76)(cid:82)

(cid:83)(cid:79)(cid:68)(cid:81)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:21)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:21)

(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:20)
(cid:38)(cid:68)(cid:86)(cid:72)(cid:29)(cid:38)(cid:82)(cid:81)(cid:87)(cid:85)(cid:82)(cid:79)(cid:16)(cid:20)(cid:29)(cid:22)

(cid:3)

(cid:3)

Table 4.2: Scenario 2 power analysis for natural indirect eﬀects in risk diﬀerence scale (upper

panel) and odds ratio scale (lower panel) using weighting estimator II with marginal struc-

tural models and robust standard errors

For scenario 1 (Table 4.1), there are several interesting observations. First, the

statistical power of the analysis increases as the true eﬀect (risk diﬀerence and odds ratio)

gets larger, which is expected. Second, when ﬁxing the sample size at 4000, both 2 : 1 and

77

1 : 1 have higher statistical power than the other two case-to-control ratios for odds ratio

scale. For the risk diﬀerence scale, it seems that 1 : 1 is the optimal case-to-control ratio

in terms of having a higher statistical power. Third, with a target statistical power of 0.8,

we can see the minimum detectable NIE is about 0.0020 (risk diﬀerence) and 1.0181 (odds

ratio), with the optimal case-to-control ratio of 2 : 1 or 1 : 1.

For scenario 2 (Table 4.2), the statistical power of the analysis also increases as the

true eﬀect (risk diﬀerence and odds ratio) gets larger. However, we found that the case-

to-control ratio of 2 : 1 has the highest power regardless the eﬀect measure scale. Under

this case-to-control ratio and with a target statistical power of 0.8, the minimum detectable

NIE is about 0.002 (risk diﬀerence) and 1.025 (odds ratio). The minimum detectable eﬀect

(OR) for scenario 2 is slightly larger than that in scenario 1 and this could be due to the

estimation of mediator model using the case-control sample at scenario 2.

4.5.2 Comparison between the proposed methods and the HS method

We also compared (modiﬁed) HS method using bootstrap resampling based method with our

proposed method (weighting estimator I and II) using robust SE under ﬁve diﬀerent eﬀect

sizes for each scenario (Table 4.3).

78

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:85)
(cid:72)
(cid:90)
(cid:82)
(cid:51)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:20)(cid:17)(cid:19)(cid:19)
(cid:19)(cid:17)(cid:28)(cid:24)
(cid:19)(cid:17)(cid:28)(cid:19)
(cid:19)(cid:17)(cid:27)(cid:24)
(cid:19)(cid:17)(cid:27)(cid:19)
(cid:19)(cid:17)(cid:26)(cid:24)
(cid:19)(cid:17)(cid:26)(cid:19)
(cid:19)(cid:17)(cid:25)(cid:24)
(cid:19)(cid:17)(cid:25)(cid:19)
(cid:19)(cid:17)(cid:24)(cid:24)
(cid:19)(cid:17)(cid:24)(cid:19)
(cid:19)(cid:17)(cid:23)(cid:24)
(cid:19)(cid:17)(cid:23)(cid:19)
(cid:19)(cid:17)(cid:22)(cid:24)
(cid:19)(cid:17)(cid:22)(cid:19)
(cid:19)(cid:17)(cid:21)(cid:24)
(cid:19)(cid:17)(cid:21)(cid:19)
(cid:19)(cid:17)(cid:20)(cid:24)
(cid:19)(cid:17)(cid:20)(cid:19)
(cid:19)(cid:17)(cid:19)(cid:24)
(cid:19)(cid:17)(cid:19)(cid:19)

(cid:19)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:24)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)

(cid:17)(cid:19)(cid:19)(cid:21)

(cid:17)(cid:19)(cid:19)(cid:22)(cid:25)

(cid:80)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)(cid:11)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:12)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)(cid:11)(cid:85)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:12)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)

(cid:19)

(cid:17)(cid:19)(cid:19)(cid:21)

(cid:17)(cid:19)(cid:19)(cid:25)

(cid:53)(cid:76)(cid:86)(cid:78)(cid:3)(cid:39)(cid:76)(cid:73)(cid:73)(cid:72)(cid:85)(cid:72)(cid:81)(cid:70)(cid:72)

(cid:17)(cid:19)(cid:20)(cid:19)

(cid:17)(cid:19)(cid:20)(cid:21)

(cid:80)(cid:72)(cid:87)(cid:75)(cid:82)(cid:71)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)(cid:11)(cid:69)(cid:82)(cid:82)(cid:87)(cid:86)(cid:87)(cid:85)(cid:68)(cid:83)(cid:12)

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)(cid:11)(cid:85)(cid:82)(cid:69)(cid:88)(cid:86)(cid:87)(cid:12)

(cid:3)

(cid:3)

Table 4.3: Power analysis comparison for natural indirect eﬀects for scenario 1 in which the

primary case-control outcome is the mediator (upper panel) and scenario 2 in which the

secondary outcome is the mediator (lower panel) in risk diﬀerence scale

We found that for scenario 1, both HS method and the proposed weighting estimator

I have almost the same power across ﬁve diﬀerent eﬀect sizes (risk diﬀerence). Both methods

have inﬂated type-I error rate (0.057 for HS method and 0.063 for weighting estimator I).

79

For scenario 2, we observed that modiﬁed HS method has slightly higher but overall close

power than proposed weighting estimator II (risk diﬀerence). Also, both methods have in-

ﬂated type-I error rate (0.059 for modiﬁed HS method and 0.062 for weighting estimator

II). However, HS method relies on bootstrap resampling method which is computationally

expensive. Our proposed MSM-based approach dramatically reduces the amount of com-

putation regardless the small diﬀerence in minimum detectable eﬀect. The price of large

amount of computation may not be an issue in analyzing a real-world dataset but it would

be costly when designing a study, especially for researchers to explore diﬀerent scenarios such

as the presence of the exposure-mediator interaction, sensitivity analysis, etc. Therefore, our

proposed method will be more preferable for such situations.

4.6 Conclusion

In Chapter 4, we discussed the optimal case-to-control sampling ratio and the minimum

detectable eﬀect size for the natural indirect eﬀects in a complex survey data of a case-control

sample drawn from the target population (e.g., a cohort) with presence of a secondary (and

tertiary) outcome(s). We ﬁrst showed results using robust standard errors estimated from

the weighted marginal structural model of our proposed methods (weighting estimator I

and II), under diﬀerent presumed eﬀect sizes. Second, we compared results from bootstrap

resampling method of a (modiﬁed) existing approach with our proposed method. Overall,

using robust standard errors with the proposed methods could signiﬁcantly reduce the time

of simulation and performed very well when designing studies for mediation analysis.

80

CHAPTER 5 Conclusions

In this dissertation, we consider the mediating eﬀects, i.e., the CDE and NEs, in case-control

studies under two scenarios. In Chapter 2 (scenario 1), we are interested in the mediating

eﬀects from an exposure to a secondary outcome through the primary case-control outcome

as the mediator where the exposure and mediator are observed through the whole target

population (e.g., a cohort) but the secondary outcome is only observed within the case-

control sample only. In Chapter 3 (scenario 2), we are interested in the mediating eﬀects

from an exposure to a tertiary outcome through a secondary outcome (mediator) where both

the mediator and the outcome of interest are observed only within the case-control sample.

We additionally explore the optimal case-to-control ratio and the minimum detectable eﬀect

size in RD and OR scales for both scenarios.

Using the Neyman-Rubin counterfactual outcome causal framework, we compare two

existing methods, i.e., VanderWeele’s MSM (VanderWeele 2009) and Huber and Solovyeva’s

weighting approaches (Huber & Solovyeva 2020), with our proposed weighting estimators

which is an extension of Hong’s ratio-of-mediator-probability weighting (RMPW) method

(Hong 2010, Lange et al. 2012, Hong 2015). When using the VanderWeele’s MSM approach

and ignoring the complex study design, although the CDE can be unbiasedly estimated,

directly applying his method lead to serious bias in estimation of the NEs. When only

the secondary outcome is involved, HS method suﬀers from two main disadvantages. First,

their estimation of the weight involves modeling the treatment as a function of the mediator

and covariates, which does not have immediate substantive interpretations given that the

treatment causally precedes rather than succeeds the mediator (Hong 2015). Second, their

method relies on the bootstrap resampling method in estimating the SEs of the estimator,

which is computationally expensive, especially at study design phase. When both the sec-

ondary and tertiary outcomes are involved, HS method cannot be directly applied since that

the probability of being selected into the case-control sample given the exposure, mediator

and covariates is not estimable under the particular study design. After some modiﬁcation

81

to ﬁt the scenario, their (modiﬁed) approach still suﬀers from above two disadvantages.

Our proposed method provides a solution in mediating eﬀect estimation under com-

plex study designs which can unbiasedly estimate both the CDE and NEs under two partic-

ular scenarios that are expected to be commonly seen in epidemiological studies. Extending

Hong’s RMPW approach with the MSM estimation proposed by Lange et al. (Lange et al.

2012), our proposed method overcomes two disadvantages mentioned above. The estimation

of the weight involves both sampling selection process and the mediator model and propen-

sity score. For the estimation of SE, the robust/sandwich estimator from the MSM can be

easily estimated using standard statistical software.

This dissertation work provides several meaningful insights. First, as far as we know,

this is one of the ﬁrst work that studies mediation analysis under complex study designs.

Second, our proposed method combines the MSM and RMPW method, which is easily

to implement in practical with standard statistical software and requires relatively fewer

identiﬁcation assumptions comparing with existing methods (Hong 2015). Third, comparing

with the only available weighting-based method (Huber & Solovyeva 2020) that can ﬁt with

two scenarios, our proposed method is more interpretable in estimation of the weight and

easier to estimate the SE with the help of MSM. Fourth, for the natural indirect eﬀect, the

proposed method has smaller bias and more stable when both the secondary and tertiary

outcomes exist. Fifth, the proposed method can be easily used to design a mediation analysis

study, even though with a relatively small target eﬀect size (risk diﬀerence and odds ratio).

Sixth, we also provide the optimal case-control ratio for performing and designing a mediation

analysis using the proposed approach, which could further help researchers reduce the cost.

Although our proposed approach has many advantages, this work could still be

extended in a number of directions. First, our discussion primarily focus on the estimation

of the CDE and NEs with limited exploration on the SEs. Although a stratiﬁed bootstrap

resampling method and the robust SE can be used, the weighting-based methods are usually

criticized with less statistical eﬃciency. Therefore, diﬀerent bootstrap resampling approaches

82

and other methods for estimating SEs could also be explored to increase the statistical

eﬃciency. Second, other methods under complex study designs could be developed (e.g.,

semi-parametric estimators) to examine issues of eﬃciency and robustness to violations in

identiﬁcation assumptions. Third, we haven’t considered the missing data issue which is very

common in complex study designs. It might be worth to study how diﬀerent missingness

mechanisms would aﬀect the estimation of mediating eﬀects and the potential remedies for

this. Additionally, there are several notable observations in our work that might require

further exploration. For scenario 1, the coverage probability of the proposed method in

estimating the natural indirect eﬀect is relatively low (lower than 95%) with risk diﬀerence

scale and it is not converging to 95% for HS method when increasing the sample size. For

scenario 2, the coverage probability is generally lower than 95% in estimating the natural

indirect eﬀect for both our proposed and HS method.

83

BIBLIOGRAPHY

Albert, J. M. (2012), ‘Distribution-free mediation analysis for nonlinear models with con-

founding’, Epidemiology 23(6), 879–888.

Alwin, D. F. & Hauser, R. M. (1975), ‘The decomposition of eﬀects in path analysis’, Amer-

ican Sociological Review 40(1), 37–47.

Baron, R. M. & Kenny, D. A. (1986), ‘The moderator-mediator variable distinction in social
psychological research: Conceptual, strategic, and statistical considerations’, Journal of
Personality and Social Psychology 51, 1173–1182.

Cai, Y., Qiu, P., Wan, Y., Meng, S. S., Liu, T., Wang, Y., Rao, S. & Kuang, W. (2021),
‘Establishing cut-oﬀ scores for the self-rating AD8 based on education level’, Geriatric
Nursing 42(5), 1093–1098.

Cao, Z., Yang, A., D’Aloisio, A. A., Suarez, L., Deming-Halverson, S., Li, C., Luo, Z., Pinto,
J. M., Werder, E. J., Sandler, D. P. & Chen, H. (2022), ‘Assessment of self-reported sense
of smell, objective testing, and associated factors in mmiddle-aged and older women’,
JAMA Otolaryngology– Head & Neck Surgery 148(5), 408–417.

Chen, Z., Xie, H., Yao, L. & Wei, Y. (2021), ‘Olfactory impairment and the risk of cognitive
decline and dementia in older adults: A meta-analysis’, Brazilian Journal of Otorhino-
laryngology 87(1), 94–102.

Chernick, M. R. & LaBudde, R. A. (2011), An Introduction to Bootstrap Methods with

Applications to R, 1st edition edn, Wiley, Hoboken, N.J.

Coﬀman, D. L., Schuler, M. S., Nguyen, T. Q. & McCaﬀrey, D. F. (2023), Weighting esti-
mators for causal mediation, in ‘Handbook of Matching and Weighting Adjustments for
Causal Inference’, CRC.

Didelez, V., Kreiner, S. & Keiding, N. (2010), ‘Graphical models for inference under outcome-

dependent sampling’, Statistical Science 25(3), 368–387.

Duncan, O. D. (1966), ‘Path analysis: Sociological examples’, American Journal of Sociology

72(1), 1–16.

Efron, B. (1982), The bootstratp, in ‘The Jackknife, the Bootstrap and Other Resampling

Plans’, Society for Industrial and Applied Mathematics, pp. 27–36.

Fann, J. R., Ribe, A. R., Pedersen, H. S., Fenger-Grøn, M., Christensen, J., Benros, M. E. &
Vestergaard, M. (2018), ‘Long-term risk of dementia among people with traumatic brain
injury in denmark: a population-based observational cohort study’, The Lancet Psychiatry
5(5), 424–431.

Finney, J. M. (1972), ‘Indirect eﬀects in path analysis’, Sociological Methods & Research

1(2), 175–186.

84

Fritz, M. S. & MacKinnon, D. P. (2007), ‘Required sample size to detect the mediated eﬀect’,

Psychological Science 18(3), 233–239.

Galvin, J. E., Roe, C. M., Xiong, C. & Morris, J. C. (2006), ‘Validity and reliability of the

AD8 informant interview in dementia’, Neurology 67(11), 1942–1948.

Greenland, S., Pearl, J. & Robins, J. M. (1999), ‘Causal diagrams for epidemiologic research’,

Epidemiology 10(1), 37–48.

Gu, D., Ou, S. & Liu, G. (2022), ‘Traumatic brain injury and risk of dementia and alzheimer’s

disease: A systematic review and meta-analysis’, Neuroepidemiology 56(1), 4–16.

Gustavo Amorim, Alastair J. Scott & Chris J. Wild (2017), Multi-Phase Sampling, in ‘Hand-
book of Statistical Methods for Case-Control Studies’, Chapman and Hall/CRC. Num
Pages: 20.

Hafdahl, A. R. (2010), ‘Random-eﬀects meta-analysis of correlations: Monte carlo eval-
uation of mean estimators’, British Journal of Mathematical and Statistical Psychology
63(1), 227–254.

Han, J., Colditz, G. A. & Hunter, D. J. (2006), ‘Risk factors for skin cancers: A nested case-
control study within the Nurses’ Health Study’, International Journal of Epidemiology
35(6), 1514–1521.

Heckman, J. J. & Vytlacil, E. J. (2007), Econometric evaluation of social programs, part i:
Causal models, structural models and econometric policy evaluation, in J. J. Heckman &
E. E. Leamer, eds, ‘Handbook of Econometrics’, Vol. 6, Elsevier, pp. 4779–4874.

Hernan, M. A. & Robins, J. M. (2020), Causal Inference: What If, CRC, Boca Raton, FL.

Holland, P. W. (1988), ‘Causal inference, path analysis, and recursive structural equations

models’, Sociological Methodology 18, 449–484.

Hong, G. (2010), ‘Ratio of mediator probability weighting for estimating natural direct and

indirect eﬀects’, p. 15.

Hong, G. (2015), Causality in a Social World: Moderation, Mediation and Spill-over, Wiley,

Chichester, England.

Hong, G., Deutsch, J. & Hill, H. D. (2015), ‘Ratio-of-mediator-probability weighting for
causal mediation analysis in the presence of treatment-by-mediator interaction’, Journal
of Educational and Behavioral Statistics 40(3), 307–340.

Hong, G. & Nomi, T. (2012), ‘Weighting methods for assessing policy eﬀects mediated by

peer change’, Journal of Research on Educational Eﬀectiveness 5(3), 261–289.

Howell, J., Costanzo, R. M. & Reiter, E. R. (2018), ‘Head trauma and olfactory function’,

World Journal of Otorhinolaryngology - Head and Neck Surgery 4(1), 39–45.

85

Huber, M. (2014), ‘Identifying causal mechanisms (primarily) based on inverse probability

weighting’, Journal of Applied Econometrics 29(6), 920–943.

Huber, M. & Solovyeva, A. (2020), ‘Direct and indirect eﬀects under sample selection and

outcome attrition’, Econometrics 8(4), 44.

Imai, K., Keele, L. & Tingley, D. (2010), ‘A general approach to causal mediation analysis’,

Psychological Methods 15(4), 309–334.

Imai, K., Keele, L. & Yamamoto, T. (2010), ‘Identiﬁcation, inference and sensitivity analysis

for causal mediation eﬀects’, Statistical Science 25(1).

Imbens, G. W. & Rubin, D. B. (2015), Causal Inference for Statistics, Social, and Biomedical

Sciences: An Introduction, Cambridge University Press, Cambridge, UK.

Judd, C. M. & Kenny, D. A. (1981), ‘Process analysis: Estimating mediation in treatment

evaluations’, Evaluation Review 5, 602–619.

Kim, R. S. & Kaplan, R. C. (2014), ‘Analysis of secondary outcomes in nested case-control

study designs’, Statistics in Medicine 33(24), 4215–4226.

Lange, T., Vansteelandt, S. & Bekaert, M. (2012), ‘A simple uniﬁed approach for estimating
natural direct and indirect eﬀects’, American Journal of Epidemiology 176(3), 190–195.

Lipsky, A. M. & Greenland, S.

(2022),

‘Causal directed acyclic graphs’, JAMA

327(11), 1083–1084.

MacKinnon, D. (2007), Introduction to Statistical Mediation Analysis, Routledge, New York,

NY.

Mackinnon, D. P. & Dwyer, J. H. (1993), ‘Estimating mediated eﬀects in prevention studies’,

Evaluation Review 17(2), 144–158.

MacKinnon, D. P., Lockwood, C. M., Brown, C. H., Wang, W. & Hoﬀman, J. M. (2007), ‘The
intermediate endpoint eﬀect in logistic and probit regression’, Clinical Trials 4(5), 499–
513.

Morris, T. P., White, I. R. & Crowther, M. J. (2019), ‘Using simulation studies to evaluate

statistical methods’, Statistics in Medicine 38(11), 2074–2102.

Neyman, J. (1938), ‘Contribution to the theory of sampling human populations’, Journal of

the American Statistical Association 33(201), 101–116.

Nguyen, T. Q., Ogburn, E. L., Schmid, I., Sarker, E. B., Greifer, N., Koning, I. M. &
Stuart, E. A. (2023), ‘Causal mediation analysis: from simple to more robust strategies
for estimation of marginal natural (in)direct eﬀects’, Statistics Surveys 17.

Pearl, J. (2001), Direct and indirect eﬀects, in M. Kaufmann, ed., ‘Proceedings of the Sev-
enteenth Conference on Uncertainty and Artiﬁcial Intelligence’, San Francisco, CA, p. 10.

86

Pearl, J. (2009), Causality, Cambridge University Press, Cambridge, UK.

Petersen, M. L., Sinisi, S. E. & van der Laan, M. J. (2006), ‘Estimation of direct causal

eﬀects’, Epidemiology 17(3), 276–284.

Preacher, K. J., Rucker, D. D. & Hayes, A. F. (2007),

‘Addressing moderated media-
tion hypotheses: Theory, methods, and prescriptions’, Multivariate Behavioral Research
42(1), 185–227.

Richiardi, L., Bellocco, R. & Zugna, D. (2013), ‘Mediation analysis in epidemiology: meth-
ods, interpretation and bias’, International Journal of Epidemiology 42(5), 1511–1519.

Rijnhart, J. J. M., Lamp, S. J., Valente, M. J., MacKinnon, D. P., Twisk, J. W. R. &
Heymans, M. W. (2021), ‘Mediation analysis methods used in observational research: a
scoping review and recommendations’, BMC Medical Research Methodology 21, 226.

Robins, J. M. & Greenland, S. (1992), ‘Identiﬁability and exchangeability for direct and

indirect eﬀects’, Epidemiology 3(2), 143–155.

Robins, J. M., Hern´an, M. ´A. & Brumback, B. (2000), ‘Marginal structural models and

causal inference in epidemiology’, Epidemiology 11(5), 550–560.

Rubin, D. B. (1974), ‘Estimating causal eﬀects of treatments in randomized and nonran-

domized studies’, Journal of Educational Psychology 66, 688–701.

Rubin, D. B. (1978), ‘Bayesian inference for causal eﬀects: The role of randomization’, The

Annals of Statistics 6(1), 34–58.

Rubin, D. B. (1980), ‘Randomization analysis of experimental data: The ﬁsher randomiza-
tion test comment’, Journal of the American Statistical Association 75(371), 591–593.

Rubin, D. B. (1986), ‘Statistics and causal inference: comment: which ifs have causal an-

swers’, Journal of the American Statistical Association 81(396), 961–962.

Rudolph, K. E., Goin, D. E., Paksarian, D., Crowder, R., Merikangas, K. R. & Stuart,
E. A. (2019), ‘Causal mediation analysis with observational data: considerations and
illustration examining mechanisms linking neighborhood poverty to adolescent substance
use’, American Journal of Epidemiology 188(3), 598–608.

Sahiner, B., Chan, H.-P. & Hadjiiski, L. (2008),

‘Classiﬁer performance prediction for

computer-aided diagnosis using a limited dataset’, Medical Physics 35(4), 1559–1570.

Sandler, D. P., Hodgson, M. E., Deming-Halverson, S. L., Juras, P. S., D’Aloisio, A. A.,
Suarez, L. M., Kleeberger, C. A., Shore, D. L., DeRoo, L. A., Taylor, J. A., Weinberg,
C. R. & Sister Study Research Team (2017), ‘The sister study cohort: Baseline methods
and participant characteristics’, Environmental Health Perspectives 125(12), 127003.

Satten, G. A., Curtis, S. W., Solis-Lemus, C., Leslie, E. J. & Epstein, M. P. (2022), ‘Eﬃcient
estimation of indirect eﬀects in case-control studies using a uniﬁed likelihood framework’,
Statistics in Medicine 41(15), 2879–2893.

87

Schifano, E. D. (2019), ‘A review of analysis methods for secondary outcomes in case-control

studies’, Communications for Statistical Applications and Methods 26(2), 103–129.

Sobel, M. E. (1982), ‘Asymptotic conﬁdence intervals for indirect eﬀects in structural equa-

tion models’, Sociological Methodology 13, 290.

Splawa-Neyman, J., Dabrowska, D. M. & Speed, T. P. (1923), ‘On the application of prob-

ability theory to agricultural experiments’, Statistical Science 5(4), 465–472.

Splawa-Neyman, J., Dabrowska, D. M. & Speed, T. P. (1990), ‘On the application of prob-

ability theory to agricultural experiments’, Statistical Science 5(4), 465–472.

Stanciu, I., Larsson, M., Nordin, S., Adolfsson, R., Nilsson, L.-G. & Olofsson, J. K. (2014),
‘Olfactory impairment and subjective olfactory complaints independently predict conver-
sion to dementia: a longitudinal, population-based study’, Journal of the International
Neuropsychological Society 20(2), 209–217.

Tchetgen, E. J. T. & Shpitser, I. (2012), ‘Semiparametric theory for causal mediation analy-
sis: Eﬃciency bounds, multiple robustness and sensitivity analysis’, The Annals of Statis-
tics 40(3), 1816–1845.

Tchetgen Tchetgen, E. J. (2013), ‘Inverse odds ratio-weighted estimation for causal media-

tion analysis’, Statistics in Medicine 32(26), 4567–4580.

Ten Have, T. R. & Joﬀe, M. M. (2012), ‘A review of causal estimation of eﬀects in mediation

analyses’, Statistical Methods in Medical Research 21(1), 77–107.

Valeri, L. & Vanderweele, T. J. (2013), ‘Mediation analysis allowing for exposure-mediator
interactions and causal interpretation: Theoretical assumptions and implementation with
SAS and SPSS macros’, Psychological Methods 18(2), 137–150.

VanderWeele, T. J. (2009), ‘Marginal structural models for the estimation of direct and

indirect eﬀects’, Epidemiology 20(1), 18–26.

VanderWeele, T. J. (2015), Explanation in Causal Inference: Methods for Mediation and

Interaction, Oxford University Press.

VanderWeele, T. J. (2020), ‘Invited commentary: Frontiers of power assessment in mediation

analysis’, American Journal of Epidemiology 189(12), 1568–1570.

VanderWeele, T. J. & Tchetgen Tchetgen, E. J. (2016), ‘Mediation analysis with matched

case-control study designs’, American Journal of Epidemiology 183(9), 869–870.

VanderWeele, T. J. & Vansteelandt, S. (2010), ‘Odds ratios for mediation analysis for a

dichotomous outcome’, American Journal of Epidemiology 172(12), 1339–1348.

VanderWeele, T. J., Vansteelandt, S. & Robins, J. M. (2014), ‘Eﬀect decomposition in the
presence of an exposure-induced mediator-outcome confounder’, Epidemiology 25(2), 300–
306.

88

Vansteelandt, S., Bekaert, M. & Lange, T. (2012), ‘Imputation strategies for the estimation

of natural direct and indirect eﬀects’, Epidemiologic Methods 1(1), 131–158.

Vittinghoﬀ, E., Sen, S. & McCulloch, C. E. (2009), ‘Sample size calculations for evaluating

mediation’, Statistics in Medicine 28(4), 541–557.

Wang, J., Ning, J. & Shete, S. (2019), ‘Mediation analysis in a case-control study when the

mediator is a censored variable’, Statistics in Medicine 38(7), 1213–1229.

Weedon, M. N., Lettre, G., Freathy, R. M., Lindgren, C. M., Voight, B. F., Perry, J. R. B.,
Elliott, K. S., Hackett, R., Guiducci, C., Shields, B., Zeggini, E., Lango, H., Lyssenko, V.,
Timpson, N. J., Burtt, N. P., Rayner, N. W., Saxena, R., Ardlie, K., Tobias, J. H., Ness,
A. R., Ring, S. M., Palmer, C. N. A., Morris, A. D., Peltonen, L., Salomaa, V., Smith,
G. D., Groop, L. C., Hattersley, A. T., McCarthy, M. I., Hirschhorn, J. N. & Frayling,
T. M. (2007), ‘A common variant of HMGA2 is associated with adult and childhood height
in the general population’, Nature Genetics 39(10), 1245–1250.

Weuve, J., Korrick, S. A., Weisskopf, M. A., Ryan, L. M., Schwartz, J., Nie, H., Grodstein,
F. & Hu, H. (2009), ‘Cumulative exposure to lead in relation to cognitive function in older
women’, Environmental Health Perspectives 117(4), 574–580.

White, J. E. (1982), ‘A two stage design for the study of the relationship between a rare

exposure and a rare disease’, American Journal of Epidemiology 115(1), 119–128.

Whittemore, A. S. & Halpern, J. (1997), ‘Multi-stage sampling in genetic epidemiology’,

Statistics in medicine 16(2), 153–167.

Wright, S. (1920), ‘The relative importance of heredity and environment in determining the
piebald pattern of guinea-pigs’, Proceedings of the National Academy of Sciences 6(6), 320–
332.

Wright, S. (1934), ‘The method of path coeﬃcients’, The Annals of Mathematical Statistics

5(3), 161–215.

Xydakis, M. S., Mulligan, L. P., Smith, A. B., Olsen, C. H., Lyon, D. M. & Belluscio, L.
(2015), ‘Olfactory impairment and traumatic brain injury in blast-injured combat troops:
a cohort study’, Neurology 84(15), 1559–1567.

Zhang, Z. (2014), ‘Monte carlo based statistical power analysis for mediation models: Meth-

ods and software’, Behavior Research Methods 46(4), 1184–1198.

89

APPENDIX A PROOFS OF THEOREMS

A.1 Proof of Theorem 1

For the controlled direct eﬀects, we need to recover the mean of the counterfactual outcome

Y xz. By recovering we mean using observed data to identify the target estimand.

E[Y xz]

= EC1,C2 [E(Y xz | C1, C2)]

= EC1,C2 [E(Y xz | C1, C2, X = x, Z = z)]

= EC1,C2 [E(Y | C1, C2, X = x, Z = z)]

= EC1,C2 [E(Y | C1, C2, X = x, Z = z, S = 1)]

(cid:2)

(cid:2)

=

=

=

=

=

=

=

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2)dc1dc2dy

yP (Y = y | c1, c2, x, z, S = 1)

f (c1, c2 | x, z, S = 1)
f (c1, c2 | x, z, S = 1)

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2 | x, z, S = 1)

y

(cid:2)

(cid:2)

c1,c2

y

(cid:2)

(cid:2)

c1,c2

y

c1,c2

P (x, z, S = 1)
P (x, z, S = 1 | c1, c2)f (c1, c2)
(cid:2)

(cid:2)

f (c1, c2)dc1dc2dy

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2 | x, z, S = 1)

y

c1,c2

P (S = 1 | x, z)P (x, z)
P (S = 1 | x, z, c1, c2)P (x, z | c1, c2)
(cid:2)

(cid:2)

dc1dc2dy

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2 | x, z, S = 1)

y

c1,c2
P (S = 1 | z)P (x, z)
P (S = 1 | z)P (x, z | c1, c2)
(cid:2)

(cid:2)

dc1dc2dy

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2 | x, z, S = 1)

y

c1,c2

P (z | x)P (x)
P (z | x, c1, c2)P (x | c1, c2)
(cid:2)

(cid:2)

dc1dc2dy

yP (Y = y | c1, c2, x, z, S = 1)f (c1, c2 | x, z, S = 1)

y

c1,c2

90

(A.1)

(A.2)

(A.3)

(A.4)

(A.5)

f (c1, c2)dc1dc2dy

(A.6)

(A.7)

(A.8)

(A.9)

(A.10)

P (z | x)P (x)
P (z | x, c1, c2)P (x | c1)

dc1dc2dy

(A.11)

= E [w1Y | X = x, Z = z, S = 1]

(A.12)

where w1 =

P (z|x)·P (x)

P (z|c1,c2,x)·P (x|c1) . For simplicity, the integral symbol is used while it represents

the summation for taking expectation of non-continuous variables (same for the rest of proofs

in Appendix). Equation (6.1) follows from the law of iterated expectations. Equation (6.2)

follows from the assumption Y xz ⊥⊥ (X, Z) | (C1, C2). Equation (6.3) follows from the

consistency. Equation (6.4) follows from the assumption Y ⊥⊥ S | (X, Z, C1, C2). Equation

(6.5) follows from the deﬁnition. Equation (6.6) is (6.5) multiplied by unity. Equations (6.7)-

(6.8) follow from the laws of probabilities. Equation (6.9) follows under the assumptions

S ⊥⊥ X | Z and S ⊥⊥ (X, C1, C2) | Z. Equation (6.10)-(6.12) follow from the laws of

probabilities and the deﬁnition of conditional expectation.

Next, we show the nonparametric identiﬁcation of E[Y xZx

] and E[Y xZx∗

] so that the

natural in(direct) eﬀects are identiﬁed.

91

E[Y xZx
]
(cid:9)

=EC1,C2

(cid:9)

=EC1,C2
(cid:2)

(cid:2)

(cid:10)

E(Y xZz | C1, C2)
(cid:11)

(cid:7)

EZx|C1,C2
(cid:2)

E

Y xZz | C1, C2, Z x)

(cid:8)

(cid:12)(cid:10)

| C1, C2

(A.13)

(A.14)

yP (Y xZz

= y | c1, c2, Z x = z)P (Z x = z | c1, c2)f (c1, c2)dzdc1dc2dy

(A.15)

yP (Y xZz

= y | c1, c2, Z x = z, X = x)P (Z x = z | c1, c2)f (c1, c2)dzdc1dc2dy

yP (Y xZz

= y | c1, c2, Z x = z, X = x)P (Z x = z | X = x, c1, c2)

yP (Y = y | c1, c2, Z = z, X = x)f (z | x, c1, c2)f (c1, c2)dzdc1dc2dy

(A.18)

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z | x, c1, c2)f (c1, c2)dzdc1dc2dy

=

=

=

=

=

y

(cid:2)

(cid:2)

c1,c2

z

(cid:2)

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

y

(cid:2)

(cid:2)

c1,c2

z

(cid:2)

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

f (c1, c2)dzdc1dc2dy
(cid:2)
(cid:2)

(cid:2)

=

y

(cid:2)

(cid:2)

=

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z, c1, c2 | x, S = 1)

z

c1,c2
f (z | x, c1, c2)f (c1, c2)
f (z, c1, c2 | x, S = 1)

(cid:2)

dzdc1dc2dy

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z, c1, c2 | x, S = 1)

y

c1,c2

z

f (z | x, c1, c2)f (c1, c2)P (S = 1 | x)P (X = x)
P (S = 1 | z, c1, c2, x)P (X = x | z, c1, c2)p(z | c1, c2)f (c1, c2)

(cid:2)

dzdc1dc2dy

(A.21)

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z, c1, c2 | x, S = 1)

z

c1,c2
P (X = x | z, c1, c2)f (z, c1, c2)P (S = 1 | x)P (X = x)
f (x, c1, c2)P (S = 1 | z)P (X = x | z, c1, c2)p(z | c1, c2)

(cid:2)

dzdc1dc2dy

(A.22)

(cid:2)

(cid:2)

=

y

(cid:2)

(cid:2)

=

=

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z, c1, c2 | x, S = 1)

y

c1,c2

z

P (S = 1 | x)P (X = x)
P (S = 1 | z)P (X = x | c1, c2)

(cid:2)

(cid:2)

(cid:2)

dzdc1dc2dy

yP (Y = y | c1, c2, Z = z, X = x, S = 1)f (z, c1, c2 | x, S = 1)

y

c1,c2

z

(A.23)

92

(A.16)

(A.17)

(A.19)

(A.20)

P (S = 1 | x)P (X = x)
P (S = 1 | z)P (X = x | c1)

dzdc1dc2dy

=E[w2Y | X = x, S = 1]

(A.24)

(A.25)

where w2 = P (S=1|X=x)P (X=x)

P (S=1|Z=z)P (X=x|c1) . Equations (6.13)-(6.14) follow from the law of iterated

expectations. Equation (6.15) follows by the deﬁnition. Equation (6.16) follows from the

assumption Y xZx ⊥⊥ X|(Z x, C1, C2). Equation (6.17) follows from the assumption Z x ⊥⊥

X|(C1, C2). Equation (6.18) follows from the consistency. Equation (6.19) follows from

the the assumption Y ⊥⊥ S|(X, Z, C1, C2). Equation (6.20) is (6.19) multiplied by unity.

Equation (6.21) follows from the laws of probabilities. Equation (6.22) follows from the

the assumption S ⊥⊥ (X, C1, C2)|Z. Equation (6.23) follows from the the assumption and

X ⊥⊥ C2|C1. Equation (6.24) follows from the deﬁnition.

The steps for identifying E[Y xZx∗

] are similar.

93

E[Y xZx∗
(cid:13)

=EC1,C2

(cid:13)

]
E(Y xZx∗

(cid:14)

| C1, C2)
(cid:16)
(cid:15)

=EC1,C2
(cid:2)

(cid:2)

EZx∗ |C1,C2
(cid:2)

E

| C1, C2, Z x∗

)

(cid:17)

(cid:18)(cid:14)

| C1, C2

(A.26)

(A.27)

Y xZx∗

yP (Y xZx∗

yP (Y xZx∗

= y | c1, c2, Z x∗

= z)P (Z x∗

= z | c1, c2)f (c1, c2)dzdc1dc2dy

(A.28)

= y | c1, c2, Z x∗

= z, X = x)P (Z x∗

= z | c1, c2)f (c1, c2)dzdc1dc2dy

yP (Y xZx∗

= y | c1, c2, Z x∗

= z, X = x)P (Z x∗

= z | X = x∗, c1, c2)

f (c1, c2)dzdc1dc2dy
(cid:2)
(cid:2)

(cid:2)

yP (Y xz = y | c1, c2, Z x∗

= z, X = x)P (Z = z | X = x∗, c1, c2)

y

c1,c2

z

f (c1, c2)dzdc1dc2dy
(cid:2)
(cid:2)

(cid:2)

yP (Y xz = y | c1, c2, X = x)P (Z = z | X = x∗, c1, c2)f (c1, c2)dzdc1dc2dy

yP (Y xz = y | c1, c2, X = x, Z = z)P (Z = z | X = x∗, c1, c2)f (c1, c2)dzdc1dc2dy

yP (Y = y | c1, c2, X = x, Z = z)P (Z = z | X = x∗, c1, c2)f (c1, c2)dzdc1dc2dy

(A.33)

=

=

=

=

=

=

=

=

y

(cid:2)

(cid:2)

c1,c2

z

(cid:2)

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

yP (Y = y | c1, c2, X = x, Z = z, S = 1)P (Z = z | X = x∗, c1, c2)

f (c1, c2)dzdc1dc2dy
(cid:2)
(cid:2)

(cid:2)

=

y

yP (Y = y | c1, c2, X = x, Z = z, S = 1)f (z, c1, c2 | X = x, S = 1)

z

c1,c2
P (Z = z | X = x∗, c1, c2)f (c1, c2)
f (z, c1, c2 | X = x, S = 1)

dzdc1dc2dy

94

(A.29)

(A.30)

(A.31)

(A.32)

(A.34)

(A.35)

(A.36)

(cid:2)

(cid:2)

(cid:2)

=

y

(cid:2)

(cid:2)

=

yP (Y = y | c1, c2, X = x, Z = z, S = 1)P (Z = z | X = x, S = 1)

z

c1,c2
P (Z = z | X = x∗, c1, c2)f (c1, c2)P (X = x, S = 1)
P (Z = z, X = x, S = 1, c1, c2)

(cid:2)

dzdc1dc2dy

(A.37)

yP (Y = y | c1, c2, X = x, Z = z, S = 1)P (Z = z | X = x, S = 1)

y

c1,c2

z

P (Z = z | X = x∗, c1, c2)f (c1, c2)P (S = 1 | X = x)P (X = x)
P (S = 1 | Z = z, c1, c2, x)P (Z = z | X = x, c1, c2)P (X = x | c1, c2)f (c1, c2)

dzdc1dc2dy

(cid:2)

(cid:2)

(cid:2)

y

c1,c2

z

yP (Y = y | c1, c2, X = x, Z = z, S = 1)P (Z = z | X = x, S = 1)

(A.38)

P (Z = z | X = x∗, c1, c2)P (S = 1 | X = x)P (X = x)
P (S = 1 | Z = z)P (Z = z | X = x, c1, c2)P (X = x | c1, c2)

(cid:2)

(cid:2)

(cid:2)

dzdc1dc2dy

(A.39)

yP (Y = y | c1, c2, X = x, Z = z, S = 1)P (Z = z | X = x, S = 1)

y

c1,c2

z

P (Z = z | X = x∗, c1, c2)P (S = 1 | X = x)P (X = x)
P (S = 1 | Z = z)P (Z = z | X = x, c1, c2)P (X = x | c1)

dzdc1dc2dy

=

=

=E [w3Y | X = x, S = 1]

(A.40)

(A.41)

P (S=1|Z=z)P (Z=z|X=x,c1,c2)P (X=x|c1) . Equations (6.25)-(6.27) follow from the law of

where w3 = P (S=1|X=x)P (Z=z|X=x∗,c1,c2)P (X=x)
iterated expectations. Equation (6.28) follows from the assumption Y xZx∗
Equation (6.29) follows from the assumption Z x∗ ⊥⊥ X | (C1, C2). Equation (6.30) follows

⊥⊥ X | (Z x∗

, C1, C2).

from the consistency. Equation (6.31) follows from the assumption Y xz ⊥⊥ Z x∗ | (X, C1, C2).

Equation (6.32) follows from the assumption Y xz ⊥⊥ Z | (X, C1, C2). Equation (6.33) follows

from the consistency. Equation (6.34) follows from the assumption Y ⊥⊥ S | (X, Z, C1, C2).

Equation (6.35) if (6.34) multiplies by unity. Equations (6.36)-(6.37) follow from the laws of

probabilities. Equation (6.38) follows from the assumption S ⊥⊥ (X, C1, C2) | Z. Equation

(6.39) follows from the assumption X ⊥⊥ C2|C1. Equation (6.40) follows from the deﬁnition.

(cid:2)

95

A.2 Proof of Theorem 2

E[Y xm]
(cid:2)
(cid:2)

=

=

=

=

=

yP (Y xm = y | c)f (c)dcdy

y

(cid:2)

c

(cid:2)

yP (Y xm = y | c, X = x, M = m)f (c)dcdy
(cid:2)

y

(cid:2)

c

(cid:2)

yP (Y = y | c, X = x, M = m, Z = z)f (z | c, x, m)f (c)dzdcdy

(A.42)

(A.43)

(A.44)

y

(cid:2)

c

(cid:2)

z

(cid:2)

y

(cid:2)

c

(cid:2)

z

(cid:2)

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m)f (c)dzdcdy

(A.45)

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m, S = 1)

y

c

z
f (z | c, x, m)f (c)
f (z | c, x, m, S = 1)
(cid:2)

(cid:2)

(cid:2)

dzdcdy

=

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m, S = 1)

z

y

c
f (c)f (c, x, m, S = 1)
P (S = 1 | z, c, x, m)f (x, m | c)f (c)

(cid:2)

(cid:2)

(cid:2)

dzdcdy

(A.46)

(A.47)

=

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m, S = 1)f (c | x, m, S = 1)

y

c

z

f (x, m, S = 1)
P (S = 1 | z, c, x, m)f (x, m | c)

(cid:2)

(cid:2)

(cid:2)

dzdcdy

(A.48)

=

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m, S = 1)f (c | x, m, S = 1)

y

c

z

P (M = m | x, S = 1)P (S = 1 | x)P (X = x)
P (S = 1 | z)P (M = m | x, c)P (X = x | c)

(cid:2)

(cid:2)

(cid:2)

dzdcdy

(A.49)

=

yP (Y = y | c, X = x, M = m, Z = z, S = 1)f (z | c, x, m, S = 1)f (c | x, m, S = 1)

z

c

y
P (M = m | x, S = 1)P (S = 1 | x)P (X = x)
P (M = m | x, c1, c2, c3)P (S = 1 | z)P (X = x | c1)

= E[w4Y |X = x, M = m, S = 1]

dzdcdy

(A.50)

(A.51)

where w4 = P (M =m|x,S=1)·P (S=1|x)·P (X=x)

P (M =m|x,c1,c2,c3)P (S=1|z)P (X=x|c1) . Equation (6.41) follows from the law of iterated
expectations. Equation (6.42) follows from the assumption of Y xm ⊥⊥ (X, M )|(C1, C2, C3).

96

Equation (6.43) follows from the consistency and basic probability theory. Equation (6.44)

follows from the assumption of Y ⊥⊥ S|(C1, C2, C3, X, Z, M ). Equation (6.45)-(6.47) follow

from the basic probability theory. Equation (6.48) follows from the assumption of S ⊥⊥

(X, C1, C2, C3)|Z and basic probability theory. Equation (6.49) follows from the assumption

of X ⊥⊥ (C2, C3)|C1 and basic probability theory. Equation (6.50) follows from the basic

probability theory.

Since M is only observed in selected sample, P (M = m|C1, C2, C3, X) is usually not

directly estimable using case-control sample. However, we can further take advantage of the

sampling scheme to estimate it by (6.51) below:

P (M = m|X, C1, C2, C3)

= ΣzP (M = m|X, C1, C2, C3, Z = z) · P (Z = z|X, C1, C2, C3)

= P (M = m|X, C1, C2, C3, Z = 0) · P (Z = 0|X, C1, C2)+

P (M = m|X, C1, C2, C3, Z = 1) · P (Z = 1|X, C1, C2)

= P (M = m|X, C1, C2, C3, Z = 0, S = 1) · P (Z = 0|X, C1, C2)+

P (M = m|X, C1, C2, C3, Z = 1, S = 1) · P (Z = 1|X, C1, C2)

(A.52)

Speciﬁcally, to estimate P (M = m|C1, C2, C3, X) using case-control sample, we can

consider to do it stratiﬁed by the value of Z. Equation (6.51) shows the derivation when Z is

a binary variable but it can be extended to other types of Z as well. In stratiﬁed sampling (by

Z), researchers usually know the sampling rate by design. Here, for example, we have 100%

sampling for subjects with Z = 1 and ˜3.6% simple random sampling for units with Z = 0. In

such case, P (M = m|X, C1, C2, C3, Z = 0, S = 1) and P (M = m|X, C1, C2, C3, Z = 1, S = 1)

can be directly estimated using a parametric model among controls and cases, respectively.

This approach is valid since both cases and controls are “representative” for the target

population with Z = 1 and Z = 0.

97

Next, we show the nonparametric identiﬁcation of E[Y xM x

] and E[Y xM x∗

] so that

the natural in(direct) eﬀects are identiﬁed.

98

E[Y xM x
(cid:2)
(cid:2)

]

=

=

=

=

=

=

=

=

yP (Y xM x

= y | c)f (c)dcdy

y

(cid:2)

c

(cid:2)

y

(cid:2)

c

(cid:2)

yP (Y xM x
(cid:2)

= y | c, X = x)f (c)dcdy

(A.53)

(A.54)

yP (Y xM x

= y | c, X = x, M x = m)P (M x = m | c, X = x)f (c)dmdcdy (A.55)

yP (Y xm = y | c, X = x, M = m)P (M = m | c, X = x)f (c)dmdcdy

(A.56)

yP (Y = y | c, X = x, M = m)P (M = m | c, X = x)f (c)dmdcdy

(A.57)

y

(cid:2)

c

(cid:2)

(cid:2)

m

y

(cid:2)

c

(cid:2)

(cid:2)

m

y

(cid:2)

c

(cid:2)

(cid:2)

m

yP (Y = y | c, X = x, M = m, S = 1)P (M = m | c, X = x)f (c)dmdcdy (A.58)
(cid:2)

y

(cid:2)

c

(cid:2)

(cid:2)

m

yP (Y = y, z | c, X = x, M = m, S = 1)dzP (M = m | c, X = x)f (c)dmdcdy

y

c

m

z

(cid:2)

(cid:2)

(cid:2)

(cid:2)

y

c

m

z

yP (Y = y | z, c, X = x, M = m, S = 1)f (z | c, X = x, M = m, S = 1)dz

(A.59)

P (M = m | c, X = x)f (c)dmdcdy

(cid:2)

(cid:2)

(cid:2)

(cid:2)

(A.60)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)
(cid:2)

(cid:2)

(cid:2)

(cid:2)

P (M = m | c, x)f (c)
P (M = m | c, x, S = 1)f (c | x, S = 1)

dzdmdcdy

(A.61)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)

(cid:2)

(cid:2)

(cid:2)

(cid:2)

P (M = m | c, x)f (c)f (S = 1 | c, x)P (S = 1 | x)f (x)
P (S = 1 | m, c, x)P (M = m | c, x)P (S = 1 | c, x)f (x | c)f (c)

dzdmdcdy

(A.62)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)
(cid:2)

(cid:2)

(cid:2)

(cid:2)

P (M = m | c, x)P (S = 1 | x)f (x)
P (S = 1 | m, c, x)P (M = m | c, x)f (x | c)

dzdmdcdy

(A.63)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

99

f (c | x, S = 1)

P (M = m | c1, c2, c3, x)
P (M = m|x, c1, c2, c3, S = 1)

·

P (S = 1 | x)P (X = x)
P (S = 1|x, c1, c2)P (X = x | c1)

dzdmdcdy

= E[w5Y |X = x, S = 1]

(A.64)

(A.65)

where w5 = P (M =m|c1,c2,c3,x)

P (S=1|x,c1,c2)P (X=x|c1) . Equation (6.52) follows from the
law of iterated expectations. Equation (6.53) follows from the assumption of Y xM x ⊥⊥

P (M =m|x,c1,c2,c3,S=1)

P (S=1|x)P (X=x)

·

X|(C1, C2, C3). Equation (6.54) follows from the law of iterated expectations. Equation

(6.55) and (6.56) follows from the consistency. Equation (6.57) follows from the assumption

of Y ⊥⊥ S|(C1, C2, C3, X, M ). Equation (6.58)-(6.62) follow from the basic probability theory.

Equation (6.63) follows from the assumptions of X ⊥⊥ (C2, C3)|C1 and S ⊥⊥ C3|(X, C1, C2).

100

y

c

m

(cid:2)

(cid:2)

(cid:2)

y

c

m

(cid:2)

(cid:2)

(cid:2)

y

(cid:2)

c

(cid:2)

(cid:2)

m

y

(cid:2)

c

(cid:2)

(cid:2)

m

y

(cid:2)

c

(cid:2)

(cid:2)

m

y

c

m

=

=

=

=

=

=

=

=

=

=

E[Y xM x∗
(cid:2)
(cid:2)

]
yP (Y xM x∗

y

(cid:2)

c

(cid:2)

y

(cid:2)

c

(cid:2)

yP (Y xM x∗
(cid:2)

= y | c)f (c)dcdy

= y | c, X = x)f (c)dcdy

(A.66)

(A.67)

(A.68)

yP (Y xM x∗

= y | c, X = x, M x∗

= m)P (M x∗

= m | c, X = x)f (c)dmdcdy

yP (Y xM x∗

= y | c, X = x, M x∗

= m)P (M x∗

= m | c, X = x∗)f (c)dmdcdy

(A.69)

yP (Y xm = y | c, X = x, M x∗

= m)P (M = m | c, X = x∗)f (c)dmdcdy

(A.70)

yP (Y xm = y | c, X = x, M x = m)P (M = m | c, X = x∗)f (c)dmdcdy

(A.71)

yP (Y = y | c, X = x, M = m)P (M = m | c, X = x∗)f (c)dmdcdy

(A.72)

yP (Y = y | c, X = x, M = m, S = 1)P (M = m | c, X = x∗)f (c)dmdcdy

(cid:2)

(cid:2)

(cid:2)

(cid:2)

y

c

m

z

(cid:2)

(cid:2)

(cid:2)

(cid:2)

y

c

m

z

yP (Y = y, z | c, X = x, M = m, S = 1)dzP (M = m | c, X = x∗)f (c)dmdcdy

(A.73)

yP (Y = y | z, c, X = x, M = m, S = 1)f (z | c, X = x, M = m, S = 1)dz

(A.74)

P (M = m | c, X = x∗)f (c)dmdcdy

(cid:2)

(cid:2)

(cid:2)

(cid:2)

(A.75)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)
(cid:2)

(cid:2)

(cid:2)

(cid:2)

P (M = m | c, x∗)f (c)
P (M = m | c, x, S = 1)f (c | x, S = 1)

dzdmdcdy

(A.76)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

101

f (c | x, S = 1)

P (M = m | c, x∗)f (c)f (S = 1 | c, x)P (S = 1 | x)f (x)
P (S = 1 | m, c, x)P (M = m | c, x)P (S = 1 | c, x)f (x | c)f (c)

dzdmdcdy

(cid:2)

(cid:2)

(cid:2)

(cid:2)

(A.77)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)
(cid:2)

(cid:2)

(cid:2)

(cid:2)

P (M = m | c, x∗)P (S = 1 | x)f (x)
P (S = 1 | m, c, x)P (M = m | c, x)f (x | c)

dzdmdcdy

(A.78)

=

yP (Y = y | c, x, m, z, S = 1)f (z | c, x, m, S = 1)P (M = m | c, x, S = 1)

y

c

m

z

f (c | x, S = 1)

P (M = m | c1, c2, c3, x∗)
P (M = m|x, c1, c2, c3, S = 1)

·

P (S = 1 | x)P (X = x)
P (S = 1|x, c1, c2)P (X = x | c1)

dzdmdcdy

= E[w6Y |X = x, S = 1]

(A.79)

(A.80)

where w6 = P (M =m|c1,c2,c3,x∗)
P (M =m|x,c1,c2,c3,S=1)
law of iterated expectations. Equation (6.66) follows from the assumption of Y xM x∗

P (S=1|x,c1,c2)P (X=x|c1) . Equation (6.65) follows from the
⊥

P (S=1|x)P (X=x)

·

⊥ X|(C1, C2, C3). Equation (6.67) follows from the law of iterated expectations. Equa-

tion (6.68) follows from the assumption of M x∗ ⊥⊥ X|(Z, C1, C2, C3). Equation (6.69)

follows from the consistency. Equation (6.70) follows from the assumption of Y xm ⊥⊥

(M x, M x∗

)|(X, C1, C2, C3). Equation (6.71) again follows from the consistency. Equation

(6.72) follows from the assumption of Y ⊥⊥ S|(C1, C2, C3, X, M ). Equation (6.73)-(6.77)

follow from the basic probability theory. Equation (6.78) follows from the assumptions of

X ⊥⊥ (C2, C3)|C1 and S ⊥⊥ C3|(X, C1, C2).

(cid:2)

A.3 Comparison of Assumption Used Between (Modiﬁed) Huber and Solovyeva

and Proposed Methods

102

Table A.1: Assumption Comparison for Chapter 2

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:18)(cid:3)
(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1852)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3027)(cid:3299)

(cid:4210) (cid:1850)(cid:513)(cid:4666)(cid:1852)(cid:3051)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1852)(cid:3051) (cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1852)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3027)(cid:3299)(cid:1499)

(cid:4210) (cid:1850)(cid:513)(cid:4666)(cid:1852)(cid:3051)(cid:1499)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)

(cid:1852)(cid:3051)(cid:1499)

(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1852)(cid:3051)(cid:1499)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1852)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:12)(cid:3)
(cid:18)(cid:3)
(cid:18)(cid:3)
(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1852)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3053)(cid:3)

(cid:1851)(cid:3051)(cid:3027)(cid:3299)(cid:3)

(cid:1851)(cid:3051)(cid:3027)(cid:3299)(cid:1499)

(cid:3)

(cid:165)(cid:3)
(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1852)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:165)(cid:3)

(cid:1851)(cid:3051)(cid:3027)(cid:3299)

(cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:18)(cid:3)
(cid:165)(cid:3)

(cid:18)(cid:3)
(cid:18)(cid:3)
(cid:165)(cid:3)
(cid:165)(cid:3)
(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:1851)(cid:3051)(cid:3053) (cid:4210) (cid:1850)(cid:513)(cid:4666)(cid:1852)(cid:3051)(cid:1499)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:165)(cid:3)

Table A.2: Assumption Comparison for Chapter 3

(cid:51)(cid:85)(cid:82)(cid:83)(cid:82)(cid:86)(cid:72)(cid:71)(cid:3)

(cid:43)(cid:88)(cid:69)(cid:72)(cid:85)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:54)(cid:82)(cid:79)(cid:82)(cid:89)(cid:92)(cid:72)(cid:89)(cid:68)(cid:3)

(cid:1851)(cid:3051)(cid:3040)(cid:3)

(cid:1851)(cid:3051)(cid:3014)(cid:3299)(cid:3)

(cid:1851)(cid:3051)(cid:3014)(cid:3299)(cid:1499)

(cid:3)

(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:4666)(cid:1850)(cid:481) (cid:1839)(cid:4667)(cid:513)(cid:2159)(cid:3)
(cid:18)(cid:3)
(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1852)(cid:481) (cid:1839)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3014)(cid:3299)

(cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)

(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1839)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3014)(cid:3299)(cid:1499)
(cid:1839)(cid:3051)(cid:1499)

(cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:4210) (cid:1850)(cid:513)(cid:4666)(cid:1852)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:4666)(cid:1839)(cid:3051)(cid:481) (cid:1839)(cid:3051)(cid:1499)(cid:4667)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:1851) (cid:4210) (cid:1845)(cid:513)(cid:4666)(cid:1839)(cid:481) (cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)

(cid:165)(cid:3)
(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:1852)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:1839)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:165)(cid:3)

(cid:165)(cid:3)
(cid:165)(cid:3)

(cid:18)(cid:3)
(cid:18)(cid:3)
(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:1850)(cid:513)(cid:4666)(cid:1839)(cid:3051)(cid:1499)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:1839)(cid:3051)(cid:1499)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:1850)(cid:513)(cid:2159)(cid:3)
(cid:1851)(cid:3051)(cid:3040) (cid:4210) (cid:1839)(cid:513)(cid:4666)(cid:1850)(cid:481) (cid:2159)(cid:4667)(cid:3)
(cid:165)(cid:3)

103

APPENDIX B DATA GENERATING PROCESS FOR CHAPTER 4

The tables below include the parameters used for simulations in Chapter 4.

Table B.1: Data generating process for scenario 1

(cid:51)(cid:68)(cid:85)(cid:68)(cid:80)(cid:72)(cid:87)(cid:72)(cid:85)(cid:3) (cid:49)(cid:82)(cid:3)(cid:72)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:20)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:21)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:22)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:23)(cid:3)

(cid:883)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:20)(cid:25)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:22)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:20)(cid:24)(cid:26)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:28)(cid:25)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:20)(cid:23)(cid:20)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:25)(cid:21)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:20)(cid:21)(cid:26)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:21)(cid:27)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:20)(cid:20)(cid:22)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:27)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:28)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:20)(cid:19)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:24)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:25)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:26)(cid:3)
(cid:17)(cid:28)(cid:28)(cid:20)(cid:3)
(cid:17)(cid:23)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:22)(cid:22)(cid:3)
(cid:17)(cid:27)(cid:26)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:28)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:19)(cid:27)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:20)(cid:24)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:19)(cid:26)(cid:25)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:22)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:19)(cid:26)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:26)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:19)(cid:23)(cid:26)(cid:3)

(cid:20)(cid:17)(cid:28)(cid:3)
(cid:17)(cid:19)(cid:19)(cid:28)(cid:28)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:25)(cid:3)
(cid:17)(cid:19)(cid:20)(cid:19)(cid:27)(cid:25)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:27)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:26)(cid:3)
(cid:20)(cid:22)(cid:17)(cid:24)(cid:3)

(cid:2010)(cid:3030)(cid:3117)(cid:3051)(cid:3)
(cid:2010)(cid:2868)(cid:3051)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3053)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3053)(cid:3)
(cid:2010)(cid:3051)(cid:3053)(cid:3)
(cid:2010)(cid:2868)(cid:3053)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3052)(cid:3)
(cid:2010)(cid:3051)(cid:3052)(cid:3)
(cid:2010)(cid:3053)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3051)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3051)(cid:3052)(cid:3)
(cid:2010)(cid:3051)(cid:3053)(cid:3052)(cid:3)
(cid:2010)(cid:2868)(cid:3052)(cid:3)

(cid:20)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:28)(cid:27)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:25)(cid:25)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:22)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:26)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:21)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:27)(cid:3)

(cid:20)(cid:17)(cid:28)(cid:20)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:22)(cid:19)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:21)(cid:23)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:20)(cid:27)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:20)(cid:21)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:19)(cid:20)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)(cid:28)(cid:27)(cid:21)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)(cid:28)(cid:25)(cid:27)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)(cid:27)(cid:25)(cid:27)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:19)(cid:28)(cid:25)(cid:24)(cid:25)(cid:3)

(cid:17)(cid:28)(cid:25)(cid:20)(cid:3)
(cid:17)(cid:25)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:3)

(cid:20)(cid:3)
(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:22)(cid:26)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:86)(cid:3)(cid:68)(cid:85)(cid:72)(cid:3)(cid:87)(cid:68)(cid:78)(cid:72)(cid:81)(cid:3)(cid:79)(cid:82)(cid:74)(cid:3)(cid:86)(cid:70)(cid:68)(cid:79)(cid:72)(cid:17)(cid:3)(cid:3)

Table B.2: Data generating process for scenario 2

(cid:51)(cid:68)(cid:85)(cid:68)(cid:80)(cid:72)(cid:87)(cid:72)(cid:85)(cid:3) (cid:49)(cid:82)(cid:3)(cid:72)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:20)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:21)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:22)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:23)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:24)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:25)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:26)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:27)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:28)(cid:3) (cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:3)(cid:20)(cid:19)(cid:3)

(cid:2010)(cid:3030)(cid:3117)(cid:3051)(cid:3)
(cid:2010)(cid:2868)(cid:3051)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3053)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3053)(cid:3)
(cid:2010)(cid:3051)(cid:3053)(cid:3)
(cid:2010)(cid:2868)(cid:3053)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3040)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3040)(cid:3)
(cid:2010)(cid:3030)(cid:3119)(cid:3040)(cid:3)
(cid:2010)(cid:3051)(cid:3040)(cid:3)
(cid:2010)(cid:3053)(cid:3040)(cid:3)
(cid:2010)(cid:2868)(cid:3040)(cid:3)
(cid:2010)(cid:3030)(cid:3117)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3118)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3119)(cid:3052)(cid:3)
(cid:2010)(cid:3051)(cid:3052)(cid:3)
(cid:2010)(cid:3053)(cid:3052)(cid:3)
(cid:2010)(cid:3040)(cid:3052)(cid:3)
(cid:2010)(cid:3030)(cid:3119)(cid:3051)(cid:3052)(cid:3)
(cid:2010)(cid:3051)(cid:3040)(cid:3052)(cid:3)
(cid:2010)(cid:2868)(cid:3052)(cid:3)

(cid:19)(cid:17)(cid:28)(cid:28)(cid:20)(cid:3)
(cid:19)(cid:17)(cid:23)(cid:22)(cid:21)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:22)(cid:22)(cid:3)
(cid:19)(cid:17)(cid:27)(cid:26)(cid:23)(cid:3)
(cid:21)(cid:17)(cid:26)(cid:3)
(cid:17)(cid:19)(cid:19)(cid:27)(cid:27)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:26)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:23)(cid:25)(cid:27)(cid:3)
(cid:20)(cid:17)(cid:23)(cid:22)(cid:28)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:23)(cid:3)
(cid:23)(cid:3)
(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:28)(cid:24)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:20)(cid:21)(cid:3)
(cid:20)(cid:17)(cid:20)(cid:27)(cid:26)(cid:3)
(cid:22)(cid:17)(cid:21)(cid:26)(cid:20)(cid:3)
(cid:20)(cid:17)(cid:19)(cid:19)(cid:19)(cid:21)(cid:3)
(cid:19)(cid:3)
(cid:20)(cid:17)(cid:26)(cid:23)(cid:3)
(cid:20)(cid:17)(cid:24)(cid:3)

(cid:19)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:26)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:23)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:21)(cid:23)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:21)(cid:21)(cid:26)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:21)(cid:20)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:20)(cid:27)(cid:25)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:20)(cid:22)(cid:25)(cid:3)

(cid:19)(cid:3)

(cid:20)(cid:17)(cid:19)(cid:25)(cid:3)

(cid:20)(cid:17)(cid:20)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:21)(cid:3)

(cid:20)(cid:17)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:28)(cid:3)

(cid:21)(cid:3)

(cid:21)(cid:17)(cid:20)(cid:3)

(cid:21)(cid:17)(cid:21)(cid:3)

(cid:21)(cid:17)(cid:22)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:25)(cid:28)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:24)(cid:23)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:23)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:21)(cid:25)(cid:3)

(cid:17)(cid:19)(cid:19)(cid:20)(cid:19)(cid:20)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:28)(cid:3)

(cid:21)(cid:3)

(cid:21)(cid:17)(cid:20)(cid:3)

(cid:21)(cid:17)(cid:21)(cid:3)

(cid:21)(cid:17)(cid:22)(cid:3)

(cid:19)(cid:3)
(cid:17)(cid:19)(cid:25)(cid:23)(cid:24)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:3)
(cid:17)(cid:19)(cid:25)(cid:22)(cid:21)(cid:24)(cid:3)

(cid:20)(cid:17)(cid:23)(cid:21)(cid:3)
(cid:17)(cid:19)(cid:25)(cid:21)(cid:20)(cid:3)

(cid:20)(cid:17)(cid:22)(cid:3)
(cid:17)(cid:19)(cid:25)(cid:19)(cid:27)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:28)(cid:25)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:26)(cid:26)(cid:24)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:25)(cid:25)(cid:22)(cid:3)

(cid:20)(cid:17)(cid:21)(cid:3)
(cid:17)(cid:19)(cid:24)(cid:24)(cid:28)(cid:26)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:24)(cid:21)(cid:28)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:23)(cid:26)(cid:3)

(cid:17)(cid:19)(cid:24)(cid:23)(cid:20)(cid:20)(cid:3)

(cid:40)(cid:73)(cid:73)(cid:72)(cid:70)(cid:87)(cid:86)(cid:3)(cid:68)(cid:85)(cid:72)(cid:3)(cid:87)(cid:68)(cid:78)(cid:72)(cid:81)(cid:3)(cid:79)(cid:82)(cid:74)(cid:3)(cid:86)(cid:70)(cid:68)(cid:79)(cid:72)(cid:17)(cid:3)(cid:3)

104