Causal Inference with Mendelian Randomization for Longitudinal Data
Mendelian Randomization (MR) uses genetic variants as instrumental variables (IVs) to examinethe causal relationship between an exposure and an outcome in observational studies. When confounding factors exist, the correlation between a predictor variable and an outcome variable does not imply causation. IV regression has been a popular method to control the confounding effect for causal inference. According to Mendel’s first and second laws of inheritance, genetic variants can be considered as valid IVs. Popular MR methods include the ratio estimator, the inverse-variance weighted estimator and the two stage estimator. However, all these methods are based on cross-sectional data. In practice, data in the observational studies can be collected over time, the so-called longitudinal data. Longitudinal data makes it possible to capture changes within subjects over time and thus offers advantages to causal modeling to establish causal relationships. However, causal inference method that can control the time-varying confounding effect is largely lacking in literature. In this dissertation, we explore MR analysis for longitudinal data by proposing different causal models and assuming different casual mechanisms. The proposed methods are strongly motivated by a real study to examine the causal relationship between hormone secretion and emotional eating disorder in teen girls. We start with a concurrent model which assumes current outcome is only affected by current exposure. Coefficients of both genetic variants (i.e., IVs) and exposure are considered as time- varying effects. We apply the quadratic inference function approach in a two-step IV regression framework and focus on statistical testing to infer causality. Through extensive simulation studies, we show that the proposed method can well protect type I error and has reasonable testing power. In Chapter 3, we generalize the concurrent model to a more complex case and propose a time lag model to investigate time delayed causal effects. In the time lag model, we assume current outcome at time ? is affected by previous exposures measured up to ? − ? time points, where the time lag △? can be determined by a rigorous model selection procedure based on data. Similar to the concurrent model, we assume the effects of genetic variants on exposure and the effects of exposure on outcome both are time-varying. We propose different tests for point-wise and simultaneous testing to assess the causal relationship. In Chapter 4, We further generalize the time lag model to the case where the cumulative effect of previous ? exposures contributes to the outcome at time ?, under a sparse functional data analysis framework. The causal relationship is examined under the functional principal component regression framework with sparse functional data. Simulation results show that the type I error is well controlled. We apply our models to the emotional eating disorder data to examine if hormone secretion during the menstrual cycle in teen girls has a causal effect on emotional eating behavior and identify interesting results. This thesis work represents the very first exploration in MR analysis with longitudinal data.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Qu, Jialin
- Thesis Advisors
-
Cui, Yuehua YC
- Committee Members
-
Huebner, Marianne MH
Weng, Haolei HW
Wang, Jianrong JW
Wang, Honglang HW
- Date
- 2022
- Subjects
-
Statistics
- Program of Study
-
Statistics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 116 pages
- Permalink
- https://doi.org/doi:10.25335/1vey-cf40