ROBUST STATISTICAL METHODS FOR CAUSAL DISCOVERY IN ONE-SAMPLE MENDELIAN RANDOMIZATION STUDIES
Mendelian Randomization (MR) has become a cornerstone approach for inferring causal relationships in epidemiological and genetic studies by leveraging genetic variants as instrumental variables (IV). Despite its popularity, conventional MR analyses, particularly those based on two-stage least squares (TSLS) and conducted within a single sample, face significant methodological challenges. These include selection-induced winner's curse and the pervasive problem of weak instruments and invalid IVs, all of which can undermine the reliability and interpretability of causal effect estimates.To address these limitations, this dissertation develops a unified and robust MR framework through a sequence of methodological innovations. First, we introduce MR-SPLIT, a novel adaptive sample-splitting and cross-fitting procedure that effectively mitigates biases arising from IV selection and weak instruments in one-sample MR settings. MR-SPLIT employs multiple sample splits to further enhance robustness, demonstrating superior performance in bias reduction, type I error control, and statistical power compared to existing approaches, as validated in extensive simulation studies and real-world data applications. Building on this foundation, we further propose MR-SPLIT+, which integrates best subset selection to accommodate invalid IVs under a relaxed plurality rule. MR-SPLIT+ substantially reduces estimation bias due to invalid instruments while maintaining efficiency and robustness. Simulation results consistently demonstrate that MR-SPLIT+ outperforms contemporary methods, and real-data analyses confirm its practical reliability in complex genetic architectures.Recognizing that causal relationships are often bidirectional or ambiguous, especially within gene expression networks and complex traits, we extend this framework to BiMR-SPLIT+. This method is specifically designed to disentangle bidirectional causality between pairs of traits, even when the underlying IV assumptions are partially violated. Extensive simulation studies and application to Drosophila melanogaster data illustrate that BiMR-SPLIT+ not only recapitulates established biological mechanisms, but also identifies novel candidate genes with potential regulatory roles. This bidirectional MR framework enables more accurate inference of gene-trait relationships and has broad implications for precision medicine.Collectively, this dissertation presents a cohesive suite of MR methodologies that systematically address weak and invalid IVs, IV selection bias, and bidirectional causality. The resulting toolkit substantially advances the reliability of causal inference in genetic epidemiology and lays the groundwork for future exploration in complex causal networks as large-scale human datasets continue to grow.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- Attribution 4.0 International
- Material Type
-
Theses
- Authors
-
Shi, Ruxin
- Thesis Advisors
-
Cui, Yuehua
- Committee Members
-
Xie, Yuying
Huang, Wen
Huebner, Marianne
- Date Published
-
2025
- Subjects
-
Statistics
- Program of Study
-
Statistics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 143 pages
- Permalink
- https://doi.org/doi:10.25335/x3y1-g916