Statistical methods for high-dimensional sequencing studies
Background: With the advance of next generation sequencing technology, a massive amount of sequencing data are generated from sequencing studies, offering a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. The massive amount of data also poses a great challenge to the statistical analysis. Association analyses based on a single-locus test endure substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. A joint association test, on the other hand, has been shown to be more suitable for sequencing studies, as it jointly tests multiple variants to increase the power and reduces the dimensionality. However, current methods for joint association tests have limitations, such as dependency on distribution assumptions, poor performance for a small sample size, inability to adjust for covariates, inability to consider a family structure, inability to handle multiple traits, and computational inefficiency. Method and Result: I have developed a series of statistical methods to detect joint associations in sequencing studies based on weighted U statistics. 1) I developed a weighted U statistic, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on the non-parametric U statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to various types of phenotypes. 2) The simultaneously analysis of multiple related phenotypes has been great interest in recent human genetics research. In these studies, multiple phenotype measurements have been collected to study pleiotropic effects, or comorbidity effects. I have extended the WU-SEQ method and developed a Gene-Trait Similarity U test (GTSU) to detect sequencing variants associated with multiple phenotypes. 3) Both population-based and family-based designs have been commonly used in genetic association studies. There lacks, however, new statistical methods for family-based sequencing data analyses. Family-based sequencing studies have many advantages, including the ability to aggregate more disease-susceptibility rare variants within families. I have extended the GTSU to FGTSU for family-based sequencing association analyses.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Wei, Changshuai
- Thesis Advisors
-
Lu, Qing
- Committee Members
-
Anthony, James C.
Fu, Wenjiang
Cui, Yuehua
- Date Published
-
2014
- Program of Study
-
Epidemiology - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- ix, 104 pages
- ISBN
-
9781321158212
1321158211
- Permalink
- https://doi.org/doi:10.25335/fevm-2387