Advanced classification methods for large spatial-temporal data : applications to neuroimaging
Ecological research, geological studies, image analysis are a few examples of high resolution spatial data where proximity describes the relationship between data points collected at various locations. Such dependencies play a vital role in modeling the data accurately to improve both its predictive capacity and parameter estimation. Rapid technological advancement has brought about an abundance of such information. To better understand this information, we are in need of feature selection techniques for spatially dependent data that can tease out relevant predictors associated with the response of interest. When the response variable at the various sites is in the form of discrete binary or count data we are faced with an added layer of complexity due to the inability of explicitly describing a joint parametric distribution. This dissertation explores the benefits of adopting a penalized quasi-likelihood approach to model a fixed number (p) or an expanding dimension (pn) of predictor variables with regard to a discrete spatial response variable. In the past this approach has been extensively studied in longitudinal data analysis. Introducing random fields that exhibit certain rho-mixing conditions we are able to provide some general theoretical results of the estimator obtained from the solving the penalized score equation. The oracle properties of the estimator are provided, followed by an algorithm to successfully implement the method. Multiple simulation studies showcase the effectiveness of the method under covariance misspecification. We apply this technique to real data obtained from the Michigan Natural Features Inventory.Variable selection in neuroimaging has a unique formulation that leads to selection of activated regions of a brain in Task-based fMRI. As one of the most non-invasive formats of studying an active brain, Task-based fMRI provides a unique opportunity in neuroscience to study the dynamic aspects of brain function. Crude statistical techniques such as voxel-wise regression analysis have been used in the past with some success to identify active brain regions based on the blood-oxygen-level dependent (BOLD) signal of the image. Inspired by graphical covariate models proposed for genetic data we incorporate a similar idea and expand our understanding of penalized regression of weighted least squares with a separable space-time covariance model in this setup. Two penalty terms are introduced as a result; one for selection (LASSO) and another for smoothing (Ridge-type). We explore the interpretability of the proposed model as opposed to its Bayesian counterparts, its computational feasibility and various approaches to selecting an optimal tuning parameter in the case of a Single-subject study. The description of the model and its implementation are presented with discussions about theoretical implications. Extensive simulation studies and a real data example of a human brain subject to two visual stimuli are also given to provide evidence of the capability of this method.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Karim, Rejaul
- Thesis Advisors
-
Maiti, Tapabrata
- Committee Members
-
Lim, Chae Young
Xiao, Yimin
Ross, Arun
- Date
- 2019
- Subjects
-
Spatial analysis (Statistics)
Generalized estimating equations
Epidemiology
Correlation (Statistics)
- Program of Study
-
Statistics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xi, 116 pages
- ISBN
-
9781088396636
1088396631