Variational bayes deep neural network : theory, methods and applications
Bayesian neural networks (BNNs) have achieved state-of-the-art results in a wide range of tasks, especially in high dimensional data analysis, including image recognition, biomedical diagnosis and others. My thesis mainly focuses on high-dimensional data, including simulated data and brain images of Alzheimer's Disease. We develop variational Bayesian deep neural network (VBDNN) and Bayesian compressed neural network (BCNN) and discuss the related statistical theory and algorithmic implementations for predicting MCI-to-dementia conversion in multi-modal data from ADNI.The transition from mild cognitive impairment (MCI) to dementia is of great interest to clinical research on Alzheimer's disease (AD) and related dementias. This phenomenon also serves as a valuable data source for quantitative methodological researchers developing new approaches for classification. The development of VBDNN is motivated by an important biomedical engineering application, namely, building predictive tools for the transition from MCI to dementia. The predictors are multi-modal and may involve complex interactive relations. In Chapter 2, we numerically compare performance accuracy of logistic regression (LR) with support vector machine (SVM) in classifying MCI-to-dementia conversion. The results show that although SVM and other ML techniques are capable of relatively accurate classification, similar or higher accuracy can often be achieved by LR, mitigating SVM's necessity or value for many clinical researchers.Further, when faced with many potential features that could be used for classifying the transition, clinical researchers are often unaware of the relative value of different approaches for variable selection. Other than algorithmic feature selection techniques, manually trimming the list of potential predictor variables can also protect against over-fitting and also offers possible insight into why selected features are important to the model. We demonstrate how similar performance can be achieved using user-guided, clinically informed pre-selection versus algorithmic feature selection techniques. Besides LR and SVM, Bayesian deep neural network (BDNN) has quickly become the most popular machine learning classifier for prediction and classification with ADNI data. However, their Markov Chain Monte Carlo (MCMC) based implementation suffers from high computational cost, limiting this powerful technique in large-scale studies. Variational Bayes (VB) has emerged as a competitive alternative to overcome some of these computational issues. Although the VB is popular in machine learning, neither the computational nor the statistical properties are well understood for complex modeling such as neural networks. First, we model the VBDNN estimation methodology and characterize the prior distributions and the variational family for consistent Bayesian estimation (in Chapter 3). The thesis compares and contrasts the true posterior's consistency and contraction rates for a deep neural network-based classification and the corresponding variational posterior. Based on the complexity of the deep neural network (DNN), this thesis assesses the loss in classification accuracy due to VB's use and guidelines on the characterization of the prior distributions and the variational family. The difficulty of optimization associated with variational Bayes solution has been quantified as a function of the complexity of the DNN. Chapter 4 proposes using a BCNN that takes care of the large p small n problem by projecting the feature space onto a smaller dimensional space using a random projection matrix. In particular, for dimension reduction, we propose randomly compressed feature space instead of other popular dimension reduction techniques. We adopt a model averaging approach to pool information across multiple projections. As the main contribution, we propose the variation Bayes approach to simultaneously estimate both model weights and model-specific parameters. By avoiding using standard Monte Carlo Markov Chain and parallelizing across multiple compression, we reduce both computation and computer storage capacity dramatically with minimum loss in prediction accuracy. We provide theoretical and empirical justifications of our proposed methodology.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Liu, Zihuan
- Thesis Advisors
-
Maiti, Taps
Bhattacharya, Shrijita
- Committee Members
-
Bender, Andrew
Zhu, David
Hong, Hyokyoung
- Date Published
-
2021
- Subjects
-
Statistics
Artificial intelligence
Bioinformatics
Support vector machines
Bayesian statistical decision theory
Neural networks (Computer science)
- Program of Study
-
Statistics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xi, 141 pages
- ISBN
-
9798538122851
- Permalink
- https://doi.org/doi:10.25335/18hz-bj81