Multimodal learning and its application to modeling Alzheimer's disease

Multimodal learning gains increasing attention in recent years as heterogeneous data modalities are being collected from diverse domains or extracted from various feature extractors and used for learning. Multimodal learning is to integrate predictive information from different modalities to enhance the performance of the learned models. For example, when modeling Alzheimer's disease, multiple brain imaging modalities are collected from the patients, and effectively fusion from which is shown to be beneficial to predictive performance. Multimodal learning is associated with many challenges. One outstanding challenge is the severe overfitting problems due to the high feature dimension when concatenating the modalities. For example, the feature dimension of diffusion-weighted MRI modalities, which has been used in Alzheimer's disease diagnosis, is usually much larger than the sample size available for training. To solve this problem, in the first work, I propose a sparse learning method that selects the important features and modalities to alleviate the overfitting problem. Another challenge in multimodal learning is the heterogeneity among the modalities and their potential interactions. My second work explores non-linear interactions among the modalities. The proposed model learns a modality invariant component, which serves as a compact feature representation of the modalities and has high predictive power. In addition to utilize the modality invariant information of multiple modalities, modalities may provide supplementary information, and correlating them in the learning can be more informative. Thus, in the third work, I propose multimodal information bottleneck to fuse supplementary information from different modalities while eliminating the irrelevant information from them. One challenge of utilizing the supplementary information of multiple modalities is that most work can only be applied to the data with complete modalities. Modalities missing problem widely exists in multimodal learning tasks. For these tasks, only a small portion of data can be used to train the model. Thus, to fully use all the precious data, in the fourth work, I propose a knowledge distillation based algorithm to utilize all the data, including those that have missing modalities while fusing the supplementary information.

Read