Multi-task learning and its application to geospatio-temporal data

Multi-task learning (MTL) is a data mining and machine learning approach for modeling multiple prediction tasks simultaneously by exploiting the relatedness among the tasks. MTL has been successfully applied to various domains, including computer vision, healthcare, genomics, recommender systems, and natural language processing. The goals of this thesis are: (1) to investigate the feasibility of applying MTL to geospatio-temporal prediction problems, particularly those encountered in the climate and environmental science domains and (2) to develop novel MTL frameworks that address the challenges of building effective predictive models from geospatio-temporal data.The first contribution of this thesis is to develop an online temporal MTL framework called ORION for ensemble forecasting problems. Ensemble forecasting uses a numerical method to simulate the evolution of nonlinear dynamic systems, such as climate and hydrological systems. ORION aims to effectively aggregate the forecasts generated by different ensemble members for a future time window, where each forecast is obtained by perturbing the starting condition of the computer model or using a different model representation. ORION considers the prediction for each time point in the forecast window as a distinct prediction task, where the task relatedness is achieved by imposing temporal smoothness and mean regularization constraints. A novel, online update with restart strategy is proposed to handle missing observations in the training data. ORION can also be optimized for different objectives, such as ε -insensitive and quantile loss functions.The second contribution of this thesis is to propose a MTL framework named GSpartan that can perform inferences at multiple locations simultaneously while allowing the local models for different locations to be jointly trained. GSpartan assumes that the local models share a common, low-rank representation and employs a graph Laplacian regularization to enforce constraints due to the inherent spatial autocorrelation of the data. Sparsity and non-negativity constraints are also incorporated into the formulation to ensure interpretability of the models.GSpartan is a MTL framework that considers only the spatial autocorrelation of the data. It is also a batch learning algorithm, which makes it difficult to scale up to global-scale data. To address these limitations, a new framework called WISDOM is proposed, which can incorporate the task relatedness across both space and time. WISDOM encodes the geospatio-temporal data as a tensor and performs supervised tensor decomposition to identify the latent factors that capture the inherent spatial and temporal variabilities of the data as well as the relationship between the predictor and target variables. The framework is unique in that it trains distinct spatial and temporal prediction models from the latent factors of the decomposed tensor and aggregates the outputs of these models to obtain the final prediction. WISDOM also employs an incremental learning algorithm that can systematically update the models when training examples are available for a new time period or for a new location.Finally, the geospatio-temporal data for many scientific applications are often available at varying spatial scales. For example, they can be generated by computer models simulated at different grid resolutions (e.g., the global and regional models used in climate modeling). A simple way to handle the predictor variables generated from the multi-scale data is to concatenate them into a single feature vector and train WISDOM using the concatenated vectors. However, this strategy may not be effective as it ignores the inherent dependencies between variables at different scales. To overcome this limitation, this thesis presents an extension of WISDOM called MUSCAT for handling multi-scale geospatio-temporal data. MUSCAT considers the consistency of the latent factors extracted from the spatio-temporal tensors at different scales while inheriting the benefits of WISDOM. Given the massive size of the multi-scale spatio-temporal tensors, a novel, supervised, incremental multi-tensor decomposition algorithm is develop to efficiently learn the model parameters.

Read