Summary
With the advances in sensing and data acquisition technology, it is now possible to collect datafrom different modalities and sources simultaneously. Most of these data are multi-dimensional in nature and can be represented by multiway arrays known as tensors. For instance, a color image is a third-order tensor defined by two indices for spatial variables and one index for color mode. Some other examples include color video, medical imaging such as EEG and fMRI, spatiotemporal data... Show moreWith the advances in sensing and data acquisition technology, it is now possible to collect datafrom different modalities and sources simultaneously. Most of these data are multi-dimensional in nature and can be represented by multiway arrays known as tensors. For instance, a color image is a third-order tensor defined by two indices for spatial variables and one index for color mode. Some other examples include color video, medical imaging such as EEG and fMRI, spatiotemporal data encountered in urban traffic monitoring, etc.In the past two decades, tensors have become ubiquitous in signal processing, statistics andcomputer science. Traditional unsupervised and supervised learning methods developed for one- dimensional signals do not translate well to higher order data structures as they get computationally prohibitive with increasing dimensionalities. Vectorizing high dimensional inputs creates problems in nearly all machine learning tasks due to exponentially increasing dimensionality, distortion of data structure and the difficulty of obtaining sufficiently large training sample size.In this thesis, we develop tensor-based approaches to various machine learning tasks. Existingtensor based unsupervised and supervised learning algorithms extend many well-known algorithms, e.g. 2-D component analysis, support vector machines and linear discriminant analysis, with better performance and lower computational and memory costs. Most of these methods rely on Tucker decomposition which has exponential storage complexity requirements; CANDECOMP-PARAFAC (CP) based methods which might not have a solution; or Tensor Train (TT) based solutions which suffer from exponentially increasing ranks. Many tensor based methods have quadratic (w.r.t the size of data), or higher computational complexity, and similarly, high memory complexity. Moreover, existing tensor based methods are not always designed with the particular structure of the data in mind. Many of the existing methods use purely algebraic measures as their objective which might not capture the local relations within data. Thus, there is a necessity to develop new models with better computational and memory efficiency, with the particular structure of the data and problem in mind. Finally, as tensors represent the data with more faithfulness to the original structure compared to the vectorization, they also allow coupling of heterogeneous data sources where the underlying physical relationship is known. Still, most of the current work on coupled tensor decompositions does not explore supervised problems.In order to address the issues around computational and storage complexity of tensor basedmachine learning, in Chapter 2, we propose a new tensor train decomposition structure, which is a hybrid between Tucker and Tensor Train decompositions. The proposed structure is used to imple- ment Tensor Train based supervised and unsupervised learning frameworks: linear discriminant analysis (LDA) and graph regularized subspace learning. The algorithm is designed to solve ex- tremal eigenvalue-eigenvector pair computation problems, which can be generalized to many other methods. The supervised framework, Tensor Train Discriminant Analysis (TTDA), is evaluated in a classification task with varying storage complexities with respect to classification accuracy and training time on four different datasets. The unsupervised approach, Graph Regularized TT, is evaluated on a clustering task with respect to clustering quality and training time on various storage complexities. Both frameworks are compared to discriminant analysis algorithms with similar objectives based on Tucker and TT decompositions.In Chapter 3, we present an unsupervised anomaly detection algorithm for spatiotemporaltensor data. The algorithm models the anomaly detection problem as a low-rank plus sparse tensor decomposition problem, where the normal activity is assumed to be low-rank and the anomalies are assumed to be sparse and temporally continuous. We present an extension of this algorithm, where we utilize a graph regularization term in our objective function to preserve the underlying geometry of the original data. Finally, we propose a computationally efficient implementation of this framework by approximating the nuclear norm using graph total variation minimization. The proposed approach is evaluated for both simulated data with varying levels of anomaly strength, length and number of missing entries in the observed tensor as well as urban traffic data. In Chapter 4, we propose a geometric tensor learning framework using product graph structures for tensor completion problem. Instead of purely algebraic measures such as rank, we use graph smoothness constraints that utilize geometric or topological relations within data. We prove the equivalence of a Cartesian graph structure to TT-based graph structure under some conditions. We show empirically, that introducing such relaxations due to the conditions do not deteriorate the recovery performance. We also outline a fully geometric learning method on product graphs for data completion.In Chapter 5, we introduce a supervised learning method for heterogeneous data sources suchas simultaneous EEG and fMRI. The proposed two-stage method first extracts features taking the coupling across modalities into account and then introduces kernelized support tensor machines for classification. We illustrate the advantages of the proposed method on simulated and real classification tasks with small number of training data with high dimensionality. Show less