Tensor learning with structure, geometry and multi-modality

With the advances in sensing and data acquisition technology, it is now possible to collect datafrom different modalities and sources simultaneously. Most of these data are multi-dimensional innature and can be represented by multiway arrays known as tensors. For instance, a color image is athird-order tensor defined by two indices for spatial variables and one index for color mode. Someother examples include color video, medical imaging such as EEG and fMRI, spatiotemporal dataencountered in urban traffic monitoring, etc. In the past two decades, tensors have become ubiquitous in signal processing, statistics andcomputer science. Traditional unsupervised and supervised learning methods developed for one-dimensional signals do not translate well to higher order data structures as they get computationallyprohibitive with increasing dimensionalities. Vectorizing high dimensional inputs creates problemsin nearly all machine learning tasks due to exponentially increasing dimensionality, distortion ofdata structure and the difficulty of obtaining sufficiently large training sample size. In this thesis, we develop tensor-based approaches to various machine learning tasks. Existingtensor based unsupervised and supervised learning algorithms extend many well-known algorithms, e.g. 2-D component analysis, support vector machines and linear discriminant analysis, with betterperformance and lower computational and memory costs. Most of these methods rely on Tuckerdecomposition which has exponential storage complexity requirements; CANDECOMP-PARAFAC(CP) based methods which might not have a solution; or Tensor Train (TT) based solutions whichsuffer from exponentially increasing ranks. Many tensor based methods have quadratic (w.r.tthe size of data), or higher computational complexity, and similarly, high memory complexity. Moreover, existing tensor based methods are not always designed with the particular structure ofthe data in mind. Many of the existing methods use purely algebraic measures as their objectivewhich might not capture the local relations within data. Thus, there is a necessity to develop newmodels with better computational and memory efficiency, with the particular structure of the dataand problem in mind. Finally, as tensors represent the data with more faithfulness to the originalstructure compared to the vectorization, they also allow coupling of heterogeneous data sourceswhere the underlying physical relationship is known. Still, most of the current work on coupledtensor decompositions does not explore supervised problems. In order to address the issues around computational and storage complexity of tensor basedmachine learning, in Chapter 2, we propose a new tensor train decomposition structure, which is ahybrid between Tucker and Tensor Train decompositions. The proposed structure is used to imple-ment Tensor Train based supervised and unsupervised learning frameworks: linear discriminantanalysis (LDA) and graph regularized subspace learning. The algorithm is designed to solve ex-tremal eigenvalue-eigenvector pair computation problems, which can be generalized to many othermethods. The supervised framework, Tensor Train Discriminant Analysis (TTDA), is evaluatedin a classification task with varying storage complexities with respect to classification accuracyand training time on four different datasets. The unsupervised approach, Graph Regularized TT, isevaluated on a clustering task with respect to clustering quality and training time on various storagecomplexities. Both frameworks are compared to discriminant analysis algorithms with similarobjectives based on Tucker and TT decompositions. In Chapter 3, we present an unsupervised anomaly detection algorithm for spatiotemporaltensor data. The algorithm models the anomaly detection problem as a low-rank plus sparse tensordecomposition problem, where the normal activity is assumed to be low-rank and the anomaliesare assumed to be sparse and temporally continuous. We present an extension of this algorithm, where we utilize a graph regularization term in our objective function to preserve the underlyinggeometry of the original data. Finally, we propose a computationally efficient implementation ofthis framework by approximating the nuclear norm using graph total variation minimization. Theproposed approach is evaluated for both simulated data with varying levels of anomaly strength, length and number of missing entries in the observed tensor as well as urban traffic data. In Chapter 4, we propose a geometric tensor learning framework using product graph structuresfor tensor completion problem. Instead of purely algebraic measures such as rank, we use graphsmoothness constraints that utilize geometric or topological relations within data. We prove theequivalence of a Cartesian graph structure to TT-based graph structure under some conditions. Weshow empirically, that introducing such relaxations due to the conditions do not deteriorate therecovery performance. We also outline a fully geometric learning method on product graphs fordata completion. In Chapter 5, we introduce a supervised learning method for heterogeneous data sources suchas simultaneous EEG and fMRI. The proposed two-stage method first extracts features taking thecoupling across modalities into account and then introduces kernelized support tensor machinesfor classification. We illustrate the advantages of the proposed method on simulated and realclassification tasks with small number of training data with high dimensionality.

Read