High-dimensional learning from random projections of data through regularization and diversification

Random signal measurement, in the form of random projections of signal vectors, extends the traditional point-wise and periodic schemes for signal sampling. In particular, the well-known problem of sensing sparse signals from linear measurements, also known as Compressed Sensing (CS), has promoted the utility of random projections. Meanwhile, many signal processing and learning problems that involve parametric estimation do not consist of sparsity constraints in their original forms. With the increasing popularity of random measurements, it is crucial to study the generic estimation performance under the random measurement model. In this thesis, we consider two specific learning problems (named below) and present the following two generic approaches for improving the estimation accuracy: 1) by adding relevant constraints to the parameter vectors and 2) by diversification of the random measurements to achieve fast decaying tail bounds for the empirical risk function.The first problem we consider is Dictionary Learning (DL). Dictionaries are extensions of vector bases that are specifically tailored for sparse signal representation. DL has become increasingly popular for sparse modeling of natural images as well as sound and biological signals, just to name a few. Empirical studies have shown that typical DL algorithms for imaging applications are relatively robust with respect to missing pixels in the training data. However, DL from random projections of data corresponds to an ill-posed problem and is not well-studied. Existing efforts are limited to learning structured dictionaries or dictionaries for structured sparse representations to make the problem tractable. The main motivation for considering this problem is to generate an adaptive framework for CS of signals that are not sparse in the signal domain. In fact, this problem has been referred to as 'blind CS' since the optimal basis is subject to estimation during CS recovery. Our initial approach, similar to some of the existing efforts, involves adding structural constraints on the dictionary to incorporate sparse and autoregressive models. More importantly, our results and analysis reveal that DL from random projections of data, in its unconstrained form, can still be accurate given that measurements satisfy the diversity constraints defined later.The second problem that we consider is high-dimensional signal classification. Prior efforts have shown that projecting high-dimensional and redundant signal vectors onto random low-dimensional subspaces presents an efficient alternative to traditional feature extraction tools such as the principle component analysis. Hence, aside from the CS application, random measurements present an efficient sampling method for learning classifiers, eliminating the need for recording and processing high-dimensional signals while most of the recorded data is discarded during feature extraction. We work with the Support Vector Machine (SVM) classifiers that are learned in the high-dimensional ambient signal space using random projections of the training data. Our results indicate that the classifier accuracy can be significantly improved by diversification of the random measurements.

Read

In Collections: Electronic Theses & Dissertations

Copyright Status: In Copyright

Material Type: Theses

Authors: Aghagolzadeh, Mohammad

Thesis Advisors: Radha, Hayder

Committee Members: Aviyente, Selin Sara
Deller, John R.
Hall, Jonathan
Radha, Hayder

Date Published: 2015

Subjects: Signal processing--Mathematical models
Random projection method
Mathematical models
Random measures

Program of Study: Electrical Engineering - Doctor of Philosophy

Degree Level: Doctoral

Language: English

Pages: x, 115 pages

ISBN: 9781339234045
1339234041

Permalink: https://doi.org/doi:10.25335/42y3-g557

High-dimensional learning from random projections of data through regularization and diversification

Full text