Large-scale high dimensional distance metric learning and its application to computer vision

Learning an appropriate distance function (i.e., similarity) is one of the key tasks in machine learning, especially for distance based machine learning algorithms, e.g., $k$-nearest neighbor classifier, $k$-means clustering, etc. Distance metric learning (DML), the subject to be studied in this dissertation, is designed to learn a metric that pulls the examples from the same class together and pushes the examples from different classes away from each other. Although many DML algorithms have been developed in the past decade, most of them can handle only small data sets with hundreds of features, significantly limiting their applications to real world applications that often involve millions of training examples represented by hundreds of thousands of features. Three main challenges are encountered to learn the metric from these large-scale high dimensional data: (i) To make sure that the learned metric is a Positive Semi-Definitive (PSD) matrix, a projection into the PSD cone is required at every iteration, whose cost is cubic in the dimensionality making it unsuitable for high dimensional data; (ii) The number of variables that needs to be optimized in DML is quadratic in the dimensionality, which results in the slow convergence rate in optimization and high requirement of memory storage; (iii) The number of constraints used by DML is at least quadratic, if not cubic, in the number of examples depending on if pairwise constraints or triplet constraints are used in DML. Besides, features can be redundant due to high dimensional representations (e.g., face features) and DML with feature selection is preferred for these applications.The main contribution of this dissertation is to address these challenges both theoretically and empirically. First, for the challenge arising from the PSD projection, we exploit the mini-batch strategy and adaptive sampling with smooth loss function to significantly reduce the number of updates (i.e., projections) while keeping the similar performance. Second, for the challenge arising from high dimensionality, we propose a dual random projection approach, which enjoys the light computation due to the usage of random projection and at the same time, significantly improves the effectiveness of random projection. Third, for the challenge with large-scale constraints, we develop a novel multi-stage metric learning framework. It divides the original optimization problem into multiple stages. It reduces the computation by adaptively sampling a small subset of constraints at each stage. Finally, to handle redundant features with group property, we develop a greedy algorithm that selects feature group and learns the corresponding metric simultaneously at each iteration leading to further improvement of learning efficiency when combined with adaptive mini-batch strategy and incremental sampling. Besides the theoretical and empirical investigation of DML on the benchmark datasets of machine learning, we also apply the proposed methods to several important computer vision applications (i.e., fine-grained visual categorization (FGVC) and face recognition).

Read

In Collections: Electronic Theses & Dissertations

Copyright Status: In Copyright

Material Type: Theses

Authors: Qian, Qi

Thesis Advisors: Jin, Rong

Committee Members: Aviyente, Sara Selin
Liu, Xiaoming
Tan, Pang-Ning

Date: 2015

Subjects: Computer vision
Machine learning
Pattern recognition systems

Program of Study: Computer Science - Doctor of Philosophy

Degree Level: Doctoral

Language: English

Pages: xii, 129 pages

ISBN: 9781339039794
1339039796