You are here
Search results
(1  1 of 1)
 Title
 FASTER ALGORITHMS FOR MACHINE LEARNING PROBLEMS IN HIGH DIMENSION
 Creator
 Ye, Mingquan
 Date
 2019
 Collection
 Electronic Theses & Dissertations
 Description

When dealing with datasets with high dimension, the existing machine learning algorithms oftendo not work in practice. Actually, most of the realworld data has the nature of low intrinsicdimension. For example, data often lies on a lowdimensional manifold or has a low doublingdimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of twofundamental problems in machine learning using some techniques in computational geometry.In Chapter two, we propose a bi...
Show moreWhen dealing with datasets with high dimension, the existing machine learning algorithms oftendo not work in practice. Actually, most of the realworld data has the nature of low intrinsicdimension. For example, data often lies on a lowdimensional manifold or has a low doublingdimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of twofundamental problems in machine learning using some techniques in computational geometry.In Chapter two, we propose a bicriteria approximation algorithm for minimum enclosing ballwith outliers and extend it to the outlier recognition problem. By virtue of the “coreset” idea andthe Random Gradient Descent Tree, we propose an efficient algorithm which is linear in the numberof points n and the dimensionality d, and provides a probability bound. In experiments, comparedwith some existing outlier recognition algorithms, our method is proven to be efficient and robustto the outlier ratios.In Chapter three, we adopt the “doubling dimension” to characterize the intrinsic dimension ofa point set. By the property of doubling dimension, we can approximate the geometric alignmentbetween two point sets by executing the existing alignment algorithms on their subsets, whichachieves a much smaller time complexity. More importantly, the proposed approximate methodhas a theoretical upper bound and can serve as the preprocessing step of any alignment algorithm.
Show less