You are here
Search results
(1  1 of 1)
 Title
 Faster algorithms for machine learning problems in high dimension
 Creator
 Ye, Mingquan
 Date
 2019
 Collection
 Electronic Theses & Dissertations
 Description

"When dealing with datasets with high dimension, the existing machine learning algorithms often do not work in practice. Actually, most of the realworld data has the nature of low intrinsic dimension. For example, data often lies on a lowdimensional manifold or has a low doubling dimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of two fundamental problems in machine learning using some techniques in computational geometry. In Chapter two, we propose...
Show more"When dealing with datasets with high dimension, the existing machine learning algorithms often do not work in practice. Actually, most of the realworld data has the nature of low intrinsic dimension. For example, data often lies on a lowdimensional manifold or has a low doubling dimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of two fundamental problems in machine learning using some techniques in computational geometry. In Chapter two, we propose a bicriteria approximation algorithm for minimum enclosing ball with outliers and extend it to the outlier recognition problem. By virtue of the "coreset" idea and the Random Gradient Descent Tree, we propose an efficient algorithm which is linear in the number of points n and the dimensionality d, and provides a probability bound. In experiments, compared with some existing outlier recognition algorithms, our method is proven to be efficient and robust to the outlier ratios. In Chapter three, we adopt the "doubling dimension" to characterize the intrinsic dimension of a point set. By the property of doubling dimension, we can approximate the geometric alignment between two point sets by executing the existing alignment algorithms on their subsets, which achieves a much smaller time complexity. More importantly, the proposed approximate method has a theoretical upper bound and can serve as the preprocessing step of any alignment algorithm."Page ii.
Show less