Faster algorithms for machine learning problems in high dimension
"When dealing with datasets with high dimension, the existing machine learning algorithms often do not work in practice. Actually, most of the real-world data has the nature of low intrinsic dimension. For example, data often lies on a low-dimensional manifold or has a low doubling dimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of two fundamental problems in machine learning using some techniques in computational geometry. In Chapter two, we propose a bi-criteria approximation algorithm for minimum enclosing ball with outliers and extend it to the outlier recognition problem. By virtue of the "core-set" idea and the Random Gradient Descent Tree, we propose an efficient algorithm which is linear in the number of points n and the dimensionality d, and provides a probability bound. In experiments, compared with some existing outlier recognition algorithms, our method is proven to be efficient and robust to the outlier ratios. In Chapter three, we adopt the "doubling dimension" to characterize the intrinsic dimension of a point set. By the property of doubling dimension, we can approximate the geometric alignment between two point sets by executing the existing alignment algorithms on their subsets, which achieves a much smaller time complexity. More importantly, the proposed approximate method has a theoretical upper bound and can serve as the preprocessing step of any alignment algorithm."--Page ii.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Ye, Mingquan
- Thesis Advisors
-
Ding, Hu
- Committee Members
-
Torng, Eric
Tong, Yiying
Ding, Hu
- Date
- 2019
- Program of Study
-
Computer Science - Master of Science
- Degree Level
-
Masters
- Language
-
English
- Pages
- viii, 43 pages
- ISBN
-
9781392157640
1392157641
- Permalink
- https://doi.org/doi:10.25335/y842-1318