Active Learning in Genetic Programming
Active learning is an active field within machine learning that aims to minimize the amount of training data required by focusing on selecting training points that will be maximally informative for model development. Active learning has been successfully applied to many different types of machine learning, but until this dissertation, active learning's application to genetic programming has not been thoroughly examined. Considering that genetic programming is already known to be less data-hungry than many other methods, it seemed to be natural that we could further reduce training data requirements for genetic programming by applying active learning methods. In this dissertation, I developed the active learning in genetic programming (AL-GP) method and demonstrated how it is flexible and can be applied to a diverse set of population-based machine learning systems across several problem domains to guide training data selection. This results in a reduction in training data required to arrive at high-quality models. The method is shown to be effective across regression, image segmentation, and image classification problems.For active learning in regressions tasks, I explored the impact of both model uncertainty and data diversity individually and together. For image analysis tasks, I explored the impact that ensemble diversity has on active learning success. In this work, I developed two new GP systems, StackGP and DT-GP. Additionally, I modified the existing SEE-Segment system to improve the search strategy. The AL-GP approach was shown to work with all three systems which demonstrates that the AL-GP approach is general and easy to adapt to any population-based machine learning system.Although not directly linked to active learning but key to the success of StackGP for regression tasks, correlation as a fitness function was shown to be more effective than the traditional RMSE fitness function.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Haut, Nathan
- Thesis Advisors
-
Punch, Bill
Banzhaf, Wolfgang
- Committee Members
-
Punch, Bill
Banzhaf, Wolfgang
Colbry, Dirk
Kotanchek, Mark
- Date
- 2023
- Subjects
-
Computer science
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 133 pages
- Permalink
- https://doi.org/doi:10.25335/2e63-7m31