Application of topological data analysis and machine learning for mutation induced protein property change prediction
Mutagenesis is a process by which the genetic information of an organism is changed, resulting in a mutation. A lot of diseases are caused by mutation of protein, including Cystic fibrosis, Alzheimer's Disease and most of cancer. To get a better understanding of mutation induced protein properties change, accurate and efficient computational models are urgently needed.For protein-protein binding affinity changes upon mutation ($\\Delta\\Delta G$), we built a prediction model called TopNetTree. Algebraic topology, a champion in recent worldwide competitions for protein-ligand binding affinity predictions, is a promising approach for simplifying the complexity of biological structures. Here, we introduce element-specific and site-specific persistent homology, a new branch of algebraic topology, to simplify the structural complexity of protein-protein complexes and embed crucial biological information into topological invariants. Additionally, we propose a new deep learning algorithm called NetTree, to take advantage of convolutional neural networks and gradient boosting trees. A topology-based network tree (TopNetTree) is constructed by integrating the topological representation and NetTree for predicting PPI Î₄Î₄G. Tests on major benchmark datasets indicate that the proposed TopNetTree significantly improves the current state-of-art in Î₄Î₄G prediction.For mutation induced protein folding energy change, we proposed a local topological predictor (LTP) based machine learning model. To characterize molecular structure, Hessian matrix of local surface is generated from Exponential and Lorentz density kernel. Eigenvalues of Hessian matrix are calculated as local topological predictor, which are then fed into gradient boost machine learning model as features. Our LTP model obtained state-of-art results for various benchmark data sets of mutation induced protein folding energy change.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- Attribution-NonCommercial-ShareAlike 4.0 International
- Material Type
-
Theses
- Authors
-
Wang, Menglun
- Thesis Advisors
-
Wei, Guowei
- Committee Members
-
Tong, Yiying
Tang, Moxun
Yan, Ming
- Date Published
-
2021
- Subjects
-
Mathematics
- Program of Study
-
Applied Mathematics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 133 pages
- ISBN
-
9798762102339
- Permalink
- https://doi.org/doi:10.25335/5j26-xv52