You are here
Search results
(1 - 14 of 14)
- Title
- ALGEBRAIC TOPOLOGY AND GRAPH THEORY BASED APPROACHES FOR PROTEIN FLEXIBILITY ANALYSIS AND B FACTOR PREDICTION
- Creator
- Bramer, David
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
Protein fluctuation, measured by B factors, has been shown to highly correlate to protein flexibility and function. Several methods have been developed to predict protein B factoras well as related applications such as docking pose ranking, domain separation, entropycalculation, hinge detection, hot spot detection, stability analysis, etc. While many B factormethods exist, reliable B factor prediction continues to be an ongoing challenge and there ismuch room for improvement.This work...
Show moreProtein fluctuation, measured by B factors, has been shown to highly correlate to protein flexibility and function. Several methods have been developed to predict protein B factoras well as related applications such as docking pose ranking, domain separation, entropycalculation, hinge detection, hot spot detection, stability analysis, etc. While many B factormethods exist, reliable B factor prediction continues to be an ongoing challenge and there ismuch room for improvement.This work introduces a paradigm shifting geometric graph based model called the multi-scale weighted colored graph (MWCG) model. The MWCG model is a new generation of computational algorithms that signicantly improves the current landscape of protein struc-tural fluctuation analysis. The MWCG model treats each protein as a colored graph where colored nodes correspond to atomic element types and edges are weighted by a generalized centrality metric. Each graph contains multiple subgraphs based on interaction typesbetween graphic nodes, then protein rigidity is represented by generalized centralities of subgraphs. MWCGs predict the B factors of protein residues and accurately analyze the flexibility of all atoms in a protein simultaneously. The MWCG model presented in thiswork captures element specific interactions across multiple scales and is a novel visual tool for identifying various protein secondary structures. This work also demonstrates MWCG protein hinge detection using a variety of proteins.Cross protein prediction of protein B factors has previously been an unsolved problem in terms of B factor prediction methods. Since many proteins are dicult to crystallize, and for some it is likely impossible, models that can cross predict protein B factor are absolutelynecessary. By integrating machine learning and the advanced graph theory MWCG method, this work provides a robust cross protein B factor prediction solution using a set of known proteins to predict the B factors of a protein previously unseen to the algorithm. Thealgorithm connects different proteins using global protein features such as the resolution of the X-ray crystallography data. The combination of global and local features results in successful cross protein B factor prediction. To test and validate these results this work considers several machine learning approaches such as random forest, gradient boosted trees, and deep convolutional neural networks.Recently, persistent homology has had tremendous success in biomolecular data analysis. It works by examining the topological relationship or connectivity of a group of atoms in a molecule at a variety of scales, then rendering a family of topological representations of the molecule. However, persistent homology is rarely employed for the analysis of atomic properties, such as biomolecular flexibility analysis or B factor prediction. This work introduces atom specific persistent homology (ASPH) to provide a local atomic level representation of a molecule via a global topological tool. This is achieved through the construction of a pair of conjugated sets of atoms and corresponding conjugated simplicial complexes, as well as conjugated topological spaces. The difference between the topological invariants of the pair of conjugated sets is measured by Bottleneck and Wasserstein metrics and leads to anatom specic topological representation of individual atomic properties in a molecule. Atom specific topological features are integrated with various machine learning algorithms, including gradient boosting trees and convolutional neural network for protein thermal fluctuation analysis and blind cross protein B factor prediction.Extensive numerical testing indicates the proposed methods provide novel and powerful graph theory and algebraic topology based tools for analyzing and predicting atom specific, localized protein flexibility information.
Show less
- Title
- The flexibility-rigidity index (FRI : theory and applications
- Creator
- Opron, Kristopher
- Date
- 2016
- Collection
- Electronic Theses & Dissertations
- Description
-
Since the first protein structures were solved in the 1950s, the protein data bank has grown to include over one hundred thousand macromolecular structures ranging in size from small peptides to large viral capsids. These experiments have shown that proteins exhibit a diverse range of structure and function and that these two aspects are closely related. In fact, it is often possible to predict a protein's function from its structure alone. Much of the focus to date has been on the more...
Show moreSince the first protein structures were solved in the 1950s, the protein data bank has grown to include over one hundred thousand macromolecular structures ranging in size from small peptides to large viral capsids. These experiments have shown that proteins exhibit a diverse range of structure and function and that these two aspects are closely related. In fact, it is often possible to predict a protein's function from its structure alone. Much of the focus to date has been on the more static regions of proteins for theoretical and practical reasons. However, it is important to note that even well folded proteins experience everlasting fluctuations due to the constant influence from outside forces, which drive motions that are relevant to function such as sidechain fluctuations and conformational shifts. The possible movements that can arise from these fluctuations are determined by a protein's structure. This means flexibility, or the ability to deform from the current conformation under external forces, is an intrinsic property of all proteins, and is closely tied to function. In order to better study protein function in ordered or disordered proteins, we require accurate, efficient, multiscale tools for evaluating flexibility.This work puts forward a multiscale, multiphysics and multidomain model, the flexibility-rigidity index (FRI), to estimate the flexibility and conformational motions of macromolecular structures. The basic assumption of the present FRI theory is that the geometry or structure of a given protein, together with its specific environment, completely determines the biological function and properties including flexibility and charge. To this end, we utilize monotonically decreasing functions to measure the geometric compactness of a protein and quantify the topological connectivity of atoms or residues in the proteins and nucleic acids. We define the total rigidity of a molecule by a summation of atomic rigidities. A practical validation of the proposed FRI for flexibility analysis is provided by the prediction of B-factors, or temperature factors of proteins, measured by X-ray crystallography. We employ a test set of 263 structurally distinct proteins to examine the validity and robustness of the proposed FRI method for B-factor estimation or flexibility prediction. The basic FRI algorithm outperforms GNM on this test set by about 20\%. After validation of the basic FRI method we introduce a multikernel-based multiscale FRI (mFRI) strategy to analyze macromolecular flexibility. The essential idea is to employ two or three kernels each parameterized with a different scale to capture the multiple characteristic interaction scales of complex biomolecules. Based on an expanded test set containing 364 proteins, we show that the mFRI method is about 22\% more accurate than the GNM method in B-factor prediction. Most importantly, we demonstrate that the present mFRI gives rise to excellent flexibility analysis for many proteins that are difficult cases for GNM and the previously introduced single-scale FRI methods. Finally, for a protein of $N$ residues, we illustrate that the computational complexity of the proposed mFRI is of linear scaling ${\cal O}(N)$, in contrast to the order of ${\cal O}(N^3)$ for GNM.
Show less
- Title
- Mathematical modeling and simulation of mechanoelectrical transducers and nanofluidic channels
- Creator
- Park, Jin Kyoung
- Date
- 2014
- Collection
- Electronic Theses & Dissertations
- Description
-
Remarkable advances in nanotechnology and computational approaches enable researchers to investigate physical and biological phenomena in an atomic or molecular scale. Smaller-scale approaches are important to study the transport of ions and/or molecules through ion channels in living organisms as well as exquisitely fabricated nanofluidic channels. Both subjects have similar physical properties and hence they have common mathematical interests and challenges in modeling and simulating the...
Show moreRemarkable advances in nanotechnology and computational approaches enable researchers to investigate physical and biological phenomena in an atomic or molecular scale. Smaller-scale approaches are important to study the transport of ions and/or molecules through ion channels in living organisms as well as exquisitely fabricated nanofluidic channels. Both subjects have similar physical properties and hence they have common mathematical interests and challenges in modeling and simulating the transport phenomena. In this work, we first propose and validate a molecular level prototype for mechanoelectrical transducer (MET) channel in mammalian hair cells.Next, we design three ionic diffusive nanofluidic channels with different types of atomic surface charge distribution, and explore the current properties of each channel. We construct the molecular level prototype which consists of a charged blocker, a realistic ion channel and its surrounding membrane. The Gramicidin A channel is employed to demonstrate the realistic channel structure, and the blocker is a positively charged atom of radius $1.5$\AA\, which is placed at the mouth region of the channel. Relocating this blocker along one direction just outside the channel mouth imitates the opening and closing behavior of the MET channel. In our atomic scale design for an ionic diffusive nanofluidic channel, the atomic surface charge distribution is easy to modify by varying quantities and signs of atomic charges which are equally placed slightly above the channel surface. Our proposed nanofluidic systems constitutes a geometrically well-defined cylindrical channel and two reservoirs of KCl solution. For both the mammalian MET channel and the ion diffusive nanofluidic channel, we employ a well-established ion channel continuum theory, Poisson-Nernst-Planck theory, for three dimensional numerical simulations. In particular, for the nano-scaled channel descriptions, the generalized PNP equations are derived by using a variational formulation and by incorporating non-electrostatic interactions. We utilize several useful mathematical algorithms, such as Dirichlet to Neumann mapping and the matched interface and boundary method, in order to validate the proposed models with charge singularities and complex geometry. Moreover, the second-order accuracy of the proposed numerical methods are confirmed with our nanofluidic system affected by a single atomic charge and eight atomic charges, and further study the channels with a unipolar charge distribution of negative ions and a bipolar charge distribution. Finally, we analyze electrostatic potential and ion conductance through each channel model under the influence of diverse physical conditions, including external applied voltage, bulk ion concentration and atomic charge. Our MET channel prototype shows an outstanding agreement with experimental observation of rat cochlear outer hair cells in terms of open probability. This result also suggests that the tip link, a connector between adjacent stereocilia, gates the MET channel. Similarly, numerical findings, such as ion selectivity, ion depletion and accumulation, and potential wells, of our proposed ion diffusive realistic nanochannels are in remarkable accordance with those from experimental measurements and numerical simulations in the literature. In addition, simulation results support the controllability of the current within a nanofluidic channel.
Show less
- Title
- Integration of topological fingerprints and machine learning for the prediction of chemical mutagenicity
- Creator
- Cao, Yin (Quantitative analyst)
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
"Toxicity refers to the interaction between chemical molecules that leads to adverse effects in biological systems, and mutagenicity is one of its most important endpoints. Prediction of chemical mutagenicity is essential to ensuring the safety of drugs, foods, etc. In silico modeling of chemical mutagenicity, as a replacement of in-vivo bioassays, is increasingly encouraged, due to its efficiency, effectiveness, lower cost and less reliance on animal tests. The quality of a good molecular...
Show more"Toxicity refers to the interaction between chemical molecules that leads to adverse effects in biological systems, and mutagenicity is one of its most important endpoints. Prediction of chemical mutagenicity is essential to ensuring the safety of drugs, foods, etc. In silico modeling of chemical mutagenicity, as a replacement of in-vivo bioassays, is increasingly encouraged, due to its efficiency, effectiveness, lower cost and less reliance on animal tests. The quality of a good molecular representation is usually the key to building an accurate and robust in silico model, in that each representation provides a different way for the machine to look at the molecular structure. While most molecular descriptors were introduced based on the physio-chemical and biological activities of chemical molecules, in this study, we propose a new topological representation for chemical molecules, the combinatorial topological fingerprints (CTFs) based on persistent homology, knowing that persistent homology is a suitable tool to extract global topological information from a discrete sample of points. The combination of the proposed CTFs and machine learning algorithms could give rise to efficient and powerful in silico models for mutagenic toxicity prediction. Experimental results on a developmental toxicity dataset have also shown the predictive power of the proposed CTFs and its competitive advantages of characterizing and representing chemical molecules over existing fingerprints."--Page ii.
Show less
- Title
- Multiscale modeling and computation of nano-electronic transistors and transmembrane proton channels
- Creator
- Chen, Duan
- Date
- 2010
- Collection
- Electronic Theses & Dissertations
- Description
-
The miniaturization of nano-scale electronic transistors, such as metal oxide semiconductor field effect transistors (MOSFETs), has given rise to a pressing demand in the new theoretical understanding and practical tactic for dealing with quantum mechanical effects in integrated circuits. In biology, proton dynamics and transport across membrane proteins are of paramount importance to the normal function of living cells. Similar physical characteristics are behind the two subjects, and model...
Show moreThe miniaturization of nano-scale electronic transistors, such as metal oxide semiconductor field effect transistors (MOSFETs), has given rise to a pressing demand in the new theoretical understanding and practical tactic for dealing with quantum mechanical effects in integrated circuits. In biology, proton dynamics and transport across membrane proteins are of paramount importance to the normal function of living cells. Similar physical characteristics are behind the two subjects, and model simulations share common mathematical interests/challenges. In this thesis work, multiscale and multiphysical models are proposed to study the mechanisms of nanotransistors and proton transport in transmembrane at the atomic level.For nano-electronic transistors, we introduce a unified two-scale energy functional to describe the electrons and the continuum electrostatic potential. This framework enables us to put microscopic and macroscopic descriptions on an equal footing at nano-scale. Additionally, this model includes layered structures and random doping effect of nano-transistors.For transmembrane proton channels, we describe proton dynamics quantum mechanically via a density functional approach while implicitly treat numerous solvent molecules as a dielectric continuum. The densities of all other ions in the solvent are assumed to obey the Boltzmann distribution. The impact of protein molecular structure and its charge polarization on the proton transport is considered in atomic details. We formulate a total free energy functional to include kinetic and potential energies of protons, as well as electrostatic energy of all other ions on an equal footing.For both nano-transistors and proton channels systems, the variational principle is employed to derive nonlinear governing equations. The Poisson-Kohn-Sham equations are derived for nano-transistors while the generalized Poisson-Boltzmann equation and Kohn-Sham equation are obtained for proton channels. Related numerical challenges in simulations are addressed: the matched interface and boundary (MIB) method, the Dirichlet-to-Neumann mapping (DNM) technique, and the Krylov subspace and preconditioner theory are introduced to improve the computational efficiency of the Poisson-type equation. The quantum transport theory is employed to solve the Kohn-Sham equation. The Gummel iteration and relaxation technique are utilized for overall self-consistent iterations.Finally, applications are considered and model validations are verified by realistic nano-transistors and transmembrane proteins. Two distinct device congurations, a double-gate MOSFET and a four-gate MOSFET, are considered in our three dimensionalnumerical simulations. For these devices, the current uctuation and voltage threshold lowering effect induced by discrete dopants are explored. For proton transport, a realistic channel protein, the Gramicidin A (GA) is used to demonstrate the performance of the proposed proton channel model and validate the efficiency of the proposed mathematical algorithms. The electrostatic characteristics of the GA channel is analyzed with a wide range of model parameters. Proton channel conductances are studied over a number of applied voltages and reference concentrations. Comparisons with experimental data are utilized to verify our model predictions.
Show less
- Title
- Application of topological data analysis and machine learning for mutation induced protein property change prediction
- Creator
- Wang, Menglun
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Mutagenesis is a process by which the genetic information of an organism is changed, resulting in a mutation. A lot of diseases are caused by mutation of protein, including Cystic fibrosis, Alzheimer's Disease and most of cancer. To get a better understanding of mutation induced protein properties change, accurate and efficient computational models are urgently needed. For protein-protein binding affinity changes upon mutation ($\Delta\Delta G$), we built a prediction model called TopNetTree....
Show moreMutagenesis is a process by which the genetic information of an organism is changed, resulting in a mutation. A lot of diseases are caused by mutation of protein, including Cystic fibrosis, Alzheimer's Disease and most of cancer. To get a better understanding of mutation induced protein properties change, accurate and efficient computational models are urgently needed. For protein-protein binding affinity changes upon mutation ($\Delta\Delta G$), we built a prediction model called TopNetTree. Algebraic topology, a champion in recent worldwide competitions for protein-ligand binding affinity predictions, is a promising approach for simplifying the complexity of biological structures. Here, we introduce element-specific and site-specific persistent homology, a new branch of algebraic topology, to simplify the structural complexity of protein-protein complexes and embed crucial biological information into topological invariants. Additionally, we propose a new deep learning algorithm called NetTree, to take advantage of convolutional neural networks and gradient boosting trees. A topology-based network tree (TopNetTree) is constructed by integrating the topological representation and NetTree for predicting PPI $\Delta\Delta G$. Tests on major benchmark datasets indicate that the proposed TopNetTree significantly improves the current state-of-art in $\Delta\Delta G$ prediction.For mutation induced protein folding energy change, we proposed a local topological predictor (LTP) based machine learning model. To characterize molecular structure, Hessian matrix of local surface is generated from Exponential and Lorentz density kernel. Eigenvalues of Hessian matrix are calculated as local topological predictor, which are then fed into gradient boost machine learning model as features. Our LTP model obtained state-of-art results for various benchmark data sets of mutation induced protein folding energy change
Show less
- Title
- Discrete de Rham-Hodge Theory
- Creator
- Zhao, Rundong
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
We present a systematic treatment to 3D shape analysis based on the well-established de Rham-Hodge theory in differential geometry and topology. The computational tools we developed are widely applicable to research areas such as computer graphics, computer vision, and computational biology. We extensively tested it in the context of 3D structure analysis of biological macromolecules to demonstrate the efficacy and efficiency of our method in potential applications. Our contributions are...
Show moreWe present a systematic treatment to 3D shape analysis based on the well-established de Rham-Hodge theory in differential geometry and topology. The computational tools we developed are widely applicable to research areas such as computer graphics, computer vision, and computational biology. We extensively tested it in the context of 3D structure analysis of biological macromolecules to demonstrate the efficacy and efficiency of our method in potential applications. Our contributions are summarized in the following aspects. First, we present a compendium of discrete Hodge decompositions of vector fields, which provides the primary building block of the de Rham-Hodge theory for computations performed on the commonly used tetrahedral meshes embedded in the 3D Euclidean space. Second, we present a real-world application of the above computational tool to 3D shape analysis on biological macromolecules. Finally, we extend the above method to an evolutionary de Rham-Hodge method to provide a unified paradigm for the multiscale geometric and topological analysis of evolving manifolds constructed from a filtration, which induces a family of evolutionary de Rham complexes. Our work on the decomposition of vector fields, spectral shape analysis on static shapes, and evolving shapes has already shown its effectiveness in biomolecular applications and will lead to a rich set of features for machine learning-based shape analysis currently under development.
Show less
- Title
- Algebraic topology and machine learning for biomolecular modeling
- Creator
- Cang, Zixuan
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
Data is expanding in an unprecedented speed in both quantity and size. Topological data analysis provides excellent tools for analyzing high dimensional and highly complex data. Inspired by the topological data analysis's ability of robust and multiscale characterization of data and motivated by the demand of practical predictive tools in computational biology and biomedical researches, this dissertation extends the capability of persistent homology toward quantitative and predictive data...
Show moreData is expanding in an unprecedented speed in both quantity and size. Topological data analysis provides excellent tools for analyzing high dimensional and highly complex data. Inspired by the topological data analysis's ability of robust and multiscale characterization of data and motivated by the demand of practical predictive tools in computational biology and biomedical researches, this dissertation extends the capability of persistent homology toward quantitative and predictive data analysis tools with an emphasis in biomolecular systems. Although persistent homology is almost parameter free, careful treatment is still needed toward practically useful prediction models for realistic systems. This dissertation carefully assesses the representability of persistent homology for biomolecular systems and introduces a collection of characterization tools for both macromolecules and small molecules focusing on intra- and inter-molecular interactions, chemical complexities, electrostatics, and geometry. The representations are then coupled with deep learning and machine learning methods for several problems in drug design and biophysical research. In real-world applications, data often come with heterogeneous dimensions and components. For example, in addition to location, atoms of biomolecules can also be labeled with chemical types, partial charges, and atomic radii. While persistent homology is powerful in analyzing geometry of data, it lacks the ability of handling the non-geometric information. Based on cohomology, we introduce a method that attaches the non-geometric information to the topological invariants in persistent homology analysis. This method is not only useful to handle biomolecules but also can be applied to general situations where the data carries both geometric and non-geometric information. In addition to describing biomolecular systems as a static frame, we are often interested in the dynamics of the systems. An efficient way is to assign an oscillator to each atom and study the coupled dynamical system induced by atomic interactions. To this end, we propose a persistent homology based method for the analysis of the resulting trajectories from the coupled dynamical system. The methods developed in this dissertation have been applied to several problems, namely, prediction of protein stability change upon mutations, protein-ligand binding affinity prediction, virtual screening, and protein flexibility analysis. The tools have shown top performance in both commonly used validation benchmarks and community-wide blind prediction challenges in drug design.
Show less
- Title
- Mathematical modeling and computation of molecular solvation and binding
- Creator
- Wang, Bao
- Date
- 2016
- Collection
- Electronic Theses & Dissertations
- Description
-
This dissertation contains a couple of results on biophysics modeling and computation, ranging from solvated molecular conformation modeling to molecular solvation and binding modeling in the solvent environment.We study the solvent excluded surface in Eulerian representation, provide the surface area and enclosed volume calculation, the molecular topological analysis is also addressed. We further analyze the electrostatic for the solvated molecules with the Eulerian solvent excluded surface....
Show moreThis dissertation contains a couple of results on biophysics modeling and computation, ranging from solvated molecular conformation modeling to molecular solvation and binding modeling in the solvent environment.We study the solvent excluded surface in Eulerian representation, provide the surface area and enclosed volume calculation, the molecular topological analysis is also addressed. We further analyze the electrostatic for the solvated molecules with the Eulerian solvent excluded surface. We show that our surface is analytical without any numerical approximation.We study the coarse grid Poisson Boltzmann solver. Our software enables extremely accurate numerical solution to the Poisson Boltzmann equation even at very large grid spacing. As a consequence, our software provides a reliable electrostatic calculation for the solvation and protein ligand binding related problem.We study the blind solvation free energy prediction problem. A hybrid of physical and statistical protocol is proposed for highly accurate solvation free energy prediction. Furthermore, to mediate the force field parametrization influence on the solvation free energy prediction, we propose a learning to rank based solvation free energy prediction paradigm.We explore the protein ligand binding free energy prediction and docking scoring via the learning to rank approach. In which a learn to rank based scoring function is proposed for accurate protein ligand binding scoring.
Show less
- Title
- AUTO-PARAMETRIZED KERNEL METHODS FOR BIOMOLECULAR MODELING
- Creator
- Szocinski, Timothy Andrew
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Being able to predict various physical quantities of biomolecules is of great importance to biologists, chemists, and pharmaceutical companies. By applying machine learning techniques to develop these predictive models, we find much success in our endeavors. Advanced mathematical techniques involving graph theory, algebraic topology, differential geometry, etc. have been very profitable in generating first-rate biomolecular representations that are used to train a variety of machine learning...
Show moreBeing able to predict various physical quantities of biomolecules is of great importance to biologists, chemists, and pharmaceutical companies. By applying machine learning techniques to develop these predictive models, we find much success in our endeavors. Advanced mathematical techniques involving graph theory, algebraic topology, differential geometry, etc. have been very profitable in generating first-rate biomolecular representations that are used to train a variety of machine learning models. Some of these representations are dependent on a choice of kernel function along with parameters that determine its shape. These kernel-based methods of producing features require careful tuning of the kernel parameters, and the tuning cost increases exponentially as more kernels are involved. This limitation largely restricts us to the use of machine learning models with less hyper-parameters, such as random forest (RF) and gradient-boosting trees (GBT), thus precluding the use of neural networks for kernel-based representations. To alleviate these concerns, we have developed the auto-parametrized weighted element-specific graph neural network (AweGNN), which uses kernel-based geometric graph features in which the kernel parameters are automatically updated throughout the training to reach an optimal combination of kernel parameters. The AweGNN models have shown to be particularly success in toxicity and solvation predictions, especially when a multi-task approach is taken. Although the AweGNN had introduced hundreds of parameters that were automatically tuned, the ability to include multiple kernel types simultaneously was hindered because of the computational expense. In response, the GPU-enhanced AweGNN was developed to tackle the issue. Working with GPU architecture, the AweGNN's computation speed was greatly enhanced. To achieve a more comprehensive representation, we suggested a network consisting of fixed topological and spectral auxiliary features to bolster the original AweGNN success. The proposed network was tested on new hydration and solubility datasets, with excellent results. To extend the auto-parametrized kernel technique to include features of a different type, we introduced the theoretical foundation for building an auto-parametrized spectral layer, which uses kernel-based spectral features to represent biomolecular structures. In this dissertation, we explore some underlying notions of mathematics useful in our models, review important topics in machine learning, discuss techniques and models used in molecular biology, detail the AweGNN architecture and results, and test and expand new concepts pertaining to these auto-parametrized kernel methods.
Show less
- Title
- Integration of topological data analysis and machine learning for small molecule property predictions
- Creator
- Wu, Kedi
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
Accurate prediction of small molecule properties is of paramount importance to drug design and discovery. A variety of quantitative properties of small molecules has been studied in this thesis. These properties include solvation free energy, partition coefficient, aqueous solubility, and toxicity endpoints. The highlight of this thesis is to introduce an algebraic topology based method, called element specific persistent homology (ESPH), to predict small molecule properties. Essentially ESPH...
Show moreAccurate prediction of small molecule properties is of paramount importance to drug design and discovery. A variety of quantitative properties of small molecules has been studied in this thesis. These properties include solvation free energy, partition coefficient, aqueous solubility, and toxicity endpoints. The highlight of this thesis is to introduce an algebraic topology based method, called element specific persistent homology (ESPH), to predict small molecule properties. Essentially ESPH describes molecular properties in terms of multiscale and multicomponent topological invariants and is different from conventional chemical and physical representations. Based on ESPH and its modified version, element-specific topological descriptors (ESTDs) are constructed. The advantage of ESTDs is that they are systematical, comprehensive, and scalable with respect to molecular size and composition variations, and are readily suitable for machine learning methods, rendering topological learning algorithms. Due to the inherent correlation between different small molecule properties, multi-task frameworks are further employed to simultaneously predict related properties. Deep neural networks, along with ensemble methods such as random forest and gradient boosting trees, are used to develop quantitative predictive models. Physical based molecular descriptors and auxiliary descriptors are also used in addition to ESTDs. As a result, we obtain state-of-the-art results for various benchmark data sets of small molecule properties. We have also developed two online servers for predicting properties of small molecules, TopP-S and TopTox. TopP-S is a software for topological learning predictions of partition coefficient and aqueous solubility, and TopTox is a software for computing element-specific tological descriptors (ESTDs) for toxicity endpoint predictions. They are available at http://weilab.math.msu.edu/TopP-S/ and http://weilab.math.msu.edu/TopTox/, respectively.
Show less
- Title
- Aspects of Computational Topology and Mathematical Virology
- Creator
- Wang, Rui
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Being able to describe the shape of data is of paramount importance to the fields of biology, physics, chemistry, pharmaceutics, etc. Therefore, in recent years, scientists from the TDA community have been applying advanced mathematical tools to decode the topological structures of data. Methods such as persistent homology, path homology, and de Rham-Hodge theory have become the main workhorse of TDA, which pioneered new branches in algebraic topology and differential geometry. Later, various...
Show moreBeing able to describe the shape of data is of paramount importance to the fields of biology, physics, chemistry, pharmaceutics, etc. Therefore, in recent years, scientists from the TDA community have been applying advanced mathematical tools to decode the topological structures of data. Methods such as persistent homology, path homology, and de Rham-Hodge theory have become the main workhorse of TDA, which pioneered new branches in algebraic topology and differential geometry. Later, various topological Laplacians such as graph Laplacian, Hodge Laplacian, sheaf Laplacian, and Dirac Laplacian are proposed to preserve topological invariants and geometric shapes simultaneously. However, such Laplacians fail to extract the topological and geometric deformations when one introduces the filtration parameters in. Therefore, we proposed a new topological Laplacians called persistent Laplacians to fully recover the topological persistence and homotopic shape evolution during filtration. It is worth mentioning that persistent Laplacians are insensitive to asymmetry or directed relations, which limits their power to preserve the directional information of structures in practical applications. Therefore, we proposed persistent path Laplacians to overcome this issue. Similar to the persistent Laplacians, one can also extract the topological persistence and geometric deformations during filtration from the persistent path Laplacians by calculating their harmonic and non-harmonic spectra. In addition, the persistent path Laplacians are constructed on the directed graphs or network, which address the importance of directional representation in datasets such as gene regulation datasets in biology. Versatile mathematical tools have been playing an essential role in various biological applications. Since the first COVID-19 case was reported in December 2019, researchers worldwide have been pursuing scientific endeavors in the SARS-CoV-2 projects. Instead of designing promising vaccines and antibody therapies that required wet lab resources, we proposed a new mathematical-AI model called TopNetmAb to systematically analyze the mutation-induced impacts on the SARS-CoV-2 infectivity, vaccines, and antibody drugs. In this dissertation, the topological data analysis (including the persistent Laplacians mentioned above), artificial intelligence, various network models, and genomics analysis are all included in our SARS-CoV-2-related projects to provide comprehensive representations for the understanding of the transmission and evolution of SARS-CoV-2.
Show less
- Title
- Variational approaches in molecular electrostatics, surface formation, shock capturing and nano-transistors
- Creator
- Hu, Langhua
- Date
- 2013
- Collection
- Electronic Theses & Dissertations
- Description
-
This dissertation covers several topics in Applied Mathematics,including nonlinear Poisson equation(NLPE) with application in electrostatics and solvation analysis for biological system,partial differential equation(PDE) transform for hyperbolic conservation laws,high order fractional PDE transform for molecular surface construction and Poisson-Kohn-Sham equation for modeling geometric, thermal and tunneling effects on nano-transistors.Electrostatic interactions are ubiquitous in nature and...
Show moreThis dissertation covers several topics in Applied Mathematics,including nonlinear Poisson equation(NLPE) with application in electrostatics and solvation analysis for biological system,partial differential equation(PDE) transform for hyperbolic conservation laws,high order fractional PDE transform for molecular surface construction and Poisson-Kohn-Sham equation for modeling geometric, thermal and tunneling effects on nano-transistors.Electrostatic interactions are ubiquitous in nature and fundamental for chemical, biological and material sciences. The Poisson equation is a widely accepted model for electrostatic analysis. However, the Poisson equation is derived based on electric polarizations in a linear, isotropic and homogeneous dielectric medium. We introduce a nonlinear Poisson equation to take into consideration of hyperpolarization effects due to intensive charges and possible nonlinear, anisotropic and heterogeneous media. Variational principle is utilized to derive the nonlinear Poisson model from an electrostatic energy functional. To apply the proposed nonlinear Poisson equation for the solvation analysis, we also construct a nonpolar solvation energy functional based on the nonlinear Poisson equation by using the geometric measure theory. The proposed nonlinear Poisson theory is extensively validated by the electrostatic analysis of the Kirkwood model, 17 small molecules and 20 proteins at a fixed temperature as well as 21 compounds at different temperatures. A good agreement between our results and experimental data as well as theoretical results suggests that the proposed nonlinear Poisson model is a potentially useful model for electrostatic analysis involving hyperpolarization effects.In our next work, we introduce the use of the PDE transform, paired with the Fourier pseudospectral method (FPM),as a new approach for hyperbolic conservation law problems, which remains an interesting and challenging task due to the diversity of physical origins and complexity of the physical situations.The PDE transform, based on the use of arbitrarily high order evolution PDEs, is a new algorithm for splitting signals, surfaces and data to functional mode functions, such as trend, edge, noise etc.A fast PDE transform implemented by the fast Fourier Transform (FFT) is introduced to avoid stability constraint of integrating high order evolution PDEs. An adaptive measure of total variations is utilized to automatically switch on and off the PDE transform during the time integration of conservation law equations. A variety of standard benchmark test problems of hyperbolic conservation laws is employed to systematically validate the performance of the present PDE transform based FPM. The impact of two PDE transform parameters, i.e., the highest order and the propagation time, is carefully studied to deliver the best effect of suppressing Gibbs' oscillations. The PDE orders of 2-6 are used for hyperbolic conservation laws of low oscillatory solutions, while the PDE orders of 8-12 are often required for problems involving highly oscillatory solutions, such as shock-entropy wave interactions. The present results are compared with those in the literature. It is found that the present approach not only works well for problems that favor low order shock capturing schemes, but also exhibits superb behavior for problems that require the use of high order shock capturing methods.Furthermore, we study the high-order factional PDE transform based on fractional derivative with application in molecular surface generation. Fractional derivative or fractional calculus plays a significant role in theoretical modeling of scientific and engineering problems. However, only relatively low order fractional derivatives are used at present.Our work introduces arbitrarily high-order PDEs to describe fractional hyper-diffusions. The fractional PDEs are constructed via fractional variational principle.Furthermore, we construct fractional PDE transform based on arbitrarily high-order fractional PDEs. We demonstrate that the use of arbitrarily high-order derivatives gives rise to time-frequency localization, the control of the spectral distribution, and the regulation of the spatial resolution in the fractional PDE transform. Consequently, the fractional PDE transform enables the mode decomposition of images, signals, and surfaces. A fast fractional Fourier transform (FFFT) is proposed to numerically integrate the high-order fractional PDEs so as to avoid stringent stability constraints in solving high-order evolution PDEs. The proposed high-order fractional PDE transform are applied to the surface generation of proteins. We first validate the proposed method with a variety of test examples in two and three-dimensional settings. The impact of high-order fractional derivatives to surface analysis is examined. Computational efficiency of the present surface generation method is compared with the MSMS approach in Cartesian representation. We further validate the present method by examining some benchmark indicators of macromolecular surfaces, i.e., surface area, surface enclosed volume, surface electrostatic potential and solvationfree energy. Extensive numerical experiments and comparison with an established surface model indicate that the proposed high-order fractional PDEs are robust, stable and efficient for biomolecular surface generation.The last part of my work is in the filed of nano-scale electronic transistors. The miniaturization of nano-scale electronic transistors, such as metal oxide semiconductor field effect transistors (MOSFETs), has given rise to a pressing demand in the new theoretical understanding and practical tactic for dealing with quantum mechanical effects in integrated circuits. We study the effects of geometry of semiconductor-insulator interfaces, phonon-electron interactions, and quantum tunneling of nano-transistors. Mathematical models of these factors are based on a unified two-scale energy functional that describes free energy of electrons and their interactions with external environments. Related numerical tools and algorithms are introduced to perform simulations on 3D nano four-gate MOSFETs with different geometries of silicon/silicon dioxide interfaces. Phonon-electron interactions are modeled in fashion of density functional theory and integrated in the general free energy formulation. Quantum tunneling effects are defined as electron tunneling ratios and calculated for each type of nano-MOSFETs. Performances of nano-transistors are explored in terms of current-voltage (I-V) curves and quantized transport energy profiles in a wide range of device parameters.
Show less
- Title
- Differential geometry based multiscale modeling of solvation
- Creator
- Chen, Zhan
- Date
- 2011
- Collection
- Electronic Theses & Dissertations
- Description
-
Solvation is an elementary process in nature and is of paramount importance to many sophisticated chemical, biological and biomolecular processes. The understanding of solvation is anessential prerequisite for the quantitative description and analysis of biomolecular systems.Implicit solvent models, particularly those based on the Poisson-Boltzmann (PB) equation for electrostatic analysis, are established approaches for solvation analysis. However, ad hoc solvent-solute interfaces are...
Show moreSolvation is an elementary process in nature and is of paramount importance to many sophisticated chemical, biological and biomolecular processes. The understanding of solvation is anessential prerequisite for the quantitative description and analysis of biomolecular systems.Implicit solvent models, particularly those based on the Poisson-Boltzmann (PB) equation for electrostatic analysis, are established approaches for solvation analysis. However, ad hoc solvent-solute interfaces are commonly used in the implicit solvent theory and have some severe limitations.We have introduced differential geometry based solvation models which allow the solvent-solute interface to be determined by the variation of a total free energy functional. Our models extend the scaled particle theory (SPT) of nonpolar solvation models with a solvent-solute interaction potential. The nonpolar solvation model is completed with a PB theory based polar solvation model. In our Eulerian formation, the differential geometry theory of hypersurface is utilized to define and construct smooth interfaces with good stability and differentiability, for use in characterizing the solvent-solute boundaries and in generating continuous dielectric functions across the computational domain. Some techniques from the geometric measure theory are employed to rigorously convert a Lagrangian formulation of the surface energy into an Eulerian formulation, so as to bring all energy terms on an equal footing. In our Lagrangian formulation, the differential geometry theory of surfaces is used toprovide a natural description of solvent-solute interfaces. By optimizing the total free energy functional, we derive a coupling of the generalized Poisson-Boltzmann equation (GPBE) andthe generalized geometric flow equation (GGFE or also called Laplace-Beltrami equation) for the electrostatic potential and the construction of realistic solvent-solute boundaries, respectively. The coupled partial differential equations (PDEs) are solved with iterativeprocedures to reach a steady state, which delivers the desired solvent-solute interface and electrostatic potential for many problems of interest. These quantities are utilized to evaluatethe solvation free energies, protein-protein binding affinities, etc.The above proposed approaches have been extensively validated.Extensive numerical experiments have been designed to validate the present theoretical models, to test the computationalmethods, and to optimize the numerical algorithms. Solvation analysis of both small compounds and proteins are carried out to further demonstrate the accuracy, stability, efficiency and robustness of the present new models and numerical approaches. Comparisonis given to both experimental and theoretical results in the literature.Moreover, to account for the charge rearrangement during the solvation process, we also propose a differential geometry based multiscale solvation model which makes use of electrondensities computed directly from a quantum mechanical approach. We construct a new total energy functional, which consists of not only polar and nonpolar solvation contributions, but also the electronic kinetic and potential energies. We show that the quantum formulation of our solvation model improves the prediction of our earlier models, and outperforms some explicit solvation analysis.
Show less