DIFFERENTIAL GEOMETRY BASED MULTISCALE MODELING OF SOLVATION By Zhan Chen A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Applied Mathematics 2011 ABSTRACT DIFFERENTIAL GEOMETRY BASED MULTISCALE MODELING OF SOLVATION By Zhan Chen Solvation is an elementary process in nature and is of paramount importance to many sophisticated chemical, biological and biomolecular processes. The understanding of solvation is an essential prerequisite for the quantitative description and analysis of biomolecular systems. Implicit solvent models, particularly those based on the Poisson-Boltzmann (PB) equation for electrostatic analysis, are established approaches for solvation analysis. However, ad hoc solvent-solute interfaces are commonly used in the implicit solvent theory and have some severe limitations. We have introduced differential geometry based solvation models which allow the solventsolute interface to be determined by the variation of a total free energy functional. Our models extend the scaled particle theory (SPT) of nonpolar solvation models with a solventsolute interaction potential. The nonpolar solvation model is completed with a PB theory based polar solvation model. In our Eulerian formation, the differential geometry theory of hypersurface is utilized to define and construct smooth interfaces with good stability and differentiability, for use in characterizing the solvent-solute boundaries and in generating continuous dielectric functions across the computational domain. Some techniques from the geometric measure theory are employed to rigorously convert a Lagrangian formulation of the surface energy into an Eulerian formulation, so as to bring all energy terms on an equal footing. In our Lagrangian formulation, the differential geometry theory of surfaces is used to provide a natural description of solvent-solute interfaces. By optimizing the total free energy functional, we derive a coupling of the generalized Poisson-Boltzmann equation (GPBE) and the generalized geometric flow equation (GGFE or also called Laplace-Beltrami equation) for the electrostatic potential and the construction of realistic solvent-solute boundaries, respectively. The coupled partial differential equations (PDEs) are solved with iterative procedures to reach a steady state, which delivers the desired solvent-solute interface and electrostatic potential for many problems of interest. These quantities are utilized to evaluate the solvation free energies, protein-protein binding affinities, etc. The above proposed approaches have been extensively validated.Extensive numerical experiments have been designed to validate the present theoretical models, to test the computational methods, and to optimize the numerical algorithms. Solvation analysis of both small compounds and proteins are carried out to further demonstrate the accuracy, stability, efficiency and robustness of the present new models and numerical approaches. Comparison is given to both experimental and theoretical results in the literature. Moreover, to account for the charge rearrangement during the solvation process, we also propose a differential geometry based multiscale solvation model which makes use of electron densities computed directly from a quantum mechanical approach. We construct a new total energy functional, which consists of not only polar and nonpolar solvation contributions, but also the electronic kinetic and potential energies. We show that the quantum formulation of our solvation model improves the prediction of our earlier models, and outperforms some explicit solvation analysis. ACKNOWLEDGMENT First of all, I would like to express my deepest and sincerest gratitude to my dissertation advisor, Dr. Guowei Wei for his invaluable supports, guidance and personal care. It is beyond my words of appreciation that he is a far-sighted person with a broad interdisciplinary perspective, and that he has a unique creative insight which often sheds light on undiscovered potentialities of a subject. More than being a polymath, he cares about his graduate students so much that I have not seen any other advisor like him. Throughout these years, he has been cultivating my passion in science and academic aspirations. He has enormous patience on me, even during the most challenging times in my Ph.D. study. His spirit of pursuing academic dreams and perfection has a life-long influence on me. It has been an enormously enjoyable experience working with him. My gratitude also goes to my defense committee members, Dr. Changyi Wang, Dr. Leslie Kuhn, Dr. Chichia Chiu, and Dr. Moxun Tang for their expertise and precious time. I also would like to thank Dr. Zhengfang Zhou for his generous supports and unceasing cares for my wife and me throughout our Ph.D. program, and Dr. Keith Promislow for his kindness and the inspiring classroom discussions, and Dr. Nathan Baker for his supports and intense interests in our work of sovlation. Moreover, I would like to thank Dr. Shan Zhao for his continuing helps and encouragement in my research. Dr. Zhao is the one who taught me how to write FORTRAN codes after I joined this group. My thanks also go to the rest of current and former members in Dr. Wei’s group: Dr. Yongcheng Zhou, Dr. Yuhui Sun, Dr. Sining Yu, Dr Weihua, Geng, Dr. Duan Chen, Dr. Siyang Yang, Ms. Qiong Zheng, Mr. Langhua Hu, Ms. Jin Kyoung Park, iv Ms. Yajie Yu, Ms. Weijuan Zhou, Mr. Wangheng Liu, Mr. Manfeng Hu, Ms. Shuailing Wang, Mr. Qi Zhao, Mr. Kelin Xia, Mr. Huibing Zhu, for their valuable discussions and helps in my research and daily life. Our group is like a big loving family. I will always treasure our friendships. I am also grateful to Ms. Barbara Miller, Graduate Secretary in the Department of Mathematics, for her generous helps during my graduate study, and to Mr. Nick Boros for proofreading a part of my dissertation. Finally, from my very deep heart I thank Yuting, my wife and loved one, for her constant love, understanding and supports throughout my Ph.D. study. She is always patient to me and always has trusts in me. To her, words can not express my gratitude. v To my family: Baokun, Ziying, Yan, Xiang and Yuting vi TABLE OF CONTENTS List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction 1.1 Introduction to solvation models . . . . . . . . . . . . . . . . . . 1.1.1 Biological background . . . . . . . . . . . . . . . . . . . 1.1.2 Polar solvation models . . . . . . . . . . . . . . . . . . . 1.1.3 Poisson-Boltzmann theory . . . . . . . . . . . . . . . . . 1.1.4 Nonpolar solvation models . . . . . . . . . . . . . . . . . 1.2 Molecular interface definitions . . . . . . . . . . . . . . . . . . . 1.3 Quantum mechanical continuum models . . . . . . . . . . . . . 1.4 Limitations of current models . . . . . . . . . . . . . . . . . . . 1.5 Mathematical issues and numerical challenges . . . . . . . . . . 1.5.1 Geometry, PDE and interface . . . . . . . . . . . . . . . 1.5.2 Geometric flow equation . . . . . . . . . . . . . . . . . . 1.5.3 Highly accurate and efficient solver for interface problems 1.5.4 Self-consistent iterative methods . . . . . . . . . . . . . . 1.6 The rest of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Eulerian formulation 2.1 Theory and model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Solute-solvent boundary . . . . . . . . . . . . . . . . . . . . . 2.1.2 Total free energy functional . . . . . . . . . . . . . . . . . . . 2.1.2.1 Polar free energy functional . . . . . . . . . . . . . . 2.1.2.2 Non-polar free energy functional . . . . . . . . . . . 2.1.3 Governing equations . . . . . . . . . . . . . . . . . . . . . . . 2.2 Methods and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Discretization schemes of the governing equations . . . . . . . 2.2.1.1 The generalized Poisson-Boltzmann equation . . . . 2.2.2 Acceleration procedures . . . . . . . . . . . . . . . . . . . . . 2.2.2.1 Precondition of the PB solver . . . . . . . . . . . . . 2.2.2.2 Initial guess of the generalized PB solution . . . . . . 2.2.2.3 Convergence criteria in the generalized PB solver . . 2.2.3 Dynamical coupling of the generalized Poisson Boltzmann and etry flow equations . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3.1 Approach I . . . . . . . . . . . . . . . . . . . . . . . vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . geom. . . . . . . . xii 1 1 1 3 6 8 10 12 15 18 19 20 21 24 25 27 29 30 31 31 32 35 38 38 39 40 40 41 42 42 43 2.3 2.4 2.5 2.2.3.2 Approach II . . . . . . . . . . . . . . . . . . . 2.2.4 Evaluation of the solvation free energy . . . . . . . . . Numerical test and validation . . . . . . . . . . . . . . . . . . 2.3.1 The behavior of the coarea formula . . . . . . . . . . . 2.3.2 Accuracy of the generalized PB solver . . . . . . . . . . 2.3.3 Convergence of boundary profile and dielectric function 2.3.4 Consistency of iteration procedures . . . . . . . . . . . 2.3.5 Efficiency of the accelerated iteration procedure . . . . 2.3.6 Impact of potentials in the geometric flow equation . . Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Set of 17 test molecules . . . . . . . . . . . . . . . . . . 2.4.2 Solvation free energy of proteins . . . . . . . . . . . . . 2.4.3 Twenty two proteins . . . . . . . . . . . . . . . . . . . Chapter conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Quantum formulation 3.1 Theory and model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Charge density based polar free energy functional . . . . . . . . . . 3.1.2 Quantum mechanical energy functional . . . . . . . . . . . . . . . . 3.1.2.1 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2.2 Potential energy . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Total free energy functional . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Governing equations . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Numerical methods and algorithms . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Solution of the generalized Poisson-Boltzmann equation . . . . . . . 3.2.2 Solution of the generalized Kohn-Sham equation . . . . . . . . . . . 3.2.3 Evaluation of the solvation free energy . . . . . . . . . . . . . . . . 3.2.4 Dynamical coupling of involved PDE equations . . . . . . . . . . . 3.3 Numerical test and validation . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Validation of the cancellation of self-interaction energy . . . . . . . 3.3.2 Validation of data translation and unit conversion . . . . . . . . . . 3.3.3 Accuracy of solvation free energies computed by the present model . 3.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Solvation free energies of 24 small molecules . . . . . . . . . . . . . 3.4.2 Solvation free energies of 16 molecules . . . . . . . . . . . . . . . . 3.4.3 Solvation free energies of 3 larger molecules . . . . . . . . . . . . . 3.5 Chapter conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Lagrangian formulation 4.1 Theory and model . . . . . . . . . . . . . . . 4.1.1 Solvation free energy functionals . . . . 4.1.1.1 Polar solvation functional . . 4.1.1.2 Total free energy functional of 4.1.2 Surface variation . . . . . . . . . . . . viii . . . . . . . . . . . . . . . . . . solvation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 47 47 49 51 55 58 61 63 63 66 70 73 . . . . . . . . . . . . . . . . . . . . . 75 77 77 79 81 82 83 84 88 88 92 95 98 101 102 104 106 107 108 110 115 119 . . . . . 121 123 123 124 125 125 4.2 4.3 4.4 4.5 4.1.3 Governing equations . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Interconversion between the Lagrangian and Eulerian representations 4.2.1.1 Embedding the Lagrangian dynamics into the Eulerian representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1.2 Transform from the Lagrangian representation to the Eulerian representation . . . . . . . . . . . . . . . . . . . . . . . 4.2.1.3 Transform from the Eulerian representation to the Lagrangian representation . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1.4 Numerical surface integral and volume integral in the Eulerian representation . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Dynamic coupling of the Poisson-Boltzmann and geometric flow equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Validation of the interface extraction scheme . . . . . . . . . . . . . . 4.3.2 Effect of interaction potentials . . . . . . . . . . . . . . . . . . . . . . 4.3.2.1 Surfaces of a diatom system . . . . . . . . . . . . . . . . . . 4.3.2.2 Surfaces of a four-atom system . . . . . . . . . . . . . . . . 4.3.2.3 Surfaces and electrostatic potentials of a protein . . . . . . . 4.3.3 Isosurface function value and minimal surfaces . . . . . . . . . . . . . 4.3.4 Convergence of surface area, volume and energy . . . . . . . . . . . . Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Free energy calculations . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1.1 Solvation energies of 17 compounds . . . . . . . . . . . . . . 4.4.1.2 A set of 23 proteins . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Salt effect on protein-protein binding energies . . . . . . . . . . . . . Chapter conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 134 134 135 137 138 140 142 143 144 146 147 148 151 152 154 156 157 157 158 162 167 5 Thesis achievements and future work 170 5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 A Solution of the generalized Laplace-Beltrami equation and the ADI scheme178 B PB equation in different forms 185 C Differential geometry theory preliminary 188 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 ix LIST OF TABLES 2.1 Areas computed from the coarea formula for bounded open sets . . . . . . . 48 2.2 Errors and convergence orders for the generalized PB solver (ǫ1 = 1) . . . . 51 2.3 Comparison between two iteration approaches . . . . . . . . . . . . . . . . . 56 2.4 Effect of relaxation factor α on final results . . . . . . . . . . . . . . . . . . . 57 2.5 Effect of the number of intermittency in Approach II . . . . . . . . . . . . . 58 2.6 CPU time analysis from original schemes . . . . . . . . . . . . . . . . . . . . 59 2.7 Speedup from adjustment of initial guess and preconditioner in PB solver . . 59 2.8 Influence of convergence criteria on electrostatic solvation free energy and computational time for the diatomic system . . . . . . . . . . . . . . . . . . 60 Comparison of CPU time (s) in the iteration procedures with and without accelerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.10 Effects of potentials on the solvent-solute boundary . . . . . . . . . . . . . . 63 2.11 Comparison of free energies (kcal/mol) for 17 compounds . . . . . . . . . . . 64 2.12 Comparison of electrostatic solvation free energies (kcal/mol) obtained from the MFCC-CPCM, the present model (OSM) and MIBPB. . . . . . . . . . . 68 2.13 Electrostatic solvation free energies for 22 proteins . . . . . . . . . . . . . . . 71 2.9 3.1 Comparison of total electrostatic energy (kcal/mol) and electrostatic solvation energy (kcal/mol) obtained with the partial charge approach. . . . . . . . . . 104 3.2 Comparison of total electrostatic energy (kcal/mol) and electrostatic solvation energy (kcal/mol) obtained with the direct use of charge density. . . . . . . . 105 x 3.3 Comparison of solvation energy components between present results and those of Wang et al [238] for three small molecules. . . . . . . . . . . . . . . . . . . 106 3.4 Solvation free energy (kcal/mol) and its decomposition. . . . . . . . . . . . . 107 3.5 Comparison of solvation free energies (kcal/mol) obtained from the present model and experimental data for 24 small molecules. . . . . . . . . . . . . . 108 3.6 Solvation free energy (kcal/mol) decomposition for a set of 21 molecules. . . 112 3.7 Comparison of free energies (kcal/mol) for 16 compounds. . . . . . . . . . . 114 3.8 Free energies (kcal/mol) for 16 compounds using structures from Pubchem data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 3.9 Solvation free energies (kcal/mol) of 3 large molecules and corresponding CPU time (hour). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.1 Comparison of surface areas (˚2 ), volumes (˚3 ) and energies (kcal/mol) for A A two small systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 4.2 ˚ Electrostatic solvation free energies (kcal/mol), surface areas (A2 ) and vol3 ) of protein 451c with different solvent-solute interactions. . . . . . 151 umes (˚ A 4.3 Surface areas (˚2 ) for different surface definitions . . . . . . . . . . . . . . . 153 A 4.4 RMS error with different nonpolar potentials. . . . . . . . . . . . . . . . . . 158 4.5 Predicted and experimental total solvation free energies for 17 small compounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.6 Comparison of electrostatic solvation free energies of 23 proteins. . . . . . . . 162 4.7 Comparison of binding affinities of two proteins complexes from current simulations and those from Bertonati et al’s paper. . . . . . . . . . . . . . . . . 166 xi LIST OF FIGURES 1.1 A solvation free energy cycle adapted from Levy et al. [130]. The total solvation energy (1) is decomposed into several steps: “charging” the solute in solvent (6) and vacuum (2), including attractive dispersive solute-solvent interactions in solvent (5) and vacuum (3), and cavity formation associated with repulsive solute-solvent interactions (4). The energy associated with Step (7) is generally termed a “nonpolar solvation energy” while the difference in energies associated with Steps (1) and (7) is generally considered as “polar solvation energy”. For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. 4 2.1 The cross line of S and (1 − S) of a diatomic system described in Section 2.3.3 29 2.2 The cross line profile of ǫ(S) of a diatomic system described in Section 2.3.3. Here, we have set ǫs = 80 and ǫm = 1. . . . . . . . . . . . . . . . . . . . . . 36 The evolutionary profiles of the S function at cross section (x, y, 0.05) in a diatomic system plotted from four intermediate states. . . . . . . . . . . . . 53 2.3 2.4 The time evolution histories of the electrostatic solvation free energy, F (Volume) and J(Area), where F (Volume) = volume/5−180 and J(Area) = (surface area)/5− 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.5 Correlation between experimental data and the present optimized surface model (OSM)(also results from Nicholls’) in electrostatic solvation free energies of 17 compounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Correlation between MFCC-CPCM [152] and the present optimized surface model (OSM) in electrostatic solvation free energies of 8 proteins. . . . . . . 69 Correlation between MIBPB-III and the present model (OSM) in electrostatic solvation free energies of 22 proteins . . . . . . . . . . . . . . . . . . . . . . . 70 Differences between electrostatic solvation free energies obtained from the MIBPB and the present model with original radii (Radii0) or enlarged radii (Radii1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.6 2.7 2.8 xii 2.9 Surface potential display of one protein (PDBID: 1frd) at different isosurfaces 72 3.1 Flowchart of the numerical solution of the coupled PDEs. . . . . . . . . . . . 99 3.2 Correlation between experimental data and the present optimized surface model with quantum correction (OSMQ) in solvation free energies of 24 small molecules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 3.3 Illustration of surface electrostatic potentials of four small compounds at their corresponding isosurfaces S = 0.50. (a) Glycerol triacetate; (b) 1,1diethoxyethane; (c) Bis-2-chloroethyl ether; (d) Dimethoxymethane. . . . . 111 3.4 Correlation between experimental data [160] and the present optimized surface model with quantum mechanics (OSMQ) in solvation free energies of 16 compounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 3.5 Illustration of surface electrostatic potentials of three compounds at their corresponding isosurfaces S = 0.50. (a) Phorbol; (b) Phorbol12,13-dibutyrate; (c) Staurosporine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.1 ˚ Illustration of surface morphology of a diatom system with radii 1.87A and coordinates (x, y, z) = (−2.2, 0, 0), and (2.2, 0, 0) under different solute-solvent interactions. Top Left: The MMS (no potential); Top Right: The surface obrep,LJ tained under a repulsive potential (Vi ); Bottom Left: The surface obtained under a full L-J potential (V LJ ); Bottom Right: The surface obtained under a full L-J potential and an electrostatic potential. It can be seen that the repulsive potential produces a “fat” surface, while an attractive potential or an electrostatic potential leads to a “slim” surface. . . . . . . . . . . . . . 147 4.2 Illustration of surface morphology of a four-atom system with radii 1.87˚ and A coordinates (x, y, z) = (−3.40, 0, 0), (3.40, 0, 0), (0, −2.94, 0) and (0, 2.94, 0) under different solute-solvent interactions. Top Right: The surface obtained rep,LJ under a repulsive potential (Vi ); Bottom Left: The surface obtained under the full L-J potential (V LJ ); Bottom Right: The surface obtained under the full L-J potential and an electrostatic potential. Again, it can be seen that the repulsive potential producess a “fat” surface; while an attractive potential or an electrostatic potential leads to a “slim” surface. . . . . . . . 149 xiii 4.3 The projection of electrostatic surface potentials of protein 451c onto different surfaces obtained under various solvent-solute interactions. Top Left: Attractive surface; Top Right: The MMS; Bottom Left: Repulsive surface; Bottom Right: Polar surface. It is noted that the repulsive surface is a “fat” surface comparing to the MMS; while an attractive surface or a polar surface is a “slim” surface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 4.4 ˚ Difference of surface areas ( A2 ) between MMS and various resulting surfaces generated under different constant pressure effects. . . . . . . . . . . . . . . 155 4.5 Decreasing of surface area (×102 ˚2 ), J(volume)=(volume-2) (×102 ˚3 ) and A A total solvation free energy (kcal/mol) in diethyl propanedioate as the number of iterations increases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 4.6 Correlation of solvation free energy between previous optimized smooth surface (OSS) model and the present optimized molecular surface (OMS) model in the set of 17 compounds listed in Table 4.5. . . . . . . . . . . . . . . . . 159 4.7 Correlation of electrostatic solvation free energy between the present optimized molecular surface (OMS) model, and previous models, such as the optimized smooth surface (OSS), the MIBPB-III and the MMS for 23 proteins listed in Table 4.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 4.8 Difference of electrostatic solvation free energies between the OMS model and previous OSS and MMS models for 23 proteins listed in Table 4.6. . . . . . . 163 4.9 Protein-protein complexes. Left: Protein complex 1emv; Right: Protein complex 1beb. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 4.10 The salt dependence of the binding affinities of two protein complexes. Left: Protein lemv; Right: Protein 1beb. Here OMS data are computed by our optimized molecular surface (OMS) model. NLPB data are taken from Bertonati et al’s paper [26]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 xiv Chapter 1 Introduction 1.1 1.1.1 Introduction to solvation models Biological background Almost all important biological processes in nature, including signal transduction, DNA recognition, transcription, post-translational modification, translation, protein folding and protein ligand binding, naturally occur in water, which comprises 65-90% of cellular mass. The understanding of solvation is an elementary prerequisite for the quantitative description and analysis of the above-mentioned processes. Solvation involves the energetics of interactions between solute molecules and solvent molecules or ions in the aqueous environment. Solute-solvent interactions are typically described by solvation energies (or closely related quantities): the free energy of transferring the solute from a vacuum to the solvent environment of interest (e.g.,water at a certain ionic strength), as shown in more detail in Figure1.1. Solvation free energy is a physical quantity that can be measured experimentally. Although millions of organic compounds are known now, only several thousands of com1 pounds have experimental data being reported for the solvation free energy. It is mainly due to experimental difficulties associated with the precise measurement, particularly for those compounds with low solubility and/or low volatility [182, 174]. Because of low solubility and/or low volatility, accurate and time-consuming measurement is required with highly sensitive instruments. Unfortunately, many important organic compounds belong to this category. Moreover, attentions need to paid on the chemical stability of solute under investigation. Therefore, the experimental study of solvation free energy still remains expensive, laborious and is sometimes inaccurate. Computational approaches provide an alternative method to obtain the solvation free energy. Solvation free energies can be calculated by a variety of computational methods, ranging from very time-consuming quantum mechanical approaches [111, 183, 148, 118] to simple phenomenological modifications of Coulomb’s law. Solvation models can be roughly divided into two main classes [186, 239, 203, 200]: explicit solvent models that describe the solvent in molecular or atomic detail [179], and implicit solvent models that generally replace the explicit solvent with a dielectric continuum [8, 10, 65, 109, 186, 117]. Explicit solvent models provide the detailed information on molecular constitutions, and generally require extensive sampling to extract meaningful thermodynamic, statistical or kinetic properties of interest. Whereas, implicit solvent models focus on the biomolecules of interest, and take a mean field approximation for solvent properties. Because of their fewer degrees of freedom, implicit solvent methods have become popular for many applications in molecular simulation [7, 82, 9, 69]. To help the calculation of solvation energy, one can conceptually break up the solvation process as follows: #1 in this figure can be decomposed into two basic processes: a 2 “nonpolar” process of inserting the uncharged solute into solvent (#7) and a “polar” process of charging the solute in vacuum (#2) and solvent (#6). The free energy change in #7 is called the nonpolar solvation energy. The difference of energies associated with #6 and #2 is called the “charging” or polar solvation energy and represents the solvent’s effect on the solute charging process. The polar portion of solvation originates from electrostatic interactions, which are ubiquitous for any system of charged or polar molecules, such as biomolecules (proteins, nucleic acids, lipid bilayers, sugars, etc.) in their aqueous environment [240, 65, 69, 105, 200, 239, 203, 86, 204, 7, 82, 9]. The nonpolar portion describes the remaining contributions, including the surface tension, mechanical work, and attractive solvent-solute dispersion interactions. 1.1.2 Polar solvation models Electrostatic interactions are ubiquitous in nature. For biomolecular systems in aqueous environment, the analysis of molecular solvation and electrostatics is of great importance to research in chemistry, biophysics, medicine and nano-technology. Implicit solvent models are widely used in such an analysis which can be classified into two general types: quantitative analysis and qualitative study. One of the primary quantitative application in computational biology and chemistry has been the calculation of thermodynamic properties. Implicit solvent methods “pre-equilibrate” the solvent and mobile ions, thus effectively pre-compute the solvent contribution for a system [186]. Such pre-equilibration is particularly evident in MM/PBSA models [246, 216, 173, 220, 149], which combine implicit solvent approaches with molecular mechanical models to evaluate binding free energies from an ensemble of biomolecular structures [16, 2, 133, 162, 145, 223, 161, 251, 162, 95, 150, 132, 131]. These methods 3 Figure 1.1: A solvation free energy cycle adapted from Levy et al. [130]. The total solvation energy (1) is decomposed into several steps: “charging” the solute in solvent (6) and vacuum (2), including attractive dispersive solute-solvent interactions in solvent (5) and vacuum (3), and cavity formation associated with repulsive solute-solvent interactions (4). The energy associated with Step (7) is generally termed a “nonpolar solvation energy” while the difference in energies associated with Steps (1) and (7) is generally considered as “polar solvation energy”. For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. 4 have been employed to interpret experimental titration curves, analyze residue contributions in protein-protein and protein-ligand binding energetics, examine structural/functional consequences of RNA nucleotide protonation, etc. Another quantitative application of implicit solvent models is the evaluation of biomolecular kinetics where implicit solvent models are generally taken to compute solvation forces for molecular Langevin dynamics [219, 180, 181, 142, 141], Brownian dynamics [147, 89, 76, 195], or continuum diffusion [49, 50, 255, 210, 211] simulations. A major qualitative study of implicit solvent methods is the visualization and qualitative analysis of electrostatic potentials on and around biomolecular surfaces [241, 177, 11, 7]. Visualization has become a standard procedure in the analysis of biomolecular structures, including ligand-receptor binding, drug design, macromolecular assembly, protein-nucleic acid complexes, protein-protein interactions, enzymatic mechanism study, etc. The polar solvation energy is generally associated with a difference in charging free energies in vacuum and solvent (see Figure 1.1 (#2) and (#6)). Polar solvation process and electrostatic effect are described by a variety of implicit solvent models [240, 186, 239, 200, 65, 92, 120, 41, 193, 233, 198, 10, 9, 221]; however, the most widely-used ones are PoissonBoltzmann (PB) models [105, 69, 7, 125, 86, 200, 65, 117], generalized Born (GB) methods [68, 15, 229, 167, 92, 263, 120, 226, 155, 41, 102] and polarizable continuum models (PCM) [52, 227, 113, 218, 35, 14, 59]. Polarizable continuum models are proposed to model the solvent either as polarizable dielectrics or as conductor-like media, and treat the solute compound by the quantum mechanical means [52, 227, 113, 218, 35, 14, 59]. These approaches have often been used in reactively kinetics where quantum mechanical descriptions are desired. Generalized Born methods are relatively fast, but are not as accurate as the PB 5 methods [9, 166, 68, 166, 63, 230, 226]. They are often employed in high-throughput applications such as molecular dynamics [15, 229, 203, 82, 120, 68, 167, 63, 41]. PB methods can be formally derived from more detailed theories [22, 159, 107] and provide a more accurate, although somewhat slower, approach for evaluating polar solvation properties [63, 166, 15]. Moreover, unlike most generalized Born methods, PB models offer a global description for the electrostatic properties, therefore making them uniquely suited to visualization and other studies [138, 29, 66, 234, 89, 76, 64, 196, 211] where the electrostatic information is required for both inside and outside a biomolecule. 1.1.3 Poisson-Boltzmann theory Mathematically, the PB equation [105, 69, 7, 125, 86, 199, 65] is a nonlinear elliptic partial differential equation (PDE) which is solved for the electrostatic potential. It is a continuum model at equilibrium state, which dictates the solvent with a piecewise dielectric constant and ionic charge density by the Boltzmann distribution. The PB equation can be derived by the Gauss law and the Boltzmann distribution law [108]. Additioally, in the physical point of view, the free energy of the system must be minimized at the equilibrium state. Therefore, a total electrostatic free energy functional may be developed based on the PB theory, then the PB equation can also be obtained by the variational principle [199]. The standard formula of the PB equation is the following: Nc −∇ · (ǫ(r)∇φ) − i=1 Qi n0 e−φQi /kB T = i Nm j qj δ(r − rj ) (1.1) where ǫ is piecewise constant and depends on interface, being ǫs in solvent and ǫm in solute; φ is electrostatic potential. Here qj is the partial charge on an atom located at xj , Qi is the 6 charge of ion species i, Nc is the number of ion species, kB is the Boltzmann constant, T is the temperature, Nm is the total number of solute atoms, and n0 is the bulk concentration i of the ith ionic species. Note that the PB equation can appear in different forms according to purposes as well as unit representation (detailed description can be found in Appendix B) The PB theory is approximate and, as a result, has several well-known limitations which can affect its accuracy [105, 159, 107, 69, 239, 198, 81, 56, 221, 222, 191, 45]. These limitations have been reviewed in the literature and will only be briefly summarized here. First, most continuum models assume linear and local solvent response [239, 198, 81, 22]. However, nonlinear solvent response (usually through dielectric saturation or electrostriction), can be important in regions of strong electric field [239, 198, 81]. Biologically-relevant examples of nonlinear solvent response have been found near highly charged ions, biomolecules, and other interfaces. Nonlocal solvent response generally involves the finite non-zero size of water and its unique hydrogen bonding with solute and other solvent molecules. Such nonlocal response can be important in describing the orientation of water at biomolecular interfaces [38], differing solvation of cations and anions, and the solvation of asymmetric charge distributions. The second major limitation is the mean-field treatment of ions in PB theory [107, 159, 105]. Mean field models assume that each ion experiences only the average influence of the other ions in solution. Such averaging precludes detailed ion-ion interactions involving steric repulsion of ions (or their solvation shells) and Coulombic interaction of ions, including repulsion and attractive pairing. The mean field assumption thereby eliminates correlations and fluctuations which can have important energetic and structural consequences for solutions of divalent and multivalent ions surrounding highly-charged molecules such as nucleic acids [56, 221, 222, 191, 45]. As suggested by the limitations above, PB models also neglect de7 tailed ion-solvent interactions which eliminate differences between ion species in solution and thereby prevent effects analysis of specific ion species – which can be important in biophysical modeling. However, despite these limitations, PB methods are still very important for biomolecular structural analysis, modeling, and simulation. Furthermore, these limitations are currently being addressed through new implicit solvent models [5, 56, 159, 221, 175] and hybrid treatments [232, 13, 128, 165, 156] which extend the applicability of the PB theory while preserving some of its computational efficiency through pre-averaging solvent and ion response. 1.1.4 Nonpolar solvation models Poisson-Boltzmann methods provide polar solvation energies and therefore must be complemented by nonpolar solvation models to provide a complete view of solvent-solute interactions. As illustrated in Figure 1.1, nonpolar solvation is generally associated with the insertion of the uncharged solute into solvent. There are many nonpolar solvation models available. The most commonly used one is solvent-accessible surface area (SASA) models. They states that nonpolar solvent-solute interactions are proportional to the area of the solvent-solute interface. It is worth to note that the proportional constant varies dramatically in the literature because the energies of other processes are also assumed to be proportional to SASA [77]. Roughly speaking, SASA models are based on the scaled particle theory (SPT) [213, 178] which actually includes the energy of surface tension effect and the mechanical work of immersing a particle into the solvent. Moreover, studies indicates that nonpolar distribution should depend on the solvent-accessible volume and surface, with a crossover to SASA when the size of solute is large [235]. Recent work by Levy, Gallic8 chio, and others [90, 92, 130, 91, 235] has demonstrated the importance of nonpolar solvent models which include treatment of attractive solute-solvent dispersion terms (#5 in Figure 1.1) as well as models of solvent-solvent repulsive interactions (#4 in Figure 1.1), which are described by both area and volume contributions [235]. Based on these considerations, in the present work, we use the following model for nonpolar solvation free energies [235] Gnp = γ · Area + p · Vol + Ωs ρs Uss dr, (1.2) where γ is the surface tension, ”Area” is the solvent-excluded area of the solute, p is the hydrodynamic pressure, ”Vol” is the solvent-excluded volume of the solute, ρs is the solvent density, Ωs denotes the solvent accessible region, and Uss is the solvent-solute van der Waals (vdW) interaction potential. The first two terms in Eq. (1.2) are those from the SPT model [213, 178]. The first term is the surface energy. It measures the disruption of intermolecular and/or intramolecular bonds that occurs when a surface is created. The second term is the mechanical work of creating the vacuum of a biomolecular size in the solvent. The third term represents the attractive dispersion effects near the solvent-solute interface which has been shown by Wagoner and Baker [235] to play a crucial role in accurate nonpolar solvation analysis.In general, Uss can be obtained by the sum of the interaction of individual atoms in Ωm with the solvent continuum in Ωs under the assumption that the nonpolar solutesolvent potential is pairwise: Uss = vdW . This model of nonpolar solvation has i Vi been demonstrated to give good agreement with explicit solvent solvation forces on proteins [235] and RNA hairpins [71]. Work by Levy and co-workers has demonstrated the good performance of a similar model [90, 92, 130, 91]. In the present work, we further allow the solvent density ρs to be a function of position in 9 general. In particular, we split the solvent density ρs into the sum of atomic or ionic density distribution functions ρs = i ρs,i . The distribution of an individual solvent component can be computed by integral equations or other approaches, such as Monte Carlo and generalized Langevin equation [231, 88, 23]. This design of solvent density allows us to recover the nonlinear and nonlocal effects of the solvent-solute interactions. 1.2 Molecular interface definitions The separation of discrete and continuum domains in implicit solvent models requires an interface to indicate the separation of solute atoms from the surrounding solvent. Naturally, such an interface can be regarded as the surface or the profile of a molecule. The definition of molecular profiles, or molecular graphics traces back to Corey and Pauling in 1950s [58], who tried to depict the profiles of amino acids, peptides and proteins from X-ray crystallography. In quantum chemistry, molecular graphics are often associated with the shapes of polynomial functions that provide approximation to electron wavefunctions. In fact, since the electron wavefunction changes its distribution under different environments, molecular profiles change accordingly. Commonly used interface definitions in implicit solvent models include the van der Waals surface, the solvent accessible surface [126], and the molecular surface (MS) [185, 57]. In certain sense, these interface definitions determine the performance of implicit solvent models because all of the physical properties of interest, including electrostatic free energies, biomolecular surface areas, molecular cavitation volumes, solvation free energies, and pKa values are very sensitive to these interface definitions [70, 72, 163, 217]. The use of PB model encounters some challenges in molecular dynamics regarding stability and accuracy. For example, the widely used molecular surface based PB model results 10 in forces which are unstable over time, lack analytical expression, are sensitive to grid discretization, and converge poorly [101]. Moreover, a discontinuous dielectric definition leads to numerical instability regardless of the location of boundary [217]. Additionally, more physically realistic surface definitions are desired because of the argument that the macroscopic physical properties must vary continuously. To overcome these difficulties, overlapping atom-centered Gaussian or polynomial functions have been proposed to define the solute surface, with smooth transitions between low and high dielectric values [101, 112]. Although continuous dielectric functions give rise to an improvement of stability and computational efficiency, some recent work demonstrates that most of them are physically incorrect, which will be discussed more in Section 1.4. The recent development of a new class of molecular interfaces that incorporate the fundamental laws of physics starts with the construction of partial differential equation (PDE) based molecular surface by Wei el al. in 2005 [245]. This approach distinguishes itself from many other PDE based surface smoothing methods [249, 256] by utilizing only atomic information, i.e., atomic coordinates and radii, instead of an existing surface. The atomic information is embedded in the Eulerian formulation and a family of hypersurfaces are evolved in time under the PDE operator, which is designed to control the curvature and surface tension. The generalized molecular surface is subsequently extracted from the final hypersurface by a level-set type of approach [245]. This PDE based surface construction procedure generates well defined molecular surfaces for both small molecules and large proteins [245]. To our knowledge, geometric PDE based approach was the first of its kind for molecular surface construction. A further progress in the development of a “physical interface” was the introduction of the minimal molecular surface (MMS) that minimizes a surface free energy 11 functional by the variational principle and leads to the mean curvature flow in 2006 [18, 19]. To our knowledge, MMSs were the first set of biomolecular surfaces that had ever been constructed by means of variational principles. The construction of the MMS was driven by the desire to understand the true physical boundary of a biomolecule in solvent. As a physical concept, the solvent-solute interface should be in general determined by the minimization of the free energy of a macromolecule in the aquatic environment. The MMS is constructed by using essentially the same procedure as that developed in the first PDE based surface generation method [245]. Another desirable property of the MMS is that it is free of geometric singularities. The MMS model was applied to the calculation of electrostatic solvation free energies of 26 proteins [20]. 1.3 Quantum mechanical continuum models In most implicit solvent models, the solute is described as a collection of fixed atomic point charges, which describes molecular polarizations at the atomistic level of resolution. However, it is well-known that charge rearrangement plays an important role in the solvation process of proteins in various cases [99]. Similar arguments were used to incorporate the quantum mechanics (QM) description in the classical implicit solvent theory [227, 228, 61]. The resulting QM version of continuum models, called quantum mechanical continuum models [52, 227, 113, 218, 35, 14, 59], offer the possibility of carrying out accurate quantum calculations in solution and near interfaces. Quantum mechanical continuum models provide a framework to describe the QM effect in solvent analysis [224]. However, this description is often compromised by the use of a pre-determined solvent-solute interface model. To integrate a continuum model with a QM description, reaction field potential, i.e., the electric 12 field induced by the polarized solvent, has been introduced as a unifying concept. It is obtained from the electrostatic computation in the framework of continuum models. It also exists in the Hamiltonian of the solute in the quantum calculation [224, 237, 42]. Therefore, the quantum formulation of the continuum model involves two problems: (1) the classical electrostatic problem of determining the solvent reaction field potential with the quantum mechanically calculated charge density; (2) the quantum mechanical problem of calculating the electron charge density with fixed nucleus charges in the presence of the reaction field potential. These two problems need to be resolved simultaneously. To carry out these computations, a intuitive self-consistent iterative procedure can be constructed to resolve the quantum problem for electron distribution and the classic electrostatic problem for the reaction field potential [224, 99, 227, 237, 42]. After computing the QM charge density, there are still two ways to implement the solvation analysis. The first approach is to use the continuous QM charge density directly in the Poisson-Boltzmann equation. The second approach is to fit the QM charge density into the atomic point charges, and then use the point charges as the source term in the Poisson-Boltzmann equation. Various schemes have been proposed to compute atomic partial charges with certain efficiency and convenience. The simplest way for atomic partial charge assignments is the Mulliken analysis method [157]. In this approach, the charge is distributed according to the orbital occupation. Many other schemes have also been proposed, including the natural bond orbital analysis, the distributed multipole analysis (DMA), the wavefunction mapping ‘Class IV’ model, the electrostatic potential expansion and analysis, etc [184, 214, 202]. Presently, the most widely used method for estimating atomic partial charges is the least-squared electrostatic potential (ESP ) fitting approach. 13 It was first proposed by Momany and has subsequently been implemented in different ways with different choices of grid points where the electrostatic potentials are calculated [154]. Examples of such potential-based methods are CHELP, CHELPG, and the Merz-Kollman scheme [60, 205, 28, 53]. Hu and Yang have recently developed a new object function to improve the quality and especially the numerical stability of the ESP fitting [110]. ESP fitting methods are not only widely used in the simulation with molecular mechanical (MM) force fields, but also extended to the QM/MM simulations as well as the molecular polarization calculation [110]. However, there are some well-known deficiencies in the atomic partial charge approaches [201, 110]. First, atomic partial charge is not observable, i.e., it can not be definitely determined by experimental data or directly obtained from quantum calculations. Therefore, it is a term lacking a rigorous and consistent definition [110]. Additionally, the approximation of quantum mechanical electron-electron interactions by simple Coulombic interactions between atomic partial charges leads to inaccurate calculations. Moreover, results from different methods or definitions may show different numerical dependences upon the QM level of the theory, basis sets used, and the choices of the number and location of grid points. Finally, there is a concern about the transferability of the atomic partial charges in different molecules. To avoid these problems, the direct use of the quantum charge density in the continuum dielectric theory was proposed [227, 237]. The quantum mechanical problem of determining the electron charge density was solved originally limited to the Hartree-Fock level, which is a traditional way to obtain complicated many-electron wavefunctions. Density functional theory (DFT) was proposed in 1960s to provide the ground state of a many-electron system in terms of a single electron wavefunction [106, 122]. In DFT, the Kohn-Sham equation is the Schr¨dinger equation of a fictitious o 14 system. It has been more and more popular for quantum calculations in solid state physics since the 1970s due to its low computational cost when compared with traditional approaches. Moreover, the results of DFT calculations have been considered accurate enough especially from 1990s when approximations used in the theory were greatly refined to better model the exchange and correlation interactions. [127, 21, 33]. DFT is now a leading method for electronic structure calculations in chemistry, physics and nano-technology. Therefore, the incorporation of DFT to continuum solvation methods becomes a routine approach in methods with the QM description of solute and the continuum description of solvent [227]. 1.4 Limitations of current models Current two-scale implicit solvation models have a severe limitation that undermines their performance in practical applications. While traditional surface definitions have found much success in biomolecular modeling and computation [212, 139, 62, 123, 25, 73, 114, 136], they are simply ad hoc divisions of the solute and solvent regions of the problem domain. In reality, the solvation is a physical process and its equilibrium state should be determined by fundamental laws of physics. Moreover, these surface definitions confront many challenges. First of all, as mentioned earlier, from the fundamental physical point of view, macroscopic properties should vary continuously. Any description of the permittivity changing instantaneously from one point to another is incorrect. Secondly, they admit non-smooth interfaces, i.e., cusps and self-intersecting surfaces, that lead to well-known instability in molecular simulations due to extreme sensitivity to atomic positions, radii, etc [188]. Thirdly, each pre-determined surface definition has its own limitations. For example, the van der Waals surface (VDW) is not smooth in space. And there are a lot of un-physical solvent pock15 ets inside the solute which cause the fluctuation of the electrostatic field [141]. the widely used molecular surface (MS) accompanying with the discontinuous dielectric function is not smooth in time for the molecular dynamic due to its definition, while it is much smoother in space than VDW and embodies the ratio between contact surface and reentry surface which maybe an important information of surface roughness. The solvent accessible surface (SAS) is not suitable in terms of solvation energies, which are the strongest validation of the PB theory. These difficulties associated with traditional discontinuous dielectric functions often drive the use of alternative “smoothed” solvent-solute interface definitions [112, 101] by applying overlapping atom-centered Gaussian or polynomial functions. However, smoothed interface definitions increase computational cost [70, 72]. Moreover, interatomic crevices and buried pockets of high dielectric, which are too small for a solvent molecule to occupy, are introduced. Furthermore, they often overestimate the electrostatic solvation free energy [217]. Finally, the wide range of surface definitions has often led to confusion and misuse of parameter (radii) sets developed for implicit solvent calculations with specific surface definitions. It is celebrated that the construction of partial differential equation (PDE) based molecular surface by Wei el al. in 2005 [245] generates well defined molecular surfaces for both small molecules and large proteins [245]. Moreover, the construction of minimal molecular surface (MMS) is determined by the minimization of the free energy of a macromolecule in the aquatic environment, and it is free of geometric singularities [20]. However, the MMS, which incorporates only the minimization of the free energy associated with the surface tension, offers only an approximation to the true physical boundary of a biomolecule in solvent. Therefore, to account for other important effects that determine the solvent-solute inter16 face, new potential driven geometric flows (PDGFs) needs to be proposed that allow the incorporation of many other potential effects in surface formation and evolution. Another criticism of implicit solvent models is the lack of uniqueness in polar and nonpolar decomposition of the solvation process [148] and the neglect of the polar-nonpolar coupling as well as solvent-solute interactions [4, 32, 38, 55, 74, 85, 87]. Dzubiella et al [74, 75] considered this problem by adding a solvent-solute coupling (interaction) term to the total free energy functional discussed by Sharp and Honig [200] and Gilson et al [97]. A feature of this new model is that surface tension energy and mechanical work of immersing a molecule into the solvent were also included in the total free energy functional. However, their initial work does not provide a protocol for the construction of molecular interfaces and a systematical analysis of solvation energy for macromolecules. Recently, Cheng et al. [48] have extracted solvent-solute interfaces from the free energy functional of Dzubiella et al [74, 75] in a setting very similar to our earlier Eulerian geometric PDE approaches of biomolecular surfaces and solvation analysis [245, 18, 19, 20]. In the earlier PDE based molecular surface models [245, 20, 74] the solute is described as a collection of fixed atomic point charges, which, together with the charge described in the continuum approximation of the surrounding medium, give rise to the total charge source for the Poisson-Boltzmann equation. The atomic partial charges describe molecular polarizations at the atomistic level of resolution. This approach is able to evaluate many physical and chemical properties. However, it can not cover the whole range of properties of interest. In particular, the charge rearrangement in the solute molecule during the transfer from the gas phase to solution has not been taken into account in the calculation of solvation free energies. Consequently, the highly accurate analysis of solvent-solute surfaces is discounted 17 by the estimation of charge rearrangement during the solution process. Additionally, those earlier solvation models depend on parameters from the existing molecular mechanical force field parameters [147, 12, 112, 116], which are typically parameterized for certain class of (macro-) molecular systems and may not be appropriate for other class of molecules. Therefore, it is desired to develop a quantum mechanical (QM) description of the solute molecule in PDE based molecular surface models, which gives accurate, self-consistent and force field independent charge arrangement treatment during the solavtion process. 1.5 Mathematical issues and numerical challenges Based on previous discussions, it is clear that there are various mathematical issues involved in the modeling of solvation. In this work, the solvent-solute boundary can be modeled in the framework of the differential geometry theory of surfaces and manifolds, which is employed to result in new coupled geometric and potential flows for the generation of a physical solventsolute boundary and the optimization of solvation energy. Technically, the smoothness of the resulting solute-solvent boundary is ensured by coupled geometric and potential flows of parabolic type. A variational framework is established to couple different parts of the solvation contributions. Governing equations are derived by variational principles. Other than the modeling strategy, implementations of models encounter many numerical challenges, which have attracted great mathematical interests for the past several decades. For instance, the multidomain and multiscale treatment of both systems results in discontinous coefficient based interface problems and a singular source in the partial differential equation. Highly accurate and efficient numerical schemes are desired to handle these singularities in the application of biological systems. Additionally, numerical convergence and efficiency of the 18 self-consistent iteration need to be explored for the derived coupled governing equations. The involved mathematical issues and associated numerical challenges in model implementations are outlined as follows: 1.5.1 Geometry, PDE and interface In this thesis, we consider the solvent-solute boundary as a two-dimensional (2D) differential manifold embedded into a 3D Euclidean space, or a hypersurface, in a Riemannian manifold. The differential geometry theory of surfaces and manifolds is employed. The above-mentioned minimal molecular surface, which incorporates only the minimization of the free energy associated with the surface tension, offers only an approximation to the true physical boundary of a biomolecule in solvent. To account for other important effects that determine the solvent-solute interface, we have recently proposed a framework of potential driven geometric flows (PDGFs) that allow the incorporation of many other potential effects in surface formation and evolution [17]. The PDGFs are inherently multiscale in nature, and enable the incorporation of microscopic interactions, such as van der Waals potentials, into the macroscopic curvature evolution. From a mathematical point of view, the molecular surfaces are constructed from the geometric partial differential equation (GPDE) in this thesis. In general, GPDE is a PDE which controls the motion of curves or surfaces and is merely formulated by the geometric measure theory. It is related to geometric analysis, manifold theory, topology, PDE, calculus of variation, and geometric measure theory. GPDE can be applied to the motion of interfaces problems in the physical or chemistry field, e.g., dissolution, combustion, erosion in the biology field, biomembrane vesicle problem, and the construction of protein surface. GPDE is 19 also applied to computational geometry, computer graphics, image processing edge detection, noise removal,image restoration. Moreover, it can be used to obtain some optimal properties such as surface area minimization, total energy minimization, etc. The surfaces generated by GPDE possess some attractive features such as smoothness and a clear geometric sense. GPDE can be constructed by an energy functional based variational approach. In this method, an energy functional with a physical target is formed. Then, the first order variation of the energy functional gives rise to geometric PDEs. In the framework of variational approach, many famous geometric flow equations are derived such as the mean curvature flow, the Willmore flow, etc. Because it is difficult to gain explicit solutions of the GPDEs , numerical solutions are necessary. Numerical solutions of Geometric PDEs can be obtained by the generalized finite difference method, the finite element methods and the level set like methods. 1.5.2 Geometric flow equation Much of the recent development in implicit solvent models is due to the use of geometric flows [245, 18, 20, 17, 243], particularly mean curvature flows, which have been of considerable interest in applied mathematics for decades [172, 187, 197, 84, 100, 153, 170, 190, 192, 197, 208, 247]. Earlier research work and part of present research are focused on image processing [172, 187], computer vision, materials design [197] and surface smoothing [249, 256]. Computational techniques using the level set theory were devised by Osher and Sethian [172, 187, 197] and have been further developed and applied by many others [37, 54, 206]. An alternative approach for image analysis is to minimize a functional in the framework of the Mumford-Shah variational functional [158], and/or the Euler-Lagrange formulation of 20 variation [30, 36, 134, 171, 187, 189]. Wei introduced some of the first family of high-order geometric flow equations for image analysis [242]. In fact, the nonlinear production term in these high-order operators provides a framework to accommodate the PDGF in our later formation for macromolecular surfaces. Their high-order geometric flow equations have led to many interesting applications [242, 244, 215, 144, 96, 39]. Mathematical analysis of these high order equations in Sobolev space was carried out by Bertozzi and Greer [27, 103, 104], who proved the existence and uniqueness of the solution to a case with H 1 initial data and a regularized operator. A similar analysis was performed by Xu and Zhou [250]. Wei and Jia also introduced a coupled geometric flow equation system for image edge detection [244]. Such an algorithm works extremely well with texture images. Recently, Wei and his collaborators have proposed an evolution operator based single-step method for image denoising and enhancement [215]. Most recently, a family of differential geometry based multiscale models has been developed by Wei for chemical and biomolecular systems, including fuel cells, ion channels, DNA packing, nanofluidic systems, and virus evolution [243]. These models describe not only the structure, but also the dynamics and transport of the above mentioned chemical and biomolecular systems. 1.5.3 Highly accurate and efficient solver for interface problems In general, electrostatic energy is much larger than the non-electrostatic part so that the accuracy of electrostatic potential calculation based on the Poisson-Boltzmann (PB) equation plays a critical role in controlling the accuracy of the total solvation free energy. Therefore, numerical methods that are able to deliver highly accurate solution of the PB equation is desirable. In the present Lagrangian model, there exists a sharp solvent-solute interface and 21 it leads to a discontinuous dielectric constant definition in the PB equation. When the discontinuous dielectric profile is applied across the interface, the standard numerical methods, including the centered finite differences scheme, lose their accuracy and convergent order. This problem is aggravated by complex geometric shapes, possible geometric singularity, and singular charges of biomolecules. In the worst-case scenario, the standard numerical methods do not converge at all for complex irregular solvent-solute interfaces [252, 93]. The solution of elliptic equations with discontinuous coefficients and singular sources is a challenging problem in computational mathematics. In order to achieve high-order numerical accuracy, it is indispensable to develop mathematical interface techniques. Peskin pioneered the immersed boundary method (IBM) [124, 176] to address this class of problems. Recently, many other elegant methods have been proposed, including the ghost fluid method [79, 137], the upwinding embedded boundary method [34], finite-volume-based methods [164], and integral equation methods [151]. A major advance in the field was due to LeVeque and Li [129], who proposed a remarkable second order sharp interface scheme, the immersed interface method (IIM) [129, 135]. Chen and Strain discussed a piecewise-polynomial discretization and Krylov-accelerated multigrid for elliptic interface problems [44]. However, these interface techniques have not been implemented for the Poisson-Boltzmann equation in the context of realistic biomolecules. Wei and his coworkers have recently proposed a highly accurate algorithm, the matched interface and boundary (MIB) method [254, 253, 258, 261, 260] for solving elliptic equations. Many essential ideas of the current MIB method were introduced in earlier interface schemes for solving Maxwell’s equations [258]. The MIB is of arbitrarily high-order accuracy in principle, and sixth-order accurate MIB schemes have been constructed [253, 261]. Wei’s 22 group has developed three generations of MIB based PB solvers, MIBPB-I [259], MIBPB-II [252], and MIBPB-III [93]. The MIBPB-I is the first PB solver that explicitly enforces the flux continuity conditions at the dielectric interface in a biomolecular context; however, it cannot maintain its designed order of accuracy in the presence of molecular surface singularities, such as cusps and self-intersecting surfaces commonly occurred in biomolecular systems [188]. This problem was first addressed in the MIBPB-II by utilizing an advanced MIB technique developed by Yu et al. [253]; however, the MIBPB-II still loses its accuracy when the mesh size is as large as half of the smallest van der Waals radius, because of the interference of the interface and singular charges. To split the singular charge part of the solution [262, 43, 31], a Dirichlet to Neumann mapping approach [51] was designed in the MIBPB-III, which is by far the most accurate and stable PB solver. To our knowledge, the MIBPB method is the only existing method that is able to offer second order accuracy in solving the PoissonBoltzmann equation with discontinuous coefficients, singular sources and arbitrarily complex interfaces. The MIBPB is a few orders of magnitude more accurate at a given mesh size and about three times faster at a given accuracy than some traditional PB solvers [93]. The most important idea in all interface techniques is to take care of interface conditions, which may vary from systems to systems. Complex interface conditions are needed for the Helmholtz equation [257] and Maxwell’s equations [258]. For the Poisson-Boltzmann equation, interface conditions are the following [φ]Γ = φ+ (r) − φ− (r) = 0 [ǫφ]Γ = (ǫs ∇φ+ ) · N − (ǫm ∇φ− ) · N = 0. 23 (1.3) where φ+ and φ− are the electrostatic potential inside and outside the solvent-solute surface, respectively. Different methods may have different strategies in dealing with these conditions. The MIB method has a unique set of procedures in implementing Eq. (1.3). The interested reader is referred to earlier work [254, 253, 258, 261, 260]. In this work, we make use of MIBPB-III scheme for PDE based interface problems. We take dielectric constants ǫm = 1 and ǫs = 80 in our calculations. We use the Dirichlet farfield boundary condition and the electrostatic potential values at the boundary are practically obtained by the sum of potential contributions from individual atomic charges with an exponential decay factor [93]. The MIBPB-III is used to handle discontinuous dielectric constants, complex geometry and charge singularity. Note that although the geometry is complex, there is no geometric singularities, such as cusps and intersecting surfaces, in the biomolecular surfaces generated by our approaches. The extraction of surface information is carried out by the marching cubes algorithm embedded in our codes. 1.5.4 Self-consistent iterative methods It will be seen , through the optimization of the solvation energy, that the resulted generalized Poisson-Boltzmann (PB) equation and the generalized potential driven geometric flow equation are fully coupled. The optimized electrostatic potential is obtained by solving the Poisson-Boltzmann equation in which solvent-solute interface is used to determine the dielectric constant and the domain decomposition. The interface is generated by the solution of the potential driven geometric flow equation, which in turn depends on the electrostatic potential. In other words, the solution of the potential driven geometric flow equation requires the knowledge of electrostatic potential, while the solution of the generalized PB equation 24 requires the input of the interface definition. Therefore, the coupled generalized geometric flow equation and the generalized PB equation need to be solved simultaneously, in the present differential geometry based solvation model. The existence and the uniqueness of their solution, under the biomolecular context, can be an interesting mathematical issue. Numerically, both the convergence and efficiency of the solutions of the coupled system will be given quite a bit of attentions. In practice, this coupled nonlinear system can be solved by an iterative procedure, until a self-consistency is reached. Iterative methods are often the only choice for nonlinear equations. In computational mathematics, an iterative method is a mathematical procedure that leads to approximate solutions for a class of problems. The most common iterative method is Newton’s method. Other well-known examples include the Gummel’s method, the steepest descent method, and the conjugate iterative method. An iterative method is called convergent if the corresponding solution sequence converges for given initial approximations. A mathematically rigorous convergence analysis of an iterative method need to be performed; however, heuristic-based iterative methods are also commonly used. 1.6 The rest of this thesis The rest of this thesis is organized as follows. In Chapter 2, we present the Eulerian formulation of our differential geometry based solvation models. The Eulerian analysis of biomolecular surfaces utilizes the well-known coarea theorem from the geometric measure theory. The resulting operator from surface area minimization can be identified as a generalized Laplace-Beltrami operator in a 3D dimension. Chapter 4 is devoted to the Lagrangian formulation. Lagrangian analysis of biomolecular surfaces makes the direct use of differential 25 geometry theory of surfaces and manifolds. The surface minimization leads to the LaplaceBeltrami operator defined in 2D domain, or the mean curvature operator. In Chapter 4, the connection of two representations is analyzed in the present work. The structure of governing equations, and the accuracy and efficiency of two formulations are compared. The objective of Chapter 3 is to incorporate a quantum mechanical description of charge density into our earlier differential geometry based solvation model, which is described in Chapter 2. This thesis is concluded by discussing the main achievements and future work. 26 Chapter 2 Eulerian formulation This chapter presents a differential geometry based model for the analysis and computation of the equilibrium property of solvation. Differential geometry theory of surfaces is utilized to define and construct smooth interfaces with good stability and differentiability for use in characterizing the solvent-solute boundaries and in generating continuous dielectric functions across the computational domain. A total free energy functional is constructed to couple polar and nonpolar contributions to the solvation process. Geometric measure theory is employed to rigorously convert a Lagrangian formulation of the surface energy into an Eulerian formulation so as to bring all energy terms on an equal footing. By minimizing the total free energy functional, we derive coupled generalized Poisson-Boltzmann equation (GPBE) and generalized geometric flow equation (GGFE) for the electrostatic potential and the construction of realistic solvent-solute boundaries, respectively. By solving the coupled GPBE and GGFE, we obtain the electrostatic potential, the solvent-solute boundary profile, and the smooth dielectric function, and thereby improve the accuracy and stability of implicit solvation calculations. We also design efficient second order numerical schemes for the solution of 27 the GPBE and GGFE. Matrix resulted from the discretization of the GPBE is accelerated with appropriate preconditioners. An alternative direct implicit (ADI) scheme is designed to improve the stability of solving the GGFE. Two iterative approaches are designed to solve the coupled system of nonlinear partial differential equations. Extensive numerical experiments are designed to validate the present theoretical model, test computational methods, and optimize numerical algorithms. Example solvation analysis of both small compounds and proteins are carried out to further demonstrate the accuracy, stability, efficiency and robustness of the present new model and numerical approaches. Comparison is given to both experimental and theoretical results in the literature. This chapter is organized as follows. Section 2.1 is devoted to the theoretical foundation of the present differential geometry based solvation model. A variational framework is established to couple different parts of the solvation contributions. Governing equations are derived by variational principles. The solution of the governing equations leads to physical solvent-solute boundaries and accurate solvation free energies. Numerical methods and algorithms are discussed in Section 2.2. Schemes of the second order numerical accuracy are designed for the construction and evolution of solute characteristic function. Appropriate preconditioners are used for solving the generalized Poisson-Boltzmann equations. The coupled equations are solved by two iterative schemes. Section 2.3 presents validation and analysis of the proposed numerical approaches. The accuracy and convergence of various computational schemes, including the surface area formulation based on the geometric measure theory, are carefully tested to ensure their computational reliability and efficiency. The applications of the proposed theories, methods and algorithms are considered to two sets of compounds: small molecules and proteins in Section 2.4. Finally, this chapter ends with 28 1 s S 1−S 0.5 0 −6 −4 −2 0 x 2 4 6 Figure 2.1: The cross line of S and (1 − S) of a diatomic system described in Section 2.3.3 concluding remarks. 2.1 Theory and model In this section, a differential geometry based model of solvation is briefly described for macromolecules and their aquatic environment that are near equilibrium. More details about the differential geometry based multiscale formalism, particularly dynamics and transport aspects, can be found elsewhere [243]. For a system near equilibrium, the density of charged particles in the solvent can be approximated by the Boltzmann distribution, which considerably reduces the number of degrees of the freedom of the solvation system. Alternatively, the Nernst-Planck equations or the full set of the Navier-Stokes equations might be utilized to describe systems that are far from the equilibrium [243]. 29 2.1.1 Solute-solvent boundary Let us consider a multi-domain setting of a macromolecule and solvent system. The macromolecule is described in discrete atomic detail, while the aqueous solvent is treated as a continuum. Therefore, the domain Ω ∈ R3 is essentially divided into two (types of ) regions, i.e., aqueous solvent domain Ωs and macromolecular domain Ωm . Therefore, one has Ω = Ωs Ωm . However, because electron wavefunctions of the solvent and the solute overlap at the atomic scale, Ωs and Ωm should overlap with each other at the boundary of molecules and solvent, i.e., Ωb = Ωs Ωm = ⊘, where Ωb is the region of solvent-solute boundary. Therefore, we propose a characteristic function S : R3 → R to characterize this overlapping solvent-solute boundary. As such, S(x) is a description function or a characteristic function of the solute domain, i.e., it is one (S = 1) inside the biomolecule and zero (S = 0) in the aquatic solvent. However, S takes a value between zero and one at the solvent-solute boundary region. Therefore, (1 − S) is a description function or a characteristic function for the solvent domain. The profiles of S and (1 − S) are depicted in Figure 2.1 for a simple system. It is seen that there is a transition region at the solvent-solute boundary where the solvent and the solute regions overlap. Obviously, in our model, the evaluation of all the solvent-solute properties depends on S. Physically, S and thus the profile of solvent-solute boundary, must be determined by the energy minimization principle. Therefore, our task is to identify the energy functional that to be optimized. This task is accomplished via the differential geometry of surfaces and manifolds in the present work. 30 2.1.2 Total free energy functional The solvation process of macromolecules involves a number of interactions. As discussed in the Introduction, typically, the free energy of solvation models consists of polar and nonpolar contributions, as well as polar and nonpolar interactions. 2.1.2.1 Polar free energy functional The polar part is standardly represented by electrostatic interactions, which are of special importance because of their long range and influence on polar or charged molecules including water, aqueous ions, and amino or nucleic acids. They are also some of the most important aspects that determine the physical and chemical properties of biomolecules, such as protein folding, protein-DNA binding, gene expression and regulation, etc. The widely used free energy functional of the electrostatic system was given by Sharp and Honig [199] and Gilson et al [97]. However, their formulation is based on a sharp interface that separates the solvent and solute domains. In our formulation, we incorporate the function S into the polar solvation free energy functional Gp = 1 S ρm φ − ǫm |∇φ|2 2 Ω   Nc  1 +(1 − S) − ǫs |∇φ|2 − kB T n0 (e−φQi /kB T − 1) dr i  2 i=1 (2.1) where φ is the electrostatic potential whose domain is the whole computational domain Ω, and ǫs and ǫm are the dielectric constants of the solvent and solute, respectively. Here ρm = j qj δ(r − xj ) is the density of molecular charges, with qj being the partial charge on an atom located at xj , Qi is the charge of ion species i, Nc is the number of ion species, 31 kB is the Boltzmann constant, T is the temperature, and n0 is the bulk concentration of the i ith ionic species. The term associated with S is the electrostatic free energy of the solute and that with (1 − S) is the electrostatic free energy of the solvent. The above electrostatic free energy functional is inherently multidomain in nature and the domain is divided into the solute subdomain and the solvent subdomain as indicated by S and 1−S, respectively. These subdomains do not have to be mutually exclusive. A discrete description of the solute and a continuum description of the solvent are also employed in Eq. (2.1) in the framework of the implicit solvent treatment, in which the charge density of mobile ions follows the Boltzmann distribution. Moreover, it will be demonstrated that the present electrostatic free energy functional is able to reproduce the classical Poisson-Boltzmann equation when a sharp solvent-solute interface is used and S becomes a Heaviside function. Finally, we note that the terms that are quadratic in the potential gradient in Eq. (2.1) have negative signs. Therefore, the free energy will be optimized instead of being minimized. In this work, we have adopted the earlier sign convention in the field [199, 97]. 2.1.2.2 Non-polar free energy functional For the nonpolar contribution, we consider the nonpolar solvation free energy functional discussed in the Introduction. Gnp = γ · Area + p · Vol + ρ0 Ωs U att dr (2.2) To obtain a functional relation for S, it is necessary to rewrite nonpolar free energy 32 formulation in terms of S(r). The enclosed volume of biomolecule can be given by Vol = Ωm dr = Ω S(r)dr. (2.3) Similarly the attractive dispersion term can be rewritten in the form ρ0 Ωs U att dr = ρ0 Ω (1 − S(r))U att dr, (2.4) where we assume that the solvent bulk density ρ0 is a constant in space. Typically, one expresses the area of a unique surface as a surface integration over the biomolecular boundary in the Lagrangian formulation. However, this approach does not work directly in our formulation because no sharp solvent-solute boundary is assumed. In fact, the concept of the surface area cannot be defined in the same manner as in the sharp surface case. For a smooth boundary, there are infinitely many surfaces and the surface area can be defined as a weighted mean of this family of surfaces. Additionally, for practical purpose, we need an appropriate Eulerian formulation for the surface area so that we can put all energy contributions on an equal footing. Therefore, we need to convert the surface integral into a volume one. To this end, we make use of the coarea formula in the geometric measure theory [78]. For a scalar field B in R3 , with C 1 continuity condition, integrating a function f over its isolevel c in a region Ω can be done directly by a volume integral over Ω through the expression R B −1 Ω f dσdc = 33 Ω ∇B f (r)dr, (2.5) where c denotes an isovalue of B, and B −1 represents the c-isosurface, i.e., the set of points {rc } such that B(rc ) = c. Here, the coarea formula prescribes a relationship between the sum of area integrals and a global volume integral. In our case, we introduce the concept of mean surface area of the family of isosurfaces which are subsets of point satisfying {S(r) = y}, where 0 ≤ y ≤ 1. Therefore the mean surface area can be given by a volume integral as Area = = 1 0 S −1 (c) Ω dσdc (2.6) ∇S(r) dr. Ω Note that ∇S = 0 only in the region of the solvent-solute boundary. Numerical test of this formulation will be presented in Section 2.3.1. Finally, the electrostatic free energy functional is complemented by the nonpolar free energy functional to give the total free energy functional of solvation for biomolecules at equilibrium Gtotal = Ω γ ∇S(r) + pS(r) + ρ0 (1 − S(r))U att (2.7) 1 + S(r) ρm φ − ǫm |∇φ|2 2   Nc 1 + (1 − S(r)) − ǫs |∇φ|2 − kB T n0 e−φQi /kB T − 1  dr. i 2 i=1 Note that the polar and nonpolar parts are coupled via the characteristic function S, which is determined by the total energy optimization instead of the surface free energy optimization as done in our earlier work [20]. The above total free energy expression provides a basis for 34 the evaluation of the solvation free energy and a starting point for the derivation of governing equations for the solvation analysis. A similar coupling of polar and nonpolar interactions was described previously by Dzubiella and co-workers [74, 75]; however, the implementation of non-polar interactions and the representation of continuum and discrete domains differ significantly from the present work. 2.1.3 Governing equations The solvation free energy functional is a functional in terms of characteristic function S and potential φ. The integral is taken over the whole space. From the physical point of view, there should exist an optimal function S(r) and an optimal potential φ at the equilibrium state which optimize the total energy. Since S and φ can vary independently in our formulation, to optimize Gtotal , it is necessary that δGtotal ⇒ Sρm + ∇ · ([(1 − S)ǫs + Sǫm ]∇φ) + (1 − S) δφ Nc i=1 n0 Qi e−φQi /kB T = 0 (2.8) i and δGtotal ∇S ⇒ −∇ · γ δS ∇S 1 + p − ρ0 U att + ρm φ − ǫm |∇φ|2 2 Nc 1 2+k T + ǫs |∇φ| n0 (e−φQi /kB T − 1) = 0, B i 2 i=1 where ∇ · γ ∇S ∇S (2.9) is a generalized Laplace-Beltrami operator, which is a generalization of the usual Laplacian operator to a smooth manifold [17, 243]. In general, γ can be a function of the position γ = γ(r) to reflect surface hydrophobicity at different locations. However, 35 Epsilon 80 60 40 20 0 −6 −4 −2 0 x 2 4 6 Figure 2.2: The cross line profile of ǫ(S) of a diatomic system described in Section 2.3.3. Here, we have set ǫs = 80 and ǫm = 1. it is treated as a constant in our present computation. From Eq. (2.8) we result in the generalized Poisson-Boltzmann equation (GPBE) Nc −∇ · (ǫ(S)∇φ) = Sρm + (1 − S) i=1 n0 Qi e−φQi /kB T , i (2.10) where the dielectric function is given by ǫ(S) = (1 − S)ǫs + Sǫm . (2.11) This expression provides a smooth dielectric profile. Figure 2.2 shows the cross line of the dielectric function ǫ(S) of a diatomic system. It is seen that there is a smooth transition region for the dielectric constant to change from ǫs to ǫm . The solution procedure of Eq. (2.10) can differ very much from that of the standard PB equation, due to the smooth dielectric function. Particularly, many mathematical difficulties of solving elliptic equations with discontinuous coefficients [258, 261, 260, 254, 253] can be avoided in the present generalized Poisson-Boltzmann equation. 36 For a weak electrostatic potential, i.e., φ ≪ 1, one can linearize the generalized PB equation −∇ · (ǫ(S)∇φ) + (1 − S)κ2 φ = Sρm , (2.12) u where κ is a modified Debye-H¨ckel screening function describing the ion strength [108]. Furthermore, the solution of Eq. (2.9) leads to a “physical solvent-solute boundary” (S). As discussed in earlier work [20, 17, 243], the solution of this elliptic partial differential equation can be attained via a parabolic partial differential equation ∂S = ∂t ∇S ∇· γ ∇S ∇S +V , (2.13) where the generalized “potential” V is defined as 1 1 V = −p+ρ0 U att −ρm φ+ ǫm |∇φ|2 − ǫs |∇φ|2 −kB T 2 2 Nc i=1 n0 e−φQi /kB T − 1 . (2.14) i Note that Eq. (2.13) has the same differential operator as the mean curvature flow equation [20], except for the extra external source term. Therefore, it is a special case of the potential driven geometric flow equation proposed in our earlier work [17]. In Eq. (2.13), as t → ∞, the initial profile of S evolutes into a steady state solution, which solves the original Eq. (2.9). It is interesting to see that the sharp solvent-solute interface and the standard PB equation, as well as related interface conditions, can be recovered from Eq. (2.10). For a sharp interface, S becomes a Heaviside function, having value 1 for the solute subdomain and 0 for the solvent subdomain. As such, the smooth transition region in the dielectric function disappears and the dielectric function becomes discontinuous. Then, Eq. (2.10) reduces to 37 the classical form of the Poisson-Boltzmann equation [108] −ǫm ∇2 φm = ρm −ǫs ∇2 φs = ∀r ∈ Ωm (−φs Qj /kB T ) ∀r ∈ Ωs j Qj cj e with appropriate interface conditions φs = φm , and ǫm ∇φm · n = ǫs ∇φs · n ∀r ∈ Γ, (2.15) where φm and φs represent the potential in the solute domain Ωm and solvent domain Ωs , respectively, Γ denotes the sharp interface, and n is the normal vector of the solvent-solute sharp interface. Note that the generalized Poisson-Boltzmann Eq. (2.10) and the potential driven geometric flow equation (2.13) are strongly coupled. Therefore, these two equations have to be solved by appropriate iterative procedures. This aspect will be discussed in Section 2.2.3. 2.2 Methods and algorithms This section presents a variety of computational methods and algorithms for the solution of the generalized Poisson-Boltzmann equation and the generalized geometric flow equation. 2.2.1 Discretization schemes of the governing equations We design second order finite difference schemes for governing equation derived from the free energy optimization. 38 2.2.1.1 The generalized Poisson-Boltzmann equation For the solution of the generalized PB equation, the finite difference scheme is utilized in this study. The continuous dielectric definition allows us to obtain accurate solution by using the standard second order center difference scheme. Let the pixel (i, j, k) represent the position (xi , yj , zk ). The discretized form of Eq. (2.12) is + + + + + = 1 ǫ(xi + h, yj , zk )[φ(i + 1, j, k) − φ(i, j, k)] 2 1 ǫ(xi − h, yj , zk )[φ(i − 1, j, k) − φ(i, j, k)] 2 1 ǫ(xi , yj + h, zk )[φ(i, j + 1, k) − φ(i, j, k)] 2 1 ǫ(xi , yj − h, zk )[φ(i, j − 1, k) − φ(i, j, k)] 2 1 ǫ(xi , yj , zk + h)[φ(i, j, k + 1) − φ(i, j, k)] 2 1 ǫ(xi , yj , zk − h)[φ(i, j, k − 1) − φ(i, j, k)] 2 (1 − S(i, j, k))κ2 φ(i, j, k)h2 − S(i, j, k)q(i, j, k)/h (2.16) where h is the grid spacing, and q(i, j, k) is the fractional charge at grid point (xi , yj , zk ), which is resulted from the interpolation of the charge density ρm . The second order interpolation (i.e., the trilinear mapping) is used to distribute charges. Thus, the discretized PB equation can be cast into the standard linear algebraic equation system of the form AX = B, where X is the solution of the interest, A is the discretization matrix and B is the source term associated with the continuum and discrete charges. The boundary condition is built up by the far field condition and practically obtained by the sum of potential contributions from individual atomic charges with a decay factor from the continuum charge strength κ [93]. We have explored the use of the biconjugate gradient method as the linear solver. 39 Matrix acceleration is discussed in a later section. The initial guess of the solution is set to 0 and the convergence tolerance is set as 10−6 . It is shown in the test section that the PB solver is able to deliver the designed second-order accuracy. The solution of geometry flow equation (2.13) is described in detail in Appendix A. 2.2.2 Acceleration procedures The computational efficiency of the solution process is an important issue and can be a bottleneck for further applications of the present model. Particularly, when this model is applied to molecular dynamic simulation, the generalized PB equation and geometry flow equation are to be solved up to millions of times. Therefore, any nontrivial improvement in computational efficiency will make the present model more feasible to many practical applications in chemical and biological systems. 2.2.2.1 Precondition of the PB solver The linear algebraic system of the discretized PB equation can be solved by two major approaches: direct methods and iterative methods. Large memory requirement prohibits direct methods to be used in the matrix resulted from large chemical and biological systems. Widely used iterative methods, including Gauss-Seidel and successive over-relaxation (SOR), work well for the generalized PB equation, but typically converge slowly. Conjugate Gradient method is quite efficient for symmetric and positively definite matrices. However, the sparse matrix A resulted from Eq. (2.16) is seven-fold banded but nonsymmetric because the dielectric distribution function is not a constant and varies in the transition region. The biconjugate gradient (BiCG) method can be a good choice for 40 non-symmetric systems and has been adopted in a variety of our MIB schemes [258, 261, 260, 254, 253], but attentions are still to be paid in regard to the convergence issue. We have studied the application of pre-conditioners in two linear solver libraries, the SLATEC (http://people.sc.fsu.edu/˜burkardt/f src/slatec/slatec.html) and the PETSc (http:// www .mcs.anl.gov/petsc/petsc-as/) to the solution of the PB equation [40]. It turned out that combination of stabilized biconjugate gradient method (BiCG) and the blocked Jacobi preconditioner (BJAC) from the PETSc and the combination of the orthomin method (OM) and the incomplete LU factorization preconditioner (ILU) from the SLATEC performed better compared to other tested solvers, preconditoners and their combinations [40]. In this study, we focus on the combination of the ILU and the OM from the SLATEC, which is easy to incorporate into the present iteration procedure and provides a stand-alone package, while the PETSc needs to be pre-installed before being used. In Section 2.3, we further demonstrate the improvement by the combination of pre-conditioners and the iterative linear solvers. 2.2.2.2 Initial guess of the generalized PB solution A good initial guess is always desired for the speedup of the iterative PB solver. Normally, the initial guess can be simply set to 0 because it is complicated and computationally expensive to find good ones. However, in our iteration procedure, it is found that the electrostatic potential distribution does not change dramatically from the prior calculation due to the small adjustment in dielectric from the prior step. Therefore, it is beneficial to take the prior potential as a good guess for solving the linear system. It turns out that the generalized PB solver converges faster when the electrostatic potential from the previous iteration is used as an input. 41 2.2.2.3 Convergence criteria in the generalized PB solver The convergence criterion directly influences the accuracy and CPU cost of the solution of the generalized PB equation. The smaller convergence criterion, the more accurate the solution of linear system is. However, the smaller convergence criterion requires more iterations and longer CPU time in the iterative solution process. Therefore, it is worthwhile to find a criterion which is a good compromise between the accuracy and the efficiency. Typically, a value of 10−6 is used in many chemical and biological applications. Later on we will numerically investigate the effect of convergence criterion on the electrostatic solvation energy, mean surface area and mean volume which are used to compute the total solvation free energy. With the 10−8 as a standard, we will examine the efficiency and the accuracy for several relaxed convergence criteria, such as 10−6 , 10−4 , 10−3 , 10−2 and 10−1 . 2.2.3 Dynamical coupling of the generalized Poisson Boltzmann and geometry flow equations As described in Section 2.1, the present differential geometry based solvation model prescribes a procedure to set up the total free energy functional of the solvation. By the variational principle, we obtain generalized coupled PB equation (2.10) and potential driven geometric flow equation (2.10). The solution of these coupled nonlinear equations provides a “physical” dielectric profile ǫ(S) and the electrostatic potential φ and thereby enables the calculation of the solvation free energy. The solution of the potential driven geometric flow equation (2.10) requires the knowledge of φ, while the solution of the generalized PB equation (2.10) requires the input of S and ǫ(S). Therefore, in principle, the geometric flow equation needs to be solved simultaneously with the generalized PB equation until a self-consistency is reached. 42 In this study, we explore two self-consistent iteration procedures. 2.2.3.1 Approach I The iteration process can be carried out by breaking up the process into an iterative sequence of two steps as follows: Starting with an initial guess of characteristic function S, one finds out the temporary electrostatic potential φ by solving the generalized PB equation with a given initial S. Once the electrostatic potential is obtained, the electrostatic energy can be calculated. The second step is to solve the time-dependent generalized geometric flow equation for S with a prior calculated potential φ. In this step, the time integration can usually reach a quasi-steady state. With the updated quasi-steady S, one can come back to the first step for the next cycle until the solvation free energy converges to within a pre-determined criteria. However, in practice, simply re-inserting S into the PB solver may diverge. Because the quasi-steady S may vary dramatically during the iteration. Note that all changes in S are concentrated around the solvent-solute boundary, as the final solution of the potential driven geometric flow equation reflects the balance between the intrinsic curvature energy and the external potential terms. A large change in S near the solventsolute boundary in turn leads to much adjustment in the electrostatic potential which differs much in the solute and solvent regions. To avoid this problem, we adopt a simple relaxation algorithm: the characteristic function S used for the PB solver is a linear combination of the previous one Sold and the newly generated one Snew S = αSnew + (1 − α)Sold , 43 0 < α < 1. (2.17) It turns out that the convergence of the generalized PB equation is guaranteed if α is small enough. In this work, α can be taken in the range from 0.1 to 0.7. The choice of α is explored later. Note that this approach may fail sometimes when the generalized geometric flow equation blows up due to a large variation in the temporary electrostatic potential. We therefore utilize a similar procedure for the electrostatic potential used in the evolution of the generalized geometric flow equation φ = α′ φnew + (1 − α′ )φold , 0 < α′ < 1, (2.18) where φold and φnew are previous and newly resolved electrostatic potentials, respectively. This treatment can avoid the blow-up of the generalized geometric flow solution. 2.2.3.2 Approach II To reduce the dramatic changes in S and φ, we can consider a straightforward way in which solving generalized PB equation follows each time-step integration of the generalized geometry flow equation. However, this treatment makes the whole iterative procedure computationally over expensive as many more PB solution processes are required. Additionally, it may not be necessary since the change in the S from one time step to another one is so small that the change in the corresponding potential distribution should be very small. Indeed, it is practical to update electrostatic potential after a number of time steps (i.e., 10 to 100 steps) in the generalized geometry flow equation integration rather than every time step. We call the number of time integration between two φ updatings the number of intermittency Nstep . This approach effectively speeds up the whole process. Additionally, the relaxation algorithm given in Eqs. (2.17) and (2.18) can also be used here to guaran44 tee the convergence. Moreover, in this approach, one better starts the iterative process by solving the S from Eq. (2.13) without the electrostatic potential term. So that the later iteration procedure can focus on the impact of electrostatic potential to the generation of the solvent-solute boundary. This treatment reduces the total iteration number and save the computational time significantly. In fact, there is a relationship between Approach I and Approach II. When the number of the time integrations becomes larger and larger, Approach II returns to Approach I. In Section 2.3, we systematically study the difference between these two approaches. This can be done by comparing the impacts of different approaches on the resulting total solvation free energy, surface area and volume of the solute molecule. It is found that these two approaches lead to the same results. This, to some degree, indicates the reliability and validity of the proposed iteration procedures. 2.2.4 Evaluation of the solvation free energy Once the electrostatic potential φ and the characteristic function S are obtained, the solvation free energy is given by ∆G = G − G0 (2.19) where G0 is the energy calculated from the homogeneous environment with ǫs = ǫm = 1 without the nonpolar energy part. Therefore, we have ∆G = Gp + Gnp − G0 . 45 (2.20) The expressions of Gp and Gnp are taken from Eq. (2.7). Here Gp − G0 can be considered as the electrostatic solvation free energy. In all calculations presented here except for salt effect calculation, mobile ions will be set to zero corresponding to a solution without salt. Therefore we have the following approximation Gp = Ω S(r)ρm φdr − 1 1 ǫ(S(r))|∇φ|2 dr ≈ S(r)ρm φdr. 2 Ω 2 Ω (2.21) Discretizing the integral yields 1 Gp = 2 Nm Q(ri )φ(ri ), (2.22) i=1 where Q(ri ) is the ith partial charge at ri in the biomolecule, and Nm is the total number of partial charges. Now the electrostatic solvation free energy can be calculated as 1 ∆Gp = Gp − G0 = 2 Nm i=1 Q(ri )(φ(ri ) − φ0 (ri )), (2.23) where φ and φ0 are electrostatic potentials in the presence of the solvent and the vacuum, respectively. Here φ is computed from the generalized Poisson equation (2.10) using the continuous dielectric distribution −∇ · (ǫ(S)∇φ(r)) = Sρm (2.24) where ǫ(S) and ρm are the same as the ones in Eq. (2.10). The homogeneous solution φ0 is computed with ǫ(S) = ǫp in the whole domain. The nonpolar part, Gnp , is computed exactly by Eq. (2.2). 46 2.3 Numerical test and validation This section provides systematic validations for the computational algorithms and schemes proposed in the last two sections. We first examine the behavior of the coarea formula, then continue testing through various equation solvers, and finally check the impact of the potential term in our generalized geometric flow equation. 2.3.1 The behavior of the coarea formula As described earlier, the coarea formula plays an important role in describing the mean surface area of an infinite family of smooth solvent-solute boundaries by a volume integral. This Eulerian formulation puts the free energy of the surface area and other free energies in an equal footing. Usually, the isovalue of the surface area in the coarea formula can be any positive real number. But for the present derivation, it is limited to be between 0 and 1 because S is defined as a characteristic function of the solute. Here, we numerically explore the behavior of the coarea formula in a bounded open set. To this end, we design some test cases as follows: Let B be a smooth function with a specific expression according to the geometry in the coarea formula, we set mean surface area of {x|0 < B(x) < 1} = 1 0 B −1 Ω dσdc = Ω ∇B dx (2.25) where Ω is a bounded open set. Therefore, the mean surface area has the same value as the volume of open set Ω {x|0 < B(x) < 1}. Computationally, integrating over the norm of the gradient of B gives rise to the corresponding mean surface area. The volume integral of 47 a density function f is just simply approximated by Ω f (x, y, z)dr ≈ f (xi , yj , zk )h3 (2.26) (i,j,k)∈J where the summation is over J, the set of points inside Ω, and (xi , yj , zk ) is the coordinates of grid points (i, j, k). Table 2.1 lists the numerical results and exact values of the surface areas for the following cases (a) A unit sphere: Ω = {(x, y, z)|x2 + y 2 + z 2 ≤ 1} and B = x2 + y 2 + z 2 (b) A cylinder: Ω = {(x, y, z)|x2 + y 2 < 1, −4 ≤ z ≤ 4} and B = x2 + y 2 (c) A ellipsoid: Ω = {(x, y, z)|(x/a)2 + (y/b)2 + (z/c)2 ≤ 1} and B= (x/a)2 + (y/b)2 + (z/c)2 , where a = 20/7, b = 25/14, and c = 25/14. It is evident that the numerical result converges to the exact value. The errors from the Table 2.1: Areas computed from the coarea formula for bounded open sets grid spacing case 0.5 0.25 0.1 0.05 0.025 exact value Sphere 4.00 4.00 4.15 4.17 4.18 4.189 Cylinder 22.50 23.25 24.49 24.84 25.01 25.133 Ellipsoid 37.75 37.97 38.10 38.17 38.16 38.163 cylinder are slightly larger than those from the sphere and ellipsoid because the cylinder has non-smooth edges. However, the errors are small for all cases. Therefore, we conclude that the mean value of the areas of the family of smooth solvent-solute surfaces indeed converges to the area of the corresponding sharp surface. Thus, the present definition of the mean surface area of an infinite family of smooth surfaces is an important generalization of the classic concept of the area of a sharp surface. 48 2.3.2 Accuracy of the generalized PB solver In this section, we investigate the accuracy of the proposed numerical solvers. The generalized geometric flow equation (A.8) has the same differential operator as the mean curvature flow [20] except for the extra source terms. Previously, we have numerically proved that the explicit Euler algorithm delivers the reliability and convergence of the solution of geometric flow equations, and the finite central different scheme is of second order convergence in space [17]. Here, we focus on the test of the accuracy of the generalized PB solver with the proposed dielectric function ǫ(S). Although the discretization form of the second-order finite difference PB expression has been used for other continuous dielectric definitions [112], the accuracy of this approach has not been examined. Moreover, it is worthwhile to validate the generalized PB solver due to its different settings of dielectric function, i.e., the ǫ(S) profiles generated by the geometric flow equation. For this purpose, we construct a benchmark test of a simple one-ball system. We examine the convergence order and the accuracy of the finite difference scheme in solving the generalized PB equation. In particular, we consider a modified Poisson equation with a designed dielectric definition given by ǫ(r) = ǫ1 Su(r) + ǫ2 · (1 − Su(r)) 49 (2.27) where r = (x, y, z), ǫ1 and ǫ2 are two constants to be determined, and    1         b− Su(r) = −2           0 if 3  2 +y 2 +z 2 b− x  +3 b−a x2 + y 2 + z 2 < a; 2 2 +y 2 +z 2 x  if a ≤ b−a if b < x2 + y 2 + z 2 ≤ b; x2 + y 2 + z 2 ; (2.28) where a < b. Note that through the definition of ǫ and Su(r), this designed case has the same features of the dielectric definition as that in our model. The exact solution is designed to be φ0 (r) = cos(x) cos(y) cos(z). (2.29) Then the modified equation becomes ∇ · (ǫ∇φ) = ǫx ∇x φ0 + ǫy ∇y φ0 + ǫz ∇z φ0 + ǫ∇2 φ0 where ∇2 φ0 (x, y, z) = −3 cos(x) cos(y) cos(z) ∇x φ0 = − sin(x) cos(y) cos(z) ∇y φ0 = − sin(y) cos(x) cos(z) and ∇z φ0 = − sin(z) cos(x) cos(y). 50 (2.30) For ǫ, if a ≤ x2 + y 2 + z 2 ≤ b, we have  ǫi (r) = 6(ǫ2 − ǫ1 )    b− x2 + y 2 + z 2 i b−a  (b − a) x2 + y 2 + z 2   x2 + y 2 + z 2 − a b−a   (2.31)  where i= x, y, z. Otherwise, ǫx = ǫy = ǫz = 0. Table 2.2 lists the computed errors under different mesh sizes with a = 1 and b = 3. The standard absolute norm error measurement L∞ is employed. Here ǫ1 is fixed to be 1 and ǫ2 is taken to be 80 or 10. The second order convergence in space is observed for the scheme. Furthermore, it is found that a large ǫ2 may slightly deteriorate the convergence. Table 2.2: Errors and convergence orders for the generalized PB solver (ǫ1 = 1) ǫ2 = 80 ǫ2 = 10 spacing L∞ order L∞ order 1 0.22 0.13 0.5 8.13*10-2 1.65 3.18*10-2 2.02 0.25 2.06*10-2 1.99 7.97*10-3 2.00 0.125 5.44*10-3 1.94 1.98*10-3 2.01 2.3.3 Convergence of boundary profile and dielectric function In the present model, the characteristic function S defines the solvent-solute boundary. Consequently, it can significantly affect the solution of the generalized Poisson-Boltzmann equation, the surface area and volume, and thus, plays a key role in the solvation free energy calculation. To illustrate the evolution and the convergence of the generalized geometric 51 flow equation and corresponding S function, we solve Eq. (2.13) without the electrostatic potential term in this test. However, the electrostatic solvation free energy at a given time can att still be calculated. The expression of attractive interaction Ui needs to be given explicitly in order to solve the geometric flow equation. Here we consider the 6-12 Lennard-Jones (L-J) att pair potential to model Ui (see the description in Appendix A). All the calculations in this work are carried out by using the WCA decomposition. To illustrate the S profile and evolution, we consider a diatomic system with the van der Waals radius 2.2˚ and coordinates (x, y, z) = (−3, 0, 0) and (1.4, 0, 0). The spacing and A time stepping are chosen as h = 0.25˚ and τ = h2 /4.5, respectively. The solvent probe A radius is set to 2˚, which is used for the initial value setting and constraint construction. A In fact a much small solvent probe radius can ensure the correct surface topology [20]. The computational domain is set to [−8.70, 7.05] × [−5.7, 5.55] × [−5.7, 5.55]. Thus, the size of computational system is 64×46×46. The L-J parameters are set as follows: σi is taken from atomic radius and σs is chosen to be 0.65˚. We set well depth ǫi = 0.039 kcal/mol and bulk A density coefficient ρ0 /γ = 2, where, γ = 1/15kcal/(mol˚2 ). To compute the electrostatic A solvation free energy during the evolution of solvent-solute boundary, 1 unit charge is set on the center of each atom. We choose the dielectric constants ǫm = 1 and ǫs = 80, respectively. We set pressure coefficient p/γ = 0.2. A different γ is used for real systems. The evolution process of diatomic solvent-solute boundary is depicted through a group of cross section profiles of the S function in Figure 2.3 where the values of S from a set of points of (x, y, 0.05) are described. The cross sections start with a relatively fat-shaped interface which reflects the solvent accessible density. Here, S = 1 inside the molecular domain and S = 0 in the solvent domain. Then the solvent-solute boundary is driven inward by 52 (T=0.07) (T=1.0) (T=1.5) (T=5.0) Figure 2.3: The evolutionary profiles of the S function at cross section (x, y, 0.05) in a diatomic system plotted from four intermediate states. 53 intrinsic geometric curvature effect and external potential. At the same time, there appears a transition region between the solvent and the solute. Finally a convergence is reached with a balance among intrinsic geometric curvature effect, different potentials and enforced constraints. To have a clear idea about the distribution feature of the S function, we draw a cross line from the cross section graph at T = 5 along x = −0.75 which are shown in Figure 2.1, where the functions of S and 1 − S are described together. It can be seen that the S function in the transition region is rather smooth. Once the S function is determined, the dielectric function ǫ(S) is calculated according to Eq. (2.11). Here the dielectric function ǫ(S) corresponding to the S function in Fig (2.1) is also exhibited in Figure 2.2. It has a pattern similar to 1 − S but with different values. It is important to note that the dielectric function ǫ(S) is also very smooth at the solute-solvent boundary. That is why the classical finite difference scheme can be applied to solve the generalized PB equation without reducing the accuracy of the solution. Once the solution of the generalized PB equation is computed, the electrostatic solvation free energy can be calculated. Therefore, the time history of the free energy along with the evolution of solvent-solute boundary can be recorded. To illustrate the convergence pattern of the solvation free energy, we compute the electrostatic solvation free energies in intermediate states during the time evolution. The results are shown in Figure 2.4. In order to show evolution histories of the surface area, volume and solvation free energy together in one plot, we plot two linear functions F (Volume) and J(Area) which have the same time dependence as the volume and the surface area, respectively. Here T denotes the time span and N = T τ represents the number of computational steps in the generalized geometric flow solver. It is found that the solvation free energy decreases with respect to the time evolution, which 54 −50 −100 −150 −200 −250 0 10 Energy F(Volume) J(Area) 1 10 N 2 10 3 10 Figure 2.4: The time evolution histories of the electrostatic solvation free energy, F (Volume) and J(Area), where F (Volume) = volume/5 − 180 and J(Area) = (surface area)/5 − 200. is consistent with our theoretical formulation. It is observed that the solution of our model converges to a steady state in terms of volume (˚3 ), area (˚2 ) and electrostatic solvation A A free energy (kcal/mol). Moreover, to obtain the results at the steady state, N = 200 or T = 3.5 is large enough to be taken as the stopping time in our geometry flow solver for this system. Normally, it takes a longer evolution time for a large system to set down to the steady state. The total integration time could be considerably shortened had a small probe radius been used [20]. 2.3.4 Consistency of iteration procedures If the electrostatic potential effect is taken into account during the solvent-solute boundary evolution, the iteration procedure has to be used to update the electrostatic potential repeatedly. As described earlier, there are two possible iterative approaches which can be explored to solve the coupling system, in which the simple relaxation algorithm guarantees 55 the convergence of the algorithm. The question is whether these two approaches lead to the same outcome. Table 2.3: Comparison between two iteration approaches 2 atoms Gly Approach I Approach II Approach I Approach II Energy -231.18 -231.18 -12.44 -12.44 Surface area 128.67 128.67 271.91 272.02 Volume 100.72 100.83 287.85 287.93 To study the consistency between these two approaches, the above mentioned diatomic system is employed as well as a small molecule called glycerol triacetate (Gly) from a set of 17 test compounds whose detailed information is given in Section2.4. The self- consistent iteration is performed until the electrostatic solvation free energy converges to within 0.01 kcal/mol. The electrostatic solvation free energy(kcal/mol), surface area (˚2 ) A and volume(˚3 ) resulted from these two different methods are compared. The results are A shown in Table 2.3. Here we take α = 0.5 for both methods. The electrostatic potential φ is updated in every 15 steps of the generalized geometric flow integrations in Approach II. It is evident that the results from these two approaches are almost the same. Therefore, they can be alternatives for each other at least in small molecular systems. But for large protein systems, as we mentioned, it is better to use the second approach to avoid the possible blowup in the generalized geometry flow caused by unpredictable large changes in the temporary electrostatic potential. Thus, in the following tests and applications, the second method is applied except specified. In Approach II, the relaxation factor α and the number of intermittency Nstep need to be determined. We are interested in knowing whether the relaxation factor α plays a role in the final result. Similarly, it is important to know whether the Nstep makes a difference in 56 the converged result. We address these issues by examining the effects of α and Nstep on the electrostatic solvation free energy, surface area and enclosed volume. The above mentioned diatomic system is used here again. Table 2.4: Effect of relaxation factor α on final results α 0.1 0.2 0.5 0.7 0.8 Energy -231.26 -231.18 -231.18 -231.18 divergence Surface area 100.73 100.94 100.83 100.71 Volume 128.65 128.62 128.67 128.71 It is known that a stable α value is between 0 and 1 but can not be very close to 1. We consider a number of α values in the diatomic system while keeping other settings fixed. Table 2.4 shows the electrostatic solvation free energy, surface area and volume for α = 0.1, 0.2, 0.5 and 0.7. It is found that the procedure diverges when α ≥ 0.8. However, convergence is achieved as long as the relaxation factor α is small enough. Once the convergence is achieved there is no much difference in the final outcome. We therefore take α = 0.5 in the following tests and applications. To study the effect of the number of intermittency, we take Nstep = 5, 10, 15, 40 and 100, while fixing α = 0.5 and other settings. The results are listed in Table 2.5. It is seen that all values obtained from different number of intermittency are very close to each other. However, a numerically too large or too small Nstep is not preferable. If Nstep is too large, Approach II goes back to the first one. If Nstep is too small, the iterative process may stop too early because the perturbation in each iteration is so weak that the convergence criteria is satisfied unexpectedly sometimes. In addition, small step number makes the whole process computationally expensive. 57 Table 2.5: Effect of the number of Nstep 100 40 Energy -231.18 -231.19 Surface area 100.83 100.83 Volume 128.63 128.63 2.3.5 intermittency in Approach II 15 10 5 -231.18 -231.17 -231.11 100.83 100.89 101.07 128.67 128.48 128.63 Efficiency of the accelerated iteration procedure We study the efficiency of the accelerated self-consistent iteration in this section. At the beginning we analyze the CPU time usage based on an original combination of methods: Biconjugate Gradient (BiCG) method for the generalized PB solver and a widely used explicit scheme for the generalized geometric flow (GGF) solver. In addition, as commonly used in the linear system of the PB equation, we take 10−6 as the initial convergence criteria and set the first guess of the electrostatic potential in each generalized PB run to be 0. The above mentioned diatomic and Gly systems will be utilized through this efficiency test. Table 2.6 lists the breakup of time spending in the different parts of the self-consistent iteration procedure. It is seen that for these two systems the major computation cost lies in the routines of the generalized PB solver and the generalized geometric flow solver (more than 90%). Therefore, the total time will be dramatically reduced when efficient accelerations are achieved in the generalized PB solver and generalized geometric flow solver. Note that all of the computations are performed on a SGI Altix 350 workstation with a 1.4 GHz Itanium processor and 4 GB memory. Additionally, we explore the improvement made to the generalized PB solver, the generalized geometric flow solver, and consequently to the total time cost. First, we combine an appropriate preconditioner with the iterative solver of the linear system. Additionally, we make use of the prior electrostatic potential as a first guess for the next PB run. Moreover, we obtain the approximations through the relaxation of the 58 Table 2.6: CPU time analysis from original schemes 2 atoms Gly Time(s) % Time(s) % Total 23.95 58.4 GF 4.95 21 11.31 19 PB 18.25 76 45.24 77 Other 0.75 3 2.03 4 convergence criteria of the linear solver. Finally, we employ the ADI scheme to integrate the generalized geometric flow equation. First of all, we do the following improvement: Take the prior potential solution as the first guess of each run of the linear solver, then replace BiCG scheme with a combination of the preconditioner and the iterative solver (ILU/OM), while keeping other settings unchanged. Table 2.7: Speedup from adjustment of initial guess and h size BiCG BiCG1 1 17× 12× 12 0.0557(252) 0.0322(152) 0.5 32× 23× 23 0.775(419) 0.467(248) 0.25 64× 46× 46 17.676(841) 10.410(490) 0.125 127× 92× 92 525.74(2771) 263.11(1371) preconditioner in PB solver ILU/OM Speedup 0.0371(50) 1.50 0.420(82) 1.85 6.947 (166) 2.54 130.76 (410) 4.02 Table 2.7 gives the total computational costs of the generalized PB solver in the diatomic system as well as the total iteration numbers which are inside the parenthesis. The third column lists the time spending for original schemes, the fourth one makes use of prior potential as a first guess and the fifth one records the time spending from the usage of the preconditioner and the new first guess setting. It is seen that the gain of speedup is related to the size of system: The larger size is the system, the more acceleration is achieved. For a 127× 92× 92 system, combination of the above two implementations can obtain a speedup up to a factor of 4, while a single adjustment does not give much impressive improvement. 59 It is also found that although the total iteration number reduces dramatically by adding the preconditioner, the total computational cost is reduced with a much smaller factor. The reason is that the PB solver with a preconditioner takes more time in each step. Next, we study the impact of the convergence criteria to the electrostatic solvation free energy of the diatomic system. Table 2.8 summarizes the calculated electrostatic solvation free energies and total time cost of the PB solver under different convergence criteria. It indicates that 10−4 is good enough to deliver accurate results. In fact, 10−2 is still fine but 10−1 is clearly unacceptable. In this study, we take 10−4 as the convergence criteria of the linear system in the following calculations except specified. Because it is able to save much time compared to 10−6 while at the same time maintains the accuracy to a satisfied level. In practical application, one might use 10−2 . A further reduction in computational time is possible if one sets the probe radius to rp = 0.25rvdW , where rvdW is the van der Waals radius [20]. Table 2.8: Influence of convergence criteria on electrostatic solvation free energy and computational time for the diatomic system Criteria 10−8 10−6 10−4 10−3 10−2 10−1 Energy (kcal/mol) -231.17 -231.17 -231.19 -231.28 -231.07 -239.80 PB Time (s) 10.44 6.95 4.17 3.09 2.05 0.90 Finally, we implement the ADI scheme in the generalized geometric flow solver. Thus we can use a much larger time increment than that used in an explicit scheme without the stability concern. For example, if grid spacing is h = 0.25, the time step size can be taken as large as 0.2 for the ADI scheme to be a good balance between accuracy and efficiency, while it has to be less than 0.02 in the explicit scheme. The acceleration of the generalized geometric flow solver can be found in Table 2.9, which is obtained by applying all of the 60 speedup strategies we have discussed to the diatomic and Gly systems. This table shows all the time spending for major routines in the iterative process before and after the acceleration. It indicates that speedup in the PB solver can reach a factor of 4 or 5. However, the speedup in the total time is not as impressive as in the PB solver. It is about a factor of 3 in the Gly system and about a factor of 2 in the diatomic system. The reason is that the acceleration in the generalized geometric flow equation through the ADI can not have the same speedup factor as that of the PB solver. The electrostatic solvation free energies are also given in the table for a comparison before and after the speedup. Little difference in energies is observed due to varying schemes and the approximation. Table 2.9: Comparison of CPU time (s) in the iteration procedures with and without accelerations 2 atoms Gly without with Speedup without with Speedup Total 23.95 8.87 2.70 58.40 16.28 3.59 GF 4.95 3.67 1.35 11.13 5.57 2.00 PB 18.25 4.45 4.10 45.24 8.71 5.19 Other 0.75 0.75 2.03 2.00 Energy -231.17 -231.18 -12.44 -12.44 2.3.6 Impact of potentials in the geometric flow equation The potential source terms in the generalized geometric flow equation include pressure, longranged attractive dispersion interaction and electrostatic potential. The solution (S) of the generalized geometric flow reflects a balance between the intrinsic geometric curvature effect and several external potentials at the equilibrium. In this section, we illustrate the impact of involved potentials to the characteristic function S, and consequently to the solvation free energy. Although in our model there is no explicit surface definition to be demonstrated, the 61 impacts of these potentials can be reflected by volume, area and the electrostatic solvation free energy. In particular, if the flow is driven inward by a potential, the volume should become smaller, and an outward driving makes the volume larger. These are true at least during the early stage of the solvent-solute boundary evolution. In fact, they are not true for the system near the equilibrium. The present study is carried out through two proteins (PDB ID 1ajj and 1fca) from protein data bank (PDB). Their detailed coordinates and parameters are given in the application section of 22 proteins. Without any potential term, the mean curvature flow equation is driven purely by intrinsic geometric curvature effect, which leads to the minimal molecular surface (MMS) [20]. With the MMS as a reference, each time we use one additional potential term in Eq. (2.13) to produce a new characteristic function S which will be used in the PB solver to calculate the electrostatic potential. In other words, starting with the MMS, we attain the different characteristic functions S with either pressure, attractive non-polar potential, or electrostatic potential separately. Only when electrostatic potential term is taken into account, is it needed to run the self-consistent iteration process for the solution of the coupled system. Table 2.10 gives the calculated volume and electrostatic solvation free energy under each potential term. We also calculate the solvation free energy when all the potentials are turned on. It is seen that all the potentials involved here drive the flow inward so that there are more solvent components between or around two spherical solutes when just an individual potential is turned on. This is consistent with the experimental observations [236] and our previous studies [17]. 62 Table 2.10: Effects of potentials on the solvent-solute boundary 1ajj 1fca Volume Energy Volume Energy MMS 6601.9 -975.6 9345.7 -1082.2 Pressure 6195.1 -1032.1 8866.6 -1123.2 Attractive 5533.8 -1139.3 8107.2 -1192.8 Electrostatic 6585.8 -1061.4 9329.3 -1112.4 Total potential 5381.6 -1165.6 7886.8 -1211.9 2.4 Applications We consider two types of problems in this section. First, we apply our new approach to a set of 17 small molecules. Then, some protein examples are studied. The Dirichlet boundary condition is used for both the generalized Poisson-Boltzmann equation and the generalized geometric flow equation as in our previous calculations [259, 252, 93, 20] 2.4.1 Set of 17 test molecules We apply our optimized surface model (OSM) of solvation to compute the solvation free energies of a set of 17 small compounds. This test set was studied by Nicholls et al [160] using a number of approaches, including quantum mechanics, PB theory etc. An important aspect about this test set is that experimental data are available. Therefore, solvation free energies predicted from our new model can be compared with both experimental values and other numerical results. Moreover, these compounds are considered as a challenging test set for computational methods because the existence of polyfunctional or interacting polar groups, which lead to strong solvent-solute interactions. In our calculation, we set the dielectric constants ǫm = 1 and ǫs = 80. We use γ as fitting parameter, and its initial value is set to γ = 1/15 kcal/(mol˚2 ) to compute other A 63 Table 2.11: Comparison of free energies (kcal/mol) for 17 compounds Compound glycerol triacetate benzyl bromide benzyl chloride m-bis(trifluoromethyl)benzene N,N-dimethyl-p-methoxybenzamide N,N-4-trimethylbenzamide bis-2-chloroethyl ether 1,1-diacetoxyethane 1,1-diethoxyethane 1,4-dioxane diethyl propanedioate dimethoxymethane ethylene glycol diacetate 1,2-diethoxyethane diethyl sulfide phenyl formate imidazole Gnp 2.27 1.40 1.35 2.22 1.96 1.86 1.45 1.65 1.52 1.01 1.82 1.03 1.59 1.55 1.22 1.37 0.82 ∆Gp ∆G Exptl Error -12.44 -10.16 -8.84 -1.32 -4.89 -3.49 -2.38 -1.11 -5.02 -3.68 -1.93 -1.75 -3.22 -1.00 1.07 -2.07 -9.20 -7.24 -11.01 3.77 -7.67 -5.81 -9.76 3.95 -4.22 -2.77 -4.23 1.46 -8.24 -6.59 -4.97 -1.62 -4.40 -2.88 -3.28 0.40 -5.65 -4.64 -5.05 0.41 -7.85 -6.03 -6.00 -0.03 -4.52 -3.50 -2.93 -0.57 -8.43 -6.84 -6.34 0.50 -4.31 -2.76 -3.54 0.78 -2.39 -1.17 -1.43 0.26 -7.84 -6.48 -4.08 -2.40 -11.27 -10.45 -9.81 -0.64 γ-dependent parameters, see Eq. (2.13). We choose ρ0 /γ = 2 by comparing the bulk density 0.033˚−3 and the possible γ value. For micro-molecular systems, pressure p can A be very small and sometimes is neglected in the calculation [48]. But here we still take it into account and set p/γ to 0.2. Note that in the numerical simulation, all ratio parameters here are treated as dimensionless. For L-J parameters, σs is chosen to be 0.65˚ as a A good fitting solvent radius and σi is the solute atomic radii [235]. Note that due to the continuum representation of solvent in our model, the 6-12 Lennard Jones potential formula (A.2) differs from the standard version — the distance used in our formula is no longer the distance between the centers of solute atoms and the centers of solvent atoms but the distance between a specific position in the solvent area and the centers of solute atoms. This should make the setting of well depth ǫi different from the ones taken from AMBER 64 or OPLS force fields. However, the performance of the L-J potential should be similar, i.e, the value of the L-J potential in the solvent caused by a solute atom only depends on the distance from the center of the atom. It implies that the value of L-J potential caused by a solute atom should be a constant on the van der Waals surface of the atom. In other σi +σs 12 σi +σs 6 words, ǫi −2 = Di if r is on the vdW surface of the atom. Here |r−ri | |r−ri | the constant Di should have different values for various types of atoms. For simplicity we use a uniform constant D to determine the value of ǫi given σs and σi . In the present calculation, we pick 1.0 for D and the WCA expression is chosen as the attractive van der Waals potential. We choose grid spacing h = 0.25˚ and time stepping τ = h2 /4.5. Here, γ A (kcal/(mol˚2 )) serves as a fitting parameter, and its final value is 0.0065 kcal/(mol˚2 ). A A Structure and charge information of the 17 compounds are adopted from those of Nicholls et al [160] and can be obtained from the supporting information of their paper. In particular, charges are taken from the OpenEye-AM1-BCC v1 parameters [115]. Atomic coordinates and radii are based on their new parametrization called ZAP-9 in which certain types of radii are adjusted by them from Bondi radii to improve the agreement with experimental free energy. With these structures and charges parameters, the root mean square error (RMS) obtained in their paper is 1.71± 0.05 kcal/mol via the explicit solvent model. And the smallest RMS error of their single - conformer Poisson-Boltzmann approach is 1.87±0.03 kcal/mol [160]. Such a large RMS error indicates the challenge of this test set. Usually, different surface definitions in implicit solvent models should have their own optimal radii set. In particular, a continuous dielectric definition based model is supposed to have radii set with larger values than those of a discontinuous dielectric model. Otherwise, the calculated free energy does not give a good fitting to experimental data [217]. This also occurs in the 65 present model. Therefore, we multiply the radii from ZAP-9 by a common factor 1.1. It turns out this treatment leads to a good agreement with experimental data in terms of electrostatic solvation free energies and total solvation free energies. The results are summarized in Table 2.11, which gives a comparison between calculated and experimental values of solvation free energies of 17 compounds. RMS error of the present model is 1.76 kcal/mol which is similar to that of Nicholls et. al, i.e., 1.87 kcal/mol. This RMS error is competitive to that of the explicit solvent approach (1.71±0.05 kcal/mol) under the same charge and structure parameters set [160]. This may be credited to the more satisfactory nonpolar terms and the enforcement of the potential driven geometric flow. Here, as expected, major errors are from the calculation of benzamides which are between 3.5 and 4.0 kcal/mol, see Figure 2.5. Without these benzamide compounds, the RMS error drops from 1.76 kcal/mol to 1.24 kcal/mol. This problem with benzamides is likely due to radius adjustment for the carbonyl oxygens and tertiary nitrogens in ZAP 9 under the OpenEye-AM1-BCC v1 charges [160]. In other words, these large errors from benzamides can not be avoided if both OpenEye-AM1BCC v1 charge and corresponding optimized ZAP 9 radii are used in PB approaches. Based on these considerations, one possible approach for improvement is to create a new charge set more appropriate for the PB approach with the same ZAP radii. It may be realized by introducing quantum mechanical corrections to our model to take care of charge density. However, this aspect will be investigated in next chapter 3. 2.4.2 Solvation free energy of proteins Validation by using a set of 17 molecules has shown that proposed differential geometry based solvation model works well for the energy prediction of small compounds. Since small 66 2 OSM/ Nicholls’ 0 Exptl vs OSM Exptl vs Nicholls’ −2 −4 −6 −8 −10 −12 −12 −10 −8 −6 −4 −2 Exptl (kcal/mol) 0 2 Figure 2.5: Correlation between experimental data and the present optimized surface model (OSM)(also results from Nicholls’) in electrostatic solvation free energies of 17 compounds. molecules are accessible to more accurate computational means, such as quantum mechanical calculations, one of the main purposes of developing the present optimized surface model (OSM) is to attack relatively large macromolecules. To this end, we consider a test set of proteins employed by Mei et al [152]. For this set, the total number of residues ranges from 21 to 275. The initial structures of all proteins are taken from the protein data bank (PDB). The hydrogen atoms, which are typically missing from the X-ray data, are added to the structures to obtain full all-atom models with optimized hydrogen bondings. Partial charges at atomic sites and atomic van der Waals radii in angstroms are assigned from the CHARMM27 force field [146]. All of these operations, i.e, the transformation from PDB files to PQR files, can be easily done with a software PDB2PQR. Parameters of the present calculation are set in the same way as those for 17 compounds except for Nstep = 2. Similar to the treatment of the 17-compound set, the radii from the CHARMM force field need to be multiplied by a common factor of 1.1. Our results are summarized in Table 2.12. 67 For a comparison, The results of Mei et al are listed in Table 2.12 as well. Their results obtained from the molecular fractionation with conjugate caps and conductor-like polarizable continuum model (MFCC-CPCM). This is an approximate quantum approach that divides the macromolecule into fragments, such that the quantum calculations at HF/6-31 G level and B3LYP/6-31 G level can be applied. The solvation effect is estimated vis the polarizable continuum method with the classic molecular surface [152]. It is seen from the table that there are relatively large deviations, up to 28%, between results obtained by the present OSM and those of the MFCC-CPCM. These derivations might due to the different methodologies, computational environments and structures. In fact, the results from two different quantum basis sets have up to 10% deviation for protein Amyloid. Another deviation between results of two quantum basis sets is about 5% for the protein BPTI. Table 2.12: Comparison of electrostatic solvation free energies (kcal/mol) obtained from the MFCC-CPCM, the present model (OSM) and MIBPB. Protein RP71955 Amyloid Crambin BPTI Calbindin Ubiquitin Lysozyme Subtilisin PDBID No.of residues 1RPB 21 1AMC 28 1CBN 46 1BPI 58 1CDN 75 1UBQ 76 2BLX 129 1SBT 275 MIBPB-III -184.68 -861.65 -303.80 -1301.9 -2188.96 -1170.61 -1913.40 -1896.5 △Gp(kcal/mol) MFCC-CPCM [152] -267.60 -886.01(-798.72) -361.52 -1332.71(-1263.52) -2259.62 -997.02 -1887.71 -2062.2 OSM -192.23 -852.68 -304.84 -1281.19 -2195.42 -1148.81 -1898.07 -2001.4 As the largest deviation between the results from the proposed OSM and that of the MFCC-CPCM is quite large, we consider another independent approach, the MIBPB [259, 252, 93], to evaluate the present method. A specific MIBPB code, the MIBPB-III which has the treatment of geometric and charge singularities [93], is employed in our calculations. 68 0 OSM(kcal/mol) −500 −1000 −1500 −2000 −2500 −2500 −2000 −1500 −1000 −500 MFCC−CPCM(kcal/mol) 0 Figure 2.6: Correlation between MFCC-CPCM [152] and the present optimized surface model (OSM) in electrostatic solvation free energies of 8 proteins. MIBPB-III has been intensively calibrated in the past and is the only known second accurate method for solving the Poisson-Boltzmann equation with both molecular surfaces and partial charges represented by the Dirac delta functions. To deliver such an accuracy, the MIBPB-III has built in the MIB scheme [261, 254] and Dirichlet to Neumann mapping [93]. Similar to the present approach, the structural data of MIBPB-III is also prepared with the PDB2PQR software. As such, we can eliminate the difference due to the different treatment of initial data. However, the MIPPB utilizes the classic PB equation and the molecular surface, while the present method has a generalized PB equation, and an optimized smooth surface. It is seen from Table 2.12 that solvation energy results from the present OSM and from the MIBPB have an excellent agreement on most proteins except for Subtilisin. For this protein, the difference of energies from two methods is about 5%. 69 0 OSM(kcal/mol) −500 −1000 −1500 −2000 −2500 −3000 −3500 −3500 −3000 −2500 −2000 −1500 −1000 −500 MIBPB−III(kcal/mol) 0 Figure 2.7: Correlation between MIBPB-III and the present model (OSM) in electrostatic solvation free energies of 22 proteins 2.4.3 Twenty two proteins Encouraged by the good consistency of the proposed method and the MIBPB-III, we further compare these approaches by a larger set of protein molecules — twenty two proteins that have been frequently used in previous studies [83, 252, 93, 17]. The implementation of these two methods is the same as that described in the last section. Table 2.13 shows the results from the present continuous dielectric model, denoted as “Radii1” in the table, and those by MIBPB-III. It turns out that electrostatic solvation energies obtained via our minimization process are very close to those based on the MIBPBIII. This can also be seen through Figure 2.7 which shows that the results between them are quite linearly correlated. The correlation coefficient is 0.999. It is still interesting to understand how important it is to use a slightly enlarged radius in smooth surface models [217]. To this end, we carry out the present calculations by using 70 Table 2.13: Electrostatic solvation free energies for 22 proteins △Gp (kcal/mol) PDB-ID No. of atoms MIBPB-III Radii1 Radii0 1ajj 519 -1137.2 -1178.5 -1362.6 1bbl 576 -986.8 -965.94 -1158.7 1bor 832 -853.7 -871.4 -1066.5 1fca 729 -1200.1 -1200.6 -1340.9 1frd 1478 -2852.2 -2844.8 -3173.4 1fxd 824 -3299.8 -3291.9 -3496.9 1hpt 858 -811.6 -808.2 -1039.1 1mbg 903 -1346.1 -1328.2 -1535.4 1neq 1187 -1730.1 -1713.9 -2049.3 1ptq 795 -873.1 -866.2 -1064.5 1r69 997 -1089.5 -1072.7 -1294.0 1sh1 702 -753.3 -771.8 -973.8 1svr 1435 -1711.2 -1704.6 -2073.7 1uxc 809 -1138.7 -1125.7 -1350.9 1vii 596 -901.5 -892.0 -1052.1 2erl 573 -948.8 -935.8 -1067.3 2pde 667 -820.9 -843.0 -1049.3 451c 1216 -1024.6 -1020.6 -1291.8 1a2s 1272 -1913.5 -1900.3 -2155.0 1a7m 2809 -2155.5 -2179.8 -2666.1 1a63 2065 -2373.5 -2380.5 -2912.0 1vjw 828 -1237.9 -1226.6 -1411.4 the original CHARMM22 van der Waals radii, denoted as “Radii0”. This result is also listed in Table 2.13. It is seen that results from the original CHARMM22 van der Waals radii can have over 20% deviations from those of “Radii1”. This helps to come to a conclusion that for continuous dielectric models, it is necessary to enlarge atomic radii obtained from widely used force fields. Otherwise, the results will be inconsistent with those of other analysis. This is in agreement with the observation in the literature [217]. The necessity of using larger radii is also shown clearly in Fig 2.8 by the differences of electrostatic solvation free energies obtained from the MIBPB-III and the present calculations with original radii (Radii0) or 71 600 MIBPB−Radii1 MIBPB−Radii0 Kcal/mol 400 200 0 0 5 10 15 Protein 20 25 Figure 2.8: Differences between electrostatic solvation free energies obtained from the MIBPB and the present model with original radii (Radii0) or enlarged radii (Radii1). enlarged radii (Radii1). Additionally, it is useful to demonstrate that the electrostatic potential function computed in the present OSM can be illustrated at arbitrary isosurface of the characteristic function S. This is done by first computing a sharp surface at a given S value, then projecting the φ value on the isosurface of a given S value. Figure 2.9 shows three plots of the electrostatic potentials at S = 0.25, 0.5 and 0.75. A comparison of these potentials indicates (S=0.25) (S=0.50) (S=0.75) Figure 2.9: Surface potential display of one protein (PDBID: 1frd) at different isosurfaces 72 the fast/slow electrostatic potential changing regions in the solvent-solute boundary. These regions are also interactive regions in the protein-protein or protein-ligand interactions. Finally, it remains an important issue to further improve the computational efficiency, although systematical efforts have been made in this work to reduce CPU cost. Since the coupled generalized PB and geometry flow equations are needed to evolve self-consistently to reach the steady state, it takes more CPU time for the present method to calculate the total free energy than some existing approaches that compute the polar and nonpolar energies separately. 2.5 Chapter conclusions This chapter presents a novel differential geometry based solvation model. A crucial concept in the present model is the characteristic function or the description function of solute molecules which is one inside the solute domain and zero inside the solvent. Near the solvent-solute boundary, the characteristic function gradually changes from one to zero over a region of transition. The exact position and width of the transition region are determined by a variational framework, which is formulated based on the total solvation free energy. As a key ingredient of the present framework, the total energy encompasses coupled polar and nonpolar contributions. The polar solvation free energy functional is described by the electrostatic theory at equilibrium, while the nonpolar solvation free energy functional consists of surface energy, mechanical work and attractive solvent-solute interactions. Both the polar and nonpolar solvation free energies are coupled through the characteristic function S. In the present work, geometric measure theory is utilized to convert the Lagrangian formulation of the surface into appropriate Eulerian formulation. By variation of the total 73 solvation free energy functional with respect to the characteristic function and electrostatic potential, a generalized geometric flow equation for the electrostatic potential and a generalized Poisson-Boltzmann equation for the characteristic function are obtained. Unlike the standard Poisson-Boltzmann equation, the generalized Poisson-Boltzmann admits a smooth dielectric profile governed by the generalized geometric flow equation, which provides a physical description of the true solvent-solute dielectric boundary, according to the variational principle. The generalized geometric flow equation balances the intrinsic geometric curvature effect and external potential due to mechanical work, solvent-solute interactions, and the electrostatic potential. The solution of the generalized geometric flow equation and the generalized Poisson-Boltzmann equation leads to quantities for the direct evaluation of the solvation free energy. 74 Chapter 3 Quantum formulation The objective of this chapter is to incorporate a quantum mechanical description of charge density into our earlier differential geometry based solvation model, which is described in Chapter 2. To this end, we hope to develop a more accurate and self-consistent multiscale approach for the solvation analysis of both small and large molecules. The advantages of the present quantum formulation of the differential geometry based multiscale solvation models are follows. First, compared with our earlier solvation models, the present model is able to provide more accurate descriptions of charge arrangement during the solvation process and leads to more accurate prediction of solvation free energies. Additionally, the present multiscale model reduces the dependence of our earlier solvation models on the existing molecular mechanical force field parameters, which are typically parameterized for certain class of (macro-) molecular systems and may not be appropriate for other class of molecules. Therefore, the present model can be applied to a wider class of molecules. Moreover, compared with other existing QM based solvation models [227, 237, 42], the present model avoids the use of unphysical solvent-solute interfaces. The solvent-solute 75 boundary in the present model is described by the differential geometry theory of surfaces. Finally, a systematical framework is established to incorporate polar energy, nonpolar energy and quantum energy into a total energy functional. The optimization of the total energy functional leads to coupled governing equations for a set of important state functions, such as electrostatic potential, electronic density, and solvent-solute boundary profile. This set of state functions gives rise to theoretical predictions of solvation free energy, electrostatic profile and solvent-solute interface of the solvent-solute complex. This chapter is organized as follows. Section 3.1 is devoted to the theoretical formulation of our differential geometry based quantum model of solvation. We provide a detailed description of various solvation free energy functionals. Three governing equations, i.e., the generalized Poisson-Boltzmann equation, the potential driven geometric flow (i.e., generalized Laplace-Beltrami) equation, and the Kohn-Sham equation are derived from the total energy functional via the Euler-Lagrange variation. Numerical methods and algorithms are presented in Section 3.2. This section offers detailed schemes for the solution of the abovementioned three governing equations. The dynamical coupling of these three equations is achieved by an efficient iterative procedure. A formula for the solvation free energy estimation is also derived from the multiscale total energy functional. The present multiscale model is validated by numerical tests using a number of molecules in Section 3.3. To establish a valid approach, we have examined consistency of the electron density with the Poisson equation. The unit conversion between conventions used in our Poisson solver and that in a DFT software is discussed. The results from the present multiscale mode is compared with those in our previous methods and those in the literature. Applications to three sets of molecules are given in Section 3.4. Some of these sets are computationally challenging. We 76 demonstrate that the present model performs well in the prediction of solvation free energies. This chapter ends with a conclusion. 3.1 Theory and model In this section, we first prescribe the polar and nonpolar free energy functionals based on our differential geometry theory of the solvent-solute interface introduced in section 2.1.1 of Chapter 2. We then give an expression for the quantum mechanical energy of electrons. In the present work, the quantum mechanical energy of electrons is also treated as a part of the multiscale total energy for the solvation system. Governing equations for the solvation process are derived by the Euler-Lagrange variational principle. 3.1.1 Charge density based polar free energy functional The solvation process involves both intermolecular and intramolecular interactions. Solvation analysis has been following certain convention, which may not be precisely consistent with that in other fields. Typically, solvation interactions are classified into polar type and non-polar type. The polar type of interactions is often modeled by the Poisson-Boltzmann (PB) equation with appropriate point charges at atomic central positions. In the original electromagnetic theory, the charge source of the electric potential is to be “free charges”. However, in biophysics, due to the atomistic nature of the description, the point charges are obtained by the fitting of the electron density distribution of either a charged molecule or a charge-neutral molecule into its atomic centers. Such point charge information is often stored in the database of popular software packages, such as CHARMM [146]. Therefore, the polar interactions include both charge and polarization effects inside the molecule. Note 77 that the effect of the rearrangement of electron charges during the solvation process needs to be computed twice, once before and once after the solvation. Polar interactions are also called electrostatic interactions. However, not all electrostatic interactions are described by the PB equation. Strictly, the electrostatic potential solved from the PB equation represents Coulombic type of interactions between charges. However, many other intermolecular interactions, such as London dispersion interactions, Debye (induced dipole) interaction, iondipole interactions and dipole-dipole interactions are also electrostatics in origin, and are not represented by the PB equation [121, 1, 5, 6, 194]. Sharp and Honig [199] and Gilson et al. [97] have given a formulation for the electrostatic free energy functional. However, their formulation is based on a given static sharp solventsolute interface. In the present work, we follow our earlier definition of differential geometry based electrostatic free energy functional in Chapter 2 1 S ρm φ − ǫm |∇φ|2 2 Ω   Nc  0 e−φQi /kB T − 1  dr − 1 ǫs |∇φ|2 − k T + (1 − S) ni B  2 i=1 Gp = (3.1) where Qi is the charge of ith ionic species, Nc is the total number of ionic species, kB is the Boltzmann constant, T is the temperature, and n0 is the bulk concentration of the ith i ionic species. Here, ǫs and ǫm are the permittivity, or dielectric constants of the solvent and solute domains, respectively. The ǫ is unity in vacuum, but assumes different values in different environments. In solvation analysis, ǫ is usually set to 1 or 2 in the solute domain and to 80 in the solvent domain. In Eq. (3.1), ρtotal is the total charge density of the 78 molecule and is given by ρtotal = qn(r) − qnn (r) = qn(r) − q I (3.2) ZI δ(r − RI ), where q is the unit charge of an electron, n(r) is the electron density, nn (r) is the nucleus density, and ZI and RI are the atomic number and the position vector of nucleus I, respectively. In Eq. (3.1), the term associated with S is the electrostatic free energy of the solute and that associated with (1 − S) is the electrostatic free energy of the solvent. In our model, the surface function S will be determined by the total energy optimization. Additionally, non-polar solvent-solute interactions are modeled in the same framework as that in Section 1.1.4. 3.1.2 Quantum mechanical energy functional In the present multiscale model, we need to evaluate the total charge density ρtotal (r) by quantum mechanical principles or ab initio approaches. However, the ab initio calculation of the electronic structure of a macromolecule is intractable at present due to the large number of degrees of freedom. A vast variety of theories and algorithms have been developed in the literature to reduce the dimensionality of this many-body problem. One of the simplest ab initio approaches is the Hartree - Fock (HF) method, which replaces instantaneous Coulombic electron-electron repulsion interactions with a mean-field average. A variational procedure is used to minimize the energy. An alternative of the HF method is the density functional theory (DFT), which is originated from the Thomas-Fermi model. DFT represents the electronic 79 structure (principally the ground state) of a many-body system as a functional of a single electron density. As usual in many-body electronic structure calculations, the nuclei of the molecule of interest are treated by the Born-Oppenheimer approximation (i.e., as fixed) in DFT to generate a static external potential in which the electrons are moving. The selfconsistent iterations are utilized to minimize the total energy of the system. Recently, DFT has become one of the most popular and versatile methods available in computational physics, computational chemistry and computational biology. In the present work, we incorporate the DFT description of the electronic structure of the solute molecule into our differential geometry based solvation model. Despite the improvement in computer hardware and software for the quantum mechanical calculation, computational costs are still a major concern for the QM simulation of large molecules of interest. Therefore, so-called order-N algorithms [98, 169], in which the computer time and memory scale linearly with the simulated system size, become increasingly important. Though the plane wave basis set has advantages over local basis sets in terms of avoiding basis-set superposition error as well as convergence concerns, it is difficult to be used in the implementation of the O(N) method in DFT. As such, a localized basis set is normally taken to develop fully self-consistent O(N) DFT algorithms. Along this line, a software package named SIESTA (Spanish Initiative for Electronic Simulations with Thousands of Atoms) was developed [209, 3]. It is based on a flexible linear combination of atomic orbitals (LCAO) basis set and essentially perfect O(N) scaling. Therefore, it allows very fast simulations using minimal basis sets and very accurate calculations with complete multiplezeta and polarized bases [169, 168]. Moreover, the pseudopotential is used in SIESTA to avoid the calculation of core electrons and to achieve the expansion of a smooth (pseudo80 ) charge density on a uniform spatial grid domain, which further accelerates the speed of quantum calculations. 3.1.2.1 Kinetic energy Combining DFT with our differential geometry based solvation formulation, we define the kinetic energy functional as 2 Gkin [n] = S(r) j 2m |∇ψj (r)|2 dr where m(r) is the position-dependent electron mass, (3.3) h = 2π with h being the Planck con- stant, and ψj (r) are the Kohn-Sham orbitals. Here, the total electron density n is given by n(r) = i |ψi |2 , (3.4) where the summation is over all the Kohn-Sham orbitals. Note that orbitals {ψj } are subject to the orthonormality constraint    1 ∗ Sψi (r)ψj (r)dr =   0 i=j (3.5) i = j. Obviously, Eq. (3.5) is an approximation which is valid as long as the boundary represented by the characteristic function S is sufficiently far away from atomic centers of the solute molecule. This is true in our model. 81 3.1.2.2 Potential energy Without external potentials, the electrostatic potential energy of nuclei and electrons can be represented by the Coulombic interactions among the electrons and nuclei. There are three groups of electrostatic interactions: interactions between nuclei, interactions between electrons and nuclei, and interactions between electrons. Because of the Born-Oppenheimer approximation, interactions between nuclei do not directly have a impact on the structure of electrons in DFT. According to the Coulombic law, the repulsive interaction between electrons can be expressed as the Hartree term Uee [n] = q 2 n(r)n(r′ ) ′ dr , ǫ(r)|r − r′ | 1 2 (3.6) where q is again the unit charge of an electron, ǫ(r) is the position dependent electric permittivity, and r and r′ are positions of two interacting electrons. Equation (3.6) gives rise to a nonlinear function in terms of electron density n. Therefore, the problem of solving the electronic structure has to be resolved by self-consistent iterations. Additionally, the attractive interactions between electrons and nuclei are given by Uen [n] = − I q 2 n(r)ZI . ǫ(r)|r − RI | (3.7) Finally, we write the total potential energy functional as Gpotential = Ω S(r) Uee [n] + Une [n] + EXC [n] dr, 82 (3.8) where the last term, EXC is the exchange-correlation potential, which includes all the manyparticle interactions in the solute molecule. In general, the exact form of the exchangecorrelation potential is not known. There are good approximations in the practical applications, such as the local-density approximation, the local spin-density approximation, and generalized gradient approximations. A detailed elaboration of the exchange-correlation potential is beyond the scope of the present work. 3.1.3 Total free energy functional Intuitively, it may appear that the total free energy functional is the summation of the polar, non-polar, kinetic and potential energy. However, such an approach will lead to some double counting because of the coupling among different energy terms. For example, the electrostatic energy depends on the charge density, which, in turn, depends on the kinetic and potential energies of electrons. Additionally, the electrostatic potential serves as an unknown in the polar energy functional, meanwhile it serves as an input in the potential energy of electrons. To see this connection, we need to solve the Poisson equation in vacuum (ǫ = 1) −∇2 φv (r) = ρv total (r), (3.9) where φv is the electrostatic potential in vacuum and ρv total = nv + nn with nv (r) being the electron density in vacuum. The solution of Eq. (3.9) is φv (r) = qnv (r′ ) ′ dr − |r − r′ | 83 I qZI . |r − RI | (3.10) Note that Eq. (3.10) is the exact total Coulombic potential of electron-electron interactions and electron-nucleus interactions. As such, we do not need to include Uee [n] and Uen [n] terms in the total free energy functional. Finally, we propose a multiscale total free energy functional for biomolecules at equilibrium Gtotal [S, φ, n] = {γ|∇S(r)| + pS(r) + (1 − S(r))ρ0 Uss Ω   Nc 1 n0 e−Qi φ/kB T − 1  +(1 − S(r)) − ǫs |∇φ|2 − kB T i 2 i=1    2 1 2 + S(r)  2 + E [n] dr +S(r) ρm φ − ǫm |∇φ| |∇ψj | XC  2 2m j (3.11) where the first row is the non-polar energy functional, the second row is the electrostatic energy functional and the last row is the electronic energy functional. As discussed above, the term ρtotal φ also contributes to the Coulombic potentials of electron-electron and electronnucleus interactions. This total free energy functional provides a starting point for the derivation of governing equations and a basis for the evaluation of solvation free energies. 3.1.4 Governing equations The total free energy functional (3.11) is a function of characteristic function S, electrostatic potential φ and electron density n. The governing equations of these quantities can be obtained by the first variation of the total free energy functional (3.11). From the mathematical point of view, there should exist optimal functions S(r), φ(r) and a set of orbitals {ψj } at the equilibrium state in which the solvation free energy is optimized. The variational procedure 84 for S(r), φ(r) and {ψj } is described below. First, by the variation of Eq. (3.11) with respect to the electrostatic potential φ, we have δGtotal = 0⇒ δφ Nc Sρm + ∇ · ([(1 − S)ǫs + Sǫm ]∇φ) + (1 − S) i=1 n0 Qi e−Qi φ/kB T = 0 i (3.12) The Euler-Lagrange equation is used in the above variation. Equation (3.12) is the generalized Poisson-Boltzmann (GPB) equation [243, 46] Nc −∇ · (ǫ(S)∇φ) = Sρtotal + (1 − S) i=1 n0 Qi e−Qi φ/kB T , i (3.13) where the dielectric function is given by ǫ(S) = (1 − S)ǫs + Sǫm . (3.14) This is a smooth function. It is clear that the GPB equation utilizes a smooth dielectric profile. There is a smooth transition region for the dielectric function to change from ǫs to ǫm . Therefore, the solution procedure of the present GPB equation (3.13) avoids many numerical difficulties of solving elliptic equations with discontinuous coefficients [258, 261, 260, 254, 253] in the classical PB equation. Furthermore, in a solvent without salt, the GPB equation is simplified to the generalized Poisson equation −∇ · (ǫ(S)∇φ) = Sρtotal . 85 (3.15) Both Eqs. (3.13) and (3.15) are similar to our earlier results in Chapter 2. However, in the present multiscale model, the charge source ρtotal is to be determined by solving the Kohn-Sham equations, rather than by the fixed charges ρfix = j qj δ(r − rj ). Additionally, by the variation of Eq. (3.11) with respect to the surface function S, we have δGtotal =0⇒ δS ∇S 1 −∇ · γ + p − ρ0 Uss − ǫm |∇φ|2 + |∇S| 2 Nc +kB T n0 e−Qi φ/kB T − 1 + ρm φ + i i=1 1 ǫs |∇φ|2 2 2 j 2m |∇ψj |2 + EXC [n] = 0 (3.16) In Eq (3.16), ∇ · γ ∇S |∇S| is a generalized Laplace-Beltrami operator, which is a gener- alization of the usual Laplacian operator to a smooth manifold of macromolecular surface [17, 243]. In general, γ can be a function of the position γ = γ(r) to account for the surface hydrophobicity at different locations of the macromolecule. For simplicity, it is treated as a constant in our present computation. By solving Eq. (3.16), we obtain a “physical solventsolute boundary” (S). As discussed in earlier work [20, 17, 243], the solution of this elliptic partial differential equation (PDE) can be attained via a parabolic PDE ∂S ∇S = |∇S| ∇ · γ ∂t |∇S| 86 +V , (3.17) where the generalized “potential” V is defined as V 1 1 = −p + ρ0 Uss + ǫm |∇φ|2 − ǫs |∇φ|2 − kB T 2 2 2 −ρtotal φ − j 2m Nc i=1 n0 e−Qi φ/kB T − 1 i |∇ψj |2 − EXC [n] (3.18) where the electronic potentials in last row do not contribute much to V at equilibrium. This is due to the fact that they are essentially confined inside the solute molecular domain. Note that Eq. (3.17) has the same structure as the potential driven geometric flow equation defined in Chapter 2. As t → ∞, the initial profile of S evolutes into a steady state solution, which solves the original Eq. (3.16) with an optimal surface function S. Finally, to derive the equation for the electronic wavefunctions, we need to incorporate the constraint as shown in Eq. (3.5) into the total energy functional. This can be easily done with a family of Lagrange multipliers i Ei δij − ∗ Sψi (r)ψj (r)dr . Therefore, by the ∗ variation of Eq. (3.11) with respect to the wavefunction ψj (r) and subject to the constraint, we have δ Gtotal + 2 − 2m i Ei δij − ∗ δψj ∇2 + Ueff ∗ Sψi (r)ψj (r)dr ψj = Ej ψj , =0⇒ (3.19) where the Lagrange multiplier constants Ei have been interpreted as energy expectation values. Equation (3.19) is the Kohn-Sham equation in which the effective Kohn-Sham potential is defined as Ueff (r) = qφ + VXC [n], 87 (3.20) where VXC [n] = dEXC [n] with qφ being the potential contribution from Coulombic interdn actions. It is to be calculated by the GPB equation (3.12) with a given total charge density. Apparently, Eq. (3.19) does not directly depend on the surface function S, so that existing DFT packages can be used in our computations with a minor modification as described in Section 3.2.2. It is seen that the generalized Poisson-Boltzmann equation (3.13), the generalized LaplaceBeltrami equation (3.17) and the Kohn-Sham equation (3.19) are strongly coupled to each other. Therefore, these three equations have to be solved by appropriate iterative procedures. This aspect is discussed in the next section. 3.2 3.2.1 Numerical methods and algorithms Solution of the generalized Poisson-Boltzmann equation The solution of generalized Laplace-Beltrami equations has been studied in appendix A, including the details of some discretization schemes. In solvation analysis, the generalized PB equation (3.15) is subject to the Dirichlet boundary condition [93] Na φ(r) = j qj , ǫs |r − rj | ∀r ∈ ∂Ω, (3.21) where qj is the total fixed charge of the jth solute atom. One option is to use the point charges from a force field model such as CHARMM. However, in the present work, we 88 consider the following Dirichlet boundary condition φ(r) = ρtotal (r′ ) ′ dr , ǫs |r − r′ | ∀r ∈ ∂Ω, (3.22) where the boundary condition is nonlinear — it depends on the electron density n and thus needs to be implemented iteratively. The standard second order center difference scheme is applied in this study to solve Eq. (3.15). An accurate solution can be obtained due to the continuous dielectric definition ǫ(S). Let the pixel (i, j, k) represent the position (xi , yj , zk ). The discretized form of Eq. (3.15) is 1 ǫ xi + hx , yj , zk [φ(i + 1, j, k) − φ(i, j, k)] 2 1 1 + ǫ xi − hx , yj , zk [φ(i − 1, j, k) − φ(i, j, k)] 2 h2 x 1 + ǫ xi , yj + hy , zk [φ(i, j + 1, k) − φ(i, j, k)] 2 1 1 + ǫ xi , yj − hy , zk [φ(i, j − 1, k) − φ(i, j, k)] 2 h2 y 1 + ǫ xi , yj , zk + hz [φ(i, j, k + 1) − φ(i, j, k)] 2 1 + ǫ xi , yj , zk − hz [φ(i, j, k − 1) − φ(i, j, k)] 2 1 h2 z = −S(i, j, k)ρm (i, j, k) (3.23) where hx , hy and hz are the grid spacings at x, y and z directions, respectively. Here, ρtotal (i, j, k) is the charge density at grid point (xi , yj , zk ), which is calculated from the electronic charge density n(r) and nucleus density nn . The implementation of ρtotal will 89 be discussed in the next paragraph. As such, the discretized PB equation can be converted into the standard linear algebraic equation system of the form AX = B, where X is the solution of interest, A is the discretization matrix and B is the source term associated with the charge density. It has been shown previously that the PB solver is able to deliver the designed second-order accuracy in Chapter 2. On the right hand side of Eq (3.23), the charge density at each grid point should be given. As an efficient approach, atomic charges have been widely used to approximate the charge density of electrons and nuclei, especially for large molecules of general interest. Therefore, the point charge approach has gained much popularity in PB solvers as well as PB applications [147, 12, 112, 116]. Nevertheless, charge assignment at atomic centers is a nontrivial issue. The deficiencies of the atomic point charge approach have been discussed in the Introduction. The direct implementation of the quantum mechanical charge density can avoid errors caused by the atomic point charge approximation. Moreover, this approach frees us from the electrostatic potential fitting process. To carry it out in the finite different scheme, the total charge density ρtotal (i, j, k), which consists of the electron density n(r) and nucleus density nn , needs to be prescribed at each grid point of the computational domain. In particular, the nucleus density nn (r), which is considered as stationary and located at the center of atoms, can be distributed to the nearest eight neighboring grid points by the second order interpolation (i.e., the trilinear mapping). The distributed nucleus core point charges are converted into the nucleus charge density by dividing point charges with the volume of the unit grid. Finally, the total charge density at each grid point is obtained by the summation of nucleus density and the electronic charge density which is directly taken from the quantum mechanical calculation. 90 However, a new issue arises from the above treatment of nuclei. Since each nucleus core charge is split into its eight neighboring grid points, it is easy to find out that short range interactions are biased and self-interaction energies are artificially introduced. This is due to the interactions of grid charges within one single atom. It exists even in the absence of solvent and any other charges. Apparently, this is a pure artifact due to the finite difference approach and must be eliminated. Within the partial charge approach the artifact can be canceled out mainly by calculating the PB equation twice, one in vacuum and the other in the solvent. It turns out that this strategy also works well here. Numerical tests regarding this cancellation of self-interaction energies are demonstrated later. It is important to point out that numerically if one implements the quantum mechanical calculation with a non-frozen core method, the remaining error from the self-interaction cancellation is still too large to be neglected. In other words, the above cancellation strategy may fail when one applies a non-frozen core approach. Therefore, frozen core approaches, such as pseudopotential methods, must be applied in our quantum calculations here. Because frozen core approaches dramatically reduce the number of charges in each nucleus and thus implicitly decease implementation errors. The biconjugate gradient method is a good choice in solving the PB equation. However, as we have demonstrated in Chapter 2, the combination of stabilized biconjugate gradient method (BiCG) and the blocked Jacobi preconditioner (BJAC) from PETSc (http://www.mc s.anl.gov/petsc/petsc-as/), as well as the combination of the orthomin method (OM) and the incomplete LU factorization preconditioner (ILU) from SLATEC, speeds up the process of the PB solution. In this study, we apply the combination of ILU and OM from SLATEC. In our iteration procedure, the prior electrostatic potential is taken as a good initial guess 91 for the followed linear system solving procedure. It turns out that this treatment makes the generalized PB solver converge much faster than simply setting the initial guess to be 0 [46]. Additionally, the convergence tolerance is set to be 10−4 as a good compromise between the accuracy and efficiency. 3.2.2 Solution of the generalized Kohn-Sham equation The generalized Kohn-Sham equation (3.19) admits all-electron and all-nucleus potentials. The direct solution of Eq. (3.19) is very expensive for macromolecules. Therefore, further simplifications are necessary. In particular, because classical DFT methods have been developed in the past few decades, the solution of Eq. (3.19) needs to make use of existing DFT methods. Note that the Coulombic potential functionals shown in Eqs. (3.6) and (3.7) involve spatially varying dielectric constants, which reflect the solvation process. The related spatially varying electrostatic potential is built in the generalized Poisson-Boltzmann equation (3.13), whose solution gives rise to the electrostatic potential energy qφ used in the generalized Kohn-Sham equation (3.19). In contrast, the standard Kohn-Sham equation is for a molecular system in vacuum and its Coulombic potentials are of the form of qφv where φv is given by Eq. (3.10) with the total charge density in vacuum described in the next section. The effective potential in the generalized Kohn-Sham equation (3.19) can be written as 0 Ueff [n] = qφ + VXC [n] = qφRF + Ueff (r), (3.24) φRF = φ − φ0 (3.25) where 92 is called the reaction field potential. Here φ0 is the solution of the Poisson equation in the homogeneous medium with the charge source ρtotal (r) −∇ · ǫ0 ∇φ0 (r) = ρtotal (r), (3.26) where ρtotal (r) is obtained from the generalized Kohn-Sham equation (3.19). In Eq. (3.24), 0 Ueff (r) is the Kohn-Sham potential 0 Ueff (r) = qφ0 + VXC [n]. (3.27) 0 In the present work, Ueff (r) is represented by the traditional Kohn-Sham potential. Consequently, a vast variety of computational techniques developed for the traditional Kohn-Sham DFT can be utilized in the present work. What we need to do in solving the generalized Kohn-Sham equation (3.19) is to add a reaction field potential qφRF to an existing KohnSham DFT solver. The most important issues in the solution of the Kohn-Sham equation are the selection of the exchange-correlation potential and the use of the pseudopotential. The pseudopotential approach eliminates the complicated effects of core electrons and allows the expansion of a smooth (pseudo-) charge density on a uniform spatial grid. In this approach, the chemically active valence electrons are dealt with explicitly, while the core electrons are ‘frozen’ and considered together with the nuclei as fixed non-polarizable ion cores. With the pseudopotential approximation, the formalism of the total energy functional needs to be modified, 93 which leads to the following expression of a Kohn-Sham effective potential [209] 0 Ueff (r) = I local (r) + VI I nonlocal + V H (r) + V (r) VI XC (3.28) where V H (r) and VXC (r) are total Hartree and exchange-correction potentials, respectively. local and V nonlocal are the local part and the nonlocal part of the pseudopotential Here, VI I of atom I. For elaborated discussions of the above potentials, we refer the reader to an excellent review [209]. In the present work, SIESTA (Spanish initiative for the electronic structure of thousands of atoms), a quantum mechanical package of high efficiency, is utilized for solving our generalized Kohn-Sham equation (3.19). SIESTA possesses the ability to perform the density functional theory (DFT) simulations of more than a thousand atoms. The details of the package has been extensively described [209]. It develops a self-consistent density functional method using the standard norm-conserving pseudopotential and a flexible numerical linear combination of atomic orbital (LCAO) basis sets with essential perfect O(N) scaling, in which the computer CPU time and memory scale linearly with the simulated system size. The exchange and correlation are treated within the Kohn-Sham DFT. Both the local density approximation and local spin density approximation (LDA/LSDA), as well as the generalized gradient approximation (GGA) are allowed. Moreover, SIESTA permits very fast simulations by using minimal basis sets as very accurate calculations with complete multiple-zeta and polarizable bases. Therefore, it can provide a general scheme to perform quantum calculations with requirements ranging from being very fast to being very accurate. For all of the simulations in the present work, the default double-ζ plus single polarization (DZP) bases are used. The MeshCutoff is set as 125 Rydberg and the LDA is applied. The 94 SolutionMethod is set to be ‘diagon’. 3.2.3 Evaluation of the solvation free energy The solvation free energy is the energy required or released from the transfer of a unit of solute molecules from vacuum to a solvent. By definition, it is calculated by the difference of the total energies in solution and in vacuum △Gtotal = Gtotal [S, φ, n] − Gvacuum [φv , nv ] (3.29) where φv is the electrostatic potential in vacuum and nv is the solute electronic density in vacuum, which is defined in terms of the electronic wavefunctions of the solute in vacuum v ψj (r) v |ψj (r)|2 . nv (r) = j (3.30) In Eq. (3.29), Gtotal [S, φ, n] is given in Eq. (3.11) and Gvacuum [φv , nv ] denotes the total energy functional in vacuum 1 2 ρv total φv − 2 ǫ|∇φv | Gvacuum [φv , nv ] = 2 + j (3.31)  v |∇ψj |2 + EXC [nv ] dr 2m where ρv total = qnv − qnn is the total charge density in vacuum. For simplicity, we have omitted the ionic density kB T Nc n0 (e−Qi φv /kB T − 1) in Eq. (3.29). Note that the i=1 i variation of Gvacuum [φv , nv ] gives rise to the standard Poisson equation (3.9) and the 95 standard Kohn-Sham equation 2 − 2m v ∇2 + Ueff v v v ψj = Ej ψj , (3.32) v v where Ej and ψj are appropriate eigenvalue and eigenfunction of Hamiltonian H v = 2 v − 2m ∇2 + Ueff . However, there is a technical difficulty in the direct evaluation of Gtotal [S, φ, n]. Namely, the integration of the quantum mechanical terms in Eq. (3.11) requires the S function profile, which is not involved in most Kohn-Sham DFT software packages. Therefore, in the present work, we evaluation the solvation free energy by the following approximation △Gtotal = Gnp + ∆Gp + ∆GQM (3.33) where Gnp , ∆Gp and ∆GQM are the non-polar, polar and quantum mechanical contributions, respectively. The non-polar solvation free energy does not exist in vacuum, and its form in solution is given by Gnp [S] = [γ|∇S(r)| + pS(r) + ρ0 (1 − S(r))Uss ] dr. (3.34) By using the Gauss’ divergent theorem and integration by parts, it is easy to show that the polar solvation energy is given by Gp [S, φ, n] = 1 ρ φdr. 2 Ωs total Similarly, the polar solvation energy in vacuum is 1 2 96 (3.35) ρv total φv dr. Therefore, one may com- pute the change of the polar solvation energy by 1 2 ρv total φv dr . How- Ωs ρtotal φdr − ever, this expression leads to a situation that the quantum mechanical contribution ∆GQM cannot be evaluated in SIESTA because of the lack of required potential terms. Additionally, such an expression is inconsistent with the conventional electrostatic solvation free energy of the form ∆Gp = 1 1 ρtotal [φ − φ0 ] dr = ρ φ dr. 2 Ωs 2 Ωs total RF (3.36) Therefore, in the present solvation analysis ∆Gp is calculated by Eq. (3.36), which leads 1 to two remaining electrostatic potential terms 2 Ωs ρtotal φ0 dr − ρv total φv dr . These terms are combined with the rest of the quantum energy functionals to compute the change of the quantum mechanical energy as ∆GQM = j v v < ψj |H 0 |ψj > − < ψj |H v |ψj > . (3.37) 2 0 where H 0 = − 2m ∇2 + Ueff . Note that wavefunctions {ψj } are computed with the full Hamiltonian in the solution. The main advantage of the quantum mechanical energy change given in Eq. (3.37) is that it can be easily computed by using existing DFT software packages as discussed in the last section. The current formula of the solvation free energy is systematically derived from the framework of the differential geometry based solvation model. It consists of the non-polar energy Gnp , the electrostatic solvation free energy ∆Gp , and the change of the solute self-energy ∆GQM due to the redistribution of electrons in the solvation process. It is of interest to see that the formulation of the present solvation analysis is consistent with that in the litera97 ture [224, 237, 238], which is basically computed by using a good chemical intuition. The reliability and accuracy of the current model are further validated by a comparison of the present prediction with experimental data, as well as with that in the literature in Section 3.3. 3.2.4 Dynamical coupling of involved PDE equations As described in Section 3.1.4, the total charge density in the solution is obtained by solving the Kohn-Sham equation in the presence of the reaction field potential φRF = φ − φ0 , which is computed by solving the PB equation and the Poisson equation, i.e., Eqs. (3.13) and (3.26). On the other hand, the solution of the PB equation requires the quantum mechanically calculated charge density, the surface profile S, and the dielectric profile ǫ(S) which are generated by solving the generalized Laplace-Beltrami equation (LBE). Moreover, the potential in the generalized Laplace-Beltrami equation contains the terms associated with the electrostatic potential from the PB equation and the charge density from the Kohn-Sham equations. In principle, the Laplace-Beltrami equation, the generalized PB and Kohn-Sham equations need to be solved simultaneously until the convergence is reached, i.e, the solvation energy of two runs agrees with each other within a prescribed tolerance. This can be achieved via a self-consistent iteration procedure. In practice, we adopt an inner-outer iterative strategy to implement the self-consistent procedure. The inner iterations concern the solution of the coupled generalized PB equation and the Laplace-Beltrami equation. These iterations have been carried out in Chapter 2, except for the different representation of the continuous charge density. In the present work, the inner iterations are combined with the solution of the Kohn-Sham equation during the 98 Initiate ρtotal in SIESTA Coupled PB and LBE for φ Solve for φRF Update ρtotal in SIESTA |∆Gnew − ∆Gpre | δ > 0. In our earlier work [18, 20], we have chosen S0 = 1000 and a small δ. Recently, in our differential geometry based multiscale models [243], we have designated S as characteristic function of the solute and chosen 0 ≤ S ≤ 1. There is no need to specify L in such a formulation. In the present work, we choose L = Smax /2, where Smax is the maximum of S. This choice is computationally stable and delivers correct MMSs, when the potential term is absent. It is to point out that S here is quite different from the S used in our Eulerian formulation [46]. In the present work, S only serves as a hypersurface function for evolving and extracting the solvent-solute interface and can take any real value. Numerically, isosurface extraction can be done by existing software such as MATLAB. However, for further electrostatic analysis, we need a Cartesian representation of the interface locations and the associated norm values. Therefore, we construct a stand-alone algorithm to extract interface information. To this 138 end, the marching cubes method is adopted [140]. For a given grid partition, the marching cubes algorithm simply deals with a local meshing problem by processing each cell or cube independently. Each vertex of a cube can be either greater or less than the threshold value L, giving 256 different scenarios. In considering the symmetry and complementarity, there are only 15 canonical configurations in each cell needed to be addressed for the local meshing [140, 67] A look-up table is a quite efficient local triangulation or Cartesian algorithm. The marching cubes method can be modified in many ways to improve the accuracy, efficiency, robustness, and topological correctness. Auxiliary binary tree structures are typically employed in the range-space approaches, such as kd-tree method and interval tree method, to speed up the marching cubes method. For a structured grid dataset, geometric searching methods exploiting spatial coherence can be simpler and more efficient than the range space approaches. To implement this scheme, first all points inside or outside the surface must be identified according to S value in the Cartesian grid domain. The surface must intersect those cube edges where one vertex is outside and the other is inside the surface. Therefore the surface intersection points and their normal directions can be approximated by linear interpolation. For instant, to compute an intersection point ro between an inside grid point r1 with value S1 and an outside grid point r2 with value S2 , the distance d between ro and r1 is calculated by L = S1 ∗ (1 − d) + S2 ∗ d d = (4.40) S1 − L S1 − S2 where L is the isosurface function value. Obviously, with known positions r1 and r2 , distance 139 d determines the position of the intersection point ro . To calculate the norm vector at ro , we need to compute the normal vectors at r1 and r2 . In general, the normal direction of a grid point (xi , yj , zk ) can be estimated by Si+1,j,k − Si−1,j,k 2∆x Si,j+1,k − Si,j−1,k ny (xi , yj , zk ) = 2∆y Si,j,k+1 − Si,j,k−1 nz (xi , yj , zk ) = 2∆z nx (xi , yj , zk ) = (4.41) n = (nx , ny , nz ) where nx , ny and nz are the x, y and z components of the norm, respectively. Thus, the norm at the intersecting point ro , denoted as no , can be interpolated through n1 and n2 , the norms of r1 and r2 , respectively no = (1 − d)n1 + dn2 . (4.42) Clearly, the choice of L = Smax /2 offers the best computational accuracy and stability. The unit norm No at the intersecting point can be easily computed as No = no . This no algorithm is used in our calculation of unit norms at the intersecting points. Obviously, higher-order algorithms can be easily constructed when they are needed. 4.2.1.4 Numerical surface integral and volume integral in the Eulerian representation Very often, we need to accurately carry out surface integration and volume integration in the Eulerian representation. The surface integral of a density function f can be approximated 140 by [207] Ξ f (x, y, z)dσ = Ω f (x, y, z)δ(d(x, y, z))dr ≈ ˜ f (xi , yj , zk )δi,j,k h3 (4.43) i,j,k where (xi , yj , zk ) is the coordinate of grid point (i, j, k), d(x, y, z) is the distance of a point (x, y, z) defined in Ω from the interface Γ, h is the uniform grid size, and f (x, y, z) is the ˜ surface density function defined on Γ. The delta function δi,j,k is given by ˜ ˜ ˜ ˜ ˜(+x) + δ (−x) + δ (+y) + δ (−y) + δ (+z) + δ (−z) ˜ ˜ δi,j,k = δ i,j,k i,j,k i,j,k i,j,k i,j,k i,j,k (4.44) ˜(±α) , (α = x, y, z) are discrete delta functions [207]. To carry out integration exactly where δ i,j,k on the interface, we use the following discrete surface integration formula [94] Ξ f (x, y, z)dσ ≈ f (xo , yj , zk ) (i,j,k)∈I |ny | |nx | |nz | + f (xi , yo , zk ) + f (xi , yj , zo ) h h h h3 (4.45) where (xo , yj , zk ) is the intersecting point of the interface and the x meshline that passes through (i, j, k), and nx is the x component of the normal vector at (xo , yj , zk ). Similar relations exist between (xi , yo , zk ) and ny , and (xi , yj , zo ) and nz . Since Eq. (4.45) has already taken into account the contribution from irregular grid points outside the interface, the summation is restricted to I, the set of irregular grid points inside or on the interface [94]. The derivation of Eq. (4.45) is lengthy and is omitted here but it can be seen elsewhere [94]. The surface area can be calculated by setting f = 1 in Eq. (4.45). The error of the surface integration depends on the grid resolution and was observed to be approximately second-order convergence [94]. 141 The volume integral of the density function f can be simply approximated by [94] Ωm f (x, y, z)dr ≈ f (xi , yj , zk )h3 (4.46) (i,j,k)∈J where the summation is over J, the set of points inside the interface. 4.2.2 Dynamic coupling of the Poisson-Boltzmann and geometric flow equations The solution of potential driven geometric flow equation, or the generalized Laplace-Beltrami equation, is discussed in Appendix A, including many discretization schemes. In general, electrostatic energy is much larger than the non-electrostatic part so that the accuracy of electrostatic potential calculation based on the Poisson-Boltzmann (PB) equation plays a critical role in controlling the accuracy of the total solvation free energy. Therefore, numerical methods that are able to deliver highly accurate solution of the PB equation is desirable. In this work we apply matched interface and boundary method (MIB) to impose interface conditions for high accuracy. This challenge is addressed in Section 1.5.3. As described earlier, optimized electrostatic potential φ is obtained by solving the PoissonBoltzmann equation (4.31) in which solvent-solute interface Γ is used to determine ǫ and λ values. The interface Γ is generated by the solution of the potential driven geometric flow equation (4.38) which in turn depends on the electrostatic potential. Therefore, the geometric flow equation and the PB equation need to be solved simultaneously in the present differential geometry based solvation model. In practice, this coupled nonlinear system can be solved by an iterative procedure: First solving the PB equation with a fixed interface Γ for 142 φ; Then obtaining the interface Γ from solving the potential driven geometry flow equation with a fixed potential φ. A more detailed algorithm is 1. Start with an initial solvent-solute interface, such as a solvent accessible surface. 2. Solve the Poisson-Boltzmann equation (4.31) for the potential with the initial solventsolute interface. 3. Obtain new solvent-solute interface by solving the potential driven geometric flow equation (4.38) with the updated potential. 4. Calculate the solvation free energy with the resulting φ and Γ. 5. Go back to Step 2 until it converges. The initial solvent-solute interface can be set by the solvent accessible surface with a probe radius of 1.4 ˚. Another way to define the initial geometry is to use the interface obtained A from solving the potential driven geometric flow equation (4.38) without the potential term V . Both approaches lead to the same result. In this study, we take the latter. The iteration will be stopped if the values of total solvation free energy converge to within a designated small criteria value which is 0.01kcal/mol for small molecules and 0.1kcal/mol for proteins in this chapter. Formulism of solvation free energy evaluation is similar to that in Chapter 2. 4.3 Validation This section is devoted to the validation of the proposed differential geometry based solvation model and a number of computational algorithms used in the present work. The 143 overall accuracy of solvation free energy calculation depends on the reliability and accuracy of the solution of the geometric flow equation and the PB solver, surface and volume integrations, and the interface extraction process. The explicit Euler algorithm guarantees the reliability and convergence of the solution of the geometric flow equation. The finite central different scheme is of second order convergence in space and first-order in time although it is computationally expensive [17]. The MIBPB-III has been verified to be of second-order in convergence even with molecular surface singularities of proteins [93]. Therefore, it will be second order accurate for the present application. In fact, the biomolecular surfaces generated with geometric flows are free of geometric singularities, which is computationally easier. We first examine the impact of the interaction potentials to the surface morphology, and surface electrostatic potentials. A few small molecular systems and 23 protein molecules are used in this examination. We then check the behavior of the surface area under different potential interactions. In particular, we verify whether the proposed minimal molecular surface (MMS) [20] provides the extreme surface area for various molecules and proteins. Finally, we investigate the convergence of the proposed iterative procedure for solving the coupled Poisson-Boltzmann and geometric flow equations. 4.3.1 Validation of the interface extraction scheme The numerical algorithm for surface integrations has second-order convergence [207, 94]. However, the MIBPB-III here has been modified for our purpose to admit the present optimized molecular surface (OMS) as the solvent-solute interface definition. This implies that the reliability of the present MIBPB-III solver depends on the interface extracted by the 144 marching cube algorithm. Moreover, the performance of our surface integration and volume integration algorithms is also determined by the resulting interface instead of some pre-determined interface such as the molecular surface [185]. Therefore, it is worthwhile to validate the interface extraction procedure and algorithm in terms of surface area (˚2 ), A surface enclosed volume (˚3 ) and electrostatic solvation free energy (kcal/mol). In general, A there is no analytical result for electrostatic energy except for the one-atom system due to Kirkwood [119]. For the one-atom system without interaction potential, the resulting solvent-solute interface from the geometry flow evolution is a sphere with the same radius as the initial one, for which the PBE admits an analytical solution. The surface area and volume can be calculated analytically. Therefore, we consider a dielectric sphere of radii 2˚ A p with a unit charge at the center. We set γ = 0.5, S0 = 1000 and L = 500. Table 4.1 lists the numerical results under different grid resolutions h, which are compared with the exact solution. The convergence in space is observed and a satisfactory result is attained. Table 4.1: Comparison of surface areas (˚2 ), volumes (˚3 ) and energies (kcal/mol) for two A A small systems. h 0.5 0.25 0.125 Referenced Value One Atom Area Volume Energy 48.86 34.00 -84.92 49.04 33.56 -82.92 50.09 33.52 -82.08 50.265 33.510 -81.98 Diatom Area Volume 95.24 71.00 99.28 72.73 100.5 73.20 100.34 71.18 Energy -238.03 -233.66 -232.37 −233.67 Another test is done with a diatom system. It has been illustrated previously that molecular surface of a diatom system can be reproduced by our PDE based approach at an appropriate constant potential value [17]. In particular, when the van der Waals (VDW) radii p of two atoms are 2˚, the generated surface with γ = −0.222 will be almost identical visually A 145 to the molecular surface with probe radius rm = 1.4˚. In this setting, the corresponding A solvation free energy, surface area and volume calculated by our numerical procedure are compared with those based on molecular surface. To calculate the electrostatic solvation energy, a unit charge is set at the centers of two atoms (-2.3,0,0) and (2.3,0,0). The numerical results are summarized in Table 4.1. For a comparison, the reference molecular surface area and volume of this diatom system are obtained by using the MSMS program [188] with probe radius 1.4 ˚ A and density 100. The electrostatic solvation energy (-233.67 kcal/mol) based on molecular surface is calculated by the original MIBPB-III [93]. A good agreement is also observed from this test. 4.3.2 Effect of interaction potentials In this section, we illustrate the impact of potentials to the generation of solvent-solute interface, consequently to the solvation analysis. Since the effect of the pressure term has already been shown in our previous study, we focus our attention on the study of potential effects which include short-ranged repulsive interaction, long-ranged attractive dispersion interaction and electrostatic potential effect. Here we consider 12-6 Lennard-Jones pair potential to model ViLJ . For the purpose of demonstration, all the surface profiles here are constructed by using 6-12 decomposition and based on the geometric flow Eq. (4.36) in absence of the pressure term and the ionic effect ∂S = ∂t   1 + ∇S 2 ∇ ·  ∇S 1 + ∇S   + ρs V LJ − γ 2 146  1 ǫs 1 ǫm |∇φ|2 − |∇φ|2  . 2γ 2 γ (4.47) Figure 4.1: Illustration of surface morphology of a diatom system with radii 1.87˚ and A coordinates (x, y, z) = (−2.2, 0, 0), and (2.2, 0, 0) under different solute-solvent interactions. Top Left: The MMS (no potential); Top Right: The surface obtained under a repulsive rep,LJ potential (Vi ); Bottom Left: The surface obtained under a full L-J potential (V LJ ); Bottom Right: The surface obtained under a full L-J potential and an electrostatic potential. It can be seen that the repulsive potential produces a “fat” surface, while an attractive potential or an electrostatic potential leads to a “slim” surface. Without any potential term, this geometric flow equation leads to the minimal molecular surface (MMS) [20]. The effects of those three potentials are demonstrated by a diatom system, a four-atom system and finally a protein molecule which is also used to illustrate the potential impacts on surface electrostatic potential analysis. In the present computation, we have treated the solvent density ρs as homogeneous. 4.3.2.1 Surfaces of a diatom system We first consider a diatom system with van der Waals radius 1.87˚ A (x, y, z) = (−2.2, 0, 0) and (2.2, 0, 0). Mesh size h = 0.04˚ A and coordinates is used. The L-J parameters are set as follows: σi is taken from atomic radius and σs is chosen to be 0.65˚. Well depth A ρs ǫi = 0.035 kcal/mol and bulk density coefficient γ = 2. To account for electrostatic ¯ 147 potential effect, a unit charge is set on the center of each atom and we choose ǫs = 80 ∗ ǫs γ m and ǫγ = 80 ∗ ǫm . We use ǫm = 1 and ǫs = 80 for dielectric constants in solute and solvent, respectively. Figure 4.1 illustrates the different potential effects on the surface morphology for the diatom system. We systematically change the potential effects to generate different surfaces. We begin with no potential effect, which leads to the minimal molecular surface (Top Left of Figure 4.1 ), then add the repulsive part of the L-J potential (Top Right of Figure 4.1), then add an attractive interaction (Bottom Left of Figure 4.1) and finally add the electrostatic potential effect (Bottom Right of Figure 4.1). It can be seen that the repulsive potential produces a “fat” surface, while an attractive potential or electrostatic potential leads to a “slim” surface. In other words, with a purely repulsive interaction turning on, there is less bulk area between or around two spherical solutes, while more bulk area with attractive potential or electrostatic potential turning on. This result is consistent with experimental observations [236] 4.3.2.2 Surfaces of a four-atom system The effects of potentials on the surface generation are further demonstrated by a fouratom system in Figure 4.2 with van der Waals radius 1.87 ˚ A and coordinates (x, y, z) = (−3.40, 0, 0), (3.40, 0, 0), (0, −2.94, 0) and (0, 2.94, 0). The needed parameters in Eq. (4.47) are set as the same as the above diatom system except for setting 1/2 charge at the center of each atom. We also systematically change the potential effects by beginning with no potential which leads to the MMS in Figure 4.2(Top Left), then add the repulsive part of L-J potential (Top Right of Figure 4.2), then add the attractive part in Figure 4.2(Bottom Left) and finally add the electrostatic potential effect in Figure 4.2(Bottom Right). The impact of potentials is similar to that in the diatom system. It is found that the size of hole 148 ˚ Figure 4.2: Illustration of surface morphology of a four-atom system with radii 1.87A and coordinates (x, y, z) = (−3.40, 0, 0), (3.40, 0, 0), (0, −2.94, 0) and (0, 2.94, 0) under different solute-solvent interactions. Top Right: The surface obtained under a repulsive potential rep,LJ (Vi ); Bottom Left: The surface obtained under the full L-J potential (V LJ ); Bottom Right: The surface obtained under the full L-J potential and an electrostatic potential. Again, it can be seen that the repulsive potential producess a “fat” surface; while an attractive potential or an electrostatic potential leads to a “slim” surface. 149 Figure 4.3: The projection of electrostatic surface potentials of protein 451c onto different surfaces obtained under various solvent-solute interactions. Top Left: Attractive surface; Top Right: The MMS; Bottom Left: Repulsive surface; Bottom Right: Polar surface. It is noted that the repulsive surface is a “fat” surface comparing to the MMS; while an attractive surface or a polar surface is a “slim” surface. in the four atoms changes dramatically when varying non-bonding potentials. This would imply that the size of pocket or cavity inside a protein can be dramatically changed under different electrostatic potentials and/or solute-solute interaction potentials. Therefore, it may result in a significant difference in physical properties of biological systems, which can dramatically influence the selectivity and gating mechanism of ion channels. 150 Table 4.2: Electrostatic solvation free energies (kcal/mol), surface areas (˚2 ) and volumes A ˚3 ) of protein 451c with different solvent-solute interactions. (A Surface MMS Repulsive Attractive Polar 4.3.2.3 Energy -724.3 -635.2 -897.9 -838.1 Area 3695.0 3805.6 3904.9 3702.9 Volume 12881.9 13458.3 11635.6 12595.7 Surfaces and electrostatic potentials of a protein Having illustrated the effects of various potentials to surface generations for simple artificial systems, we now consider their impacts to the surface morphology and the solvation analysis of proteins. For this purpose, we choose a protein called heme-binding protein, Fe(II) cytochrome C551 from the organism Pseudomonas aeruginosa (PDB ID: 451c). For the structure, extra water molecules that are attached to proteins are excluded and hydrogen atoms are added to obtain a full all-atom model. Partial charges at atomic sites and atomic van der Waals radii in angstroms are taken from the CHARMM22 force field [146]. To show the potential effects, each time we keep one and only one potential term in Eq. (4.47) to produce a new surface which is used in our PB solver to calculate the electrostatic potential. Starting with the MMS, the surface is called a repulsive surface when only a repulsive potential term is added, an attractive surface when only an attractive interaction is added and a polar surface if only the electrostatic potential effect is allowed. The needed parameters for potential expressions are set in the same way as in the 17 compounds which is described in Section 4.4.1.1. Surface electrostatic potentials are plotted on the corresponding surfaces in Figure 4.3. Meanwhile, electrostatic solvation free energies, surface areas and volumes are calculated and listed in Table 4.2. Potential effects similar to the surface generations of 151 the diatom and four-atom systems are observed. Moreover, it is interesting to note that the minimal molecular surface (MMS) has the smallest surface area among them so that it indeed minimizes the surface free energy. This is consistent with the mathematical proof that the mean curvature flow equation leads to the minimal surface area. Yet, the MMS does not possess the minimal volume. Instead, the attractive solvent-solute interaction leads to the minimal volume. The repulsive solvent-solute interaction gives rise to the largest volume. These results might appear to be counterintuitive. However, one has to keep in mind that proteins are partially charged molecules. The electrostatic free energy plays a dominant role in the solvent-solute interactions. There is an obvious correlation between the solute volume and the electrostatic free energy: The larger the solute volume is, the smaller the electrostatic free energy is in absolute value. Therefore, the repulsive potential interaction leads to the smallest electrostatic solvation free energy in absolute value, which is an indication of the hydrophobic nature of the repulsive potential interaction. As expected, the attractive solvent-solute interaction leads to the largest electrostatic solvation free energy, reflecting the hydrophilic nature of the attractive solvent-solute interaction. It is believed that the results in Table 4.2 are very helpful to the understanding of the sophistication of solvation. 4.3.3 Isosurface function value and minimal surfaces The minimal molecular surface (MMS) proposed in our earlier work [20] was based entirely on the differential geometry theory of surfaces. Although the minimal surface theory is mathematically rigorous, the resulting surface might not be exactly the one with the minimal surface area, when the evolution of the Laplace-Beltrami operator is carried out in the Eulerian representation. This problem is due to the surface extraction process. There are 152 Table 4.3: Surface areas (˚2 ) for different surface definitions A Area PDB-ID No.of atoms OMS MMS 1ajj 519 2201.4 2046.7 1bbl 576 2657.6 2434.1 1bor 832 2946.9 2683.9 1bpi 898 3274.9 3017.4 1cbn 648 2401.4 2212.7 1fca 729 2728.7 2474.1 1frd 1478 4467.2 3994.2 1fxd 824 3037.3 2762.5 1hpt 858 3368.3 3013.8 1mbg 903 3163.2 2831.3 1neq 1187 4829.0 4295.5 1ptq 795 2959.4 2685.8 1r69 997 3124.8 2806.3 1sh1 702 2808.4 2515.4 1svr 1435 4796.4 4247.9 1uxc 809 2916.1 2630.6 1vii 596 2571.2 2269.3 2erl 573 2380.4 2162.9 2pde 667 2782.0 2527.9 451c 1216 4184.7 3688.5 1a2s 1272 4507.5 3968.7 1a63 2065 7184.8 6369.7 1a7m 2809 7939.4 6918.9 infinitely many ways to select an isosurface value. Our tests indicate that the MMS can be S0 obtained if we choose L = 2 . Therefore, we set the isosurface function value to 500 in the present solvation free energy calculations rather than a value very close to 1000 which was used in our earlier paper [20]. The results in Table 4.2 were obtained in this manner. In this subsection, we further illustrate that the MMS indeed gives rise to the smallest surface area for a set of 23 proteins. Moreover, we also study the impact of the pressure to the surface area for a couple of given proteins. 153 We consider a set of 23 proteins in the present study. The detail of preparation and treatment of protein data is described in Section 4.4.1.2. Two types of surfaces are generated in the present work. The first type is the MMS constructed by the mean curvature flow [20]. The second type of surfaces is called optimized molecular surface (OMS) generated by using the present differential geometry based solvation model. Results are listed in Table 4.3. As expected, the surface areas from the MMS model are always smaller that those from the OMS model. Essentially, the optimization of total free energy differs much from the minimization of the surface free energy. The possession of the minimal surface area in the MMS can be further demonstrated as follows: we consider situations where only the constant pressure (p) is added into the mean curvature flow equation to cause a perturbation of the MMS. For our purpose, two arbitrarily chosen protein systems (PDB-IDs: 1ajj and 451c) from the set of 23 proteins are explored. In our simulation, we set p=-0.4,-0.3,-0.2,-0.1,0,0.2,0.3,0.4 and 0.5. The minimal molecular surface is obtained when p = 0. Figure 4.4 illustrates the difference of surface areas (˚2 ) A between various resulting surfaces generated under different p values and the MMS for these two protein systems. It is clearly seen that a small perturbation around MMS leads to a larger surface area comparing to that of the MMS. In other words, the MMS indeed has the minimal surface area. 4.3.4 Convergence of surface area, volume and energy In this section, we also illustrate numerically the convergence and decreasing pattern of total solvation free energy during the time integration, which has been shown theoretically in the process of the model derivation. To this end, a small compound named diethyl propanedioate 154 Difference of surface area 250 200 451c 1ajj 150 100 50 0 −50 −0.4 −0.2 0 0.2 Pressure 0.4 0.6 ˚ Figure 4.4: Difference of surface areas ( A2 ) between MMS and various resulting surfaces generated under different constant pressure effects. 6 4 2 0 Energy J(Volume) Area −2 −4 −6 0 10 1 10 N 2 10 3 10 Figure 4.5: Decreasing of surface area (×102 ˚2 ), J(volume)=(volume-2) (×102 ˚3 ) and A A total solvation free energy (kcal/mol) in diethyl propanedioate as the number of iterations increases. 155 has been chosen from a set of 17 compounds described in Section 4.4.1.1. The time history of the total solvation free energies along with the evolution of solvent-solute boundary is recorded. To illustrate the convergence pattern of the solvation free energy, we compute the total solvation free energies in the intermediate states during the time evolution. The results are shown in Figure 4.5. Here T denotes the time span and N = T represents the number τ of computational steps in the generalized geometric flow solver. In order to put surface area, enclosed volume and total solvation energy together in one evolution picture, we illustrate J(volume), which is a linear function of volume and shares the same pattern with volume, rather than volume. It is found that the total solvation free energy decreases with respect to the time evolution, which is consistent with our theoretical formulation. It is observed that the solution of our model converges to a steady state in terms of volume (˚3 ), area (˚2 ) A A and total solvation free energy (kcal/mol). 4.4 Application In this section, we consider the application of the proposed differential geometry model to the calculations of solvation free energies and salt effects on the protein-protein binding affinity. Previously, we have developed an optimized smooth surface (OSS) model in Chapter 2[46] via the Eulerian formulation of the differential geometry based solvation model. It has been demonstrated that OSS model successfully reproduces not only the solvation free energy of small molecules but also the electrostatic solvation free energies of proteins. Although the present optimized molecular surface (OMS) model is derived by using the same framework of free energy functional optimization, the solvent-solute interfaces are entirely different in two models. It is important to verify whether their results are consistent with each other. 156 For a comparison, we choose the same set of 17 compounds used in Chapter 2. Thus the results from the OSS model are taken directly from the earlier work. In addition, we also choose a subset of 23 proteins from 30 proteins studied in Chapter 2. The protein-protein binding affinity is investigated by using two protein systems. 4.4.1 Free energy calculations 4.4.1.1 Solvation energies of 17 compounds This test set of 17 small compounds was studied by Nicholls et al. [160] using a number of approaches, including quantum mechanical methods, the PB theory etc. It is considered as a challenging test set for computational methods because of the existence of polyfunctional or interacting polar groups, which leads to strong solvent-solute interactions. The nonpolar solvent-solute interaction potential in the present model provides a potentially efficient means to deal with strong solvent-solute interactions. In our calculations, we set the initial amplitude S0 = 1000 and isosurface function value S0 L = 2 . Other parameters are set in the same way as that in our previous work. Again, here γ (kcal/(mol˚2 )) serves as a fitting parameter and will have different values for differA ent expressions of the nonpolar potential. Details are listed in Table 4.4. Typically, only attractive solvent-solute interactions contribute to the dispersion effects in the third term of Eq. (1.2). Here, we have three choices for the dispersion effect: V att,WCA , V att,6/12 and V LJ . It turns out that the use of full L-J potential expression can offer the smallest root mean square (RMS) error for the set of 17 compounds. Therefore, it will be chosen from now on for the further study in this chapter except specified. Structure and charge information of compounds are adopted from those in Nicholls’s 157 Table 4.4: RMS error with different nonpolar potentials. Potential γ RMS (kcal/mol) V att,WCA 0.0077 1.77 V att,6/12 0.0094 1.83 V LJ 0.0074 1.75 paper [160] too, which can be obtained from the supplementary information of the paper. The results of the present full L-J potential model are summarized in Table 4.5. It gives a comparison between the predicted and experimental values of solvation free energies of 17 compounds. The RMS error of the present model is 1.75 kcal/mol. This RMS error is competitive with the explicit solvent approach (1.71±0.05 kcal/mol) under the same charge and structure parameters set [160]. Moreover, it is interesting to note that the RMS is almost the same as the one obtained from our earlier optimized smooth surface (OSS) model using similar γ value (i.e., 0.0065 kcal/(mol˚2 ) vs 0.0074 kcal/(mol˚2 )). This consistency A A can also be seen through Figure 4.6 which shows that the results from the OSS and the present OMS are linearly correlated. The correlation coefficient is 0.999. It may reveal at least two facts. First, in the framework of free energy optimization, the calculated results using the Lagrangian representation and the Eulerian representation should be similar to each other. Additionally, a satisfactory nonpolar term and the enforcement of the potential driven geometric flow really play a critical role in the analysis of solvation free energies. 4.4.1.2 A set of 23 proteins The set of 17 compounds has already shown the present approach’s ability to predict the total solvation free energy of small compounds. Tests on proteins are needed to demonstrate the capacity on the large system of interest. Encouraged by the success in the application to 158 0 OSS(kcal/mol) −2 −4 −6 −8 −10 −10 −8 −6 −4 OMS(kcal/mol) −2 0 Figure 4.6: Correlation of solvation free energy between previous optimized smooth surface (OSS) model and the present optimized molecular surface (OMS) model in the set of 17 compounds listed in Table 4.5. 159 Table 4.5: Predicted and experimental total solvation free energies for 17 small compounds. Compound Gnp glycerol triacetate benzyl bromide benzyl chloride m-bis(trifluoromethyl)benzene N,N-dimethyl-p-methoxybenzamide N,N-4-trimethylbenzamide bis-2-chloroethyl ether 1,1-diacetoxyethane 1,1-diethoxyethane 1,4-dioxane diethyl propanedioate dimethoxymethane ethylene glycol diacetate 1,2-diethoxyethane diethyl sulfide phenyl formate imidazole 2.33 1.39 1.36 2.22 1.99 1.91 1.44 1.67 1.55 1.01 1.87 1.02 1.62 1.57 1.22 1.37 0.80 ∆Gp ∆G (kcal/mol) -12.36 -10.03 -4.87 -3.47 -5.06 -3.70 -3.30 -1.07 -9.22 -7.22 -7.84 -5.93 -4.16 -2.71 -8.21 -6.53 -4.63 -3.08 -5.64 -4.62 -7.75 -5.88 -4.64 -3.62 -8.40 -6.78 -4.40 -2.83 -2.40 -1.17 -7.82 -6.45 -11.56 -10.76 Exptl Error -8.84 -2.38 -1.93 1.07 -11.01 -9.76 -4.23 -4.97 -3.28 -5.05 -6.00 -2.93 -6.34 -3.54 -1.43 -4.08 -9.81 -1.19 -1.09 -1.77 -2.14 3.79 3.83 1.52 -1.56 0.20 0.43 0.12 -0.69 0.44 0.71 0.26 -2.37 -0.95 small compounds, we further consider a set of realistic proteins and compare the results with those from previous optimized smooth surface (OSS) model and MIBPB-III [93] with predetermined molecular surfaces (MSs), which is defined as the inner surface smoothly traced by a probe sphere as it rolls over the atomic sphere [185, 57]. Twenty three proteins, a test set used in previous studies [17, 46], are chosen for the present calculations. All structures and partial charges are obtained in the same way as the 451c system which is described before. Table 4.6 shows the results of the present model, and those of the OSS and the MIBPB-III. Results from the OSS and the MIBPB-III have been proved to be very close to each other and they are competitive to those from quantum mechanic approaches [46]. Like in the set of 17 compounds, results from the OSS and the OMS also show quite consistency. The correlation coefficient between them are 0.999. This can also be observed in the Figure 160 0 OSS/MIBPB/MMS(kcal/mol) OMS vs OSS −500 −1000 OMS vs MIBPB OMS vs MMS −1500 −2000 −2500 −3000 −3500 −3500 −3000 −2500 −2000 −1500 −1000 −500 OMS (kcal/mol) 0 Figure 4.7: Correlation of electrostatic solvation free energy between the present optimized molecular surface (OMS) model, and previous models, such as the optimized smooth surface (OSS), the MIBPB-III and the MMS for 23 proteins listed in Table 4.6. 4.7. Therefore, this observation further convinces us that in the framework of differential geometry based free energy optimization, the OSS and the OMS can be alternative to each other in the aspect of solvation analysis. Since both of them share similar energy functional expressions and take into account the key feature of total energy optimization at equilibrium. If we remove the external potential effects in the surface evolution which are derived from the energy optimization, the present OMS model returns to our previous minimal molecular surface (MMS) model and the calculated results of solvation energies deviate dramatically. Table Figure 4.8 demonstrates the difference of electrostatic solvation free energy between the OSS and MMS models, as well as the difference between the OSS and OMS models. This once again indicates the importance of polar-nonpolar coupling and solute-solvent interaction in implicit solvent modeling and solvation analysis. 161 Table 4.6: Comparison of electrostatic solvation free energies of 23 proteins. △Gp (kcal/mol) PDB-ID No. of atoms MIBPB-III OSS OMS 1ajj 519 -1137.2 -1178.5 -1122.3 1bbl 576 -986.8 -965.9 -972.0 1bor 832 -853.7 -853.7 -836.3 1bpi 898 -1301.9 -1281.2 -1295.1 1cbn 648 -303.8 -304.8 -291.0 1fca 729 -1200.1 -1200.6 -1184.1 1frd 1478 -2852.2 -2844.8 -2846.7 1fxd 824 -3299.8 -3291.9 -3306.1 1hpt 858 -811.6 -808.2 -815.6 1mbg 903 -1346.1 -1328.2 -1346.9 1neq 1187 -1730.1 -1713.9 -1742.9 1ptq 795 -873.1 -866.2 -872.9 1r69 997 -1089.5 -1072.7 -1082.7 1sh1 702 -753.3 -771.8 -753.9 1svr 1435 -1711.2 -1704.6 -1716.7 1uxc 809 -1138.7 -1125.7 -1147.9 1vii 596 -901.5 -892.0 -907.0 2erl 573 -948.8 -935.8 -944.4 2pde 667 -820.9 -843.0 -812.3 451c 1216 -1024.6 -1020.6 -1016.8 1a2s 1272 -1913.5 -1900.3 -1902.8 1a63 2065 -2373.5 -2380.5 -2382.6 1a7m 2809 -2155.5 -2179.8 -2152.6 4.4.2 MMS -921.0 -792.3 -665.9 -1060.0 -181.0 -1040.0 -2499.5 -3087.1 -570.0 -1148.7 -1401.6 -660.2 -824.4 -532.1 -1321.3 -919.3 -724.2 -812.2 -591.3 -718.2 -1633.0 -1851.0 -1699.9 Salt effect on protein-protein binding energies Finally, we consider the application of our differential geometry based solvation model to the calculations of salt effect on the protein-protein binding. This is the first time that our new model is applied to the study of the salt effect. The ion concentration plays an important role in the stability and even reactivity of biomolecules. This application can be further extended to the binding affinity analysis of ligands, peptide, proteins, nucleic acids, and membrane proteins. To this end, the potential terms caused by mobile ions need to be 162 600 500 OSS−OMS OMS−MMS Kcal/mol 400 300 200 100 0 −100 0 5 10 15 Protein 20 25 Figure 4.8: Difference of electrostatic solvation free energies between the OMS model and previous OSS and MMS models for 23 proteins listed in Table 4.6. restored in our calculation. The full Poisson-Boltzmann (4.31) is coupled to the geometric flow Eq. (4.38) to obtain the solvation free energy for proteins in the salted solvent. The solution procedure for the nonlinear PB equation was described in our work [40]. The coupling of the nonlinear PB equation and potential driven geometric flow equation is discussed in Section 4.2.2. For low salt concentration and weak electrostatic potential, the linearized Poisson-Boltzmann equation discussed in Section 4.1.3 can be applied. For the binding free energy, only the electrostatic component and particularly, its salt dependence are studied. The total binding free energy which includes many other terms that do not depend on the salt concentration, does not need to be calculated. Then the electrostatic component of the binding energy (∆Gp ) is found as the difference of the electrostatic 163 Figure 4.9: Protein-protein complexes. Left: Protein complex 1emv; Right: Protein complex 1beb. free energies of the complex and those of the free molecules ∆Gp (I) = GAB (I) − GA (I) − GB (I), p p p (4.48) where GAB (I), GA (I) and GB (I) are the electrostatic free energies of the complex AB, p p p and the monomers A and B, respectively, at a given ionic strength, I. The salt dependence of the binding free energy ∆∆Gp (I) is thus the difference in the electrostatic components of the binding energies, Eq. (4.48), attained at some salt concentration I and at zero salt concentration ∆∆Gp (I) = ∆Gp (I) − ∆Gp (I = 0) = {GAB (I) − GAB (I = 0)} p p −{GA (I) − GA (I = 0)} p p −{GB (I) − GB (I = 0)}, p p 164 (4.49) 10 Binding Free Energy(Kcal/mol) Binding Free Energy(Kcal/mol) 15 Calculated CF line Experimental EF line 5 0 −3 −2 Ln(I) −1 0 1 Calculated CF line Experimental EF line −5 −10 −15 −5 −4 −3 −2 Ln(I) −1 0 Figure 4.10: The salt dependence of the binding affinities of two protein complexes. Left: Protein lemv; Right: Protein 1beb. Here OMS data are computed by our optimized molecular surface (OMS) model. NLPB data are taken from Bertonati et al’s paper [26]. where each energy term at different ionic strengths can be calculated via Eq. (2.23). In general, the nonlinear PB equation should be used for the evaluation of salt effects on the protein-protein binding. However, as shown by Bertonati et al. [26], both the linearized PB (LPB) and the nonlinear PB (NLPB) can be applied to calculate salt effects when the ionic strength of the salt is weak and the net charges of the binding complex and individual molecules are relatively small. The results obtained with the LPB were very close to those obtained with the NLPB. This encourages us to use LPB in this section to reduce the computational complexity. To test the utility of our new model in the calculation of salt effects on the protein-protein binding, a hetro-dimeric and a homo-dimeric complex are selected for our study. These cases were considered by Bertonati et al [26]. In the experiment, NaCl is used for the salt with a pH value of 3. As shown in Figure 4.9, each protein complex encompasses two separated pieces. The structures and charges of them are attained in the same way as earlier 23 proteins, so are the needed parameters in the potential driven geometric flow equation. The salt 165 Table 4.7: Comparison of binding affinities of two proteins complexes from current simulations and those from Bertonati et al’s paper. PDB code Complex charge Surface Charge of the ˚2 ) free monomers Area (A 1emv -3 1465 1beb +26 1167 Bertonati [26] LPB NLPB OMS B=+5; A=-8 1.29 1.31 2.40 A=B=+13 -2.48 -1.53 -2.02 dependence of the binding free energy from NLPB simulation by Bertonati et al. as well as our OMS model is shown in Figure 4.10, where the binding free energy ∆∆Gp (I) is plotted as a function of the logarithm of the salt strength I. Additionally, binding affinities are summarized in Table 4.7, in which the first four columns describe the properties of proteins and the last two columns are the binding affinities extracted from the slope of the lines in Figure 4.10. Note that the calculation is performed by assuming that all Arg, Asp, Glu and Lys residues are ionized in both free and bound states. It is seen that our model clearly reproduces the experimental observation, i.e., for the hetero-dimeric complex, the binding free energies increase with the increasing ionic strength; while for the homo-diemric complex, the affinity is negative. Moreover, as shown in the table the quantities of the binding affinity obtained from simulations with the present OMS model are in good agreement with those obtained by LPB and NLPB methods in Bertonati et al’s paper in which NaCl is used for salt with a pH value of 3. Note that in the case of Lactoglobulin dimer, the results obtained with all acidic groups neutral are shown. Note that proteins considered in the present work have fixed conformations before and after their interaction. Ions are treated as structureless. Therefore, the salt effect studied in the present work does not include the effect of ions on biomolecule conformation and complex 166 formation due to different ion species [80, 225]. It takes more sophisticated models to study the impact of the different ion species to molecular reaction rates and conformational change. Such an aspect is beyond the scope of the present work. 4.5 Chapter conclusions The objective of the present work is to explore an alternative formulation, the Lagrangian formulation of differential geometry based multiscale solvation models. The Lagrangian representation of biomolecular surfaces is suitable for the visualization, surface electrostatic potential map and visual perception of biomolecules. It is can be directly employed in the implicit solvent models and existing software packages. Finally, the Lagrangian representation has an advantage that it avoids artificially enlarging van der Waals radii as often required by smooth surface models [235]. In the present approach, the discrete and continuum domains are separated by a sharp solvent-solute interface, which naturally constitutes a smooth and differentiable manifold enclosing the biomolecule of interest. The time evolution of the manifold is governed by the potential driven geometric flow, a mathematical framework introduced in our previous work [242, 243, 17]. The specific potential driven geometric flow equation used in the present work is derived via the first variation of the total free energy functional of solvation in the Lagrangian representation. Such a derivation differs much from our earlier derivation using the Eulerian representation and geometric measure theory [243]. Although there are some similarities in expressions of coupled PB equation and geometric flow equation between our previous optimized surface (OSS) model in Chapter 2 and the present optimized molecular surface (OMS) model, there are important differences to be 167 spelled out. First of all, the solute-solvent interface definitions in two models are fundamentally different. In the OSS model, the solute and solvent region is described by a continuous characteristic function denoted by 0 ≤ S ≤ 1. In contrast, in the present OMS model, solute and solvent regions are strictly separated by a 2D differentiable manifold. The function S in Eq. (4.38) only serves as a hypersurface function for the geometric surface evolution. This difference has a dramatic computational implication. The generalized Poisson-Boltzmann equation with an OSS is much easier to solve than the OMS is. However, a formal comparison of this computational aspect is beyond the scope of the present work. Moreover, in the potential driven geometric flow equation (4.38), ∇φ+ = ∇φ− because of the discontinuity of ∇φ inside and outside the solute-solvent interface. However, ∇φ+ = ∇φ− in the overlap region of the OSS model due to continuous dielectric definition. Further, dielectric constant ǫ(x) in the PB equation is defined in a totally different way: ǫ(S) is a function of S in the OSS model, and there exists a smooth transition region from the low dielectric to the high dielectric. In contrast, ǫ(x) is piecewise constant in the present model. In other words, here ǫ = ǫs in solvent and ǫ = ǫm in solute, respectively. Further, a generalized Poisson-Boltzmann equation was derived in the OSS model. Whereas, we formally end up with the classical Poisson-Boltzmann equation in the present theory, although it is coupled to the potential driven geometric flow equation. Yet, the present OMS brings up a number of mathematical issues, including the singularity formation on the manifold, and Eulerian embedding of Lagrangian dynamics. Finally, there are many computational problems associated with the Lagrangian formulation of our differential geometry based solvation model too. For instant, the current discontinuous definition of ǫ leads to dramatic accuracy reduction in the standard numerical schemes for the elliptic equations with discontinuous coefficients 168 and singular sources [254, 253, 258, 261, 260]. To overcome this difficulty, we have incorporated the highly accurate MIB scheme into our PB solver [259, 252, 93]. In addition, many other computational issues, such as hybrid Lagrangian and Eulerian dynamics, level set methods, isosurface extraction, surface integration, and Dirichlet to Neumann mapping [93], are relevant in the present Lagrangian representation. 169 Chapter 5 Thesis achievements and future work 5.1 Contributions As described in Chapter 1, solvation is an elementary process in nature that has a great impact on many sophisticated physical, chemical, and biological processes. Therefore, the importance of the understanding of solvation cannot be overemphasized. The major contribution of this thesis is in the construction of a series of novel differential geometry based multiscale solvation models for chemical and biomolecular systems. We have extended our earlier variational formulation of the surface free energy to the analysis of total solvation free energy via the differential geometric theory of surfaces. As a key ingredient of the present framework, the total energy functional encompasses coupled polar and non-polar contributions with a self-consistent interface definition. The true physical boundary of a biomolecule in a solvent, as a physical concept, is determined by the optimization of the total free energy for the solvation equilibrium. As such, a natural description of the solvent-solute interface is provided by the differential geometric theory of surfaces and implemented by the generalized 170 Laplace -Beltrami equation, rather than simply ad hoc divisions of the solute and solvent regions. Moreover, rigorous mathematical derivations have been demonstrated to obtain the coupled PDE system in the spirit of the variational principle. Additionally, efficient and robust computational algorithms have been designed for the 3D simulation. Finally, solvation analysis of both small compounds and proteins are carried out to further display the accuracy, stability, efficiency and robustness of the proposed new models and the associated numerical approaches. Comparison is made with both experimental and theoretical results in the literature. Biologically, this thesis provides a self-consistent treatment of the dielectric boundary in all energy terms in the implicit solvent model based on the Poisson-Boltzmann theory.the non-polar free energy terms. It simultaneously optimizes the total solvation free energy. This is considered as a novel and important advance in the continuum treatment of molecular free energies. Moreover, we propose a new theory that allows one to predict the solvation free energy of a biomolecule in a computationally inexpensive way. The relative simplicity of the theory could make this a useful daily tool for researcher to describe general trends in solvation behavior of different biomolecular systems. Furthermore, our method potentially allows one to investigate the ionic effects on the thermodynamics of biomolecules. In our Eulerian formulation, the surface, separating the low dielectric interior from the high dielectric solvent, is for the first time treated through a 3-D function S. S function takes a value of 1 or 0 in the solute and solvent regions, respectively, and smoothly varies between these values in the interface region. Then the van der Waals interaction, mechanical work and the surface free energy terms are introduced as functions of S. The total energy can then be optimized by simultaneous solution of the generalized Poisson-Boltzmann (GPBE) equa171 tion and the generalized geometric flow equation (GGFE) through the first order variation with respect to the potential and S function. As such, realistic solvent-solute boundaries are constructed. By solving the coupled GPBE and the GGFE, we obtain the electrostatic potential, the solvent-solute boundary profile, and the smooth dielectric function, and thereby improve the accuracy and stability of implicit solvation calculations. We also design efficient second order numerical schemes for the solution of the GPBE and GGFE. The matrix, which results from the discretization of the GPBE, is accelerated with appropriate preconditioners. An alternative direct implicit (ADI) scheme is designed to improve the stability of solving the GGFE. Two iterative approaches are constructed to solve the coupled system of nonlinear partial differential equations. Extensive numerical experiments are designed to validate the present theoretical model, to test computational methods, and to optimize numerical algorithms. Generally, partial charges from the existing force fields are parameterized for certain class of molecules and may not be accurate for other molecules. Additionally, the fixed charge pattern does not describe the charge rearrangement during the solvation process. This drawback limits the accuracy and utility of our earlier solvation models. The quantum formulation work in this thesis addresses such a limitation by the incorporation of quantum electron density in our earlier models. To this end, we initially construct a new multiscale total free energy functional, which includes the electron kinetic energies and potential energies. By means of the reaction field potential, we can relate the full Kohn-Sham Hamiltonian to the standard Hamiltonian, so that existing computational software packages can be utilized. We have developed a protocol to make use of the SIESTA (Spanish initiative for the electronic structure of thousands of atoms), an efficient linear scaling Density Functional Theory pack172 age, to obtain the solution of the electron density. Appropriate iteration procedures are developed to dynamically couple three governing equations and ensure the convergence of the solution. In the Lagrangian formulation of our differential geometry based solvation model, the solvent-solute interface is modeled as a 2D manifold embedded into a 3D space. It is suitable for visualization, surface electrostatic potential mapping and visual perception of biomolecules. The Lagrangian formulation is can be directly employed in the implicit solvent model based existing software packages. Moreover, the Lagrangian representation has an advantage in that it avoids artificially enlarging van der Waals radii as is often required by smooth surface models. In this thesis, we also analyze the connections, similarities and differences between the Eulerian and Lagrangian formulations of the solvation models. Such analysis is important to the understanding of our differential geometry based solvation models. Finally, besides the solvation free energies, our Lagrangian formulation model is utilized to evaluate the protein-protein binding affinities. Most of the materials of this thesis are adopted from the following publications: • Zhan Chen and Guo-wei Wei, “Differential geometry based solvation model III: Quantum formulation.”, submitted to Journal of chemical physics • Zhan Chen, Nathan A. Baker and Guo-Wei Wei, “Differential geometry based solvation model II: Lagrangian formulation”, Journal of Mathematical Biology, in press, 2011. • Zhan Chen, Nathan A. Baker and Guo-Wei Wei, “Differential geometry based solvation model I: Eulerian formulation”, Journal of Computational Physics, 229, vol. 22, pp.8231-8258, 2010 173 The following publications are also closely related to the present thesis: • Duan Chen, Zhan Chen and G.W. Wei, ”Quantum dynamics in continuum for proton channel transport II: Variational interface”. in press, 2011 • Guo-Wei Wei and Zhan Chen , “Multiscale models for nano-bio systems”, Proceedings of CMBE: ”2nd international conference on computational and mathematical biomedical engineering, 19-22,2011 • Duan Chen, Zhan Chen, Changjun Chen, Weihua Geng and Guo-Wei Wei, “MIBPB: A software package for electrostatic analysis”, Journal of Computational Chemistry, 32 vol.4,pp.756-770, 2011 , 2010 • P. W. Bates, Zhan Chen, Y.H. Sun, G.W. Wei and Shan Zhao, ”Geometric and potential driving formation and evolution of biomolecular surfaces”, Journal of Mathematical Biology, 59, vol.2, pp.193-231, 2009 5.2 Future work In our current research, implicit solvent models assume local and linear solvent responses. Further, we concentrate on the solvation calculations in the differential geometry based models. As far as our future work is concerned, models, numerical simulations and mathematical proofs can be improved and extended as follows: 1. A simple homogeneous solvent density is employed in the present work. In future work we will consider the implementation of the solvent variation. In particular, such a variation can be computed by the integral equation approaches of solutions. A combination of the integral equation theories with the present differential geometry based models will lead to 174 better predictions. This is because the non-uniform density fluctuations of solvent around the solute-solvent interface can be taken into account. 2. Another important extension of the present work is the implementation of the implicit solvation model based molecular dynamics (MD). Currently, the Poisson-Boltzmann (PB) based molecular dynamics algorithms have not been commonly used in the practical simulation of macromolecules. Major hurdles to this development include limits in accuracy, stability, speed and reliability. The multiscale models proposed in this thesis have given rise to a new promise for the development of the PB based molecular dynamics. The accuracy and stability problems in the previous PB based MD methods will not appear in our new model. This is because they are free of interface singularity. Moreover, in all differential geometry base models, the force expressions differ much from those in the classic Poisson-Boltzmann based MD algorithms. 3. The resulting nonlinear PDE systems in our models pose challenges to mathematicians. Numerically, we have shown the existence and local uniqueness of the solutions, which lead to the optimization of the solvation free energy. Moreover, the convergence of the iterative algorithms has been displayed numerically. However, rigorous mathematical proofs have not been studied yet. In future work, upon the promise of numerical performance, we can study these properties using rigorous mathematical tools. 4. The understanding of solvation is an elementary prerequisite for the quantitative description and analysis of a variety of biological, physical and chemical processes. At an equilibrium state, many models are expected to return to the solvation model. Therefore, our differential geometry based solvation models can provide a classic framework for testing the validity of other proposed implicit solvent based biological models, such as ion channel trans175 port models, proton transport models, protein-protein binding simulations, etc. Moreover, it can be incorporated into other related models for accurate simulations in the future. 176 APPENDICES 177 Appendix A Solution of the generalized Laplace-Beltrami equation and the ADI scheme The solution of the generalized Laplace-Beltrami equations (also called the geometric flow equation) Eq.(2.13) or Eq.(3.17) is studied here. First of all, to solve the geometry flow equation, the expression of solvent-solute interaction potential Uss must be prescribed explicitly. Although Uss includes many unspecified interactions, we consider the following form Uss (r) = Ui (r), i 178 (A.1) where Ui (r) is the potential due to the ith atom in the solute molecule. One possible choice of Ui (r) is the following Lennard-Jones (L-J) 6-12 pair potential LJ Ui (r) = εi σi + σs 12 σi + σs 6 −2 |r − ri | |r − ri | (A.2) where εi is the well-depth parameter, and σi and σs are solute atomic and solvent radii, respectively. Here r is the point of interest and ri is a position vector of an atom in the solute molecule. The L-J potential can be divided into attractive term U att and repulsive term U rep in different ways. It can be a “6-12” decomposition as follows: σi + σs 6 |r − ri | σi + σs 12 . |r − ri | att,6/12 Ui (r) = −2εi rep,6/12 Ui (r) = εi (A.3) Alternatively, it can also be a “WCA” decomposition based on the original WCA model [130]    −ε (r) 0 < |r − ri | < σi + σs i att,WCA Ui (r) =  LJ  U (r) |r − ri | ≥ σi + σs , i   LJ  U (r) + ε (r) 0 < |r − r | < σ + σs i i i rep,WCA i Ui (r) =   0 |r − ri | ≥ σi + σs . (A.4) (A.5) As indicated in Chapter 2, the WCA attractive potential provides good results for solvation. Therefore, all the calculations in the present study are carried out by using the WCA decomposition. Note that due to the fast decay of the potential, only those solute atoms which 179 locate near the solvent-solute boundary are needed to include in the evaluation of Uss . Additionally, a necessary step in solving our generalized Laplace-Beltrami equations is to determine all the physical parameters involved. Because of the choice of the polar and nonpolar separation and the continuum representation of solvent in our model, not all parameters from the literature are suitable. In particular, surface tension γ serves as a fitting parameter in our model due to the ambiguity of its specific value in atomic-scale models [130, 160, 235]. Therefore, we rewrite the generalized potential driven geometric flow equation as ∂S ∂t = |∇S| ∇ · γ = |∇S|γ ∇ · ∇S |∇S| ∇S |∇S| +V (A.6) + Vγ where Vγ = V . Therefore, in addition to the Lennard Jones parameters εi , σs and σi , other ¯ γ parameters including p/γ, ρs /γ, ǫs /γ, and ǫm /γ need to be pre-determined in the solution of the generalized potential driven geometry flow equation. The discretization scheme used here for the solution of the generalized geometry flow equation (A.6) is similar to what we designed previously [20, 17]. It can be rewritten in the form 2 2 2 2 2 2 (Sx + Sy )Szz + (Sx + Sz )Syy + (Sy + Sz )Sxx ∂S = 2 2 2 ∂t′ Sx + Sy + Sz 2Sx Sy Sxy + 2Sx Sz Sxz + 2Sz Sy Syz 2 2 2 − + Sx + Sy + Sz V γ , 2 + S2 + S2 Sx y z (A.7) where t′ = tγ. To obtain the discretized form, we introduce the following notations. We consider a discrete time tn := nτ where n is a non-negative integer and τ is the time stepping 180 n size. We denote Sijk to be the discretized form of S(xi , yj , zk , tn ). An explicit scheme of the generalized geometry flow equation is given by n 2 2 2 n n S n+1 − Sijk := [vx δx + vy δy + vz δz ]Sijk + τ fijk , ijk (A.8) where n fijk = Sx Sy Sxy + Sx Sz Sxz + Sz Sy Syz −2 + 2 2 2 Sx + Sy + Sz vx = τ n 2 2 (Sy + Sz ) , 2 2 2 Sx + Sy + Sz ijk vy = τ n 2 2 (Sx + Sz ) , 2 2 2 Sx + Sy + Sz ijk vz = τ n 2 2 2 Sx + Sy + Sz V n 2 2 (Sx + Sy ) , 2 2 2 Sx + Sy + Sz ijk where 2 n n n n δx Sijk = (S(i−1)jk − 2Sijk + S(i+1)jk )/h2 2 n n δy Sijk = (S n − 2Sijk + S n )/h2 i(j−1)k i(j+1)k 2 n n δz Sijk = (S n − 2Sijk + S n )/h2 ij(k−1) ij(k+1) {Sx }n = (S n − Sn )/2h ijk (i+1)jk (i−1)jk n n {Sy }n = (Si(j+1)k − Si(j−1)k )/2h ijk {Sz }n = (S n − Sn )/2h ijk ij(k+1) ij(k−1) 181 ijk n n n n {Sxy }n = (S(i+1)(j+1)k + S(i−1)(j−1)k − S(i+1)(j−1)k − S(i−1)(j+1)k )/4h2 ijk n {Sxz }Sijk = (S n + Sn − Sn − Sn )/4h2 (i+1)j(k+1) (i−1)j(k−1) (i+1)j(k−1) (i−1)j(k+1) and {Syz }n = (S n + Sn − Sn − Sn )/4h2 . ijk i(j+1)(k+1) i(j−1)(k−1) i(j+1)(k−1) i(j−1)(k+1) Equation (A.6) is solved with the Dirichlet boundary condition S(r, t) = 0, ∀r ∈ ∂Ω. For the initial value of S, we consider S(x, y, z, 0) =    1, (x, y, z) ∈ Dsa (A.9)   0, otherwise where we define the domain enclosed by the solvent accessible surface to be Dsa = Na i=1 {r : |r − Ri| < ri + rp }, with ri and rp being atomic van der Waals radius and the probe radius, respectively. Here, Ri is the atomic center position vector of the ith solute atom and Na denotes the total number of atoms for a given macromolecule. To protect the van der Waals surface and make the computation more efficient, we only update the values of S(x, y, z, t) at the points in between the van der Waals surface and the solvent accessible surface; i.e., (x, y, z) ∈ Dsa /DvdW , where DvdW is the domain enclosed by van der Waals surface Na DvdW = i=1 {r : |r − Ri | < ri }. Numerically, to avoid possible zeros in the denominator of Eq. (A.6) we add a very small number, such as 10−7 , into the denominator expression, which does not affect the result at all. For simplicity, the widely used explicit Euler scheme can be applied to the solution of the generalized Laplace-Beltrami equation for the time integration. The Euler scheme can 182 be combined with the second order central difference scheme for the spatial discretization [20]. Nevertheless, this algorithm is not very efficient because a very small time stepping size is required to guarantee the stability of the time integration. Therefore, an alternative direction implicit (ADI) scheme is desirable. The ADI scheme is second order in both spatial and time discretizations. It builds in a fast O(N) Thomas algorithm to solve the tridiagonal linear system and thus is very efficient. The ADI algorithm is unconditionally stable and allows a much larger time stepping size than does the explicit Euler scheme. An splitting algorithm based ADI scheme reported in our earlier work [17] was the fastest scheme among the tested ones under typical accuracy requirement for the mean curvature flow. Considering the similarity of the current differential operator and the mean curvature flow, we adopt the ADI scheme to speed up our generalized geometric flow solver. To this end, we modify Eq. (A.8) as 1− vx 2 vy 2 vz 2 n+1 vx 2 vy 2 vz 2 n n δx − δy − δz S = 1 + δx + δy + δz Sijk + τ f (Sijk ) (A.10) ijk 2 2 2 2 2 2 It follows that Ax 2 Ax 1+ 2 1− = Ay 2 Ay 1+ 2 1− Az S n+1 ijk 2 Ax Ay Az Az n n 1+ − Sijk + τ f (Sijk ) 2 4 1− (A.11) where 2 2 2 Ax = vx δx , Ay = vy δy , Az = vz δz (A.12) 2 2 2 Here vx , vy , vz , δx , δy and δz are defined in Section 2.2.1. The following multi-step imple- 183 mentation can be carried out. Step 1: Ax 1− 2 n+ 1 3 = S ijk 1+ Ax n n + Ay + Az Sijk + τ f (Sijk ) 2 (A.13) Step 2: Ay 1− 2 2 n+ 3 n+ 1 Ay n 3− S S =S ijk ijk 2 ijk (A.14) Az 2 2 n+ 3 Az n S n+1 = S − S . ijk ijk 2 ijk (A.15) Step 3: 1− 184 Appendix B PB equation in different forms Assuming that there are only two mobile ion spices and all ions are univalent, we can treat them as positive and negative ions with charge +ec and −ec , where ec is the electron charge. Then the nonlinear Poisson-Boltzmann (NLPB) equation (1.1) becomes [93, 108] −∇ · (ǫ(r)∇φ) + κ2 (r) kB T ec sinh ec φ kB T = λm ρm , (B.1) where κ is the modified Debye-H¨ ckel screening function describing ion strength and is u determined by κ2 = 2λs Na e2 cI , s 1000kB T (B.2) where Na the Avogadro’s number, and Is the ion strength in the unit of mole. Numerically, when T = 298K, the value of κ2 can be obtained via κ2 = 0.675365 ˚−2 Is . Note that A Debye-H¨ ckel parameter κ can also be expressed as [97] u Nc 2 = λs κ n0 Q2 . i i kB T i=1 185 (B.3) Equations (B.2) and (B.3) are equivalent to each other via the following formula [108] Nc 1000M 1 n0 Q2 = Is = , Na 2e2 i=1 i i c (B.4) where M is the bulk concentration of ions in the unit of mole per cubic centimeter mol for cm3 both positive and negative ionic charges. Equation (B.1) is subject to the far-field boundary condition φ(∞) = 0. However, the Dirichlet boundary condition is used in practical computations Nm φ(r) = φi = i i=1 qi e−κ|r−ri |/ǫs ǫs |r − ri | ∀r ∈ ∂Ω, (B.5) where φi is the exact solution of a single ion in a homogeneous media. The linear superposition in Eq. (B.5) is very accurate if the macromolecule domain Ωm is sufficiently away from the boundary ∂Ω. Let define a dimensionless potential u through u = ec φ/kB T , one yields another formulation of the nonlinear PB equation in terms of u [40] ec λm ρm . −∇ · (ǫ(r)∇u) + κ2 (r) sinh(u) = kB T (B.6) If the potential is very weak, i.e., u ≪ 1, one can numerically solve the following linearized PB (LPB) equation ec −∇ · (ǫ(r)∇u) + κ2 (r)u(r) = λm ρm . kB T (B.7) Note that in the Poisson-Boltzmann theory, there are two unit conventions in the literature that differs by a factor of 4π. Specifically, the convention used by Sharp and Honig [200], and in some of our earlier work [93, 40] has a factor of 4π in the Poisson-Boltzmann 186 equation. Whereas, the convention used by Gilson et al. [97] and in our recent work [243] as well as the present derivation, the 4π factor does not appear. Therefore, care is needed in the comparison of the electrostatic potentials computed by these two conventions. 187 Appendix C Differential geometry theory preliminary The solvent-solute boundary can be considered as a 2-dimensional (2D) differentiable manifold embedded in a 3D Euclidean space or a hypersurface in a Riemannian manifold. For example, the subsequent free energy optimization can be carried out on the 2D manifold. Consider a C 2 immersion f : U → Rn+1 , where U ⊂ Rn is an open set and U is compact[248]. Here f(u) = (f1 (u), f2 (u), · · · , fn+1 (u)) is a hypersurface element (or a position vector), and u = (u1 , u2 , · · · , un ) ∈ U. Tangent vectors (or directional vectors) of f are Xi = ∂f , i = 1, 2 · · · n. The Jacobi matrix of the mapping f is given by ∂ui Df = (X1 , X2 , · · · , Xn ). The first fundamental form is a symmetric, positive definite metric tensor of f, given by I(Xi , Xj ) := (gij ) = (Df)T · (Df). Its matrix elements can also be expressed as gij = Xi , Xj , where , i, j = 1, 2, · · · , n. 188 is the Euclidean inner product in Rn , Let N(u) be the unit normal vector given by the Gauss map N : U → Rn+1 , N(u1 , u2 , · · · , un ) := X1 × X2 · · · × Xn / X1 × X2 · · · × Xn ∈ ⊥u f, (C.1) where “×′′ denotes the cross product. Here ⊥u f is the normal space of f at point X = f(u), where the position vector X differs much from tangent vectors Xi . The normal vector N is perpendicular to the tangent hyperplane Tu f at X. Note that Tu f ⊕ ⊥u f = Tf(u) Rn , the tangent space at X. By means of the normal vector N and tangent vector Xi , the second fundamental form is given by II(Xi , Xj ) = (hij )i,j=1,2,···n = − ∂N ,X ∂ui j . (C.2) ij 1 The mean curvature can be calculated from H = n hij g ji, where we use the Einstein summation convention, and (g ij ) = (gij )−1 . For n = 2, which fits into our purpose, let us choose f(u) = (u1 , u2 , χ), where χ(u1 , u2 ) is a function of interest. We have the first fundamental form:   2 χ χ  1 + χ1 1 2  (gij ) =  , 2 χ1 χ2 1 + χ2 (C.3) ∂χ , i = 1, 2. The inverse matrix of (gij ) is given by where χi = ∂ui   2 −χ χ 1  1 + χ2 1 2  (g ij ) =  , g 2 −χ1 χ2 1 + χ1 189 (C.4) where g = Det(gij ) = 1 + χ2 + χ2 is the Gram determinant. The normal vector can be 1 2 computed from Eq. (C.1) N= (−χ1 , −χ2 , 1) , √ g The second fundamental form is given by (hij ) = 1 √ χu u g i j (C.5) , i.e., the Hessian matrix of χ. The explicit form for the mean curvature operator can be written as 1 (h g + h22 g11 − 2h12 g12 ) 2g 11 22 χ χ 1 ∂ ∂ = . √1 + √2 2 ∂u1 g ∂u2 g H = (C.6) (C.7) In Section 4.1.3, we show that the mean curvature operator can be expressed in a (3D) formulation. 190 BIBLIOGRAPHY 191 BIBLIOGRAPHY [1] A. Abrashkin, D. Andelman, and H. Orland. Dipolar poisson-boltzmann equation: Ions and dipoles close to charge interfaces. PHYSICAL REVIEW LETTERS, 99:077801, 2007. [2] J. Antosiewicz, J. A. McCammon, and M. K. Gilson. The determinants of pKa s in proteins. Biochemistry, 35(24):7819–7833, 1996. [3] E. Artacho. The siesta method; developments and applicability. J. Phys.: Condens. Matter, 20:064208, 2008. [4] H. S. Ashbaugh. Convergence of molecular and macroscopic continuum descriptions of ion hydration. Journal of Physical Chemistry B, 104(31):7235–7238, 2000. [5] C. Azuara, E. Lindahl, P. Koehl, H. Orland, and M. Delarue. PDB Hydro: incorporating dipolar solvents with variable density in the Poisson-Boltzmann treatment of macromolecule electrostatics. Nucleic Acids Research, 34:W38–W42, 2006. [6] C. Azuara, H. Orland, M. Bon, P. Koehl, and M. Delarue. Incorporating dipolar solvents with variable density in poisson-boltzmann electrostatics. Biophysical Journal, 95:5587C5605, 2008. [7] N. A. Baker. Poisson-Boltzmann methods for biomolecular electrostatics. Methods in Enzymology, 383:94–118, 2004. [8] N. A. Baker. Biomolecular applications of Poisson-Boltzmann methods. In K. B. Lipkowitz, R. Larter, and T. R. Cundari, editors, Reviews in Computational Chemistry, volume 21. John Wiley and Sons, Hoboken, NJ, 2005. 192 [9] N. A. Baker. Improving implicit solvent simulations: a Poisson-centric view. Current Opinion in Structural Biology, 15(2):137–43, 2005. [10] N. A. Baker, D. Bashford, and D. A. Case. Implicit solvent electrostatics in biomolecular simulation. In B. Leimkuhler, C. Chipot, R. Elber, A. Laaksonen, A. Mark, T. Schlick, C. Schutte, and R. Skeel, editors, New Algorithms for Macromolecular Simulation. Springer, 2006. [11] N. A. Baker and J. A. McCammon. Electrostatic interactions. In P. Bourne and H. Weissig, editors, Structural Bioinformatics, pages 427–440. John Wiley & Sons, Inc., New York, 2003. [12] N. A. Baker, D. Sept, S. Joseph, M. J. Holst, and J. A. McCammon. Electrostatics of nanosystems: Application to microtubules and the ribosome. Proceedings of the National Academy of Sciences of the United States of America, 98(18):10037–10041, 2001. [13] N. K. Banavali, W. Im, and B. Roux. Electrostatic free energy calculations using the generalized solvent boundary potential method. Journal of Chemical Physics, 117(15):7381–8, 2002. [14] V. Barone, M. Cossi, and J. Tomasi. A new definition of cavities for the computation of solvation free energies by the polarizable continuum model. Journal of Chemical Physics, 107:3210–3221, 1997. [15] D. Bashford and D. A. Case. Generalized Born models of macromolecular solvation effects. Annual Review of Physical Chemistry, 51:129–152, 2000. [16] D. Bashford and M. Karplus. pKa ’s of ionizable groups in proteins: atomic detail from a continuum electrostatic model. Biochemistry, 29(44):10219–25, 1990. [17] P. W. Bates, Z. Chen, Y. H. Sun, G. W. Wei, and S. Zhao. Geometric and potential driving formation and evolution of biomolecular surfaces. J. Math. Biol., 59:193–231, 2009. [18] P. W. Bates, G. W. Wei, and S. Zhao. The minimal molecular surface. arXiv:qbio/0610038v1, [q-bio.BM], 2006. [19] P. W. Bates, G. W. Wei, and S. Zhao. The minimal molecular surface. Midwest Quantitative Biology Conference, Mission Point Resort, Mackinac Island, MI:September 29 – October 1, 2006. 193 [20] P. W. Bates, G. W. Wei, and S. Zhao. Minimal molecular surfaces and their applications. Journal of Computational Chemistry, 29(3):380–91, 2008. [21] A. D. Becke. Density functional thermochemistry iii. the role of exact exchange. J. Chem. Phys., 98:5648, 1993. [22] D. Beglov and B. Roux. Solvation of complex molecules in a polar liquid: an integral equation theory. Journal of Chemical Physics, 104(21):8678–8689, 1996. [23] D. Beglov and B. Roux. An integral equation to describe the solvation of polar molecules in liquid water. Journal of Physical Chemistry B, 101(39):7821–6, 1997. [24] M. Berger and B. Gostiaux. Differential Geometry: Manifolds, Curves, and Surfaces. Springer-Verlag, 1988. [25] C. A. S. Bergstrom, M. Strafford, L. Lazorova, A. Avdeef, K. Luthman, and P. Artursson. Absorption classification of oral drugs based on molecular surface properties. Journal of Medicinal Chemistry, 46(4):558–570, 2003. [26] C. Bertonati, B. Honig, and E. Alexov. Poisson-Boltzmann calculations of nonspecific salt effects on protein-protein binding free energy. Biophysical Journal, 92:1891–1899, 2007. [27] A. L. Bertozzi and J. B. Greer. Low-curvature image simplifiers: Global regularity of smooth solutions and Laplacian limiting schemes. Communications on Pure and Applied Mathematics, 57(6):764–790, 2004. [28] B. H. Besler, J. Merz, K. M., and P. A. Kollman. Atomic charges derived from semiempirical methods. J. Comput. Chem., 11:431–439, 1990. [29] N. Blomberg, R. R. Gabdoulline, M. Nilges, and R. C. Wade. Classification of protein sequences by homology modeling and quantitative analysis of electrostatic similarity. Proteins, 37(3):379–387, 1999. [30] P. Blomgren and T. Chan. Color TV: total variation methods for restoration of vectorvalued images. Image Processing, IEEE Transactions on, 7(3):304–309, 1998. [31] A. H. Boschitsch and M. O. Fenley. Hybrid boundary element and finite difference method for solving the nonlinear Poisson-Boltzmann equation. Journal of Computational Chemistry, 25(7):935–955, 2004. 194 [32] M. Bostrom, F. W. Tavares, D. Bratko, and B. W. Ninham. Specific ion effects in solutions of globular proteins: Comparison between analytical models and simulation. Journal of Physical Chemistry B, 109(51):24489–94, 2005. [33] K. Burke, J. Werschnik, and E. K. U. Gross. Time-dependent density functional theory: Past, present, and future. J. Chem. Phys., 123:062206, 2005. [34] W. Cai and S. Z. Deng. An upwinding embedded boundary method for Maxwell’s equations in media with material interfaces: 2d case. J. Comput. Phys., 190:159–183, 2003. [35] E. Cances, B. Mennucci, and J. Tomasi. A new integral equation formalism for the polarizable continuum model: Theoretical background and applications to isotropic and anisotropic dielectrics. Journal of Chemical Physics, 107:3032 – 3041, 1997. [36] V. Carstensen, R. Kimmel, and G. Sapiro. Geodesic active contours. International Journal of Computer Vision, 22:61–79, 1997. [37] T. Cecil. A numerical method for computing minimal surfaces in arbitrary dimension. Journal of Computational Physics, 206(2):650–660, 2005. [38] D. S. Cerutti, N. A. Baker, and J. A. McCammon. Solvent reaction field potential inside an uncharged globular protein: A bridge between implicit and explicit solvent models? The Journal of Chemical Physics, 127(15):155101, 2007. [39] Q. Chang, X. Tai, and L. Xing. A compound algorithm of denoising using secondorder and fourth-order partial differential equations. NUMERICAL MATHEMATICSTHEORY METHODS AND APPLICATIONS, 2:353–376, 2010. [40] D. Chen, Z. Chen, C. Chen, W. H. Geng, and G. W. Wei. MIBPB: A software package for electrostatic analysis. J. Comput. Chem., in press, 2010. [41] D. Chen, G. W. Wei, X. Cong, and G. Wang. Computational methods for optical molecular imaging. Communications in Numerical Methods in Engineering, 25:1137– 1161, 2009. [42] J. Chen, L. Noodleman, D. Case, and D. Bashford. Incorporating solvation effects into density functional electronic structure calculations. J. Phys. Chem., 98:11059– 11068, 1994. 195 [43] L. Chen, M. J. Holst, and J. Xu. The finite element approximation of the nonlinear poisson–boltzmann equation. SIAM Journal on Numerical Analysis, 45(6):2298–2320, 2007. [44] T. Chen and J. Strain. Piecewise-polynomial discretization and Krylov-accelerated multigrid for elliptic interface problems. J. Comput. Phys., 16:7503–7542, 2008. [45] Y. G. Chen and J. D. Weeks. Local molecular field theory for effective attractions between like charged objects in systems with strong Coulomb interactions. Proceedings of the National Academy of Sciences of the United States of America, 103(20):7560–5, 2006. [46] Z. Chen, N. A. Baker, and G. W. Wei. Differential geometry based solvation models I: Eulerian formulation. J. Comput. Phys., 229:8231–8258, 2010. [47] Z. Chen, N. A. Baker, and G. W. Wei. Differential geometry based solvation models II: Lagrangian formulation. J. Math. Biol., accepted, 2010. [48] L. T. Cheng, J. Dzubiella, A. J. McCammon, and B. Li. Application of the level-set method to the implicit solvation of nonpolar molecules. Journal of Chemical Physics, 127(8), 2007. [49] Y. Cheng, J. K. Suen, Z. Radi, S. D. Bond, M. J. Holst, and J. A. McCammon. Continuum simulations of acetylcholine diffusion with reaction-determined boundaries in neuromuscular junction models. Biophysical Chemistry, 127(3):129–39, 2007. [50] Y. Cheng, J. K. Suen, D. Zhang, S. D. Bond, Y. Zhang, Y. Song, N. A. Baker, C. L. Bajaj, M. J. Holst, and J. A. McCammon. Finite element analysis of the time-dependent Smoluchowski equation for acetylcholinesterase reaction rate calculations. Biophysical Journal, 92(10):3397–406, 2007. [51] I. L. Chern, J.-G. Liu, and W.-C. Weng. Accurate evaluation of electrostatics for macromolecules in solution. Methods and Applications of Analysis, 10(2):309–28, 2003. [52] M. Chiba, D. G. Fedorov, and K. Kitaura. Polarizable continuum model with the fragment molecular orbital-based time-dependent density functional theory. Journal of Computational Chemistry, 29:2667–2676, 2008. [53] L. E. Chirlian and M. M. Francl. Atomic charges derived from electrostatic potentials: A detailed study. J. Comput. Chem., 8:894–905, 1987. 196 [54] D. L. Chopp. Computing minimal surfaces via level set curvature flow. Journal of Computational Physics, 106(1):77–91, 1993. [55] I. Chorny, K. A. Dill, and M. P. Jacobson. Surfaces affect ion pairing. Journal of Physical Chemistry B, 109(50):24056–60, 2005. [56] V. B. Chu, Y. Bai, J. Lipfert, D. Herschlag, and S. Doniach. Evaluation of ion binding to DNA duplexes using a size-modified Poisson-Boltzmann theory. Biophysical Journal, 93(9):3202–3209, 2007. [57] M. L. Connolly. Analytical molecular surface calculation. Journal of Applied Crystallography, 16(5):548–558, 1983. [58] R. B. Corey and L. Pauling. Molecular models of amino acids, peptides and proteins. Rev. Sci. Instr., 24:621–627, 1953. [59] M. Cossi, V. Barone, R. Cammi, and J. Tomasi. Ab initio study of solvated molecules: A new implementation of the polarizable continuum model. Chemical Physics Letters, 255:327–335, 1996. [60] S. R. Cox and D. E. Williams. Representation of the molecular electrostatic potential by a net atomic charge model. J.Comput. Chem., 2:304–323, 1981. [61] C. J. Cramer and D. G. Truhlar. Implicit solvation models: Equilibria, structure, spectra, and dynamics. Chem. Rev., 99:2161–2200, 1999. [62] P. B. Crowley and A. Golovin. Cation-pi interactions in protein-protein interfaces. Proteins: Structure, Function, and Bioinformatics, 59(2):231–239, 2005. [63] L. David, R. Luo, and M. K. Gilson. Comparison of generalized Born and Poisson models: Energetics and dynamics of HIV protease. Journal of Computational Chemistry, 21(4):295–309, 2000. [64] M. E. Davis, J. D. Madura, J. Sines, B. A. Luty, S. A. Allison, and J. A. McCammon. Diffusion-controlled enzymatic reactions. Methods in Enzymology, 202:473–497, 1991. [65] M. E. Davis and J. A. McCammon. Electrostatics in biomolecular structure and dynamics. Chemical Reviews, 94:509–21, 1990. 197 [66] F. De Rienzo, R. R. Gabdoulline, M. C. Menziani, P. G. De Benedetti, and R. C. Wade. Electrostatic analysis and Brownian dynamics simulation of the association of plastocyanin and cytochrome F. Biophysical Journal, 81(6):3090–3104, 2001. [67] C. Dietrich, C. E. Scheidegger, J. Schreiner, J. L. D. Comba, L. P. Nedel, and C. T. Silva. Edge transformations for improving mesh quality of marching cubes. IEEE Transactions on Visualization and Computer Graphics, 15(1):150–159, 2009. [68] B. N. Dominy and C. L. Brooks, III. Development of a generalized Born model parameterization for proteins and nucleic acids. Journal of Physical Chemistry B, 103(18):3765–3773, 1999. [69] F. Dong, B. Olsen, and N. A. Baker. Computational methods for biomolecular electrostatics. Methods in Cell Biology, 84:843–70, 2008. [70] F. Dong, M. Vijaykumar, and H. X. Zhou. Comparison of calculation and experiment implicates significant electrostatic contributions to the binding stability of barnase and barstar. Biophysical Journal, 85(1):49–60, 2003. [71] F. Dong, J. A. Wagoner, and N. A. Baker. Assessing the performance of implicit solvation models at a nucleic acid surface. Physical Chemistry Chemical Physics, 10:4889– 902, 2008. [72] F. Dong and H. X. Zhou. Electrostatic contribution to the binding stability of proteinprotein complexes. Proteins, 65(1):87–102, 2006. [73] A. I. Dragan, C. M. Read, E. N. Makeyeva, E. I. Milgotina, M. E. Churchill, C. CraneRobinson, and P. L. Privalov. DNA binding and bending by HMG boxes: Energetic determinants of specificity. Journal of Molecular Biology, 343(2):371–393, 2004. [74] J. Dzubiella, J. M. J. Swanson, and J. A. McCammon. Coupling hydrophobicity, dispersion, and electrostatics in continuum solvent models. Physical Review Letters, 96:087802, 2006. [75] J. Dzubiella, J. M. J. Swanson, and J. A. McCammon. Coupling nonpolar and polar solvation free energies in implicit solvent models. Journal of Chemical Physics, 124:084905, 2006. [76] A. H. Elcock, R. R. Gabdoulline, R. C. Wade, and J. A. McCammon. Computer simulation of protein-protein association kinetics: Acetylcholinesterase-fasciculin. Journal of Molecular Biology, 291(1):149–162, 1999. 198 [77] A. H. Elcock, D. Sept, and J. A. McCammon. Computer simulation of protein-protein interactions. J. Phys.Chem. B, 105:1504–1518, 2001. [78] H. Federer. Curvature Measures. Trans. Amer. Math. Soc., 93:418–491, 1959. [79] R. P. Fedkiw, T. Aslam, B. Merriman, and S. Osher. A non-oscillatory Eulerian approach to interfaces in multimaterial flows (the ghost fluid method). J. Comput. Phys., 152:457–492, 1999. [80] M. V. Fedorov, J. M. Goodman, and S. Schumm. To switch or not to switch: The effects of potassium and sodium ions on α-poly-l-glutamate conformations in aqueous solutions. J. Am. Chem. Soc., 131:10854–0856, 2009. [81] M. V. Fedorov and A. A. Kornyshev. Unravelling the solvent response to neutral and charged solutes. Molecular Physics, 105(1):1–16, 2007. [82] M. Feig and C. L. Brooks III. Recent advances in the development and application of implicit solvent models in biomolecule simulations. Curr Opin Struct Biol., 14:217 – 224, 2004. [83] M. Feig, A. Onufriev, M. S. Lee, W. Im, D. A. Case, and I. Brooks, C. L. Performance comparison of generalized Born and Poisson methods in the calculation of electrostatic solvation energies for protein structures. Journal of Computational Chemistry, 25(2):265–284, 2004. [84] X. Feng and A. Prohl. Analysis of a fully discrete finite element method for the phase field model and approximation of its sharp interface limits. Mathematics of Computation, 73:541–567, 2004. [85] M. Fixman. The Poisson-Boltzmann equation and its application to polyelectrolytes. Journal of Chemical Physics, 70(11):4995–5005, 1979. [86] F. Fogolari, A. Brigo, and H. Molinari. The Poisson-Boltzmann equation for biomolecular electrostatics: a tool for structural biology. Journal of Molecular Recognition, 15(6):377–92, 2002. [87] J. Forsman. A simple correlation-corrected Poisson-Boltzmann theory. Journal of Physical Chemistry B, 108(26):9236–45, 2004. 199 [88] P. H. Fries and G. N. Patey. The solution of the hypernetted-chain approximation for fluids of nonspherical particles. a general method with application to dipolar hard spheres. Journal of Chemical Physics, 82:429 – 440, 1985. [89] R. R. Gabdoulline and R. C. Wade. Brownian dynamics simulation of protein-protein diffusional encounter. Methods-a Companion to Methods in Enzymology, 14(3):329– 341, 1998. [90] E. Gallicchio, M. M. Kubo, and R. M. Levy. Enthalpy-entropy and cavity decomposition of alkane hydration free energies: Numerical results and implications for theories of hydrophobic solvation. Journal of Physical Chemistry B, 104(26):6271–85, 2000. [91] E. Gallicchio and R. M. Levy. AGBNP: An analytic implicit solvent model suitable for molecular dynamics simulations and high-resolution modeling. Journal of Computational Chemistry, 25(4):479–499, 2004. [92] E. Gallicchio, L. Y. Zhang, and R. M. Levy. The SGB/NP hydration free energy model based on the surface generalized Born solvent reaction field and novel nonpolar hydration free energy estimators. Journal of Computational Chemistry, 23(5):517–29, 2002. [93] W. Geng, S. Yu, and G. W. Wei. Treatment of charge singularities in implicit solvent models. Journal of Chemical Physics, 127:114106, 2007. [94] W. H. Geng and G. W. Wei. Multiscale molecular dynamics via the matched interface and boundary (mib) method. J. Comput. Phys., in press, 2010. [95] R. E. Georgescu, E. G. Alexov, and M. R. Gunner. Combining conformational flexibility and continuum electrostatics for calculating pKas in proteins. Biophysical Journal, 83(4):1731–1748, 2002. [96] G. Gilboa, N. Sochen, and Y. Y. Zeevi. Image sharpening by flows based on triple well potentials. Journal of Mathematical Imaging and Vision, 20:121–131, 2004. [97] M. K. Gilson, M. E. Davis, B. A. Luty, and J. A. McCammon. Computation of electrostatic forces on solvated molecules using the Poisson-Boltzmann equation. Journal of Physical Chemistry, 97(14):3591–3600, 1993. [98] S. Goedecker. Linear scaling electronic structure methods. Rev. Mod. Phys., 71:1085– 1123, 1999. 200 [99] V. Gogonea and K. M. Merz. Fully quantum mechanical description of proteins in solution. combining linear scaling quantum mechanical methodologies with the poissonboltzmann equation. J. Phys. Chem. A, 103:5171–5188, 1999. [100] J. Gomes and O. D. Faugeras. Using the vector distance functions to evolve manifolds of arbitrary codimension. Lecture Notes in Computer Science, 2106:1–13, 2001. [101] J. A. Grant, B. T. Pickup, and A. Nicholls. A smooth permittivity function for PoissonBoltzmann solvation methods. Journal of Computational Chemistry, 22(6):608–640, 2001. [102] J. A. Grant, B. T. Pickup, M. T. Sykes, C. A. Kitchen, and A. Nicholls. The Gaussian Generalized Born model: application to small molecules. Physical Chemistry Chemical Physics, 9:4913–22, 2007. [103] J. B. Greer and A. L. Bertozzi. H-1 solutions of a class of fourth order nonlinear equations for image processing. Discrete and Continuous Dynamical Systems, 10:349– 366, 2004. [104] J. B. Greer and A. L. Bertozzi. Traveling wave solutions of fourth order pdes for image processingl. SIAM Journal on Mathematics Analysis, 36:38–68, 2004. [105] P. Grochowski and J. Trylska. Continuum molecular electrostatics, salt effects and counterion binding. a review of the Poisson-Boltzmann theory and its modifications. Biopolymers, 89(2):93–113, 2007. [106] P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev., 136:B864–871, 1964. [107] C. Holm, P. Kekicheff, and R. Podgornik. Electrostatic effects in soft matter and biophysics; NATO Science Series. Kluwer Academic Publishers, Boston, 2001. [108] M. J. Holst. Multilevel Methods for the Poisson-Boltzmann Equation. University of Illinois at Urbana-Champaign, Numerical Computing Group, Urbana-Champaign, 1993. [109] B. Honig and A. Nicholls. Classical electrostatics in biology and chemistry. Science, 268(5214):1144–9, 1995. [110] H. Hu, Z. Y. Lu, and W. T. Yang. Fitting molecular electrostatic potentials from quantum mechanical calculations. Journal of Chemical Theory and Computation, 3:1004 – 1013, 2007. 201 [111] B. Husowitz and V. Talanquer. Solvent density inhomogeneities and solvation free energies in supercritical diatomic fluids: A density functional approach. The Journal of Chemical Physics, 126(5):054508, 2007. [112] W. Im, D. Beglov, and B. Roux. Continuum solvation model: electrostatic forces from numerical solutions to the Poisson-Boltzmann equation. Computer Physics Communications, 111(1-3):59–75, 1998. [113] R. Improta, V. Barone, G. Scalmani, and M. J. Frisch. A state-specific polarizable continuum model time dependent density functional theory method for excited state calculations in solution. Journal of Chemical Physics, 125(054103), 2006. [114] R. M. Jackson and M. J. Sternberg. A continuum model for protein-protein interactions: Application to the docking problem. Journal of Molecular Biology, 250(2):258– 275, 1995. [115] A. Jakalian, B. L. Bush, D. B. Jack, and C. I. Bayly. Fast, efficient generation of high-quality atomic charges. am1-bcc model: I. method. Journal of Computational Chemistry, 21(2):132–146, 2000. [116] B. Jayaram, D. Sprous, and D. L. Beveridge. Solvation free energy of biomacromolecules: Parameters for a modified generalized Born model consistent with the AMBER force field. Journal of Physical Chemistry B, 102(47):9571–9576, 1998. [117] R. Jinnouchi and A. B. Anderson. Electronic structure calculations of liquid-solid interfaces: Combination of density functional theory and modified Poisson-Boltzmann theory. PHYSICAL REVIEW B, 77(245417), 2008. [118] S. C. L. Kamerlin, M. Haranczyk, and A. Warshel. Progress in ab initio qm/mm freeenergy simulations of electrostatic energies in proteins: Accelerated qm/mm studies of pk(a), redox reactions and solvation free energies. Journal of Phys. Chem. B, 113:1253– 1272, 2009. [119] J. G. Kirkwood. Theory of solution of molecules containing widely separated charges with special application to zwitterions. J. Comput. Phys., 7:351 – 361, 1934. [120] P. Koehl. Electrostatics calculations: latest methodological advances. Current Opinion in Structural Biology, 16(2):142–51, 2006. 202 [121] P. Koehl, H. Orland, and M. Delarue. Beyond the poisson-boltzmann model: Modeling biomolecule-water and water-water interactions. PHYSICAL REVIEW LETTERS, 102:087801, 2009. [122] W. Kohn and L. J. Sham. Self-consistent equations including exchange and correlation effects. Phys. Rev., 140:A1133– A1138, 1965. [123] L. A. Kuhn, M. A. Siani, M. E. Pique, C. L. Fisher, E. D. Getzoff, and J. A. Tainer. The interdependence of protein surface topography and bound water molecules revealed by surface accessibility and fractal density measures. Journal of Molecular Biology, 228(1):13–22, 1992. [124] M. C. Lai and C. S. Peskin. An immersed boundary method with formal second-order accuracy and reduced numerical viscosity. J. Comput. Phys., 160:705–719, 2000. [125] G. Lamm. The Poisson-Boltzmann equation. In K. B. Lipkowitz, R. Larter, and T. R. Cundari, editors, Reviews in Computational Chemistry, pages 147–366. John Wiley and Sons, Inc., Hoboken, N.J., 2003. [126] B. Lee and F. M. Richards. The interpretation of protein structures: estimation of static accessibility. J Mol Biol, 55(3):379–400, 1971. [127] C. T. Lee, W. T. Yang, and R. G. Parr. Development of the colle-salvetti correlationenergy formula into a functional of the electron-density. Physical Review B, 37:785 – 789, 1988. [128] M. S. Lee, J. Salsbury, F. R., and M. A. Olson. An efficient hybrid explicit/implicit solvent method for biomolecular simulations. Journal of Computational Chemistry, 25(16):1967–78, 2004. [129] R. J. LeVeque and Z. L. Li. The immersed interface method for elliptic equations with discontinuous coefficients and singular sources. SIAM J. Numer. Anal., 31:1019–1044, 1994. [130] R. M. Levy, L. Y. Zhang, E. Gallicchio, and A. K. Felts. On the nonpolar hydration free energy of proteins: surface area and continuum solvent models for the solute-solvent interaction energy. Journal of the American Chemical Society, 125(31):9523–9530, 2003. [131] H. Li, A. D. Robertson, and J. H. Jensen. The determinants of carboxyl pKa values in turkey ovomucoid third domain. Proteins, 55(3):689–704, 2004. 203 [132] H. Li, A. D. Robertson, and J. H. Jensen. Very fast empirical prediction and rationalization of protein pka values. Proteins, 61(4):704–21, 2005. [133] J. Li, C. L. Fisher, J. L. Chen, D. Bashford, and L. Noodleman. Calculation of redox potentials and pKa values of hydrated transition metal cations by a combined density functional and continuum dielectric theory. Inorganic Chemistry, 35(16):4694–702, 1996. [134] Y. Li and F. Santosa. A computational algorithm for minimizing total variation in image restoration. IEEE Transactions on Image Processing, 5(6):987–95, 1996. [135] Z. L. Li and K. Ito. Maximum principle preserving schemes for interface problems with discontinuous coefficients. SIAM J. Sci. Comput., 23:339–361, 2001. [136] V. J. Licata and N. M. Allewell. Functionally linked hydration changes in escherichia coli aspartate transcarbamylase and its catalytic subunit. Biochemistry, 36(33):10161– 10167, 1997. [137] X. D. Liu, R. P. Fedkiw, and M. Kang. A boundary condition capturing method for Poisson’s equation on irregular domains. J. Comput. Phys., 160:151–178, 2000. [138] D. R. Livesay, P. Jambeck, A. Rojnuckarin, and S. Subramaniam. Conservation of electrostatic properties within enzyme families and superfamilies. Biochemistry, 42(12):3464–3473, 2003. [139] J. R. Livingstone, R. S. Spolar, and M. T. Record Jr. Contribution to the thermodynamics of protein folding from the reduction in water-accessible nonpolar surface area. Biochemistry, 30(17):4237–44, 1991. [140] W. E. Lorensen and H. E. Cline. Marching cubes: a high resolution 3d surface reconstruction algorithm. Computer Graphics, 21:163–169, 1987. [141] Q. Lu and R. Luo. A Poisson-Boltzmann dynamics method with nonperiodic boundary condition. Journal of Chemical Physics, 119(21):11035–11047, 2003. [142] R. Luo, L. David, and M. K. Gilson. Accelerated Poisson-Boltzmann calculations for static and dynamic systems. Journal of Computational Chemistry, 23(13):1244–53, 2002. 204 [143] B. A. Luty, M. E. Davis, and J. A. McCammon. Solving the finite-difference nonlinear Poisson-Boltzmann equation. Journal of Computational Chemistry, 13:1114– 1118, 1992. [144] M. Lysaker, A. Lundervold, and X. C. Tai. Noise removal using fourth-order partial differential equation with application to medical magnetic resonance images in space and time. IEEE Trans. Imag. Proc., 12:1579–1590, 2003. [145] C. M. MacDermaid and G. A. Kaminski. Electrostatic polarization is crucial for reproducing pKa shifts of carboxylic residues in turkey ovomucoid third domain. Journal of Physical Chemistry B, 111(30):9036–44, 2007. [146] J. MacKerell, A. D., D. Bashford, M. Bellot, J. Dunbrack, R. L., J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, I. Reiher, W. E., B. Roux, M. Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus. All-atom empirical potential for molecular modeling and dynamics studies of proteins. Journal of Physical Chemistry B, 102(18):3586–3616, 1998. [147] J. D. Madura, J. M. Briggs, R. C. Wade, M. E. Davis, B. A. Luty, A. Ilin, J. Antosiewicz, M. K. Gilson, B. Bagheri, L. R. Scott, and J. A. McCammon. Electrostatics and diffusion of molecules in solution - simulations with the University of Houston Brownian Dynamics program. Computer Physics Communications, 91(1-3):57–95, 1995. [148] A. V. Marenich, C. J. Cramer, and D. G. Truhlar. Perspective on foundations of solvation modeling: The electrostatic contribution to the free energy of solvation. Journal of Chemical Theory and Computation, 4(6):877–887, 2008. [149] I. Massova and P. A. Kollman. Computational Alanine Scanning To Probe ProteinProtein Interactions: A Novel Approach To Evaluate Binding Free Energies. Journal of the American Chemical Society, 121(36):8133–43, 1999. [150] W. M. Matousek, B. Ciani, C. A. Fitch, B. E. Garcia-Moreno, R. A. Kammerer, and A. T. Alexandrescu. Electrostatic contributions to the stability of the GCN4 leucine zipper structure. Journal of Molecular Biology, 374(1):206–19, 2007. [151] A. Mayo. The fast solution of Poisson’s and the biharmonic equations on irregular regions. SIAM J. Numer. Anal., 21:285–299, 1984. 205 [152] Y. Mei, C. G. Ji, and J. Z. H. Zhang. A new quantum method for electrostatic solvation energy of protein. J. Chem. Phys., 125(094906), 2006. [153] K. Mikula and D. Sevcovic. A direct method for solving an anisotropic mean curvature flow of plane curves with an external force. Mathematical Methods in the Applied Sciences, 27(13):1545–1565, 2004. [154] F. A. Momany. Determination of partial atomic charges from ab initio molecular electrostatic potentials. application to formamide, methanol, and formic acid. J. Phys. Chem., 82:592–601, 1978. [155] J. Mongan, C. Simmerling, J. A. McCammon, D. A. Case, and A. Onufriev. Generalized Born model with a simple, robust molecular volume correction. Journal of Chemical Theory and Computation, 3(1):159–69, 2007. [156] Y. Mu, Y. Yang, and W. Xu. Hybrid hamiltonian replica exchange molecular dynamics simulation method employing the Poisson–Boltzmann model. Journal of Chemical Physics, 127(8), 2007. [157] R. S. Mulliken. Electronic population analysis on lcaomo [linear combination of atomic orbital-molecular orbital] molecular wave functions. I. J. Chem. Phys., 23:1833–1840, 1955. [158] D. Mumford and J. Shah. Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics, 42(5):577–685, 1989. [159] R. R. Netz and H. Orland. Beyond Poisson-Boltzmann: Fluctuation effects and correlation functions. European Physical Journal E, 1(2-3):203–14, 2000. [160] A. Nicholls, D. L. Mobley, P. J. Guthrie, J. D. Chodera, and V. S. Pande. Predicting small-molecule solvation free energies: An informal blind test for computational chemistry. Journal of Medicinal Chemistry, 51(4):769–79, 2008. [161] J. E. Nielsen, K. V. Andersen, B. Honig, R. W. W. Hooft, G. Klebe, G. Vriend, and R. C. Wade. Improving macromolecular electrostatics calculations. Protein Engineering, 12(8):657–662, 1999. [162] J. E. Nielsen and G. Vriend. Optimizing the hydrogen-bond network in PoissonBoltzmann equation-based pK(a) calculations. Proteins, 43(4):403–412, 2001. 206 [163] M. Nina, W. Im, and B. Roux. Optimized atomic radii for protein continuum electrostatics solvation forces. Biophysical Chemistry, 78(1-2):89–96, 1999. [164] M. Oevermann and R. Klein. A cartesian grid finite volume method for elliptic equations with variable coefficients and embedded interfaces. Journal of Computational Physics, 219:749–769, 2006. [165] A. Okur, L. Wickstrom, M. Layten, R. Geney, K. Song, V. Hornak, and C. Simmerling. Improved efficiency of replica exchange simulations through use of a hybrid explicit/implicit solvation model. Journal of Chemical Theory and Computation, 2(2):420–433, 2006. [166] A. Onufriev, D. Bashford, and D. A. Case. Modification of the generalized Born model suitable for macromolecules. Journal of Physical Chemistry B, 104(15):3712–3720, 2000. [167] A. Onufriev, D. A. Case, and D. Bashford. Effective Born radii in the generalized Born approximation: the importance of being perfect. Journal of Computational Chemistry, 23(14):1297–304, 2002. [168] P. Ordejon. Linear scaling ab initio calculations in nanoscale materials with siesta. Phys. Stat. Sol., 217:335, 2000. [169] P. Ordejon, E. Artacho, and J. M. Soler. Self-consistent order-n density-functional calculations for very large systems. Phys. Rev. B, 53:10441–10444, 1996. [170] S. Osher and R. P. Fedkiw. Level set methods: An overview and some recent results. Journal of Computational Physics, 169(2):463–502, 2001. [171] S. Osher and L. I. Rudin. Feature-oriented image enhancement using shock filters. SIAM Journal on Numerical Analysis, 27(4):919–940, 1990. [172] S. Osher and J. E. Sethian. Fronts Propagating with Curvature-dependent Speed: Algorithms based on the Hamilton-Jacobi Formulation. Journal of Computational Physics, 79:12–49, 1988. [173] C. S. Page and P. A. Bates. Can MM-PBSA calculations predict the specificities of protein kinase inhibitors? Journal of Computational Chemistry, 27(16):1990–2007, 2006. 207 [174] D. S. Palmer, V. P. Sergiievskyi, F. Jensen, and M. V. Fedorov. Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model. J. Chem. Phys., 133(044104), 2010. [175] R. Penfold, S. Nordholm, B. Jnsson, and C. E. Woodward. A simple analysis of ion-ion correlation in polyelectrolyte solutions. Journal of Chemical Physics, 92(3):1915–1922, 1990. [176] C. S. Peskin. Numerical analysis of blood flow in the heart. Journal of Computational Physics, 25(3):220–52, 1977. [177] D. Petrey and B. Honig. GRASP2: Visualization, surface properties, and electrostatics of macromolecular structures and sequences. Methods in Enzymology, 374:492–509, 2003. [178] R. A. Pierotti. A scaled particle theory of aqueous and nonaqeous solutions. Chemical Reviews, 76(6):717–726, 1976. [179] J. W. Ponder and D. A. Case. Force fields for protein simulations. Advances in Protein Chemistry, 66:27–85, 2003. [180] N. V. Prabhu, M. Panda, Q. Y. Yang, and K. A. Sharp. Explicit ion, implicit water solvation for molecular dynamics of nucleic acids and highly charged molecules. J. Comput. Chem., 29:1113–1130, 2008. [181] N. V. Prabhu, P. Zhu, and K. A. Sharp. Implementation and testing of stable, fast implicit solvation in molecular dynamics using the smooth-permittivity finite difference Poisson-Boltzmann method. Journal of Computational Chemistry, 25(16):2049–2064, 2004. [182] E. L. Ratkova, G. N. Chuev, V. P. Sergiievskyi, and M. V. Fedorov. An accurate prediction of hydration free energies by combination of molecular integral equations theory with structural descriptors. J. Phys. Chem. B, 114(37):12068–2079, 2010. [183] M. R. Reddy, U. C. Singh, and M. D. Erion. Ab initio quantum mechanics-based free energy perturbation method for calculating relative solvation free energies. Journal of Computational Chemistry, 28(2):491–4, 2007. [184] A. E. Reed, L. A. Curtiss, and F. Weinhold. Intermolecular interactions from a natural bond orbital, donor-acceptor viewpoint. Chem. Rev., 88:899–926, 1988. 208 [185] F. M. Richards. Areas, volumes, packing, and protein structure. Annual Review of Biophysics and Bioengineering, 6(1):151–176, 1977. [186] B. Roux and T. Simonson. Implicit solvent models. Biophysical Chemistry, 78(1-2):1– 20, 1999. [187] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. In Proceedings of the eleventh annual international conference of the Center for Nonlinear Studies on Experimental mathematics : computational issues in nonlinear science, pages 259–268, Amsterdam, The Netherlands, The Netherlands, 1992. Elsevier North-Holland, Inc. [188] M. F. Sanner, A. J. Olson, and J. C. Spehner. Reduced surface: An efficient way to compute molecular surfaces. Biopolymers, 38:305–320, 1996. [189] G. Sapiro and D. L. Ringach. Anisotropic diffusion of multivalued images with applications to color filtering. Image Processing, IEEE Transactions on, 5(11):1582–1586, 1996. [190] A. Sarti, R. Malladi, and J. A. Sethian. Subjective surfaces: A geometric model for boundary completion. International Journal of Computer Vision, 46(3):201–221, 2002. [191] A. Savelyev and G. A. Papoian. Inter-DNA electrostatics from explicit solvent molecular dynamics simulations. Journal of the American Chemical Society, 129(19):6060–1, 2007. [192] C. Sbert and A. F. Sol´. 3d curves reconstruction based on deformable models. Journal e of Mathematical Imaging and Vision, 18(3):211–223, 2003. [193] M. Schaefer and M. Karplus. A comprehensive analytical treatment of continuum electrostatics. Journal of Physical Chemistry, 100(5):1578–1599, 1996. [194] M. J. Schnieders, N. A. Baker, P. Ren, and J. W. Ponder. Polarizable atomic multipole solutes in a Poisson-Boltzmann continuum. Journal of Chemical Physics, 126:124114, 2007. [195] D. Sept, A. H. Elcock, and J. A. McCammon. Computer simulations of actin polymerization can explain the barbed-pointed end asymmetry. Journal of Molecular Biology, 294(5):1181–1189, 1999. 209 [196] D. Sept and J. A. McCammon. Thermodynamics and kinetics of actin filament nucleation. Biophysical Journal, 81(2):667–674, 2001. [197] J. A. Sethian. Evolution, implementation, and application of level set and fast marching methods for advancing fronts. Journal of Computational Physics, 169(2):503–555, 2001. [198] Y. Y. Sham, I. Muegge, and A. Warshel. The effect of protein relaxation on charge-charge interactions and dielectric constants of proteins. Biophysical Journal, 74(4):1744–1753, 1998. [199] K. A. Sharp and B. Honig. Calculating total electrostatic energies with the nonlinear Poisson-Boltzmann equatlon. Journal of Physical Chemistry, 94:7684–7692, 1990. [200] K. A. Sharp and B. Honig. Electrostatic interactions in macromolecules - theory and applications. Annual Review of Biophysics and Biophysical Chemistry, 19:301–332, 1990. [201] E. Sigfridsson and U. Ryde. Comparison of methods for deriving atomic charges from the electrostatic potential and moments. J. Comput. Chem., 19(4):377–395, 1998. [202] A. C. Simmonett, A. T. B. Gilbert, and P. M. W. Gill. An optimal point-charge model for molecular electrostatic potentials. Mol. Phys., 103:2789–2793, 2005. [203] T. Simonson. Macromolecular electrostatics: continuum models and their growing pains. Current Opinion in Structural Biology, 11(2):243–252, 2001. [204] T. Simonson. Electrostatics and dynamics of proteins. Reports on Progress in Physics, 66(5):737–87, 2003. [205] U. C. Singh and P. A. Kollman. An approach to computing electrostatic charges for molecules. J. Comput. Chem., 5:129–1453, 1984. [206] P. Smereka. Semi-implicit level set methods for curvature and surface diffusion motion. Journal of Scientific Computing, 19(1):439–456, 2003. [207] P. Smereka. The numerical approximation of a delta function with application to level set methods. Journal of Computational Physics, 211(1):77–90, 2006. [208] N. Sochen, R. Kimmel, and R. Malladi. A general framework for low level vision. Image Processing, IEEE Transactions on, 7(3):310–318, 1998. 210 [209] J. M. Soler. The siesta method for ab initio order-n materials simulation. J. Phys.: Condens. Matter, 14:2745– 2779, 2002. [210] Y. Song, Y. Zhang, C. L. Bajaj, and N. A. Baker. Continuum diffusion reaction rate calculations of wild-type and mutant mouse acetylcholinesterase: Adaptive finite element analysis. Biophysical Journal, 87(3):1558–66, 2004. [211] Y. Song, Y. Zhang, T. Shen, C. L. Bajaj, J. A. McCammon, and N. A. Baker. Finite element solution of the steady-state Smoluchowksi equation for rate constant calculations. Biophysical Journal, 86(4):2017–2029, 2004. [212] R. S. Spolar, J. H. Ha, and M. T. Record Jr. Hydrophobic effect in protein folding and other noncovalent processes involving proteins. Proceedings of the National Academy of Sciences of the United States of America, 86(21):8382–8385, 1989. [213] F. H. Stillinger. Structure in aqueous solutions of nonpolar solutes from the standpoint of scaled-particle theory. J. Solution Chem., 2:141 – 158, 1973. [214] A. J. Stone and M. Alderton. Distributed multipole analysis methods and applications. Mol. Phys., 56:1047–1064, 1985. [215] Y. H. Sun, P. R. Wu, G. W. Wei, and G. Wang. Evolution operator based single-step method for image processing. Int. J. Biomed. Imaging, 83847:1–27, 2006. [216] J. M. J. Swanson, R. H. Henchman, and J. A. McCammon. Revisiting free energy calculations: A theoretical connection to MM/PBSA and direct calculation of the association free energy. Biophysical Journal, 86(1):67–74, 2004. [217] J. M. J. Swanson, J. Mongan, and J. A. McCammon. Limitations of atom-centered dielectric functions in implicit solvent models. Journal of Physical Chemistry B, 109(31):14769–72, 2005. [218] Y. Takano and K. N. Houk. Benchmarking the conductor-like polarizable continuum model (cpcm) for aqueous solvation free energies of neutral and ionic organic molecules. Journal of Chemical Theory and Computation, 1(1):70–77, 2005. [219] C. Tan, L. Yang, and R. Luo. How well does Poisson-Boltzmann implicit solvent agree with explicit solvent? A quantitative analysis. Journal of Physical Chemistry B, 110(37):18680–18687, 2006. 211 [220] J. J. Tan, W. Z. Chen, and C. X. Wang. Investigating interactions between HIV-1 gp41 and inhibitors by molecular dynamics simulation and MM-PBSA/GBSA calculations. Journal of Molecular Structure: THEOCHEM, 766(2-3):77–82, 2006. [221] Z. J. Tan and S. J. Chen. Electrostatic correlations and fluctuations for ion binding to a finite length polyelectrolyte. Journal of Chemical Physics, 122:044903, 2005. [222] M. Tanaka and A. Y. Grosberg. Giant charge inversion of a macroion due to multivalent counterions and monovalent coions: Molecular dynamics study. Journal of Chemical Physics, 115(1):567–574, 2001. [223] C. L. Tang, E. Alexov, A. M. Pyle, and B. Honig. Calculation of pKas in RNA: On the structural origins and functional roles of protonated nucleotides. Journal of Molecular Biology, 366(5):1475–96, 2007. [224] D. J. Tannor, B. Marten, R. Murphy, R. A. Friesner, D. Sitkoff, A. Nicholls, M. Ringnalda, W. A. Goddard, and B. Honig. Accurate first principles calcualtion of molecular charge distribution and solvation energies from ab initio quantum mechanics and continuum dielectric theory. J. Am. Chem. Soc., 116:11875 – 11882, 1994. [225] I. Terekhova, A. O. Romanova, R. S. Kumeev, and M. V. Fedorov. Selective na+/k+ effects on the formation of α-cyclodextrin complexes with aromatic carboxylic acids: Competition for the guest. J. Phys. Chem. B, 114(37):12607–2613, 2010. [226] H. Tjong and H. X. Zhou. GBr6NL: A generalized Born method for accurately reproducing solvation energy of the nonlinear Poisson-Boltzmann equation. Journal of Chemical Physics, 126:195102, 2007. [227] J. Tomasi, B. Mennucci, and R. Cammi. Quantum mechanical continuum solvation models. Chem. Rev., 105:2999–3093, 2005. [228] J. Tomasi and M. Persico. Molecular interactions in solution: An overview of methods based on continuous distributions of the solvent. Chem. Rev., 94:2027–2094, 1994. [229] V. Tsui and D. A. Case. Molecular dynamics simulations of nucleic acids with a generalized Born solvation model. Journal of the American Chemical Society, 122(11):2489– 2498, 2000. [230] V. Tsui and D. A. Case. Calculations of the absolute free energies of binding between RNA and metal ions using molecular dynamics simulations and continuum electrostatics. Journal of Physical Chemistry B, 105(45):11314–11325, 2001. 212 [231] D. M. Tully-Smith and H. Reiss. Further development of scaled particle theory of rigid sphere fluids. Journal of Chemical Physics, 53(10):4015–25, 1970. [232] A. Vitalis, N. A. Baker, and J. A. McCammon. ISIM: A program for grand canonical Monte Carlo simulations of the ionic environment of biomolecules. Molecular Simulation, 30(1):45–61, 2004. [233] A. Vitalis and R. V. Pappu. ABSINTH: a new continuum solvation model for simulations of polypeptides in aqueous solutions. Journal of Computational Chemistry, 30(5):673–99, 2009. [234] R. C. Wade, R. R. Gabdoulline, and F. De Rienzo. Protein interaction property similarity analysis. International Journal of Quantum Chemistry, 83(3-4):122–127, 2001. [235] J. A. Wagoner and N. A. Baker. Assessing implicit models for nonpolar mean solvation forces: the importance of dispersion and volume terms. Proceedings of the National Academy of Sciences of the United States of America, 103(22):8331–6, 2006. [236] A. Wallquist and B. J. Berne. Computer-simulation of hydrophobic hydration forces stacked plates at short-range. Journal of Physical Chemistry, 99:2893–2899, 1995. [237] M. L. Wang and C. F. Wong. Calculation of solvation free energy from quantum mechanical charge density and continuum dielectric theory. J. Phys. Chem. A, 110:4873– 4879, 2006. [238] M. L. Wang, C. F. Wong, J. H. Liu, and P. X. Zhang. Efficient quantum mechanical calculation of solvation free energies based on density functional theory, numerical atomic orbitals and poissoncboltzmann equation. Chemical Physics Letters, 442:464– 467, 2007. [239] A. Warshel and A. Papazyan. Electrostatic effects in macromolecules: fundamental concepts and practical modeling. Current Opinion in Structural Biology, 8(2):211–7, 1998. [240] A. Warshel, P. K. Sharma, M. Kato, and W. W. Parson. Modeling electrostatic effects in proteins. Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics, 1764(11):1647–76, 2006. [241] J. Warwicker and H. C. Watson. Calculation of the electric potential in the active site cleft due to alpha-helix dipoles. Journal of Molecular Biology, 157(4):671–9, 1982. 213 [242] G. W. Wei. Generalized Perona-Malik equation for image restoration. IEEE Signal Processing Letters, 6(7):165–7, 1999. [243] G. W. Wei. Differential geometry based multiscale models. Bulletin of Mathematical Biology, 72:1562 – 1622, 2010. [244] G. W. Wei and Y. Q. Jia. Synchronization-based image edge detection. Europhysics Letters, 59(6):814, 2002. [245] G. W. Wei, Y. H. Sun, Y. C. Zhou, and M. Feig. Molecular multiresolution surfaces. arXiv:math-ph/0511001v1, pages 1 – 11, 2005. [246] P. Weinzinger, S. Hannongbua, and P. Wolschann. Molecular mechanics PBSA ligand binding energy and interaction of efavirenz derivatives with HIV-1 reverse transcriptase. Journal of Enzyme Inhibition and Medicinal Chemistry, 20(2):129–134, 2005. [247] T. J. Willmore. Riemannian Geometry. Oxford University Press, USA, 1997. [248] K. Wolfgang. Differential Geometry: Curves-Surface-Manifolds. American Mathematical Society, 2002. [249] G. Xu, Q. Pan, and C. L. Bajaj. Discrete surface modeling using partial differential equations. Computer Aided Geometric Design, 23(2):125–145, 2006. [250] M. Xu and S. L. Zhou. Existence and uniqueness of weak solutions for a fourth-order nonlinear parabolic equation. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 325:636–654, 2007. [251] A. S. Yang, M. R. Gunner, R. Sampogna, K. Sharp, and B. Honig. On the calculation of pK(a)s in proteins. Proteins-Structure Function and Genetics, 15(3):252–265, 1993. [252] S. Yu, W. Geng, and G. W. Wei. Treatment of geometric singularities in implicit solvent models. Journal of Chemical Physics, 126:244108, 2007. [253] S. Yu and G. W. Wei. Three-dimensional matched interface and boundary (MIB) method for treating geometric singularities. Journal of Computational Physics, 227:602–632, 2007. [254] S. Yu, Y. Zhou, and G. W. Wei. Matched interface and boundary (MIB) method for elliptic problems with sharp-edged interfaces. Journal of Computational Physics, 224(2):729–756, 2007. 214 [255] D. Zhang, J. Suen, Y. Zhang, Z. Radic, P. Taylor, M. Holst, C. Bajaj, N. A. Baker, and J. A. McCammon. Tetrameric mouse acetylcholinesterase: Continuum diffusion rate calculations by solving the steady-state Smoluchowski equation using finite element methods. Biophysical Journal, 88(3):1659–65, 2005. [256] Y. Zhang, G. Xu, and C. Bajaj. Quality meshing of implicit solvation models of biomolecular structures. Computer Aided Geometric Design, 23(6):510–30, 2006. [257] S. Zhao. High order matched interface and boundary methods for the helmholtz equation in media with arbitrarily curved interfaces. J. Comput. Phys., 229:3155–3170, 2010. [258] S. Zhao and G. W. Wei. High-order FDTD methods via derivative matching for Maxwell’s equations with material interfaces. Journal of Computational Physics, 200(1):60–103, 2004. [259] Y. C. Zhou, M. Feig, and G. W. Wei. Highly accurate biomolecular electrostatics in continuum dielectric environments. Journal of Computational Chemistry, 29:87–97, 2008. [260] Y. C. Zhou and G. W. Wei. On the fictitious-domain and interpolation formulations of the matched interface and boundary (MIB) method. Journal of Computational Physics, 219(1):228–246, 2006. [261] Y. C. Zhou, S. Zhao, M. Feig, and G. W. Wei. High order matched interface and boundary method for elliptic equations with discontinuous coefficients and singular sources. Journal of Computational Physics, 213(1):1–30, 2006. [262] Z. Zhou, P. Payne, M. Vasquez, N. Kuhn, and M. Levitt. Finite-difference solution of the Poisson-Boltzmann equation: complete elimination of self-energy. Journal of Computational Chemistry, 17:1344–1351, 1996. [263] J. Zhu, E. Alexov, and B. Honig. Comparative study of generalized Born models: Born radii and peptide folding. Journal of Physical Chemistry B, 109(7):3008–22, 2005. 215