MULTI - SCALE SIMULATION OF ELECTROSTATIC CHANNELING By Yuanchao Liu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Materials Science and Engineering Doctor of Philosophy 2018 ABSTRACT MULTI - SCALE SIMULATION OF ELECTROSTATIC CHANNELING By Yuanchao Liu One - pot multistep catalysis, also called tandem catalysis, denotes stepwise ch emical reactions over sequential active sites in a single vessel. Such an approach enables efficient synthesis of various product molecules from simple precursors, and complete utilization of the energy stored in each chemical bond. Therefore, such cascade catalysis is of great significance to the manufacture of fine chemicals, the production of pharmaceutical intermediates, and electrocatalytic devices. A key factor limiting these processes is the mass transport of reaction intermediates, which in nature i s found to be largely facilitated by substrate channeling, in which intermediate molecules are transferred directly to a subsequent active site instead of equilibrating to bulk environment. Electrostatically bound diffusion represents one such channeling m echanism, in which charged intermediates are transported along an oppositely charged pathway. Naturally occurring electrostatic channeling has been studied for decades, but the challenge of controlling molecular - level interactions along with catalytic kin etics has hindered its application to artificial cascades. In this work, multi - scale simulations by a combination of molecular dynamics (MD) and kinetic Monte Carlo (KMC) are utilized to quantify the overall kinetics of artificial cascades, aiming to furth er explore the channeling mechanism and reveal potential limitations for future cascade design. Taking advantage of this hierarchical model, the molecular complexity may be fully considered and a wide range of time and length scales can be effective covere d to bridge the gap between microstructures and kinetic events. Specifically, a hopping surface diffusion mode is demonstrated by molecular dynamics simulation, wherein charged intermediate molecules are shuttled along a cationic oligopeptide bridge that i s proposed to covalently conjugate hexokinase (HK) and glucose - 6 - phosphate dehydrogenase (G6PDH). Strong experimental evidence for the occurrence of electrostatic channeling is provided by ionic - strength dependent studies , via both simulations and experime ntal stop - flow lag time analysis. Specifically, simulations suggest that a balance between surface adsorption and diffusion is required for optimal channeling efficiency. To further quantify the energy associated with each elementary step in channeling pro cess, advanced sampling methods are employed to study interactions on the cascade surface. Transition state theory is used to calculate the hopping energy barrier via a temperature - dependen t study of hopping frequency. Desorption energy is calculated by um brella sampling, further revealing ionic strength - independent Stern layer diffusion under the protection of an ionic strength - dependent diffuse layer. Intermediate hopping from the peptide bridge to its binding site on G6PDH is analyzed by transition pathw ay theory, which provides sufficient sampling of the interactive pathways on this 2D flexible surface. The leakage in this process is evaluated by probability analysis. Finally, overall cascade kinetics is quantified by the KMC method , integrat ing the MD and experiment parameters. The KMC model enables direct comparison with stop - flow lag time analysis, by evaluating the product evolution over the entire experimental time scale, including the pre - steady state. The KMC results provide good agreement with experiment in terms of ionic strength dependence. Moreover, KMC reveals several key parameters limiting overall cascade kinetics, including the strength of surface interaction s , the length of channeling pathway , and the energy barrier around the artificial interface of synthetic cascade. Copyright by YUANCHAO LIU 2018 v This dissertation is dedicated to my family . vi ACKNOWLEDGEMENTS Firstly, I would like to express my sincere gratitude to my advisor , Dr. Scott Calabrese Barton , for his continuous support on my Ph.D . study and research, for his patience, motivation, and immense knowledge. He is a good mentor, actively guiding me towards independent thinking and positive communication; He is a good collaborator, proficiently working with me on a great deal of code work on our attempts to new modeling tools; He is also a good friend, always sharing various social/academic activities and much fun stuff with us. Secondly, I gratefully acknowledge the funding support from the Army Research Office MURI (#W911NF1410263) via the University of Utah. My sincere thanks also goes to our project partners and their co - authorship on my peer - reviewed publication s . Specifically, great thanks to Dr. Shelley D. Minteer, Dr. Matthew S. Sigman and Dr. David P. Hickey from University of Utah, for their support of experimental works. Thanks also go to Dr. Plamen Atanassov and Dr. Ivana Matanovi c from University of New Mexico, for their help on our simulation work. Besides my advisor, I would like to thank the rest of my committee members : Dr. Yue Qi, Dr. Wei Lai, Dr. Alex Dickson and Dr. Richard Lunt, for their insightful comments and encouragement, but also for the hard question which incented me to widen my research from various perspectives. Particular thanks to Dr. Alex Dickson for his great help on the advanced sampling technique s in my dissertation. Last but not the least, I would like to thank my parents for raising me up, giving me an optimistic and positive attitude towards life, and shaping me as a strong and independent vii personality . I would also like to thank my girlfriend for her companionship and care during my doctoral care er. . C omputational work in support of this research was performed at Michigan State . T hanks to Dr. Xiaoge Wang and Dr. Chun - min Cheng for their help on software compiling and job management ; I thank my fello w labmates, Dr. Cenk Gumeci , Jacob Anibal , Alex Miraba, Kanchan Chavan and Erica Earl, for the stimulating discussions and for all the fun we have had together in the last four years. Lastly, I would to thank my friend s at MSU, including Yuxi Ma and Mingmin Wang. viii TABLE OF CONTENTS LIST OF TABLES ................................ ................................ ................................ .......................... x LIST OF FIGURES ................................ ................................ ................................ ....................... xi Chapter 1 Introduction ................................ ................................ ................................ ............... 1 1.1 Overview ................................ ................................ ................................ ...................... 1 1.2 Substrate Channeling in Cascade Catalysis ................................ ................................ .. 2 1.3 Experimental Measurement ................................ ................................ .......................... 6 1.4 Intermediate Transport in Electrostatic Channeling ................................ ..................... 8 1.5 Kinetic Quantification ................................ ................................ ................................ .. 9 1.5.1 Analytical methods ................................ ................................ ..................... 10 1.5.2 Continuum modeling ................................ ................................ .................. 13 1.5.3 Molecular simulation ................................ ................................ ................. 14 1.5.4 Kinetic Monte Carl o model ................................ ................................ ........ 17 1.6 Overview of Dissertation ................................ ................................ ............................ 20 Chapter 2 Surface Diffusion by Molecular Dynamics Simulation ................................ .......... 24 2.1 Introduction ................................ ................................ ................................ ................ 24 2.2 Simulation Methods ................................ ................................ ................................ .... 25 2.3 Results and Discussion ................................ ................................ ............................... 30 2.3.1 Surface hopping mechanism ................................ ................................ ...... 30 2.3.2 Impact of intermediate species ................................ ................................ ... 36 2.3.3 Balance between surface adsorption and mobility ................................ ..... 37 2.3.4 Qualitative comparison with experiment by stop - flow lag time analysis .. 40 2.4 Summary ................................ ................................ ................................ ..................... 45 2.5 Acknowledgements ................................ ................................ ................................ .... 46 Chapter 3 Quantification of Thermodynamic Parameters ................................ ....................... 47 3.1 Introduction ................................ ................................ ................................ ................ 47 3.2 Model Description ................................ ................................ ................................ ...... 48 3.2.1 Transition state theory study ................................ ................................ ...... 48 3.2.2 Umbrella sampling ................................ ................................ ..................... 49 3.3 Results and Discussion ................................ ................................ ............................... 53 3.3.1 Energy barrier of s urface hopping ................................ .............................. 53 3.3.2 Desorption energy ................................ ................................ ...................... 55 3.4 Summary ................................ ................................ ................................ ..................... 58 3.5 Acknowledgements ................................ ................................ ................................ .... 58 Chapter 4 Transition Pathway from Bridge to Binding Pocket ................................ ............... 59 4.1 Introduction ................................ ................................ ................................ ................ 59 4.2 Model Description ................................ ................................ ................................ ...... 60 4.2.1 Selection of complex structure ................................ ................................ ... 60 4.2.2 Hybrid topology at enzyme/linker interface ................................ .............. 61 4 .2.3 Intermediate initialization on complex structure ................................ ........ 63 ix 4.2.4 Markov State Model on transition pathway on flexible 2D surface .......... 64 4.3 Results and Discussion ................................ ................................ ............................... 67 4.3.1 Channeling process ................................ ................................ .................... 67 4.3.2 Transition pathway analysis ................................ ................................ ....... 73 4.4 Summary ................................ ................................ ................................ ..................... 79 4.5 Acknowledgements ................................ ................................ ................................ .... 80 Chapter 5 Cascade Kinetics by Kinetic Monte Carlo Method ................................ ................ 81 5.1 Introduction ................................ ................................ ................................ ................ 81 5.2 Model Description ................................ ................................ ................................ ...... 83 5.2.1 Kinetic Monte Carlo model ................................ ................................ ........ 83 5.3 Results and Discussion ................................ ................................ ............................... 88 5.3.1 Cascade Kinetics by KMC model ................................ .............................. 88 5.3.2 Quantitative comparison with experiment via ionic strength dependence . 90 5.3.3 Model extension to multistep cascade ................................ ........................ 93 5.4 Summary ................................ ................................ ................................ ..................... 94 Chapter 6 Conclusion ................................ ................................ ................................ .............. 96 APPENDICES ................................ ................................ ................................ ............................ 100 APPENDIX A. Method of counting hopping fre quency ................................ .............. 100 APPENDIX B. Various intermediates performance on LYS peptide surface .............. 101 APPENDIX C. Vector transformation and dimension reduction ................................ . 102 APPENDIX D. Code repository ................................ ................................ ................... 105 APPENDIX E. Mono enzyme kinetics ................................ ................................ ......... 106 REFERENCES ................................ ................................ ................................ ........................... 108 x LIST OF TABLES Table 1 Kinetic Monte Carlo (KMC) vs. Molecular Dynamics (MD) ................................ ......... 18 Table 2 Energy barriers and corresponding rate constants. ................................ .......................... 57 Table 3 Comparison of experimental G6PDH 33 and available crystal structures. ........................ 62 Table 4 Energy barriers, related rate constants and cascade kinetics. ................................ .......... 78 Table 5 KMC parameters. ................................ ................................ ................................ ............. 86 Table 6 Properties of Oxalate intermediates channeled by Lys - Ala peptides. ........................... 101 xi LIST OF FIGURES Figure 1 - 1 Conceptual cluster of various catalysts anchored on a DNA scaffold. (a) Tartronic acid oxidation to mesoxalic acid at TE MPO catalyst; (b) facilitated transport of mesoxalic acid along DNA; (c) oxidation of mesoxalic acid to oxalic acid via catalysis by PtRu nanoparticle AldDH adduct. ................................ ................................ ................................ ................................ 2 Figure 1 - 2 (a) Three dimensional diffusion field near a point source at steady state ( t = 10 4 ), where k is reaction rate and D is diffusivity. (b) Comparison of experimental turnover frequency of free enzyme coupl e and covalently conjugated enzyme couple. 12,14 ................................ .......... 4 Figure 1 - 3 Mechanisms of substrate channeling in nature. 7,16,18,19 (a) The crys tal structure of traptophan synthase (PDB:1A5S) with - subunit shown in yellow and - subunit shown in green. (b) The crystal structure of malate dehydrogenase citrate synthase with possible channeling pathway marked in dashed yellow line. (c) The crystal s tructure of eukaryotic pyruvate dehydrogenase complex with E1 enzymes in yellow, E2 core structures in green and linker in between marked in blue. ................................ ................................ ................................ ................. 5 Figure 1 - 4 Reaction scheme and example data for transient time ( ) analysis. 7 ............................ 8 Figure 1 - 5 Schematic of two sequential enzyme - catalyzed reactions (diffusion limited) in continuum model. 35 (a) Base model without bridge between two spherical active sites. (b) Model with volumetric bridge. (c) Electrostatic model with positive charged bridge. (d) Adsorption model with surface diffusion on bridge. ................................ ................................ ....................... 14 Figure 1 - 6 Time and length scale of different simulations techniques. 42 ................................ ..... 16 Figure 1 - 7 Rare event in Molecular Dynamics and Kinetic Monte Carlo. 36 ................................ 19 Figure 2 - 1 (a) Intermediate species. (b) Amino acid side chain species. (c - e) electrostatic potential m ap of Arginine, Lysine and Histidine. (f - h) a - helix Lysine peptide with various charge density. ................................ ................................ ................................ ................................ .......... 31 Figure 2 - 2 Radial distribution function (RDF) diagrams of oxalate around the peptide surface and the corresponding diagrams for oxygen atoms and carbon atoms individually (inset). ......... 33 Figure 2 - 3 (a) Short - range Coulombic energy diagrams between oxalate ( - 2) and peptide composed of combinations of alanine and arginine, Lysine, and Histidine. The shaded area in Lys peptide figure (red) represents a frequent surface diffusion process. (b) Representative MD simulations corresponding to the plateau region in the Coulombic energy diagram between ARG side chains (blue) and oxalate (red). ................................ ................................ ............................. 34 Figure 2 - 4 System configuration and coulombic energy diagram for glucose 6 - phosphate (a), xii Adsorption time fraction and hopping times of intermediate molecules on peptides with various Lys fraction over the entire 50 ns simulation. The bar marked by red arrows corresponds to the coulombic diagram in above figures. Numerical data can be found in APPENDIX B ................ 36 Figure 2 - 5 Effect of the Lys fraction (a, d) and thermodynamic adsorption energy, , (b, e) on surface diffusivity, , and adsorption time fraction, . Effect of on transport efficiency, characterized by els simulations for each individual system. ................................ ................................ ........................ 39 Figure 2 - 6 (a) Illustration of the proposed channeling complex using a poly(lysine) bridge as an electrostatic surface between hexokinase (HK; PDB 3VF6) and glucose - 6 - phosphate dehydrogenase (G6PDH; PDB 4LGV). (b) Experimental reaction scheme used to study electrostatic channeling of the charged intermediate (glucose - 6 - phosphate) across a cationic peptide bridge. (c) Sample absorbance plot highlighting the determination of experimental lag e (K5), neutral bridge (G5), or free enzymes. 33 ................................ ................................ ................................ ................................ ..... 41 Figure 2 - 7 Lag time of the free enzymes and 5xLysine (K5), 15xLysine (K15), 5xGlycine (G5), and 15xGlycine (G15) complexes using 1.4 mM citrate with variable ionic strength, where the ionic strength is controlled by the concentration of NaCl. Experiments were performed by adding one standard deviation; (*) 0.05 > p > 0.01; (**) 0.01 > p > 0.001; ( ***) 0.001 > p. [ref] .......... 43 Figure 2 - 8 (a) MD Simulation on ionic strength, based on the system with G6P and Lys peptide. The theoret - calculated from Lys - 1Ala peptide (K6 - - 7, and - 9. ................................ ........................... 44 Figure 3 - 1 Poly - lysine peptide as an electrostatic bridge to transfer reaction intermediate (G6P, glucose 6 - phosphate) from HK (hexokinase) to G6PDH (glucose 6 - phosphate dehydrogenase).22 ................................ ................................ ................................ ................................ ....................... 48 Figure 3 - 2 (a) System configuration of a dual association mode. (b) An assumed desorpti on state by pulling G6P 1.5 nm away from the surface. ................................ ................................ ............ 51 Figure 3 - 3 Gromacs biased potential for umbrella sampling. (a) gmx_distance restraints the pull group at a specific distance away from the reference group. (b) gmx_direction restraints the pull group at a plane perpendicular to given vector and also a specific distance away from reference gmx_direction area above reference group. The x - axis for all figures are at the same scale to y - axis. ................ 52 Figure 3 - 4 Two - dimensional PMF calculated by Grossfield WHAM code. ................................ 53 Figure 3 - 5 (a) Short - range coulom bic diagram showing G6P hopping as indicated by energy fluctuation between single - and dual - association configurations. (c) Temperature dependence of surface hopping rate. (d) Ionic strength dependence on hopping energy barrier and frequency factor. ................................ ................................ ................................ ................................ ............ 55 xiii Figure 3 - 6 (a) Biased probability distribution of G6P molecule in each window. (b) 1 - D potential of mean force by Umbrella Sampling. The points shows the biased energy minimum for each sampling window. ................................ ................................ ................................ ......................... 56 Figure 4 - 1 Proposed channeling pathway from the last dual association site on poly - Lysine ................................ .............................. 60 Figure 4 - 2 Chemical structure of linker molecule capped with CYS residue on each side, CYS - (BM - (PEG) 2 ) - CYS. The whole complex structure was processed by CGenFF to obtain the topology parameters for the interface between standard residue segment shaded by blue color and the non - standard residue segment shaded in red color. In actual MD simulation, the topology for pure standard residue segment was taken from CHARMM, and CGenFF parameter s were applied to the pure non - standard segment and the interface topology. ................................ ......... 63 Figure 4 - 3 The results of molecular docking simulat ions for the binding of G6P to G6PDH (circled positions) in the area between LYS bridge and G6P binding pocket on G6PDH. (a) Search boxes. (b) Several favorable binding sites were seemed as potential kinetic trapping spots in short MD simulation and were a void in initializing the parallel simulations. .......................... 64 Figure 4 - 4 An example of transition matrix, where white color stands for zero e lement and blue color stands for non - zero items. In this case, there are 10,625 non - zero elements out of the total 250,000 items, resulting in a ~ 4.25% occupancy of the matrix. ................................ .................. 66 Figure 4 - 5 Simplified energy profile from bridge to G6P binding site on G6PDH. Point a is the last dual association site on peptide bridge; point c is the transition state at the energy barrier between bridge and G6PDH; point b and d are the two states slightly devi ating from the transition state. ................................ ................................ ................................ .............................. 69 Figure 4 - 6 (a) System configuration when G6P was released around point c in Figure 4 - 5 . (b) System configuration showing the distribution of G6P intermediate (green dots) in 500 parallel simulations. The complex structure is the 0 ns frame and the G6P molecules position are taken from the last frame of each parallel simulation. The coordina tes are rotated to make the G6PDH secondary structure fit that of first frame. ................................ ................................ ..................... 70 Figure 4 - 7 Probability results as a function of number of parallel simulations. The resulting probability depends on the intermediate releasing point. In this set of simulations, G6P molecule was released at a single point around the perfect transition state area, and then parallel simulations was cond ucted to study the convergence of probabilities. ................................ ........ 71 Figure 4 - 8 (a) Ionic strength dependence of leaking probability, showing the probability of to sample the rare leaking event. (b) The ration of desorption and channeling events in Fig.3b. 72 Figure 4 - 9 Transition pathway visualization at 0 mM (a) and 120 mM (b). The node size is based on the stationary populations of each state. The colors indicates the three - basi n committor probabilities to the pre - defined basins of desorption states (red), peptide bridge states (blue) and binding pocket states (green). (c) Triangle color bar. (d) Triangle color bar with grid and probability labels. ................................ ................................ ................................ .......................... 75 xiv Figure 4 - 10 Two - basin committor probabilities for transition pathways visualization at 0 mM (a) and 120 mM (b). As compared to Figure 4 - was not defined when calculating the committor probabilities, but the corresponding states were still recognized as an visualized. (c) Triangle color bar. (d) Tria ngle color bar with grid and probability labels. ......... 76 Figure 4 - 11 Comparison between the leakage calculated by transition state theory (TST) and transition path theory. (a) Leaking probability represented by desorption probability and th e probability to reach the G6PDH binding pocket. (b) Energy difference between transition state to bulk environment, corresponding to in Figure 4 - 5. ................................ ................................ 79 Figure 5 - 1 Schematic diagram of kinetic Monte Carlo model. E1 and E2 are two active sites. Sites 1 - 4 are discrete hopping sites representing the dual association sites on peptide bridge. Bulk are the environment with changing inter mediate concentration. is the Michaelis constant and values are the rate constants for various events, including for turnover frequencies, for desorption from bridge, for hopping on bridge, for hopping from enzyme - 1 to bridge , for hopping from bridge to enyme - 2. ................................ ................................ ........ 84 Figure 5 - 2 (a) A example of simulated stop - flow lag time analysis, showing the prod uct evolution under different channeling conditions. (b) The evolution of bulk intermediate. .......... 89 Figure 5 - 3 Plot of leaking probab ility based on equation 5 - 12. ................................ ................... 91 Figure 5 - 4 (a) Comparison of experimental lag time and KMC lag time. K5 and K15 only consider the leakage on LYS bridge. K5+E2 and K15+E2 involve the leakage from bridge to G6PDH pocket. (b) Time course of surface hopping on bridge. (c) Time course of leakage on bri dge. ................................ ................................ ................................ ................................ ........... 92 Figure 5 - 5 Product and intermediate evolution in three step KMC model (Free standing system). All parameters for bridge - 2 and enzyme - 3 are the same with that of bridge - 1 and enzyme - 2, respectively. ................................ ................................ ................................ ................................ .. 94 1 Chapter 1 Introduction 1.1 Overview One - pot multistep catalysis, also called tandem catalysis, involves a series of catalysts in a single vessel. 1 As compared to conventional stepwise reaction, tandem catalysis eliminates the isolation and purification process es that usu ally requires high process costs, result in yield losses, and generate waste. Additionally, these cooperative reactions can effectively protect unstable intermediates and reduce unnecessary side reactions. As a result, multistep catalysis enables high chem ical conversion efficiency, making complex product molecules from simple and accessible precursors. Therefore, tandem catalysis is of great significance to the manufacture of fine chemicals and the production of pharmaceutical intermediates. 2 Taking advantage of complete energy extraction from various chemical bonds in complex molecule structure s , multistep reactions also have great potential to be applied to electrocataly tic device s , such as biofuel cells. 3 Multi - enzyme biofuel cells demonstrate a higher energy (1.4~3 fold) deliver y due to the deep oxidation by tandem catalysis. 4,5 Figure 1 - 1 sho ws a n example cascade of various catalysts anchored on a DNA scaffold, including metallic, organic, and bio/enzymatic catalyst s . Besides finding proper catalysts and the conditions that fully functionalize each reaction site 1,6 , the mass transport of reaction intermediate s is a key factor impact ing the overall efficiency of this tandem process . 7 D ue to Brownian effect s , the produced intermediate molecules tend to move in a random and non - directional mode. In spite of a clusteri ng of sequential catalysts, intermediate molecules still have a high probability to equilibrate to bulk media, where a side reaction and unfavorable binding 2 can occur. Therefore, the flux control of reaction intermediates is a key challenge to the developm ent of advanced catalytic cascades . Figure 1 - 1 Conceptual cluster of various catalysts anchored on a DNA scaffold. (a) Tartronic acid oxidation to mesoxalic acid at TEMPO catalyst; (b) facilitated transport of mesoxalic acid along DNA; (c) oxidation of mesoxalic acid to oxalic acid via catalysis by PtRu nanoparticle AldDH adduct. 1.2 Sub s trate C hanneling i n C ascade C atalysis Over billions of years of evolution, nature has developed very efficient catalytic pathways to perform chemical reactions in the cell stepwise , utilizing transient super - molecular complexes termed metabolons. 8,9 By these one - pot multistep catalys e s , a wide range of complex biomolecules are synthesized from simple precursors, and energy is gener ated from carbohydrate oxidation through a metabolic pathway. Although the cell has a complex chemical environment, these well - defined cascades are able to maintain high catalytic efficiency and prevent the undesired bonding of reaction intermediates to un productive active sites. The key was found to be substrate channeling, which is defined as the direct transfer of reaction intermediates from one catalytic active site to the next , without equilibrating into bulk media. 7,10,11 Therefore , substrate channeling may have promising applications in chemical manufacturing and electrocataly tic devices. For example, with cascade catalysis, energy devices based on organic fuels may accommodate more fuel categories and more fully utilize the energy stored in high molecular 3 weight fuels. 3 These features enable biofuel cells with high energy density and biosensors with enhanced sensitivity. 4,5 Under most con ditions, intermediate diffusion is fast compared to reaction rate, making proximity alone insufficient to realize effective channeling. To illustrate this, the product concentration around a single active site is calculated by equation 1 - 1 , 12 where c is the concentration at radial distance , r , from a point active site , t is time, D is diffusion coefficient and is the average time between reaction events ( k - 1 , where k is turnover frequency). 1 - 1 Figure 1 - 2 shows resulted concentration profile at steady state ( t = 10 4 ) with different k/D values. In a typical system with the diffusion of small molecule in aqueous solution, the k/D value is 0.01 m 2 . As shown i n Figure 1 - 2 , the spatial distribution of product concentration is quite uniform in most conditions, indicating that proximity alone has very little impact on the local concentration of reaction intermediate in cascade reactions. Brownian dynamics simulations also ind icate that the concentration profile for most small intermediate molecule is raised only to a distance around 1 nm from a neutral catalytic active site. 13 Recent experiment al work on covalently conjugated enzymes also indicate d that proximity alon e did not have impact on the activity improvement, as shown in Figure 1 - 2 b. 14 These results suggest that channeling by proximity alone only works when sequential active sit es are extremely close to each other. However, intermediate diffusion can be enhanced and well - controlled t hrough the incorporation 4 of a functional surface or tunnel. For example, the channeling distance was proposed to be increased 10 - fold in the presen ce of electrostatic guidance. 15 Figure 1 - 2 (a) Three dimensional diffusion field near a point source at steady state ( t = 10 4 ), where k is reaction rate and D is diffusivity. (b) Comparison of experimental turn over frequency of free enzyme couple and covalently conjugated enzyme couple. 12,14 Figure 1 - 3 shows three natural channeling mechanisms, each featuring a functional surface or tunnel with reasonable spatial organization. A well - studied example of an intra - molecular tunnel is tryptopha n synthase, Figure 1 - 3 a, which produces tryptophan through a two - step reaction using indole as the intermediate. 16 The crystal structure reveals a hydrophobic tunnel (2.5 - 3 nm) connecting the two active sites, which physically confines the indole intermediate and transports it to the second active site. 17 As compared to intra - molecular tunneling, the pyruvate dehydrogenase complex ( Figure 1 - 3 b) creates a pathway by covalent bonding, where the acetyl group (intermediate) is transferred by a lipoamide swing arm. Electrostatic guidance refers to interactions between a charged interme diate and an oppositely charged pathway on cascade surface , and provides possibly the most straightforward channeling method for application in synthetic cascades. As shown in Figure 1 - 3 , the crystal structure of MD - CS (malate dehydrogenase citrate synthase) shows a positive surface (purple area) bridging 5 the two active site s , providing a diffusion path for negatively charged oxaloacetate (intermediate) . Figure 1 - 3 Mechanisms of substrate channeling in nature. 7,16,18,19 (a) The c rystal structure of t raptophan s ynthase (PDB:1A5S) with - subunit shown in yellow and - subunit shown in green. (b) The crystal structure of malate dehydrogenase citrate synthase with possible channeling pathway marked in dashed yellow line. (c) The crystal structure of eukaryotic pyruvate dehydrogenase complex with E1 enzymes in yellow, E2 core structures in green and linker in between marked in blue. In past decades, attempts were made to apply substrate channeling in artificial systems. 20 22 For example, multiple enzyme species were anchored on DNA scaffold to increase the effect of cluster and spatial orientation. 20 A good example of ar tificial chemical swing arm is a glucose - 6 - phoshate dehydrogenase (G6pDH) and malate dehydrogenase (MDH) cascade spatially organized on a DNA scaffold where a NAD+ functionalized single - stranded DNA swing arm is used to channel NADH to the next active site . 21 Examples of synthetic electrostatic channeling and intr a - molecular tunnel are quite limited. Local effective ness of reaction intermediates was studied by introducing favorable/unfavorable electrostatic interactions between intermediate and 6 c ascade scaffold. 23,24 A decreased catalytic efficiency was found with either too low or too high binding constants. So far, introducing substrate channeling artificially is complicated by both the precise molecular - level control and the lack of experiment methods t o detect the occurrence of such transient phenomenon (~ ns). 7 Therefore, new methodology (modeling/experiment) is gre atly needed to aid the design of channeling pathways and the quantification of overall catalytic kinetics. Natural channeling mechanisms utilize intramolecular tunnels 16,17,25 , chemical swing arms 19 , spatial organization 12,26,27 and electrostatic guidance 15,18,28 30 to facilitate bound diffusion of reaction intermediates. These mechanisms highlight a significant distinction between substrate channeling, which features bound or restricted diffusion, and active site proximity. 1.3 Experimental Measurement M odern biochemical analys e s of substrate channeling are indirect measurements, tracking the bulk intermediate evolution instead of the actual channeling molecules. Examples are transient time ( ) analys e s , isotope dilution and enrichment studies, cascade resista nce to a competing side reaction, and cascade resistance to a reaction inhibitor. 7 T ransient time analysis , as an exa mple, is defined as the time required to reach steady - state flux of reaction intermediate. If a reactant A is converted to intermediate B and finally converted to C , the prod uct time course can be observed graphically when only A is input to the sy stem , as shown in Figure 1 - 4 a . Transition time, also called lag time, is obtained by extrapolation of the steady state line back to the time axis. There are several explanation s of the physical meaning of lag time ; one o f them is the average time for intermediate to reach the sequential active site. From the statistical point of view, on the other hand, lag time is also the time required for bulk intermediate 7 concentration to reach a constant value , such that the reaction rate of the second site equal s that of the first site, and a steady state is achieved. For perfect channeling , is zero. With insufficient leaky channeling, tends to shift to the value for free standing enzyme couple , (section 1.5.1 ) dep ending on the extent of leakage . However, it is important to know that all these methods are indirect measurements. As a consequence, current experimental measurements work well with perfect channeling, but are less clear for leaky channeling and free diffusion. Also, owing to the pheno mena observed, it is difficult to evaluate the contribution of channeling compound to bulk diffusion. Therefore, substrate channeling is a complex phenomenon that is difficult to model or analyze by single technique. A combination of hierarchical modeling and experimental measurement would be preferred to understand channeling mechanisms, and further design and quantify cascade reactions. From the perspective of simulation, the model should be efficient but have enough complexity to capture molecular - level effect s and evaluate the contribution of individual event. This will be discussed in detail in Chapter 3. 8 Figure 1 - 4 Reaction scheme and example data for transient time ( ) analysis. 7 Another significant technique is the use of competing reaction s , where in bulk intermediates are consumed by a side reaction in bulk . The observed yield loss is thus related to the extent of substrate channeling, as shown in Figure 1 - 4 b. 31 When substrate channeling is high, the effect of competing reaction is minim al . Without channeling, the yield can be significantly reduced. Details of these analysis techniques will be described in the following simulation section. 1.4 Intermediate Transport in Electrostatic Channeling Electrostatic channeling is bounded diffusion of ch arged intermediates along an oppositely charged surface, a non - specific pathway applicable to any charged intermediate. It is a more general mechanism, as compared to structure - specific intramolecular tunnels and chemical swing arms. In past decades, elect rostatic channeling studies mainly focused on the bi - functional enzyme TS - DHFR (thymidylate synthase - dihydrofolate reductase) and the TCA cycle supercomplex MDH - CS (malate dehydrogenase - citrate synthase) , as shown in Figure 1 - 3 b . 9 Specifically, experimental work showed strong evidence of improved kinetics by lag time analysis. 31,32 The occurrence of substrate channeling was also supported by apparent resistance to competing bulk reactions. By structur al characterization ( Figure 1 - 3 ) , both the TS - DHFR and MD H - CS complexes were found to have positively charged surfaces located between the two active sites. 18,28 Moreover, the channeling efficiency could be significantly disabled by either a neutralization of the channeli ng pathway 22,29 or an increased io nic strength (IS) 29,33 . This evidence gives strong qualitative s upport for experimental results. Recently, we have reported the first cascade with artificially introduced electrostatic channeling (details in Chapter 2 and Chapter 3 ) , composed of hexokinase (HK) and glucose - 6 - phosphate dehydrogenase (G6PDH) covalently conjugated by a cationic oligopeptide bridge. 33 In this work, a hopping surface diffusion mode was built by molecular dynamics (MD) simulation, and studies of ionic strength dependence (MD and Exp.) on lag time provided strong evidence of the occurrenc e of electrostatic channeling. 1.5 Kinetic Q uantification K inetic quantification is a key to bridge the gap between microstructure information ( Figure 1 - 3 ) and kinetic macroscale phenomen a ( Figure 1 - 4 ). It not only helps to understand the channeling mechanism, but also accesses time and length scale s (~ ns and ~nm) inaccessible to current experiment technique s . More importantly, it is able to identify potential rate limitation s in elementary steps , thus providing guidelines for future cascade design. 10 1.5.1 Analytical methods Generally, analytical quantification is based on an assumption of steady - state conditions, where the intermediate flux is time independent, requiring an equivalent production and consumption of reaction intermediate. If denotes the reaction rate of fully saturated enzyme 1 (E1) and denotes the actual rate of enzyme 2 (E2) at steady state, their relationship can be expressed by equation 1 - 2 . Usually, ; otherwise the system will never reach steady state. 1 - 2 Michaelis - Menten kinetics is applied to a reversible binding/unbinding of substrate, S, to enzyme, E, followed by a non - reversible reaction to product, P. The whole process is shown by equation 1 - 3 , where , and are the rate constants for substrate desorption, adsorption and reaction, respectively. 1 - 3 In a free standing system, a s shown by equation 1 - 4 and 1 - 5 , product evolution rate on second enzyme, , can be correlated to the fully saturated E2 rate through intermediate concentration , , and Michaelis constant, . 1 - 4 11 1 - 5 By combining equation 1 - 2 and 1 - 4 , steady state intermediate concentration , , can be correlated to three ex perimental accessible parameters as shown by equation: 1 - 6 By definition, lag time for uncoupled enzymes ( Figure 1 - 4 a) , , is the time required for bulk intermediate concentration to reach a value making the reaction on E2 equal to that of E1, as shown by equation: 1 - 7 To introduce the turnover frequency (TOF), , as the rate constant for reaction, equation 1 - 7 can also be expressed by 1 - 8 Through equation 1 - 6 , 1 - 7 and 1 - 8 , the lag time can be derived as follow s : 1 - 9 1 - 10 12 Therefore, the lag time for free standing system ma i nly depends on the concentration of enzyme couple, the TOF difference between two enzyme species and the Michaelis constant of enzyme - 2 . This can be used as an indication of the maximum lag time for the study of a c hanneling system. When introducing substrate channeling, the lag time, , can be expressed as follows : 34 1 - 11 1 - 12 where is the probability for intermediate molecules to reach second enzyme through channeling, and is the reaction probability for readily adsorbed intermediates as compared to unbinding from E2. Since the first item in equation 1 - 11 is usually smal ler than 1 and is usually much smaller than , equation 1 - 11 can be simplified as follows : 1 - 13 As compared to free standing lag time, therefore, the lag time reduction of channeling system is only determined by the mass transport - related channeling efficiency, , and ligand binding/unbinding kinetics - related leakage, . 13 Although analytic methods can give a basic idea on the components of channeling parameters, it is not able to extimate the channeling efficiency, , the key parameter to overall kinetics. We a re therefore motivated to use computational techniques to obtain such estimation. 1.5.2 Continuum modeling Electrostatic channeling has been studied via continuum modeling, where the distribution of substrates/intermediates are represented by a continuous media. 15,35 With pre - determined spatial factors ( e.g. , boundary conditions) and system para meters ( e.g. , diffusivity and reaction rate ), the intermediates migrate under the impact of concentration gradient ( e.g. , external electric force field , as shown in Figure 1 - 5 . Based on this, a steady state flux is used to evaluate the channeling effect by long - range electrostatic interaction . reported the electrostatic confinement model with reflecting boundary conditions. The results of diffusion limited system show that the attractive electrostatic force between intermediate and both enzyme is able to give the maximal channeling efficiency, which is similar to a compartment model. 15 Recently, our group reported a mo re specific two - step cascade model with more detail and complexity , as shown in Figure 1 - 5 . By setting zero boundary conditions, the model is mimicking a competing reaction system ( Figure 1 - 4 b), where the yield percentage is employed to evaluated the channeling efficiency. In addition, by fully incorporating the r ange of diffusion and reaction rate constants, the contribution of electrostatic channeling was also identified at high Damkohler number , which also means a diffusion limited region. 35 14 Figure 1 - 5 Schematic of two sequential enzyme - catalyzed reactions (diffusion limited) in continuum model. 35 (a) Base model without bridge between two spherical active sites. (b) Model with volumetric bridge. (c) Electrostatic model with positive charged bridge. (d) Adsorption model with surface diffusion on bridge. However, continuum modeling relies heavily on parameters estimated from experiment results and other simulations. Moreover, the impact of molecular - level interactions cannot be a ssessed at these length scales. 1.5.3 M olecular simulation Modern molecular simulat ion techniques offer powerful tools to test scientific hypotheses and predict macroscopic properties. In addition, simulation can build understanding of mechanisms behind experimental phenomena and study the time and length scale inaccessible to experiment . In recent years, Quantum Mechanism (QM), Molecular Dynamics (MD) and Kinetic 15 Monte Carlo (KMC) simulations are increasing significant in the area of heterogeneous catalysis 25,36 41 , as shown in Figure 1 - 6 . 42 Based on the Schrödinger equation, QM deals with electron structure, which is able to calculate the activation energy and frequency factor for elementary chemical reactions. However, because of its intrinsic simulation loa d, QM cannot represent systems with large length and time scale. Using classical mechanics, MD represents explicit interactions between atoms with mutual potential energy (bonded and non - bonded). Due to larger time and length scale, MD is capable of repres ent ing the ensemble properties, the interaction between large biomolecules and the diffusion coefficient of small molecules. Therefore, MD simulation is a suitable tool to study the interaction between intermediate molecules and bio - molecular catalysts ( e. g. , enzyme) or scaffold ( e.g. , DNA and poly peptide). As a result, binding and transport behavior of reaction intermediates can be obtained for each catalytic system. As compared to conventional MD, reactive force field (ReacFF) 43 is originally developed for hydrocarbon oxidation in MD and still under development for other systems. 16 Figure 1 - 6 Time and length scale of different simulations technique s . 42 Over past two decades, molecular simulation h as aided the study of electrostatic channeling mechanisms in enzymatic s uper - complexes. This has been demonstrated in the case of naturally occurring enzyme complexes (such as the Krebs cycle and the electron transport chain). First molecular simulation of electrostatic channeling group, using Brown ian simulations to estimate transport efficiency in terms of probability that the explicit intermediate reaches the vicinity (~0.7 nm) of the second active site. 13,29,30,44 These modeling results were further integrated by an analytical approach representing the second active site with Michaelis kinetics 34 . By combining with experimental results, it is demonstrated that restricted diffusion of intermediates caused by electrostatic interactions improved transfer efficiencies up to ~80% (compared to < 10% for that caused by non - electrostatic interaction). Low channeling leakage and high binding affinity at the second active site were highlighted as key factors impacting overall kinetics. 17 These simulations represent a refinement of the continuum mo del , but were still coarse - grained, in that the intermediate was treated as charged sphere migrating in an electric field generated by cascade surface. In addition to long - range electrostatic interaction , charged and polarized intermediates can interact st rongly with channeling surface through hydrogen bond interaction . However, consideration of such local interactions in electrostatic models has been largely absent. Therefore, MD simulation is a good candidate to study such interaction s due to its detailed representation of various energy terms in channeling system. In this context, molecular simulation has emerged as a technique that enables the study of electrostatic interactions that may be inaccessible to experimental detection limits or impractical for computationally expensive techniques, such as density functio nal theory. Additionally, these simulations provide a platform for exploring possible substrate/intermediate combinations by describing their interaction with a computationally - optimized electrostatic surface. 1.5.4 Kinetic Monte Carlo model Despite several adv antage , molecular dynamics has intrinsic limitations which can be complemented by kinetic Monte Carlo. As shown by Table 1 , regular MD is not able to represent reactions, and so it is not by itself able to characterize the overall reaction kinetics. 18 Table 1 Kinetic Monte Carlo (KMC) vs. Molecular Dynamics (MD) Feature MD KMC Reaction No Yes Explicit representation atom event Time representation finite, constant and small time steps discrete time step depending on individual random events Governing factor Potential energy e.g. , L - J, Coulomb Transition State Theory e.g. , , f Rare event s advanced sampling techniques are required yes Moreover, because MD calculates the trajectory of individual molecules , governed by local potential energy, it cannot well represent rare events , as shown in Figure 1 - 7 . 36 That is, the MD system tends to oscillate around an energy minimum and meanwhile get the probability to hop to the neighboring energy minimum exponentially depending on the energy barrier in between. As a result, large a mount simulation resource is wasted when MD system is stuck at such stat e , making MD simulation unable to well represent rare events. Enhanced sampling techniques 45 are usua lly required to accelerate MD system s to overcome the energy barrier s separating the meta - states, in order to acqui re longer time scale and accurate sampling probability. Current enhanced sampling approaches includes R eplica - exchange molecular dynamics 46,47 , Meta d ynamic s 48,49 , Umbrella Sampling 50 , and Transition Path Analysis by Markov State Model 51 57 . 19 Nonetheless, MD simulations are still expensive and not capable to cover the time and length scale of real word experimental system. In comparison, KMC simulation uses given rate constants to randomly determine the occurrence of specific event s and corresponding time evol ution s , instead of performing the actual oscillating process in MD simulations. The probability of executed events depends on the weight of rate constants that are determined by the energy barrier of such events. More detail of KMC method can be found in Chapter 5 . Figure 1 - 7 Rare event in Molecular Dynamics and Kinetic Monte Carlo. 36 As a simplified model represented by explicit eve nts and rate constants, KMC is able to quantify overall kinetics, achieving the time scale of experiment al system s . Although KMC is a coarse - grained method, it allows tunable complexity to mimic experiment al conditions, such as interactive events and chang ing external environment. For example, when oxygen repulsion is considered in CO oxidation on a RuO 2 surface, the KMC result is much superior to the classical microkinetics theory which assumes a homogeneous surface distribution and thus underestimate s the complexity of local conditions. 41 In spite of that KMC per se cannot predict energy barrier s or frequency factor s needed to calculate the rate constant s via transition state theory (TST). These key parameters have to be measured by experiment or calculated by QM and /or MD. Therefore, hierarchical simulation is a proper 20 approach to model a channeling system. By the combination of MD and KMC, a bridge can be bu ilt between the molecular level design and the macroscopic experimental phenomena. 58 From the aspe c t of application s , KMC model s are widely used for catalytic oxidation and diffusion of small molecules on lattice surface s . 41,59 63 For example , DFT parameterized ab initio KMC was employed to study the impact of catalyst composition, surface structure, lateral interactions, and environment on the overall catalytic performance of CO and NO oxidation . 62 Moreover , experimental ly based KMC was used to study the CO oxidation on a RuO 2 surface, 60 and a compa rison between DFT and experimental parameters was made to identify the significant impact of lateral repulsion of adsorbent . 41 Besides qu antum mechanics, a KMC model combined with MD simulation was used to study the epitaxial growth of fcc and hcp islands on a fcc (111) surface, exhibiting great simula tion acceleration. As a result, boundary pinning effect s by the island of adsorbed atom was revealed, which is usually difficult to be elucidated by conventional KMC methods. 64 Another good example of model hybridization is a combination of first - pri nciple MD and KMC simulations, where the impact of rotational coupling of the sid e groups was shown to influenc e the proton conduction in proton exchange membrane fuel cells (PEMFC). 65 1.6 Overview of D issertation The goal of this work is using multi - scale simulations to further understand the mechanism of electrostatic channeling, quantitatively calculate the cascade kinetics and develop design rule s for fu ture artificial cascade s . The model construction particularly focus es on the surface diffusion of reaction intermediates, and the quantification approaches are aiming to fully consider the molecular complexity of channeling p athway s . Enhanced sampling techniques are 21 utilized to cover the ergodicity of the channeling system, and gi ve a rational representation of all possible events with different frequency and probability. Specifically, molecular dynamics (MD) simulations enable the calculation of energy - determined surface equilibrium constants and surface diffusivity, and a kinetic Monte Carlo (KMC) model integrated all rate constants from MD ( e.g. , surface diffusion and desorption rate) and experiments ( e.g. , turnover frequency), to estimate the product evolution on experimental time scale s . The cascade and reaction scheme employed in this work is the conversion of glucose to phospho - 6 - gluconolactone by hexokinase (HK) and glucose - 6 - phosphate dehydrogenase (G6PDH) , covalently conjugated by a cationic oligopeptide bridge. The system is simulated and validated by comparison to stopped - flow lag time analysis . This section is in collaboration with University of Utah. In Chapter 2 , MD simulation is use d to study the surface interaction between cationic oligopeptide (Lysi n e, Arginine and Histine) and various negatively charged intermediate species (oxalate, glyoxylic acid and glucose 6 - phosephate). A hopping surface diffusion mode l is built, and the MD and experiment studies of ionic strength dependence on lag time (HK - G6PDH cascade) provides strong evidence of the occurrence of electrostatic channeling. In addition, the balance between surface adsorption and diffusion is studied to indicate an optim ization of channeling efficiency. (1st paper published on ACS Catalysis 33 ) In Chapter 3 , regular MD simulation and umbrella sampling are used to quantify the energy barrier for surface hopping and desorption. A Stern layer diffusion under the protect ion of diffuse layer is revealed by ionic strength dependence study. As a result, the thermodyn amic 22 parameters of an individual hop is quantified in great detail. This helps to further understand the channeling mechanism and also give support to the rate constants for kinetic model s . In Chapter 4 , the leakage between peptide bridge and enzyme binding pocket is considered. On this complicate d 2D surface with multiple possible pathways, probability analysis and a Markov State m odel are used to ma p the energy profile in this area. Then , this energy landscape is parameterized into KMC rate constants to involve this additional leakage before final reaction. In Chapter 5 , the overall kinetics of HK - G6PDH cascade was quantified by a combination of MD and KMC simulation. Specifically, KMC simulation enable the integration of all rate constants to model the evolution of bulk intermediate, particularly from the pre - steady state. This allows for a direct comparison to experiment results. From the resulting lag time, the leaking probability of each individual hop and the length of channeling bridge are found to be the key factors for overall kinetics. The barrier between bridge and second enzyme not only caused leakage by itself but also exacerbated the leakage on channeling bridge. The KMC lag time agrees much better to experiment as compared to the results only with leakage on peptide bridge. (2 nd paper submitted to ACS Catalysis , in revision ) In summary, the multi - scale simulation in this work enable a detailed quantification approach, fully considering the molecular - level interactions in electrostatic channeli ng process via strong surface diffusion mechanism. This is of great significance to understand the micro - scale mechanism and identify potential limitations to be improved for cascade design. In future, h ighly interactive work between such modeling techniqu e and experiment is greatly needed for 23 the molecular - level design of synthetic cascade with channeling efficiency comparable to or even better than natural cascade. 24 Chapter 2 Surface Diffusion by Molecular Dynamic s Simulation * 2.1 Introduction Th e phenomenon of electrostatic channeling is demonstrated naturally by coupled enzymes , such as dihydrofolate reductase - thymidylate synthase (TS DHFR) and malate dehydrogenase - citrate synthase (MD - CS) cascades. As for structure characterization, XRD diffraction results reveal a charged surface between active sites for both cascades. Theoretically, c harged intermediate molecules interact with cascade surface s through long - range electrostatic forces and local , short - rang e interaction s via Van der Waals forces and hydrogen bond s . Therefore, besides the migration under the electric field generated by charged cascade surface, i ntermediate molecule s also potential ly perform surface diffusion at closed proximity to a channel ing pathway. However, the channeling process occurs at nanometer and nanosecond scale, which is inaccessible to current experimental technique s all of which rely on capturing bulk intermediate evolution at real world time and length scale s . 7 Given the fact that in - situ observation is currently impossible, molecular simulation combined with experimental kinetics is an optimal approach to understanding the channeling mechanism on cascade surface, particularly through a surface diffusion mode. Using the mechanism of electrostatic guidance as an inspiration, we envisioned the use of a charged oligopeptide bridge as a molecular construct to facilitate restricted diffusion of a * The content of this chapter has been published on ACS Catalysis , in collaboration with University of Utah. Experimental work in this chapter is in collaboration with David P. Hickey, Dr. Shelley D. Minteer and Dr. Matthew S. Sigman at University of Utah. Y. L iu, D. P. Hickey, J. - Y. Guo, E. Earl, S. Abdellaoui, R. D. Milton, M. S. Sigman, S. D. Minteer and S. Calabrese Barton, "Substrate Channeling in an Artificial Metabolon: A Molecular Dynamics Blueprint for an Experimental Peptide Bridge", ACS Catalysis , 7, 2486 2493 (2017). doi:10.1021/acscatal.6b03440. 25 charged intermediate between two seq uential catalytic sites . In this chapter , we describe the use of MD simulations to design and optimize a series of theoretical cationic - helix peptides, and quantify their ability to transport charged intermediates across the theoretical surface. These optimized theoretical conditions were applied to design and construct a simple template for preparing artificial enzyme complexes capable of inducing restricte d diffusion of a charged reaction intermediate between two sequential enzymes in a metabolic pathway. Stop - flow lag time analysis was used to experimentally examine the channeling efficiency and validate the MD results. These studies provide the basis for utilizing MD simulations to strategically design synthetic substrate channeling cascades and develop a detailed understanding of the range and limitations of artificial electrostatic substrate channeling. 2.2 Simulation M ethods GROMACS 5.1.1 66 72 was used as the MD simulation package. CHARMM36 73,74 was employed as the force field governing whole simulation and CGenFF (CHARMM General Force Field) 75 was used to generate topology for small intermediate molecules. Periodic boundary conditions (PBC) were applied to all MD simulations. Th e peptide/intermediate couple was first solvated with tip3p water molecules in a dodecahedral box (~ 100 nm 3 ) and then neutralized by Na + or Cl - ions. The system energy was then minimized using a steepest descent algorithm, followed by a 0.1 ns NVT and 1.0 ns NPT equilibration process with position restraints for each molecule. Finally, the MD simulation was performed under NPT ensembles for 50 ns, repeating 10 times for each system. For all MD simulations , the system temperature was coupled at 300 K by velocity - rescale thermostat 76 and the pressure was stabilized at 1 bar by Parrinello - Rahman barostat. 77 Particle - Mesh Ewald (PME) algorithm with cubic interpolation (0.16 ns grid space) was used to calculated long - range electrostatic interaction. 78,79 Cutoff values for both short - range 26 electrostatic and van der Waals interactions were set to 1 nm. A Verlet cutoff scheme 80 is used to calculated non - bonded interactions on a GPU accelerator. 81 Initial - helix peptide structures was generated by Avogadro 1.1.1 82 with torsion angles = - 60° and = - 45°. Neutral terminal s ( - COOH and NH 2 instead of COO - and NH 3 + ) were applied to the helical backbone for MD simulations. From resulting backbone, the alpha - carbon of the N - termin us and alpha - carbon of the C - termin us (two points in total) were pinned to immobilize the peptid e. The fraction of basic residue s was tuned by adding alanine (Ala) between basic residues (Arg, Lys). Accordingly, the peptide chain used in simulations were named based on the number of basic residues (Lys, His, Arg) and the number of interstitial Ala. For example, the nomenclature K4 - A3 indicates a peptide with four Lys residues in total with three Ala between each Lys, resulting in the structure, . Finally, four extra Ala were appended to the terminal Lys at each end, in order to provide pinning sites and to maintain the - helical structur e . Electrostatic potential maps were prepared using APBS and PDB2PQR, 83 85 and VMD 1.9.2 (Visual Molecular Dynamics) 86 was used for visualization. Intermediate molecule structures ( . mol2 file) were downloaded from ZINC database. 87,88 The i onic strength of MD system was tu ned with explicit ions. For these simulations, box size was increased to 325 nm 3 , to access lower ionic strength than the initial case. Counter ions were added after neutralization until the ionic strength reached a target value. For 0 mM ionic strength, a ll counter ions were removed, resulting in a non - neutral system. 27 Surface interaction was evaluated by the fraction of adsorption time, , as shown by following equation, where is the time for intermediate molecule to stay on cascade surface, when the short - range coulombic energy is less that 50 kJ mol - 1 . Similarly, corresponds to the desorption time. 2 - 1 Adsorption energy, , is calculated based on the radial distribution function ( RDF ) 38,89 of the intermediate atoms aroun d peptide surface, which is defined as the probability to find the intermediate molecule at a specific distance away from the reference s urface. Using the , the RDF was calculated by taking the distance r between substrate atoms and the closest peptide atom. Using the GROMACS function gmx_rdf , the probability density was obtained versus r . Then r = 5 pm) and summed using: 2 - 2 where . The energy level of a specific radial position was calculated by equation 2 - 3 , where was taken at 1.0 nm away from charged peptide system and 1.5 away from non - charged peptide. 2 - 3 28 F inally, the adsorption energy was obtained by taking the integral summation of relative energy differences, weighted by normalized probability density: 2 - 4 To quantify the diffusion rate, mean square displacement (MSD) was used to evaluate the mobility of intermediate in bulk or on - helix peptide surface. The relationship between MSD and diffusion constant, , is shown in equation 2 - 5 , where n is 2 for one dimension and 6 for three dimensions. In addition, is the time interval (lag time) corresponding to the displacement. 2 - 5 Two diffusion modes were assumed to calculate the MSD. For bulk diffusion, 3 - D Brownian motion was assumed and the MSD was obtained from MD trajectories by taking the mean square displacement according to equation 2 - 5 , where is lag time and is the position of j . 2 - 6 It should be noted that the terminology lag time for MSD calculation is mathematically independent from the lag time used in the Experiment Section. Unless specified, the average bulk diffusivity, , was calculate d by fitting the vs curve between 200 - 1000 ps. Since the MSD was taken over the entire simulation region, it contains the average of adsorption and desorption components. Therefore, a subscript avg was appended to distinguish it from pure 29 bulk diffusivity, but when the interaction is very weak ( e.g. , on His peptide), can represent the bulk diffusivity. As a result, the can reflect the average mobility of intermediate in peptide system. When intermediates were closely associated with peptide surface, surface diffusivity, , can be calculated using a 1 - D hopping diffusion mode, where the intermediate molecule can only move to one of its two nearest - neighbor sites (1 - D) with equal probability. The MSD for hopping mode can be calcula ted according to equation 2 - 7 , where is the hopping distance to the nearest charged residue and is the hoping rate (s - 1 ). 2 - 7 By combining equation 2 - 5 and 2 - 7 , one dimensional surface diffusivity, , can be correlated to and , so that 2 - 8 As for hop p ing rate, the coulombic energy diagram was first slightly smoothed with convolution of a scaled window (win = 21). From coulombic energy distribution, two to three peaks ( e.g. , - 150, - 280 and - 400 kJ mol - 1 for the oxalate - Lys couple) could be identified, corre sponding to the plateau in coulombic energy diagram. Detail ed figures can be fou n d in APPENDIX A . Number of hops, , was taken as half of the number of energy shifts between energy plateaus, excluding the shift involving 0 kJ mol - 1 (which indicates desorption). After that, average h oping rate ( ) was obtained by dividing by simulation time (50 ns). To exclude 30 the impact of the varying desorption period, (the hoping rate upon adsorption) was calculated by further dividing by . In this way, can be com pared to and to describe the relationship between surface mobility and desorption probability. In order to obtain the hopping distance, , the rise along helical axis was taken from a generally accepted value (0.15 nm per residue), which is then multiplied by the numbers of residues to yield . 2.3 Result s and D i scussion 2.3.1 Surface hopping mechanism The adsorption of small charged molecules on biomolecular surfaces is due to local potential energy minima caused by non - specific electrostatic interactions about the bio - interface. Mechanistically, local potential energy minimization competes with kinetic exchange between adsorbed molecule and water molecules, resulting in a dynamic adsorption and desorption process that allows for bound diffusion across the charged biological surface. Figure 2 - 1 shows the chemical structures of charged intermediate species and the oppositely charged amino acid side chains. As shown in Figure 2 - 1 a, oxalate and glucose 6 - phosphate (G6P) mo lecules both carry - 2 charge , but oxalate has a symmetric charge distribution due to the two identical carboxylate groups , and G6P is pola rize d, with negative charge concentrated on its phosphate group. Glyoxylic acid has similar geometry to oxalate, with only - 1 charg e . As for the amino acid side chains in Figure 2 - 1 b , Arginine and Lysine both have +1 net charge as indicated by their high pKa values. But Arginine has a much stronger polarization degree on its guanidine group, as shown by the electrostatic potential map in Figure 2 - 1 c . 33 31 Therefore, Arginine and Lysine are expected to have the same long range electrostatic impact but a very different local int eractions. 40 In contrast, Histidine has no net charge but a high polarization degree. Figure 2 - 1 f - h show several example of Lysine - Alanine peptide s in an - helix conformation. The charge density is tuned by the neutral Alanine residues ( Figure 2 - 1 b) placed between adjacent LYS side chains. Figure 2 - 1 (a) Intermediate species. (b) Amino acid side chain species. (c - e) electrostatic potential map of Arginine, Lysine and Histidine. (f - h) a - helix Lysine peptide with various charge density. With th e mechanism of electrostatic channeling as a pretext, we use d MD simulations to study the bound diffusion of a common small biological intermediate, oxalate, across a charged peptide surface . The a dsorption energy , , and average coulombic energy, , between the intermediate and peptide were chosen as metr ics indicating to an effective substrate channeling. 32 Specifically, can be calculated for every frame of the simulation, and its time average indicates the degree of electrostatic interaction between charges. is calculated based on the proximal density of oxalate around peptide surface ( radial distribution function , RDF), and describes interactions caused by hydrogen bonding and polarization of the interacting groups . RDF data provides information about the proximity of an electrostatic interac tion based on both discrete hydrogen bonding interactions and cumulative adsorption energy. Here, we use the RDF of oxalate about a polypeptide chain to determine the probability that oxalate will be stabilized at a given distance from the electrostatic pe ptide surface. As shown in Figure 2 - 2 , c orresponding RDF diagrams for oxalate with the charged amino acid residues of interest indicate a high density of interactions occurring at ~0.15 nm for cationic Arg - and Lys - containing pep tides. A minimum distance of 0.145 nm is also observed, which corresponds (~0.15) associated with hydrogen bonding between anionic oxalate and either the - ammonium of Lys or the guanidinium of Arg (NH + ··· - O). 90 These results indicate a fast adsorption process and a possible contacting surface diffusion process. However, peptide surfaces containing His residues primarily interact with oxalate at a distance between 0.5 and 1.5 nm, which extends beyond the short - range cut - off distance for dissociation (1 .2 nm). Therefore, HIS residues do not create coulombic forces and do not promote electrostatic channeling . 33 Figure 2 - 2 Radial distribution function (RDF) diagrams of oxalate around the peptide surface and the corresponding diagrams for oxygen atoms and carbon atoms individually (inset). Where as the imity to peptide surface, surface mobility is illustrated by short - range coulombic diagram, as shown in Figure 2 - 3 a. MD simulations describing coulombic energy during dif fusion of oxalate along a polyarginine peptide chain reveal several discrete energy states that dictate its surface mobility. The corresponding coulombic energy diagram ( Figure 2 - 3 a) suggests that oxalate primarily exists in either a singly - or dually - bound ionic conformation with adjacent Arg residues ( Figure 2 - 3 b) with coulombic energy of - 190 kJ mol - 1 or - 390 kJ mol - 1 , respectively, or in an unbound state of coulombic energy 0 kJ mol - 1 . As oxalate travels across the peptide surface, the coulombic energy diagram exhibits a period of rapid flux in energy between e ach state (highlighted as the shaded region of coulombic diagram, Figure 2 - 3 a). During this process, the randomly oriented kinetic exchange between ox alate and either water molecules or adjacent charged sites results in oxalate ho p Figure 2 - 3 b). This energy - hopping region of coulombic energy diagram serves as a qualitative indication that diffusion is largely confined to the oligopeptide surface. Additionally, the combined RDF minimum distance (0.145 nm) 90 and coulombic energy 34 results suggest a diffusion mechanism that utilizes distinct hydrogen bonding interactions between oxalate and basic amino acid residues , which adds si gnificant detail to previously reported mechanisms of locally restricted diffusion within an electrostatic field. 29 Figure 2 - 3 (a) Short - range Coulombic energy diagrams between oxalate ( - 2 ) and peptide composed of combinations of alanine and arginine, Lysine, and Histidine . The shaded area in Lys peptide figure (red) represents a frequent surface diffusion process . (b) Representative MD simulations corresponding to the plateau region in the Coulombic energy diagram between ARG side chains ( blue) and oxalate (red). One of the prim ary benefits of MD simulations is the opportunity to rapidly explore several theoretical electrostatic peptide bridges that would be impractical to screen experimentally . Using coulombic energy diagrams as described above, we sought to identify optimal c ationic peptide residues to facilitate electrostatic channeling of an anionic intermediate. By tuning the type of charged amino acid (Arg, Lys, His), we studied the electrostatic 35 interactions of various peptide structures with a dianionic oxalate molecule. Despite being highly polarized, neutral His (pK a = 6.04) peptides do not allow for ionic stabilization and therefore do not enable any adsorption of oxalate. Both Arg and Lys have the same +1 net charge, but Lys has a much weaker polarization degree and H - bonding capability as compared to Arg. 40 - m group and thus is not able to sterically block bulk water molecul es as effectively. This results in stronger interactions between oxalate and water molecules, and increased kinetic exchange with the bulk media. Consequently, oxalate molecules displayed a shorter adsorption time fraction ( ) on Lys - containing peptides than their Arg counterparts ( Figure 2 - 3 a) . Despite exhibiting smaller , Lys peptides d isplay a unique transport phenomenon in the exceptionally high number of energy hop s exhibited in coulombic diagram. Highlighted by the shaded area of the energy diagram in Figure 2 - 3 a, the coulombic energy hop s 20 times over the 17 ns timeframe and moves frequently on the peptide surface . In this case, t he Lys residue s are separated by three alanine (Ala) residues, with 0.6 nm between adjacent Lys residues along - helix axis (assuming a normal helical increment of 0.15 nm per residue). T his results in a surface diffusivity of 2.1 × 10 - 6 cm 2 s - 1 , which is comparable to the typical bulk diffusivity of small molecules ( 10 - 20 × 10 - 6 cm 2 s - 1 ). In contrast, Arg peptides interact too strongly with oxalate, anchoring the intermediate instead of allowing transport. These data suggest that Lys appears to be an effective choice of charged amino acid residue for design of an el ectrostatic channel. D etailed discussion on surface diffusion constant can be seen in Section 2.3.3 . 36 2.3.2 Impact of intermediate species With the insigh ts afforded above by coulombic energy diagrams, we sought to identify characteristics that allow various charged intermediates to be efficiently shuttled between conjugated active sites . To this end, MD simulations were performed on three common biological intermediates, glucose - 6 - phosphate (G6P), oxalate, and glyoxylate ( Figure 2 - 4 a - c). As shown in Figure 2 - 1 , b oth oxalate and G6P possess a - 2 charge, however, G6P has a much more localized charge with r espect to its molecular volume. Figure 2 - 4 Sys tem configuration and coulombic energy diagram for glucose 6 - phosphate (a), oxalate (b) and glyoxylate (c). The inserts are simplified coulombic energy diagram , where 0 and 2 on vertical axis stands for the energy level of dual association and desorption . (d) Adsorption time fraction and hopping times of intermediate molecules on peptides with various Lys fraction over the entire 50 ns simulation. The bar marked by red arrows corresponds to the coulombic diagram in above figures. Numerical data can be found in APPENDIX B 37 Coulombic energy diagrams of oxalate and G6P ( Figure 2 - 4 a and b ) indicate that they both readily adsorb ( = 71 ± 7 % and 85 ± 8 %, respectively) and diffuse across the peptide bridge through a double association mechanism. However, surface interactions changed dramatically in the case of the singly charged glyoxylic acid ( Figure 2 - 4 c) . Desp ite having a similar structure to that of oxalate, hardly any dual association (1.2 ± 0.6 % time fraction) is observed in corresponding coulombic diagram and the intermediate remains either singly associated with Lys or desorbed ( = 12 ± 4 %). Signifi cant desorption of glyoxylic acid arises because its single anionic site does not allow for simultaneous coordination between adjacent Lys residues. Therefore, glyoxylate depends on a dissociative jumping mechanism between residues that increases the proba bility of desorption from peptide surface. This suggests that singly - charged reaction intermediates should be avoided as substrate channeling targets due to a high propensity for dissociation with the charged peptide chain. Residues with smaller positive c harge ( e.g. , + 0.5) might be an interesting subject to study the dual association and transportation of such intermediates. However, the natural HIS residue is only 10% charged given its pKa value at 6.04, and it is not supposed to be a good candidate. T he refore, synthetic amino acid residues might be able to give more inspiration to this issue. Also, f urther experiment is needed to validate the channeling behavior of singly charged intermediate molecule, and this has the potential to add more details and evidence to the surface diffusion mechanism and design rules. 2.3.3 Balance between surface adsorption and mobility In an ideal electrostatic channeling system, the intermediate molecule should have a strong adsorption and thus a long adsorption time. At the sam e time, the intermediate molecule should be very mobile on cascade surface to reach the sequential site . However, as discussed 38 above, strong interaction may decrease intermediate mobility on peptide surface. To further study the relationship between surface adsorption and mobility, adsorption energy and diffusivity were quantified for oxalate on Lys peptides. Simulated electrostatic interactions between oxalate and peptide bridge were described in terms of the absorption energy ( , described above ), the fraction of time that intermediate is adsorbed , , and the diffusivity along the surface of peptide chain , . As shown in Figure 2 - 5 , MD simulations of both oxalate and G6P were analyzed in terms of , , and , where was controlled by varying the ratio of charged Lys residues to neutral Ala residues ( e.g. , Figure 2 - 1 f - h ). These simulations demonstrate that for oxalate is enhanced with increasing Lys fraction, from ~0 kJ mol - 1 to 8.91 kJ mol - 1 ( Figure 2 - 5 a ) . As a result, increased gradually from 31%, reaching a maximum of 90% for oxalate, while for G6P ranged from 67% to 97%. This suggests that a higher fraction of Lys is required to pre vent oxalate from desorbing from peptide surface, while G6P is not likely to desorb under any of the conditions studied. Additionally, we found that both oxalate and G6P maintained a high (~1 times per ns) for all Lys - based peptides with the exception of Lys 1 - Ala 6 , wherein the separation of neighboring Lys residues is too great to allow for a d ual association diffusion mechanism. This resulted in a consistently high for most conditions. However, according to equation 2 - 8 , affects exponentially, where decreases substantially for the peptide chains where Lys fraction is too high. This is due to persistent double and tripl e association of the intermediate to proximal Lys residues that dramatically slow diffusion across the peptide chain. 39 Figure 2 - 5 Effect of the Lys fraction (a, d) and t hermodynamic adsorption energy, , (b, e ) on surface diffusivity, , and adsorption time fraction , . Effect of on transport efficiency, characterized by nt the standard deviation of 10 parallel simulations for each individual system. In order to provide an optimization metric to balance adsorption energy with surface mobility, we simply took the product of as a measure of transport efficiency. The resulting plots of vs ( Figure 2 - 5 c, f ) indicates that transport efficiency reaches maximum for peptides where the Lys fraction is slightly less than saturated (i.e., Lys 6 - Ala 1 ). The 40 simulated transport efficiency for a Lys - s aturated peptide (Lys 10 - Ala 0 ) was decreased due to the insufficient surface mobility, while the low transport efficiency of Lys 1 - Ala 6 was due to poor adsorption time. Collectively, MD simulations indicate that an electrostatic channeling surface with high (but less than saturated) charge density and a polyanionic intermediate would strike a balance between surface mobility and adsorption energy to allow for efficient electrostatic substrate channeling. 2.3.4 Qualitative c ompar ison with experiment by stop - flow lag time analysis In this section, MD simulation results are compared with experiments, in terms of the occurrence of channeling and the d ependence of ionic strength . In cooperation with our partners from University of Utah, we selected the reaction of glucos e with HK and G6PDH as a model system to introduce an artificial channeling bridge , as shown in Figure 2 - 6 a . This enzymatic cascade utilizes HK with adenine triphosphate (ATP) to phosphorylate glucose to G6P, which is subsequently oxidized by G6PDH with nicotinamide adenine dinucleotide phosphate (NADP + ) as a terminal oxidant ( Figure 2 - 6 b ). The reaction rate on each active site (~0.01 - 0.1 s - 1 ) is much slower than the diffusion rate of small liga nd molecules (~10 - 5 cm 2 s - 1 ), including cofactor, reactant, intermediate and product. Therefore, it can be assumed that th ere is no delay on bulk concentration change when turnover occurs at any active sites. Therefore, by measuring the absorbance correspo nding to NADPH formation, we were able to indirectly monitor the activity of HK or directly measure the activity of G6PDH. Additionally, this reaction sequence provides G6P as a charged intermediate, which allows for the comparison of our experimental find ings with those suggested by MD modeling. Based on the simulated oligopeptide chains, we synthesized an electrostatic bridge consisting of a poly(Lys) 41 oligopeptide, as shown in Figure 2 - 6 a . Corresponding synthesis detail can be found in our published work . 4,33 Figure 2 - 6 (a) Illustration of the proposed channeling complex using a poly(lysine) bridge as an electrostatic surface between hexokinase (HK; PDB 3VF6) and glucose - 6 - phosphate dehydrogenase (G6PDH; PDB 4LGV). (b) Experimental reaction scheme used to study electrostatic channeling of the charged intermediate (glucose - 6 - phosp hate) across a cationic peptide bridge. (c) Sample absorbance plot highlighting the bridge (K5), neutral bridge (G5), or free enzymes . 33 As discussed in Chapter 1.3, one of the most common methods for studying substrate channeling involves measuring the transition time, also called lag time, required to reach steady - state flux of the reaction intermediate (G6P). 22 The lag time , , was determined experimentally by stopped - flow injection analysis, in which the absorbance of a solution containing both enzymes was measured following an injection of glucose . From this plot, the maximum change in absorbance is extrapolated to the time axis ( ) at which absorbance = 0, which is exemplified in Figure 2 - 6 c sh ow ing the experimental results of three types of enzymatic couple, including free standing enzymes (no channeling), neural bridge (proximity only) and charged bridge 42 (proposed channeling). The result on charged peptide bridge shows much smaller lag time th an its uncoupled counterparts, indicating that the electrostatic surface of the peptide facilitates transport of reaction intermediate (G6P). In addition, the lag time of enzymes coupled by neutral bridge stayed at the same level of free standing enzymes , which roll ed out the contribution of proximity alone. Theoretically, the ions in solution can shield the electrostatic interaction between charges, which should undermine the channeling efficiency in this case. As a result, the degree of transport efficien cy could be adjusted by ions strength, which broadens the range of comparison between simulation prediction and experiment measurement. In order to further demonstrate the impact of electrostatic interaction , is measur ed with variable ionic strength, as shown in Figure 2 - 7 . Specifically, when IS level is increased from 0 mM to 100 mM, the lag time for K5 system almost increases to the same level with non - channeling system, approving the contribution of electros tatic interaction to the intermediate transport at low ionic strength. However, the K15 case has a more concentrated charge but a lower channeling efficiency as compared to its K5 counterpart. The quantitative reason of this will be demonstrated in Chapter 5 . 43 Figure 2 - 7 Lag time of the free enzymes and 5xLysine (K5), 15xLysine (K15), 5xGlycine (G5), and 15xGlycine (G15) complexes using 1.4 mM citrate with variable ionic strength, where the ionic strength is controlled by the concentration of NaCl. Experiments were perfor (complex) at pH 7.0 and 37 °C. Error bars represent one standard deviation; (*) 0.05 > p > 0.01; (**) 0.01 > p > 0.001; (***) 0.001 > p. [ref] The dependence of electrostatic interaction on ionic strength is also studied by MD , where ionic strength were tuned with explicit ions. The box size was controlled at 325 nm 3 , and extra counter ions were added after neutralization, until the ionic strength reached target values. For 0 mM ionic strength, all counter ions were removed, resulting in a non - neutral system. MD simulation on ionic stre ngth shows that adsorption time, , generally experienc es a decreasing trend with increasing ionic concentration, where the fully saturated Lys peptide exhibits a slight decrease on ( = 4.3%) and the peptide with moderate Lys density has a severe decrease ( = 40%) on , as shown in Figure 2 - 8 a . This suggests that electrostatic interaction and thus channeling efficiency tend to decrease with increased ion concentration, particularly for the less saturated Lys bridge. These simulation results qualitatively agree well with experiment results in Figure 2 - 7 . 44 Figure 2 - 8 (a) MD Simulation on ionic strength, based on the system with G6P and Lys peptide. The theoretical 100 % Lys was calculated from Lys - 0Ala peptide (K10), and 50 % Lys was calculated from Lys - 1Ala peptide (K6 - A1). (b) Plot of Free , G5 and K5 in Figure 2 - 7 , and analytical fit on K5 lag time according to equation 2 - 9 . In order to further compare MD and experiment results, the simulated lag time of K5, , is analytically derived as shown in equation 2 - 9 , where is the lag time of free standing experimental lag time, is probability of successfully channeled intermediate, is the percent of coupled enzyme pairs as compared to their non - coupled counterparts, and , is the maximum reaction rate of two enzyme. Given the fact that , equation 2 - 9 is simplified to its final ex pression: 2 - 9 2 - 10 By assuming is equal to in Figure 2 - 8 a, the fitted MD lag time and experiment lag time are shown in Figure 2 - 8 b. Although th e lag times agree well at low ionic strength when 45 assuming a 50% contamination of uncoupled enzyme pair, the IS dependence trends show a significant discrepancy between the simulation and experiment results. This is due to the absent of quantitative parame ters that can be directly correlated the rate constants of reaction on each active site. Nonetheless , by combining equation 2 - 9 and Figure 2 - 7 , the value is around 50% percent at zero IS level. That means the channeling probability is between 50% and 100% at low IS level , which suggested a non - perfect but still very effective electrostatic channeling. 2.4 Summary MD simulations were used to explore electrostatic interactions b - helix peptide surface and negatively charged reaction intermediates. O xalate molecules was found to und ergo a surface diffusion mechanism across cationic poly(Arg) and poly(Lys) peptide , whereby oxalate hops between discrete hydrogen bonding interactions between proximal charged residues. By varying the composition of simulated peptide bridge and the charac teristics of the diffusing intermediate, we were able to define several rules for designing electrostatic substrate channels. Specifically, Lys residues were found to provide a balance of intermediate adsorption and surface diffusivity that allow for effic ient electrostatic channeling while preventing dissociation of the intermediate into the bulk. Add itionally, simulations suggest that a dianionic intermediate is required for the double associative diffusion mechanism that prevents desor ption from the pept ide surface. Using these simulation - derived design principles as a foundational blueprint, we synthesized an enzyme complex by covalently conjugated HK and G6PDH by a poly(Lys) 46 bridge. The synthetically cross - linked enzyme complex was shown to facilitate e lectrostatic substrate channeling by decreasing the lag time required to reach steady state with respect to the intermediate from 102 ± 10 sec for a mixture of coupled enzymes to 56 ± 11 sec for the Lys - bridged enzymes. The study of synthetic channeling co mplexes allowed us to identify low ionic strength as ideal experimental conditions to observe electrostatic substrate channeling. Current MD and experiment results are able to give a strong support of the occurrence of artificial ly introduced electrostatic channeling, and also qualitatively agree well to each other. However, to further quantitatively compare these result s and identify potential limitation s in this channeling process, detailed MD study and advanced sampling technique are needed. 2.5 Acknowledgements We greatly acknowledge the experiment support from Dr. David Hickey in Dr. Shelley Oligonucleotides were synthesized by the DNA/Peptide Facility, part of the Health Sciences Center Cores at the University of Utah. 47 Chapter 3 Quantification of Thermodynamic Parameters * 3.1 Introduction Surface adsorption of small ligand molecule s is driven by a local energy minimum, and surface diffusion can be seen as a stochastic process on a given energy landscape. The charged surface of natural cascade 18,28 promotes surface diffusion along with intermediate migration under a proximal electric field created by the cascade surface . Continuum modeling is able to give a good description of the intermediate migration with long range electrostatic interaction. 15,35 However, there is no representation of the surface interaction. Brownian d ynamics gives a more detailed description of the electric field and also explicit intermediate molecules, 29 but potential local h - bond interactions are underestimated. F or kinetic quantification, rate constants are greatly needed and are the key to evaluat e overall kinetics. 34 Therefore, detailed quantification of thermodynamics parameters, such as hopping energy barrier and desorption energy, is of great significance to further understand the channeling mechanism and quantify overall kinetics. In this chapter, by using MD simulations, the energy barrier of an individual hop is calculated by transition state theory and desorption energy is calculated by umbrella sampling. By combining these two energy terms , as shown in Figure 3 - 1 , the probability for intermediate to traverse the bridge is quantified in detail. * The content of this chapter has been published on ACS Catalysis as a full paper. Y. Liu, I. Matanovic, D. P. Hickey, S. D. Minteer, P. Atanassov and S. Calabrese Barton, "Cascade Kinetics of an Artificial M etabolon by Molecular Dyn amics and Kinetic Monte Carlo" , 8 , 7719 - 7726 (201 8 ) . doi : 10.1021/acscatal.8b01041 48 Figure 3 - 1 Poly - lysine peptide as an electrostatic bridge to transfer reaction intermediate (G6P, glucose 6 - phosphate) from HK (hexokinase) to G6PDH (glucose 6 - phosphate dehydrogenase).22 3.2 Model De scription A description of the basic molecular dynamics model on the peptide bridge is given in Chapter 2 . 3.2.1 Transition state theory study G k hop ) on th e peptide surface was calculated by the short - ran ge coulombic energy change between the levels of dual - ( - 400 kJ/mol) and single - ( - 200 kJ/mol) association configuratio n, as shown in Figure 2 - 3 . Acco rding to t ransition s tate t heory (TST) the rate constant, k , is related to the energy barrier G by an Arrhenius expression: 3 - 1 49 where A is the frequency factor, R is the gas constant and T is the absolute temperature. Therefore, A and G could be calculated by fitting as a function of 1/ T , which will be further discussed in following section . In order to do this, the MD system was set at 10 temperatures (10, 1 5, 20, 25, 30, 37, 40, 45, 50, 55 o C). At each point , 10 parallel 100 - ns simulations were conducted to calculate the average hopping rate, k hop, and its standard deviation. It should be noted that G and A values were calculated by fitting all of the 10 × 10 data points , rather than the average at each temperature. This enables a sufficient evaluation of the uncertainty of hopping rate . For ionic strength (IS) dependence study, IS values were set at 0, 20, 40, 70, and 120 mM explicitly represented by Na + and Cl - ions. The IS value here represents additional ion concentration beyond that required for neutralization of MD system . 3.2.2 Umbrella s ampling From the MD trajectory of TST study, a dual association configuration was extracted for umbrella sampling . 50 After that, the peptide was first restrained at a reference position and the readily adsorbed intermediate molecules (G6P) were pulled away perpendicularly from the peptide surface for 200 ns . In this process, a spring constant of 1000 kJ mol - 1 and a pulling rate of 0.01 nm ps - 1 were used, as shown in Figure 3 - 2 a . The two - ammonium nitrogen atoms were selected as the reference group and the whole G6P molecule was selected as the pull group . As a result, a maximum distance of 2 nm was obtained for th e center - of - mass (COM) of G6P as a reference to its original position. From pulling trajectories, 15 frames with a COM increment of 0.1 nm were attempted to be selected as the initial configuration for each window in umbrella sampling. Such spacing distanc e allow sufficient overlap between the probability distribution within neighboring windows. This will be further discussed in section 3.2.2 50 In orde r to optimize the frame selection for umbrella sampling, two reaction coordinates, and , were defined as equations: 3 - 2 3 - 3 where and - ammonium nitrogen atoms, respectively. is the given vector perpendicular to peptide surface in pulling process . In other words , is the projection of on . Figure 3 - 2 c shows an example on the time course of and . A certain difference can be seen between these values, which is due the pulling process was repeated 50 times to pick the 15 frames with less than 0.01 nm. Usually, 20 parallel pulling enable at least one appropriate frame for each window. 51 Figure 3 - 2 (a) System configuration of a dual association mode. (b) An assumed desorption state by pulling G6P 1. 5 nm away from the surface. For Umbrella Sampling, above mentioned reaction coordinates and were used for GROMACS COM - distance and COM - direction modules , as shown by Figure 3 - 3 and equation: 3 - 4 w here is the biased potential applied to the intermediate molecule as its reaction coordinate , , deviates from the energy minimum position, . is the spring constant at 500 kJ mol - 1 . umbrella sampling. 52 Figure 3 - 3 Gromacs b iased potential for umbrella sampling. (a) gmx_distance restraints the pull group at a specific distance away from the reference group. (b) gmx_direction restraints the pull group at a plane perpendicular to given v ector and also a specific distance away from reference group. (c) gmx_distance + gmx_direction allows the pull group to be restraint at the dome area above reference group. The x - axis for all figures are at the same scale to y - axis. In order to calculate the potential of mean force (PMF) 91 as an indication of sorption energy, the sampling results were combined by weighted histogram analysis method (WHAM) 92,93 using Gro ssfield - WHAM code 94 . As shown in Figure 3 - 4 , a 2 - D PMF with non - independent reaction coordinates, and , was obtained. That is, was abo ve mentioned - - gradient can be seen from the proximity region to desorption region. The ragged bottom of the band is due to the insufficient sampling on the dome edge in Figure 3 - 3 c, where and were too far away from their energy minimum. Therefore, this region was dropped when calculating the 1 - D PMF along . The average on was calculated by only taking th e area above the yellow line in Figure 3 - 4 , where nm. 53 Figure 3 - 4 Two - dimensional PMF ca lculated by Grossfield WHAM code. 3.3 Result s and Discussion 3.3.1 Energy barrier of surface hopping In Chapter 2 , c harged oligopeptides were demonstrated to be an effective channeling bridge between sequential enzymes. MD studies on G6P molecule interacting with a charged polylysine bridge found that th e G6P associates with two lysine side chains by dynamic exchange with surrounding water molecules, and surface diffusion was achieved through hopping between neighboring association sites ( Figure 3 - 1 ). Additionally, surface hopping was indicated by short - range coulombic energy changes between discrete levels ( Figure 2 - 3 a and Figure 3 - 5 a ). By counting these energy level changes, the hopping frequency, , can be calculated . MD studies over a temperature range ( Figure 3 - 5 b ) yields the hopping energy barrier, , via the Arrhenius equation: 54 3 - 5 where A is the frequency factor, is the gas constant and is temperature. The hopping mechanism involves significant intermolecular contact through hydrogen - bond interactions. In order to hop to the next association site, the G6P molecule dissociates from one LYS residue while still bonded to the other LYS as a swing arm. During this process, an energy penalty has to be overcome until G6P touches another LYS and thus reaches a new energy minimum. This is a Stern layer diffusion 95 that is less impacted by the ionic shielding from the bulk environment. As shown in Figure 3 - 5 c , despite ionic strength variation from 0 mM to 120 mM, the energy barrier remains fairly consistent at 12 ± 0.5 kJ/mol. Therefore, a value at 12 kJ/mol was used for all IS conditions in later KMC parameterization. 55 Figure 3 - 5 (a) Short - range coulombic diagram showing G6P hopping as indicated by energy fluctuation between single - and dual - association configurations. (c) Temperature dependence of surface hopping rate. (d) Ionic strength dependence on hopping energy barrier and f requency factor. 3.3.2 Desorption energy Desorption energy was calculated by Umbrella Sampling. 50 Figure 3 - 6 sum marizes the IS dependence study on the 1 - D energy profile near the peptide surface. The probability distributions in Figure 3 - 6 a indicate a sufficient overlap between neighboring windows, allowing an effective combination of each relative energy profile s . Figure 3 - 6 b shows the energy profile Theoretically, a desorbing intermediate molecule has to traverse the double layer near the charged oligopeptid e, comprising the above mentioned Stern layer and a diffuse later created by long - range electrostatic interactions. At low ionic strength, the energy drop can be well separated into Stern layer and diffuse layer components , as shown by the blue and orange arrows in Figure 3 - 6 b . With increasing ionic strength, the long - range electrostatic interaction is gradually screened out and 56 the corresponding energy region becomes a plateau just beyond the Stern layer . In contrast, the Stern layer potential well is less impacted by changes in ionic strength. This agrees well with the ionic strength independen ce study on hopping energy barrier Figure 3 - 5 . Figure 3 - 6 (a) Biased probability distribution of G6P molecule in each window. (b) 1 - D potential of mean force by Umbrella Sampling. The points shows the biased energy minimum for each sampling window. From Figure 3 - 6 b, desorption energy, , was calculated by taking the difference between the energy minimum and the energy average between 1.6 and 2.0 nm. Consequently, hopping of an adsorbed intermediate molecule requires a climb up the IS - independent energy hill (dashed line in Figure 3 - 6 b) while simultaneously having a probability to reach a desorption state 57 that is dependent on bulk ionic strength. Assuming a Boltzmann distribution, the leaking probability of each individual hop depends on the difference between and : 3 - 6 At high IS, the leaking probability is 1/16, which means one out of 16 hops results in desorption. However, the low - IS leaking probability is calculated as 1/189, an order of magnitude lower than the 120 mM case. This means the hopping is much less prone t o desorption at low IS and the intermediate is more likely to traverse the bridge. Table 2 summarizes the energy constants calculated by MD simulation s , which will be used to parameterize the KMC model . Table 2 Energy barriers and corresponding rate constants. IS mM nm kJ/mol kJ/mol kJ/mol 0 9.8 12 25.5 13.5 189 20 2.2 12 24.9 12.9 146 40 1.6 12 23.85 11.9 99 70 1.2 12 21.7 9.7 43 120 0.9 12 19.14 7.1 16 IS: ionic strength : D ebye length , : energy barrier and rate constant for hopping from one dual association site to neighboring site on LYS bridge , : energy barrier and rate constant for desorption from one dual association site = - : energy difference from hopping transition state level to desorption level 58 3.4 Summary The G6P intermediate molecule was found to undergo a surface diffusion mode under the protection of the electric double layer created by the charged peptide surface. Specifically, surface hopping occurs within a Stern layer due to local hydrogen bond interaction that is less impact by the ionic environment of bulk media. The actual value for hopping energy barrier approximate the energy difference between a dual and si ngly association mode. Above Stern layer , the long range electrostatic interaction creates a diffuse layer to further protect the surface diffusion. However, this layer could be largely shielded at high ionic strength. Collectively, t he overall leaking pro bability for each individual hop depends on the energy difference between the hopping transi tion state and desorption state . Th is detailed quantification on thermodynamic parameters not only further explore the surface diffusion mechanism, but also give a reasonable support for the rate constants that can be used to parameterize KMC model. 3.5 Acknowledgements The analysis of umbrella sampling in this chapter is under the help of my committee member, Dr. Alex Dickson. My adviser, Dr. Scott Calabr ese Barton , gre atly contribute to the build - up of the python code for KMC simulation. 5 9 Chapter 4 Transition Pathway from Bridge to Binding Pocket * 4.1 Introduction N atural cascades typically include 2D channeling surfaces between sequential active sites. 18,28 Similarly, t he area between the artificial bridge and its enzyme binding site is a flexible 2D surface with multiple pathways and corresponding leakage. Figure 4 - 1 shows a proposed As discussed in Chapter 2 and Chapter 3 , the surface diffusion is a stochastic process on an energy landscape with discrete potential wells. As compared to the 1D hopping on the peptide bridge , with random motion either forward or backward, 2D surface diffusion includes multiple pathways over a network of hopping site s . As a result, the overall leakage cannot be simply estimated by thermodynamic parameters. Transition pathway analysis using a Markov state model (MSM) is a good approach to analyze this process. 51,53,54,57,96 MSM is widely used in the area of protein folding and lig and unbinding, 53,56,57,9 7,98 where the transition pathway is complicated by spatial configuration with multiple degrees of freedom ( e.g. , inter - atomic distances and dihedral angles ) and highly interconnected states of the system . Using detailed configuration discretization and combin ing states with related physical meanings ( e.g. , desorption or binding states), MSM converts multi - dimensional molecular dynamics transition s to elementary pathways, allows the selection of dominant and long - timescale pathways, and finally human readable pathways. 57,99 As a result, * The content of this chapter has been published on ACS Catalysis as a full paper. The work on molecular docking in this chapter is in collaboration with Dr. Ivana Matanovic and Dr. Plamen Atanassov at University of New Mexico Y. Liu, I. Matanovic, D. P. Hickey, S. D. Minteer, P. Atanassov and S. Calabrese Barton, "Cascade Kinetics of an Artificial Metabolon by Molecular Dyn amics and Kinetic Monte Car lo" , 8 , 7719 - 7726 (201 8 ) . doi : 10.1021/acscatal.8b01041 60 the energy landscape 53,98 and committor probability 57 to minimum - energy basi ns can be calculated. Figure 4 - 1 Proposed channeling pathway from the last dual association site on poly - In this chapter, we further map the channeling pathway between the bridge and intermediate active site of G6PDH using a pr obability analysis based on transition state theory. This result is then validated by Transition Path Theory (TPT) with a Markov state model . Th i s , transition pathways , and energy barriers in channe 4.2 Model Description Basic MD parameters follow the settings as discussed in Chapter 2 . 4.2.1 Selection of complex structure The crystal structure of G6PDH from Saccharomyces cerevisiae (experimental enzyme) is not yet available. 33 In order to obtain a relevant enzyme crystal structure, a comparison was made between th is experimental G6PDH and the four species with available crystal structure ( Table 3 ). Due to charge dissimilarity, Mycobacterium Avium G6PDH ( Ma_G6PDH ) and 61 Leuconostoc Mesenteroides G6PDH ( Lm_G6PDH ) were ruled out. The secondary cofactor binding site of human G6PDH (Homo sapiens G6PDH) structure is very different from yeast G6PDH (Saccharomyces C erevisiae G6PDH) used in experiment . Moreover, Trypanosoma Cruzi G6PDH ( Tc_G6PDH ) has multiple CYS linker sites and available complex structure with both substrate and cofactor readily adsorbed. Therefore, Tc_G6PDH was selected to study the hopping from bridge to downstream enzyme active site. CYS - 528 was selected as the linker site, as it was fully solvent exposed and close to the G6P binding site. 4.2.2 Hybrid topology at enzyme/linker interface I n order to conduct MD simulation of the bridge - G6PDH complex, force field parameters are required for all atoms and bonds . Generally, as discussed in section 2.2 , CHARMM36 all - atom force field 73,74 provides the topology for all standard residues, such as protein a nd DNA component s . In addition to this , CGenFF (CHARMM General Force Field) 75 can be used to calculate parameters for arbitrary , non - standard small molecules, such as ligand s . Here, these two database s are combined to generate topology for the interface between the bridge and its linker 33 and standard peptide/enzyme. As shown in Figure 4 - 2 , the linker molecule was capped with CYS residu es on each side using ChemAxon Marvin Sketch . Then, the topology of whole complex was generated by CGenFF. The interface topology (bonds, angles , dihedrals) was used for the entire complex in MD simulation. The independent standard and non - standard section s were generated by CHARMM and CGenFF, respectively. The capped linker Q&A of CGenFF we b page: https://mackerell.umaryland.edu/~kenno/cgenff/faq.php#hybrid 62 structure in Figure 4 - 2 was applied to the HK - bridge segment and similar approa ch was also used for bridge - G6PDH interface. Table 3 Comparison of e xperimental G6PDH 33 and available crystal structures. Enzyme Sc_G6PDH Hs_G6PDH Tc_G6PDH Ma_G6PDH Lm_G6PD Organism Saccharomyces C erevisiae Homo S apiens Trypanosoma Cruzi Mycobacterium Avium Leuconostoc Mesenteroides Species Eukaryote yeast Eukaryote human Eukaryote parasite Prokaryote bacteria Prokaryote bacteria Similarity a -- 48% 49% 34% 35% Similarity + b -- 64% 64% 53% 54% Total charge - 4.6 - 1.1 1.1 - 18.4 - 20.5 PDB index none 2BH9 5AQ1 4LGV 1E7Y Ligands c -- NADP/G6P NADP&G6P no NADP&G6P CYS number single multi multi single no CYS proximity d good medium very good medium -- a Similarity: residue similarity to Sc_G6PDH; b Similarity +: positive similarity to Sc_G6PDH; c Ligands: substrate and cofactor availability in crystal structure; d CYS proximity: the proximity of CYS residue to G6P binding site. To stabilize the complex, the terminal HK CYS of the bridge was first pulled away from the position - restrained G6PDH, extending the LYS bridge. Subsequently, the HK CYS side was 63 position restrained while the res t of complex was allowed to stabilize over a 100 ns MD simulation. The resulting complex was used to initialize probability analysis. Figure 4 - 2 Chemical structure of linker molecule capped with CYS residue on each side, CYS - (BM - (PEG) 2 ) - CYS. The whole complex structure was processed by CGenFF to obtain the topology parameters for the interface between standard residue segment shaded by blue color and the non - standard residue segment shaded in red color. I n a ctual MD simulation, the topology for pure standard residue segment was taken from CHARMM , and CGenFF parameters were applied to the pure non - standard segment and the interface topology . 4.2.3 Intermediate initialization on complex structure To choose a G6P rele ase point, molecular docking was used to identify favorable binding areas between the LYS bridge and G6P binding pocket that could be ruled out as leading to long - term adsorption, as shown in Figure 4 - 3 . The docking simulations were performed using AutoDock Vina 100,101 by Dr. Ivana Matanovic working at Dr. Plamen Atanassov s group at University of New Mexico. Docking of G6P to G6PDH in the presence of the LYS bridge was performed multiple times using a range of areas ( Figure 4 - 3 a) between LYS bridge and the G6PDH active site. As a result, the rest of the area between the bridge and the active site was considered as a potential transition state area, where the position of G6P can be adjusted to find the points as shown in Figure 4 - 5 . Subsequently, 500 parallel simu lations of 2 ns duration were conducted with velocity regeneration at each ionic strength (0, 20, 40, 70, 120 mM). At 10 ps per frame, these 64 simulations generated 10 5 frames per ionic strength, to be used in analyzing the channeling behavior. The identific ation of starting points will be described in Section 4.3.1 . Figure 4 - 3 The results of molecular docking simulations f or the binding of G6P to G6PDH (circled positions) in the area between LYS bridge a nd G6P binding pocket on G6PDH. (a) Search boxes. (b) Several favorable binding sites were seemed as potential kinetic trapping spots in short MD simulation and were avoid in initializing the parallel simulations. 4.2.4 Markov State Model on transition pathway o n flexible 2D surface P ython package MSMBuilder 102 is used to conduct the MSM analysis in this work. 65 In order to analyze the trajectory of intermediate molecule on a flexible 2D surface, the atom coordinates in each 10 5 - frame MD trajector y were featurized into a vector with representative coordinates , : 4 - 1 where is time step , complex surface and Python package MDAnalysis 103,104 is used for the trajectory featurization. The actual distance to a reference surface was calculated by taking the distance between the COM of intermediate molecule and the closest atoms of reference group . specific issues to be studied, t he dimension of can be increased by adding features of interest such as dihedral angles, distances to charged moieties , etc., to reveal more detail in channeling process. After trajectory featurization, the original vectors, , was grouped into 500 clusters/states based on their conformation al similarity. As a result, the trajectory of vect ors is converted into a trajectory of 500 states , : 4 - 2 Then, a Markov State Model , , is built by counting the transition between each state in : 66 4 - 3 where is the time interval used to count the transition. Specifically, represent the number of counts for the system to go from state to state during time interval, . The the count matrix is converted to a transition matrix according to the following equation: 4 - 4 This makes each row of the tra nsition matrix , , sum to one. Figure 4 - 4 shows an example of 500×500 transition matrix, where 4.25 % of the element s are non - zero. This means that the transition matrix is usually sparse, because mutual connections only exist among directly connected states. Figure 4 - 4 An example of transition matrix, where white color stands for zero element and blue color stands for non - zero items. In this case, there are 10,625 non - zero elements out of the total 250,000 items, resulting in a ~ 4.25% occupancy of the matrix. 67 Finally, three basins are defined to calculate the committor probabilit y with Python package CSNAnalysis . The energy basins were defined as: 1. The bridge basin , including clusters where the G6P phosphate group lies within 0.3 nm of the peptide bridge surface. 2. The pocket basin , of clusters where the G6P phosphate group lies within 0.3 nm of the center of the G6PDH binding site. 3. The desorption basin , in which the G6P phosphate group is more than 2 nm away from the complex surface. 4.3 Result s and Discussion 4.3.1 Channeling p rocess Unlike the LYS bridge with uniform potential wells (and thus precisely defined energy terms), the configuration of the peptide bridge with respect to G6PDH is flexible due to its single bond connection ( Figure 4 - 1 ). Additionally, no single reference group can be defined along the multiple trajectories between the bridge and binding pocket; therefore, the leaking probability in this region cannot be easily assessed by umbrella sampling. Here, a direct probability measurement was conducted by MD to explore the leakage and estimate a corresponding energy barrier. By combining equation s 3 - 5 and 3 - 6 , the rate constant ratio, / , and probability ratio, / , are equal and can be related to the energy difference, , according to : CSNAnalysis : https://github.com/ADicksonLab/CSNAnalysis 68 4 - 5 and refer to the hop across the energy on G6PDH. The most direct approach to measuring / would be to follow the MD trajectory of the G6P molecule, starting from either the LYS bridge or G6PDH binding pocket, and track its probability to arrive at the other site. However, because of the deep potential well at each of those sites, such traversa ls are extremely rare on MD time scales. Alternatively, if the intermediate is initialized at a high - energy transition state ( Figure 4 - 5 point c), it should have equal probability to reach the peptide bridge , and the G6PDH pocket, . There also exists a finite probability, , of desorption to the bulk, depending on the energy difference between transition and desorption state, . More g enerally, the G6P molecule may be released near the transition state ( Figure 4 - 5 points b or d), where and are not equal but comparable. Gi ven point d as an example, indicates the shift from transition points to enzyme pocket. Here, as well as the desorption probability, , can be related to their respective energy barriers, and , by an Arrhenius expression: 4 - 6 69 This expression suggests generally that wherever the release point, the ratio of hopping and le aking probabilities is given by , where is the energy difference between a perfect transition point (point c in Figure 4 - 5 ) and the bulk energy level . Therefore , in order to calculate the value of , the G6P molecule can be released from a nearby region ( Figure 4 - 5 points b or d) , instead of finding a single perfect transition state (point c) . Figure 4 - 5 Simplified energy profile from bridge to G6P binding site on G6PDH. Point a is the last dual association site on peptide bridge; point c is the transition state at the energy barrier between bridge and G6PDH; point b and d are the two states slightly devi ating from the transition state. Probability analysis was conducted using a partial complex, including G6PDH (PDB: 5AQ1, Table 3 ), the LYS bridge and only one interfacial CYS residue of HK, as shown in Figure 4 - 6 a. The HK CYS was position - restrained to mimic the existence of the HK biomolecule. Usin g the configuration of Figure 4 - 6 a as the initial frame, the velocity of all atoms was regenerated and then equilibrated for 1 ns (position restraint o n all G6P and complex atoms). After that, 5 00 parallel MD simulations (2 ns) were conducted with position restraint only on the HK CYS residue. 70 Figure 4 - 6 (a) System configuration when G6P was released aro und point c in Figure 4 - 5 . (b) System configuration showing the distribution of G6P intermediate (green dots) in 500 parallel simulations. The complex structure is the 0 ns frame and the G6P molecules position are tak en from the last frame of each parallel simulation. The coordinates are rotated to make the G6PDH secondary structure fit that of first frame. Probability was calculated for three outcomes of G6P molecule: reaching the peptide bridge ( p br ), reaching the G6 PDH pocket ( p poc ) or desorbing into the bulk ( p des ). Each outcome That is, when phosphate group was within 1.2 nm (short range cut - off) to the LYS bridge surface or the G6P binding pocket, it was assumed to reach the destination. If G6P molecule was 1.2 nm 71 away from the whole complex, it was assumed to desorb. Cases where G6P was located elsewhere on the complex surface after 2 ns (~20% of all simulations) were consid ered incomplete channeling, and were not included in the calculation. Finally, the initial position of G6P was slightly adjusted until p br is comparable to p poc . Figure 4 - 7 shows that the probability became fairly consistent when the simulation was repeated more than 200 times. Therefore, all the probability results in the main text were calculated from 500 parallel simulations. Figure 4 - 7 Probability results as a function of number of parallel simulation s . T he resulting probability depends on the intermediate releasing point. In this set of simulations, G6P molecule was released at a single point around the perfect transition state area, and then parallel simulations was conducted to study the co nvergence of probabilities . Using the release point yielding comparable values of and , a IS dependence study was conducted via 500 parall el simulations. For example, at IS=20 mM ( Figure 4 - 8 ), , and were found to be ~53.5%, 43.4% and 3.0%, respectively, resulting in a lea king probability 7.14%. Using this value in equation 4 - 6 , we calculated the resulting value to be 6.9 kJ/mol. As compared to the bridge, where 13 kJ/mol ( Table 2 ), traversal from the bridge to G6PDH is less protected from leaking into the bulk. 72 Figure 4 - 8 (a) Ionic strength dependence of leaking probability, showing the probability conducted to sample the rare leaking event. (b) The ration of desorption and channeling events in Fig.3b. Specifically, the energy difference between transition state and desorption level was calculated from the resulted probability according to Figure 4 - 5 , Figure 4 - 8 and equation: 4 - 7 Assuming a uniform bulk energy level, can be used to correlate the transition state energy level to and on LYS bridge, as shown in equation below: 4 - 8 4 - 9 Therefore, a complete energy profile can be discerned between the LYS bridge to the transition state to G6PDH binding pocket. Figure 4 - 8 b shows different value as a function of ionic strength, indicating increased leakage under a more concentrated ioni c 73 environment. It should be noted that a high energy barrier tends to keep the G6P molecule on the bridge, increasing the residence time of intermediates on the bridge. Below we show that this energy barrier results in further leakage from the bridge. As for the rate constants, since all hopping and desorption events on LYS bridge can be correlated to the last dual association site on peptide bridge, , , and are assumed to have the same frequency factor. Similarly t o the KMC parameterization on peptide bridge ( Table 2 ), rate constant ratio from bridge to E2, , was calculated from probability ana lysis. 4.3.2 Transition pathway analysis Given the fact that channeling on 2D surface is complicated and there could be multiple pathways, Transition Path Theory is conducted via Markov state model to better understand the channeling process and also validate th e probability analysis in above section. Figure 4 - 6 b shows final position s relative to bridge - G6PDH complex in 500 parallel simulations . In spite of one frame from each 200 - frame trajectory, the area between peptide bridge and binding pocket was already fully covered, even including the desorption case. Therefore, the local ergodicity is good enough to build a Markov state model. Figure 4 - 9 shows the network of MSM states for the 0 mM and 120 mM case in Figure 4 - 8 a. Specifically, the node size corresponds to its stationary populations and the node color stands for its committor probability to one of the pre - defined basins. For example, the states belonging to bridge basin has a 100% co mmittor probability to this basin and thus has a pure blue color. Similarly, the pure cyan nodes have an equivalent 50% probability to pocket and 74 bridge basins. The MSM result s indicate separate energy basins of peptide bridge and G6P binding site. But these two basins are connected via a highly interactive nodes that correspond to the transition area in Figure 4 - 5 and Figure 4 - 6 a. It is obvious that neither of these two basins has a good connection to the bulk, because there is a large kinetic trap due to the charge density and strong local h - cyan ) shows a relative frequent interaction to desorption states. This ag rees well with the leakage when G6P hops from peptide bridge to its binding site on G6PDH. At 0 mM ionic strength ( Figure 4 - 9 are strongly connected by the cyan states in between, and very little network was observed to the state, as indicated by the larger red area and more app arent purple /yellow area s . This give a good qualitative visualization on the impact of ionic strength. In order to visualize the transition probability from bridge states to pocket or bulk states, the bridge was not defined as a basin when calculating the committor probabilities, as shown in Figure 4 - 10 for the ori Figure 4 - 9 a and b is removed in Figure 4 - 10 , and the green and red spread to and paint the original blue states as circled by dashed oval shape. from yellow, indicating that the intermediate on bri dge was more like to go to the binding pocket (green) instead of being lost in to bulk media (red) . However, when comparing the 0 mM and 120 mM ionic strength , more yellow can be observed in original bridge states in Figure 4 - 10 a , demonstrating more leakage occurred with increasing ionic strength. This agree s well with direct probability analysis ( Figure 4 - 8 ) and the three - basin committor probability result ( Figure 4 - 9 ). 75 Figure 4 - 9 Transition pathway visualization at 0 mM (a) and 120 mM (b) . The node size is based on the stationary populations of each state. The colors indicates the three - basin committor probabilities to the pre - defined basins of desorption states ( red ), peptide bridge states ( blue ) and binding pocket states (green). (c) Triangle color bar. (d) Triangle color bar with grid and probability lab els. 76 Figure 4 - 10 Two - basin committor probabilities for t ransition pathways visua lization at 0 mM (a) and 120 mM (b) . As compared to Figure 4 - 9 defined when calculating the committor probabilities , but the corresponding states were still recognized as an energy basin. In this way the probability to bulk/po cket states from (c) Triangle color bar. (d) Triangle color bar with grid and probability labels. In order to quantitatively compare the direct probability analysis and MSM result, the overall leakage was calculated. B y s tudying the states with equal committor probability to in Figure 4 - 9 probability ratio in Figure 4 - 8 b, making a more reasonable as illustrated in Figure 4 - 5 . 77 and ) were extracted, and then only the states representing surface interactions were kept, in Finally the overall desorption probability, , was calculated by the summation of weighted desorption probability at each states: 4 - 10 where is the weight of each selected state calculated from their stationary populations in MSM. Table 4 summarizes all energy values, key rate constant ratios and key probability ratios. With the value from probability analysis and assuming a uniform bulk energy level (equation 4 - 8 ), the hopping energy barrier from bridge to G6PDH can be calculated From Eq. 4 - 9 . Given the ionic strength a t 120 mM as an example, the minimum for these MSM clusters are 30.3% ( Table 4 ), comparable to the 31% in Figure 4 - 8 b. 78 Table 4 Energy barriers, related rate constants and cascade kinetics. TST TPT IS mM nm kJ/mol kJ/mol kJ/mol kJ/mol 0 9.8 27.0 8.5 17.0 16.6 7.24 18 20 2.2 14.4 6.9 18.0 8.7 5.56 19 40 1.6 10.5 6.1 17.7 11.2 6.22 18 70 1.2 7.4 5.2 16.5 6.1 4.66 17 120 0.9 3.3 3.1 16.0 3.3 3.26 16 IS: ionic strength : D ebye length : desorption probability for hopping from last dual association site to G6P binding pocket on G6PDH : energy difference between transition state level and desorption level, when hopping from last dual association site to G6P binding pocket on G6PDH , : probability and energy barrier for hopping from last dual association site to G6P binding pocket on G6PDH Figure 4 - 11 makes a graphical comparison between TST and TPT methods in terms of desorption probability and the energy difference between transition ar ea energy and bulk energy level. From both plots, the TST and TPT results generally agree well, giving a good quantitative impact of ionic environment on the channeling from bridge to G6PDH binding site. Therefore, the probability analysis in above section is reasonable to parameterize KMC model to quantify the cascade kinetics. 79 Figure 4 - 11 Comparison between the leakage calculated by transition state theory (TST) and transition path theory. (a) L eaking probability represented by desorption probability and the probability to reach the G6PDH binding pocket. (b) Energy difference between transition state to bulk environment, corresponding to in Figure 4 - 5 . 4.4 Summary - lysine peptide bridge to its binding site on G6PDH is studied by a direct probability measurement via MD simulation. By releasing the G6P molecules at a transition state area with relative high energy level, three event probabilities ( p br , p poc , p des ) are used to calculate the energy difference between this state and bulk energy level. Then, this energy difference is correlated to the desorption energy on peptide bridge, in order to complete the energy pr on G6PDH. Transition pathway analysis by MSM further elucidate s the details of this process . Discrete potential basins are clarified by the visualization of MSM result s , and th e transitions states in between (cyan color) shows more propensity to desorb to the bulk as compared to arriving at the bridge and pocket basins. The leaking probability by committor probability agrees well with the direct probability analysis in Chapter 3 . These probability results are used to e s timate the kinetic parameters for the KMC model in Chapter 5 . 80 4.5 Acknowledgements The molecular docking in this chapter was conducted by Dr. Ivana Matanovic from Dr. Model was suggested by , Dr. Alex Dickson. Also, my advisor, Dr . Scott Calabrese Barton make great contribution to the python code on topology hybridization. 81 Chapter 5 Cascade Kinetics by Kinetic Monte Carlo Method * 5.1 Introduction In previous chapters, thermodynamic parameters we re quantified in detail by molecular dynamics simulation combined with advanced sampling method s . In order to further reveal the potential limitations of channeling efficiency , quant it ative kinetics is greatly needed. A quantitative model should not only e ffectively cover the time and length scale s from micro - structure to experiment, but also fully consider the molecular complexity of the cascade topology . Reported work on this includes the analytical method with assumed channeling efficiency, continuum mod eling focusing on the charge migration under electric field and simplified molecular simulation on the probability for intermediate molecule to reach the second active site. By solving a set of equations based on mass balance, analytical methods or micro - kinetics are able to correlate the observed cascade kinetics to channeling efficiency , which can not be measured directly by current experimental techniques , and are usually estimated by computational simulations. 34 Continuum modeling represent s the intermediate distribution as a continuous field. W ith predetermined spatial factors and boundary conditions, the intermediate migration is governed by concentration gradient s and the electric field created by the cascade surface. 15,35 As a result, steady state flux can be calculated to determine the yield as a n indication of channeling efficiency. Recent molecular simulations were conducted under Brownian * The content of this chapter has been published on ACS Catalysis as a full paper. Y. Liu, I. Matanovic, D. P. Hickey, S. D. Minteer, P. Atanassov and S. Calabrese Barton, "Cascade Kinetics of an Artificial Metabolon by Molecular Dyn amics and Kinetic Monte Carlo" , 8 , 7719 - 7726 (201 8 ) . doi : 10.1021/acscatal.8b0 1041 82 dynamics, wherein the transport efficiency was estimated by the probability of simplified but explicit intermediate molecules to rea ch the vicinity (0.7 nm) of second active site. 29,30,44 This probability term was integrate d by the above - mentioned analytical approaches to quantify the cascade kinetics. However, these models are n ot able to adequately represent the surface interaction between charged and polarized intermediate species and channeling surface. Additionally, the quantification algorithm relies heavily on the estimation of channeling efficiency and ignore the complexit Because of its focus on sequences of events rather than stepping through time , Kinetics Monte Carlo (KMC) model has been widely used for simulating chemical reaction and diffusion on the lattice surface of inorganic catalysts. 41,59 63 B asic events include surface adsorption , desorption, hopping and reaction, which are common to i ntermediate channeling. Specifically, DFT parameterized KMC, also called first - principle KMC, is used to study the effect of catalyst composition, surface structure, lateral interactions, and operating conditions on the overall catalytic pe rformance of chemical ( e.g. , CO or NO) oxidation/reduction on catalyst lattice surface. 62 In addition, experimental ly based KMC was used to study the CO oxidation on RuO 2 surface, 60 and a comparison between DFT and experimental parameters was m ade to reveal the significant impact of lateral interaction of adsorbed species . 41 Besides quantum mechanics, a combination of MD and KMC simulations was also used to study epitaxial growth of fcc and hcp islands on fcc (111) surface , showing great simulation acceleration and a boundary pinning effect by adsorbed atom islands that is difficult to reveal by conventional KMC methods. 64 Another multi - level example is a hybrid of first - principle MD and KMC simulations, showing the impact 83 of rotational coupli ng of the side groups influencing the proton conduction in proton exchan ge membrane fuel cells (PEMFC). 65 In this chapter, we use the KMC method to quantify the cascade kinetics of the HK - G6PDH cascade . Parameterized by MD and experimental results, the KMC model was built to estimate product evolution on the experimental time scale. This model enables a direct comparison between simulations and experiments, focusing on pre - steady state product evolution. These studies build a detailed quantitative approach, enabling us to further elucidate the range and limitati on s of electrostatic channeling. 5.2 Model De scription 5.2.1 Kinetic Monte Carlo model Figure 5 - 1 shows the tw o - step KMC model for cascade reactions: 5 - 1 5 - 2 As shown in the figure, the two active sites (E 1 and E 2 ) are connected by several discrete hopping sites. On each site, rate constants for all possible events ( e.g. , , , , , ) were assigned explicitly. All sites were allowed to exchange intermediate with the bulk environment. Given the fast diffusion rate of G6P (~10 - 5 cm 2 s - 1 ) as compared to the turnover frequency (TOF) of active site (~ 0.01 - 0.1 s - 1 ), the G6P was assumed to diffuse immediately into a homogeneous bulk media once it left the cascade surface. Therefore, the bulk environment was 84 only represented by a changing intermediate concentration, which will in turn impact the adsorptio n rate on active sites. MD simulation of intermediate in the binding pocket is complex and usually handled by more detailed and advanced sampling techniques such as Markov State model 57 . Therefore , it is out of the scope of this work. To simplify the KMC system, h opping was assumed to be reversible between bridge sites, but was irreversible from E1 to bridge and bridge to E2. Reversible hopping on and off bridge is of potential interest for future work. Figure 5 - 1 Schematic diagram of kinetic Monte Carlo model. E1 and E2 are two active sites. Sites 1 - 4 are discrete hopping sites representing the dual association sites on peptide bridge. Bulk are the environment wit h changing intermediate concentration. is the Michaelis constant and values are the rate constants for various event s , including for turnover frequencies, for desorption from bridge, for hopping on bridge, for hopping from en zyme - 1 to bridge, for hopping from bridge to enyme - 2. A zero value was given if an event was disallowed. Figure 5 - 1 indicates all allowed events on each site. The actual rates , , in each KMC step were calculated by taking the product of rate constant , , and the occupancy of current and neighboring sites , and , represented by either 1 or 0 . As mentioned above, bulk concentration , , was involved in this rate/occupancy product when calculating the adsorption rate . Each specific calculations can be seen as follows: 5 - 3 85 5 - 4 5 - 5 5 - 6 Having assigned all rate values ( r 1 , r 2 r n ) to all available sites, a random number ( ) is generated between 0 and 1, to select a specific event according to : 36,60 5 - 7 where is the summation of all values in current KMC step. The corresponding event was then executed and the time evolution, , was calculated by : 5 - 8 where is another random number. After execution, the occupancy and rate values were updated accordingly. Then, KMC simulation entered a loop until the time reached 1000 ns (steady state product evolution from E2). From the time course of product evolution, lag time, , was calculated by extrapolating the 500 - 1000s segment back to the time axis. Detailed discussion of this is provided in section 5.3 . In each KMC simulation, 100 parallel cascades were employed to enhance the event sampling and reduce the uncertainty of intermediate concentration. Normally, each KMC simulation took less than 10 7 steps, depending the parameters of KMC simulation. Finally, 5 - 10 86 parallel KMC simulations were performed to evaluate the error in lag time estimation . Table 5 shows the parameters for KMC simulation. Table 5 KMC parameters. Constant Value , cascade concentration / mol L - 1 8*10 - 9 , c ompartment volume / L 2.07*10 - 14 , concentration of substrate for E1 / mol L - 1 2 , TOF on E1 / molec s - 1 0.7 , desorption rate on E1, s - 1 0.07 , adsorption rate on E1, s - 1 M - 1 7.7*10 5 , Michaelis Constant of E1, mmol L - 1 10 - 2 , hopping rate from E1 to bridge, s - 1 7*10 15 , TOF on E2 / molec s - 1 6.2 , desorption rate on E2, s - 1 0.62 , adsorption rate on E2, s - 1 M - 1 1.26*10 6 , Michaelis Constant of E2, mmol L - 1 5.4*10 - 3 , hopping rate on bridge, s - 1 , desorption rate on bridge, s - 1 , hopping rate from bridge to E2, s - 1 Specifically, turnover frequencies ( TOFs ) on E1 ( ) and E2 ( ) was calculated by fitting experimental data for steady state product evolution of a fully saturated HK and 87 G6PDH. 33 The Michaelis constant for E2, , was calculated from the experiment lag time of a free - standing system, , according to following equation 105 , where [I] is the intermediate concentration at steady state. 5 - 9 Specifically, the in Figure 5 - 1 is a combination of , and , as show in following equation . In order to minimize the leakage of a readily ch anneled intermediate, substrate desorption rate constant on each active site, and , was taken as 1/10 of the and , respectively. The adsorption rate constants, and , were calculated according to: 5 - 10 As mentioned above , 100 parallel cascades were simulated within a single common bulk environment. The total volume , , for these simulations was 21 fL, based on the exper imental conce ntration of enzymatic cascade, [E1 , E2 ], at 8.9 nM: 5 - 11 As a result, the concentration of leaked and hopping intermediate can be calculated for each event . In this way, the degree of leakage and hop ping could be compared directly with product and intermediate evolution. E1 was assumed to be saturated in glucose substrate. 88 R ate constants for channeling on the bridge and E2 were obtained from MD results. The h opping rate (~ ns - 1 ) on the peptide bridge is orders of magnitude higher than the TOF on each active site (~ s - 1 ), such that enzyme turnover is rate limiting. Therefore, the channeling process was actually governed by equilibrium ratios of transport rates, rather the absolute values of each transport rate constant. In the KMC simulation, was set at only two orders of magnitude higher than , in order to improve simulation efficiency. Meanwhile , values of and were varied until their ratios to were equal to th ose of MD simulation results in Table 2 and Table 4 . This guarantees the leaking probability for each individual KMC event is the same as the dynamic behavior observed in MD. was set to an infinite large value ( ) to make sure the intermediate goes to bridge immediately after it is produced on E1. For the IS - dependent study, was varied until its ratio to was equal to that of MD results, as shown in Table 2 , Table 4 and Table 5 . This Kinetic Monte Carlo model was built in Python 101,106 , and it is available at the repository in APPENDIX D . 5.3 Results and Discussion KMC result for singe enzyme kinetics agree perfectly with Michaelis - Menten kinetics and details are provided in APPENDIX E . 5.3.1 Cascade Kine tics by KMC model KMC simulation was applied to the cascade model as shown in Figure 5 - 1 . As discussed in section 1.3 and section 2.3.4 , stop - flow lag time analysis is a widely - used experiment al method to evaluate channeling efficiency. 7,33 Figure 5 - 2 shows an example of product evolution 89 simulated b y KMC that is comparable to a stop - flow experiment, and allows estimation of the lag time, . In this figure, four types of product evolution are presented, including non - channeling , perfect channeling and two leaking cases (K5+E2 and K15+E2). As indicate d by the curve slop e in Figure 5 - 2 a, the reaction rate reaches a steady state value at the second half of the simulation. Additionally, the time course of bulk intermediate in Figure 5 - 2 b gives a more direct visualization on such time period. That is, the intermediate concentration reach es a plateau level after 400 sec, when this concentration allows the second active site to reach a reaction rate equal to that of its upstream active si te. Therefore, by extrapolating the 500 - 1000 ns curve Figure 5 - 2 a to its time axis, the lag time can be calculated to compare with experiment results. Figure 5 - 2 (a) A example of simulated stop - flow lag time analysis, showing the product evolution under different channeling conditions. (b) The evolution of bulk intermediate. In previous work, an analytic al expression was employed to estimate lag time based on channeling efficiency. 34 In contrast, the KMC model is capable of evaluating product evolution on experimental time scales starting from the pre - steady state, thereby allowing direct comparison with experimental results. Moreover, the contribution of any elementary step can be 90 tracked over the ent ire simulation process, as exemplified by the bulk intermediate in Figure 5 - 2 b. 5.3.2 Quantitative comparison with experiment via ionic strength dependence Figure 5 - 4 a demonstrates a comparison between experimental lag time, , and KMC lag time, , as a function of ionic strength . Here, we assume 1 00 % hopping efficiency from HK to the LYS bridge , and transport from the bridge to E2 is fast with no desorption allowed . Bridge lengths of 5 LYS residues (K5) and 15 LYS residues (K15) are represented both in experiment and simulation. These polypeptides offer 4 and 14 dual - association sites, respectively. A t low IS and short channeling distance (K5), both and are much lower than that of free (uncoupled) enzyme. Based on a random walk model, the expected number of hops required to traverse the bridge is the square of the bridge site number, . The leaking probability, , is the expected probability of des orption during traversal and can be expressed as: 5 - 12 91 Figure 5 - 3 Plot of leaking probability based on equation 5 - 12 . As shown in Figure 5 - 3 , was estimated to be 8.1% for K5 at IS=0 , which means that electrostatic channeling is 92% successful under these conditions. At higher IS, the energy barrier to desorption decreases ( Figure 3 - 6 b), and becomes 62.1%, resulting in an increased lag time. Similarly, intermediate leakage increases for a longer channeling pathway. At the extreme case, for IS= 120 mM and =14, desorption probability, , approaches 100% and both and revert to values for the free - standing system. T he lag time for K15 (~55 - 120 sec) is consistently higher that of K5 (~ 20 - 60 sec). and leakage is plot ted for K5 and K15 in Figure 5 - 4 b and c. With more hopping sites and thus a longer channeling pathway, the K15 system requires more hops that its K5 c ounterparts. As a result, a leakage increase was observed in Figure 5 - 4 c. These agree well with the analytical result in equation 5 - 12 . Therefore, the individual leaking probability and channeling distance must be minimized in an effective cascade. 92 Figure 5 - 4 (a) Comparison of experimental lag time and KMC lag time. K5 and K15 only consider the leakage on LYS bridge. K5+E2 and K15+E2 involve the leakage from bridge to G6PDH pocket. (b) Time course of surface hopping on bridge. (c ) Time course of leakage on bridge. Based on MD simulations on the channeling from bridge to pocket , leakage between the bridge and G6PDH is included, which is depicted as K5+E2 and K15+E2 plots in Figure 5 - 4 a. The result of leakage between the bridge and E2 is a decrease in channeling efficiency and an increase in lag time. With the existence of leakage from bridge and active site, increased to v alues comparable to experimental results for both bridge lengths. As discussed above, the relative high energy barrier between bridge and G6PDH pocket not only increases the downstream leakage at this site, but increases the residence time of G6P on LYS br idge, and therefore increasing the probability of desorption. As shown in Figure 5 - 4 b, bridge hopping times for both K5+E2 and K15+ E2 were higher than that of K5 and K15 over the entire simulation, which resulted in further leakage on the channeling pathway ( Figure 5 - 4 c). The resulting IS dependence of lag time agrees well with experiment results, particularly in terms of the slope. Inclusion of hopping between the bridge and enzyme site in the KMC 93 model generally increases the calculated lag times, bringing them closer to experimental val ues. Additionally, whereas experimental and KMC results for the K15+E2 case compare well, there is some discrepancy between experimental and KMC results for the K5+E2 case. This can be explained by additional leakage from HK to the LYS bridge ( in Fig . 4a), which is not studied in this work. Such leakage is expected to affect lag times for the shorter K5 bridge more, because the longer K15 bridge itself exhibits greater leaking into the bulk. 5.3.3 Model extension to multistep cascade In order to broaden the application of KMC , the present model can be extended to a three - step cascade : 5 - 13 where E1, E2 and E3 are three sequential active sites. Given reaction on E1 as an example, is the substrate for E1 , and is the product of E1 and also the substrate for E2. By assuming a same kinetics for the second site (E2) and third site (E3), the product evolution for the free standing s ystem is shown in Figure 5 - 5 a . KMC result show that the lag time of E3 is almost two times that of E2, because the two bulk intermediate species concentrations must be generated from E1 . This model extension shows a potential of KMC model to study more complicated cascade system s . 94 Figure 5 - 5 Product and intermediate evolution in three step KMC mode l (Free standing system). All parameters for bridge - 2 and enzyme - 3 are the same with that of bridge - 1 and enzyme - 2, respectively. 5.4 Summary Using thermodynamic and kinetic parameters derived from molecular dynamics studies, the KMC model enables direct compa rison with stop - flow lag time analysis, by evaluating the product evolution over the entire experimental time scale, particularly at pre - steady state. Moreover, it reveals several key parameters limiting overall cascade kinetics. Specifically, the lag tim e depends on the overall leakage of intermediate molecules, a result of joint action of the hopping and leaking probability on the bridge . A high IS environment tends to increase the desorption probability of each random hop, and a longer hopping pathway i s found to dramatically decrease the likelihood that intermediates traverse the cascade surface. At given ionic strength, therefore, the length of channeling pathway and the strength of surface interaction should be balanced to achieve an efficient intermediate transport between sequential active sites. Parameterized by these energy terms, a KMC model is utilized to evaluate the impact of the leakage between peptide bridge and active site. By including such leakage, the predicted KMC lag time matches better with experiment results. Detailed track ing of event evolution 95 reveals that the energy barrier in this area not only resulted in leakage on the enzyme , but more likely push es the intermediate back to the peptide bridge. This leads to longer retentio n time and thus more desorption on LYS bridge. Unlike the well - defined energy terms and channeling behavior on the peptide bridge, the configuration of singly bonded peptide and G6PDH need to be further sampled by advanced techniques. In addition, the ener gy barrier between LYS bridge and G6PDH pocket further increases the likelihood of downstream leakage. This energy barrier should be carefully considered, because it not only caused leakage by itself but also exacerbated the leakage on channeling bridge. T he leakage from HK to LYS bridge is also a possible factor to further bridge gap between simulation and experiment results. The present modelling approach is applicable to the design of synthetic catalytic cascades, as well as natural cascades to better un derstand channeling mechanisms. 96 Chapter 6 Conclusion This work builds up a multi - scale model to quantify the kinetics of artificial cascade, enabling a further understand ing on the channeling mechanism and a n indication of potential limitations for future cascade design. In this hierarchical model, molecular dynamics simulation enables a completed consideration of molecular complexity and kinetic Monte Carlo mode l covers a wide range of time and length scale, efficiently bridging the gap between microstructures an d phenomen on kinetics. Specifically, charged intermedia te molecules were found to undergo a surface diffusion mechanism across cationic poly - Arginine and poly - Lysine peptide , which was realized by discrete hops between neighboring amino acid residues throu gh hydrogen bonding interactions. Lys residues were found to provide a balance of intermediate adsorption and surface diffusivity that allow for efficient electrostatic channeling while preventing dissociation of the intermediate into the bulk. Additionall y, simulations suggest that a dianionic intermediate is required for the double associative diffusion mechanism that prevents desorption from the peptide surface. Also, a balance between surface adsorption and mobility is required to achieve an optimal cha nneling efficiency. The comparison with stop - flow lag time analysis g a ve a strong support of the occurrence of artificially intro duced electrostatic channeling. Further molecular dynamics study demonstrates that the surface hopping is actually under the protection of the electric double layer created by the charged peptide surface. The hopping in Stern layer was less impacted by ionic screening, but the diffuse layer protection due to long - range electrostatic interaction could be shielded at hi gh ionic strength. The leaking probability for each hop depends on the energy difference between the hopping transition state and 97 desorption state. In addition, the energy barrier around the artificial interface between LYS bridge and G6PDH pocket further increases the likelihood of downstream leakage , making the channeling even unsafe upon concentrated ionic environment. Using thermodynamic and kinetic parameters derived from molecular dynamics studies, the KMC model enables a direct comparison with experi ment, by evaluating the product evolution over the entire experimental time scale, particularly from the pre - steady state. Moreover, it reveals the key parameters limiting overall cascade kinetics. Specifically, the number of hopping sites and strength of close - range interactions account for the leakage from the channeling bridge. The barrier between bridge and second enzyme should be carefully considered, because it not only caused leakage by itself but also exacerbated the leakage on channeling bridge. Th e present modelling approach is applicable to the design of synthetic catalytic cascades, as well as natural cascades to better understand channeling mechanisms. Furthermore, for natural cascades with more stable dimer interfaces and channeling surfaces, s uch as TS - DHFR and MDH - CS, a complete energy profile may be mapped between active sites to reveal more precise mechanisms of the corresponding biological pathways. From a p ro spective point of view, kinetic quantification will continue to rel y on seamless connection between computational techniques at different time and length scales that fully cover the molecular - level interactions and phenomenon kinetics. Continuum modeling and Kinetic Monte Carlo simulations are two powerful approaches capable of integra ting such tasks . A hybridization of continuum and KMC model s , either sequential or simultaneous , is promising to account for bulk concentration field and discrete surface hopping . If key meta - states can be 98 further coarse - grained into major energy basins, M arkov State model s have a great potential to bridge the energy - discrete cascade surface to an energy - continuous long - range electrostatic region. Molecular simulations are so far the most effective approach to quantify the parameters for KMC and continuum models, and also help to set up the geometry of these coarse - grained mode l s. The key challenges of molecular simulations are the full representation of transition pathways , and dealing with kinetic trap s i n energy landscape s . Markov Stat e model s are still the best way to map the complex transition pathway s from very elementary states to a human readable pattern. As for the kinetic trap s , advanced sampling techniques, such as Umbrella Sampling 50 and Metadynamics 49,107 , are required to assist the MD simulations for MSM purpose. For example, by the development and combination of various computational simulations, samplin g techniques and even experimental crystal structures , it is possible to build a complete MSM that fully cover the transitions from perfect upstream binding states, to unbinding states, to channeling pathways, to downstream unbinding states and finally to perfect binding states at second active site. This promising pattern will allow us to further understand the channeling mechanism s and more precisely quantify cascade kinetics, both of which will enhance the design of synthetic cascades for various applica tions. 99 APPENDICES 100 APPENDICES APPENDIX A. Method of counting hopping frequency ( a) An examp le of o r igi nal coulombic energy diagram (red) and corresponding smoothed curve (blue). (b) Evolution of discrete energy levels for counting the number of hops. Blue curve was normalized from the original coulombic diagram and green curve was from smoothed coulombic diagram. 37 hops were counted from the green curve. 101 APPENDIX B. Various intermediates perfor mance on LYS peptide surface A summary of calculated diffusion parameters for all theoretical peptide chains and intermediates studied is provided in Table 6 Table 6 Properties of Oxalate intermediates channeled by Lys - Ala peptides. Explanation of column headings: : number of Ala residues between two Lys. : fraction of Lys, equal to . : adsorption energy calculated from radial distribution function (RDF). : average coulomb energy over entire simulation. : adsorption time fraction. : average diffusivity calculated by MSD based on MD trajectory. : hoping distance. : average hoping rate ( e.g. , single - double - single association times) over a 50 ns simulation. : hoping rate during adsorption, equal to . : surface diffusivity calculated based on and , assuming a 1D hopping diffusion mode. **special jumping distance due to - helix structure. 102 APPENDIX C. Vector transformation and dimension reduction Then, the dimension of the vector, , can be selectively transformed and reduced by independent component analysis (ICA) 108 . In order to better unde rstand the vector transformation, we first introduce the principal component analysis (PCA). is the time average of as shown by equation: 6 - 1 Then is used to denote the deviat ion of from its time average, : 6 - 2 The covariance matrix, , can be denoted as: 6 - 3 The matrix version is expressed by the cross product of and its matrix transpose, : 6 - 4 After that, the relationship of the eigenvalue, , and eigenvector, , of covariance mat rix can be expressed as follows : 6 - 5 103 Then the matrix version of equation 6 - 5 can be expressed as follows , by assuming matrix as the matrix of all independent eigenvectors, , and matrix as the diagonal matrix of all eigenvalues, . 6 - 6 6 - 7 6 - 8 Finally, the transformation is conducted by projecting the original featurized vectors, , on the eigenvectors. The eigenvectors with higher eigenvalues have a larger variance of data projection. As for the dimension reduction, the eigenvector matrix is ordered according to a descending order of egenvectors, . Then, a sub set of eigenvector matrix, , is used instead as following equation. 6 - 9 PCA is able to find the projecting vectors with largest variance, and time - structure independent components analysis (tICA) 97,109,110 can be used to find the slowest - relaxing degre e of freedom, which is more useful to analyze the system with relative slow surface hopping from bridge to enzyme pocket. Specifically, time - lagged covariance matrix is introduced as follows , where the covariance is taken with a time difference, , be tween the i, j entries of featurized vectors, 104 6 - 10 6 - 11 Obviously, the is equal to when lag time is zero. 6 - 12 Similarly to PCA method, the eigenvector matrix can be calculated by solving following generalized eigenfunction. Finally, the transformation and dimension reduction can be conducted by equation 6 - 9 . 6 - 13 Collectively, when coding the above mentioned process or using MSDbuilder , the dimension of reduced freedom is required to select the in equation 6 - 9 . For tICA, additionally, a lag time, , usually in form of the steps of minimum time interval, is applied at the very beginning of the analysis. 105 APPENDIX D. Code repository All P ython codes for KMC model s and data analysis can be ac cessed from this GitLab repository : https://gitlab.msu.edu/scbgroup/yuanchao - dissertation - code.git 106 APPENDIX E. Mono enzyme kinetics To integrate all possible events and corresponding rate constants, we employed Kinetic Monte Carlo for quantification of cascade kinetics. But firstly, KMC simulation was applied to reproduce Michaelis - Menten kinetics (equation 6 - 14 ), 111 which indicates a linear reaction rate increase within low substrate concentration region ( ) and a constant rate value at hig h concentration region ( ): 6 - 14 where is the TOF of single enzyme molecule, is substrate concentration, is Michaelis constant, is the desorption rate from enzyme and is the adsorption rate onto enzyme. The figure below shows the dependence of reaction rate on substrate concentration, where the rate constants are taken from E1 ( K M =0.1 mM). The KMC results agrees well with conv entional - enzyme kinetics. KMC calculation Michaelis - Menten kinetics on uncoupled enzyme. Error bars represent standard deviation of 10 calculations. 107 REF ERENCES 108 REFERENCE S 1. C. A. Denard, J. F. Hartwig and H. Zhao, "Multistep One - Pot Reactions Combining Biocatalysts and Chemical Catalysts for Asymmetric Synthesis", ACS Catal. , 3 , 2856 2864 (2013). doi:10.1021/cs400633a. 2. Y. Wang, H. Ren and H. Zhao, "Expanding the boundary of biocatalysis: design and optimization of in vitro tandem catalytic reactions for biochemical production", Crit. Rev. Biochem. Mol. Biol. , 53 , 115 129 (2018). doi:10.1080/10409238.2018.1431201. 3. M . J. Moehlenbrock, T. K. Toby, L. N. Pelster and S. D. Minteer, "Metabolon Catalysts: An Efficient Model for Multi - enzyme Cascades at Electrode Surfaces", ChemCatChem , 3 , 561 570 (2011). doi:10.1002/cctc.201000384. 4. M. J. Moehlenbrock, M. T. Meredith a nd S. D. Minteer, "Bioelectrocatalytic Oxidation of Glucose in CNT Impregnated Hydrogels: Advantages of Synthetic Enzymatic Metabolon Formation", ACS Catal. , 2 , 17 25 (2012). doi:10.1021/cs200482v. 5. M. J. Moehlenbrock, T. K. Toby, A. Waheed and S. D. M inteer, "Metabolon Catalyzed Pyruvate/Air Biofuel Cell", J. Am. Chem. Soc. , 132 , 6288 6289 (2010). doi:10.1021/ja101326b. 6. J. M. Sperl and V. Sieber, "Multienzyme Cascade Reactions Status and Recent Advances", ACS Catal. , 8 , 2385 2396 (2018). doi:10.10 21/acscatal.7b03440. 7. I. Wheeldon, S. D. Minteer, S. Banta, S. C. Barton, P. Atanassov and M. Sigman, "Substrate channelling as an approach to cascade reactions", Nat. Chem. , 8 , 299 309 (2016). doi:10.1038/nchem.2459. 8. P. A. Srere, "The metabolon", Trends Biochem. Sci. , 10 , 109 110 (1985). 9. C. V é lot, M. B. Mixon, M. Teige and P. A. Srere, "Model of a quinary structure between krebs TCA cycle enzymes: A model for the metabolon", Biochemistry , 36 , 14271 14276 (1997). doi:10.1021/bi972011j. 10. M . K. Geck and J. F. Kirsch, "A Novel, Definitive Test for Substrate Channeling Illustrated with the Aspartate Aminotransferase/Malate Dehydrogenase System ", Biochemistry , 38 , 8032 8037 (1999). doi:10.1021/bi983029c. 11. C. Singleton, T. P. Howard and N . Smirnoff, "Synthetic metabolons for metabolic engineering", J. Exp. Bot. , 65 , 1947 1954 (2014). doi:10.1093/jxb/eru050. 12. J. Fu, M. Liu, Y. Liu, N. W. Woodbury and H. Yan, "Interenzyme Substrate Diffusion for an Enzyme Cascade Organized on Spatially Addressable DNA Nanostructures", J. Am. Chem. Soc. , 134 , 5516 5519 (2012). doi:10.1021/ja300897h. 109 13. P. Bauler, G. Huber, T. Leyh and J. A. McCammon, "Channeling by Proximity: The Catalytic Advantages of Active Site Colocalization Using Brownian Dynamic s", J. Phys. Chem. Lett. , 1 , 1332 1335 (2010). doi:10.1021/jz1002007. 14. Y. Zhang, S. Tsitkov and H. Hess, "Proximity does not contribute to activity enhancement in the glucose oxidase horseradish peroxidase cascade", Nat. Commun. , 7 , 13982 (2016). doi: 10.1038/ncomms13982. 15. C. Eun, P. M. Kekenes - Huskey, V. T. Metzger and J. A. McCammon, "A model study of sequential enzyme reactions and electrostatic channeling", J. Chem. Phys. , 140 , 105101 (2014). doi:10.1063/1.4867286. 16. T. R. Schneider, E. Ger hardt, M. Lee, P. - H. Liang, K. S. Anderson and I. Schlichting, "Loop Closure and Intersubunit Communication in Tryptophan Synthase", Biochemistry , 37 , 5394 5406 (1998). doi:10.1021/bi9728957. 17. M. F. Dunn, V. Aguilar, P. Brzovi , W. F. Drewe, K. F. Hou ben, C. a Leja and M. Roy, "The tryptophan synthase bienzyme complex transfers indole between the alpha - and beta - sites via a 25 - 30 A long tunnel.", Biochemistry , 29 , 8598 8607 (1990). doi:10.1021/bi00489a015. 18. F. Wu and S. Minteer, "Krebs Cycle Metab olon: Structural Evidence of Substrate Channeling Revealed by Cross - Linking and Mass Spectrometry", Angew. Chemie Int. Ed. , 54 , 1851 1854 (2015). doi:10.1002/anie.201409336. 19. Z. H. Zhou, D. B. McCarthy, C. M. O Connor, L. J. Reed and J. K. Stoops, "Th e remarkable structural and functional organization of the eukaryotic pyruvate dehydrogenase complexes", Proc. Natl. Acad. Sci. , 98 , 14802 14807 (2001). doi:10.1073/pnas.011597698. 20. J. M ü ller and C. M. Niemeyer, "DNA - directed assembly of artificial mu ltienzyme complexes", Biochem. Biophys. Res. Commun. , 377 , 62 67 (2008). doi:10.1016/j.bbrc.2008.09.078. 21. J. Fu, Y. R. Yang, A. Johnson - Buck, M. Liu, Y. Liu, N. G. Walter, N. W. Woodbury and H. Yan, "Multi - enzyme complexes on DNA scaffolds capable of substrate channelling with an artificial swinging arm", Nat. Nanotechnol. , 9 , 531 536 (2014). doi:10.1038/nnano.2014.100. 22. B. Bulutoglu, K. E. Garcia, F. Wu, S. D. Minteer and S. Banta, "Direct Evidence for Metabolon Formation and Substrate Channeling in Recombinant TCA Cycle Enzymes", ACS Chem. Biol. , 11 , 2847 2853 (2016). doi:10.1021/acschembio.6b00523. 23. J. - L. Lin and I. Wheeldon, "Kinetic Enhancements in DNA Enzyme Nanostructures Mimic the Sabatier Principle", ACS Catal. , 3 , 560 564 (2013). doi:10.1021/cs300766d. 24. Y. Gao, C. C. Roberts, J. Zhu, J. - L. Lin, C. A. Chang and I. Wheeldon, "Tuning Enzyme Kinetics through Designed Intermolecular Interactions Far from the Active Site", ACS 110 Catal. , 5 , 2149 2153 (2015). doi:10.1021/acs catal.5b00130. 25. R. Amaro and Z. Luthey - Schulten, "Molecular dynamics simulations of substrate channeling through an barrel protein", Chem. Phys. , 307 , 147 155 (2004). doi:10.1016/j.chemphys.2004.05.019. 26. C. C. Roberts and C. A. Chang, "Modeling of Enhanced Catalysis in Multienzyme Nanostructures: Effect of Molecular Scaffolds, Spatial Organization, and Concentration", J. Chem. Theory Comput. , 11 , 286 292 (2015). doi:10.1021/ct5007482. 27. O. Idan and H. Hess, "Origins of Activity Enhancement in Enzyme Cascades on Scaffolds", ACS Nano , 7 , 8658 8665 (2013). doi:10.1021/nn402823k. 28. D. R. Knighton, C. - C. Kan, E. Howland, C. A. Janson, Z. Hostomska, K. M. Welsh and D. A. Matthews, "Structure of and kinet ic channelling in bifunctional dihydrofolate reductase thymidylate synthase", Nat. Struct. Biol. , 1 , 186 194 (1994). doi:10.1038/nsb0394 - 186. 29. A. H. Elcock, M. J. Potter, D. a Matthews, D. R. Knighton and J. A. McCammon, "Electrostatic Channeling in t he Bifunctional Enzyme Dihydrofolate Reductase - thymidylate Synthase", J. Mol. Biol. , 262 , 370 374 (1996). doi:10.1006/jmbi.1996.0520. 30. N. Wang and J. A. McCammon, "Substrate channeling between the human dihydrofolate reductase and thymidylate synthase ", Protein Sci. , 25 , 79 86 (2016). doi:10.1002/pro.2720. 31. C. Lindbladh, M. Rault, C. Hagglund, W. C. Small, K. Mosbach, L. Buelow, C. Evans and P. A. Srere, "Preparation and kinetic characterization of a fusion protein of yeast mitochondrial citrate s ynthase and malate dehydrogenase", Biochemistry , 33 , 11692 11698 (1994). doi:10.1021/bi00205a004. 32. M. Trujillo, R. G. K. Donald, D. S. Roos, P. J. Greene and D. V. Santi, "Heterologous Expression and Characterization of the Bifunctional Dihydrofolate Reductase Thymidylate Synthase Enzyme of Toxoplasma gondii", Biochemistry , 35 , 6366 6374 (1996). doi:10.1021/bi952923q. 33. Y. Liu, D. P. Hickey, J. - Y. Guo, E. Earl, S. Abdellaoui, R. D. Milton, M. S. Sigman, S. D. Minteer and S. Calabrese Barton, "Subst rate Channeling in an Artificial Metabolon: A Molecular Dynamics Blueprint for an Experimental Peptide Bridge", ACS Catal. , 7 , 2486 2493 (2017). doi:10.1021/acscatal.6b03440. 34. A. H. Elcock, G. A. Huber and J. A. McCammon, "Electrostatic Channeling of Substrates between Enzyme Active Sites: Comparison of Simulation and Experiment", Biochemistry , 36 , 16049 16058 (1997). doi:10.1021/bi971709u. 35. E. Earl and S. Calabrese Barton, "Simulation of intermediate transport in nanoscale scaffolds for multistep catalytic reactions", Phys. Chem. Chem. Phys. , 19 , 15463 15470 (2017). doi:10.1039/C7CP00239D. 111 36. K. Reuter, First - Principles Kinetic Monte Carlo Simulations for Heterogeneous Catalysis: Concepts, Status, and Frontiers, in Model. Simul. Heterog. Catal. React. Wiley - VCH Verlag GmbH & Co. KGaA, Weinheim, Germany (2011), pp. 71 111. 37. M. Karplus and J. A. McCammon, "Molecular dynamics simulations of biomolecules", Nat. Struct. Biol. , 9 , 646 652 (2002). doi:10.1038/nsb0902 - 646. 38. V. P. Raut, M. A. A gashe, S. J. Stuart and R. A. Latour, "Molecular Dynamics Simulations of Peptide Surface Interactions", Langmuir , 21 , 1629 1639 (2005). doi:10.1021/la047807f. 39. H. D. Herce and A. E. Garcia, "Molecular dynamics simulations suggest a mechanism for trans location of the HIV - 1 TAT peptide across lipid membranes", Proc. Natl. Acad. Sci. , 104 , 20805 20810 (2007). doi:10.1073/pnas.0706574105. 40. L. Li, I. Vorobyov and T. W. Allen, "The different interactions of lysine and arginine side chains with lipid mem branes", J. Phys. Chem. B , 117 , 11906 11920 (2013). doi:10.1021/jp405418y. 41. F. Hess and H. Over, "Kinetic Monte Carlo simulations of heterogeneously catalyzed oxidation reactions", Catal. Sci. Technol. , 4 , 583 (2014). doi:10.1039/c3cy00833a. 42. M. Stan, J. C. Ramirez, P. Cristea, S. Y. Hu, C. Deo, B. P. Uberuaga, S. Srivilliputhur, S. P. Rudin and J. M. Wills, "Models and simulations of nuclear fuel materials properties", J. Alloys Compd. , 444 , 415 423 (2007). 43. A. C. T. van Duin, S. Dasgupta, F . Lorant and W. A. Goddard, "ReaxFF: A Reactive Force Field for Hydrocarbons", J. Phys. Chem. A , 105 , 9396 9409 (2001). doi:10.1021/jp004368u. 44. Y. M. Huang, G. A. Huber, N. Wang, S. D. Minteer and J. A. McCammon, "Brownian dynamic study of an enzyme m etabolon in the TCA cycle: Substrate kinetics and channeling", Protein Sci. , 27 , 463 471 (2018). doi:10.1002/pro.3338. 45. R. C. Ber nardi, M. C. R. Melo and K. Schulten, "Enhanced sampling techniques in molecular dynamics simulations of biological systems", Biochim. Biophys. Acta - Gen. Subj. , 1850 , 872 877 (2015). doi:10.1016/J.BBAGEN.2014.10.019. 46. Y. Sugita and Y. Okamoto, "Repl ica - exchange molecular dynamics method for protein folding", Chem. Phys. Lett. , 314 , 141 151 (1999). doi:10.1016/S0009 - 2614(99)01123 - 9. 47. D. J. Earl and M. W. Deem, "Parallel tempering: Theory, applications, and new perspectives", Phys. Chem. Chem. Phy s. , 7 , 3910 (2005). doi:10.1039/b509983h. 48. A. Laio and M. Parrinello, "Escaping free - energy minima.", Proc. Natl. Acad. Sci. U. S. A. , 99 , 12562 6 (2002). doi:10.1073/pnas.202427399. 49. A. Barducci, G. Bussi and M. Parrinello, "Well - Tempered Metady namics: A Smoothly 112 Converging and Tunable Free - Energy Method", Phys. Rev. Lett. , 100 , 020603 (2008). doi:10.1103/PhysRevLett.100.020603. 50. J. K ä stner, "Umbrella sampling", Wiley Interdiscip. Rev. Comput. Mol. Sci. , 1 , 932 942 (2011). doi:10.1002/wcms.6 6. 51. V. S. Pande, K. Beauchamp and G. R. Bowman, "Everything you wanted to know about Markov State Models but were afraid to ask", Methods , 52 , 99 105 (2010). doi:10.1016/j.ymeth.2010.06.002. 52. G. R. Bowman, An Overview and Practical Guide to Build ing Markov State Models, in Springer, Dordrecht (2014), pp. 7 22. 53. A. C. Pan and B. Roux, "Building Markov state models along pathways to determine free energies and rates of transitions", J. Chem. Phys. , 129 , 064107 (2008). doi:10.1063/1.2959573. 54. J. - H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. D. Chodera, C. Sch ü tte and F. No é , "Markov models of molecular kinetics: Generation and validation", J. Chem. Phys. , 134 , 174105 (2011). doi:10.1063/1.3565032. 55. M. Biswas, B. Licker t and G. Stock, "Metadynamics Enhanced Markov Modeling of Protein Dynamics", J. Phys. Chem. B , 122 , 28 (2018). doi:10.1021/acs.jpcb.7b11800. 56. A. Dickson and C. L. Brooks, "Native States of Fast - Folding Proteins Are Kinetic Traps", J. Am. Chem. Soc. , 1 35 , 4729 4734 (2013). doi:10.1021/ja311077u. 57. S. D. Lotz and A. Dickson, "Unbiased Molecular Dynamics of 11 min Timescale Drug Unbinding Reveals Transition State Stabilizing Interactions", J. Am. Chem. Soc. , 140 , 618 628 (2018). doi:10.1021/jacs.7b085 72. 58. D. D. Vvedensky, "Multiscale modelling of nanostructures", J. Phys. Condens. Matter , 16 , R1537 R1576 (2004). doi:10.1088/0953 - 8984/16/50/R01. 59. lkening and J. Wintterlin, "CO oxidation on Pt(111) Scanning tunneling microscopy experiment s and Monte Carlo simulations", J. Chem. Phys. , 114 , 6382 (2001). doi:10.1063/1.1343836. 60. A. Farkas, F. Hess and H. Over, "Experiment - Based Kinetic Monte Carlo Simulations: CO Oxidation over RuO 2 (110)", J. Phys. Chem. C , 116 , 581 591 (2012). doi:10. 1021/jp204703p. 61. A. Chatterjee and D. G. Vlachos, "An overview of spatial microscopic and accelerated kinetic Monte Carlo methods", J. Comput. Mater. Des. , 14 , 253 308 (2007). doi:10.1007/s10820 - 006 - 9042 - 9. 62. M. Stamatakis and D. G. Vlachos, "Unra veling the Complexity of Catalytic Reactions via Kinetic Monte Carlo Simulation: Current Status and Frontiers", ACS Catal. , 2 , 2648 2663 113 (2012). doi:10.1021/cs3005709. 63. D. - J. Liu, A. Garcia, J. Wang, D. M. Ackerman, C. - J. Wang and J. W. Evans, "Kineti c Monte Carlo Simulation of Statistical Mechanical Models and Coarse - Grained Mesoscale Descriptions of Catalytic Reaction Diffusion Processes: 1D Nanoporous and 2D Surface Systems", Chem. Rev. , 115 , 5979 6050 (2015). doi:10.1021/cr500453t. 64. P. Zoontje ns, T. P. Schulze and S. C. Hendy, "A Hybrid Kinetic Monte Carlo - Molecular Dynamics Method for Modeling Epitaxial Growth", 1 6 (2007). 65. G. Kabbe, C. Wehmeyer and D. Sebastiani, "A Coupled Molecular Dynamics/Kinetic Monte Carlo Approach for Protonation Dynamics in Extended Systems", J. Chem. Theory Comput. , 10 , 4221 4228 (2014). doi:10.1021/ct500482k. 66. H. J. C. Berendsen, D. van der Spoel and R. van Drunen, "GROMACS: A message - passing parallel molecular dynamics implementation", Comput. Phys. Commu n. , 91 , 43 56 (1995). doi:10.1016/0010 - 4655(95)00042 - E. 67. E. Lindahl, B. Hess and D. van der Spoel, "GROMACS 3.0: a package for molecular simulation and trajectory analysis", J. Mol. Model. , 7 , 306 317 (2001). doi:10.1007/s008940100045. 68. B. Hess, C. Kutzner, D. van der Spoel and E. Lindahl, "GROMACS 4: Algorithms for Highly Efficient, Load - Balanced, and Scalable Molecular Simulation", J. Chem. Theory Comput. , 4 , 435 447 (2008). doi:10.1021/ct700301q. 69. S. Pronk, S. Pall, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess and E. Lindahl, "GROMACS 4.5: a high - throughput and highly parallel open source molecular simulation toolkit", Bioinfor matics , 29 , 845 854 (2013). doi:10.1093/bioinformatics/btt055. 70. M. J. Abraham, T. Murtola, R. Schulz, S. P á ll, J. C. Smith, B. Hess and E. Lindahl, "GROMACS: High performance molecular simulations through multi - level parallelism from laptops to superc omputers", SoftwareX , 1 2 , 19 25 (2015). doi:10.1016/j.softx.2015.06.001. 71. D. Van Der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C. Berendsen, "GROMACS: Fast, flexible, and free", J. Comput. Chem. , 26 , 1701 1718 (2005). doi:10.1002/ jcc.20291. 72. S. P á ll, M. J. Abraham, C. Kutzner, B. Hess and E. Lindahl, Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS, in Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinfo rmatics) (2015), pp. 3 27. 73. B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan and M. Karplus, "CHARMM: A program for macromolecular energy, minimization, and dynamics calculations", J. Comput. Chem. , 4 , 187 217 (1983). 114 doi:10 .1002/jcc.540040211. 74. R. B. Best, X. Zhu, J. Shim, P. E. M. Lopes, J. Mittal, M. Feig and A. D. MacKerell, "Optimization of the Additive CHARMM All - Atom Protein Force Field Targeting Improved Sampling of the Backbone , and Side - Chain 1 and 2 Di hedral Angles", J. Chem. Theory Comput. , 8 , 3257 3273 (2012). doi:10.1021/ct300400x. 75. K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov and A. D. Mackerell, "CHARMM general force field : A force field for drug - like molecules compatible with the CHARMM all - atom additive biological force fields", J. Comput. Chem. , 31 , 471 690 (2009). doi:10.1002/jcc.21367. 76. G. Bussi, D. Donadio and M. Parrinello, "Canonical sampling through velocity r escaling", J. Chem. Phys. , 126 , 014101 (2007). doi:10.1063/1.2408420. 77. M. Parrinello and A. Rahman, "Polymorphic transitions in single crystals: A new molecular dynamics method", J. Appl. Phys. , 52 , 7182 7190 (1981). doi:10.1063/1.328693. 78. C. L. Wennberg, T. Murtola, B. Hess and E. Lindahl, "Lennard - Jones Lattice Summation in Bilayer Simulations Has Critical Effects on Surface Tension and Lipid Properties", J. Chem. Theory Comput. , 9 , 3527 3537 (2013). doi:10.1021/ct400140n. 79. C. L. Wennberg, T. Murtola, S. P á ll, M. J. Abraham, B. Hess and E. Lindahl, "Direct - Space Corrections Enable Fast and Accurate Lorentz Berthelot Combination Rule Lennard - Jones Lattice Summation", J. Chem. Theory Comput. , 11 , 5737 5746 (2015). doi:10.1021/acs.jctc.5b00726. 80. S. P á ll and B. Hess, "A flexible algorithm for calculating pair interactions on SIMD architectures", Comput. Phys. Commun. , 184 , 2641 2650 (2013). doi:10.1016/j.cpc.2013.06.003. 81. C. Kutzner, S. P á ll, M. Fechner, A. Esztermann, B. L. de Groot and H. Grubm ü ller, "Best bang for your buck: GPU nodes for GROMACS biomolecular simulations", J. Comput. Chem. , 36 , 1990 2008 (2015). doi:10.1002/jcc.24030. 82. M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Va ndermeersch, E. Zurek and G. R. Hutchison, "Avogadro: an advanced semantic chemical editor, visualization, and analysis platform", J. Cheminform. , 4 , 17 (2012). doi:10.1186/1758 - 2946 - 4 - 17. 83. T. J. Dolinsky, J. E. Nielsen, J. A. McCammon and N. A. Baker , "PDB2PQR: an automated pipeline for the setup of Poisson - Boltzmann electrostatics calculations", Nucleic Acids Res. , 32 , W665 W667 (2004). doi:10.1093/nar/gkh381. 84. T. J. Dolinsky, P. Czodrowski, H. Li, J. E. Nielsen, J. H. Jensen, G. Klebe and N. A. Baker, "PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations", Nucleic Acids Res. , 35 , W522 W525 (2007). doi:10.1093/nar/gkm276. 115 85. N. A. Baker, D. Sept, S. Joseph, M. J. Holst and J. A. McCammon, "Electrostatics of nanosystems: Application to microtubules and the ribosome", Proc. Natl. Acad. Sci. , 98 , 10037 10041 (2001). doi:10.1073/pnas.181342398. 86. W. Humphrey, A. Dalke and K. Schulten, "VMD: Visual molecular dynamics", J. Mol. Graph. , 14 , 33 38 (1996). doi:10.1016/0263 - 7855(96)00018 - 5. 87. J. J. Irwin, T. Sterling, M. M. Mysinger, E. S. Bolstad and R. G. Coleman, "ZINC: A Free Tool to Discover Chemistry for Biology", J. Chem. Inf. Model. , 52 , 1757 1768 (2012). doi:10.1021/ci3001277. 88. J . J. Irwin and B. K. Shoichet, "ZINC -- a free database of commercially available compounds for virtual screening.", J. Chem. Inf. Model. , 45 , 177 82 (2005). doi:10.1021/ci049714+. 89. R. J. Ferreira, M. - J. U. Ferreira and D. J. V. A. dos Santos, "Do adsor bed drugs onto P - glycoprotein influence its efflux capability?", Phys. Chem. Chem. Phys. , 17 , 22023 22034 (2015). doi:10.1039/C5CP03216D. 90. R. Rowland and R. Taylor, "Intermolecular nonbonded contact distances in organic crystal structures: Comparison with distances expected from van der Waals radii", J. Phys. Chem. , (1996). 91. B. Roux, "The calculation of the potential of mean force using computer simulations", Comput. Phys. Commun. , 91 , 275 282 (1995). doi:10.1016/0010 - 4655(95)00053 - I. 92. M. Souaille and B. Roux, "Extension to the weighted histogram analysis method: combining umbrella sampling with free energy calculations", Comput. Phy s. Commun. , 135 , 40 57 (2001). doi:10.1016/S0010 - 4655(00)00215 - 0. 93. S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen and P. A. Kollman, "THE weighted histogram analysis method for free - energy calculations on biomolecules. I. The method", J. Comput . Chem. , 13 , 1011 1021 (1992). doi:10.1002/jcc.540130812. 94. A. Grossfield, WHAM: the weighted histogram analysis method. Version 2.0.9, n.d. . Available at: http://membrane.urmc.rochester.edu/content/wham. [Accessed: 17 - Aug - 2017]. 95. O. Stern, "The theory of the electrolytic double - layer", Z. Elektrochem , 30 , 1014 1020 (1924). 96. G. P é rez - Hern á ndez, F. Paul, T. Giorgino, G. De Fabritiis and F. No é , "Identification of slow molecular order parameters for Markov model construction", J. Chem. Phys. , 1 39 , 015102 (2013). doi:10.1063/1.4811489. 97. C. R. Schwantes and V. S. Pande, "Improvements in Markov State Model Construction Reveal Many Non - Native Interactions in the Folding of NTL9", J. Chem. Theory Comput. , 9 , 2000 2009 (2013). doi:10.1021/ct30087 8a. 116 98. L. J. Lapidus, S. Acharya, C. R. Schwantes, L. Wu, D. Shukla, M. King, S. J. DeCamp and V. S. Pande, "Complex Pathways in Folding of Protein G Explored by Simulation and Experiment", Biophys. J. , 107 , 947 955 (2014). doi:10.1016/J.BPJ.2014.06.037 . 99. A. Dickson and C. L. Brooks, "Quantifying Hub - like Behavior in Protein Folding Networks", J. Chem. Theory Comput. , 8 , 3044 3052 (2012). doi:10.1021/ct300537s. 100. O. Trott and A. J. Olson, "AutoDock Vina: Improving the speed and accuracy of dock ing with a new scoring function, efficient optimization, and multithreading", J. Comput. Chem. , 31 , 671 690 (2009). doi:10.1002/jcc.21334. 101. M. F. Sanner, "Python: a programming language for software integration and development.", J. Mol. Graph. Model . , 17 , 57 61 (1999). 102. K. A. Beauchamp, G. R. Bowman, T. J. Lane, L. Maibaum, I. S. Haque and V. S. Pande, "MSMBuilder2: Modeling Conformational Dynamics on the Picosecond to Millisecond Scale", J. Chem. Theory Comput. , 7 , 3412 3419 (2011). doi:10.102 1/ct200463m. 103. N. Michaud - Agrawal, E. J. Denning, T. B. Woolf and O. Beckstein, "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations", J. Comput. Chem. , 32 , 2319 2327 (2011). doi:10.1002/jcc.21787. 104. R. J. Gowers, M. Linke, J . Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, J. Doma ski, D. L. Dotson, S. Buchoux, I. M. Kenney and O. Beckstein, "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations", 98 105 (2016). 105. Y. Liu, I. Matanovic, D. P. Hickey, S. D. Minteer, P. Atanassov and S. C. Barton, "Cascade Kinetics of an Artificial Metabolon by Molecular Dynamics and Kinetic Monte Carlo", ACS Catal. , 8 , 7719 7726 (2018). doi:10.1021/acscatal.8b01041. 106. F. Perez and B. E. Granger, "IPyt hon: A System for Interactive Scientific Computing", Comput. Sci. Eng. , 9 , 21 29 (2007). doi:10.1109/MCSE.2007.53. 107. O. Valsson, P. Tiwary and M. Parrinello, "Enhancing Important Fluctuations: Rare Events and Met adynamics from a Conceptual Viewpoint", Annu. Rev. Phys. Chem , 67 , 159 84 (2016). doi:10.1146/annurev - physchem - 040215 - 112229. 108. What is Independent Component Analysis?, in Indep. Compon. Anal. John Wiley & Sons, Inc., New York, USA (n.d.), pp. 145 164. 109. L. Molgedey and H. G. Schuster, "Separation of a mixture of independent signals using time delayed correlations", Phys. Rev. Lett. , 72 , 3634 3637 (1994). doi:10.1103/PhysRevLett.72.3634. 110. C. R. Schwantes, D. Shukla and V. S. Pande, "Markov St ate Models and tICA Reveal a Nonnative Folding Nucleus in Simulations of NuG2", Biophys. J. , 110 , 1716 1719 (2016). doi:10.1016/J.BPJ.2016.03.026. 117 111. H. S. Fogler, Essentials of chemical reaction engineering , Prentice Hall (2011).