COMPUTATIONAL STUDY OF SUBSTRATE CHANNELING MECHANISMS IN ENZYMATIC CASCADE REACTIONS By Yan Xie A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Materials Science and Engineering – Doctor of Philosophy 2023 ABSTRACT Cascade reactions have attracted great attention in the fields of chemical synthesis, biofuel cells, and biosensors due to multiple benefits, including reduced waste generation and minimal purification requirements. They involve a sequence of chemical transformations that take place within a single reactor, where the product of one reaction step, described as an intermediate, becomes the product of the following step. The transport of these intermediates between neighboring active sites often faces the challenge of desorption into the bulk solvent, as well as competition with the side reactions. The efficiency of cascade reactions is therefore limited by intermediate transport. Nature has evolved several substrate channeling strategies to enable the direct transfer of intermediates between adjacent active sites. In this work, molecular dynamics simulations were performed to computationally understand two main mechanisms of substrate channeling: electrostatic channeling and molecular tunneling. Firstly, we studied the electrostatic channeling of glucose-6-phosphate (G6P) on a poly- arginine peptide connecting the sequential enzymes of hexokinase (HK) and glucose-6- phosphate dehydrogenase (G6PDH). Metadynamics is used in conjunction with molecular dynamics simulation to quantify the hopping rate of G6P on the bridge. According to lag time calculations observed via a kinetic Monte Carlo model, a poly-arginine bridge is more efficient at channeling G6P compared to the previously studied poly-lysine bridge. A more complex model of electrostatic channeling was then considered, namely the malate dehydrogenase–citrate synthase complex of the citric acid cycle. The negatively charged intermediate oxalacetate (OAA) travels along a positive surface on the enzyme complex. A Markov state model (MSM) identified the dominant pathway and four bottleneck residues. By conducting a hub score analysis and measuring channeling probabilities, we verified that replacing the experimentally determined positive key residue Arg65 with the neutral residue Ala65 led to a 50% reduction in channeling probability, as observed experimentally. The mechanism of molecular tunneling was studied for an ammonia tunnel in the asparagine synthetase system. Combining molecular dynamics with umbrella sampling, energy profiles were constructed for both the original and the mutant structures with an alanine→ leucine replacement. Due to its larger side chain, leucine caused a narrowing of the tunnel when it replaced alanine in the mutant structure, resulting in the blockage of NH3, and thus an increase in the local energy profile. Finally, the enzymatic interaction between hexokinase (HK) and glucose-6-phosphate dehydrogenase(G6PDH) was studied with coarse-grained molecular dynamics (CG MD). CG MD simplified the complex system of HK-bridge-G6PDH and allowed the observation of enzymatic configuration change. The relative rotation of G6PDH shows an electrostatic interaction between the enzymes, which is dependent on ionic strength. Overall, this work computationally examines the mechanisms of substrate channeling at an atomic level and acts as a guide to design efficient artificial cascades with substrate channeling. ACKNOWLEDGEMENTS Pursuing my Ph.D. degree at MSU has been a wonderful experience that has passed by so quickly - five years have flown by. Looking back, I have encountered both successes and challenges in my research, and I am grateful for the help and support of many individuals. I would like to take this opportunity to express my sincere gratitude to my advisor, Dr. Scott Calabrese Barton. Throughout my Ph.D. journey, he has provided me with guidance and encouragement, pushing me to think creatively and explore new ideas. Dr. Barton's patience and positivity have been instrumental in helping me overcome research obstacles, and his high standards have inspired me to strive for excellence. I am also grateful for his availability to discuss research and provide me with assistance, which has helped my research progress smoothly. I am also thankful to my Ph.D. committee members, Dr. Alex Dickson, Dr. David Hickey, and Dr. Hui-Chia Yu, for their invaluable feedback and suggestions. Additionally, I appreciate the funding support from Army research office MURI (#W911NF1410263) via Dr. Shelly Minteer at the University of Utah. My lab mates, Kanchan, Alex, Manali, Christina, Mindy, Yunlu, and Brandon, have all been instrumental in supporting me in various ways. Their company and research discussions have been a source of comfort and inspiration for me throughout my Ph.D. journey. Most importantly, I would like to extend my thanks to my friends at MSU, Shuqi, Sarah, my husband Pin, and many others, whose help and support have been essential to my success. The moments shared with them have become unforgettable memories that I will always cherish. I am especially grateful to my family, particularly my mother and my sister, who have been supportive back home in China. My mother's unwavering optimism and encouragement have iv been essential in helping me overcome self-doubt and fear of failure. Her investment in my education and vision have sent me to the USA and enabled me to achieve my dream of obtaining a Ph.D. degree. She has instilled in me a strong sense of determination in pursuing my goals and perseverance in the face of challenges, which has been invaluable throughout my academic journey. I appreciate their understanding and love throughout my five-year journey. v TABLE OF CONTENTS Chapter 1. Introduction ................................................................................................................... 1 Chapter 2. Infrequent Metadynamics Study of Rare-event Electrostatic Channeling .................. 30 Chapter 3. Markov State Study of Electrostatic Channeling within the Tricarboxylic Acid Cycle Supercomplex ............................................................................................................................... 53 Chapter 4. Molecular Tunneling of Ammonia in Asparagine Synthetase .................................... 75 Chapter 5. Coarse-Grained Molecular Dynamics Simulation of HK-bridge-G6PDH Complex ........................................................................................................................................ 86 Chapter 6. Conclusions and Future Work ..................................................................................... 94 BIBLIOGRAPHY ......................................................................................................................... 97 vi Chapter 1. Introduction Developing green reactions has been a goal of the chemical synthesis industry for decades.1 Toward this goal, it is necessary to investigate new reactions that increase the production of the desired products and decrease the occurrence of unwanted by-products.1 Cascade reactions are one such approach, as they achieve multistep chemical transformation in a single container without the need for separate purification processes (Figure 1.1).2 Thus, cascade reactions reduce the cost of time and labor and minimize wasteful byproduct production.3 Designing cascade reactions that have high efficiency, high yields, and excellent selectivity is always the ultimate goal of catalytic synthesis.4 Enzymes are powerful biocatalysts that accelerate chemical reactions in living organisms. Mimicking catalytical enzymatic cascades is considered a promising approach for chemical synthesis. Motivated by the multi-enzyme structures observed in nature, including the photosynthetic system in plants and the metabolic pathways in cells, significant effort has been invested in creating effective, engineered enzyme cascade reaction systems.5,6 These systems Figure 1.1 A cartoon representation of one-pot cascade reactions. This figure is adapted from ref. 2. 1 play a crucial role in various fields, such as enzyme replacement therapies,7,8 biological detoxification,9 chemical syntheses,10,11 biofuel cells,12 and biosensors.13,14 While the idea of being able to manipulate reactions and achieve desired outcomes is appealing in theory, the intricate nature of interconnected enzyme systems makes practical application difficult. Additionally, the complex organization of these networks can impact the performance of individual enzymes and the movement of intermediates between them, which is not well understood, further adding to the challenges of utilizing these reactions in real-world settings. Enzymatic cascade reactions involve multiple enzymes that work together to convert a substrate into a product through a series of reactions. In these reactions, intermediates are the molecules that are formed during the reaction and are subsequently converted into other intermediates or the final product. These intermediates are very stable, as they have a relatively long half-life and are able to accumulate to a measurable extent. The transport of intermediates is a crucial step in determining overall reaction efficiency. Therefore, it is imperative to regulate the diffusion of intermediates between adjacent enzyme active sites in order to manage a cascade reaction. Nature utilizes strategies such as intramolecular tunneling,15 chemical swing arms,16 spatial organization,17 and electrostatic channeling,18,19 to direct the transport of intermediates. The phenomenon is often referred to as substrate channeling.4 The direct transport of intermediates by substrate channeling protects intermediates from undesired binding or reactions in bulk solution and avoids equilibration at finite bulk concentration. Understanding the mechanisms of substrate channeling is necessary to take advantage of designing artificial catalytic cascade reactions. There are many experimental approaches to detect and evaluate substrate channelings, such as transient time analysis, isotope dilution, and 2 competing reaction effects. However, experimental methods cannot achieve precise control of complex enzyme systems at an atomistic level or directly observe the process of substrate channeling. Therefore, computational models at various scales have been used to quantitatively investigate the mechanisms of substrate channeling, such as Brownian dynamics, continuum models, and molecular dynamics (MD).19,20 In this dissertation, we examine two specific mechanisms of substrate channeling and investigate the thermodynamics and the kinetics properties of intermediate transport by combining several computational techniques. This dissertation aims to shed light on substrate channeling and improve the design of future efficient artificial cascade reactions. 1.1 Muti-enzymatic Cascade Reactions Enzymes, a distinct category of proteins, are composed of amino acid residues with specific sequences and conformations. These proteins act as catalysts and facilitate chemical reactions by lowering the energy barrier required for the reaction to occur. In nature, enzymatic cascade reactions are utilized for metabolic processes as a result of evolution's optimization. Figure 1.2 The condensation process of peptides. 3 1.1.1 Protein Structure There are twenty amino acid residues found in nature. Each residue has a carboxylic acid group and an amino group.21 Two successive amino acids are joined together by a condensation reaction and form a peptide bond, during which a water molecule is produced through the reaction between the hydroxyl group in the carboxyl group of one residue and the hydrogen from the amino group of the other residue (Figure 1.2). Polypeptides are a chain of amino acids linked together by peptide bonds. Proteins are formed by polypeptides, and the amino acid sequence and the polypeptide structures determine the function of proteins. Based on their sidechain, amino acid residues have different structures and, thus, different properties. Some residues are charged; for example, arginine, histidine, and lysine have Figure 1.3 The four levels of protein structure. This figure is from ref. 25. 4 a positive charge, while aspartic acid and glutamic acid have a negative charge. Some residues are polar but uncharged, such as serine, asparagine, and glutamine. Some of the residues have hydrophobic sidechains, such as alanine, leucine, and tryptophan.22 All these characteristics of residues affect the formation of protein structure and have different functions.23 Proteins have four levels of structure, primary, secondary, tertiary, and quaternary (Figure 1.3).24,25 The primary structure is determined by the sequence of amino acids, while the secondary structure is established by the hydrogen bonds formed within the peptide backbone. The common motifs of secondary structure include the alpha helix and beta sheet as shown in Figure 1.3. The tertiary structure refers to the protein's complete three-dimensional configuration and is formed by multiple interactions, such as hydrophobic/hydrophilic interactions and ionic and hydrogen bonds. The quaternary structure describes the arrangement of subunits for proteins with multiple identical subunits that only exist in some proteins.26 Proteins have various functions and, thus, different categories based on their structures. Antibody proteins, for example, as one of the protein categories, binds to specific foreign particles to protect the body.27 Enzymes are proteins responsible for the catalysis of biochemical reactions.28 Messenger proteins carry and exchange information between different bio components.29 Overall, proteins play a crucial role in the functioning of living organisms. 1.1.2 Enzymatic Catalysis Enzymes serve as biological catalysts and facilitate the transformation of various biological substrates into different products during metabolic reactions.28 They are crucial for cell biological processes as enzymes regulate most chemical reactions. Enzymes usually catalyze a specific reaction from a selective substrate to a particular product,as the binding site of 5 Figure 1.4 Free energy change of nonenzymatic and enzymatic reactions. enzymes forms a unique pocket for its substrate to bind, and the catalytic site carries out the chemical transformation.30 From a thermodynamic standpoint, a chemical reaction involves changes in enthalpy, entropy, and free energy.31 Enthalpy change indicates the amount of heat absorbed or released during the reaction, while entropy change refers to the degree of disorder before and after the reaction. Free energy change determines whether the reaction is spontaneous or requires an external force. As the reaction proceeds, the system must break the existing bonds between the reactant molecules or atoms and form new bonds to create the products. This process requires overcoming an energy barrier, which is the minimum energy required for the reaction to occur. The energy barrier is called activation energy. The maximum point on the reaction’ free energy profile is the transition state. For enzymatic reactions, the substrate binds to the enzyme, leading to the formation of a ligand-enzyme complex with a reduced energy barrier compared to non- catalyzed reactions (Figure 1.4). The temporary and dynamic complex that stabilized by non- 6 covalent interactions between the enzyme and the ligand decomposes into products and free enzymes during the reaction. From a kinetics standpoint, the reaction rate is dependent on the concentration of reactants, temperature, presence of a catalyst, and reaction surface area of the reactants and catalysts. The rate law defines the reaction rate dependence on the concentration of reactants. The activation energy also decides the reaction rate. A lower activation energy can expedite the reaction. Enzymes could effectively decrease the activation energy and increase the reaction rate. Therefore, studying the kinetics is crucial for quantifying enzyme efficiency. The transient-state kinetics refer to the stage before the reaction reaches a steady state, in which enzymes and substrate quickly react.32 Steady-state kinetics are achieved when the rate of formation of the intermediate is equal to its rate of decomposition; thus, the concentration of the intermediate stays constant.33 The mutation of the active site residues or the addition of enzyme inhibitors would modify the activity of enzymes and alter the reaction rate.34 1.1.3 Examples of Enzymatic Cascades If an enzyme-catalyzed reaction step generates a stable intermediate, which could be used as the reactant in a following enzyme-catalyzed step, till the ultimate product is formed, this Figure 1.5 Four different types of enzymatic cascades: (a) linear cascade, (b) orthogonal cascade, (c) parallel cascade, and (d) cyclic cascade. This figure is from ref. 36. 7 whole process can be called a multi-enzymatic "cascade" or "domino" reaction.35 There are four main types of multi-enzymatic cascade reactions (Figure 1.5): (a) Linear cascade, where only one intermediate exists, which is produced by one substrate and would be used as reactant on the following enzyme active site to create the final product; (b) Orthogonal cascades, wherein the substrate transforms into the product, and the reaction is coupled with other reactions; (c) Parallel cascades, in which two reactants exist, which can be converted into two products; (d) Cyclic cascade, two substrates transform into an intermediate and product, and the intermediate could be transformed into the original substrates, and after many cycles the product accumulates.35,36 These are not strict definitions, and a natural reaction could use a combination of these cascade reactions. There are many examples of natural cascades, and these natural cascades are also called metabolic pathways. For example, glycolysis is the first metabolic pathway found in nature.37 In glycolysis, glucose decomposes into pyruvate, releasing energy, after going through 10 steps of enzymatic reactions. Another well-known illustration of an enzymatic cascade reaction is the tricarboxylic acid cycle, or the TCA cycle. The TCA cycle plays a crucial role in the natural processes as it serves as the second phase of cellular respiration. It occurs in the mitochondria of many plants, animals, and bacteria.38 The TCA cycle facilitates the process of oxidizing organic fuel molecules, such as glucose and fatty acids, to produce ATP and allows for the capture of energy through this conversion.39 Srere et al. proposed that the TCA cycle occurs in a super- complex composed of eight enzymes and involves an 8-step enzymatic reaction.18 It is also assumed super enzyme complexes (also called metabolons) shorten the transfer path for intermediates, and channels these intermediates from one enzyme to another enzyme.18,40 8 1.2 Substrate Channeling Mechanisms Cascade reactions have many advantages such as minimal waste generation and reduction of cost in time and labor. However, they suffer from low efficiency and low yield of the final product.41 The low efficiency of intermediate transfer is the primary reason for this, as the intermediate typically transfer through and equilibrate with the bulk solvent before reaching the desired reaction site; in the end, only a small portion of the intermediate could successfully continue the reaction. Over thousands of years of evolution, nature has developed many strategies to facilitate cascade reactions and enhance reaction efficiency. One strategy, substrate channeling, refers to the direct transport of intermediates between active sites on consequent enzymes42. Substrate channeling primarily aims to prevent intermediates produced at the previous enzyme from diffusing into the bulk environment instead of going to the following enzyme. With the help of substrate channeling, the intermediates are isolated within active sites, reducing the risk of exposure to the bulk, avoiding contact with other toxic intermediates, and avoiding competing reactions.4 Many different mechanisms for channeling are currently known as explained below. 1.2.1 Molecular Tunneling Molecular tunnels are found in natural enzymes, and function to transfer intermediates between adjacent active sites in enzymatic cascade reactions.43 They are mainly formed by backbone atoms and hydrophobic and nonpolar amino acids and prevent intermediate loss to bulk accumulation and undesired side reactions.44 Intramolecular tunneling can achieve a channeling probability of 100%, which is a great advantage compared to the less efficient channeling of other mechanisms. 9 Intramolecular tunnels occur in many supramolecular complexes. Discovery of intramolecular tunnels was enabled by the development of X-ray crystallography, which made it possible to examine the protein structures on a detailed chemical level. It once was considered that active sites on the enzymes would be arranged close with each other as enzymes folded. However, the structure observation of tryptophan synthase under X-ray by David Davies rejected this assumption in 1988.45 There are two subunits on tryptophan synthase, a-subunit and b- subunit, catalyzing two distinct reactions. On the a-subunit, indole-3-glycerol phosphate (IGP) decomposes into indole as the intermediate and glyceraldehyde-3-phosphate (G3P), while the on b-subunit, indole reacts with L-serine to produce the final product L-tryptophan (L-TRP), as illustrated in Figure 1.6b. It was observed that there existed a predominantly hydrophobic tunnel of about ~25 Å connecting the individual active sites on a- and b-subunits (Figure 1.6). Kinetic studies have shown the concentration of indole intermediate is deficient in the bulk.46 Thus, this confirmed that the tunnel could effectively transfer indole between consequential active sites. (a) (b) Figure 1.6 (a) Ribbon representation of tryptophan synthase. The a- and b-subunits are colored in yellow and blue, respectively. A tunnel connecting the two active sites in the two functional units is indicated by the red chicken wire tube. Indole propanol phosphate and pyridoxal phosphate are abbreviated as IPP and PLP, respectively. (b) The reaction catalyzed by tryptophan synthase. This figure is from ref. 43. 10 After the molecular tunnel was found in tryptophan synthase, a similar NH3 tunnel was found in many enzymes. Examples of these enzymes are Carbamoyl Phosphate Synthetase (CPS), Glutamine Phosphoribosylpyrophosphate Amidotransferase (GPATase), Asparagine Synthetase, and Glutamate Synthase.47–50 All these enzymes use NH3 from the hydrolysis of glutamine as a nitrogen source. 1.2.2 Electrostatic Channeling Electrostatic channeling uses electrostatic interaction between charged intermediate and an oppositely-charged electrostatic surface to control translocation of catalytic intermediates.15 It is the most promising substrate channeling mechanism applied to artificial cascade reactions because it is not limited by enzymes' structural organization, and most intermediates are charged.20 In recent decades, considerable efforts have focused on two well-studied examples of electrostatic channeling in nature. The first example is the bifunctional enzyme dihydrofolate reductase-thymidylate synthase (DHFR-TS), where negatively charged dihydrofolate intermediates are guided by bounded diffusion on the positively charged protein structure between two active sites.15,51 The other often-studied example is the enzyme complex of malate dehydrogenase (MDH) and citrate synthase (CS) in the tricarboxylic acid (TCA) cycle, where OAA, carrying two negative charges, crosses the positively charged surface between MDH and CS via electrostatic interaction.18,19,40 1.2.3 Spatial Organization Spatial organization refers to the arrangement of different enzymes and metabolic pathways within a cell to facilitate the transfer of intermediates between enzymatic active sites and pathways.4 In nature, enzyme reactions are spatially organized through many mechanisms, 11 such as immobilization on scaffolds and encapsulation. The use of polymeric capsules, DNA scaffolds, protein scaffolds, and other similar systems for co-immobilizing enzymes has been demonstrated to be effective in building complex multi-enzyme systems.52–55 It was initially thought that bringing enzymes in close proximity would significantly reduce the chance of leakage into bulk, enhancing the transport efficiency of intermediates.56 However, the effect of spatial organization is more complicated than previously assumed.57 For example, Fu et al. reported that two sequential reacted enzymes, glucose oxidase and horseradish peroxidase, were co-assembled on DNA nanostructure at various distances (Figure 1.7). The activity was highest at a distance of 10 nm between the two enzymes. Thus, they concluded that because the spatial organization of two enzymes prevented the H2O2 intermediate from diffusing into the bulk, the reaction rate was highly enhanced.58 But, the idea has been challenged by Zhang et al. who argued that the local pH decreased with increasing enzyme loading on the DNA scaffold, which changed the reaction mechanism for peroxide’s consumption.59 Thus, the exact mechanism is still a debate and needs to be determined by further study.57 Figure 1.7 DNA scaffold enables regulation of enzyme proximity. The efficiency of a cascade reaction between Glucose Oxidase (GOx) and Horseradish Peroxidase (HRP) depends on the spacing of the enzymes. This figure is from ref. 58. 12 However, there are other successful examples available. For instance, Kim and coworkers used protein-based scaffolds instead of nucleic acid-based scaffolds to immobilize enzymes. They found that the conversion of pyruvate to 2,3-butanediol in yeast was increased up to 37% compared with the scaffold-free system.60 In addition, polymer-based systems can be used to encapsulate enzymes. Van Dongen et al. positioned three sequentially catalytic enzymes in a polymersome by putting one enzyme on the membrane, one within the lumen (i.e., the central cavity of the polymersome), and one on the exterior surface.61 This high incorporation efficiency of the enzymes in the polymersome nanoreactor demonstrates the viability of polymer-based enzyme encapsulation.61 However, spatial organization presents several challenges: (1) stability of active site: the spatial organization may distort the active site and reduce the efficiency; (2) undisered interactions: enzymes may interact with other molecules in the same environment, including inhibitors, which can impact enzyme’s activity; (3) complicated activity: it is important that the active sites of enzymes are stable and correctly positioned for the reaction to occur, and any disruption to their organization can have significant consequences of the activities.62 In most cases, proximity cannot even be achieved, so the application range of spatial organization is limited. 1.3 Experimental Analysis Approaches There are many experimental approaches to detect and evaluate substrate channeling. The choice of a specific method to characterize the performance of substrate channeling largely depends on the enzymes involved in the cascade reaction, kinetics at each active site, and the mechanism of substrate channeling. Thus it is critical to know about commonly available methods and how they may be applied to different enzyme systems. Transient time analysis (t), 13 isotope dilution or enhancement, and competing reaction effects are three standard experimental approaches to detect and characterize substrate channeling.63 1.3.1 Lag Time Analysis Lag time is the time needed for the intermediate to reach steady state flux. If perfect substrate channeling occurs, the transient time should approach 0 s, whereas if leaky substrate channeling occurs, the transient time will increase. If transient time is significant, this indicates that the intermediate may freely diffuse into bulk solution. Thus, lag time analysis is a critical approach to evaluating substrate channeling. For linear cascade reactions expressed in Scheme 1.1, the lag time can be derived as below: k k S ⎯E⎯ 1 → I ⎯E⎯ 2 →P 1 2 Scheme 1.1 A linear cascade reaction. where S is the substrate, I is the intermediate, P is the product, E1 is the first enzyme, and E2 is the sequential enzyme. If we make the following assumptions: (1) the first reaction is an irreversible and zeroth-order reaction, whose rate constant is k1 (M/s); (2) the second reaction is an irreversible and first-order reaction with respect to the concentration of intermediate, whose rate constant is k2, the rate equation for intermediate and product are64: 𝑑𝐶! 1-1 = 𝑘" − 𝑘# ∙ 𝐶! ; 𝐶! (0) = 0 𝑑𝑡 𝑑𝐶$ = 𝑘# ∙ 𝐶! 1-2 𝑑𝑡 By solving Equation 1-1 and Equation 1-2, we obtain 𝑘" 𝐶! (𝑡) = ∙ (1 − exp(−𝑘# ∙ 𝑡)) 1-3 𝑘# 𝑘" 𝐶$ (𝑡) = 𝑘" ∙ 𝑡 − (1 − exp(−𝑘# ∙ 𝑡)) 1-4 𝑘# Assuming, 14 1 𝑘# = 1-5 𝜏 where t is the lag time. At steady state, the equations of 1-3 and 1-4 are65 𝐶$%% = 𝑘" ∙ (𝑡 − 𝜏) 1-6 𝐶!%% = 𝑘" ∙ 𝜏 1-7 By measuring the concentration change of the product, CP evolution over time can be plotted. The linear intercept of CP is the lag time, as shown in Figure 1.8a.65 Figure 1.8 (a) Lag time in coupled enzymatic reactions. Product, P, versus time is showing lag time t = 10 s and steady-state concentration of intermediate, I, is 5 µM. k1 =0.5 µM/s, k2 = 0.1 s-1. (b) Isotope enrichment comparison between free diffusion and channeling. (c) Competing reaction effect of MDH-CS system with a competing enzyme aspartate aminotransferase that consumes the intermediate oxaloacetate. This figure is from ref. 4 and ref. 65. 15 1.3.2 Isotope Dilution or Enhancement In Scheme 1.2, substrate, A, is isotopically labeled. Intermediate, B, and product, C, will be isotopically labeled as the reaction proceeds. But there are many unlabeled intermediate, B, in bulk that would produce unlabeled, C. Therefore, if perfect substrate channeling exists, there should have no accumulation of B* in bulk.4 Based on the ratio of unlabeled product, C, to labeled product, C*, or the concentration of the labeled intermediate, B*, in the bulk, one can quantify the channeling efficiency. A low concentration of B* indicates an efficient channeling. For example, as shown in Figure 1.8b, a freely difffusing monofunctional TS and DHFR results in 55% labeled product and 20% labeled intermediate in the bulk, whereas the labeled product is about 100% and the label intermdiate is only 5% in the bifunctional TS-DHFR cascade.66 While this approach has demonstrated efficacy in different systems, its applicability may be constrained in systems where isotopic labeling is not feasible or where labeling could affect the reaction kinetics.4 Scheme 1.2 A cascade reaction with isotopical labeling. 1.3.3 Competing Reaction Effects. Competing reaction effects can be beneficial in detecting the presence or absence of substrate channeling. As shown in Scheme 1.3, a competing reaction that consumes intermediate, B, can be introduced to evaluate substrate channeling of intermediate B generated from substrate, A, by providing another enzymatic reaction in bulk. If the intermediate, B, diffuses into the bulk phase, then it would be consumed by the competing enzyme to generate product, D, instead of the desired product, C. In the absence of channeling, the cascade activity will significantly decrease as the intermediate is consumed by competing reaction in the bulk.65 As indicating in 16 Figure 1.8c, in the example of MDH-CS cascade, with the incorporation of competing enzyme aspartate aminotransferase in the bulk, the activity decreased from 60% to 25%.4 Scheme 1.3 A cascade reaction with competing effect. 1.4 Computational Techniques Though many advanced experimental approaches are available for investigating substrate channeling as mentioned previously, they are unable to monitor substrate channeling in real- time, or probe the mechanism at an atomic scale. Computational techniques can provide detailed insight to explain experimental results. Different methods have different simulating time scales and length scales. The employment of proper computational techniques is crucial for mimicking the simulation system. 1.4.1 Continuum Model The continuum model is a mathematical representation of a physical system that assumes a system as a continuous medium without taking into account the underlying atomic or molecular structure.67 This approach is based on the idea that the properties for large-scale systems, including temperature, pressure, or concentration, are continuous and can be described as an infinite number of infinitesimal elements. The finite element method is frequently utilized to calculate the properties of a continuum model.68 Continuum models are accurate for simulations at large scales, as it ignores the atomic and molecular interaction with the system. Previously, continuum models have been applied to study substrate channeling in our group.69,70 Continuum models were employed to study molecular tunnels by varying the tunnel's geometry in terms of length, radius, and shape.69 The findings suggest that minimizing the 17 distance between active sites or the diameter of the tunnel can maximize product yield. Continuum models were also used to study two different channeling mechanisms together, including channeling by proximity via surface adsorption and electrostatic channeling.70 The result is summarized as that channeling by proximity via surface adsorption is less efficient than electrostatic channeling in terms of yield.70 1.4.2 Brownian Dynamics Simulation Brownian dynamics is a computational simulation technique used to model the random motion of particles.71 It is based on the Brownian motion theory, which describes the random movement of particles that are subjected to collisions with neighboring molecules. Thermal motion drives all the movement in the simulating system. Many biological processes take place in a mesoscopic regime where thermal motion controls kinetics and diffusion.71 Brownian dynamics simulations are suitable for understanding these systems. McCammon's group employed Brownian dynamics to investigate electrostatic channeling. This approach considers the Brownian motion of all the components including enzymes and ligands, their interactions with the solvent molecules, and the electrostatic force generated by the enzymes' electrostatic field.19,71,72 They focused on the malate dehydrogenase (MDH)-citrate synthase (CS) system.19 Different sources of MDH-CS metabolons were investigated, including native bovine, wild-type porcine, and recombinant sources. They found that the electrostatic channeling was observed in native bovine and recombinant sources, which guided the transfer of OAA and enhanced the transfer efficiency, whereas the long diffusion length in wild-type porcine led to a low channeling probability. The problem with this model is that the intermediate was considered to be a charged sphere, which is inaccurate and cannot describe the interaction between the intermediate and the charged surface as it transfers. 18 1.4.3 Molecular Dynamics Simulation Molecular dynamics (MD) is a technique that solves Newton's laws to estimate atomistic motion in a molecular system with parameters from classical force fields.73 By utilizing parameters defined in force fields, molecular dynamics can anticipate the trajectory of individual atoms within a system. At an atomistic level, MD simulations can analyze the dynamics, interactions, conformations, and characteristics of a molecular system. The driving force comes from the gradient of the potential:74 𝜕𝑈 𝜕𝑈 𝜕𝑈 𝐹& = −∇'! 𝑈(𝑟& , … , 𝑟( ) = −( , , ) 𝜕𝑥& 𝜕𝑦& 𝜕𝑧& 1-8 𝑎& 𝑏& 𝑈(𝑟& , … , 𝑟( ) = < (𝑙& − 𝑙&) )# + < (𝜃 − 𝜃&) )# 2 2 & *+,-% .,/01% 𝑐& + < [1 + cos(𝑛𝜔& − 𝛾& )] 1-9 2 2+'%&+,% 𝜎&3 𝜎&3 𝑞& 𝑞3 + < 4𝜖&3 [( )"# − ( )4 ] + < 𝑘 𝑟&3 𝑟&3 𝑟&3 567 10182'+%2.2&8 In the above potential equation, 𝐹! is the driving force, 𝑈(𝑟! , … , 𝑟" ) is the potential of the system. The first three terms are the bonds, angles, and torsions from the covalent bonding. The final two expressions represent Van der Waals interaction and electrostatic interaction from the nonbonded forces. Molecular dynamics (MD) is another critical approach to studying substrate channeling. Liu et al. used MD to conduct a simulation by building artificially charged polypeptide bridges connecting hexokinase (HK) and glucose-6-phosphate dehydrogenase (G6PDH) using different peptides including (lysine, arginine, and histidine) and different intermediates including (glucose-6-phosphate, oxalate, and glyoxylic acid).20 It was found that the combination of 19 glucose-6-phosphate (charge=-2) as intermediate and poly-lysine (1e- per lysine) as bridge is the most efficient electrostatic channeling system. Molecular dynamics give us more information on the atomic scale with relatively high efficiency, so it is an excellent model for studying substrate channeling. However, MD simulations tend to become trapped within the local energy minima and inadequately sample events of interest within microsecond timescales. Therefore, advanced sampling techniques or data analysis models combined with MD are needed to overcome this limitation. 1.4.4 Umbrella Sampling Umbrella sampling is a method to enhance the sampling of rare events or states that are energetically unfavorable and occur at a higher energy barrier compared to the system's equilibrium state.75 It involves the application of an external harmonic potential to generate multiple biased simulations, i.e., windows, that allow for the calculation of free energy differences and the determination of the system's thermodynamic properties. By selecting appropriate reaction coordinates, umbrella sampling can analyze the energy landscape of a system with significant energy barriers between distinct regions when used in conjunction with molecular dynamics simulations. Steered molecular dynamics (SMD) is commonly employed as a preparatory step for umbrella sampling, wherein it assists in selecting the initial configurations for the individual windows in the umbrella sampling process. During SMD, an external force is typically exerted on one or more parts of the simulating system by stretching or pulling them. For a specific window of umbrella sampling, after a harmonic bias is applied, the system can only sample a narrow range of configuration space. The weighted histogram analysis method (WHAM) algorithm is often used to recover the unbiased distribution of umbrella sampling simulations.76 Umbrella sampling is widely used, for example, to quantify energy profiles within 20 molecular tunnels and to study the desorption energy of intermediates in electrostatic channeling.77,78 1.4.5 Markov State Model Markov State Models (MSM) use short-term dynamics to build a matrix of transition probabilities between states and predict long-term kinetics.79 Paired with molecular dynamics, MSM have been developed as a powerful approach to study the dynamics evolution. MSM enables many property calculations, including mean first passage time which is defined as the expected amount of time it takes for a system to transition from one state to another for the first time,80 identification of dominant pathways combined with the transition path theory, and free energy profiles.81 In particular, the hub score is a metric that quantifies the degree to which a set of states mediates transport between two other groups of states; the hub score can therefore be used to identify intermediate states essential to a given transport process.82 Figure 1.9 Representation of umbrella sampling simulation. The red lines denote the harmonic bias potentials that are employed in the umbrella sampling simulation windows. 21 Recently, MSMs have been proven to be powerful in studying various systems, such as binding, dissociation, allostery, and folding processes.83–86 Previously, MSM was adopted to investigate the G6P’s transport process via electrostatic channeling on the enzyme surface by Liu et al. (Figure 1.10).87 They found that the transport efficiency of G6P from a peptide bridge to G6PDH is consistent with that of transition state theory(TST), indicating the success of MSM. However, the transport pathway from the bridge to the other enzyme, HK, is ~4.1 nm, which is much longer than that of G6PDH being ~ 2nm, so the transport process cannot be simplified and studied by TST. As MSM is useful in predicting long-term kinetics, it was extended to study the transport efficiency of G6P from a peptide bridge to HK. 1.4.6 Metadynamics In a metadynamics simulation, the external history-dependent bias potential V(s, 𝑡 # ) is applied over selected collective variables (CV) at time 𝑡 # , with the following equation:88 ("# (s − 𝑠& (𝑡 : ))# 𝑉(𝑠, 𝑡 : ) = ω(𝑡 : ) ∙ exp (− < ) 1-10 2𝜎&# %! ~4.1 nm ~2 nm Figure 1.10 The application of MSM models in studying the transport process of G6P from the peptide bridge to both enzyme HK and enzyme G6PDH. This figure is from ref. 87. 22 where 𝜔 is the amplitude of each Gaussian-shape bias, and 𝜎! is the width of the bias deposited over the ith CV 𝑠! . In well-tempered metadynamics,89 bias amplitude, ω, is determined by: −β𝑉(𝑠, 𝑡 : ) ω(𝑡 : ) = 𝜔) exp ( ) 1-11 (1 − 𝛾) where 𝜔$ is the initial amplitude, 𝛽 is 1⁄𝑘% 𝑇, and 𝛾 is the bias factor. Through biasing, time- consuming transitions hindered by large energy barriers can be observed in MD simulations. Metadynamics was initially designed to construct the thermodynamics free energy profile (Figure 1.11), but recently, it has been extended to extract a system’s kinetics.90 Tiwary and Parrinello’s infrequent metadynamics (InMetaD) method has successfully recovered several systems' kinetic rate constants.91 With a larger deposition interval, InMetaD adds bias much less Figure 1.11 Representation of metadynamics simulation. The black line represents the free energy profile, while the solid gray line shows the deposition of Gaussian potential, which causes the free-energy landscape (dashed gray line) to rise. As more Gaussian potential is added, the first metastable state becomes filled, and the system moves into the second metastable basin. Once the second metastable basin is filled, the landscape becomes flat, and the deposited bias (solid gray profile) can be used to estimate a rough negative free energy profile. This figure is from ref. 90. 23 frequently than metadynamics, which usually adds bias about every ps. Through applying bias to the system infrequently, InMetaD accelerates system evolution out of the bottom of energy basins. Because InMetaD does not bias high-energy configurations, it keeps the system’s kinetics undisturbed during transition states. Therefore, post-simulation analysis can recover the unbiased transition times, t, from the biased simulated time, t’, with the following equation:92 2%&'() $) 𝑡=T 𝑑𝑡 : 𝑒 ;5(%,2 1-12 ) where 𝑡&'()* is the metadynamics simulation time when the transition occurred, and 𝑉(𝑠, 𝑡 # ) is the bias potential at time 𝑡 # . InMetaD has been successfully applied to ligand binding and unbinding studies of various systems.92–94 For example, InMetaD has successfully estimated the unbinding rate of an inhibitor from the p38 mitogen-activated protein (MAP) kinase to be 0.02 s-1, which is comparable to the experimental result 0.14 s-1.92 InMetaD also revealed the unbinding mechanism of a ligand from 𝛼7 nicotinic acetylcholine receptor with the unbinding rate to be 3.3 ms-1, which is consistent with the experimental value of 0.3 ms-1.93 1.4.7 Kinetic Monte Carlo Kinetic Monte Carlo (KMC) is a stochastic method that uses random numbers to simulate dynamic systems.95 KMC is useful for studying systems with complex dynamics and has been widely applied in surface diffusion, crystal growth, and heterogeneous catalysis.95 It is often used in conjunction with other simulation techniques, such as molecular dynamics, to provide a more complete understanding of complex systems. KMC simplifies a problem as a set of events with specific rate constants corresponding to each event. According to the KMC algorithm, a random number 𝜌+ is generated to choose an event, whose range is from 0 to 1:96 24 &* @" &* < 𝑘& < 𝜌" ∙ Γ2+2.0 ≤ < 𝑘& 1-13 &?" &?" where Γ(,()- is the sum of all rate constants at the current KMC step. Then the event 𝑖$ is executed with the time ∆𝑡:96 ∆𝑡 = −ln (𝜌# )/Γ2+2.0 1-14 where 𝜌. is the second random number between zero and 1. After each event execution, the occupancy and rate values is updated. Previously, our group has applied KMC to calculate the lag time for electrostatic channeling of glucose-6-phosphate (G6P) on a polypeptide bridging hexokinase (HK) to glucose-6- dehydrogenase (G6PDH).78 All the events in the channeling process, including hopping, desorption, adsorption, and reaction, were taken into consideration with specific rate constants parametrized from molecular dynamics and umbrella sampling. In the end, the plot of product evolution as a function of time was obtained and the lag time was extrapolated from the plot. The lag times calculated from KMC models compared well to those measured in experiments, validating the effectiveness of the KMC model. 1.4.8 Coarse-Grained Molecular Dynamics Experiments used for investigating the dynamic behavior of biological systems can only achieve a time resolution ranging from microseconds to milliseconds. However, standard molecular dynamics simulations are restricted to a few hundred nanoseconds. The discrepancy between the timescales of experiments and molecular dynamics simulations can be bridged by utilizing coarse-grained molecular dynamics (CG MD).97 25 Figure 1.12 A comparison representation of all-atom model and coarse-grained model. This figure is from ref. 99. The time scale of coarse-grained simulations is much longer compared to conventional MD due to the reduced representation of the system, which results in fewer degrees of freedom and a faster simulation time.98 Specifically, a group of atoms is represented by a single coarse- grained bead, instead of individual atoms, and multiple water molecules are jointly represented as a single coarse-grained bead (Figure 1.12).99 The biomolecular systems are simplified, and there could be run for a much longer timescale of up to seconds, allowing direct comparison with experimental observations. The coarse-grained force field Martini is widely used in simulating biomolecular systems.100 CG MD has been applied in many fields, including lipids, polymers, and proteins.98,101,102 For instance, Bond et al. used CG MD to explore the interactions between a number of model peptides (e.g., WALP and LS3) and simple proteins (e.g., the bacterial sugar transporter LacY) and the membrane during in the insertion processes of peptides/proteins into a DPPC lipid bilayer.98 They found that CG simulations can correctly predict the experimentally characterized membrane orientations and identify the nature of lipid/protein interactions. 1.5 Overview of Work The work aims to computationally study the mechanism of substrate channeling in cascade reactions, both thermodynamically and kinetically. Electrostatic channeling and 26 molecular tunneling are the two main strategies of substrate channeling considered here. Our research investigates electrostatic channeling in both natural and artificial cascade systems. In addition, we also study the mechanism of molecular tunnels. This thesis is structured as follows: As shown in Figure 1.13, chapter 2 explores the electrostatic channeling mechanism of G6P on arginine peptides as a rare event and computes the hopping activation energy using infrequent metadynamics. Umbrella sampling simulations capture configurational changes in the desorption process and help determine the desorption energy. By considering these energy barriers, KMC simulations were performed to study the transfer efficiency of G6P. Results indicate that, even at high ionic strength (120 mM), a poly-arginine peptide bridge may be capable of electrostatic channeling of G6P with an efficiency much greater than that of an equivalent poly-lysine bridge. Chapter 3 examines the transport process using electrostatic channeling of the intermediate oxaloacetate (OAA) from malate dehydrogenase (MDH) to citrate synthase (CS). The Markov State Model (MSM) identifies the dominant transport pathways of OAA from MDH to CS. Analysis of all pathways using a hub score approach reveals a small set of residues that control OAA transport. This set includes an arginine residue previously identified experimentally. In addition, MSM analysis of a mutated complex, where the identified arginine is replaced by alanine, led to a two-fold decrease in transfer efficiency, consistent with experimental results. Chapter 4 studies the transport mechanism of NH3 in the molecular tunnel in asparagine synthetase (AS). We used molecular dynamics (MD) simulations combined with umbrella sampling (US) and a Markov State Model (MSM) to discover the interaction between NH3 and the tunnel. The energy profile constructed from US indicates that four energy maxima exist. The 27 highest energy barrier separates the first site basin from the mid-tunnel basin. This bottleneck is associated with the narrowest part of the tunnel formed by four hydrophobic residues. Compared to the energy profile for the mutant A388L proposed in experiments, the mutation leads to an even higher energy barrier located at the lower part of the tunnel that prevents NH3 from passing through and significantly reduces the tunneling probability. In Chapter 5, we aim to explore the interaction between hexokinase (HK) and glucose-6- dehydrogenase (G6PDH) with the existence of a free polypeptide bridge. CG MD examined the distance and the interaction energy between the enzymes. The results showed that HK and Figure 1.13 Overview of the work. (a) Electrostatic channeling of G6P on a polyarginine bridge, (b) Electrostatic channeling of OAA on MDH-CS complex, (c) Molecular tunnel of NH3 in asparagine synthetase, (d) Coarse-grained molecular dynamics study of HK- G6PDH. 28 G6PDH had a tendency to be in close proximity, which led to a notable rise in the interaction energy. 29 Chapter 2. Infrequent Metadynamics Study of Rare-event Electrostatic Channeling* 2.1 Introduction Cascade reactions that combine multiple chemical transformations into a single step, without purification steps, have attracted attention recently due to benefits such as minimal complexity, reduced labor requirements, and decreased waste.103 Design and control of cascade reactions requires a deep understanding of the mechanism and a substantial toolbox with which to build cascade architectures. Many examples are found in nature where substrate channeling improves the efficiency of cascade reactions, by enabling intermediate transport between adjacent reaction active sites without equilibrating with a bulk concentration.4 Intermediates can be thereby protected from unfavorable binding or reaction in the bulk, enhancing overall cascade reaction efficiency. Some strategies of substrate channeling observed in nature are intramolecular tunneling,104 chemical swing arms,54,105 spatial organization,17,54,106 and electrostatic channeling.106–108 Electrostatic channeling controls the transport of charged intermediates via electrostatic interaction with the oppositely charged surface between active sites. It is a promising way to channel artificial cascades because it is not limited by structural aspects, and most metabolic intermediates are charged.109 The transport efficiency of electrostatic channeling can be measured by the determination of the transient lag time—the time the system takes to reach steady-state flux. For a perfect * This work is published as Y. Xie and S. C. Barton, "Infrequent metadynamics study of rare-event electrostatic channeling", Physical Chemistry Chemical Physics, 23, 13381–13388 (2021). doi:10.1039/D1CP01304A. 30 channeling mechanism, the lag time is near zero; for an electrostatically channeled system, a lower lag time reflects increased channeling efficiency.110 In recent decades, considerable efforts have focused on two well-studied examples of electrostatic channeling in nature. The first example is the bifunctional enzyme dihydrofolate reductase-thymidylate synthase (DHFR-TS), where bounded diffusion guides negatively charged dihydrofolate intermediates on the positively charged protein structure between two active sites.109 The other often studied example is the super enzyme complex of malate dehydrogenase (MDH) and citrate synthase (CS) within the tricarboxylic acid (TCA) cycle, where oxaloacetate, carrying two negative charges, crosses the positively charged surface between MDH and CS via electrostatic interaction.111 Structural characterization of each of these systems has revealed a charged surface present between active sites, and lag time analyses have demonstrated the influence of this surface on kinetics.40,109 However, experimental approaches provide a limited view of how intermediates interact with these electrostatic surfaces. Thus, advanced computational simulations provide a view into the interaction mechanisms underlying electrostatic channeling. Computational studies have addressed substrate channeling mechanisms, including DNA- scaffolded bienzyme systems,106,112,113 co-immobilization,114 and encapsulation,115–118 in artificial cascades. Our group and collaborators have recently reported studies of electrostatic channeling in an artificial cascade comprising hexokinase (HK) and glucose-6-phosphate dehydrogenase (G6PDH) where the following reactions take place:20,78,87 31 Scheme 2.1 Dual site conversion of glucose to gluconolactone via hexokinase (HK) and glucose- 6-phosphate dehydrogenase (G6PDH). Glucose-6-phosphate (G6P) is a stable intermediate. In those works, an enzyme complex of HK and G6PDH was covalently conjugated by a positively charged poly-lysine peptide, and the negatively charged intermediate glucose-6- phosphate (G6P) was channeled through electrostatic interactions. A hopping transport mode was considered, and diffusion and desorption energy barriers were computed for intermediate transport along the poly-lysine bridge. Computational results along with experimental data were used to parameterize a kinetic Monte Carlo (KMC) model to compute the transport efficiency of G6P in terms of lag time. Computational results matched well with the experiment, and both results demonstrated the effectiveness of the poly-lysine bridge in decreasing lag time by the increasing transport efficiency of the intermediate. Electrostatic channeling efficiency was shown to decrease with increasing ionic strength, and loss of intermediate was found to be significant at each terminal enzyme, particularly for HK, which has a significant net negative charge. As an alternative cationic amino acid to lysine, arginine residues are more polarized and can form multidentate hydrogen bonds with phosphate groups,119 enabling stronger interaction with charged intermediates. Because of the strong interaction, poly-arginine peptides are more electrostatically “sticky” than poly-lysine, increasing both the desorption energy and hopping activation energy of G6P on poly-arginine peptides. High desorption energy lowers desorption probability and prevents G6P’s leakage at high ionic strength. However, such high hopping activation energy traps G6P in energy wells on the poly- arginine peptide, increasing the time scale between observed hopping events to hundreds of 32 nanoseconds. Hopping of G6P in the presence of arginine is thus a rare event in MD simulations. In contrast, the hopping process of G6P remains orders of magnitude faster than the HK enzyme’s turnover frequency of 0.7 s-1.20 Based on these considerations, poly-arginine peptides represent a promising electrostatic bridge candidate, and the study of rare hopping events using poly-arginine bridges would benefit from advanced sampling techniques. Among the methods of advanced sampling of MD simulation, infrequent metadynamics (InMetaD)91 is efficient in calculating kinetic rates and has been successfully applied to ligand binding and unbinding studies of various systems.92–94 For example, InMetaD has successfully estimated the unbinding rate of an inhibitor from the p38 mitogen-activated protein (MAP) kinase to be 0.02 s-1, comparable to the experimental result 0.14 s-1,92 and reveals the unbinding mechanism of a ligand from 𝛼7 nicotinic acetylcholine receptor with the unbinding rate to be 0.0033 s-1, comparable with the experimental value of 0.0003 s-1.93 InMetaD enhances the sampling of a system by adding bias on predefined collective variables (CVs) to release the system from energy minima, enabling observation of system dynamics and kinetics. A key aspect of implementing InMetaD is the selection of CVs. Good CVs properly describe a system in terms of both thermodynamics and kinetics, and common CVs include separation distances between atoms and molecules, bond angles, and combinations of these. Here we use infrequent metadynamics to study the rate and energy barrier of hopping of G6P on a poly-arginine peptide. The desorption pathway and energy barrier were studied using Umbrella sampling.120 Rate constants obtained by infrequent metadynamics and umbrella sampling were used as parameters for KMC studies to estimate lag time due to transport over the 33 polypeptide surface. These studies provide an improved understanding of the influence of scaffolding architecture in electrostatic channeling. 2.2 Methods 2.2.1 MD Simulation Model poly-arginine peptide structures were built using Avogadro software.121 Two alanine residues on each end and eight arginine residues in the middle were used to form a peptide chain (Figure 2.4a). The structure of glucose-6-phosphate (G6P) was obtained from the Zinc database.122 Molecular dynamics (MD) simulations were conducted using GROMACS 2020.1123 with the CHARMM36 force field.124 G6P was parameterized using CgenFF (CHARMM General Force Field).125 The TIP3P model was used to solvate the peptide-ligand complex in a dodecahedral box, and six Cl- ions were added to neutralize the system. Energy minimization was conducted with the steepest descent minimization algorithm. Afterward, the system was equilibrated with an NVT ensemble for 0.1 ns and an NPT ensemble for 1.0 ns. MD simulations were conducted under an NPT ensemble with temperature governed by the velocity- rescale thermostat and pressure governed by the Parrinello-Rahman barostat. Periodic boundary conditions were applied in all directions for all simulations. Simulations were conducted at five discrete temperatures between 280 K and 320 K with the first and last alpha carbons of the peptide restrained to mimic the situation when the bridge is incorporated into the HK-G6PDH cascade. At each temperature, 32 parallel simulations were run with random initial velocities generated according to the Maxwell-Boltzmann distribution at the designated temperature. 34 2.2.2 Infrequent Metadynamics Metadynamics simulation was conducted using GROMACS with the PLUMED 2.2 plugin.126 In this work, the initial height of the potential bias 𝜔$ is 1.0 kJ/mol and the bias factor 𝛾 is 15. The bias potential was modified every 10 ps. We adopted three CVs based on distances between the phosphate group of G6P and arginine residues. The first, labeled 𝑑./0 , represents the sum of the distances, 𝑑! , between the center of mass of the G6P phosphate group and the center of mass of the guanidinium group of the ith arginine residue, for three arginine residues numbered 2, 4, and 6 in Figure 2.4a. Initially, the phosphate group was closely associated with these residues, where a close association is defined as 𝑑! ≤ 0.5 nm, and 𝑑./0 = 𝑑. + 𝑑/ + 𝑑0 = 𝑑./0 ≈ 1.5 nm, which was considered to be a minimum value. The second collective variable, 𝑛1 , represents the number of arginine residues with which the phosphate group is closely associated at each time step. The following function defines 𝑛1 continuously: B 1 𝑑& ≤ 0.5 𝑛𝑚 𝑛A = < ^½ (𝑐𝑜𝑠[10𝜋(𝑑& − 0.5)] + 1) 0.5 𝑛𝑚 < 𝑑& < 0.6 𝑛𝑚 2-1 &?" 0 𝑑& ≥ 0.6 𝑛𝑚 The third and final CV is 𝑛2 , representing the ligand’s solvation number (i.e., the number of water oxygens near the ligand). The value 𝑛2 can be defined continuously as:92 ,+ 1 − (𝑑& /𝑑) )4 𝑛C = < 2-2 1 − (𝑑& /𝑑) )"# &?), where 𝑛3 is the total number of water molecules, 𝑑! is the center of mass distance between the five atoms on the phosphate group of G6P and the oxygen atom of the ith water molecule, and 𝑑$ 35 was 0.3 nm. The Gaussian width 𝜎! was set to 0.1, 0.1, and 1.0 for CVs 𝑑./0 , 𝑛1 , and 𝑛2 , respectively. A committor analysis was implemented in all simulations. In the committor analysis of infrequent metadynamics, we defined 19 possible triple association basins, besides 𝑑./0 , based on the assumption that the nearby index difference of the three indexes of 𝑑!45 should not be greater than 2. For instance, 𝑑+.6 , 𝑑+./ , 𝑑+67 and 𝑑+6/ are possible, but not 𝑑+.7 or 𝑑+/7 . Therefore, these basins below are all assumed to be possible triple association basins: 𝑑+.6 , 𝑑+./ , 𝑑+6/ , 𝑑+67 , 𝑑.6/ , 𝑑.67 , 𝑑./7 , 𝑑./0 , 𝑑6/7 , 𝑑6/0 , 𝑑670 , 𝑑678 , 𝑑/70 , 𝑑/78 , 𝑑/08 , 𝑑/09 , 𝑑708 , 𝑑709 , 𝑑789 , and 𝑑089 . Specifically, 32 simulations were initiated with 𝑑./0 ≈ 1.5nm and terminated once any of the 19 predefined triple association energy basins has been reached within a wall time of 48 hours (simulation time ~24 ns). A committer basin is defined based on the sum of three associated distances, 𝑑!45 where 𝑖, 𝑗, and 𝑘 indicate an arginine residue as labeled in Figure 2.4a. When the sum 𝑑!45 is less than 1.5 nm for a specific basin, while 𝑑!45 for all other basins is greater than 2.0 nm, that basin is considered to be occupied. Once any of the predefined basins is occupied, the simulation was terminated, and the simulated time was converted to unbiased transition times using Equation 2-3. Assuming that the transitions are stochastically independent, they may be treated as Poisson processes, and the reaction times should follow an exponential distribution.127 Thus, the empirical cumulative distribution function (ECDF) was computed from the unbiased transition times, 𝑡, and compared with the theoretical cumulative distribution function (TCDF) of a homogeneous Poisson process:92 −𝑡 𝑇𝐶𝐷𝐹 = 1 − 𝑒𝑥𝑝 ( ) 2-3 𝜏 36 where 𝜏 is the characteristic time, and the reciprocal of 𝜏 is the hopping rate. In this work, the transition times below 1 𝜇s were fitted into the cumulative function. To validate the precision of the recovered transition times, the two-sample Kolmogorov- Smirnov (KS) test was conducted, following the procedure of Salvalaglio et al,127 to determine whether they were exponentially distributed. When the p-value of the KS test exceeds the threshold of 0.05, the dataset is deemed to conform to an exponential distribution. This is an important step in InMetaD analysis as it ensures that recovered times are uncorrelated and reliable. 2.2.3 Umbrella Sampling In the US simulations, the 𝑑./0 triple association configuration represented in Figure 2.4b was used as the initial association state, and a steered MD simulation was conducted by pulling Figure 2.1 Example biased distribution of x1 of all the US windows at IS = 0 mM, temperature 300 K. (b) Example 2D energy plot of G6P desorption energy from the triple association on the poly-arginine peptide. Data above the yellow cutoff line was used for calculation of the PMF. (c) The projection of desorption energy on x1 at IS = 0 mM, temperature 300 K. 37 G6P perpendicular to the peptide axis while restraining the entire poly-arginine backbone. The pull rate was 2 nm ns-1, and the simulation was conducted for 0.8 ns, yielding a maximum pull distance of 1.6 nm, defined as the distance between the center of mass of the alpha carbons of arginine residues 2, 4, and 6, and that of the phosphate group on G6P. From this data, over 20 US windows were chosen, with a higher window density below 1 nm, where the energy gradient was greatest. All US simulations were conducted at 300 K with a spring constant, 𝑘 = 2000 kJ mol-1 nm-2 applied over two coordinates. The first US coordinate was the pull distance 𝑥+ as defined above. The second, 𝑥. , represents the pull vector, defined as the magnitude of (𝐶1. + 𝐶1/ + 𝐶10 )/3 − 𝑃 where 𝐶1! is the position of the alpha carbon of the associated arginine residue 𝑖, and 𝑃 is the position of the phosphorus atom of G6P. Figure 2.2 Schematic of the KMC model. Figure 2.3 Product concentration versus time at IS = 120 mM in the KMC model. 38 MD sampling of each US window was conducted for 20 ns. Grossfield-WHAM128 code was used to calculate the 2D potential of mean force. The average of the area defined by |𝑥+ − 𝑥. | = 0.05 nm was used as the projection of the PMF on the pull distance, 𝑥+ (Figure 2.1). 2.2.4 Kinetic Monte Carlo The ratio of hopping rate to desorption rate for varying ionic strengths was used to parameterize the kinetic Monte Carlo model described by Liu et al.78 Six hopping sites on the peptide bridge and two active sites on HK and G6PDH were considered in the KMC model. The rate ratios for hopping from enzyme to bridge and from the bridge to enzyme were assumed to be the same as those on the bridge. At each site, all possible events were considered (hopping, desorption, adsorption, and reaction) and rate constants were assigned accordingly. Reaction events were only considered at the enzyme active sites, labeled as HK and G6PDH in Figure 2.2. + . The timescale of reactions at these sites was given by the rate constants 𝑘:)( and 𝑘:)( , respectively. The simulation was conducted with 5 parallel runs of 1 µs duration consisting of 50 parallel cascades. The lag time, 𝜏, was computed by extrapolating the steady-state portion of the product accumulation to intercept the time axis (Figure 2.3). All parameters used in the KMC simulations are described in Table 2.1 and Table 2.2. 39 Table 2.1 Energy barriers and rate constants. IS 𝐺D+E Δ𝐺-1% ∆𝐺 𝑘D+E 𝑘.-% × 104 mM kJ/mol kJ/mol 𝑘-1% 𝑘-1% kJ/mol 0 25.1 47.4 22.3 7515 179 20 25.1 45.6 20.5 3652 87 40 25.1 42.6 17.5 1096 26 70 25.1 41.2 16.1 625 15 120 25.1 36.7 11.6 103 2.5 IS: ionic strength 𝐺;,< , 𝑘;,< : energy barrier and rate constant of hopping on the bridge from one triple association to an adjacent triple association Δ𝐺='> , 𝑘='> : energy barrier and rate constant of desorption from triple association ∆𝐺: the energy difference between hopping and desorption ∆𝐺 = 𝐺='> − 𝐺;,< 40 Table 2.2 KMC parameters. Region Parameter Value System 𝑐FG@H4$6F , cascade concentration, mol L-1 8×10-9 Vol, reaction volume, L 2.1×10-14 𝑐%I* , substrate concentration for HK, mol L-1 2 FG HK 𝑘8.2 , TOF of HK, molec s-1 0.7 FG 𝑘-1% , desorption rate on HK, s-1 0.07 FG 7.7×104 𝑘.-% , absorption rate on HK, s-1 "* 𝑘D+E 𝑘D+E , hopping rate from HK to bridge, s-1 FG 10-5 𝐾J," , Michaelis constant of HK, mM 𝑘D+E , hopping rate on bridge, s-1 FG Bridge 𝑘8.2 × 100 = 70 𝑘-1% , desorption rate on bridge, s-1 𝑘D+E 𝑘D+E ⁄𝑘-1% 𝑘.-% , adsorption rate on bridge, s-1 mol-1 L 𝑘-1% × (𝑘D+E ⁄𝑘-1%) G6PDH *# 𝑘D+E 𝑘D+E , hopping rate from bridge to G6PDH, s-1 K4LMN 6.2 𝑘8.2 , TOF of G6PDH, molec s-1 K4LMN 0.62 𝑘-1% , desorption rate on G6PDH, s-1 K4LMN 1.3×106 𝑘.-% , absorption rate on G6PDH, s-1 𝐾J,# , Michaelis constant of G6PDH, mM 5.4×10-6 * The value of 𝑘;,< ⁄𝑘='> is from Table 2.1. 41 2.3 Results and Discussion 2.3.1 Understanding of the Transfer Mechanism by MD Simulations Previously, a poly-lysine bridge was found to electrostatically channel glucose 6- phosphate (G6P) between the consecutive enzymes hexokinase (HK) and glucose-6-phosphate dehydrogenase (G6PDH).78 Here, we studied the molecular dynamics of G6P’s motion in electrolytes of varying ionic strengths in the presence of either a poly-lysine bridge or a poly- arginine bridge. On the poly-lysine bridge, the association distribution in Figure 2.5 indicates that G6P mainly associates with one or two lysine residues, a mode that we describe as a single association and a double association, respectively. A hop is made when transferring from a double association to the next double association with a single association as the intermediate state.78 The hopping rate was estimated to be 0.6 ± 0.04 ns-1 at 310 K.78 (a) (b) Figure 2.4 (a) The scheme of the peptide bridge. (b) A triple association configuration between G6P and the arginine peptide. G6P is associated with the second, fourth and sixth arginines. 42 In comparable simulations for G6P channeling on poly-arginine peptides, three dominant interaction energy were observed from the electrostatic interaction energy plot as shown in Figure 2.7. After visualizing the trajectories, we observed that these three energy states correspond to the three association states of single, double, and triple. In certain time periods, rapid fluctuations in association state are observed, for example near 40 ns and 80 ns in Figure 2.7. Triple association with electrostatic interaction energy of 750 kJ/mol on poly-arginine peptides is much stronger than the double association of 400 kJ/mol on poly-lysine peptides.20 In addition, association with three arginine residues, or triple association, was primarily observed, as shown structurally in Figure 2.4b for the poly-arginine bridge. This is possible because of the longer side chain of arginine, making it more amenable to triple associations. Figure 2.5 Association state probability of G6P on the poly-lysine bridge. 43 Figure 2.7 Electrostatic energy between G6P and the poly-arginine bridge. Due to this strong electrostatic interaction, escaping from the energy well of triple associations is rarely observed within the time scale of MD. Figure 2.6a shows residues with which G6P associates (𝑑! < 0.5 nm) and the number of associated residues over time at 293 K. After an initial period of the double association with residues 6 and 7, and then with residues 3 and 5, G6P was trapped in the triple association with residues 1, 3, and 5 starting at ~26 ns. As temperature increases (Figure 2.6b), the occurrence of a single association remains relatively rare, while that of a double association decreases, and that of a triple association increases. In our observations, triple association dominates over the entire simulated temperature range from 293 K to 318 K. Thus, we define a full hop on the poly-arginine bridge as a transition between Figure 2.6 (a) Associated arginine residue indices (top) and association number n (bottom) evolution over 100ns. (b) Association number probability dependence on temperatures. 44 different triple association states. To increase the frequency of transitions, we seek advanced sampling techniques. 2.3.2 Hopping Energy Barrier Study by InMetaD Among the enhanced sampling methods of molecular dynamics, metadynamics is efficient in forcing MD systems out of local energy minima, enabling the sampling of otherwise inaccessible regions by applying a history-dependent bias.88 The key step of applying metadynamics is the selection of collective variables (CVs). Here we biased three CVs with the following motivations: 1) biasing 𝑑./0 to force the system out of its initial triple association state; 2) biasing 𝑛1 to destabilize other triple-association energy basins and enable the system to move to any other association states such as double or single; 3) biasing 𝑛2 to accelerate water molecule dynamics around G6P without affecting interaction with arginine residues. However, metadynamics is mainly used to quantify thermodynamic properties and is not generally applied to recover kinetic information. To calculate the rate of hopping between triple association states, we adopted infrequent metadynamics (InMetaD).91 The essential difference between metadynamics and InMetaD is that in InMetaD, the rate of bias deposition is much slower than in metadynamics, and is intended to move the system out low-energy basins without disturbing the kinetics in high-energy transition regions.91,127 The theoretical and empirical cumulative distributions of the transition times at IS = 0 mM, 310 K are shown in Figure 2.8a, where the characteristic time, τ, was 126 ± 3 ns. The p- value was 0.95 as calculated by the Kolmogorov-Smirnov (KS) test.129 This value is much greater than the threshold of 0.05, indicating that the transition time conformed to an exponential distribution. This approach was subsequently applied to calculate values of τ for temperatures from 280 K to 320 K, for which all the p-values were above 0.05. 45 The hopping rate, 𝑘;,< , can be defined simply as 𝑘;,< = τ?+ . At 310 K, 𝑘;,< is 7 ± 0.6 𝜇s-1. Compared to the poly-lysine bridge, the hopping rate on the poly-arginine bridge is 85-fold slower at 310 K. Similarly, the hopping rates and their errors were estimated from the characteristic times and their errors across all the studied temperatures and ionic strengths. Overall, 𝑘;,< displays Arrhenius behavior, and the hopping activation energy was computed for a range of ionic strengths from 0 mM to 120 mM (Figure 2.8b). No trend was observed for the dependence of either hopping rate or activation energy with respect to ionic strength. This observation that ionic strength had little impact on hopping kinetics of G6P on poly-arginine peptides is consistent with previous calculations using the poly-lysine peptide.78 In both cases, transport is influenced locally by short-range interactions within the Stern layer, where counterions have relatively little influence. Hopping activation energy of 25 ± 3.6 kJ/mol was obtained by simultaneous fitting of rate data at all ionic strengths and is used in the KMC model. Figure 2.8 Hopping activation energy by InMetaD. (a) Cumulative probability distribution of transition time from independent InMetaD simulations at IS = 0 mM, 310 K. Theoretical cumulative probability distribution (TCDF) is in blue, and the stepwise empirical cumulative probability distribution (ECDF) in black was fitted from the unbiased transition times, t. (b) Arrhenius plot of the hopping rate at varying ionic strength. The black line represents an Arrhenius fit for all ionic strengths. The shaded area in blue is the 95% confidence interval. 46 Notably, this hopping activation energy value of poly-arginine is about double that observed for poly-lysine (13 ± 0.5 kJ/mol).78 2.3.3 Desorption Energy Mobile G6P traversing the electrostatic surface of a charged peptide may desorb into the bulk, which represents the major loss mechanism in electrostatic channeling of cascade reactions. To quantify the desorption energy and the ratio of hopping rate to desorption rate, umbrella sampling (US) simulations were conducted. This desorption energy is a free energy difference that considers changes in entropy. It is therefore not directly comparable to the electrostatic interaction energy, which is a potential energy difference that does not include entropic effects. We computed the desorption energy from the triple association state by considering the center of mass distance between G6P and associated arginine residues as a CV. The desorption energy dependence on ionic strength was studied, as shown in the potential of mean force (PMF) plot in Figure 2.9a. Here, the PMF is referenced to the triply associated state, and we observed the desorption energy at zero IS to be 47 ± 0.2 kJ/mol. This value is double the desorption energy from the poly-lysine peptide,78 consistent with the argument that the triple association of G6P with poly-arginine peptides was much stronger than the double association with poly-lysine peptides. It is worth noting the feature that appears in the PMF traces of Figure 2.9a near 0.72 nm, which varies with ionic strength. By inspection of trajectories, we observed that 0.72 nm is a position beyond which the system transitions from triple to double and single association. Generally speaking, this transition feature becomes less smooth and shifts to lower distances at higher ionic strength, reflecting the effect of counterions to weaken electrostatic interactions with the arginine bridge. As shown in Figure 2.9b, the desorption energy declines significantly with 47 increasing ionic strength. This is explained by the counterion shielding of long-range electrostatic forces between G6P and the arginine bridge. Based on the above InMetaD and US results, we can calculate the rate ratio of hopping to desorption by: 𝑘D+E 𝐴 ∙ 𝑒𝑥𝑝 (−𝐺D+E /𝑅𝑇) 𝐺D+E − 𝐺-1% = = 𝑒𝑥𝑝 r− s 2-4 𝑘-1% 𝐴 ∙ 𝑒𝑥𝑝 (−𝐺-1% /𝑅𝑇) 𝑅𝑇 where 𝐺;,< comes from InMetaD and 𝐺='> from US. The frequency factor, 𝐴, is assumed to be the same for both hopping and desorption in the same system. Due to the large difference between hopping activation energy and desorption energy, the poly-arginine peptide bridge displays a forty-fold larger hopping:desorption rate ratio, as compared to that of the poly-lysine bridge, at zero ionic strength (Figure 2.9b). This ratio decreases to only six-fold at IS = 120 mM due to electrostatic shielding. Figure 2.9 Desorption Energy by Umbrella Sampling (US). (a) Potential of mean force (PMF) profile for varying ionic strength. (b) Dependence of desorption energy (open symbols) and hopping:desorption rate ratio (closed symbols) on ionic strength for arginine peptide (blue triangles) and lysine peptide (orange squares). 48 2.3.4 Transfer Efficiency We conducted Kinetic Monte Carlo (KMC) simulations to study the transfer efficiency and quantify the reaction lag time of the two-step cascade. The KMC model was parameterized by using the rate ratio of hopping and desorption (Figure 2.9b) and experimental turnover frequencies of HK and G6PDH.20 In the KMC model, we set the hopping rate to be 100 times the turnover frequency. This is because the enzyme turnover frequencies (0.7 s-1 for HK, 6.2 s-1 for G6PDH) are much lower than the hopping rate, and we are only interested in the magnitude difference between hopping and reaction. We also neglected the desorption of intermediate from HK and assumed that the hopping to desorption ratio from the bridge to G6PDH is the same as that on the bridge. These assumptions simplify the model to focus on a comparison between poly-arginine bridges and poly-lysine bridges. Figure 2.10 Lag time dependence on ionic strength. The results of free enzyme and lysine bridge are taken from ref. 78. 49 Using these assumptions, the KMC model can predict and compare lag time results for both the HK-G6PDH couple bridged with poly-lysine or poly-arginine and freely standing HK- G6PDH without any bridges. Stop-flow lag time analysis was employed to evaluate the channeling efficiency from all the KMC data. As shown in Figure 2.10, the lag time for the poly- arginine bridge was 1.9 s at zero IS and increases slightly to 6.5 s at 120 mM. In comparison, the lag time for the poly-lysine bridge increases three-fold over the same range of ionic strength. Comparison to calculated results for the free enzyme,78 poly-lysine bridge,78 and poly-arginine bridge provides a clear indication that electrostatic channeling can significantly improve transfer efficiency as compared to a free enzyme system, and that poly-arginine represents a better bridge candidate compared to poly-lysine for this molecular motif. As the reaction system achieves steady state, desorption from the bridge approaches equilibrium, creating the possibility of multiple intermediates adsorbed there. To understand this, we used KMC model results to obtain the probability distribution for the number of the adsorbed intermediates on the poly-lysine and the poly-arginine bridges, each with six hopping sites Figure 2.11 Probability distribution of the number of adsorbed intermediates on the poly-lysine bridge and poly-arginine bridge. 50 (Figure 2.11). Our results show that the bridge was empty for about 80% of the time, the bridge adsorbed one intermediate for about 20% of the time, and the occurrence of more than two adsorbed intermediates was rare for both poly-lysine and poly-arginine bridges. This can be explained by the fact that the turnover frequency of G6PDH is much faster than that of HK and the transit time of the intermediate on either the poly-lysine bridge or the poly-arginine bridge is even faster, so there is relatively little accumulation of G6P intermediates on the bridge. Owing to its higher adsorption energy, the poly-arginine bridge does demonstrate a consistently higher probability for multiple adsorbed intermediates as compared to poly-lysine. This work only accounts for the movement of G6P on the bridge and did not consider the desorption from HK active site to bridge and bridge to G6PDH active site. These phenomena lead to additional loss of intermediate transport efficiency and therefore higher lag times.87 Additionally, the model assumes that the hopping rate between the bridge and enzyme is the same as that on the bridge, which is only a rough approximation. A more realistic prediction of transfer efficiency and lag time would be obtained by studying the transport of intermediate over the enzyme surfaces in the presence of the poly-arginine bridge, as has been demonstrated for the poly-lysine bridge. We are also pursuing experimental validation of these results by synthesis of the HK-polyArg-G6PDH system followed by stop-flow experiments to determine lag time. 2.4 Conclusion The transport of G6P in the presence of poly-arginine peptides was investigated using the combination of infrequent metadynamics (InMetaD) and umbrella sampling (US) in MD simulations. Although arginine and lysine both carry one positive charge, the interaction of G6P with arginine is much stronger than that with lysine. The strong local interaction reduces hopping rates 85-fold at 310 K as determined by InMetaD. The desorption of G6P from the triple 51 association on poly-arginine peptides displays large fluctuation due to the transition from triple association to double and single association. The hopping activation energy is less sensitive to ionic strength change compared to the desorption energy, leading to large hopping:desorption rate ratio that persists at high ionic strength. Kinetic Monte Carlo studies based on this rate ratio predict significantly lower lag times for systems built with the poly-arginine bridge as compared to the poly-lysine bridge, and lag time is more resistant to changes in ionic strength. These results will be ameliorated somewhat by including transport losses in the vicinity of the enzymes, an effect that can be studied computationally and validated experimentally. This demonstrates the use of InMetaD to study transport phenomena at longer time scales, which may be applied to channeling phenomena in other multi-enzyme fusions including TS-DHFR and MDH-CS. This work also aids the design of efficient artificial cascade reactions using electrostatic bridges. 52 Chapter 3. Markov State Study of Electrostatic Channeling within the Tricarboxylic Acid Cycle Supercomplex† 3.1 Introduction Sequential enzymes often form supramolecular complexes in nature, which are also referred to as metabolons.130 The spatial organization of sequential enzymes enables the controlled transport of substrates between consecutive active sites in cascade reactions and prevents the desorption of substrates into bulk solvents.131 This phenomenon is referred to as substrate channeling. Electrostatic channeling is one such channeling mechanism, in which charged intermediates interact with oppositely charged surfaces.4 By sequestering intermediates on a surface, electrostatic channeling avoids the equilibration and side reactions of intermediates with their surroundings, and can thus greatly increase the probability of transport to the target active site. Therefore, even when the concentration of intermediates in the bulk solvent is low, high overall reaction efficiency can be achieved via the high utilization of reaction intermediates.4 Electrostatic channeling is found in many metabolic pathways in nature, including such enzyme supercomplexes as the dihydrofolate reductase-thymidylate synthase complex, which has a hydrofolate intermediate,132 malate dehydrogenase-citrate synthase (MDH-CS) supercomplex, which is part of the tricarboxylic acid (TCA) cycle and involves an oxaloacetate (OAA) intermediate,133 and the sulfate-activating complexes (SACs) with adenosine 5’- phosphosulfate (APS2-) as the intermediate.134 † This work is published as Y. Xie, S. D. Minteer, S. Banta and S. C. Barton, "Markov State Study of Electrostatic Channeling within the Tricarboxylic Acid Cycle Supercomplex", ACS Nanosci. Au, 2, 414– 421 (2022). doi:10.1021/acsnanoscienceau.2c00011. 53 In this work, we explore the electrostatic channeling mechanism in a subset of the TCA cycle. The TCA cycle is a closed cascade consisting of a set of sequential reactions across eight enzymes. In one of the steps, MDH reduces malate to an OAA intermediate while converting NAD+ to NADH (Scheme 3.1). Carrying two negative charges, OAA is channeled from MDH to CS via electrostatic interaction with a positively-charged enzyme surface spanning MDH and CS. CS combines OAA and acetyl-CoA to produce a further intermediate, citrate. The MDH- catalyzed reaction equilibrium favors the formation of malate over OAA, which is the reverse of the TCA cycle reaction.135 Substrate channeling is therefore critical to efficient removal of OAA from the MDH active site so that equilibrium is not achieved. This property of MDH makes the MDH-CS complex an ideal platform for the study of electrostatic channeling. NAD+ NADH Acetyl-CoA CoA malate OAA citrate MDH CS Scheme 3.1 Dual site conversion of malate to citrate via malate dehydrogenase (MDH) and citrate synthase (CS). Oxaloacetate (OAA) is the intermediate. Because enzymes are weakly linked in the form of metabolons and easily dissociate during isolation and purification, obtaining structures of the supramolecular assemblies is challenging. Recently, Minteer’s group reported the structural characterization of a native MDH- CS supercomplex for the first time, using chemical cross-linking and mass spectrometry.111 By using a hybrid docking method, they studied protein-protein interactions and identified three cross-links between MDH and CS. The active site distances between MDH and CS were shortened by the formation of an enzyme complex to about 3.5 nm and a continuously positive surface was formed between MDH and CS by alignment of the stationary charges. Free CS is slightly negatively charged, and MDH is positively charged. After docking MDH to CS, a continuously positively charged channel forms between the two enzymes, linking the active sites 54 and, in turn, assisting the transport of intermediates via electrostatic channeling.111 Altogether, this work provides insight into the structural organization of native MDH-CS. Banta’s lab recently synthesized in vitro three recombinant versions of MDH-CS supercomplex with three different enzyme sources (e.g., native bovine, wild-type porcine, and recombinant porcine) and further examined substrate channeling.40 Based on crystal structures, the shortest active site distances between MDH and CS for the three different sources were reported as 3.5 nm, 7.3 nm, and 4.0 nm, respectively.40 As the crystal structures are mobile and flexible in solution, these values are expected to vary. A key interfacial residue, ARG65 on CS, was identified to assist with the electrostatic channeling of OAA. A mutation of the recombinant porcine complex, replacing ARG65 with ALA, was studied. The lag time, which is the time required for a cascade system to transition from zero reaction rate to steady-state reaction rate, can be used as an indirect measure of the efficiency of electrostatic channeling. A shorter lag time reflects more efficient electrostatic channeling.4 Transient analysis of Acetyl-COA production according to Scheme 3.1 showed that the reaction catalyzed by native bovine MDH-CS displayed a short lag time of 40 ± 5 ms, correlating to the shortest active site distance among the three sources. Upon mutating one arginine residue to alanine, the lag time of the recombinant porcine complex increased nearly thirty-fold from 30 ± 10 ms to 880 ± 60 ms, indicating a decreased efficiency of OAA transfer. Arginine (MW = 174) has a positive charge and alanine (MW = 79) is uncharged at neutral pH. The ARG →ALA mutation may have disrupted the positively- charged channeling surface between MDH and CS, although it is possible for the reduced positive charge and change in the size of the residue to have other structural effects.40 Altogether, these results provide strong evidence for the occurrence of electrostatic channeling. 55 Later, McCammon’s lab used Brownian simulations to computationally study the transfer mechanism of OAA in the MDH-CS supercomplex.19 The transfer efficiency of OAA from MDH to CS was investigated for the three sources under varying ionic strength. By comparing the transfer efficiency of neutral OAA and negatively charged OAA, they concluded that charged OAA has a high transfer efficiency, whereas neutral OAA has a transfer efficiency approaching zero. By comparing these sources’ transfer efficiencies, both native bovine and recombinant porcine were found to have a transfer efficiency of about 90%. However, the transfer efficiency for wild-type porcine is only about 50% because of its comparably longer active site distance.19 Their work modeled OAA as a charged sphere, which did not consider molecular interactions between OAA and the enzyme electrostatic surface, and was thus unable to detect the influence of key residues. Molecular dynamics (MD) can be used to simulate the molecular mechanism of OAA electrostatic channeling, but is limited by the timescales it can access. Specifically, MD simulations of large systems such as enzyme supercomplexes can only span hundreds of nanoseconds, making them poorly suited to reliably study slow biological processes. Extracting long-term kinetics for the transport of OAA from MD simulations requires the full sampling of OAA transport trajectories between active sites and repeated sampling of slower transport processes. Unfortunately, MD simulations cannot achieve this goal with conventional computational resources. Markov State Models (MSM) make use of short-term dynamics to build a matrix of probabilities of transition between states and to predict long-term kinetics.79 We have previously employed MD combined with MSM to investigate the mechanism of electrostatic channeling on an artificial enzyme complex comprised of two enzymes covalently conjugated by a cationic 56 oligopeptide bridge.20,78,136 Combined with a kinetic Monte Carlo model, this approach enabled the estimation of lag time over the two-enzyme cascade. So it can be used to simulate the molecular mechanism of OAA’s electrostatic channeling. MSM enables many property calculations, including mean first passage time, identification of dominant pathways combined with the transition path theory, and free energy profiles.81 In particular, the hub score is a metric that quantifies the degree to which a set of states mediates transport between two other sets of states; the hub score can therefore be used to identify intermediate states that are important to a given transport process.82 Recently, MSMs have been proven to be powerful in studying various systems, such as binding,84 dissociation,85 allosteric,86 and folding processes.83 In this work, we combine MD simulations with MSM and hub score analysis to study the transport mechanism of OAA from MDH to CS for the recombinant porcine complex as well as the complex with ARG65 →ALA65 mutation. By analyzing the MD trajectories with MSM, multiple energy minima were identified, dominant transition pathways were observed, and free energy profiles were constructed. The visualization shows that the most dominant pathway crosses the experimentally mutated residue ARG65, as well as a small set of additional key residues. The transfer efficiency was also estimated for both structures. 3.2 Methods 3.2.1 Structural Model and Molecular Dynamics Simulation The structural model of recombinant porcine MDH-CS supercomplex used for simulation was consistent with the complex synthesized in vitro with recombinantly produced porcine enzymes, and structurally characterized using cross-linking/mass spectrometry analysis by Bulutoglu et al.40 Site mutation was conducted using PyMOL.137 The structure of oxaloacetate 57 (OAA) was obtained from the Zinc database and parameterized using the CHARMM General Force Field (CGenFF).138,139 GROMACS 2020.1 software was used to conduct MD simulations.123 Metadynamics simulations were conducted using GROMACS with the PLUMED 2.2 plugin. To probe the structural characteristics of the recombinant MDH-CS enzyme complex, we used the APBS plugin of PyMOL to generate its electrostatic potential surface.140 The MDH and CS enzymes are both dimers composed of two identical subunits, each with two active sites, thus providing four reaction paths (Figure 3.1a). To examine all the transport paths, we ran four independent MD simulations with OAA inserted in each active site and conducted energy minimization, NVT, and NPT equilibration at 300 K with restraint on the OAA and MDH-CS complex. Afterward, for each active site, we restrained the enzyme complex and conducted 10-ns MD simulations using metadynamics, which forced OAA to sample multiple states surrounding the active site of interest. States visited in these simulations were used as starting points for longer-term MD simulations.141 Metadynamics enables the introduction of bias energy that depends on a small group of “collective variables”. The sole collective variable used for these metadynamics studies was the distance between OAA and the active site of interest. Therefore, out of the 5000 frames generated by each simulation, we selected 50 frames near each active site. MD simulations were then conducted, starting from each of the resulting 200 snapshots, and with initial velocity randomization with a Maxwell-Boltzmann distribution at 300 K.141 To avoid enzyme configurational changes, we restrained the backbone of the enzyme complex and focused on the movement of OAA. Each of the 200 trajectories spanned 10 ns and were saved at intervals of 5 ps, resulting in a total of 4×105 frames. 58 3.2.2 Markov State Model Building and Validation The Python package MDAnalysis was used to extract the features for Markov State Model from MD trajectories.142,143 The Python package PyEmma was used to analyze the resulting datasets.144 Prior to analysis, all water molecules were removed from the trajectories. The enzyme complex and OAA were then aligned to the initial structures via rotation and translation. These aligned frames were used to build the Markov State Model (MSM). Each frame was summarized by five features involving the OAA intermediate: the minimum distances between OAA and the four active sites (𝑑2!" , 𝑑2!# , 𝑑2$" , and 𝑑2$# ) and the minimum distance between OAA and enzyme surface, 𝑑>@AB (Figure 3.1c). The 4×105 frames were then partitioned to a set of 500 clusters (i.e., microstates, which we use interchangeably) using the k-means algorithm. In other words, the state space of OAA transport was discretized into 500 clusters (Figure 3.1d). The Markov state transition matrix, 𝑇!4 (𝜏), enables the estimation of long-term kinetics: 𝑇&3 (𝑛𝜏) = 𝑇&3, (𝜏) 3-1 where 𝑇!4 (𝑛𝜏) is the transition probability from state 𝑖 to state 𝑗 over a time period 𝑛𝜏.83 The lag time, 𝜏, is a minimum discrete time interval for transitions between states. Here, to construct a reliable Markov model with a transition matrix 𝑇(𝜏), it is necessary to choose an optimal lag time, 𝜏, that satisfies the Markovian property of the dynamics: the current state of a stochastic process only depends on the previous step.79 Hence, we computed the 10 largest implied timescales, which are the timescale of the dynamical process identified by the decomposition of the transition matrix at different lag times 𝜏. For each lag time 𝜏, a transition matrix 𝑇(𝜏) was constructed. The timescales, 𝑡! , implied by the eigenvalues 𝜆! of the transition matrix 𝑇(𝜏) are defined as: 59 𝜏 𝑡& = − 3-2 log (𝜆& ) The selection of lag time is based on the theory that for each lag time, 𝜏, the transition matrix and the implied timescales become approximately constant.79 A lag time should be large enough to incur minimal approximation error and small enough to catch all the processes of interest that have longer timescales than the lag time.145 At 𝜏 ≈ 10 ps, these timescales, 𝑡! , become approximately independent of 𝜏, which implies that the dynamics are Markovian at 10 ps or more. Therefore, τ = 10 ps was chosen as the optimal lag time. The transition probability 𝑇!4 given by the transition matrix was used to predict long-term kinetics of OAA transport process using: 𝑠&3 (𝑡 + 𝑛𝜏) = 𝑠&3 (𝑡) ∙ 𝑇&3 (𝑛𝜏) = 𝑠&3 (𝑡) ∙ 𝑇&3, (𝜏) 3-3 where 𝑠!4 (𝑡 + 𝑛𝜏) represents the transition probability from microstate 𝑖 at time 𝑡 to microstate 𝑗 after 𝑛 steps. From the transition matrix, the stationary population, 𝜋, of all microstates is quantified by:79 𝜋 O 𝑇(𝜏) = 𝜋 O 3-4 where 𝜋 is the left eigenvector of the transition matrix with an eigenvalue of one. The stationary population determines the free energy of all microstates by: 1 𝐺 = − ln (𝜋) 3-5 𝛽 where 𝛽 represents the thermodynamic beta, 1⁄𝑘% 𝑇. 3.2.3 Transition Path Theory To identify dominant transport pathways of OAA from source active sites to target active sites, we combined the MSM with transition path theory (TPT).146 TPT divides the state space into three subsets: the source macrostate, 𝐴, the target macrostate, 𝐵, and the remaining 60 intermediate macrostate, 𝐼. We adopted committor analyses in TPT to decompose the MSM network.146 According to the committor analysis, the forward committor probability, 𝑞!C , represents the probability of an intermediate microstate, 𝑖, reaching the target state, B, before reaching the source state, A. Based on this definition, 𝑞!C = 0 for 𝑖 in 𝐴; 𝑞!C = 1 for 𝑖 in 𝐵, and the following system of equations may be solved to determine 𝑞!C for all states in 𝐼:83 −𝑞&P + < 𝑇&Q 𝑞QP = − < 𝑇&Q 3-6 Q∈! Q∈S The effective flux, 𝑓!4 , from state 𝑖 to state 𝑗, can then be defined as: 𝑓&3 = 𝜋& 𝑞&@ 𝑇&3 𝑞3P 3-7 where 𝑞!? = 1 − 𝑞!C represents the backward committor probability of state 𝑖, the probability of reaching A before B. The total flux from 𝐴 to 𝐵 may then be calculated as: 𝐹 = < < 𝜋& 𝑇&3 𝑞3P 3-8 &UA 3∉A The relative probabilities, 𝑃5 , of the total flux over each path were calculated by grouping paths based on their source active site. For the paths starting from the active site SM1, the relative probabilities, 𝑃+ and 𝑃. , are estimated as: 𝐹" 𝐹# 𝑃" = ; 𝑃# = 3-9 𝐹" + 𝐹# 𝐹" + 𝐹# such that 𝑃+ + 𝑃. = 1. Similarly, the relative probabilities, 𝑃6 and 𝑃/ , for paths starting from the active site SM2, are estimated as: 𝐹V 𝐹W 𝑃V = ; 𝑃W = 3-10 𝐹V + 𝐹W 𝐹V + 𝐹W By following the protocol described by Noe et al., we decomposed this Markov network into multiple individual transition pathways.83 The relative probabilities, 𝑝5 , of dominant 61 individual transition pathways along the four reaction paths, having individual flux 𝑓5 , can be calculated from the total flux according to: 𝑓Q 𝑝Q = W 3-11 ∑Q?" 𝐹Q The mean first passage time (MFPT) 𝑚!4 , which is the expected arrival time at state j after starting in state i, can be estimated from the transition matrix 𝑇!4 : Figure 3.1 (a) Enzyme complex of MDH-CS, with active sites indicated that correspond to enzyme subunits M1 and M2 (malate dehydrogenase) and C1 and C2 (citrate synthase) Subunits are indicated by varying colors. (b) Electrostatic potential surface of the enzyme complex. (c) The five features in MSM. (d) Distribution of clusters based on features 𝑑2!" and 𝑑2$" . 62 𝑚&3 = 1 + < 𝑇&Q 𝑚Q3 3-12 QX3 The uncertainty associated with each calculated MFPT was calculated by bootstrapping147 in this work. Specifically, 21 different Markov State networks with transition matrices were obtained by randomly sampling 100 subsets out of all simulation trajectories. The mean MFPT of these 21 data points was calculated 5000 times, and the standard deviation of the 5000 data points was taken to be the uncertainty in the MFPT. Using TPT, we can also estimate the probability of paths from A to B passing through I, which is termed a hub score.82 The total transition probability from A to B is 𝜋& 𝑇AS = < < 𝑇 3-13 ΠA &3 &∈A 3∈S where 𝜋! is the stationary distribution of state i, and Π1 = ∑!∈1 𝜋! is the total population of macrostate A. The transition probability from A to B is then (∑&∉A,S 𝑞&P 𝑇A& ) + 𝑇AS 𝑃A→S = 3-14 1 − 𝑇AA where 𝑇1! = ∑!∈1 𝜋! 𝑇!4 /Π1 . The expression 𝑞!C can be described as 𝑞!1%EC + 𝑞!1%E? , where 𝑞!1%EC is the probability of starting in i and reaching B before A while passing through I, and 𝑞!1%E? is the probability starting in i and reaching B before A without passing through I. Thus, we can express the hub score, h, for A to B passing through I as:82 ∑&∉A,S 𝑞&AS!P 𝑇A& 3-15 ℎ= ∑&∉A,S 𝑞&P 𝑇A& + 𝑇AS In this work, we used the hub score to quantify the transfer efficiency and the significance of individual residues.82 The uncertainty of each desorption hub score, h, was calculated by bootstrapping as described above for MFPT. The transfer efficiency, 𝜂2 , for OAA transport from each of the two source active sites to either target active site was estimated as: 63 𝜂C%, = 𝑃" ∙ (1 − ℎ" ) + 𝑃# ∙ (1 − ℎ# ) 3-16 𝜂C%- = 𝑃V ∙ (1 − ℎV ) + 𝑃W ∙ (1 − ℎW ) 3-17 where ℎ5 is the hub score with the source active site macrostate and the target active site microstate with the desorption macrostate as the intermediate state for the four reaction paths. 3.3 Results and Discussion 3.3.1 Transport Path Analysis from Electrostatic Potential Surface Both MDH and CS have two subunits, named here as M1, M2 for MDH, and C1, C2 for CS. Each subunit contains an active site: SM1, SM2, SC1, and SC2 as shown in Figure 3.1a. SM1 and SM2 are the source active sites on MDH; SC1 and SC2 are the target active sites on CS, which are open and have a highly positive cleft.40 Therefore, four OAA transport paths can be defined using these active sites: (1) Path 1: from SM1 to SC1; (2) Path 2: from SM1 to SC2; (3) Path 3: from SM2 to SC1; and (4) Path 4: from SM2 to SC2. From the electrostatic potential surface of the enzyme complex in Figure 3.1b, some qualitative observations about the active sites and the reaction paths are worth explaining. First, SM1 is much closer to both SC1 and SC2 than to SM2. As such, the two paths originating at SM1, Paths 1 and 2, are more likely to be efficient compared to those starting from SM2. Path 1 is the shortest path (~ 4nm) in this enzyme complex and features a continuously positive surface comprising arginine and lysine residues to enable the direct transport of OAA from SM1 to SC1. Meanwhile, Path 2 has a direct distance of 6 nm. The surface pathway for Path 2 includes the charged residues on subunit C1. 64 Second, SM2 faces the bulk solution and has an unfavorable orientation towards the target active sites. Consequently, OAA intermediates that successfully transfer from SM2 to target active sites are highly likely to diffuse through the bulk solution due to the lack of continuously positive surfaces on Paths 3 and 4. These observations are made based on crystal structures, and enzyme configuration is expected to vary in solvent or during long simulations. To avoid that, we restrained the backbone atoms to focus only on the interaction between the MDH-CS enzyme complex and OAA. 3.3.2 Transfer Efficiency To investigate the transition probabilities for both the recombinant porcine complex and the mutated complex, using the MSM models built with five features and 200 short MD simulations, the 500 microstates were categorized into six macrostates (i.e., basins): four active Figure 3.2 Distribution of macrostates and the dominant pathways in each reaction path. 65 site basins (SM1, SM2, SC1, and SC2), a desorption basin, and an intermediate basin (Figure 3.2). The active site basins consisted of microstates within 1 nm of the corresponding active site. For instance, basin SM1 is defined by all the clusters where 𝑑2!" < 1 nm. The Desorption basin is defined as all clusters where 𝑑>@AB > 1.2 nm. The remaining clusters form the intermediate basin, wherein OAA is positioned on the enzyme surface but not near an active site. Mean first passage time (MFPT) between basins was subsequently calculated according to Equation 3-12. As shown in Table 3.1, the MFPT of Path 1 from SM1 to SC1 is the fastest compared with the other three reaction paths due to direct transfer by electrostatic channeling. Furthermore, SM2 has the fastest MFPT to desorption. This observation is consistent with SM2’s unfavorable orientation. Even for the slowest reaction path from M1 to C2, the MFPT is still much faster than the turnover frequency of MDH which is about 31 s-1 for the forward reaction.40 As shown in Figure 3.3a, the relative probabilities of Path 1 and Path 2, the transport paths starting from SM1, are 80% and 20%, respectively; the relative probabilities of Path 3 and Path 4, starting from SM2, are 56% and 44%, respectively. In other words, Path 1 is much more probable than Path 2, but Paths 3 and 4 have comparable probability. However, Paths 3 and 4 are distinct form Paths 1 and 2 in that they primarily involve desorption. When intermediates are transferred by electrostatic channeling, there exists a nonzero probability of desorption into the Table 3.1 Mean first passage time (MFPT / ns) between macrostates. The vertical axis represents starting basins and the horizontal axis represents target basins, with “Desorbed” representing a terminal desorbed state. Uncertainty was calculated by bootstrapping using subsets of the simulated data. SC1 SC2 Desorbed SM1 100 ± 9 5900 ± 1500 1200 ± 300 SM2 400 ± 60 3500 ± 1000 11 ± 1 66 Figure 3.3 (a) Relative probabilities for the four reaction paths, 𝑃5 , for both the recombinant porcine complex and the mutant complex. (b) Desorption hub scores, ℎ5 , for both the recombinant porcine complex and the mutant complex. Error bars are calculated by bootstrapping using subsets of the simulated data. Table 3.2 Transfer efficiency (%) comparison. Source Recombinant Mutant Active Site This work Huang et al. 𝑆&+ 95.6 ± 0.04 90 47.2 ± 0.03 𝑆J# 0.01 ± 8×10-5 38 0.1 ± 5×10-4 67 bulk solution. OAA molecules that enter the bulk solvent escape the range of electrostatic interaction and are therefore not electrostatically channeled. Indeed, the outcome of any solvated intermediate depends on the concentrations of both intermediate and complex, the latter of which affects the diffusion distance of the intermediate. Thus, we consider transfer through bulk solvent as unchanneled and ignored this pathway when calculating channeling efficiency. Quantifying the desorption probability of electrostatic channeling is therefore an effective way to evaluate the electrostatic channeling efficiency. The hub score is a metric to determine the probability of visiting an intermediate basin on transition paths between two other basins of interest.82 Hub scores can be used to estimate the desorption probability on reaction paths with the desorption basin as the intermediate state. Therefore, we used the desorption basin as the intermediate state to calculate the hub score of passing through the desorption basin for each reaction path. For instance, for Path 1, we used basin SM1 as the starting state, basin SC1 as the ending state, and the desorption basin as the intermediate state and calculated the probability of all pathways from SM1 to SC1 that pass through the desorption basin. In this scenario, the hub score of each path represents the desorption probability for that path. In other words, if a specific pathway passes through the desorption basin, electrostatic channeling is ineffective on that pathway. Accordingly, a high hub score means high desorption probability and thus low electrostatic channeling. As shown in Figure 3.3b, for the recombinant porcine structure, the desorption hub score of Path 1 is 0.000004 %, and that of Path 2 is 22.2 %. But the desorption hub score for Path 3 and 4 are both 100%. This indicates that Path 1 in the recombinant porcine is a nearly-perfect electrostatic channel. The relatively low desorption hub score of Path 2 also indicates that Path 2 is effective at protecting OAA from desorption, given the partially positive surface between SM1 68 (a) (b) SM1 SC2 SC1 SM2 Arg65 Figure 3.4 (a) The dominant pathway from SM1 to SC2 (dotted line) on the free energy surface. (b) Spatial visualization of the dominant pathway. Clusters included in the pathway are indicated by red circles. and SC2. In contrast, the orientation of SM2 is unfavorable, and both Paths 3 and 4 cover long distances over surfaces that are not continuously positive; thus, they both have desorption hub scores of ~100%. The transfer efficiency, 𝜂2!" , of the recombinant porcine complex for OAA starting from SM1 is 95.6 ± 0.04% (Table 3.2), and the transfer efficiency with OAA starting from SM2, 𝜂2!# , is 0%, as calculated using Equation 3-16 and 3-17. The result is consistent with results previously reported in Huang et al.’s Brownian dynamics study (90% at 0.05 mM ionic strength diffused from SM1 and 38% diffused from SM2).19 3.3.3 Dominant Pathways We further refined the set of clusters near each active site to those where the distance between the active sites and OAA is below 0.3 nm. Among these clusters, we identified the dominant pathway for each of the four reaction paths in TPT and calculated their relative probabilities, 𝑝5 , and individual flux, 𝑓5 , using Equation 3-11. It is worth noting that we differentiate pathways from paths: a path refers to the transport of OAA starting from a specific 69 source active site to a specific target site, whereas a pathway represents a collection of specific MSM clusters through which OAA transfers. Therefore a path is comprised of a collection of pathways. The dominant pathway for Path 1 is the most likely pathway for the whole system, with a relative probability of 5.2%. The next three most probably pathways all have a low relative probability (~ 0.3%) and pass through the desorption basin. In other words, this dominant pathway is the most likely specific pathway for OAA transport, with 5% of successfully transported OAA taking this pathway. The dominant pathway is illustrated within the free energy landscape in Figure 3.4a and traverses an energy minimum between SM1 and SC1. As shown in the Table 3.3 Key residues and their hub scores. Residue LYS432 on C2 ARG65 on C1 ARG67 on C1 LYS423 on C2 subunit subunit subunit subunit Hub score /% 11 11 10 8 Hub score /% 11 11 10 8 Figure 3.5 Spatial positions of key residues. 70 visualization in Figure 3.4b, the dominant pathway passes through a key residue, ARG65, identified in the previous experimental study.40 These results confirm that ARG65 is a key residue in the electrostatic channeling of OAA within the MDH-CS complex. 3.3.4 Key Residues Beyond verifying ARG65 as a key residue, it is possible to identify other key residues in the electrostatic channeling process. To accomplish this, MD frames for all non-desorption clusters were analyzed to identify positive residues of which OAA passes within 0.5 nm. For example, the distance between ARG67 and OAA was less than 0.5 nm in clusters 41, 83, 178, 206, 255, 318, and 483. These clusters were then used as the intermediate state, I, to calculate a hub score for ARG67. In addition to ARG65, we found three more key residues with hub scores of around 10%: LYS432, ARG67, and LYS423 (see Table 3.3). In Bulutoglu et al.’s experimental study, ARG67 was another mutated residue crucial to the enzymatic activity of CS whose mutation decreased the activities in 2 or 3 orders of magnitude in comparison to the recombinant wild-type CS.40 As shown in Figure 3.5, these four residues form a bottleneck, which effectively stalls OAA transport from SM1 to SC2. Ultimately, this region of transport has the lowest energy profile within the enzyme complex (i.e., the purple region in Figure 3.4a). Table 3.4 Mean first passage time (MFPT / ns) of the mutant. The vertical axis represents starting basins and the horizontal axis represents target basins, with “Desorbed” representing a terminal desorbed state. SC1 SC2 Desorbed SM1 9±1 256±18 39±3 SM2 41±2 196±17 4±0 Table 4. Mean first passage time (MFPT / ns) between macrostates of the mutant. The vertical axis represents starting basins and71 the horizontal axis represents target basins, with “Desorbed” representing a terminal desorbed state. 3.3.5 Comparison with the Mutant Mean first passage times (MFPT) for the mutant MD-CS complex are shown in Table 3.4. All four pathways show lower MFPT compared to the recombinant structure. This can be explained by weaker interactions with the surface, due to the mutation, which eliminates the trapping of OAA at an energy minimum and thus increases the mobility of OAA. However, this increased mobility includes desorption pathways. As shown in Figure 3.3a, the relative probabilities of Paths 1 and 2 within the mutant complex starting from SM1 are 66% and 37%, respectively; those starting from SM2 (Path 3 and Path 4) are 53% and 47%, respectively. Compared with the recombinant complex, the desorption hub scores of Path 1 and Path 2 for the mutant complex are significantly higher, especially for Path 2 (Figure 3.3b). In contrast, the desorption hub scores for Path 3 and Path 4 remain at ~100%. The transfer efficiency, 𝜂2!" , (Equation 3-16) for transport originating from SM1 decreased after mutation, from 95.6% to 47.2% (Table 3.2). The transfer efficiency, 𝜂2!# , (Equation 3-17) for transport originating from SM2 remained at 0.1%. Collectively, these hub scores and transfer efficiencies demonstrate that the mutation of ARG65 to ALA65 significantly impairs the transport of OAA from SM1 to SC1 or SM1 to SC2. Notably, these results are consistent with previously reported experimental observations. Bulutoglu et al. reported an increased lag time of about 30-fold.40 We likely did not observe such a significant difference in transfer efficiency because the position of the enzyme backbones was fixed in our simulation, and thus, we could not simulate the effect of mutation on the morphology of the enzyme complex—a key reason for the increased lag time in Bulutoglu et al.’s experimental study. Since the side chain of alanine is much smaller than arginine, the mutation from ARG65 to ALA65 also represents a 72 reduction in residue size. Both the reduced charge and bulk may influence the structural properties and stability of the complex, as discussed by Bulutoglo et al.40 Transfer efficiency has a strong effect on experimentally determined lag time: the lower the transfer efficiency of the intermediate, OAA, the longer it takes for a flux of OAA to CS to reach steady-state.87 Therefore, by comparing the transfer efficiency of the mutant of MDH-CS with that of the recombinant form, we would predict a much higher lag time. Quantitative lag time predictions are the subject of ongoing work. 3.4 Conclusion The transport mechanism of OAA using electrostatic channeling is not only a key step of the TCA cycle, but also an excellent example of naturally occurring electrostatic channeling. In this work, we used molecular dynamics combined with MSM analysis to examine the atomic- level interaction between OAA and the positively charged enzyme surface of MDH-CS. By describing intermediate transport between the four active sites using a Markov State Model, we quantified the relative flux probabilities and desorption probabilities for all possible transport paths. Our results show that Path 1 with OAA from SM1 to SC1 is a more favorable reaction path than Path 2 from SM1 to SC2. In particular, we visualized the overall dominant pathway from SM1 to SC1 and found that it passes through the previously experimentally identified key residue ARG65.40 Moreover, we identified three additional key residues: LYS432, ARG67, and LYS423. These four key residues trap OAA along the dominant pathway from SM1 to SC1 and hinder its desorption. Mutation of ARG65 to ALA65 cuts the transfer efficiency diffused from SM1 almost in half. Altogether, our computational results provide a deeper understanding of the electrostatic channeling mechanism in the TCA cycle. This work contributes to designing strategies for efficient catalytic nanostructures utilizing fusion proteins and non-natural protein 73 supercomplexes. Such structures may be implemented in the catalysis of cascade reactions in biofuel cells, biosensors, and chemical synthesis. 74 Chapter 4. Molecular Tunneling of Ammonia in Asparagine Synthetase 4.1 Introduction Though electrostatic channeling can achieve high transfer efficiency, it requires specific properties of the enzymatic reaction: the enzyme surface of the intermediate’s transfer path has to be highly charged and the intermediate needs to carry the opposite charges. The intermediate is likely to be exposed to the bulk while transferring via electrostatic interaction, which leads to a leaky channeling. As an alternative approach, intramolecular tunneling can protect the intermediate from desorption and has the potential to achieve a channeling probability of 100%. Intramolecular tunnels are found in many supramolecular complexes. The most widely studied example is the indole tunnel in tryptophan synthase.46,148,149 The tryptophan synthase has two subunits and involves a two-step biosynthesis of L-tryptophan. Indole, a stable intermediate, transfers through a tunnel connecting the two active sites of length ~25 Å.46 Similarly, an ammonia (NH3) tunnel has been found in many enzymes, including the glutamine-dependent aminotransferase (GATase) group in particular.150 GPATases make use of NH3 from the hydrolysis of glutamine as the nitrogen source, in which glutamine is hydrolyzed into glutamate and NH3 on the first site, and NH3 as the intermediate transfers through an intramolecular tunnel to the second active site for the next step. Tunnels of carbon monoxide intermediate, acetaldehyde intermediate, and N5-formiminotetrahydrofolate intermediate are also found in cetyl-CoA synthase/carbon monoxidedehydrogenase (ACS/CODH), aldolase/dehydrogenase/DmpFG, and formiminotransferase-cyclodeaminas, respectively.151–153 Glutamine-dependent aminotransferases (GATases) are divided into two subfamilies, class I (also known as trpG-type or triad) and class II (also known as purF-type or Ntn).150 The classification is based on sequence similarity. The active site in class I GATase is a triad of 75 conserved Cys-His-Glu, with the catalytic cysteine being essential for amidotransferase activity. In class II GATase, the active site is a cysteine located at the N-terminal end of the enzyme, with Asn and Gly forming an oxyanion hole to stabilize the formed tetrahedral intermediate.150 Class I GATase includes carbamoyl phosphate synthetase, imidazole glycerol phosphate synthase, and anthranilate synthase, while class II GATase includes asparagine synthetase, glutamate synthase, amido phosphoribosyltransferase, and glucosamine--fructose-6-phosphate aminotransferase.150 Asparagine synthetase is responsible for the conversion of aspartate to asparagine with the presence of both ATP as the energy source and nitrogen source.154 There are two types of asparagine synthetase based on the nitrogen source. The first type, asparagine synthetase A (ASNA), only utilizes ammonia as the nitrogen donor, whereas the second type, asparagine synthetase B (ASNB), utilizes both the glutamine and ammonia as the nitrogen source with a preference of glutamine.155 Based on the amino acid sequence analysis, ASNB was proved to be a member of class II GATase and have two nitrogen sources.49 Figure 4.1(a) The structure of Asparagine Synthetase B(ASNB); (b) The visualization of the three distances between NH3 and S1, S2, and the desorption path ends. 76 In this work, we focused on investigating the tunneling mechanism of ASNB. ASNB is a dimeric protein with two domains for each subunit (Figure 4.1a). The first domain is the N- terminal and has the glutaminase site. It hydrolyzes glutamine into glutamate and NH3 (Equation 4-1). The second domain is the C-terminus, which is also the synthetase site that converts aspartate and NH3 with the help of ATP into asparagine, AMP, and PPi (Equation 4-2). Glutamine + H# O ↔ Glutamate + NHV 4-1 Aspartate + ATP + NHV ↔ PPi + Asparagine + AMP 4-2 by the overall reaction, asparagine synthetase forms asparagine: Glutamine + H# O + ATP + Aspartate 4-3 ↔ Glutamate + AMP + PPi + Asparagine Larsen et al. synthesized the crystal structure of C1A mutant of ASNB from Escherichia Coli (PDB: 1CT9) and confirmed the existence of intramolecular tunnel.49 After being generated at the first site, NH3 is expected to be transported through the tunnel to the second site. Li et al. studied the ammonia channeling in ASNB with gHMQC-Based NMR assay.156 Their results show that ASNB makes use of exogenous NH3 as the nitrogen source either by accessing the second site or transport through the tunnel from the first site. Later, they identified the tunnel residues with the crystal structure from Larsen et al. and claimed that Ala388 is located at a bottleneck that significantly affects the transport of NH3. They studied an A388L mutant, which introduced a tunnel block by mutating alanine to a bigger residue leucine and found that the glutamine depend synthetase activity decreased about 8 fold, confirmed the key importance of the residue Ala388.157 However, their experiment failed to investigate the interaction between the tunnel residues and NH3 and was not able to explain the tunneling mechanism at atomic level. 77 Previously, our group used a continuum model to study nanoscale confinement as mentioned in Chapter 1.69 Later, the work was extended to investigate the tunneling mechanism by a single-wall carbon nanotube (SWCNT) decorated with different termini for both charged and uncharged intermediates with molecular dynamics (MD) simulation.158 The results shows that a length:diameter ratio of the tunnel above 5 can provide the intermediate with a high retention time, which is beneficial for high transfer efficiency.158 The water distribution inside the tunnel also plays an important role. In this work, we use the crystal structure of asparagine synthetase B as a model system to understand the tunneling mechanism with molecular dynamics (MD) simulations. MD simulations allow us to compute movement of enzymes and ligands within a timescale of hundreds of nanosecond. Combined with advanced sampling techniques including umbrella sampling and Markov state modeling, we were able to uncover the energy interaction between NH3 and ASNB and kinetic dynamics of NH3 in the tunneling process. 4.2 Methods 4.2.1 Structural Model We use the crystal structure of C1S mutant of asparagine synthetase B (ASNB) synthesized by Larsen et al. in the Protein Data Bank (PDB ID: 1ct9) as the original structure.49 As the crystal structure of 1ct9 has four identical subunits, we used chain D as the model subunit. There are three residues that have missing atoms: 426LYS, 447ARG, 448GLN, which affected the simulation setup, so we used the Mutagenesis tool in PyMOL to complete the missing atoms in these residues, in other words, replacing incomplete residues with complete ones.137 In MD 78 Figure 4.2 (a) Enzyme of ANSB. Active sites S1 and S2 are indicated with blue bead and red bead, respectively. The paths are indicated in different colors. (b) Detailed paths from Caver. Each line represents a specific path from a specific configuration. simulations, hydrogen atoms, counter ions, and TIP3P water molecules were added to the model structure using GROMACS 2021.4.123 The structure of NH3 was obtained from the ZINC database and parameterized with the CHARMM General Force Field (CGenFF).159,160 The enzyme-ligand complex was equilibrated at 300 K with NVT and NPT ensembles. 4.2.2 Initial Examination of Tunnels in ANSB The existence of possible tunnels for NH3 was investigated by Caver3.0.161 We ran 5 parallel simulations for only the ASNB enzyme for 20 ns without any restricted motion, and extracted 200 configurations for each simulation (thus 5*200 configurations) for tunnel analysis in Caver. The starting point of a tunnel was defined as the area within 5 Å of the first active site. The Caver probe radius was 1 Å. 4.2.3 Umbrella Sampling Since the crystal structure of ASNB available from Larsen et al. is a C1A mutant, we consider the position of 1ALA as the first active site, even though 1CYS is the residue of the first 79 active site of ASNB.49 To study the enzyme profile along the tunnel, umbrella sampling was adopted to sample the path from the first active site (1ALA) and the second active site (348GLU). The direct length between the two sites in the crystal structure is about 19 Å. A steered molecular dynamics simulation was conducted to pull NH3 from the first active site to the second active site. Then we selected 50 frames with NH3 positioned along the tunnel as the initial configurations for umbrella sampling. We conducted the same simulations for the A388L mutant in order to investigate the tunneling mechanism from the difference of energy profiles. 4.2.4 Markov State Model In this work, we used similar methodology of the Markov State Model as we described in Chapter 3.162 Briefly, we conducted 50 parallel MD simulations for 20 ns using the same 50 frames in umbrella sampling as the initial configurations and used all the data for MSM analysis. 500 clusters are used to classify all the frames of the MD trajectories based on the five collective variables defined as the distances from NH3 to five states, including the enzyme surface (dsurf), the first active site (d1), the second active site (d2), the first desorption path end (ddes1), and the second desorption path end (ddes2) (Figure 4.1b). Additional elaboration on the desorption path ends can be found in the results section. The MSM model was built with the 500 clusters and then classified them into four macro-states, including desorption, tunnel, S1, and S2. Then the hub score analysis was conducted to study the transfer efficiencies of NH3 transferring from S1 to the bulk solvent via various transition states.82 4.3 Results and Discussion 4.3.1 Identification of the Tunnel Structure in Wild Asparagine Synthetase From the results of Caver tunnel analysis, we were able to identify four paths starting at the first active site. Figure 4.2a shows the structure of asparagine synthetase B with the identified 80 tunnels from Caver. From the detailed paths in Figure 4.2b, we can see that T1 is the path with which NH3 can enter or exit from the first active site. T2 is the tunneling path from the first active site to the second active site. T3 and T4 are the leaky tunneling paths in the middle of the tunnel. T1 shows up in 989 configurations out of 1000, which means among the 1000 configurations, 989 of them contain a tunnel such as T1 with a bottleneck radius greater that the threshold of 1 Å, which is the Caver probe radius. Similarly, T2, T3, and T4 appear in 699, 694, and 355 frames, respectively. The remaining tunnels were considered rare and were ignored. The high occurrence probability of T1 indicates that there is a high probability for NH3 generated on the first site to exit or exogenous NH3 to enter the tunnel. The tunneling path T2 is very likely to transport NH3. NH3 could possibly leak via T3 as its occurrence is comparable to T2. T4 could also leak NH3 with a lower likeness compared to T3. However, all the analysis is based on Figure 4.3 (a) Tunneling path T2 is shown in green. (b) Energy profiles along the NH3 tunnel of the original structure of asparagine synthetase B and its mutant A388L. 81 geometric structure and steric hindrance of the enzyme, which provides some insights but must be verified with the enzyme-ligand complex simulations. 4.3.2 Energy Profile for Ammonia Transport along the Tunnel From the analysis of Caver and visualization in VMD, a representative tunnel of T2 is shown in Figure 4.3a. From the energy profile of the original structure in Figure 4.3b, there is a dominant peak at d1 ~ 0.7 nm. The energy barrier is 8.7 kJ/mol. The tunnel in Figure 4.3a also shows a bottleneck with a small radius at d1 ~ 0.7 nm which is formed by a set of residues (120MET, 143ILE, 389ASN, and 398GLU). The peaks at d1 at 1.1 nm,1.4 nm, and 1.6 nm also come from the narrow areas along the tunnel, but they are much less significant that the peak at 0.7 nm. The energy profile of the mutant structure is discussed below. 4.3.3 Mutation effect As shown in Figure 4.3b, the energy profile of the A388L looks very similar to that of the original structure: it also has peaks at 0.7 nm, 1.1 nm, 1.4 nm, and 1.6 nm. The huge difference is located at 1.1 nm where the mutant has the greatest energy peak with an energy barrier of 9.9 Figure 4.4 Tunneling path of NH3 is showed in yellow. The blue residue represents the first active site, whereas the pink residue represents the second active site. The red residue represents the mutated residue 388Leu. 82 Table 4.1 Hub score analysis. Hub score Original Mutant(A388L) p1: S1 to bulk via T1 0.51 0.43 p2: S1 to bulk via T2 0.45 0.41 p3: S1 to bulk via T3 0.002 0.15 p4: S1 to bulk via T4 0.009 0.012 Total 0.96 0.99 kJ/mol, but the peak of the original structure at 1.1 nm is quite small. From the visualization of the corresponding configurations, we found that the sidechain of the mutated residue 388Leu fluctuates within the tunnel, which prevents the direct passage of NH3. The mutated residue leucine has a much longer sidechain than that of alanine, therefore, the free movement of leucine’s side chain acts like a flexible blockage and decreases the tunneling probability of NH3 (Figure 4.4). The maximum energy barrier of the mutant shifted from d1 ~ 0.7 nm to d1 ~ 1.1 nm and increased from 8.7 kJ/mol to 9.9 kJ/mol. 4.3.4 Dynamics of NH3 Transport Pathway From the Markov State Model paired with the hub score analysis, we were able to obtain the transfer probabilities. In MSM, we defined four basins from the 500 clusters: the bulk basin in which dsurf > 0.5 nm, S1 in which d1 < 0.5 nm, S2 in which d2 < 0.5 nm, and tunnel basin with the rest of clusters. For the analysis of desorption paths T1, T3, and T4, we identified the terminal residues of them and defined their termini basins. For example, for T3, residues 118, 221, and 407 (shown as the three residues in red sticks in Figure 4.2b) are the terminating ones and we defined all the clusters with the distance between NH3 and the center of the three residues smaller than 0.5 nm as the end of T3 basin. Similarly, for T1, residues 54, 57, 80, and 96 are 83 used to define the end of T1 basin; for T4, residues 382, 515, and 173 (shown as the three residues in cyan sticks in Figure 4.2b) are used to define the end of T4 basin; As shown in Table 4.1, we quantified four hub score probabilities (p1, p2, p3, and p4) of the paths from S1 to bulk along the four paths. The values of p1 are comparable between the original structure and the mutant, indicating that this is a leaky tunnel and NH3 is highly likely to desorb at the entrance of the tunnel. The values of p2 are also similar, indicating that the transport efficiency of NH3 from S1 transferring to bulk through S2 via T2 is about 45%. If all of this ammonia were to react at S2, 45% NH3 would react at the second site via T2 instead of continuing to go to the bulk. The other 55% of NH3 does not go through S2 before going to bulk which might follow the path T1 and go to bulk directly. This also means that the mutation does not affect the overall transport efficiency significantly, which is consistent with the small energy barrier increase ~1 kJ/mol. However, the large difference occurs on p3, in which the mutant has a much larger value than the original structure, representing an 75 fold increase. In addition, p4 also increased after the mutation. Considering p3 and p4 together, we can see that after the mutation, the desorption probabilities indeed increased in the specific paths predicted using Caver. T3 is more leaky than T4 because the mutation is located after the divergence of T3 but before that of T4. Therefore, to avoid the bottleneck introduced by the mutation, NH3 is more likely to desorb before the mutation bottleneck through T3. This explains how p3 increased more significantly that p4. 4.3.5 Toward Experimental Validation Experimentally, Li et al. reported that the mutation of A388L decreased the activity of glutamine dependent synthetase by 8 fold, whereas the glutaminase activity and ammonia dependent activity are comparable to that of the original structure.157 This indicates that the 84 mutation does not affect the conversion of glutamine into NH3, but affects the transport of the NH3 produced at the glutaminase site to the synthetase site along the tunnel. Comparing with our computational result, this decrease of activity maybe a result of the local energy increase. In addition, we did see a small decrease of channeling probability along T2, p2, due to mutation, which correlates to the 8-fold decrease in activity but is not proportional to it. This could be attributed to the possibility of taking the side paths along T2 which use not just S2 but also the bulk as the transition states. These paths should be considered as desorption paths and their probabilities should increase after mutation. With an improved metric based on hub score to exclude unintended side paths along T2, we expect to see a significant decrease of p2 after mutation. 4.4 Conclusion In this work, we examined the molecular tunnels in asparagine synthetase B with molecular dynamics simulation together with Caver, umbrella sampling techniques, and Markov State Model. From Caver, several tunnels were identified as potential paths of NH3. Among them, T2 is the desirable path for the successful transport of NH3 from the glutaminase site to the synthetase site in ASNB; T1 is the gating tunnel for both the exit of glutamine-hydrolyzed NH3 and entry of exogenous NH3; T3 and T4 are the leaky paths along the tunnel. With energy profile from umbrella sampling, we found the largest energy barrier occurred at d1~0.7 nm and was formed by several bottleneck residues. Comparing with the energy profile of the mutant structure, the mutation from Ala to Leu changed the dominant peak from d1~0.7 to d1~1.1 nm and increased the energy barrier about 1 kJ/mol. With the hub score analysis in MSM, we found that T3 is very likely to be the leaky path due to the mutation. In summary, these computational findings will contribute to design and application of molecular tunnels in cascade reactions. 85 Chapter 5. Coarse-Grained Molecular Dynamics Simulation of HK- bridge-G6PDH Complex 5.1 Introduction As discussed in Chapter 1, an artificial bridge was incorporated into the enzyme complex of HK and G6PDH to enhance the substrate channeling of G6P.20 Experimentally, to synthesize the HK-bridge-G6PDH complex, the terminus of each component is modified with linkers so that they can react with each other to introduce covalent bonds between them. The bridge is composed of six lysine residues and a cysteine(Cys-Lys6). Computationally, the crystal structures of HK and G6PDH are taken from PDB ID 3VF6 and 4LGV, respectively. The linkers and the peptide bridge along with the two enzyme structures are connected together manually with Avogadro. The conjugated structure is shown in Figure 5.1.20 Figure 5.1 Illustration of the proposed channeling complex using a poly(lysine) bridge as an electrostatic surface between hexokinase (HK; PDB 3VF6) and glucose-6-phosphate dehydrogenase (G6PDH; PDB 4LGV). This figure is from ref. 20. 86 In the previous work, Liu et al. focused on the transport of intermediate on the peptide bridge, from HK to bridge, and from bridge to G6PDH.20,78,163 Due to the large size of HK- bridge-G6PDH, it was not considered in its entirety using the molecular dynamics approach. However, it is very important to understand how two enzymes interact with each other during the transport of G6P in order to completely understand the effectiveness of substrate channeling. The hypothesis is that the enzymes would diffuse away without the linkage of the bridge, as both HK and G6PDH are negatively charged. Coarse-grained molecular dynamics is powerful at investigating long-terms dynamics of biological systems as it simplifies amino acids and water molecules into large beads. The reduction of size dimension allows CG MD to run simulations on microsecond time scales. This approach has been widely used to study long-time processes, such as membrane dynamics, protein folding, protein aggregation, and protein-protein interactions.164–168 Therefore, coarse- grained molecular dynamics is employed in this work to examine the interactions between enzymes and the bridge. In this work, we studied the dynamics of the separated system consisting of non-linked HK, bridge, and G6PDH. 5.2 Methods In our work, we used Gromacs 2021-4 to run the simulation.169 The Martini force field was used to coarse-grain the atomistic structure obtained from Liu et al.20,170 The ElNeDyn elastic network was adopted to conserve the conformation of the crystal structure and avoid the dissociation of beads.171 87 For the system of HK, bridge, and G6PDH, we ran 10 parallel simulations for 10 µs at temperature of 300K after the energy minimization and equilibration steps. In these simulations, 35 Sodium ions were added to neutralize the system. To perform the analysis, the enzymes and the bridge were aligned with the initial structure using rotation and translation. The system was analyzed using four collective variables: the minimum distances between HK and G6PDH, HK and the bridge, and the bridge and G6PDH, and the relative displacement angle between HK and 𝑣1 and the vector of G6PDH, ••••⃗, G6PDH. The displacement angle is defined with the vector of HK, ••••⃗, 𝑣2 as below: 𝑣 ••••⃗" ∗ 𝑣 ••••⃗# 5-1 q = 𝑎𝑟𝑐𝑐𝑜𝑠( ) |𝑣••••⃗||𝑣 " ••••⃗| # Figure 5.2 Coarse grained beads of the conjugated structure of HK-bridge-G6PDH. 88 Figure 5.3 (a) The averaged nearest distance between HK and G6PDH across the ten parallel simulations (blue) and its standard deviation (cyan); (b) the averaged distance between HK and the peptide bridge across the ten parallel simulations (blue) and its standard deviation (cyan); (c) the averaged distance between the peptide bridge and G6PDH across the ten parallel simulations (blue) and its standard deviation (cyan). 5.3 Results and Discussion To gain insight into the movement of the enzymes and bridge in relation to each other, three distances were examined. Analysis of the distance between HK and G6PDH indicated a preference for close proximity, leading to an investigation of the enzymatic interaction energy. The findings demonstrated a significant increase in interaction energy with close proximity. Further analysis of the electrostatic surface revealed an increase in localized electrostatic interaction energy due to G6PDH rotation. 89 Figure 5.4 (a) The short range Coulombic energy between HK and G6PDH; (b) The short range Lennard-Jones energy between HK and G6PDH. 5.3.1 Distances Analysis between HK and G6PDH From the three distances analysis, it was observed that HK and G6PDH approached a distance of less than 0.5 nm and maintained this close proximity after a short simulation time of approximately 100 ns in all 10 simulations as shown in Figure 5.3a. The size of bridge is relatively small compared to the enzymes, so the bridge moved freely around HK and G6PDH. As shown in Figure 5.3 b and c, the bridge is more likely to stay on G6PDH as its averaged distance to G6PDH is below 0.6 nm and that to HK is around 1 nm. 5.3.2 Interaction Energy To investigate this phenomenon of maintaining a close proximity, we analyzed the interaction energy between HK and G6PDH. As shown in Figure 5.4, upon approaching each other, the Coulombic energy between the enzymes decreased rapidly, and the Lennard-Jones energy decreased even more significantly. The total interaction energy, which is the absolute value of the sum of Coulombic energy and Lennard-Jones energy, increased, indicating a strong interaction between HK and G6PDH. It 90 Figure 5.5 (a) The front view of the electrostatic surface of HK; (b) a representative configuration of HK and G6PDH; (c) The front view of the electrostatic surface of G6PDH; (d) The side view of the electrostatic surface of HK; (e) The side view of the electrostatic surface of G6PDH. should be noted that this observation may be altered under higher ionic strength conditions. In experiments, HK and G6PDH have not been observed to be strongly associated or to form a metabolon, suggesting that this interaction may be transient.20 Further experimental validation is needed, such as the measurement of association constants. 5.3.3 Relative Replacement of G6PDH The analysis of the trajectories revealed a significant rotation of G6PDH upon approach to HK. Figure 5.5b illustrates a representative ending configuration of HK and G6PDH, and a comparison with the initial configuration in Figure 5.2 demonstrates the significant rotation of G6PDH for interaction with HK. Electrostatic surface analysis indicated that the contact surface 91 Figure 5.6 (a) The vectors of HK and G6PDH; (b) The averaged displacement angle between HK and G6PDH(blue) and its standard deviation(cyan). involves the negative (red) charge of HK and the positive (blue) charge of G6PDH, as highlighted by yellow circles. This significant rotation of G6PDH occurred throughout all simulations and led to a decrease in the interaction energy. This displacement rotation is further analyzed using the angle defined using two vectors of HK and G6PDH (Equation 5-1). As indicated in Figure 5.6a, the vector of HK is defined by THR133 and THR359, and that of G6PDH is defined by THR46 and PRO337. The starting configuration has an initial angle of 50 degrees, which changes to 103 degrees after the displacement rotation, as shown in Figure 5.6b. This rotation may help to stabilize the relative configuration of HK and G6PDH by reducing the interaction energy. 5.4 Conclusion This chapter utilized coarse-grained molecular dynamics to investigate the interaction between HK and G6PDH. The results demonstrated a preference for close proximity between HK and G6PDH, resulting in a significant increase in the interaction energy. Moreover, the peptide bridge was observed to move freely on either HK or G6PDH or the complex of both. As a future direction, exploring the dependence of enzymatic interaction on ionic strength and investigating the energy profile as a function of HK-G6PDH distance through umbrella 92 sampling simulation could be beneficial. Additionally, conducting simulations of a linked complex of HK-G6PDH with the bridge could provide valuable insights. The bridge may limit the rotation of G6PDH, which could be compared to the results obtained in the absence of the bridge. Overall, this work provides a deep understanding of the enzymatic interaction between HK and G6PDH, which can aid in the screening of artificial cascade reactions to incorporate substrate channeling and improve efficiency. 93 Chapter 6. Conclusions and Future Work The work in this thesis is inspired by the need to design efficient cascade reactions with a good understanding and implementation of substrate channeling. In this thesis, two mechanisms of substrate channeling, namely, electrostatic channeling and molecular tunnel, are examined through computational techniques, in particular, molecular dynamics. Chapter 2 explores the electrostatic channeling mechanism of a charged intermediate on an oppositely charged polypeptide bridge in an artificial cascade reaction. Specifically, we combine infrequent metadynamics (InMetaD), umbrella sampling (US), and kinetic Monte Carlo (KMC) models to computationally study the transfer mechanism of glucose-6-phosphate (G6P) on a poly-arginine peptide bridging hexokinase (HK) and glucose-6-dehydrogenase (G6PDH). Transport of G6P by hopping in the presence of poly-arginine peptides is shown to be a rare event, and InMetaD is used to compute the hopping activation energy. US simulations capture the configurational change in the desorption process and enable the determination of the desorption energy. Parameterized by these results, a KMC model is used to estimate transport efficiency for the bridged enzyme complex. Results are compared to a similar complex using a poly-lysine bridge, using kinetic lag time as a metric. Even at a high ionic strength of 120 mM, poly-arginine peptides may be capable of more efficient transport as compared to poly-lysine, with a predicted lag time of 6 seconds for poly-arginine, compared to a previously reported lag time of 59 seconds for poly-lysine. This work indicates that poly-arginine peptides may be an improved bridge structure for electrostatic channeling of anionic intermediates. Chapter 3 furthers the understanding of electrostatic channeling by investigating a natural enzymatic cascade where electrostatic channeling occurs. Using a combination of molecular dynamics (MD) simulations and a Markov state model (MSM), we examined the transport 94 process of the intermediate oxaloacetate (OAA) from MDH to CS. The MSM enables the identification of the dominant transport pathways of OAA from MDH to CS. Analysis of all pathways using a hub score approach reveals a small set of residues that control OAA transport. This set includes an arginine residue previously identified experimentally. MSM analysis of a mutated complex, where the identified arginine is replaced by alanine, led to a 2-fold decrease in transfer efficiency, also consistent with experimental results. This work provides a molecular- level understanding of the electrostatic channeling Overall, chapter 2 and 3 together provide a comprehensive understanding of electrostatic channeling at atomic level. Chapter 5 studies the interactions and configurational change of enzymes involved in a cascade reaction with electrostatic channeling. In particular, we studied the system of HK-bridge-G6PDH complex where enzymes and the bridge are not linked. From the results of coarse grained molecular dynamics simulation, we identified that HK and G6PDH are preferred to remain a close proximity after a significant rotation of G6PDH, which results in a significant increase in the interaction energy. Chapter 4 looks into the mechanism of molecular tunnel in asparagine synthetase. We studied the transport mechanism of NH3 in the molecular tunnel in asparagine synthetase (AS). The energy profile constructed from umbrella sampling indicates that four energy maxima exist. The highest energy barrier separates the first site basin from the mid-tunnel basin. This bottleneck is associated with the narrowest part of the tunnel formed by four residues. Compared to the energy profile for the mutant A388L proposed in experiments, the mutation leads to an even higher energy barrier located at the lower part of the tunnel that prevents NH3 passing through and significantly reduced the tunneling probability. This work helps in the rational design of efficient artificial cascade reactions utilizing molecular tunnels. 95 Chapter 5 examines the enzymatic interaction between HK and G6PDH which benefited from electrostatic channeling at transferring G6P when incorporating a positive peptide bridge. HK and G6PDH preferred to stay close with a minimum distance below 0.5 nm. The shortened distance increased the interaction energy of HK and G6PDH. The work helps in understanding the enzymatic interactions for artificial cascade reactions. In this work, we studied the intermediate transport mechanism in enzymatic cascade reactions to learn the structural characteristics for efficient channeling. In future, it is possible to construct artificial channeling surfaces that will facilitate the efficient channeling of intermediates in cascade reactions. These surfaces can be developed based on the principles identified in this study, which include the incorporation of robust electrostatic interactions, the construction of oppositely charged continuous surfaces, and the avoidance of bottleneck residues. This thesis serves as a guide for future studies on substrate channeling, which promises to be important in cascade reactions with the potential applications in biofuel cells, biosensors, and chemical synthesis. 96 BIBLIOGRAPHY 1. C. J. Li and B. M. Trost, "Green chemistry for chemical synthesis", Proceedings of the National Academy of Sciences of the United States of America, (2008). doi:10.1073/pnas.0804348105 2. J. Muschiol, C. Peters, N. Oberleitner, M. D. Mihovilovic, U. T. Bornscheuer and F. Rudroff, "Cascade catalysis – strategies and challenges en route to preparative synthetic biology", Chem. Commun., 51, 5798–5811 (2015). doi:10.1039/C4CC08752F 3. K. C. Nicolaou, D. J. Edmonds and P. G. Bulger, "Cascade Reactions in Total Synthesis", Angewandte Chemie International Edition, 45, 7134–7186 (2006). doi:10.1002/anie.200601872 4. I. Wheeldon, S. D. Minteer, S. Banta, S. C. Barton, P. Atanassov and M. Sigman, "Substrate channelling as an approach to cascade reactions", Nature Chem, 8, 299–309 (2016). doi:10.1038/nchem.2459 5. J. Guo, L. Yang, C. Zhao, Z. Gao, Y.-Y. Song and P. Schmuki, "Constructing a photo- enzymatic cascade reaction and its in situ monitoring: enzymes hierarchically trapped in titania meso-porous MOFs as a new photosynthesis platform", J. Mater. Chem. A, 9, 14911– 14919 (2021). doi:10.1039/D1TA04009J 6. G. Hrazdina and R. A. Jensen, "Spatial Organization of Enzymes in Plant Metabolic Pathways", Annual Review of Plant Physiology and Plant Molecular Biology, 43, 241–267 (1992). doi:10.1146/annurev.pp.43.060192.001325 7. Y. Liu, J. Du, M. Yan, M. Y. Lau, J. Hu, H. Han, O. O. Yang, S. Liang, W. Wei, H. Wang, J. Li, X. Zhu, L. Shi, W. Chen, C. Ji and Y. Lu, "Biomimetic enzyme nanocomplexes and their use as antidotes and preventive measures for alcohol intoxication", Nature Nanotechnology, (2013). doi:10.1038/nnano.2012.264 8. A. Zakharchenko, N. Guz, A. M. Laradji, E. Katz and S. Minko, "Magnetic field remotely controlled selective biocatalysis", Nature Catalysis, (2018). doi:10.1038/s41929-017-0003-3 9. Y. Zhang, M. Baekgaard-Laursen and B. Städler, "Small Subcompartmentalized Microreactors as Support for Hepatocytes", Advanced Healthcare Materials, (2017). doi:10.1002/adhm.201601141 10. S. K. Kuk, R. K. Singh, D. H. Nam, R. Singh, J. K. Lee and C. B. Park, "Photoelectrochemical Reduction of Carbon Dioxide to Methanol through a Highly Efficient Enzyme Cascade", Angewandte Chemie - International Edition, (2017). doi:10.1002/anie.201611379 11. Y. Zhang, Q. Wang and H. Hess, "Increasing Enzyme Cascade Throughput by pH- Engineering the Microenvironment of Individual Enzymes", ACS Catalysis, (2017). doi:10.1021/acscatal.6b03431 97 12. F. C. Macazo and S. D. Minteer, "Enzyme cascades in biofuel cells", Current Opinion in Electrochemistry, 5, 114–120 (2017). doi:10.1016/j.coelec.2017.07.010 13. Y. S. Li, W. P. Liu, X. F. Gao, D. D. Chen and W. G. Li, "Immobilized enzymatic fluorescence capillary biosensor for determination of sulfated bile acid in urine", Biosensors and Bioelectronics, (2008). doi:10.1016/j.bios.2008.05.001 14. Y. S. Li, Y. D. Du, T. M. Chen and X. F. Gao, "A novel immobilization multienzyme glucose fluorescence capillary biosensor", Biosensors and Bioelectronics, (2010). doi:10.1016/j.bios.2009.10.035 15. A. H. Elcock, G. A. Huber and J. Andrew McCammon, "Electrostatic channeling of substrates between enzyme active sites: Comparison of simulation and experiment", Biochemistry, 36, 16049–16058 (1997). doi:10.1021/bi971709u 16. J. Wang, N. S. Nemeria, K. Chandrasekhar, S. Kumaran, P. Arjunan, S. Reynolds, G. Calero, R. Brukh, L. Kakalis, W. Furey and F. Jordan, "Structure and function of the catalytic domain of the dihydrolipoyl acetyltransferase component in Escherichia coli pyruvate dehydrogenase complex", Journal of Biological Chemistry, (2014). doi:10.1074/jbc.M113.544080 17. O. Idan and H. Hess, "Origins of activity enhancement in enzyme cascades on scaffolds", ACS Nano, 7, 8658–8665 (2013). doi:10.1021/nn402823k 18. F. Wu and S. Minteer, "Krebs cycle metabolon: Structural evidence of substrate channeling revealed by cross-linking and mass spectrometry", Angewandte Chemie - International Edition, 54, 1851–1854 (2015). doi:10.1002/anie.201409336 19. Y. M. Huang, G. A. Huber, N. Wang, S. D. Minteer and J. A. McCammon, "Brownian dynamic study of an enzyme metabolon in the TCA cycle: Substrate kinetics and channeling", Protein Science, 27, 463–471 (2018). doi:10.1002/pro.3338 20. Y. Liu, D. P. Hickey, J.-Y. Guo, E. Earl, S. Abdellaoui, R. D. Milton, M. S. Sigman, S. D. Minteer and S. Calabrese Barton, "Substrate Channeling in an Artificial Metabolon: A Molecular Dynamics Blueprint for an Experimental Peptide Bridge", ACS Catal., 7, 2486– 2493 (2017). doi:10.1021/acscatal.6b03440 21. S. Hovmöller, T. Zhou and T. Ohlson, "Conformations of amino acids in proteins", Acta Cryst D, 58, 768–776 (2002). doi:10.1107/S0907444902003359 22. J. M. Zimmerman, N. Eliezer and R. Simha, "The characterization of amino acid sequences in proteins by statistical methods", Journal of Theoretical Biology, 21, 170–201 (1968). doi:10.1016/0022-5193(68)90069-6 23. H. Xiong, B. L. Buckwalter, H. M. Shieh and M. H. Hecht, "Periodicity of polar and nonpolar amino acids is the major determinant of secondary structure in self-assembling oligomeric peptides.", Proceedings of the National Academy of Sciences, 92, 6349–6353 (1995). doi:10.1073/pnas.92.14.6349 98 24. S. Damodaran, "Food Proteins: An Overview", in Food Proteins and Their Applications (CRC Press, 1997). 25. "File:Main protein structure levels en.svg - Wikipedia", (2008). 26. F. Eisenhaber, B. Persson and P. Argos, "Protein Structure Prediction: Recognition of Primary, Secondary, and Tertiary Structural Features from Amino Acid Sequence", Critical Reviews in Biochemistry and Molecular Biology, 30, 1–94 (1995). doi:10.3109/10409239509085139 27. D. J. Cahill, "Protein and antibody arrays and their medical applications", Journal of Immunological Methods, 250, 81–91 (2001). doi:10.1016/S0022-1759(01)00325-8 28. P. K. Agarwal, "Enzymes: An integrated view of structure, dynamics and function", Microb Cell Fact, 5, 2 (2006). doi:10.1186/1475-2859-5-2 29. A. Prochiantz, "Messenger proteins: homeoproteins, TAT and others", Current Opinion in Cell Biology, 12, 400–406 (2000). doi:10.1016/S0955-0674(00)00108-3 30. L. Hedstrom, "Enzyme Specificity and Selectivity", in ELS (John Wiley & Sons, Ltd, 2001). doi:10.1038/npg.els.0000716 31. E. B. Smith, Basic Chemical Thermodynamics (Imperial College Press, 2004). 32. K. A. Johnson, "Advances in transient-state kinetics", Current Opinion in Biotechnology, 9, 87–89 (1998). doi:10.1016/S0958-1669(98)80089-X 33. J. R. Lorsch, "Chapter One - Practical Steady-State Enzyme Kinetics", in Methods in Enzymology, ed. J. Lorsch (Academic Press, 2014), pp. 3–15. doi:10.1016/B978-0-12- 420070-8.00001-5 34. J.-Q. Fan, "A counterintuitive approach to treat enzyme deficiencies: use of enzyme inhibitors for restoring mutant enzyme activity", 389, 1–11 (2008). doi:10.1515/BC.2008.009 35. E. Ricca, B. Brucher and J. H. Schrittwieser, "Multi-Enzymatic Cascade Reactions: Overview and Perspectives", Advanced Synthesis & Catalysis, 353, 2239–2262 (2011). doi:10.1002/adsc.201100256 36. E. T. Hwang and S. Lee, "Multienzymatic Cascade Reactions via Enzyme Complex by Immobilization", ACS Catal., 9, 4402–4425 (2019). doi:10.1021/acscatal.8b04921 37. N. S. Chandel, "Glycolysis", Cold Spring Harb Perspect Biol, 13, a040535 (2021). doi:10.1101/cshperspect.a040535 38. C. Schnarrenberger and W. Martin, "Evolution of the enzymes of the citric acid cycle and the glyoxylate cycle of higher plants", European Journal of Biochemistry, 269, 868–883 (2002). doi:10.1046/j.0014-2956.2001.02722.x 99 39. N. Raimundo, B. E. Baysal and G. S. Shadel, "Revisiting the TCA cycle: signaling to tumor formation", Trends in Molecular Medicine, 17, 641–649 (2011). doi:10.1016/j.molmed.2011.06.001 40. B. Bulutoglu, K. E. Garcia, F. Wu, S. D. Minteer and S. Banta, "Direct Evidence for Metabolon Formation and Substrate Channeling in Recombinant TCA Cycle Enzymes", ACS Chem. Biol., 11, 2847–2853 (2016). doi:10.1021/acschembio.6b00523 41. W. Guo, J. Sheng and X. Feng, "Mini-review: In vitro Metabolic Engineering for Biomanufacturing of High-value Products", Computational and Structural Biotechnology Journal, 15, 161–167 (2017). doi:10.1016/j.csbj.2017.01.006 42. E. W. Miles, S. Rhee and D. R. Davies, "The Molecular Basis of Substrate Channeling", Journal of Biological Chemistry , 274, 12193–12196 (1999). doi:10.1074/jbc.274.18.12193 43. F. M. Raushel, J. B. Thoden and H. M. Holden, "Enzymes with molecular tunnels", Accounts of Chemical Research, 36, 539–548 (2003). doi:10.1021/ar020047k 44. S. Singh and R. Anand, "Tunnel Architectures in Enzyme Systems that Transport Gaseous Substrates", ACS Omega, 6, 33274–33283 (2021). doi:10.1021/acsomega.1c05430 45. C. C. Hyde, S. A. Ahmed, E. A. Padlan, E. W. Milesl and D. R. Davies, "Three-dimensional Structure of the Tryptophan Synthase", Journal of Biological Chemistry, 263, 17857–17871 (1988). 46. K. S. Anderson, E. W. Miles and K. A. Johnson, "Serine modulates substrate channeling in tryptophan synthase. A novel intersubunit triggering mechanism", Journal of Biological Chemistry, 266, 8020–8033 (1991). 47. Y. Fan, L. Lund, L. Yang, F. M. Raushel and Y.-Q. Gao, "Mechanism for the Transport of Ammonia within Carbamoyl Phosphate Synthetase Determined by Molecular Dynamics Simulations", Biochemistry, 47, 2935–2944 (2008). doi:10/bc5gpn 48. X. S. Wang, A. E. Roitberg and N. G. J. Richards, "Computational Studies of Ammonia Channel Function in Glutamine 5′-Phosphoribosylpyrophosphate Amidotransferase", Biochemistry, 48, 12272–12282 (2009). doi:10/cp7v7c 49. T. M. Larsen, S. K. Boehlein, S. M. Schuster, N. G. J. Richards, J. B. Thoden, H. M. Holden and I. Rayment, "Three-Dimensional Structure of Escherichia coli Asparagine Synthetase B: A Short Journey from Substrate to Product † , ‡", Biochemistry, 38, 16146–16157 (1999). doi:10/fmb5sr 50. C. Binda, R. T. Bossi, S. Wakatsuki, S. Arzt, A. Coda, B. Curti, M. A. Vanoni and A. Mattevi, "Cross-Talk and Ammonia Channeling between Active Centers in the Unexpected Domain Arrangement of Glutamate Synthase", Structure, 8, 1299–1308 (2000). doi:10.1016/S0969-2126(00)00540-2 100 51. M. Trujillo, R. G. K. Donald, D. S. Roos, P. J. Greene and D. V. Santi, "Heterologous expression and characterization of the bifunctional dihydrofolate reductase-thymidylate synthase enzyme of Toxoplasma gondii", Biochemistry, 35, 6366–6374 (1996). doi:10.1021/bi952923q 52. B. Brena, P. González-Pombo and F. Batista-Viera, "Immobilization of Enzymes: A Literature Survey", in Immobilization of Enzymes and Cells: Third Edition, ed. J. M. Guisan (Humana Press, Totowa, NJ, 2013), pp. 15–31. doi:10.1007/978-1-62703-550-7_2 53. Synthesis of polymeric microcapsule arrays and their use for enzyme immobilization | Nature, n.d., Available at: https://www.nature.com/articles/369298a0, Accessed: 08-Feb- 2023. 54. J. Fu, Y. R. Yang, A. Johnson-Buck, M. Liu, Y. Liu, N. G. Walter, N. W. Woodbury and H. Yan, "Multi-enzyme complexes on DNA scaffolds capable of substrate channelling with an artificial swinging arm", Nature Nanotechnology, (2014). doi:10.1038/nnano.2014.100 55. Z. Liu, S. Cao, M. Liu, W. Kang and J. Xia, "Self-Assembled Multienzyme Nanostructures on Synthetic Protein Scaffolds", ACS Nano, 13, 11343–11352 (2019). doi:10.1021/acsnano.9b04554 56. P. Bauler, G. Huber, T. Leyh and J. A. McCammon, "Channeling by Proximity: The Catalytic Advantages of Active Site Colocalization Using Brownian Dynamics", J. Phys. Chem. Lett., 1, 1332–1335 (2010). doi:10.1021/jz1002007 57. M. B. Quin, K. K. Wallin, G. Zhang and C. Schmidt-Dannert, "Spatial organization of multi- enzyme biocatalytic cascades", Organic and Biomolecular Chemistry, 15, 4260–4271 (2017). doi:10.1039/c7ob00391a 58. J. Fu, M. Liu, Y. Liu, N. W. Woodbury and H. Yan, "Interenzyme Substrate Diffusion for an Enzyme Cascade Organized on Spatially Addressable DNA Nanostructures", J. Am. Chem. Soc., 134, 5516–5519 (2012). doi:10.1021/ja300897h 59. Y. Zhang, S. Tsitkov and H. Hess, "Proximity does not contribute to activity enhancement in the glucose oxidase–horseradish peroxidase cascade", Nature Communications, 7, 13982 (2016). 60. S. Kim and J.-S. Hahn, "Synthetic scaffold based on a cohesin–dockerin interaction for improved production of 2,3-butanediol in Saccharomyces cerevisiae", Journal of Biotechnology, 192, 192–196 (2014). doi:10.1016/J.JBIOTEC.2014.10.015 61. S. F. M. van Dongen, M. Nallani, J. J. L. M. Cornelissen, R. J. M. Nolte and J. C. M. van Hest, "A Three-Enzyme Cascade Reaction through Positional Assembly of Enzymes in a Polymersome Nanoreactor", Chemistry – A European Journal, 15, 1107–1114 (2009). doi:10.1002/chem.200802114 101 62. Q. Ji, B. Wang, J. Tan, L. Zhu and L. Li, "Immobilized multienzymatic systems for catalysis of cascade reactions", Process Biochemistry, 51, 1193–1203 (2016). doi:10.1016/j.procbio.2016.06.004 63. H. O. Spivey and J. Ovádi, "Substrate Channeling", Methods, 19, 306–321 (1999). doi:10.1006/meth.1999.0858 64. W. R. McClure, "A Kinetic Analysis of Coupled Enzyme Assays", Biochemistry, 8, 2782– 2786 (1969). doi:10.1021/bi00835a014 65. H. O. Spivey and J. Ovádi, "Substrate Channeling", Methods, 19, 306–321 (1999). doi:10.1006/METH.1999.0858 66. H. Sharma, M. J. Landau, M. A. Vargo, K. A. Spasov and K. S. Anderson, "First Three- Dimensional Structure of Toxoplasma gondii Thymidylate Synthase–Dihydrofolate Reductase: Insights for Catalysis, Interdomain Interactions, and Substrate Channeling", Biochemistry, 52, 7305–7317 (2013). doi:10.1021/bi400576t 67. R. Sharpley, "Continuum model", in Encyclopedia of Tourism, eds. J. Jafari and H. Xiao (Springer International Publishing, Cham, 2016), pp. 190–191. doi:10.1007/978-3-319- 01384-8_645 68. B. Liu, Y. Huang, H. Jiang, S. Qu and K. C. Hwang, "The atomic-scale finite element method", Computer Methods in Applied Mechanics and Engineering, 193, 1849–1864 (2004). doi:10.1016/j.cma.2003.12.037 69. K. S. Chavan and S. Calabrese Barton, "Simulation of Intermediate Channeling by Nanoscale Confinement", J. Phys. Chem. C, 122, 14474–14480 (2018). doi:10.1021/acs.jpcc.8b01922 70. E. Earl and S. C. Barton, "Simulation of intermediate transport in nanoscale scaffolds for multistep catalytic reactions", Phys. Chem. Chem. Phys., 19, 15463–15470 (2017). doi:10.1039/C7CP00239D 71. G. A. Huber and J. A. McCammon, "Brownian Dynamics Simulations of Biological Molecules", Trends in Chemistry, 1, 727–738 (2019). doi:10.1016/j.trechm.2019.07.008 72. D. L. Ermak and J. A. McCammon, "Brownian dynamics with hydrodynamic interactions", J. Chem. Phys., 69, 1352–1360 (1978). doi:10.1063/1.436761 73. T. Hansson, C. Oostenbrink and W. van Gunsteren, "Molecular dynamics simulations", Current Opinion in Structural Biology, 12, 190–196 (2002). doi:10.1016/S0959- 440X(02)00308-1 74. M. Karplus and G. A. Petsko, "Molecular dynamics simulations in biology", Nature, 347, 631–639 (1990). doi:10.1038/347631a0 102 75. J. Kästner, "Umbrella sampling", WIREs Computational Molecular Science, 1, 932–942 (2011). doi:10.1002/wcms.66 76. and and, "Calculation of Collective Variable-based PMF by Combining WHAM with Umbrella Sampling", Chinese Phys. Lett., 29, 068702 (2012). doi:10.1088/0256- 307X/29/6/068702 77. Y. Lin, Z. Cao and Y. Mo, "Molecular Dynamics Simulations on the Escherichia coli Ammonia Channel Protein AmtB: Mechanism of Ammonia/Ammonium Transport", J. Am. Chem. Soc., 128, 10876–10884 (2006). doi:10.1021/ja0631549 78. Y. Liu, I. Matanovic, D. P. Hickey, S. D. Minteer, P. Atanassov and S. C. Barton, "Cascade Kinetics of an Artificial Metabolon by Molecular Dynamics and Kinetic Monte Carlo", ACS Catal., 8, 7719–7726 (2018). doi:10.1021/acscatal.8b01041 79. B. E. Husic and V. S. Pande, "Markov State Models: From an Art to a Science", Journal of the American Chemical Society, 140, 2386–2396 (2018). doi:10.1021/jacs.7b12191 80. N. F. Polizzi, M. J. Therien and D. N. Beratan, "Mean First-Passage Times in Biology", Israel Journal of Chemistry, 56, 816–824 (2016). doi:10.1002/ijch.201600040 81. P. Metzner, C. Schütte and E. Vanden-Eijnden, "Transition Path Theory for Markov Jump Processes", Multiscale Modeling & Simulation, 7, 1192–1219 (2009). doi:10.1137/070699500 82. A. Dickson and C. L. Brooks, "Quantifying Hub-like Behavior in Protein Folding Networks", J. Chem. Theory Comput., 8, 3044–3052 (2012). doi:10.1021/ct300537s 83. F. Noé, C. Schütte, E. Vanden-Eijnden, L. Reich and T. R. Weikl, "Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations", Proceedings of the National Academy of Sciences of the United States of America, 106, 19011–19016 (2009). doi:10.1073/pnas.0905466106 84. G. Zhou, G. A. Pantelopulos, S. Mukherjee and V. A. Voelz, "Bridging Microscopic and Macroscopic Mechanisms of p53-MDM2 Binding with Kinetic Network Models", Biophysical Journal, 113, 785–793 (2017). doi:10.1016/j.bpj.2017.07.009 85. A. Nunes-Alves, D. M. Zuckerman and G. M. Arantes, "Escape of a Small Molecule from Inside T4 Lysozyme by Multiple Pathways", Biophysical Journal, 114, 1058–1066 (2018). doi:10.1016/j.bpj.2018.01.014 86. H. Tian, F. Trozzi, B. D. Zoltowski and P. Tao, "Deciphering the Allosteric Process of the Phaeodactylum tricornutum Aureochrome 1a LOV Domain", Journal of Physical Chemistry B, 124, 8960–8972 (2020). doi:10.1021/acs.jpcb.0c05842 87. Y. Liu, D. P. Hickey, S. D. Minteer, A. Dickson and S. Calabrese Barton, "Markov-State Transition Path Analysis of Electrostatic Channeling", J. Phys. Chem. C, 123, 15284–15292 (2019). doi:10/ghfprh 103 88. A. Barducci, M. Bonomi and M. Parrinello, "Metadynamics", WIREs Computational Molecular Science, 1, 826–843 (2011). doi:10.1002/wcms.31 89. A. Barducci, G. Bussi and M. Parrinello, "Well-Tempered Metadynamics: A Smoothly Converging and Tunable Free-Energy Method", Physical Review Letters, 100, 20603 (2008). doi:10.1103/PhysRevLett.100.020603 90. A. L. Parrill and K. B. Lipkowitz, Reviews in Computational Chemistry, Volume 28 (John Wiley & Sons, 2015). 91. P. Tiwary and M. Parrinello, "From metadynamics to dynamics", Physical Review Letters, 111, 1–5 (2013). doi:10.1103/PhysRevLett.111.230602 92. R. Casasnovas, V. Limongelli, P. Tiwary, P. Carloni and M. Parrinello, "Unbinding Kinetics of a p38 MAP Kinase Type II Inhibitor from Metadynamics Simulations", Journal of the American Chemical Society, 139, 4780–4788 (2017). doi:10.1021/jacs.6b12950 93. Y. Zhou, R. Zou, G. Kuang, B. Långström, C. Halldin, H. Ågren and Y. Tu, "Enhanced Sampling Simulations of Ligand Unbinding Kinetics Controlled by Protein Conformational Changes", Journal of Chemical Information and Modeling, 59, 3910–3918 (2019). doi:10.1021/acs.jcim.9b00523 94. Z. F. Brotzakis, V. Limongelli and M. Parrinello, "Accelerating the Calculation of Protein- Ligand Binding Free Energy and Residence Times Using Dynamically Optimized Collective Variables", Journal of Chemical Theory and Computation, 15, 743–750 (2019). doi:10.1021/acs.jctc.8b00934 95. A. P. J. Jansen, An Introduction to Kinetic Monte Carlo Simulations of Surface Reactions (Springer, 2012). 96. A. F. Voter, "INTRODUCTION TO THE KINETIC MONTE CARLO METHOD", in Radiation Effects in Solids, eds. K. E. Sickafus, E. A. Kotomin and B. P. Uberuaga (Springer Netherlands, Dordrecht, 2007), pp. 1–23. doi:10.1007/978-1-4020-5295-8_1 97. S. O. Nielsen, C. F. Lopez, G. Srinivas and M. L. Klein, "Coarse grain models and the computer simulation of soft materials", J. Phys.: Condens. Matter, 16, R481 (2004). doi:10.1088/0953-8984/16/15/R03 98. P. J. Bond, J. Holyoake, A. Ivetac, S. Khalid and M. S. P. Sansom, "Coarse-grained molecular dynamics simulations of membrane proteins and peptides", Journal of Structural Biology, 157, 593–605 (2007). doi:10.1016/j.jsb.2006.10.004 99. S. Kmiecik, D. Gront, M. Kolinski, L. Wieteska, A. E. Dawid and A. Kolinski, "Coarse- Grained Protein Models and Their Applications", Chem. Rev., 116, 7898–7936 (2016). doi:10.1021/acs.chemrev.6b00163 104 100.The MARTINI Coarse-Grained Force Field: Extension to Proteins | Journal of Chemical Theory and Computation, n.d., Available at: https://pubs.acs.org/doi/full/10.1021/ct700324x, Accessed: 08-Feb-2023. 101.J. T. Padding and W. J. Briels, "Time and length scales of polymer melts studied by coarse- grained molecular dynamics simulations", J. Chem. Phys., 117, 925–943 (2002). doi:10.1063/1.1481859 102.H. J. Risselada and S. J. Marrink, "Curvature effects on lipid packing and dynamics in liposomes revealed by coarse grained molecular dynamics simulations", Phys. Chem. Chem. Phys., 11, 2056 (2009). doi:10.1039/b818782g 103.E. Ricca, B. Brucher and J. H. Schrittwieser, "Multi-enzymatic cascade reactions: Overview and perspectives", Advanced Synthesis and Catalysis, 353, 2239–2262 (2011). doi:10.1002/adsc.201100256 104.T. R. Schneider, E. Gerhardt, M. Lee, P.-H. Liang, K. S. Anderson and I. Schlichting, "Loop Closure and Intersubunit Communication in Tryptophan Synthase,", Biochemistry, 37, 5394– 5406 (1998). doi:10.1021/bi9728957 105.S. Vijayakrishnan, S. M. Kelly, R. J. C. Gilbert, P. Callow, D. Bhella, T. Forsyth, J. G. Lindsay and O. Byron, "Solution structure and characterisation of the human pyruvate dehydrogenase complex core assembly", Journal of Molecular Biology, 399, 71–93 (2010). doi:10.1016/j.jmb.2010.03.043 106.J. Fu, M. Liu, Y. Liu, N. W. Woodbury and H. Yan, "Interenzyme substrate diffusion for an enzyme cascade organized on spatially addressable DNA nanostructures", Journal of the American Chemical Society, 134, 5516–5519 (2012). doi:10.1021/ja300897h 107.A. H. Elcock, M. J. Potter, D. A. Matthews, D. R. Knighton and J. A. McCammon, "Electrostatic Channeling in the Bifunctional Enzyme Dihydrofolate Reductase-thymidylate Synthase", Journal of Molecular Biology, 262, 370–374 (1996). doi:https://doi.org/10.1006/jmbi.1996.0520 108.C. Eun, P. M. Kekenes-Huskey, V. T. Metzger and J. A. McCammon, "A model study of sequential enzyme reactions and electrostatic channeling", The Journal of Chemical Physics, 140, 105101 (2014). doi:10.1063/1.4867286 109.D. R. Knighton, C.-C. Kan, E. Howland, C. A. Janson, Z. Hostomska, K. M. Welsh and D. A. Matthews, "Structure of and kinetic channelling in bifunctional dihydrofolate reductase– thymidylate synthase", Nature Structural Biology, 1, 186–194 (1994). doi:10.1038/nsb0394- 186 110.A. H. Elcock, G. A. Huber, J. Andrew McCammon and J. A. McCammon, "Electrostatic Channeling of Substrates between Enzyme Active Sites: Comparison of Simulation and Experiment", Biochemistry, 36, 16049–16058 (1997). doi:10.1021/bi971709u 105 111.F. Wu and S. Minteer, "Krebs cycle metabolon: Structural evidence of substrate channeling revealed by cross-linking and mass spectrometry", Angewandte Chemie - International Edition, 54, 1851–1854 (2015). doi:10.1002/anie.201409336 112.O. I. Wilner, Y. Weizmann, R. Gill, O. Lioubashevski, R. Freeman and I. Willner, "Enzyme cascades activated on topologically programmed DNA scaffolds.", Nature Nanotechnology, 4, 249–254 (2009). doi:10.1038/nnano.2009.50 113.V. Linko, M. Eerikäinen and M. A. Kostiainen, "A modular DNA origami-based enzyme cascade nanoreactor", Chemical Communications, 51, 5351–5354 (2015). doi:10.1039/c4cc08472a 114.F. Jia, B. Narasimhan and S. K. Mallapragada, "Biomimetic Multienzyme Complexes Based on Nanoscale Platforms", AIChE Journal, 59, 355–360 (2013). doi:10.1002/aic.13992 115.J. L. Lin, J. Zhu and I. Wheeldon, "Synthetic Protein Scaffolds for Biosynthetic Pathway Colocalization on Lipid Droplet Membranes", ACS Synthetic Biology, 6, 1534–1544 (2017). doi:10.1021/acssynbio.7b00041 116.Y. Elani, R. V. Law and O. Ces, "Vesicle-based artificial cells as chemical microreactors with spatially segregated reaction pathways", Nature Communications, 5, 1–5 (2014). doi:10.1038/ncomms6305 117.H. Yuan, H. Bai, L. Liu, F. Lv and S. Wang, "A glucose-powered antimicrobial system using organic–inorganic assembled network materials", Chemical Communications, 51, 722– 724 (2015). doi:10.1039/C4CC07533A 118.C. G. Palivan, R. Goers, A. Najer, X. Zhang, A. Car and W. Meier, "Bioinspired polymer vesicles and membranes for biological and medical applications", Chemical Society Reviews, 45, 377–411 (2016). doi:10.1039/C5CS00569H 119.D. Frigyes, F. Alber, S. Pongor and P. Carloni, "Arginine–phosphate salt bridges in protein– DNA complexes: a Car–Parrinello study", Journal of Molecular Structure: THEOCHEM, 574, 39–45 (2001). doi:https://doi.org/10.1016/S0166-1280(01)00368-2 120.J. A. Lemkul and D. R. Bevan, "Assessing the Stability of Alzheimer’s Amyloid Protofibrils Using Molecular Dynamics", The Journal of Physical Chemistry B, 114, 1652–1660 (2010). doi:10.1021/jp9110794 121.M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek and G. R. Hutchison, "Avogadro: an advanced semantic chemical editor, visualization, and analysis platform", Journal of Cheminformatics, 4, 17 (2012). doi:10.1186/1758-2946-4-17 122.J. J. Irwin and B. K. Shoichet, "ZINC - A free database of commercially available compounds for virtual screening", Journal of Chemical Information and Modeling, 45, 177– 182 (2005). doi:10.1021/ci049714+ 106 123.M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess and E. Lindah, "Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers", SoftwareX, 1–2, 19–25 (2015). doi:10.1016/j.softx.2015.06.001 124.J. Huang and A. D. MacKerell Jr, "CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data", Journal of Computational Chemistry, 34, 2135–2145 (2013). doi:10.1002/jcc.23354 125.K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov and A. D. Mackerell, "CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields", Journal of Computational Chemistry, 31, 671–690 (2010). doi:10.1002/jcc.21367 126.G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, "PLUMED 2: New feathers for an old bird", Computer Physics Communications, 185, 604–613 (2014). doi:https://doi.org/10.1016/j.cpc.2013.09.018 127.M. Salvalaglio, P. Tiwary and M. Parrinello, "Assessing the Reliability of the Dynamics Reconstructed from Metadynamics", J. Chem. Theory Comput, 10, 47 (2014). doi:10.1021/ct500040r 128.A. Grossfield, "An implementation of WHAM: the weighted histogram analysis method", Analysis, 1–13 (2004). 129.F. J. Massey, "The Kolmogorov-Smirnov Test for Goodness of Fit", Journal of the American Statistical Association, 46, 68–78 (1951). doi:10.1080/01621459.1951.10500769 130.P. A. Srere, "The metabolon", Trends in Biochemical Sciences, 10, 109–110 (1985). doi:10.1016/0968-0004(85)90266-X 131.J. Ovádi, "Physiological significance of metabolic channelling", Journal of Theoretical Biology, 152, 1–22 (1991). doi:10.1016/S0022-5193(05)80500-4 132.A. H. Elcock, M. J. Potter, D. A. Matthews, D. R. Knighton and J. A. A. McCammon, "Electrostatic Channeling in the Bifunctional Enzyme Dihydrofolate Reductase-thymidylate Synthase", Journal of Molecular Biology, 262, 370–374 (1996). doi:https://doi.org/10.1006/jmbi.1996.0520 133.K. Shatalin, S. Lebreton, M. Rault-Leonardon, C. Vélot and P. A. Srere, "Electrostatic channeling of oxaloacetate in a fusion protein of porcine citrate synthase and porcine mitochondrial malate dehydrogenase", Biochemistry, 38, 881–889 (1999). doi:10.1021/bi982195h 134.Y. Cheng, C. E. A. Chang, Z. Yu, Y. Zhang, M. Sun, T. S. Leyh, M. J. Holst and J. A. McCammon, "Diffusional channeling in the sulfate-activating complex: Combined continuum modeling and coarse-grained Brownian dynamics studies", Biophysical Journal, 95, 4659–4667 (2008). doi:10.1529/biophysj.108.140038 107 135.R. W. Guynn, H. J. Gelberg and R. L. Veech, "Equilibrium Constants of the Malate Dehydrogenase, Citrate Synthase, Citrate Lyase, and Acetyl Coenzyme A Hydrolysis Reactions under Physiological Conditions", J. Biol. Chem., 248, 6957–6965 (1973). 136.Y. Xie and S. C. Barton, "Infrequent metadynamics study of rare-event electrostatic channeling", Physical Chemistry Chemical Physics, 23, 13381–13388 (2021). doi:10.1039/D1CP01304A 137.Schrödinger, LLC, The {PyMOL} Molecular Graphics System, Version~1.8 (2015). 138.J. J. Irwin and B. K. Shoichet, "ZINC - A free database of commercially available compounds for virtual screening", Journal of Chemical Information and Modeling, 45, 177– 182 (2005). doi:10.1021/ci049714+ 139.K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov and A. D. Mackerell, "CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields", Journal of Computational Chemistry, 31, 671–690 (2010). doi:10.1002/jcc.21367 140.E. Jurrus, D. Engel, K. Star, K. Monson, J. Brandi, L. E. Felberg, D. H. Brookes, L. Wilson, J. Chen, K. Liles, M. Chun, P. Li, D. W. Gohara, T. Dolinsky, R. Konecny, D. R. Koes, J. E. Nielsen, T. Head-Gordon, W. Geng, R. Krasny, G. W. Wei, M. J. Holst, J. A. McCammon and N. A. Baker, "Improvements to the APBS biomolecular solvation software suite", Protein Science, 27, 112–128 (2018). doi:10.1002/pro.3280 141.V. S. Pande, K. Beauchamp and G. R. Bowman, "Everything you wanted to know about Markov State Models but were afraid to ask", Methods, 52, 99–105 (2010). doi:10.1016/j.ymeth.2010.06.002 142.N. Michaud-Agrawal, E. J. Denning, T. B. Woolf and O. Beckstein, "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations", Journal of Computational Chemistry, 32, 2319–2327 (2011). doi:10.1002/JCC.21787 143.R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, J. Domański, D. L. Dotson, S. Buchoux, I. M. Kenney and O. Beckstein, "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations", Proceedings of the 15th Python in Science Conference, 98–105 (2016). doi:10.25080/MAJORA-629E541A-00E 144.M. K. Scherer, B. Trendelkamp-Schroer, F. Paul, G. Pérez-Hernández, M. Hoffmann, N. Plattner, C. Wehmeyer, J. H. Prinz and F. Noé, "PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models", Journal of Chemical Theory and Computation, 11, 5525–5542 (2015). doi:10.1021/acs.jctc.5b00743 145.E. Suárez, R. P. Wiewiora, C. Wehmeyer, F. Noé, J. D. Chodera and D. M. Zuckerman, "What Markov state models can and cannot do: Correlation versus path-based observables in protein folding models", 2020.11.09.374496 (2020). doi:10.1101/2020.11.09.374496 108 146.P. Metzner, C. Schütte and E. Vanden-Eijnden, "Transition Path Theory for Markov Jump Processes", Multiscale Modeling & Simulation, 7, 1192–1219 (2009). doi:10.1137/070699500 147.R. STINE, "An Introduction to Bootstrap Methods: Examples and Ideas", Sociological Methods & Research, 18, 243–291 (1989). doi:10.1177/0049124189018002003 148.K. S. Anderson, A. Y. Kim, J. M. Quillen, E. Sayers, X.-J. Yang and E. W. Miles, "Kinetic Characterization of Channel Impaired Mutants of Tryptophan Synthase", Journal of Biological Chemistry, 270, 29936–29944 (1995). doi:10.1074/jbc.270.50.29936 149.M. F. Dunn, D. Niks, H. Ngo, T. R. M. Barends and I. Schlichting, "Tryptophan synthase: the workings of a channeling nanomachine", Trends in Biochemical Sciences, 33, 254–264 (2008). doi:10.1016/j.tibs.2008.04.008 150.S. Mouilleron and B. Golinelli-Pimpaneau, "Conformational changes in ammonia- channeling glutamine amidotransferases", Current Opinion in Structural Biology, 17, 653– 664 (2007). doi:10.1016/j.sbi.2007.09.003 151.X. Tan, A. Volbeda, J. C. Fontecilla-Camps and P. A. Lindahl, "Function of the tunnel in acetylcoenzyme A synthase/carbon monoxide dehydrogenase", J Biol Inorg Chem, 11, 371– 378 (2006). doi:10.1007/s00775-006-0086-9 152.B. A. Manjasetty, J. Powlowski and A. Vrielink, "Crystal structure of a bifunctional aldolase–dehydrogenase: Sequestering a reactive and volatile intermediate", Proceedings of the National Academy of Sciences, 100, 6992–6997 (2003). doi:10.1073/pnas.1236794100 153.D. Kohls, T. Sulea, E. O. Purisima, R. E. MacKenzie and A. Vrielink, "The crystal structure of the formiminotransferase domain of formiminotransferase-cyclodeaminase: implications for substrate channeling in a bifunctional enzyme", Structure, 8, 35–46 (2000). doi:10.1016/S0969-2126(00)00078-2 154.C. L. Lomelino, J. T. Andring, R. McKenna and M. S. Kilberg, "Asparagine synthetase: Function, structure, and role in disease", Journal of Biological Chemistry, 292, 19952–19958 (2017). doi:10.1074/jbc.R117.819060 155.L. Gaufichon, M. Reisdorf-Cren, S. J. Rothstein, F. Chardon and A. Suzuki, "Biological functions of asparagine synthetase in plants", Plant Science, 179, 141–153 (2010). doi:10.1016/j.plantsci.2010.04.010 156.K. K. Li, Beeson, I. Ghiviriga and N. G. J. Richards, "A Convenient gHMQC-Based NMR Assay for Investigating Ammonia Channeling in Glutamine-Dependent Amidotransferases: Studies of Escherichia coli Asparagine Synthetase B", Biochemistry, 46, 4840–4849 (2007). doi:10.1021/bi700145t 157.K. Li, Intramolecular tunnel and regulatory mechanisms of asparagine synthetase (ASNS) (Ph.D., University of Florida, n.d.). 109 158.K. S. Chavan and S. C. Barton, "Confinement and Diffusion of Small Molecules in a Molecular-Scale Tunnel", J. Electrochem. Soc., 167, 023505 (2020). doi:10.1149/1945- 7111/ab6dd2 159.J. J. Irwin and B. K. Shoichet, "ZINC – A Free Database of Commercially Available Compounds for Virtual Screening", J Chem Inf Model, 45, 177–182 (2005). doi:10.1021/ci049714 160.K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov and A. D. MacKerell, "CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields", J Comput Chem, 31, 671–690 (2010). doi:10.1002/jcc.21367 161.E. Chovancova, A. Pavelka, P. Benes, O. Strnad, J. Brezovsky, B. Kozlikova, A. Gora, V. Sustr, M. Klvana, P. Medek, L. Biedermannova, J. Sochor and J. Damborsky, "CAVER 3.0: A Tool for the Analysis of Transport Pathways in Dynamic Protein Structures", PLOS Computational Biology, 8, e1002708 (2012). doi:10.1371/journal.pcbi.1002708 162.Y. Xie, S. D. Minteer, S. Banta and S. C. Barton, "Markov State Study of Electrostatic Channeling within the Tricarboxylic Acid Cycle Supercomplex", ACS Nanosci. Au, 2, 414– 421 (2022). doi:10.1021/acsnanoscienceau.2c00011 163.W. Liu, Y. Zhu, Y. Wu, C. Chen, Y. Hong, Y. Yue, J. Zhang and B. Hou, "Molecular Dynamics and Machine Learning in Catalysts", Catalysts, 11, 1129 (2021). doi:10.3390/catal11091129 164.A. Y. Shih, P. L. Freddolino, A. Arkhipov and K. Schulten, "Assembly of lipoprotein particles revealed by coarse-grained molecular dynamics simulations", Journal of Structural Biology, 157, 579–592 (2007). doi:10.1016/j.jsb.2006.08.006 165.S. Hezaveh, S. Samanta, A. De Nicola, G. Milano and D. Roccatano, "Understanding the Interaction of Block Copolymers with DMPC Lipid Bilayer Using Coarse-Grained Molecular Dynamics Simulations", J. Phys. Chem. B, 116, 14333–14345 (2012). doi:10.1021/jp306565e 166.G. G. Maisuradze, P. Senet, C. Czaplewski, A. Liwo and H. A. Scheraga, "Investigation of Protein Folding by Coarse-Grained Molecular Dynamics with the UNRES Force Field", J. Phys. Chem. A, 114, 4471–4485 (2010). doi:10.1021/jp9117776 167.C. Wu and J.-E. Shea, "Coarse-grained models for protein aggregation", Current Opinion in Structural Biology, 21, 209–220 (2011). doi:10.1016/j.sbi.2011.02.002 168.K. M. Ravikumar, W. Huang and S. Yang, "Coarse-Grained Simulations of Protein-Protein Association: An Energy Landscape Perspective", Biophysical Journal, 103, 837–845 (2012). doi:10.1016/j.bpj.2012.07.013 110 169.M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess and E. Lindahl, "GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers", SoftwareX, 1–2, 19–25 (2015). doi:10.1016/j.softx.2015.06.001 170.S. J. Marrink, H. J. Risselada, S. Yefimov, D. P. Tieleman and A. H. de Vries, "The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations", J. Phys. Chem. B, 111, 7812–7824 (2007). doi:10.1021/jp071097f 171.X. Periole, M. Cavalli, S.-J. Marrink and M. A. Ceruso, "Combining an Elastic Network With a Coarse-Grained Molecular Force Field: Structure, Dynamics, and Intermolecular Recognition", J. Chem. Theory Comput., 5, 2531–2543 (2009). doi:10.1021/ct9002114 111