mu: . 2...... ME... 2.2% I...) gangrwbikv 5.5%.. 2.2. a 1.. . $1.. .C... l u .1 he“ : .. g; . a: x 3.... . .. pt: . fish-mg : l .r . “tam. . .mrfifi: . 1. 2.1:... fiazfifi. an, a. .1...an N. I} ‘n . fir. .. aw. f, 14 . tart? x? .. is}: ..........2.. qt! .liiliu . 3.....53... 3.. r x: . tn Lit-.2! : . . JP). .Voulflgflsfl 2...... 9 ! RWY?“ sh“. .. zen}: .2. 1,11)... . . '1 v V V munavh.rh.ss~ fivihfirfl. ”unfit-Mn... ’0. I 34 . C. :11... Beg-I15!“ , 7: 1...... deity]: n 2a. i . s... wimpauu..." sun”. {5%. .1. .uM. l. ’11 91:.1 :11 i. .i . , . out. . 6.. 3. x. .. 3 1.3.. 3&1... .5. w... .. J .. 6‘. 343.”. .x: ifs? gave. .. gilt tank: :1: I Ba}!!! 93“! fl glufid I!) n It .5. I . 1 All. . 1:1..51«.¢.:~. 1:! Shh .. :2 v .a nlddvyxrsu. UH”... 75L! sh! :“ +3... 1... a. 1.: AW 4‘; .2" a... .11..“ . nu. . 3 a . .v in"... H .l 9. . 3.. .1... 41.5.... 1.5. .2. .............3 pain, 39 unit...” : $.22bflr :1 t3. 3% 15%. l. 1. 12.3.1 33 «71.5. .3!is l1! 1.. .ii. I I... u.......:....:;uw¢afi.# :. . . 1...: 2 ’12-. If!!! I Q I . 5.2.-.. Emmwwfimmnix. 33:398.»? 3.... . 0"..lxid‘l: A”): . : $1... 3.32553...» :3 .=»I)livolhrsv. tit“. 31 l .- LHO .a . ~ ....{..:Put.x I... 3...}... 3.....173391 , z. , .115...» a! In. 11.6! r 3133!... J 1.w‘¢.~rl¢.o.l ill ‘ f‘cblflllvcialifiiOt-c V}? It!“ :I‘I..va6i XUW‘nr.tv\-$vlatfiu Axwvmriizv .29.! 5 urt)3lu 3:7,: ghvéllfiaw it 4‘- «5.45m .. "we“: r' '19“! 5 ”meant-'1?! ? rmsev goo} LIBRARY Michigan State University This is to certify that the dissertation entitled THE DYNAMICS OF ENZYMATIC REACTION: COMPUTATIONAL AND EXPERIMENTAL STUDIES presented by LISHAN YAO has been accepted towards fulfillment of the requirements for the PhD degree in Chemistry may... Major Professor’s Signature 4/2/04 Date MSU is an Affirmative Action/Equal Opportunity Institution PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 p:/ClRC/DateDue.indd-p.1 THE DYNAMICS OF ENZYMATIC REACTIONS: COMPUTATIONAL AND EXPERIMENTAL STUDIES By Lishan Yao A DISSERTATION Submitted to Michigan State University In partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Chemistry and Department of Biochemistry and Molecular Biology 2006 ABSTRACT THE DYNAMICS OF ENZYMATIC REACTIONS: COMPUTATIONAL AND EXPERIMENTAL STUDIES BY Lishan Yao The roles of protein dynamics in enzymatic catalysis have been investigated by a combination of experimental and computational approaches. The enzyme systems under the studies include yeast cytosine deaminase (yCD), Bacillus subtilis guanine deaminase (bGD), and Staphylococcus aureus dihydroneopterin aldolase (SaDHN A). yCD catalyzes the hydrolytic deamination of cytosine to uracil. The enzyme is of great biomedical interest, because it can also catalyze the deamination of the prodrug 5- fluorocytosine to form the anticancer drug S-fluorouracil. A multidisciplinary approach has been employed to elucidate the catalytic mechanism of yCD. Computational studies, including quantum chemical calculations and molecular dynamics (MD) simulations, permit the suggestion of a complete reaction path for the deamination reaction at atomic resolution. The reaction path involves the formation of two tetrahedral intermediates. Abstraction and addition of protons are critical for facilitating the reaction, and the carboxyl group of glutamate-64 at the catalytic center plays a critical role in shuttling the protons. One product release path was identified that involves the loop containing phenylalanine 114 (F114) and the C-terminal helix of the protein. Nuclear magnetic resonance spectroscopy (NMR) has been used to study the catalytic mechanism and dynamics of the enzyme. A series of NMR studies have been carried out to study the binding of the anticancer drug 5-fluorouracil to the enzyme. Together with transient kinetic analysis, the NMR studies show that product release is rate-limiting in the activation of the prodrug S-fluorocytosine by yCD. NMR studies also indicate that the binding of an intermediate state analog rigidities the F114 loop and C-terminal helix, while the binding of product maintains their flexibility on the second to minute timescale. Flexibility changes of the F114 loop on the picosecond to nanosecond timescale were found based on both NMR relaxation and MD simulation studies. Guanine deaminase catalyzes the hydrolytic deamination of guanine and plays an important role in the regulation of guanine nucleotide pool. A complete reaction path of the bGD-catalyzed reaction has been proposed by using a combination of a multilayer quantum mechanical method and an MD simulation method. Two residues, Glutamate 55 and Aspartate 114, play important roles in the proton shuttling in the bGD-catalyzed reaction. The dynamical properties of the active site have been elucidated by the MD simulations. Dihydroneopterin aldolase catalyzes the conversion of dihydroneopterin to 6- hydroxymethyl-7,8-dihydropterin (HP) in the biosynthesis of folate. The dynamic properties of SaDHNA and its complex with the reaction product HP have been investigated by MD simulations. Flexible regions of the apo protein surrounding the active site have been identified and, upon binding HP, show that the active site is rigiditied. Specific residues important to product binding and the catalytic mechanism of DHNA are associated with these flexible regions, and their interactions with HP account for most of the binding energy. An exit path for HP has been found and the barrier to its release estimated. Dedicated to itself iv ACKNOWLEDGMENTS Firstly, I would like to thank my advisors Dr. Honggao Yan and Dr. Robert I. Cukier for providing me the opportunity to study enzyme system computationally and experimentally. Their guidance has been benefiting me and their enthusiasm for research will influence me further in my future. Secondly, I would like to thank the members of my guidance committee Dr. Jim Geiger, Dr. James Harrison and Dr. Michael Garavito for their advice and critical review of my thesis. Thirdly, I would like to say thanks to Dr. Aizhuo Liu and Guangyu Li for their help in NMR experiments and Dr. Stepan Sklenak and Dr. Yuxiang Bu for their help in quantum mechanical calculations. Their kindness and patience has made me learn things faster and easier. Fourthly, I would like to thank Dr. Yue Li and Yan Wu for teaching me molecular biology and protein purification techniques. I would also like to express my gratitude to members of Dr. Cukier’s lab Dr. Steve Seibold, Hongfeng Lou and Li Su and members of Dr. Yan’s lab Yi Wang, Dr. Jaroslaw Blaszczyk, Jifeng Wang and Zhengwei Lu. It is a great pleasure to work with them. TABLE OF CONTENT LIST OF FIGURES ...................................................................................................... ix LIST OF TABLES ...................................................................................................... xvi Chapter 1 ........................................................................................................................ 1 Introduction .................................................................................................................... 1 Part 1. Theoretical Background ...................................................................................... 3 Section 1. NMR Relaxation ...................................................................................... 3 Section 2. Molecular Dynamics Simulation ........................................................... 22 Section 3. ONIOM: A Mutilayered Integrated Quantum Mechanics Method ....... 28 Part II Studied Proteins ................................................................................................ 31 Section 1. Yeast Cytosine Deaminase .................................................................... 31 Section 2. Dihydroneopterin Aldolase ................................................................... 32 Section 3. Guanine Deaminase ............................................................................... 33 Reference ..................................................................................................................... 3S Appendices ................................................................................................................... 39 Chapter 2 ...................................................................................................................... 40 Product Release Is Rate-Limiting in the Activation of the Prodrug S-Fluorocytosine by Yeast Cytosine Deaminase .......................................................................................... 40 Experimental procedures ............................................................................................. 42 Results .......................................................................................................................... 48 Discussion .................................................................................................................... 54 Appendices ................................................................................................................... 61 Chapter 3 ...................................................................................................................... 68 Yeast Cytosine Deaminase Dynamics Study by NMR and Molecular Dynamics Simulation .................................................................................................................... 68 Introduction .................................................................................................................. 68 Materials and Methods ................................................................................................. 71 Results .......................................................................................................................... 75 Discussion .................................................................................................................... 84 References .................................................................................................................... 87 Appendices ................................................................................................................... 90 Chapter 4 ...................................................................................................................... 98 The Dynamics Changes of YCD along Reaction Cycle Revealed by Hi) exchange ..98 Introduction .................................................................................................................. 98 Materials and Methods ............................................................................................... 101 Results ........................................................................................................................ 103 Summary .................................................................................................................... 1 14 Reference ................................................................................................................... 1 17 Appendices ................................................................................................................. 1 19 vi Chapter 5 .................................................................................................................... 127 A Molecular Dynamics Exploration of the Catalytic Mechanism of Yeast Cytosine Deaminase .................................................................................................................. 127 Introduction ................................................................................................................ 127 Methods ...................................................................................................................... 129 Results and Discussion .............................................................................................. 137 Concluding remarks ................................................................................................... 147 Reference ................................................................................................................... 150 Appendices ................................................................................................................. 152 Chapter 6 .................................................................................................................... 166 Product Release Study of Yeast Cytosine Deaminase by a Combination Method of ONIOM and Molecular Dynamics Simulation .......................................................... 166 Introduction ................................................................................................................ 166 Method ....................................................................................................................... 167 Results and discussion ............................................................................................... 172 Conclusion ................................................................................................................. 175 Reference ................................................................................................................... 177 Appendices ................................................................................................................. 178 Chapter 7 .................................................................................................................... 182 A Molecular Dynamics Study of The Ligand Release Path in Yeast Cytosine Deaminase .................................................................................................................................... 1 82 Introduction ................................................................................................................ 182 Methods ...................................................................................................................... 185 Results and Discussion .............................................................................................. 188 Conclusions .......................................................................................................... 199 Reference ................................................................................................................... 201 Appendices ................................................................................................................. 202 Chapter 8 .................................................................................................................... 211 Reaction Mechanism of Guanine Deaminase: An ON IOM and Molecular Dynamics Simulation Study ........................................................................................................ 211 Introduction ................................................................................................................ 21 1 Methods ...................................................................................................................... 213 Results ........................................................................................................................ 216 Discussion .................................................................................................................. 228 Conclusion ................................................................................................................. 230 References .................................................................................................................. 23 l Appendices ................................................................................................................. 233 Chapter 9 .................................................................................................................... 240 Mechanism of Dihydroneopterin Aldolase: A Molecular Dynamics Study of the Apo Enzyme and Its Product Complex .............................................................................. 240 Introduction ................................................................................................................ 240 Methods ...................................................................................................................... 243 vii Results and Discussion .............................................................................................. 247 Conclusions ................................................................................................................ 261 References .................................................................................................................. 266 Appendices ................................................................................................................. 268 viii LIST OF FIGURES Figure 1-1.Periodic boundary condition box. The box in the center is box P (primary) .. 39 Figure 1-2 Aldolase reaction catalyzed by DHN A. A represents acid, B represents base. ........................................................................................................................................... 39 Figure 2-1 Stopped-flow analysis of the activation of the prodrug SFC by yCD. (A) The time course of the reaction as monitored by OD296. The reaction mixture contained 10 11M yCD and 250 “M 5FC. The solid line was obtained by nonlinear least square fit to an equation with an exponential and a linear term. (B) The linear dependence of the rate constant of the exponential term on the concentration of 5FC. ........................................ 63 Figure 2-2 Quench—flow analysis of the activation of the prodrug SFC by yCD. The solid lines were obtained by global fitting using the numerical analysis program DYNAFIT. 64 Figure 2-3 lH—15N HSQC spectra of yCD. Sequential assignments are indicated with residue numbers. (A) 1.5 mM yCD; (B) 1.5 mM yCD + 12 mM 5FU. ........................... 65 Figure 2—4 19F NMR spectra of free (unbound) SFU and the yCD-bound 5FU. The smaller peaks near the strong peak of free SFU in the upper spectrum are 13 C satellite peaks. ................................................................................................................................ 66 Figure 2-5 19F NMR spectra of yCD labeled with 5-fluorotryptophan. (A) 1.5 mM 5- fluorotryptophan-labeled yCD; (B) 1.5 mM 5-fluorotryptophan-labeled yCD + 24.7 mM 5FU. .................................................................................................................................. 66 Figure 2-6 NMR saturation transfer analysis of the binding of SFU to the 5- fluorotryptophan-labeled yCD. The NMR sample contained 1.5 mM S-fluorotryptophan- labeled yCD and 24.7 mM 5FU. The data in panels A and B were obtained by irradiating the l9F NMR nuclei of the bound and free yCD, respectively. The solid lines were obtained by nonlinear least square fit of the data as described in the “Experimental Procedures” section. .......................................................................................................... 67 Figure 3-1 15N relaxation data of yCD in the (a). apo, (b). inhibitor 5FPy bound form measured at 600 MHz and plotted vs residue. .................................................................. 91 Figure 3-2 Model-free dynamics results for the (a) apo form (b) inhibitor 5FPy bound form yCD from NMR 15N relaxation data. ....................................................................... 92 Figure 3-3 RMSF and order parameter of the (a), apo form. (b), inhibitor 2Py bound form. (c). product bound form. The x-ray B-factors of apo and 2Py bound form were converted and plotted in red .............................................................................................. 93 ix Figure 3-4 X-ray structure of yCD complexed with 2Py. Three regions 111-117, 150-158 and 72*-81* (from adjacent protomer) are highlighted in gold. Three residues F114, W152 and 1156 which blocks the active site were also shown. ........................................ 94 Figure 3-5 Te value calculated from MD simulation for the apo form, inhibitor 2Py complex and product complex. ......................................................................................... 94 Figure 3-6 two sidechain dihedral angles defined in MD data analysis for (a). F114, (b). W152, (c). 1156. ................................................................................................................ 95 Figure 3-7 F114 sidechain dihedral angle (01 and (1)2 of the (a). apo form, (b). inhibitor 2Py bound form, (c). product uracil bound form in 10 ns MD simulation. ...................... 95 Figure 3-8 The sidechain dihedral angle (01 and (02 of W156 in apo form. .................... 96 Figure 3-9 1156 sidechain dihedral angle ml and (02 of the (a). apo form, (b). inhibitor 2Py bound form, (c). product uracil bound form in 10 ns MD simulation. ...................... 96 Figure 3-10 The sidechain conformations of F114, W152 and 1156. ............................... 97 Figure 3-11 RMSF vs 82 plot of the (a) a helix and B sheet residues, (b) other region residues. ............................................................................................................................ 97 Figure 4-1 The deamination of cytosine/SFC to uracil/SFU through a tetrahedral intermediate ..................................................................................................................... 120 Figure 4-2 The chemical shift difference (a). in the apo and 5FPy bound form. (b). in the apo and SFU bound form. ............................................................................................... 120 Figure 4-3 The colored tube plot of backbone CAs based on chemical shift difference (a). in apo and 5FPy bound form. (b). in apo and SFU bound form. The difference increases with the color changing from blue to red. The residues with no assignment were painted white ................................................................................................................................ 121 Figure 4-4 Exchange time for (a) apo, (b) 5FPy complex, (c) SFU complex. A represents no exchanges (exchange time longer than ~2000 minutes (apo) or ~4000 minutes (5FPy and SFU complex)), A represents fast exchanges (exchange time shorter than ~5 minutes) ........................................................................................................................... 122 Figure 4-5 Number of water molecules around NH within 3.5A from N atom in MD simulation. A represents no exchanges residues (exchange time longer than ~2000 minutes) ........................................................................................................................... 123 Figure 4-6 (a). Percentages of H-bond formation between NH and other residues. (b). Percentages of H-bond formation between NH and solvent water during 1.5 ns MD simulation ........................................................................................................................ 123 Figure 4-7 X-ray structure of yCD complexed with 2Py. Regions with significant increase of HD exchange time were highlighted in gold. ............................................... 124 Figure 4-8 Summary of HD exchange data of yCD in (a). apo, (b). 5FPy complex and (c) SFU complex forms at pH 7. Residues are colored according to the logarithm of protection factor: blue, residues without exchanges during experimental time (log P 2 7.0); red, residues with fast exchanges (log P S 2.0); orange to iceblue, residues with slow exchanges (7.0 > log P > 2.0). The crystal structure of apo form subunit one was used in the picture ........................................................................................................................ 124 Figure 4-9 The ratio between the exchange time of (a). 5FPy and apo yCD. (b). SFU and apo yCD. A represents the binding of ligand changes the residue exchange time from no exchange to slow exchange or from slow exchange to fast exchange or from no exchange to fast exchange. A represents the reverse change ......................................................... 125 Figure 4-10 The exchange time ratio in apo form at pH 7.16 and 8.05. ......................... 125 Figure 4-11 The exchange time ratio in 5FPy bound form at (a). pH 7.10 and 7.62, (b). pH 7.62 and 8.50. ............................................................................................................ 126 Figure 4-12 The exchange time ratio in SFU bound form at pH 7.10 and 7.50. ............ 126 Figure 5-1. (a) Ribbon representation of the crystal structure of yCD drawn according to the coordinate of the 1.14 A crystal structure.4 The figure was prepared with Molscript26 and Raster3D.27’28 (b) Schematic drawing of the coordination sphere of the catalytic zinc ion and polar interactions between yCD and a reaction intermediate analog as revealed by X-ray crystallography.” The distances (A) between the zinc ion and its ligands and between the heavy atoms involved in hydrogen bonds are obtained from one subunit (A) of the 1.14 A crystal structure of yCD with the reaction intermediate analog complex. 159 Figure 5-2 Schematic representation of the path of the yCD-catalyzed reaction as revealed by our quantum chemical calculations8 and molecular dynamic simulations (this work): 1. water/cytosine active site. 2. hydroxide/Glu64H cytosine complex. 3 hydroxide/Glu64 cytosineH complex. 4. intermediate I complex. 5. intermediate H complex. 6. uracil/ammonia complex. 7. uracil/water complex. .................................... 160 Figure 5-3 RMSDs of CA (residues 15-158) for subunit one and subunit two compared with the corresponding crystal structures. The first 500 ps were treated as an equilibration stage, while the last 1500 ps were used to do the data analysis. Left panel: free form, right panel: intermediate analog complex. .............................................................................. 161 Figure 5—4 (a) RMSDs from the crystal structure of the C-terminal helix (left panel) and (b) the Phel 14 loop (right panel) in the free yCD simulation, after equilibration .......... 161 Figure 5-5 RMSFs of CAs in the yCD simulations compared with the crystal structure B- factors. (a) Left panel: free yCD simulation. (b) Right panel: yCD intermediate analog complex. .......................................................................................................................... 162 xi Figure 5-6 Schematic representation and MD snapshots of the yCD active site. Top panel: free form. Bottom panel: intermediate analog complex form ......................................... 163 Figure 5-7 Changes in the CB-CG-CD-OEl dihedral angle of Glu64. (a). Free yCD MD simulation. (b). yCD water/cytosine model (see 1 in Figure 2). (c) yCD intermediate I complex (see 4 in Figure 2). ........................................................................................... 164 Figure 5-8 Potential of mean force for rotation around the CB-CG-CD-OE2 dihedral angle of Glu64H. ............................................................................................................. 165 Figure 6-1 The flow chart of MD ONIOM combination used in Zn-O bond cleavage process ............................................................................................................................. 178 Figure 6-2 The ONIOM energy changes along the Zn-O4 distance scan ....................... 178 Figure 6-3 The rearrangement of the active site during MD simulation after the Zn-O4 bond cleavage .................................................................................................................. 179 Figure 6-4 The average forces between Zn and 04. The bars in the plot represent the errors. .............................................................................................................................. 179 Figure 6-5 The potential of mean force along the Zn-O4 bond cleavage. ...................... 180 Figure 6-6 The three states defined based on PMF during Zn-O4 bond cleavage. ........ 180 Figure 6-7 The distance change between Zn-OEl during the cleavage of Zn-O4. ........ 181 Figure 7-1 a. 5-fluro-uracil (SFU) b. 5-fluro-cytosine (SFC) ......................................... 203 Figure 7-2 CA RMSFs of: (a). protomer 1. (b). protomer 2 at 300 K (black) and 320 K (green). The crystal B-factors were converted to RMSFs (red). The difference (blue) is the RMSF difference of the 300 K and 320 K data. ....................................................... 203 Figure 7-3 CA RMSF plots of the first 5 PCA modes at 300 K and 320 K and their difference: (a) protomer 1. (b) protomer 2. ..................................................................... 204 Figure 7-4 (a) Schematic graph of 50 trajectory snapshots of CA atoms projected out of the first 5 PCA motion modes at 300 K and 320 K. The F114 loop and C-terrninal helix were labeled to give a better view. Water molecules can diffuse into the active site at 320 K through the path in between F114 loop and C-terminal helix, but not at 300 K. (b) One snapshot of apo MD simulation at 320 K. Water molecules diffuse into the active site through the triangle mouth defined by C91, F114 and 1156. (c) 3D plots of distances d1 (C91 CA-F114 CZ), d2 (C91 CA-1156 CD) and d3 (F114 CZ-1156) at 300 K (red) and 320 K (black). ................................................................................................................. 205 Figure 7-5 (a). Total CA RMSD versus time obtained by comparing the MD snapshots with the crystal structure. (b). The average restraint force between cytosine and the xii protein in each window. (c). Hydrogen bond lifetimes between cytosine and the protein during the cytosine pushing along path 1. ...................................................................... 206 Figure 7-6 (a). CA RMSFs in 2 ns regular MD simulation. (b). RMSF difference by comparing the first stage of cytosine pushing with the regular simulation along path 1. (c). RMSF difference by comparing the second stage with the regular simulation ........ 207 Figure 7-7 Schematic graph of 50 trajectory snapshots of CA atoms projected out of the first 5 PCA motion modes: (a). Regular simulation. (b). Pushing simulation stage 1. (c). Pushing simulation stage 2 .............................................................................................. 208 Figure 7-8 The cytosine release paths from the pushing simulation. ............................. 208 Figure 7-9 Mass weighted RMSD plot of F114 and 1156 along the pushing path of cytosine (path 1). ............................................................................................................. 209 Figure 7-10 (a). RMSD of all CA atoms compared with the crystal structure. (b). The average force calculated for the cytosine pushing along path 2. (c). The CA fluctuation differences between the regular simulation and the cytosine pushing ............................ 210 Figure 8-1 The deanrination reaction catalyzed by GD. ................................................. 235 Figure 8-2 Active site interaction of the guanine bound complex from the ONIOM calculation. ...................................................................................................................... 235 Figure 8-3 Reaction mechanism proposed for deamination of guanine to xanthine catalyzed by GD. All the species are labeled with numbers (see text). .......................... 236 Figure 8-4 Zn-O bond cleavage. ..................................................................................... 237 Figure 8-5 The proton transfer from Zn bound water to Aspl 14 through a water bridge2.3 ......................................................................................................................................... 7 Figure 8-6 Alternative mechanism with protonated Asp114. ......................................... 238 Figure 8-7 The tautomerization of guanine .................................................................... 239 Figure 9-1 CA atom RMSDs from the simulations and corresponding crystal structure B- factors. (a) Apo form. (b) HP bound form. ..................................................................... 272 Figure 9-2 Comparison of the RMSFs derived from the average protomer RMSF and the octamer RMSF (see text for the difference in these RMSFs). (a) Apo form. (b) HP bound form. ................................................................................................................................ 272 Figure 9-3 A representative snapshot of two adjacent subunits of apo DHN A from the MD simulation. HP is placed for the identification of the active site. One subunit is colored in green, and the other in red. The regions corresponding to large RMSF xiii differences are highlighted in gold color. For clarity, only one set of the regions is highlighted. ..................................................................................................................... 273 Figure 9-4 (a) RMSF differences between the apo DHNA and HP complex simulations. The differences based on the average protomer and octamer RMSF methods are displayed with solid and dashed lines, respectively. (b) RMSF differences between the apo DHN A and HP complex simulations excluding protomer 6, based on the octamer RMSF method. ................................................................................................................ 273 Figure 9-5 Time evolution of the w dihedral angle of Asp 50 and the (p of Thr51 in the eight subunits displayed sequentially as one trajectory. ................................................. 274 Figure 9-6 Principal Component Analysis eigenvalues of the first 20 modes in the apo and HP complex simulations. The total fluctuation over all the modes is 96.6 A2 for the apo and 67.2 A2 for the complex simulation ................................................................... 274 Figure 9-7 The CA atom RMSF projections onto the first principal component eigenvector for the apo and HP bound forms. The enhanced fluctuation regions of apo versus HP complex are evident and similar to those found as displayed in Figure 4. 275 Figure 9-8 The CA distances: (1; (Tyr*54 to Ile105), d2 (Tyr*54 to Ala69) and d3 (Ala69 to Ile105) parametric on the trajectory for the apo (blue) and HP bound (red) forms. There is a major and minor component for the apo form. The HP bound form distance fluctuations are smaller and fairly well confined to a part of the apo major state volume. ......................................................................................................................................... 275 Figure 9-9 Active site interactions based on the MD simulations of apo enzyme (a) and HP complex (b). Residues in one protomer are distinguished from the other protomer by “*"s. ................................................................................................................................ 276 Figure 9-10 (a) Electrostatic and (b) van der Waals interaction energies between HP and all residues within 5 A of HP. The energies are averages over the 2 ns simulation and the protomers. ....................................................................................................................... 277 Figure 9-11 CA RMSF comparison between the push and regular simulations, indicating the small disturbance in the protein fluctuations from the exit of HP for the chosen force constant and pulling rate. ................................................................................................ 277 Figure 9-12 CA RMSD of the 50 ps averaged structure in each window relative to the average structure in the HP complex regular simulation. ............................................... 278 Figure 9-13 Total CA fluctuation of DHNA along the exit trajectory of HP showing a range of protein fluctuations that span the bound to free form results. .......................... 278 Figure 9-14 CA RMSD of the flexible regions of the 50 ps averaged structure in each window relative to the average fragments in the HP complex regular simulation. (a) Residues 15-25. (b) Residues 45*—55*. (c) Residues 68—74. ((1) Residues 100—1 10. 279 xiv Figure 9-15 CA fluctuations of flexible regions along the exit path of HP. (a) Residues 15—25. (b) Residues 45*—55*. (c) Residues 68—74. (d) Residues 100—110. (e) sum of (a), (b), (c), and (d). ............................................................................................................... 280 Figure 9-16 The interaction energies between HP and its surroundings over the exit pathway. (a) Electrostatic interactions between HP, and Glu22, Glu74 and LyleO. (The solvation energy of the starting point (—14.5 kcal/mol) was set to zero, to give a better view for this figure). (b) Overall interactions between HP and residues 15—25, 45*-55*, 68—74, 100—110, solvent and the other parts of the protein. ........................................... 281 Figure 9-17 The intramolecular energetic and entropic contributions of HP, and its interaction energy with the environment and solvent, along the exit pathway. .............. 281 XV LIST OF TABLES Table 2-1 Kinetic Constants for the Activation of the Prodrug SFC by yCDa ................ 61 Table 3-1 The RMSF of center of mass of various helixes in the apo form ..................... 90 Table 4-1 The statistics of HD exchange times for the apo, 5FPy and SFU complex in various pHs. .................................................................................................................... 119 Table 5-1 Force constants for the Zn complex. .............................................................. 152 Table 5-2 Charges (e units) of Zn complex used in the yCD simulations ...................... 153 Table 5-3 RMSD for free and intermediate analog complex forms ............................... 154 Table 5-4 Hydrogen bonds in yCD free and intermediate analog complex forms ......... 155 Table 5-5 Hydrogen bonds in yCD water/cytosine simulation ....................................... 156 Table 5-6 Hydrogen bonds in yCD hydroxide/Glu64H cytosine simulation ................. 156 Table 5-7 Hydrogen bonds in yCD hydroxide/Glu64 cytosineH simulation ................. 157 Table 5-8 Hydrogen bonds in yCD intermediate I/II simulations .................................. 157 Table 5-9 Hydrogen bonds in yCD uracil/ammonia and uracil/water simulations. ....... 158 Table 5-10 Chemical mutation free energies between water and ammonia. .................. 158 Table 7-1 Total CA fluctuations and contributions from first 5 modes at 300 K and 320 K ...................................................................................................................................... 202 Table 8-1 Comparison of ON IOM optimized structure with crystal structure of the imidazole inhibitor complex ........................................................................................... 233 Table 8-2 Hydrogen bonds in GD substrate bound form ................................................ 233 Table 8-3 Hydrogen bonds in complex 8 bound form .................................................... 234 Table 8-4 Hydrogen bonds in complex 9 bound form .................................................... 234 Table 9—1 Active Site Interactions in Apo DHNA .......................................................... 268 Table 9-2 Hydrogen Bonds between HP and Its Surroundings in HP complex (a) ......... 269 Table 9-3 Active Site Interactions in HP Complex ........................................................ 270 xvi Table 9-4 Hydrogen bonds between Asp‘46, Asn71, and Leu72 in HP Complex ......... 270 Table 9-5 Hydrogen Bonds with Glu24 and Ile25 in HP Complex ................................ 271 Table 9-6 RMSF of Arg118 Backbone N Atom in the Regular and Pushing MD Simulations ..................................................................................................................... 271 xvii Chapter 1 Introduction It has been long realized that the static structure of an enzyme alone is insufficient to explain how it precisely works. Proteins are dynamic molecules that often undergo conformational changes while performing their specific functions, such as an enzyme reaction or ligand binding. The dynamic properties intrinsic to a protein structure may provide information on the location and the energetics of the conformational change process, and are thus the focus of many biophysical studies. The focus of my thesis work is to study the roles of protein dynamics in enzymatic catalysis using both nuclear magnetic resonance (NMR) spectroscopy and molecular dynamics (MD) simulation methods. NMR is a very powerful technique for both biomolecular structure determinations and dynamics and thermodynamics of biological macromolecules. On the other hand, MD can provide details concerning individual atom motions as a function of time, which can be used to calculate thermodynamic parameters by applying statistical mechanics. MD and NMR can confirm each other and are complimentary. By combining these two techniques, one can obtain richer information about protein dynamics. One drawback of MD simulation due to commonly used force fields is that electrons are not treated explicitly and chemical bonds are restrained by harmonic potentials. No chemical bond cleavage and formation can be studied directly by conventional MD methods. A Quantum mechanics (QM) method must be used for the study of the chemical steps of an enzymatic reaction, but in practice the size of biological systems prohibits the use of high-level QM methods on the whole system. A multilayered QM method has been developed to address this problem. Together with NMR and the computational methods, one can study enzymatic reaction mechanisms, dynamics and their relationship. Part I. Theoretical Background Section 1. NMR Relaxation NMR can be used to monitor the dynamic behavior of a protein at a multitude of specific sites. Moreover, protein movements on a broad range of timescales can be monitored using various types of NMR experiments — nuclear spin relaxation rate measurements report the internal motions on fast (sub nanoseconds) and slow (microseconds to milliseconds) timescales as well as the overall rotational diffusion of the molecule (5—50 nanoseconds), whereas rates of magnetization transfer among protons with different chemical shifts and proton exchange report movements of protein domains on the very slow timescales (milliseconds to days). These features make NMR a unique and powerful tool in studying protein dynamics related to protein functions (Kay 1998; Ishima and Torchia 2000; Palmer 2001; Palmer et al. 2001; Stone 2001; Akke 2002). The relaxation rate constants depend on the spectral density functions that quantify the frequency dependence of stochastic motions modulating the dipolar, chemical shift anisotropy, or quadrupolar interactions. The relaxation data are interpreted in terms of overall rotational diffusion of the molecule and intramolecular dynamics at specific atomic sites. Some brief illustrations about relaxation mechanisms in macromolecules are discussed below. 1). ps-ns timescale relaxation In the semiclassical theory of spin relaxation, the Hamiltonian for the spin system is written as the sum of a deterministic quantum-mechanical Hamiltonian that acts only on the spin system, Ho, and a stochastic Hamiltonian, H1(t), that couples the spin system to the lattice (Abragam 1961; Cavanagh et. al. 1996): H(t)=H0+H1(t) (l) The Hamiltonian H ,(t) is regarded as time-dependent perturbations acting on the main time-independent Hamiltonian, Ho. The Liouville equation of the density operator is: d0(t)/dt = —i[H(t),0(t)] (2) In the interaction representation with 0* = eiHOtUe—iflot, Hf (I) = euro: H1 (”e—i110: the equation (2) becomes d0‘*(t)/dt = -i[H*(:),a*(t)J which can be solved by using time-dependent perturbation theory approximated to the second order I da*(t)/dt = —i[H*(t),a*(0)]- jdrIHi" (z), [Hf (t — r),a*(0)]] (3) 0 Since H ‘(I) varies randomly in time, we shall perform an average of the lattice ensembles. It will be assumed further that (Abragam 1961): (a). It is permissible to neglect the correlation between H *(t) and 0*(0) in the average of (3) and average them separately. So the first term of equation (3) is zero because the average of H (t) 1S zero (random perturbation). The justification of this assumption rs as following: The density function 0*(0) depends on the behavior of H *(t') (2's 0) (equation (3)), and since there is a correlation between H *(t) taken at different times, in principle there is a correlation between H *(t) (t > 0) and 0*(0) . But if t>> TC ( Tc is the correlation time of H *(t) ), this correlation can be neglected and therefore average * * H (t) and 0' (0) separately. (b). It is permissible to replace 0*(0) by 0*(t) on the right hand side of equation (3). Assuming at t = 0 the system is in equilibrium, 0*(0) can be expressed as * _ _ a (0) = 0(0) = exp( ””%T)/zr{exp( ”H%T)} exp(~ hH%T) can be approximated by exv<'””%r>=E-””%T at room temperature. E is the identity operator. For example, for ”N atom with spin angular momentum 1/2 at 300 K and 14.09 T magnetic field (600 MHz for proton resonance), > 1C , the integral from t to +00 is negligible. ((1). It is permissible to neglect all higher-order terms on the right-hand side of (3). With these assumptions and omitting the bar on a‘(t) which stands for the average density matrix, equation (3) becomes (10* (t)/dt = —°idr[Hf (t), [Hf (t — r),0'*(t)fl (4) o The solution of (4) is E which means all the states will have same probability corresponding to the infinite temperature situation. One more term 00 the equilibrium density operator needs to be introduced into (4) to ensure that the spin system relaxes to equilibrium, which modifies (4) to da*(t)/dt = jidz'fl-Iik (t), [Hf (z — r),a*(r) - 00H (5) o This equation can be obtained more rigorously by treating the lattice quantum mechanically (Abragam 1961). Then 0" is transformed back to a which gives dam/dz = —i[H0,a]—% j dr[H1(z),[e“i” 071110 — r)e‘”0’,a(t) - 00]] For any physical observable, Q, the expectation value of Q can be denoted as: _£1_ = d0(t) d < Q > ldt _ d1 Tr{Qa'(t)} Tr{Q————dt } (6) The longitudinal and transverse of relaxation times, T1 and T2 can be calculated by replacing Q by 12 and I + (or I _). In order to proceed, the nature of the perturbation Hamiltonian H; must be considered. In a real system, a very large number of physical interactions give rise to stochastic Hamiltonians capable of mediating spin relaxation. For spin 1/2 nuclei in diamagnetic biological macromolecules (such as backbone amide N), the dominant relaxation mechanisms are the magnetic dipolar coupling with amide proton and chemical shift anisotropic (CSA) mechanisms (Cavanagh et. al. 1996). a). Bipolar relaxation (Fischer et al. 1998) Considering two-spin system I and S 1[A(Iz(t))]=_[pl 01$][A] (7) dr A(Sz(t)) 0,5 p5 A_eq M520» =_<5z>eq p, =d2{6j(w1)+2j(w1 —a)5)+12j(a)1 +w5)} p5 =d2{6j(a)1)+2j(a)1 —a)S)+12j(a)1 +w5)} 015 =d2{—2j(a)1 —wS)+12j(w1 +015 )} 2 2 with d2 =l[-”—0) {—7175}? 8 47r ,3 where ,uo is the permeability of free space, 7 is gyromagnetic ratio, r is the distance between two dipoles, j(a)) is the spectral density evaluated at frequencya). The longitudinal relaxation time T) is the reciprocal of the longitudinal relaxation rate, ,0. The cross term is the Nuclear Overhauser Effect which allows for transfer of information between two nuclei. A similar expression can be obtained for 1+, S+ l M) + — 0 ' . (m) ' .1 _ + = - 2 1 + (8) d! <3 (1)) 0 £013 + — <5 (1)) TS _ _ 2 _ where 1 . . . . . —T=d2{41(0)+3](w,)+ 1(a)] —w5)+6j(w, +a25)+6)(a)5)} 2 1 . . . . . —S=d2{41(0)+3)(a)5)+ 1(a)] —w5)+6}(a)1 +w5)+6)(a)1)} 2 T2 is the transverse relaxation time. b). Chemical shift anisotropic relaxation 1 . T1 1 . . —, = c2{41(0)+31(w1 )} T2 2 ”(2’302 2 022 + 033 022 - 033 where c = (1+AI7 /3) with A0=0'11—————, A77=————, and 18 2 0'11 - Uiso _ 011+022+033 aim — 3 with 011 2 022 2 0'33 -0ii is the diagonal part of the chemical shift tensor. By combining the dipolar interaction mechanism and the chemical shift anisotropy mechanism, T1, T2 and NOE for I spin are given by: fizdziwwm 21m): —ws>+121'(wz +ws>}+02{6f(w1)} T12— = d2{4j(0) + 3j(w1)+ j(w1 - ws ) + 611601 + aIs ) + 611015 )}+ €2{4f(0) + 31(w1)}+ Rex NOE=1+(yI/y5)d2{—2j(a)1—a)S)+12j(w1 +wS)}> B k—l where A, B are two magnetically distinct environments. The chemical exchange rate constant, kex, is given by kex =k1+k-1=k1/PB =k—1/PA where {m =[A1/(1A1+[B]) [)3 =1- PA Two situations will be discussed: (a) Only one signal is observed which means kex are larger than the chemical shift difference between A and B. (b) A and B are observable as separated signals. The first case is referred as fast exchange while the second case is referred as slow exchange. a). Fast exchanges Since in the fast exchanges only one signal is observed, an effective Hamiltonian can be introduced as(Wennerst.H 1972): 13 HO) = fA(t)HA(t)+fB(I)HB(t) fA=0 f 1 when spin stays in state B. No transition 8 = {fret =1 . . { when spin stays in state A. f B = 0 state is assumed. By using the same procedure described in the first section, T), T2 and NOE contributions from the exchange can be derived: 1 —=PA /T1A +PB/T13 T1 2 -T-2-=PA/T2A+PB/T2B+PAPB(GIA'03) ”Ca: (14) NOE: pANOEA + pBNOEB (0wa are resonance frequencies for A, B. TIA,T13(T2A,T23) are longitudinal . . . 1 1 1 l (transverse) relaxation trme for A, B. The assumptions kex >> and TIA ’Tw ,TZA ’T23 k6,,C <<—1—, TmA TmB are used for the derivation, which means the conformational exchange rate has to be much faster than relaxation rates but much slower than the molecular global tumbling rate. kex >> wA —a)3 is also implicitly used to guarantee only one signal is observed. Eq. (14) shows that for residues with multiple conformations, T), T2 and NOE are more complicated. But if A and B have quite similar relaxation rates and the NOE of one form is dominant, we can simply add a Rama = pApB(a)A -a)3)2 Ikex) to the T2 term in eq. (9). b). Slow exchanges l4 In the slow exchange, with the only condition kex <<—l——,—1—- , a more general TmA TmB equation for longitudinal relaxation can be the derived by modified Bloch equations (McConnell 1958; Palmer et al. 2001): 1 d A = (114/12(0) - 0 in which A0 (MAZ >010 denote the equilibrium expectation value of M AZ and M 32 . The solution of this equation is: A> _ aAA(I) GAB“) A 16 .1...) ml 033111.21 > where 15 1 l _ -— +kex(PB " PA) a Mo) =.;_ 1— TIA T13 4+ —,L exp(—Lt) 1 l _ T— + kex> a) A — 013, only one signal is observed with the intensity + (M 32 (0). If kex >> ?1_,}1_ , it can 1A 18 16 be shown that the exponential term with 1+ disappears while the rate A- can be approximately by (Taylor expansion): 2 [;_;] P P T T P P a_=—i+—B-—PAPB 1" ”3 =- A + B (17) TIA T113 kex TIA T113 A typical protein molecule has T) in seconds for amide nitrogen, for example yCD (35 KDa) has T) ~l.3 s. If kexis about 100 s'1 or more, it is safe to omit the quadratic form in A- , which will give the same result as shown in (14). Modified Bloch equations can also be used for transverse relaxation rate calculation: , l i z _. WA +-T2_A_ + m“ _ “k“ (18) dt (MB-W”) _ PBkex in +_Tl + PAkex 28 The solution of these equations is: (MAW) _[a 110) mm] - (19) “311(1) 0330) in which . 1 1 1 —1Aw+-T———-T—.+kex(p8_pA) a AAU) = 2 1- 2A [133/]; exp(—Lt) . 1 1 1 —zAw+;——}—+kex(pB—pA) +5 1+ 2" 2,251. exr)(-/1+t) 17 . l 1 1 ‘lAWr-T— ‘T— +kex(PB 'PA) (1330‘) =5 1+ 2A 28 exp(—Lt) 2,—1. . 1 1 —1A(I)+—--T—-+kex(pB -pA) +% 1— TZA 1+2? L ex1964+!) k a A B (t) = Tex—Ef— [exp(—th) — exp(— 1+!” k “BA (t) = 704% [exp(-Lt) - exp(- 14)] + _ and i0) M) + 1 + 1 +k - A T B _ — ex T24 T23 11" N Ill H 4 2 , 1 1 i [—1Aw+————+kex(p3-PA)] +4PAPBker i T2A T23 Aw: lwA —a)B| Just as the longitudinal relaxation, it appears that the transverse relaxation includes two exponential decays. But it can be reduced to a single exponential decay under the fast exchange limit, with the same procedure used to simplify xii as in longitudinal . . . . 1 1 . . relaxation. With the conditions kex >>—,— and kex >> (”A —w3, the coeffrcrent T2A T23 of 8+ approaches 0 and xi. can be approximated by: 18 1/2 l1_ z —i(PAa)A + P3603 )+ £4- + :1 T2A T213 2 l 1 2 . 1 1 ((014-603) — —' ' -21(a)A—Q)B ——___ TZA T28 T2A T23 +PAP3 (20) kax _ 2 =—i(P a) +P a) )+fl—+—PB +P P ————(“’A ‘03) A A B B A B T2A T23 kex where —i(PAwA +PBwB) is the Larrnor precession frequency which is zero when measuring the relaxation on resonance, or can be removed by using Carr-Purcell- Meiboom-Gill (CPMG) pulses (Luz and Meiboom 1963). Details will be discussed in the _L_—l-l was used for the second step T2A T23 approximation of (20). This assumption can be justified as following: as shown in (9) and next section. One condition lwA —a)3|> (12), the largest term contributing to Ti is j(0) which is only proportional to rm the 2 global tumbling time (assuming the internal motion has a shorter timescale), suggesting l . . . . . T— rs not sensrtrve to the local envrronmental changes. In yCD ?1- of the amlde 2 2 nitrogen is about 25 s", the T1— difference between different conformations should be 2 smaller than this value. On the other hand, for a exchange rate kex about 1000 3", assuming conformation A and B have same populations and T2, (0A —a)B has to be larger than 71 rad/s to contribute 5% of %which is the typical detection limit in 2 experiments (experimental uncertainty). So the conformational changes of the spin 19 system with lwA -a)3| < L-;l can hardly be detected. Based on the conditions T2A T213 used to derive (20), we can refine the valid range of (20) used for conformational exchange study in a protein such as yCD. The lower limit of km is the transverse relaxation rate ~25 s'1 and the upper limit is the global tumbling rate ~55x106 5". So the valid k“ should be in between ~250 s'1 and ~5.5x10° s'l, primarily on a millisecond to microsecond timescale. Many biological processes occur with time constants within this range. If we are only interested in the motion on this timescale, CPMG and spin locked techniques Tm can be used to extract kc, based on equations derived similar to (20). c. Transverse relaxation during CPMG sequences A single spin-echo period, 1/ 2—180°—z'/ 2 can be described as a 180° pulse (in x or y direction) in between two 1/ 2 delays. The 180° pulse changes M + to M _ (and M —to M +), while during 2'/ 2 delays, M + (A and B) relaxes with transition matrix C consisted of the element aij(7) from equation (19) and M — relaxes with C* (complex conjugate of C). The relaxation in 180° pulse period is neglected because the pulse length is usually much shorter than 1/ 2. After 2n spin echo periods, M +(2n 2') is given by: M+(2nz')= (CC*CC*TM+(O) (21) The general solution of equation (21), a combination of two exponentials, is rather complicated (Carver and Richards 1972; Jen 1978; Davis et al. 1994). If PA >> PB, a one-exponential decay can be approximated for the solution with the relaxation rate constant of A is (Ishima and Torchia 1999): 20 /2 —T1;(r) = 21: (2' —> 0) + (p A p Bszkexy[k3x + (Ii/2111104 + 144/14)l ] In the fast exchange limit, the solution is approximated by single exponential with the decay rate (Luz and Meiboom 1963): 3%(1) = -T—12—(r —> o) + (pA p )3sz lkex )1 — 2 tanh(kexz'l 2)/kexz'] 1 . where ;—(T—)O)= p A / T2 A + p B/ T23. In practice, two Ivalues can be selected 2 11,72 to measure transverse relaxation, then 1 1 300-502): 2(PAPBAw2 lkex tanh(kexrll2)/kexz'l]“[tanh(kexrzl2)/k~‘3x1'2]} By comparing the rates at these two delays, fast exchanges can be identified. In most cases of practical interest, CPMG experiments in proteins will be applicable to chemical exchange processes with values of kex < 1x104 5'1 as a result of experimental constraints on the minimum value of t (Palmer et al. 2001). Sample heating at high pulsing rates is a major limitation. In addition, the delay time in between two 180° pulses has to be much longer than the pulses length; otherwise the assumption in equation (21) may not be valid. (e.g. The 180° pulse length of 15N is ~80 ,us in Chemistry department Varian 600 MHz machine, the safe 1 minimum should be ~0.5 ms). The up-limit of I should be less than half of T2 (~40 ms for yCD) in order to give reasonable sensitivity and enough data points for exponential fitting before the signal decays to zero. The function tanh(x)/x changes dramatically when x is small, for example, x changes from 0.5 to 5 corresponding tanh(x)/x changes from 0.92 to 0.2. Therefore, practically CPMG experiments are sensitive to kex ~ 103 5'1 provided that Aw is reasonably big. 21 In a summary, protein dynamics in ps-ns and ,us-ms characterized by heteronuclear spin relaxation NMR spectroscopy is discussed above. The relaxation rate was derived by using time dependant perturbation method. For backbone NH spin system, this perturbation mainly comes from NH dipolar-dipolar interaction and nitrogen chemical shift anisotropy effect. The motion of protein that causes the perturbation time dependant is separated to global tumbling motion and local fluctuations of individual spin sites. Model-free method was described, with the assumption of spin vector local time correlation function as exponential forms. The order parameter was introduced which describes the flexibility of individual spin vector. Unlike the ps-ns protein dynamics, motion on the ,us-ms scale is primarily caused by protein conformational exchanges. The relaxation rate for this timescale motion can be derived from the modified Bloch equation. The ,us-ms motion only changes T2 but not T1 for the fast exchanges. The reason can be simply illustrated as following. The exchanges switch the Hamiltonian between two conformations, which consists of the major term time independent H0 and minor term time dependent H(t) in equation (1). Because H0 commutes with 12 but not I + orI - , the major term contributes to T2 but not T1 relaxation. Therefore the influence of exchange over T1 relaxation is rather small. Section 2. Molecular Dynamics Simulation MD simulation can provide a great detail concerning individual particle motions as a function of time (Karplus and McCammon 2002). Thus, it can be used to address specific questions about the properties of a model system, often more easily than 22 experiments on the actual system. For many aspects of biomolecular function, it is these details that are of interest. Of course, experiments play an essential role in validating the simulation methodology. Another advantage of simulations is that users have complete control of the potential function. So, the users can transmute the potential from that representing one system to another during a simulation in the calculation of free energy differences. MD simulates matter at the atomic level. Given an intermolecular potential, and an initial configuration for the molecules, it will provide molecular configurations for all time by solving Newton’s equations of motion (Lanczos 1966). mid) = f,- ai = i", (dot = time derivative) fi = -ViV V ,- means gradient with respect to r,-. V is the intermolecular potential. We will assume that it is pairwise additive. 1 i 1' i ¢ j The above second order in time equations may be written as 1",- = p,- /m 6N 1st order in time differential equations Pi = fi A lot of force-field development deals with the intramolecular part of the potential. To develop it, one fits quantum chemistry data on: vibrational frequencies, dipole moments, derivatives of the energy and etc. Typical intramolecular force fields will have terms as (Pearlman et al. 1995): 23 E= zx.(r—r..)2+ ZKAe-e e.)2+z Z—g—(HcosW-rl) bonds angles dihedrals atom Aij 3. atom qi q}. +2— +2 i>j rijz rij i>j grij Nature deals with ~ Avogadro’s number of particles. On a computer, ~ 10,000 - 50,000 is the current practical limit for conventional computers. For parallel machines with special codes, this is changing toward very large numbers of molecules (1,000,000). For many purposes, this is not necessary. Many simulations were run with 256 molecules, if you want properties of a small molecule solvent. If you want to do a small protein, say cytochrome c — with about 100 residues, and properly solvated for, need about 6,000 H2O molecules. The surface of the box becomes very important; e.g., for 1000 = 103 molecules in a cubic box, 488 molecules appear on the cube faces! This drove the introduction of Periodic Boundary Conditions (PBC, Figure 1-1). You just need to store the particle coordinates in box P (not the other boxes). When a particle moves out of box P, it is replaced in the list of coordinates by the one that moves into the box. Simulating a periodically replicated system is better than a box in vacuum but introduces problems with long range interactions. Modern MD codes such as AMBER (Pearlman et al. 1995) use special Ewald-based methods (Essmann et al. 1995) to deal with these issues. For one atom, the jth, the vector position at time t is rj{t). Over a time interval T, RMS average deviation is defined r (rj (1))T =-;- jr- t()dt o 24 Consider the fluctuations in position with respect to this time evolution. The natural definition is to construct the matrix ((11. ,- (0d. ,- (t))T (dx,-(r)dy,~ (t))T (dx,-T (dzj(t)dyj(t)>T T I where dxj(t)=xj(t)—T etc. So that (dxj (t)dxj (t)>T = (x3 (t)>T — T(xj (t)>T This matrix is a lot of information to store/analyze, so it is more usual to take its trace (sum the diagonal elements) to obtain T T T T rmsj = (drj(t) . drj(t)> = (rJ-(t) . 5(2)) — This gives the time-averaged fluctuation of jth atom’s position relative to its time averaged position. If we want to do the same on e. g. a residue basis, then define ZT je resk Z 1' je resk rmsresk = Similarly the deviation from X-ray structure is defined as: 25 T T Edy-(o) —r,-(0))-((r,-(r)) -r,-(0» dev = xray = ‘je reSk resk Zj je resk Where rj(0): j(X -ray). Adding up all the deviations from X-ray time averaged over some time interval and plotting as a function of time from initial time gives some indication of how long it takes to equilibrate the protein/solvent system from the protein point of view. Sometimes we want a thermostated system, e.g. NVT canonical system. To fix the temperature, we may scale the velocities. The average kinetic energy is (E k > = 2 Nk 3T 2 Equipartition says <—;-mv,;2x>=kBT/2 where < > means average over the Maxwellian P(vix) for each particle i. 80, evaluate the “instantaneous temperature” 1 Z — m,- v-2 ' 2 l k T t = 3/2N B If T is not Trefi the desired temperature, multiply all the vi’s by the same factor to make T = Tref- If this is done every MD step, then you will maintain the target temperature, Tref- More sophisticated methods of temperature control can be used (Berendsen et al. 1984). 26 Thermodynamic Perturbation Methods. The connection between the Helmholtz free energy A(N,V,T) (one can do the same for the Gibbs free energy G(N,P,T)) and the potential energy function U(RN ) used in e.g. MD is N - k A(N,V,T)=-kBT/(N!A3N)ln deNe U(R )/ ”T N is number of particles in the system, V is volume, T is temperature, RN denotes the -I/2 coordinates of each atom, k3 is the Boltzman constant, A is (271kaT) . Consider another system with potential energy U * (RN ). Then IdRNe-(U*(RN )—U(RN ))/kBTe—U(RN )lkBT a: A N,V,T —A N,V,T =—k Tl ( ) I ) B n N -U(RN)/kBT IdR e :- -kBTln The last line defines an ensemble average, done by e.g. doing a long MD trajectory, U where the configurations are generated by doing the MD with the U (RN ) potential. If we want to find out binding free energy for a ligand L with a protein P, we can define A (N, V, T) as the free energy for the binding state, and A‘ (N, V, T) as the free energy for the dissociated state. Obviously, the free energy difference is the binding free energy between ligand L and protein P. But in MD, we can’t just simulate these two states, since MD can only give us a small conformational space in real time simulation for a protein, which means the average we get from above equation may not be the true binding free energy. Instead if we define some intermediate state, 27 U(RN) A =0 UMRN ) = f(U(RN >.U*(RN M) 2e (0.1) U*(RN) A =1 Then 1 A*(N,V,T)—A(N,V,T)= j col 0 a): )1 This equation could be integrated to give us binding free energy, as long as the path we choose is reversible. Section 3. ONIOM: A Mutilayered Integrated Quantum Mechanics Method The standard high level ab intio quantum mechanics (QM) method is well known to , be computationally expensive, which make it extremely difficult to study enzymatic reaction. Here we introduce a ONIOM method developed by Keiji Morokuma and his collegues (Svensson et al. 1996; Dapprich et al. 1999; Vreven and Morokuma 2000; Vreven et al. 2001; Torrent et al. 2002; Morokuma 2003; Vreven et al. 2003), a onion like multi-level method, combining different leverls of quantum chemical methods as well as molecular mechanics method. The concept of ONIOM method is rather simple. We will illustrate it by using two-layer ONIOM, while extending to more layers is straightfoward. Assuming a protein with well defined active site (for example from X-ray structure), in order to study the reaction mechanism we have to treat the whole system with resonably good QM method (such as DFT, Density Functional Theory), which is impossible with the current computer resources. Instead we can use ONIOM concept to separate the system (named the real system) into two parts: 1. the active site which we are 28 primaryly interested in (named the model system), treated with high level QM method (DFT etc.). 2. the rest of protein, which influences the reaction mainly through long range electrostatics, treated with low level QM or MM method (AMl or force field etc.). Three calculations are carried out: the high level caculation of the model system, the low level calculation of the real system and the low level calculation of the model system. The total energy of the system will be: E (ONIOM, real) = E (high, model) + E (low,real) — E (low,model) This treatment inevitably introduces an error if considering the high level calculation of the whole system as the “correct” method. But if the error is a constant for two different structures, their relative energy difference will be evaluated correctly by ONIOM method. In this case, if the change of ligand (e.g. from reactant to intermediate state) doesn’t cause a drastic change beyond the active site, the ONIOM method should give a reasonably accurate result. The two most important questions when using the ONIOM are how to choose different layers and how to select the methods to describe these layers (Morokuma 2003). The answer to the first question is to determine which parts of the system are more important and which parts are less important. In enzyme-catalyzed reaction, the reactant as well as residues that are directly involved in the bond cleavage and formation are most important and the residues away from the reactant might be less important. The former can be selected as the high level system, while the latter can be selected as the low level system. The high level QM method used in ONIOM depends on the system and problems one wants to study, and practically also depends on the size of the system. Once the high level is selected, two structures along the reaction can be used 29 to calculate the energy difference by treating the real system in high level. Then, the low level methods will be judged by the reproducibility of energy difference. The implementation of ONIOM is straightforward if there is no covalent bond in between low level layer and the high level layer. The force used to drive the structure to the minimum is just (Dapprich et al. 1999): VE(ONIOM, real) = VE (hi gh, model) + VE(low,real) — VE (low, model) And the Hessian matrix is the second derivative of ONIOM energy, which provides the information about normal vibrational frequencies: V2E(ONIOM, real) = V2E(high, model) + V2E(low, real) - V2E(low, model) If there are covalent bonds between low level layer and high level layer, link atoms are used to saturate the model layer. But the link atoms are constrained by the atom pairs that form covalent bonds. In this case, J acobian matrix J is introduced to project the forces on the link atoms to the corresponding atom pairs (Dapprich et al. 1999): VE(ONIOM, real) = VE (high, model) + VE (low,rea1) x J — VE (low, model) x J And the second derivative is: V2E(ONIOM, real) = V2E(high, model) + JT x VE(low,real) x J - JT x VE(low,model) x J where JT is the transpose of J. 30 Part II Studied Proteins Section 1. Yeast Cytosine Deaminase Yeast cytosine deaminase (yCD), a zinc metalloenzyme, catalyzes the hydrolytic deamination of cytosine to uracil. yCD is of great biomedical interest, because it also catalyzes the deamination of the prodrug 5-fluorocytosine (SFC), which is one of the most widely used prodrugs for gene-directed enzyme prodrug therapy (GDEPT) for the treatment of cancer (Greco and Dachs 2001). The challenge in cancer therapy is to kill tumor cells without damaging normal cells. GDEPI‘ meets the challenge by activating a prodrug in the tumor, thereby minimizing damage to normal tissues (Aghi et al. 1998; Greco and Dachs 2001). The structure of yCD has been recently determined at high resolution with (Ireton et al. 2003; K0 et al. 2003) and without (Ireton et al. 2003) the inhibitor 2-pyrimidinone, a transition state analogue. yCD is a homodimeric enzyme, and the structure of each monomer consists of a central B-sheet flanked by two or-helices on one side and four or - helices on the other. The homodimeric protein contains two active centers, and each active center is composed of residues within a single subunit and a catalytic zinc ion coordinated with a histidine (His62), two cysteines (Cys9l and Cys94), and a water molecule in the substrate-free enzyme. The Zn-bound water serves as a nucleophile in the deamination reaction. The bound inhibitor is completely buried, being covered by a lid composed of Phe114 from the loop between [34 and OLD, and Trp152, and Ile156, both from the C-terminal helix. 31 The catalytic apparatus of yCD, including the catalytic zinc, its coordinated residues, and the proton shuttle, is very similar to that of E. coli cytidine dearrrinase (Ireton et al. 2003; K0 et al. 2003), which has been extensively studied (Schramm and Bagdassarian 1999; Snider et al. 2002; Snider et al. 2002) and is a paradigm for understanding enzymatic catalysis. Crystal structures have been determined for the complexes of the enzyme with 3-deazacytidine (Xiang et al. 1996), 3,4-dihydrozebula1ine (Xiang et al. 1995), 3,4-hydrated 2-pyrimidinone riboside (Xiang et al. 1995), and the product uridine (Xiang et al. 1997), all at 2.30 A resolution except the structure of the 3,4-dihydrozebularine complex, which is at 2.20 A resolution. These crystal structures represent different states of the catalytic cycle and have provided important insights into the catalytic mechanism of the enzyme. On the basis of the similarity of yCD and CDA active site structures and early studies of the mechanism of CDA catalysis, an analogous reaction mechanism was proposed for yCD (Ireton et al. 2003). Our recent quantum chemical study (Sklenak et al. 2004) using the ONIOM (B3LYP2PM3) method has revealed a complete path for the deamination reaction catalyzed by yCD. Section 2. Dihydroneopterin Aldolase DHNA catalyzes the conversion of 7, 8-dihydroneopterin (DI-INP) to 6- hydroxymethyl-7, 8-dihydropterin (HP) in the de novo biosynthesis of folic acid from guanosine triphosphate, which is present in microbes but not humans. 32 H2 H2N o 311%”: —- “Err" m \ Q \ O OH H H N M It is a good target for antibacterial chemotherapy just like dihydropteroate synthase and dihydrofolate reductase (Dale et al. 1997). S. aureus DHNA structures have been published for the apo and HP bound forms (Hennig et al. 1998). DHNA is an octamer in nature with 121 amino acids for each monomer (Hennig et al. 1998). The octamer is a hollow cylinder formed by a 'head to head' assembly of two such tetrameric rings. The active site sits in between two adjacent subunits of the tetramer. The bottom of the active site has a glutamate (Glu74) that serves as an "anchor" for binding and there are residues lining the active site that stabilize HP. A general acid and base are needed for the aldolase reaction (Figure 1-2). Section 3. Guanine Deaminase Guanine deaminase (GD), a Zn metalloenzyme, catalyzes the hydrolytic deamination of guanine into xanthine and therefore plays an important role in nucleotide metabolism. The crystal structure of GD from Bacillus subtilis (bGD) complexed with imidazole was solved by Liaw (Liaw et al. 2004) and his colleagues at 8 resolution of 1.17A resolution. bGD forms a homodimer with 156 residues in each monomer. The overall structure of each monomer consists of a center five-stranded B-sheet sandwiched by six helixes (Liaw et al. 2004). The homodimer GD contains two active sites, including one Zn atom in each active site. The Zn atom is coordinated with His53, Cys83, Cys86 33 and one water molecule. It is interesting that the active site is buried by a swapped C- terrninal tail from an adjacent subunit, the first domain-swapped structure in the Cytidine Derninase superfamily (Liaw et al. 2004). It was proposed that this tail not only seals the active site but also is used to recognize specific substrates (Liaw et al. 2004). The bGD catalyzed reaction is believed to proceed through a tetrahedral transition state while Glu55 acts as the proton shuttle. But the detailed mechanism is still unknown. The thesis is arranged as follows: 1. Experimental studies of yCD are included in Chapters 2-4. 2. Computational studies of yCD are presented in Chapters 5-7. 3. A computational study of GD is given in Chapter 8. 4. A computational study of DHNA is included in Chapter 9. Chapter 2 describes kinetics studies of the activation of SFC and an NMR study of the product release process. Chapter 3 describes the yCD dynamics study (on the ps-ns time scale) by using NMR relaxation measurements and MD simulations. Chapter 4 describes the yCD dynamics study (on the sec-min time scale) by using the NMR HD exchange method. Chapter 5 investigates by MD simulation the yCD catalyzed reaction mechanism. Chapter 6 describes the ONIOM and MD study of 5FU release from yCD. Chapter 7 studies the ligand release path from yCD by using the MD simulation method. Chapter 8 describes the GD catalytic mechanism by using the MD and ONIOM methods. Chapter 9 describes DHNA dynamical properties in the apo and product complex by the use of MD simulations. 34 Reference Abragam, A. (1961). The principles of nuclear magaetism. Aghi, M., C. M. Kramm, et al. (1998). "Synergistic anticancer effects of ganciclovir/thymidine kinase and 5-fluorocytosine/cytosine deaminase gene therapies." J Natl Cancer Inst 90(5): 370-80. Akke, M. (2002). "NMR methods for characterizing microsecond to millisecond dynamics in recognition and catalysis." Current Opinion in Structural Biology 12(5): 642-647. Berendsen, H. J. C., J. P. M. Postma, et al. (1984). "Molecular-Dynamics with Coupling to an External Bath." Journal of Chemical Physics 81(8): 3684-3690. Carver, J. P. and R. E. Richards (1972). "General 2-Site Solution for Chemical Exchange Produced Dependence of T2 Upon Carr-Purcell Pulse Separation." J oumal of Magnetic Resonance 6(1): 89-&. Cavanagh, J., Fairbrother, W.J., Palmer, AG. and Nicholas, J .S. (1996). Protein NMR Smctroscopy. Chen, J. H., C. L. Brooks, et al. (2004). "Model-free analysis of protein dynamics: assessment of accuracy and model selection protocols based on molecular dynamics simulation." Journal of Biomolecular Nmr 29(3): 243-257. Clore, G. M., A. Szabo, et al. (1990). "Deviations from the Simple 2-Parameter Model- Free Approach to the Interpretation of N-15 Nuclear Magnetic-Relaxation of Proteins." Journal of the American Chemical Society 112(12): 4989-4991. Dale, G. E., C. Broger, et al. (1997). "A single amino acid substitution in Staphylococcus aureus dihydrofolate reductase determines trimethoprim resistance." Journal of Molecular Biology 266(1): 23-30. Dapprich, S., I. Komaromi, et al. (1999). "A new ONIOM implementation in Gaussian98. Part I. The calculation of energies, gradients, vibrational frequencies and electric field derivatives." Journal of Molecular Structure-Theochem 462: 1-21. d'Auvergne, E. J. and P. R. Gooley (2003). "The use of model selection in the model-free analysis of protein dynamics." Journal of Biomolecular Nmr 25(1): 25-39. Davis, D. G., M. E. Perlman, et al. (1994). "Direct Measurements of the Dissociation- Rate Constant for Inhibitor-Enzyme Complexes Via the T-l-Rho and T-2 (Cpmg) Methods." Journ_al of Magnetic Reson4ance Series B 104(3): 266-275. 35 A Ding, Z. R, G. Lee, et al. (2005). "PhosphoThr peptide binding globally rigidifies much of the FHA domain from Arabidopsis receptor kinase-associated protein phosphatase." Bioch_emistrv 44(30): 10119-10134. Essmann, U., L. Perera, ct al. (1995). "A Smooth Particle Mesh Ewald Method." Journal of Chemical Phyics 103(19): 8577-8593. Fischer, M. W. F., A. Majumdar, et al. (1998). "Protein NMR relaxation: theory, applications and outlook." Progress in Nuclear Magnetic Resonance Spectroscopy 33(4): 207-272. Greco, O. and G. U. Dachs (2001). "Gene directed enzyme/prodrug therapy of cancer: Historical appraisal and future prospectives." J. Cell. Physiol. 187(1): 22-36. Hennig, M., A. D'Arcy, et al. (1998). "Crystal structure and reaction mechanism of 7,8- dihydroneopterin aldolase from Staphylococcus aureus." Mture Structurafliology 5(5): 357-362. Ireton, G. C., M. E. Black, et a1. (2003). "The 1.14 A crystal structure of yeast cytosine deaminase evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11: 961-972. Ishima, R. and D. A. Torchia (1999). "Estimating the time scale of chemical exchange of proteins from measurements of transverse relaxation rates in solution." Journal of Biomolecular Nmr 14(4): 369-372. Ishima, R. and D. A. Torchia (2000). "Protein dynamics from NMR." lfllture Structural Biology 7(9): 740-743. Jen, J. (1978). "Chemical Exchange and Nmr T2 Relaxation - Multisite Case." Journal of Magaetic Resonance 30(1): 111-128. Karplus, M. and J. A. McCammon (2002). "Molecular dynamics simulations of biomolecules." N_ature StructuLal Biology 9(9): 646-652. Kay, L. E. (1998). "Protein dynamics from NMR." lflture Structural Biology 5: 513-517. Ko, T.-P., J .-J . Lin, et al. (2003). "Crystal structure of yeast cytosine deaminase. Insights into enzyme mechanism and evolution." J. Biol. Chem. 278: 19111-19117. Lanczos, M. (1966). "The variational principle of mechanics." Liaw, S. H., Y. J. Chang, et al. (2004). "Crystal structure of Bacillus subtilis guanine deaminase - The first domain-swapped structure in the cytidine deaminase superfamily." Journal of Biological Chemistgy 279(34): 35479-35485. 36 Lipari, G. and A. Szabo (1982). "Model-Free Approach to the Interpretation of Nuclear Magnetic-Resonance Relaxation in Macromolecules .1. Theory and Range of Validity." Joumfi of the American Chemical Society 104(17): 4546—4559. Lipari, G. and A. Szabo (1982). "Model-Free Approach to the Interpretation of Nuclear Magnetic-Resonance Relaxation in Macromolecules .2. Analysis of Experimental Results." Journal of the American Chemical Society 104(17): 4559-4570. Luz, Z. and S. Meiboom (1963). "Nuclear Magnetic Resonance Study of Protolysis of Trimethylammonium Ion in Aqueous Solution - Order of Reaction with Respect to Solvent." Journal of Chemical Physics 39(2): 366-&. Mandel, A. M., M. Akke, et al. (1995). "Backbone Dynamics of Escherichia-Coli Ribonuclease Hi - Correlations with Structure and Function in an Active Enzyme." Journal of Molecular Biology 246(1): 144-163. McConnell, H. M. (1958). "Reaction Rates by Nuclear Magnetic Resonance." Journal of Chemical Physics 28(3): 430-431. Morokuma, K. (2003). "ONIOM and its applications to material chemistry and catalyses." Bulletin of the Korean Chemical Society 24(6): 797-801. Palmer, A. G. (2001). "NMR probes of molecular dynamics: Overview and comparison with other techniques." Annual Review of Biophysics and Biomolecular Structure 30: 129-155. Palmer, A. G., C. D. Kroenke, et al. (2001). Nuclear magnetic resonance methods for quantifying microsecond-to-rnillisecond motions in biological macromolecules. Nuclear Magaetic Resonance of Biological Macromolecules, Pt B. 339: 204-238. Pearlman, D. A., D. A. Case, et al. (1995). "Amber, a Package of Computer-Programs for Applying Molecular Mechanics, Normal-Mode Analysis, Molecular-Dynamics and Free- Energy Calculations to Simulate the Structural and Energetic Properties of Molecules." Computer ths_ics Communications 91(1-3): 1-41. Peng, J. W. and G. Wagner (1992). "Mapping of Spectral Density-Functions Using Heteronuclear Nmr Relaxation Measurements." Journal of Magnetic Resonance 98(2): 308-332. Schramm, V. L. and C. K. Bagdassarian (1999). Deamination of nucleosides and nulceotides and related reactions. Enzymes, Enzyme Mechanisms, Proteins, and Aspects of NO Chemistry. C. D. Poulter. New York, Amsterdam. 5: 71-100. Sklenak, S., L. S. Yao, et al. (2004). "Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study." J oum_al of the American ChemicalSocietv 126(45): 14879-14889. 37 Snider, M. J ., D. Lazarevic, et al. (2002). "Catalysis by entropic effects: The action of cytidine deaminase on 5,6-dihydrocytidine." Biochemistry 41(12): 3925-3930. Snider, M. J ., L. Reinhardt, et al. (2002). "N-15 kinetic isotope effects on uncatalyzed and enzymatic deamination of cytine mark." Biochemistry 41(1): 415-421. Stone, M. J. (2001). "NMR relaxation studies of the role of conformational entropy in protein stability and ligand binding." Accounts of Chemical Resear_ch 34(5): 379-388. Svensson, M., S. Humbel, et al. (1996). "ONIOM: A multilayered integrated MO+MM method for geometry optimizations and single point energy predictions. A test for Diels- Alder reactions and Pt(P(t-Bu)(3))(2)+H-2 oxidative addition." Journal of ths_igal Chemistg 100(50): 19357-19363. Torrent, M., T. Vreven, et al. (2002). "Effects of the protein environment on the structure and energetics of active sites of metalloenzyrnes. ONIOM study of methane monooxygenase and ribonucleotide reductase." Journal of the American Chemical Society 124(2): 192-193. Vreven, T., B. Mennucci, et al. (2001). "The ONIOM-PCM method: Combining the hybrid molecular orbital method and the polarizable continuum model for solvation. Application to the geometry and properties of a merocyanine in solution." Journal of Chemical Physics 115(1): 62-72. Vreven, T. and K. Morokuma (2000). "On the application of the IMOMO (integrated molecular orbital plus molecular orbital) method." Journ_al of Computational Chemistg 21(16): 1419-1432. Vreven, T., K. Morokuma, et al. (2003). "Geometry optimization with QM/MM, ONIOM, and other combined methods. I. Microiterations and constraints." Journal of Computational Chemistg 24(6): 760-769. Wennerst.H (1972). "Nuclear Magnetic-Relaxation Induced by Chemical Exchange." Molecula; Physics 24(1): 69-&. Xiang, S., S. A. Short, et al. (1995). "Transition-state selectivity for a single hydroxyl group during catalysis by cytidine deaminase." Biochemistry 34(14): 4516-23. Xiang, S., S. A. Short, et al. (1996). "Cytidine deaminase complexed to 3-deazacytidine: a "valence buffer" in zinc enzyme catalysis." Biochemistry 35(5): 1335-41. Xiang, S., S. A. Short, et al. (1997). "The structure of the cytidine deaminase-product complex provides evidence for efficient proton transfer and ground-state destabilization." Biochemistry 36(16): 4768-74. 38 Appendices 1‘ A-H DHNP An 0 H '1 HN N‘ HN N / OH \ I = \JEE (’H-B +0” o‘H H2 N n H2N N u (, A—H / 0 “fr?” ‘1‘ + H2N \N u '3 OH Hp Glycoaldehyde Figure 1-2 Aldolase reaction catalyzed by DHNA. A represents acid, B represents base. 39 Chapter 2 Product Release ls Rate-Limiting in the Activation of the Prodrug 5-Fluorocytosine by Yeast Cytosine Deaminase Yeast cytosine deaminase (yCD), a zinc metalloenzyme, catalyzes the hydrolytic deamination of cytosine to uracil (Scheme 1A). yCD is of great biomedical interest because it also catalyzes the deamination of the prodrug 5-fluorocytosinc (SFC, Scheme 1B), which is one of the most widely used prodrugs for gene-directed enzyme prodrug therapy (GDEPT) for the treatment of cancer (I, 2). The challenge in cancer therapy is to kill tumor cells without damaging normal cells. GDEPT meets the challenge by activating a prodrug in the tumor, thereby minimizing damage to normal tissues (1, 2). In cytosine deaminase-based GDEPT, the prodrug SFC is converted to 5-fluorouracil (SFU) by the enzyme. SFU is an anticancer drug used to treat breast, colon, rectal, stomach and pancreatic cancers, and is the drug of choice for treating colorectal carcinoma. However, the drug has high gastrointestinal and hematological toxicities. In contrast, the prodrug 5FC is fairly nontoxic to human, because of the lack of the CD activity in human cells. By producing SFU in the tumor, the CD/5FC system minimizes the undesired toxic effects of 5FU. The structure of yCD has been recently determined at high resolution both in the apo form (3) and in complex with the inhibitor 2-pyrimidinone, a reaction intermediate analog (3, 4). The enzyme is a homodimeric enzyme, and the structure of each monomer 40 consists of a central B-sheet flanked by two a-helices on one side and three or-helices on the other. The homodimeric enzyme contains two active centers, and each active center is composed of residues within a single subunit and a catalytic zinc ion coordinated with a histidine (His62), two cysteines (Cys91 and Cys94), and a water molecule in the substrate-free enzyme, the latter serving as a nucleophile in the deamination reaction. The inhibitor is bound in a hydrated adduct (4-[R]-hydroxyl—3,4-dihydropyrimidine), and the 4-hydroxyl group is coordinated with the catalytic zinc. The bound inhibitor is completely buried, being covered by a lid composed of Phel 14 from the loop between [34 and 01D, and Trp152, and Ile156, both from the C-terrninal helix. It has been suggested that the C-terminal helix may serve as a “gate” controlling the access to the active center and therefore both substrate binding and product release (4). Surprisingly, the structure of apo yCD is essentially the same as that of the inhibitor complex with an RMSD of 0.23 A for the backbone atoms between the two structures. The active center in the apo enzyme is also covered by the same cluster of hydrophobic residues and appears inaccessible to the substrate (3). The crystal structure of the apo enzyme also contains a second zinc ion in the substrate binding pocket. However, the non-catalytic zinc is coordinated with water molecules only and has no direct interactions with the protein. The effects of the non- catalytic zinc on the protein structure are not clear. The yCD-catalyzed reaction is believed to proceed via a tetrahedral intermediate with a conserved glutamate (Glu64) serving as a proton shuttle (3-5). The catalytic apparatus of yCD, including the catalytic zinc, its coordinated residues, and the proton shuttle, is very similar to that of E. coli cytidine dearrrinase (3, 4), which has been extensively studied (6-8) and is a paradigm for understanding 41 enzymatic catalysis. yCD has emerged as another excellent model system for studying this class of enzymatic reactions and the role of conformational dynamics in enzymatic catalysis, because high resolution structures have been determined (1.14 A and 1.60 A for the two structures of the reaction intermediate analog complex (3, 4) and 1.43 A for the structure without the reaction intermediate analog (3)). Furthermore, yCD is only about half the size of E. coli cytidine deaminase and is amenable to high resolution NMR analysis, which is being carried out currently in our laboratory. We are interested in elucidating how yCD catalyzes the activation of the prodrug 5FC to the anticancer drug 5FU and the role of conformational dynamics in the catalysis. In this paper, we show by a combination of transient kinetic and NMR studies that product release is rate limiting in the activation of the prodrug SFC by yCD and may involve multiple steps. Experimental procedures Materials. All chemicals and biochemicals were from commercial sources. Cytosine, 5-fluorocytosine, and 5-fluorouracil were purchased from Sigma. [6-3H]-5- fluorocytosine was purchased from Moravek. [lsN]-NH4CI was purchased from Isotec. Restriction enzymes and T4 ligase were purchased from New England Biolabs. Pfit DNA polymerase and the pET-17b vector were purchased from Strategene and Novagen, respectively. Cloning. The yCD gene was amplified by PCR from yeast genomic DNA. The primers for PCR were: 5’-GGG ATC CAT ATG GCA AGC AAG TGG GAT CAG-3’ (forward primer with a Nde I site) and 5’-GGA ATT CTA CT C ACC AAT ATC TTC AAA CC-3’ (reverse primer with an EcoR I site). The PCR product was digested with 42 Nde I and EcoR I restriction enzymes and ligated with the vector pET-17b digested with the same restriction enzymes. The ligation mixture was transformed into the E. coli strain DHSa. The correct coding sequence of the cloned yCD gene was verified by DNA sequencing. The DNA of the expression construct (pET17b-yCD) was then transformed into the E. coli strain BL21(DE3)pLysS. Site-Directed Mutagenesis. The mutant W10H was made by PCR-based site directed mutagenesis. The forward and reverse primers for the mutagenesis were 5’- CATATGGCAAGC AAGCACGATCAGAAGGGTATGGACATTGCC-3’ and 5’- GGCAATGTCCATACCCTT CT GATCGTGCTT GCT'I‘ GCCATATG-3. The mutation was verified by DNA sequencing. The entire gene was sequenced in order to confirm no unintended mutation. Expression and Purification. Five ml of LB medium containing 100 ug/ml of ampicillin and 20 ug/ml chloramphenicol was inoculated with a fresh colony of the expression strain BL21(DE3)pLysS containing pET17b-yCD. The culture was grown overnight at 33 °C with vigorous shaking (~200 rpm). The overnight culture was first checked for the expression of yCD by SDS-PAGE and used for seeding a large scale growth. The expression of yCD was induced by the addition of IPTG to a final concentration of 0.5 mM when the OD600 of the culture reached 1.2. The culture was then cooled down to room temperature and grown for six more hours. The E. coli cells were harvested by centrifugation, washed once with buffer A (50 mM potassium phosphate, pH 7.9), and kept at -20 °C until use. The frozen bacterial paste was thawed at room temperature and suspended in 100 ml of pre-cooled buffer A. The cells were disrupted by a French press. The resulting 43 lysate was centrifuged for 20 min at ~27,000 g and 4 °C. Polyethyleneimine was added to the pooled supernatant to a final concentration of 0.1%. The solution was centrifuged immediately after mixing at 15,000 g for 30 min. Ammonium sulfate was added in small portions to the supernatant under constant stining to 40% saturation. After another hour of stirring, the solution was centrifuged at ~27,000 g for 20 min. The supernatant was loaded to a phenyl Sepharose column equilibrated with 40% saturation ammonium sulfate in buffer B (50 mM potassium phosphate, pH 7.0). The column was washed with the equilibrium solution until the OD230 of the effluent was <0.05 and eluted with a linear ammonium sulfate gradient (40-0% saturation) in buffer B. Fractions containing yCD were identified by SDS-PAGE and concentrated to ~15 ml by an Amicon concentrator using an YM10 membrane. The protein solution was then applied to a Sephadex G-75 column equilibrated with 20 mM Tris-HCI, pH 7.5. The column was developed with the same buffer. Fractions from the gel filtration column were monitored by OD230 and SDS- PAGE. Pure yCD fractions were pooled and concentrated to 10-20 ml. The concentrated yCD was dialyzed against 2 mM potassium phosphate buffer (pH 7.0) and lyophilized. ”N-Labeling. For 15N-labeling of yCD, the expression strain BL21(DE3)pLysS containing pETl7b-yCD was grown in a M9 medium with ISNILIsCl as the sole nitrogen source. The cells were grown first at 33 °C and were induced by 0.5 mM IPI‘ G when the culture reached an OD600 of 1.2. The culture was further incubated at room temperature for 10 h. The lsN-labeled yCD was purified by the same procedure as for the unlabeled protein. 5-Fluorotryptophan Labeling. To produce 5-fluorotryptophan labeled yCD, 1 L of M9 medium containing 100 ug/ml of ampicillin and 20 ug/rnl chloramphenicol was inoculated with 5 ml culture of the expression strain BL21(DE3)pLysS containing pETl7b-yCD and grown overnight at 33 °C with vigorous shaking. The E. coli cells were centrifuged down and re-suspended in 3 L of M9 medium containing 100 ug/ml of ampicillin and 20 ug/ml chloramphenicol and grown until OD600 reached 1.0. 5- Fluorotryptophan was then added to a final concentration of 50 mg/L. The culture was incubated further at 33 °C until OD600 reached 1.4 and then induced with 0.5 mM IPTG (final concentration). The cells were harvested by centrifugation after 4 h of incubation at room temperature. The labeled protein was purified and the percentage of 5- fluorotryptophan labeling was determined by mass spectrometry. pH Indicator Assay. For measuring the steady-state kinetic parameters of SFC, the reaction mixture contained 1 mM 5FC, 0.09 mM cresol red, and 8.1-12 ug/ml yCD in 100 mM bicine, 100 mM NaCl, pH 7.5. The reaction was initiated by addition of the enzyme at 25 °C and followed by monitoring the absorption of the pH indicator cresol red at 572 nm. The values of the steady state kinetic parameters were estimated by the numerical analysis of the time course of the reaction using the program DYNAFIT (9). The steady-state kinetic parameters of cytosine were determined by initial velocity analysis. The reaction mixture contained 0.09 mM cresol red, ~0.6 ug/ml yCD, 0.2, 0.5, 0.8, 1.2, 2.4, 4.8, or 9.6 mM cytosine. The initial rates were analyzed according to the standard Michaelis-Menten equation. Stopped-Flow Analysis. Stopped-flow experiments were performed in an Applied Photophysics SX.18MV-R stopped-flow spectrofluorometer at 25 °C. One syringe contained yCD, and the other contained the substrate 5FC. Both yCD and 5FC were dissolved in 20 mM sodium phosphate, 150 mM NaCl, pH 7.3. The absorption at 296 nm 45 was monitored. The data were analyzed by nonlinear least square fit to an exponential equation using the program Origin (OriginLab). Quench-Flow Analysis. Quench-flow experiments were carried out with a KinTek RQF-3 rapid quench-flow instrument at 25 °C. All reaction components were dissolved in 20 mM phosphate, 150 mM NaCl, pH 7.3. One syringe contained yCD, and the other contained the substrate 5FC. A trace amount of [6-3H]-5FC was used to follow the reactions. The reactions were quenched with 0.5 M acetic acid, 10 mM SFC, and 10 mM 5FU. The substrate SFC and product 5FU were separated by TLC on a RP-l8 F254, aluminium sheet (Merck) developed with acetonitrite and acetic acid at a 50:1 ratio. After the TLC plate was air dried for 10 min, 5FC and SFU were spotted under UV light. The fluorescent spots were cut out and soaked in 1 ml of the developing solution in scintillation vials and shaken for 30 min at room temperature. Radioactivities were measured with a Beckman LS6500 scintillation counter after the addition of 7.5 ml scintillation fluidto each scintillation vial. In pre-steady state experiments, the reaction mixture contained 20 11M yCD and various concentrations of 5FC. In single-turnover experiments, the reaction mixture contained 300 M yCD and 15 11M 5FU. All concentrations were those after mixing. The presteady-state and single-turnover data were first analyzed by nonlinear least square fit to appropriate single exponential equations as previously described (10). The amplitudes and rate constants were then used to set the initial values for fitting the data to the complete mechanism by numerical analysis using the program DYNAFIT (9). IH-15N HSQC NMR Spectroscopy. NMR samples were prepared by dissolving 15N- labeled yCD in 100 mM potassium phosphate, 100 M NaN3, pH 7.0 made in H20/2H2O 46 (13/1). DSS (20 M) was used as an internal standard for chemical shift calibration. The initial protein concentration was 1.5 mM in protomers. The NMR samples containing 12 and 20 mM 5FU were made by adding aliquots of a concentrated SFU solution. The NMR sample containing 75 mM SFU was made by adding 5FU powder. The sensitivity- enhanced 1H—‘SN HSQC spectra were acquired at 25 °C on a Varian INOVA 600 spectrometer. The spectrum widths were 9476 and 2400 Hz for 1H and 15N dimensions, respectively. 160 (t1, 15N dimension) x 1946 (t2, lH dimension) complex data points were recorded for each spectrum. The number of transients was 32 for each FID with a 1.5 s delay between transients. The NMR data were processed with the program NMRPipe (II). 19F NMR Spectroscopy. 19F NMR experiments were performed on a Varian INOVA 300 or 600 MHz NMR spectrometer. The NMR samples were made with the same buffer as that for the lH-ISN HSQC NMR experiments. The NMR spectra were acquired with a spectral width of 5000 Hz, 5000 data points, and 1024 transients for each spectrum with a 2 s delay between transients. The NMR data were processed with the program VNMR (V arian Associates). The 19F chemical shifts were referenced to CF3C6H5. Saturation transfer experiments were carried out as described by Lian and Roberts (12). The NMR sample for the saturation transfer experiments contained 1.5 mM yCD labeled with 5-fluorotryptophan and 24.7 mM 5FU. Two complementary saturation transfer experiments were performed, one saturating a 19F NMR signal of free yCD and the other saturating a 19F NMR signal of SFU-bound yCD. The signals were saturated with a low power gated decoupling pulse. The pulse power was set to -2 dB on the basis of the results of preliminary tests. Eight spectra were acquired for each saturation transfer 47 experiment with saturation times of 0, 0.04, 0.06, 0.08, 0.1, 0.12, 0.2, and 0.4 s. The control experiment was performed with the saturation frequency set at one end of the spectrum. The intensity of the monitored peak is described by the following equation. [/10 =(k/A)e"1t +R//l where I is the peak intensity of the monitored species when the other species is saturated, 10 is the peak intensity of the monitored species in the control experiment, k is the rate constant for the conversion of the monitored species to the saturated species, R is the longitudinal relaxation rate constant for the monitored species, and xi = k + R. The data were fitted by nonlinear least squares regression to an exponential equation using the program Origin. ’1’ + Irel = Ae— c where 1“.) is the peak intensity of the monitored species relative to that of the control experiment. When the NMR signal of the bound yCD is saturated, the product of A and )1. provides the value for the dissociation rate constant. When the NMR signal of the free yCD is saturated, the product of A and A is the rate constant for the conversion of the free yCD to the bound yCD. The concentrations of the free enzyme, the bound enzyme, and the free ligand can be calculated from the standard equilibrium relationship. The association rate constant can be then obtained because the rate constant for the conversion of the free yCD to the bound yCD is the product of the association rate constant and the concentration of free SFU at equilibrium. Resufls 48 Steady State Kinetic Analysis. CD is traditionally assayed by measuring UV absorbance changes (13). Although the method is convenient for routine measurement of CD activity, it is not accurate for measuring kinetic constants, because both the substrate cytosine and the product uracil have high molar extinction coefficients and similar maximum absorbance wavelengths. For example, for an accurate measurement of the Km value of cytosine (1.1 mM), the substrate concentration must go up to at least 5 mM (~5 times Km), preferably 10 mM (~10 times Km). The absorbance for 5 mM cytosine is 30.5 at the maximum absorbance wavelength (267 nm) and 6.1 at the wavelength (286 nm) normally used for measuring CD activity. To overcome this problem, we developed a pH indicator assay. Although the use of pH indicators to monitor the progress of enzymatic reactions has a long history, the technique has not been used for measuring the kinetics of the CD-catalyzed reaction. The pH indicator assay took advantage of the high pKa (9.25) of ammonia so that essentially all ammonia generated by the CD-catalyzed reaction is converted to ammonium at the assay pH (7.5). The resultant pH change causes a change in the absorbance of the pH indicator at 572 nm. The absorbance change is proportional to the ammonia generated by the CD-catalyzed reaction if the pKa of the pH indicator is equal to that of the buffer. Cresol red was used as the pH indicator because its pKa (8.3) matches with that of bicine (8.35) used for making the assay buffer. Because kinetic parameters change with pH, the pH changes were kept within 0.05 unit. CO2 absorption effects were not an issue, because the reaction rates with the amount of the enzyme in the assay were at least an order of magnitude higher than the basal rates obtained without the enzyme (control). Therefore, the CO2 absorption effects could be safely ignored. We also tested the phenol red (pK,l = 7.81) and POPSO (pKa = 7.82) combination, and the kinetic 49 constants obtained by the two systems were the same. We chose the higher pKa system because it gave a higher signal with the same small magnitude of pH change. The buffering capacity was compensated by increasing the concentration of the buffering component. The results of the kinetic measurement by the pH indicator assay are summarized in Table 1. The results showed that yCD has a slightly higher catalytic efficiency (kw/Km) for the prodrug SFC than for the natural substrate cytosine. The catalytic efficiency of yCD for SFC is ~10-fold higher than that of the Ecoli enzyme (14), in agreement with the report that yCD is a better enzyme for CD/SFC-based GDEPT (15). Transient Kinetic Analysis. In order to determine the rate constants for the individual steps of the prodrug activation, we performed the transient kinetic analysis of the reaction. We first performed stopped-flow spectrophotometric analysis of the reaction. As shown in Figure 2-1A, there was a burst increase followed by a steady decrease in absorbance. A similar phenomenon was observed for E. coli cytosine deaminase (14). The burst increase in absorbance was attributed to the change in the environment of the substrate upon binding to the enzyme, the decrease to the conversion of the substrate to the products. The absorbance changes could be phenomenologically described by an exponential and a linear term. The rate constant of the exponential term is linearly dependent on the concentrations of the substrate (Figure 2-1B). The apparent association and dissociation rate constants were estimated to be 0.21 ttM'ls'l and 33 s", respectively, which were probably the low limits because the substrate complex was rapidly converted to the product complex as shown by a quench-flow analysis. 50 Because the extinction coefficient of the bound 5FC is slightly higher than that of the free SFC, it was difficult to estimate how much SFU was formed at the initial phase of the reaction. We then performed several stopped-flow experiments with a pH indicator. It turned out that the data obtained in the first 50 ms was very noisy and unusable. Quench— flow experiments were used instead to obtain definitive kinetic information for the prodrug activation. The results are shown in Figure 2-2. The burst experiments (the top three lines in Figure 2-2) clearly indicated that product release is the rate-limiting step in the activation of the prodrug. A single turnover experiment in which the enzyme concentration was larger than that of the substrate was also performed (the bottom line in Figure 2-2). The data from both burst and single turnover experiments were analyzed by global fitting according to the following minimal kinetic mechanism (Scheme 2) using the numerical analysis program DYNAFIT (9, 10). The initial values for the global numerical analysis were obtained by nonlinear least squares fit of the data to appropriate exponential equations. The rate constants obtained by the global numerical analysis are summarized in Table 1. The rate constant for the forward reaction is eight times that of product release and more than ten times of km. The Km and km values calculated from the individual rate constants were very similar to those determined by steady-state kinetic measurements, indicating that the individual rate constants deterrrrined by the transient kinetic analysis are consistent with the steady-state kinetic parameters. The slow product release is probably due to the slow dissociation of SFU and was confirmed by the NMR analysis of the binding of SFU to yCD. 51 NMR Analysis. To confirm that the product release is rate-limiting in the prodrug activation, several NMR experiments were performed. First, we acquired lH-ISN HSQC spectra of 15N-labeled yCD in the absence of SFU and in the presence of 12, 20, and 80 mM 5FU. As shown in Figure 2-3B, many residues had two sets of cross-peaks under unsaturated conditions, indicating that SFU was in slow exchange with the complex of yCD and SFU on NMR time scale and yCD was not saturated with 5FU. Only one set of HSQC cross-peaks were obtained when SFU reached ~80 mM (the maximum solubility), indicating that yCD was predominately in the bound form at this concentration of SFU (data not shown). The sequential resonance assignment of yCD in complex with the reaction intermediate analog 5-fluoro-2-pyrimidinone has been achieved by 3D double and triple resonance NMR experiment, including lSN-editcd 3D NOESY, HNCA, HN(CO)CA, HN(CA)CB, HN(COCA)CB, HNCO, HN(CA)CO, HCCH-TOCSY, C(CO)NH-TOCSY, and H(CCO)NH-TOCSY data (16), using single, double, and triple labeled yCD samples (Yao et al., unpublished). Because most of the cross peaks in the HSQC spectra are well resolved, many cross peaks of unliganded yCD (Figure 2-3A) and its complex with SFU (Figure 2-3B) could be assigned by comparison with the assigned lH—‘SN HSQC spectrum of yCD in complex with the reaction intermediate analog. Most of the cross-peaks that moved upon the binding of SFU belong to the residues in the vicinity of the active site. The Kd of the binary product complex was estimated to be ~20 mM, because about equal amounts of yCD was in the free form and the bound form at 20 mM 5FU. Second, we acquired l9F NMR spectra of free SFU and in complex with yCD (Figure 2-4). The formation of the yCD complex shifted the 19F NMR signal of SFU by ~4.5 ppm. The results clearly indicated that the free 5FU is in slow exchange with the 52 yCD and SFU complex on the 19F NMR timescale. Third, we acquired 19F NMR spectra of yCD labeled with 5-fluorotryptophan in the absence and presence of SFU (Figure 2-5). The degree of the S-fluorotryptophan labeling was estimated to be ~90% on the basis of the lH-‘SN HSQC spectra of single (UN) yCD and double labeled (”N and 19F) yCD proteins. The km and Km values of the 19F-labeled yCD were 15:04 8’1 and 801-6 11M, respectively, indicating that the 5-fluorotryptophan labeling has no significant effects on the kinetic properties of yCD. The 19F-labeled yCD showed two 19F NMR peaks, and one was much sharper than the other. The binding of SFU shifted the broader NMR peak by ~2 ppm, but caused no significant change in the position of the narrower peak (Figure 2- 5). The wild-type yCD has two tryptophan residues, Trp10 at the N-terminus (away from the active site) and Trp152 at the active site. The two 19F NMR peaks were assigned on the basis of the comparison with the 19F NMR spectrum of the yCD mutant W10H, which showed only one peak from Trp152 (data not shown). Surprisingly, the peak shifted belonged to T rp10 at the N-terminus rather than Trp152 at the active site. The result again indicated that SFU is in slow exchange with its complex with the enzyme on the NMR time scale. Furthermore, the intensities of the free yCD and the bound yCD were about equal in the presence of 24.7 mM 5FU, again indicating that the Kd value for the binary complex is ~20 mM. Fourth, we determined the exchange rate constants by NMR saturation transfer experiments. The results are shown in Figure 2-6. The results were analyzed using a simple one-step binding model. The dissociation rate constant was determined by saturating the 19F NMR signal of the bound yCD, which was 13 s“. The rate for the formation of the complex, a product of the association rate constant and the concentration of free SFU at equilibrium, was determined by saturating the 19F NMR 53 signal of the free yCD, which was 14 s". The concentration of the free SFU was calculated to be 24 mM using the standard equilibrium relationship. The association rate constant therefore was 0.6 mM'ls", and the K, was 22 mM. The results indicated that SFU is in slow exchange with its yCD complex but has a rather high Kd. Discussion yCD is of great biomedical interest because it catalyzes the deamination of the prodrug SFC to form the anticancer drug 5FU. By a combination of transient kinetic and NMR studies, we have clearly shown that the product release is a rate-limiting step in the activation of the prodrug SFC by yCD. The transient kinetic studies showed that the rate constant of the chemical step for the forward reaction (250 S”) is ~8 times that of the product release (31 s") and ~15 times km (17 s"). The transient kinetic results are consistent with those of the steady state kinetic analysis in the sense that the km and Km values calculated from the rate constants determined by the transient kinetic analysis are in close agreement with those measured by the steady state kinetic analysis. The steady- state kinetic parameters, however, are insufficient for the description of the kinetics of the prodrug activation. The yCD-catalyzed activation of the prodrug SFC generates two products, SFU and ammonia. The four NMR experiments described in the results section clearly indicated that the release of SFU is rate limiting in the activation of the prodrug SFC by yCD. There are two possible causes for the slow release of SFU by yCD. One involves the breaking of the coordination between SFU and the catalytic zinc, and the other the opening of the lid that covers the active site. When SFC is converted to SFU at the active site of yCD, the oxygen at position 4 of SFU is coordinated with the catalytic zinc. Using 54 the ONIOM methodology (17-24), our recent computational study of the deamination of cytosine showed that energetically, the cleavage of the O4-Zn bond is rather difficult either in the presence or in the absence of ammonia (5). The computational study suggests that uracil is liberated from the zinc by an oxygen exchange mechanism that involves the formation of a gem-diol intermediate from the Zn bound uracil and a water molecule, the C4-O2n cleavage, and the regeneration of the Zn-coordinated water. The rate determining step in the oxygen exchange is the formation of the gem-diol intermediate, which is also the rate determining step for the overall yCD-catalyzed deamination reaction. SFU is also likely to be completely buried at the active site of yCD. The structure of yCD in complex with the reaction intermediate analog 2-pyrimidinone has been recently determined at high resolutions (3, 4). The bound reaction intermediate analog is completely buried, being covered by a lid composed of Phe114, Trp152, and Ile156, the latter two of which are from the C-terminal helix. It is possible that the opening of the lid may also determine the rate of product release in yCD. Furthermore, the structure of apo yCD is essentially the same as that of the reaction imterrnediate complex (3). The active center is also covered by the same cluster of hydrophobic residues in the apo enzyme and appears inaccessible to the substrate (3). It appears that the opening of the lid is required not only for product release but also substrate binding. Thus, the release or binding of 5FU may involve multiple steps as illustrated in Scheme 3. where E° and Ec represent the open and closed conformations of the unliganded enzyme, respectively, Ec-SFU is a complex in which SFU is coordinated with the catalytic zinc and yCD is in a closed conformation, E°.5FU is a complex in which SFU is not coordinated with the zinc and the 55 enzyme is still in a closed conformation, and E°.5FU is a complex in which SFU is not coordinated with the zinc but the enzyme is in an open conformation. The NMR results are interesting, because the results not only identified that the release of SFU is a slow step in the deamination of SFC but also showed that SFU has a low affinity for yCD (high K.) In general, slow exchange means tight binding and a low Kd. One explanation for the high Kd is that yCD exists in two conformations, a closed conformation with an inaccessible active center and an open conformation with an accessible active center, as illustrated in Scheme 3. SFU binds to the open conformation only. The equilibrium is in favor of the closed conformation, which is most populated and was crystallized (3). The apparent Kd is the reciprocal of the overall equilibrium constant, which is a product of the equilibrium constant for the conformational transition of the unliganded yCD and the equilibrium constant for the binding of SFU to the open conformation. An equilibrium constant in favor of the closed conformation will raise the apparent Kd by a factor of the reciprocal of the equilibrium constant. The unusually low apparent association rate constant may also be attributed to the multiple step nature of the binding of 5FU to yCD. yCD belongs to the CDA family of purine/pyrimidine deaminases (3, 4). In addition to yCD and other fungal cytosine deaminases, members of the family of enzymes include cytidine deaminases, guanine deaminases, and riboflavin biosynthesis enzymes. These enzymes have similar structures with a zinc-containing catalytic apparatus (25). The structures of the family of enzymes reported to date are all in a closed conformation. Because these enzymes all have a closed conformation and most likely follow a similar chemical mechanism, product release may also be rate-limiting in the 56 reactions catalyzed by other members of this family of enzymes. However, a lack of a viscosity dependence on the km of E. coli cytidine deaminase suggests that product release may not be rate-limiting in the reaction catalyzed by the enzyme (26). On the other hand, full 15N kinetic isotope effects are manifested in the reaction catalyzed by mutants with significantly reduced catalytic efficiencies but not the wild-type cytidine deaminase (7), indicating that there is another step influencing the rate besides the slow C"-N4 bond cleavage in the deamination of cytidine. In conclusion, the transient kinetic and NMR data together clearly showed that the release of 5FU is rate-limiting in the activation of the prodrug SFC by yCD and may involve multiple steps, the interconversion of yCD between a closed and an open conformation and the cleavage or formation of the coordination between SFU and the catalytic zinc of yCD. 57 References (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Aghi, M., Hochberg, F., and Breakefield, X. 0. (2000) Prodrug activation enzymes in cancer gene therapy. J. Gene Med. 2, 148-164. Greco, O., and Dachs, G. U. (2001) Gene directed enzyme/prodrug therapy of cancer: Historical appraisal and future prospectives. J. Cell. Physiol. 187, 22-36. Ireton, G. C., Black, M. E., and Stoddard, B. L. (2003) The 1.14 A crystal structure of yeast cytosine deaminase evolution of nucleotide salvage enzymes and implications for genetic chemotherapy. Structure 11, 961-972. Ko, T.-P., Lin, J .-J ., Hu, C.-Y., Hsu, Y.-H., Wang, A. H.-J., and Liaw, S.-H. (2003) Crystal structure of yeast cytosine deaminase. Insights into enzyme mechanism and evolution. J. Biol. Chem. 278, 19111-19117. Sklenak, S., Yao, L. S., Cukier, R. I., and Yan, H. G. (2004) Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study. J. Am. Chem. Soc. 126, 14879-14889. Schramm, V. L., and Bagdassarian, C. K. (1999) Dearrrination of nucleosides and nulceotides and related reactions, in Enzymes, Enzyme Mechanisms, Proteins, and Aspects of NO Chemistry (Poulter, C. D., Ed.) pp 71-100, Amsterdam, New York. Snider, M. J ., Reinhardt, L., Wolfenden, R., and Cleland, W. W. (2002) 15N kinetic isotope effects on uncatalyzed and enzymatic deamination of cytidine. Biochemistry 41 , 415-421. Snider, M. J ., Lazarevic, D., and Wolfenden, R. (2002) Catalysis by entropic effects: The action of cytidine deaminase on 5,6-dihydrocytidine. Biochemistry 41, 3925-3930. Kuzmic, P. (1996) Program DYNAFIT for the analysis of enzyme kinetic data: Application to HIV proteinase. Anal. Biochem. 23 7, 260-273. Li, Y., Gong, Y., Shi, G., Blaszczyk, J ., Ji, X., and Yan, H. (2002) Chemical transformation is not rate-limiting in the reaction catalyzed by Escherichia coli 6- hydroxymethyl-7,8-dihydropterin pyrophosphokinase. Biochemistry 41, 8777- 8783. Delaglio, F., Gr'zesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J ., and Bax, A. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277-93. 58 (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) Lian, L. Y., and Roberts, G. C. K. (1993) Effects of chemical exchange on NMR spectra, in NMR of Macromolecules (Roberts, G. C. K., Ed.), IRL Press, New York. Ipata, P. L., Mannocchi, F., Magni, G., and Felicioli, R. (1971) Baker's yeast cytosine deaminase. Some properties and allosteric inhibition by nucleosides and nucleotides. Biochemistry 10, 4270-4276. Porter, D. J. T. (2000) Escherichia coli cytosine deaminase: the kinetics and thermodynamics for binding of cytosine to the apoenzyme and the Zn2+ holoenzyme are similar. Biochim. Biophys. Acta 1 4 76, 239-252. Kievit, E., Bershad, E., Ng, E., Sethna, P., Dev, 1., Lawrence, T. S., and Rehemtulla, A. (1999) Superiority of yeast over bacterial cytosine deaminase for enzyme/prodrug gene therapy in colon cancer xenografts. Cancer Res. 59, 1417- 1421. Gardner, K. H., and Kay, L. E. (1998) The use of 2H, 13c, 15N multidimensional NMR to study the structure and dynamics of proteins. Annu. Rev. Biophys. Biomol. Struct. 27, 357-406. Dapprich, S., Komaromi, I., Byun, K. S., Morokuma, K., and Frisch, M. J. (1999) A new ONIOM implementation in Gaussian98. Part I. The calculation of energies, gradients, vibrational frequencies and electric field derivatives. J. Mol. Struct. 462, 1-21. Humbel, S., Sieber, S., and Morokuma, K. (1996) The IMOMO method: Integration of different levels of molecular orbital approximations for geometry optimization of large systems: Test for n-butane conformation and S(N )2 reaction: RCl+Cl. J. Chem. Phys. 105, 1959-1967. Kuno, M., Hannongbua, S., and Morokuma, K. (2003) Theoretical investigation on nevirapine and HIV-1 reverse transcriptase binding site interaction, based on ONIOM method. Chem. Phys. Lett. 380, 456-463. Svensson, M., Humbel, S., Froese, R. D. J., Matsubara, T., Sieber, S., and Morokuma, K. (1996) ONIOM: A multilayered integrated MO+MM method for geometry optimizations and single point energy predictions. A test for Diels- Alder reactions and Pt(P(t-Bu)(3))(2)+H-2 oxidative addition. J. Phys. Chem. 100, 19357-19363. Svensson, M., Humbel, S., and Morokuma, K. (1996) Energetics using the single point IMOMO (integrated molecular orbital plus molecular orbital) calculations: Choices of computational levels and model system. J. Chem. Phys. 105, 3654- 3661. Vreven, T., Mennucci, B., da Silva, C. O., Morokuma, K., and Tomasi, J. (2001) The ONIOM-PCM method: Combining the hybrid molecular orbital method and 59 (23) (24) (25) (26) the polarizable continuum model for solvation. Application to the geometry and properties of a merocyanine in solution. J. Chem. Phys. 115, 62-72. Vreven, T., and Morokuma, K. (2000) On the application of the IMOMO (integrated molecular orbital plus molecular orbital) method. J. Comput. Chem. 21, 1419-1432. Vreven, T., Morokuma, K., Farkas, O., Schlegel, H. B., and Frisch, M. J. (2003) Geometry optimization with QM/MM, ONIOM, and other combined methods. I. Microiterations and constraints. J. Comput. Chem. 24, 760-769. Liaw, S. H., Chang, Y. J., Lai, C. T., Chang, H. C., and Chang, G. G. (2004) Crystal structure of Bacillus subtilis guanine deaminase - The first domain- swapped structure in the cytidine deaminase superfamily. J. Biol. Chem. 2 79, 35479-35485. Snider, M. J ., Gaunitz, S., Ridgway, C., Short, S. A., and Wolfenden, R. (2000) Temperature effects on the catalytic efficiency, rate enhancement, and transition state affinity of cytidine deaminase, and the thermodynamic consequences for catalysis of removing a substrate "anchor". Biochemistry 39, 9746-9753. 60 Appendices Table 2-1 Kinetic Constants for the Activation of the Prodrug SFC by yCDal Steady State Kinetics 5-F1uorocytosine Cytosine K..(mM) k...(s") k.../K..(M“s") Km(mM) has") k.a/Km(M“s") 0161001 17:04 l.1><105 1.11008 9114 8.2x104 Transient Kinetics kt(11M"S") kt 1s") Io(s") k-2(s") k3(S") KmonM) k... (8“) 0481004 93:20 250:30 91:20 31:2 0.11b 21b ”The steady-state kinetic parameter for cytosine are presented for comparison. bThe Km and kw were calculated from the individual rate constants. 61 Scheme 1 n 0 fl Cytosine Uracil NH O B / 2 H20 NH3 N F A I V - ‘ik'f 0 N O N H H S-Fluorocytosine S-Fluorouracil Scheme 2 k, 1:, k;, E + S =ES —-—_——=EP E + P K1 k-2 Scheme 3 Ec-SFU -— E°.5F U — E°.5F U —— E° — Ec 62 0.315 0.310- A 0.305 - § 0.300- 0.295 - , 0.290 - 0 '100 ' 200 ' 300 ' 400 . 500 1 600 5FU (11M) Figure 2-1 Stopped-flow analysis of the activation of the prodrug SFC by yCD. (A) The time course of the reaction as monitored by OD296. The reaction mixture contained 10 11M yCD and 250 11M 5FC. The solid line was obtained by nonlinear least square fit to an equation with an exponential and a linear term. (B) The linear dependence of the rate constant of the exponential term on the concentration of 5FC. 63 o yCD=20 11M, 5-FC=200 uM - yCD=20 11M, 5-FC=300 uM . 50 ’ A yCD=20 11M, 5-FC=500 uM . v yCD=300 11M, 5-FC=15 11M 40 P I o A 2 ‘ ° .3- 301 ' O E n , 20 - ‘ ‘ . L0 10 . 0 o 0.05 0.10 Tlme (s) Figure 2-2 Quench-flow analysis of the activation of the prodrug SFC by yCD. The solid lines were obtained by global fitting using the numerical analysis program DYNAFIT. 64 A 14. 047072 - 104 35° 25 29 0156 _ °6° “6 ”6 1108 1180 30 ‘9 '0157 329 ‘ r 630 509036 102018 _ 076 839 8J5: : 112 .65 34 150 101‘8 57 425 _ 109. 54 53 67 9.1.10 » 860 95. 948 073 02440143138 0114 - 116 190 “51°35. 0% 381.0123 *_ 92. 0 1° . 65 . 0 6 8105 1130 . r 120 55° -124 - 128 -O0 L a D - 132 12 11 - 104 B L - 108 - 112 - 116 . . 65 _ 3113 - 120 550 _ 124 - 128 8 . 64 - 46¢ ‘07 '- 132 12 11 10 9 8 7 1 H (ppm) 15N (ppm) 15N (ppm) Figure 2-3 lH-ISN HSQC spectra of yCD. Sequential assignments are indicated with residue numbers. (A) 1.5 mM yCD; (B) 1.5 mM yCD + 12 mM 5FU. 65 —1 free 5FU ”3935’” JET” 5FLL:1L"_MY¢PA 5 mM 5FU + 0 mM yCD -166.5 -167.0 -167.5 468.0 -168.5 -169.0 469.5 -170.0 -170.5 ppm Figure 2—4 19F NMR spectra of free (unbound) SFU and the yCD-bound 5FU. The smaller peaks near the strong peak of free SFU in the upper spectrum are 13C satellite peaks. W152 W10 1.5 mM yCD + 0 mM 5FU 1.5 mM yCD W10 (bound) +24.7mM5FU meJ VY'ITY' -122.5 '.123.0 "_1'23"5 -124.0 -12l45.l.H-1250'-12'55v-12l6.0 -126.5 ppm Figure 2-5 19F NMR spectra of yCD labeled with S-fluorotryptophan. (A) 1.5 mM 5- fluorotryptophan-labeled yCD; (B) 1.5 mM S-fluorotryptophan-labeled yCD + 24.7 mM 5FU. 66 1.0-1 0.8 - 0.6 d 0.4 -l 0.2 4 Relative Intensity 0.0 — .3 N l A o l A Relative Intensity 0.0 v 0.1 ' 0:2 ' 0.3 0.4 Time (s) Figure 2-6 NMR saturation transfer analysis of the binding of SFU to the 5- fluorotryptophan-labeled yCD. The NMR sample contained 1.5 mM S-fluorotryptophan- labeled yCD and 24.7 mM 5FU. The data in panels A and B were obtained by irradiating the 19F NMR nuclei of the bound and free yCD, respectively. The solid lines were obtained by nonlinear least square fit of the data as described in the “Experimental Procedures” section. 67 Chapter 3 Yeast Cytosine Deaminase Dynamics Study by NMR and Molecular Dynamics Simulation Introduction Yeast cytosine deaminase (yCD) catalyzes the deamination of cytosine to uracil. It can also catalyze the deamination of the prodrug S-fluorocytosine (S-FC) to the anticancer drug S-fluorouracil (S-FU). Thus, a potential system for gene-directed enzyme prodrug therapy combines yCD and the prodrug S-FC. X-ray structures of yCD in the apo form (Ireton et a]. 2003) and in complex with the potent inhibitor 2-pyrimidinone (2Py) (Ireton et al. 2003; K0 et al. 2003), are available. yCD is a homodimer with one active site in each protomer. The yCD protomer contains a center 8 sheet (BI~BS, parallel to each other) with two on helixes (a1 and a5) on one side and four (1 helixes (0L2, a3, a4 and a6) on the other side. Each active center contains a single catalytic zinc ion that is tetrahedrally coordinated by a histidine (His62), two cysteines (Cys91 and Cys94), and a water molecule in the substrate-free enzyme or the inhibitor in the complex. Surprisingly, the structure of apo yCD is essentially the same as that of the transition state analog complex with the active site completely buried with the average root mean square deviation (RMSD) 0.23 A for all backbone atoms. The immediate question is whether the dynamics are the same, though the two structures are almost identical. In this paper, we explore the dynamic property of the apo and inhibitor bound form by using an NMR backbone relaxation method (Abragam 1961; 68 Fischer et al. 1998; Ishima and Torchia 2000; Palmer 2001), which primarily describes the motion on the picosecond (ps) to nanosecond (ns) timescale. The inhibitor we use is 5-fluro-2-pyrimidinone (5FPy) which is a transition state analog for the reaction from SFC to 5FU. Model-free analysis was used to extract the order parameter S2 (Lipari and Szabo 1982; Lipari and Szabo 1982; Mandel et al. 1995) describing the motion of each backbone NH vector. MD simulation can provide a great detail concerning individual atom motions as a function of time (Karplus and McCammon 2002). Previous MD simulations for the apo form, and reactant cytosine, various intermediate and product uracil complexes (Yao et al. 2005) all showed a buried active site during the ~2.0 ns simulations, with the overall protein quite rigid. In this study we performed three 10 ns MD simulations for apo form, inhibitor (2Py) bound form and product (uracil) bound form to see the possible dynamics difference. Since these MD simulations also describe the motion on a ps to ns timescale, the computational results can provide complementary information to NMR experimental data about protein dynamics on this timescale. The order parameter and internal motion correlation time were calculated for each amide NH vector. Though direct comparison of order parameters from NH) and model-free analysis is not straightforward because of the limitations of both methods (Korzhnev et al. 2001; Case 2002), the order parameter changes of yCD due to the inhibitor binding should be comparable. As is well known, the order parameter is defined by (Lipari and Szabo 1982) in which Yzq (6,¢) is a modified second-order spherical harmonic function, defined by 69 (300826—1)/2 Yzq (t9,¢) = J3/2 sin Bcosélel‘b J3/8sin29e’2¢ Y? (0419) = $376.11) (2) 1Q Q Q II N I— O Essentially the order parameter S2 describes the fluctuation of the angular coordinates of the NH vector. A small order parameter indicates a large angular fluctuation, from a more flexible NH vector, and usually arises from a larger translational motion of the nitrogen atom. The translational fluctuations of the nitrogen can be characterized by the root mean square fluctuation (RMSF) of N. But, since the lower bound of the order parameter is 0 and the upper bound of the RMSF could be several angstroms on the ps-ns timescale, the question is whether there is a quantitative correlation between RMSF and order parameter in this wide range. Also, one may question whether the correlation is different in different secondary structure regions. In addition, how sensitive is the order parameter to protein motion? More importantly, does it follow that a residue with a smaller order parameter is really the residue with larger flexibility? These questions will be addressed in this study. One advantage of an MD simulation is that the motion of each atom in the simulation system is determined. In the apo and 2Py bound form X-ray structures (Ireton et a1. 2003; K0 et al. 2003) three residues, F114, W152 and 1156, block the active site. It has been proposed that one possible product release path is in between the F114 loop (111-117) and the C-terminal helix (150-158), where two residues, F114 and 1156, are identified that must move away in order to let the ligand out (Yao et al. 2006). The side— chain dynamics of F114, W152 and 1156 should be interesting in this regard, and will be discussed in the Results Section. 70 Materials and Methods Construction, expression and purification of 15N labeled yCD were described as before (Yao et a1. 2005). An NMR sample of apo (5FPy bound) form was 1.5 mM 15N labeled yCD (with 20 mM 5FPy) in 93% H2O/7% D20 100 mM phosphate buffer. NMR spectroscopy was performed on a 600 MHz Varian Inova spectrometer at 25 °C. The pulse sequences for R1, R2, NOE measurements are based on sensitivity-enhanced gradient-based HSQC experiment (Farrow et a1. 1994). For R1, R2 measurements, a recycle delay of 2.0 s was used between transients. For the NOE measurements, a recycle delay of 8 s was used for the non-NOE reference, and 4 s was used for the NOE measurement. A train of 120° hard pulses at 5 ms intervals with a sinc soft pulse inserted was used for 4 s proton saturation, with minimum perturbation of water magnetization found during saturation. The R; (R2) experiments were performed using 16 (32) transients. 64 transients were used for NOE measurements. For R1 measurements, 9 spectra were recorded using delays 11.1, 55.5, 133.2, 233.1, 377.4, 555.0, 888.0, 1332.0, 1998.0 milliseconds (ms). For R2 measurement, 11.77, 23.53, 35.30, 47.07, 58.84, 70.60, 82.37, 109.8, 137.3 ms were used for the delay. The spectral widths were 9476 and 2040 Hz for 1H and 15N dimensions, respectively; 128 (I1, 15N dimension) and 1946 (t2, lH dimension) complex data points were recorded for each spectrum. The NMR data were processed with NMRPipe (Delaglio et al. 1995). The peak heights were measured by using NMRView (Johnson and Blevins 1994). The R1, R2 relaxation rates were obtained by fitting the peak intensity to a two-parameter exponential decay function using Curvefit (A.G. Palmer 11], Columbia University), and uncertainties in fitted parameters were estimated using Jackknife 71 simulations (Eisenmann et al. 2004). The average error for R1 and R2 is 2.6% and 3.6% for the 5FPy bound form and 4.5% and 4.7% for the apo form. It was proposed that the fit of peak volume gives more accurate results than peak height because the search routine of peak height to find the maximum in a peak overestimates intensity, especially for weak peaks. They usually come with a longer delay time which could underestimate the R1 and R2 rates in the exponential fitting (Viles et al. 2001). But, the disadvantage of the peak volume method is that it is difficult to quantify the volume of overlapped peaks. For well- separated peaks in the 5FPy bound form, peak volume was also used for R1, R2 determinations in the 5FPy bound form. The average difference for R1 and R2 is 1.6% and 3.2%, respectively, which is within the error of R1 and R2 in peak height fitting. The peak height fitting should give a sufficiently accurate rate; thus, it was used for data analysis. The NOE enhancement was calculated as a ratio between the cross-peak intensities with and without proton saturation. The errors in NOE values were propagated from the uncertainty of peak height based on RMS noise in the spectra (Nicholson et al. 1992). The average percentage error of NOE is 5.0% for the 5FPy bound form and 5.0% for the apo form. Model-free calculations were performed using the ModelFree program provided by Prof. Arthur G. Palmer (Palmer et al. 1991). The global tumbling time and anisotropy of the rotational diffusion tensor were obtained from R2/R1 ratios using the program TENSOR2 (Dosset et al. 2000). Residues with NOEs larger than 0.6 or {(172) ‘ T2,12 )/}— {((Tl) - T1,n )/ 1-550 (3) were filtered out. And then residues with |(R2 IR1) — R2,, /R1,,,| > 1.551) (4) 72 were also filtered out. The subscript n is for residue n and SD stands for standard deviation. The calculated principle components of the diffusion tensor, Dz : Dy : Dx, are 1.14 : 1.07 : 1.00 for the 5FPy bound form and 1.09 : 1.05 :1.00 for the apo form, based on the relaxation data. So, the motional anisotropy of yCD is very small and can be neglected (Korzhnev et al. 2001), Therefore, isotropic global motion was adopted in the dynamics analysis. The global tumbling time is 18.8 ns and 19.5 ns for the 5FPy bound and apo forms, respectively. Five models with various combinations of model-free parameters are used in the model-free analysis (Mandel et al. 1995). They are model 115'2 }, model 2 {52,18 1, model 3 {S2,Rex}, model 4 {S2,z'e,Rex} and model 5 {9;1531121- Akaike’s information criterion (AIC) was used as the model selection method (Akaike 1973; d'Auvergne and Gooley 2003). AIC is computed as [2 +2k, where 12 is the fitting error and k is the number of parameters in the selected model (d'Auvergne and Gooley 2003; Chen et al. 2004). All five models can be compared simultaneously by using AIC (Chen et al. 2004; Ding et al. 2005), and the model with lowest AIC is selected. Since AIC doesn’t provide information on the quality of fit, 500 Monte Carlo simulations are used to determine whether the selected model is sufficient to describe the data for model 1, 2 and 3. If 2'2 <12(0.1) where 0.1 is the pre-chosen critical value, the model was considered to be sufficient. For model 4 and 5, only 2’2 equal to 0 was considered to be sufficient. In the model-free analysis, re was allowed to range up to 5 ns in models 2, 4 and 5. 73 A reliable estimate of the overall rotation correlation times is very important for the internal motion. The equations (3) and (4) are supposed to filter out the residues with conformational exchanges on micro-millisecond timescale and/or large 2", , which gives a good initial guess of the global tumbling time. The global tumbling time was optimized during the model-free analysis of local dynamics by using an iterative scheme. The tumbling time was scanned from -0.5 ns to +0.5 ns around the initial value with 0.1 ns each step, when the model-free analysis was performed. The summation of AIC over all spin systems in each step was compared and the lowest one was selected, which gives the best global tumbling time. The final global tumbling time is 18.6 ns and 19.6 us for the 5FPy bound and apo forms, respectively, quite similar to the initial values, indicating that equations (3) and (4) are effective in removing residues with large Te and/or Rex for yCD. Three residues, 77, 154, 156, cannot be fitted in the apo form; two residues, 38 and 85, cannot be fitted in the 5FPy bound form. MD simulation To explore the dynamic property of yCD in the apo and bound forms, 10 ns MD simulations were performed for apo form, inhibitor (2Py) complex and product uracil complex. In the simulation, the inhibitor used was 2Py, slightly different from 5FPy in experiment. The only difference is the proton of the fifth position of the pyrimidine ring is substituted by fluorine in 5FPy. There is no hydrogen bond donor close to fluorine; the substitution of fluorine by proton in the simulation should not introduce much effect in terms of the dynamical properties of the protein. The Zn complex and ligand charge parameterization have been described previously (Yao et a1. 2005), along with the details of the simulation protocol. All three simulations are performed with constant temperature 74 300 K and constant volume. The coordinates were saved every 2 ps. The first 2 us of simulation time were treated as the equilibration period and not included in the data analysis. According to the original model-free formalism (Lipari and Szabo 1982), the internal dynamics of an amide NH vector can be characterized with the correlation function, _ _ 2 2 —t/r C(t)-(#o-#i)—S +(1-S )e e (5) where p is the NH vector, and parameterized with the order parameter S 2 = C (co) and the internal motion time constant Te , defined by 00 1 r =--—-———- (C(t)--C(<>°))dt (6) e ere-dong The correlation function was calculated only up to 4 ns to ensure adequate statistics. S2 values were obtained by averaging C( I) over the last 500 ps (Chen et al. 2004); Q values were obtained based on equation (6). Since yCD is a homodimer, an estimate of the error for $2 and re can be computed from the differences between the two protomers. Resufls 15N relaxation in Apo and 5FPy bound form 15N NMR relaxation parameters of the yCD apo and 5FPy bound forms were determined at 298 K at 600 MHz (Figures 3-1, 3-2). 123 residue peaks in the apo and 136 residue peaks in the bound form were resolved well enough for quantitative analysis. Backbone NOEs are sensitive to motion in the picosecond to nanosecond time range, with lower NOEs indicating greater motions on this timescale. The average NOE for the apo form is 0.791006, slightly lower than that for the 5FPy bound form 0.81:0.05. The 75 theoretical NOE value at 600 MHz is 0.83 for a completely rigid isotropic protein with global tumbling time of 19 ns. The high NOE values indicate that the protein is quite rigid in both apo and 5FPy bound forms. Though the average NOEs are similar, some regions show differences between the apo and 5FPy bound form. The NOEs of K115 and 3116 from the F114 loop, which covers the active site, are both 0.66 in the apo form, significantly smaller than the 0.84 and 0.73 found in the bound form; The average NOE of the C-terminal helix, 150~158, another region covering the active site, is 0.70:0.09 in the apo form, smaller than the 0.82:0.04 found in the 5FPy bound form. It suggests that these two regions undergo motions on the ps to ns timescale in the apo form, but not in the 5FPy bound form. Model-free dynamics from 15 N relaxation The average S2 is 0931005 in the apo form, comparable to 0911005 in the 5FPy bound form, indicating that the overall protein is quite rigid in both forms. Most of the residues in both forms have 82 values larger than 0.8 (Figure 3-2). In some regions, the apo form S2 values appear to be smaller than for the bound form, but the difference is rather small. For example, the average 82 of the F114 loop (111-117) is 0871006 in the apo form and 0901004 in bound form. The C-terminal helix shows no apparent differences for S2, but larger Te ’s were observed in the apo form, consistent with the NOE data (Figure 3-2). There are 41 residues with Te larger than 0, among which 18 residues have re larger than 100 ps in the apo form, compared with 37 residues with re larger than 0, where 6 residues have re larger than 100 ps. in the 5FPy bound form. Therefore there are more residues in the apo form undergoing motion in picoseconds. It appears that more residues in the 5FPy bound form have conformational exchange on the 76 us-ms timescale than in the apo form. 17 residues have Ra,x with the average 3.96 s'1 in the apo form, compared to 49 residues with Rex average 2.84 s'] in the 5FPy bound form. Though more residues in 5FPy bound form require an Rex term in the model-free analysis, only 6 residues have Rex greater than 4.0 s"1 comparable to 8 residues in the apo form. Protein dynamics from MD simulation It appears that yCD is quite stable in the 10 ns simulation with CA root mean square deviation (RMSD) from the crystal structure less than 1.5 A in apo, inhibitor 2Py bound and product uracil bound forms (data not shown). The root mean square fluctuations (RMSF) of backbone CAs were calculated for the apo form, as shown in Figure 3-3a. Most of the flexible residues are from regions between a helix and B sheet. It is interesting that (14, a5 and (16 are more flexible than other a helixes and the [3 sheet, which will be discussed later. The pattern of RMSFs is similar to the x-ray B-factors. But residues 72~81 and 111~117 are significantly more flexible in the MD simulation than indicated by the corresponding x-ray B-factors (Figure 3-3a). The flexibility of these two regions is consistent with experimental S2 values in the apo form (Figure 3-2a). The former region, a small helix between (12 and [33, is in contact with the C-terminal helix (150-158) in the adjacent protomer, and R73 forms a salt bridge with E154 (Figure 3-4). The later region, the F114 loop between [34 and (14 covering the active site, is quite rigid based on the x-ray B-factors, but rather flexible in the MD simulation. The upper panel of Figure 3-3a shows the order parameters calculated from the apo form simulation, which qualitatively correlates well with the RMSFs. For most of the residues, S2 is well defined though, for some residues with low S2 values, the error is quite large. Seven residues, including 59, 74, 75, 81, 82, 116, and 117 have 82 errors larger than 0.1, which are from 77 flexible regions, with the exception of 82. All these residues also have quite large 1, values (~ 1000 ps) with large errors as shown in Figure 3-5. Simulations of 10 ns may not be long enough to determine the order parameter of these residues, which are likely to have longer timescale motion. The average Te is 1240 :t 224 ps for residues 72~81, 1139 i 539 ps for residues lll~1 17, significantly larger than the average for the whole protein of 396 1- 439 ps. So, the flexibility increase of residues 72~81 and 111~117 in the MD simulation, compared with the X-ray B-factors, comes from the ns timescale motion. The binding of 2Py doesn’t change the protein overall flexibility as shown in Figure 3-3b. It appears that residues 72~81 and 111~117 are slightly more rigid than in the apo form, except for K115, which seems to be rigid in one protomer but flexible in the other. The average RMSF for the former is 0.72 A in the 2Py bound form, which is slightly smaller than the 0.85 A in the apo form. The ayerage RMSF for the latter is 0.92 A in the bound form compared with 1.21 A in the apo form. The rigidity increase in these two regions can be confirmed by the calculated order parameters S2 and re (Figure 3-3b, 5). The average order parameter of residues 72~81 (lll~ll7) is 0.78 (0.64) in the 2Py bound form, which is larger than the 0.65 (0.49) found in the apo form. The average Te ’5 of residues 72~81 and 111~117 are 449 ps and 561 ps in the 2Py bound form, about 2 times smaller than for the apo form. The experimental order parameter also implies a flexibility decrease of the F114 loop (lll~ll7), but the difference in 72~81 is rather small (except 74: 32 0.79 3: 0.03 in apo, 0.83 i 0.03 in bound form) when comparing the bound and apo forms (Figure 3-2). The RMSF of the C-terminal helix displays a sharp decrease in D155 due to the hydrogen bond interaction between the carboxyl group of D155 and 2Py, while in the apo 78 form no such pattern was observed (Figure 3-3). The order parameter of D155 is 0.86 i 0.01 in the 5FPy bound form, slightly larger than the 0.80 z 0.03 in the apo form, while Te is ~260 : 160 ps in the 2Py bound form, about 3 times smaller than that in the apo form (~750 i 537 ps). In both forms, 1156, G157 and E158 are quite flexible. The RMSFs of the uracil bound form is shown in Figure 3-3c, which is similar to the apo and 2Py bound form. Residues 72~81 and the F114 loop (lll~ll7) are also slightly rigidified compared to the apo form. The former has an average RMSF of 0.69 A, S2: of 0.78 and re of 641 ps, while the latter has average RMSF of 0.80 A, S2 of 0.64 and re of 813 ps. The side-chain dynamics of F114, W152 and 1156 It has been observed that the yCD active site is blocked by the side-chains of F114, W152 and 1156 in two crystal structures (Ireton et al. 2003; K0 et al. 2003). The dynamics of these three side-chains should be interesting because they may contribute to ligand binding or release. Two side-chain dihedral angles, (01 and 032, were defined to study the side-chain motion as shown in Figure 3-6. (01 describes the rotation of the side- chain around the Ca-CB axis, and (02 describes the rotation of the side-chain around the CB-Cy axis. Figure 3-7a shows the ml, (02 plot along MD trajectory of the apo form. It appears that there are two conformations, one major conformation with 031 ~190°and (02 ~250°, and one minor conformation with (1)1 ~210° and 032 ~70°. The 180° difference between the two (:32 corresponds to a phenol ring flip, which gives the same structure since the phenol ring is axially symmetric. The 011 average is 187°:15.6°, consistent with the corresponding X-ray structure angle 174°:tl.4° (average over two protomers). Even though the fluctuation appears to be small, 001 can span approximately 80°, from 150° to 79 230° (Figure 3-7a). The (1)2 value of F114 in the apo yCD X-ray structure is 82°12.9° falling into the sampling region of MD simulation. After the binding of the inhibitor, the flip of the phenol ring is more frequent because the two groups of dots ((01 and (1)2 snapshot values) are more evenly distributed than in the apo form (Figure 3-7b). The (1)1 average is 184°:13.3°, close to the apo form with slightly smaller fluctuation. This value is also consistent with the angle from the inhibitor bound form the X-ray structure of 174°:1.4°. While the binding of the inhibitor seems not to change the conformational states, but only the distribution, the binding of the product introduces a new state with 001 ~290° and (02 ~275° (Figure 3-7c). The new state shares a similar probability to be visited as the other two states found in the MD simulation. So, the product bound form appears to explore the largest dihedral space of the F114 side-chain. Figure 3-8 shows the W152 (1)1, (1)2 dihedral angles of the apo form yCD. Surprisingly, there are also two states; one with (01 ~280°, (02 ~70°, another with (1)1 ~250°, (1)2 ~175°. It is interesting that the X-ray structure W152 values (1)1 ~165°, (1)2 ~235° are not included in these two states, but further investigation suggests that the X- ray conformation is quite similar to the second one. The binding of the inhibitor or the product doesn’t change the shape of the two regions in Figure 3-8 (data not shown) suggesting that the accessible dihedral space of W152 is similar in all three forms. The side-chain (01, (02 plot for 1156 is shown in Figure 3-9, with two major conformations captured in the apo form MD simulation. The first conformation has (1)1 ~60°, (1)2 ~75°; the second one has (1)1 ~60°, (1)2 ~175°. The corresponding X-ray structure dihedral angles are (1)1 58.4°, (1)2 175.1°, which fall into the second conformation. The difference between the two conformations mainly comes from (02, which is the rotation of C5 along the Cp-C,1 (there are two C, in 1156) axis. The binding of the inhibitor introduces the third state with (1)1 ~60°, (02 ~280°, a new rotamer around the C5-C1 axis. The binding of product shows a similar pattern of side-chain dihedral plot as the inhibitor bound form, with the third state slightly more sparse. In summary, it appears that the side-chains of F114, W152 and 1156 are quite flexible in the apo, inhibitor bound and product bound forms. The flip of the F114 phenol ring is observed in all three forms, and the binding of the product introduces a new state with 001 ~290° and (1)2 ~275° (Figure 3-7c). There are two conformations of the W152 side-chain in all three forms, two conformations of 1156 in the apo form, and three conformations of 1156 in the inhibitor and product bound form. The representative side- chain conformations of F114, W152 and 1156 are plotted in Figure 3-10, where the backbone yCD coordinates are taken from the inhibitor bound form X-ray structure. The correlation between MD S2 and RMSF 1). The a helix and [3 sheet regions As shown in Figure 3-3, the order parameter correlates well with the RMSF in the apo, 2Py and uracil bound forms, qualitatively. Larger RMSF residues usually have smaller 82 values. But, since the minimum of S2 is zero while the maximum of a RMSF can be quite large, this correlation may not be linear for residues with small Sz. In order to see what the relationship between 82 and RMSF is, they were plotted in x and y-axis together as shown in Figure 3-11. All a helix and [3 sheet residues are plotted in Figure 3- 11a, while the rest of the protein residues are plotted in Figure 3-1 lb (156, 157, 158 were included in Figure 3-11b since they don’t form a strict a helix in the x-ray structure). All three forms, apo, 2Py and uracil complex, were included in Figure 3-11, and the order 81 parameters with error larger than 0.1 were removed. As shown in Figure 3-lla, high S2 usually corresponds to small RMSF, but residues with low S2 have quite scattered RMSFs. And, it is interesting that a helix residues have more flexibility than [3 sheet ones for the same order parameter. For example, for a residue with S2 0.85, the RMSF of an a helix residue can be 0.5 A to 1.0 A while a [3 sheet residue only varies from 0.3 A to 0.5 A. Since a [3 sheet forms the core of yCD, it is not a surprise that the [3 sheet might be more rigid than the a helix. Some (1 helices seem to be more flexible than other a helices and the [3 sheet regions, based on RMSFs, such as (14, a5, and (16 (Figure 3—3). The average RMSFs of a4, (15, (16 (not including 156, 157 and 158) are 0.72 A compared with 0.41 A for the other secondary structure regions. But the order parameters are quite similar, with the average 0.85 for the former and 0.87 for the latter. As is well known, an order parameter describes the angular fluctuation of the backbone NH vector, while the backbone nitrogen RMSF describes the translational motion of an N atom that is essentially the same as the backbone CA atom RMSF in yCD (data not shown). If a group of residues undergoes a translational motion without rotating the NH vector, the CA RMSF will be large but the order parameter will be 1.0. In this case, order parameter underestimates the motion of residues, which seems to be the case in (14, a5, (16 for apo, 2Py bound and uracil bound forms. In order to identify whether this is true, the overall translational motion of helices was investigated for the apo form. The center of mass for the backbone heavy atoms (CA, CO, N, O) of each helix was calculated. The fluctuation of the center of mass was calculated as shown in Table 8-1. The RMSF of al, a2, a3 is ~0.3 A while the RMSF of a4, a5, (16 is two times larger ~0.6 A. It proves that the 82 a: 3.1" fluctuation difference between (11, a2, a3 and a4, a5, (16 comes from the overall motion instead of local motion of individual residues in helixes. Two approximate boundaries of the a helix and B sheet can be drawn in the RMSF- 82 plot. The absolute values of the slopes for the two boundaries are 0.91 A and 3.2 A in the B sheet and 1.7 A and 9.2 A in the a helix. The smaller slope in the B sheet region suggests that generally the order parameter is more sensitive to the motion in the B sheet but less sensitive in an a helix. 2). The other regions Figure 3-11b shows the RMSF S2 plot for regions other than the a helices and B sheet. A cone like graph with rigid residues on the lower right side and flexible residues on the upper left side can be seen, similar to the a helix and B sheet regions. There is no one-to-one relationship between 82 and RMSF evident. For example, a residue with S2 0.80 can have an RMSF range from 0.4 A to 1.4 A. On the other hand, a residue with RMSF 0.8 A can have an order parameter span from 0.5 to 0.85. As discussed above, 82 and RMSF describe different kinds of motion. Considering a system composed only of one free NH vector in space, the translational motion of N is completely independent of the rotation of the vector, and that means the RMSF and S2 are independent. But, in protein system, due to the sp2 hybridization of the backbone amide N, CA-CO-N-H has to maintain a trigonal planar geometry, which means that the angular coordinate of the NH vector depends on the orientation of the peptide plane (For the simplicity of the argument, the bending of NH away from trigonal planar geometry is not considered). The orientation of one peptide plane is usually restrained by its position in a structure. The relationship between orientation and position might be very complicated and dependent 83 on the local environment as well as the whole protein structure. Larger position fluctuations will give more possibilities for different orientation fluctuations, which is consistent with the cone-like plot of order parameter and RMSF for yCD. The absolute value of the slope of the two boundaries is 1.3 A and 5.7 A, respectively. The large range in between these two boundaries will make the interpretation of an order parameter complicated. For example, the residue with S2 0.8 could be more flexible than the residue with S2 0.5, as shown in Figure 3-11b. But, on the other hand, as discussed above, residues 72~81 and 111~117 show an increase of rigidity based on both RMSF and S2 values, due to the binding of 2Py. So, for the same region in different forms (apo, bound etc.), the flexibility change can be accurately detected by the order parameter. Discussion The crystal structures of yCD in the apo and inhibitor 2Py bound forms are essentially the same, with the active site completely buried. Our NMR relaxation study shows that both forms are quite rigid. The F114 loop, which covers the active site, appears to be more flexible in the apo form, based on order parameters. The C-terminal helix undergoes more ps-ns timescale motion in the apo than in the bound form based on NOE experiments. The flexibility change is consistent with our HD exchange results that showed that the F114 loop and C-terminal helix are more flexible in the apo form, even though these two kinds of experiments study motions on completely different timescales (manuscript in preparation). The relaxation measurement mainly studies the motion in the ps-ns time scale, while an HD exchange measurement studies motion on a second to minute timescale. 84 The backbone flexibility change in the F114 loop was also confirmed by an MD simulation study. The average RMSF over CA atoms in this region is about 0.3 A greater in the apo than in the inhibitor bound form. The analysis of the order parameter and correlation time 1.; from simulation provides a consistent conclusion. The average S2 of backbone NH vectors in the F114 loop decreases ~0.15 while 2" decreases about 2 fold. Another region, residues 72~81, (a loop) that interacts with the C-temrinal helix from the adjacent protomer also shows a decrease of flexibility due to the binding of the inhibitor. The binding of the product uracil displays similar backbone flexibility changes in these two regions. It has been proposed that the motion of the F114 loop and C-terminal helix is important for ligand release. In the X—ray structure of yCD, three residue side-chains F114, W152 and 1156 block the active site. It appears that these side-chains are quite flexible in the apo, 2Py bound and uracil bound forms. The flip of the phenol ring of F114 was observed in all three forms. A new conformation appears after the binding of uracil but not 2Py. The large span for (01 of ~80° in the apo form can also rotate the F114 side-chain away from the crystal conformation. There are two conformations of the W152 side-chain in all three forms, and two (three) conformations of 1156 in apo (bound) form. Together with the motion of the backbone of the F114 loop and the C-terminal helix, the ligand release path in between these two regions seems to be feasible. Backbone order parameter S2 has been widely used in NMR experiments to describe the flexibility of residues. But, since S2 is related to the angular flexibly of the NH vector instead of a direct reflection of the RMSF of backbone N (or CA), how S2 and RMSF are correlated is of great interest. It appears that, qualitatively, they correlate quite 85 well, with smaller S2 usually corresponding to larger RMSF. But there is no quantitative correlation between them. For example, the average RMSF of a4, a5, a6 is about double that of the other secondary structure regions, but the average order parameter is the same. Further investigation suggests that (14, a5, a6 undergo two times larger center of mass translation motion than the other helices. The order parameter fails to describe the translation motion of the whole helix, since an individual NH vector doesn’t change its orientation in this process. It is also interesting that the order parameter seems to be more sensitive to the motion in the B sheet than in an a helix. In the regions other than the a helix and B sheet, the plot of RMSF versus S2 has a cone shape, with the tip corresponding to large order parameter and small RMSF. A residue with small S2 can have quite different RMSF values, while a residue with large RMSF can have quite different S2. On the other hand, in the different forms of yCD that we studied, the increase of RMSF corresponds to the decrease of order parameter, such as for the F114 loop. So, the order parameter appears to be more accurate to describe the backbone flexibility difference of the same region in different protein forms, but less accurate to describe the flexibility of the different regions in the same protein form. 86 References Abragam, A. (1961). The principles of nuclear maggetism. Akaike, H. (1973). "In Information Theory and an Extension of the Maximum Likehood Principle." Proceedings of the 2nd International Symposium on Infomration Theory: 267-281. Case, D. A. (2002). "Molecular dynamics and NMR spin relaxation in proteins." Accounts of Chemiczfi Research 35(6): 325-331. Chen, J. H., C. L. Brooks, et al. (2004). "Model-free analysis of protein dynamics: assessment of accuracy and model selection protocols based on molecular dynamics simulation." J oumal of Biomolecular Nmr 29(3): 243-257. d'Auvergne, E. J. and P. R. Gooley (2003). "The use of model selection in the model-free analysis of protein dynamics." Journal of Biomolecular Nmr 25(1): 25-39. Delaglio, F., S. Grzesiek, et a1. (1995). "Nmrpipe - a Multidimensional Spectral Processing System Based on Unix Pipes." Journal of Biomolecular Nmr 6(3): 277-293. Ding, Z. R, G. Lee, et al. (2005). "PhosphoThr peptide binding globally rigidifies much of the FHA domain from Arabidopsis receptor kinase-associated protein phosphatase." Biochemistry 44(30): 10119-10134. Dosset, P., J. C. Hus, et al. (2000). "Efficient analysis of macromolecular rotational diffusion from heteronuclear relaxation data." Journ_al of BiomoleculaLNmr 16(1): 23- 28. Eisenmann, A., P. Neudecker, et al. (2004). "Treatment of peak intensity uncertainties in NMR relaxation data analysis can lead to severe artifacts." Monatshefte Fur Chemie 135(9): 1089-1099. Farrow, N. A., R. Muhandiram, et al. (1994). "Backbone Dynamics of a Free and a Phosphopeptide-Complexed Src Homology-2 Domain Studied by N-15 Nmr Relaxation." Biochemistry 33(19): 5984-6003. Fischer, M. W. F., A. Majumdar, et al. (1998). "Protein NMR relaxation: theory, applications and outlook." Progress in Nuclear Magnetic Resonance Spectroscopy 33(4): 207-272. Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 A crystal structure of yeast cytosine deaminase evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11: 961-972. 87 Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 angstrom crystal structure of yeast cytosine deaminase: Evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11(8): 961-972. Ishima, R. and D. A. Torchia (2000). "Protein dynamics from NMR." Nature Structural Biology 7(9): 740-743. Johnson, B. A. and R. A. Blevins (1994). "Nmr View - a Computer-Program for the Visualization and Analysis of Nmr Data. " Journal of Biomoleculaflm 4(5): 603-614. Karplus, M. and J. A. McCammon (2002). "Molecular dynamics simulations of biomolecules." Nature Structural Biology 9(9): 646-652. Ko, T. P., J. J. Lin, et al. (2003). "Crystal structure of yeast cytosine deaminase - Insights into enzyme mechanism and evolution." Journal of Biological Chemistg 278(21): 19111-19117. Ko, T.-P., J .-J . Lin, et al. (2003). "Crystal structure of yeast cytosine deaminase. Insights into enzyme mechanism and evolution." J. Biol. Chem. 278: 19111-19117. Korzhnev, D. M., M. Billeter, et al. (2001). "NMR studies of Brownian tumbling and internal motions in proteins." Progress in Nuclear Magnetic Resonance Saectroscopy 38(3): 197-266. Lipari, G. and A. Szabo (1982). "Model-Free Approach to the Interpretation of Nuclear Magnetic-Resonance Relaxation in Macromolecules .1. Theory and Range of Validity." Journ_al of the American Chemical Society 104(17): 4546-4559. Lipari, G. and A. Szabo (1982). "Model-Free Approach to the Interpretation of Nuclear Magnetic-Resonance Relaxation in Macromolecules .2. Analysis of Experimental Results." Journal of the American Chemical Socim 104(17): 4559-4570. Mandel, A. M., M. Akke, et al. (1995). "Backbone Dynamics of Escherichia-Coli Ribonuclease Hi - Correlations with Structure and Function in an Active Enzyme." Journal of Molecular Biology 246(1): 144-163. Nicholson, L. K., L. E. Kay, et al. (1992). "Dynamics of Methyl-Groups in Proteins as Studied by Proton-Detected C-l3 Nmr-Spectroscopy - Application to the Leucine Residues of Staphylococcal Nuclease." Biochemistry 31(23): 5253-5263. Palmer, A. G. (2001). "NMR probes of molecular dynamics: Overview and comparison with other techniques." AnnuaL Review of Biophysics and Biomolecular Structure 30: 129-155. Palmer, A. G., M. Rance, et al. (1991). "Intramolecular Motions of a Zinc Finger DNA- Binding Domain from Xfin Characterized by Proton-Detected Natural Abundance C-12 Heteronuclear Nrnr-Spectroscopy." Journal of the American Chemicg Society 113(12): 4371-4380. 88 Viles, J. H., B. M. Duggan, et al. (2001). "Potential bias in NMR relaxation data introduced by peak intensity analysis and curve fitting methods." Journal of Biomolecular Nmr 21(1): 1-9. Yao, L. S., Y. Li, et al. (2005). "Product release is rate-limiting in the activation of the prodrug S-fluorocytosine by yeast cytosine deaminase." Biochemistry 44(15): 5940- 5947. Yao, L. S., S. Sklenak, et al. (2005). "A molecular dynamics exploration of the catalytic mechanism of yeast cytosine deaminase." Journal of Physical Chemistry B 109(15): 7500-7510. Yao, L. S., H. G. Yan, et al. (2006). "A molecular dynamics study of the ligand release path in yeast Cytosine Deaminase (submitted)." 89 Appendices Table 3-1 The RMSF of center of mass of various helixes in the apo form a1 a2 a3 a4 05 a6 RMSF(A) 0.281001 0.301001 0.27:0 0.641002 0.571005 0.591003 9O 1.0 .............................. 222112132 at . 1,1111. 121.112.11.11. , 2 80:71” WHM’ ’ 33”” i,”’i i z 06-: § 1% 0.5- .1 351:0fi20fabj4o ‘5To'5'0W1’0’9Tof10011012013014ofi1T50160 30: $2 32519133 "iii“! {fifi’i’h’ E33 512,151 131?: :95?! "3’35””: $20-13 15’ .............................. ’ 10 2943? 40 . sjofi 6'0. 70 80 90 .100 110 120 130140150160 0.8-1% }& T i 1 i 1 i ii“, £51501}? , ’i’iifi (113,232.21, l’i’égifl :06 fi 2 (r ‘ _ _ _ _ _ _ _ §__4 04 a1 [31 ()2 a2 :33 a4 [35 a5 ’1'0'2'0 . 3’0 ’10 ' 50 Y 610’ 70 ' 8'0 . 9b ’1oofi11071207150'1io'150ai160 residue (a) 1.0 , . . . Q}. 08 oggggfég 51% £18" figégfig gé§é3§§’fiféé%§: gig-£3336 iégig o 07 0.6 0.5 Q 351:0'f2:o . 30 . 4:0 . so . ego r70 . 8:0.00f100‘1'10fi120'130r1'4or1‘50'160 ” a gzsiéal’giiiafiggji has 1.. fair.- 3 “an in. .1 in. 1510:2? faTo . 4E) . 50 . 60j70 . clogs?) .1éoi’1o.’%°:”,’°-11°.15,°-’6° 0.8 Figure 3-1 15N relaxation data of yCD 1n the (a). apo, (b). inhibitor 5FPy bound form — - ——4 (14 35 a5 a6 (1)) measured at 600 MHz and plotted vs residue. 91 103 ‘ 1h 1: 1'0 ’20 V 30 ’40fi sofeo ’0 '80 90 100+110 1éov1éor14oi1§ofi 160 LTD-1 vvvvvvvvvvvvvv i . 1 .1 112 1111151.,WM1111111, 1% 111, '1 “1111 09-: g i i 1 - $ 0.8-1 1 j 07- , 0.6.: _ _ __ ._ _ _ _ _ __..___1 a! 131 02 a2 [33 a3 134 a4 95 a5 a6 1 0'5 vvvvvvvvvvvvvvvvvvvvvvvvvvvvv 102030405060708090100110120130140150160 residue (3) 10 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,, f 1 . 8-1 -1 :61 1 1 4 i” 1 *1 Y T * * 1111 1 11 2‘ 1’ ’1 ll 11151’1’11’3" I ' I r I ' I ' l ' I ' T f '''''''''''''''' 1000 f § i i E 11"” l 1 5’ ’ ' v I v ff vvvvvvvvvvvvvvvvvvvvvv (b) Figure 3- 2 Model- free dynamics results for the (a) apo form (b) inhibitor 5FPy bound form yCD fiom NMR 15N relaxation data. 92 1.0 I V I V I V I 1 . W4, .q' 45:. 4.15 . “\fln \, .M’. ‘ 0.81.‘ f. I.“ *go . r3 — . i. To 1.. . .25.: .. Q o N 0.6 1 {1* I. l" 01 < . 1 . 0.4- I 1 l - o.2~ " - 1 ‘ Y1 ‘ 0.0 j—r I V I V IfT V I V I j I V I V I 9 I j I V I V I V 10 20 30 40 50 60 70 80 90100110120130140150160 2-0 I V I V I V I V r V 1 V VI V I V I V IV V I V T V I V I V I V 1 1.5- ~ (A 1 $1.01 d cco.s~ _ _ «w -, ' 1 0 ‘ fal fi ‘57 E V 5‘2 - 03 513 T3 - 014' F5- (15' 06’ 10 20 30 40 50 60 70 80 90 100110120130140150160 resndue (a) 10 I I V I I I V I I I T I I *V T w» .- ’I’ O- a. 0‘ 8 q ? a... Q IP‘ ... Q. 0 .1; . .§ '1‘. ‘i. ‘10, .0 «..4 1% “i’ i 0‘ * ’1 1 0.6« 1» ‘H ., N 1» m 9 , 1 0.4- 1 1 4 . 4 0.2- 1 -< J 4 0.0 I V I V I V I V I V I V I V I V I V I V I V fffi V I T V 10 20 30 40 50 60 7O 80 90100110120130140150160 2.0 I V I V I V I V I V I V I V I V I V I V I T I V I V I V I V (26’ 0 011 '7 02 a2 03 (13 E1 a4 '35 015 10 20 30 4O 50 60 70 80 90100110120130140150160 reigfe 1.0 I V I V I V I V I V I V I V I V I V I V T V I V VI V I V I V 1...”- v .09 ’.% 0 M o . ’.o ‘ 0 .8 d-o A.. '.o\ . ... .. §: $.‘ .\ '0. o. ‘V.’...fl . Ofiwq ‘ I O .. * .r . 0‘ 0.6“ . f 1 T —1 N 1 a: 0 4 ’ é ' ’ . ‘1 ' .1 1 1 a 1 0.2-4‘ 1 -: 0.0 I V I V IVVV‘ I ’V I V I ’V" I V I ’V I V I V I V I V I V I V I V 10 2O 30 40 50 60 70 80 90 100110120130140150160 2.0 I V I V I I I I I I I I I I V I I I V 1.5~ « 2 1 w 5:1.0~ 1 g 1 10.5« — ‘—l'-a 131 ‘13—: _012 03 a3 1'4 as 016‘ (14 01.0 '20 '3’0 '4'0 '5'0 ’60 '7'0 '8'0 '9'0 '100'110'12'01 reicfiue Figure 3-3 RMSF and order parameter of the (a), apo form. (b), inhibitor 2Py bound form. (c). product bound form. The x-ray B-factors of apo and 2Py bound form were converted and plotted in red. 93 (1) 111-117 (2) 150-153 (3) 72'-s1' Figure 3-4 X-ray structure of yCD complexed with 2Py. Three regions 111-117, 150-158 and 72*-81* (from adjacent protomer) are highlighted in gold. Three residues F114, W152 and 1156 which blocks the active site were also shown. OEM mm ' _. O 1 E 1500‘ ’ .V V; E; 1000. l of". 7 . *0 1 r 502%.»? #11} Ui-Joo? 1" {M519 i" 10 20 so 40 50 60 7'0 0'0 90160110120130140150110 r I l I I I T I I 20004 O icd_ inhi .1 15001 v 1 1 1L 5°23} 3.32.1 uifivngiiwé’ 19(98) § ? I I I “m 10 20 30 40 so so 70 so 90 100 110 120 130 140 150 1,0 I I I I l IV-I—V—r- I T ‘ O xcd uracil 1500.. . 3‘ 1000i . e 5%; g? a - e 5003 “L A , - oi "Mu-$4.?" 51.1.! it‘gg: gig» ‘43. KM 4 I710 10 20 30 4O 50 60 90 100 110 120 130 140 150 1)0 oresidue Figure 3-5 re value calculated from MD simulation for the apo form, inhibitor 2Py complex and product complex. 94 (a) NH2 0\ CH (1) OH (02 OH (b) o \C/CH IECHz (I) NH2 NH (02 CH (C) NH2 0 CH CH2 \C/ (MEG-1&2 \CHs l J... OH 3 Figure 3-6 two sidechain dihedral angles defined in MD data analysis for (3). F114, (b). W152, (c). 1156. 350 I o I 300. F114 2501 200- N 1 31500 1 ‘ ' IW‘ d 1m: 1:: ..... - 50d ‘ 5‘" . 0 T I j I T I v I v I ' I V I o 1 I 1 r v ' , I ' I v I T I 050100150200250300350 050100150200250300350 (01 an (a) (b) 350- ' I r F r 1 F114 300- 250- 200. N 1 8150-1 1 1N1 1 501 0 """"'vv.., 0 so 100 150 200 250 300 350 (01 (C) Figure 3-7 F114 sidechain dihedral angle (01 and (02 of the (a). apo form, (b). inhibitor 2Py bound form, (0). product uracil bound form in 10 ns MD simulation. 95 N 3 150- - 100~ - 50. - o O 0 50 1 00 1 50 200 250 300 350 ml Figure 3-8 The sidechain dihedral angle (01 and (02 of W156 in apo form. 350_1 ' I ' I V I ' I ' I ' I ' Id 350. V I ' I ' I V I ' I ' I V I- I1 56 yCD_apo . . yCD_2 Py aoo~ - aoo- ~ 2501-1 - 250- - 4 q 200'- ‘1 200-1 " N a 4 N ‘ 1 3 150« S 8 150- ~ ‘ ‘ 1 100‘ ‘ 100~ - ‘ ‘ 1 50"‘ " 50¢ - o! i ‘ ' ' ' ' ' ' ' I Y I T V T r 0 f 1 fii I V I ' I T r T I ' I 0 50 1 00 1 50 200 250 300 350 o 50 100 150 200 250 300 350 (01 001 (a) (b) yCD_pro 1 .l g . (02 I A J V v I I V I ' I V I I I 0 50 1 00 1 50 200 250 300 350 001 (C) Figure 3-9 1156 sidechain dihedral angle (01 and (02 of the (a). apo form, (b). inhibitor 2Py bound form, (0). product uracil bound form in 10 ns MD simulation. 96 I l I I I I V I I l 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 82 (b) Figure 3-11 RMSF vs 82 plot of the (a) a helix and 3 sheet residues, (b) other region residues. 97 Chapter 4 The Dynamics Changes of YCD along Reaction Cycle Revealed by HD exchange Introduction Yeast cytosine deaminase (yCD) catalyzes the deamination of cytosine to uracil while it can also catalyze the deamination of the prodrug S-fluorocytosine (S-FC) to the anticancer drug 5-fluorouracil (S-FU). One reaction mechanism was proposed (Sklenak et al. 2004) where cytosine (or S-FC) is deaminated to uracil (or S-FU) through a tetrahedral intermediate (Figure 4-1). High-resolution structures of yCD in the apo form (Ireton et al. 2003) and in complex with the inhibitor 2-pyrimidinone (2Py) (Ireton et al. 2003; K0 et al. 2003), a potent intermediate analog, were published. It is a surprise that the structure of apo yCD is essentially the same as that of the 2Py complex with the active site buried. A series of molecular dynamics (MD) simulations were performed along the catalytic cycle, which also show a buried active site (Yao et al. 2005). In order to bind the reactant or release the product, certain residues have to move. One possible ligand release path was found by using steered MD method (ms in preparation). The ligand releases along the path in between C-terminal helix (ISO-158) and the F114 loop (111-117). Certain residues in these two regions are crucial for ligand release including F114 and 1156. Recent measurements have shown that product release is the rate-limiting step during the 98 activation of S-FC (Yao et al. 2005), and S-FU is in slow exchange with its bound form by the off-rate ~13 s". It is interesting that the binding of S-FU is quite weak with a Kd only about 25 mM. It was proposed that the slow release was caused by high energy barrier during the bond cleavage between S-FU O4 and catalytic Zn by using ONIOM computational method(Sklenak et a1. 2004). But our experimental results suggest that the proposed mechanism occurs at a much slower rate than product release (unpublished data). An alternative mechanism of the Zn-O4 bond cleavage was later on demonstrated by combining MD and ONIOM, which shows that the barrier is rather small ~3 kcal/mol. So it is more likely that the barrier comes from the rearrangement of the residues of the active site and along the product release path since the active site is completely buried. Understanding the product release or reactant binding process can certainly help us elucidate protein dynamics and its function, and improve this enzyme’s catalytic efficiency, which is the essential goal of studying this enzyme. Exchange with solvent hydrogens of backbone NH monitors the motion of protein at each of numerous NH sites along the protein backbone. A general two-process model (Loh et al. 1996; Qian and Chan 1999) was proposed to describe hydrogen exchange including: "local opening", a small-scale conformational fluctuation in the native state and "global unfolding", a large-scale conformational change from native state to unfolded state. The most slowly exchanging NH groups in native proteins appear to exchange through global unfolding, while the more rapidly exchanging NH groups exchange through local opening (Arrington and Robertson 2000). In this paper we focus on NHs where exchange appears to be dominated by local openings, and the protein [3 sheet core shows no hydrogen exchanges on the experimental time scale (see results below). 99 Proton exchange from protein amides to solvent is generally described by a two- state mechanism: k k NH(close) ~——k0'p—"— NH(OPCH) —L> exchanged cl The apparent rate of exchange is kex = (kopkrc)/(kcl + krc + kop) (1) Futhermore, kop is generally much smaller than kc], because the protein is in the native state. Equation 1 can be simplified to kex = (kopkrc)/(kcl + krc) (2) Equation 2 can be further simplified by two limiting cases. The first one is called EXl exchange where kc, << k ,c, equation 1 can therefore be further reduced to kex = kop (3) which permits direct determination of the rate of opening. The second one is EX2 exchange where kc] >> krc , then equation 1 can be written as kex = (kopkrc)/ kcl = krcKop (4) where Kop is the equilibrium constant for the opening reaction. Therefore, an equilibrium constant can be measured from EX2. krc can be calculated by the method of Bai et al (Bai et al. 1993). In this work we performed an HD exchange study on the apo enzyme, 5FPy (5- fluro-Z-pyrimidinone, intermediate analog) complex and SFU (5-fluro-uracil, product) complex at PH 27.0 where exchange is mainly catalyzed by base (Arrington and Robertson 2000), the observed exchange rates are strongly pH dependent under EX2, 100 whereas under EXl conditions the observed exchange rate is independent of pH (assuming kop and kc, are pH independent). Thus, by varying pH, EXl and EX2 can be distinguished. Materials and Methods Construction, expression and purification of yCD and 15N labeled yCD has been described (Yao et al. 2005). For 2H, 15N, 13C protein, expression stain BL21(DE3)pLysS containing pETl7b-yCD was grown in M9 medium (70% D20) with 15NmCl and 13c- glucose as the sole nitrogen and carbon sources, respectively. The procedure for expression and purification is the same as that of the unlabeled protein. NMR assignments of 5FPy bound form were made using the spectra of 1.8 mM 2H, 13c and 15N triple labeled yCD or 15N labeled yCD in 100 mM potassium phosphate, 100 uM NaN3, 20 uM DSS, pH 7.0, made with 93% H20/7% D20. Sequential assignments of 5FPy bound form were based on NOESY-HSQC spectrum and confirmed by the HNCACB experiment. Sequential assignments of apo and SFU bound were based on NOESY-HSQC spectrum. Mixing time 70 ms were used for all three NOESY-HSQC spectra (15 N single labeled sample). HD exchange samples were prepared by dissolving 15N labeled yCD with or without ligand into pre—adjusted pH 100 mM potassium phosphate buffer. The concentration of 5FPy and SFU is 80 mM and 20 mM respectively. Protein solutions were then lyophilized. After that, protein samples were added to D20. The final concentrations of yCD are about 1.5 mM. Sample pH was measured after experiments and the value reported in this paper was not corrected for isotope effect. 101 HD exchange rates were measured for individual NH groups by sensitivity enhanced lH-lSN HSQC experiments performed on a 600 MHz Varian Inova spectrometer at 25 °C. Experimental dead time (time between the beginning of dissolving yCD in D20 and the start of HSQC acquisition) is ~5 min. For the first 2 hours, HSQC spectra were acquired continuously. For the next 3 hours, the interval between two consecutive HSQC spectra was elongated to 30 min. Thereafter, the time interval is set larger to 1 hour and then 2 hours. The experimental time for apo, SFU and 5FPy bound form are ~1 day, 3 days and 3 days respectively. Longer experimental time seemingly didn't increase the number of exchanged residues. The exchange experiments were performed for (1). apo form at pH 7.16 and 8.05. (2). 5FPy bound form at pH 7.10, 7.62 and 8.50. (3). SFU bound form at pH 7.10 and 7.50. The spectrum widths were 9476 and 1900 Hz for 1H and 15N dimensions, respectively; 32 (t1, 15N dimension) and 1946 (t2, lH dimension) complex data points were recorded for each spectrum. The number of transients was 4 for each FID with a 1.0 s delay between transients. It takes ~ 5 min for each spectrum. The NMR data were processed with NMRPipe (Delaglio et al. 1995). The peak heights were measured by using NMRView (Johnson and Blevins 1994). A three parameter exponential equation were used for data fitting: I=aexp(—t/Tex)+c (6) where I is the peak height, Tex =1/kex is the apparent exchange time constant, a and c the two parameters. Origin was used for data fitting and the error of Tex was estimated based on quality of fitting. 102 Two nanosecond molecular dynamics (MD) of apo yCD was performed with explicit solvent water molecules. The details of parameterization of the Zn bound complex and MD simulation are described elsewhere (Yao et al. 2005). The first 500 ps were discarded as the equilibration period and the later 1.5 ns were used for data analysis in this paper. The solvent accessibility analysis and hydrogen bond analysis were performed by using PTRAJ module in Amber 7.0 program(Pearlman et al. 1995). We assign hydrogen bonds when the distance of two heavy atoms (0 or N) is less than 4.0 A and the angle (heavy atom-hydrogen-heavy atom) is greater than 120°. Results Chemical shift perturbation Chemical shift difference was calculated for each assigned NH in apo and 5FPy bound form by equation: 51m =|§H|+|5N|/4.77 4.77 is the spectrum width ratio between nitrogen and proton dimension. 145 backbone NHs were assigned (all Non-proline residues except K9) for 5FPy bound form. K9 is the N—terminal residue in the protein. The large flexibility of the N-terminal might cause the disappearance of the K9 NH signal in backbone HSQC spectrum (Yao et al. 2005). 142 residues were assigned for the apo form. The unassigned residues were K9, N51, H62 and C91. N51 is from one loop. H62 and C91 are from the active site, both coordinated to Zn. The chemical shift differences due to the binding of 5FPy were plotted in Figure 4- 2a. The average difference is 0.13:0.16. Most residues show little perturbation as shown. Significant differences are seen for G35, E64, L88, M93, Y101, K115 and 1156 (with the 103 difference more than the 2 times standard deviation from average value) as shown in Figure 4-3a. Most of them are from regions around the active site. 139 residues were assigned for the SFU bound form. The unaasigned residues were K9, V31, H62, S66, T67, and C91. The main reason for unassigned residues is probably the lack of NOE in NOESY-HSQC spectrum. H62, S66, T67 and C91 are from active site, which might be more flexible in the apo and 5FU bound form than in 5FPy bound form. The chemical shift changes between the apo and 5FU complex forms are shown in Figure 4-2b and 3b. The average change is 0141029 similar to that of the 5FPy binding, but with larger fluctuations. The major changes are from regions around the active site (Figure 4-3b). The residues with significant chemical shift difference include G63, E64, T95 and K115. HI) exchange of the apo form Non-exchange residues and water accessibility The yCD monomer contains a central B sheet (B1~B5, parallel to each other) with two on helices (a1 and a5) on one side and four 0t helices (a2~0t4 and G6) on the other side. In the apo form, the regions without exchange in the experimental period mainly include (Figure 443): (1) residues 18~25 which is part of oz helix 1 (11~27); (2) residues 33~38 from B strand 1 (34~39); (3) residues 46~49 from B strand 2 (45~50); (4) residues 82~89 from B strand 3 (82~88); (5) residues 94~97 which is part of or helix 3 (92~101); (6) residues 101~110 which covers part of or helix 3 and the whole B strand 4 (105~110); (7) residues 141~146 which is part of 0t helix 5 (l35~l47). There are totally 42 residues with no significant HD exchanges (Table 4-1) during experimental time. The strong hydrogen bond network among B strands makes the HD exchange hard to occur in these 104 regions. Helix 3 sits in the interface between two monomers which might prohibit the HD exchanges because of the hydrophobicity of the interface. Figure 4-5 shows average number of water molecules within 3.5 A from amide N in the 1.5 ns MD simulation of the apo form. It is clear that most non-exchange residues have little water accessibility. This might explain why some parts of oz helices 1, 3, and 5 have no exchanges but not other parts or other helices. Interestingly, B2 (45~50) and B5 (128~131) both have water molecules around (with 0.42 water molecules for the former and 0.95 for the latter on average) but the former doesn't have HD exchange while the latter has a relatively fast exchange. Apparently only having water accessibility is not sufficient for HD exchanges. 0n the other hand, residue 51~54 (a loop) and 60~69 (a2) have little water accessibility, but both regions have HD exchanges occurring. These two regions have to go through a relatively large conformation changes to exchange with water, or water has to penetrate into these two regions in a time scale longer than the ns MD simulation. Fast exchange residues There are 38 (Table 4-1) fast exchange residues that are spread among the whole protein (Figure 4-1a). Most of them come from loops or turn regions including 41~43, 57~59, 62~63, 73~78, 113~116, and 13l~134. Some of them come from helical regions including 10~14 (a1), 117~125 (a4), 135~136 (a5) and 150~158 ((16). One comes from a B strand 129~131 (B5). Figure 4-6a shows percentage of hydrogen bond formation between amide (as donor) and other residues (as acceptor) in the protein. The average percentage is 104.8149], indicating that quite a lot of amides (93 out of 145) form hydrogen bonds with more than one acceptor and many of the amides are well protected. 105 However there are 31 residues with less than 80% hydrogen bond formation, among which 19 residues have fast exchanges. Apparently, the amides with weak hydrogen bond formation are easier to exchange with solvent water. But on the other hand, another 19 fast exchange residues have hydrogen bond formation larger than 80%. Though these residues seem to be protected well, the competition from solvent might make these residues exchange quickly. Figure 4-6b shows the percentages of H-bond formation between backbone amide NH (as donor) and solvent water molecules (as acceptor) in the MD simulation. Only 60 residues form such a hydrogen bond with the average percentage 29.43381, much smaller than hydrogen bond formed in between residues. For the base-catalyzed I-ID exchange, the H-bond between amide and water can assist the exchange by transferring amide proton to water (or 0H“ ). It is not a surprise that 29 out of 38 fast exchange residues have H—bond interaction with solvent (Figure 4-6b). But not all the amides hydrogen bonded to solvent have fast exchanges. 0n the other hand, except W10, the other 8 fast exchange residues are not solvent accessible including K13, N39, F54, 063, E64, 198, W152 and 1156. Among these, 063 and E64 are from the active site; W152 and I156 are from the C-terminal helix. Conformational fluctuations might be responsible for the fast hydrogen exchanges of these residues. Slow exchange residues 40 residues with a slow exchange rate could be measured with average lifetimes of 270 min (Table 4-1) and most of these residues are water accessible. However, residues 51~54 (close to a small 0t helix, 53~55) as well as 63~69 (part of oz helix 2, 63~71) have no (or few) water accessibility during the simulation, but exchanges were observed in 106 both regions. The latter region is close to the active site with H62 coordinated to Zn, G63 and E64 directly h-bonded to 5FPy in the complex form. The exchange data show that G63, E64 have fast exchanges, L68 has no exchange. The other 4 residues have average exchange times of 386 min. The exchange in this region reflects that the active site is flexible in the apo form. The yCD crystal structure shows that the active site is covered by three regions (Ireton et al. 2003), 53~61 (loop 1, between B2 and a2), 111~117 (F114 loop) and 150~158 (a6) (Figure 4-7). The average exchange time is 36.9 min for region 53~61 (including Q55, K56, SS8, T60. Residues F54, G57, A59 have fast exchanges while F53 has slow exchange and L61 is overlapped). The average exchange time is 256 min for 111~117 (including V112, N113, and F114. Residues K115, 8116 and K117 have fast exchanges; N 111 is overlapped). The average exchange time for W152 and E154 is 70 min (D155 has no exchange), while N150, D151, F153, [156, G157 and E158 have fast exchanges. The exchange data indicate these three regions are flexible. The flexibility around these regions might be important for ligand binding and release. Figure 4-8a shows the natural logarithm of protection factor verse residues, which gives us a direct view of the flexibilities of different regions. The blue color represents residues with no hydrogen exchange, mainly from the B sheet core. The red one represents residues with fast exchanges (exchange time less than 5 min), from various regions mainly loops, turns and part of some helixes. All the colors in between represent residues with slow exchange (from ice blue to orange with the decrease of protection factor). The binding of 5FPy 107 After the binding of the intermediate analog 5FPy, the HD exchange becomes much slower in various regions (Figure 4-4b). First, there are 70 residues which have no exchanges compared with 42 in the apo form. All the non-exchange regions in the apo form are maintained and/or extended, which include residue 17~25, 30~39, 46~55, 82~91, 93~110 (except 106) and 141~146. Besides these regions, residues 60~62, 65~71 and part of 150~158 have no exchanges. The average exchange time for all measurable residues is 662 rrrin compared with 270 min in the apo form. Figure 4-8a shows the exchange time changes due to the binding of 5FPy, "A" represents residues with exchange time scale from fast to slow or from slow to non-exchange or even from fast to non-exchange. HD exchange rates can be measured for 28 residues in both the apo and 5FPy bound form, with the average exchange time in 5FPy bound form 8.7 fold larger than that in the apo form. 0n the other hand this increase is not uniform, since 14 of these residues only have less than 2 times exchange time increase in 5FPy bound form, suggesting that some regions are rigidified significantly. The largest measurable exchange difference comes from 72~80 with the increase of exchange time about 44 fold (Figure 4-9a, including G72 (50 fold), Y79 (41 fold) and K80 (42 fold); L74 and G76 have no exchange in 5FPy bound form compared to the fast exchange in apo form). From the structure, we find that this region is in close contact with loop 1 and the C-terminal helix in the adjacent subunit. Another region with significant exchange time change is from residue 63~69, which forms one part of the active site. It is quite flexible in the apo form, but significantly rigidified by the binding of 5FPy (Figure 4-9a). The binding of 5FPy slows down the conformational fluctuation of the active site and stabilizes it. 108 The binding of 5FPy also changes the residue flexibility around the active site. First, Exchanges of 53~6l are 4.8 times slower than the apo form including 055, K56 while S58, T60 changes from slow exchange (43.2 and 20.2 min respectively) to non- exchange; F54 and A59 changes from fast exchange to non-exchange and slow exchange respectively (870 min); G57 is in fast exchange the same as the apo form and L61 is overlapped. If we simply assume that the fast exchange has exchange time shorter than 5 min (the starting measurement time for the first HSQC) and the non-exchange has the exchange time longer than 4000 min (The last HSQC collected in 5FPy bound form), the exchange time changes are quite significant (~180 fold) in this region and also quite diverse for different residues. Second, the exchange time for the loop lll~117 is about 23 times longer than the apo form (including N113 and F114). Residues K115, 8116 and K117 have fast exchanges same as the apo form while V112 has no exchange compared with slow exchange in apo form (671 min). Third, the C-terminal helix 150~158 is also rigidified due to the binding of 5FPy. F153, 1156 and E158 have no exchange compared with the fast exchange of F153, 1156, and E158 in the apo form; W152 has no exchange but has slow exchange in the apo form (38.4 min); D151 and G157 have slow exchanges (27 and 2041 min) compared with fast exchange in apo form; The average exchange time for E154 is about 12.5 times slower in 5FPy bound form. On average the exchange time increase is more than ~300 folds at least for this region. Just like region 53~61 and 111~l 17, the exchange time changes of the C-terminal helix are also quite significant and varied. Therefore, the binding of 5FPy changes the HD exchange property of yCD significantly. It stabilizes various regions of the protein especially the active site and the 109 residues around. The binding of 5FPy increases the exchange time significantly of loop 1 (53~61), residue 63~69, residue 72~80, the F114 loop (1 l l~l 17) and the C-terminal helix (150~158). Figure 4—8b shows the protection factor of 5FPy complex. We can see clearly the exchange time changes in these five regions. Because 5FPy is a mimic of a reaction intermediate, the rigidification of the active site and regions around may be favorable for the catalytic reaction, which is consistent with MD simulations (Yao et al. 2005) and QM calculations (Sklenak et al. 2004). The binding of SFU The HD exchange property was also measured for the SFU bound form. Unlike 5FPy, the binding of SFU does not change dramatically the HD exchange rate of yCD as shown in Figure 4-4c and Figure 4-9b. There are 34 residues in fast exchange, 46 in slow exchange and 39 in no exchange quite similar to that of the apo protein (Table 4-1). The average exchange time is 383 min comparable to 270 min in apo form. The binding of SFU slows down the HD exchange process for certain residues but accelerates for others. Residues H50, M52, L74, S89, 1145, F153, W158 have exchange time scale change from either fast exchange to slow exchange region or from slow exchange region to no exchange region. And residues F54, E119, Y121, V129 have exchange time change in the reverse way (Figure 4-9b). There are 32 residues which HD exchange rates can be measured in both apo and 5FPy bound form, with the average exchange time in SFU bound form 1.78 folds larger than that in apo form, compared with 8.7 folds increase in 5FPy bound form. Residue 72~80, which has a dramatic increase of exchange time (44 fold) in the binding of 5FPy, has a 2.5 times increase of exchange time (including G72 2.4 fold, T79 0.7 fold and K80 2.7 fold) while R73, G76 and K77 have the fast exchanges 110 the same as the apo form (L74 has an exchange time 12 min compared to the fast exchange in apo form). Figure 4-8b shows colored residues picture based on the logarithm of protection factor. As we can see the overall pattern of SFU complex form is quite similar to apo form, though differences exist in specific regions. The comparison between SFU bound form and apo form in regions around the active site gives us information about the flexibility changes due to the binding of 5FU. The exchange time for F114 loop (lll~ll7) is about 2.3 times longer than apo form (including V112, N113, and F114. Residues K115, S116 and K117 have fast exchanges). The binding of SFU slows down slightly the exchange in this region compared with 23 fold increase due to 5FPy binding. The exchange time for C-terminal helix (150~158) is similar to the apo form: the average time is 1.4 times longer than apo form for W152 and E154; F153 and E158 change from fast exchange region to slow exchange with exchange time 18.3 min and 3 min respectively; N150, D151, 1156 and G157 have fast exchanges and D155 has no exchange the same as the apo form. Exchanges of 53~61 are 2.8 times slower than apo form which includes Q55, K56, SS8, T60 while residues F54, G57, A59 have fast exchange times, the same as the apo form ( F53 and L61 are overlapped). Therefore the residues around active site are slightly rigidified by the binding of 5FU. 0n the other hand, exchange rates of residues 64~67 can't be measured in the SFU bound form mainly due to the weak peak intensity or lack of assignment. The 15N NOESY-HSQC experiment shows the NOEs between amide protons in this region and other protons are very weak suggesting that the active site is quite flexible in the SFU bound form. It appears that the binding of SFU changes the HD exchange property 111 insignificantly compared with 5FPy binding. The flexibility of Loop 1, F114 loop and the C-terminal helix which covers the active site is maintained (Figure 4-7c) while it is decreased dramatically in the 5FPy bound form. The results imply that these three regions might be important for reactant binding and product release. In the apo (product) complex form, the protein is ready to bind (release) ligand. Effect of pH Apo form HD exchange experiments were performed in both pH ~7.0 and ~8.0 phosphate buffer for apo form. The exact pH values were measured after the experiments. The uncorrected readings for the pHs are 7.16 and 8.05 respectively. According to the Linderstrom-Lang model (Linderstrom-Lang 1955), under the EXl limit, the exchange rate is the measure of the conformational opening rate, which is independent of pH whereas under the EX2 limit, the exchange rate is limited by the HD intrinsic chemical exchange of the open conformation, which is dependent on pH since it is catalyzed by acid or base. In system with pH higher than 3~4, the exchange is catalyzed by base. Therefore, in our system, ideally one unit increase of pH should increase the exchange rate 10 times under EX2, or maintain the values under EXl. Since the pH was increased by 0.89 in the experiments we performed, the corresponding increase of exchange rate should be 7.8 fold. Figure 4-10 shows the exchange time changes at pH 8.05 for the apo yCD compared with pH 7.16. 35 residues can be measured under both pHs. On average, the exchange time decreases 4.8 fold at pH 8.05. And it is interesting that pH change has quite a different effect on different residues. G29, R48, H50, G72 and D133 change from the slow exchange region to fast exchange region while D155 and L68 change from non- 112 exchange to slow exchange. 0n the other hand, F54 and 198 change from fast exchanges to slow and non-exchange respectively. For 35 measurable residues, the exchange time is also spread. In some regions, the exchange time doesn't change much such as 65~70 with the decrease only about 1.5 fold (including 165: 0.91, S66: 0.97, T67: 0.89, E69: 1.8, N70: 2.9). Apparently, the HD exchange rate in this region is determined by conformational changes (EXl). The average exchange time for this region is 346 min (including I65, S66, T67, E69, N70), which is equal to conformation opening time. This is a fairly slow process. Besides this region, some other individual residues also show EXl exchanges with exchange rate change less than 2 fold (Figure 4-6a). But in another region l37~l40 (part of (15), the exchange time decreases ~7.6 fold (C137: 7.2 fold, K138: 9.2 fold, K139: 7.4 fold, I140: 6.6 fold). The HD exchange rate in this region is more determined by the pH value (EX2). Therefore, different regions (residues) may have different exchange mechanism and many of them have neither EXl nor EX2 mechanism but only in between this two. 0n the other hand, k0p and kc, may be influenced by the pH changes for some residues in the apo form. For example, the T60 and K80 exchange rate increases 10.1 fold and 11.3 fold respectively (Figure 4-10), larger than the theoretical increase of 7.8 fold under the EX2 extreme. And the exchanges of F54, I98 and C106 are slowed down at the high pH. 5FPy complex form Exchange rate at three pHs were measured including pH 7.10, 7.62 and 8.50. Under the EX2 extreme, the pH change from 7.10 to 7.62 corresponds to 2.6 fold increase of exchange rate. The exchange rate of 37 residues can be measured under these two pHs (Table 4-1), with average exchange time decrease 271-0.9 fold. It seems that the overall 113 protein exchanges under the EX2 mechanism. But this exchange time ratio spans from ~0.8 fold to 5 fold (Figure 4-1 1a) suggesting no uniform pattern can be determined based on the data. One more exchange rate at pH 8.50 was measured for the 5FPy bound form to give a larger pH difference that can give us a better illustration of EXl and EX2 differences. And usually the more basic solution is the more possible EX2 exchange occurs because of the strong pH dependence of km. 30 residues have measurable exchange rates under pH 7.62 and 8.50 with the average exchange time decrease 9.8256 fold slightly larger than the theoretical decrease of 7.6 fold under the EX2 mechanism (Figure 4-11b). The large fluctuation of the ratio reminds us of the discrepancy between the experimental results and the model. Among these 30 residues, 12 have the ratio larger than 10 fold, which indicates at least for these residues kop and/or kc] might be different under these two pHs. So just like the apo form, different residues show quite different exchange time effects due to pH changes. It is very hard to interpret the data simply as EXl or EX2 mechanism and the assumption about the independence of kop and/or kc, over pH can hardly be justified in the 5FPy bound form. SFU complex form Two exchange rates of SFU yCD complex were measured at pH 7.10 and 7.50 respectively (Figure 4-12). Compared with pH 7.10, the exchange time decreases by 19:08 fold on average for 38 measurable residues under both pHs. The exchange time ratio spans from ~05 to ~39 (Figure 4-12). pH has quite different influences in exchanges over different residues. Summary 114 The backbone amide NH chemical shift has been assigned for apo, 5FPy and SFU bound form. The binding of 5FPy and SFU perturbs mainly the active site and it appears that SFU perturbs the active site more than 5FPy. HD exchange experiments were performed for apo, 5FPy intermediate analog and 5FU product complexes in different pHs. Several regions show different exchange properties in different forms. In the apo form, 4 cental B sheets as well as part of three 0t helixes have no exchanges. Most of the no exchange residues have no/few water accessibility while most of the fast exchange residues form h-bonds with solvent water. But 63~69 (from the active site) have exchanges though it is not water accessible. Apparently conformation changes are needed to make the exchange occur. After the binding of 5FPy, the exchanges are significantly slower in this region. The binding of SFU makes this region rather flexible, amide of H62, S66, T67 can not be assigned due to the lack of NOE and the exchange rate of E64 and 165 can't be measured due to the weak signal intensity. The second interesting region is loop 1 (53~61), a loop covering the active site. The binding of 5FPy increases the exchange time by more than ~180 fold, while the binding of SFU only increases the exchange time ~2.8 fold. The third interesting region is residue 72~80 which shows a dramatic increase of exchange time in the 5FPy bound form but not in the 5FU bound form. The fourth interesting region is loop F114 (lll~ll7), another loop covering the active site. The binding of 5FPy increases the exchange time for N113 and F114 about 20 fold compared with the binding of SFU, which increases that time about 2.3 fold. The fifth interesting region is the C-terminal helix a6 (150~158) which have relatively fast exchanges in the apo form. The binding of 5FU maintains the exchange rate in this region. But the binding of 5FPy makes exchange in this region more than ~300 times 115 slower. Loop 1, loop F114 and the C-terminal helix, covering the active site, might be important for ligand binding and product release. Compared with the apo form, on average the 5FPy complex increases the exchange time 1.8 folds; 5FPy complex increases the exchange time 8.7 fold for measurable residues. The pH effect over the exchange time is quite different for individual residues. For example, in the apo form residue 65 ~ 70 displays the EXl exchange mechanism while residue 137 ~ 140 displays the EX2 exchange mechanism. But the majority of residues show neither EXl nor EX2 mechanisms in apo, 5FPy and SFU bound forms. 116 Reference Arlington, C. B. and A. D. Robertson (2000). Kinetics and thermodynamics cf conformational equilibria in native proteins by hydrogen exchange. Energetics of Biological Macromolecules, Pt C. 323: 104-124. Arlington, C. B. and A. D. Robertson (2000). "Microsecond to minute dynamics revealed by EXl-type hydrogen exchange at nearly every backbone hydrogen bond in a native protein." Journal of Molecular Biology 296(5): 1307-1317. Bai, Y. W., J. S. Milne, et al. (1993). "Primary Structure Effects on Peptide Group Hydrogen-Exchange." Proteins-Structure Function and Genetics 17(1): 75-86. Delaglio, F., S. Grzesiek, et al. (1995). "Nmrpipe - a Multidimensional Spectral Processing System Based on Unix Pipes." Journ_al of Biomolecular Nmr 6(3): 277-293. Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 angstrom crystal structure of yeast cytosine deaminase: Evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11(8): 961-972. Johnson, B. A. and R. A. Blevins (1994). "Nmr View - a Computer-Program for the Visualization and Analysis of Nmr Data." Journal of BiomolecuLar Nmr 4(5): 603-614. Linderstrom-Lang, K. (1955). "Deuterium exchange between peptides and water." Sp_ec. Publ. Chem. Soc. 2: 1-20. Loh, S. N ., C. A. Rohl, et al. (1996). "A general two-process model describes the hydrogen exchange behavior of RNase A in unfolding conditions." Proceedings of the National Academy of Sciences of the United States of America 93(5): 1982-1987. Pearlman, D. A., D. A. Case, et al. (1995). "Amber, 3 Package of Computer-Programs for Applying Molecular Mechanics, Normal-Mode Analysis, Molecular-Dynamics and Free- Energy Calculations to Simulate the Structural and Energetic Properties of Molecules." Computer Physics Communicgaions 91(1-3): 1-41. Qian, H. and S. 1. Chan (1999). "Hydrogen exchange kinetics of proteins in denaturants: A generalized two-process model." Journal of Molecular Biolggv 286(2): 607-616. Sklenak, S., L. S. Yao, et al. (2004). "Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study." Journ_al of the American Chemical Society 126(45): 14879-14889. Yao, L. S., Y. Li, et al. (2005). "Product release is rate-limiting in the activation of the prodrug 5-fluorocytosine by yeast cytosine deaminase." Biochemistry 44(15): 5940-5947. 117 Yao, L. S., S. Sklenak, et al. (2005). "A molecular dynamics exploration of the catalytic mechanism of yeast cytosine deaminase." JoumA of Physical Chemistry B 109(15): 7500-7510. 118 Appendices Table 4-1 The statistics of HD exchange times for the apo, 5FPy and SFU complex in various pHs. number of residues Average time (min) PH 7.0“” fast slow no undetermined apo 38 40 42 25 270 SFU 34 46 39 26 383 5FPy 19 43 70 13 662 PH 8.0“” Apo 42 39 41 23 159 SFU 39 38 39 29 306 5FPy 24 43 63 15 459 PH 9.09 5FPy 28 36 61 20 140 (a). The measured pHs are 7.16, 7.10 and 7.10 for apo, 5FPy and SFU bound form. (b). The measured pHs are 8.05, 7.62, and 7.50 for apo, 5FPy and SFU bound form. (c). The measured pH is 8.50 for 5FPy bound form. 119 H2N OH 0 / _ i I “N | ““m ”N l _ L H .l H Cytosrne Uracil r. .. NH2 H2N OH 0 F / F -NH F i I——*+”2° “" I ——’3 ”N | H _ H _ H 5-Fluoro-cytosine 5-Fluoro-uracil Figure 4-1 The deamination of cytosine/SFC to uracil/SFU through a tetrahedral intermediate. 2 I I I I I I I I g. 0 l 3- E 1- - 5 a l .9 : , . g Orr-'NL J‘s Kl" "Wit? - 0 r I ' I ' I ' I ' I ' I T I I 20 40 60 80 100 120 140 160 Residue (21) A3" ll .. E O. 3 .35 2‘ ‘ .C U) 8 1- . . E t "I- i- 4 0 IR \ I"" 'l 5 erudite—um he‘d-l"! NJ mil-e1 '2'o'4b'oro'a'o'1c'to'tio'tfto'too Residue (b) Figure 4-2 The chemical shift difference (a). in the apo and 5FPy bound form. (b). in the apo and 5FU bound form. 120 (a) (b) Figure 4-3 The colored tube plot of backbone CAs based on chemical shift difference (a). in apo and 5FPy bound form. (b). in apo and SFU bound form. The difference increases with the color changing from blue to red. The residues with no assignment were painted white. 121 ‘ I I I I I I I ' I I I 10001 1 E i 2 .5. : : g 100 1 o 1 1 a 4 d 8 S 10? O ‘ I x : ul I“ | . 1 A“ A ' M13 AL‘ I M I 'AM I“ I ‘ A 20 40 60 80 100 120 140 160 ResMue (a) I I I I I I fl r r y 1000 1: 1 1:? = a .E. . : a 1003 F s i Q .. 2’ 1 a 1° '4 1 x : 3 .. j l . 1 “.I-“A c.A.““I '“ IA AM“ A I‘“ ‘ I 20 4o 60 so 100 120 140 160 Reswue 1' r I I T I T I I I 4 10001 1 E i i E. 1 i g 1001 g l: = 3 , § - i a 10 1 X . . “‘ i L L l' 1 . “r‘e . i ““““ - i ‘“= “%‘ ““i 1. 20 40 so so 100 120 140 160 Reswue (C) Figure 4-4 Exchange time for (a) apo, (b) 5FPy complex, (c) SFU complex. A represents no exchanges (exchange time longer than ~2000 minutes (apo) or ~4000 minutes (5FPy and SFU complex», A represents fast exchanges (exchange time shorter than ~5 minutes). 122 213~ §§‘15 ‘ (U -" n 3 j 1 f; 10 8 ', 5- ”"l l l l H1 I" la,” -1. -l I, ...., I, l.,|.|,ll|r l 20 40 60 80 100 120 140 ResMue 160 Figure 4-5 Number of water molecules around NH within 3.5A from N atom in MD simulation. A represents no exchanges residues (exchange time longer than ~2000 minutes). I I I I I I I I 200 - _ 4 . a; 1&)« J F . or -§ 100. a 50 d I “II M [I I I , . , . . , . 20 40 so so 100 120 140 180 Residue (a) ' I I I f I I I I _I 140 "l d l . 120- - -( d g 100- - 8, so - - S - . g 60- - o 40~ u D. . , l I it! i 0 J. r I '1; Ala T AimI an“ -1 7 I1 ' AI '1 M. I! x: 1 \ £1 anal 20 4o 60 80 100 120 140 160 Residue Figure 4-6 (a). Percentages of H-bond formation between NH and other residues. (b). Percentages of H-bond formation between NH and solvent water during 1.5 ns MD simulation. 123 (1) 53-61 (2) 83-69 (3) rr-w (4) 111—117 (5) 150—153 3» Figure 4-7 X-ray structure of yCD complexed with 2Py. Regions with significant increase of HD exchange time were highlighted in gold. ( ) (b) ( ) Figure 4-8 Summary of HD exchange data of yCD in (a). apo, (b). 5FPy complex and (c) SFU complex forms at pH 7. Residues are colored according to the logarithm of protection factor: blue, residues without exchanges during experimental time (log P 2 7.0); red, residues with fast exchanges (log P S 2.0); orange to iceblue, residues with slow exchanges (7.0 > log P > 2.0). The crystal structure of apo form subunit one was used in the picture. 124 100 I T I I 1 I r fl '9 10 ‘ a: : ‘ E . 3‘3 . o O) 1 -.l __ C 1 : g a a x I ‘ Lu 3 0.1 1 .: Y I 'A AI A AA WY AA r Ll . A ' 'M A‘“. 20 40 60 80 100 120 140 160 Residue (a) 100 EU I I I 1 I I V I I l .9 can 1 e °i Q9 : : .E . q a a: 4 8i 1 ‘ r: i 5 2 : = o r . x ‘ 1 UJ 0.1 1 1 ‘. , , :A . 4 . . Ar . A . A‘ 20 40 60 80 100 120 140 160 Residue Figure 4-9 The ratio between the exchange time of (a). 5FPy and apo yCD. (b). SFU and apo yCD. A represents the binding of ligand changes the residue exchange time from no exchange to slow exchange or from slow exchange to fast exchange or from no exchange to fast exchange. A represents the reverse change. 12. 10- 8- 6-l .4 4a Exchange time ratio d 24 ‘ I L: A I V ' T I I I r apo pH 7.16 vs 8.05 _ 0 . I 20 17 40 6 mil. , ., H. .L‘ o 160 150 140 160 Residue Figure 4-10 The exchange time ratio in apo form at pH 7.16 and 8.05. 125 ' I I I I I 0 . 5FPy pH 7.10 vs 7.62 0 .9 4‘ ‘ i§ i a 3-1 - 2; 2- _ C m 1 8 x 1‘ " U-l . o . j . . , . .0. . , . M. . . 20 40 60 80 100 120 140 160 Residue 3o . , . , . , . , . , . , . . e 25- 5FPy pH 7.62 vs 8.50 - .9 0 . £2 zo- - g 1 3 154 q a . g 10- - .5 0 0 "’3 511 I l I ll ll -. 0 Mr A L L IL A . , . , . , . , . , . , . T - 20 40 60 80 100 120 140 160 Residue Figure 4-11 The exchange time ratio in 5FPy bound form at (a). pH 7.10 and 7.62, (b). pH 7.62 and 8.50. 5 . . . , . , . , . , 0 ,3 . , . s 4.. 5FU PH 7.0vs7.5 -4 "0% 0 BSJ -3 .E l a d) §2j :2 - ti 0. l. .LI..,..1. l.i0 20 40 so so 100 150 140 160 Residue Figure 4-12 The exchange time ratio in 5FU bound form at pH 7.10 and 7.50. 126 Chapter 5 A Molecular Dynamics Exploration of the Catalytic Mechanism of Yeast Cytosine Deaminase Introduction Yeast cytosine deaminase (yCD) catalyzes the deamination of cytosine to uracil by nucleophilic attack of a Zn-bound hydroxide on the substrate, forming a tetrahedral intermediate that decomposes through the elimination of ammonia. Cytosine dearrrinase is present in bacteria and fungi, as part of the pyrimidine salvage pathway, but is absent in mammals.1 yCD can also catalyze the deanrination of the prodrug 5-fluorocytosine (5- FC) to the anticancer drug 5-fluorouracil (5-FU). Therefore, a potential system for gene- directed enzyme prodrug therapy combines yCD and the prodrug 5-FC, since the product, 5-FU, inhibits DNA synthesis and is thus a potent chemotherapeutic agent .2’3 High resolution structures of yCD in the apo form4 and in complex with the inhibitor 2-pyrimidinone, a mimic of a reaction intermediate”, are available. yCD is a homodimer, with each 158 residue subunit consisting of a central B-sheet flanked by two a-helices on one side and three a-helices on the other (Figure 5-1a). A single active site is present in each subunit. The active sites are separated by 14 A, both are adjacent to the dimer interface, and no cooperativity has been reported for the enzyme.4 Each active center contains a single catalytic zinc ion that is tetrahedrally coordinated by a histidine (His62), two cysteines (Cys91 and Cys94), and a water molecule in the substrate-free 127 enzyme or the inhibitor in the complex (Figure 5-lb). The bound inhibitor is completely buried by a lid composed of Phel 14 from the loop between B4 and a4, and Trp152 and Ile156, both from the C-terrninal helix. Surprisingly, the structure of apo yCD is essentially the same as that of the intermediate analog complex with a root-mean-squared deviation (RMSD) between the two structures of 0.23 A for the backbone atoms. The active site architectures of yCD and cytidine deaminase (CDA) from E. coli share a striking similarity."7 Superposition of their active sites reveals a very similar interaction network involving the ligated water molecule, the Zn ion ligated by two cysteine residues and one histidine, the Zn-bound ligand and a glutamic acid important to the catalytic mechanism.5 Based on the similarity of these active site structures and early studies of the mechanism of CDA catalysis, an analogous reaction mechanism was proposed for yCD.5 Our recent quantum chemical study8 using the ONIOM(B3LYP:PM3) method has revealed a complete path for the deamination reaction catalyzed by yCD. The cytosine deamination proceeds via a sequential mechanism involving the protonation of N3, the nucleophilic attack of C4 by the Zn-coordinated hydroxide, and the cleavage of the C4-N4 bond. Uracil is liberated from the zinc by an oxygen exchange mechanism that involves the formation of a gem-diol intermediate from the Zn bound uracil and a water molecule, C4-Ozn bond cleavage, and regeneration of the Zn-coordinated water. In this article, we explore aspects of the mechanism of the cytosine deamination reaction by running a series of MD trajectories (~2 ns) for the fully solvated enzyme in its free form, reactant complex form, product complex form, and four possible intermediate complex forms, with active site structures summarized in Figure 5-2. In particular, we 128 study the side-chain motion of Glu64 along the proposed8 reaction path and the role of several other residues in both ligand binding and catalysis. For each simulation, quantum chemical calculations are first carried out in order to obtain suitable charge parameters for the Zn ion, ligated residues, catalytic water, substrate and important neighboring residues. The protonation states of the Zn-ligated residues are critical for an accurate simulation, as is an account of the substantial charge transfer to the Zn (formally 2+) ion. The availability of the yCD crystal structures with the intermediate analog complex"5 aided us in validating the charge parameters of the Zn bound complex by comparing the results of an MD simulation with these crystal structures. The flexibility and average position of the loop containing Phe114, and the N- and C-terrninal helices are monitored in the MD simulations of both the free and intermediate analog complex forms and compared with the corresponding crystal structures. That permits suggestions for possible ways the reaction products can exit the active site, of interest in view of the very similar crystal structures found for the apo and inhibitor- bound structures. The product complex simulation also suggests a binding position for ammonia and the exchanged water prior to release from the active site. In these ways, further insights into the catalytic and binding events taking place in the active site of yCD can be obtained. Methods Determination of the protonation state of the Zn bound complex The protonation states of the Zn ligands, cysteine and histidine, are dependent on 9-11 their surroundings. Quantum chemical studies on model compounds show that the Zn — ligand distances are sensitive to their protonation states and, by comparison with 129 crystallographic data, the latter can be reliably assigned.9"° We carried out B3LYP/6- 31G** optimizations8 in the gas phase for the Zn complex starting from the crystal structure, modeling His62 by imidazole, Cys91 and Cys94 by -SCH3, Glu64 by CH3CH2COO—, and included the Zn ion, a water and —OOCCH2CH2C(=O)CH3, the last of which is a model for Asp155 that is hydrogen bonded to His62 (2.7 A, a second shell interaction with the Zn) and, in the intermediate analog complex, also to the intermediate analog (Figure 5-1b). These calculations showed that to match the coordination geometry of the zinc ion in the crystal structure, the sulfur atoms of Cys91 and Cys94 must be deprotonated and His62 must be singly protonated. Parameterization of the Zn bound complex In metal-ligated simulations a choice between bonded versus non-bonded force fields must be made.12 For the purposes of this work, where the crystal structures show strong four-coordinate ligation to the Zn ion, a bonded (covalent) force field is indicated. Thus, explicit bonds between the Zn cation and its ligands were used.13 The force constants for bonds and bond angles listed in Table 5-1 serve to preserve the experimentally observed tetrahedral Zn. They are consistent with force constants used for a Zn complex in a recent simulation by Suarez and Merz.l4 The equilibrium bond distances and angles are taken from the apo yCD crystal structure.4 All the torsional parameters associated with the Zn-ligand interactions were set to zero as in Hoops et al.13 A single point QM calculation (B3LYP/6-31+G*) of the Zn complex was carried out based on the crystal structure of the yCD free form. Zn, its three ligated residues,His62, Cys91, Cys94, as well as Asp155 (all atoms), and one water molecule (coordinated with the Zn) are included in the calculation. Hydrogen atoms were added by 130 using the Insight H program. One hydrogen atom was added to each C-terminus and N- terminus of the four amino acids to saturate the heavy atoms. Atom-centered partial charges were derived by using the AMBER antechamber program (RESP methodology).ls Though HF/6-31G* is the method used for parameterization of the General AMBER Force Field (GAFF)16, in the case of the Zn-ligand bonds that have an intermediate character between a covalent and an ionic bond,l7 B3LYP is a better method to calculate the RESP charges. For our test job of cytosine, the RESP charges derived from HF/6-31G* and B3LYP/6-31+G* produced a root-sum-of-square-deviation (RSSD) of only 0.07le for all the charges. In the MD simulation, only the RESP charges of the substrate, intermediates and products, as well as those of the Zn-bound water, Zn and the side-chains of His62, Cys91, and Cys94 are re-evaluated by our quantum chemical calculations relative to their values in the AMBER database. The charges listed in Table 5-2 are slightly modified to preserve the charge neutrality of the Zn complex. Charges of Asp155, the backbone charges of His62, Cys9l, and Cys94 are the same as those from the AMBER database. Since no new bond, bond angle or dihedral angle parameterization was performed for this system in our calculations, this procedure was used to minimally perturb the parameters from the AMBER database. MD simulation of the free form of yCD Starting coordinates for the protein atoms were taken from the free form yCD 1.43 A resolution crystal structure.4 There is a second Zn in the active site of the crystal structure, which was deleted for the simulations. All the crystal water molecules were removed except the one bound to the catalytic Zn. The protonation states of the ionizable 131 residues were set to their normal values at pH 7. The protein was surrounded by a periodic box of 12.5 A (~17,000) TIP3P water molecules. Na+ ions were placed by the Leap program15 to neutralize the —6 charge of the model system. The parm94 version of the all-atom AMBER force field‘8 was used for all the simulations. MD simulations were carried out using the SANDER module in AMBER 7.0.16 The SHAKE algorithm19 was used to constrain all bond lengths involving hydrogen atoms permitting a 2 fs time step. A nonbonded pair list cutoff of 8.0 A was used and nonbonded pair list was updated every 25 steps. The Particle-Mesh-Ewald method was used20 to evaluate the contributions of the long-range electrostatic interactions. The pressure (1 atm) and the temperature (300 K) of the system were controlled during the MD simulations by Berendsen’s method.21 The typical simulation time was 2 ns, with a 500 ps equilibration period. Coordinates are saved every 2 ps. All of the MD results were analyzed with the PTRAJ module of AMBER 7.0. In these analyses, we assign hydrogen bonds when the distance of two heavy atoms (0 or N) is less than 3.2 A and the angle (heavy atom-hydrogen-heavy atom) is greater than 120°. MD simulation of the yCD intermediate analog complex The starting structure for the yCD intermediate analog complex simulation was taken from the crystal structure.4 Subunit one has 156 residues (the first two are missing); subunit two has 161 residues because three extra residues are attached to the N-terminus. One more residue, Glu64, is included in the charge parameterization because it forms a strong hydrogen bond with the ligand in the crystal structure, and the inclusion of this residue may influence the RESP charges of the Zn-bound complex. The Glu64 charges used in the MD are the AMBER ones, since they are very close to those from the 132 quantum chemical calculation. The MD simulation protocols for the yCD intermediate analog were identical to those for the unbound model except the modeled system was maintained at constant volume and temperature, as was the case for the other simulations described below. Ml) simulation of the yCD water/cytosine complex (1 in Figure 5-2) Cytosine was docked to the yCD free enzyme crystal structure active site in the same position found for the intermediate analog. The Zn-bound water molecule was included in the simulation. All other crystal water molecules and the second Zn were removed, as in the free form simulation. The force field parameters of the yCD protein are the same as in the free yCD simulation. The RESP charges of cytosine are from a HF/6-31G* calculation. MD simulation of the yCD hydroxide/Glu64H cytosine complex (2 in Figure 5-2) The 500 ps instantaneous structure of the yCD water/cytosine system was adopted as the starting structure of the yCD hydroxide/cytosine system. The deprotonated Glu64 was mutated into a protonated glutamic acid, while the Zn complex water was mutated into a hydroxide group in this system. The new RESP charges of the Zn complex (Zn, His62, Cys91, Cys94, and the hydroxide ligated to Zn) were calculated based on a free form ONIOM(B3LYP:PM3) minimized structure with the proton restrained to Glu64.8 MD simulation of the yCD hydroxide/Glu64 cytosineH complex (3 in Figure 5-2) The 1.0 ns snapshot structure of the yCD hydroxide/Glu64H cytosine system was used as the starting structure of yCD hydroxide/Glu64 cytosineH system. The proton of Glu64 was transferred to the N3 of cytosine in this system. The RESP charges of the Zn complex (Zn, His62, Cys9l, Cys94, hydroxide) are the same as the yCD 133 hydroxide/Glu64H cytosine complex simulation, and the RESP charges of protonated cytosine were calculated based on HF/6-31G* QM calculation. MI) simulation of the yCD intermediate complexes (4 and 5 in Figure 5-2) The starting structure for the yCD intermediate complex simulation was taken from the crystal structure of the yCD intermediate analog complex. The intermediate was docked to the same position where the intermediate analog stays. Since the intermediate could be either protonated, intermediate 1, (with deprotonated Glu64, 4 in Figure 5-2) or deprotonated, intermediate H, (with protonated Glu64, 5 in Figure 5-2), two situations were implemented in the two different active sites. The RESP charges (see Table 5-2) of these two Zn complexes were calculated based on the yCD intermediate analog crystal structure. MD simulation of the yCD product complexes (6 and 7 in Figure 5-2) The starting structure for the yCD product complex simulation was taken from the yCD intermediate analog complex crystal structure. Intermediate H was converted to uracil/ammonia. Since our ONIOM(B3LYP:PM3)8 calculations showed that the Zn complex structure in this system was similar to that in the crystal structure of yCD intermediate analog complex, uracil was bound to the Zn atom for the simulation with a bond to the oxygen. This geometrical arrangement is similar to the cytidine deaminase- uridine complex crystal structure7 The new RESP charges of the Zn complex were calculated. In order to study the release of ammonia and its possible exchange with water in the active site, ammonia was substituted by water in one active site while it was kept in the other. 134 Potential of mean force (PMF) calculation of Glu64 COOH rotation in the yCD hydroxide/Glu64H cytosine complex The simulations that we carry out show that the Glu64 (GluH64) carboxylate (carboxylic acid) moiety fluctuates around three orientations, which can be described by the dihedral angle a? defined by the CB-CD-CG-OE2 atoms of Glu64. The free energy of rotation was obtained from the potential of mean force (PMF) W (¢’) that is related to the probability P(¢)) d4!) of finding the dihedral between (a and ¢+ dgo according to W(¢)=—kBT1nP((p). The PMF was obtained by a free energy umbrella sampling 22.23 (1 metho in which a series of harmonic restraint terms (windows) are added to the system’s Hamiltonian in order to force the system to sample the desired regions of (p space, in this manner overcoming the poor sampling of high-energy conformations that would occur in a unbiased MD trajectory. The data from the windows were combined using the weighted histogram analysis method (WHAM)24 to produce the PMF over the desired range of (0. The force constant was set to 20 kcal/(mol—radz), which provides a rrns restraint potential window width of approximately 10°, and the window size was set to 3.5° to ensure good overlap of data in neighboring windows. The PMF between state 1 and 2 was determined by running the simulations forward and backward with 5 ps of equilibration and 5 ps of data collection for each window (total 40 windows). The parameters we used were validated by testing the accuracy of the method on the dihedral PMF for ethane solvated in water. The results (not shown) indicate that this set of parameters can reproduce the PMF of the ethane dihedral rotation quite well. By using this set of parameters, our forward and backward simulations of the Glu64 CB-CG-CD- 135 0E2 dihedral rotation gave consistent results (see Results and Discussion). The PMF between state 2 and 3 was determined by starting the simulation from state 2. Chemical alchemy simulation of NH3 and H20 exchange Our ONIOM(BBLYPzPM3)8 calculations indicate that uracil is released by an oxygen exchange mechanism that uses a water molecule at the same location as the product ammonia in the active site. As a consequence, before the release of uracil, ammonia should exchange with the water molecule. Therefore, we evaluated the difference in binding free energy between water and ammonia by mutating water to ammonia (and ammonia to water) in both the protein active site and water solution. The RESP charges of ammonia were calculated based on a HF/6-31G* calculation. After ammonia was solvated in water (box side 12.5 A) and equilibrated, it was mutated to a water molecule. The nitrogen and two ammonia hydrogen atoms were mutated to an oxygen atom and two water hydrogens while the third hydrogen was mutated to a dummy atom that has zero charge and zero van der Waals radius but maintains its internal interaction parameters. Twelve windows were used with 30 ps per window (15 ps for equilibration and 15 ps for data collection). SHAKE was not used considering that mutation of hydrogen atoms is involved, necessitating a time step reduction to 1 fs. Gaussian quadrature formulas were used to pick values for the ranges of these windows. The same procedure was used to mutate water (with a dummy atom attached) to ammonia. Thermodynamic integration was used to do the free energy calculation. The contribution to the free energy from the dummy atom was evaluated analytically.25 A standard state of 1 M (1661 A'3) was used for the free dummy atom. The results for the forward and backward mutation show excellent consistency (see Results and Discussion). 136 In the yCD product simulation, the water and ammonia were put into different active sites, and they were mutated to each other, respectively, by the procedure outlined above. The free energy difference in water and protein is the binding free energy. Assuming a small effect from the different active sites, the difference in binding free energies between water and ammonia is then available. Results and Discussion yCD free and intermediate analog complex compared with the crystal structures yCD is a homodimer with 158 residues per subunit. Each subunit has one active site that is covered by Phe114 located in a loop consisting of residues 109-116, and by Trp152 and Ile156 contained in the C-terrninal helix, residues 149-158. The data analysis is based on individual subunits in order to get better statistics, because the crystal structure shows that the configurations of the two subunits are the same, except for the N- terrninus (~10 residues). We superimpose the MD snapshots on the crystal structure without using the N-terminal 14 residues. The time evolution of the CA atom RMSD of the instantaneous structures (not including the first 14 residues) from the initial crystal Structure for the free yCD and the intermediate analog complex indicate that the two systems are in equilibrium with respect to the RMS deviations during the analyzed trajectory (Figure 5-3). The mass weighted RMSD of all atoms relative to the crystal structure is, on average, 1.29 A for subunit one and 1.60 A for subunit two, in the free form simulation (Table 5-3), indicating that the structural changes in the protein were not large during the course of the simulation. The different RMSDs do indicate that some part of the protein in subunit two moved further away from the crystal structure than in subunit one. By superimposing the average structure of subunit one and subunit two (data not shown) we see that the largest 137 differences are in the N-terrninal, C-terrninal and Phell4 loop regions. The average RMSD data also suggest distinctions among these three regions (Table 5-3). The time evolution of the RMSD of the C-terrninal helix shows that two sub-states were captured in the free yCD MD simulation (Figure 5-4a). Subunit one maintains the crystal structure of the C-terminal helix, while subunit two shifted the whole helix slightly away, and this movement lets the Trp152 side-chain occupy the active site. This movement is not surprising since, in the crystal structure, one additional Zn atom, which is bridged by a water molecule to the active site Zn, coordinates with three more water molecules.4 This Zn complex fills the active site; in the MD simulation the extra Zn with its coordinated waters was removed, providing space for some rearrangement around the active site. Interestingly, the apo and intermediate complex forms have essentially the same crystal structure", with a closed active site. Ligands cannot move in or out without moving the covering residues Phell4, Trp152 and 116156. The time evolution of the Phell4 loop RMSD in the free form simulation highlights the difference in this region between subunit one and subunit two (Figure 5-4b). The Phell4 loop region is flexible, though a 2 ns simulation may not be long enough to describe a motion that could permit substrate entrance. To gain insight into the fluctuations of the yCD enzyme, the root mean square fluctuation (RMSF) was calculated by comparing the instantaneous protein structure with the average one for the free form (Figure 5-5a). Though the difference in RMSD is large between subunit one and subunit two compared with the crystal structure, the RMSF for CA atoms are almost the same, and consistent with the crystal B-factors. The residue RMSF values are quite small (most of them are around 0.5 A), which 138 indicates that the yCD protein is quite rigid as a whole in the free form on the MD time scale. As in the free form enzyme, the N -terminal residues are far more flexible than the rest of the protein in the intermediate analog complex (Table 5-3). The total RMSD (not including N-terrninal 14 residues) is 1.46 A for subunit one, and 1.24 A for subunit two, quite similar to the free form values. The small RMSF values for the CA atoms indicate the rigidity of the intermediate analog complex on the MD time scale as well (Figure 5- 5b). The differing RMSF’s between subunits one and two (Figure 5-5b) in the intermediate analog complex are evidence that two different configurations are being accessed, at least on the 2 ns timescale. The results indicate that the Phel 14 loop and the C-terrninal helix are flexible, especially in subunit one, relative to the crystal structure. Structure of the active site A schematic representation and a typical snapshot of the active site of the apo and intermediate analog complex are displayed in Figure 5-6. In Table 5-4, the most significant H-bonds between the ligand (or, Zn-bound water in the free form) and the enzyme residues are characterized in terms of the distances between two heavy atoms and their percentages of occurrence. In the free form simulation, the Zn-bound water forms hydrogen bonds with 0E1 and 0E2 of Glu64, which are not present in the crystal structure, because the second Zn in the crystal structure blocks the formation of these hydrogen bonds. In the MD simulation, this second Zn is not present. There is ~70% hydrogen bond occurrence for 0E2 of Glu64 and ~30% for 0E1 in subunit two, which indicates that the carboxyl group of Glu64 can rotate in the free form simulation (Figure 5-7a). In the complex, the ligand forms hydrogen bonds with Asn51, Gly63, Glu64, 139 Cys9l and Asp155. The simulation results are in excellent agreement with the hydrogen bond network found in the crystal structure (Table 5-4), which supports the Zn complex parameters and MD protocols developed for this simulation. Orientational changes of cytosine along the reaction path Snapshots of the active site of the yCD water/cytosine reactant complex show, surprisingly, that cytosine rotates around the axis perpendicular to its pyrimidine ring to move its NH2 group closer to the Glu64 carboxyl group (1 in Figure 5-2), relative to its starting position that is the same as the intermediate analog complex orientation in the crystal structure. This rotation helps the NH2 group of cytosine form a hydrogen bond network with Glu64 (Table 5-5). The upper part of the pyrimidine ring moves away from the Zn-bound water, due to the van der Waals repulsion between them, while the bottom part is restrained by three hydrogen bonds between cytosine and Gly63, Asn51 and Asp155. The distance between the C4 of cytosine and O of the Zn-bound water is 3.26:0.17A in subunit one and 3.19:0.14A in subunit two. Cytosine orients somewhat differently in the two subunits (Table 5-5). The NH2 group of cytosine in subunit one points directly at the carboxyl group of Glu64, and forms a hydrogen bond network using the two carboxylate oxygens and two amino hydrogens. But, in subunit two, NH2 can also form a hydrogen bond with O of Ser89 that is oriented toward the upper right side of cytosine. After protonation of Glu64 by proton transfer from the Zn bound water (yCD hydroxide/Glu64H cytosine model, 2 in Figure 5-2), cytosine rotates back to the orientation of the intermediate analog found in the crystal structure4 in the MD simulation. In this system, the protonated Glu64 forms one hydrogen bond with the 140 amino group of cytosine instead of the hydrogen bond network as in the previous carboxylate system (Table 5-6). The starting structure for this simulation was taken from one snapshot of the yCD water/cytosine simulation, with a slightly different geometry in subunits one and two. Correspondingly, these produce slightly different active site interactions. Notably, the distance between the C4 of cytosine and O of the Zn-bound hydroxide is 3021013 A in subunit one, but 3331020 A in subunit two. By superimposing the average structure of subunit one and two, we found that the cytosine in subunit two was pushed down by ~0.5 A relative to subunit one. This is consistent with the formation of a hydrogen bond between the carbonyl O of Ser89 and the amino group of cytosine in subunit one, but not in subunit two (Table 5-6). The orientation of protonated Glu64 is also more favorable for proton transfer to the N3 of cytosine in subunit one than in subunit two (see discussion below). After Glu64 transfers a proton to the N3 of cytosine (yCD hydroxide/Glu64 cytosineH), the distance between the C4 of cytosine and the O of Zn-bound hydroxide becomes 2.84 A in subunit one and 2.80 A in subunit two; these distances are about 0.4 A shorter than in the initial yCD cytosine complex. Thus, the C4 atom of cytosine becomes well positioned for nucleophilic attack by the hydroxide. Therefore, subsequent to proton transfer from the Zn-bound water to the N3 atom, the reorientation of cytosine makes nucleophilic attack of the Zn-bound hydroxide group on C4 easier. The proton shuttle with Glu64 Our quantum chemical calculations indicate that Glu64 in yCD acts as a proton shuttle with the Zn-li gated water.8 In the free form crystal structure, there is a second Zn that pushes Glu64 away from this water molecule, which makes proton transfer from it to 141 Glu64 improbable. In the free form MD simulation, where this second Zn atom was removed, the OE] or 0E2 of Glu64 forms a strong hydrogen bond with the Zn-bound water, with distance ~2.60 A in both subunit one and subunit two (Table 5-4). Interestingly, the side-chain carboxyl group of Glu64 can rotate in subunit two despite the strong hydrogen bond (Figure 5-7a). Our calculations8 indicate that, in the apo enzyme, the proton stays covalently bound to the water, versus transferring to protonate Glu64. After cytosine binds to the protein (yCD water/cytosine model, 1 in Figure 5-2), Glu64 still forms a strong hydrogen bond with the Zn-bound water (with distance 0E1...O 2.59 A, or 0132...o 2.60 A) in subunit one, and 0E2...0 2.60 A in subunit two (Table 5-5). The difference between them comes from the rotation of the Glu64 carboxyl group in subunit one but not in subunit two (Figure 5-7b). As discussed above, after the proton transfers from Zn bound water to the carboxyl O of Glu64, the simulation (yCD hydroxide/Glu64H cytosine model, 2 in Figure 5-2) shows a slightly different active site conformation in subunit one and subunit two. The apparent difference is attributable to the dihedral angle (a formed by the Glu64H CB-CG-CD-OE2 atoms with (a ~0° (all four atoms in the same plane with the terminal atoms cis for subunit one, which we shall refer to as state 1 and ¢~120° for subunit two (state 2). In state 1, the C-OEl-OE2-H plane is close to parallel with the cytosine pyrimidine ring plane and OE2-H is hydrogen bonded to the Zn-bound hydroxide group as a donor. In state 2, the angle between these two planes is about 60° with the 0E2-H hydrogen bonded to the same hydroxide group. In both cases, this hydrogen bond is quite strong, with distance OE2...O 2.60 A in subunit one and 2.62 A in subunit two. However, the ONIOM calculations indicates that these two protonated Glu64$ are not stable states unless the 0E2-H points to N3 of cytosine 142 and forms a hydrogen bond with it (2’ in Figure 5-2), instead of pointing to the 0 atom of the Zn-bound hydroxide. This could be done simply by rotating the dihedral angle CB- CG-CD-OE2 of Glu64 to (0~-—50°. This geometry defines state 3, where OE2-H points to N3 and the C-OEl-OE2-H plane is again parallel with the cytosine pyrimidine ring. From the MD perspective, where chemical species are fixed during a simulation, state 1 and state 2 can be stable in the sense of corresponding to minima on a free energy surface. Therefore, it is of interest to determine the potential of mean force (PMF) W ((p) along the Glu64H carboxyl dihedral coordinate, Q), formed by the atoms CB-CG-CD- 0E2. The umbrella sampling method that we use to obtain the PMF is described in the Methods Section. Figure 5-8 displays the PMF and shows that state 3 is only 2 kcal/mol higher than state 1. Because state 3 is a not a minimum of W(¢), configurations where the Glu64 carboxyl OH point toward the cytosine N3 atom in the simulation trajectory are transient. State 2 has the lowest free energy, but the barrier to rotate from state 2 to state 1 is about 6 kcal/mol, and that makes the dihedral angle rotation somewhat difficult. The rotation frequency estimated from transition state theory is around 0.5 ns'1 and this frequency is consistent with the one or no transition behavior observed in Figure 5-7. Thus, it is more likely that a proton transfers from the Zn bound water to form Glu64 carboxyl OH, followed by a dihedral angle CB-CG-CD-OE2 rotation of ~50° to form a stable state with the Glu64 OE2H pointing to the N3 of cytosine. That the MD force field does not produce a stable state for state 3 may be a consequence of the deficiency of the force field, or may reflect the short time scale of the simulation that is used to construct the PMF. After the proton transfers from Glu64 to the N3 of cytosine, (Figure 5-1b) a 2 ns simulation of the yCD hydroxide/Glu64 cytosineH complex (3 in Figure 5-2), as 143 described in the Methods Section, was carried out. A hydrogen bond analysis of the trajectory shows that the Glu64 carboxyl group forms a stable hydrogen bond with the amino group of cytosine in both subunit one and subunit two (Table 5-7). Irnportantly, the geometry enforced by this new hydrogen bond places C4 of cytosine in a favorable position for nucleophilic attack by the Zn-bound hydroxide (The 0—C4 distance is ~2.8 A in both active sites). Following the nucleophilic attack of the Zn-bound hydroxide group on C4 of cytosine, the proton of the hydroxide group (now C4OH) could either stay with the tetrahedral intermediate I (4 in Figure 5-2) or transfer to the Glu64 intermediate II (5 in Figure 5-2). The intermediate I (intermediate 11) state was modeled in the active site of subunit one (two) (see Methods Section) and a new MD simulation (yCD intermediate model) was performed. In the intermediate I state (Table 5-8), both 0E1 and 0E2 of Glu64 can form hydrogen bonds with the N3 and NH2 of the intermediate I, which indicates that the carboxyl group of Glu64 can once again rotate during the simulation (Figure 5-7c). The Glu64 carboxylate forms a strong hydrogen bond with C4OH (distance 2.65 A), suggesting that it is easier for the C40H proton to transfer to the carboxyl group of Glu64 than to some other residue. After this proton transfer, to form the intermediate H state, the Glu64 OE2H forms hydrogen bonds with both N4 and 04, with a slight preference for N4 (occurrence 100%, distance 2.72 A) than 04 (occurrence 84%, distance 2.83 A), as shown in Table 5-8. Therefore, in a manner similar to the first proton transfer, after Glu64 receives the proton, the Glu64 CB-CG-CD-OE2 rotates to point OE2H to N4. 144 In summary, along the reaction path, the rotation of the Glu64 dihedral CB-CG- CD-OE2 is critical for enabling the various proton transfers required to form intermediate II. The free energy simulations show that such rotations can happen on a ns timescale, and suggest that Glu64 acts as a proton shuttle multiple times. The hydrogen bond network In addition to the critical interactions between Glu64, the Zn-coordinated water or hydroxide, and the substrate, intermediates or products, the enzyme maintains a network of hydrogen bonds for substrate binding and catalysis. The hydrogen bonds involving the side-chain amide of Asn51, the NH’s of Gly63 and Cys91, and the carboxylate of Asp155 are maintained throughout the reaction path (Figure 5-2 and Tables 5-(5-9)). These hydrogen bonds help to stabilize all the complexes in the reaction path and therefore are important for both substrate binding and catalysis. In particular, the hydrogen bond between the NH group of Cys91 and the Zn-bound water (or hydroxide) may acidify the latter group and facilitate the proton transfers to Glu64. The hydrogen bond between the carbonyl O of Ser89 and the amino group of cytosine evolves during the reaction. The hydrogen bond is strongest both in terms of geometry and frequency of occurrence in the yCD hydroxide/Glu64 cytosineH complex (3 in Figure 5-2) and helps the positioning of cytosine for the nucleophilic attack by the Zn-bound hydroxide. As the reaction progresses further, the hydrogen bond weakens. The occurrence frequency of the hydrogen bond between the carbonyl 0 of Ser89 and ammonia is only 15%. The weakening of the hydrogen bond helps the release of the newly formed ammonia. Binding mode and free energy difference of ammonia and uracil 145 After ammonia and uracil are formed, our ONIOM8 calculations indicate that uracil has a similar binding mode as the intermediate analog in the crystal structure. The product complex MD simulations (6 and 7 in Figure 5-2) show that uracil forms a hydrogen bond network that is similar to the other ligand complexes (Table 5-9). Ammonia stays on the top of uracil with distance between N and C4 of uracil about 3.0 A, and forms a hydrogen bond with the carboxyl group of Glu64. When ammonia is exchanged with water, water also forms a similar hydrogen bond with Glu64. As mentioned above, our recent ONIOM calculation indicates that uracil is released with the help of a water molecule inside the active site.8 To explore the possibility of ammonia-water exchange, the difference in binding free energy between water and ammonia was evaluated (see Table 5-10). The forward and backward mutations between water and ammonia in water solution show consistent results, with ~2.8 kcallmol from water to ammonia and ~—2.7 kcallmol from ammonia to water. The mutation between water and ammonia in the protein active site was performed with the same scheme. The forward mutation (water to ammonia), which was carried out in the active site of subunit two, gave ~6.4 kcallmol while the backward mutation which was performed in the active site of subunit one gave ~—4.3 kcallmol. This indicates that the free energy of chemical mutation is quite sensitive to the environment. The removal of the dummy atom (see Methods) costs the same free energy as that in water solution, which indicates that this contribution is not sensitive to the environment. The (average) binding free energy difference between water and ammonia therefore is ~—2.6 kcallmol, which shows that water is a better ligand than ammonia. And, considering that the reaction takes place in water, the water concentration is much higher 146 than that of the product, ammonia, which makes water movement into the active site easier. Thus, we may conclude that after the formation of ammonia, it is highly probable that ammonia exchanges with water in the active site. Concluding remarks The MD simulations of free form yCD, and in complex with its reactant (cytosine), product (uracil), several reaction intermediates, and an intermediate analog performed in this work provide insights into the reaction mechanism and structural requirements of the enzyme. Quantum chemical calculations to obtain appropriate charges for the simulations around the enzyme active site were carried out, since the presence of the Zn ion and its ligands leads to large electronic structure effects that are reflected in the MD force field parameterization. For the free enzyme and intermediate analog complex, the agreements between the simulation results and the X-ray structures are excellent, showing the importance of proper force field parameterization when metals are ligated with residues whose protonation states are dependent on the metal and the nature and number of ligands. The simulations of the free enzyme and its intermediate analog complex show that the protein N-terminus is quite flexible, while all other parts of the enzyme are quite rigid on the MD time scale. The crystallographic data for the free and intermediate analog complex forms are very similar, raising the issue of how substrate enters and product leaves the active site. We do find that the Phel 14 loop region and the C-terminal helix sample different configurations in the two subunits, even though the RMSFs for the CA atoms are almost the same and only around 1 A. This may correspond to a more-or-less rigid body motion of these regions that could permit entrance of a substrate. The 2 ns 147 trajectories may not be long enough to provide enough sampling time to reveal the larger motions that may be required for substrate entry. Rotation of the carboxyl group of Glu64 is critical to position it relative to the cytosine Zn(HOH/OH) complex and its intermediates to facilitate the various proton transfer steps along the reaction pathway. The MD potential of mean force calculation for Glu64 carboxylate rotation suggests that such motion is feasible on a ns time scale. The MD simulations of the reactant and a series of intermediates also reveal that cytosine adjusts its orientation and position to assist in the nucleophilic attack by the Zn(OH). With reference to Figure 5-2, the steps along the reaction pathway can be summarized as follows. First, cytosine and yCD forms an initial complex (1 in Figure 5—2) with cytosine having an orientation rather different from that of the intermediate analog found in the crystal structures. Second, the binding of cytosine facilitates proton transfer from Zn(H2O) to form Glu64H and Zn(OH) (2 in Figure 5-2). The carboxylic acid group of Glu64H then rotates so that its OH group forms a hydrogen bond with N3 of cytosine (2’ in Figure 5-2). Third, Glu64H then transfer its proton to N3 of cytosine to form the yCD hydroxide/Glu64 cytosineH complex (3 in Figure 5-2). The hydrogen bond between the carbonyl oxygen of Ser89 and the amino group of cytosine is strongest at this stage and helps position C4 of cytosine for the subsequent nucleophilic attack by Zn(OH). Fourth, after nucleophilic attack to produce intermediate I (4 in Figure 5-2), the MD shows that Glu64 hydrogen bonds to N3 (and NH2) and can again rotate to be re-protonated by C4OH to form intermediate H (5 in Figure 5-2). Fifth, the NH) trajectory of intermediate H leads to Glu64H hydrogen bonding to N4 and C40, which implies that Glu64H has 148 rotated again to point to N4, facilitating decomposition of the tetrahedral complex to the product complex (6 in Figure 5-2). In the MD simulation of the product complex, ammonia forms a stable hydrogen bonds with Glu64. On the ns timescale, it does not leave the active site. In view of our previous study8 that proposed a mechanism for the release of uracil that relies on a water molecule replacing ammonia in its active site, a water molecule was exchanged for ammonia (7 in Figure 5-2) and, in the subsequent MD simulation, this water forms a similar hydrogen bond with Glu64. Our evaluation of the binding free energy difference between water and ammonia shows that the difference is sufficiently small that, when coupled with the much greater water than ammonia concentration in the active site vicinity, it is likely for ammonia to readily exchange with water. 149 References (1) Nishiyama, T.; Kawamura, Y.; Kawamoto, K.; Matsumura, H.; Yamamoto, N.; Ito, T.; Ohyama, A.; Katsuragi, T.; Sakai, T. Cancer Res 1985, 45, 1753. (2) Dipiro, J. T.; Talbert, R. L.; Yee, G. C.; Matzke, G. R.; Wells, B. G. Pharmacology: A Pathophysiologic Approach, 3rd ed.; Appleton and Lange: Stamford, CT, 1997. (3) Morris, S. M. Mutat. Res. 1993, 297, 39. (4) Ireton, G. C.; Black, M. E.; Stoddard, B. L. Structure 2003, 11, 961. (5) K0, T. B; Lin, J. J.; Hu, C. Y.; Hsu, Y. H.; Wang, A. H. J.; Liaw, S. H. J. Biol. Chem. 2003, 278, 19111. (6) Xiang, S. E; Short, S. A.; Wolfenden, R.; Carter, C. W. Biochemistry 1995, 34, 4516. (7) Xiang, S. B.; Short, S. A.; Wolfenden, R.; Carter, C. W. Biochemistry 1997, 36, 4768. (8) Sklenak, S.; Yao, L.; Cukier, R. 1.; Yan, H. J. Am. Chem. Soc. 2004, 126, 14879. (9) Dudev, T.; Lim, C. J. Phys. Chem. B 2001, 105, 10709. (10) Dudev, T.; Lim, C. J. Am. Chem. Soc. 2002, 124, 6759. (11) Simonson, T.; Calimet, N. Proteins 2002, 49, 37. (12) Banci, L. Curr Opin Chem Biol 2003, 7, 143. ( 13) Hoops, S. C.; Anderson, K. W.; Merz, K. M. J. Am. Chem. Soc. 1991, 113, 8262. ( 14) Suarez, D.; Merz, K. M. J. Am. Chem. Soc. 2001, 123, 3759. (15) Pearlman, D. A.; Case, D. A.; Caldwell, J. W.; Ross, W. S.; Cheatham, T. E.; Debolt, S.;.Ferguson, D.; Seibel, G.; Kollman, P. Comput Phys Commun 1995, 91, l. (16) Case, D. A.; Pearlman, D. A.; Caldwell, J. W.; HI, T. E. C.; Wang, J.; Ross, W. S.; Simmerling, C. L.; Darden, T. A.; Merz, K. M.; Stanton, R. V.; Cheng, A. L.; Vincent, J. J.; Crowley, M.; Tsue, V.; Gohlke, H.; Radmer, R.; Duan, Y.; Pitera, J.; Massova, I.; Seibel, G. L.; Singh, G; Weiner, P.; Kollman, P. A. AMBER7; University of California: San Francisco, 2002. (17) Diaz, N.; Suarez, D.; Merz, K. M. M. J. Am. Chem. Soc. 2000, 122, 4197. 150 (18) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1996, 118, 2309. (19) Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. J. Comput. Phys. 1977, 23, 327. (20) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, G. L. J. Chem. Phys. 1995, 103, 8577. (21) Berendsen, H. H. C.; Postma, J. P. M.; Gunsteren, W. F.; DiNola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684. (22) Torrie, G. M.; Valleau, J. P. Chem. Phys. Letts. 1974, 28, 578. (23) Frenkel, D.; Smit, B. Understanding Molecular Simulation : from algorithms to applications; Academic Press: San Diego, 1996. (24) Souaille, M.; Roux, B. Comput Phys Commun 2001, 135, 40. (25) Boresch, S.; Tettinger, F.; Leitgeb, M.; Karplus, M. J. Phys. Chem. B 2003, 107, 9535. (26) Kraulis, P. J. J. Appl. Crystallogr. 1991, 24,946. (27) Bacon, D. J.; Anderson, W. F. J. Mol. Graphics 1988, 6, 219. (28) Merritt, E. A.; Bacon, D. J. Methods Enzymol. 1997, 277, 505. 151 Appendices Table 5-1 Force constants for the Zn complex. K, (kcallmol—A2) K9 (kcallmol—radz) Zn-S Zn-O Zn-N 300 300 300 S-Zn-S S-Zn-O S-Zn-N O-Zn-N 75 75 75 75 Zn-O-C Zn-S-C Zn-O-H Zn-N-C 35 35 35 35 152 Table 5-2 Charges (e units) of Zn complex used in the yCD simulations 1 2 3 4 5 6 Hi s62 CB -0.168 -0.420 -0.708 0.047 0.129 -0.278 HB2 0.082 0.105 0.266 0.063 0.045 0.150 HB3 0.082 0.105 0.266 0.063 0.045 0.150 CG 0.258 0.241 0.284 0.005 -0.051 0.074 NDl -0.283 -0.111 -0.303 -0.168 -0.159 -0. 143 CEl -0.061 -0. 195 0.021 -0. 100 -0.082 -0.019 HEl 0.120 0.175 0.159 0.168 0.169 0.133 NE2 0.071 0.002 -0.336 -0.086 -0.1 19 -0.097 I-IE2 0.233 0.276 0.416 0.326 0.335 0.281 CD2 -0.427 -0.401 -0. 168 -0.323 -0.291 -0.342 HD2 0.313 0.268 0.195 0.218 0.220 0.254 Cys91/ CB 0.221 -0.006 0.215 0.08 0.059 0.027 Cys94 HB2 -0.035 0.016 -0.056 0.019 0.026 -0.002 HB3 -0.035 0.016 -0.056 0.019 0.026 -0.002 SG -0.644 -0.568 -0.729 -0.703 -0.723 -0.673 Zn Zn 0.762 0.609 1.027 0.876 0.887 0.668 (1) Data used in free form and water/cytosine active site simulations. (2) Data used in intermediate analog complex simulation. (3) Data used in hydroxide/Glu64H cytosine complex and hydroxide/Glu64 cytosineH complex simulations. (4) Data used in intermediate I complex simulation. (5) Data used in intermediate H complex simulation (6) Data used in product uracil/ammonia (and uracil/water) complex 153 Table 5-3 RMSD for free and intermediate analog complex forms RMSD (A) free form intermediate analog complex Subunit l Subunit 2 Subunit 1 Subunit 2 CA 0.72 (0.06) 1.01 (0.07) 0.90 (0.14) 0.77 (0.05) Total (3’ 1.29 (0.06) 1.60 (0.07) 1.46 (0.11) 1.24 (0.05) N-terrninal 2.28 (0.24) 3.43 (0.78) 2.65 (0.32) 3.66 (0.96) C-terrninal 0.98 (0.11) 1.50 (0.11) 1.40 (0.37) 1.32 (0.23) Loop 1.41 (0.19) 1.93 (0.22) 1.97 (0.32) 1.20 (0.09) “” Not including residues 1-14. 154 Table 5-4 Hydrogen bonds in yCD free and intermediate analog complex forms Subunit l Occurrence (%) Distance (A) Crystal structure Intermediate complex Asn51—NH. . .02 99.47 2.82 (0.10) 3.14 Asp155—OD. . .H—Nl 100.00 2.70 (0.08) 2.69 Gly63—NH. . .02 92.68 2.94 (0.12) 2.89 Glu64—OE1...H—O4 37.42 3.06 (0.11) 3.34 Glu64—0E2. . .H—O4 99.73 2.50 (0.07) 2.49 Cys91—NH. . .04 22.90 3.10 (0.07) 3.02 Glu64—0E1...N3 99.87 2.79 (0.09) 2.80 Free form Glu64—0E2. . .H—WAT 100.00 2.62 (0.10) Subunit 2 Intermediate complex Asn51—NH. . .02 98.80 2.85 (0.11) 2.90 Asp155—OD. . .Nl 100.00 2.69 (0.07) 2.69 Gly63—NH...02 89.21 2.97 (0.12) 2.87 Glu64—OE1...H-04 39.41 2.95 (0.24) 3.37 Glu64—0E2. . .H—O4 95.61 2.53 (0.11) 2.52 Cys91—NH...O4 21.04 3.12 (0.06) 3.03 Glu64—OE1...N3 91.48 2.80 (0.09) 2.78 Free form Glu64-OE2...H—WAT 73.60 2.57 (0.10) Glu64—0E1...H—WAT 28.93 2.60 (0.16) 155 Table 5-5 Hydrogen bonds in yCD water/cytosine simulation Occurrence Distance (A) Occurrence Distance (A) (%) (%) Subunit 1 Subunit 2 Asn51—NH. . .02 99.73 2.82 (0.10) 99.60 2.84 (0.11) Asp155—OD. . .H—Nl 99.07 2.84 (0.11) 100.00 2.79 (0.10) Gly63—NH...02 75.87 2.99 (0.12) 84.80 2.97 (0.12) Glu64-OE1...H1—N4 8.27 2.94 (0.12) 88.00 2.88 (0.13) Glu64—OE1...H2—N4 32.93 2.87 (0.14) Glu64—OE2...H1—N4 23.20 2.86 (0.13) 9.33 3.04 (0.10) Glu64—OE2...H2—N4 50.67 2.84 (0.13) 31.07 2.88 (0.13) Glu64-OE1...H—WAT 70.80 2.59 (0.09) Glu64-0E2. . .H—WAT 29.33 2.60 (0.09) 100.00 2.60 (0.10) Cys91—NH...O—WAT 98.93 2.92 (0.10) 98.40 2.91 (0.09) Ser89—O...H2—N4 49.20 2.91 (0.13) Table 5-6 Hydrogen bonds in yCD hydroxide/Glu64H cytosine simulation Occurrence Distance (A) Occurrence Distance (A) (%) (%) Subunit 1 Subunit 2 Asn51—NH...02 92.67 2.91 (0.12) 98.67 2.84 (0.11) Asp155—0D. . .H—Nl 99.33 2.78 (0.10) 36.27 2.79 (0.11) Gly63—NH...02 89.33 2.96 (0.12) 63.47 3.01 (0.11) Glu64—OE1...H1-N4 79.73 2.94 (0.13) 79.33 2.90 (0.13) Glu64—OE2H...OH 97.73 2.60 (0.10) 99.87 2.62 (0.10) Cys91NH...OH 99.07 2.91 (0.10) 90.40 2.99 (0.11) Ser89-O...H2—N4 79.33 2.96 (0.12) 156 Table 5-7 Hydrogen bonds in yCD hydroxide/Glu64 cytosineH simulation Occurrence Distance (A) Occurrence Distance (%) (%) (A) Subunit l Subunit 2 Asn51—NH. . .02 63.73 3.03 (0.11) 74.53 2.99 (0.11) Asp155—OD...H—N1 100.00 2.70 (0.08) 100.00 2.71 (0.08) Gly63—NH. . .02 54.27 3.04 (0.10) 82.00 2.99 (0.11) Glu64—OE1...H1—N4 100.00 2.73 (0.08) 99.87 2.72 (0.07) Glu64-OE1...H—N3 44.67 3.07 (0.10) 38.40 3.07 (0.10) G1u64—OE2...H—N3 99.87 2.80 (0.09) 100.00 2.79 (0.09) Cys91NH...OH 100.00 2.82 (0.08) 100.00 2.83 (0.08) Ser89—0. . .H2—N4 80.00 2.87 (0.13) 91.60 2.84 (0.12) Table 5-8 Hydrogen bonds in yCD intermediate I/H simulations Occurrence Distance (A) Occurrence Distance (%) (%) (A) Intermediate I Intermediate H Asn51—NH. . .02 99.07 2.86 (0.11) 100.00 2.84 (0.10) Asp155—OD. . .H—Nl 100.00 2.70 (0.08) 100.00 2.76 (0.09) Gly63—NH...O2 88.00 2.97 (0.12) 89.07 2.96 (0.12) Glu64—OE1...H1-N4 22.93 3.02 (0.10) Glu64-OE2...H1—N4 49.07 3.02 (0.11) Glu64—OE1...H—O4 34.80 2.65 (0.19) Glu64—OE2—H...O4 83.60 2.83 (0.13) Glu64—OE2—H. . .N4 100.00 2.72 (0.11) Glu64-OE1...H—N3 30.80 2.86 (0.10) 100.00 2.83 (0.10) Glu64-0E2. . .H—N3 69.47 2.84 (0.10) Cys91NH. . .04 96.13 2.97 (0.10) 97.47 2.96 (0.11) Ser89-O. . .H2—N4 40.27 3.03 (0.11) 36.67 3.06 (0.10) 157 Table 5-9 Hydrogen bonds in yCD uracil/ammonia and uracil/water simulations. Occurrence Distance (A) Occurrence Distance (%) (%) (A) uracil/ammonia uracil/water Asn51—NH...02 97.87 2.87 (0.12) 96.93 2.88 (0.12) Asp155-OD. . .H—Nl 100 2.69 (0.08) 100.00 2.70 (0.07) Gly63-NH. . .02 79.2 2.99 (0.11) 87.60 2.98 (0.12) Glu64—OE2...H-N3 99.87 2.83 (0.09) 99.07 2.86 (0.11) Cys91NH. . .04 60.27 3.07 (0.09) 45.2 3.08 (0.08) Glu64—OE1...H1— 94.67 2.84 (0.11) 100.00 2.65 (0.11) NH3 /(Hl—WAT) Ser89—0...H2—NH3 14.53 3.05 (0.10) 23.47 2.81 (0.17) /(H2—WAT) Table 5-10 Chemical mutation free energies between water and ammonia. direction of mutation In water In protein Alchemy Restraint (‘0 Alchemy Restraint (‘0 NH3 to H20 —2.74 —8.36 —4.32 a” —8.36 (kcallmol) H20 to NH3 2.82 8.36 6.417” 8.36 (kcallmol) (a) A 1M standard state was used as a reference state for the gas phase of the dummy atom. (b) Result from subunit 1. (c) Result from subunit 2. 158 Cys94 Cys91 25 S.‘ 23.5 °“~L 2'3 ‘26?+ 27 Y ‘-H ’2 21“N¢\N-H--4--O:J \‘30 " His62 ‘H—N 0 s91 \ y H ~. \N O \H\ _ Asn51—< 2'§~~O/‘\Asp155 0. \‘2.8 H\ N— / Arg53 (b) Figure 5-1. (a) Ribbon representation of the crystal structure of yCD drawn according to the coordinate of the 1.14 A crystal structure." The figure was prepared with Molscript26 and Raster3D.27’28 (b) Schematic drawing of the coordination sphere of the catalytic zinc ion and polar interactions between yCD and a reaction intermediate analog as revealed by X-ray crystallography.” The distances (A) between the zinc ion and its ligands and between the heavy atoms involved in hydrogen bonds are obtained from one subunit (A) of the 1.14 A crystal structure of yCD with the reaction intermediate analog complex. 159 61064 2002* O-H 2n2+ / 2n“ \“_/ .'. — . m— -. / GIUBI <\ " H N\ U 91 Glu64 ‘‘‘‘‘ H— _o/_.-- H‘ N (W891 _:,..‘-.-" \H """"" H-N\Cy891 0‘ '1'qu “ )1 "\N/H HN\ ,H-u -O= . ‘, . N H I30 ,1? l1 on HN\ H‘n\ 0 A3“SI__< ‘ /|\A8p155 Asn51———< H ~ o)‘\ A891 55 A3"5I-—-< H‘ ‘. k” 1 55 o O O O P 2 2. zn2+ / Glu64 H_O/.---—H-N\ Cys91 Vii-u o= (2) 8R 3r r=R in which <...)r=R is a conditional ensemble average with the condition r=R. Since r is only a function of coordinates of certain atoms in the high-level QM part, equation (2) can be further simplified by q q qm q q 3PMF(R)_ 3(Hh+th+Hhr) ~ 8(Hh+th) (3) OK Br Br r=R r=R provided the interaction between core layer and outer layer is very small. The outer layer influences the mean force through the ensemble average. To estimate this average, certain number of MD snapshots need to generated with the constraint r=R. Then, the mean force can be evaluated by equation 3. By integrating equation (3), PMF will be: 169 ’1 ’l 3114 PMF 7‘0 ’0 r':r But, as discussed above, this QM/MM simulation is impractical with the current computer resources due to the expensive QM calculation. Certain approximations have to be introduced to evaluate the mean force more efficiently. We propose a two-step method to approximate this QM/MM simulation. 1. One MM simulation where all the atoms treated classically is performed with the restraint r=R. 2. A certain number of snapshots are chosen for a two-layer ONIOM optimization with the high and low level QM method, the same used in the "ideal" QM/MM simulation. In ONIOM optimization, the low level layer is fixed and the high level layer is fully optimized with the constraint r=R. Then the force between two atoms can be calculated and averaged over the snapshots by using equation (3). Repeating step 1 and 2 with different constraint distances, PMF can be calculated by using equation (4). Since the chemical bond cleavage process cannot be fully described by the MM method, the core layer has to be optimized using high level QM method before the force calculation which acts as a refinement of the core structure. The main errors introduced by this approximation method come from the core and medium layer in the simulation. For the medium layer, the use of MM will certainly introduce an error. But since there is no chemical reaction occurring in the medium layer, MM should be a reasonable approximation for a low level QM method. If MM of the medium layer responds properly to the constraint changes in the core layer and generates corresponding conformational changes, the ONIOM optimization in step 2 will produce an accurate core region conformation which will give a good evaluation of the force. The second error comes from the MM treatment of the core layer in the simulation. The 170 inadequate treatment of the core in the simulation may not be able to generate good conformations for force calculation. Thus ONIOM optimization has to be performed in step 2. The third error is higher order error compared to the first two. The erroneous core conformations generated in simulation might cause the erroneous response of the medium layer which in turn will influence the second step ONIOM optimization. To guarantee this error is small, two conditions have to be fulfilled. 1. MM is a good method to describe the initial state, which is likely true with the current force field because the initial state is a well defined state. 2. During the reaction process, the modification of the core is small and then the quantum effect changes to the medium layer are insignificant. In a conclusion, the PMF quality depends primarily on the quality of MM treatment of the core system and the complexity of the reaction process in this two—step protocol. MD simulation and ONIOM calculation of the YCD uracil complex A 2-ns MD simulation was previously performed for the yCD uracil complex at constant temperature (300 K) and constant volume(Yao et al. 2005). The active site Zn and uracil 04 are covalently linked by a harmonic potential with the force constant 300 kcallmol-A2. In order to see what happens after the Zn-O4 bond breaks, one lns MD simulation was carried out by starting from one snapshot with the harmonic potential removed. It appears that the Zn-O4 distance increased to ~3.6 A. One MD snapshot was selected with the Zn-O4 distance 3.86 and trimmed for a two-layer ONIOM (B3LYP/PM3) in Gaussian 03(Frisch) optimization with the same setup in the previous (Sklenak et al. 2004) study. The outer layer is fixed and treated by PM3, including Ile33, Asn51, Thr60, Leu6l, Gly63, Ile65, Leu88, Ser89, Pro90, Asp92, Met93, Th195, Phel l4, Trp152, Phe153, Glu154, Asp155, and Ile156. The inner layer is fully optimized by using 171 B3LYP(Becke 1993), including uracil, Glu64, Zn and Zn bound residues His62, Cys91, Cys94. The ONIOM optimized Zn-O4 distance is similar to that from MD. Then this distance was scanned from 3.8 A to 2.0 A to see whether it can reproduce the relative energy profile from the crystal structure calculation. Another lns MD simulation was performed with the harmonic potential between Zn-O4 maintained but the equilibrium distance changed 0.1 A every 50 ps from 2.0 A to 3.8 A starting from one MD snapshot of the first simulation. The coordinates were saved every 2 ps, the first 20 ps was used as the equilibration period. For each Zn-O distance, 5 snapshots were chosen evenly and optimized by the ONIOM method as described above; then the force between Zn and 04 was calculated based on a Hessian matrix and averaged over 5 snapshots. The two step process is demonstrated in Figure 6-1. The error of the force average is rather small based on the standard deviation (see Results) except at a distance of 2.3 A where 10 snapshots were used in the ONIOM calculation. Results and discussion Scan with the fixed outer layer After the Glu64 transfers two protons from Zn bound water molecule to cytosine, NH3 is formed and released, but 04 of uracil is still covalently bound to Zn atom (Sklenak et al. 2004). It appeared that the direct Zn-O4 bond cleavage is extremely difficult in our previous calculation (Sklenak et al. 2004). But one potential problem of the calculation is that the outer layer of the protein was fixed according to the crystal structure. If some of these residues need to rearrange themselves during Zn-O4 bond break, fixing the outer layer will make the barrier artificially high due to the steric clashes. The ONIOM optimizations were carried out to scan the Zn-O4 distance from 2.0 172 A to 3.8 A, based on the Zn-O4 bound product complex structure generated from yCD inhibitor crystal structure (Ireton et al. 2003; K0 et al. 2003; Sklenak et al. 2004). One MD simulation was performed with the bond restraint between Zn and O4 removed, which effectively gives a relaxed system with Zn-O4 bond broken. Then the same ONIOM optimizations were performed based on the MD snapshot. By comparing these two energy profiles, we can see whether the constraint of the outer layer will introduce any significant artifact. As shown in Figure 6-2, the ONIOM energy for the crystal structure shows a monotonically increase from 0 kcallmol to 15 kcallmol with Zn-O4 distance changing from 2.0 A to 3.8 A, while the ONIOM energy for the MD snapshots decreases along the increasing distances. It suggests that the active site has rearranged itself in MD simulation when Zn-O4 bond is cleaved. Further investigation shows that OEl of Glu64 moves close to the Zn atom after the cleavage of Zn-O4 bond, due to the electrostatic attraction (Figure 6-3). The partial charge of Zn is +0.67 e and the charge of 0131 of Glu64 is -0.82 e. The average distance between Zn and OEl is 1991009 A while the distance between Zn and O4 is 3591027 A in the MD simulation. So OEl acts as the fourth atom coordinated with Zn after the loss of the Zn-O4 interaction. The water molecule forming the hydrogen bond with the Ser89 backbone carbonyl and Glu64 carboxyl in the starting point was pushed away during the MD simulation (Figure 6-3). This process was not observed during the ONIOM distance scanning of the crystal structure, because the backbone of Glu64 was treated as the outer layer and fixed, which prohibits 04 from moving close to Zn and the water molecule is contained by the outer layer residues. On the other hand, during the Zn-O4 distance scanning of the MD snapshots, the outer layer was also fixed but relaxed for the cleavage of the Zn-O4 bond. 173 That is probably why the system seems to be more stable with the Zn-O4 bond cleaved in the ONIOM calculations of the MD snapshot. Therefore, the ONIOM calculations based on the crystal structure and MD snapshot are two extremes: the crystal structure is the representation of the covalently bound form Zn-O4 complex, the MD snapshot is the representation of the covalently unbound form of the Zn-O4 complex. One has to scan the distance between Zn-O4 and rearrange the outer layer residues at the same time to generate the energy profile correctly. Combining ONIOM with MD As discussed above, the ONIOM method with the static outer layer can’t demonstrate the Zn-O4 bond cleavage process. MD has to be combined with ONIOM to take care of the changes. First MD was used to generate snapshots with different Zn-O4 distance restrained, and then ONIOM was used to optimize these snapshots and calculate the forces. It appears that the average of forces at individual Zn-O4 distance converges quickly as shown in Figure 6-4, so only 5 snapshots are used for each distance except 2.3 A where 10 snapshots are used. The average force decreases dramatically from 9.2 kcallmol-A to -12.4 kcallmol-A as the distance increases from 2.0 A to 2.2 A. Then the force increases to -2.0 kcallmol-A at 2.4 A, after that it increase slowly to ~2 kcallmol-A at 2.7 A and fluctuates around till 3.5 A when it drops to ~O kcallmol-A. The force curve passes zero three times, the first one in between 2.0 A and 2.1 A indicating the first minimum in potential of mean force curve, the second one in between 2.6 A and 2.7 A a maximum in the potential curve, the third one in between 3.4 A and 3.5 A another minimum in potential. The potential of mean force was calculated by using the discrete summation of the average force as shown in Figure 6-5. It is a typical chemical reaction 174 process with one transition state which has the Zn-O4 distance approximately 2.6-2.7 A. The barrier for the reaction is ~2.9 kcallmol. The corresponding three states are shown in Figure 6-6. The transition state has the Zn penta-coordinated with His62, Cys9l, Cys94, Glu64 and uracil because OEl of Glu64 moves close and coordinates to Zn. In fact the bond between Zn and OEl is formed at the beginning of Zn-O4 bond cleavage as shown in Figure 6—7. The distance between Zn-OEl decreases to 2.1 A when Zn-O4 distance increases to 2.4 A. In the transition state Zn has a weak interaction with 04 of uracil since the distance is already ~2.6-2.7 A. After the reaction, the backbone amide of Cys91 forms hydrogen bond with OEl similar to what occurs in MD simulation (Figure 6-3). Several hydrogen bond interactions between uracil and active site residues were maintained during the reaction, including Glu64 OE2 with N2H, Gly63 NH with 02, Asn55 NH2 with 02, Asp155 ODl with NIH, consistent with MD simulation (Figure 6- 3). Conclusion In this study we revisit the Zn-O4 bond cleavage mechanism during uracil release. The mechanism proposed previously is an oxygen exchange mechanism based on a two- layer quantum mechanics ONIOM method (Sklenak et al. 2004), which was observed in experiments (unpublished data). But the rate for the exchange is very slow, and therefore unlikely to be the Zn-O4 cleavage mechanism. It was seen in the previous ONIOM calculation that the direct Zn-O4 bond breaking needs to overcome a huge energy barrier. In this study, we demonstrate that this barrier mainly comes from the constraint of the outer layer that prohibits the proper response of active site residues, especially the E64 carboxyl group. The increase of Zn-O4 distance based on the crystal structure results in a 175 monotonic increase of energy. But, the increase of Zn-O4 distance based on the MD simulation shows a steady decrease in energy. The reason is that ONIOM optimization from the crystal structure is a good approximation of the Zn-O4 covalently linked state while the ONIOM from MD snapshot is a proper approximation of the Zn-O4 bond broken state. The constraint of the outer layer can’t describe the intermediates in between correctly, which makes the ONIOM calculation always favor the starting point. One strategy to solve this problem is to use QM/MM simulation method. But the presence of Zn in the system requires a high level QM method, which makes the simulation impractical. Instead, we proposed a two-step process to mimic this target simulation by combining MM simulation with a two-layer ONIOM QM calculation. We demonstrated that this process can describe the response of the outer layer properly. Zn-O4 bond cleavage occurs through a penta-coordinated Zn complex. And the barrier for this process is rather low, only about 2.9 kcallmol. 176 Reference Becke, A. D. (1993). "Density-Functional Thermochemistry .3. The Role of Exact Exchange." Journal of Chemical Physics 98(7): 5648-5652. Dudev, T. and C. Lim (2001). "Modeling Zn2+-cysteinate complexes in proteins." Journal of Physical Chemistry B 105(43): 10709-10714. Frisch, M. J. "Gaussian 03." Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 angstrom crystal structure of yeast cytosine deaminase: Evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11(8): 961-972. Ko, T. P., J. J. Lin, et al. (2003). "Crystal structure of yeast cytosine deaminase - Insights into enzyme mechanism and evolution." Journal of Biological Chemistry 278(21): 19111- 191 17. Maseras, F. and K. Morokuma (1995). "Imomm - a New Integrated Ab-Initio Plus Molecular Mechanics Geometry Optimization Scheme of Equilibrium Structures and Transition-States." Journal of Computationaghemistrv 16(9): 1170-1179. Sklenak, S., L. S. Yao, et al. (2004). "Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study." J 01ml of the American ChemicalfiSociety 126(45): 14879-14889. Svensson, M., S. Humbel, et al. (1996). "ONIOM: A multilayered integrated MO+MM method for geometry optimizations and single point energy predictions. A test for Diels- Alder reactions and Pt(P(t-Bu)(3))(2)+H-2 oxidative addition." Journal of Physifl Chemistgy 100(50): 19357-19363. Vreven, T., B. Mennucci, et al. (2001). "The ONIOM-PCM method: Combining the hybrid molecular orbital method and the polarizable continuum model for solvation. Application to the geometry and properties of a merocyanine in solution." Journal of Chemical Physics 115(1): 62-72. ' Vreven, T., K. Morokuma, et al. (2003). "Geometry optimization with QM/MM, ONIOM, and other combined methods. I. Microiterations and constraints." Journal of Computational Chemistry 24(6): 760-769. Yao, L. S., Y. Li, et al. (2005). "Product release is rate-limiting in the activation of the prodrug S-fluorocytosine by yeast cytosine deaminase." BiochemistLy 44(15): 5940-5947. Yao, L. S., S. Sklenak, et al. (2005). "A molecular dynamics exploration of the catalytic mechanism of yeast cytosine deaminase." Journal of Physifll Chemistry B 109(15): 7500-7510. 177 Appendices [Zn-O distance] " MD ll:> ONIOM Opt u ONIOM l a a a a Figure 6-1 The flow chart of MD ONIOM combination used in Zn-O bond cleavage process. 15 - + x-ray 10-i Energy(l N I H- A - *N' (31-V63 N/ ‘9 'l' Gly63/ \N/H ‘9 A H ..... O l [4,-.. O / Asn51 ——t\ /Gl\ Asn51 ——i~< x H 0 Asp155 O Asp155 Figure 6-3 The rearrangement of the active site during MD simulation after the Zn-O4 bond cleavage. 10‘ 5“ i 2 °' V'\ > , ’i‘i ’i 8 at, -5. § u. -10- 451 l 'zoi'l'l'lfiI'I'I'I‘I'Tj 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 Distance (A) Figure 6-4 The average forces between Zn and 04. The bars in the plot represent the errors. 179 2'5: (2) 2.0-1 ‘ C 1.5... O/ A / \. a \ a « ° \. a; 0.5- \ 5:: 0.0- O \\.,o\ /. 0.5: (1) . (3) 4.0- I V 1 V I V l’ V I Y j V l' I I U I V I ‘ 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 Distance (A) Figure 6-5 The potential of mean force along the Zn-O4 bond cleavage. C s C CYS94 CY591 CYSg4\ /Cys91 Y 94 / Y$91 \Zn/ 2+ 2+_ . \H. Zn\ . /Zn His62 I362 Glufi’ko I HISSZ GIU64’\§O I GIu“/\§ 0/ 2+ 0‘ o O‘~\ O Os 0 ~ , H H‘ . H M Z V O y. 2 V )—2' Z (1) (2) (3) Figure 6-6 The three states defined based on PMF during Zn-O4 bond cleavage. 180 Distance(A) 2.04 —-— Zn-OE1 Mafia‘O-Oa—oa—Ow—O—o 2.0 2.2 T I Y r ' r r j I T I ‘ 2.8 3.0 3.2 3.4 3.6 3.8 Distance(A) T T I ‘ I ' I ' 2.4 2.6 Figure 6-7 The distance change between Zn-OEl during the cleavage of Zn—O4. 181 Chapter 7 A Molecular Dynamics Study of The Ligand Release Path in Yeast Cytosine Deaminase Introduction Yeast cytosine deaminase (yCD) catalyzes the deamination of cytosine to uracil. Cytosine deaminase is present in bacteria and fungi, as part of the pyrimidine salvage pathway, but is absent in mammals (Nishiyama et al. 1985). yCD can also catalyze the deamination of the prodrug S-fluorocytosine (5-FC) to the anticancer drug S-fluorouracil (S-FU). Thus, a potential system for gene-directed enzyme prodrug therapy combines yCD and the prodrug S-FC, since the product, 5-FU, inhibits DNA synthesis and is thus a potent chemotherapeutic agent (Morris 1993). X-ray structures of yCD in the apo form (Ireton et al. 2003) and in complex with the potent inhibitor 2-pyrimidinone (Ireton et al. 2003; K0 et al. 2003), are available. yCD is a homodimer with one active site in each protomer. The yCD protomer contains a center B sheet (Bl~BS, parallel to each other) with two or helixes (a1 and a5) on one side and four or helixes (0L2, a3, a4 and a6) on the other side. Each active center contains a single catalytic zinc ion that is tetrahedrally coordinated with a histidine (His62), two cysteines (Cys91 and Cys94), and a water molecule in the substrate-free enzyme or the inhibitor in the complex. Surprisingly, the structure of apo yCD is essentially the same as that of the transition state analog complex with the active site completely buried. 182 Previous MD simulations for the apo form, reactant, various intermediate, and product complexes (Yao et al. 2005), all showed a buried active site during the ~2.0 ns simulations. The overall protein is quite rigid in those simulations, consistent with NMR backbone NOE data and order parameters, which describe motion in the ps-ns time scale (chapter 3). The question that we address in this work is how does the reactant bind and the product release in the catalytic cycle, in view of the buried active site? Quench-flow experiments performed to study the reaction mechanism indicate that product release is the rate-limiting step during the activation of S-FC (Yao et al. 2005). A slow product release process was also demonstrated by a 19F NMR study, with the product release rate determined to be ~13.1 3". But, it is interesting that the Kd of S-FU is only about ~25 mM, suggesting that it is not a good binding ligand. Sklenak et al. (Sklenak et al. 2004) proposed a two-step process for uracil as well as 5-FU release. The first step is bond cleavage between the O4 atom of 5-FU (Figure 7-1) and the catalytic Zn atom through an oxygen exchange mechanism in which S-FU exchanges it's 04 with a water molecule. In the second step, non-covalently bound S-FU moves out of the active site. The first step gives a rather high activation barrier (Sklenak et al. 2004), which was thought to be the cause of the slow release of 5FU. However, our recent experiments (unpublished data) show that, though this oxygen exchange occurs, it is too slow and unlikely to be the mechanism for the Zn-O4 cleavage. A new mechanism was proposed in which the Zn-O4 bond was broken through a 5-coordinated Zn transition state, and the barrier for the Zn-O4 bond cleavage found to be quite low, only ~3 kcallmol (chapter 6). Therefore, it may be that rather than rate limitation by a chemical step, the barrier for product release comes from the loss of non-covalent interactions (such as hydrogen 183 bonds) with the protein and the steric hindrance due to the closed nature of the active site. If product release is rate limiting, knowledge of how product moves out or reactant moves into the active site would certainly help us understand this process, and lead to insights that would permit protein engineering to make this process faster, and consequentially improve the efficiency of the enzyme. In this work, we explore several possible ligand release paths by pushing cytosine out of the active site, with the use of a restraint method. The methodology is quite similar to steered molecular dynamics (SMD), which has been used in studying protein ligand binding and unbinding process in various systems (Isralewitz et al. 2001; Shen et al. 2003; Xu et al. 2003). But, before pushing the ligand out of the active site, potential paths must be chosen. Typical ns simulations at 300 K can hardly give us valid information about the ligand release path and the involved residues. So, the strategy we adopted is to perform one regular simulation for the apo enzyme at 300 K and another at 320 K. By comparing these two trajectories, we might be able to find out the “soft spot” of the protein. As is well known, the higher temperature gives larger thermal fluctuations, and therefore a better chance for the protein to overcome certain barriers to reach some less- populated conformations; in this case, the opening of the active site. The simulation results show that the flexibility of certain residues does increase significantly more than others. In particular, the F114 loop (111-117) and C-terminal helix (ISO-158) that cover the active site do fluctuate more when comparing the 300 K and 320 K simulations. The motions of these two regions at 320 K open the active site, and permit water molecules to diffuse into and out of the active site through two paths. Based on the identification of these paths, cytosine was pushed out of the active site along them, and the residues that 184 respond to this release monitored. In this manner, motions of yCD regions required to release the ligand could be suggested. Methods MD simulation of the YCD apo form The parameterization of the force field for Zn complexes was described in detail in a previous article (Yao et al. 2005). The protonation states of the ionizable residues were set to their normal ionization states at PH 7. The protein was surrounded by a periodic box of approximate dimensions 90 x 88 x 70, leading to solvation by ~17,000 TIP3P water molecules. Six Na+ ions were placed by the Leap program (Pearlman, Case or al. 1995) to neutralize the —6 charges of the model system. The parm94 version of the all- atom AMBER force field was used to model this system. MD simulations were carried out using SANDER in AMBER 7.0 (Pearlman, Case et al. 1995) at constant pressure and temperature. The time step was 2 fs and the SHAKE algorithm was used to constrain all bonds involving hydrogen atoms. A nonbond pair list cutoff of 8.0 A was used and nonbond pair list was updated every 25 steps. The pressure and the temperature (300 K and 320 K) of the system were controlled during the MD simulations by Berendsen's method. To include the contributions of long-range electrostatic interactions, the Particle-Mesh-Ewald method was used. A 2 ns simulation was performed for the 300 K simulation, with 500 ps of equilibration time. The 320 K simulation protocol is slightly different from 300 K’s. After a 1 ns simulation, 3 us of simulation was continued. Meanwhile, four other 1 ns independent simulations were started by reassigning the atom velocities to the last snapshot of the lst ns simulation based on the Maxwell-Boltzmann distribution at 320 K. So effectively, an 8 ns simulation 185 was done at 320 K. The first 500 ps of the regular simulation and the first 200 ps of the velocity-reassigned simulations were treated as equilibration periods, and therefore not included in the data analysis. The coordinates were saved every 2 ps, and were analyzed with the Moil-view program and the PTRAJ module of AMBER 7.0. Principle component analysis (PCA) was performed by using the gcovar and ganaei g modules in gromacs to filter out the fast motions (Lindahl et al. 2001). MD simulation of the YCD cytosine complex A 2-ns MD simulation was previously performed for the yCD cytosine complex at constant temperature (300 K) and constant volume (Yao et al. 2005). The simulation shows the active sites in both protomers are buried and the overall protein is quite rigid, similar to the 300 K 2 ns simulation of the apo form. In order to explore how cytosine gets out of the active site, a restraint was introduced to push the cytosine out through two paths based on the results from a 320 K apo form MI) simulation. The first path is in between the F114 loop (111-117) and C- terrninal helix (150-158) and the second path is in between loop 1 (53-61) and the C- terminal helix. A harmonic potential was added between the center of mass (COM) of heavy atoms of the cytosine ring and the COM of backbone heavy atoms of HisSO (Ser89) in path 1 (2). The reasons to choose HisSO are: (1) it is from the core region ([32) and quite rigid and (2) it is along the path 1 direction of ligand release. Similar reasoning was used in choosing Ser89 in path 2. The equilibrium distance was increased gradually to give an effective force to push cytosine out. The starting structure was taken from one snapshot of the cytosine complex MD simulation. 186 Cytosines were pushed out 13 A along path 1 in protomer 2 and path 2 in protomer l in a total of 15.6 ns simulation time. 130 windows were used with 0.1 A per window and a force constant of 20 kcal/(mol-Az). The force constant corresponds to a thermal fluctuation of distance ~0.25 A. This pushing rate gives us a continuous sampling of the cytosine release path. Test simulations showed that this force constant is large enough to push cytosine out efficiently, similar to the pushing simulations performed on another system we studied (DHNA ref), and consistent with what other researchers used in a ligand unbinding study (Xu et al. 2003) (Gerini et al. 2003; Shen et al. 2003). The restraint force was calculated and averaged over each window. The first 20 ps of simulation were treated as the equilibration period and therefore not included in the force calculation. But, in the RMSD calculation between the instantaneous MD trajectory and the crystal structure, all the snapshots were used since no apparent difference can be seen between the equilibration and data collection period in terms of structure. The pushing velocity is another important parameter in the simulation. Considering that the experimental rate of product release is 13 s", which is impossible to simulate with current computer resources, we use a pushing speed as slow as is practical to mimic the unbinding process. Our test simulations with a faster pushing velocity (0.1 A per 14 ps), a typical speed used in protein ligand unbinding in the literature, starting from different initial structures, showed no sign of distortion of the protein overall structure. Thus, from the structure point of view, the velocity we adopted is a reasonable compromise between the computational cost and the genuine property of the system. Moreover, the aim of this study is to address the potential ligand release paths and protein dynamics rather than evaluate the potential of mean force for ligand release. 187 Since the two ligands were pushed out simultaneously along two different paths, it might cause some synchronized artificial effect. In order to make sure this is not the case, a simulation of pushing a single ligand out along path 1 was carried out. No effect of dynamics changes in the active site of the adjacent protomer were found, probably because the two active sites are ~18 A apart, and the pushing simulation results in the ligand further away from the adjacent active site. Results and Discussion Apo form simulation at 300 K and 320 K yCD is a homodimer with 158 residues in each subunit. The data analysis was based on each subunit to improve the statistics, because the crystal structure shows that the two subunits are the same, except for several N-terminal residues. Becasuse previous MD simulations showed that the N-terminus is far more flexible than the rest of the protein (Yao et al. 2005) and is also far away from the protein active site, the first 14 residues were not included in the fitting for the calculation of the root mean square deviation (RMSD) from the crystal structure or the root mean square fluctuation (RMSF) from the average structure. The time evolutions of the CA-based RMSD of MD snapshots from the crystal structure stabilize after 500 ps, indicating that the two systems are in equilibrium during the analyzed trajectory (data not shown). Figure 7-2 shows RMSF plot at 300 and 320 K for two different subunits as well as the RMSF calculated from the crystal B-factors. As can be seen, they are consistent with the crystal B-factors. The overall protein is quite rigid with average RMSF ~O.41 A and ~O.44 A for protomer 1 and 2 respectively at 300 K. The increase of temperature from 300 K to 320 K causes a dramatic increase of 188 - mm. Am._~_ _-_— I r__..—_ protein flexibility (Figure 7-2). The average RMSFs are ~0.62 A for protomer l and ~0.67 A for protomer 2 at 320 K as approximately 50% larger than those at 300 K. This flexibility increase mainly comes from the relatively flexible regions at 300 K, among which the C-terrninal helix shows the largest RMSF increase in both protomers (Figure 7- 2). Since the C-terminal helix covers the active site, as shown in the crystal structure (Ireton et al. 2003), the dramatic increase of RMSF in this region might be important to open up the active site and let the reactant bind. In order to explore more details about the protein fluctuation changes and potential motion mode changes due to the increase of temperature, principle component analysis (PCA) was used to analyze the MD trajectories. PCA attempts to decompose the overall atom fluctuations into a small set of modes that encompass a large percentage of the overall RMSFZ, and a much larger set of modes that contribute a relatively small amount. In this way, we can focus on the small set of modes that represent slow, large-scale motions, because large portions of the flexibility increase usually comes from these modes. By comparing these significant motion modes at the two temperatures, we can obtain instructive information about large-scale motions that might be responsible for ligand binding or product release. As shown in Table 7-1, the total variance of CA atoms (residue 15-158) is ~25 A2 and ~29 A2 for protomer l and 2 respectively at 300 K. The 320 K simulation gives much bigger variances with ~65 A2 for protomer l and ~76 A2 for protomer 2. It is a surprise that the total fluctuations are approximately doubled even though the temperature only increases ~7%. Assuming harmonic vibrational motion for the CA atoms, the increase of total variance should scale with the temperature increase. The large deviation from this 189 expectation might come from two main reasons: 1. The increase of temperature gives the MD a better chance to overcome certain barriers and therefore explore phase space more effectively. 2. The longer simulation time at 320 K (8 ns) can also cover more phase space than the 2 ns simulation at 300 K. Nevertheless, it is a surprise that a small increase of temperature can trigger such a dramatic fluctuation change. New, slow motion modes might be captured in the 320 K simulation. To see whether this is true, PCA was used to filter out the fast motions and keep the slow ones which we are primarily interested in. The first 5 modes contribute 24% and 29% at 300 K compared with 48% and 47% at 320 K to the total motion for protomer 1 and protomer 2 respectively. The fluctuation difference in the first 5 modes between these two temperatures is ~25 A2 (~27 A2) for protomer 1 (2), contributing 63% (73%) of the total fluctuation difference. Thus the major motional differences between 300 K and 320 K were captured in the first 5 modes. The RMSF plots of individual residues at these two temperatures can provide us information about which regions have large flexibility changes. Figure 7-3 shows the CA RMSF of the first 5 modes; the main difference between 300 K and 320 K is from the flexible regions. Though the trajectories of these two protomers show some differences of the flexibility increase in certain regions, such as the 120~125 (part of helix 4) and l44~149 (part of helix 5) probably due to the incomplete sampling of conformation space, both protomers show consistently large RMSF differences from the C-terminal helix (150-158) and F114 loop (111-117). In order to get a better view of the flexibility Changes, a schematic graph was generated by superimposing the 50 MD CA snapshots pr0jected onto the first 5 modes for individual protomers at these two simulation 190 temperatures (Figure 7-3a). Clearly, the C-terminal helix and F-114 loop show much larger fluctuations at 320 K than those at 300 K. Furthermore, it reveals that the motion of these two regions can expose the active site to the solvent, whereby water molecules can move freely in and out of the active site, unlike the 300 K simulation in which the active site is completely buried (Figure 7-4a). We identified a total of 27 water molecules that diffuse into the active site and 11 water molecules that diffuse out through the path in between the C-terminal helix and F114 loop in both protomers (Figure 7-4b). The 300 K MD simulation shows that the side-chains of F114, W152 and 1156 cover the active site, while F114 and 1156 block this path. The increase of flexibility of the C-terminal helix and F114 loop at 320 K moves these two side-chains slightly away from their original positions; as a consequence the active site is opened up. Three residues C91, F114 and 1156 were used to define qualitatively the entrance of this water path (Figure 7-4b). Labeling the C91 CA-Fll4 CZ distance as d1, C91 CA-1156 CD distance as d2, and F114 CZ-1156 CD distance as d3, parametrically plotting these distances for 300 K and 320 K simulations in 3D reveals at 300 K the distances are restrained in the low left comer while at 320 K they are spread from low left to up right (Figure 7-4c). The distances for d1, d2 and d3 are $2510.72 A, 8.04:1.04 A and 4.65:0.64 A at 300 K compared with 7.73:1.90 A, 1020:1223 A and 6121120 A at 320 K in protomer 1. All the distances and their fluctuations are increased at 320 K, indicating the mouth of the active site is opened up and the breathing motion is larger at this temperature. The 320 K data, when projected onto the 3 2D planes formed from these distances, shows that the two state behavior in Figure 7-4c mainly arises from the C91 CA-F114 CZ distance 191 coordinate. Thus, there are barriers that are overcome during the simulation at the higher temperature. It is interesting that one water molecule moves out of the active site in protomer 2 through the path in between the C-terminal helix and the loop between (12 and [32 (51-61, we name it loop 1 (Figure 7-4b). As shown is Figure 7-3, loop 1 is also relatively more flexible at 320 K. Thus, two water diffusion paths were identified. But it seems that the path in between the F114 loop and C-terminal helix (path 1) is more favorable than the path in between the C-terminal helix and loop 1 (path 2), particularly considering either the natural reactant cytosine or the product uracil is much larger than a water molecule. Further investigations of these two paths were launched by pushing the reactant cytosine out of the active site along these two paths as described below. To summarize the apo simulation results, the 320 K Nfl) simulation increases the flexibility of the protein, and most of the increase comes from the large-scale (slow) motion modes that have relatively large barriers to overcome. The use of high temperature and longer simulation time gives us a greater chance to overcome these barriers and give a better description of the protein dynamics. Two pathways were identified for water diffusion into the active site, which are the potential paths for ligand binding, due to the increased motion of loop 1, the F114 loop and the C-terrninal helix. Our NMR experiments using HD exchange of backbone amides (chapter 4) confirm the motions of these three regions in the apo form, but on a much slower time scale, primarily in seconds, which is comparable to the product release rate. T1, T2 relaxation, NOE and T2 relaxation dispersion experiments show no apparent sign of large motions of these regions in the ps-ns and us-ms time scales (unpublished data). So, it appears that 192 the opening of the active site is quite slow, and probably limits the product release. By pushing cytosine out of the active site, we can discover details of the motions of individual residues that are important for product release or ligand binding. Pushing cytosine out of the active site In order to discriminate whether the two water paths identified in the regular simulation are sensible paths for product release, the reactant cytosine was pushed 13 A away from the active site through these two paths in a 15.6 ns MD simulation. The structure and flexibility changes for the overall protein and individual regions during the pushing simulation were investigated and the forces used for the cytosine pushing were calculated to better understand these two paths and their influence on the overall protein and its various regions. Several residues were identified and studied which might be the potential ones that limit the slow product release process. Path 1 In path 1, one harmonic restraint was added in between the COM of the cytosine heavy atoms and the COM of the backbone heavy atoms of His 50. This residue is from the [32 region, which is quite rigid, and aligned along the path. So, this residue was selected as the anchor to implement the restraint force. Figure 7-5a shows the RMSD plot of all the CA3 versus time. Over the first 7 ns, the RMSD is stable at ~0.9 A and then increases slightly to ~1.1 A. The whole protein is quite rigid and the perturbation of overall structure is rather small during the ligand release. Figure 7-5b shows the average pushing forces in each window. In the beginning, the force increases to ~7 kcallmol-A at the 3rd ns, then fluctuates around and drops back to ~0 kcallmol-A at about the 7th ns. After that, the force fluctuates and goes up to ~8 kcallmol-A then rapidly falls to ~0 kcallmol-A at the 12th ns. Hereafter cytosine is out of 193 the active site and on the surface of protein, so the force fluctuates around 0 kcallmol-A. From the RMSD plot and the force plot (Figure 7-5a and b), it appears that there are two periods for the product release, the first period is from the beginning to the ~7th ns and the second period from the 7th ns to the end. In the first period the overall structure changes are quite small. The main barrier comes from the direct hydrogen bonding interactions lost between cytosine and the residues in the active site, which can be confirmed by a hydrogen bond analysis. Figure 7-5c shows the lifetime of the hydrogen bonds of the pushing simulation. There are 5 hydrogen bonds between cytosine and the protein, including N51, G63, E64 and D155, when cytosine is well bound in the active site. All these hydrogen bonds are lost during the first period of the pushing simulation. The hydrogen bond with G63 was lost first at ~3 ns primarily because G63 sits at the bottom of the active site, and the hydrogen bond donor, the backbone amide, is quite rigid. Then E64 and N51 lose their three hydrogen bond interactions with cytosine at ~5 ns. Lastly, the hydrogen bond between D155 and cytosine is broken at ~7 ns. Apparently, the loss of five hydrogen bonds demands the external force to do the work and compensate the energy. During the second period, since all the hydrogen bonds with cytosine have broken, the force needed to push the ligand out probably mainly comes from steric hindrance and conformational rearrangements of the residues along the path. To understand how the cytosine—pushing procedure affects the dynamics of individual regions or residues, CA RMSFs were plotted in Figure 7-6. Figure 7-6a shows the CA RMSFs of the regular MD simulation, which are quite similar to the regular simulation of the apo enzyme at 300 K, with a rigid backbone. During the pushing of cytosine, the overall protein becomes more flexible as shown in Figures 6b and 6c, and 194 the RMSF increases when the first and second stages of the pushing are compared with the regular simulation. The total fluctuations of the CA atoms are 35.8 A2, 50.6 A2 and 59.3 A2 for the regular, stage 1 and stage 2 simulations, respectively. Since the active site is buried during the regular simulation, the protein has to move certain regions or residues to open up the active site to let cytosine out, which causes the fluctuation increases. Figure 7-6b and 6c show quite dramatic C-terminal helix and F114 loop RMSF increases in the pushing simulation, consistent with the 320 K apo form simulation results (Figure 7-2). The increases in the RMSFs of these two regions suggest that: l. The motions of these two parts are important for ligand release. 2. The motion of protein required for ligand release is local, and the structural perturbation is small. It appears that the motion of the C-terminal helix is larger in stage 2 than that in stage 1. The MD trajectory shows that at the end of stage 2 cytosine moves out of the active site completely. The tip of the C-terminus moves close to the active site, partially to occupy the empty space, and consequentially block the active site, which effectively imparts to the C-terminal helix a relatively large motion. The unbinding process of cytosine can be better illustrated by the schematic graph of the CA fluctuations. Just like the data analysis carried out in the apo form simulations, PCA was used to filter out the high frequency motions and only the first 5 modes were used in the schematic plot that is composed of the superposition of 50 MD snapshots. As shown in Figure 7-7a, the whole protein is rather rigid, including the F114 loop and C-terminal helix in the regular 2 ns MD simulation of the cytosine complex. Figure 7-7b shows the increase of the fluctuation in these two regions in the first stage of pushing, while Figure 7-7c describes a similar increase, but some of the C-terminal helix snapshots are closer to the active site in the 195 second stage. A movie of the trajectory reveals that, in stage 2, the C-terminal helix and F114 loop undergo a concerted flapping motion, which effectively opens up the active site and then closes it after the release of cytosine (see discussion below). Besides the changes of these two regions, a small helix ~75-80 also undergoes fluctuation perturbations during the pushing simulation (Figure 7-6b and c). Further inspection shows that this region is in close contact with the C-terminal helix of the adjacent protomer, and suggests that this fluctuation increase might be caused by the motion of that C-terminal helix where cytosine was pushed out along path 2 simultaneously. The MD trajectory shows that the cytosine release path is slightly different from the water diffusion path in the apo form simulation. Unlike water diffusing through the triangle mouth defined by C91, F114 and 1156, the cytosine release path drifts slightly away and is in between F114 and 1156 (Figure 7-8). The mass weighted RMSDs of F114 and 1156 are plotted in Figure 7-9. The RMSD of F114 changes dramtically during the pushing simulation. Over the first 6 ns, it fluctuates around ~l.5-2.0 A, then increases to ~6.0-7.0 A and stabilizes at ~9-13 ns, after that it drops to ~25 A at ~14 ns. Compared to F114, the motion of 1156 is smaller (Figure 7-9), with a similar pattern to the whole C- terrninal helix, and at the end it also goes back to ~l A, indicating that this residue returns to its original conformation. Apparently, F114 undergoes a large flapping motion during stage 2, which opens up the active site and assists the release of cytosine, while the C- terminal helix moves slightly away to make the entrance larger, and then both come back to cover the active site after the release of cytosine. It appears that the C-terminal helix moves as a whole block, while F114 moves in addition to the loop movement because 196 there is no secondary structure restraint in the loop region. As a result, the motion scale of F114 is much larger than 1156. Path 2 Cytosine in protomer 1 was pushed out along path 2 that is defined as in between the C-terminal helix (ISO-158) and loop 1 (53-61) (Figure 7-8). According to the 320 K simulation of the apo enzyme, it seems that this path is less popular for water to diffuse through than path 1. Figure 7-10a shows the RMSD of all CA atoms superposition with the crystal structure, which increases from ~0.8 A to ~1.4 A. This change is larger than that of path 1, which increases from 0.9 A to 1.1 A, suggesting that the overall protein has to move more when cytosine releases through this path. This could make the release harder. The total CA fluctuation is 92.5 A2, much larger than that in the regular simulation (25 A2 (29 A2) for protomer 1 (2)), with the difference (for protomer 2) presented in Figure 7-10c. It appears that many regions in the protein have larger fluctuations, especially those flexible regions identified in the regular 320 K apo form simulation (Figure 7-2). Therefore, it seems that the protein has to rearrange many regions in a concerted way when cytosine releases through this path, unlike in path 1 where only the F114 loop and C-terminal helix regions need to move away, and the perturbation is rather local and small. An analysis of the MD trajectory shows that along this path cytosine first pushes aside V31, N51, D155 and breaks the hydrogen bonds with the later two residues, and then moves away from D151 (salt bridged with R148). After that, it rearranges T54's side-chain to gain enough space to exit. It appears rather hard for a ligand to move through this path, more likely for water but not for cytosine. The force used to push cytosine out is presented in Figure 7-10b. The average force is ~5.6 kcallmol-A, about two times larger than that in path 1 (~32 kcallmol-A), implying that 197 more work has to be done in path 2. The root mean square force fluctuation is also larger in path 2 (4.5 kcallmol-A) than path 1 (2.8 kcallmol-A), so the energy surface seems rougher along path 2. Thus, from energy and structural points of view, path 2 is not favorable compared with path 1 in the current simulation, although path 2 cannot be excluded on the basis of the simulations alone. In order to release the ligand along path 2, many regions have to move, including the C-terminal helix and F114 loop, which are the only parts required to move significantly when cytosine is released along path 1. In fact, the F114 loop has the largest fluctuation changes in the path 2 simulation, because cytosine pushes the C- terminal helix close to this loop to move it away. It appears that in both pathways, the C-terminal helix is very important, acting like a lid covering the active site. During the pushing of cytosine along both paths, the C- terrninal helix moves as a whole block. The X-ray structure shows that this helix is negatively charged (—4, including D151, E154, D155, E158), and there are four salt bridges with residues from other regions, including D151-R148, E154-R53, E154-R73* (from the adjacent protomer) and D155-R53, which restrain this helix. During the pushing simulation, all these salt bridge are well maintained. It is reasonable to predict that cytosine pushing would be easier if some of these salt bridges were weakened or eliminated. On the other hand, mutation of F114 and/or1156 to smaller residues, without changing the dynamic property of the C-terrninal helix, may also improve the product release rate. Experiments are underway with mutants of some of these residues to explore these suggestions. 198 Conclusions In this study, we used Molecular Dynamics to explore possible ligand release paths from yCD. The 300 K apo enzyme simulation shows a closed active site, consistent with the crystal structure. Overall the protein is quite rigid, a feature that has been confirmed by NMR backbone relaxation experimental data (unpublished results) {ref}. An 8 us 320 K simulation was performed, which shows a strikingly large increase of flexibility in some regions of the protein. The increases of flexibility in the C-terminal helix and F114 loop regions, which open up the closed active site, are most significant. Water molecules diffuse into the active site through two paths. One path is in between the F114 loop and C-terminal helix, and the other is in between loop 1 and the C-temrinal helix. The former path appears to be more popular, with the mouth of the entrance defined by a triangular area formed by the residues C91, F114 and 1156. This area and its fluctuations are much larger at 320 K than at 300 K (Figure 7-4c), which allows water molecules to frequently move in and out of active site. Cytosine was pushed out of the active site along these two paths by 13 A in a 15.6 ns simulation. The results show that in path 1 the required motion of the protein is quite local, only involving the C—terminal helix and F114 loop. Two residues F114 and 1156 are identified that have to be moved away in order to let cytosine out. While, in path 2, the protein has to rearrange itself quite globally, and the changes are also much larger compared to the path 1 simulation. The average force along the path and its fluctuation are larger in path 2 than in path 1, suggesting that the barrier is higher and the energy surface along the release is rougher in path 2. Several residues have to be moved away including V31, N51, and D155 first, and then D151 and finally T54. This path appears 199 rather difficult, and seems to be a path much less frequently used even for water molecules and not meant for cytosine because of its much larger size in comparison with water. However, path 2 cannot be excluded on the basis of the simulations alone. Nevertheless, in both paths the C-terminal helix is critical for ligand release. Note that the C-terrninal helix is well restrained by four salt bridges and several hydrogen bonds with residues in other regions. By weakening these salt bridge interactions, we could, in principle, increase the flexibility of the C-terminal helix, and that could make cytosine release faster. Or, on the other hand, making F114 or 1156 (residues along the path) smaller could also assist the release of the ligand. As discussed above, the active site opening limits the ligand escape process. From the experimental point of view, by modifying the protein as described above, we should be able to improve the product release rate and therefore enhance the protein’s activity. Experiments are under the way to test these hypotheses, and with more data available, we will be able to correlate our experimental results with our simulation data. 200 Reference Gerini, M. F., D. Roccatano, et al. (2003). "Molecular dynamics simulations of lignin peroxidase in solution." Biophysical Journal 84(6): 3883-3893. Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 angstrom crystal structure of yeast cytosine deaminase: Evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11(8): 961-972. Isralewitz, B., M. Gao, et al. (2001). "Steered molecular dynamics and mechanical functions of proteins." Current Opinion in Structural Biology 11(2): 224-230. Ko, T. P., J. J. Lin, et al. (2003). "Crystal structure of yeast cytosine deaminase - Insights into enzyme mechanism and evolution." Journal of Biological Chemistry 278(21): 19111- 191 17 . Lindahl, E., B. Hess, et al. (2001). "GROMACS 3.0: a package for molecular simulation and trajectory analysis." Journal of Molecular Modeli_ng 7(8): 306-317. Morris, S. M. (1993). "The Genetic Toxicology of S-Fluoropyrimidines and 5- Chlorouracil." Mutation Research 297(1): 39-51. Nishiyama, T., Y. Kawamura, et al. (1985). "Antineoplastic Effects in Rats of 5- Fluorocytosine in Combination with Cytosine Deaminase Capsules." Cancer Research 45(4): 1753-1761. Shen, L. L., J. H. Shen, et al. (2003). "Steered molecular dynamics simulation on the binding of NNRTI to HIV-1 RT." Biophysical Journal 84(6): 3547-3563. Sklenak, S., L. S. Yao, et al. (2004). "Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study." Journal of the American Chemical Society 126(45): 14879-14889. Xu, Y. C., J. H. Shen, et al. (2003). "How does Huperzine A enter and leave the binding gorge of acetylcholinesterase? Steered molecular dynamics simulations." Jouml of the American Chemical Society 125(37): 11340-11349. Yao, L. S., Y. Li, et al. (2005). "Product release is rate-limiting in the activation of the prodrug 5-fluorocytosine by yeast cytosine deaminase." Biochemistry 44(15): 5940-5947. Yao, L. S., S. Sklenak, et al. (2005). "A molecular dynamics exploration of the catalytic mechanism of yeast cytosine deaminase." Journal of Physical Chemistry B 109(15): 7500-7510. 201 Appendices Table 7-1 Total CA fluctuations and contributions from first 5 modes at 300 K and 320 K. 300 K 320 K prot 1 prot 2 prot 1 prot 2 CA fluctuation (A2) 25.4 29.4 65.1 76.1 First 5 modes 24% 29% 48% 47% 202 o NH2 H\N3)41j/F N34 F OX1 I 02% I J. H (a) (b) Figure 7-1 a. S-fluro-uracil (5FU) b. S-fluro-cytosine (SFC) 2.o-' —- prot 1 (300 K) prot 1 B-factor 1.5~ —— prot 1 (320 K) 4: —difference I 1.0 * 1 [K1 2 i A I 0.54 A“ , n-‘ " a) \ . v “o "k ‘ ‘ 2 4 ‘1 o.o~ -o.5~ -1.0- '1.5 r v T I T ' I ' I j l ' J Y 1 20 4o 60 80 100 120 140 160 Residue (a) prot 2 (300 K) 2.0« ——prot2 B—factor . 1 —— prot 2 (320 K) 1.5~ ———-—difference g LL. 0) 2 CE 1 -O.5-1 1 1.0« ‘1-5 T fl I ' I ' I fl I ' I ' I ' 1 20 40 60 80 100 120 140 160 Residue Figure 7-2 CA RMSFs of: (a). protomer 1. (b). protomer 2 at 300 K (black) and 320 K (green). The crystal B-factors were converted to RMSFs (red). The difference (blue) is the RMSF difference of the 300 K and 320 K data. 203 , 2-01 —prot1 (300 K) ‘ —-—prot1 (320 K) 1-5' ‘ —difference A 10- fl ‘3 / LOL) 05- 2 -1 a: 0.0- -0.5- -1.0- '1-5 1 I I r I r I fi—l 20 40 60 80 100 120 140 160 Residue (a) 20] —prot2(300 K) , —-—prot2(320 K) 15‘ , —difference 1.0-1 g 1 :3 0.54 2 r I! 0.04 -0.5- 1.0- -1.5 . . . , . . . - i i 20 40 60 80 100 120 140 160 Residue (b) Figure 7-3 CA RMSF plots of the first 5 PCA modes at 300 K and 320 K and their difference: (a) protomer 1. (b) protomer 2. 204 F114 loop / WA) ‘4 (b) (C) Figure 7-4 (a) Schematic graph of 50 trajectory snapshots of CA atoms projected out of the first 5 PCA motion modes at 300 K and 320 K. The F114 loop and C-terminal helix were labeled to give a better view. Water molecules can diffuse into the active site at 320 K through the path in between F114 loop and C-terminal helix, but not at 300 K. (b) One snapshot of apo MD simulation at 320 K. Water molecules diffuse into the active site through the triangle mouth defined by C91, F114 and I156. (c) 3D plots of distances d1 (C91 CA-F114 CZ), d2 (C91 CA-1156 CD) and d3 (F114 CZ-1156) at 300 K (red) and 320 K (black). 205 2.0 - i—path1Cfl 1.5« Stage1 Stage2 2: . . ’ 1 5% 1c ‘ - 2 ' . (I l 0.54 i 00 V I I I I I ' T T 1 o 2 4 6 8 1o 12 14 16 Time(ns) (a) —path 1 forcfl 1o~ 2 Stage1 818982 75 . E s 5- :‘5, § 0 u. o- '5 ' I I I f I f I I I ' I I I # 1 o 2 4 6 8 10 12 14 16 Time(ns) (b) ho I N51-Rea o GGS-Rea A E64-Rea '—'-V' ' E64-Rea o DiSS—Rea B 1 - _A 2 I -—- 0'214'6'87110'112j1141116 Time(ns) (C) Figure 7-5 (a). Total CA RMSD versus time obtained by comparing the MD snapshots with the crystal structure. (b). The average restraint force between cytosine and the protein in each window. (0). Hydrogen bond lifetimes between cytosine and the protein during the cytosine pushing along path 1. 206 2.0a 1.5-1 0.5-1 zbfiovsh's'o'too'téojtio'tso Residue (a) 1.0a [—+— RMSF difference] ‘ Staget 0.5 " + '05 I f I I I r I v I fi I I I 20 4o 60 so 100 120 140 160 Residue 1.0 - [—+— RMSF dinerencfl +.—— s + + ‘ (“f-1' 4' + + + ’K-tl't'm + ’39:] ++ g . iw'ta. 317 “a, Ajtmatb downiwtlftw L? 1311; 3"“ “'3 is 4* ’ ii .9. '05 y r r - I m I v I f m ' I ' I 20 40 so so 100 120 140 160 Residue (C) Figure 7-6 (a). CA RMSFs in 2 ns regular MD simulation. (b). RMSF difference by comparing the first stage of cytosine pushing with the regular simulation along path 1. (c). RMSF difference by comparing the second stage with the regular simulation. 207 final ,5 Regular ' 81:19.1 ‘1“ (a) (b) (C) Figure 7—7 Schematic graph of 50 trajectory snapshots of CA atoms projected out of the first 5 PCA motion modes: (3). Regular simulation. (b). Pushing simulation stage 1. (c). Pushing simulation stage 2. fpath 2 Figure 7-8 The cytosine release paths from the pushing simulation. 208 " ——F114 8: 1156 RMSD(A) 0 I I T I T l I l j o 2 4 6 8 1o 12 14 16 Time(ns) Figure 7-9 Mass weighted RMSD plot of F114 and 1156 along the pushing path of cytosine (path 1). 209 2.0 -1 F— pach CA ] RMSD(A) T I Y I l ' I r I ‘ I ‘ l o 2 4 6 8 1o 12 14 16 Time(ns) (a) [ ——path2torce l 15- ‘4‘ '8' 10.. > J .3; v 5‘ £3 0 IL 0+ , ’5 I I r I I ' I ' I ' I I I o 2 4 6 8 10 12 14 16 Time(ns) (b) , 1,4, [ —+——RMSF difference ] + J 1.2-1 A A 1 1.0: t++ A 08‘ + '1‘. + + +1" “- 06« + I L (I) + 2 1 ft 1 l r: 044 fl +15 1 ‘ It + -1 + + K + t 0.2- Rf 13.. 1 1 + .m #31 0.0- We ‘0-2 I r I I I I I fi fi ' I ' I v I 20 40 60 80 100 120 140 160 Residue (C) Fi gure 7-10 (a). RMSD of all CA atoms compared with the crystal structure. (b). The aVerage force calculated for the cytosine pushing along path 2. (c). The CA fluctuation differences between the regular simulation and the cytosine pushing. 210 Chapter 8 Reaction Mechanism of Guanine Deaminase: An ONIOM and Molecular Dynamics Simulation Study Introduction Guanine deaminase (GD), a Zn metalloenzyme, catalyzes the hydrolytic deamination of guanine into xanthine (Figure 1) and therefore plays an important role in nucleotide metabolism. The crystal structure of GD from Bacillus subtilis (bGD) complexed with imidazole was solved by Liaw and coworkers and his colleagues at 1.17 A resolution (Liaw et al. 2004). bGD forms a homodimer with 156 residues in each monomer. The overall structure of each monomer consists of a central five-stranded [3- sheet sandwiched by six helixes (Liaw et al. 2004). The homodimer GD contains two active sites, including one Zn atom inoeach of the active sites. The Zn atom is coordinated with HisS3, Cy383, Cys86 and one water molecule. It is interesting that the active site is buried by a swapped C-terminal tail from the other subunit. It was proposed that this tail not only seals the active site but also is used to recognize specific substrate (Liaw et al. 2004). The bGD—catalyzed reaction is believed to proceed through a tetrahedral transition state, while Glu55 (Figure 8-2) acts as the proton shuttle. But the detailed mechanism is still unknown. In this work, we performed a series of quantum calculations and molecular dynamics simulations to study the reaction mechanism. It is very challenging to carry out an accurate and reliable quantum mechanical study for an enzymatic catalytic reaction 211 due to the size of the system. Here we adopted a two-layered ONIOM (Humbel et a1. 1996; Svensson et al. 1996; Dapprich et al. 1999; Vreven and Morokuma 2000) method implemented in Gaussian 03 (Frisch) to do the quantum calculation. The ONIOM method is a hybrid computational method allowing different levels of theory to be applied to different parts of a molecular system. In the two-layered ONIOM method, the molecular system was divided into an inner layer (the core region, usually at high level theory) and an outer layer (surrounding environment, usually at low level theory). The ONIOM method allows one to perform a high-level calculation on just a small but critical part of the molecular system while incorporating the effects of the environment at a low level theory, which makes large molecular system study affordable with good accuracy. Our ONIOM study of the reaction mechanism catalyzed by yeast Cytosine Deaminase (yCD) (Sklenak et al. 2004) correlates quite well with experimental results (Yao et al. 2005) (Yan's lab data). But, in ONIOM calculations, one drawback is that only the static structure is used due to the high computer time cost. On the other hand, molecular dynamics simulation based on a classic force field can efficiently provide important dynamic property of the protein, which can be used to complement the ONIOM calculation (Yao et al. 2005). In addition, MD simulation can be performed for the whole protein solvated with explicit water, while ONIOM calculation is typically used for only for a selected part of the system (generally the active site and the immediate surrounding residues). MD simulation can be used to evaluate the environmental effect of the rest of the system and provide important information about the size of the selected part that should be used in ONIOM. 212 Methods ONIOM calculation The computational models studied in this work were based on the 1.17 A imidazole bound GD crystal structure (Liaw et al. 2004). The active site model was generated from subunit one of the crystal structure. In order to get guanine bound in the active site, the imidazole ring of guanine was superimposed to the bound ligand imidazole. The missing hydrogen atoms were built using InsightII (InsightII). The protonation state of guanine was determined based on pKa of its individual atoms. In neutral pH, N1 and N9 are protonated while N7 is deprotonated in aqueous based on pKa of these three atoms (N1: 12.3-12.4, N9: 9.2-9.6, N7: 3.2-3.3) (Jang et al. 2003). This is consistent with the hydrogen bond analysis of the active site (see results part). N1 forms a hydrogen bond with carboxyl group of Glu55 as a donor, N9 forms a hydrogen bond with carboxyl group of Aspl 14 as a donor, and N7 forms one hydrogen bond with phenol group of Tyr156 as an acceptor. 06 forms two hydrogen bonds with backbone amide of Ala54 and sidechain amide of Asn42 as an acceptor, so 06 should be in ketone form instead of enol form (Figure 8-2). The enzyme was modeled as 26 residues within 6 A of the bound guanine, including Gly24, Pr025, Phe26, Gly27, Ala28, Glu41, Asn42, Asn43, Val44, Ala52, HisSB, Ala54, Glu55, Va156, Thr78, CysSO, Glu81, Pr082, Cy883, Cy386, Ala107, Phell2, Asp113, Aspll4, Trp*92 and Tyr*156 (* represents residues from adjacent subunit, same below.) MD simulation shows that these residues contribute most of the interaction energies between ligand and surroundings (see results). The systems were divided into two layers. The inner layer, which was treated at high level of theory, was composed of the substrate, intermediate, or the products, Zn, the imidazole ring (model 213 for HisSB), CH3CH2COO' (model for Glu55), SCH3 (model for Cy883 and Cy586), CH3COO' (model for Aspl l4), and the water coordinated to Zn. The rest of system was treated at a lower level. Quantum chemical studies on model compounds show that the Zn-ligand distances are sensitive to their protonation states (Dudev and Lim 2001; Dudev and Lim 2002). It appears that the Zn complex in bCD is identical to that in yCD active site, with very similar distances (Ireton et al. 2003). The four distances in bCD (yCD) are Zn-O 2.03 (2.07) A, Zn-ND 2.06 (1.99) A, Zn-SG 2.34 (2.30) A and Zn-SG 2.28 (2.28) A. Therefore, the protonation states of Zn bound ligands are set the same as yCD complex with SG of Cys83, Cys86 and ND of His53 deprotonated. The imidazole bound form active site structure was reproduced by ONIOM method showing good agreement with the crystal structure (Irnidazole was treated at high level.) that confirms the protonation state assignment. B3LYP (Lee et al. 1988; Becke 1993) with the 6-31G* basis set was used as the high level, while Aml was used as low level (Dewar et al. 1985). Energy barriers were estimated by scanning the reaction coordinates. Therefore the barriers are the upper bounds of real reaction barriers. The geometry of the inner layer was optimized for all the species while the atoms of the outer layer were fixed at their crystal structure positions (Sklenak et al. 2004). MD simulation shows the protein is quite rigid; especially the active site in both the reactant guanine and the product xanthine bound form, which suggests freezing the outer layer is a reasonable approximation (see Results). MD simulation of apo and guanine bound form In the MD simulation, the RESP charges of Zn, Zn bound water, the side-chains of HisS3, Cy383, and Cy586 as well as the bond, angle, dihedral force constants of Zn 214 complex are taken from yCD apo form simulation considering the same architecture of Zn complex in the two crystal structures. Atom-centered partial charges of guanine were derived by using the AMBER antechamber program (RESP methodology)(Pearlman et al. 1995) based on I-IF/6-31G* quantum calculation. Starting coordinates for the protein atoms were taken from the crystal structure. Irnidazole was removed from the active site of subunit one, while it was mutated to guanine in the active site of subunit two. 80, subunit one is the apo form while subunit two is the guanine complex. All the crystal water molecules were maintained. The protonation states of the ionizable residues were set to their normal values at pH 7. The protein was surrounded by a periodic box of 12.5 A with ~18,000 TIPBP water molecules. Na+ ions were placed by the Leap program (Pearlman et al. 1995) to neutralize the ~12 charge of model system. The parm94 version of the all-atom AMBER force field (Cornell et al. 1996) was used for all the simulations. MD simulations were carried out using the SANDER module in AMBER 7.0 (Pearlman et al. 1995). The SHAKE algorithm was used to constrain the bond lengths of all bonds involving hydrogen atoms permitting a 2 fs time step (Ryckaert et al. 1977). A nonbonded pair list cutoff of 8.0 A was used and nonbonded pair list was updated every 25 steps. The volume and the temperature (300 K) of the system were controlled during the MI) simulations by Berendsen's method (Berendsen et al. 1984). The Particle-Mesh- Ewald method was used to include the contributions of long-range electrostatic interactions (Essmann et al. 1995). The simulation time was 2 ns with a SOOps equilibration period. Coordinates were saved every 2 ps. All of the MD results were analyzed using PTRAJ module of AMBER 215 7.0. In these analyses, hydrogen bonds were assigned when the distance of two heavy atoms (0 or N) is less than 3.5 A and the angle (heavy atom -hydrogen-heavy atom) is greater than 120°. MD simulation of xanthine bound form In order to understand whether water assists the Zn-02 bond cleavage during xanthine release, MD simulation of xanthine bound form was performed. One xanthine was docked to each crystal structure active site based on the ONIOM minimized active site structure. In subunit one, the distance between Zn and 02 is restrained at 3.10 A with force constant 100 kcal/(mol-AZ) mimicking the transition state of Zn-02 bond cleavage. In subunit two, the distance was retrained at 3.80 A with the same force constant mimicking the xanthine bound form with the Zn—O2 bond broken. Both distances are from the corresponding ONIOM calculations. The same force constants for Zn complex were used for the apo and guanine form. RESP charges were fitted by using the same program as done for guanine, but based on B3LYP 6—31+G* calculations of ONIOM optimized active site structures which were trimmed to only include HisS3, Glu55, Cy383, Cys86, Aspll4, Zn and xanthine (Yao et al. 2005). Only the charges of the side chains of Hi353, Cys83 and Cy586 as well as charges of Zn and xanthine were modified. Product ammonia was substituted by one water molecule in each active site considering the release of ammonia should be easier than xanthine. 2-ns MD was performed by using the same protocol as described above, while the data were analyzed by using the PTRAJ program. Resuhs Reproduction of the crystal structure active site 216 I‘l‘l‘.’ ..‘n 1“. ._.. Ina In order to validate the ONIOM method used in this work, the crystal structure of the active site was reproduced. The calculated distances and angles for Zn complex are summarized in Table 8-1 comparing with those in the crystal structure. They are in good agreement indicating that the ONIOM method can reproduce the active site Zn architecture; therefore it is reasonable to assume that it can also give us fairly good energies for various species along the reaction path. To elucidate the reaction mechanism catalyzed by bGD, a series of ONIOM calculations were performed for various species as shown below. Substrate Binding A schematic representation of the active site of the substrate binding form is given in Figure 8-2. The bound substrate is stabilized by a hydrogen bond network and a It- stacking interaction with the imidazole ring of HisS3. The hydrogen bonds between substrate and surrounding residues are listed in Table 8-2 with distances between two heavy atoms. The amino group of guanine forms hydrogen bonds with the carboxyl group of Glu55 and the carbonyl group of Glu8l. These two hydrogen bonds can assist the positioning of C2 for nucleophilic attack by the Zn-bound water in the following steps. The carboxyl group of Glu55 also forms one hydrogen bond with Nl-H and another one with Zn bound water molecule, while the latter also forms one hydrogen bond with N3 of guanine. In the ONIOM optimized structure, guanine forms an additional four hydrogen bonds with Gln42, Ala54, Aspl l4 and Tyr156* (Figure 8-2, Table 8-2). To form an active enzyme-substrate complex, a proton must transfer from the Zn bound water to N3 of guanine. The hydrogen bond between OH of water and N3 seems to favor the direct proton transfer. But the angle between the OH vector and purine plane 217 "#‘Tl' of guanine is ~90° which makes the direct proton transfer extremely difficult. Our ONIOM calculation shows that the barrier for direct transfer is ~42 kcallmol. Thus, this transfer is unlikely to occur. Alternatively, this proton transfer may be assisted by the carboxyl group Aspl 14. But the distance between the O of water and the ODl of Asp114 is quite far, ~4.2 A when guanine binds, which makes the proton transfer from water to Asp114 quite difficult. However, the energy of the protonated Asp114 and Zn bound hydroxide is 0.7 kcallmol higher than the deprotonated Asle and Zn bound water (complex 1 —’ complex 2, Figure 8-3) when the substrate binds. So from a thermodynamics point of view, the Zn bound water can transfer its proton to ODl of Aspl 14. This process will be discussed later. The protonation of guanine After the protonation, ODl of Aspll4 forms a new hydrogen bond with N3 of guanine (complex 2 in Figure 8-3) with heavy atom distance 2.81 A while other hydrogen bonds between the ligand and enzyme are maintained. This new hydrogen bond stabilizes protonated Asp114. The distance between N2 and O of Zn bound hydroxide is shortened to 2.46 A, compared with 2.78 A before the proton transfer from the Zn bound water to Aspl 14. Apparently this transfer favors the hydroxide nucleophilic attack. Then Aspl 14 transfers its proton to guanine (complex 3 in Figure 8-3) with a barrier of 4.5 kcal/mol when ODl-H distance is 1.25 A. The barrier was calculated as following. The distance between ODl and the proton was scanned by 0.1 A per step, so the maximum energy could be located approximately. Then 0.05 A was used for the scan around the maximum. Complex 3 is 0.7 kcallmol more stable than complex 2. The proton 218 transfer from Aspll4 to guanine makes the distance between C2 and O of Zn bound hydroxide even shorter (~2.29 A) due to the positive charge of the ligand. The formation of tetrahedral intermediates After the proton transfer from the Zn bound water to N3 though Aspll4, the hydroxide is well positioned for a nucleophilic attack. The shortened distance between C2 and Zn bound hydroxide due to the proton transfer makes the nucleophilic attack facile. The barrier for this process is 1.5 kcallmol with the distance between C2 and O of Zn bound hydroxide 1.85 A. The barrier was calculated by using the same method as described above. Such a low barrier suggests this process can occur fairly easily. The new species (complex 4 in Figure 8-3) is 0.1 kcallmol more stable than complex 3. This process also shortens the distance between O of Zn bound hydroxide and OE2 of Glu55 from 2.88 A to 2.63 A and one hydrogen bond is formed between them. This distance shortening assists the next step of the reaction — the proton transfer from Zn bound hydroxide to the carboxyl of Glu55. The proton transfer from Zn bound hydroxyl to Glu55 is composed of two sub- steps: (1). Rotation of the amine group of the tetrahedral intermediate to point N2-H1 to other directions so that OE2 of Glu55 can extract the proton from hydroxyl. (2). Proton transfer to OE2 of Glu55. The new state (complex 5 in Figure 8-3) has 3.3 kcallmol higher energy than complex 4. Since the reaction coordinates for this process are rather complicated, the barrier was estimated based on the stepwise process, ~4.2 kcallmol with OE2-H distance 1.20 A from sub-step 2. Considering the energy difference between complex 5 and complex 4 is 3.3 kcallmol, the real barrier should be between 3.3 kcallmol and 4.2 kcallmol. 219 ' u H '-i‘.A1L‘:‘ “.17 After the proton transfer from the Zn bound hydroxyl to Glu55, the carboxyl group of Glu55 rotates ~30° to point OE2H to the amine group and then transfers this proton to the amine group. The barrier for this step is 2.6 kcallmol with a distance between N2 and the proton of 2.05 A. This new complex (complex 6 in Figure 8-3) is 4.1 kcallmol more stable than complex 5. The bond length between N2 and C2 was monitored along the reaction so far. Interestingly this bond was elongated quite a bit after the nucleophilic attack. This bond length is 1.37 A (complex 1), 1.36 A (complex 2), 1.35 A (complex 3), 1.44 A (complex 4), 1.46 A (complex 5) and 1.56 A (complex 6). The first dramatic change (from complex 3 to complex 4) may be caused by the hybridization change of C2 from sp2 to sp3, which weakens the correlation between N2 1: electrons and the purine ring 1t electrons, while the second dramatic change (from complex 5 to complex 6) may be caused by the fact that N2 orbital hybridization changes from sp2 to sp3. After all, this 0.2 A C2-N2 bond elongation should make the C2-N2 bond cleavage easier in the next step. The formation of products Since the amino group has been formed after the proton transfer from Glu55, the C2-N2 bond can be cleaved to form Zn bound xanthine and ammonia. The distance between C2 and N2 was scanned, the maximum was found at the distance 2.10 A with the barrier 8.8 kcallmol. The product complex (complex 7 in Figure 8-2) is 5.9 kcallmol less stable than complex 6. The stationary position was found for ammonia at the distance 2.79 A forming two hydrogen bonds with OE2 of Glu55 and carbonyl of Glu81. The two hydrogen bonds distances are 2.97 A (between N of ammonia and OE2 of Glu55) and 3.17 A (between N of ammonia and O of Glu8l), indicating that ammonia is loosely 220 bound in the active site, which should favor complex 7 more than complex 6 from entropic point of view. Once ammonia is released, this reaction will not be reversible. Therefore, from complex 1 to complex 7, the overall reaction is endothermic with ~5.0 kcallmol energy differences and the rate-limiting step is the C2-N2 bond cleavage with the barrier 8.8 kcallmol (complex 6 to complex 7). Release of xanthine Zn and 02 of xanthine are still covalently bonded in complex 7; this bond has to be broken in order to release xanthine. The Zn-O bond was scanned and the maximum energy occurs at distance 3.10 A with the barrier 8.4 kcallmol slightly smaller than C2- N2 bond cleavage barrier in the previous step. Xanthine becomes stable when this distance increases to 3.79 A with the energy 7.9 kcallmol higher than the covalently bonded form (Figure 8-3). After the break of the Zn-O bond, xanthine is free to release, and then water comes in and coordinates with Zn to complete the catalytic cycle. Or alternatively, water might access to Zn and assist the bond breaking which would make the cleavage of Zn-O bond and the binding of water a concerted step. In order to identify whether this is the case and the possible water path for this process, MD simulation of Zn bound xanthine was performed with Zn-O distance restrained at 3.10 A in subunit 1 to rrrirnic the transition state and 3.80 A at subunit 2 to mimic the free xanthine complex. During 2-ns simulation, no water was observed to get close to Zn within 3.5 A in either active site. Though three water molecules are present in both active sites, but none can get access to Zn because the path is blocked by xanthine (data not shown). Then the distance between Zn and 02 of xanthine was reduced to 2.0 A in subunit 1 (mimic complex 8), and enlarged to 4.8 A in subunit 2, again no water was 221 observed close to Zn in either active site in l-ns simulation time. Therefore, all these suggest that water binds to Zn more likely after the release of xanthine, or at least after the relocation of xanthine in the active site. The protonation of Asp114 As discussed above, the first step for the mechanism is the proton transfer from Zn bound water to carboxyl group of Asp114. This happens either before or after guanine binds to the active site. In order to elucidate this process, the crystal structure was optimized by ONIOM without the bound imidazole, but keeping all the crystal water molecules and residues within 6 A around the active site. Interestingly, one water molecule bridges the Zn bound water and carboxyl ODl of Asp114 by forming one hydrogen bond with the former as an acceptor and one with the latter as a donor (Figure 8-5). The bridging water also forms a hydrogen bond with the carbonyl of Glu81. The heavy atom distances are 2.64, 2.60, and 3.23 A for these three hydrogen bonds. This indicates that this bridging water should be rather stable. The Zn bound water is also hydrogen-bonded to Glu55 (Figure 8-5). Interestingly, the MD simulation of the apo bGD also shows the bridging water molecule with similar interaction, but more dynamical. During the first 150 ps of the MD simulation, one (the first) water molecule bridges Zn bound water and Aspl 14, the same as seen in the ONIOM calculation, but during the next 175 ps, another (the second) water comes in and forms a two-water bridge. Then in the next 680 ps one-water bridge is resumed and this second water molecule moves away to form a hydrogen bond with OD2 of Asp114. This one-water bridge is broken during the next 370 ps and the first water moves close and hydrogen bonded to the carboxyl group of Glu55. Thereafter this 222 second water moves back to ODl and recovers the one-water bridge, at the same time a new water molecule migrates from solvent to the active site. Overall one-water (two- water) bridge exists during 70% (10%) of the 1.5 ns simulation. Since a similar water bridge was observed in both ONIOM and MD calculations, this water bridge should be stable and might assist the proton transfer from the Zn bound water to Aspll4. The distance between the Asp114 carboxyl ODl and the proton it is hydrogen bonded to was scanned in an ONIOM calculation; the maximum energy occurs at the distance 1.20 A with the barrier 7.0 kcallmol (Figure 8-5). In the end, the bridging water transfers its proton to 0D] and extracts one proton from the Zn bound water. The energy of the protonated Aspll4 with deprotonated Zn bound water is 5.20 kcallmol higher than the energy of deprotonated Asp114 with Zn bound water. So it is more likely that the proton stays with the Zn bound water instead of Aspll4 in the apo enzyme. However, from the previous calculations, we know that after the binding of guanine, the energy difference between these two states is only 0.70 kcallmol in favor of the Zn bound water. So, the binding of guanine shifts the equilibrium to the right side of Figure 8-5. It stabilizes the protonated Asp114 probably by two hydrogen bonds between N3 and ODl- H, N9—H and OD2. But the question comes up as to whether it is possible that the proton transfer from the Zn bound water to Aspll4 through the water bridge occurs after the binding of guanine. In order to answer this question, we first tried to insert a water molecule in between Asp114 and the Zn bound water after guanine binds in the ONIOM calculation. After the energy minimization, the bridging water forms one hydrogen bond with the Zn bound water as an acceptor and another hydrogen bond with N3 instead of ODl of Aspll4. 223 ‘IIIUB- on“)! E" — Different orientations of the inserted water were tried, but the same minimized structure was obtained. Thus, it is more likely that the proton transfers through the water directly to N3 instead of to Asp114 in this case. The proton of the Zn bound water was transferred to N3 of guanine by shortening the distance between N3 and the proton of the bridging water. The proton transfer from the Zn bound water to the bridging water occurs at the distance 1.20 A with a barrier of 14.7 kcallmol and the end state is 8.2 kcallmol higher than the starting state. This large barrier makes the proton transfer through the water bridge unlikely to happen compared with the previous proposed process. Furthermore, the inserted water molecule makes the region quite crowded such that the distance between O of the bridging water and carbonyl O of Glu81 is only 2.31 A, which makes the van der Waals interaction very unfavorable and the bridging water less stable. Since during an ONIOM calculation, the outer layer residues are fixed, which makes the space tight around guanine, the motion of these residues, especially Glu81, might be able to accommodate one water molecule in between the Zn bound water and Aspll4. But MD simulation shows that the water bridge doesn't exist when guanine binds to the active site. Though two water molecules are seen around the carboxyl group of Aspll4, none of them forms hydrogen bonds with the Zn bound water because the presence of guanine makes the space too small for water molecules. Therefore, all these results suggest that it is unlikely that the proton transfers to Asp114 (or to guanine) after guanine binds. Instead it is more likely that proton transfer to Asp114 occurs through the water bridge right before the positioning of guanine in the binding pocket as seen in ONIOM, which then pushes away the bridging water and 224 stabilizes the protonated Aspll4 by forming two hydrogen bonds with it. After this, the subsequent reactions occur as described above. Comparison of MD and ONIOM The outer layer was frozen in the ONIOM calculation, but the active site structure and interactions might change when allowing its movement. One good way to investigate this is by comparing the active site from the MD simulation with the corresponding structure from ONIOM. The hydrogen bond network of the guanine bound enzyme complex (complex 1) was investigated in MD simulation, which appears to be the same as that from the ONIOM calculation (Table 8-2). For the Zn-xanthine product complexes (complex 8 and complex 9) MD simulations, the same hydrogen bond interactions were also observed as in the ONIOM calculations (Table 8-3,4). Generally the distances between heavy atoms in MD are quite consistent with quantum results, and most distances from ONIOM can be covered by the corresponding MD numbers within their fluctuations (Table 8-2, 3, 4). This indicates that fixing the outer layer in ONIOM doesn't alter the active site interactions significantly in the reactant and the product bound form. Another way to justify the constraint of the outer layer is by checking its rigidity in the MD simulations. The Root Mean Square Fluctuation (RMSF) calculations show that the all-atom RMSF of the 26 residues in the active site (which were included in ONIOM) are quite small; only 0.45 A for complex 1, 0.41 A for complex 8 and 0.44 A for complex 9. Thus, it is reasonable to fix the outer layer in these ONIOM calculations. The outer layer of ONIOM has to be large enough to cover most of the environmental effect, but as small as possible to minimize the computational cost. In order to identify whether 26 residues in ONIOM are sufficient, the interaction energies 225 between guanine and its surroundings are calculated from snapshots of 1.5-ns MD simulation. The average interaction energy is —103.9 :1: 4.42 kcallmol between guanine and those 26 residues compared with 3.54 :1: 3.62 kcal/mol between guanine and the rest of the system (including solvent). So the effect of the rest of the system is rather small, only ~3%. Thus, it is sufficient to include those 26 residues in the ONIOM calculations. Alternative mechanism As described above, the first step in the reaction, proton transfer from Zn bound water to Asp114 is important for the following steps. But, this step cannot be studied directly by using ONIOM, because it seems to be quite dynamical. On the other hand, it is possible that in the apo form Asp114 is already protonated, but the proton comes from somewhere else instead of the Zn bound water. Considering that this residue is buried in the active site that is sealed by the C-terminal helix (Liaw et al. 2004), the pKa of Asp114 might be different from those of model peptides (~4.0). A similar mechanism to the one above can be proposed based on protonated Aspll4, as shown in Figure 8-6. Only the thermodynamic energies are evaluated. First, Asp114 transfers its proton to guanine, which increases the energy by ~9.6 kcallmol (from complex 1' to complex 2', Figure 8- 6). Second, Glu55 extracts one proton from the Zn bound water and forms a hydrogen bond with the guanine amine group. At the same time, the Zn bound hydroxide attacks C2 to form a tetrahedral intermediate, which increases the energy by ~7.4 kcallmol (from complex 2' to complex 3', Figure 8-6). Alternatively, first Glu55 extracts one proton from the Zn bound water, which increases the energy by ~13.6 kcallmol (complex 1' to complex 2", Figure 8-6); second, Aspll4 transfers its proton to N3 and the Zn bound hydroxide attacks C2 which gives the same intermediate (complex 3', Figure 8-6) as 226 above. That increases the energy by 3.4 kcallmol. Third, Glu55 transfers its proton to the guanine amine, and the energy goes up by another 1.2 kcallmol (complex 4', Figure 8-6). Fourth, Glu55 extracts another proton from the Zn bound hydroxide and at the same time the C2-N2 bond breaks automatically, which drops the energy by ~14.0 kcallmol (complex 5', Figure 8-6). Finally, we forced the proton to attach bind to ammonia using a constraint. But, once this constraint is removed, the proton jumps back to Glu55, suggesting that this proton prefers to stay with Glu55. So, the final products are xathine, ammonia, and protonated Glu55. The overall energy change for this mechanism is ~4.2 kcallmol endothermic, which is comparable with the ~5.0 kcallmol from the previous mechanism. But, all the intermediates have much higher energies. The complex 4' is 18.2 kcallmol less stable than complex 1' (Figure 8-6) compared with its analog complex 6 that is 0.9 kcallmol more stable than complex 1 (Figure 8-2) in the previous mechanism. This 18.2 kcallmol difference would make this reaction far less efficient. Therefore, this new mechanism based on protonated Aspl 14 is less likely to occur than the mechanism based on deprotonated Aspl 14. The mechanism proposed (Figure 8-3) with deprotonated Asp114 as the starting point is reminiscent of the reaction mechanism proposed for the deamination reaction catalyzed by yeast Cytosine Deaminase (Sklenak et al. 2004). In the yCD system, two protons are transferred by Glu64 (like Glu55 in bGD) from the Zn bound water to cytosine (like guanine in GD). Here, Asp114 transfers the first proton and Glu55 transfers the second proton to guanine. In fact, if N3 is protonated (complex b in Figure 8-7) instead of N1 (complex a in Figure 8-7), the same mechanism can be proposed for the bGD catalyzed reaction up to the release of xanthine. Here we want to address whether 227 this is a possible mechanism. It has been predicted that complex a is 5.2 kcallmol more stable than complex b in aqueous systems (Jang et al. 2003). The relative stability of these two compounds in the enzyme was also evaluated. Just like in aqueous system, ONIOM calculation shows complex a is 5.8 kcallmol more stable than complex b in GD active site. Considering complex b is a minor form in solution and that the enzyme also prefers to bind complex 3 instead of b, the probability for complex b binding should be much lower which makes the yCD-like mechanism unlikely occur. Discussion This ONIOM study provides a detailed mechanism of the guanine deamination reaction, with the energetics of all the stable species as well as the barriers along the reaction path. First, one proton is transferred from the Zn bound water to Aspll4. This process might be quite complicated and rather dynamic. In the apo form proton transfer is possible through the water bridge but not favorable (the energy increases 5.2 kcallmol), while in the guanine bound form it is less unfavorable (the energy only increases by 0.7 kcallmol) but difficult since no water bridge is present. Water entry in the location that was observed in ONIOM and MD might happen just before the positioning of guanine. The positioning of guanine breaks the water-bridge and stabilizes protonated Asp114 by forming two hydrogen bonds with it. Second, Aspl 14 transfers its proton to N3. The first two steps shorten the distance between C2 and Zn bound hydroxide and favor the nucleophilic reaction. Third, Zn bound hydroxide attacks C2 to form a tetrahedral intermediate with a rather low barrier, which also moves Zn bound hydroxide closer to OE2. Fourth, Glu55 transfers one proton from Zn bound hydroxide to N2 by rotating its dihedral CB-CG-OEl-H ~30°. Till now, the C2-N2 bond length has been elongated 0.2 228 A, and that makes the C2-N2 bond cleavage more facile. Fifth, the C2-N2 bond is broken and ammonia forms, which gives the highest barrier ~8.8 kcallmol so far. Sixth, ammonia leaves the active site and xanthine is freed by the cleavage of the Zn-02 bond with the barrier ~8.4 kcallmol. Therefore, along the reaction path the highest barrier comes from the C2-N2 bond cleavage, while the barrier from the cleavage of the Zn-O2 bond is slightly smaller. Then, xanthine leaves the active site and water moves close and binds to Zn to finish the whole enzymatic reaction cycle. Along this path, Glu55 and Aspl 14 play an important role by acting as proton shuttles. One alternative mechanism was proposed that involves protonated Aspll4. Even though the overall reaction energy difference between product and reactant is comparable to that with deprotonated Aspll4, the energies of various intermediates along the reaction path are much higher. This makes this mechanism less likely happen in the reaction. One tautomer of guanine with deprotonated N1 and protonated N3 was also discussed, which would give a similar mechanism as the proposed (Sklenak et al. 2004) yCD catalyzed reaction if it is present in the active site. However, this tautomer is 5.8 kcallmol less stable than guanine in the active site, similar to what was found in an aqueous environment (5.2 kcallmol less stable). The probability to bind this tautomer is much lower than guanine, which makes the yCD-like mechanism unlikely. In the ONIOM calculation, only 26 residues of the enzyme around the active site were included. Our MD simulation of guanine bound GD shows that these 26 residues contribute 97% of the interaction energies between the ligand and its surroundings. It suggests that it is sufficient to include just this part of the enzyme in the ONIOM 229 calculation. But, since only part of the protein is used, the energy minimization of the outer layer without constraints might give an unrealistic expansion of the protein. So, the outer layer was fixed in the quantum calculations. Our MD simulations of the protein with explicit water molecules lead to the same hydrogen bond network in the active site as in the ONIOM calculations, for both the reactant and product complexes, with consistent distances between heavy atoms. Therefore, the interaction between ligand and its surroundings is well defined and quite stable. The constraint on the outer layer residues didn’t introduce artifacts in the hydrogen bond analysis of the active site. The flexibility of active site was also investigated in the NH) simulations. The all- atom RMSFs of those 26 residues are only ~0.4 A in both reactant and product complexes. The rigidity of the active site in MD explains why MD gives similar active site interaction compared with ONIOM. This also suggests that it is a reasonable approximation to freeze the outer layer in the ONIOM calculations. Conclusion One reaction path of the dearrrination of guanine to xanthine catalyzed by GD was proposed based on a two-lay ONIOM and MD study. The highest barrier comes from C2- N2 bond cleavage. The Zn-02 bond can be broken without the assistance of water during the release of xanthine. Glu55 and Asp114 were identified to assist two proton transfers from Zn bound water to the ligand. 230 Reference Becke, A. D. (1993). "Density-Functional Thermochemistry .3. The Role of Exact Exchange." Journal of Chemical Physics 98(7): 5648-5652. Berendsen, H. J. C., J. P. M. Postma, et al. (1984). "Molecular-Dynamics with Coupling to an External Bath." Journal of Chemical Physics 81(8): 3684-3690. Cornell, W. D., P. Cieplak, et al. (1996). "A second generation force field for the simulation of proteins, nucleic acids, and organic molecules (vol 117, pg 5179, 1995)." Journal of the American Chemical Society 118(9): 2309-2309. Dapprich, S., I. Komaromi, et al. (1999). "A new ONIOM implementation in Gaussian98. Part I. The calculation of energies, gradients, vibrational frequencies and electric field derivatives." Journal of Molecular Structure-Theochem 462: 1-21. Dewar, M. J. S., E. G. Zoebisch, et al. (1985). "The Development and Use of Quantum- Mechanical Molecular-Models .76. Aml - a New General-Purpose Quantum-Mechanical Molecular-Model." Journal of the American Chemical Society 107(13): 3902-3909. Dudev, T. and C. Lim (2001). "Modeling Zn2+-cysteinate complexes in proteins." Journal of PMsical Chemistry B 105(43): 10709-10714. Dudev, T. and C. Lim (2002). "Factors governing the protonation state of cysteines in proteins: An ab initio/CDM study." Journal of the American Chemical Society 124(23): 6759-6766. Essmann, U., L. Perera, et al. (1995). "A Smooth Particle Mesh Ewald Method." Journal of Chemical Physics 103(19): 8577-8593. Frisch, M. J. "Gaussian 03." Humbel, S., S. Sieber, et al. (1996). "The IMOMO method: Integration of different levels of molecular orbital approximations for geometry optimization of large systems: Test for n-butane conformation and S(N)2 reaction: RCl+C1." Journal of Chemical Physics 105(5): 1959-1967. InsightII "InsightII." Ireton, G. C., M. E. Black, et al. (2003). "The 1.14 A crystal structure of yeast cytosine deaminase evolution of nucleotide salvage enzymes and implications for genetic chemotherapy." Structure 11: 961-972. 231 Jang, Y. H., W. A. Goddard, et al. (2003). "pK(a) values of guanine in water: Density functional theory calculations combined with Poisson-Boltzmann continuum-solvation model." Journal of Physical Chemistry B 107(1): 344-357. Lee, C. T., W. T. Yang, et al. (1988). "Development of the Colle-Salvetti Correlation- Energy Formula into a Functional of the Electron-Density." msical Review B 37(2): 785-789. Liaw, S. H., Y. J. Chang, et al. (2004). "Crystal structure of Bacillus subtilis guanine deaminase - The first domain-swapped structure in the cytidine dearrrinase superfamily." Journal of Biological Chemistry 279(34): 35479-35485. Pearlman, D. A., D. A. Case, et al. (1995). "Amber, a Package of Computer-Programs for Applying Molecular Mechanics, Normal-Mode Analysis, Molecular-Dynamics and Free- Energy Calculations to Simulate the Structural and Energetic Properties of Molecules." Computer Physics Communications 91(1-3): 1-41. Ryckaert, J. P., G. Ciccotti, et al. (1977). "Numerical-Integration of Cartesian Equations of Motion of a System with Constraints - Molecular-Dynamics of N-Alkanes." Journal of Computational Physics 23(3): 327-341. Sklenak, S., L. S. Yao, et a]. (2004). "Catalytic mechanism of yeast cytosine deaminase: An ONIOM computational study." Journal of the American Chemical Society 126(45): 14879-14889. Svensson, M., S. Humbel, et al. (1996). "ONIOM: A multilayered integrated MO+MM method for geometry optimizations and single point energy predictions. A test for Diels- Alder reactions and Pt(P(t-Bu)(3))(2)+H-2 oxidative addition." Journal of Physical Chemistgy 100(50): 19357-19363. Vreven, T. and K. Morokuma (2000). "On the application of the IMOMO (integrated molecular orbital plus molecular orbital) method." Journal of Computational Chemistry 21(16): 1419-1432. Yao, L. S., Y. Li, et al. (2005). "Product release is rate-limiting in the activation of the prodrug S-fluorocytosine by yeast cytosine deaminase." Biochemistry 44(15): 5940-5947. Yao, L. S., S. Sklenak, et al. (2005). "A molecular dynamics exploration of the catalytic mechanism of yeast cytosine deaminase." Journal of Physical Chemistry B 109(15): 7500-7510. 232 Appendices Table 8-1 Comparison of ONIOM optimized structure with crystal structure of the imidazole inhibitor complex Internal Coordinate ONIOM X-ray Structure Zn—O (wat) 1.96 2.03 Zn-ND (111353) 2.03 2.06 Zn-SG (Cy583) 2.41 2.34 Zn-SG (Cy386) 2.37 2.28 SG(Cy383)-Zn-SG(Cy586) 115.6° 130.7° SG(Cys83)-Zn-ND(HisS3) 101.9o 105.0° SG(Cys86)-Zn-ND(HisS3) 105.9° 112.7° SG(Cy883)-Zn-O(wat) 1 105° 106.8° Table 8-2 Hydrogen bonds in GD substrate bound form ONIOM MI) Distance (A) Distance (A) Occurrence (%) Gln42-NDH. . .06 3.09 3.00(0.18) 97.3 AlaS4-NH. . .06 3.01 3.03(0. 16) 99.7 Glu55-0E1. . .H-Nl 2.84 2.88(0.16) 80.1* Glu55-OE2. . .Hl-NZ 3.05 3.1 l(0.22) 865* Glu8l-O. . .H2-N2 3.46 3.16(0.l9) 74.1 Zn-WAT-OH. . .N3 3.05 3.07(0.16) 98.3 Aspl l4-OD1. . .H—N9 2.73 2.96(0.16) 98.0a Tyr156-OH...N7 3.32 3.03(0. 19) 92.3 * Due to the rotation of carboxyl group, OE2(OE1) also forms hydrogen bond with H- N 1(H1-N2). a In MD OD2 also forms hydrogen bond with H-N9 with distance 2.96(0.18) and occurrence 96.8%. 233 'Iu'I-J Ia 1.1-4‘ Table 8-3 Hydrogen bonds in complex 8 bound form ONIOM IVID Distance (A) Distance (A) Occurrence (%) Gln42-NDH. . .06 3.06 2.90(0. 14) 97.8 AlaS4-NH...06 3.15 3.15(0.16) 95.2 Glu55-OE] . . .H-NI 2.73 2.79(0.12) 83.2 Aspl 14-0Dl. . .H-N3 2.68 2.96(0.18) 78.8 Aspl 14-0D2. . .H-N9 2.69 2.78(0. 10) 100 Tyr156-OH...N7 2.82 2.94(0.17) 97.8 Table 8-4 Hydrogen bonds in complex 9 bound form ONIOM MD Distance (A) Distance (A) Occurrence (%) Gln42-NDH. . .06 3.20 2.92(O.14) 99.7 Ala54-NH. . .06 3.11 3.20(0. 15) 91.9 Glu55-OE] . . .H-Nl 2.77 2.79(0.09) 100 Asp114-OD1...H-N3 2.64 2.79(0.09) 100 Aspl 14-OD2. . .H-N9 2.73 2.8l(0.11) 100 Tyr156-OH...N7 2.99 2.89(0. 16) 99.2 234 H\N/H H20 NH3 __ \\ H N H N12€§ \\\ 1// 1 3 6 6 / 9 7 N:/ O N’:/ Figure 8-1 The dearrrination reaction catalyzed by GD. Zn GMES “,H_q(’ Qflx H G \\ ’,’,/O.___ GIU81 o H\N/H/ \\ /i\ / mmn4 \\ ‘ \ I’ / \H 6/ ll/O -‘ / N/H” 7:9/ Asn42 / H—N12\3N“' H ( ) 6 / /.”O 6 / /,/O N/H’ N/H’ 7 9 7 9 N:/ N‘ 1 2 Glu55 Glu55 O~-- Zn \6 H\ / H—N/Cy383 G) O H 0,,” \ Q \\ H/NX‘~KO:\ O Glu81 Asp114 \‘ )\ /—~‘ /0 \\ — .... O \ __—-H H N12 N—H' H—N 2\ N" | 3 e 1 3 34 6 1.2 6 o’,'0 fl 0 / H ————— 0 fl 0 / /H" N/ 9N 7 Nu ~~ 2H 3' Glu55 d H~N/H / 0 Zn / Glu55 ’_H”" \ “\H_N Cys83 e H O H ,N \ H \ / ,,'H-N\Cys83 Y x x O o” ’I \ ‘ \ H-\‘0 ’I Zn ‘ __ ‘~ H’N/ ..-“‘~o Glu81 A 114 0‘ o’/ 0 Glu81 ”>"\ .0 sp ‘1 )l\ Asp1 14 H—N1 2 N—H—r‘" ~, ...—0 6 3 9 -14.0 H—N1 2 3N-H-' 9 fl 0 / /H ...... O 6 / ______ 0 7‘9) 0 9N/H ‘ 7 N N:/ 4' 5' Figure 8-6 Alternative mechanism with protonated Asp114. 238 .V-. , Zn Glu55 H— H— CysBS 0\ e ’,_-«O: and C N ((51.1%: P11 (5r; =rti —< > , with 6, distinguishing which protomer average is used in the C P former definition. The first RMSF we define is . 1/2 RMSF1= <(6irtl)2> 5 (1114511)!) . (3) C RMSFl is the average over protomers of the ith protomer RMSF,- and will be referred to as the average protomer RMSF. The second RMSF we define is 1/2 RMSFZ = «(64' )2>C> . (4) P 245 RMSF2 is an RMSF obtained by treating the trajectory as one record over the entire (Np)*(NC) data, and will be referred to as the octamer RMSF. RMSF2 does not distinguish atom r among the different protomers. To the extent that the averages in these deviations are different, there is no rigorous connection between the two RMSFs. If, however, the protomer averages are independent of protomer, = < > C C P (i =1,...,N,,), then they can be related as follows. Using the protomer RMSF,- definition in Eq. (3), the difference between RMSF2 and RMSFl can be expressed as 2 2 ’ 2 2 2 ARMSF a RMSF2 —RMSF1 = -(RMSF,-)p >0 (5) P Thus, ARMSF is the RMS fluctuation over different protomers of the individual protomer RMSFis. This indicates that, to the extent that the average values of the vector positions are independent of protomer, as they will be approximately (when referenced to the same coordinate origin), the RMSF2s will be larger than the RMSFls. This difference should be magnified for the atoms that are intrinsically more mobile, and this feature will be useful for distinguishing fluctuations of given atoms in the apo versus bound forms of DHNA. When the protomer averages are dependent on protomer, the inequality in Eq. (5) may be violated, and RMSFl could be slightly larger than RMSF2. The principal component analysis was carried out with the gcovar and ganaeig modules of GROMACS 3.0.20 The PCA method diagonalizes the covariance matrix 01-]- = ((519-519) of the CA atom fluctuations from their trajectory-averaged (...) values, where 6x,- = x,- —(x,-) and x,- denotes the Cartesian components of the ith CA atom.10 For 246 this analysis, <..) = <<...)C>P ; the entire trajectory is used and corresponds to the octamer- based RMSF approach. The overall translational/rotational motion of all the snapshots is removed relative to a reference structure before matrix diagonalization. The sum of the eigenvalues of the covariance matrix corresponds to the total variance of the motion over the trajectory. Ordering the eigenvalues from large to small can, in favorable cases, provides a small set of modes that captures most of the protein’s fluctuations. Another application of PCA is to the evaluation of entropy.21 Assuming that the covariance matrix 0,]. describes the protein dynamics permits the entropy change in DHNA for binding of HP to be estimated as: _k_B det(0'a) AS‘ 21n[det(a,)]’ (I) where k3 is Boltzmann’s constant, and an and 0",, are the above covariance matrices in the bound and free forms respectively. After diagonalization of these two matrices, the entropy change can be obtained since the determinants in Eq. (1) are expressible as the products of their PCA eigenvalues. The entropy of HP was estimated by using the quasih module of AMBER 7.0 to obtain the normal modes that can be used in standard formulas from statistical thermodynamics. Results and Discussion 1. Apo DHNA and its HP complex A. Flexibility analysis of DHNA and its HP complex based on the octamer and its protomers. DHNA is an octamer, with its eight active sites that bind HP formed by interfaces between pairs of adjacent protomers. There is no cooperation among the active 247 sites based on kinetic studies and, therefore, they can be treated as statistically independent units, indicating that more conformational space can be explored than in a usual MD simulation of a given duration. The RMSDs of the CA atoms of the instantaneous MD structures from the initial X-ray structure for the apo DHNA and its HP complex are ~ 1.1-1.5 A between 0.5 and 2.0 ns (data not shown). For this comparison, the superposition of structures for the RMSD evaluation was based on the octamer providing an effective simulation time of 12 ns (8 x 1.5 ns of time after 0.5 ns for equilibration). The RMS fluctuations (RMSF) of the CA atoms for each residue (Figure 9-la and 1b) were first calculated by the average protomer RMSF method, as discussed in the Methods. This approach evaluates each protomer RMSF and averages them over the eight protomers to obtain the average protomer RMSF. Overall, the protein is quite rigid with an average RMSF of ~0.6 A. The binding of HP apparently does not change the RMSF of the protein on the MD time scale. The RMSF variations along the amino acid sequence obtained from the MD simulations and experimental crystal B-factors are quite similar, with correlation coefficient 0.79 for the apo form and 0.77 for the HP complex. The MD trajectory of the DHNA octamer was then partitioned into the coordinate sets for the individual protomers and combined to obtain a data record of length Np*NC with averages and fluctuations from averages based on all the data for a given CA. This octamer-based RMSF is the second approach discussed in the Methods and, as noted there, will tend to emphasize the fluctuations. These octamer RMSF values, compared with the previous average protomer RMSF, are plotted in Figure 9-2. The new RMSFs display a similar pattern to the previous ones, but with larger variations, especially for the 248 apo enzyme (Figure 9-2a), in some regions. Strikingly, all these regions are around the active sites, at the interface of two subunits (Figure 9-3), including residues 15-25, 45*- 55*, 68-74 and 100-110. (We will henceforth denote residues in the adjacent subunit that together form the active site by a “*”, as in the notation 45*-55*). This difference reflects the limited 2 ns sampling space around the initial subunit structures. The “extra” flexibility around the active site might assist in ligand binding and release, as discussed below. The increase in RMSF on octamer versus average protomer basis is unanimous in the apo enzyme simulation. In the HP complex simulation, the analogous increase is smaller and even negative (decreased relative to the average protomer-based method) in the rigid regions of the complex (Figure 9-2b), which indicates that HP binding rigidifies the enzyme not only in the flexible regions but also in the rigid regions. This feature can be seen clearly in Figure 9-4a, where the apo and HP-bound RMSF differences are compared for octamer and average protomer methods. Thus, while RMSF differences based on the average protomer RMSF would not support the hypothesis that the binding of HP rigidifies the protein, the differences based on the octamer RMSF show clearly that the binding of HP rigidifies DHNA, and does so not only in the flexible regions, but also in the rigid regions. While most parts of DHNA are rigidified by binding HP, residues 49-51 seem more flexible after HP binding (Figure 9-4a, dotted line). Further investigation indicates that the 1,1! dihedral angle of Asp*50 and the go dihedral angle of Thr*51 can change cooperatively to give a different conformation for Asp*50 and Thr*51. Figure 9-5 shows the time evolution of these two dihedral angles in the different subunits displayed 249 sequentially as one trajectory. In subunits 1, 2, 3, 4, 7, 8, (p and w remain at ~0° and -l40°, respectively; but, in subunits 5 and 6 (especially 6), w changes from 0° to -40°, at the same time that (p changes from -l40° to -70°. These simultaneous changes of dihedral angles may be caused by the new interactions between the Thr*51 side chain and its surroundings (see discussion below). After excluding subunit 6 from the trajectory, the new RMSF differences between apo enzyme and HP bound enzyme show no positive spike around residue 50 (Figure 9-4b). B. Principal component analysis. The rigidification of DHNA upon HP binding can also be investigated by the PCA method, which attempts to decompose the total variance of the atom RMS fluctuations over the MD trajectory into a contribution from a small set of modes that have a relatively large contribution to this variance and a large set of modes with a small summed contribution.10 Because the above analysis does show that there are flexible regions of DHNA that become rigidified upon HP binding, a comparison by PCA of the two forms should be instructive. The trajectory of subunit 6 was excluded in the PCA analysis, because Asp*50 and Thr*51 in subunit 6 have backbone conformations quite different from those in the other subunits. In order to highlight the flexible regions in the PCA analysis, only CAs of rigid regions (with RMSF less than 0.7 A in the free form) were fitted to remove the overall translation/rotational motion. The new fitting gives total variance 96.6 A2 for all CAs in the apo form and 67.2 A2 in the bound form. The first four modes contribute 56% of the variance in the apo and 39% in the bound form. The remaining modes contribute ~43 A2 in the apo form and ~41 A2 in the bound form. Thus, as indicated in Figure 9-6, the first four modes incorporate most of the flexibility difference between the apo and bound forms. That a few modes 250 capture most of the difference in RMS fluctuations between the apo and bound forms reinforces the data obtained above that a limited number of residues are responsible for the changes in DHN A flexibility upon HP binding. The first mode has variance ~25 A2 in the free form but only ~9 A2 in the HP bound form. The RMS fluctuations of each CA corresponding to the trajectory based on mode 1 for the apo and I-IP-bound forms are displayed in Figure 9-7. They show that the enhanced flexibility of the apo relative to bound form is concentrated in the three region, 45*-55*, 68-74 and 100-110, which were identified in Section A. These data, along with the data for modes 2 and 3, suggest that examination of three distances between three residues, one in each of these regions, should provide a refined perspective on the active site fluctuations. Labeling the Tyr*54-Ile105 distance as d1, the Tyr*54—Ala69 distance as d2, and the Ala69-Ile105 distance as d3, and parametrically plotting the data for all eight active sites in 3D reveals that the most interesting differences are in the d2 and d3 directions, as displayed in Figure 9-8. The fluctuations in distance along the (11 direction are quite similar for the apo and I-IP-bound forms, and its scale is about half of the total d2 and d3 apo fluctuation scale. The apo form has a major and a minor state, while the HP- bound form has only one state, which is confined within the major apo state. The apo motion is correlated in the sense that on average an increase in d2 is accompanied by an increase in d3, indicating a kind of breathing motion (both distances have a common residue) that is, approximately, an equally weighted combination of the two distances. The estimated entropy loss of DHNA due to HP binding for CA atoms is about 4.9 cal/mol-deg based on the first four modes and about 13.1 cal/mol-deg including all the 357 PCA-determined modes, when evaluated from Eq. (1). It is quite interesting that even 251 though the first several modes contribute most of the flexibility changes in the binding, ~60% of the entropic loss comes from the small eigenvalue (high frequency) contribution. This is consistent with the fact that the binding of HP not only rigidifies the flexible regions but also the rigid regions. C. Active site analysis. Since the most significant flexibility changes are localized in the active site region, it is useful to analyze the interactions between the active site residues and between the active site residues and bound HP. In the active site of apo DHNA (Figure 9-9a), the carboxylate of Glu22 forms a salt bridge with LyleO and hydrogen bonds with the backbone amides of Ala18 and Leu19 and with the side chain amide of Gln27. The carboxylate of Glu74 forms hydrogen bonds with the hydroxyl of Thr*51 and the phenoxyl of Tyr*54 from the adjacent subunit and the backbone amides of Leu73 and Glu74 from the same subunit. The strong hydrogen bond networks formed by the carboxylates of Glu22 and Glu74 with their surroundings indicate that these two residues are restrained in the apo form. By comparing the hydrogen bond interactions in different active sites (Table 9-1), the hydrogen bond strengths are shown to be more diverse around Glu22 than Glu74, which may be caused by the difference in flexibility between these two regions. The different active sites between different subunits show quite consistent hydrogen bond interactions (Table 9-1), which suggests that the active sites are well defined in the apo form MD simulation. All the hydrogen bonds in Table 9- 1 are similar to those found in the crystal structure, with the one exception that the Y54* to Glu74 hydrogen bond, which spans two subunits, is longer in the crystal structure. The hydrogen bond interactions after binding HP are listed in Tables 2 and 3 and a typical snapshot displayed in Figure 9-9b. The results are in good agreement with those 252 from the crystal structure. Compared with the apo form, the strong hydrogen bond between the amide and carboxyl of Glu74 is maintained, as well as the hydrogen bond between the side chain hydroxyl of Thr*51 and the Glu74 carboxyl group. The Glu74 carboxyl forms two new strong and stable hydrogen bonds with N2H1 and N3H of HP. This suggests that Glu74, which sits at the bottom of the active site and is well restrained by the hydrogen bonds with surrounding residues, acts as an anchor to fix the orientation of HP. The side chains of His*53 and Tyr*54 sandwich the HP pterin ring forming a n-n stacking interaction in some active sites, while in others the HisS3 imidazole ring is angled away. In the crystal structure, the Tyr*54 phenol is stacked with the pterin ring, but the His*53 imidazole is not. HP also forms hydrogen bonds with the carbonyl of Val*52 and amide of Tyr*54 from the adjacent subunit. The O4 atom of HP forms a stable hydrogen bond with a water molecule that is hydrogen bonded to the carbonyl of Asn71 and the side chain amine of LyleO (Table 9- 2, Figure 9-9b), which is similar to the bound water interaction found in the HP crystal structure. Since all crystallographic water molecules were removed before the start of the MD simulation, the appearance and consistent presence of water in this position in all the active sites suggests that it is a stable point for water, and that the simulation time was sufficient to organize proper active sites. All of the above-noted interactions that do not exist in the free form enzyme contribute to make the complex more stable and rigid. It is sensible that HP binding rigidifies the region around residue 70 (Figure3) considering that HP forms two hydrogen bonds with Glu74, one with Leu73, and one hydrogen bond network with Asn7l through the trapped water. Since this region is quite close to residues around 45 from the adjacent subunit (Figure 9-3), these interactions may 253 also contribute to rigidifying the region around residue 45*. Further investigation indicates that the amides of Asn71 and Leu72 form a hydrogen bond network with the carboxyl of Asp*46 from the adjacent subunit (Table 9-4), which might be the reason that residues around 45* are also rigidified. The strong hydrogen bonds between HP and Glu22 certainly stabilize the region around residue 22. Interestingly, the largest rigidity change due to HP binding comes from residue 25, which does not interact with HP directly. Ile25 shows great flexibility in the free form simulation with RMSF equal to 1.7 A that might occur because Ile25 is at the end of helix 1 (residues 20-24) and close to the beginning of beta sheet ‘3 (residues 27-37). In the bound form, the Ile25 RMSF is only about 0.9 A. Further inspection shows that the carbonyl of Ile25 forms one hydrogen bond with the amide of Ala21 and one weak hydrogen bond with the amide of Glu22, while Glu24 forms two hydrogen bonds with Ser20 and Ala21 (Table 9—5). Clearly, the rigidity increase around Glu22 also stabilizes Glu24 and Ile25. D. Product binding energy decomposition. The nature of the binding pocket can be defined by examining which residues contribute the most to stabilizing HP. As shown in Figure 9-9b, the direct interactions with HP typically include hydrogen bonds with the carboxyl groups of Glu22 and Glu74, the carbonyl oxygen of Val*52, and the amide nitrogens of Tyr*54 and Leu73. In order to assess how the individual residues contribute to the binding of HP, the interaction energies between HP and residues within 5 A of HP, averaged over the 2 ns trajectory and the eight active sites, are shown in Figure 9-10. These residues contribute about —80 kcallmol of electrostatic energy (97% of the total 254 WT . over all residues) and about —23 kcallmol of van der Waals energy (95% of the total over all residues). The highly conserved residues Glu22, Glu74 and LyleO contribute about —68 kcallmol of the electrostatic energy (82% of the total over all residues electrostatic energy) and clearly are essential to fix the orientation of HP within the active site. The electrostatic interaction (—35 kcallmol) between Glu74 and HP is the largest among all individual residues due in part to the two strong hydrogen bonds between the carboxyl group and HP. Glu22 contributes —11 kcal/mol of electrostatic stabilization in a similar way. Though LyleO does not form a direct hydrogen bond with HP, being water- mediated, the positively charged amine group of LyleO does have a strong interaction with the partial negative charges of O4 and N5 of HP, which contributes the second largest electrostatic energy of —23 kcallmol. His*53 and Tyr*54 contribute -9.3 kcallmol (37%) van der Waals interaction due to the fact that HP pterin ring is sandwiched by the side chains of these two residues in four of the active sites. In the other four active sites, the pterin ring is tilted slightly and moves closer to LyleO and Glu22. Tyr54 is maintained in a similar position but His53 moves away. This might be caused by the strong interaction between HP and LyleO and Glu22. The solvent-HP interaction provides about —14 kcallmol electrostatic and —1 kcallmol van der Waals of stabilization energy. A major contributor (6-7 kcallmol) to this energy is the hydrogen bond between the HP 03 atom and the water bound to LyleO. 2. Exit path of HP from DHNA A. Pushing HP out of the active site: Fluctuation analysis. In order to investigate how the HP release process changes the structure and flexibility of DHNA, HP was 255 pushed out by 8 A in all the active sites. The pushing force was imposed between the backbone N of Arg118 and C10 of HP. The trajectory average RMSD of 1.56 A is quite stable and slightly larger than the 1.42 A of the regular HP complex simulation, indicating that no significant changes of the overall backbone structure occurred during the push. The average RMSD of the N of Arg118 is 1.29 A during this pushing simulation compared with 1.10 A in the regular HP simulation. As shown in Table 9-6, the RMSF of this N atom in the eight different subunits increases slightly overall in the pushing MD simulation, with the average increase ~0.07 A. Both RMSD and RMSF measures show that the backbone N of Arg118 does not move much during the pushing simulation, which confirms that the restraint between this N atom and C10 of HP and the time scale used led to a sensible product release trajectory. Figure 9-11 shows the by residue RMSF comparison (based on superposition of individual subunits) between the pushing and regular MD simulations. The product release does not change the backbone protein structure and flexibility significantly, though slight changes occur around the flexible regions, residues 15-25, 45*-55*, 68-74 and 100-110, which may reflect the disturbance that the exit of HP creates in this regions. The total fluctuation of the CA atoms is 84.9 A2, larger than the 65.6 A2 of the regular HP complex simulation, but comparable with that of apo enzyme simulation’s 86.5 AZ. This is consistent with the previous simulation result that product binding rigidifies the protein; therefore, the release of HP should increase the enzyme flexibility. In order to understand how protein structure and flexibility change along the HP release trajectory, the data were also analyzed based on individual time windows. The average protomer structure in each window (an average over all subunits over the 256 simulation time window of 50 ps) was compared with the average protomer structure (average over all subunits over the 1.5 ns simulation time) in the regular HP complex simulation (Figure 9-12). The first window corresponds to the first 50 ps of the regular simulation. The CA RMSD changes from 0.2 A to 0.4 A during the first 2 A of pushing, and then fluctuates around 0.4 A. The initial window difference of 0.2 A is a consequence of the 50 ps versus 1.5 ns averaging time. The 0.2 A subsequent RMSD increase again illustrates the minimal overall protein disturbance, considering that the CA RMSD between the average structure of the regular HP complex and that of the apo form simulation is only 0.44 A. The CA fluctuation in each window is plotted along the exit path of HP in Figure 9- 13. Unlike the average structure plot (Figure 9-12), the fluctuation increases from ~65 A2 to ~92 A2 along the HP exit path from 1 to 6 A, and then stabilizes after 6 A. Interestingly, the average fluctuation between 6 A and 8 A is slightly larger than that of the apo form of ~87 AZ. This extra flexibility may be caused by the weak DHNA-HP interaction perturbing the energy surface of the protein and assisting it to access more conformation space. Note that the difference in fluctuation magnitude is probably not a sampling deficiency, though it is hard to get converged fluctuations in short time simulations, because the fluctuation in the first window of the HP push is 67.0 A2, which is consistent with the 65.6 A2 in the regular HP complex enzyme simulation. The window RMSD calculations were also carried out on the CA of the four fragments that correspond to the flexible regions: 15-25, 45*-55*, 68-74, 100-110, as displayed in Figure 9-14. The CA RMSD of residues 15—25 increases from 0.3 A to 0.7 A (Figure 9-14a) over the first 2 A push, but then drops back to 0.3 A around the distance 257 3.5 A and thereafter fluctuates around 0.5 A. This region may undergo a conformational change during the exit of HP associated with its strong interaction with Glu22. The CA RMSD of residues 45—55 (Figure 9-14b) increases from 0.2 A to 0.5 A, and then fluctuates before increasing to 0.8 A around distance 7—8 A. This residue range includes Thr*54 that is n-stacked with HP. The CA RMSD of residues 68—74 (Figure 9-14c) increases from 0.2 A to 0.5 A, and then goes back to 0.3 A at the distance 6 A, thereafter fluctuating around 0.4A. The RMSD of CA5 of residues 100-110 (Figure 9-l4d) goes up to 0.5A, and then fluctuates. While the RMSD data provides information about the drift of protein structure during the HP exit, the fluctuation data indicates how flexibility changes in these regions. Residues 15-25 show the largest CA fluctuation changes, from ~10 to ~22 A2, (Figure 9- 15a), while residues 45*-55*, 68-74 and 100-110 show slight increases of flexibility along the path (Figure 9-15b, c, (1). By summing the fluctuations in these four regions, as shown in Figure 9-15e, the overall fluctuation increases from ~30 A2 to ~50 A2 which contributes ~75% of the increase in protein flexibility. Therefore, the major increase in flexibility comes from these more mobile regions around the active site. B. Pushing HP out of the active site: Energetic analysis. The local nature of the HP-DHNA interactions suggests that an exit path should show a relatively fast transition between bound and weakly bound behavior. Figure 9-16a plots the electrostatic interactions between HP and Glu22, Glu74 and Lys100. The local nature of the electrostatic interactions between HP and the protein noted in Section 1D is consistent with the fairly well defined transition behavior in this plot. The magnitude of the electrostatic interaction decreases mainly due to the breaking of hydrogen bonds between 258 Glu22, Glu74 and HP and the increase of distance between LyleO and HP. Figure 9-l6a also shows that the hydrogen bonds between HP and Glu22 and Glu74 break when HP is 2-3 A away from its original bound position. Interestingly, the interaction between Glu22 and HP changes from —10 to 6 kcallmol during the first 4 A of HP exit, indicating that the motion of HP pushes this residue away. Then, this interaction energy decreases to -4 kcallmol, which suggests that Glu22 favorably interacts with HP afterward. Figure 9-16b displays the interaction energies of the flexible regions with HP. The interaction between HP and residues 15—25 shows a similar pattern to that between HP and Glu22, with a lower maximum (—2 kcallmol) at a distance of 4 A, and strong interaction of —10 kcallmol after 6 A. This feature suggests that the interaction between HP and residues 15—25 can assist in breaking the hydrogen bond between HP and Glu22 and also that this region undergoes a conformational change to accommodate the release of HP. The trend in energy for residues 15—25 as a function of distance is consistent with the RMSD trend summarized by Figure 9—14a. The interaction energy between HP and residues 45*—55* fluctuates around —20 kcallmol until the distance increases to 6 A. The strong stabilizing van der Waals interaction as well as electrostatics between this region and HP (Figure 9- 10) was maintained in the first 4 A of pushing, with the residues moving with HP. Then, the van der Waals energy increases and the electrostatic energy drops to maintain the overall energy until about 6 A. The residue regions 68—74 and 100—110 show similar interaction energy changes as those of Glu74 and Lys100, respectively. Along the exit path, the interaction between HP and DHNA weakens from —108 to —33 kcallmol while the HP solvent interaction strengthens from —16 to —61 kcallmol (mainly electrostatic 259 energy), the latter compensating for the loss of interactions between HP and the enzyme (Figure 9-16b). The total interaction energy between HP and its environment along the product release path is plotted in Figure 9-17, which would imply a barrier to HP release of ~32 kcallmol. That is a rather large barrier to product release; however, it does not account for the internal energy of HP. If the internal energy of HP along the path is also included (averaging the HP internal energy over the 50 ps windows) the barrier drops by ~7 kcallmol and lowers the ligand release barrier to ~25 kcallmol. The decrease of internal energy of HP helps to compensate for the loss of interaction between HP and the enzyme relative to its solvation energy. Note that the reaction coordinate is defined by the N of Arg118 to C10 of HP distance, so that 0 A corresponds to 17.5 A. When HP is pushed out by restraining this distance, the Glu74 to HP distance, for example, does not necessarily immediately respond. Furthermore, various orientations are being sampled at each distance. That the energetic change does not occur immediately (between 0-2 A) can be attributed to such effects. Most of the decrease in internal energy occurs between 2~4 A and is a combination of the loss of interaction energy of HP with the protein and the release of strain energy of the bound product. C. Pushing HP out of the active site: Free energy analysis. A binding constant is of course related to the free energy difference between bound and solvated states, and a kinetic constant for dissociation is related to the free energy barrier separating bound and un-bound states. Above, we focused on the energy of pushing HP out of the binding pocket. In principle, the exit pathway strategy pursued here could be used to obtain directly the free energy along the chosen coordinate. However, that would require an 260 averaging period for each of the 40 windows that cannot be practically reached in current MD simulations. An approximate free energy profile may be obtained by including the entropy associated with HP along the exit path and the corresponding protein entropy. The HP entropy may be estimated by a quasi-harrnonic analysis since, with the exception of the ring NH2 and the CHzOH tail, it is a quite rigid structure. The line in Figure 9-17 labeled as -TS is this vibrational contribution to the HP entropy. The AS of ~ 13 cal/mol- deg obtained in Section 1B for DHNA between bound and apo forms provides a less than 4 kcallmol decrease in free energy for un-binding. Including these two entropic contributions leads to a barrier to HP release of ~15 kcallmol around 4 A. Conclusions The eight catalytic active sites of DHNA are formed by the noncovalent association of protomers. The presence of eight active sites permits a more efficient investigation by molecular dynamics of the binding site, and of how HP is released from DHNA. The reliability of the simulations was assessed based on RMSD from the crystal structures, the active site hydrogen bonding patterns, and the conformations of the active sites compared with those of the crystal structures. The RMSD and RMSF data confirm that the simulations maintain the basic DHNA conformation. The overall DHNA structure is quite rigid, and the binding of HP does not cause significant changes in the structure. By analyzing the trajectory data with the two RMSF methods denoted as average protomer RMSF and octamer RMSF, the more flexible regions of DHNA were identified. In particular, the four regions spanning residues 15-25, 45*-55*, 68-74 and 100-110 are the more flexible regions of apo DHNA. These regions include residues that are part of the active site. The binding of HP rigidifies DHN A, not only in the above flexible regions 261 but also in the more rigid regions that do not directly participate in HP binding. All these conclusions relied on a comparison of the octamer and average protomer method changes in residue RMSFs for the apo and HP-bound forms; it would not have been evident from just the average protomer-based method. The network of hydrogen bonds formed by HP and the protein, and also induced in the protein by HP, are schematized in Figure 9-9b, and specified, for each dimer, in Tables 9-(2-5). These interactions between HP and active site residues might explain the increase in rigidity of the enzyme when HP is bound. The PCA results of apo and HP-bound DHNA complements those of the RMSF analyses. The rigidification, on average, of DHN A upon HP binding is evident in the total PCA variance of 86.5 A2 for the apo and 65.6 A2 for the bound form. The data in Figure 9-6 show that the first four modes not only capture a significant fraction of the total fluctuation in both apo and bound form, but that most of the effect of rigidification upon HP binding is captured in those four modes, suggesting that a limited number of residues are responsible for the change in DHNA flexibility upon HP binding. That picture is reinforced by the projections of the atom motions onto the first few principal components, as shown for the first mode in Figure 9-7, and leads to the identification of residues in the three regions, 45*-55*, 68-74 and 100-110, which can provide a contrast between the fluctuations in the apo and HP bound forms, as displayed in Figure 9-8. An energetic analysis of the residues that contribute to binding HP reveals that the binding is localized, as summarized by the data in Figure 9-10. It is remarkable that the HP binding energy is so local, even though it is dominated by the nominally long-range electrostatic interaction energy contribution. Residues within 5 A of HP contribute about 262 97% of the total electrostatic and 95% of the total van der Waals energy. Just three residues, Glu22, Glu74 and LyleO contribute most (82%) of the electrostatic energy, which emphasizes the importance of these residues for binding HP and for creating a catalytically competent structure. Glu74, which has the largest interaction with HP (—35 kcallmol), sits at the bottom of the active site and may act as an electrostatic attractor for HP. Both Glu22 and LyleO are implicated in the catalytic mechanism of HP. Note that LyleO is not directly hydrogen bonded to HP; but, there is one water molecule that bridges LyleO and 04 of HP, which is also hydrogen bonded to Asn7l. This water is found in the crystal structure. In the simulation, which is canied out in the absence of crystallographic waters, a water molecule from the solvent migrates to this position and persists in all the eight active sites (Table 9-2). This is about half the total HP water interaction energy, which indicates the importance of this water for binding HP. The hydrogen bond between LyleO and the trapped water may also serve to decrease the pKa of the water molecule as part of the catalytic mechanism of DHNA. The electrostatic interaction energy between LyleO and HP of —23 kcallmol may be important for binding the pterin ring, versus modifying the N5 pKa, as noted in a Raman spectroscopic study that found N5 to be deprotonated even at pH 6.5.15 The flexibility of DHNA is 87 A2 in the apo and 65 A2 in the HP bound form, providing a measure of how HP binding increases the rigidity of DHNA, which complements the view obtained from specific interactions. as detailed in this work. It would be possible, of course, for HP to rigidify DHNA locally but still lead to an overall increase in DHN A flexibility. The increase in rigidity that is found is qualitatively related to an entropic loss upon binding that needs to be compensated by a sufficient increase in 263 DHNA-HP interaction energy. It is interesting that during the course of the HP exit path simulation, DHNA increases its flexibility to 92 A2 that is even larger than the free form value. This “extra” flexibility may be due to the surface-bound HP interacting with the protein and helping it sample more conformation space. Much (75%) of the flexibility increase comes from the four flexible regions around the active site that have been identified in the simulation. Among them, residues 15-25 contribute the most, about 40%, to the increase. Since the binding of HP is so local, it is sensible that the flexibility decrease upon binding should also be concentrated in this active site region. The direction and speed of pushing that we chose for the HP exit path did not significantly disturb the overall DHNA structure, as summarized in Figures 11-12. As noted above, HP is bound strongly by Glu74 at the bottom of the binding site, and examination of HP relative to its surroundings along the exit pathway, shows that it moves out with the pterin ring first parallel with its bound position followed by a gradual upward rotation as it exits. That the active site is formed by the noncovalent association of two protomers indicates that it is mainly HP-residue hydrogen bonds that are being broken in the release process. The HP-DHNA interaction energies, as displayed in Figures 16 and 17, show a fairly abrupt transition region around the 2-4 A range, which, again, is consistent with the local binding energetics (Figure 9-10) mainly involving Glu22, Glu74, LyleO. It appears (Figure 9-16a) that LyleO must be pushed aside as HP exits the active site. The local binding energetics of HP may be important to the release mechanism because it lets HP easily leave once those local interactions are broken. The large decrease in HP-DHNA stabilization energy during the exit of HP needs to be compensated, if the barrier to HP release is not rendered too large. Partly, this comes 264 from the increase of the HP-solvent interaction, as shown in Figure 9-16b. At the same time, the internal energy of HP also decreases (Figure 9-17), which suggests that HP relaxes a conformationally based strain during the release. Thus, HP binding destabilizes itself, which may be viewed as a trade-off associated with positioning HP in the active site to permit catalytic activity. There are, of course, entropic terms that contribute to the free energy barrier for HP release. The contribution associated with the HP internal fluctuations during the exit path as obtained from a quasi-harmonic analysis is shown in Figure 9-17, and clearly lowers the barrier to and contributes a stabilizing influence on HP release. In view of the enhanced freedom of HP when it is in the solvent, relative to its bound state, an entropic penalty of binding is expected. The change in protein entropy that is obtained from the PCA eigenvalues (based on the CA atoms) provides less than 4 kcallmol of free energy protein stabilization in favor of the apo protein. The change in protein entropy found most likely is an upper bound because the initial state is all HP bound and the final state is all HP free. If fewer than eight ligands were released, the change in protein entropy should be reduced. The experimental data22 indicate that the rate constant for product release is ~10 s", which is consistent with the barrier to HP release obtained here. 265 Reference (1) Illarionova, V.; Eisenreich, W.; Fischer, M.; Haussmann, C.; Romisch, W.; Richter, G.; Bacher, A. J. Biol. Chem. 2002, 27 7, 28841. (2) Henderson, G. B.; Huennekens, F. M. Methods Enzymol. 1986, 122, 260. (3) Bermingham, A.; Derrick, J. P. Bioessays 2002, 24, 637. (4) Sanders, W. J.; Nienaber, V. L.; Lemer, C. G.; McCall, J. 0.; Merrick, S. M.; Swanson, S. J.; Harlan, J. E.; Stoll, V. S.; Stamper, G. F.; Betz, S. F.; Condroski, K. R.; Meadows, R. P.; Severin, J. M.; Walter, K. A.; Magdalinos, P.; Jakob, C. G.; Wagner, R.; Beutel, B. A. J. Med. Chem. 2004, 47, 1709. (5) Hennig, M.; Dale, G. E.; D'Arcy, A.; Danel, F.; Fischer, 8.; Gray, C. P.; Jolidon, S.; Muller, F.; Page, M. G. P.; Pattison, P.; Oefner, C. J. Mol. Biol. 1999, 287, 211. (6) Bauer, 8.; Schott, A. K.; Illarionova, V.; Bacher, A.; Huber, R.; Fischer, M. J. Mol. Biol. 2004, 339, 967. (7) Salzmann, M.; Pervushin, K.; Wider, G.; Senn, H.; Wuthrich, K. J. Am. Chem. Soc. 2000, 122, 7543. (8) Lopez, P.; Lacks, S. A. J. Bacteriol. 1993, I 75, 2214. (9) Cox, T. E; Cox, M. A. A. Multidimensional scaling, 2nd ed. ed.; Chapman & Hall: Boca Raton, 2001. (10) Amadei, A.; Linssen, A. B. M.; Berendsen, H. J. C. Proteins: Structure, Function, and Genetics 1993, I7, 412. (11) Kollman, P. Chem. Rev. 1993, 93, 2395. (12) Gilson, M. K.; Given, J. A.; Bush, B. L.; McCammon, J. A. Biophys. J. 1997, 72, 1047. (13) Boresch, S.; Archontis, G.; Karplus, M. Proteins 1994, 20, 25. (14) Boresch, S.; Tettinger, F.; Leitgeb, M.; Karplus, M. J. Phys. Chem. B 2003, 107, 9535. (15) Deng, H.; Callender, R.; Dale, G. E. J. Biol. Chem. 2000, 275, 30139. (16) Pearlman, D. A.; Case, D. A.; Caldwell, J. W.; Ross, W. S.; Cheatham, T. E.; Debolt, S.; Ferguson, D.; Seibel, G.; Kollman, P. Comput. Phys. Commun. 1995, 91, 1. 266 (l7) Berendsen, H. H. C.; Postma, J. P. M.; Gunsteren, W. F.; DiNola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684. (18) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, G. L. J. Chem. Phys. 1995, 103, 8577. (19) Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. J. Comput. Phys. 1977, 23, 327. (20) Lindahl, E.; Hess, B.; van der Spoel, D. Journal of Molecular Modeling 2001, 7, 306. (21) Levy, R. M.; Karplus, M.; Kushick, J.; Perahia, D. Macromolecules 1984, 17, 1370. (22) Yan, H. (unpublished results). 267 Appendices Table 9-1 Active Site Interactions in Apo DHNA Subunh l 2 3 4 5 6 7 8 Glu74- 2.71 2.70 2.68 2.76 2.68 2.71 2.68 2.71 051... (0.13) (0.12) (0.13) (0.16) (0.13) (0.13) (0.12) (0.15) Thr'51- 100% 100% 100% 100% 100% 100% 100% 100% OGH Glu74- 2.67 2.63 2.93 2.62 2.66 2.65 2.62 052... (0.12) (0.10) (0.25) (0.10) (0.12) (0.12) (0.10) Tyr‘54- 100% 87% 35% 100% 100% 100% 100% 0H Glu74- 3.23 3.21 3.24 3.28 3.27 3.24 3.17 3.25 052... (0.16) (0.16) (0.16) (0.15) (0.16) (0.15) (0.16) (0.15) Leu73- 72% 82% 61% 41% 37% 75% 76% 63% NH Glu74- 2.78 2.76 2.76 2.78 2.77 2.78 2.77 2.76 052... (0.09) (0.08) (0.09) (0.10) (0.09) (0.09) (0.09) (0.09) Glu74- 100% 100% 100% 100% 100% 100% 100% 100% NH Glu22- 2.77 2.80 2.74 2.90 2.93 2.75 2.74 2.86 051... (0.1 1) (0.12) (0.12) (0.18) (0.22) (0.10) (0.09) (0.16) LyleO— 100% 98% 100% 100% 77% 100% 100% 99% N2“ Glu22- 3.12 3.05 2.96 2.88 3.06 3.12 2.96 3.01 052... (0.20) (0.21) (0.18) (0.15) (0.22) (0.19) (0.17) (0.20) Alal8- 51% 60% 81% 97% 100% 100% 91% 78 % NH 011122- 3.26 3.04 3.13 2.99 2.98 2.98 3.15 052... (0.16) (0.18) (0.19) (0.16) (0.16) (0.15) (0.18) Leul9- 34% 91% 36% 93% 99% 99% 73% NH Glu22- 2.94 2.88 2.90 2.82 2.81 2.80 2.90 051... (0.17) (0.15) (0.17) (0.12) (0.11) (0.11) (0.15) Gln27- 96% 97% 53% 99% 100% 100% 97% NEH " Only distance criterion was used for the salt bridge analysis. 268 Table 9-2 Hydrogen Bonds between HP and Its Surroundings in HP complex (a) Subunit l 2 3 4 5 6 7 8 N1... 3.15 3.25 3.21 3.25 3.23 Tyr‘54- (0.16) (0.15) (0.15) (0.15) (0.15) NH 94% 31% 84% 25% 26% N2H1... 2.80 2.80 2.80 2.75 2.81 2.78 2.80 2.80(0.1 Glu74- (0.10) (0.10) (0.10) (0.09) (0.10) (0.10) (0.11) 0) 051 100% 100% 100% 100% 100% 100% 100% 100% mm... 3.06 3.15 3.07 2.97 3.04 3.13 3.04 3.08 Va1'52- (0.19) (0.20) (0.20) (0.15) (0.18) (0.19) (0.18) (0.20) co 87% 47% 61% 98% 95% 59% 95% 52% N3H... 2.79 2.80 2.80 2.99 2.91 2.81 2.83 2.79 Glu74- (0.08) (0.09) (0.09) (0.17) (0.14) (0.10) (0.13) (0.09) 052 100% 100% 100% 98% 100% 100% 99% 100% 04... 2.96 2.99 2.98 3.17 2.97 3.03 3.03 2.90 Leu73- (0.15) (0.16) (0.16) (0.18) (0.16) (0.16) (0.17) (0.13) NH 99% 99% 97% 81% 97% 98% 95% 100% 1' OH... 2.79 2.62 2.80 2.64 2.67 2.69 2.75 2.63 Glu22- (0.23) (0.12) (0.26) (0.1 1) (0.17) (0.22) (0.23) (0.10) 051 91% 100% 82% 95% 88% 67% 73% 100% Glu22- 2.94 3.17 2.93 3.14 3.07 2.77 2.97 3.24 052... (0.27) (0.19) (0.30) (0.35) (0.34) (0.31) (0.26) (0.16) l'OH 78% 36% 69% 18.4% 57% 45% 54% 54% 04... 2.85 2.80 2.88 2.80 2.85 2.79 2.77 2.86 H20 (0.22) (0.18) (0.20) (0.20) (0.22) (0.17) (0.16) (0.19) 103% 100% 60% 106% 106% 94% 94% 91% Asn7l- 2.85 2.84 3.05 2.74 2.84 2.88 2.79 2.85 0... (0.20) (0.19) (0.25) (0.13) (0.19) (0.18) (0.17) (0.19) H20 90% 99% 1 1% 100% 86% 77% 97% 96% LyleO- 2.86 2.92 2.95 2.86 2.88 2.88 2.86 2.87 Nz... (0.14) (0.15) (0.19) (0.11) (0.14) (0.13) (0.13) (0.12) H20 98% 98% 61% 100% 85% 100% 100% 100% l'OH... 2.85 2.91 2.90 2.85 2.91 3.21 2.83 2.80 H20 (0.22) (0.22) (0.22) (0.18) (0.21) (0.22) (0.19) (0.17) 101% 68% 136% 55% 35% 8% 69% 113% (‘0 Some percentages are greater than 100% since more than one atom can hydrogen bond at a given time. 269 Table 9-3 Active Site Interactions in HP Complex SubunH 1 2 3 4 5 6 7 8 Glu74- 2.64 2.65 2.68 2.98 2.83 2.64 2.68 2.65 051 (o. 10) (0.11) (0.1 1) (0.26) (0.21) (0.10) (0.14) (0.1 1) Thr51- 100% 100% 100% 63% 95% 100% 100% 100% OGH Glu74- 2.90 3.11 2.96 052... (0.23) (0.25) (0.24) Tyr'54- 75% 74% 11% OH Glu74- 3.29 3.32 3.30 3.29 3.27 3.31 3.29 3.32 052... (0.13) (0.12) (0.13) (0.14) (0.14) (0.13) (0.16) (0.13) Leu73- 56% 43% 51% 55% 59% 46% 54% 40% NH Glu74- 2.74 2.76 2.78 2.86 2.80 2.76 2.78 2.75 052... (0.08) (0.08) (0.09) (0.11) (0.10) (0.08) (0.09) (0.08) Glu74- 100% 100% 100% 100% 100% 100% 100% 100% NH 011122- 2.84 2.84 2.90 2.88 2.88 2.86 2.80 2.82 051... (0.18) (0.15) (0.22) (0.16) (0.20) (0.18) (0.12) (0.13) LyleO- 67% 99% 22% 91% 100% 100% 100% 100% N2 Glu22- 2.94 2.98 2.92 2.97 2.96 2.90 052... (0.17) (0.19) (0.16) (0.18) (0.18) (0.14) Ala18- 32% 62% 91% 86% 83% 98% NH 011122- 3.09 3.12 3.06 3.16 3.03 3.01 052... (0.22) (0.19) (0.19) (0.19) (0.18) (0.17) Leul9- 21% 65% 88% 60% 88% 96% NH Glu22- 2.91 3.02 2.87 2.84 2.84 2.86 2.89 051... (0.17) (0.20) (0.14) (0.14) (0.12) (0.14) (0.15) Gln27- 97% 39% 87% 95% 93% 99% 98% NEH Table 9-4 Hydrogen bonds between Asp‘46, Asn71, and Leu72 in HP Complex Subunh l 2 3 4 5 6 7 8 Asp‘46- 2.85 2.85 2.91 2.87 2.84 2.86 2.87 2.86 0131... (0.14) (0.14) (0.16) (0.15) (0.14) (0.14) (0.15) (0.14) Asn7l- 96% 88% 43% 81% 99% 89% 96% 94% ND2H1 Asp'46- 3.17 3.13 3.06 3.12 3.12 3.09 3.09 3.08 001... (0.18) (0.18) (0.18) (0.18) (0.18) (0.18) (0.18) (0.18) Asn71- 85% 91% 96% 92% 88% 93% 95% 95% NH Asp‘46- 2.92 2.98 2.94 2.91 2.94 2.97 2.93 2.98 0132... (0.14) (0.17) (0.15) (0.13) (0.15) (0.16) (0.14) (0.16) Asn7l- 100% 99% 100% 100% 100% 100% 99% 99% NH Asp'46- 2.95 2.95 2.95 2.93 2.98 2.95 2.99 2.96 002... (0.15) (0.19) (0.15) (0.14) (0.16) (0.14) (0.16) (0.15) Leu72- 95% 99% 99% 100% 99% 99% 99% 99% NH 270 Table 9-5 Hydrogen Bonds with Glu24 and Ile25 in HP Complex Subunh 1 3 4 5 6 7 8 Ser20- 3.05 3.00 3.10 3.00 3.17 3.06 3.19 O... (0.19) (0.19) (0.19) (0.17) (0.19) (0.19) (0.19) Glu24- 88% 86% 77% 96% 50% 84% 61 % bfli Ala21- 3.17 3.20 3.13 3.22 3.12 3.11 3.04 O... (0.18) (0.18) (0.18) (0.18) (0.18) (0.18) (0.17) Glu24- 65% 60% 75% 56% 83% 85% 94% Dfli Ala21- 3.17 3.00 3.07 3.24 3.18 3.07 3.08 O... (0.18) (0.17) (0.18) (0.16) (0.18) (0.19) (0.18) Ile25- 57% 83% 79% 46% 58% 84% 90% bfli Glu22- 3.22 3.25 3.32 3.30 O... (0.17) (0.19) (0.14) (0.15) He25- 5390 1190 2396 1596 bfli Table 9—6 RMSF of Arg118 Backbone N Atom in the Regular and Pushing MD Simulations Subunit l 2 3 4 5 6 7 8 Regular 0.47 0.49 0.48 0.48 0.51 0.45 0.52 0.54 MD (A) Pushing 0.50 0.61 0.51 0.64 0.56 0.61 0.56 0.50 MD (A) 271 ‘ -— HP comPIOX apo iorm ------ derived from B-iactor ------ derived irom B-iactor 1.s~ (a) (b) Figure 9-1 CA atom RMSDs from the simulations and corresponding crystal structure B- factors. (a) Apo form. (b) HP bound form. — octamer ------ monomer ~—--~- difference 1 .‘.o Y I V 1' V fiv # V '1.0 v I ' r ' I ' I ' I ' 1 I ' ' o 20 4o 80 00 100 120 0 20 40 60 80 100 120 Residue Residue (a) (b) Figure 9-2 Comparison of the RMSFs derived from the average protomer RMSF and the octamer RMSF (see text for the difference in these RMSFs). (a) Apo form. (b) HP bound form. 272 (1) 15-25 (2) 68-74 (3) 100-110 (4) 45*-55* Figure 9-3 A representative snapshot of two adjacent subunits of apo DHNA from the MD simulation. HP is placed for the identification of the active site. One subunit is colored in green, and the other in red. The regions corresponding to large RMSF differences are highlighted in gold color. For clarity, only one set of the regions is highlighted. —°c*°m°' 1‘, ------ monomer 0.0 ~ e Z) 2 107 05 0.5 4 45 55 70 25 1 9 . . . i t . -1 .9 . . . . . . 20 40 60 80 100 120 20 40 60 80 100 120 Residue Residue (a) (b) Figure 9-4 (a) RMSF differences between the apo DHNA and HP complex simulations. The differences based on the average protomer and octamer RMSF methods are displayed with solid and dashed lines, respectively. (b) RMSF differences between the apo DHNA and HP complex simulations excluding protomer 6, based on the octamer RMSF method. 273 —A$5OP§ Dihedral Angle (degree) 8 8 8 5 o e 8 8 8 Dihedral Angle (degree) Tine (rs) Tlrre (rs) Figure 9-5 Time evolution of the \i’ dihedral angle of Asp 50 and the (p of Thr51 in the eight subunits displayed sequentially as one trajectory. —— apo enzyme ------ HP complex Fluctuation (A’) 2 4 6131'01'21'41'61'82'0 ModeNumber Figure 9—6 Principal Component Analysis eigenvalues of the first 20 modes in the apo and HP complex simulations. The total fluctuation over all the modes is 96.6 A2 for the apo and 67.2 A2 for the complex simulation. 274 l 1.4 1 — apo form < ------ hp complex RMSF(A) Residue Number Figure 9-7 The CA atom RMSF projections onto the first principal component eigenvector for the apo and HP bound forms. The enhanced fluctuation regions of apo versus HP complex are evident and similar to those found as displayed in Figure 4. Figure 9-8 The CA distances: d1 (Tyr*54 to 116105), d2 (Tyr*54 to Ala69) and d3 (Ala69 to Ile105) parametric on the trajectory for the apo (blue) and HP bound (red) forms. There is a major and minor component for the apo form. The HP bound form distance fluctuations are smaller and fairly well confined to a part of the apo major state volume. 275 \N/ Glu74 / I LyleO [HZN N ,H x ,’ H H ~.. I” Glu74 S 2 ‘~-0\ \c/O“~~ e/c/ Glu22 I6 “we 0 ~~~~~ / O ’1’ ‘H-N % “‘4“ H\ \ It? s‘ ‘ N_— 0” Tyr‘54 / <3 I , ~9 H3C Thr 51 (a) \Asn71 Gln27 Leu73/ \ 0 \hi‘ :9 Lille H2N Glu74 H H ‘N III ‘\\‘ I """ HNH2_--- ’l \H I,’ ‘\‘ "H’O-' a ..... O Glu74 x x 0“ water e\C/Glu22 / --”O~ CH OH‘- ““s~ 3:. ‘\ ‘H_NQ1318 . \ . H H ‘N—- \ \ Leu19 0\ water H (b) Figure 99 Active site interactions based on the MI) simulations of apo enzyme (a) and HP complex (b). Residues in one protomer are distinguished from the other protomer by “*,,S. 276 a'o ' 1+3 “[126 I C I —I —5 0i 0 in I n I A l it. -25 .. Eiectrostaticflkcallmol) Van Del Waals(kcallmol) + ‘3’ 5?. (a) 0)) Figure 9-10 (a) Electrostatic and (b) van der Waals interaction energies between HP and all residues within 5 A of HP. The energies are averages over the 2 ns simulation and the protomers. —— regular MD 2-0 1 —-— push MD 1 -~—- difference 1.5 4, *“ . 4 ’5 1.0« E g . g, 0.5- % 0.0 .'/v u“ N_\-. *1 ..‘r. fiék‘fi: ”my" 'IV—v' H,“ :, .r;\ 2 ‘ I A l i; 5’\,,’ ’ d . a: . u -0.5- ' l '1-0 ' 1 ' r i I 1 1 ' a * 0 20 4o 60 80 100 120 Residue Number Figure 9-11 CA RMSF comparison between the push and regular simulations, indicating the small disturbance in the protein fluctuations from the exit of HP for the chosen force constant and pulling rate. 277 0.8« RMSD(A) 0.4 .. 0.2 1 I f '7 T r d Distance(A) Figure 9-12 CA RMSD of the 50 ps averaged structure in each window relative to the average structure in the HP complex regular simulation. 95- 904 1 -85‘ N .5, 1 .5 8°: ‘37:» 3 . LL70? 65. ”80 . - . - . 0 2 4 8 8 Distance(A) Figure 9-13 Total CA fluctuation of DHNA along the exit trajectory of HP showing a range of protein fluctuations that span the bound to free form results. 278 0.8 4 0.8 « A A 0.6-i ,3 0.6+ g 8 <0 62: 62: 0.4 - 0--H l 0.2 g r '7 . 1 0-2 v r v r r m 1 0 4 6 8 0 2 4 6 8 Distance(A) Disiance(A) (a) (b) 03 ., 0.8 4 i A 0.6 ~ A 0.6 s i: go) (I) E 5 0.44 0.4 ‘ , 0.2 -1 a I v I v 1 0.2 f I f I r I ' I 0 4 6 8 0 2 4 6 8 Distance(A) Distance(A) (c) ((1) Figure 9-14 CA RMSD of the flexible regions of the 50 ps averaged structure in each window relative to the average fragments in the HP complex regular simulation. (a) Residues 15-25. (b) Residues 45 *—55*. (c) Residues 68-74. (d) Residues 100—110. 279 301 i g e 2.. .5 20 8 "3 i 3 0 f u“: i 104 10 . , . , - , - , T l - I . t v t 0 2 4 6 8 0 2 4 6 8 Distance(A) Distance(A) (a) (b) 20 20- E “is C 10-1 8 10$ '23 a 3 6 g 2 W LL ‘W U" ‘ O ' r ' I v r ' 1 0 ' I ' I ' I ' 1 o 2 4 6 a o 2 4 6 e Distance(A) Distance(A) (C) (d) 504 is. .5 40- a 8 3 II 30¢ o ' 5 ' 2 ' é ' é Distence(A) (6) Figure 9-15 CA fluctuations of flexible regions along the exit path of HP. (a) Residues 15-25. (b) Residues 45*—55*. (c) Residues 68-74. (d) Residues 100—110. (6) sum of (a), (b). (C). and (d). 280 10- °? //\,\«e ~—\/\/' if E E ‘5' g g : -1°« 33 9 I. Q) i ‘——_‘i g, -20« , -——Glu22 ' g g * - : ——-Glu74 i 5 g 1 r Lysloo. E .8 9w - e .33 . _ _ ’ o LU '40 ' r r r ' r ' I 0 2 4 6 8 Distances (A) Distances(A) (a) (b) Figure 9-16 The interaction energies between HP and its surroundings over the exit pathway. (a) Electrostatic interactions between HP, and Glu22, Glu74 and LyleO. (The solvation energy of the starting point (—14.5 kcal/mol) was set to zero, to give a better view for this figure). (b) Overall interactions between HP and residues 15—25, 45*—55*, 68—74, 100—110, solvent and the other parts of the protein. ————- HP interaction energy ' 45_ - --— HP internal energy __ sum of energies - TS . —— -T8 1 354 A Energy(kcal/mol) a? Distances(A) Figure 9-17 The intramolecular energetic and entr0pic contributions of HP, and its interaction energy with the environment and solvent, along the exit pathway. 281 Summary The reaction mechanism catalyzed by and dynamics of yCD were studied with the use of experimental methods WMR and quench-flow) and computational methods (MD and ONIOM). A complete reaction mechanism was proposed. The cytosine deamination proceeds via a sequential mechanism involving the protonation of N3, the nucleophilic attack of C4 by the Zn-coordinated hydroxide, and the cleavage of the C4-N4 bond. Two products, ammonia and uracil, are formed while the O4 of uracil is covalently linked to Zn. Thereafter, ammonia exchanges with bulk water, and then the Zn-O4 bond is cleaved through a 5-coordinated Zn complex transition state (the fifth coordinated atom is the carboxyl oxygen of Glu64). Along the reaction path, Glu64 is essential by acting as a proton shuttle to transfer protons from the Zn bound water to cytosine during the catalysis and substituting for uracil to coordinate with Zn during the uracil release. The protein is quite rigid along the whole reaction path, and the active site stays completely buried, based on the MD simulation studies. The rigidity of the protein is confirmed by an NMR relaxation study (that describes motions on the ps-ns time scale) of the backbone amide groups in the apo and the transition state analog 5FPy complex. The quench-flow kinetics study shows that the product release is the rate limiting step in the activation of 5FC. The slow release process of the product was also studied by using NMR, which gives a release rate of 13 8", though the dissociation constant Kd is only 22 mM. The opening of the active site apparently limits the release of the product. Two ligand releases paths were studied by using a steered MD method. Path 1 is in between the C-terminal helix and the F114 loop, which are required to move during the ligand release. Over this release path, the disturbance of the overall protein is rather small. Path 282 2 is in between the C-tenninal helix and loop 1. When the ligand releases along this path, the rearrangement of the overall protein is much larger than that in the path 1 release. It appears that path 1 is more favorable than path 2, based on the MD simulation study. The dynamics of yCD was also studied in apo, 5FPy and 5FU complexes, by using an NMR HD exchange method, which describes motion on the sec-min time scale. The results show that in the apo and the 5FU complex the C-terminal helix, F114 100p and loop 1 are more flexible than in the 5FPy complex. This suggests that yCD is ready to bind the substrate in the apo form and release the product in the product complex form, but is quite rigid during the chemical reaction process. It is striking that even though the apo form and the transition state analog complex have identical crystal structures, the dynamics of these two are quite different. The HI) exchange study also confirms the product release path proposed on the basis of the MD simulation. The catalytic mechanism of GD was studied by using the MD and ONIOM methods. The proposed mechanism is similar to but not the same as that of yCD. Two residues Glu55 and Asp114 act as the proton shuttle to transfer protons from a Zn bound water to guanine. The dynamical properties of DHNA were studied in the apo and product, HP, bound form by using the MD method. The binding of HP rigidifies the protein’s active site. The HP exit path way was studied by using steered MD. The chosen pathway leads to minimal structural disturbance of the protein. The analysis of the various components that contribute to the product release free energy provides an insight into the energy entropy compensation during this process. Future work 283 The product release path from yeast Cytosine Deaminase has been studied by Molecular Dynamics. Future work for this system will focus on experimental mutagenesis studies of residues that the MD suggested as contributing to the product release process. The experimental studies would not only complement the computational results, but also potentially improve the enzyme activity, since product release is the rate- limiting step in the activation of 5FC. In the Guanine deaminase ONIOM and MD study, two residues, Glu55 and Asp114, were proposed to be important for catalysis. Experimental mutagenesis studies of these two residues would be very interesting. They would possibly confirm the mechanism proposed in the ONIOM calculation. A two-step method combining MD and ONIOM was introduced to study the Zn-O4 bond cleavage step in the uracil-yCD complex. While certainly approximate for obtaining a potential of mean force along a bond breaking reaction coordinate, it does permit the use of a high-level ONIOM method with an ensemble of structures obtained from MD simulations at a realistic computational cost. The method could be extended to study other reactions where a bond breaking/making reaction coordinate can be identified. 284 IIIIIIIIIIIIIIIIIIIIIIIIIIII ll'liilliilllilliilliilllilllililllii'Illiili