MSU LIBRARIES .—:__ RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. mg’fii’ 923223; ’925 PART I: CONFORMATIONAL ENERGY MINIMIZATIONS OF Y-CHYMOTRYPSIN. PART II: THE REFIUEMENT AND STRUCTURE OF a-CHYMOTRYPSIN AT 1.67 A RESOLUTION. BY Richard Alan Blevins A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Chemistry 1984 ABSTRACT PART I: CONFORMATIONAL ENERGY MINIMIZATIONS OF y-CHYMOTRYPSIN. PART II: THE REFINEMENT AND STRUCTURE OF a-CHYMOTRYPSIN AT 1.67 A RESOLUTION. BY Richard Alan Blevins The three dimensional structure of the proteolytic enzyme y-chymotrypsin has been studied with conformational energy minimization techniques. The studies addressed questions of structural stability, protein—solvent interactions, and protein-protein aggregation. The largest perturbation of the crystallographically observed structure occurs upon energy minimization at the surface of the protein, specifically the dimer interface residues found in the alpha form of the enzyme.“ The larger changes in the interface residues are’significant due to their implications in protein-protein and protein-solvent interactions. The structure of y-chymotrypsin surrounded by solvent has also been refined. The protein-solvent system was modeled using the Ferro and H01 "mobile solvation layer plus ice lattice model" of bulk solvent. A mechanical, non— thermodynamic surface tension was calculated for the y- chymotrypsin monomer. Results indicate that the region of the monomeric y-chymotrypsin protein corresponding to the interface of dimeric a-chymotrypsin possesses a Blevins, Richard A. mechanical surface tension approximately twice that for the global protein exterior. These results suggest the possibility of predicting protein-protein interface sites when component structures are known but aggregates are not. The structures of the two independent molecules of the a-chymotrypsin dimer have been refined using Hendrickson's PROLSQ refinement program. The refinement was initiated using an exact two-fold structure, coordinates of which were obtained from a model fitted to a 2-fold averaged electron density map. The trial structure calculated well at 3.0 A resolution, the conventional R-factor being .364. Manual interventions were also performed using FRODO on an Evans and Sutherland P8300 interactive computer graphics system. A total of 247 probable water molecules were located and 97 cycles of least squares refinement were performed giving a final R—factor of 0.179. The final structure of the d—chymotrypsin dimer has a root mean square asymmetry of 0.24 A for the main chain atoms and 0.64 A for the side chains. The total r.m.s. shift from the trial structure is 0.50 A and 1.02 A for main and side chain atoms, averaged between the two molecules. Most of the asymmetry resides in the configura— tions and conformations of the side chain atoms. To my entire family for their support and encouragement throughout my many years of education. ACKNOWLEDGMENTS For their guidance, support and friendship during this study, I wish to extend my sincere gratitude to Dr. Alexander Tulinsky and Dr. Paul Hunt. The author wishes to thank Dr. B.A. Karcher, Dr. C.E. Buck, Dr. M. Ragazzi and J.R. Zimmermann for stimulating discussions and contributions during all aspects of this work. -iii- CHAPTER II III IV VI TABLE OF CONTENTS PAGE LIST OF TABLES. . . . . . . . . . . . . . . . . Vi LIST OF FIGURES. . . . . . . . . . . . . . . . viii PART I: CONFORMATIONAL ENERGY MINIMIZATIONS OF y-CHYMOTRYPSIN. INTRODUCTION. . . . . . . . . . . . . . . . . . . l CONFORMATIONAL ENERGY CALCULATIONS. . . . . . . 7 A. Refinement Strategies. . . . . . . . . . . . .7 B. y-Chymotrypsin Refinements. . . . . . . . . . 8 C. The Extended Atom Implementation. . . . . . .11 D. Parameter Choices. . . . . . . . . . . . . . 12 E. Preparatory Steps Before Refinement. . . . . 18 THE MOBILE SOLVATION LAYER PLUS ICE LATTICE . MODEL . . . . . . . . . . . . . . . . . . . . . .20 A. The Model. . . . . . . . . . . . . . . . . . 20 B. Generation of Trial Structures. . . . . . . .21 CHYMOTRYPSIN REFINEMENT RESULTS. . . . . . . . . 28 A. Series I-a, I-b, II—a. . . . . . . . . . . . 28 B. Crystallographic Analysis. . . . . . . . . . 45 PROTEIN-PROTEIN ASSOCIATION AND MECHANICAL SURFACE TENSION. . . . . . . . . . . . . . . . . 51 A. Mechanical Surface Tension Calculations. . . 53 B. Surface Tension Results. . . . . . . . . . . 55 PART II: THE REFINEMENT AND STRUCTURE OF a-CHYMOTRYPSIN AT 1.67 A RESOLUTION INTRODUCTION. . . . . . . . . . . . . . . . . . .60 A. Refinement Methods Based on X-Ray Data. . . .60 B. PROLSQ —— Restrained Least Squares Refinement. . . . . . . . . . . . . . . . . .63 1. Structure Factors. . . . . . . . . . . . 65 2. Bond Distances. . . . . . . . . . . . . .66 -iv- CHAPTER VII VIII IX PAGE . Planar Groups. . . . . . . . . . . . . . 66 . Chiral Centers. . . . . . . . . . . . . .66 . Non-bonded Contacts. . . . . . . . . . . 67 . Torsion Angles. . . . . . . . . . . . . .67 C. Graphics Intervention and FRODO. . . . . . . 68 GUI-Dubs) REFINEMENT OF THE a-CHYMOTRYPSIN DIMER. . . . . .76 A. Experimental. . . . . . . . . . . . . . . . .76 B. Refinement Summary. . . . . . . . . . . . . .77 RESULTS OF THE LEAST SQUARES REFINEMENT. . . . . 89 A. The Independent Molecules. . . . . . . . . . 89 B. Solvent Structure. . . . . . . . . . . . . .106 C. Dimer Asymmetry. . . . . . . . . . . . . . .113 D. The Active Site. . . . . . . . . . . . . . .130 E. The ILE-l6, ASP-194 Ion Pair. . . . . . . . 136 F. The Specificity Site. . . . . . . . . . . . 139 G. The TRP Cluster. . . . . . . . . . . . . . .142 H. Side Chain Asymmetry. . . . . . . . . . . . 143 ENERGETIC ANALYSIS. . . . . . . . . . . . . . . 146 COMPARISON OF THE INDEPENDENT MOLECULES OF a-CHT WITH y-CHT. . . . . . . . . . . . . . . . 152 A. ,Active Sites. . . . . . . . . . . . . . . . 155 B. Specificity Sites. . . . . . . . . . . . . .161 C. The TRP Clusters. . . . . . . . . . . . . . 162 D. Hydrogen Bonding. . . . . . . . . . . . . . 162 E. Solvent Structure. . . . . . . . . . . . . .164 F. Concluding Remarks. . . . . . . . . . . . . 169 LIST OF REFERENCES. . . . . . . . . . . . . . . 170 APPENDIX A: Extended Atoms and their Non— Bonded Parameters. . . . . . . . . . . . . . . .175 APPENDIX B: Residues found in Specific CHT Regions. . . . . . . . . . . . . . . . . . 176 APPENDIX C: Variable Dihedral Angles in the a-CHT Dimer. . . . . . . . . . . . . . . . .178 APPENDIX D: Hydrogen Bonds in the a-CHT Dimer. O O I O O I O O O O O O O O O O O O O O .187 APPENDIX E: Solvent Molecule Positions in the a-CHT Dimer. . . . . . . . . . . . . . . . .194 APPENDIX F: Protein-Solvent Hydrogen Bonds in the a-CHT Dimer. . . . . . . . . . . . . . . 195 APPENDIX G: Polar Protein Atoms -— Solvent Interactions in the a-CHT Dimer. . . . . . . . .197 -V— TABLE 10 11 12 13 14 15 LIST OF TABLES Fractional Charges in Active Site Region. . . Sample Weighting Scheme Used in Energetic Refinements. . . . . . . . . . . . . . . . . Final Energies of the Refined y-CHT Structures (kcal/mole). . . . . . . . . . . . R.M.S. Deviations from the Observed y-CHT Structure for the Energetic Refinements. . . . . . . . . . . . . . . . . Tau-Angle Distributions for the Energetic Refinements of y-CHT. . . . . . . . . . . . . Omega-Angle Distributions (absolute value) for the Energetic Refinements of y-CHT. . . . R.M.S. Deviations Between Energetically Refined y-CHT Structures. . . . . . . . . . . R-Factors for the y-CHT Energy Refinements. . Surface Area and Mechanical Surface Tension Results. 0 O O O I O O O O O O O O I O O O 0 Summary of Least Squares Parameters and Deviations. . . . . . . . . . . . . . . . . . Dihedral Angles of Disulfide Bridges of the Independent Molecules of a-CHT. . . . . . . . Asymmetry of Hydrogen Bonding in a-CHT. . . . Average Thermal Parameters for the a-CHT Dimer (A2). . . . . . .. . . . . . . . . . . . R.M.S. Asymmetry for the a—CHT Dimer. . . . . Dimer Interface Interactions in d-CHT. . . . “Vi- 33 34 56 85 95 99 . 103 . 114 126 TABLE PAGE 16 Hydrogen Bonds in the Catalytic Sites a.) Involving Protein Atoms, b.) Involving Solvent Atoms. . . . . . . . . . . . . 134 17 Two-Fold Symmetric Water Molecules in the a-CHT Dimer. . . . . . . . . . . . . . . . . . . 135 18 Asymmetry Classified by Residue Type in the a— —CHT Dimer. . . . . . . . . . . . . . . . . . 144 19 Calculated Energies and Surface Areas for the a—CHT Dimer. . . . . . . . . . . . . . . . . 150 20 Transformations Relating y-CHT to a—CHT. . . . . 153 21 R.M.S. Differences between the Independent Molecules of a-CHT and y-CHT. . . . . . . . . . .154 22 Main Chain Hydrogen Bond Differences with Respect to y-CHT. . . . . . . . . . . . . . . . .163 23 Equivalent Water Molecules (within 1.0 A) in both a-CHT and y-CHT. . . . . . . . . . . . . 166 -vii- FIGURE 10 LIST OF FIGURES Extended Atom Hydrogen Bond Potential Energy. Summary of Ice Lattice Generation and Final Results. . . . . . . . . . . . . . . . . . . . Slices through x-y Plane for the Cubic and Face Centered Cubic Protein Plus Ice Lattice Systems. Movable solvent shaded; cubic lattice, top; face centered cubic lattice, bottom. . . . . . . . . . . . . . . . . . . . . Stereoview of CA Atoms of the y-CHT Monomer. Dimer interface residues shaded. . . . . . . . Ramachandran Plot for the Final Structure Resulting from Refinement Series I-a. Glycines represented as circles. . . . . . . . Examples of FRODO Graphical Displays. (a) stick diagram, (b) stick diagram plus electron density. (0) van der Waals surface. . Progress of R-Factor During Refinement. The resolution stages and FRODO interventions are indicated 0 O O O O O' O O O O O O O O O O 0 Progress of Asymmetry Development and Shifts During Refinement. Diamonds and squares, main and side chain asymmetry; triangles and circles, main and side chain shifts with respect to the trial structure, respectively. . Variation of RgFactor with Scattering Angle. o Triangles 3.0 A, squares 2.5 A, diamonds 2.0 A, and inverted triangles 1.67 A resolution; broken lines are theoretical curves for 0.15, 0.18 and 0.20 A coordinate error. . . . . . . . Omega-Angle Distribution. Molecule 1, top and molecule 2, bottom. . . . . . . . . . . . . . . -viii- PAGE 16 24 26 30 37 72 80 86 91 FIGURE 11 12 l3 14 15 16 17 18 19 20 21 PAGE Ramachandran Plots of a-CHT. Molecule 1, top; molecule 2, bottom; GLY not included. . . . . . . 94 Histograms of a-CHT Hydrogen Bond Distances and Donor—Hydrogen-Acceptor Angles. Molecule 1 left; molecule 2, right. . . . . . . . . . . . . 98 Distribution of Some Side Chain Conformational Angles. (a) SER, (b) THR, (c) VAL, (d) ILE and LEU. . . . . . . . . . . . . . . . . . . . . 102 R.M.S. Thermal Parameters of a-CHT. Main chain, solid; side chain, broken; molecule 1, tOp; molecule 2, bottom. . . . . . . . . . . . . 105 Distribution of Occupancies (top) and Thermal Parameters (bottom). Occupancies greater than one were set to one during refinement. . . . . . . . . . . . . . . . . . . .110 Distribution of Solvent-Protein (top) and Solvent-Solvent (bottom). All solvent- solvent minimum distances >5.0 A grouped together. . . . . . . . . . . . . . . . . . . . .112 R.M.S. Asymmetry Between IndividualoMolecules of a-CHT. Only atoms with B <23.0 A2 are included; main chain, solid; side chain, broken; a-dimer interface regions, b-dyad B regions near noncrystallographic 2-fold axis between dimers, c-external turns. . . . . . 118 Stereo CA Plots of the a-CHT Dimer. Top - view down XO (local 2-fold axis), bottom - view down YO (2-fold axis can be seen. . . . . . . . . . . . . . . . . . . . . . .121 Stereoview of Representative Surface Asymmetry. Residues 172-179; molecule 2 bold. . . . . . . . . . . . . . . . . . . . . . .123 Stereoview of Overall Asymmetry of a-CHT. Viewed down local 2-fold axis designated by asterisk; side chains shown only if r.m.s. asymmetry >0.5 A and <23.0 A; main chain atoms corresponding to these residues are also shown. . . . . . . . . . . . . 125 Stereoview of Typical Dimer Interface Asymmetry. Residues 35-41; molecule 2, bold. . . . . . . . . . . . . . . . . . . . . . .129 _ix_ FIGURE 22 23 24 25 26 27 PAGE Stereoview of Catalytic Residues of Independent Molecules of a-CHT. HIS 57, ASP-102, SER-l95; molecule 2, bold. . . . . . . .132 Stereoview of Active Site Regions of a-CHT. Included are solvent and TYR-l46 of the other molecule. Molecule 1, top; 'molecule 2, bottom. Solvent common to both shaded. . . . . . . . . . . . . . . . . . . 138 Stereoview of Specificity Site Regions of a-CHT. Included are solvent molecule 1, top; molecule 2, bottom. Symmetric waters shaded. . . . . . . . . . . . . . . . . . . . . .141 Progress of Dimerization Energy during Refinement. Resolution Stages are indicated. . . . . . . . . . . . . . . . . . . . 149 R.M.S. Differences between a-CHT and y-CHT. Only atgms with B <23.0 A2 (a-CHT) and <15.0 A (y-CHT) included; molecule 1, top; molecule 2, bottom, main chain, solid, side chain, broken; intermolecular contacts in a- and y-CHT. . . . . . . . . . . . .157 Stereoviews of Superpositions of the Catalytic Site Regions of a-CHT and y-CHT. Molecule 1 - y-CHT, top; molecule 2 - y-CHT, bottom. y-CHT, bold in each case. . . . .160 —X— PART I: CONFORMATIONAL ENERGY MINIMIZATIONS OF y-CHYMOTRYPS IN . CHAPTER I INTRODUCTION The application of energy minimization techniques to the study of protein conformations was pioneered by Scheraga and coworkers1 and Lifson and coworkers.2 Energy minimization techniques have been used in theoretical studies of protein folding,3 intramolecular motion in proteins and nucleic acids4 and the energetics of activated processes in proteins.5’6 Atomic coordinates derived from x-ray diffraction studies of protein crystals are subject to uncertainties which can be of the order of several tenths of an Angstrom. If the x-ray structure is to be used to study enzyme-substrate or enzyme-enzyme interactions, the protein must be in a low energy conformation so that these types of interactions may be examined. Conformational energy calculations are proving useful in elucidating how inter- atomic interactions dictate stable conformations of polypeptides and proteins, along with their intermolecular complexes. Rigorous quantum mechanical methods, ab initio and semiempirical, have increased in power over the last few -2- decades. The size of problems of biological interest, however, requires the use of the most elementary model empirical energy functions. The basic assumption of elementary energy calculations is that one may replace the Born-Oppenheimer energy surface by a computationally convenient sum of analytical functions. In most cases, the potential energy function is chosen as a sum of approximate strain energies and non-bonded interactions. A basic framework has emerged from this early work. First, a suitable energy function that will describe the potential energy of the molecule accurately is found. Second, using the above energy function, the conformation of the molecule is adjusted to achieve a stable equilibrium structure. An assumption inherent in the above approach is that the biologically active protein conformation does possess a low potential energy. Several effects are neglected in empirical energy functions that are routinely used. The first is the anhar- monic form of the potentials far from the potential minima. Second, the non-bonded interactions employed neglect charge-induced dipole terms and three-body polarization effects.lo Most of the algorithms employed today reflect a balance between computational efficiency and accuracy. Various types of molecular force fields have been proposed for use in the study of biological macromolecules. Most agree in general reSpects; however, there are differences in detail and the numerical values of the -3- parameters may show large variations. Although energy functions applied to smaller molecules have usually included hydrogen atoms explicitly, those for polypeptides sometimes exclude them in order to reduce the number of free variables. Some or all of the hydrogen atoms are merged with their attached atoms to form "extended atoms". Loss in precision is sure to result from this type of approxima- tion, but a quantitative assessment of the "extended atom" method in energetic refinements has not yet been performed. Various criteria may be used to estimate the agreement between calculated and observed equilibrium structures. The r.m.s. deviation of carbon alpha atoms measures the agreement of the backbone conformations. The deviations of side chain atoms can also be used, but in this case, these atoms are affected by solvent and other molecules in the crystal; care must be taken if these atoms are to be considered. Also, differences in the main chain hydrogen bonding can be used to measure whether certain interactions are important in both structures. A question that is only recently being studied is the importance of solvent in energy calculations, and how its effect can be modeled in a simple and accurate fashion. Protein crystals contain anywhere from 20 to 80% solvent, Often containing a high molarity of salt or organic precipitating agent.7 The majority of solvent molecules cannot be located as discrete maxima in electron density maps from x-ray studies. Since most of the solvent appears -4- to be very mobile and to possess a fluctuating structure, at present only a statistical description is possible. It is clearly desirable to understand and describe the structure of water near the protein surface in terms of protein-water and water-water interactions. Chymotrypsin is an enzyme of considerable potential importance as a generalizable model for the study of protein-protein and protein-solvent interactions. Although there is general interest in the microscopic energetics and dynamics of protein molecules, attention is now centering on the aggregation of biomolecules, through dimer- or oligomerization or through the formation of heteromolecular complexes. Conformational energy calcula- tions on the structure of y-chymotrypsin may prove useful in addressing some basic questions concerning not only it's structure, function and specificity but also protein- protein and protein-solvent interactions. The structure of y-chymotrypsin was chosen for study due to the fact that it has been refined to high resolu- tion.11 Additional and more extensive work may be done on the dimeric form of the enzyme now that it has been refined crystallographically.12 y-Chymotrypsin, hereafter denoted y-CHT, is composed of 241 amino acid residues, arranged in three polypeptide chains of 13, 131 and 97 residues. y—CHT exists as a monomer at neutral pH, whereas the alpha form (a-CHT) crystallizes as a dimer at pH 3.5. -5- The Chymotrypsin enzyme is of particular interest as a model for protein aggregation since the alpha form is 12,13 asymmetric and alpha crystals exposed to substrate analogs and irreversible inhibitors Show asymmetric binding in the catalytic sites of the dimer.l4’15 The availability of a high resolution, crystal- lographically refined y-chymotrypsin structure affords an excellent opportunity to qualitatively and quantitatively examine different types of energy minimization methods as applied to protein molecules. The reliability of the "extended atom" method may be shown in terms of agreement of energetically refined structures with the crystal- lographically observed structure. The inclusion of the crystallographic water molecules found in the y-chymotrypsin molecule may be studied in terms of their effect on the structure (global energy and atomic forces) in conforma- tional energy refinement. A study of the y-chymotrypsin molecule, including bulk solvent, may further the understanding of protein-solvent interactions at the protein surface. Also, protein-protein and protein-solvent interactions at the corresponding dimer interface residues of the monomeric y-chymotrypsin enzyme may explain the stability gained in the dimerization process by the dimeric a-chymotrypsin. Preliminary results will Show that in the presence of bulk solvent, the dimer interface residues of the monomeric y-chymotrypsin possess a local, non-thermodynamic molecular surface tension approximately -6- twice that of the exterior residues globally. As a consequence, this region would tend to internalize preferentially upon dimerization. The application Of local molecular surface tension techniques may prove to be a valuable tool in the study of oligomeric systems where component structures are known but the aggregates are not. CHAPTER II CONFORMATIONAL ENERGY CALCULATIONS A. Refinement Strategies In the previous chapter, fundamental questions concerning protein conformational energy minimizations were outlined. Additional questions that have developed during the development of energy minimization techniques concern the problem of generating low-energy conformations of proteins which are acceptable by crystallographic standards, and the dilemma of whether energy minimization and crystallographic refinement should be used together in the refinement of protein structures. A test of this type of hybrid procedure has recently been performed on the bovine pancreatic trypsin inhibitor (BPTI) by Fitzwater and Scheraga.8 There, a potential energy-constrained real space refinement method was developed for use with diffraction data of medium to low resolution. In real space refinements of protein molecules, the model is adjusted to minimize the following function: [(oO-om>2dv (1) -8- where O0 is the observed electron density and pm is the density associated with the model. An objection to this type of method could be that by choosing real space, the new electron density map is biased toward the phasing model used to obtain it. As an alternative, a potential energy-constrained reciprocal space method may be employed.9 A comparison of the results of similar refinements on BPTI has been performed. The reciprocal space refinement resulted in a final structure with a lower R-factor, but the real space refinement method displayed a lower r.m.s. (root mean square) shift from the crystallographic structure. B. y-Chymotrypsin Refinements In all energy refinements discussed below, the 1.9 A resolution crystallographic structure of the globular serine protease y-chymotrypsin reported by Cohen, Silverton and Davies11 served as input in the conformational energy calculations. A wide variety of conformational energy refinements were performed on the structure Of Y-CHT, some including the crystallographically Observed water molecules and two including bulk solvent.16 The refinements may be grouped into three series. In series I, with the exception of half-electron charges on the carboxylic group of the side chain of ASP-194 and the side chain of ILE-l6, zero net charge was assigned to all ionizable groups. These -9- half-electron charges were used to represent a salt bridge, which has been proposed to be an integral factor in the enzyme's functionality.l7- The basic criticism of the use of zero net charge is the fact that y-CHT crystals are grown at or near pH 5. At this pH, many of the basic amino acid residues, the carboxylic terminal residues, and a large fraction of the glutamate and aspartate residues are ionized. Therefore, the series II refinements were performed using fractional charges for all ionizable side chains.18 Unlike ASP-194 and ILE-l6, neither HIS-57 nor SER-l95 carried a net charge in either series I or II, although the imidizole ring of HIS—57 is strongly polarized in the series I refinements. A list of side chain fractional charges employed for the active site region and the proposed salt bridge are listed in Table l. The series III refinements attempted to model the effects of bulk solvent on the structure and energetics of y-CHT. Also, series III results were used in the calculation of approximate local mechanical surface tensions, in an attempt to predict protein-protein interface sites. Two different refinements were performed in series III. In the first, a realistic diamond ice lattice surrounded y-CHT. In the second, a simple cubic ice lattice was employed. Side chain fractional charges were not employed in either of the series III refinements. The generation of these bulk solvent structures will be discussed below. -10- omm.l omm.l mmm.| mmm.| com.3 com.1 omm.| cmm.l omm.l omm.| o mam. omm. omm. omm. owm. omm. mmm. 5mm. mmm. 5mm. 0 omH.| ov.| Nmz mvH. mmH. HMO nHo. omm. Hoz mnm.| mmm.| mnm.| .0 H90 mnm.| mmm.l mnm.| .o «no mHH.I .o 00 mHo.| mmo. HQO mHo.| mH.o1 NQO mHo.I mmo. H00 «Ho. mmH. NOD omo. omo. mwm. chm. mmm. .c OD OHo. 0H0. mHo. mmH. mHH. .o omH.| . .o omHJI .0 mo mho. omo. mmo. mmo. ono. one. one. one. ono. one. HO omo.l omo.l omo.| omo.| cho.| ono.l OHH.I who.l OHH.I who.| 2 HH H HH H HH H HH H HH H hmt wHa mmH# va* OHUH NOHt UH0¢ mcHnHumHm UCHOSUHOmH mcHumm UHuHmmmd oHuummm¢ .COHmUm muHm T>Hu0¢ CH mmmnmnu HmcoHuomum " OHQMB -11- C. The Extended Atom Implementation In all of the Chymotrypsin energy refinements, a locally modified "extended atom" method of representing a protein was employed using a standard dictionary of ideal bond lengths, bond angles, dihedral angles and force constants. The modification to the standard extended atom method5 provided an enhanced representation of hydrogen bonding; polar hydrogens were incorporated according to the following procedure. All polar hydrogens, including those of solvent, were added to the crystallographic structure in geometrically idealized positions. During the course of all the energy minimizations, the positions of the hydrogens were constrained to remain approximately ideal by assignment of very high force constants to the polar hydrogen to donor (or donor chain) bonds. The polar hydrogens were excluded from the set of calculated non- bonded interactions. Since no donor-hydrogen and hydrogen- acceptor parameters exist in the extended atom dictionary adopted, geometrical parameters from a refinement method incorporating all hydrogens19 were used for both polar hydrogens and for donor atoms. The extended atom approximation has been shown to, provide a satisfactory representation of the internal vibrations and bulk properties of small molecules and simple peptides.lo There are advantages and disadvantages to the extended atom approximations. Some of the -12- advantages are: a) their use significantly reduces the computational size of the problem, in most cases by a factor of two, b) fewer non-bonded interactions and internal degrees of freedom result and c) in most cases, hydrogen atom coordinates are unobserved and must be inferred from the non-hydrogenic cOordinates obtained from the x-ray crystallographic study. 'Some possible disadvantages include: a) unless polar hydrogens are used, it is very difficult to represent hydrogen bonding, b) there is a loss of steric effects arising from hydrogens, as an extended atom is always spherical, c) hydrogen atom coordinates are necessary for some forms of analysis (e.g., proton and 13C NMR phenomena). A list of extended atoms employed, along with their corresponding non-bonded and hydrogen bond parameters, is given in Appendix A. D. Parameter Choices Preparations and parameter choices for the Chymotrypsin conformational energy refinements will now be described. All energetic refinements were performed using the prOgram REFINE,20 locally modified from a Univac version to run on a VAx 11-7so.8'9 Unlike some energy minimization methods now in use, REFINE does not include electron density constraints (real space or reciprocal space). The energy is expressed as a sum of non-bonded and bonded contribu- tions: -13- ‘ 2 ‘ 2 ' 2 E==kw 2k (2.-£ ) ‘+kw 2k (6.-8 ) -+%w 2k (p -p ) £1 1i 1 0i ej 0i 3 oj ok ok k ok + wNBENB-+%w¢ZV¢ [1. -cos(n(¢p-¢o ))] . (2) P P P where R is a force constant, V is a potential minimum, and n represents the number of dihedral rotational minima. In Equation 2, the summation indices i, j, k, p run over all bond lengths (ii), bond angles (Bj), "frozen" dihedrals (pk), and free dihedrals (op), respectively. In each case, the subscript zero denotes an ideal geometric value. The non-bonded contribution (van der Waals and electro- static), ENB' represents the sum over all the pairs of non-bonded atoms at less than 6 A separation chosen as the cut-off distance. For the n'th pair of atoms between which hydrogen bonding is impossible, the energy is computed as: NB -12 -6 _ -l -l -1 E - A r -Cmr +wNBQmD (3) For the interaction between possibly hydrogen bonded (non-hydrogenic) atoms, the angle n formed by them with the H atom at the vertex and the hydrogen to acceptor distance (dHA) were computed: If n >90 and dHA.>3'5 A, the following expression was evaluated: 2 HB NB 2 HB -12 HBr-IO) COS n . (4) E = E sin n +(A r -C m m m m -14- Thus Equation 4 provides a smooth transition from hydrogen bonding to a simple non-bonded interaction as n is decreased from 180°. A plot of a typical extended atom hydrogen bond potential energy curve is presented in Figure 1. In the electrostatic term, Qm is the charge product of the m-th interacting pair. The computationally convenient assumption D:=r was made, which is consistent with other work.lo’18 The use of a distance dependent dielectric term introduces an approximate screening effect. Several additional methods of calculating the electrostatic energy are now being tested. These include the use of a constant dielectric, a shifted dielectric and "electrostatics by groups".10 In a conformational energy minimization, the function actually minimized is ' 2 f = e +linflxt-Xotl . (5) In Equation 5, it represents the trial coordinate vector and x the corresponding initial vector of the t-th atom. In 0t Equations 2, 3 and 5, the w factors are weights that are varied during the energy refinement to accelerate conver- gence.19 A sample weighting scheme used in a typical refinement is given in Table 2. Strong geometrical similarity constraints (i.e. high w values) are usually T imposed during the early stages of a refinement, thus maintaining structural ideality. As the refinement proceeds, -15- Figure 1. Extended Atom HydrOgen Bond Potential Energy Curve. -l6- 3% Toto-m5 m 0 w h N T- 11..-. -..+--.a-------- .-._ .c-l+-.--1-+-..-1 -414. Si \\) I ll ....... :T o ._. .1. e I + a I 4: OH. met-G \Skmtm STEP-cl btom comp-.93 Us (awry/1'03 )1) [5.1.9 -17- Table 2: Sample Weighting Scheme used in Energetic Refinements.“ Number of wt WI we wNB Cycles 1-20 Initial 1000. .05 Final 1. .10 0.5 21-40 Initial 1. .10 Final 0. .30 0.8 41-60 Initial 0. .30 Final 0. 1.0 1.0 61-80 Initial Final 0. aWeights for torsional and electrostatic energy are unity throughout. -13- wNB' is increased while the other weights are reduced. At the conclusion of the refinement, all weights are set to T' which is zero. The effect of the wNB term in Equation 3 is to give the electrostatic interactions unity except w unit weight during all cycles of the refinment. Similarly, wp =w =1 in all cycles. (b E. Preparatory Steps Before Refinement Prior to the actual energy refinement, other features of the REFINE program were used to optimize the y-CHT protein structure. The NDl and ODl pair and the 0E1 and NE2 pair were rotated by 90° about the CG and CD atoms of the ASN and GLN residues, respectively. The final conformation of the side chain was dictated by the lower conformational energy. This procedure is a way of correcting for the crystallographic indistinguishability of nitrogen and oxygen atoms. During an energetic refinement, most of the effort at the beginning is concentrated on geometrical idealization. Considerable CPU time can be saved by the process of model building before the actual energy minimization is begun. Here, the time consuming process of calculating a list of non-bonded interactions is not needed. At the end of a model building state (usually about 40-50 cycles), all bond lengths are within 0.05 A, bond angles within 5° and dihedral angles 20° of the ideal values set in the -19- extended atom dictionary. This procedure almost always creates a few short non-bonded contacts. Because such short contacts produce anomalously large interaction energies, they are removed by selective local energy minimizations prior to the global energy refinement. Depending on the size of the protein structure being studied, REFINE output is assumed to be effectively converged when the global energy change per cycle falls below 1.5-2.0 kcal/mole. This energy cut-off for conver- gence was chosen due to the steepest descent algorithm employed in REFINE. The rate of convergence is much slower than that of a conjugate gradient method.21 Additionally, the shifts of the atoms may be monitored, and convergence may be decided by an overall root mean square shift per cycle, as is done in other types of refinements.”-25 CHAPTER III THE MOBILE SOLVATION LAYER PLUS ICE LATTICE MODEL Crystallography has shown that a significant portion of the first shell of waters surrounding a protein is highly ordered.26 Although a complete description of a protein- water system can only be achieved through statistical or dynamical methods, energy minimization may be useful in locating stable solvent. Low energy starting configura- tions may be generated and used as input for additional, more definitive work. A. The Model Ferro and H01 have developed a model for the study of protein-solvent interactions. Their model is particularly well suited for the REFINE system of programs and involves a mobile solvation layer plus ice lattice.lo An isolated protein may be described as surrounded by two layers of water molecules. The inner layer contains all water molecules that interact significantly with protein atoms, and is thick enough for protein and solvent atom rearrangement. All water molecules lying within a -20.. -21- chosen distance Rs from a non-hydrogen protein atom are taken to belong to the inner layer. They are free to move in the energy refinement. The bulk solvent surrounding the protein is represented by the outer layer. In order to bound the system, only those molecules within a chosen distance Rc from a non-hydrogen protein atom are included, and their main interactions are with other waters rather than the protein. B. Generation of Trial Structures Two types of ice lattices were generated in this work, a simple cubic ice lattice as suggested by Ferro and H01 and a more realistic face-centered cubic ice lattice. In both cases, the following steps were taken to generate the trial structure. First, a cubic box was generated such that when the protein was placed in the middle of the box no protein atom was closer than R[HOH}+R[VDW], where R[HOH] is a chosen van der Waals radius of water (1.4 A) and R[VDW] is the van der Waals radius of the protein atom under consideration. No hydrogen atoms were considered here and the dimensions of the box were chosen such that every protein atom was at least 10 A from the edge of the box. Many different trial protein-solvent systems were generated by translating and rotating the lattice system with respect to the protein system and by thermally randomizing the positions of the lattice sites. The rotation was -22- accomplished by generating a rotation matrix from a given set of Euler angles.27 In thermally randomizing the coordinates of the lattice sites, random deviates from the surface of a unit sphere74 were generated to fix the orientation and an appropriate random number chosen as a function of the unit cell edge of the lattice fixed the magnitude. Second, lattice sites that were within the cut-off distance RS were classified as movable. Hydrogen atoms of all water molecules were added in geometrically ideal positions. Third, a low-energy model of the solvent was first created by minimizing the energy of the solvation shell (inner layer). Once this was accomplished, the entire protein-solvent system was refined. A graphical representation of the mobile solvation plus ice lattice model is presented in Figure 2. The results of the model calculations were used as input for the series III refinements. Figure 2 indicates the number of free and fixed water molecules, the van der Waals radii employed and the final densities of the protein-bulk solvent systems. Figure 3 displays a section through each of the two types of ice lattices generated, indicating clearly the simple cubic and face centered cubic ice lattice structures. Since no potentials exist for the protein-water and water-water interactions in the extended atom dictionary adopted, non-bonded and geometrical parameters for water oxygens were used from other work19 (see Appendix A). A -23- Figure 2. Summary of Ice Lattice Generation and Final Results. Cubic ice lattice Cell edge 3.1034 A Density 1.000 G/CC -24- Face Centered Cubic ice lattice Cell edge 6.38 A Density .93 G/CC All water molecules were removed if the oxygen was closer than (Ri‘+Rw) to any protein atom. R. - vdw radius of the protein atom 1. Rw = effective vdw radius of water Movable layer of waters at Rs Results: Cubic ice lattice: R8 = 8 A Density = .98 G/CC Movable waters 1812 Total waters 14126 Diamond ice lattice: -8° RS - A Density = .94 G/CC Movable waters 1484 Total waters 12634 -25- Figure 3. Slices through x-y Plane for the Cubic and Face Centered Cubic Protein Plus Ice Lattice Systems. Movable solvent shaded; cubic lattice, top; face centered cubic lattice, bottom. -26- v j, v v V T V # V V o o o o o o o o o o o o o o o o o o o o o > o o o o o o o o o o o o o O O o o o o o o h o o 0 o o o o o o O o o o o o I; o o o 0 ‘ I i 0 .. . 1 '1‘ ‘I . 0 . o o o o t "I " o o o o n" I D o o o u I o o o o o I {I ' ' o o o I I I ‘II‘ ) 0 O o . o o o O O xl" ‘x u. I. I 0 0 0 I n ’ o . . l‘ I . . ‘1' §§ I o o o I!“ 't o o o o D o o o n o o o o I o o o o o o o o o o p o o o o o o o o o o o 0 0 O O O O O O O O 0 O D O 0 0 O 0 O 0 0 O O O O O O 0 O O O O 0 0 O O P O O O O O 0 O 0 0 O o o o o o o o o o o o o y o o o o o o o o o 0 £0 a O o o O . o g o o o . 9 o 9 o o . o o 0 ° 0 o o o o o o . O O . ° . ’ o o . e o o 0 0 e o o ' o ' 0 0 ° 0 . . ‘ ‘ o o o 0 ° ° ' ° 0 0 ° 0 ° 0 o o o o . ° 0 . ° 0 ° ° 0 ‘ o 0 I . o . C 0 ‘ . xi 0 . O . . .l x" . O o I . 0 ° . Q ' x'x' I I . .. . O u ‘ O G . . . E ' I . . . . . G I o O . o . E . i ' “ 0 . ° II b . O O . . I I I ~ : . . w "' 9 o o O 0 : I i x I " x n x . 9 o o l ‘ ' x I I " ‘ O O . 0 . g ' x '1 x O o . .m ,, xx x?“ . 0 ° . 3 O . } ”III &' . . I b O O 0 g x II I l‘ 0 . ° 0 . I " i 0 O O ‘ & Q 0 b I ll 1 I ‘I‘ K" . . . ‘ I I . o o o o I x . " ,. o ‘ o o a ' . O " x g o o . u x ‘ ° 0 . O o 0 o O o . . ° ' "n ‘ 0 o . o . . f x . ' 0 ° 0 ° 0 . . . . ° . ° 0 o 0 ° ° 0 o 0 o o D o o o o o 0 ‘ 0 ° D o o . o . o o o o . 0 g o . o 0 . e ' o o ‘ ° 0 0 ° 0 ° 0 ° ° 0 ° 6 ° ° ‘ . o o O O 0 ° a O O Q 0 0 . . 0 ° 9 . 0 O O . . A A A4 A L A k‘ A . -27- complete extended atom description of protein-solvent interactions is needed. Hermans and co-workers have developed a new solvent model for use in the extended 7,28,29 atom approximation. Their "simple point charge model is a first step at addressing this question. CHAPTER IV CHYMOTRYPSIN REFINEMENT RESULTS A. Series I-a, I-b, II-a To assist in the analysis of the refinements, the Chymotrypsin protein was sub-divided into interior, inter- face and exterior regions. A complete list of the amino acid residues assigned to each region in CHT is given in Appendix B. A stereoview of the CA atoms of y-CHT is displayed in Figure 4. The amino acid residues that constitute the dimer interface in the alpha form of CHT may easily be seen. In order to meaningfully compare the results of the final structures of y-CHT obtained, a least squares procedure was used to rotate and translate one structure to another. The method was adopted from the REFINE system. In all cases, hydrogen atoms were removed from the coordinate lists before the rotation-translation was performed. In summary, in refinement I-a, the 150 crystallographically observed water molecules were subjected to energy minimization, but charged ionizable side chains were not employed. In refinement I-b, the effects of neglecting solvent molecules in an energetic refinement were investigated. The effects of neglecting -28- -29- Figure 4. Stereoview of CA Atoms of the y-CHT Monomer. Dimer interface residues shaded. -30... Amoe- -31- the charges on ionizable side chain residues were investi- gated in comparing refinements I-a and II-a. The effect of bulk solvent is shown in refinements III-a and III-b. Tables 3 and 4 indicate the final energies and root mean square deviations respectively of the different series of refinements from the Observed crystallographic structure. The greatest energetic improvement in every case is seen to arise from the non-bonded term, however, all terms show marked improvement, especially in the bond lengths and bond angles. In an energetic refinement, it is important to analyze the final geometry in terms of standard geometrical properties of amino acids. Table 5 shows that the refinements of y-CHT narrowed the tau angle (N-CA-C) distribution around the ideal dictionary value of 110°. On the other hand, a degradation of the omega angle distribution is noted in Table 6. The worst of the post-refinement omega angles is seen to be near the carboxy terminal residues, especially at the end of the C chain. The Ramachandran plot for the I-a refined structure is shown in Figure 5. Very few of the residues show deviations from the allowed non-bonded contact zones. Table 4 diSplays the r.m.s. deviations from the observed structure for the various classes of atoms shown in Appendix B. The r.m.s. movements of the main chain atoms and the CYS sulfur atoms are quite small compared with the overall r.m.s. deviations. The largest movements of the main chain -32- Final Energies of the Refined Table 3: y-CHT Structures (kcal/mole).a Observed I-a I-b II-a III-a III-b Bond Length 901 15 19 110 1074 1276 Bond Angle' 997 182 171 201 197 222 Dihedral 162 51 44 65 56 53 Non-Bonded 6398 -2837 -2264 -2763 -6759 ~5977 Torsional 325 202 211 212 201 200 Electrostatic -126 -163 -l34 -294 -l69 -l62 Global 8658 -2490 -1953 -2498 -5600 -4388 Average Energy —— -2.79 - -l.56 —l.44 —l.03 per Water aFor the series III refinements, plus movable ice lattice waters. results are for protein -33_ Table 4: R.M.S. Deviations from the Observed y-CHT Structure for the Energetic Refinements. I-a I-b II-a III-a III-b Main Chain 0.47 0.38 0.51 0.50 0.44 Side Chain 0.90 0.73 0.95 0.92 0.76 Carbonyl Oxygens 0.98 0.94 1.16 0.99 0.86 Sulfurs 0.60 0.47 0.65 0.56 0.50 Catalytic Site 0.76 0.74 1.01 0.78 0.70 TRP Cluster 0.46 0.32 0.49 0.47 0.31 Interior Main Chain 0.45 0.34 0.48 0.47 0.39 Side Chain 0.81 0.67 0.87 0.73 0.66 Exterior Main Chain 0.48 0.39 0.52 0.52 0.45 Side Chain 0.96 0.82 1.04 1.00 0.82 Interface Main Chain 0.48 0.43 0.53 0.54 0.50 Side Chain 1.05 0.96 1.05 1.14 0.91 Domain 1 Main Chain 0.46 0.30 0.49 0.47 0.43 Side Chain 0.88 0.70 0.94 0.89 0.76 Domain 2 Main Chain 0.49 0.44 0.54 0.54 0.45 Side Chain 0.91 0.75 0.95 0.95 0.75 Refinement Series -34- Table 5: Tau-Angle Distributions for the Energetic Refinements of y-CHT. Region (deg) I-a I-b II-a £2123 III-b Observed 97.5-102.5 7 102.5-107.5 7 5 7 9 5 62 107.5-112.5 139 139 133 137 145 120 112.5-117.5 94 97 100 94 89 44 ll7.5-122.5 l l l 2 7 122.5-127.s ' 1 Average 111.9 111.9 112.0 109.7 -35- Table 6: Omega Angle Distributions (Absolute Value) for the Energetic Refinements of y-CHT. Region (deg) I-a I:b II-a III:3_ 122:2 Observed 180-175 95 97 77 95 99 176 174-170 77 89 73 64 70 32 169-165 35 25 34 43 31 3 164-160 7 5 22 ll 14 1 159-155 1 l l 3 3 154-150 1 1 149—145 1 144-140 -36- Figure 5. Ramachandran Plot for the Final Structure Resulting from Refinement Series I-a. Glycines represented as circles. -37- 188 I 140 1. l -68 -100 -I40 I40 -68 '180 ~l40 PHI -33- were found near the carboxy terminal residues, especially near ALA 149, which is also consistent with the omega angle distribution results. The TRP cluster is a hemispherical cavity about 7 A in diameter and 7 A deep, bordered by TRP 27, PRO 28, TRP 29 and TRP 207, with PRO 4 and PRO 8 located approxi- mately 4 A above the opening. The TRP cluster found in the family of Chymotrypsin enzymes is of interest because it has been suggested that it, along with the three other aromatic clusters found in Chymotrypsin, lend stability to the folding of the molecule. Additionally, it can serve as a secondary binding site for aromatic substrate-like molecules14 and it is approximately symmetric with the active site across the protein center of mass. The center of mass of the TRP cluster residues is defined by a vector from the protein center of mass of length 11.0 A. The active site center of mass (i.e., that of the catalytic triad) is defined by a similar vector of length 9.22 A, the angle between the vectors is 172.5°. Quite spectacu- larly, the TRP cluster shows an overall r.m.s. deviation that is very small compared with the overall r.m.s. deviation. The magnitude of the PRO 4 and PRO 8 contribu- tion to the r.m.s. movement of the TRP cluster may be reduced due to their smaller size. Nevertheless, the 0.46 A r.m.s. change in the TRP cluster during refinement I-a is small relative to the 0.556 A r.m.s. change for -39- the other six trytophans and the 0.693 A r.m.s. change of the other seven prolines in y-CHT (not listed). Table 4 indicates that the side chain atoms of the residues which constitute the dimer interace region of the a-CHT dimer exhibit an r.m.s. movement larger than the r.m.s. movement displayed by the side chain atoms of the exterior residues separately, or any other classes of atoms. With the exception of the number of cycles performed, all other aspects of refinements I-a and I-b were similar. These refinements indicate the effects of omitting the crystallographically observed water molecules from energy minimization. Table 3 shows that the final energies are not changed, outside of the contributions arising from solvent-protein non-bonded interactions. Geometrical analysis shows equivalent distributions of tau and omega angles. However, the r.m.s. movements in refinement I-b are smaller than those reported for refinement I-a in Table 4, even in the interior of the protein. In the absence of solvent, the r.m.s. movement of the catalytic Site residues is larger than the overall r.m.s. motion and is virtually the same size as in refinement I-a. Furthermore, in refinement I-b, the r.m.s. movements of the two domains of Chymotrypsin are quite different, especially for the main chain atoms. This difference may be rationalized by the approximately 10% difference in the number of water molecules found in the two domains. -40- Once again, the TRP cluster and the interface atoms show small and large r.m.s. movements respectively. The effects of including fractional charges on the side chain atoms of polar residues may be seen by comparing refinements I-a and II-a. Here, fractional charges are seen to reduce the electrostatic contribution but increase the non-bonded contribution to the total global energy of y-CHT. The improvement in the electrostatic energy in refinement II-a suggests that charge localization such as that employed in refinement I-a may present difficulties in conformational energy minimization techniques. A consequence of this effect may be seen easier by examining the proposed salt bridge between ILE l6 and ASP 194. As was mentioned previously, half-electron charges were used in refinements I-a and I-b to simulate this prOposed salt bridge. Initially, it was found that only one hydrogen bond existed between N of ILE 16 and ODl of ASP 194. After refinement without side chain fractional charges, this hydrogen bond was lost. In refinement II-a, this initial hydrOgen bond was not only preserved, but also improved, and an additional hydrogen bond was formed between ILE 16 and a solvent water molecule. The use of charged side chain atoms generally increased the r.m.s. movement during the energy refinements. Some deviations from this pattern were seen: e.g., the carboxy terminals were found to be positioned closer to the observed positions than in refinement I-a. -41- Another comparison of the r.m.s. deviations between the final energetically refined structures shows that the structures I-a and II-a are more similar to each other than either is to the observed crystallographic structure. These results can be seen in Table 7. Similar results are obtained for refinements I-a and I-b. However, when refinements I-b and II-a are compared with each other and with the observed structure, the refined structures are seen to differ more from each other than either does from the input structure. Some important exceptions include the catalytic site and the TRP cluster. The least squares procedure used to fit the final y-CHT structures to the observed crystallographic structure and with themselves may in itself be contributing an artifical effect. The translating and rotating algorithm employed treats all atoms (except hydrogens, which are not included) equally. Consequently, a sulfur atom is given the same weight as a carbon atom for example. The electron density will establish the position of the sulfur atom with a much greater accuracy than the carbon atom. Also, the thermal factors are not examined before the least squares fit. Atoms with larger thermal factors are treated exactly as those with smaller thermal parameters. In this way, a long side chain (probably possessing large thermal factors) such as LYS, will have the same weight as the side chain of an ALA, having only CB as it's side chain. -42- Table 7: R.M.S. Deviations Between Energetically Refined y-CHT Structures. Ia-IIa Ia-Ib Ib-IIa Main Chain 0.40 0.33 0.41 Side Chain 0.67 0.83 0.99 Carbonyl Oxygens 0.38 0.83 0.72 Sulfurs 0.32 0.40 0.48 Catalytic Site 0.42 0.54 0.47 TRP Cluster 0.32 0.41 0.47 Interior Main Chain 0.38 0.34 0.41 Side Chain 0.67 0.64 0.74 Exterior Main Chain 0.41 0.33 0.42 Side Chain 0.73 0.73 0.81 Interface Main Chain 0.56 0.36 0.57 Side Chain 0.87 0.71 0.91 Domain 1 Main Chain 0.36 0.32 0.39 Side Chain 0.62 0.64 0.69 Domain 2 Main Chain 0.44 0.34 0.43 Side Chain 0.71 0.62 0.74 -43- Additional constraints should be added to the least squares algorithm to include the effects of the two points mentioned above. Mass weighting and thermal factor cut-off criteria may be the answer. However, in the Chymotrypsin energy refinements, the thermal factors of the final energetically refined structures were not available. If the above constraints are included, the results presented for the r.m.s. comparisons will likely show lower asymmetry for the main chain and greater asymmetry for the side chains, generally. In both refinements I-a and II-a, no charges were placed on the solvent molecules. The average energy of a solvent molecule is -2.79 kcal/mole in refinement I—a and -l.56 in refinement II-a. The extended atom dictionary maxima for hydrogen bonds ranges from -2.5 to -3.5 kcal mole-1. The average solvent molecule energetic contribution is consistent with approximately the energy of one hydrogen bond, while that of refinement II-a seems to be artifically small. The above results are also consistent with the difference in the mean solvent-protein closest approach distances of 3.01 A in I-a and 3.15 A in II-a. The observed structure showed a mean protein-solvent closest approach distance of 3.05 A. Overall, the r.m.s. magnitude of the force acting on the atoms in refinement I-b is 2.36 kcal/mole/A. This value compares very well with the r.m.s. force reported for the structure of the bovine pancreatic trypsin -44- inhibitor (with 4 internal water molecules) used as input 18,28,29 In the in a biomolecular dynamics simulation. presence of solvent, but not of charged side chains, an r.m.s. force of 6.19 kcal/mole/A was calculated for refinement I-a. In refinement II-a, the r.m.s. force was 4.78 kcal/mole/A. A simple rationalization of large r.m.s. movements during energy minimization being accompanied by small final r.m.s. forces fails; the effects of side chain charges and solvent must be taken into account. The refinements performed in series III were identical to those in series I in that no side chain fractional charges were included in the refinements; however, bulk solvent was included. Also, the crystallographically observed waters were omitted. The final energies reported in Table 3 show similarity with the other series. Meaningful comparisons are made with refinement I-a. The striking feature is that the average energy per water molecule is drastically reduced in both of the series III refinements compared to that of series I. The bulk solvent present in the former has the effect of distributing the energy and forces throughout the solvent system much better than in I-a. This may be a consequence of the fact that interactions with internal protein atoms are minimal for the movable layer of water molecules in the bulk solvent model adopted and the larger number of movable waters in the series III refinements. The unit cell edge -45- of the ice lattice in each case is over 3.0 A; water-water interactions do not contribute a significant amount to the global energy. The r.m.s. (Table 4) deviations of the series-III refinements clearly show that in employing a solvent model in protein energetic refinements, the final structures obtained possess a reasonable energy; at the same time, atomic positions do not deviate from the observed structure by an unreasonable amount. B. Crystallographic Analysis As was stated earlier, an important factor in deter- mining the reliability or accuracy of the results of purely energetic refinements on protein molecules is the degree to wnich the procedure preserves agreement with the crystallographic observations. Some workers have found it advantageous to incorporate crystallographic I restraints in their refinement programs. However, changes have also been made in crystallographic refinement 30 It can be routines to include potential energy terms. argued that including potential energy restraints into crystallographic refinement prOgrams may destroy some of the information that results in high resolution refinements on protein molecules. A typical example of this effect can be seen in the refinement of the alpha form of Chymotrypsin, which follows in part II. a-CHT crystallizes as a dimer at pH 3.5. A close examination of the final -45- structure of the dimer revealed four or five close contacts in the dimer interface region. The difference electron density maps indicated that within the accuracy of the method employed, the dimer interface had refined to the correct structure. No positive or negative peaks in the difference electron density maps were noted. The energy of the two monomers and that of the dimer of the final structure of a-CHT were calculated and about 10 to 15 kcal/mole of energy existed in these close contacts in the dimer interface region. If potential energy restraints had been included in this refinement, these close contacts would have been lost, and the difference electron density maps would probably have shown errors in this region. A more detailed examination of the dimer interface region in a-CHT will follow in part II. For the purposes of this work, the reliability and accuracy of the final refined structures can be checked by calculating the crystallographic R-factors,31 defined by the following equation: = Elma-”cl" 2(IFOI) R (6) In Equation 6, [Fol and IFCI represent the amplitudes of the observed and calculated structure factors, respectively. The final structure of y-CHT refined to an R-factor of .180, at a resolution of 7.0 to 1.90 A. The results of these calculations are given in Table 8. In all cases, the -47- Table 8: R-Factors for the y-CHT Energy Refinements. Structure Resolution (A) R-Factor y-CHT(obs) 7.0 0.191 R==.180 3.0 0.174 2.5 0.202 2.0 0.191 1.9 0.212 y-CHT(obs) 7.0 0.284 (no solvent) 3.0 0.229 R=.231 2.5 0.252 2.0 0.220 1.9 0.235 I-a 7.0 0.297 R==.3l7 3.0 0.313 2.5 0.354 2.0 0.321 1.9 0.347 I- 7.0 0.287 R =.345 3.0 0.349 2.5 0.377 2.0 0.360 1.9 0.359 II-a 7.0 0.297 R==.352 3.0 0.357 2.5 0.381 2.0 0.352 1.9 0.363 III-a 7.0 0.320 R=.345 3.0 0.357 2.5 0.362 2.0 0.356 1.9 0.364 III-b 7.0 0.311 R=.332 3.0 0.342 2.5 0.358 2.0 0.326 1.9 0.351 -48- energetically refined structures were translated and rotated to fit the structure of y-CHT. This placed the energeti- cally refined structure in the correct coordinate reference frame for the structure factor calculations, removing any drift that may have occurred in the coordinates without effecting the final energy. Since thermal factors and occupancies are not refined in potential energy minimization procedures, the B-factors and occupancies from the final crystallographic structure of y-CHT were used for the protein and solvent atoms, respectively. Additionally, since the residues 10-13 and 149-150 were not seen in the observed electron density maps, these resi- dues were not included in the structure factor calculations Besides testing the validity of the refinement results obtained, this type of comparison will attest to the accuracy of the extended atom plus polar hydrogen method in general. Table 8 displays the R-factors of the final, energetically refined y-CHT protein structures, along with that of the observed structure. Since some of the energetically refined structures had no solvent included, a structure factor calculation omitting the solvent in the observed y-CHT protein was also performed. The R-factor versus scattering angle is also shown in Table 8 at 7.0, 3.0, 2,5, 2.0 and 1.9 A. Interestingly, the solvent makes a strong contribution to the diffraction in the observed y-CHT structure, about -49- 5.0%. As expected, the greatest contribution of the solvent occurs in the low-angle data, where at 7.0 A resolution, the contribution is over 9.0%. Examination of the energetically refined y-CHT structures reveals that generally, there was an increase in the R-factor of about 10-15%. The largest increase in R-factor is found in the high angle data in all of the final structures. The structure refined with no solvent and no side chain fractional charges showed the best agreement with the observed crystallographic structure of y-CHT. The R— factor from 7.0-1.90 A increased about 13%, the agreement in the low angle data being about 10%. Examination of the series III refinements shows that the use of a more realistic face centered cubic ice lattice results in a final protein structure that agrees with the observed structure by 1.3% over using just a simple cubic ice lattice. The results of the above structure factor calculations indicate the advantages and disadvantages of the use of the extended atom plus polar hydrogen method. However, it is known that the use of all atoms (including hydrogens) in an energetic refinement results in final structures that agree with the crystallographic structure to within 5-6%.19 Also, if a refinement is carried out with only extended atoms and no polar hydrogens, the increase in R-factor is 32 well above 15%. The use of the extended atom method then seems to be a compromise between computational -50- speed and crystallographic accuracy. Better potentials may help to produce agreement with the observed structure but it should be realized that the coordinate undertainty in most crystallographic studies approaches 0.2-0.3 A, and may be even more for atoms with larger thermal parameters. In many cases, the r.m.s. movements of certain groups of atoms are very similar to the uncertainty in the coordinates. This fact shows that the R-factor can be drastically affected by slight movements in atomic positions. Better crystallographic agreement for the energetically refined y-chymotrypsin structures could be obtained if two or three cycles of least squares refinement were performed, refining only thermal factors and occupancies of the solvent. CHAPTER V PROTEIN-PROTEIN ASSOCIATION AND MECHANICAL SURFACE TENSION Electrostatic, hydrOgen bonding and van der Waals interactions have all been shown to be important in the folding of a polypeptide chain into a three dimensional protein structure. The biological importance of protein- protein association, including dimer- and oligomerization, is widely recognized. Recently, hydrOphobicity has been suggested to be a major force in the stabilization of protein-protein association.33 Hydrophobicity can be assessed using the concept of accessible surface area.31 For a protein atom, this is the area of the surface over which the center of a water molecule can be placed while it is in contact with the atom and not penetrating any other protein atom. Each square Angstrom of surface area buried upon association gives a hydrOphobic free energy of 33 Additional mechanisms have been about 25 cal/mole. proposed as rationalizations and origins of the free energy of association in protein-protein association. Archi- tectural complementarity or "lock and key" descriptions have been proposed.35 Kauzmann has suggested that hydrophobic energies arising from surface patches of non-polar side -51- -52- chains may play a role.36 Chothia and Janin33'37 have also stressed the effects upon solvent entropy (and hence upon the free energy of association) of excluding protein surface area from interaction with the solvent. Since it is desirable to be able to predict patterns of association when component structures are known but the aggregates are not, tests and improvements of the above mentioned association models are needed. The conformational energy minimizations of y-CHT in the presence of solvent discussed above may be of use in predicting protein dimerization and complex formation. During the analysis of the y-CHT refinements, it was found that the interface region of the y-CHT monomer displayed large movements. This may be seen in Table 4 (p. 33) where it is shown that the r.m.s. movements of the sets of all, main chain and side chain atoms in the interface region were uniformally larger than the corresponding supersets in the protein exterior. This observation motivated the calculation of r.m.s. forces, which led naturally to the suggestion that this region may possess a large local surface tension. Upon association, this region would then be expected to be internalized. The REFINE system enables the manipulation of the final microscopic forces on the structure numerically to obtain a mechanical, non-thermodynamic surface tension, derived solely from a protein-solvent potential. This -53- work is conceptually related to that reported by Lee,38 however, there, a thermodynamic surface tension was calculated. This type of investigation may prove to be a natural complement of the "excluded solvent-accessible area" model of Chothia and Janin.33 A. Mechanical Surface Tension Calculations In this section, the methods employed to approximate the effective molecular surface tension will be described. In the first method, the Chymotrypsin center of mass was determined for use as the origin of a spherical polar coordinate system. Each atom's contribution, ei(r,g,g) to the system energy was calculated. After all radial coordinates were multiplied by 1.001, the atomic energy contributions were recalculated. The microscopic surface tension was approximated with the numerical difference Ymech 3 - (AA)'l;[ei(1.ool 5.9.93) -€i(§,§.<§)] (7) l where the summation was taken over the set of all exterior residue atoms to find ygiih and the set of all interface residue atoms to find Y?:::r' In Equation 6, AA represents the difference in van der Waals surface areas associated with the summation set before and after the radial coordinate scaling. The surface area calculations were performed using the atomic van der Waals radii -54- r(0) =1.5 A, r(N) =1.6 A, r(C) =1.8 A, and r(S) =1.9 A. No hydrogen atoms were included in the calculations. All results were obtained using Connolly's implementation39 of Lee and Richards' molecular surface area algorithm.34 As a Check on the effective surface tension calcula- tions, a second method has been devised, in order to give a greater statistical sample of atomic energies and remove the problem of incommensurate changes, such as those of aromatic rings located parallel and perpendicular to the surface of the protein. Implementation of method two has recently been started. In method two, several different structures will be generated for the final y-CHT structure including the 150 crystallographically observed waters. The generation of the protein structures used in method 2 is as follows. The harmonic potential force constants for each atom are calculated. This need be done only once for the model. Once the force constants are calculated, trial structures are generated using the following method. An energetic contribution is assigned to each atom according to a Boltzmann distribution. A reasonable energetic cut-off is Chosen at 5.0 kcal/mole. The energy is distributed over x,y,z randomly, subject to the constraint that EX-+Ey-+Ez==E Using the force tot' constants determined previously, a Shift is calculated for each atom. For each component, a random direction is chosen on the Classical phase space ellipse and the shift -55.. computed as shift =sq rt(2E/K) cos(theta) . Theta is the random phase angle for each (x,y,z) direction. The shifts are applied and the energy of the resultant system is calculated. The energetic components (exterior and interface) and the molecular surface area are used to obtain the local macroscopic surface tension. These results are then averaged for the trial structures generated to find the final value of the local molecular surface tension for each region (exterior and interface). B. Surface Tension Results For each of the three refinements (I-b, III-a, III-b), the total protein van der Waals surface area, the ratios of the exterior to interface surface areas and regional energies and the exterior and interface ymech values are collected in Table 9. The total surface area values reveal that the presence of solvent in energy minimizations II and III tends to reduce the accessible surface area relative to the isolated enzyme calculation I. Lee38 suggests that the average potential energy of a molecule can be represented as a linear function of its surface area. Comparison of the ratios of the exterior to interface surface areas and energies in the protein-solvent refinements II and II is consistent with this assumption. mech In the presence of solvent, the Y- values are inter twice the size of the corresponding values for the molecular -56- Table 9: Results. Surface Area and Mechanical Surface Tension Refinement Series I-a III-a III-b Total Surface Area (A2) 23226.2 22863.9 22922.1 Area(ext)/Area(inter) 4.540 4.580 4.560 Energy(ext)/Energy(inter) 4.410 .4.720 4.320 mech O2 (kcal/mole-A ) 0.145 0.351 0.131 ext 0’) meCh (kcal/mole-A“) 0.063 0.838 0.250 inter -57- exterior including the interface. Although the percentage Changes in the exterior and interface regional surface areas were similar upon radial scaling, the percentage change in the interface energy is twice that of the exterior regional energy. The inconsistent results obtained for I-a are explicable on the grounds that the mechanical surface tensions represented protein-vacuum interfaces; the central role of a real solvent is artifically absent. The values of the mechanical surface tension calculated for Chymotrypsin in the preceding section differ in two important aspects from results reported by Lee38 for a variety of protein molecules. The values above were derived from a mechanical potential function describing the protein-solvent system and hence do not reflect the contribution of entropy to a thermodynamic surface tension based upon free energy. In addition, the results are intended to exploit the microscopic detail of conformational energy minimizations by representing, at least to some level of approximation, variations in the mechanical surface tension between certain regions of the protein exterior instead of representing a single, global value of the surface tension. The first distinction largely precludes the comparison of the mechanical surface tensions in Table 9 with those obtained by Lee,38 which are approximate- ly an order of magnitude smaller (~35 cal/mole/A). The prospect of identifying possible protein-protein association sites on the basis of local variations in -58- mechanical surface tension is enticing, particularly since the conformational energy minimizations which would be utilized in such attempts are an increasingly routine part of protein structure refinement. However, few such conformational energy minimizations reported to date have attempted to model the interaction of proteins with bulk solvent. It is possible that the high mechanical surface tension found for the interface region of Chymotrypsin is fortuitous, and examination of other associating molecules with well-characterized structures is clearly necessary; among the possible targets for study are hemoglobins, insulin, and the trypsin-trypsin inhibitor complex. Besides requiring analysis of several associating molecules, the validation of mechanical surface tension as a guide to possible interface sites will require study of several model-specific factors. It has been argued that solvent entrOpy gains upon protein surface area reduction drive the association process,33’40-42 yet the "mobile solvation layer inside an ice lattice" model used in this work entails a well- ordered bulk solvent. Ideally, mechanical surface tension calculations would be attempted with several solvent models. It could prove that mechanical surface tension is tangential to the important aspects of protein aggrega- tion, particularly since the application of macroscopic concepts of surface chemistry to single molecules is 43 recognizably difficult. Yet it is worth noting that -59- calculations utilizing mechanical potentials include contributions from van der Waals interactions and hydrogen bonds. Ross and Subramanian44 have criticized the absence of such contributions from excluded volume theoriesBB'37 and the possibility of obtaining complementary information from detailed conformational energy calculations is thus attractive. Approximations which may influence mechanical surface tension calculations include the use of the modified "extended atom" representation of protein hydrogens, the particular potential dictionary used for interacting non-hydrogenic atoms, the representation of charged ionizable side chains employed, and the D==r representation of the dielectric. The effects of the above approximations need further investigation. The generalizability to other associating proteins of the high local mechanical surface tension found in the dimer interface region of the isolated Chymotrypsin monomer needs further study also. PART II: THE REFINEMENT AND STRUCTURE OF a-CHYMOTRYPSIN AT 1.67 A RESOLUTION. CHAPTER VI INTRODUCTION A. Refinement Methods Based on X-ray Data In any structure analysis based on x-ray diffraction data, two main components exist. The first is to deduce a model or a set of phases which correspond to most, if not all of the atomic positions in the molecule. In protein structural studies, the phases are usually determined from several heavy atom derivatives whose crystals are isomor- phous with those of the native protein45; density modification procedures may then be employed as a method to extend the resolution without bias of a mode1.l3'46 Second, the initial model can be adjusted so that the calculated structure factor amplitudes match the observed values as closely as possible. This process is termed refinement. Early refinements of protein structures were performed almost exclusively using either the real space method47 or difference Fourier methods folloWed by a conventional block diagonal least squares procedure.48 Watenpaugh used the second procedure in refining the structure of rebredoxin. The results of this work indicated that much more structural information can be obtained, in particular, solvent -60- -61- structure, than by using the least squares method or the real space method alone. With the latter, the model is not refined in the usual crystallographic sense since the model is fit to the electron density based on phases which do not change during the refinement. Deisenhofer and Steigemann have used a combination of real space and difference Fourier methods with a great deal of success.49 In the unconstrained least squares refinement of atomic parameters, the function minimized takes the form: P = E — 2 hkl _I_Fo(hkl) I—ch(hk1) I] (8) W(hkl) where W =1/02(hkl) is the weighting function. The atomic parameters are corrected using the matrix equation: AU = -H G (9) Here H is the normal matrix and G is the gradient 50’51 One of the problems with this type of vector. procedure is the immensity of the computational problem. The size of the normal matrix is M KM, where M is the number of parameters. The length of the gradient vector is also M. Agarwal52 has developed a much faster least squares procedure for refining atomic parameters. His method is based on the fast Fourier transform method (FFT). For very large Structures, the amount of computation is proportional to the size of the structure, making it -62.. extremely attractive for the refinement of biological macromolecules. In recent years, new algorithms have been developed for the crystallographic refinement of biological molecules. Hoard and Nordman53 as well as Sussman et 31.,54 have introduced the concept of group constraints into reciprocal space refinements. The basic concept here is that certain peptide fragments, for example side Chains, may possess geometries which are well established and should be preserved. In the case of full matrix refine- ments, the reduction in the number of parameters can substantially reduce computing time while at the same time provide accurate refinement results.50'S3 Hoard and Nordman have applied this rigid-group restraint method in developing a crystallographic refine- ment program based on the Gauss-Seidel least squares procedure. Here, the normal equations for each structural unit (rigid group) are solved and the new estimates for the group parameters are used to update the calculated structure factors. The procedure is basically block diagonal and considerable computation time is saved by calculating the contributions from one atom to all reflections at a time.55 Some problems are associated with this method, one being that it is difficult to simultaneously impose restraints on Chirality at asymmetric centers while at the same time restraining the planarity of certain groups of atoms.56 -63- The approach that is most commonly employed today in the refinement of biological macromolecules is the least squares refinement of structure factors coupled with simultaneous optimization of the stereochemistry. Two approaches to this problem have been developed. First, Hendrickson and KonnertS7 introduce the stereochemical data as additional observations in the least squares refinement. The second method, discussed in part I, was developed by Jack and Levitt9 where a potential energy function is included. Since the energy function used by Jack and Levitt is quadratic, the two methods are essentially the same. B. PROLSQ -— Restrained Least Squares Refinement There are two major obstacles that need to be overcome in the refinement of large macromolecules. The first has already been mentioned, being the large computing time involved, even with block diagonal least squares programs. The second is the limited amount of diffraction data available from large molecules such as proteins. It is very rare to find a protein crystal that will scatter x-rays beyond the 2.0 A limit. Generally, the diffraction data is reduced by the sheer size of the protein molecule and disorder associated with it.50 In any least squares procedure, the reliability of the results is decreased as the ratio of the number of observations to the number -64- of parameters is reduced. Two ways of overcoming the obstacle of limited data is to either decrease the number of parameters or add additional observations. One method of reducing the number of parameters is to use rigid group constraints.53’54 Alternatively, the number of observations may be effectively increased by including information in the form of constraints or restraints on the known geometry of the molecule. These might include information about bond lengths, bond angles and torsion angles. From crystal structure analyses of amino acids, spectroscopic and chemical analyses, and theoretical studies, a great deal of information has been gathered concerning the geometry and stereochemistry of the components of proteins and nucleic acids. The prOgram PROLSQ was developed with this information in mind. PROLSQ, a least squares, reciprocal space refinement program, employs restraints on the known geometry of proteins to "increase" the number of Observations or more exactly, to effectively reduce the number of free variables. It is important to note that unlike constraints, restraints restrict the features of the model to a range of realistic values. PROLSQ is a least squares procedure, where the best set of final parameters minimizes the weighted sum of the squared residuals. In PROLSQ, the weights chosen are always inversely proportional to the variances. There are many classes of "observations" employed in PROLSQ. Each -65- class is treated separately in the sum, the total function for minimization being the sum of all observational classes. Some of the various classes of "observations" that are treated in PROLSQ are outlined below. 1. Structure Factors The observational function for structure factors takes the form: reflections l 2 ¢ = 2 -§[IFO|-IFCI] (10) OF The calculated structure factors are determined from the equation: 2 = z . _ , ° . . + . FC K jfj(hkl) exp( BJS(hkl))exp[2wi(hxj+kyj 123)] (11) where (h,k,l) are the reflection indices, K is a scale factor,fj is the atomic scattering factor, B is the isotropic temperature factor, x,y,z are the atomic coordi- nates in fractions of the unit cell, S is sine/A and the summation is over all atoms in the asymmetric unit. It is also possible to include variable occupancy factors and the inclusion of six anisotropic temperature factors. -66- 2. Bond Distances Interatomic distances are restrained using the following "observational" function: ¢ = distgnces l rideal _rmodel 2 (12) j 020') j j D r being the distance between the atoms. By also restricting next nearest neighbor and 1-4 distances, bond angles and dihedral angles may be restrained. 3. Planar Groups Certain groups of atoms are restricted within the least squares plane of the group of atoms. The "observa- tional" equation takes the form: coplanar planes atoms 1 _ _ 2 ¢ = E E [m -r. -d ] (13) k i OP2(i,k) k 1'k k where 5% and dk are the parameters defining the least squares plane. 4. Chiral Centers One of the best features of the PROLSQ prOgram is its ability to restrain the stereochemistry at asymmetric centers, using the chiral volume as the "observational" equation, which takes the form: -67- Chiral centers . ¢ = 2 l VIdeal_vmodel 2 (l4) 2 A A 2 OC(£) 5. Non-bonded Contacts Instead of employing a potential energy function in the least squares procedure, PROLSQ uses only the repulsive part of the standard Lennard-Jones potential in the "observational" function: non-bonded contacts . 4 ¢ = 2 4l dmin_dmodel] (15) m 0 m m (m) The summation is taken only over repulsive contacts, dmodel QCOO>GQZQ U-.n..ut¢fl.a< ‘U (OOUHOO~Z2umaaa W‘L—XOCXW>MFUDUMHZz—<—r—«mr-mo>oommoo> >memmr—n mznn<oonuand S was chosen such that the weighted squared discrepancies remained approximately constant over the scattering range (Table 10). A final FRODO intervention was carried out after the 82nd cycle of refinement. A total of 97 cycles of restrained least squares refinement were carried out on the dimer of a-CHT. The ranges of restraints applied during the course of the refinement are listed in Table 10 along with the restraints applied on the final structure (first value listed) and the r.m.s. deviations from ideal geometry at cycle 97. The refinement of the d-CHT dimer corresponds to 3472 protein atoms, 25534 structure factor amplitudes, 570 Chiral centers, 2198 torsion angles and 35598 possible van der Waals contacts. Close examination of Table 10 indicates that the final structure conforms superbly with the ideal geometry and van der Waals contacts. The final R-factor is 0.179, the weighted R-factor being 0.198. If the 247 solvent molecules are removed from the structure, the R-factor increases to 0.218, indicating the strong contribution the solvent makes to the observed diffraction. When the final dimeric structure of O-CHT has hydrOgen atoms added in ideal geometrical positions, the R-factor remains essentially constant. Examination of the R-factor of the final structure of the dimer versus scattering angle, (Figure 9), can be used to estimate the mean coordinate error.62 The value Table 10: Deviations. -35- Summary of Least Squares Parameters and Distances (A) Bond Lengths Bond Angles Planar 1-4 Disulfides Planar Groups (A) Deviation from Plane Chiral Centers (A3) Chiral Volume Non-Bonded Contacts (A) Single Torsion Multiple Torsion Possible (x,y) H-bond Torsion Angles (deg) Planar Staggered Orthonormal 32.22 is the average [Fc-Fc] Target Sigma 0.02 - 0.04 0.04 - 0.06 0.05 - 0.08 0.02 - 0.04 0.02 - 0.04 0.15 0.50 0.50 0.50 5.00 15.00 20.00 discrepancy ‘ Sigmas for FQBS==(19.0)-+(-70.0) +(S-l/6) Isotropic Thermal Factor Restraints Type Number 1 1964 2 2498 3 1586 4 2370 Type l==main Chain bond, 3 =side chain bond, Sigma R.M.S. Delta from Ideal 0.77 1.25 0.77 1.19 2==main chain angle, =side Chain angle. 0.021 0.057 0.061 0.030 0.018 0.210 0.210 0.315 0.350 8.900 22.000 25.100 -86- Figure 9. Variation of RgFactor with Scattering Angle. Triangles 3.0 A, squares 2.50A, diamonds 2.0 A, and inverted triangles 1.67 A resolution; broken lines are theoretical curves for 0.15, 0.18 and 0.20 A coordinate error. R—F actor 5.0 0.30 0.2.5 -‘ 0.20 - 0.15 " 0.10 -37- Resolution (A) 4.0 3.0 2.0 0. 1000 I T I I I I 7 I 0. 7250 0. I 500 0. I 7.50 0.2000 0.2250 0.2500 0.2 750 0.3000 Sin (00/ A 0.50 0.25 0. 20 0.10 -88.. indicated is ~0.18 to 0.20 A. These average values assume that all discrepancies between observed and calculated structure factors are due to positional errors. This is Clearly not the case, so that some atoms are better positioned than 0.20 A while atoms with large thermal parameters may have a value considerably larger than 0.20 A. Furthermore, the Choice of weighting scheme applied to the structure factor amplitudes can have considerable effects on the R-factor, particularly the Bragg angle dependence. The mean error values indicated here are similar to those of other comparable refinements.ll'63'64 CHAPTER VIII RESULTS OF THE LEAST SQUARES REFINEMENT A. The Independent Molecules The coordinates, thermal factors and occupancies of the solvent of the final dimeric structure of O-CHT have been deposited in the Protein Data Bank.65 The r.m.s. deviations from ideal values listed in Table 10 correspond very Closely in the independent molecules. The beauty of the program PROLSQ is that it is able to restrain geometrical and structural parameters. For instance, it is imperative that the omega angles, which describe the planarity of the peptide bond be close to 180°. A histogram of the omega angle distribution in both molecules of the a-CHT dimer is presented in Figure 10. Taken as a whole, planarity of the peptide units shows an r.m.s. deviation of 0.04 A (il.5°). Generally the angles are within t5° of 180°). Also, the carbonyl carbons of the peptide units should be planar. Despite the fact that this restraint is not explicitly included in PROLSQ, the sum of the angles around the carbonyl carbon averages 359.9° (£0.16°). The tau angle (N-CA-C) should be close to 110°. Analysis shows that 90% of the residues in the dimer are -39- -90- Figure 10. Omega-Angle Distribution. Molecule 1, top and molecule 2, bottom. 70- 60‘ .30‘ Ahunber of fitmfidues 20" 70- -91- I—q 169 Ahunber of.RbskfinRs fl f 0 w I 771 I73 {75 I77 179 18: :8: Omega Angle (deg) 185 187 W ’89 197 169 Y I I7] I73 I75 I77 I79 I 181 I :83 Omega Angle (deg) -92- within 7.5° of 110.0°. The Ramachandran plots of the individual molecules are presented in Figure 11. These figures clearly indicate that the non-bonded PHI-PSI contacts of the two molecules conform to the allowed regions.66 The dihedral angles of the five disulfide bridges found in each monomer of a-CHT are listed in Table 11. Generally, the conformatiOns of these disulfides are similar, at least within the experimental error of the coordinates. A list of all the torsion angles in the final structure may be found in Appendix C. A complete list of the hydrogen bonds found in both molecules of a-CHT is given in Appendix D. In preparing this list of hydrogen bonds, polar hydrogens were added to the final structure of each monomer, hydrogen bonds were removed from the list if the hydrOgen to acceptor distance was greater than 2.45 A or if the donor to acceptor distance was greater than 3.30 A and if the angle formed by the donor-hydrogen-acceptor was less than 120.0°. This is a very conservative approach of identifying hydrogen bonds, as was the calculation which examined both distances and angles by the extended atom plus polar hydrOgen method. A total of 134 and 141 hydrogen bonds were found in molecules 1 and 2 respectively. In molecule 1, of the 134 total, 105 involve main Chain donors and acceptors exclusively, 27 involve just one main Chain donor and acceptor and 2 involve side chain donors and acceptors. In molecule 2, the respective numbers from the 141 total are -93- Figure 11. Ramachandran Plots of y—CHT. Molecule 1, top; molecule 2, bottom, GLY not included. 180 ..9¢1- -90 180 90 9‘) A F. /.-J U ./\ P)“ u‘ -90 -.--‘----o.---.--- L . C a. 2 .u s. . E .0 c. O . . u .. C. 0 J f it. . _. _|..LII,L- _ . _ _ p U & .IIrIlIF _ IIlrI LII- II - r'lrllr-lI_IIIL 1.. 1| .l O 0 O .9 9 I O O O I O l 0' I a lllllllll ”I‘d I IIIIIII 000-06 I \ a ~ a u r s v s I \ lllllllll I \/ IQIIIIIoo g m |IIIIIIIIIII 0 /\ 0 I .III . IIIIIII. m IIIIIIIII J \OUIIIIJ I P I \III IIIIIJ ‘IIIIII \\ . . \ o . \ a}. . a o . 1 . II. I ll. . - -I] . O o . u 0 o... 0 0fi . . .l a” do... . s. o. - ha . . . ”Sm... on o .. \ ”on-.0 u h u . J . . a ...... .. . . . O . 0!. no. a .0 O. u 0 O A c I 9 u u 5‘5 u o o oo. o u n . u . .0 o v 0.. o o o A . ~ on O ‘1‘ O. O > o o c g d. o o. O o no\ I . . . 0.. C O \ I o * . VOA” no 0 \ J . H o o x . . .l. . s . (II-IO \ u. . H# U \ . . I.II... IIIIIIIIIIIIIII x. “In--. our m I--- _ _ J 1 — u d a d .0_I d — q 0 0 0 O O 9 9 8 9 _ d . Rue Em Rue EA ~180 -180 Table 11: -95- Independent Molecules of a-CHT. Dihedral Angles of Disulfide Bridges of Bridge(molecule) 1-122(l) (2) 42-58(1) (2) 136-201(1) (2) 168-182(1) (2) 191-220(1) (2) X1 64 67 -106 -96 -55 -54 -l64 -l65 -155 ~155 X2 78 72 -140 -149 -l37 -126 167 175 41 44 X3 97 108 -86 -91 99 107 -80 -84 98 89 X2 -53 -70 -92 -91 -89 -95 -166 -172 -l68 -l75 x1 -68 -51 -69 -62 -43 -43 -51 -53 -60 -51 -96- 110, 28 and 3. Histograms of the distribution of donor- acceptor and theta angles are shown in Figure 12 for molecules 1 and 2. These results indicate that a-CHT possesses a strong hydrogen bonding pattern, even though stringent criteria were used with the extended atom plus polar hydrogen method to locate the hydrogen bonds. The average donor-acceptor distance is 2.91 A for both molecules 1 and 2, the average angle between the donor, hydrogen and acceptor is 155.7° and 154.8° for molecule 1 and molecule 2, respectively. These average parameters of the possible hydrogen bonds in a-CHT are very reasonable at least compared to other similar refinements.ll'63’64 Evidence of asymmetry between the two molecules of the a-CHT dimer is shown in Table 12, where the hydrogen bonds foUnd in one molecule but not the other are listed. In every case, the hydrogen bonds are found near the surface of the protein or in the dimer interface, reflecting the adaptability of surface residues. Interestingly, there is an additional hydrogen bond found in the catalytic site of molecule 1 that is not found in molecule 2 (56 N-102.0). The overall distribution of x-l angles (N-CA-CB-CD) of the side chains in a-CHT agrees very well with the trimodal prediction of theoretical calculations and corresponds well to the observed distribution among 9., t and 9+ positions of a large number of proteins.67 Despite the fact that many of the Side chain dihedral angles were restrained during the refinement, the observed distribution -97- Figure 12. Histograms of a-CHT Hydrogen Bond Distances ‘ and Donor-Hydrogen-Acceptor Angles. Molecule 1, left; molecule 2, right. -98.. 89.1 BASOCKItumotbAIIAOCOQ L “K. bk. av. 00. no. . IL.|)O.(.DII|!.|5III.PI It IOIIIA l. \1 0n. n! F 9— n: 3.x.» 6.1.03.le \SQUOOvFOtOQ v.5 N6 0.. EN 0N ? EN .o. TSUPOEQ 0.335. 3.05 ptom $399.3 x0 QESOEQQ V «\N anfiont 35:0»le bcom tumotbék \O QOQOQEQQ _ spuog uabprH jO JOQUJDN spuag 0960mm )0 “man” 305‘ 330.06.? tomotbéxl COCOQ 8~ nu. OK. no. 9.: an. On. 01. a. n... on: nN. ON. I . I ) r I?) I fl I r r S Q i _ I n . 2 I n. _ . om % 9w 3 SxGSO‘Q 9032‘ SOS btom timotbxal \O ancafiémd 3Q 0622me \SQOOOKIKOQOQ v... .2. 6... 3w ow tw Wm ed A s _ I — u p R p O FL - 2 I on T r I R. r. § 030.303.» 30:2le .OCOQ Sumotbxt \O cocaotgmfi spuog uabonXH )0 mawrw spuog uabmeH [a JaqwnN -99- Table 12: Asymmetry of Hydrogen Bonding in a-CHT. a.) Hydrogen Bonds Found Only in Molecule 1. Donor Acceptor H-Aa D-Ab Theta (deg)C 157 NE2 HNE2 21 O 2.50 3.40 149.7 39 N HN 35 OD2 2.28 3.21 154.9 38 N HN 35 OD2 1.53 2.52 171.8 40 NE2 HE2 193 O 1.51 2.48 161.6 56 N HN 102 O 2.66 3.54 146.3 154 NHl HH12 72 ODl 2.59 3.56 163.9 93 N HN 91 001 2.18 2.86 123.2 107 NZ HZ2 103 O 2.51 3.41 149.9 118 N HN 115 OG 2.43 3.26 139.3 203 NZ HZ2 128 OD2 2.35 3.35 177.6 139 N HN 198 O 2.73 3.60 145.4 145 NH2 HH22 150 001 1.96 2.95 172.7 230 NHl HH12 165 001 2.49 3.41 158.4 175 N2 H23 172 O 2.58 3.50 153.3 b.) Hydrogen Bonds Found Only in Molecule 2. Donor Acceptor H-Aa D-Ab Theta (deg)c 2 N HN 120 O 1.93 2.93 173.4 18 ND2 HND2 187 O 2.08 2.97 146.5 157 NE2 HNE2 20 0E2 2.28 3.24 159.4 39 N HN 35 ODl 2.53 3.48 158.8 37 N HN 35 ODl 1.93 2.85 151.4 75 N HN 72 O 2.49 3.48 170.6 98 N HN 95 ODl 2.04 2.99 158.8 118 N HN 115 O 2.41 3.25 141.3 125 N HN 128 ODZ 1.58 2.40 135.6 127 N HN 125 OG 2.54 3.47 153.4 167 N HN 164 0G 2.23 3.14 149.8 224 N HN 221 O 2.61 3.57 163.0 “Hydrogen to acceptor distance. Donor to acceptor distance. cDonor-hydrogen-acceptor angle. -100- generally reflects the starting angular conformations. Figure 13 presents a more detailed comparison of the x-l angles. There is no preferred conformation of the x-l angles in SER residues while THR residues prefer the g- and g+ positions. This probably results from the greater steric hinderance of the methyl group of THR. The residues VAL and ILE/LEU show a marked preference for the g+ conformation. Here,one CG is in g+ and the other in the t position. The behavior of other classes of residues is in general agreement with the observations of larger comparisons. The behavior of the thermal parameters of the inde- pendent molecules is summarized in Table 13 and shown graphically in Figure 14. Restrained individual thermal parameters were introduced during the latter stages of the 5.0-2.5 A resolution refinement, and refined thereafter. Examination of Table 13 and Figure 14 indicates that the thermal parameters of the independent molecules are fairly similar, this being especially true near residues 39, 110, 130, 160-180, 205 and 215-225. Since a symmetry restraint on the thermal parameters was not included in the refinement, the agreement between the two molecules is a reassuring result. The region from 70-80 is noticeable in both molecules. In molecule 1, this region was disordered and was not included in the refinement (occupancies were assigned a value of 0.01). However, this -101- Figure 13. Distribution of Some Side Chain Conformational Angles. (a) SER, (b) THR, (C) VAL, (d) ILE and LEU. Number of Residues -102- Number of Residues A i 3 . 3 an I , no .. GU m a I 1 Wm . a. - n S. . P. l A d A .L W .1 III] 1 m A A u A a N o . l — h I . _ . _ t l _.| _ ._ 1.4-(.1m4.(w1.4-..4..o.-... J --..4.. ©(A41qJIJ)..<.J|4(.4..-4.J-A « 4 . LIA]. o a 8 .8 .8 3... 26 nus Be A. o 8 8 .3 8 8 u:- uuo he 0% fil-I- bank. \93.» Qil~ Lab-m N93 \ w a . EV ~b uh I A A A A nu I (I. 8 cu I . W i A ”m L A r. - n r? J P . I: a A Q I Q I . w l A M . Q I N ‘0.- A A u- u- °.II1I|1|1—I.‘ll‘l.l4l_1(u(415.0 A grouped together. _ ——._— fi‘m -112- — q u . q a - J a m m w m w m .6383: 39.3% x0 538:2 5.5 5.0 4.0 4.5 5.5 Minimum Dis tan cc (A) 3.0 2.0 '05 - . q u - m w m u a . w u w 4 n m ‘ .- 1 33833 3.30% \0 teens? 4.0 4.5 5.0 5. 5 3.: Minimum Dis (once (A) 2.0 2.5 3.0 1.5 -ll3- consists of a l-2 atom thickness shell around the protein which is necessarily less dense than liquid water. Weaker peaks in the difference electron density maps were not pursued beyond this layer. Thus, there is no clear indication of liquid water structure from the solvent- solvent distances. A complete list of protein-solvent hydrogen bonds is presented in Appendix F. In molecule 1, 42 water molecules hydrogen bond to a protein atom and in molecule 2, there are 41 protein-water hydrogen bonds. In all cases, the donor-acceptor distances and the hydrogen bond angles possess highly acceptable average values. All hydrogens were added to the water molecule oxygens in idealized geometrical positions; however, the orientation of the hydrogen atom is completely random in Space. Many possible protein water interactions were not included in this list of hydrogen bonds since it was unrealistic to use the water molecule as the hydrogen bond donor. A list of polar protein atom-water molecule interactions was therefore generated to include interactions where the water oxygen may have been the hydrOgen bond donor. This list is presented in Appendix C. C. Dimer Asymmetry During the course of the refinement of the a-CHT dimer, deviations from the non-crystallographic Z-fold -ll4- symmetry were investigated by calculating the rotation matrix and translation vector that minimized the squares of the differences in the coordinates between the indepen— dent molecules. Although all atoms were used in these calculations, removal of large discrepancies did not alter the results for practical purposes indicating that the asymmetry is not systematic but basically random. The matrix-vector relating Cartesian Angstrom coordinates of molecule 2 to molecule 1 is: .9138 -.0066 .4059 -9.94 -.0017 -.9999 -.0126 40.60 .0406 .0108 -.9l38 47.60 The development of main-side chain asymmetry was noted after the first few cycles of refinement at 5.0-3.0 A resolution. There are discontinuities in the asymmetry at cycles 18, 47 and 82 which are related to the manual interactive graphics interventions using FRODO. These discontinuities decrease with extent of refinement indicating that the Fourier and least squares results finally converge to the same structure. Closer examination of the average thermal parameters of the a-CHT dimer along with visual inspection of difference electron density maps revealed that generally, atoms with thermal factors greater than 23 A2 did not usually appear reliably so that their positions are somewhat uncertain. Therefore, in all the analyses of the asymmetry in the -115- dimer, atoms whose B-factors which were greater than 23 A2 were removed from consideration. Table 14 summarizes the results of the asymmetry present in the a-CHT dimer. The overall asymmetry for the main chain atoms is 0.24 A while that for the side chains is 0.64 A. The interior displays the most symmetry while the exterior and dimer interface residues (which are internal in the dimer) are nearly equal in asymmetry. The catalytic site and the TRP cluster also show good symmetry. One would expect the atoms nearer the surface of a protein to be less well defined and this may be seen in Table 14. About a quarter of the atoms are removed from consideration of the surface and about a sixth are removed from the dimer interface. The error in the coordinates has been determined to be between 0.18 and 0.20 A (Figure 9). It is clear then that the main chain possesses a high degree of fidelity between the two molecules and the folding is essentially 2-fold like within experimental error. Only a few regions of the main chain approach 0.5 A in asymmetry, two of these being terminal residues (PRO 8, TYR 146). The same does not apply to the side chains where there are highly significant deviations from 2-fold symmetry. Figure 17 shows the average asymmetry per residue, separated into main and side chain components, for atoms whose thermal factors are less than 23 A2. The summary of Table 14 shows that 10-15% of the dimeric structure is asymmetric with almost all of it residing in the side chains (~25%). Table 14: -ll6— R.M.S. Asymmetry for a-CHT Dimer. # Atoms Asymmetry (A) Removed Protein Atoms 0.47 Main Chain 0.24 Carbonyl Oxygens 0.35 Side Chains 0.64 Sulfurs 0.18 Interior Main Chain 0.18 ( 0) Side Chain 0.49 ( 0) Exterior Main Chain 0.27 ( 90) Side Chain 0.68 (206) Dimer Interface Main Chain 0.29 ( 14) Side Chain 0.59 ( 29) Catalytic Site Main Chain 0.12 ( 0) Side Chain 0.29 ( 0) TRP Cluster Main Chain 0.19 ( 0) Side Chain 0.35 ( 0) Domain 1 (1-122) Main Chain 0.27 ( 60) Side Chain 0.67 (127) Domain 2 (123-245) Main Chain 0.22 ( 30) Side Chain 0.60 ( 79) + Cutoff, B >23.0 A2 Removed 157 atoms M01. 1 and ” 139 M01. 2 Carbonyl Main Chain. Oxygens Side Chain Overall 0.0 - 0.25 i 485 107 229 794 0.25 - 0.50 A 142 87 262 491 0.50 - 0.75 i 13 11 50 74 0.75 - 1.00 i 3 26 30 1.00 — 1.50 A 20 22 >1.so i 28 29 Figure 17. -ll7- R.M.S. Asymmetry Between Individual Molecules of a-CHT. Only atoms with B <23.02 are included; main chain, solid; side chain, broken; a-dimer interface regions, b-dyad B regions near noncrystallographic 2-fold axis between dimers, c-external turns. -118- 200 I £ 3: '1 ;. :1 ;i '1 180 I60 [I 120 I40 Residue Number 100 80 6O 40 I I I I I 'l 1' 'I 3 'z z I 20 5171 445- 41)- .15 - JC’ 2. 2 (y) uoqqm 30 -ll9- Previous studies of 2.8 A difference electron density maps between the two molecules have shown that ~l6% of the density had differences that were greater than 0.7 eA-3 or 30(Ap) =3(/2b(po)).l4 Some of the surface asymmetry may be attributed to inter-dimer contacts (LYS 203 and ASN 204 and ASN 236-VAL 233)15 and external loops (60-65 and 95-99) in one of the two similar antiparallel B-sheet barrel domains. This accounts for only about l/3 of the observed asymmetry. The remainder of the observed asymmetry must simply reflect the high degree of adaptibility associated with tertiary surface structure. A view of the final dimeric structure of a-CHT (CA atoms only) is shown in Figure 18, viewed down the crystallographic x and y axes. Although difficult to discern, some differences between the two molecules may be seen, especially in turns near the surface of the protein. A representative view of the surface asymmetry is shown in Figure l9 (residues 172-179), from which side chain asymmetry can easily be seen. An overall stereoview of the asymmetry is shown in Figure 20. Here, side chains possessing an average asymmetry greater than 0.5 A were drawn. The lack of asymmetry in the interior as well as the total lack of aromatic residues is noticeable. The dimer interface interactions are listed in Table 15. Included in this list are potential hydrogen bonds and ion pairs. It is clear from the list that half of the interface interactions display good symmetry relations. -l20- Figure 18. Stereo CA Plots of the a-CHT Dimer. Top -— view down XO (local 2-fold axis), bottom -— view down YO (2-fold axis can be seen). -121- (Top) ’ p r‘ \ ~ ' L. [V ' V 7 p 5" ,ug. _ ‘ ‘4‘, , '1 .‘J' 3“?” V“, 1. ”‘xfi’\) 5 3"; w) «is a g. . 01-.» .. . . v??? j J _ “ti :‘3' 4 '59-’59“ ‘ ( "21.5 of‘ 1" ,> ‘_ ‘1 3 2. 4w" \ \J fi ‘ " \ -122- Figure 19. Stereoview of Representative Surface Asymmetry. Residues 172-179; molecule 2 bold. -123- 30.5 Figure 20. -124- Stereoview of Overall Asymmetry of a-CHT. Viewed down local 2-fold axis designated by asterisk; side chains shown only if r.m.s. asymmetry >0.S A and <23.0 A2; main chain atoms corresponding to these residues are also shown. -125- 30.3 -126- II' '1.ch oq.m coam saw .I womam mmm mo.~ ma.~ momma mum u: coam saw om.~ mocha may .1 mnemma mma hm.m «moan may .I Haemma mma Hm.m «comm may .I comma mma econm oo.~ mm.m eqam mmm .I memes ewe sm.m ceam mum In Nooaa awe mv.m ceam mam .I mmosva may sv.m HoOmHN may I: Hoomm maH om.m HmquH om< In moms mam gums aoH A cache mo.m m¢.~ vaa «4a In Agave mma sm.m 2mqfl age I: move awe mH.m mm.m comes «ac nu 0mm saw econm Hm.~ ma.~ 2mqa «ac I: 0mm saw ozone mm.~ ma.m NQZOmH 2mc II ohm mHm se.m om.m acmqa «we I: moosm mHm s~.m m¢.m semen mwa .1 ohm mHm “Ham :oH econm os.m ne.~ Oman ass 1: ohm mHm em.m sm.m coed ass .1 ohm mHm vm.m Hmomm mam In Hoesm mes scone oH.m HoonH «my I: Haemm awe Ame Hum Ame mus N masomaoz masomaoz .BmUld CH mCOHHUMHmuhHH 00MWH®HEH H®EHQ "ma QHQMB -127- Even in the interactions where symmetry is noted, some asymmetry is present in the van der Waals distances as well as the angles (not listed). Very prominent is the possible ion pair between the terminal carboxyl group of the B-chain (TYR 146) of one molecule and the protonated imidazole of the catalytic site of the other molecule. Both of these groups may be charged at pH 3.5, although a hydrogen bonding protonated TYR 146 carboxyl group is likely from previous change in pH studies.6O Another possible ion pair occurs between ASP 64 and the amino terminal of the C-chain ALA 149. Asymmetry is evident in the two molecules; a water molecule is present in one molecule but not the other, complicating this interaction. There are about ten additional hydrogen bonds in this region as well as some additional close contacts, especially near the local 2-fold axis (GLY 216-SER 218). During the refinement, the position of 0G of SER 218 was monitored, and near the end it was moved ~40° about x-l away from GLY 216 of the other molecule. Examination of the difference electron density maps showed clearly that the position of the 0G atom on SER 218 of the second molecule was positioned correctly. Other close contacts were not corrected due to a lack of electron density or difference density indications. A typical example of the asymmetry in the dimer interface region is shown in Figure 21 where residues 35-41 are superimposed. As in the example of surface asymmetry, most of the asymmetry is found at -128- Figure 21. Stereoview of Typical Dimer Interface Asymmetry. Residues 35-41; molecule 2, bold. -129- 32; ~130- the side chain atoms; some symmetry is preserved even in the side chains. D. The Active Site The catalytic residues of the independent molecules (HIS-57, ASP-102, SER-195) display excellent 2-fold symmetry (Table 14), well within the estimated coordinate error (Figure 9), despite the fact that they are located in the dimer interface region of the molecule. A stereo- view of the residues of the active site superimposed upon one another is presented in Figure 22 where it may be seen that the differences between the two molecules may be 1) a slight displacement of the imidizole ring of HIS-57 and 2) the positions of CB and 0G of SER-195. The x-l angles of SER-195 are -84° and -104° in molecule 1 and molecule 2, respectively. These angles are very similar to the angle of SER-195 found in the high resolu- tion refinement of y-CHTll and in other serine proteases. The 0G of SER 195 is nearly coplanar with the imidizole of HIS-57, the out-of-plane deviations being -0.26 and +0.27 A, respectively. This may lead to hydrogen bonding between SER-195 CG and HIS-57 NE2, but when the hydrogen bond angle is examined, the hydrogen to acceptor distance is 2.3 and 2.1 i with angles of 119° and 102°. Thus, a hydrogen bond is very unlikely here, in agreement with the results found for y-CHT but for a different reason. 68 In y-CHT, a hydrogen bond between SER-195 CG and HIS-57 NE2 -l31- Stereoview of Catalytic Residues of Independent Molecules of a-CHT. HIS-57, ASP-102, SER-195; molecule 2, bold. Figure 22. -l33- was not feasible because OG was over 0.7 A out-of-plane of the imidizole ring, giving a distance of 3.8 A. If the donor-acceptor roles are reversed in a-CHT, which is possible at pH 3.5 where the imidizole is protonated, the hydrogen bond angles are still unacceptable, being ~120°. A complete list of hydrogen bonds in the active site is given in Table 16, including interactions with solvent molecules. Most of the hydrogen bonds listed in Table 16 occur in y-CHT, an exception being the lack of hydrogen bond between 214 0G and 102 ODl in a-CHT due to a close inter— dimer contact near SER 214. A short contact also occurs between 56 N and 102 OD2 in a—CHT (~2.95 A); however the hdyrogen bond angles are small (108° and 116° respectively). Examination of the list of water molecules which show 2-fold symmetry in a-CHT (Table 17) and in the active site (Table 16) reveals that the solvent molecule hydrogen bonding pattern in the active site residues is indeed asymmetric. Five hydrogen bonds are found in molecule 1 and six in molecule 2, but of these hydrogen bonds, only 3 are in both molecules. One of these involves a hydrogen bond to ASP 102 N, but the other two are hydrogen bonded to the TYR residue from the other molecule, and interacting with the imidizole ring of HIS 57. The average thermal factor for the solvent molecules in the active site regions are 21.1 and 22.2 A2 for molecules 1 and 2, respectively. The corresponding occupancies are 0.81 and 0.83. Averaging only those water molecules that are found in both active .noEocoE nonuo ozu Eonm mosoamon ounoaoca « .moasooaoe noun: pounaon >nuoEE>m ounoaoca m30nn< A v -l34- < v.m DO mma m o.m z oma II moo noun: m m.m NmZ>m m a.m Bo «ova II avm noun: m N.m mo«ova II Nmn noun: m o.m o «ova m m.m co mma m a.m o no II mmm noun: m m.m z mma II mao noun: m v.m o «ova II mom noun: Alllv .m v.m O «ova II hao noun: m a.m mo «ova II omo noun: Alllv m m.m mo«ova II mvm noun: m m.m 2 «ca II hmm noun: ATllv m a.m z moa II vam noun: N oasooaoz a oasooaoz .mEou¢ uco>aom mca>ao>cH A.n o.moa oo.m m.voa om.~ OmmaIZMv v.oma m>.~ m.hoa mm.m NOONOalzhm o.moa om.m a.aoa oo.~ anemoauaozhm «.moa oo.m h.maa ~o.m mManIOOmma Amoco oamc< Amv oocnumao Amoco ommc< Amy oocnumao noumooomnnocoo m oasooaoz a oasooaoz .mEOud caouonm m:a>ao>:H A.n .mmnnm oaunamnno man an mnsom cmmonnnm “on manna -l35- avo mmn aoo vmo vao omn ooo mom amo och mmo mom oNo who mom mmo mom who mmm mmm ooo vbm moo mom N oasooaoz noun: oho moo vmo NNo can Nao moo omo omm amm oom hao amm Nmo Nmo moo moo aom moo mNo 0mm mvm ovm ovm a oasooaoz noun: omo mum omo ohm mmm omm th vmm mNm on mom aNm va mam mNm hmm ham omm aam mmm mam mam NNm omm bmv N oaoooaoz noun: .mnanm mv mo anuoee mvm vvm Nvm mmm 0mm amm mmm vmo mum Nmm mmm mmm omm mom oam vam mam Nam amm oam moo mom vom mmv mom a oasooaoz noun: d .noEaQ Bavan onu Ca moasooaoz noun: oanuofifihm oaomloze "ha oannB -136- sites, the average thermal parameters are 19.0 and 21.7 A2 while the average occupancies change to 0.89 and 0.83, for molecules 1 and 2, respectively. In all cases, the average thermal factors are lower and the average occupan- cies are at least as great as the average parameters for the entire solvent. Stereoviews of the active site residues are presented in Figure 23, displaying the solvent molecules in the active site. E. The ILE-l6, ASP-194 Ion Pair As in y-CHTll and in other serine proteases,68 there are five water molecules hydrogen bonding in the region of the salt bridge between ILE-16 and ASP-194. Four of the five water molecules (505, 516, 555, 634 and 619 in molecule 1 and 515, 529, 518, 661 and 662 in molecule 2) display symmetry between the two molecules of the a—CHT dimer. Only waters 619 and 662 do not show symmetry within 1.0 A; however, both appear to interact strongly with ASP-194 N. Additionally, three of these four symmetric water molecules (505, 516, 555 in molecule 1 and 515, 529, 518 in molecule 2) are also found in y-CHT. It may be that the water molecules help to dissipate the charge of the ILE-l6, ASP-194 ion pair. The geometry of the region shows similarity between the two molecules, most of the differences occurring in the side chains. Both ion pairs indicate a hydrogen bond between the N of ILE-16 Figure 23. -l37- Stereoview of Active Site Regions of a-CHT. Included are solvent and TYR-146 of the other molecule. Molecule 1, top; molecule 2, bottom. Solvent common to both shaded. -l38- (Top) )7 (3)-6 - 0519 \‘B “.45 r, 073: J .0517 JG! 7 0652 066: f (f) I? (3 (0% .0630 \5 . 639 A F fix» “.0541 "h -’ fl {7541 35 v") x ‘ 7 ~‘ 3: w, . ‘ [Ki ’ifyi .‘\"~J ‘1 4') I! 2". i “ 4" J 1 -~ .‘ g/ I ’53 F)—,“\~ 1k“ hy‘“; '6‘!“ -l39- and ODl of ASP-194, although the donor-acceptor distances seem to be quite small (2.6 and 2.3 A in molecules 1 and 2, respectively). F. The Specificity Site The specificity site is defined by residues 189-195, 214-220 and 225-228. The catalytic triad is also included, located near one end of the site. The specificity site displays good symmetry (r.m.s. delta.=0.24 A), and also contains several water molecules displaying 2-fold symmetry. Figure 24 presents stereoviews of both specificity sites of the a-CHT dimer including water molecules. One of the symmetric water molecules makes a close contact with TYR-228 OH (2.9 and 3.1 A) while the other is close to TRP 215 O (2.9 A) and the main chain of VAL 227. With the transition state analog phenylethane boronic acid bound in the active site, specificity site 69 A least water molecules are displaced upon binding. squares refinement of this structure70 shows that 2-3 of the specificity site water molecules are displaced while the others remain localized at the closed far-end of the specificity site most distant to the catalytic residues HIS-57, ASP-102 and SER-195. The specificity site does not possess any obvious characteristics that may lead to reasons for aromatic specificity. The size of the site is large enough to accommodate large side chains such as —l40- Figure 24. Stereoview of Specificity Site Regions of a-CHT. Included are solvent molecule 1, top; molecule 2, bottom. Symmetric waters shaded. -l4l- (Top) _ u__._._.._-_—.____ .. _._._.—.__..4 -l42- LYS and ARG (as in trypsin). As a consequence, the aromatic specificity of a-CHT may be due in part to the fact that the buried water molecules of the site are not displaced upon substrate binding and in this way aid in the positioning of the substrate for catalysis. G. The TRP Cluster The TRP cluster is a cavity 7.0 A in diameter, containing the residues TRP-27, PRO-28, TRP-29 and TRP-207, with PRO-4 and PRO-8 being slightly above the cavity. In part I, it was shown that the TRP cluster of y—CHT remained essentially positionally stationary during conformational energy calculations, in isolation, with the crystallographically observed water molecules, and in the presence of bulk solvent. Other reasons for interest in this region are a) there are three other aromatic clusters which have been suggested to lend stability to the protein, b) within the experimental coordinate error, this region displays excellent 2-fold symmetry, comparable to the catalytic site, c) it may serve as a secondary bonding site for atomatic substrate-like molecules,14’70 and d) the electron density seems to define the positions of the residues better than any other region in the a-CHT dimer. There are also 14 and 12 water molecules that surround or are within the TRP clusters of molecules 1 and 2, respectively. Again, 8 of these water 0 molecules show symmetry within 1.0 A between the two -143- monomers. The others are all within hydrogen bonding distance to some atom of the cluster. The presence of such a large number of solvent molecules may lend some stability to the region; however, the full significance and importance of this region still remains unclear. H. Side Chain Asymmetry Some of the surface asymmetry clearly arises from close inter-dimer contacts in the crystal structure as well as from asymmetrical stabilizing interactions in the dimer interface region. Most of the asymmetry probably arises from the flexibility and adaptability of side chains to distribute among equally probable configurations. To examine the question of side chain asymmetry, the side chains in the a-CHT dimer were examined as a function of residue type. The results are summarized in Table 18. In this case however, atoms with thermal factors greater than 20.0 A2 were not included in the comparison. By reducing the thermal factor cut-off, a greater sample of each type of amino acid side chain was used, this being especially true for ASN residues in a-CHT. The methyl groups of ALA residues show evidence of asymmetry. The only possible cause of such an observation is that there must be main chain difference associated with these residues. The asymmetry indicated for LYS and ARG ~144- Table 18: Asymmetry Clgsgified by Residue Type in the a-CHT Dimer. ' Code Total Exterior Interior Interface GLY 0.000(22) 0.000(15) 0.000( 8) 0.000( 3) ALA 0.327(22) 0.335(16) 0.305( 6) 0.000( 0) ARG 0.263( 3) 0.263( 3) 0.000( 0) 0.000( 0) LYS 0.222(14) 0.222(14) 0.000( 1) 0.000( 1) ASP 0.126( 9) 0.120( 7) 0.146( 2) 0.152( 1) GLU 0.148( 5) 0.148( 5) 0.000( 0) 0.000( 0) ASN 0.390(13) 0.390(13) . 0.000( 0) 0-444( 2) GLN 0.407(10) 0.418( 9) 0.297( 1) 0.000( 0) SER 0.527(26) 0.569(21) 0.293( 5) 0.591( 8) THR 0.430(22) 0.448(18) 0.304( 5) 0.560( 7) ILE 0.648(10) 0.733( 6) 0.495( 4) 0.231( 1) LEU 0.671(17) 0.301( 6) 0.804(11) 0.175( 2) VAL 0.381(22) 0.445( 9) 0.330(13) 0.158( 1) PHE 0.180( 6) 0.157( 5) 0.267( 1) 0.189( 2) TRP 0.266( 8) 0.253( 4) 0.278( 4) 0.185( 1) TYR 0.194( 4) 0.213( 3) 0.118( 1) 0.261( 2) CYS 0.165( 8) 0.166( 6) 0.160( 2) 0.191( 3) HIS 0.207( 2) 0.207( 2) 0.000( 0) 0.207( 2) MET 0.483( 2) 0.651( 1) 0.209( 1) 0.651( 1) PRO 0.352( 9) 0.385( 6) 0.276( 3) 0.000( 0) “B's <20.00 A2; Number of atoms removed is 238, Molecule 1 and 254, Molecule 2. Asymmetry expressed in A. number of residues. Numbers in parentheses indicate -145- is misleading, since in these cases, only 2-3 atoms were used in the comparison and in the case of ARG, there are only two residues in each monomer. Interestingly, there is considerably better symmetry displayed by the acid groups ASP and GLU. This was also noted in previous work.14 The better symmetry displayed by the carboxylic acids is probably due to stabilizing interactions such as ion pair formation with other cations or solvent and hydrogen bond formation. The smaller polar side chains of SER and THR display much better symmetry in the interior of the molecule, possibly due to the higher electron density in the interior. However, these residues show much greater asymmetry in the dimer interface, suggesting that these indications might well be real. The non-polar residues ILE, VAL and LEU show an unexpectedly large asymmetry, despite the fact that most are in the interior of the molecule. The main chain folding in the interior has been shown to possess the greatest symmetry so that the asymmetry displayed by the non-polar residues must be due to the size of the side chains and the free rotations they can effect. CHAPTER IX ENERGETIC ANALYSIS Throughout the least squares refinement of the a-CHT dimer, the REFINE system was used to calculate the global .energy of both monomers, as well as that of the dimer.20 Many of the other features included in REFINE were also used to analyze geometry and stereochemistry. Due to the fact that REFINE and PROLSQ employ slightly different representations and dictionaries of ideal geometrical parameters for protein molecules, it was impossible to quantitatively examine the geometrical energy terms, since in most cases, large energetic contributions resulted. Therefore, only the non-bonded and the electrostatic contributions to the global energy were compared. This seems to be a good approximation since PROLSQ indicated from the very beginning that the a-CHT dimer possessed good geometry, at least with respect to the dictionary of ideal values. The r.m.s. delta values from ideal are very small and are listed in Table 10. As in part I, the extended atom plus polar hydrogen atom approximation was used in calculating the energetic contribution. All atoms (except residues 9-13) were included. -l46- -l47- The dimerization energy of a-CHT was monitored throughout the refinement. This energy is calculated as AB = EDim-(El-+E2) , (17) where E is the energy of the dimer and El and E2 are Dim the energies of molecules 1 and 2. The change in dimeriza- tion energy with refinement is shown graphically in Figure 25. Many large changes occurred in AB throughout the refinement, most due to the graphics changes in the structure using FRODO and when the resolution was extended, a large change in AB usually occurred. Using only non-bonded contributions (which includes hydrogen bonding), the dimerization energy of the final structure is +44.9 kcal/mole. When electrostatic interactions are included, AE increases to +50 kcal/mole. These results can be compared with the observed enthalpy of dimerization of 1-4 kcal at pH 4.1 between 15-20°c.7l'72 The separate energies of the monomers calculate extremely well considering the fact that only crystallo- graphic coordinates were used and an energetic term was not included in the least squares refinement. The final energetic terms are listed in Table 19. The final global energy of the a-CHT dimer may be significantly reduced by energy refinement, but with an increase in R-factor. The surface area buried in the native state of a folded protein has been suggested to be proportional to -l48- Figure 25. PrOgress of Dimerization Energy during Refinement. Resolution states are indicated. 2.0 A )- 7 (JO 0) 80 7O 40 .30 I (,1 .l' r_'——r.‘r"T‘*“ 1"?" I'-"—x "1 "pm“?— 9 \J O Q C) (‘1 Q ‘.‘. 0 LI) Q to C1 5‘) ‘4) 01 V- V- "3 "‘1 w W K (ea/cw Jed mag) [543.23 -150- mm.v mo.a om.mma mh.ho m.Nva| m.mmNN| hv.maNoa oo.mvovv N.moNaI a.NNaan mo.omov vm.oMNNN o.thaI a.o>aa| oo.mvmm am.m>vNN nonnaocH monomlm nnasooaoz anuoe. mann: non cn> oaunumonuooam mann: non cn> Aoaoe\anoxo moamnocm ANMV non< oonmnsm AoaOE\anoxv >mnocm oonm canonmonomm moansm noan N oasooaoz a oasooaoz .noEaQ axon» onu now mnond oonmnsm can moamnocm pounasoano “ea manna -151- the gain in free energy of dehydration and hydrogen bond formation due to folding.73 Recently, analytical calcula- tions of buried surface areas have been employed in locating stable domains in proteins as well as predicting folding pathways.33 The buried surface area of the d-CHT dimer has been calculated and the contribution of hydrOphobicity to the stability of the dimer ascertained. The buried surface area is calculated as the sum of the surface area of the monomers minus the surface area of the dimer. The results of the surface area calcula- tions for a—CHT are listed in Table 19. From studies on hydrocarbons and amino acids,42 it has been suggested that 1.0 A of surface area corresponds to about 25 cal/mole of hydrophobic free energy. For the a-CHT dimer, the gain in free energy corresponds to about 4.6 Kcal/mole, this value being significantly less (about a factor of 2) than similar results for insulin dimer, trypsin-PII 33'37 Thus, in the complex and the hemoglobin a-B dimer. case of the a-CHT dimer, other contributions to the free energy of association must have a stronger effect than those present in these other complexes. Possibilities may include van der Waals interactions and hydrogen bonds, complementarity and the loss of translational and rotational entropy.33’37 CHAPTER X COMPARISON OF THE INDEPENDENT MOLECULES OF a-CHT WITH Y-CHT The structures of the independent molecules of the dimer of a-CHT at 1.67 A resolution have been compared with the refined structure of y-CHT at 1.9 A resolution. Each molecule of the a-CHT dimer was translated and rotated, in a similar manner in which the independent molecules of the alpha structure were compared to themselves. The final rotation matrices and translation vectors which relate the Cartesian coordinates of the monomers of a- and y-chymotrypsin are listed in Table 20. As in the comparisons discussed previously, atoms whose thermal factors which were greater than a chosen cut-off in one or the other structure were excluded from the calculations. The thermal factor cut-offs were 23 A2 for a-CHT and 15 A2 for the y-CHT structure, the average thermal factor for y-CHT being less than that of a-CHT. A summary of the r.m.s. differences between the structures is given in Table 21, which also indicates that the atoms on the surface and in the dimer interface show the largest thermal factors. In addition, certain residues were not included in the r.m.s. calculations due to the fact that ~152- -153- Table 20: Transformations Relating y-CHT to a-CHT. Matrix Elements Vector -.5041 -.3321 -.7972 72.3 y-al .8585 -.0916 -.5046 16.3 .0946 -.9388 .3313 11.5 -.4249 -.6935 -.5936 60.5 y-az -.8533 .0845 .5137 23.4 -.3010 .7250 -.6l95 66.5 -154- Table 21: R.M.S. Differences Between the Independent Molecules of a-CHT and y-CHT. # Atoms # Atoms Molecule 1 Received Molecule 2 Received All Atoms 0.58 0.60 Main Chain 0.37 0.39 Carbonyl Oxygens 0.49 0.44 Side Chain 0.77 0.80 Sulfurs 0.47 0.35 Interior Main Chain 0.25 ( 0) 0.26 ( 0) Side Chain 0.53 ( 0) 0.47 ( 0) Exterior Main Chain 0.41 ( 51) 0.44 ( 73) Side Chain 0.86 (190) 0.93 (202) Interface Main Chain 0.53 ( 15) 0.52 ( 17) Side Chain 0.88 ( 38) 0.89 ( 39) Catalytic Site Main Chain 0.29 ( 0) 0.28 ( 0) Side Chain 0.31 ( 0) 0.52 ( O) TRP Cluster Main Chain 0.20 ( 0) 0.21 ( 0) Side Chain 0.25 ( 0) 0.29 ( 0) Domain 1 Main Chain 0.34 ( 22) 0.38 ( 31) Side Chain 0.78 ( 81) 0.89 ( 84) Domain 2 Main Chain 0.40 ( 29) 0.40 ( 42) Side Chain 0.76 (109) 0.70 (118) a Summary Table of Deviations by Number of Atoms Carbonyl Main Chain. Oxygens Side Chain Overall 0.00 - 0.25 A 277,268 66,71 159,131 502,470 0.25 - 0.50 R 262,219 87,78 233,230 582,528 0.50 - 0.75 A 68,92 40,36 101,114 209,242 0.75 - 1.00 5 6,14 8,13 22,26 36,53 1.00 - 1.50 A 2,1 1,0 22,26 25,27 >1.50 A 3,2 3,1 38,40 43,43 __ aFirst number, molecule l-y-CHT, second number, molecule Z-y-CHT. -155- they were disordered in one or the other structures (9-13, 70-80 and 149-150). Examination of Tables 14 and 21 shows that the two monomers of a-CHT are more similar to each other than either is to the structure of y-CHT. Figure 26 displays the r.m.s. deviations of the individual residues (main and side chains). The main chain is seen to possess good agreement although there are some departures in the regions between residues 80-100. The largest differences between the structures of d-CHT and y-CHT occur in the surface residues and in the dimer interface. This is not unexpected since these regions display asymmetry in the dimer molecule. The differences in these regions may be due in part to the differences in intermolecular contacts in the crystal forms (shown in Figure 17), but again, most is probably due to surface side chain adaptability among different positions. A. The Active Sites There are regions which show excellent agreement between the two molecules of a-CHT and y-CHT. The active site residues are very similar, especially ASP-102. The r.m.s. deviations of the catalytic residues are: molecule 1, HIS-57 =O.3l A, ASP-102 =0.l7 A, SER-195 =0.63 A and molecule 2, 0.52 A, 0.20 A and 0.98 A, respectively. Superposition of the catalytic triad residues is shown Figure 26. -156- R.M.S. Differences Between d-CHT and y-CHT. Only atoms with B <23.0 A2 (a—CHT) and <15.0 A2 (y-CHT) are included; molecule 1, top; molecule 2, bottom; main chain, solid; side chain, broken; * intermolecular contacts in a- and y-CHT. -157- 14\o-~uuun-- / u ‘ ollllOdlvuo.uo.fi. a It Ir-”"|'.|.‘ll a I 1% III. . ||~In nailiea.»nu.ltx.v. .. .. “II-4|]! 4 0. 5. 0. 5 4 ‘ 3Q co.c£§u¢ r 140 Residue Number 1 I 00 V . I 180 200 220 160 120 60 ‘ dJ - {I d 1 -)‘)-l1ler. u s o. 5. a. s 0 s 0. Ia J 2 Z .1 no a a I ‘1 ‘A In...- .III -I 0. .'a n. . u I.fl rw. ‘4. -\ ‘Olplsv .‘v a. .c .. ’.IIIII.I“H .0... I-...U~0I0.HII.. My nl‘. I‘ .. .n u o - 0’ 0.,‘|.' p.” 00 :0 l .0. 0 8|. II. .'I. s-.. \J.’ 0“ IIIIIII 4.. ll IIIIIIII J 6.! | 3 .V- (fl. l‘ to. A. II” 1‘" LII.I1.' I . lJvhn‘ J I. .. .¥I. “ 41““ I '04- IIIIMI“ I o u o.’ -a..............u.. 5)) I“. “I nip, WNW I’IIII.’ 0!! ....... .- -l"ahm..N'\l 6 ."I’ villi III-ovuuwnuwuquW 1-...«uunuumI/M a 8.338 :00 :00 ' ‘0 I 50 i 50 'VL'I'nt c’.’ :20 40 60 80 20 Res/cue ”le 1U 1]] J l -lS8- in Figure 27. The imidizole groups of HIS-57 are slightly different between the two molecules; however, the major difference in conformation between the alpha and gamma structures seems to be in the orientation of the 0G in SER-195. The position of 0G differs by 0.67 A in molecule 1 and 0.95 A in molecule 2. The x-l angles differ by almost 50° in the alpha and gamma structures and there seems to be no indication of a hydrogen bond between the SER-195 0G and HIS-57 NE2 in either a-CHT or y-CHT. The difference in conformation between the two alpha catalytic sites and that of y-CHT might be the result of the difference in pH of the two crystal forms and the possibility that the imidizole might not be protonated in y-CHT. In the alpha structures, molecules 1 and 2 contain 5 and 6 water molecules that interact strongly with the active site residues. In the case of d-CHT, TYR-146 of the other molecule is in close proximity to the catalytic residues. However, the active site, while showing a strong similarity in the protein positions involved, shows larger deviations with respect to the solvent structure. Only three water molecules are within hydrogen bonding distance to the catalytic triad (331, 390, and 464) in y-CHT. Of these three, water 381 in y-CHT is found in both a-CHT monomers (interacting with ASP-102) but water 464 is found only in molecule 2 in a-CHT. This interaction involves a hydrogen bond between the water and both TYR-146 OT and HIS-57 NE2. -159- Figure 27. Stereoview of Superpositions of the Catalytic Site Regions of a-CHT and y-CHT. Molecule 1 —- y-CHT, top; molecule 2 -— y-CHT, bottom. y-CHT, bold in each case. -l60- (Top) -l6l- The list of close contacts in the dimer interface region (Figure 16) indicates that there could be an ion pair between these residues in both molecules of a-CHT. The asymmetry and differences between the two alpha structures and that of y-CHT is shown most clearly here in differences in solvent structure. B. The Specificity Sites The specificity site of y-CHT is also similar with those of a-CHT (r.m.s. deviation =0.51 A, averaged over both molecules of a-CHT). One exception are the residues from 216-218, which are in the dimer interface region of a-CHT located very near the local 2-fold axis. Structural changes must occur in this region upon dimerization to remove the very close contacts that would result. Conformational energy calculations reveal an extremely high non-bonded contribution to the global energy of the initial symmetrical dimer which was used as input for the least squares refinement. The final conformation of the region shows large deviations between the two alpha monomers and also with respect to the y-CHT structure (r.m.s. deviations of 1.1-1.2 A). The solvent structure also shows that only two of the waters found in the specificity site of y-CHT are present in the a-CHT monomers. -l62— C. The TRP Clusters Interestingly, the positions of the atoms found in the TRP cluster in y-CHT are almost identical to those found in both molecules 1 and 2 of a-CHT. Their positions are most certainly within the error of the two independent structure determinations and high resolution refinements. The solvent structure about the TRP clusters in d-CHT and y-CHT is similar also. Of the eight water molecules that show symmetry between the two alpha structures, four are also found in the y-CHT TRP cluster. The stability given by solvent interactions may be a partial explanation of the similarity between the two molecules in this region. The large rigid groups possess a smaller number of energetically favorable positions which may be assumed in the structures. D. Hydrogen Bonding Differences in the main chain hydrogen bonding pattern between the monomers of a-CHT and y-CHT protein are listed in Table 22. A complete list of the main chain hydrogen bonds of y-CHT may be found elsewhere11 and the complete list for the a-CHT structures may be found in Appendix D. The hydrOgen bonds in y-CHT were chosen according to the following criteria: stereo- chemically reasonable and the donor-acceptor distance being less than 3.5 A. In the case of a-CHT, the angle -l63- Table 22: Main Chain HydrOgen Bond Differences with Respect to y-CHT. ---------------- Molecule 1 --—------------ Hydrogen Bonds in Hydrogen Bonds in a-CHT Only a y-CHT Only 56N-1020 2N-1200 l44N-1500b l6N-l430 l75N-1720 42N-33O 245N-2420 100N-950 119N-280 l69N-164O 184N-16lO ---------------- Molecule 2 --------------- Hydrogen Bonds in Hydrogen Bonds in a-CHT Only“ y-CHT Only 59N—56O 16N-1430 l44N-1500b 42N-33O 175N-1720 60N-560 245N-2420 121N-460 aDistance <3.5 A, angle >150°. bASN 150 is disordered in y-CHT. -l64- (donor-hydrogen-acceptor) was also examined. The donor- acceptor distance was kept less than 3.5 A and the theta angle greater than 120°. Table 22 indicates a hydrogen bond involving ASN-150 in both molecules of a-CHT. This hydrogen bond is not present in the y-CHT structure since ASN-150 is disordered. Also, an additional hydrogen bond is found in the terminal alpha-helix (245N-2420) in both a-CHT monomers. E. Solvent Structure Since both the structures of a-CHT and y-CHT have now been refined at a high resolution, the opportunity presents itself to actually compare the solvent structures of the two and more specifically, to examine the differences in the dimer interface region. The amino acid residues that are part of the dimer interface region in a-CHT have already been presented in Appendix B. These residues were also used as the corresponding residues for the interface region comparisons in y-CHT. In actually choosing which waters are part or near the dimer interface region in both d-CHT and y-CHT, a combination of two methods were used. First, FRODO was used to examine the structures and the water molecules that seemed to be near the interface region were selected. Second, a nearest-neighbor calculation was performed on the interface residues with all the solvent. In this manner, solvent molecules selected -l65- could be removed if they were initially incorrect and new solvent added if they were missed in the FRODO examination. After all solvent was selected, the rotation matrices and translation vectors used to best fit the two alpha structures with that of y-CHT (Table 20) were also used to rotate the solvent structure of y-CHT to the solvent structure surrounding a-CHT. Table 23 presents a summary of the water molecules in a-CHT molecule 1 and molecule 2 that are within 1.0 A of the water positions in y-CHT. Once again, asymmetry in the final a-CHT monomers is apparent. There are 38 solvent molecules in a-CHT molecule 1 that are also found in y-CHT. The number for the other monomer is 34. These results are enhanced by the fact that in the alpha structures, the average thermal factor of these waters is lower and the average occupancy is larger than the average for the solvent globally (22.3 A2 and 0.765). The same can be said for the waters of y-CHT, although it is very difficult to compare the final thermal factors and occupancies of y-CHT with those of a-CHT since there seems to be a difference in scaling between the two molecules. The average thermal factor in y-CHT for waters occurring 2 and 5.6 A2 for molecule in a-CHT molecule 1 is about 7.4 A 2. The corresponding average occupancies are 0.819 and 0.845. These numbers are much lower than the average thermal factor for the y-CHT solvent (9.9 A2). In addition, 25 pairs of water molecules of the 38 and 34 -l66- Table 23: Equivalent Water Molecules (within 1.0 A) in both a-CHT and y-CHT. a.) Molecule 1 of: a-CHT 1-CHT Thermal Thermal Deviation Number Occupancy, Factor' Number Occupancy, Factor (A) 498 * 0.60 21.7 324 1.00 DI 3.0 0.624 499 0.65 26.1 383 0.83 14.0 0.707 503 0.53 26.1 254 1.00 DI 26.1 0.941 504 * 1.00 21.9 430 0.82 7.2 0.376 505 * 0.81 16.1 314 0.90 2.0 0.348 508 * 1.00 9.5 311 1.00 DI 3.5 0.573 509 * 1.00 11.0 323 0.84 8.1 0.250 510 * 1.00 12.1 313 0.99 2.0 0.493 512 * 1.00 13.2 309 1.00 DI 2.0 0.244 514 * 0.99 12.6 381 1.00 10.9 0.662 516 * 1.00 12.4 304 1.00 DI 2.0 0.272 520 * 1.00 17.5 302 0.94 7.2 0.063 531 * 0.83 14.3 307 1.00 2.0 0.156 542 * 0.85 14.8 322 0.98 DI 2.0 0.149 544 * 1.00 16.5 337 0.95 13.7 0.275 546 * 0.89 21.3 358 0.66 4.1 0.293 551 * 0.77 20.6 312 1.00 2.5 0.583 553 * 0.98 16.4 365 1.00 11.4 0.532 555 * 1.00 20.7 328 1.00 DI 7.9 0.394 562 0.86 19.5 445 0.50 5.0 0.473 580 0.65 21.9 466 0.30 7.3 0.790 584 0.67 27.0 363 0.47 2.0 0.883 587 0.81 21.3 353 0.76 9.6 0.441 592 * 0.87 26.3 377 0.86 15.8 0.811 593 0.70 17.3 336 0.89 14.7 0.421 604 0.70 16.4 315 0.91 7.9 0.632 608 g 0.96 29.9 316 1.00 11.6 0.489 609 * 0.80 27.0 366 0.64 DI 2.5 0.732 612 * 0.72 20.5 446 0.45 6.3 0.557 623 0.90 19.9 471 0.53 6.9 0.784 625 * 0.72 23.2 332 0.90 7.0 0.241 632 * 0.76 23.1 335 1.00 DI 5.6 0.646 634 * 0.82 20.4 413 0.80 7.2 0.202 644 0.81 22.4 360 0.85 5.5 0.614 658 * 0.71 25.5 463 0.63 8.0 0.553 674 0.54 20.2 474 0.44 10.2 0.861 694 * 0.74 24.8 419 0.71 2.0 0.810 739 0.61 26.0 425 0.58 DI 15.6 0.821 Average 0.82 19.9 0.82 7.4 0.518 Table 23 Continues. -l67- Table 23 Continued. b.) Molecule 2 of: a-CHT y-CHT Thermal Thermal Deviation Number Occupancy: Factor Number Occgpancyf Factor (A) 497 * 0.48 17.6 311 1.00 DI 3.5 0.608 511 * 1.00 16.9 307 1.00 2.0 0.166 515 * 0.96 11.4 314 0.90 2.0 0.351 517 * 1.00 14.9 306 1.00 2.0 0.303 518 * 1.00 14.2 328 1.00 DI 7.8 0.536 519 * 1.00 12.7 323 0.84 8.1 0.231 521 * 0.92 17.8 365 1.00 11.4 0.822 524 * 0.91 15.2 302 0.94 7.2 0.243 526 * 1.00 23.2 377 0.86 15.8 0.538 529 * 0.83 13.6 304 1.00 2.0 0.385 530 * 0.97 14.1 309 1.00 DI 2.0 '0.078 532 0.87 22.1 301 1.00 5.1 0.562 533 * 1.00 17.5 313 0.99 2.0 0.465 534 * 1.00 14.3 419 0.71 2.0 0.380 536 * 1.00 15.9 312 1.00 2.5 0.610 537 * 0.84 17.6 381 1.00 11.0 0.562 541 0.80 22.1 464 0.28 DI 2.0 0.923 556 * 1.00 24.5 324 1.00 DI 3.0 0.424 557 1.00 17.7 317 0.88 3.0 0.256 558 * 0.68 13.8 332 0.90 7.0 0.350 559 * 1.00 18.4 366 0.64 DI 2.5 0.891 577 * 0.88 16.9 301 1.00 5.0 0.590 579 * 0.64 14.4 337 0.95 13.7 0.366 583 1.00 16.1 371 0.74 3.4 0.220 588 * 0.77 18.2 358 0.66 4.1 0.730 600 0.84 24.7 424 1.00 9.7 0.712 618 0.82 23.9 475 0.46 3.0 0.612 636 * 0.79 25.0 322 0.98 DI 2.0 0.509 637 0.70 23.5 452 0.51 2.8 0.885 661 * 0.63 18.2 413 0.80 7.2 0.164 665 0.86 26.1 315 0.91 7.9 0.510 700 0.74 25.3 329 0.72 DI 15.5 0.877 736 * 0.69 26.0 446 0.45 6.3 0.546 738 * 0.64 26.0 463 0.63 8.0 0.749 Average 0.86 18.8 0.85 5.7 0.504 * -Symmetric Water Molecules in a-CHT. DI —-Dimer Interface Waters. -168- waters in the a-CHT monomers found in y—CHT show 2-fold symmetry. This emphasizes that these may well be the best determined solvent molecules in the a-CHT structure. A nearest-neighbor calculation performed on the final structure of y-CHT along with an examination using FRODO showed that there were 33 solvent molecules in y-CHT located near the corresponding residues of the dimer interface region of d-CHT. Using these waters, examina- tion of the solvent occurring in both d-CHT and y-CHT revealed which water molecules are excluded in dimeriia- tion, which are simply displaced by a certain distance and which do not change position during the process of dimerization. 0f the 33 solvents given above, the change in position of the corresponding a-CHT waters was anywhere from 0.1 g to over 4.0 fl. It is very difficult therefore to decide from this type of distribution of distances which solvent molecules may be displaced. It was decided that any solvent molecules within 1.0 fl of each other in both the a-CHT and y-CHT structures would be considered the same. With this assumption, the following results emerge. 0f the 33 waters in the y-CHT interface, 12 (36%) are also found in the final structure of d-CHT. 0f the remaining 21 water molecules, 11 (34%) are simply displaced by less than 2.8 23. (r.m.s. =2.04 23.) and the remaining 10 (30%) are lost during the process of dimerization. In addition to the asymmetrical structural changes in the dimer interface side chain atoms that must -169- occur upon dimerization, the above results indicate that changes in the solvent structure in this region must also be important in stabilizing the dimerization process. F. Concluding Remarks The least squares refinement of a-CHT has focused on two molecules per asymmetric unit. The basic results that have emerged as a result of this work may be applied to structures containing morethan two molecules per asymmetric unit. The folding of the main chain is the same within experimental error but this does not apply generally to the side chain stereochemistry. The deviations in the side chains may be due in part from inter-dimer contacts in the crystal but most of the differences are probably a result of the high adaptability associated with rotational degrees of freedom in the tertiary surface structure. The results of the refinement of the a-CHT dimer clearly show that the folding of a protein molecule is basically independent of most of the detailed stereochemistry of the side chain atoms. Since a large number of protein structures are being studied by averaging about an appropriate symmetry element, the results of this work indicate that care must be exercised in analyzing the side chain configurations. Ideally, the structure should be unaveraged for correct interpretation, especially near surface and inter-subunit regions. LIST OF REFERENCES 10. ll. 12. 13. 14. 15. LIST OF REFERENCES F.D. Gibson and H.A. Scheraga, Proc. Nat'l. Acad. Sci. USA, 58, 420-428 (1967). M. Levitt and S. Lifson, J. Mol. Biol., 46, 269- 279 (1969). M. Levitt, J. Mol. Biol., 104 59-107 (1976). J.A. McCammon, B.R. Gelin, M. Karplus and P.G. Wolynes, Nature, 262, 325-326 (1976). B.R. Gelin and M. Karplus, Biochemistry, 18, 1256-1268 (1979). S. Harvey and J.A. McCammon, Comp. and Chem., 6, 173-179 (1982). J. Hermans and M. Vacatello, Water in Polymers, American Chemical Society, pp. 199-214 (1980). S. Fitzwater and H.A. Scheraga, Proc. Nat'l. Acad. Sci. USA, 19, 2133-2137 (1982). A. Jack and M. Levitt, Acta Cryst., A34, 931-935 (1978). B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan and M. Karplus, J. Comp. Chem., 4, 187-217 (1983). C.H. Cohen, E.W. Silverton and D.R. Davies, J. Mol. Biol., 148, 449-479 (1981). R.A. Blevins and A. Tulinsky, J. Biol. Chem., in press. N.V. Raghavan and A. Tulinsky, Acta Cryst., B35, 1776-1785 (1979). A. Tulinsky, R.L. Vandlen, C.N. Morimoto, N.V. Mani and L.H. Wright, Biochemistry, 12, 4185-4192 (1973). L.D. Weber, A. Tulinsky, J.D. Johnson and M.A. El-Bayoumi, Biochemistry, 18, 1297-1303 (1979). -l70- 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. -l71- D.R. Ferro and W.C.J. Hol, Energy Minimizations of Proteins. An Attempt to Explicitly Account for Water-Protein Interactions, CECAM Workshop Project (1980). A. Tulinsky, N.V. Mani, C.N. Morimoto and R.L. Vandlen, Acta Cryst., B33, 1309-1322 (1973). J.A. McCammon, P.C. Wolynes and M. Karplus, Biochemistry, 33, 927-942 (1979). D.R. Ferro, J.R. McQueen Jr., J.T. McCown and J. Hermans, J. Mol. Biol., 136, 1-18 (1980). J. Hermans and J.E. McQueen Jr., Acta Cryst., A39, 730-739 (1974). U. Burkert and N.L. Allinger, Molecular Mechanics, American Chemical Society Monograph 177, pp. 59-71, (1982). M. Levitt, J. Mol. Biol., gg, 393-420 (1974). M. Levitt, Protein Folding, ed. R. Jaenicke, pp. 17-39, Elsevier7North-Holland Biomedical Press (1980). P.K. Warme and H.A. Scheraga, Biochemistry, 33, 757-767 (1974). P.K. Weiner and P.A. Kollman, J. Comp. Chem., 3, 287-303 (1981). G. Nemethy, W.J. Peer and H.A. Scheraga, Ann. Rev. Biophys. Bioeng., 39, 459-497 (1981). E.B. Wilson, J.C. Decius and P.C. Cross, Molecular Vibrations, Dover, NY, p. 286 (1980). W.F. van Gunsteren, H.J.C. Berendsen, J. Hermans, W.C.J. H01 and J.P.M. Postma, Proc. Nat'l. Acad. Sci. USA, 32, 4315-4319 (1983). W.F. van Gunsteren and M. Karplus, J. Comp. Chem., 3, 266-274 (1980). M. Carson (private communication). J.P. Glusker and K.N. Trueblood, Crystal Structure Analysis: A Primer", Oxford University Press, New Yorkv(l972). 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. -172- R.A. Blevins, M. Ragazzi and P.M. Hunt (unpublished results). C. Chothia and J. Janin, Nature, 256, 705-708 (1975). B. Lee and P.M. Richards, J. Mol. Biol., 33, 379-400 (1971). L. Pauling and D. Pressman, J. Am. Chem. Soc., 31, 1003-1012 (1945). W. Kauzmann, Adv. Protein Chem., 32, 1-63 (1959). J. Janin and C. Chothia, Biochemistry, 31, 2943- 2948 (1978). B. Lee, Proc. Nat'l. Acad. Sci. USA, 39, 622-626 (1983). M. Connolly, QCPE Bull., l, 75 (1981). C. Chothia, Ann. Rev. Biochem., 23, 537-572 (1984). S.J. Wodak and J. Janin, Biochemistry, 39, 6544- 6552 (1981). v C. Chothia, J. Mol. Biol., 105, 1-14 (1976). C. Tanford, Proc. Nat'l. Acad. Sci. USA, 3, 4175- 4176 (1979). P.D. Ross and S. Subramanian, Biochemistry, 33, 3096-3102 (1981). L.H. Jensen, Ann. Rev. Biophys. Bioeng., 3, 81-92 (1974). A. Tulinsky, Methods of Enzymology. Diffraction Methods in Molecular Biology., in press. R. Diamond, Acta Cryst., A31, 436-452 (1971). K.D. Watenpaugh, L.C. Sieker, J.R. Herriot and L.H. Jensen, Acta Cryst., B32, 943-956 (1973). J. Deisenhofer and W. Steigemann, Acta Cryst., B33, 238-250 (1975). N. Isaacs, Structural Studies of Molecules of Biological Interest, eds. CZ Dodson, J.P. Clusker and D. Sayre, Clarendon Press, Oxford, 1981, pp. 274-287. ' 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. -173- C.J. De Ranter, X-raprrsytallography and Drug Action, eds. A.S. Horn and C.J. De Ranter, Oxford Science Publications, 1984, pp. 1-22. R.C. Agarwal, Acta Cryst., A33, 791-809 (1978). L.G. Hoard and C.E. Nordman, Acta Cryst., A33, 1010- 1015 (1979). J.L. Sussman, S.R. Holbrook, G.M. Church and S.H. Kim, Acta Cryst., A33, 800-804 (1977). R.M. Burnett and C.E. Nordman, J. Appl. Cryst., 1, 625-627 (1974). R.A. Blevins, C.E. Buck and A. Tulinsky (unpublished results). W.A. Hendrickson and J.H. Konnert, Computing in Crystallography, eds. R. Diamond, S. Ramaseshan, K. Venkatesan, Indian Acad. Sci., Bangalore, pp. 13.01-13.26 (1980). T.A. Jones, Compututational Crystallography, ed. D. Sayre, Clarendon Press, Oxford, pp. 303-317 (1982). B. Bush, Comp. and Chem., 3, 1-11 (1984). A. Mavridis, A. Tulinsky and M.N. Liebman, Biochemistry, 33, 3661-3666 (1974). J.J. Birktoft and D.M. Blow, J. Mol. Biol., 33, 187-240 (1972). V. Luzzati, Acta Cryst., 3, 802-810 (1952). A.R. Sielecki, W.A. Hendrickson, C.G. Broughton, L.T.J. Delbaere, G.D. Brayer and K.N.C. James, J. Mol. Biol., 134, 781-804 (1979). M.N.G. James and A.R. Sielecki, J. Mol. Biol., 163, 299-361 (1983). Protein Data Bank, (1984) Brookhaven National Laboratory, Upton, NY. G.M. Ramachandran, C. Ramakrishnam, and V. Sasisekharan, J. Mol. Biol., 1, 95-99 (1963). J. Janin, S. Wodak, M. Levitt and B. Maigret, J. Mol. Biol., 125, 357-396 (1978). 68. 69. 70. 71. 72. 73. 74. -174- D.A. Matthews, R.A. Alden, J.J. Birktoft, S.T. Freer and J. Kraut, J. Biol. Chem., 252, 8875-8883 (1977). A. Tulinsky, I. Mavridis and R.F. Mann, J. Biol. Chem., 253, 1074-1078 (1971). B.A. Karcher and A. Tulinsky (unpublished results). K.C. Anne and S.N. Timasheff, Biochemistry, 33, 1609-1617 (1971). K.C. Aune, L.C. Goldsmith and S.N. Timasheff, Biochemistry, 33, 1617-1622 (1971). A.R. Rashin, Biopolymers, 33, 1605-1620 (1984). IMSL, International Mathematical and Statistical Library, Inc., Subroutine CGSPH, Edition 9, 1982. APPENDICES APPENDIX A -175- Appendix A: Extended Atoms and their Non-Bonded Parameters. Atom 0 eff rde Groups Represented O 0.84 6 1.60 Carbonyl Oxygen, Water Oxygen OH 1.20 7 1.70 Alcoholic Hydroxyl OM 2.14 6 1.60 Carboxyl Oxygen NH 1.40 7 1.65 Peptidic Nitrogen N(2) 1.70 8 1.70 -NH2 Terminals N(3) 2.13 9 1.75 -NH§ Terminals OH 1.35 6 1.85 Aliphatic-CH C(2) 1.77 7 1.90 Aliphatic-CH2 C(3) 2.17 8 1.95 Methyl Terminal C 1.65 5 1.80 Aromatic/Carbonyl Carbon CR 2.07 6 1.90 Aromatic-CH S 0.34 16 1.90 Sulfur (Cys,Met) Hydrogen Bond Potentials HB Bond Type Emin (kcal/mole) Rmin (A) OH-O -3.5 2.80 OH-OH -3.5 2.75 OH-OM -3.5 2.85 NH-O -3.0 2.95 NH-OH -3.0 3.08 NH-OM -2.5 3.10 N(2)-O -2.5 2.87 N(2)-OH -2.5 2.87 N(2)-0M -2.5 2.87 N(3)-OH -2.5 3.00 NB _ 1 6 HB _ _ H8 6 HE A — 2 C(ri-+rj) EMIN - .067(C ) /A 5 NB 25 s C (Bah/Z/fi)aiaj/[(ai/Ni) +(aj/Nj) 1 RfigN = (1.2 AHB/CHB);5 See Reference 10 and Equations 2 and 3. APPENDIX B -176- .mwscflucoo m xflpcmmm< mva .mm mm mm .ov omm .Hma .mm «ma .NOH omH .mm mes momwnmucH mes .mea .mem .mea .em .mm ems .esa .mm .mm .om .e em .oe we .oe .me .HN .om oem .mmm .sma .ema .eaa .Hm .ms .em .e emu .Hem .Hma .ema .mma .mm .me .H Nos .msa .mma .mma .mma .me .ee..mm men .emm .eom .eea .mea .oma .Hoa .OOH .mm .Ha .om .me .ma emu .ema .mea sew .mem .mmm .eom .mea .mma .mea .Nma .Hma .ema .ONH .NHH .HHH .em .me .m Hoflumuxm «mm .mem .mma .oea .mma .mma .moa .eoa .moa .84 .mm emu mam .moa .ee .ea qu mHm ago om zgo med .mma mwo «ma ems st om< mmm .mma .mma .mm .mm .mm «as HOHHGHGH .mcofimmm Emu Damaommm :H CGDOM mmspflmmm um xfipcmmmé eam .mm .em ee eea .ee mam New .eam .Hma .ee .me .He .em _ 7 7 A mam .HHN .eam .eam .mea .eea .ee .me He .em Nee em OOMMHOUCH .wHN .mma wNN .oma .hma .mwa .mem .ee .mw .mm .mm .mm .ma .N mmm .eea .ema .eaa .ee .ee .me .mm .m flea .eea .ee emm .eem .Nea .Hm Hem .mmm .emm .NNN .eam .eem .eea .eea .Hma .eea .mma .ema .eaa .eHH .ee .me .He .em mam .Hmm .eam .eam .mea .eea .eea .ema .ema .mma .eea .eHH .maa .maa .eea .ee .me .ee .ee .me .me .em mam .Hea .Nma .em .e .e ems .eaa .ee .He .em «ea mem .mem .eea .mea .eea .eea .eea .me .ee .ee .ee .me .ee .em uoflumuxm Ham .eea .eea .mea .eeH .ee .me emm .Hmm .emm .mam .eHN .eem .HNH .ee .ee .mm .mm .Hm .ea emm mam .HeH .em .em ems .ema .eea .em eam .eea .me .mm mes .ema .em He eea HoflnmpsH yaw A¢> mNB mmB mmB mmm 0mm mmm 8m: qu .pmsceucoo m xflpcmmmm APPENDIX C -178- Appendix C: Variable Dihedrals in the a-CHT Dimer A.) Molecule 1 Residue Phi Psi Omega x-l x~2 x~3 x~4 x~5 C 1 -7 ~177 139 ~64 172 -54 G 2 ~170 38 -178 v 3 ~123 94 180 ~178 P 4 ~61 140 172 30 ~36 25 A 5 ~67 ~36 174 I 6 ~102 103 ~177 -66 143 Q 7 ~67 128 178 ~47 134 71 P 8 ~56 ~ 150 ~31 42 ~37 I 16 134 177 -54 169 v 17 ~92 129 177 ~174 N 18 74 24 ~180 ~112 ~103 G 19 ~96 ~167 ~180 E 20 ~149 163 ~178 24 89 168 E 21 ~65 135 177 ~164 176 91 A 22 ~81 163 173 v 23 ~75 127 ~178 ~177 P 24 ~60 143 ~180 ~2 3 ~3 G 25 84 ~15 ~180 S 26 ~72 ~13 178 81 w 27 -133 72 ~171 ~64 110 P 28 -71 ~11 177 30 ~38 30 w 29 ~89 ~14 ~178 33 95 Q 30 ~66 127 177 165 91 62 v 31 ~121 160 177 ~68 S 32 ~118 133 176 158 L 33 ~100 128 179 ~58 169 Q 34 ~119 136 ~177 ~70 ~177 2 D 35 ~81 ~172 180 67 ~158 K 36 -72 ~42 179 ~116 -51 ~178 ~123 T 37 ~45 ~44 ~179 117 G 38 117 36 174 P 39 ~141 160 ~173 ~170 64 H 40 ~46 120 176 179 86 F 41 ~125 ~11 ~180 64 100 C 42 ~168 166 176 ~106 ~140 ~86 ~92 G 43 ~106 ~176 176 G 44 ~166 179 179 S 45 ~136 142 178 ~53 L 46 ~81 131 176 ~82 157 I 47 ~108 ~19 ~179 62 167 N 48 -153 174 175 ~77 165 E 49 ~75 ~5 ~179 ~89 ~15? ~20 N 50 ~123 -7 ~171 ~92 -77 w 51 ~143 145 180 ~65 81 Residue << Harmzmxmzx'nnmrm0uxwmwmooommo><<>e 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 Phi '127 '101 '148 '83 '63 '59 '63 60 '62 '127 '69 '97 '70 '124 '99 '112 '116 87 '83 '111 '103 '96 '112 '65 '55 177 '136 '92 '140 '129 '94 '98 ~76 '97 '94 '155 '107 '121 '82 '67 '60 ~78 '49 '119 ~70 '97 '98 60 Psi 135 139 '177 145 '43 '17 '15 36 137 ' 174 '8 '10 161 147 132 119 148 133 '59 141 134 '26 135 169 98 '171 111 100 131 152 127 '42 149 138 95 148 126 '20 '14 132 111 '5 '33 '21 19 Omega 173 174 179 175 '178 '179 '177 179 '178 '179 180 177 '179 177 174 178 179 '178 178 '178 '179 '179 '180 179 179 180 '179 180 179 177 '178 178 179 179 '180 '179 180 '177 178 179 179 '178 '176 '179 178 '174 '179 '178 -179- x-l 169 73 '172 83 '69 '165 76 '88 '43 '63 73 '168 x~2 '100 '92 136 80 52 144 '167 '178 169 '67 '170 129 159 '135 168 160 83 '89 '28 59 '15 139 '54 '114 '71 '137 '168 45 140 179 156 142 103 '140 174 169 136 '165 '154 179 Residue N100 N101 0102 1103 T104 L105 L106 K107 L108 S109 T110 A111 A112 5113 F114 5115 0116 T117 V118 5119 A120 V121 C122 L123 9124 5125 A126 S127 0128 0129 F130 A131 A132 6133 T134 T135 C136 V137 T138 T139 6140 W141 6142 L143 T144 R145 Y146 A149 Phi ~79 64 '80 '138 '149 '105 '122 '100 '75 '93 '155 '53 ~74 '116 '54 '173 ~76 '104 '125 '164 '100 '87 '85 '103 '82 '99 '92 '73 '109 '130 ~70 '49 109 '61 '89 '115 '138 '132 '137 170 '110 -71 '52 '91 '158 (Psi 150 47 64 155 139 140 118 115 146 108 152 162 155 '161 '19 '16 120 '180 149 158 154 147 179 159 '25 150 134 102 141 129 '18 140 109 '172 146 147 156 178 21 172 144 ~7 142 '42 162 -180- Omega 179 '175 179 179 175 '177 179 '179 179 '178 '178 177 '179 179 '174 '178 '179 179 '179 '177 175 177 178 175 173 176 '179 '176 180 178 '177 175 179 '179 179 '180 '173 176 '179 176 179 178 '179 179 179 179 '179 x~1 '160 '52 '167 '172 '80 '69 '59 160 '60 105 '52 '158 '76 72 '8 61 '179 70 66 '70 '130 35 '69 54 5 178 ~73 '39 '63 '55 157 '173 69 '75 '43 '49 179 X'Z 26 '45 '157 '47 180 '173 100 '172 18 36 '54 15 '32 17 66 109 '82 174 135 60 '152 89 '1 172 '64 16 99 '89 112 '125 173 -18l- Residue Phi Psi Omega x~1 x~2 x~3 x~4 N150 '104 14 175 ~76 '23 T151 '99 ~138 '179 73 P152 ~81 135 178 39 ~44 31 0153 ~85 ~29 179 ~70 152 R154 '95 140 179 '74 158 '114 '34 L155 ~64 129 179 '170 74 0156 '105 148 ~179 :51 ~71 ~49 0157 ~139 '165 174 65 '165 ~28 A158 ~167 144 173 $159 ~93 144 ~180 ~69 L160 '158 147 179 75 167 P161 '89 145 178 43 '37 20 L162 ~77 154 179 '66 ~171 L163 '122 158 172 ~78 72 $164 '81 154 179 '26 N165 '60 '29 179 '91 '154 T166 ~55 ~54 ~179 ~90 N167 ~66 ~42 ~179 ~51 81 C168 ~65 ~31 180 '164 167 ~80 ~166 K169 ~70 ~26 176 ~53 ~153 171 96 K170 '51 '39 '180 '84 '174 112 '133 2171 ~93 '57 ~180 ~47 ~101 W172 ~88 '8 ~180 ~63 113 6173 55 ~146 ~176 T174 ~56 ' ~16 '180 53 K175 -71 ~15 ~177 ~60 ~146 ~66 ~48 1176 ~91 116 179 ~54 ~69 K177 '102 165 '179 ?81 150 174 57 0178 '56 '31 ~178 '108 50 A179 '94 26 179 M180 ~126 151 179 ~59 ~179 ~180 I181 ~139 121 173 139 ~135 C182 ~97 150 ~180 ~51 ~166' ~80 167 A183 '148 150 179 6184 102 '133 '178 A185 63 18 ~174 $186 '110 8 ~178 53 6187 110 15 ~180 V188 '149 162 '176 '62 $189 ~166 142 177 151 $190 ~72 156 180 '65 C191 '147 173 177 '155 41 98 '168 M192 '39 129 180 '80 105 '157 6193 101 ~15 180 0194 ~86 ~11 177 ~69 140 $195 '47 138 179 '84 6196 82 ~15 ~177 6197 '75 177 '180 '182- Residue Phi Psi Omega x~1 x~2 x~3 x~4 P198 ~88 157 170 33 '35 23 L199 ~129 ' 106 180 170 68 V200 ~115 152 ~179 ~64 C201 ~140 143 ~176 -43 ~89 99 ~137 K202 ~85 122 179 ~132 179 160 ~159 K203 '135 128 178 '172 '159 '134 ~167 N204 58 33 178 ~65 ~16 6205 73 15 178 A206 ~129 153 177' w207 ~83 130 ~180 ~64 93 T208 ~119 138 ~179 ~48 L209 ~76 117 ~177 ~173 70 V210 ~107 ~29 ~176 166 6211 ~141 158 171 I212 '122 124 180 ~52 156 V213 ~53 122 ~173 -174 - $214 ~119 ~62 178 . ~169 w215 ~157 -179 ~180 53 ~89 ‘ 6216 173 '160 '176 5217 ~50 126 ~176 162 $218 ~65 '8 ~178 36 T219 ~125 4 175 ~86 C220 63 31 180 ' ~60 ~168 98 41 $221 ~57 141 -179 168 T222 ~86 ~3 178 64 5223 '94 '8 '171 '92 T224 '122 140 179 ~69 P225 ~80 145 174 19 ~13 1 6226 ~80 148 ~178 V227 ~114 132 ~177 ~180 Y228 ~126 152 178 '54 78 A229 ~74 131 ~177 R230 ~79 105 ~177 ~165 179 ~76 ~90 V231 '59 '40 177 174 T232 ~54 '33 179 ~114 A233 '84 ~12 ~178 L234 '113 '6 '173 '78 13 V235 '68 '40 178 85 N236 ~51 ~40 ~179 ~71 ~21 w237 ~65 ~45 ~180 173 87 V238 ~55 ~62 ~178 170 0239 ~51 ~41 ~179 ~60 '55 ~9 0240 ~69 ~39 176 ~70 ~171 ~101 T241 '63 '47 '179 '37 L242 '57 ~46 '180 '84 179 A243 ~58 ~36 ~178 A244 ~96 ~5 ~176 N245 '125 157 ~77 '69 8.) Molecule 2 Residue <MMQZWIHWF7=OH74NMMMQOUWNO><<>H Phi '144 '93 '65 '63 '60 53 '64 '131 '64 ’ ~98 '89 '124 '94 '112 '118 92 '90 '121 '105 '82 120 '46 -'54 173 '152 '86 '87 118 103 113 ~79 110 ~75 '152 106 '119 '86 '68 '56 '98 '57 '106 '67 '95 '78 75 '82 64 Psi '171 145 '34 '24 '5 36 143 158 '4 '6 162 137 122 117 148 10 142 '59 126 117 '26 145 '179 56 162 124 111 134 153 118 '48 146 133 118 120 '10 '4 129 105 '3 '53 '28 19 152 51 Omega 177 180 179 177 ~179 179 174 177 177 17s ~179 177 ~180 ~179 -1so ~178 173 -179 -179 17a ~180 180 ~180 ~180 ~180 ~178 177 177 -177 179 -177 ~180 ~177 178 178 ~177 ~178 -179 180 -17a ~176 ~177 179 179 -175 179 179 ~174 ~184- X'l '167 84 '62 '169 86 57 102 '55 142 167 176 '86 165 '169 '83 59 63 '178 ~70 '130 59 ~72 '43 ~43 ~74 '59 '159 176 '66 42 163 4 '69 171 '166 80 ~74 '58 '42 '171 '58 X'Z '110 '91 139 '115 74 164 166 '169 78 131 165 88 152 59 179 148 88 171 167 '154 '63 177 '39 23 '47 111 38 163 67 '172 '143 145 '139 155 '160 '109 158 139 '146 '164 Residue 0102 1103 T104 L105 L106 K107 L108 8109 T110 A111 A112 S113 8114 $115 0116 T117 V118 5119 A120 V121 C122 L123 P124 5125 A126 5127 0128 0129 8130 A131 A132 6133 T134 T135 C136 V137 T138 T139 6140 W141 6142 L143 T144 R145 Y146 A149 N150 T151 Phi '83 150 140 114 109 '94 ~70 '47 129 '59 '80 107 '64 161 '58 102 128 173 '97 '92 '90 105 ~78 107 '55 '92 ~74 127 103 '80 '59 108 '49 '85 114 132 124 134 158 110 ~79 '45 '94 155 '54 112 '89 Psi 79 145 149 139 118 117 122 ~48 125 157 150 111 146 '171 '34 '11 130 171 155 158 154 135 180 156 '34 155 110 115 152 131 '25 133 112 175 138 152 163 176 170 143 '8 135 131 170 19 117 -185- Omega 178 178 177 '180 176 '178 '180 179 '179 177 177 '180 '172 '178 '178 179 '177 '178 179 176 177 177 175 178 178 179 '178 179 '178 176 '180 '179 179 '180 '177 177 179 174 '180 179 '179 174 '179 '180 '179 178 179 X'l '168 '176 '85 '68 '45 153 '49 '141 32 '36 '74 131 ~73 74 179 '66 57 '51 ~75 33 153 81 '174 '57 '64 ~71 '54 176 '173 50 '55 67 '150 178 '82 76 X'2 '167 162 162 '171 '178 '176 80 ~70 '168 '37 157 99 89 '126 ~74 '158 142 60 '23 X'3 108 28 107 '132 72 '95 '37 Residue P152 0153 R154 L155 0156 0157 A158 5159 L160 P161 L162 L163 5164 N165 T166 N167 6168 K169 K170 Y171 W172 6173 T174 K175 1176 K177 0178 A179 M180 1181 C182 A183 6184 A185 5186 6187 V188 $189 5190 C191 M192 6193 0194 $195 6196 6197 9198 L199 Phi '70 '92 '95 '57 '114 141 155 '93 166 '85 '82 121 ~74 '63 ~70 ~70 ~72 ~71 '59 ~72 '116 51 '59 '64 '84 '104 '36 '85 '133 '137 '116 '161 105 63 '120 113 '133 '151 ~74 '140 ~41 105 ~78 '31 87 '80 '85 '124 Psi 148 '33 141 136 154 159 140 140 154 151 156 157 159 '34 '37 '31 '27 '17 '54 '45 '2 '127 '22 '26 111 170 '61 37 151 141 157 160 '138 19 14 '2 160 137 152 174 125 '21 '26 127 '14 174 152 111 Omega 178 180 176 -175 -179 176 175 -177 179 176 175 177 ~180 ~180 175 ~178 -17a 178 -179 '175 ~178 -1ao ~179 ~180 ~179 175 180 179 177 -179 180 176 -17a ~l72 179 173 179 173 ~179 ~179 -177 -179 174 177 -179 ~177 168 -177 '186- X'l 26 '81 '55 178 '54 52 '91 78 12 '63 '96 23 104 90 '69 165 '89 '92 '65 '58 '69 '95 '51 '39 '43 '54 158 '53 59 '60 179 '80 155 '69 ~79 104 33 173 X'Z '38 94 '164 '80 172 157 171 51 '144 114 175 166 168 ~78 102 '161 '59 '38 117 '179 '168 '172 44 101 150 '37 81 X'3 35 '112 '48 55 '19 '84 '162 '158 '120 '152 '178 '84 89 80 154 '169 '172 '134 139 '59 157 175 '175 Residue V200 €201 K202 K203 N204 6205 A206 W207 T208 L209 V210 6211 1212 V213 $214 W215 6216 S217 5218 T219 C220 5221 T222 5223 T224 P225 6226 V227 Y228 A229 R230 V231 T232 A233 L234 V235 N236 W237 V238 0239 0240 T241 L242 A243 A244 N245 Phi '118 '131 '86 '132 61 69 '122 '96 '121 ~72 '97 '143 '118 '56 '115 '162 171 '41 '56 110 56 '59 '88 '122 121 '65 '86 122 124 '64 '84 '56 '54 '69 120 '68 '66 '61 '55 '52 '60 '61 '60 '58 '67 '132 Psi 153 147 120 117 48 8 161 129 143 100 '32 160 119 123 '62 178 '166 127 '30 10 31 140 13 130 148 161 127 149 133 106 '34 '43 '23 '8 '23 '34 '45 '65 '49 '44 '49 '44 '51 '20 '175 ~187- Omega 176 '179 180 178 '177 '176 '177 179 180 '176 179 175 '179 '175 '176 '178 '176 180 '180 '177 180 '177 179 180 '177 174 ~176 '178 175 '179 ~173 '180 180 '174 '177 179 '179 177 177 '180 176 '180 '180 '178 180 X'1 '78 '43 '136 176 ~71 ~73 '46 '171 173 '67 176 '169 56 175 59 '95 '51 165 76 ~71 '47 20 '174 '59 178 167 '105 '46 72 ~75 164 165 '57 '51 '50 '72 '46 '95 '175 176 '37 97 65 '175 '29 78 '172 177 '25 83 '58 171 '161 ~78 107 169 '167 89 27 '84 10 '173 ~126 ~177 '65 44 '67 167 APPENDIX D Appendix D: Hydrogen Bonds in the a-CHT Dimer A.) Molecule 1 DODOI‘ 207 16 17 20 157 22 26 30 119 46 30 31 44 32 67 33 mi N .9. O zzzzzzgzzzzzzzzzzzzzzgzzzzzzzzzzzzzzzzzzzzzz H N HN HN3 HN HN HN HN HN HN HN HN HNE2 HN HN HN HN HN HN HN HN HN HN HN HE2 HN HN HN HN HN HN HN HN HN HN HN HN HN HN H01 HN HN HN HN HN HN Acceptor 2 194 189 157 20 155 23 27 28 29 31 44 31 67 32 42 33 65 O 001 OD OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO NM ' ~188- H'A 2.03 1.71 1.79 1.88 1.68 1.71 2.29 2.20 2.33 1.93 2.01 2.04 2.26 2.21 1.78 2.01 1.91 2.00 1.88 2.26 2.28 1.53 1.51 1.59 1.88 1.94 2.06 2.03 1.75 1.81 1.83 2.09 1.80 1.92 1.96 2.04 1.86 1.71 1.89 2.12 2.16 1.86 1.85 1.86 D'A 3.02 2.62 2.74 2.82 2.63 2.69 3.28 3.14 3.27 2.86 3.00 3.03 3.04 3.16 2.76 '3 01 2.85 2.98 2.82 3.21 3.21 2.52 2.48 2.56 2.79 2.87 3.02 3.02 2.66 2.81 2.77 2.97 2.77 2.61 2.79 3.04 2.85 2.68 2.83 3.08 3.08 2.85 2.75 2.85 Theta 170.63 148.79 155.49 155.36 155.66 164.95 170.07 155.14 156.23 153.20 171.49 168.20 133.95 158.46 163.89 177.13 154.94 166.78 154.31 157.12 154.92 171.78 161.63 164.48 150.45 153.59 162.43 171.10 149.21 173.27 155.27 146.40 163.61 123.02 139.16 173.76 167.78 161.07 155.86 161.84 150.87 173.74 148.20 173.34 128 134 133 162 136 160 137 200 138 158 140 142 143 144 145 156 184 163 168 169 167 171 ZZZZZZZZZZZZZZZ2222222ZZZZZZZZZZZZZZZZZZZZZZZZZ'Z HN HN HN HN HN HNEl HN HN HN HN HN HN HN HN HN HNDZ HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HH22 HNE2 HN HN HN HN HN HN Acceptor 68 70 72 153 81 113 84 107 107 87 105 89 103 91 91 93 100 95 101 118 115 115 116 207 122 128 125 131 162 134 160 136 200 137 158 138 156 194 192 150 150 154 161 182 164 165 H 01 co O 081 002 O O [‘3 H OOOOOOOOOOOOOOO OG OOOOOOOOOOOOOOOOOOOOO O U l-' -189~ H'A 1.87 2.34 1.99 2.13 2.27 1.93 2.01 1.86 2.33 2.04 1.63 1.96 2.36 2.18 2.04 1.98 2.23 2.19 2.12 '1.94 2.34 2.43 2.02 2.07 1.87 1.32 2.27 1.90 1.97 1.88 1.62 2.23 1.91 1.87 1.96 2.37 2.20 1.84 1.92 1.78 1.96 2.01 1.89 2.07 2.03 2.28 1.98 2.21 D'A 2.80 3.21 2.95 2.94 3.15 2.80 3.00 2.84 3.29 3.04 2.56 2.96 3.30 2.86 3.03 2.97 3.11 2.91 2.93 2.81 3.21 3.26 2.96 3.06 2.84 2.20 3.24 2.84 2.92 2.81 2.58 3.12 2.80 2.83 2.92 3.12 3.16 2.74 2.81 2.73 2.95 3.00 2.85 3.00 2.96 3.14 2.65 3.05 Theta 153.00 144.55 160.87 137.28 145.92 143.37 169.32 164.43 161.90 176.71 152.92 173.38 156.69 123.21 169.48 169.65 145.28 127.04 136.68 143.95 145.58 139.33 154.37 171.21 160.30 143.16 163.85 154.68 156.06 154.49 160.50 147.20 148.01 158.71 158.98 131.53 159.46 148.21 145.70 157.11 172.71 169.52 157.88 153.69 155.20 143.09 121.60 139.91 -190- Donor Acceptor H-A D-A Theta 172 N HN 168 O 1.87 2.85 166.90 173 N RN 169 O 1.63 2.52 146.41 191 N HN 194 ODl 1.79 2.76 163.45 194 N EN 191 O 1.99 2.91 150.56 197 N HN 194 O 2.21 3.18 165.29 196 N RN 213 O 1.93 2.91 167.14 213 N RN 197 O 2.00 2.97 161.62 199 N HN 211 O 2.06 2.88 138.02 210 N RN 199 O 1.71 2.69 165.16 211 N RN 199 O 1.96 2.92 157.88 201 N RN 208 O 2.04 3.01 160.61 208 N HN 201 O 1.84 2.80 162.41 203 N HN 206 O 1.79 2.77 166.06 206 N EN 203 O 1.92 2.88 161.72 231 N HN 210 O 2.08 3.06 165.31 212 N HN 229 O 2.37 3.26 147.58 229 N HN 212 O 1.88 2.84 159.19 214 N HN 227 O 2.24 3.08 141.19 215 N RN 227 O 2.02 3.00 165.17 227 N RN 215 O '1.88 2.87 169.55 221 N RN 217 06 1.97 2.95 167.36 219 N RN 217 06 2.12 3.09 161.79 220 N RN 217 O 1.82 2.75 153.28 233 N RN 230 O 2.23 3.05 138.27 234 N RN 231 O 2.05 2.96 150.91 238 N HN 234 O 2.34 3.25 150.36 239 N RN 235 O 1.96 2.91 155.87 240 N HN 236 O 2.11 3.06 157.52 241 N RN 237 O 2.05 3.01 160.59 242 N HN 238 O 1.76 2.73 161.11 243 N RN 239 O 1.82 2.76 154.58 244 N RN 240 O 2.30 3.16 143.60 H-A Hydrogen to Acceptor Distance D-A Donor to Acceptor Distance Theta Donor'Hydrogen-Acceptor Angle (Degrees) 8.) Molecule 2 Donor Acceptor 2 N RN 120 O 207 N HN 2 O 16 N HN3 194 OD1 17 N RN 189 O 18 ND2 HNDZ 187 O 20 N HN 157 O 157 N82 HNEZ 20 0E2 157 N HN 20 O 154 NH2 HH22 21 081 22 N HN 155 O 26 N HN 23 O 30 N HN 27 O 46 N HN 29 O 30 NE2 HNEZ 31 O 31 N HN 44 O 32 N HN 67 O 67 N HN 32 O 33 N HN 42 O 41 N HN 33 O 34 N HN 65 O 65 N HN 34 O 35 N HN 39 O 35 N HN 35 0D2 37 N HN 35 001 38 N HN 35 O 43 N HN 195 O 45 N HN 53 O 53 N HN 45 O 121 N HN 46 O 47 N HN 51 O 48 ND2 HND2 47 O 112 N HN 49 O 108 N HN 50 O 52 N RN 106 O 106 N HN 52 O 54 N HN 104 O 55 N HN 54 061 104 N HN 54 O 58 N HN 55 O 59 N HN 56 O 57 N HN 102 OD2 57 NDl H01 102 OD1 61 N HN 64 OD1 64 N HN 61 O 66 N HN 83 O 83 N HN 66 O -191- H'A 1.93 1.99 1.43 1.87 2.08 2.06 2.28 2.00 1.70 1.93 2.19 2.06 2.00 1.82 1.98 1.95 1.87 1.77 2.13 1.80 2.03 2.28 1.52 1.93 2.08 1.61 1.97 1.68 2.18 2.04 1.60 1.65 1.80 2.04 2.02 1.98 2.01 2.08 2.35 2.35 1.88 1.59 2.23 1.67 1.98 1.77 D'A 2.93 2.92 2.33 2.83 2.97 2.94 3.24 2.91 2.45 2.92 3.16 2.99 2.92 2.78 2.97 2.92 2.85 2.77 3.06 2.80 3.00 3.18 2.23 2.85 2.98 2.60 2.92 2.65 3.10 3.02 2.49 2.63 2.78 3.00 2.97 2.95 2.69 3.02 3.22 3.30 2.79 2.58 3.20 2.64 2.96 2.76 Theta 173.42 154.49 145.90 160.48 146.46 144.86 159.44 150.26 127.79 175.50 164.31 154.52 151.16 158.83 170.93 163.07 166.39 173.84 154.92 173.24 164.55 148.91 123.42 151.43 148.25 168.03 157.69 163.34 152.71 165.53 145.08 164.08 165.11 158.07 157.40 165.14 122.89 156.17 145.56 157.99 150.37 169.02 161.96 164.04 165.95 168.92 143 144 184 163 167 168 2222ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ HN HN HN HN HN HN HN HNEl HN HN HN HN HN HN HNDZ HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN HN (h (1) 0000000 m U l—' N H O 01 000000000 00 C10 FJN 131 162 134 160 136 200 137 158 138 156 140 194 192 150 161 182 164 164 O OOOOOOOOOOOOOOOOOO ~192- H'A 1.85 1.66 2.17 1.93 1.80 2.15 2.05 2.41 2.10 1.99 1.86 1.67 1.87 1.95 2.13 2.22 2.04 1.88 1.75 12.25 2.26 1.87 2.04 2.41 2.31 2.24 1.77 1.58 2.05 2.03 1.90 1.91 2.21 1.77 1.87 2.05 1.82 1.88 2.12 2.08 2.18 1.89 2.16 2.07 2.04 2.10 2.23 2.10 3.14 3.04 Theta 163.47 161.79 140.24 145.08 159.86 138.24 174.59 147.73 151.68 147.63 170.45 164.35 154.05 169.35 149.12 152.78 158.75 154.71 147.70 120.32 129.71 158.47 160.28 141.28 145.85 169.02 163.16 135.59 154.25 166.56 161.49 149.60 146.22 159.63 152.01 154.77 172.76 151.83 141.32 149.83 162.33 145.25 155.21 156.25 148.52 159.47 149.75 154.45 -193- Donor Acceptor H-A 169 N HN 165 O 2.13 170 N HN 167 O 2.24 172 N RN 168 O 1.95 173 N HN 169 O 1.84 175 N HN 172 O 2.27 180 N HN 177 O 2.36 178 N RN 178 OD1 2.13 180 N HN 178 O 2.34 230 N HN 179 0 2.04 230 NE HE 180 O 2.11 181 N HN 228 0 2.11 228 N RN 181 O 1.66 183 N RN 226 0 2.29 226 N RN 183 0 2.03 187 N HN 222 O 2.04 191 N RN 194 OD1 1.91 194 N RN 191 O 1.96 197 N RN 194 O 2.25 196 N RN 213 O .1.93 213 N RN 197 O 1.78 199 N HN 211 0 2.12 211 N RN 199 0 2.14 210 N HN 199 O 1.88 201 N RN 208 O 1.81 208 N HN 201 0 1.80 203 N RN 206 O 1.58 206 N RN 203 0 1.94 231 N RN 210 0 1.94 212 N RN 229 O 2.28 229 N RN 212 0 1.98 214 N HN 227 0 2.15 215 N RN 227 O 2.01 227 N RN 215 0 1.88 221 N HN 217 06 2.03 220 N RN 217 O 2.07 223 N HN 221 06 2.33 233 N HN 230 O 2.29 234 N RN 231 O 2.04 235 N RN 231 O 1.97 238 N HN 234 O 2.34 239 N RN 235 O 1.75 240 N RN 236 O 1.94 241 N RN 237 O 1.87 242 N RN 238 O 1.84 243 N RN 239 0 1.71 244 N HN 240 0 1.99 245 ND2 HND2 241 O 1.94 D'A 2.89 3.05 2.89 2.69 3.22 3.09 3.12 3.10 3.03 2.90 3.05 2.61 3.21 2.94 2.93 2.88 2.90 3.13 2.92 2.77 3.01 3.12 2.81 2.79 2.77 2.52 2.90 2.92 3.17 2.83 3.01 2.99 2.84 2.96 3.06 3.30 3.07 2.96 2.68 3.28 2.75 2.93 2.85 2.81 2.68 2.95 2.70 Theta 131.05 136.76 155.53 140.21 156.37 128.93 169.13 131.39 167.74 134.05 155.60 158.89 151.74 150.58 146.31 160.66 155.46 146.37 168.15 168.26 146.50 165.60 153.26 166.87 161.84 153.35 161.29 166.59 146.71 140.85 144.34 164.42 160.24 153.27 170.12 162.96 133.54 153.51 125.30 156.51 170.19 176.93 163.00 162.11 160.97 161.47 131.21 APPENDIX E ~194- Appendix E: Solvent Molecule Positions in the a-CHT Dimer Exterior 498,499,500,501,502,503,504,505,506,507,508,510,511,514,515 520,521,522,524,526,527,528,531,532,533,534,535,536,537,538 539,542,543,544,546,547,548,551,552,553,554,557,558,559,560 561,562,563,564,565,566,567,569,571,572,573,575,576,577,578 579,580,581,582,583,584,587,588,589,590,591,592,593,594,595 596,597,598,599,600,601,603,605,607,608,609,610,611,612,613 615,616,617,618,620,621,623,624,625,626,627,628,629,631,632 633,634,635,636,637,638,639,640,641,642,643,644,645,646,647 648,649,650,651,652,653,654,656,657,658,659,660,661,663,664 665,666,667,669,670,671,673,674,675,676,677,678,679,680,681 682,683,685,686,687,688,689,691,692,693,694,695,696,697,698 700,701,702,704,705,706,707,708,709,710,711,712,713,714,715 716,717,718,719,720,721,722,723,724,725,726,727,728,729,730 731,732,733,734,735,736,737,738,739,740,741,742 Interior 509,512,513,516,517,518,519,529,530,555,565,603,606,614,703 668,684,690,699,732 Interface 496,497,523,525,540,541,545,549,550,568,570,574,585,586,602 619,622,630,655,662,672,734 APPENDIX F -195- Appendix F: Protein-Solvent G'CHT Dimer A.) Molecule 1 Donor 239 Q ZZZZZZOOZZZZZZOZZZZZZ QC) 8 NE2 HN HN HNZ HN HN HN HO HN HN HN HNDl HN HN HO HO HN HN HN HN HNZ HN HNDl HNEl HN HN HO HN HN HN HN H23 HN HN HO HN HO HO HO HN HN HN HNE2 Acceptor 564 544 505 663 674 531 531 632 573 587 593 514 514 740 584 546 561 560 512 611 568 523 531 596 539 712 572 504 508 623 543 581 705 542 527 701 739 498 520 510 652 565 WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT H'A 1.92 1.87 2.10 1.92 1.93 2.00 1.79 1.77 2.29 2.07 2.02 2.23 2.10 1.97 2.35 2.30 1.77 2.26 1.86 1.58 1.95 2.07 1.83 2.00 2.07 1.98 2.33 2.06 1.72 1.92 2.01 1.86 2.26 1.82 2.26 1.39 2.31 2.25 1.75 2.06 2.31 1.78 D'A 2.90 2.86 2.98 2.92 2.89 3.00 2.62 2.69 3.26 3.06 3.01 2.88 3.09 2.63 3.14 3.29 2.74 3.24 2.85 2.50 2.90 3.05 2.79 2.92 3.02 2.86 3.25 3.04 2.72 2.91 2.97 2.84 3.23 2.81 3.22 2.26 3.28 3.10 2.72 3.05 3.21 2.61 Hydrogen Bonds in the Theta 166.55 172.56 146.18 171.77 159.52 174.13 138.12 151.44 163.62 170.91 171.83 121.12 171.64 121.34 135.14 170.00 161.31 165.21 174.49 149.36 157.06 168.01 160.86 151.81 159.28 145.77 152.30 165.77 173.11 169.37 159.59 165.84 164.72 170.11 161.57 140.17 165.74 140.99 161.74 169.12 149.31 138.34 8.) Molecule 2 Donor 204 214 222 228 230 232 236 239 H'A D'A 61 02 ZZZZZZOZZZZZZZOZZZZZZZZ 020 32 C) NHl NEZ Theta HN HN HNZ HN HN HN HO HN HN HNEZ HE2 HN HN HN HN HN HN HN HN HO HN HN HN HN HN HN HN HO HN HNDZ HN HN HN HN HO HN HO HH11 HN HN HNE2 Acceptor 579 579 515 563 661 511 511 703 583 594 530 577 737 536 557 696 528 697 537 532 675 588 578 552 530‘ 525 631 618 576 522 522 497 662 626 636 660 556 659 524 533 638 WAT WAT WAT WAT WAT WAT- WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT -196- 1.71 2.44. 1.67 2.20 2.10 2.12 2.09 2.24 1.87 2.19 2.40 2.14 1.96 2.09 2.02 2.20 2.20 1.70 1.52 1.75 1.86 1.57 2.16 1.97 2.10 1.82 2.05 1.99 2.02 2.14 1.49 D'A 2.89 2.85 3.19 2.78 3.25 3.13 2.96 2.70 2.99 2.69 3.20 2.63 3.06 3.16 3.05 3.11 3.05 3.23 2.86 3.11 3.22 3.08 2.95 3.06 3.01 3.04 3.20 2.57 2.49 2.42 2.83 2.56 2.98 2.92 3.06 2.79 2.94 2.86 3.00 3.13 2.42 Theta 124.40 171.64 168.29 162.63 140.07 172.23 147.67 133.42 152.28 166.27 170.43 158.69 158.88 161.25 158.89 174.67 160.63 169.16 169.42 152.81 139.24 157.68 169.80 165.65 168.80 139.82 177.04 142.97 161.59 120.71 162.76 169.40 137.75 158.39 159.44 162.89 147.41 143.85 164.73 168.93 152.80 Hydrogen to Acceptor Distance Donor to Acceptor Distance Donor-Hydrogen-Acceptor Angle (Degrees) APPENDIX 6 Appendix 6: Polar Protein Atoms ~ Solvent Interactions a-CHT Dimer Water Number 497 504 505 506 508 509 510 511 512 514 515 516 519 520 522 524 525 528 529 532 533 534 536 537 538 544 547 549 551 553 554 557 559 560 563 565 566 in the Protein VAL THR LEU GLY GLY GLN VAL THR GLY THR GLN GLY ALA GLY GLY THR GLY VAL LEU VAL CYS THR ASP ASP ALA THR SER PRO GLN ALA PRO GLY GLN ASP PHE PRO GLN ALA GLU VAL VAL ASN THR PRO ASN ~197- AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Hr-nnaK3NP-hu-kuakaHr-mnuniwromnvhambohnuunanawrQRHAFJNP-FHAFJNF-FHAKJN Atom 1880 2240 1630 1400 1420 34081 1880 2240 1960 2320 157081 1930 1790 1400 1420 1390 1960 2100 1630 2100 580 620 640 1940 560 2320 1590 280 300 1790 80T 250 1160 128002 410 280 300 1200 20081 670 600 236001 1440 1240 204001 Distance(A) 2.53 2.86 2.62 2.76 2.65 2.81 2.55 2.83 2.76 2.93 2.73 2.70 2.95 2.83 2.68 2.62 2.94 2.84 2.74 2.77 2.93 2.59 2.46 2.74 2.60 2.79 2.99 2.88 2.65 2.88 2.67 2.62 2.61 2.23 2.73 2.90 2.54 2.70 2.94 2.75 2.39 2.48 2.81 2.50 1.95 -198- Water Number Protein Atom Distance(A) 567 ALA (2) 2330 2.87 571 THR (1) 620 2.57 573 THR (1) 620 2.21 574 PHE (2) 410 2.80 576 ASN (2) 167001 2.28 579 GLY (2) 250 2.78 GLN (2) 1160 2.68 580 LYS (l) 2020 2.69 583 TRP (2) 270 2.88 584 GLY (1) 690 2.72 586 MET (1) 1920 3.00 589 LEU (2) 1550 1.92 590 PRO (1) BOT 2.23 602 SER (2) 960 2.90 603 TRP (1) 2150 2.95 VAL (1) 2270 2.90 604 LYS (1) 1750 2.71 606 TRP (2) 2150 2.83 611 ALA (1) 1490 2.73 612 VAL (1) 1210 2.96 613 ALA (2) 550 2.09 614 GLN (2) 300 2.90 GLU (2) 700 2.87 615 GLY (2) 1330 2.84 619 PHE (1) 410 2.79 620 ASP (1) 129001 2.38 622 LEU (1) 970 2.93 624 ASP (1) 350 2.81 628 ASP (1) 129001 2.78 630 SER (1) 2180 2.31 631 GLU (2) 20081 2.32 634 GLN (1) 156081 2.52 638 PRO (2) 1240 2.76 640 LYS (1) 790 2.61 641 GLY (1) 690 2.72 647 PRO (2) 40 2.81 650 THR (1) 1170 2.81 658 ASP (1) 1280 2.48 661 ILE (2) 160 2.94 GLN (1) 156OE1 2.54 662 CYS (2) 1910 2.97 663 THR (1) 1440 2.32 665 LYS (2) 1750 2.32 667 SER (1) 770 2.90 672 GLY (2) 590 2.94 675 GLU (2) 49032 2.21 679 ILE (2) 470 2.56 680 LYS (1) 1690 2.30 Water Number 681 682 684 685 690 691 692 694 695 707 714 715 716 720 721 722 724 730 732 733 734 735 738 740 R.M.S. THR ASN LEU LYS ASP ASN ASP SER GLN SER GLN CYS THR ALA PRO PRO LYS ASN ASN GLN SER SER SER GLN ALA ASP SER Deviation = AAAAAAAA'AAAAAAAAAAAAAAAAAAA enemimrokuvhounvkaknvnakuuhdkuAPAFHAPJNromna vvvvvvvvvvvvvvvvvvvvvvvvvvv N Protein 03 -199- \l Atom 2320 48001 970 900 153001 101001 178001 1590 70 1150 1160 10 1100 220 610 80 1700 100001 101001 5001 4180 1150 960 2400 1260 1280 1130 Distance(A) 2.87 2.93 2.87 2.36 1.94 2.99 2.56 2.59 2.86 2.44 2.03 2.65 2.78 2.14 2.08 2.82 2.07 2.78 2.55 2.11 2.31 2.80 2.93 2.90 2.56 2.65 2.95