. 2:3: ,‘3 ~ 153:“ #5215: mf‘ixvx 2‘-‘~-! : « .r-‘b xixziér‘fié" ~--« If~£tt€zé{.%~£§§g ‘ ‘t I . ‘.I‘ a\ L. A «763.1. . ~ 1.. ‘r” dfimfi‘fi‘ “ - ~ \- w '5. ‘ . . :fi'fik‘ifl‘g‘l t .3' TA“ 1. — ”-3 . éi‘u‘afizfl'f‘ég“ 3 h l- 1'" . . ;_. a ’ ‘EERHW war-:1; I} ’ . Kati-«é u if . ‘9 .,. .. I“: l _ V‘ ‘ h'.“ '1 rail-'05:? L7 13.355- ayflv ” 5 4. ‘ .' - I .‘T .1” $3 11;" . . ' I I,'- 9' k . f— v 1" " i: [A yr - .3". , ‘ Wu. '3“ :7. 3;; if; } I .. ~. - , _, _ , . r r _ ‘ . .‘ v. ;‘_:;:‘(:-a ' .‘Jt - "-1me fl , . «N; {3-}. -~~n.5‘- ...:. J‘ra s :3; '2.“ vi“ I“ ‘ ‘3? . ix . .g gawk: ”3%; “m 3,3: " wt”??? ,5 w. ; 3" fi~ s‘ .1. v [u mill—v C h. ‘ Vi": ' .N . Q; "VI: ,‘E‘ia‘i‘wrféat ~ , . ‘ - 1*; ~51 23’3"} {.‘l~1:"‘l’ 1“}? ‘A ‘ u" 0: ~ ”Wig-255%.; ~Ah ‘ ark: 5 ‘ . - a . 31. A v, ’- If 5" M‘v‘ ”.‘q-v ' ’ .( “r53, . itfié -3';' a 3‘ I § "1 '§ x g: .' J ,. $55 . "P F" {at c ‘ 1: 1.1. ‘13.! 1. %k',';‘ékw $3323.: 2 L_ A M" £5“ + v k 7 if? 5‘. u - 99"”‘1' 1,21". 3.. _|~:,. {I .. 1??? -’ ' 0.... w J ”v.34: 'Wflu‘h‘,.- W 9... W _ —",v ' «'JQ I “'15:, o n- --~ - - . ~ n.- n ‘ililiiiillliii This is to certify that the dissertation entitled THE STRUCTURE OF A COMPLEX OF RECCNBINANT HIRUDIN AND HUMAN oz-THRQMBIN AT 2.3 A RESOLUTION presented by Timothy John Rydel has been accepted towards fulfillment of the requirements for Ph . D . degree in Chemistry Gm Tuséfi Major professor Datew MS U i: an Affirmative Action/Equal Opportunity Institution 0-12771 k F PLACE IN RETURN BOX to TO AVOID FINES return on L manna-r Michigan state University fl or betore due due. DATE DUE DATE DUE DATE DUE Tl remove this checkout from your record. 'l J J MSU Is An Affirmative Action/Equal Opportunity Institution _H— cWMG-nt ‘ THE STRUCTURE OF A COMPLEX OF RECOMBINANT HIRUDIN AND HUMAN «PTHROMBIN AT 2.3 A RESOLUTION BY Timothy John Rydel A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Chemistry 1991 ABSTRACT THE STRUCTURE OF A COMPLEX OF RECOMBINANT HIRUDIN AND HUMAN a-THROHBIN AT 2.3 A RESOLUTION By Timothy John Rydel Crystals of a complex of the recombinant hirudin, rHVZ-K47, and human a-thrombin were obtained via vapor diffusion from hanging drops. The hirudin-thrombin complex crystallizes in the tetragonal crystal system, space group P43212, with a = b = 90.54 A, c = 132.04 A and one molecule in the asymmetric unit. The structure was solved by Patterson search techniques using PPACK-human arthrombin as a model and a 2.8 A resolution diffractometer intensity data set. The rotational search was conducted in Patterson space using 8.0 to 3.0 A resolution data and the SEARCH routine in the PROTEIN package of macromolecular crystallographic programs. The translational search also used 8.0 to 3.0 A resolution data, and was conducted with programs written by Lattman and modified by Deisenhofer and Huber. The R-value of the initial solution was 0.39, but it reduced to 0.34 when the rotational and translational parameters were refined with the rigid body refinement program TRAREF. The initial structure was improved using 2.3 A resolution FAST area detector data, the energy restraint-crystallographic refinement program EREF, and graphics interventions. After six iterative stages of model—building and refinement, the hirudin-thrombin complex (includes 207 water molecules) had an R-value of 0.193 for 21,960 reflections between 7.0 and 2.3 A resolution. The complex was further refined using PROFFT, the fast Fourier .Ff‘ .‘4 our i .i..\ 0'1. ch ch in fig g.‘ "r 0‘ transform version of the stereochemically-restrained least squares refinement program PROLSQ. After five iterative stages of model-building and refinement, the complex (includes 265 water molecules) had excellent overall stereochemistry and an R-value of 0.173 for 21,056 reflections between 7.0 and 2.3 A resolution. Hirudin consists of an N-terminal globular domain with a three-cystine core and an extended C-terminal tail. The N-terminal tripeptide of hirudin interacts in a novel manner in the active site region of thrombin. The C—terminal tail of hirudin makes many electrostatic as well as hydrophobic interactions in the anion-binding exosite of thrombin. In all, hirudin engages in 216 contacts of less than 4.0 A with thrombin. This abundance of interactions is likely the source of the high affinity and specificity of hirudin. The overall folding of the thrombin structure in the complex is very similar to that of PPACK-thrombin, however many significant main chain and side chain differences exist between the two. Most of the differences can be accounted for in terms of four categories of factors or influences. To my parents, Arlene and Joseph Rydel, and to my wife, Odette, for their encouragement, love, and support. iv ulq er I c . G. . . w. . G. . .. .fi .2 . . a... c .. .x. 5b. a . Au» v. v. .1 . ,. , v . .1. C. a: p u ACKNOVLEDGEHENTS To Professor Alexander Tulinsky I extend my sincere gratitude for his constant enthusiasm, guidance, interest, and support throughout all aspects of this work. I could not have asked for a better scientific mentor, and it has been a wonderful learning experience to be a part of his laboratory. I am also very grateful to him for allowing me to return to Michigan State to continue work toward a Ph.D. degree in his group. To Dr. K.G. Ravichandran, a former post—doc of the laboratory, I extend my sincere thanks for his important contributions to the structure solution of the hirudin—thrombin complex. In particular, I am grateful to him for his assistance with the model-building and refinement work on the complex conducted at the Max-Planck-Institut fur Biochemie in Martinsried, Germany, and for his friendship during two exciting and exhausting months spent overseas. I have also benefitted a great deal from being a part of a very friendly, instructive and supportive research group. I would especially like to thank Drs. Vasili Carperos, Anne Mulichak, Pappan Padmanabhan, Eva Skrzypczak-Jankun, and Manuel Soriano-Garcia for their assistance and helpful discussions. It is also a pleasure for me acknowledge Dr. Carolyn Roitsch of Transgene, S.A., in Strasbourg, France for supplying us with ample amounts of recombinant hirudin, and for her genuine enthusiasm and interest in the project. The generous financial support of Transgene, S.A., is also gratefully acknowledged. Thanks is also extended to Dr. John V. Fenton II of the New York State Department of Health in Albany, NY for supplying us with large quantities of human a-thrombin. It is an honor and a privilege for me to thank Professor Robert Huber of the Max—Planck-Institut fur Biochemie in Martinsried, Germany for his role in this project. The invitation to come to his laboratory to collaborate on the structure solution of the complex, and his advice and guidance during my visit are greatly appreciated. I would also like to thank Drs. Wolfram Bode, Albrecht Messerschmidt, and Milton Stubbs, and Ms. Monika Schneider - all members of Professor Huber’s laboratory - for their valuable advice and assistance during my stay. Prior to returning to Michigan State to work on this project. I was very fortunate to have worked in two industrial protein crystallography laboratories where I was given the opportunity to learn and develop new skills in both biochemistry and crystallography. To Drs. Cele Abad-Zapatero and John Erickson of Abbott Laboratories, and Drs. Howard Einspahr and Keith Vatenpaugh of The Upjohn Company, I am thankful for these opportunities and for their encouragement of my returning to Michigan State to continue work toward a Ph.D. degree. I am fortunate to have had so many supportive family members and friends behind me in this effort. I am especially indebted to my wonderful wife Odette, and to my parents, my brother and sisters, and my Grandma for their continual love and encouragement. Support by the National Institutes of Health is gratefully acknowledged. vi CHAPTER II III IV VI VII LIST OF TABLES LIST OF FIGURES . INTRODUCTION A. Thrombin . B. Hirudin EXPERIMENTAL A. Sample Preparation . . . 8. Sample Molecular Veight Estimation . C. Crystallization . . . . D. Crystal Characterization . . . . . . . E. 2. 8 A Resolution Diffractometer Intensity Data Collection . . F. 2. 3 A Resolution Area Detector Data Collection . DATA REDUCTION A. Converting the 2. 8 A Resolution Diffractometer Intensity Data to Structure Amplitudes . 8. Post- Data Collection Processing of the 2. 3 A TABLE OF CONTENTS Resolution Area Detector Structure Factors . MOLECULAR REPLACEMENT CALCULATIONS . . . . . . . . Introduction . . . . . . The Rotation Function Calculations . . . . . The Translation Function Calculations . The Initial Hirudin- Thrombin Electron Density Map Calculation . . . . . . . . . . . . . MODEL-BUILDING AND EREF REFINEMENT MODEL-BUILDING AND PROLSQ REFINEMENT RESULTS AND DISCUSSION A. B. C. The Hirudin Structure . . . . . . . The Hirudin- Thrombin Interaction . . . . . Crystal Packing . . . . . vii PAGE ix . xiii 62 68 76 76 79 81 84 89 95 120 120 139 178 eh.‘ - - .p—" «0" i... CHAPTER PAGE VIII A COMPARISON OF THE THROMBIN STRUCTURE IN THE HIRUDIN—THROMBIN COMPLEX AND IN PPACK—THROMBIN . . . . . 185 LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . 218 viii TABLE 10 11 12 13 14 LIST OF TABLES The Primary Sequence of Human a—Thrombin. The amino acid type is indicated using the standard single letter code. The numbering of residues is based on topological similarities with chymotrypsinogen. A number followed by an alphabetic character indicates an insertion. CP represents a cis-proline. (Numbering taken from [6].) . . . . . . . . . . . . Summary of Crystal Data of rHV2-K47—Human a-Thrombin e e e e e e e e e e e e e e The Average Cell Parameters of the HTNZ (3.6-2.8 A) and of the HTN3 (2.8 A) Crystals. Cell lengths in in units, angles in degrees . . . . . . . . . . . . Reflection Statistics on the Hirudin-Thrombin FAST Data Set Evaluated by MADNES and Corrected using ABSCOR O O O O O O O O O O O O O O O O O O O O O O 0 Results of the Translation Search . . . Course of Model-Building and EREF Refinement Final Model Parameters of the BREE-refined rHVZ—K47 Human a—Thrombin Structure . . Summary of Model-Building and PROLSQ Refinement . Summary of Hirudin- Thrombin PROLSQ Refinement Restraint Parameters . . . . . . . . . . Final Model Parameters of the PROLSQ—refined Hirudin-Thrombin Complex . . . . . . . . . The Refactor vs. Resolution for the Final Hirudin-Thrombin Model Parameters . Average Individual B-factor and Average Occupancy Factor for the Hirudin-Thrombin Model Parameters Secondary Structural Elements of Hirudin Sulfur— Sulfur Distances (A) in Hirudin. A '*' indicates a disufide bond . . . . . ix PAGE 34 45 74 85 91 92 . 100 . 102 . 103 . 110 . 112 . 123 . 125 TABLE PAGE 15 Intramolecular Hydrogen Bonds in Hirudin. Hydrogen atoms are assigned geometrically idealized positions. Donor atoms are denoted 'D' and acceptor atoms ’A'. . . . 130 16 Dihedral Angles and Dihedral Energies of the Disulfide Bridges in Hirudin. A ’*' indicates the energy value was excluded from the calculation of the mean . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 17 Intermolecular Hydrogen Bonds in the Hirudin-Thrombin Complex. Hydrogen atoms are assigned geometrically idealized positions. Donor atoms are denoted 'D' and acceptor atoms 'A'. A protonated amino terminus is designated ’N+' and an unprotonated amino terminus is designated ’N' for Ilel'. A '*' indicates the mean and standard deviation were calculated using values for an unprotonated Ilel' amino terminus . . . . . . 145 18 Hirudin-Water Molecule Hydrogen Bonds in the Complex. Hydrogen atoms are assigned geometrically idealized positions. Donor atoms are denoted ’D', acceptor atoms 'A’ . . . . . . . . . . . . . . 151 19 Solvent-Bridged Hirudin-Thrombin Hydrogen Bond Interactions. The protein atoms involved in the interaction are represented as 'A' and ’B', and the mediating water 'V’ . . . . . . . . . . . . . . . . . . . . 152 20 Solvent-Bridged Intramolecular Hydrogen Bond Interactions in Hirudin. The protein atoms involved in the interaction are represented as 'A' and 'B', and the mediating water 'V' . . . . . . . . . . . . . . . . 152 21 Hirudin a—Thrombin Interactions. Numbers are intermolecular contacts < 4.0 A; parentheses indicate ion pairs; a dash followed by a number is a hydrogen bond; hydrophobic or neutral contacts are underlined in bold . . . . . . . . . . . . . . . . . . . . . . . . . . 169 22 Solvent Accessible Surface Area of Thrombin Residues Alone and in Complex with Hirudin. Abbreviations used: TASA: total atom surface accessibility; MASA: main chain atom surface accessibility; SASA= side chain atom surface accessibility . . . . . . . . . . . . . 173 23 Solvent Accessible Surface Area of Hirudin Residues Alone and in Complex with Thrombin. Abbreviations used are as defined in Table 22 . . . . . . . 173 24 Percent Loss of Hirudin Solvent Surface Accessibility Due to Complexation with Thrombin. Results presented with regard to hirudin residue TABLE 25 26 27 28 ranges. Abbreviations used are as defined in Table 22 O I O O I 0 O I O O 0 O O O O O O O O O Residue Surface Accessibility of Free Hirudin. Conformation of hirudin as it appears in the complex. Abbreviations used: SASAt= theoretical side chain atom surface accessibility. All other abbreviations used are as defined in Table 22. All surface accessibility results reported in A2. SASA/SASAt ratios indicate how the actual side chain accessibility compares to the maximum possible. TASA/SASAt ratios greater than one indicate that that there is at least some main chain accessibility. These ratios are not given for glycines as glycine has no non-hydrogen side chain atoms. A '*' preceding a line indicates that the values given are susceptible to error due to disorder in or near these residues. A '+' preceding a line indicates a residue which has at least one direct contact of less than 4.0 A with thrombin . . . . . . Hydrogen Bonds Between Symmetry-Related Molecules. Hydrogen atoms are assigned geometrically idealized positions. Donor atoms are denoted 'D' and acceptor atoms ’A'. The symmetry molecule number refers to the symmetry equivalent position of the symmetry-related molecule in the interaction, as given below. The symmetry-related molecule residue number is preceded by a ’*' . . . . . . . . . . . . . Results of an Optimal Superposition of the Thrombin Structures of the Complex and PPACK-Thrombin [6] using the Program MOLFIT. Fitting based on N, CA, G Atoms. Results expressed in A units as root mean square deviations (r.m.s.d.). Numbers in the '{}' brackets represent the number of atom pairs contributing to the r.m.s.d . . . . . . . . . . . . Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Residues Poorly Defined in Electron Density or Tentatively Placed in Either Structure, or due to Alternate Interpretations of the Density Maps. Optimal superpositioning of thrombin molecules as in Table 27. Abbreviations: HT s hirudin-thrombin complex; PT . PPACK-Thrombin; MCA r.m.s.d. = main chain atom average root mean square deviation; SCA r.m.s.d. a side chain atom average r.m.s.d.; S = surface; AS 2 active site; conf. = conformation; rot. = rotation. Deviations > la on main chain (0.45 A) and/or side chain (1.43 A) atoms are listed xi PAGE . 174 . 176 . 182 . 186 . 191 TABLE 29 30 31 32 Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Crystal Packing Interactions in the Complex. Optimal superpositioning of thrombin molecules as in Table 27. Abbreviations: as = aromatic stacking; hb =hydrogen bond; hip: hydrogen— bonding ion pair; hc a hydrophobic contact; pc = polar contact; sympos = symmetry molecule positions (see Table 26), wmi= water mediated interaction. All other abbreviations are defined in Table 28. Residues in brackets [] are not involved in crystal contacts but have significant conformational differences which could be due to packing interactions of nearby residues. A '*' denotes the average MCA and SCA r.m.s.d. for residues Asn143-Gln151. Table 32 givess a complete listing of the MCA and SCA r. m. s. d. for the residues. Deviations > 1o on main chain (0. 45 A) and/or side chain (1. 43 A) atoms listed . Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to the Interaction with Hirudin. Optimal superpositioning of the thrombin molecules as in Table 27. All abbreviations are defined in Tables 28 and 29. Residues in brackets [] do not interact with hirudin, but have significant conformational differences which could be due to hirudin interactions of nearby residues. Deviations > lo on main chain (0. 45 A) and/or side chain .(1. 43 A) atoms listed . . . . . . . Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Surface Residue Conformational Variability. Optimal superpositioning of the thrombin molecules as in Table 27. Abbreviations: fs 2 free sidechain; ident. . identical. All other abbreviations defined in Tables 28 and 29. Deviations > 1o on main chain (0.45 A) and/or side chain (1.43 A) atoms listed Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] in the Vicinity of the 149A-149E Insertion Loop. Optimal superpositioning of the thrombin molecules as in Table 27. Abbreviations: r.m.s.d. = root mean square deviation; MCA r.m.s.d. = average main chain atom r.m.s.d.; SCA r.m.s.d. = average side chain atom r.m.s.d. All residues listed have deviations > 1o on main chain (0.45 A) and/or side chain atoms (1.43 A) xii PAGE 192 . 193 194 196 w sssss FIGURE LIST OF FIGURES The Prothrombinase Complex. Prothrombin (PT) is composed of fragment 1 (F1), fragment 2 (F2), and prethrombin 2 (Pre-2); Factor Va . Va; factor Xa = Xa; Ca+2 ions are represented by filled circles binding to y—carboxyglutamic acid residues The Products of PT Activation. CHO represents a carbohydrate side chain; "a", "b", and "c" represent the sites of cleavage of bovine and human PT; an additional cleavage, "b'" occurs in human PT (taken from [3]) The Polysaccharide of Human a—Thrombin [8] and its Components: N-Acetylneuraminic Acid (NeuNAc), N-Acetylglucosamine (GlcNAc), Mannose (Man), Galactose (Gal), and Fucose (Fuc) . . . Schematic Drawing of the Fibrinogen Molecule [9]. The symmetric molecule is composed of a dimeric central domain containing the amino—termini of all six chains (a,B,Y), two connecting coiled coils, two terminal domains, and two Aa polar appendages. Fibrinopeptides A and B (EPA and FPB) are at the amino-terminal ends of the a and B chains. Four carbohydrate clusters (CHO) occur, and are located on each 7 chain near the central domain and on the B chains of each terminal domain. Primary cross-linking sites (XL) can be found near the carboxy-termini of the 7 chain and in the Au polar appendages . . . . . . The Central Bioregulatory Functions of asThrombin in Hemostasis. Coagulation factor zymogens are identified by roman numerals; an "a" following a numeral indicates an activated enzyme. Primes and double primes are used to indicate active and inactive forms of factors of V and VIII. Abbreviations: F1.2 = prethrombin fragment 1.2; FpA . fibrinopeptide A; FpB fibrinopeptide B; PL = phospholipid; and TPN thromboplastin (taken from [18]) . . . . . . . . . . . . . . . . . . . Amino Acid Sequences of the Three Natural Hirudin Variants: HV1, HV2, and HV3. The boxes xiii PAGE ., .yo-F ' e ‘I ..V'.-~ ‘Q .1 - A «i. '_.1 (I, FIGURE PAGE delineate regions of homology and the stars (*) refer to variant residues. HV3 contains an additional residue at position 63. Y’ indicates a sulfated tyrosine (taken from [66]) . . . . . . . . . . . . . . . . 14 7 The Disulfide Bond Connectivity Patterns of Hirudin and Epidermal Growth Factor [67] . . . . . . . . . 16 8 The Sequence of Recombinant Hirudin Variant 2-Lysine 47 (rHVZ-K47) . . . . . . . . . . . . . . . . . . 18 9 Stereo Drawing of Nine Optimally Superimposed DISMAN Distance Geometry N, CA, G Structures from the 2D NMR Constraint Data of Haruyama and Vuthrich [80]. Disorder exists in the residue ranges 1-3, 31-36 and 49-65 (taken from [80]) . . . . . . . . . . . . . . . . . . 19 10 Human apThrombin and its Proteolytic Derivatives. The numbering of residues is taken from Table 1. The A-chain and B-chain of thrombin are designated by "A" and "B", respectively. CHO represents the polysaccharide linked to nitrogen ND2 of Asn(N)60G. The location of the catalytic triad residues is noted (857, D102, and $195). Excised proteolytic fragments are denoted by broken lines . . . . . . . . . . . . . . . . 21 11 100Log(fo100) vs. ZT Gel Concentration. The slope for each line gives the retardation coefficient (Kr) for the corresponding protein oligomer. Abbreviations: L(1ysozyme), CA(carbonic anhydrase), CEA(chicken egg albumin), BSA-M(bovine serum albumin-monomer), BSA—D(bovine serum albumin), HT(hirudin-thrombin) . . . . . . . . . . . . . . . . . . . 29 12 Log(-Kr) vs. Log(Molecular Weight). Linear regression best fit line to the standard protein retardation coefficient data. Abbreviations used have the same meaning as in Figure 11. The extrapolated molecular weight estimate for the hirudin-thrombin complex is 38.5 kD . . . . . . . . . . . . 31 13 Photograph of an rHVZ—K47-Human a-Thrombin Crystal. Crystal size is approximately 1.0 x 0.4 x 0.4 mm. Note the trace of the seed in the center of the crystal . . . . . . . . . . . . . . . . . . . . . . . . 33 14 hkO Precession Photograph of a Hirudin-Thrombin Crystal. The resolution limit at the edge of the photo is 3.7 A. (mu angle = 12.0°) . . . . . . . 35 15 Morphology of an Idealized Hirudin-Thrombin Crystal . . . . . . . . . . . . . . . . . . 38 xiv «A hr- 4‘ LL ‘1 54 (I FIGURE 16 17 18 19 20 21 22 23 24 25 26 27 28 A Hirudin-Thrombin Crystal Mounted in a Capillary for X-ray Crystallographic Analysis . The Goniostat of a Four-Circle Diffractometer [111] Axial Intensity Distributions of the Recombinant Hirudin-Human a—Thrombin Crystals . The w—Profile of (6,16,17) Taken Before the 3-D HTNZ Data Collection. Peak width at half-height approximately 0.18° . . . . . . . . Intensity Absorption Plot of (0,0,8) Taken Before the HTN3 3.56 A Resolution Data Collection. 1(6) in total counts/measurement. The 45° 6 range employed in the data collection indicated by "D.C." . . . . . . Drawing of an Enraf—Nonius Diffractometer with a Conventional Detector [116] A Flow Chart of Operations in a MADNES Area Detector Data Collection . . . . . . . . Dependence of X- -ray Decay Correction with Scattering Angle for the HTNZ Hirudin-Thrombin (3.6-2. 8) A Resolution Data Collection . . . . The <|Fobs|2> versus <29) Distribution for the 2. 8 A Resolution Hirudin- Thrombin Diffractometer Data Set . . . . . . . . . . . . . . . . . Plot of Sin weighting factor V versus X where V-Il(X)/IO(X) and XxZIFobs(h)||Fcal(h)|/Iu (taken from [137]). 11(X) and 10(X) are first- and zero-order modified Bessel functions, respectively. Iu is an estimate of the contribution from the unknown structure, or that which is not accounted for in the madel D O O O O O O O O O O O O O Stereoview of the First Unit of Thrombin Carbohydrate Modeled into Map Density. Displayed in bold is N-acetylglucosamine (NAGl) linked to the side chain atom ND2 of Asn60G. Map density is from the final 2Fo-Fc map used for model-building; contours at 0.8a. Atomic positions are from the final coordinates . . . . . . . . . . . . . . . . . . . . . Histogram of w Angles in the Hirudin-Thrombin Complex . . . . . . . . . . . . . . . . The Ramachandran Plot of Hirudin-Thrombin. Only non-glycine residue angles are displayed . XV PAGE 39 41 43 48 50 52 55 65 69 88 98 . 107 . 108 at. at a Re ~44 lhu. )4- 1114 FIGURE PAGE 29 Plot of R-factor vs. Resolution for Hirudin-Thrombin. Large filled circles represent the data points of the complex. The small filled circles represent structure factor amplitude discrepancy data due to specified average coordinate errors (Ar) according to Luzzati [148] . . . . . . . . . . . . . . . . 111 30 Plot of the Average Individual B-factor of Main and Side Chain Atoms for the Protein Atoms in the Complex. The for main and side chain atoms is represented by '+' and 'X' symbols, respectively. Main chain atom values connected by a line. The break in in hirudin corresponds to the completely undefined segment Ser32'-Lys35'. Bold arrows point to main and side chain (Bi) values of Cys residues engaged in disulfide linkages . 114 31 Histogram of the Individual B-factors of the Water Molecules in the Hirudin-Thrombin Structure. . 117 32 Histogram of the Occupancy Factors of the Water Molecules in the Hirudin—Thrombin Structure. All water molecules in the last occupancy shell, 1.0 - 1.1, have an occupancy of 1.0 . . . . . . 118 33 Stereoview of the Folding of Hirudin in the Complex. N, CA, C atoms only; hirudin disulfides in bold; disordered or poorly defined regions indicated by dashed lines . . . . . . . . . . . . . . . . . . . . . . 121 34 The Sequence of Recombinant Hirudin Variant 2-Lysine 47 (rHV2-K47). Isolated loop designated ’A' and loop segments comprising the two interconnected loops designated '8', 'C', and 'D'. . . . . . . . . . . . . 122 35 Stereoview of the Cystine Disulfide Core of Hirudin. Disulfide linkages in bold . . . . . . . . . . 124 36 Stereoview of the Folding of NH2-terminal Domain of Hirudin in the Complex. N, CA, C atoms only; designations ’A', 'B', 'C', and 'D' correspond to those of Figure 34; disorder indicated by dashed lines . . . . . . . . . . . . . . . . . . . . . . . . . 127 37 Stereoview of the 81 Antiparallel B—Strand in Hirudin. Hydrogen bonds indicated by dashed lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 38 Stereoview of the 82 Antiparallel B—Strand in Hirudin. Hydrogen bonds indicated by dashed lines . . . . . . . . . . 129 xvi .,.vw~fl b . I Ao-V“~ q,_- p ‘U .4 46 47 FIGURE PAGE 39 Stereoview of the Intramolecular Lys47' Hydrogen Bonds in the Hirudin N-terminal Domain. CA structure of residues 1'-48' including disulfides (disulfide linkages in bold), and hydrogen bonding residues (bold). The disorder in residues 32'—35’ as well as the hydrogen bonds are indicated by dashed lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 40 Stereoview of the Polyproline II Helix in Hirudin. N, CA, C atoms indicated in bold . . . . . . . 133 41 Stereoview of the Disulfides in the Crystal Complex and in the NMR Hirudin Structure [79]. NMR hirudin disulfide side chains indicated by dashed lines; disulfide linkages in bold . . . . . . . . . . . . . 136 42 Stereoview of the T4 Type III Helical Reverse Turn in Hirudin. Hydrogen bonds indicated by dashed lines; main chain atoms in bold; Gln65’ also included . . . . . . . . . . . . . . . . . . . . . . . . . 140 43 Stereoview of the CA Structure of the Hirudin-Thrombin Complex. Hirudin CA and disulfides of hirudin and thrombin in bold. N- and C—terminals of thrombin designated . . . . . . . . . . . . . . . . . . 141 44 Stereoview of PPACK in the Active Site Region of Thrombin. PPACK, from the PPACK-thrombin structure [6], optimally superimposed into thrombin active site based on N, CA, C fitting of PPACK-thrombin and hirudin- thrombin complex thrombin structures; PPACK in bold . . . . . . . . . . . . . . . . 143 45 Stereoview of the Interaction of the Hirudin N-terminal Tripeptide in the Active Site of Thrombin. Hirudin N-terminal tripeptide in bold; hydrogen bonds of the N-terminal nitrogen indicated by dashed lines. . . . . ..... . . . . . . . . . . . . . . . . . . . . . 144 46 Stereoview of Hydrophobic Contacts of the N-terminal Hirudin Domain with Thrombin. Hirudin residues in bold. Residues in figure: Ile174, Pro6OC, Leu13', Va121', and Pro46'. . . . . . . . . . . . . 148 47 Stereoview of Electrostatic Interactions of the N-terminal Hirudin Domain with Thrombin. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, and ion pairs by +/- . . . . . . . . . . . . . . ._. 149 48 Stereoview of a Hirudin-Thrombin Water Mediated Interaction. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/—, the water molecule by * . . . . . . . . 153 xvii 4.. :4 [Ho pKv FIGURE PAGE 49 Stereoview of Hirudin Pro46’-Lys47’-Pro48' and the Hirudin N-Terminus—Thrombin Active Site Interaction. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/-, and water molecules by * . 154 50 Charged Residues of the Anion-Binding Exosite Region of Thrombin. Thrombin CA structure only with thrombin disulfides and charged anion-binding exosite residues in bold . . . . . . . . . . . . . . . . . . . . . 156 51 Basic Residues of the Postulated Heparin Binding Site of Thrombin. Thrombin CA structure only with thrombin disulfides and basic residues of heparin binding site in bold . . 157 52 Hirudin(Glu49'-Ser50’-Hi551')-Thrombin Interactions. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/~, and water molecules by * . . . . . . . . . . . . . . . . . 159 53 Electrostatic Hirudin(Asp55'-Gln65')- Thrombin Interactions. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/-, and water molecules by * . . . . . . . . . . . . . 160 54 Hirudin Phe56' in a Hydrophobic Thrombin Cavity. Phe56' in bold . . . . . . . . . . . . . . . . . . 162 55 Hirudin-Thrombin Interactions near the Type III Helical Reverse Turn of Hirudin. Hirudin residues in bold, with N, CA, G of hirudin bolder. Ion pairs indicated by +/— . . . . . . . . . . . . . . . . 164 56 Space-filling Drawing of the rHV2-K47-Human apThrombin Complex. Hirudin in sky blue . . . . . . . . . 168 57 The Sequence of Human asThrombin. Insertion residues with respect to the sequence of chymotrypsinogen [6] are graphically indicated as protrusions from the linear sequence chain. Disulfide linkage connections are indicated, and insertion residues possessing contacts of < 4.0 with hirudin in the hirudin-thrombi complex are circled . . . . . . . . . . . . . . . . . . . . . . . . . . 171 58 Stereoview of a Hydrophobic Crystal Packing Interaction at a Two—Fold Symmetry Axis. Top: Molecules involved are at symmetry positions 1 and 5 (relation 42, text). Disulfides and hirudin in bold. Two-fold axis indicated. The hydrophobic interactions occur in the circled region and are detailed below. Bottom: Nonpolar/aromatic residues xviii “'9‘." C-’d ‘ II \L') \7\ '74. (7\ ‘AJ FIGURE 59 60 61 62 63 64 which comprise the hydrophobic cluster. Groups involved: Leu14G, Tyr14J, Ile14K, Leu129C, Tyr134, Pr0204, and Phe204A (all bold) and the corresponding symmetry- -equivalent groups . . . . . . . . Steroview of an Electrostatic Crystal Packing Interaction at a Two-Fold Symmetry Axis. Top: Molecules involved are at symmetry positions 1 and 6 (relation 42, text). Hirudin and all disulfides are in bold. Two—fold axis indicated. The electrostatic crystal packing interactions are within the circled region and are detailed below. Bottom: The double hydrogen—bonding ion pairs involved in the crystal contacts. Dashed lines indicate hydrogen bonds; hirudin glutamate residues are in bold. Two-fold symmetry axis offset approximately 30 degrees from the perpendicular to the plane of the paper . Stereoview of a Crystal Packing Interaction Involving the Thrombin Lysl45-Gly150 Undecapeptide Loop. The crystal contact involves molecules at symmetry positions 1 and 7 (relation 42, text). Residues involved: Lysl45, Trp148, Thr149, Asn149B, Va1149C, Gly#1D, A1a#lB, Leu#123, Lys#235, Gln#244, and Phe#245. N, CA, C atoms and residues of the Lysl45-G1y150 loop engaged in hydrogen bond interactions in bold; hydrogen bonds indicated by dashed lines . . . . . . . . . . . . . . . . . Stereoview of the Optimally Superimposed Thrombin CA Structures of the Complex and PPACK-Thrombin. Superpositioning corresponds to the MOLFIT N, CA, C fitting results of Table 27. Residues of the complex in bold . . . . . . . . . . . Average R.M.S. Deviation of Main Chain Atoms versus Residue Number for the Superimposed Thrombin Molecules of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. The horizontal lines represent lo and 2a deviations. Most of residues with deviations > lo are identified Average R.M.S. Deviation of Side Chain Atoms versus Residue Number for the Superimposed Thrombin Molecules of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. The horizontal lines represent lo and 20 deviations. Many of the residues with deviations > Is are identified Stereoview of the Superimposed Residues LysZ36-Phe245 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. Symmetry molecule interactions of xix PAGE . 179 . 181 . 184 . 187 . 189 . 190 —' P." . l god" 4'. (l) I 1'» ‘LJ ‘rx FIGURE 65 66 67 68 69 70 71 the complex displayed; symmetry molecules indicated by a '#' preceding the residue number. Hydrogen bonds indicated by dashed lines . . . . Stereoview of the Superimposed Residues Glu146-Gly150 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold . . . . . . . . . . . . Stereoview of the Superimposed Residues Glu146-Gly150 of PPACK-Thrombin and the Complex, including the Hirudin N-Terminal Pentapeptide. Superpositioning as in Figure 61. Residues of PPACK-thrombin in bold. Thrombin catalytic triad of the complex displayed . . . . . . . . . . . . . Stereoview of the Superimposed Residues LysllO-Prolll of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. Peptide bond preceding Prolll is cis- in PPACK—thrombin, trans- in the complex . Stereoview of the Superimposed Residues LysZOZ-Arg206 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold . . . . . . . . . . . . . . Stereoview of Superimposed Active Site Residues of PPACK-Thrombin and the Complex, including the Hirudin N-Terminal Tripeptide. Residues of the complex in bold. Superpositioning as in Figure 61. Hydrogen bonds indicated by dashed lines . . . . . Stereoview of the Superimposed Anion-Binding Exosite Residues Arg75-Lys81 of the Complex and PPACK-Thrombin. Additional residues of the complex (thrombin Ala113, hirudin G1u57'-Glu58', and the water molecule V467 ) included. Residues of the complex in bold. Superpositioning as in Figure 61. Hydrogen bonds indicated by dashed lines . . . . . . Stereoview of Residues Leu41 and Phe60H-Leu64 of the Complex and PPACK-Thrombin. Residues of the complex in bold. Superpostioning as in Figure 61. Hydrogen bonds indicated by dashed lines . . . . . XX PAGE . 198 . 200 . 202 . 205 . 207 208 . 211 213 Ffl’fifl nun ' ‘Hv— ucgh' 5-,..- . 1-4»0 sit; fetid ~‘HUI I INTRODUCTION A. Thrombin aPThrombin is a glycoprotein serine proteinase [1] which is generated in the penultimate step of the blood coagulation cascade. In this step, prothrombin (PT), a zymogen of molecular weight 74,000, is ’activated' by limited proteolysis in the prothrombinase complex [2] to form thrombin. The complex, schematically represented in Figure 1, is composed of PT substrate, the proteolytic enzyme Factor Xa, the membrane—bound cofactor Factor Va, the phospholipid surface, and Ca+2 ions. A representation of the prothrombin activation products in bovine and human systems is shown in Figure 2 [3]. During bovine PT activation, Factor Xa cleaves PT at Arg-Thr and Arg-Ile bonds, sites "b" and "c" respectively in Figure 2, to produce PT fragment 1-2 and dethrombin. In the human system, a-thrombin additionally cleaves at sites "a” and "b'" to produce PT fragment 1, PT fragment 2, and arthrombin. The result is that the A chain of human thrombin is 13 residues shorter than the A chain of bovine thrombin. Human arthrombin, the thrombin form used in this study, is a two chain molecule which possesses a molecular weight of approximately 36,600 [4,5]. The sequence of human aethrombin is given in Table 1 [6]. The A chain contains 36 residues and is covalently linked to the B chain through a disulfide (1-122). The B chain is 259 residues long and possesses 3 intrachain disulfide linkages (42-58, 168-182, 191-220). Thrombin is a serine proteinase by virtue of the B chain, which is highly homologous to other coagulation/fibrinolytic serine proteinases (Factor IXa, Factor Xa, protein C, urokinase, tissue plasminogen activator, and plasmin) and is approximately 50% homologous with the Figure 1. PROTHROMBIN DIMER The Prothrombinase Complex. Prothrombin (PT) is composed of fragment 1 (F1), fragment 2 (F2), and prethrombin 2 (Pre-Z); Factor Va . Va; factor Xa s Xa; Ca+2 ions are represented by filled circles binding to y-carboxyglutamic acid residues. Prothrombin a b' c cao cno I : one I AL‘. 1 I ;SER jTHR ' ILE—-—L—-5ER V L‘ Prothrombin ‘L‘ "7 fragment 1 '1‘ Prethrombin l———] ‘47Prothrombin‘ 1‘ l' f 2 71‘ Prethrombin 2———o] ragmenr In——-Prothromb in fragment l-Z—o‘ le-A :4: 3 —' L—sr-s ,, o-Thromb in Figure 2. The Products of PT Activation. CHO represents a carbohydrate side chain; "a", "b", and "c" represent the sites of cleavage of bovine and human PT; an additional cleavage, "b'" occurs in human PT (taken from [3]). u ._.v" m} N145 Table 1. The Primary Sequence of Human a-Thrombin. is indicated using numbering with chymotrypsinogen. character cis-proline. the residues indicates The amino acid type (Numbering taken from [6].) Human u-Thrombin A-chain T1H L3 313 8141 316 R4 014 Y14J 61? P5 K14A 114K 513 L6 T148 014L Human ot-‘l'hrombin B-Chain 116 M26 x36 845 A55 0603 V56 Y76 L85 was 1104 :11: p124 0131 w141 N149! 0156 P166 1176 K185 C191 M201 0209 C220 T229 0239 V17 $27 S36A L46 A56 K60? R67 377 386 N96 L105 8115 0125 A132 6142 V1496 V157 V167 T177 P186 3192 K202 M210 D221 H230 K240 318 P28 CP37 147 H57 N606 168 R773 K87 R97 M106 0116 3126 6133 N143 61490 V158 C168 0178 0186A 6193 S203 6211 R221A V231 V241 619 N29 038 848 C58 F603 669 N78 188 397A K107 1117 3127 1134 L144 K1493 N159 K169 N179 3186B D194 P204 1212 D222 F232 1242 610 F7 314C 614M 520 030 E39 D49 L59 T601 K70 179 189 N98 L108 1118 T128 K135 K145 6150 L160 D170 M180 61860 $195 F204A V213 6223 R233 0243 standard single letter code. based on topological similarities A number followed by an alphabetic insertion. CP represents 31C A18 013 C1 62 88 K9 K10 811 L12 R140 3143 L143 L146 E14H R15 D21 A22 B23 124 625 V31 M32 L33 F34 R35 L40 L41 C42 G43 A44 R50 w51 V52 L53 T54 L60 160A P608 P606 W600 361 N62 D63 L64 L65 H71 872 R73 T74 R75 380 K81 182 583 M84 190 R91 P92 R93 194 L99 0100 R101 0102 1103 K109 K110 P111 V112 A113 H119 P120 V121 C122 L123 A129 A129A. $1298 L129C L130 6136 R137 V138 T139 6140 3146 T147 N148 T149 A149A 0151 P152 8153 V154 L155 P161 1162 V163 8164 R165 $171 T172 R173 1174 R175 P181 C182 A183 6184 Y184A K1860 R187 6188 0189 A190 6196 6197 P198 P199 V200 N2048 N205 R206 W207 1208 8214 N215 6216 8217 6219 K224 1225 6226 F227 Y228 L234 K235 K236 W237 1238 0244 P245 6246 B247 4 . ~Ai~ Vb: -»-n¢ . _ cu. I. .5 .. ._ ”n a. 1.. A... .. fie ... cc 2. .. a. .6. 4. v. u . . . n. .u E r. . . .. a Q. 2‘ ‘ y. .HJ . y. Y. A. . . _ . . . . C A. p. E y: .1 .5. [NC 63 s a ax E .o. .6. Ce _ n. I. .. we a. . . n. 0. Ce 5-. .7. ?~ A». n. a: . . a” _.. . ~ J . .2. C . .. .t p . rJ s... pf n0. 5». e... 2. digestive serine proteinases trypsin, chymotrypsin, and elastase [7]. The 8 chain of thrombin contains the three invariant catalytic site residues — Hi557, Asp102, and Ser195 (see Table 1) - commonly called the catalytic triad. Finally, the thrombin B chain also contains a single biantennary chain carbohydrate which is linked to Asn60G; the structure of the carbohydrate moiety is given in Figure 3 [8]. The ultimate step of blood coagulation is also the primary function of thrombin: the conversion of fibrinogen to fibrin. Fibrin forms the network which encapsulates the hemostatic plug or blood clot during wound healing. Fibrinogen is a large, dimeric six chain molecule with a molecular weight of approximately 340,000 and a overall length of about 450 A [9]. A schematic drawing of fibrinogen is shown in Figure 4 [9]. Fibrin assembly proceeds in two stages [10]. In the first stage, fibrin monomers polymerize end-to-end to form fibrin I polymers or protofibrils. In the second stage, protofibrils associate laterally to form fibers. The first stage of fibrin assembly is initiated by the cleavage of two Aa chains of fibrinogen at 16Arg—17Gly with the subsequent release of 2 molecules of fibrinopeptide A or FPA [11]. FPA release leads to an unfolding of a polymerization domain in the NHZ-terminal portion of fibrinogen and exposes a complementary portion of the 7 chain, which exists in intact fibrinogen [12]. This FPA-deficient fibrinogen molecule is known as the fibrin monomer. Fibrin monomers polymerize via the available polymerization sites to produce fibrin I polymers or protofibrils. The resulting fibrin I polymers are comprised of fibrin monomers laid end-to-end with an overlap equal to about 225 A or one—half the monomer length [13]. The second stage of fibrin assembly involves the lateral association of protofibrils to form fibers. A fiber may contain over NeuNAcaZ-6Galal-4GchAcal-2Manal Fuc a} 3 6 Han al-4GlcNAc31-4GlcNAc-l-N-Asn 6 leuMAcoZ-GGalIl-4GlcNAcal—2Menol witffij b(§ ONHO OW HO COOH HOCHZJV OH GALACTOSE N-ACEWLNEURAMINIC ACID CH20H H H o \ OH H H0 H o n- ' N—ACE TY LGLU COSAM l N E MANNOSE H\ C//o Hoin HA... .10.. Hot. H3 L-Fucose Figure 3. The Polysaccharide of Human a—Thrombin [8] and its Component Hexoses. 1501 CENTRAL 1501 TERMINAL »——————‘1 Dow... .————————-4 Iggms DOMNN (mwazwm (uw 67.200) t_.———. . wk 60A FPS caoru» tuwzxm) SS mNG .J, :r “"-H "L PROTEASE (“33:35} 160 A ‘7’ , sausmvs 5”55 coueo cou (MW 39.100) (W 42.300) Figure 4. Schematic Drawing of the Fibrinogen Molecule [9]. The symmetric molecule is composed of a dimeric central domain containing the amino—termini of all six chains (a,8,y), two connecting coiled coils, two terminal domains, and two Au polar appendages. Fibrinopeptides A and B (FPA and FPB) are at the amino-terminal ends of the u and B chains. Four carbohydrate clusters (CHO) occur, and are located on each 7 chain near the central domain and on the B chains of each terminal domain. Primary cross-linking sites (XL) can be found near the carboxy-termini of the y chain and in the Au polar appendages. .v v - il. ‘8' _.4 3. 3 PC nb. 11 n?- v‘ 100 protofibrils [14] and it is these fibers which join irregularly to produce the gel-like mass characteristic of in vitro clotted fibrin [15]. The lateral association of fibrin I polymers is accompanied by thrombin cleavage of the BB chains at l4Arg-15Gly in these polymers and the subsequent release of fibrinopeptide B or FPB. The FPB—deficient fibrin I polymers are known as fibrin II polymers. Release of FPB unfolds another unique polymerization site [11] and results in an interaction of polymeric species and in polymers and fibers of increased strand width. It has been demonstrated that although FPB release is not necessary for fiber formation or stage II fibrin assembly, the release of FPB results in more rapid fiber formation [10]. Once fibrin assembly is complete, the network is covalently stabilized by the action of Factor XIIIa (fibrin stabilizing factor), which intoduces y-glutamyl s-lysyl cross-links in fibrin [16]. The role of thrombin in hemostasis is by no means limited to the activation of fibrinogen to clottable fibrin. Thrombin has been called the central bioregulatory enzyme in hemostasis [17]. Thrombin regulates hemostasis on three different levels - the fluid or plasma level, the blood cell or cellular level, and the blood vessel or vascular level [18]. Moreover, thrombin accomplishes these tasks by acting as a positive feedback activator of coagulation [7] as well as a negative regulator [19]. The central bioregulatory functions of thrombin in hemostasis are indicated in Figure 5 [18]. On the fluid or plasma level, thrombin proteolytically cleaves several zymogens to produce active enzymes or stimulate activation [17]. Thrombin activates Factor VIII (antihemophilic factor) [20] and Factor V (accelerator globulin) [21] via limited proteolysis which results in positive feedback to the coagulation pathway. Thrombin also activates ‘ fiET Q I I I : Kuuuuxus‘vm , l I Rm GEN XII 1:le F8 0 it ”hum“ ( xuo ( )Xle -I'IIXo I CLOT A a-rIIRouem BARRIER qu-/"a_°:::-.-.::- ‘ 1' F I rfl‘ ,/ PA . KININS vm' I PFOTEINCO . "mam“ PREKALUKREIN TPN . .. In. H4 PROTEIN C/ / (FlBRlN-Z) Lrummtm mum XIII marl l l (INSOLUBLE) Pusumocrn __‘ L “9"" L l _ _.p..s..... .Tl c... BARRIER r '55”: PLASM'NOGE" FIBRIN SPLIT PRODUCTS ACT I V'ATORS CWPLEMENT SYSTEM ACTIVATED NEUTROPHILS/MONOCYTES vtssrr VESSEL WALL LUMEN Figure 5. The Central Bioregulatory Functions of a—Thrombin in Hemostasis. Coagulation factor zymogens are identified by roman numerals; an "a" following a numeral indicates an activated enzyme. Primes and double primes are used to indicate active and inactive forms of factors of V and VIII. Abbreviations: F1.2 = prothrombin fragment 1.2; FpA = fibrinopeptide A; FpB = fibrinopeptide B; PL = phospholipid; and TPN = thromboplastin (taken from [18]). "..o)\v ._...rt ’A‘ L‘C . v”- .p In untfinri n «usex'v ' A ".‘h- ’va-u il.‘ _ 'P_I-\I o-.II.‘,‘ V. . a ,. p 7‘ dri “"10{ u."“ ’c‘ ,. ‘~“h -. 5". L‘.‘ . "1 (D r 1 10 Factor XIV (protein C) by proteolytic cleavage; Factor XIVa is believed to be responsible for the inactivation of Factors Va and VIIIa [22,23]; in this capacity thrombin is no longer a procoagulant but rather an anticoagulant. Thrombin cleavage of Factor XIII (fibrin stabilizing factor) produces Factor XIIIa'b which becomes an active enzyme in the presence of Ca+2 ions [24]. The conversion of fibrinogen to clottable fibrin also occurs at this level. The endogenous proteinase inhibitors antithrombin III, heparin cofactor II, al-proteinase inhibitor, and a2-macroglobulin all inhibit thrombin on the plasma level [25]. Antithrombin III is the primary inhibitor of thrombin in plasma. Antithrombin III and heparin cofactor II are unique in that they display accelerated thrombin inhibition in the presence of heparin and other glycosaminoglycans [26,27]. The most potent naturally occurring plasma-level inhibitor of thrombin is hirudin, a 65-66 amino acid peptide isolated from the salivary glands of the European medicinal leech Hirudo medicinalis [28]. On the cellular level, thrombin is an agonist for a variety of cellular activities in many tissues and cell types [29]. Thrombin-cellular interactions stimulate aggregation (platelets), chemotaxis (monocytes), contractility (smooth muscle), proliferation or cell growth (fibroblasts, macrophages, splenocytes), prostaglandin synthesis (platelets, endothelial cells, fibroblasts, neuronal cells), and secretion (platelets, endothelial cells). The induction of platelet aggregation and secretion is a very important procoagulant cellular function of thrombin. On the other hand, a particulary important anticoagulant thrombin-cellular interaction involves the thrombomodulin receptor on endothelial cells. Thrombomodulin is a 74,000 dalton protein which forms a stoichiometric 1:1 noncovalent complex of high :;;€ni‘ .‘..n Ru 'D‘ '3 kill TYOTEIY hhrqON .II.-- J thrive-I ,llb a A V p e A 'HrA-n ustg I l . -njn'w n - n A u)“ \ 11 affinity (Kd = 0.48 nM) with thrombin. Moreover, thrombin is inhibited by thrombomodulin and this interaction also stimulates activation of protein C which can in turn inactivate factors Va and VIIIa. Thus, the thrombin—thrombomodulin interaction is yet another means by which thrombin negatively regulates coagulation. Functions of thrombin on the vascular level also involve thrombin-cellular interactions [17]. Thrombin interacts with the endothelial cells of the blood vessel wall. This interaction results in the synthesis of prostacyclin [30], prostaglandin [30] and platelet—activating factor [31], as well as the release of von Villebrand factor [32], and tissue plasminogen activator (tPA) and its inhibitor [33]. The mechanism by which serine proteinases cleave peptide bonds can be described as a general base-catalyzed nucleophilic attack on the carbonyl carbon of the substrate by the hydroxyl oxygen of Ser195 [34,35]. The x-ray crystal structures of the digestive serine proteinases trypsin [36], a—chymotrypsin [37], and elastase [38] indicate that prior to peptide bond cleavage, the sidechain of an amino acid on the carbonyl-side of the bond to be cleaved fits into a 'specificity pocket' which brings the carbonyl carbon near to Ser195. Moreover, the nature of the binding pocket reflects the specificity of the enzyme [39]. The specificity pocket of archymotrypsin is large and hydrophobic; thus, a—chymotrypsin is well-suited to cleave peptide bonds after tryptophan, phenylalanine, tyrosine, and leucine sidechains. In trypsin, the specificity pocket fis similar in size but it has an aSpartate (Asp189) at the base of the specificity pocket as opposed to a serine. This aspartate deprotonates above about pH 5.0 and thus makes the cavity ideal for the binding of arginyl and lysyl sidechains. The 12 fact that thrombin is 'trypsin-like’ in its ability to cleave Arg-X, Lys-X bonds and that it contains an aspartate at position 189 suggests that thrombin and trypsin share a similar specificity pocket [1]. The diverse functions and regulation thrombin provides in hemostasis, however, clearly indicate that it is a unique trypsin-like serine proteinase which requires more than a specificity pocket to govern its selectivity. Vhile trypsin is omnivorous in its choice of protein substrates and in the number of peptide bonds it will attack [1], thrombin is highly selective in its choice of macromolecular substrates and interactions. Thrombin achieves such high selectivity by using active site region subsites, and exosites which are distinct from the active site [17,40-46]. Moreover, it is these exosites in addition to the catalytic site region which allow thrombin to function in hemostasis by means of enzymatic (proteolytic), nonenzymatic (hormonal), and coupled mechanisms [47]. Coupled mechanisms require the involvement of the thrombin active site and one or more exosites. The high degree of specificity of a—thrombin interactions with macromolecular substrates is believed to result from binding at three distinct regions on the thrombin surface: the primary binding or specificity pocket [1], an apolar binding site adjacent to the catalytic site [40,41], and an anion-binding exosite which is responsible for fibrinogen recognition [42,44]. The anion—binding exosite appears to contain or be in the vicinity of the B—cleavage loop (residues Arg67-Arg77A [48]) of dethrombin [44]. The integrity of this loop is important to the interaction of thrombin with fibrinogen, thrombomodulin, and protein C [49-52]. Additional secondary binding Sites have been identified for heparin-binding [45] and growth factor activity [46]. a . .0! r_ ‘ .soU~‘ o-AII ,I. .II‘v J 'C‘," Is»... . T""" ~..4;. 'Vfiht‘ " \ -.vV—v 9. '“vn ...., 13 In an effort to truly understand the structure-function relationships underlying thrombin specificity, this enzyme has been the focus of x—ray crystallographic studies for some time [6,52-55]. Although crystals of bovine [52] and human thrombin [53] had been reported for well over ten years, no protein structure has resulted from these crystals. In the absense of definitive crystallographic structures, three-dimensional models of the thrombin 8 chain were proposed based on the sequence homology with bovine chymotrypsin and trypsin [56,57]. Recently however, nearly identical crystals of human a-thrombin inhibited by D-Phe-Pro-Arg-chloromethylketone (hereafter referred to as PPACK) were obtained by us [55] and independently at the Max-Planck-Institut fur Biochemie in Martinsried, Germany [6]. The German research group led by Dr. Volfram Bode and Professor Robert Huber eventually solved the structure of PPACK-inhibited thrombin at 1.9 A resolution [6]. B. Hirudin Hirudin is a small protein which occurs in the salivary glands of the the European medicinal leech, Hirudo medicinalis and is the most potent thrombin inhibitor known [28]. The complex of hirudin with arthrombin is high-affinity, noncovalent, stoichiometric (1:1) and has a dissociation constant in the picomolar to femtomolar range [59,60]. Crude hirudin from the leech is a mixture of polypeptides containing 65 or 66 amino acids which possess a molecular weight of approximately 6900 daltons [61]. The first three hirudin sequences to be determined were classified as hirudin variant 1 [62], hirudin variant 2 [63], and hirudin variant 3 [64] based on their chronological appearance in the literature [65]; these sequences are often referred to 14 .A_oo_ souu :oxmuv enamouau moonwasm m moumofivca .» .mo cowuwmoa um osvwmeu Hmcowuwvvm cm mcwmucoo m>m .mesvwmeu acmwum> 0» Home» AIV madam egg was smoHoeo: uo mnemmou oueocfiaoo moxon one .m>= use .~>= .~>= ”mucmwmm> sauna“: amusumz ooush may «0 moucosaom ofioc ocma< .o ousmfim a 6 6C zIzImIo xImIeIoImIoIaI>IUIOIz.qu no: zImIqu z-AIRIoImIoIaISIUIOIzioIa «on znzImIo aImIanImIoIaI>IUI0Iz-aIa up: on me am mm a a 4‘. IOImIoIsIHIoIaIzIo onIorsuzImIquIoIqIoIqIzIOIoImImIaIUIoIaIsIaIH no: IzImIoIoIHIOIzIzIoIzIoIoI>IzImIquIoIqIoroIzIOIoIquIaIUIoIaIquIu a»: IoImIoIQIHIUIzIquIo oIUI>IzImIoImIUIuIUIaIzIoroImImIaIUIoIaIsI>I> up: mm 0m .1w c~ mu on I m A rhy Fl. I CL r a 9 .. . D. .N . «Au EU. I l . h...“ a: \_ 'I 15 as HV1, HV2, and HV3, respectively, and are given in Figure 6 [66]. The hirudin sequences are approximately 80% homologous and contain only 13 variable positions [67]. The amino acid compositions also reveal a high content of asparagine (N), glutamine (Q), aspartate (D) and glutamate (E), as well as the absence of arginine (R), methionine (M), and tryptophan (V) [67]. The COOH-terminal region of these hirudin sequences is rich in acidic residues; nearly half of the last 11 homologous amino acid positions are occupied by an aspartate or glutamate residue. A unique acidic residue found in all three hirudin variants is the sulfated tyrosine (Y’) in the antepenultimate position [62,68,69]. This tyrosine acyl sulfate enhances the ability of hirudin to inhibit thrombin; desulfato-hirudins display three-to ten-fold decreased affinity for thrombin with respect to the sulfated forms [59,60,70,71]. Presently, twenty naturally—occurring hirudin sequences have been determined, and they display an overall sequence homology of approximately 66% [61]. . Hirudin variants also share a common disulfide connectivity pattern. All hirudin molecules contain six conserved cysteine (C) residues in the NHZ-terminal region which are involved in 3 characteristic cystine linkages: Cys6-Cysl4, Cys16-Cys28, and CysZZ-Cys39 [72]. Interestingly, all known serine proteinase inhibitors can be grouped into one of ten structural families based on sequence homology, and the topological relationship among disulfide bonds and the inhibitor reactive site [73]. However, hirudin displays no topological or sequence homology to any known serine proteinase inhibitor and thus could represent a heretofore unknown family of inhibitors [72]. Hirudin does share some global features with epidermal growth factor (EGF), which is a structural motif found in numerous proteins [74]. Alignment 16 ..a@. Eczema gauche Haauoeaam van cacao“: mo mcueuumm zua>uuoooooo econ okuasmua one .5 ousmflm sauna“: I m 3. we mm mm mm mm UH Mr M. .— — Louuem guaouu Heeuevfiqm NV mm PM ON 3. I m —I Ia mm. I— a. H I F 17 of the hirudin and EGF disulfide linkage patterns in opposite directions reveals a similar connectivity, as is seen in Figure 7 [67]. It is of note, however, that hirudin and EGF share no functional or sequence homology. Because leeches contain mixtures of naturally occurring hirudin variants which are difficult to purify and isolate in large quantities, recombinant DNA technology has been employed to obtain homogeneous preparations in ample amounts. To date, five different microbial expression systems have been made to produce recombinant hirudin variant 1 (rHVl) [67,75—77] and two systems have been developed to produce recombinant hirudin variant 2 (rHV2) [63,65]. Recombinant hirudins differ from their natural counterparts in that they lack the sulfated tyrosine. The source of hirudin used in this study is the recombinant hirudin variant 2-Lys47 mutant (rHV2-K47) resulting from a yeast expression system [65]. The sequence and covalent structure of rHV2-K47 are given in Figure 8. Secondary and tertiary structural information on hirudin in solution has been obtained independently by two research groups employing 20 NMR techniques to study recombinant hirudin variant 1 [78-80]. Qualitatively, the results of both groups are in general agreement. These studies revealed that hirudin is a two domain protein characterized by a compact, well-defined NHZ—terminal core and a flexible, disordered COOH-terminal tail. The core domain (residues 3-30 and 37-48) contains the cystine linkages, several turns and two antiparallel B-strands. Residues 31-36 within the NHZ—terminal domain and residues 49—65 of the COOH-terminal tail exhibited no preferred conformation in solution. A stereo drawing displaying nine optimally superimposed DISMAN distance geometry solutions resulting from the 20 Figure 8. The Sequence of Recombinant Hirudin Variant 2-Lysine 47 (rHV2-K47). .v. ‘6. Figure 9. 19 Stereo Drawing of Nine Optimally Superimposed DISMAN Distance Geometry N, CA, C Structures from the 20 NMR Constraint Data of Haruyama and Vuthrich [80]. Disorder exists in the residue ranges 1-3, 31-36 and 49—65 (taken from [80]). ‘.--AF“ \\ ..-~. new-91‘ 5.”! l v.5 3‘1 - . M... Or. a- [H‘I .I... fish-0‘! f‘ "-04 »A'_- ..~ a: A w "‘e. :5. u... «\— 20 NMR constraint data of Haruyama and Vuthrich is shown in Figure 9 [80]. Hirudin reacts rapidly with human a-thrombin (kon > 1 x 109 M‘ls'l) [59] and forms an amazingly high-affinity, noncovalent complex with a dissociation constant estimated at 80 pM [28] and 20 fM [59]. The apolar binding site and the anion-binding exosite or fibrinogen recognition site of thrombin appear to be important thrombin determinants in complex formation. Hirudin binding to human a~thrombin displaces proflavin from the apolar binding site [41]. In addition, the dissociation constant of hirudin and PPACK-thrombin, in which PPACK alkylates the active site histidine and appears to bind in the apolar binding site [40], is 1 x 106 higher than with a-thrombin [81]. Both of these observations implicate the apolar binding site in the hirudin-thrombin interaction. Evidence supporting the importance of the anion-binding exosite of thrombin to its interaction with hirudin has come from binding studies involving the proteolytic derivatives of a-thrombin: B—, y—, and s-thrombin. Schematic drawings of human a-thrombin and these proteolytic derivatives, with the cleavages indicated, are given in Figure 10. B—Thrombin and y-thrombin are produced by autolysis of native a—thrombin or by limited proteolysis with trypsin [82]. B—Thrombin generated by autolytic degredation is characterized by a cleavage at Arg77A followed by Arg67, and by the concomitant loss of the Ile68-Arg77A undecapeptide [48] (Figure 10 and Table 1). The B—thrombin derivative produced by trypsinolysis, designated BT-thrombin, contains a single cleavage at the Arg77A-Asn78 peptide bond [52]. The exact covalent structure of y-thrombin appears to depend on the method of preparation. Boissel et al. [48] report y-thrombin to possess cleavages at Arg67 and Arg77A, as in B—thrombin, and additional breaks at Arg126 and Lysl49E, followed by the loss of the resulting 11- and 21 1H 15 (:1 l T H57CHO A01: 5195 “I? BI rises 6122 l D 0L Al I cuo l j I . I Bl RS7 thA 7 lg Al | one I l 8' :---4 4 ----- .2 I V R67 RNA R120 RI49E ' 2} .. Al I Figure 10. l l A i4” I Q Human a-Thrombin and its Proteolytic Derivatives. The numbering of residues is taken from Table 1. The A-chain and B-chain of thrombin are designated by "A" and "B": respectively. CHO represents the polysaccharide linked to nitrogen N02 of Asn(N)6OG. The location of the catalytic triad residues is noted (H57, 0102, and $195). Excised proteolytic fragments are denoted by broken lines. 22 31-amino acid peptides (Figure 10 and Table 1). Fenton et al. [82], on the other hand, have reported a y-thrombin with only two major cleavage sites - Arg77A and Lysl49E. s-Thrombin results from limited proteolysis with elastase and contains a single cleavage at the Ala149A-Asn149B bond (Figure 10 and Table 1) [52]. Hirudin complexes with B-thrombin and v-thrombin, proteolytic derivatives of a—thrombin which possess little or no fibrinogen activity (0.05%) due to a disrupted anion binding exosite [52], have significantly higher dissociation constants with respect to the a-thrombin-hirudin complex [81]. e-thrombin, which displays only about 40% diminished fibrinogen activity with respect to native thrombin [52], in complex with hirudin has only a slightly increased dissociation constant [81]. Hirudin complexes with native human a-thrombin as well as with some of its chemical and proteolytic derivatives are not retained on nonpolymerized fibrin-agarose resin in contrast to free thrombin [44]. The importance of the hirudin interaction with thrombin at the anion-binding exosite is further emphasized by the observation that an antibody raised against a peptide corresponding to the B—cleavage site undecapeptide, Ile68-Arg77A, has been shown to inhibit hirudin binding to a—thrombin [83]. Hirudin is unique among macromolecular substrates for thrombin, and serine proteinase inhibitor proteins in general [84], in that interaction with the primary specificity pocket of thrombin does not appear to be crucial for inhibition. A hirudin lysyl residue was often thought to interact with the thrombin specificity pocket, especially Lys47; residues (39-48) of hirudin display 50% homology with residues (148-157) of PT, which contains an a-thrombin cleavage site utilized in human prothrombin activation [85] (cleavage site "a" in Figure 2). When the basic residues on hirudin were altered by site-directed mutagenesis 533 51 j . .1. , 7130. I upDA ,- .n..h ;..n,. I‘V'ab 3“ F' In“?- as C ~.U 8‘! A 23 and the binding of the resulting mutants was tested with bovine and human thrombin, only mutation of Lys47 was found to be sensitive [86,87]. However, the increase in the dissociation constant was only minor: four- to ten-fold . Moreover, the fact that a Lys47 is not conserved in natural hirudin sequences further suggests that this interaction is not necessary [61]. The NH2- and COOH-terminal regions of hirudin both contain residues important to the hirudin-thrombin interaction. The electrostatic and acidic nature of the carboxy-terminal tail appear to be vital. Removal of the last 7 COOH-terminal amino acids of hirudin - which includes two Glu residues and a sulfated Tyr in natural hirudin - results in a loss of about 90% of inhibitory activity, and removal of the last 22 amino acids nearly abolishes inhibition [70]. Single and multiple mutations of the carboxy-terminal glutamates (57,58,60 and 61) to glutamine caused increases in the dissociation constant of the complex which increased with the number of mutations [87], and each negatively charged Glu appears to contribute equally-favorably to the binding energy [88]. Desulfato Tyr63 hirudin-thrombin complexes possess a higher dissociation constant compared to those from sulfated forms, albeit only three- to ten-fold higher [59,60,71]. The hydrophobic residues in the anionic tail also appear to be involved in the interaction. In inhibition studies with a hirudin COOH-terminal peptide mimetic corresponding to residues 55-64 , which is the minimum size required to abolish fibrinogen recognition, it was shown that Phe56, I1e59, Pro60, and Leu64 as well as Glu57 were all sensitive to modification [89]. In the NHZ-terminal region of hirudin, it appears that the hydrophobic nature of the first two amino acids, a positively charged amino terminus, and the integrity of the three disulfides are crucial [68,90]. x I“ (h . \n h .' In. [F ..A.U uh. \ “ “I\fle ~.I..‘ ‘ROVA ‘0‘... ‘0»... ‘nn." , L 'flno. ~.u. ‘4 2,. P 24 In its ability to interact with thrombin at both the apolar binding site and the anion-binding exosite, hirudin appears to be a bivalent or 'double-headed' inhibitor. Kinetic analysis of complex formation has revealed that the first step is ionic strength-dependent and unaffected by binding at the active site while the second step is influenced by binding at the active site [59]. These results in light of further studies suggest that the first step involves the interaction of the negatively-charged carboxy-terminal tail of hirudin with the anionic binding exosite, while the second step involves amino-terminal domain interaction in the active site region [81,88]. In support of this notion, it has been found in thrombin binding studies that the COOH-terminal hirudin fragments (49-65,53-65) protect against tryptic cleavage, while the NH2-terminal fragment (1-52) is a competitive inhibitor of the thrombin substrate D-Phe-pipecolyl-Arg-p-nitroanilide [91,92]. Peptide synthesis of the hirudin segment (45-65) first confirmed that the COOH-terminal region of hirudin alone in complex with thrombin was sufficient to inhibit fibrin formation and the release of FPA without affecting amidase activity [93]. Since then it has been reported that amino acid residues (56-64) and (53-64) are minimally required for these results [89,94]. In addition, the finding that a (56-65) hirudin peptide produces a conformational change in thrombin as judged by circular dichroism which does not occur upon PPACK binding suggests that the COOH-terminal tail of hirudin may trigger a conformational change prior to binding of the NHZ-terminal region [95,96]. It is important to note that these carboxy-terminal hirudin lleptide mimetics in complex with thrombin have dissociation constants Which are about 6 orders of magnitude higher and anticoagulant aetivities 3-4 orders of magnitude lower than that of intact hirudin en, 6 an E? ,. Ah ‘, AL V 5““‘9 add". RR I... . '5: ~1§V ‘A' 5 2‘ 6 “NI. 5 s H H a. 25 [93]. The ability of hirudin in complex with thrombin to completely abolish the conversion of fibrinogen to fibrin has lead to a great deal of study and interest in the use of hirudin as an antithrombotic medical agent. Many factors make hirudin an attractive alternative to conventional anticoagulant medical therapies. Currently, pharmacological control of thrombin activity in vivo is attained by means of heparins and bezamidine-type thrombin inhibitors, and both of these treatments have drawbacks [97]. Heparins, which are sulfated glycosaminoglycans, inhibit thrombin indirectly by enhancing the inhibitory power of antithrombin III and heparin cofactor II [26,27,97]. Thus, heparin efficacy is dependent on the plasma levels of these proteins. In addition, heparins can be neutralized by natural antiheparins in the blood such as platelet factor 4, can interact with other blood components such as platelets, fibrinolytic constituents, and lipoprotein lipases, and can lead to hemorrhagic side effects [97]. Oral benzamidine-based anticoagulants interfere with the biosynthesis of clotting factor proteins [97]. Hirudin, in contrast, is a selective, direct thrombin inhibitor which is not dependent on or affected by other plasma proteins, and which does not interfere with the biosynthesis or action of these components [28,98]. In addition, in animal studies modeling five pathogenic thrombosis conditions, hirudin exhibited antithrombotic efficacy at the lowest concentrations in all five examples compared to heparin and the best synthetic inhibitor [97]. Hirudin also displays low antigenicity, is rapidly cleared from the bloodstream, and can be produced in ample amounts to accomodate high medical demand through recombinant DNA methods [68,99]. Two main problems with using hirudin as a therapeutic anticoagulant are the CECESE oi owhyvv .oUbAI , '3rtnl e'u».'v’¢ 26 unparalleled affinity of hirudin for thrombin (Rd in the picomolar to femto molar range), which would be a challenge to counteract if necessary, and the fact that hirudin blocks amidolytic as well as fibrinolytic activities in thrombin [59]. Such concerns are not a factor in the promotion and development of hirudin COOH-terminal peptide mimetics for anticoagulant use. These molecules can effectively inhibit the thrombin-catalyzed fibrinogen to fibrin conversion without disrupting thrombin amidolytic acitivity, have significantly higher dissociation constants in complex with thrombin compared to hirudin, and can easily be synthesized by non-recombinant means [89,93-95]. w" r. '1 ""4'-. “[IYIII‘ ~dheg 130311. “a II EXPERIMENTAL A. Sample Preparation Human a—thrombin was generously supplied to us by Dr. John V. Fenton II of the New York State Department of Health in Albany, New York. The thrombin samples were prepared by the method of Fenton et al. [100], and came as frozen vials which were 3.1 mg/mL in 0.75 M NaCl. The recombinant hirudin sample, rHV2-K47, was kindly supplied to us by Dr. Carolyn Roitsch of Transgene,S.A. in Strasbourg, France. The rHVZ-K47 was produced by the method of Loison et a1. [65], and was purified by the method of Bischoff et al. [101]; the samples were a lyophilized powder of better than 95% purity. A frozen solution of thrombin was allowed to thaw on ice. A 10-202 molar excess of rHV2-K47 was dissolved in 50 mM sodium phosphate buffer, 1 mM NaNs, pH 7.3, and was added to the thrombin solution. The inability of several 5 uL hirudin-thrombin aliquots to produce clot formation in a 0.5 mL human plasma sample verified that thrombin inactivation had occurred. The hirudin-thrombin sample was then transferred to a dialysis bag made of Spectra/For 4, 12-14 k0 cutoff dialysis membrane, and dialyzed against 25 mM sodium phosphate buffer, 0.375 M NaCl, 1mM NaNz, pH 7.3. Dialysis was carried out at 4 °C for several days with several buffer changes to affect a 10‘9 to 10'11 dilution of the excess hirudin. The protein solution was then concentrated in a refrigerated centrifuge using an Amicon Centricon 10 microconcentrator to approximately 6.0 mg/mL. 27 r». .43 I: nay A..I‘ 28 8. Sample Molecular Veight Estimation Discontinuous polyacrylamide gel electrophoresis [102, 103] was used to estimate the molecular weight of the hirudin-thrombin complex. Slab gels were prepared using the BIORAD Mini-PROTEAN II dual slab cell unit and the power source for electrophoresis was a BIORAD Model 200 constant voltage power supply. The gels were prepared, stained, and destained using Hoefer Scientific Instruments protocols [104]. The procedure used to estimate the molecular weight of electrophoresed proteins is a Sigma Chemical Company protocol [105], and is based on the methods of Bryan [106] and Davis [103]. Four different slab gels of varying separation gel pore size - 7, 8, 9, and 10 %T - were prepared, where ZT is defined as : %T = [(mass acrylamide monomer + mass crosslinker)/ total volume} X 100 .(1) The pore size and the ZT of a gel are inversely correlated; a large value of ZT indicates a small pore size. Three microgram samples of the hirudin-thrombin complex, pre- and post-dialysis, as well as of the following calibration standard proteins were applied to each gel : bovine milk a-lactalbumin (MV 14,200), bovine erythrocyte carbonic anhydrase (MV 29,000), chicken egg albumin (MV 45,000), and bovine serum albumin (MV—monomer 66,000; MV-dimer 132,000). Bromophenol blue tracking dye was added to all of the samples. The gels were run at a constant voltage of 200 V. It took approximately one hour for the dye front to move the length of the gel, which concludes the electrophoretic separation. The Rf, or electrophoretic mobility, of each protein oligomer was calculated for each of the four gels, and 100Log(fo100) versus percent gel concentration (Figure 11) was plotted for each 29 100Log(fo100) QCMZ, “90., Figure 11. v é 4 é e to ZGel Concentration (2T) 100Log(fo100) vs. 2T Gel Concentration. The slope for each line gives the retardation coefficient (Kr) for the corresponding protein oligomer. Abbreviations: L(1ysozyme), CA(carbonic anhydrase), CEA(chicken egg albumin), BSA-M(bovine serum albumin-monomer), BSA-0(bovine serum albumin), HT(hirudin-thrombin) I4 :u .b» \ I s ‘v~~ “A protein, where Rf is defined as : Rf = distance of protein migration ...(2) distance of tracking dye migration The slope for each protein oligomer in such a plot is known as the Retardation Coefficient (Kr). The negative logarithm of Kr was then plotted versus the logarithm of the molecular weight for the standard proteins, producing a linear plot (Figure 12). Using the experimentally-determined Kr value for the complex (-5.4), extrapolation from this plot gave a hirudin-thrombin complex molecular weight of 38,500. Although this molecular weight estimate for the complex is only 1900 daltons larger than that of native human a-thrombin and is about 5000 daltons less than that expected, the discrepancy could be attributed to the difficulty to accurately estimate protein molecular weights from gel electrophoresis. C. Crystallization Crystals of the rHV2-K47-human a-thrombin complex were initially obtained by the hanging drop, vapor diffusion method [107] in an incomplete crystallization parameter factorial search [108] using 30 (w/v)% polyethylene glycol (PEG) 4000, 0.2 M MgClz, 0.1 M sodium acetate buffer, pH 4.5, 1 mM NaN3, and a protein concentration of about 6 mg/mL. Crystals appeared in 5-7 days. Refinement of these conditions found that the largest crystals resulted from precipitant solutions which contained 27-28Z PEG 4000, 0.2 M MgClz, 1 mM NaNa, and any of the following buffers at a concentration of 0.1 M: sodium acetate buffer, pH 4.7 or 5.0; sodium citrate buffer, pH 5.5; and ADA (N-l2-Acetamido]-Iminodiacetic Acid) buffer, pH 6.0. Larger crystals (maximum dimensions 1.0 x 0.4 x 0.4 mm3) suitable for x-ray diffraction 31 “K r /00.. CEA CA BSA-M BSA-D HT Ofi‘DOO l0] I 1 $4 I I I /0 20 50 /00 200 M. W. Figure 12. Log(-Kr) vs. Log(Molecular Veight). Linear regression best fit line to the standard protein retardation coefficient data. Molecular weight (M.V.) in kilodaltons. Abbreviations used have the same meaning as in Figure 11. The extrapolated molecular weight estimate for the hirudin-thrombin complex is 38.5 k0. ..:\_A mvl I u, (lv 1" , \ I‘V'H ~4au .7”,- \ ’Irc v.‘ 7". I f] I 'r! 'V 32 studies were obtained by macroseeding protein-precipitant solutions [109]. Precipitant solutions used for macroseeding experiments were identical to those mentioned above, but were 26% in PEG 4000. Crystals selected to be used as seeds for macro-seeding in protein-precipitant droplets were rinsed in a solution prior to transfer which was identical to the precipitant solution of the crystallization, but which was 23.5-24.0% in PEG 4000. A photograph of a data collection-quality hirudin-thrombin crystal is shown in Figure 13; the trace of the seed crystal is visible in the center of the large crystal. 0. Crystal Characterization Crystals of the complex were characterized using Ni-filtered Cu K“ x—rays. Still and precession photographs were taken using a Rigaku RU-ZOOB Standard Rotating Anode Source with a 0.5 x 3.0 mm filament operating at 8.8 kV power (55 kV, 160 mA), and a Charles Supper precession camera. Ve thank Drs. Cele Abad-Zapatero, John Erickson, and Chang Park of the Protein Crystallography Lab of Abbott Laboratories for the use of their facility to carry out these experiments. Frame data were also collected using a Xentronics Area Detector on a Nicolet P3/F diffractometer operating at 1.75 kV (50 kV, 35 mA) with a fine focus x-ray tube. The area detector data were processed using XENGEN data reduction software. Ve would like to thank Mr. Steven Muchmore of the Protein Crystallography Lab at The Upjohn Company for carrying out the area detector work for us. The unit cell parameters and relevant crystal data are summarized in Table 2. Still x-ray pictures displayed diffraction to better than 2.5 A resolution. From (Okl) and (hkO) precession photographs (Figure 14) and/or the reduced area detector data, the rHV2-K47-human a-thrombin 33 Figure 13. Photograph of an rHVZ-K47-Human a-Thrombin Crystal. Crystal size is approximately 1.0 x 0.4 x 0.4 mm. Note the trace of the seed in the center of the crystal. 34 Table 2. Summary of Crystal Data of rHV2-K47-Human a-Thrombin. Crystal system Space group Number of molecules per asymmetric unit a (A) c (A) Solvent fraction Matthews number (Vm; A3/dalton) Tetragonal P41212 or P43212 1 90.39(2) l31.97(8) 61% 3.10 35 Figure 14. hkO Precession Photograph of a Hirudin-Thrombin Crystal. The resolution limit at the edge of the photo is 3.7 A. (mu angle = 12.0°.) 36 crystals were shown to be tetragonal, space group P41212 or P43212. The unit cell dimensions measured with a Nicolet P3/F diffractometer using 10 high order reflections are: a = b = 90.39(2), c = l31.97(8) A. Assuming eight hirudin-thrombin complexes (Mr = 43,500 0) per unit cell, and a protein specific volume of 0.74 cm3/g gives a solvent fraction (fs) of 61% in the crystals; protein crystals generally have a solvent fraction ranging from about 27 to 65%, with 43% being the most common value [110]. Given the cell parameters and assuming one complex molecule per asymmetric unit, the Matthews number or Vm [110], which is the crystal volume per unit of protein molecular weight, is calculated to be 3,10 A3/dalton; the most commonly observed Vm value is 2.15 A3/dalton [110]. The high solvent fraction and high Matthews number of the hirudin-thrombin crystals are, however, nearly identical to the values obtained from crystals of the asymmetric molecules human Ig immunoglobulin (fs = 0.61, Vm = 3.14 A3/dalton) [111] and yeast phenylalanine tRNA (fs = 0.60, Vm = 3.20 A3/da1ton) [112], WhiCh suggests that the hirudin-thrombin complex could be asymmetrical in shape. A storing solution of 30% PEG 4000, 0.375 M NaCl, 0.2 M MgClz, 0.1 M sodium acetate buffer, pH 4.6, 1 mM NaN3 was found to stabilize the crystals and maintain diffraction quality. The rHV2-K47-thrombin crystals used for x-ray diffraction experiments were gradually equilibrated in this storage solution prior to use by slowly exchanging the crystallizing solution with the storage solution. 37 E. 2.8 A Resolution Diffractometer Intensity Data Collection A 2.8 A resolution set of diffractometer intensity data was collected on the hirudin-thrombin crystals using monochromatic Cu K“ radiation (1.5418 A) and a Nicolet P3/F four-circle diffractometer controlled by a Data General Nova/4 computer. X-rays were generated using an AEG sealed x-ray tube operating at 2 kV (50 kV, 40 mA), and were monochromated by means of a graphite monochromator crystal. A helium evacuated collimator and beam tunnel were used to minimize the air absorption of the incoming beam and the diffracted x-rays, respectively. The complete data set was collected in two parts, each part using a different crystal. The first data collection contained reflection measurements from 3.63 to 2.80 A resolution (reflections had 26 values ranging from 24.5° to 32.0°), and will be referred to as HTN2. The second set contained intensity data to 3.56 A resolution (reflection measurements from 2.0° to 25.0° in 20), and will be referred to as HTN3. (The first actual 3.56 A resolution intensity data set collected, HTNl, was used to generate a file of observed reflections.) The crystal used for the HTN2 and HTN3 data collections had dimensions of approximately 0.95 x 0.40 x 0.35 mm. These crystals were mounted in siliconized glass capillaries of diameter 1.0 mm. The morphology of an idealized hirudin-thrombin crystal is shown in Figure 15. A crystal of the complex was mounted such that the c* axis (along the longest dimension of the crystal) was coincident with the capillary axis. A drawing of a hirudin-thrombin crystal mounted in a capillary is shown in Figure 16. The crystal was positioned in the capillary bathed in the storing solution and, just prior to sealing the capillary, the crystal was partially ’dried’ with a filter paper strip so that only a small amount of liquid remained under the crystal to affix it to the 38 It» Figure 15. Morphology of an Idealized Hirudin-Thrombin Crystal. 39 P ”7“ it I“ Crystal Mother Liquor _Figure 16. A Hirudin-Thrombin Crystal Mounted in a Capillary for X-ray Crystallographic Analysis. 4O capillary wall. A small band of storage solution was left at one end of the capillary before sealing it off with wax; as protein crystals contain a significant solvent fraction, the presence of this band of storage solution in the sealed capillary ensured, by vapor equilibration, that the crystal would not dry out. The capillary containing the crystal was then inserted into plasticene mount on a goniometer head. The protein crystal on a goniometer head was positioned onto the goniostat of a four-circle diffractometer. A schematic drawing of this goniostat is shown in Figure 17 [113]. The goniostat orients the crystal and detector with respect to a stationary x-ray beam, and brings Bragg planes into reflecting position. V.L. Bragg developed the analogy of the the diffraction of x-rays by crystals to ordinary reflection, and he deduced a simple equation - nk = 2dsin9 - by treating diffraction as "reflection" from planes of atoms in a lattice [114]. In this equation, d represents the interplanar spacing of the Miller planes and A is the wavelength of the x-ray radiation. Thus, diffracted x-rays are often referred to as reflections. A reflection is identified by the Miller indices h,k, and l, where the reflection plane (hkl) makes intercepts a/h, b/k, and c/l with the unit-cell axes a, b, and c of the crystal. The four circles of the goniostat refer to the three circles swept out by rotation of the crystal about the w, 6, and X axes (Figure 17), and the circle swept out by the arm of the detector through rotation about the 0 axis (which is coincident with the w axis). A goniometer head with the mounted protein crystal is placed on a base in the circle which is coincident with the axis. Each diffracted x-ray beam collected with a diffractometer is therefore defined by a 20, w, ¢. and X angle as well as by an (hkl) index. 41 Figure 17. The Goniostat of a Four-Circle Diffractometer [111]. 42 The 20, w, o, and X angles which the computer-controlled diffractometer drives to in the course of making an intensity measurent result from the unit cell parameters and the orientation of the crystal. The unit cell constants and crystallographic orientation of the crystal are stored in the computer in the form of an orientation matrix A, which relates the reciprocal lattice vectors a*, b*, and c* in terms of their components in the coordinate system of the instrument. {/ a*x b*x c*x a*y b*y c*y \\ a*z b*z c*z .(3) In the HTN2 and HTN3 data collections, the orientation matrix was initially determined by manually locating three strong and approximately mutually orthogonal reflections, (0,0,32), (1,16,0), and (16,1,0). These reflections are stored in the reflection array, and were used to calculate the unit cell constants and the initial orientation matrix. As the hirudin-thrombin crystals were mounted with the c* or (001) direction of the crystal approximately coincident with the capillary axis, it was possible to align the c* axis to coincide with the 6 axis of the diffractometer. This alignment facilitated the location of (1,16,0) and (16,1,0), which could be found at X = 0, since the the crystal lattice is an orthogonal system. Axial intensity distributions of the (001) as well as the (h00) and (hh0) directions were then recorded. These measurements, which are displayed in Figure 18, allow one to assess the quality and integrity of the diffraction pattern from crystal to crystal. After this initial orientation matrix was obtained, a more accurate matrix was found by including nine additional high order reflections into the reflection array and by performing a least squares calculation between the calculated and observed angles of the 43 .mHmumauo caneouceIa cmesmIcficzumm “confineooom mg» no mcoflusnfimamwn zuwmcoucH Hmfix< .wH ousmwm mm mm mm 9. on om 9 a 2. on 8 Q G 9. on cm 9 c _ a 5 1 5 w _ _l _ .1 _ m a low I am a 16m 9. R a H 2 1% .2 I on a Lon . .4 ONE N 2 = Q n. 8 _. 9 m 193. I EN . loam look 1 sea 102 v .. 18.: w! 8.: look 3qu $qu A :qu Be 93 2E 44 reflections in the array. These additional reflections were all of relatively high 20 (15° < 20 < 22°), possessed sufficiently strong intensities so that the reflection centering routine would be able to center them after typical decay (5-15%), and collectively were distributed uniformly in the region of reciprocal space used for intensity measurements - most notably in x. The cell dimensions of the crystal were updated each time a least squares calculation was performed on the reflections in the reflection array. The average cell parameters obtained by least squares in the HTN2 and in the HTN3 data collections are shown in Table 3. The method used to measure the intensity data was first proposed by VYckoff et al. [115], and is known as the Vyckoff w—step procedure. The Vyckoff procedure is a stationary crystal-stationary detector peak sampling method where the top of the reflection peak is sampled by making a limited number of intensity measurements at small intervals in 00, as opposed to the full profile scan from background to background. It is often used in cases where reflections are closely spaced or where QUiCk! accurate measurements are sought to minimize x-ray-induced CFVStal decay - situations frequently encountered in macromolecular data collections. Using this method in the HTN2 AND HTN3 data collections, the Peak was sampled 0.09° on either side of the calculated value for a given reflection by taking intensity measurements at seven 0.03° step increments, and it took 18 seconds to complete the scan. The four largest intensities measured were then summed, and this sum was taken to be Preportional to the integrated intensity. Background measurements were made on either side of the peak by offsetting the I» value 0.40", and it took approximately 7 seconds to complete the left and right background measurements. Thus, it took approximately 25 seconds to Table 3. 45 The Average Cell Parameters of the HTN2 (3.6-2.8 A) and of the HTN3 (2.8 A) Crystals. Cell lengths in in A units, angles in degrees. A B C a 6 Y HTN2 91 .07 90.99 132.60 90.05 89.99 90.03 a 0.02 0.01 0.04 0.01 0.01 0.01 HTN3 90.95 90.95 132.56 89.99 90.04 90.03 0 0.02 0.02 0.03 0.04 0.03 0.02 46 measure each reflection intensity. This procedure also can compensate for minor crystal misalignments by allowing the diffractometer to take additional steps if the peak maximum is too close to an edge of the scan range; six additional steps were allowed in the HTN2 and HTN3 data collections. Althought the Vyckoff to step scan was used to collect both sets of hirudin-thrombin intensity data, the manner of data collection was different in the two sets. In the HTN2 (3.63-2.80M data collection, the intensity data were collected in three 26 shells of (24.5-27.5)°, (27.5—30.0)°, and (30.0-32.0)°, in the order of lowest to highest resolution. As only 1/16th of the sphere of reflection needs to be measured to collect a unique set of intensity data, only reflections with the k index greater than or equal to the h index were measured. In each shell, data were collected by ascending h,k,l - with 1 varying fastest , h slowest. A total of 7914 reflections were measured in the HTN2 data collection, and it took approximately 55 hours to obtain these data. In the HTN3 3.56 A data collection, a reflection file was used based on the observed reflections of the HTNl 3.56 A data collection; 81% 0f the possible HTNl reflections were judged to be observed and were included in this file. The file was created by dividing the unique, observed data into 19 29 ranges of approximately 305 reflections per range, Each range was sorted in order of descending 29, with h varying 810West, 1 varying fastest, and these individual files were merged in the Order of descending average 29, so that the data collection file wmild allow reflections to be measured from high to low resolution. Since high order data are generally weaker, and are more strongly affected by x-ray-induced decay than low order data, this manner of data collection is preferred to that used in the HTN2 data collection. In 47 the HTN3 data collection, 5983 reflections were measured over the course of about 42 hours. During the course of each data collection, the alignment of the crystal was monitored by measuring the intensity of three nearly orthogonal reflections every 100 reflections; reflections (1,16,0), (16,1,0), and (0,0,32) were chosen for this purpose. If the intensity of these monitor reflections diminished by more than 15%, the crystal was considered misaligned, and this triggered the re-measurement of the 12 reflections in the reflection array, followed by the calculation of a new orientation matrix and cell parameters by a least squares calculation. Slopes resulting from a least squares fit line of intensity versus exposure hour data for each monitor reflection were used in data processing to obtain decay corrections to the data of 29 > 15.0°. A number of different measurements preceded and followed the actual intensity data collection; these measurements included the 0) profile of the reflection (6,16,17), the intensity absorption of (0,0,8) reflection versus 4» angle, and the (hkO) zonal reflections to 5.9 A resolution. The 0° profile of a reflection helps determine the parameters in the Vyckoff to step scan, and provides an indication of the suitability of the Crystal for intensity data collection. A split profile, for example, may imply that the crystal is cracked, and that two closely oriel‘tted but distinct crystals are contributing to the profile. In the HTN2 and HTN3 data collections, the a) profile of the reflection (6,16,17) was determined by making 10 second intensity measurements at 00 Values of +/- 0.50° from the calculated value in 0.05° increments of w; the a) profile obtained for reflection (6,16,17) prior to beginning the HTN2 data collection is shown in Figure 19. The width of the profile 400- 200- FigUre 19. IZOD- I000_ 600. 48 TCOL/N TS//O SEC. I4OOJ 1 800. 90° ' 92° V 95° /0.2° 10.5" no“ The w—Profile of (6,16,17) Taken Before the 3-D HTN2 Data Collection. Peak width at half-height approximately O.18°. 49 peak at half the intensity maximum defines the region of the peak which should be scanned in the Vyckoff w step scan; since this width is approximately O.18° in the profile (Figure 19), a scan range of O.18° was chosen. From this profile one can also determine a suitable w offset value at which left and right background measurements can be made; an offset value of 0.40° was chosen from the Figure 19 profile. The intensity absorption versus ¢ angle for the reflection (0,0,8) was measured to determine the optimum region of ¢ to carry out the data collection. Ideally, one would like to collect data in the region with the lowest overall x-ray absorption or the highest overall intensity. The (0,0,8) reflection was chosen because it has a large intensity (Figure 18) and, more importantly, because each crystal was aligned on the diffractometer so that the c* axis was coincident with the o axis. This alignment ensured that the (001) Bragg planes were always in reflecting position for all values of d at X = 90°. Intensity measurements were made every 20° in ¢ over the entire 360° range, and then every 10° over a 150° range containing the 45° data collection ¢ region. A plot of I(¢) versus ¢ taken before the HTN3 3.56 A data collection is shown in Figure 20; the 45° range indicated in the plot was the ¢ range employed for the data collection. This region was chosen for the data collection because it was of high intensity or low x-ray absorption. The curve obtained from averaging the absorption curves determined before and after each data collection was used to correct the intensity data for absorption in the data processing. Intensity data for (hkO) zonal reflections of 29 (2-15)° were measured before and after the three-dimensional data collections. These data were used in the three—dimensional data processing to assess decay due to x-ray exposure for reflections of the corresponding 29 range. /(¢>) 1* 50 5000.. hOO h/10 OkO 4000.. 3000.. 2000.. IOOO- Figure 20. /00° 60° 20° —20° -60° -/00° 91> Intensity Absorption Plot of (0,0,8) Taken Before the HTN3 3.56 A Resolution Data Collection. I(¢) in total counts/measurement. The 45° ¢ range employed in the data collection indicated by "D.C.”. A. u 51 Moreover, the 5.9 A resolution zonal intensity data measured before data collection for the HTN2 and HTN3 data sets were used to obtain an initial scale factor for scaling the two data sets. F. 2.3 A Resolution Area Detector Data Collection A 2.3 A resolution set of area detector intensity data was collected from a recombinant hirudin-thrombin crystal in the laboratory of Professor Robert Huber at the Hax—Planck-Institut fur Biochemie in Martinsried, Germany. The Enraf—Nonius data collection system consisted of a CAD-4 four-axis goniostat and a FAST television area detector. Cu K“ x-rays were produced by a Rigaku RU-ZOOB rotating anode x-ray generator fitted with a 0.3 x 3.0 mm fine focus filament and operating at 5.2 kw (45 kV, 120 mA). The x-rays were monochromated by means of a Ni-filter which removed 99.95% of the Cu K“ radiation. A hirudin-thrombin crystal of approximate dimensions 0.85 x 0.50 x 0.50 mm, mounted in the previously described manner with the c* axis coincident with the capillary axis, was used to collect the intensity data; all measurements were made at 6 °C. The Enraf-Nonius CAD-4 goniostat is a 4-axis diffractometer which makes use of the same 29, m, and ¢ angles as the Siemens P3/F system, but which uses a K mechanism, as opposed to the X circle, to position the goniometer head. A drawing of a 'K-geometry' goniostat is shown in Figure 21 [116]. The goniometer head is held on an arm which can rotate about an axis 50° from the vertical. The base of the arm rotates about the 9 axis, and the motion is identical to that of w on a conventional goniostat. The upper portion of the arm is held with the ¢ axis inclined 50° to the K swivel axis. Combined motions of the w and K can produce Crystal orientations analagous to —100° < X < 100° on a conventional 52 ._oHH_ acuowuon Hucofiucw>coo a nu“: uwuoEOHomuumfin msfizoznumucm an no mafiaoun .aN ousufim diffractometer [116]. The Munich Area Detector (NE) System or MADNES software package is an inclusive area detector x-ray diffraction data collection system of programs which was used to carry out all steps involved the FAST experiment. MADNES is an area detector independent system which allows one to perform the following diverse tasks [117]: approximately align crystals; find reflections; autoindex found reflections and use them to determine unit cell parameters and an orientation matrix; refine crystal and detector orientation; predict reflection positions; collect reflections; evaluate reflections as they are collected; monitor and update the crystal orientation matrix and detector alignment throughout data collection; and plot predicted and/or observed reflections. The data collection geometry used in a MADNES-controlled area detector x-ray diffraction system is similar to that used in rotation or oscillation x-ray photography: the crystal is rotated about an axis perpendicular to the incident x-ray beam. In order to measure a nearly complete set of data from a hirudin—thrombin crystal, two crystal orientations and two data collection ¢ ranges were used. (The ¢ angle in-MADNES terminology is identical to the w angle of the CAD-4 goniostat.) Initially, data were collected by rotating nearly 90° in ¢ (-60° to 24°) about the c* or capillary axis of the mounted crystal. In addition, data were collected by rotating 100° in ¢ (-10° to 90°) about an axis 30° oblique to the c* axis. Unlike conventional scintillation counter diffractometers which measure reflection intensities individually, area detectors have the Capability of measuring several hundred reflection intensities at a time. The FAST area detector is a phosphor screen coupled to a television camera, and it is composed of picture elements called pixels. 54 The pixels are arranged on the screen in a 512 x 512 grid. FAST area detector x—ray intensity data are collected by rotating a crystal through a small angle while x—ray quanta are collected on the phosphor screen. The x-ray quanta collected at each pixel location are integrated, and a two—dimensional array is generated whose elements correspond to the amount of x—rays detected at each detector location; this array is called an image or frame, and it is the basic informational entity used by MADNES to perform its numerous aforementioned tasks. In the hirudin-thrombin FAST experiment, frames of data were generated by rotating O.1° in ¢ at a rate of 90 seconds per frame. The crystal-to-detector distance was 49 mm, and it took approximately 56 hours to complete the frame data measurements at the two previously mentioned orientations. The flow which transpires during a MADNES intensity data collection is indicated in Figure 22 [118]. The initial step in the data collection process is a pre-alignment, and it is very similar to the preliminary alignment which must be conducted prior to x-ray precession photography. The purpose of the pre-alignment is to find two major zones of the crystal, and to reposition the crystal so that the center of the zones approximately coincides with the center of the x—ray beam [117,118]. In this experiment, the (Okl) and (hOl) zones were aligned. A least squares fit to the spots which defined the diffraction cones gave the circle center, and the new goniostat angles were calculated to bring the zone and beam centers into alignment. After this step, the missetting angles were approximately zero and rough unit cell parameters could be obtained. The second step in the FAST experiment utilizes the FIND command and involves determining the centroids of a few hundred reflections at 55 [1. PBS-ALIGNMENT] ' 2. FIND Yms, st, Phi for several hundred reflections at two phi positions 3. ALIGNMENT 1 A. AUTO-INDEX found reflections and obtain 3 approximate orientation matrix i 3. REFINE ! cell parameters, orientation matrix, & ; detector-related experimental parameters_J W 4. PREDICT a batch of reflections v S. COLLECT & EVALUATE , intensity data get and process next image yes !<:fiis-alignment detect3§2:> <:Reed more predictionsz>>———————-yes no < yes END Figure 22. A Flow Chart of Operations in a MADNES Area Detector Data Collection. 56 the pre-aligned ¢ positions. Three coordinates define a FAST reflection position: Yms, st, and ¢. The coordinates Yms and st (’ms’ stands for ’mass store') correspond to the vertical and horizontal directions of the detector, and the w coordinate refers to the crystal rotation angle (the ¢ angle is identical to the w goniostat angle, as was mentioned earlier). Finding a hundred or so reflections at each of the (hOl) and (Okl) zonal positions was a 4-stage process [118]. First, an initial image was produced by rotating 0.4° in ¢ about each zonal ¢ position. Second, each resulting image was searched for reflections and the Yms, st values were extracted. Third, ten images spanning 1.0° in v (O.1° rotation per frame), and centered on the zonal o value of step 1, were measured. Finally, a list of Yms, st, and ¢ coordinates for the reflections contained in these frames was produced. The purpose of the third step was to ensure that any partial reflections measured in step 1 would be completely defined. The third step of the area detector data collection is an alignment stage which includes an auto-indexing of the found reflections, an orientation matrix calculation, and a refinement of various crystal and experimental parameters. In the rHVZ-human a-thrombin data collection, the AUTJ (auto-indexing of reflections by the James Pflugrath method) command was used to auto—index the found reflections and obtain an approximate orientation matrix. The REFINE command was then implemented to refine the crystal orientation matrix, unit cell parameters, the missetting angles, the crystal-to-detector distance, the detector tilt and twist angles, and the center of the primary beam [117]. The unit cell parameters obtained as a result of this alignment step for hirudin-thrombin were: a = b = 90.53 A; c = 132.05 A; a = B = v = 90.00° and the error estimates were approximately 0.05 A and 0.05°, 57 respectively. Having completed the pre—alignment, reflection-finding, and alignment steps, one is now ready to actually collect and evaluate the needed frame data. Obtaining a three-dimensional set of reflection data using MADNES basically involves the iterative use of two commands: PREDICT and COLLECT. PREDICT uses the output from the previous aligment step to calculate for a reflection the (hkl) index, central ¢ value, ¢ range, starting ¢ value, Lorentz—polarization correction factor, oblique incidence correction factor, and the mass store coordinates Yms, st [117]. Reflections are predicted in small ¢ increments called batches. The major restriction on the batch size is that it contain less than 3000 reflections; in the hirudin-thrombin data collection a batch size of 2.5° was used. Once a batch of reflections has been predicted it is sorted on starting o value and checked for overlaps. By means of the COLLECT command, frame data collection and data processing are carried out, and controlled by the appropriate parameter input. The data collection proceeds by the exposure of a series of images as the crystal rotates, and involves the extraction of information around each predicted reflection occurring on the images. MADNES stores all the reflection information in a double-linked list [117]. Each time a reflection begins to approach the Ewald sphere, the storage size of data for the reflection is calculated and a contiguous amount of space in this list is allocated. Information for this reflection from a series of adjacent images is transferred to this allocated location as it is obtained. Once the reflection has finished passing through the Ewald sphere, the information for that reflection in the double-linked list is complete and that reflection is processed; the data for the reflection can be written to disk if desired, but most 58 importantly the data space for this reflection is cleared for use by a new reflection. As an image is being generated by the area detector, MADNES can process the previous image. Collected (hkl) reflection intensity data were processed to observed structure amplitudes (uncorrected for scaling, absorption, or decay), denoted lFobsl, during the course of data collection by application of the following relationship: |Fobs|2 = LORPOL x I(hkl) ...(4) where I(hkl) is the background-corrected intensity of a reflection (hkl) and LORPOL is the Lorentz-polarization correction factor, which is a geometric-polarization correction term. The size and shape of reflections was determined by means of the dynamical masking method of Sjolin and Wlodawer [119]. This method generates a mask which is used to evaluate the pixel data. Mask generation proceeds via five stages [118]. In stage 1, the 3-D data are treated with a smoothing algorithm, and the average and standard deviation of the smoothed array are determined. In stage 2, a mask is produced from the smoothed data using the formula: MASK = (SMOOTHv — Average)/ Std. Dev. v.2.6 2.: ..(5) In stage 3, a new average background and standard deviation are calculated from the smoothed array by leaving out pixels which deviate too much from the previous average and step 2 is repeated. In step 4, a contiguity criterion is applied to sufficiently positive mask values: the contiguous region is considered to be the peak and non—contiguous regions due to noise or interfering reflections. In step 5, the mask is 59 used to assign the original data into peak, background, and neither peak nor background pixels so that peak integration and intensity calculation can be achieved. Masks are not produced for low-intensity reflections; these reflections are processed using a mask made from the masks of medium and strong reflections from the same region of the detector. (The detector surface is divided into 16 regions.) As a result of the COLLECT command, an observed structure factor amplitude (lFobsi) file is generated and continually added to during the course of active data collection. The file contains the following information for an individual reflection record: (hkl), lFobsl, o(Fobs), the average background, o(background), error codes, crystal identifier codes, calculated and observed: Yms, st, ¢ coordinates and the width of the spot center (observed values are only listed if found), and exposure time in hours at the spot center [117]. Each reflection is given a simplified character as well as a detailed numeric error code. Reflections can be given one of four simplified error codes: good, weak, edge, and bad. 'Good' reflections are measurements which were obtained under no obvious error conditions and will be used in further processing (to be described in a later chapter) to generate |F(hkl)| values. Reflections which are labelled with the other three simplified error codes are ignored in later processing, and have detailed numerical error codes which reveal the exact source of the error. 'Veak’ reflections were not observed in the data collection. ’Edge’ reflections were measured on the edge of a data array in one or more of the Yms, st, or ¢ coordinates. A 'bad’ reflection was labelled as such if one or more of the following problems occurred with regard to a peak: appeared too far from one or more of the predicted detector coordinate values. peak too wide in a particular coordinate value. bad 60 background or o(background), bad pixel non-uniformity in the data array or a problem with pixel overflow — i.e. too dark [117]. In the hirudin-thrombin data collection, 183,791 reflection measurements were recorded in the output IFobsl file, of which 164,250 (89.4% of the recorded reflection data) were labelled as 'good’ and 19,541 were labelled with the bad, edge, or weak error codes. MADNES also has a means to continually monitor whether a mis-alignment has occurred and a re-centering is needed [118]. During the course of data collection, the average difference between the predicted and observed Yms, st, and ¢ coordinates is calculated for each of the 16 regions of the detector. Moreover, the evaluation status of the 500 most recently processed reflections is stored. If average differences in the predicted and observed coordinates get too high, or if the fraction of 'good’ reflections in the 500 most currently evaluated reflection list gets too low, the data collection automatically returns to the REFINE command for a re-alignment. Up to 250 reflections from the current list with the highest |Fobs| and suitable error codes are selected for refinement of missetting angles and detector position. (The unit cell parameters are not refined in a realignment as the ¢ range and the sampling of reciprocal space are too small to yield an accurate orientation matrix calculation.) After the refinement has been conducted, two quantities are calculated to assess if the re-centering has been successful [117]. The root mean square (r.m.s.) difference between the actual and the predicted spot positions on the detector in millimeters (rmscfd), and the r.m.s. difference in angles between the reciprocal lattice vector and the predicted one (rmsdeg) for the reflections selected to perform the refinement are calculated and checked against maximum acceptable values stored in 61 RmsChk. In the hirudin-thrombin data collection, the acceptable values for for these quantitites were 0.2500 and 0.2000, respectively. If the calculated values were below the tolerance limits, the data collection would proceed; otherwise, the data collection would halt and prompt for user intervention. III DATA REDUCTION A. Converting the 2.8 A Resolution Diffractometer Intensity Data to Structure Amplitudes The conversion of each reflection intensity, I(hkl), to its corresponding structure factor modulus, |F(hkl)l, is the initial step in the data processing of x-ray intensity data. The generation of a 2.8 A resolution set of structure factor amplitudes from the diffractometer x-ray data of the hirudin-thrombin crystals proceeded in two stages. Initially, the 3.56 A resolution HTN3 and the (3.63-2.80)A resolution HTN2 data sets were reduced to structure factor amplitudes. Afterward, the data sets were scaled and averaged to produce a unique set of amplitudes to 2.8 A resolution. The intensity data were reduced using the program P-DATA, written by C.D. Buck, formerly of this laboratory. The program is able to accomplish this task, using input data from either a Nicolet P3/F or a Picker FACS-I four circle diffractometer, by means of the following equation: |F(hkl)|2 = consr x ABS x DEC x LORPOL x I(hkl) ...(6) where CONST is a scaling factor, ABS is an absorption correction factor, DEC is an x-ray induced decay correction factor, LORPOL is the Lorentz-polarization correction factor, and I(hkl) is the background-corrected intensity of the reflection (hkl). Background-corrected reflection intensities were obtained by averaging the background data associated with the reflections in shells of 29. The integrated intensity output for each reflection during the course of data collection is computed from the following equation [120]: 62 63 I = [total scan count — sum of background counts ] x scan rate background to scan time ratio -(7) where measurements of the left and right background as well as of the total scan count for the peak are used in the calculation. Background-averaging of intensity data results in more accurate measurements for two reasons. First, each left and right background was measured only one—fifth as long as the peak in these intensity data collections. Second, since these background count values are generally low, statistical fluctuations can significantly affect the intensities calculated for weak reflections. Averaging the background of the intensity data for the HTN2 and the HTN3 data collections revealed that there was a definite 29 dependence to the background measurements. The background data were also averaged in shells of 6 as well as 29; however, since no 6 dependence could be discerned, the intensity data resulting from the 29 shell background averaging were used in the P-DATA processing. The formulation used to correct intensity measurements for absorption in the P-DATA program was based on the method proposed by North, Phillips, and Mathews [121]. The program uses absorption tables of Imax/I(¢) versus 6 to apply the absorption correction, and such a table was obtained from the absorption curve depicted in Figure 20. The decay correction factor, DEC, corrects for the intensity deterioration of a reflection due to x—ray exposure to the crystal, and it is generally a function of both exposure time and 29 angle. The DEC factor has the form 1/(1-St) in the following equation: I'(t) = I(t)[1/(1-St)] ...(8) In equation (8), I(t) is an intensity measurement recorded at time t in 64 the data collection, I'(t) is an intensity measurement corrected for decay as if it were recorded at time zero. Furthermore, S = -s/I'(0), where s is either the slope from an intensity versus exposure time plot for a single, periodically-measured monitor reflection, or the slope from an average intensity versus exposure time plot for a periodically-measured group of reflections, and I'(O) is the corresponding extrapolated intensity at zero time. The decay corrections for the HTN2 (3.6—2.8) A data set were obtained from the linear plot of 8 versus average 29 (Figure 23). Four points were used to generate the line; three points came from intensity decay plots of the three monitor reflections [(1,16,0), (16,1,0), and (0,0,32)], and the fourth point was the slope calculated by comparing peak heights from 5.9 A resolution (hkO) Patterson projection maps computed using intensity data collected before and after the 3—D data collection. The three decay— and alignment-monitoring reflections decayed approximately (12—16)Z over the course of the 57 hour data collection, while the 5.9 A resolution (hkO) zonal reflections decayed 8% on an average. The HTN2 intensity data were collected from low to high resolution in shells of (24-27.5)°, (27.5-30.0)°, and (30.0-32.0)° in 29, respectively. From the S versus average 29 plot, S values were extrapolated for the 29 values 27.5° (S = 0.00373 hours-1) and 30-0° (5 = 0.00404 hours—1)- The (24.0—27.5)° intensity data were corrected for decay using S = 0.00373 hours'l. the (30.0-32.0)° intensity data were corrected for decay using S = 0.00404 hours-1. and the (27-5-30-0)° data were corrected for decay by interpolating between the two S values. In the 3.56 A resolution HTN3 data set, which was collected from high to low resolution using an observed reflection file, the DEC correction factors had a slightly different form than in equation (8), S 65 x /O.3 hours" 4h 5.. 4.. 3.. .. 2fl_ I. /.L 4° 8° /2° 16° 20° 24° 28° 32° <29> Figure 23. Dependence of X-ray Decay Correction with Scattering Angle for the HTN2 Hirudin-Thrombin (3.6-2.8) A Resolution Data Collection. 66 and are contained in the expressions listed below: I'(t) I(t)[1/(1-29kt)] 29 > 12.0° ...(9) I'(t) I(t)[1/(1-12kt)] 29 g 12.0° ...(10) To obtain DEC corrections factors for this data set an S versus average 29 plot was made, as the one shown in Figure 23, but now the slope of the 5 versus average 29 plot, k, as well as the actual 29 value of the reflection were used to obtain the appropriate decay correction. In the HTN3 data collection the three alignment— and decay-monitoring reflections decayed (5-11)% over the course of the 44 hour data collection, and the 5.9 A resolution (hkO) zonal reflections decayed on an average of 4%. The k value determined from the S versus average 2A curve for the HTN3 data collection was 9.04 x 10'5 hours-1. The factor CONST was used to approximately scale the HTN2 intensity data to the HTN3 intensity data. The CONST value was obtained by finding the mean ratio of corresponding Patterson peak heights from the HTN2 and HTN3 5.9 A resolution (hkO) Patterson projection maps calculated using intensity data obtained before the start of each 3-D data collection. The value for CONST determined to put the HTN2 intensity data roughly on the same scale as the HTN3 intensity data was 1.04. Moreover, the fact that the CONST value is so close to unity indicates that the x-ray scattering of the two crystals was comparable. Once the P-DATA reflection processing was completed, it was possible to determine the percentage of collected data which were observed in the HTN2 and HTN3 intensity data collections. The HTN2 (3.6-2.8)A resolution data were processed to 3483 unique structure factors out of 7597 possible observations, thus this data set had 46% of the unique data observed. In the HTN3 3.56 A resolution data 67 collection, 5624 unique measurements were observable out of the 5769 reflections present in the reflection file. This number was unusually high because the data file contained only reflections observed to 3.56 A resolution from a previous data collection, HTNI. However, if one considers that this observed reflection file represented 81% of the unique data to 3.56 A resolution, the HTN3 data set had 79% of the unique data observed. The two sets of reduced data contained 190 common structure factors, and these were used to scale the HTN2 structure amplitudes more accurately to those of HTN3. Before carrying out any further scaling, an R-factor was calculated for the overlapping reflections using the equation: R-factor - I M A "'1 O U— U) - lFobSI I) HTN3 HTN2 ...(11) and the value was found to be 0.098. In an effort to reduce the above Refactor and scale the overlapping structure factors (and ultimately the two sets of structure factors more closely), the following ratio was calculated: £(IFobsl )/£ (lFobs| ) ...(12) HTN3 HTN2 which was found to be 0.96. Applying a scale factor of 0.96 to the HTN2 |F(hkl)| values and recalculating the R-factor for the overlapping amplitudes yielded a value of 0.091, indicating that the re—scaling brought about improvement. To create a unique set of structure amplitudes to 2.8 A resolution, all the HTN2 structure factors were 68 multiplied by the scale factor of 0.96, combinded with the HTN3 reduced data, and the overlap amplitudes were averaged; the result was a set of 8917 unique |F(hkl)| values. The <|F(hkl)|2> versus <29) distribution for these hirudin—thrombin structure factors is shown in Figure 24. The completeness of this 2.8 A resolution set of |F(hkl)| values was determined by loading these data into the PROTEIN [122] package of macromolecular crystallographic programs, and inserting a CHECK COMPLETENESS subcommand into the LOAD command file; the set of structure factors was found to correspond to 62% (8917/14362) of the possible unique reduced data. B. Post—Data Collection Processing of the 2.3 A Resolution Area Detector Stucture Factors Although the MADNES-controlled FAST area detector diffractometer generated lFobs(hkl)| values for each reflection measured in the hirudin-thrombin data set, these structure factors still had to be corrected for absorption, decay, and scaling before they could be used for x-ray crystal structure analysis. However, the previously described means of carrying out absorption and decay corrections cannot generally be conducted with an area detector system. Azimuthal angle w scan data cannot be collected for an empirical absorption correction, nor can specific reflections be measured periodically to carry out a conventional decay correction. Nonetheless, one can make use of the real power of the area detector - its ability to measure large numbers of symmetry—equivalent refletions in a data set — to correct for absorption and decay, as well as for scaling. By means of the program ABSCOR [123] the hirudin-thrombin FAST area detector diffractometer data were corrected for scaling, absorption, and 69 IZCH; ICMDl 8C7» SCI. 4CL. 20.. <:£?62> Figure 24. The <|Fobs|2> versus <29) Distribution for the 2.8 A Resolution Hirudin-Thrombin Diffractometer Data Set. ‘A' 70 decay as well as for the non-uniformity of response of the detector. The intensity data output file from the MADNES program (the so-called D 13 file) serves as the input file for ABSCOR. In a generalized macromolecular crystallographic data collection, it is possible that the crystal will not be fully bathed in the primary x—ray beam, and that variable scattering volumes of the crystal will be irradiated during the course of the data collection. (In the hirudin-thrombin data collections discussed in this dissertation, the crystals should have been bathed uniformly in the x-ray beam throughout the course of the experiments.) In this situation, it is reasonable to assume that reflections measured in a batch or in a block, which consists of one or more contiguous batches, should have similar scale and temperature factors which correct for changes in irradiated volume and for crystal decay due to x-ray exposure. One can represent a structure factor amplitude corrected for absorption, decay, scaling, and non-uniformity of response of the detector by the following formula [123]: lFlcorr-h(i) = IFIobs-h(i)*Si*exp[_Bi(sin(e)2/x2)l X [A'O'h(i)*(Tll,...,T33)*A's_h(i)*(T11,...,T33)]-1/2 O 1 2 x (Knonunlform) / ...(13) where |F|obs-h(i) is the observed structure factor amplitude of reflection h of the ith batch; h is the triple index (hkl); i is the index of the reflection batch; Si is the individual scale factor of the ith reflection batch; Bi is the temperature factor of the ith reflection batch; A'O-h(i) is the transmission factor of the primary beam of reflection h(i); A's-h(i) is the transmission factor of the secondary 71 beam of reflection h(i); T11,...,T33 are the components of the symmetric tensor T which describes the transmission properties of the crystal; and, Knonuniform is the post-correction factor for the nonuniformity of response of the FAST area detector. The transmission factors A'O—h(i) and A’s-h(i) are obtained from the tensor ellipsoids: A'o_h(i)‘”2 = (Tllmo2 + T22*v02 + T33*wo2 + 2*T12*u0*v0 + 2*T13*u0*w0 + 2*T23*v0*w0) ...(14) A's—h(i)-1/2 = (T11*us2 + T22*vs2 + T33*ws2 + 2*T12*us*vs + 2*T13*us*ws + 2*T23*vs*ws) ...(15) where these factors are generated by the following operations [123]: A'O—h(i)'1/2 = [vO-h(i)]T*T*v0-h(i) ...(16) A's-h(i)-l/2 -.- [vs-h(i)]T*T*vs—h(i) ...(17) with v0-h(i) = u0 and vs-h(i) = us ...(18) v0 vs w0 ws being the primary and secondary direction vectors of the normalized beam in the goniometer head coordinate system for reflection h(i). The scale (Si) and temperature (Bi) factors for each block of reflections, and the symmetric transmission tensor matrix elements (T11, T22, T33, T12, T13, T23) for the crystal are determined by sorting the reflection structure factors into symmetry-equivalent sets (including the Friedel mates, where |F(hkl)| and |F(-h—k-l)| constitute a Friedel 72 pair) and minimizing by least sqares methods the following sum of squared differences of pairs of symmetry—equivalent reduced data: k[h(i)l who): , Z Z Z (lanlcorr - lanlcorr )‘ = min h(i) j=1 l=j+1 h(i),j h(i),l ...(19) where k[h(i)] is the number of reflections in a set of symmetry equivalent reflections of index h of the ith reflection block. Natural logarithms are used so that the difference is linear with respect to Si and Bi, and the scale and temperature factors of the first block are fixed at 1.0 and 0.0, respectively. It is quite common for area detectors to display a non-uniformity in detector response; this problem is generally corrected by means of a pixel—to—pixel correction table obtained by an appropriate calibration method prior to the initiation of the data collection. The aforementioned least squares minimization allows one to introduce a non-uniformity of response post-correction. After the mimization of the scaling and absorption correction parameters has been completed, corrected structure factor amplitudes of the symmetry-equivalent reflections are compared. A deviation from the average value is expressed as a correction factor for particular pixel values. The table which results from this is composed of 64 8x8 pixel boxes and it can be smoothed by various methods. The values from this table are the basis of the correction factor Knonuniform. In the ABSCOR processing of the hirudin-thrombin FAST output data, 183,791 reflections were input, of which 164,250 were flagged as 'good' and used for further processing and 19,541 with error flags were ignored. The block size used to restrict scale and temperature factors was 2 sequentially measured reflection batches, and since the batch size 73 used in the data collection was 2.5° in 6, the block size was 5°. Previous work has shown that block sizes of 3° to 6° in 6 give reliable and reproducible results [123]. The end result of this was that the 164,250 FAST lFobsl were sorted into 31 blocks, and that after a complete least squares minimization of equation (19), the six matrix elements of the symmetric transmission tensor, as well as the 31 scale factors, and 31 temperature factors were determined. The complete ABSCOR processing of the hirudin-thrombin FAST reflection output data proceeded via 3 stages, and the results are summarized in Table 4. The progress at each stage was assessed by the calculation of the R-factor presented below: Rmerge = Z X |I(h)j -|/ Z Z I(h)j ...(20) h J' h J' where h represents the triple index (hkl) and j represents the number of equivalent reflections for a given I(h). Prior to any minimization of equation (19), the ABSCOR output revealed that the 164,250 reflection intensities corresponded to 23,521 unique reflections, and that the initial R-merge between individual reflections was 0.168. In the first stage of the ABSCOR processing, scale and temperature factors were refined and applied to the input lFobsl, and the result was that the Rmerge after scaling was 0.120 between individual reflections. In the second stage of the ABSCOR processing, the scale factors, temperature factors, and the six components of the symmetric transmission tensor T were refined and applied to the lFobsl, which resulted in an Rmerge of 0.111 between individual reflections. Finally, in the third stage of the ABSCOR processing, the Knonuniform detector non-uniformity of response correction factors were applied and the resultant Rmerge was 74 Table 4. Reflection Statistics on the Hirudin—Thrombin FAST Data Set Evaluated by MADNES and Corrected using ABSCOR Total number of reflections: 164,250 Number of unique reflections: 23,521 Rmerge*-uncorrected: 0.168 Rmerge*-scaling applied: 0.120 Rmerge*-scaling & absorption corrections applied: 0.111 Rmerge*—scaling, absorption, & post-correction of detector non—uniformity of response applied 0.108 Rmerge** 0.042 Completeness of data to 2.3 A resolution 0.91 Rmerge = 2 (I - )/ Z I * = between individual reflections ** = between averaged Friedel pairs 75 0.108. The resultant output file from the ABSCOR processing contained data for 41,062 reflections, in which there were 23,521 unique data and 17.541 Friedel mate data. The unique and Friedel mate data were averaged by means of the PROTEIN program package to generate a set of 23,521 unique structure factors. The Rmerge calculated in PROTEIN between averaged Friedel pairs was 0.042 and the data set was found to be 91% complete to 2.3 A resolution. IV MOLECULAR REPLACEMENT CALCULATIONS A. Introduction In any macromolecular crystallographic investigation, once a requisite set of structure factor amplitudes have been obtained, the next major task is to work toward generating an electron density map for structure interpretation and model-building. The electron density at a point (x,y,z) in the crystal can be determined by computing the following Fourier summation: p(x,y,z) = l/V Z Z ZIP(hkl)I*exp{ia(hkl)}*exp[-2n*i(hx + ky +lz)} all h, k, l ...(21) where |F(hkl)| is the amplitude of the reflection (hkl), «(hkl) is the phase of the reflection, V is the volume of the unit cell, and the summations extend over all of the observable reflections. However, calculating such an electron density map is non-trivial, as it requires a knowledge of the phase angles which cannot be measured experimentally. In this investigation, the molecular replacement method [124] or Patterson search method [125] was used to overcome this "phase problem". The molecular replacement method has two main types of applications [126]. In crystals which possess more than one molecule per asymmetric unit, the technique can be employed to determine the orientation of non-crystallographic symmetry elements within an asymmetric unit of the unit cell. The method can also be used to generate a set of trial phases for an unknown structure by orienting and positioning a similar molecule of known tertiary structure in the unit cell of the unknown structure. This latter application was employed to solve the structure of the rHVZ-K47-human a—thrombin complex using PPACK-human a—thrombin as 76 77 a model [127]. The structure solution of the hirudin-thrombin complex was a problem ideally suited for the molecular replacement method. First of all, both the model and the crystal contained the same protein, human a-thrombin. Second, thrombin accounts for roughly 84% of the total molecular weight of the complex. Thus, it seemed reasonable to assume that one could generate a good initial Fourier map based on a Patterson search using the PPACK-thrombin structure as a model. The molecular replacement method is often referred to as a Patterson search method since calculations are performed using the Patterson functions of the known model and the unknown structure of interest. A Patterson function is a Fourier synthesis which uses the square of the structure factor amplitudes, has the form: +h +k +1 P(uvw) e 2/v t z 2 |F(hkl)|2*cos{2n*(hu + kv + lw)}, ...(22) o o 0 and produces peaks corresponding to all of the interatomic vectors [128]. A peak at the point (u,v,w) implies that there exist atoms at (x1,y1,zl) and (x2,y2,22) such that u = x1 - x2, v = yl - y2, and w = 21; 22. The height of a peak in the Patterson function is approximately proportional to the product of the atomic numbers of the atoms which define the ends of the vector, (21 x 22). Moreover, if the unit cell of a molecule contains N atoms there will be N2 peaks in the Patterson map, of which there will be N origin or self-vector peaks and (NZ-N) non-origin peaks (half of which are centrosymmetric). Thus, the Patterson function is a unique Fourier synthesis which contains structural information (relative structural information), but which requires no knowledge of the phases and can be calculated using experimentally—obtainable quantities, the structure factor amplitudes. 78 The Patterson search conducted with PPACK—human a-thrombin as a model to generate a trial structure of the hirudin-thrombin complex proceeded in three stages common to molecular replacement calculations [124]: (1.) The relative orientation or rotational relationship of the thrombin model with respect to the unknown hirudin-thrombin structure was determined by "rotation function" [126] calculations involving the intramolecular Patterson vectors of the model and the crystal Patterson synthesis. (2.) The correct translation of the properly oriented thrombin model in the unit cell of the hirudin-thrombin crystal was determined by "translation function" [129,130] calculations involving a modified crystal Patterson map and intermolecular model Patterson vectors. Once the first two stages were successfully achieved, the equivalence between a point x’ in the unknown hirudin-thrombin structure and the corresponding point x in the model thrombin structure could be expressed by the following relationship: X, = [Clx + d ...(23) where [C] represents the rotation matrix determined from the first stage of the Patterson search, and d represents the translation vector determined from the second stage. (3.) The properly oriented and translated thrombin model in the hirudin-thrombin unit cell was used to calculate a set of phase angles which were combined with 8.0 to 3.0 A resolution observed structure factors to produce a trial electron density map for model-building. The three stages in the Patterson search calculations will now be discussed in more detail. 79 B. The Rotation Function Calculations A model Patterson vector set was produced from the coordinates of PPACK—thrombin [6]. The thrombin model (284 residues) contained all but 11 of the protein residues in the PPACK-thrombin crystal structure; the first five residues (lH-lD) and the last four residues (14K-15) of the thrombin A chain, as well as the last 2 residues of the thrombin B chain (246-247) were deleted as they had poor or no density in the PPACK-thrombin electron density maps (Table 1). The search model also contained 24 internal water oxygen atoms, but it lacked the covalently-bound peptidyl inhibitor PPACK. Triclinic Pl structure factors were obtained by Fourier inversion of a model electron density map using the program FFTSF, which contains two main sub-programs, RHOGEN and PISF. RHOGEN calculated an electron density map of the model positioned in a triclinic P1 orthogonal cell of 120 x 120 x 120 A3 using 8.0 to 3.0 A resolution data and an overall temperature factor for the atoms of 20 A2. PISF then calculated structure factors by fast Fourier inversion of the electron density. Finally, these calculated structure factors were used to generate an 8.0 to 3.0 A resolution Patterson function of the thrombin model in a 120 x 120 x 120 A3 orthogonal cell- However, not all of the peaks in this model Patterson map were used for the rotational search with the hirudin-thrombin crystal Patterson function. In the rotation function calculation one is only interested in intramolecular vectors, thus only Patterson vectors of length greater than 3.0 A and less than 20.0 A were selected. The lower limit of 3.0 A eliminates the huge origin peak from the calculation. The upper limit of 20.0 A is meant to prevent intermolecular vectors from entering the rotation function calculation. Moreover, as PPACK-thrombin is a nearly spherical molecule of dimensions 45 x 45 x 50 A3 [5], an upper limit 0f 80 20 A for intramolecular vectors appeared to be a good estimate. 'Weak' or small magnitude Patterson peaks were excluded from the calulation by using an appropriate cutoff limit. Finally, as only half of the complete triclinic model Patterson map is unique, the peak search was restricted to 60 A along the w axis (although any of the three axes could have been chosen). The end result was that 7565 model Patterson peaks were selected for a rotation function calculation. The hirudin-thrombin crystal Patterson function was calculated using observed diffractometer structure factor amplitudes over the same resolution range. The rotational search was conducted with the SEARCH routine in the PROTEIN program package. The model thrombin Patterson vector set was rotated and interpolated to the nearest grid points in the hirudin-thrombin crystal Patterson function, and a product function was calculated. This process was repeated until all angles over a specified angular range and increment size were examined. The set of coordinates yielding the highest product function corresponded to the transformation necessary to bring the model and crystal Patterson vectors into the same orientation. The Huber ("ROH") angles 6, 9, and 6 [122] were used to conduct the rotational Patterson search. A rotational operation 6, 9, 6 corresponds to a rotation of the the Cartesian axes by 6 about the z axis, a rotation 9 about the new x axis, and a rotation 6 about the new y axis. The Huber angle rotation range for the calculation was: 6, 0-100°; 9, 0-180°; and 6, 0-180°. All angles were incremented by 5° in the calculations. The highest peak in the search occurred at the Huber angles (5.0°, 175.0°, 80.0°), and it was 7.70 above the mean. To refine this solution, a finer rotational search was carried out over the range: 6, 0-10°; 9, 170-180°; and 6, 75-85° using the angular increment of 81 1.0°. The refinement improved the peak height to 8.10 above the mean, and the refined coordinates were (3.5°, l76.0°, 81.5°). C. The Translation Function Calculations Once the orientation of the model molecule is found in the unit cell of the target crystal, the approximate position is found by conducting a translation search. However, in achieving this translation result one has actually positioned not just one molecule, but the entire set of symmetry-related molecules in the crystal unit cell. This is due to the fact that the relative position between symmetry-related molecules is fixed. For example, the hirudin—thrombin complex crystallizes in space group P43212 with one molecule per asymmetric unit and 8 molecules in the unit cell. The relative positions of the 8 symmetry-equivalent molecules in the unit cell are given in the following relations [131]: Molecule 1: x y 2 Molecule 5: y x -z Molecule 2: —x -y 1/2+z Molecule 6: -y -x 1/2-z Molecule 3: 1/2-y 1/2+x 3/4+z Molecule 7: 1/2-x 1/2+y 3/4-z Molecule 4: 1/2+y 1/2-x 1/4+z Molecule 8: 1/2+x 1/2-y 1/4-z ..(24) Moreover, the relative positions of the 7 intermolecular vectors are fixed as well; the complete intermolecular vector set is listed below: 1—2: 2x 2y -1/2 1-6: (x+y) (y+x) -1/2+22 1-3: -1/2+(x+y) —1/2-(x-y) -3/4 1-7: -1/2+2x -1/2 —3/4+22 1-4: —1/2+(x-y) -1/2+(x+y) -1/4 1-8: —1/2 -l/2+2y -1/4+22 1-5: (x-y) (y-x) 22 ...(25) 82 The positioning of the thrombin model in the hirudin complex unit cell was achieved by means of a set of programs written by E.E. Lattman (and modified by J. Deisenhofer, R. Huber, and M. Schneider) which calculates a translation function as described by Crowther and Blow [130]. The Crowther and Blow translation function has the form: n-l Ts(t) = x (lFobs(h)|2 - 2 |Fm(h*Ai)|2) h i=0 x Fm(h)*Fm’(h*A)*exp(-2n*i*h*t) ...(26) where h represents the triple index (hkl); Fobs(h) and Fm(h) are the observed and model structure factors at triple index h and are on an absolute scale; Fm'(h*A) is the complex conjugate of the model structure factor of a symmetry-mate molecule related by the symmetry operator A; t is the translation vector; and the term |Fm(h*Ai)|2 subtracts out the intramolecular vectors from the observed Patterson function at each of the n—l symmetry related positions using the matrix Ai, A0 being the identity matrix of the model. The above translation function Ts(t) is specific for the intermolecular vector at the translation t between a model molecule and its symmetry-mate related by the symmetry operator A; there will be n-l such functions for the n-l unique intermolecular vectors relating a given molecule to its n-1 symmetry-mate molecules in the unit cell of a crystal. Thus, the Crowther and Blow translation function makes use of the intermolecular vector equations defined by the space group of the crystal (relation 25 above) to determine the correct position of the properly-oriented model in the crystal cell. Basically, for each unique intermolecular vector translation function, all the intermolecular vectors were calculated for the model and its appropriate symmetry mate. 83 Each resulting set of intermolecular vectors is then translated throughout the crystal Patterson synthesis modified to remove the intramolecular vectors. A product function is calculated between the modified crystal Patterson map and the model intermolecular vector set at each point in the translation search. At the correct position of the model in the crystal, the product function will have a maximum value. In many cases, as in the hirudin-thrombin Search, if all the translation functions for the set of unique intermolecular vector equations are calculated, the correct position of model in the crystal cell will be overdetermined. In other words, the correct position of the model should account for a high product function (if not the highest) peak in each of the intermolecular translation function searches. The translational searches were conducted using 8.0 to 3.0 A resolution crystal and thrombin model structure factors, and a translation function calculation was conducted for each unique intermolecular vector in the P43212 space group (relation 25 above). The results of the search are listed in Table 5. The highest peak in all of the calculations occurred in Harker sections at consistent locations which positioned the molecule at x = 67.42 A, y = 3.00 A. and z = 52.91 A, and verifed the correct enantiomorphic space group to be P43212 as opposed to P41212. The correct position of the molecule also gave an outstanding peak in each Harker section; the highest peak in each section was on the average 13.30 above the mean, and 7.50 higher than the next highest peak. The R-value for this solution, where the R-value is defined as R-value = Z IIFobs(h)|-|Fcal(h)ll/ Z |Fobs(h)l, h h ...(27) 84 was found to 0.39. The rotational and translational parameters were further refined with the rigid body refinement-Fourier transform fitting program TRAREF [132]. The program TRAREF refined the orientational and translational parameters of a thrombin model by fitting its molecular Fourier transform or search model structure factors to the observed hirudin-thrombin structure factor amplitudes. Model structure factors were calculated based on the position of the model in the crystal cell determined from the translation search (Table 5). The fitting was done by calculating derivatives of the molecular Fourier transform with respect to the three orientational angles, the three positional parameters, a scaling factor, and an overall temperature factor. The system of equations was then reduced to normal equations and improved parameters were determined. The TRAREF refinement shifted the center of gravity position for the thrombin model in the crystal to x = 67.44 A, y = 2.87 A, and z = 52.87 A, and the R-value for this new postion was 0.34. D., The Initial Hirudin—Thrombin Electron Density Hap Calculation An initial hirudin-thrombin Fourier synthesis at 8.0 to 3.0 A resolution was calculated using coefficients of the form V*(2|Fobs(h)|-|Fcal(h)])*exp(i*ac(h)), ...(28) where h represents the triple index (hkl); [Fobs(h)] is the observed structure factor amplitude at triple index h; |Fcal(h)| and uc(h) represent the model structure factor amplitude and phase at triple index h calculated from the TRAREF-refined position of the thrombin model in 85 Table 5. Results of the Translation Search Second Harker Highest peak highest peak Position of Vector section Position (A) Height Height the molecule U V V (a) (a) (A) 1—2 0:1/2 43.94 6.00 66.30 13.8 6.7 1-3 V=3/4 24.87 72.01 33.21 13.9 4.7 1-4 V=1/4 18.98 24.87 99.38 13.9 4.8 x=67.42 1-5 - 64.52 26.46 105.75 12.5 6.7 y: 3.00 1-6 - 70.28 70.41 39.36 13.0 6.6 2:52.91 1-7 V=1/2 89.26 45.50 6.36 13.6 6.2 1-8 U=1/2 45 50 51.72 72 81 12 3 5.2 Peak height is given in standard deviations above of the function. The vectors are as follows: the mean value 1-2: 2x, 2y, -1/2 1-6: (x+y), (y+x), -1/2+22 1—3: -1/2+(x+y). -1/2-(x—y), -3/4 1-7: -1/2+2x, -1/2, -3/4+22 1-4: -1/2+(x~y), -1/2+(x+y), -1/4 1-8: -1/2, -1/2+2y, -1/4+22 1-5: (X-y). (y-x). 22 86 the crystal; and V represents the Sim weight [133,134] for the model phase angle ac(h). The map size was 100 x 100 x 140 grids, which corresponded to 0.91 A/grid along the a and b axes, and 0.94 A/grid along the c axis. Such so-called "2Fo-Fc" maps are commonly used in macromolecular crystallography when one is working with an incomplete structure, as in this situation where the phasing model lacked hirudin and solvent structure. The map is basically a point-by-point summation of an F0 and an (Fo-Fc) or delta F synthesis, and is useful for several reasons. First of all, like an F0 synthesis, it displays the input model. In addition, the map also indicates how additional structure can be accomodated, or whether existing structure can be reoriented or removed from the present model, like a delta F synthesis [135]. The Sim weighting scheme for the electron density map calculation was designed to weight each ac(h) by the probability of it being correct, and on the notion that the best weights for structure amplitudes minimize the mean square error in the electron density map due to phase angle error. Based on previous work by Blow and Crick [136], Sim showed that such weights are defined as 2n V = I cosEp(E)dE, ...(29) 0 where E = a(h)- ac(h) ...(30) and the probability that the phase difference lies between E and E + dB is given by the probability function p(E). This probability function has the form 87 p(E) = exp(X*cosE)/2n*IO(X), ...(31) where 10 is a zero-order modified Bessel function, and X is given by X = 2|Fobs(h)|*|Fcal(h)|/Iu ...(32) in which Iu represents the contribution from the "unknown structure" or the atoms in the structure not included in the model. In the Sim weighting scheme used to produce the hirudin-thrombin trial electron density map, Iu was approximated by the quantity ~ 2 Iu = <||Fobs|‘ _ |Fcal|2|> = fillFobs(h)2 - Fcal(h) Il/N. ...(33) where N is the number of observed structure factors included in the calculation, and Iu was calculated in 10 shells of equal thickness in reciprocal space. The Sim weight definition given above in equation (29) can be further simplified to the form v e Il(X)/IO(X), ...(34) where 11 is a first-order modifed Bessel function. The Sim weight is zero when X=0 and 1.0 when X is infinity; a plot of the Sim weighting factor V against X is given in Figure 25 [137]. The first Fourier map was displayed on an Evans and Sutherland P5390 interactive graphics display system and it was encouraging, as it clearly had density to accomodate hirudin as well as thrombin. The initial model-building which was conducted using this density map, as well as further graphics interventions and EREF energy refinement which lead to a mildly-refined 2.3 A resolution hirudin—thrombin structure [127], will be summarized next. 88 0.9 '- 08 r- 06 r- oa~ 0.2 r— O.| r- Figure 25. Plot of Sim weighting factor V versus X where H-Il(X)/IO(X) and X=2|Fobs(h)|chal(h)|/Iu (taken from [137]). Il(X) and IO(X) are first- and zero-order modified Bessel functions, respectively. Iu is an estimate of the contribution from the unknown structure, or that which is not accounted for in the model. V NOBEL-BUILDING AND EREF REFINENENT The hirudin-thrombin complex was initially refined to an R—value of 0.193 with 7.0 to 2.3 A resolution data by an iterative process of model-building and EREF refinement [127]. The model-building work was conducted interactively using the PSFRODO version [138] of the molecular graphics program FRODO [139] by fitting protein and solvent atoms in Fourier maps viewed on a Evans and Sutherland P5390 graphics display system. The model—building was initiated on the 3.0 A (2Fo-Fc) Sim-weighted Fourier map resulting from the Patterson search calculations with the thrombin model. The map clearly displayed density for thrombin, which most of the model thrombin structure could easily accomodate, and had a significant fraction of additional density for the recombinant hirudin. At this point, the coordinates of the hirudin structure determined using two—dimensional NMR techniques by Folkers et al. [79] proved to be very useful in tracing residues 3’ to 48’ ( where a prime after a number designates a hirudin residue) of the NHZ-terminal domain of hirudin. The most exciting aspect of these initial model-building studies was to see continuous density for the last ten COOH-terminal residues of the hirudin molecule, especially for the sidechains of Phe 56’ and Tyr 63'. This initial hirudin-thrombin model was refined using the energy-restraint crystallographic refinement program EREF [140], and the resulting EREF structure was inspected and modified at the interactive graphics display. Sim-weighted density maps - positive (2Fo-Fc) and negative as well as positive (Fo-Fc) Fourier maps - were used in the course of model-building. The cyclic model—building and refinement procedure was repeated six times with gradual increases in resolution to 2.3 A; the pertinent details of this 89 90 work are summarized in Table 6. In the second and later stages of this process, the FAST area detector structure factors were used solely. Water molecule oxygen atoms were inserted in the third stage, and in later stages. If a strong peak (generally > 3.50(Ap)) occurred in both the 7 to 2.3 A resolution (2Fo-Fc) and (Fo-Fc) Fourier maps, and was in a stereochemically reasonable position for a water molecule, a water oxygen atom was inserted. Individual temperature factors were also refined. After the six rounds of model-building and refinement, the structure of the complex had an R-value of 0.193 for 21,960 reflections between 7.0 and 2.3 A resolution. The structure included 207 water molecules, had a mean individual temperature factor of 35.7 A2, and the r.m.s. deviations for bond lengths and bond angles were 0.013 A and 2.4°, respectively. The final model parameters of the EREF-refined hirudin-thrombin complex are summarized in Table 7. The EREF macromolecular refinement program simultaneously minimizes a realistic potential energy function and a crystallographic residual. The complete potential energy function includes terms for bond stretching, bond angle bending, torsion potentials, and non- bonded and electrostatic interactions, and has the following form [141]: E e z 1/2Kb(bi — b0)2 + r 1/2Kt(ti — :0)2 bonds bond angles 2 Ko{1 + cos(m9i + d)} + r (A/ri12 + B/ri°) + z qi*qj/r torsion angles non-bonded interactions ...(35) where Kb is the bond stretching force constant; b0 the equilibrium bond length; Kt is the bond angle bending force constant; to is the equilibrium bond angle; K0 is the torsion barrier; m is the periodicity of the torsion barrier; d is the phase of the barrier; and A and B are 91 Table 6. Course of Model-Building and EREF Refinement Stage R-value Resolution No. of atoms (kcal/mole) Energy (A) 0 0.349 3.0 1 0.245 2.3 2 0.291 2.5 3 0.261 2.3 4 0.221 2.3 5 0.205 2.3 6 0.193 2.3 2495 2495 2677 2787 2848 2915 -1458 -1721 -1763 -2175 -2128 -2221 Rotation, translation structure; Nicolet P3/F data set used. Model rebuilt on display (1); 167 protein atoms added; 8 cycles of EREF including resolution extension. Model rebuilt on display (2); 9 cycles of EREF including resolution extension and 2 cycles of B-factor refinement; FAST data set used. Model rebuilt on display (3); 182 protein atoms added; 14 cycles of EREF including resolution extension and 4 cycles of B-factor refinement. Model rebuilt on display (4); 17 protein atoms and 93 water molecules added; 15 cycles of EREF and 4 cycles of B-factor refinement. Model rebuilt on display (5); 10 protein atoms deleted and 71 water molecules added; 19 cycles of EREF refinement and 4 cycles of B-factor refinement. Model rebuilt on display (6); 24 protein atoms and 43 water molecules added; 13 cycles of EREF and 4 cycles of B-factor refinement. 92 Table 7. Final Model Parameters of the EREF-refined rHV2-K47 Human aeThrombin Structure. Number of protein atoms a) Thrombin b) Hirudin Number of solvent atoms r.m.s. deviation from target values Bond lengths (A) Bond angles (deg.) Total internal energy (kcal/mole) Resolution range (A) Number of unique reflections used for refinement R-value Mean B value (A2) 2296 411 207 0.013 2.36 -2221 7.0 — 2.3 21960 0.193 35.7 93 the coefficients of the Lennard—Jones potential for non-bonded interactions. The Lennard-Jones potential parameters used were those specified by Levitt [141], and the values of the force constants were derived from the vibrational spectra of small molecules. Moreover, the bond angle and torsion angle force constants were corrected for the missing hydrogen atoms in the structure [142]. The crystallographic residual has the form x e r (|Fobs(h)| — |Fcal(h)|)2. h ...(36) A cycle of EREF refinement proceeds in three steps [142]. In the first step, the system of normal equations for the diagonal-matrix least-squares refinement of atomic parameters is developed from an (Fo-Fc) difference Fourier map. In step two, the potential energy of the model is minimized along with a crystallographic residual to refine atomic positions. If atomic temperature factors are refined, this minimization is bypassed. In the third step, new structure factors are calculated from the model resulting from step two. The system of normal equations generated in step 1 of a refinement cycle uses the (Fo-Fc) difference Fourier map to minimize the crystallographic residual in equation (36). These equations can be compactly represented by the expression A*dp = b ...(37) where dp represents the vector of desired parameter shifts (positions, occupancies, B-values); the matrix A is approximated by a diagonal matrix in which the diagonal elements are estimated for each atom type. The components of the vector b, which represent the shifted parameters 94 of vector dp, are obtained by convoluting the Fourier transform of the atomic form factor (i.e. the density about a given atom) with the gradient of the (Fo-Fc) map around the atomic site in the case of a positional refinement, or with the difference map itself in the case of occupancy or B-factor refinement. The convolution is generally calculated in a box of 3 x 3 x 3 grid points around the atomic positions, and the grid spacing used is approximately one-third of the maximum resolution in angstrom units. The function which is minimized in step 2 of an EREF cycle is R = E + kX ...(38) where E represents the conformational energy of equation (35), X represents the crystallographic residual of equation (37), and k is a weighting factor which can be used to emphasize or de-emphasize the diffraction pattern in the minimization process. For example, when k a 0 a pure energy refinement is conducted, and when k is large, a pure crystallographic refinement is performed. Generally, during the initial cycles of a refinement run, the k used was 5 x 10", and it was gradually increased in the following cycles from 3 x 10‘5 to up to 7 x 10'5 in the last cycles. The final step of the EREF cycle run involved generating a new set of calculated structure factors based on the energy minimization results or B-factor refinement conducted in the cycle. The FFT (Fast Fourier Transform) program of Ten Eyck [143,144] was used for this purpose. The program initially produces a model electron density map from which it calculates the structure factors by a Fourier inversion. VI MODEL—BUILDING AND PROLSO REPINENENT Using the final coordinates of the EREF-refined hirudin—thrombin structure as a basis (Table 7), the iterative model-building and refinement procedure was continued using the restrained least squares refinement program of Hendrickson and Konnert [145]. This program, PROLSO (an acronym for PROtein Least SOuares), incorporates stereochemical information into the refinement process by treating this knowledge as additional observations. Basically, PROLSO minimizes a grand function 6 a E 6i, where contains 61 functions for: a) structure factors, b) distances, c) planar groups, d) chiral centers, e) non-bonded contacts, f) torsion angles, g) temperature factors, and h) non-crystallographic symmetry. Each piece of information is treated as an observational equation, g-obs = g-calc(x) + e ...(39) where g-obs is the observation, g-calc(x) is the theoretical value computed from the parameters x of the model, and c is the discrepancy between the two [146]. Moreover, as the refinement is based on the principle of least squares, the ’best' set of parameters for equation (39) are those which minimize the weighted sum of squared residuals taken over all the observations, 6(x) = 2 whlgh—obs - gh-calc(x)]2 ...(40), h in which the appropriate weights wh are the inverses of the variances of the observations. The decision was made to further refine the hirudin-thrombin coordinates, and to use the PROLSO program in particular, for several 95 96 reasons. First of all, PROLSO has been shown over the years to be effective in producing stereochemically-sound structures of biological macromolecules. Moreover, in particular, PROLSO refinement circumvents two serious shortcomings in the EREF refinement. Temperature factors are not restrained during B-factor refinement in EREF, and in some instances the difference in B-factor between connected atoms in the complex is unreasonably large, as much as 30 A2 or more. Also, the final coordinates of the complex contain 40 w angles which differ by more than 10° from either side of 180° - one angle as small as 148° - since only moderate energy penalties are placed on such deviations from planarity. As the function minimized by PROLSO has terms for thermal parameters and planar groups, the aforementioned problems can be averted by making an appropriate choice of target a values, and therefore weighting factors for equation (40). In addition, upper and lower temperature factor limits were set in the program so that B-factors were not allowed to drop below 15 A2 or rise above 60 A2. Another important reason for choosing to refine the complex further by means of PROLSO is the inherent flexibility of this program compared to to EREF. By virtue of the different (a-h) functions which comprise the grand function minimized by PROLSO, as well as the ability to choose the weighting factors for these functions, a refinement protocol can be tailor-made. Through a judicious choice of weighting factors, one can tightly, moderately, or loosely restrain any or all of the (a-h) observational functions; one can also refine atomic occupancies (generally only performed on solvent molecules) in addition to positions and temperature factors, as well as introduce and adjust the extent of shift damping to be applied to parameters during the refinement. The EREF program, on the other hand, affords very little refinement 97 flexibility; one can simply adjust the crystallographic residual weighting factor k in equation (38), choose positional or B-factor refinement, and control the extent of damping applied to the recommended parameter shifts. Two new steps were also taken in this second phase of model-building and refinement. The first unit of the carbohydrate moiety in thrombin (Figure 3), N-acetylglucosamine (NAGl), was modeled into a sufficiently large region of 2Fo-Fc electron density near the side chain atom NDZ of Asn6OG and included into the refinement (Figure 26). The manner in which water molecules were located and refined in the crystal structure was different. In addition to the criteria used in the EREF refinement and model-building for choosing possible solvent atom positions {good 2Fo-Fc density and significant (generally > 3.50) Fo-Fc density over the resolution range being studied (7 A to the Inaximum resolution, 2.5 or 2.3 A), as well as proximity to a Daydrogen-bonding donor or acceptor atom], it was also required that the [mosition have significant (generally > 3.50) Fo-Fc density over the resolution range 8 A to the maximum resolution. Refinement of the tscalvent positions also included alternating block cycles of B—factor and Occupancy refinement along with positional refinement. The interactive model-building, as in the previous work, was cairried out on an Evans and Sutherland P5390 graphics display using the PSF'RODO version of the graphics program FRODO; however, the 2Fo-Fc and I:'<>~Fc maps used were not generated using Sim-weighted phases. In addition, the restrained least squares refinement was almost exclusively Q0“ducted using a modified version of PROLSO known as PROFFT, which incorporates fast Fourier transform-based algorithms to speed—up the Q0'Ilputation of structure factors and least squares matrix elements, and 98 I D 01A) :0, '6‘ “ - mung“ I .‘ V4} 1 Stereoview of the First Unit of Thrombin Carbohydrate Modeled into Map Density. Displayed in bold is N-acetylglucosamine (NAGl) linked to the side chain atom ND2 of AsnGOG. Map density is from the final 2Fo—Fc map used for model-building; contours at 0.80. Atomic positions are from the final coordinates. Figure 26. 99 which can restrain intermolecular contacts [147]. A summary of the model-building and PROLSO refinement is presented in Table 8. Refinement was initiated using the final coordinates of the EREF-refined hirudin-thrombin structure (including protein atoms and water molecules) in which all the atoms were given an individual B-factor of 35.7 A2, the average individual B-factor of the final EREF structure (Stage -3, Table 8). After two initial rounds of refinement (Stages —2 to -1, Table 8) involving 27 cycles {19 cycles employing tight (T) restraints, 8 employing loose (L) restraints; target 0 values used to produce the T and L restraint weights in the refinement are given in Table 9} and a model-building session, however, a new direction had to be taken as the refinement was ill-conditioned. Large coordinate and solvent occupancy shifts were occurring which got progressively larger in size and number, leading to a severe degradation in the stereochemistry and divergence in the refinement. In an effort to avoid these refinement problems, two steps were taken. After difficulties arose in the first round of refinement (Stage -2, Table 8), 857 weak reflections which gave rise to extremely poor R-factors were removed from the reflection file; however, problems persisted in the next stage of refinement (Stage -1, Table 8). As a second attempt to resolve these problems, the refinement was re-started without any of the solvent atoms of the final EREF structure; only the protein and NAGI carbohydrate atom coordinates from the first graphics rebuilding session (Stage 0, Table 8) were used. Removal of all the previous EREF-based water molecule oxygen atoms terminated the occurrence of large positional shifts in the protein atomic positions. In the first stage of refinement after this re—start, following only 10 cycles of refinement (8 T and 2 L restraint cycles, 100 Table 8. Summary of Model-Building and PROLSO Refinement. Abbreviations: T, M, L a tight, medium, and loose refine- ment restraints, respectively (see Table 9); 0 = occupancy; NAGl a N-acetylglucosamine; Bi = individual B (temperature) factor; <> s average; d = d-spacing. Final model parameters given in Table 10. Resolution Stage R—value Maximum Total Comments (A) Atoms/Waters -3 0.282 2.5 3056/228 EREF structure, all Bi a 35.7 A2. -2 0.237 2.5 3002/174 17 refinement cycles conducted (12 T, 5 L), 4 used (all T). (Bi)- 36.5 A . Refinement ill- conditioned: cannot recover from L cycles; large positional shifts on protein & solvent atoms; low occu- pancy (Q < 0.4) solvents deleted. -1 0.224 2.5 3022/228 Model rebuilt on display (1): 14 atom NAGl sugar unit positioned near Asn60G; 54 new waters added; 48 protein atoms removed. 857 reflections (weak or yielding poor R-factors) removed. 10 refinement cycles conducted (7 T, 3 L). = 36.0 A2. Refinement ill-condi- tioned: large positional and occu- pancy shifts. REMOVE ALL SOLVENTS AND RESTART. O 0.274 2.5 2795/0 Protein & NAGl coordinates from model rebuild (1). (Bi): 35.9 A2. 1 0.232 2.5 2795/0 10 cycles of refinement (8 T, 2 L) conducted; (81): 35.5 A2. 2 0.202 2.5 2855/76 Model rebuilt on display (2): 76 new waters located; 16 protein atoms removed. 30 refinement cycles (22 T, 8 L) conducted; 16 (14 T, 2 L) used; variable structure factor weighting applied hereafter. = 34.5 A2. 3 0.199 2.3 2924/132 Model rebuilt on display (3): 56 new waters located; 13 protein atoms added. 35 refinement cycles (29 T. 6 L) conducted; 32 (27 T, 5 L) used. Resolution extension to 2.3 d > 2.30 A. 12 refinement cycles (12 T) conducted; 9 used. = 35.6 A . Model rebuilt on display (5): 63 new waters located; 2 protein atoms removed. 32 cycles of refinement (11 T, 21 M) conducted; 17 used (a T, 13 M). = 35.3 A2. 102 Table 9. Summary of Hirudin-Thrombin PROLSO Refinement Restraint Parameters. Target 0 values listed for loose, medium, and tight refinement cycles. Veights used in refinement correspond to 1/02. Two values separated by a ’/' indicate that the first listed 0 value was used in early refinement situations while the second listed value was used later. Loose Medium Tight Distances (A): bond lengths 0.030/0.025 0.020 0.015 bond angles 0.050/0.040 0.035 0.030 planar 1-4 0.060/0.050 0.045 0.040 Planes (A): peptides 0.040 0.020 0.015 aromatic groups 0.040 0.020 0.015 Chiral volumes (A3): 0.200 0.150 0.150 Non—bonded contacts (A): single torsion 0.55 0.55 0.50 multiple torsion 0.55 0.55 0.50 possible (X..Y) hydrogen bond 0.55 0.55 0.50 Isotropic thermal parameters (A2): main chain bond 0.5 0.5 0.5 main chain angle 1.0 1.0 1.0 side chain bond 0.5/1.0 1.0 0.5/1. side chain angle 1.0/1.5 1.5 1.0/1. Target 0 Values UIO 103 Table 10. Final Model Parameters of the PROLSO—refined Hirudin-Thrombin Complex. The directional arrow and value enclosed in brackets indicate the change with respect to the final hirudin-thrombin EREF structure value. Protein atoms: Thrombin: 2322 (T 26) Hirudin: 447 (T 36) Carbohydrate (NAGl): 14 (T 14) Solvent atoms: 265 (T 58) Resolution range (A): 7.0-2.3 Unique reflections: 21056 (1 904) R-value: 0.173 (6 0.020) Mean B-value (A2): 35.3 (I 0.4) Geometrical Conformity r.m.s. deviation in Bond angles (deg.): 2.8° (t 0.4) r.m.s. deviations from target values: target model Distances (A): bond lengths 0.020 0.021 bond angles 0.035 0.051 planar 1-4 0.045 0.052 Planes (A): peptides 0.020 0.017 aromatic groups 0.020 0.015 Chiral volumes (A3): 0.150 0.198 Non-bonded contacts (A): single torsion 0.55 0.22 multiple torsion 0.55 0.27 possible (X..Y) hydrogen bond 0.55 0.27 Isotropic thermal parameters (A2): main chain bond 0.5 0.6 main chain angle 1.0 1.1 side chain bond 1.0 1.2 side chain angle 1.5 1.9 104 Stage 1, Table 8) the R—factor dropped 0.042 from 0.274 to 0.232, and there was an overall improvement in the stereochemical conformity of the structure. These results clearly imply that at least some of the solvent molecules in the final EREF structure coordinates were the source of the previous ill-conditioned refinement behavior. Moreover, a possible reason why some of the EREF structure water molecules could be causing problems in the PROLSO refinement is likely linked to the dramatic difference in the functions being minimized by the two programs. The EREF program minimizes a potential energy function along with a crystallographic residual, and likely the energetics of salvation play a strong role in refined water positions. The program PROLSO, on the other hand, has no direct energy restraints, with refinement being guided primarily by stereochemical restraints. In addition, when one also considers that these crystallographic water molecules have a scattering power of about ten electrons, are generally not fully occupied, and have on an average higher thermal parameters than the protein, it seems reasonable that the EREF and PROLSO refinement programs would treat such minor contributors to the overall scattering power of the protein crystal differently. The model-building and refinement progressed very well over the next four stages (Stages 2-5, Table 8). Water molecule oxygen atoms were added and refined in each stage until a total of 265 were included. Also of note is the fact that the resolution of the refinement was extended to 2.3 A for the last 3 stages of the work, and 47 weak reflections (many yielding poor R-factors) of resolution 2.56 A > d > 2.30 A were removed from the reflection file. Finally, a set of refinement restraints intermediate to loose (L) or tight (T), so called 'medium' (M) restraints (Stage 5, Table 8; Table 9), were used in the 105 last stage of refinement to arrive at the final hirudin-thrombin coordinate set. Overall, in the course of these last four rounds of model—building and refinement ( 54 T, 13 M, and 7 L restraint cycles used), the R—factor decreased by 0.059 from 0.232 to 0.173, 12 protein atoms were removed, 265 water molecules were added, and the average individual B—factor decreased slightly from 35.5 to 35.3 A2 (Table 8). An examination of the final model parameters of the PROLSO-refined hirudin-thrombin complex, along with changes in these values with respect to the final EREF-based complex coordinates (Table 10), reveals that the second phase of model-building and refinement work was clearly worthwhile. The final PROLSO-based complex coordinates contain 3048 total atoms: 2769 protein atoms (2322 in thrombin, 447 in hirudin), 14 carbohydrate atoms, and 265 water molecules. This atom total represents an increase of 134 over the number of atoms present in the final EREF-based coordinates; moreover, the increase is distributed as follows: 62 more protein atoms (26 in thrombin, 36 in hirudin), 58 more water molecules, and 14 N-acetylglucosamine atoms. Of particular importance is the fact that the R-factor was lowered significantly and the mean B—value decreased as well; the R-factor improved 0.020 from 0.193 to 0.173 and the mean B—factor dropped from 35.7 to 35.3 A2. Also, the final PROLSO refinement reflection file had 904 fewer reflections than that used in the EREF refinement work; these reflections were removed because they were of low intensity and generally yielded high R-factors. The geometrical conformity of the PROLSO-refined hirudin-thrombin structure is also quite good (Table 10). The r.m.s. deviation of bond angles is 2.8°, 0.4° larger than the value for the final EREF complex structure. The r.m.s. deviations on many restrained 106 paramaters are also generally in good agreement with the moderate restraint target values used in the final stages of the refinement (Tables 9, 10). The deviations on bond, angle, and planar 1-4 distances are close to, though generally larger than, the target values, with the deviation on bond angle distances displaying the greatest discrepancy (Table 10). The peptide and aromatic group planarity is excellent, and the r.m.s. deviations on these planar units are below the target values (Table 10). The histogram of w angles (twist along the C-N peptide bond) in Figure 27 attests to the high degree of peptide bond planarity as better than 91% of the bonds have m angles of 180° +/- 5°. In addition, the Ramachandran plot of the 6 (twist about CA-N axis) and 6 (twist about CA-C axis) angles for non-glycine residues in these coordinates, shown in Figure 28, indicates that the vast majority of the residues have ¢.W angles within conformationally-allowed regions. Only three residues in the complex, which all reside in the thrombin A-chain (SerlE, Phe7, and Asp14L), deviate significantly from the Ramachandran allowed regions. The stereoconfiguration at chiral centers is satisfactory; the r.m.s. deviation on chiral volumes is 0.198 A3, which is reasonably close to, but noticeably larger than, the target value of 0.150 A3 (Table 10). On the other hand, the r.m.s. deviations on all non-bonded contacts are outstanding, as all three contact types have deviations which are less than half the target value of 0.55 A (Table 10). The low r.m.s. deviations on non-bonded contacts could in part be due to the fact that an EREF structure was used to inititate the PROLSO refinement, and that the EREF program, which minimizes a potential energy function containing terms for non—bonded contacts, did well in idealizing these non-bonded interactions. Finally, the deviations on 107 .xoaqeoo :Beougetfiuafiz 05 5 mafia: 3 we 33:3“: .3 6.28: mmozq comzo & .mmd D .Nh— S fix..— 0 du— O .muul a 68.7 0 .nhul 5 .Dhnl _ —I all! 0“ GM ILII .Mn I?! if. 11 III—dull. l4]. .mm .l:l. LI- Lr. .1" 11 1T III #1 Lu .0 .n d« flu 6N .nN dn .mn 6v .nv in .nn do 60 6h 6h .60 .no in .mm .8 n .3" .0. n .n— n .8 n .8 a .GM u .3 a .3" .0. n .8" SUOBHO JO HBBHON 107 .5338 533351533: 6.: 5 moan: 3 mo 30.333: a .mm.- 0 Nb" a .05— _ _ msozq cowzo ads— 0 .m~ — I a 6%.: 0.9:: pl .x~ assess a .27. 13!! .M n .mm .Nfifi 111414 117 ITIIIIUTI lllllll [Fill [IT] JllJllllllglllllJllllll .9 .n .0 n .n « 6N .nN dn .nn .0? .0? 6n .mn dc 6m .Dk fit 60 do 6m .nm .8 n .8" .0. n .m— n .8 n .nNn .8 n .3 a .00 u .9 n .8" SUSBHO 30 838NON 108 a. an. lea. I L 1 1 1 1 m. I 1 I 1'- I 1 I _ l 1 ,1®I4L Asp I I 90. 1 d ' 1 1 1 | 1 1 _ 1 "I 4 I K g .. \\ 1' \II o. 1"”. -3. .......... , "”- l 1' 1 1 1 r 1 1 1 f 1 -1ao. -10a 40. 0. so. 180. PHI Figure 28. The Ramachandran Plot of Hirudin—Thrombin. Only non-glycine residue angles are displayed. 109 isotropic thermal parameters among main chain and side chain atoms involved in bonds and angles are all fairly close to the target values, though they are slightly larger (Table 10). Data pertaining to the R-factor versus resolution shell (seven shells were used, containing approximately 3000 reflections per shell) for the final model parameters of the complex are presented in Table 11 and plotted in Figure 29. The nearly linear, direct correlation between the shell-R and resolution is typical of well-refined structures. Moreover, the fact that the R-factor versus resolution data point curve of the complex falls between the lines corresponding to the Luzzati coordinate error estimates [148] of 0.20 A and 0.25 A, suggests that the average error in the atomic positions for the final PROLSO-refined coordinates is likely close to these estimates (Figure 29). Luzzati plots are based on the assumption that the discrepancies between observed and calculated structure factor amplitudes are due solely to the coordinate error, Ar; other possible sources of error, such as inaccuracies in the measurement of Fobs, are neglected [148]. Thus, the actual error in the coordinates of the complex is probably somewhat larger than 0.25 A. While the average individual isotropic temperature factor, , for all the atoms in the hirudin-thrombin complex is 35.3 A2 (Table 10), it is much more informative to consider the of each component of the crystal and how the (Hi) varies within each of these components. Values of <81) for the various constituents of the hirudin-thrombin crystals are listed in Table 12. Not surprisingly, the for thrombin (32.1 A2) is significantly lower than that for hirudin (43.3 A2), and the thrombin B-chain atoms have a lower than those of the A-chain (31.5 A2 for the B-chain versus 37.3 A2 for the A-chain). Since 110 Table 11. The R-factor vs. Resolution for the Final Hirudin-Thrombin Model Parameters. The dmin vs. shell R-factor is plotted in Figure 29. The directional arrow and number enclosed in brackets next to shell R-factor values for the the unhydrated structure indicate the change with respect to the corresponding values for the hydrated structure. dmin reflection sphere shell shell (A) number R-factor R—factor R-factor no waters 4.13 3043 89.19 0.150 0.150 0.239 (T 0.089) 3.39 3380 63.62 0.143 0.136 0.198 (T 0.062) 3.00 3335 55.61 0.151 0.176 0.243 (T 0.067) 2.75 3160 46.67 0.157 0.190 0.261 (T 0.071) 2.56 3112 42.60 0.163 0.212 0.267 (T 0.055) 2.42 2826 39.19 0.168 0.235 0.285 (T 0.050) 2.30 2200 38.25 0.173 0.265 0.304 (T 0.039) overall: 21056 54.47 0.173 0.242 (T 0.069) 111 ._qu_ “Lawns; cu mcfivuouuw Auqv muouuo uumcwvmoou wmmuw>m cowmwowam om one name aucmawmumwc ovsufiaaem Louomu wmsuusuum ucwmouamu moHouwo voHwa Hausa one .xwageoo may no mucfioq name may unomwuaou moaomfiu vwaafiu omens .cwneomzeuzwczmfim Lou :owusflowwm .m> monumWIm mo Loam .mu ousmfim QQZGJ\IHZ~w DON .o nNN .0 SN .0 n: .0 00‘ .6 DN— .6 00— .0 _ p e _ _ _ .on _ a q _ _ a v 11.n_ 11.0“ <0N.OL< IIAhN <0) Number (AZ) All atoms 3048 35.3 ---- All protein atoms 2769 34.7 1.00 Thrombin 2322 32.1 1.00 Thrombin A-Chain 253 37.3 1.00 Thrombin B-Chain 2069 31.5 1.00 Hirudin 447 48.3 1.00 All water molecules 265 39.8 0.78 Stage 2 waters 76 31.1 0.91 Stage 3 waters 56 37.8 0.81 Stage 4 waters 70 44.1 0.76 Stage 5 waters 63 47.2 0.63 113 hirudin is a small inhibitor protein and thrombin is a large serine proteinase with a molecular architecture like chymotrypsin, one would expect atomic vibrations of the atoms in the serine proteinase to be lower than those of the inhibitor. Moreover, since the thrombin A-chain is much smaller than the thrombin B-chain one might expect the B-factors of the covalently liked peptide to be higher, on the average. However, what is startling is the fact that the for the hirudin atoms is even higher than the average for all the crystallographically—located water molecules (39.8 A2) by 8.5 A2. An examination of the versus residue number for all of the protein components in the crystal is even more informative. As can be seen in Figure 30, the varies significantly among the main chain and side chain atoms within each protein unit. The smooth variation in for main and side chain atoms is likely a result of the isotropic thermal parameter restraints used in the refinement (Table 9). However, the magnitude and variation of are likely reflective of the conformational flexibility, crystal disorder, and stabilizing interactions in the protein. The for the side chain atoms is in general larger than for main chain atoms because side chain atoms, by and large, have more conformation flexibility than the main chain atoms. Moreover, in several of the residues where the side chain is lower than that of the main chain, this occurs because the sidechain atom is a cysteine sulfur involved in a disulfide linkage, which could certainly explain the diminished thermal motion compared to that of the main chain atoms (Figure 30). Many of the ’high peaks’ in the Figure 30 plot correspond to residues (most of which are in loop segments) which are poorly defined in electron density, or are near disordered regions. In the thrombin as a. gal. .— 8.3. Li. C. no 0: Dis. .— (I (Zn. [LI 0. z: z: LLI .— fl. 1: II. a I. Figure 114 m1“ TWIN I-CMIN HIRUDIN A-CHAIN X 1 x t I x . 1 1 a: x v. “F. .7 I ‘ X . xx xx x x ‘8 I I -- i ‘ X ‘ &‘ 3 x 1, : .> E n a . g X] :7 x ‘ x - v ‘ ! In. Ax 5; I b h & ‘ 1x - ' " A x I . .. i ‘ § ' ‘ 3 __*x . 6 '. ‘1 ~ .' k . ’ - ' a l‘ ' X "" x x‘ 0 x l l I l l l J l l [I L l LLJ l l l I l l I l l I [I I ll rII'I—IIFIIIFrIIIITIIIIIIIIIIII'I'II VII—T— I“ I. I9 . on " . I. II, I“ I” I” I” III ””I' .' ‘0' “'0' RESIDUE NUNBER 30. Plot of the Average Individual B-factor of Main and Side Chain Atoms for the Protein Atoms in the Complex. The for main and side chain atoms is represented by ’+’ and 'X' symbols, respectively. Main chain atom values connected by a line. The break in in hirudin corresponds to the completely undefined segment Ser32'—Lys35’. Bold arrows point to main and side chain values of Cys residues engaged in disulfide linkages. 115 molecule, the N- and C—termini of the A-chain and the C-terminus of the B—chain are all missing residues due to disorder in the crystals, and the peak at Glu97A is indicative of the poor density for this residue (weak main chain density and essentially no side chain density). In the hirudin molecule, the residues possessing the largest main and side chain either adjoin the completely undefined loop residues Ser32'—Lys35’ or are near the poorly defined tripeptide segment Asn52'-Asn53’-Gly54'. Finally, stabilizing interactions in the protein molecules of the complex are probably responsible for some of regions of diminished thermal motion in the Figure 30 plot. The stabilizing or damping effect which the disulfide linkages have on the thermal motion of residues near them is clearly visible, as the cystine linkages generally occur at or near the minima. Also, the low main chain and side chain values for residues Ile1'-Tyr3’ and the trough for residues Asp55’-Gln65’ in Figure 30 is probably due to the numerous hirudin—thrombin interactions involving these regions of the hirudin molecule, as will be elaborated on in the following chapter. All the water molecules in the final PROLSO-refined coordinates of the complex were located and refined over the course of four stages (Table 8), as was mentioned earlier. Upon examination of the average individual B-factor and average occupancy (<0)) for the water molecules added in each of these four steps, an interesting trend was discerned (Table 12). The magnitude of increases and the magnitude of <0) decreases with each successive round of water molecules added, with values = 31.1 A2 and <0) = 0.91 for the water molecules added in the first stage of hydration (Stage 2, Table 8) and = 47.2 A2 and <0) = 0.63 for water molecules added in the last stage (Stage 5, Table 116 8). This trend seems reasonable if one considers that the difference density peaks selected to be water molecules in the first round of water addition were large in magnitude and significance, thus to satisfactorily account for these peaks a water molecule would generally refine to a low B-factor and a high occupancy factor. However, in the last stage of hydration, the majority of bound solvent atoms have already been located, and frequently many of the significant difference density peaks are likely due to errors. In addition, the largest difference densities are lower in magnitude than they were in the first stage of hydration, and many of the suitable difference density peaks will correspond to second hydration sphere water molecules (more interaction with other water molecules than with protein atoms); thus, the latter water molecules frequently refine to significantly higher B-factors and lower occupancies. The histogram of water individual B-factors in the complex (Figure 31) indicates that although the range in individual temperature factors for water molecules is quite broad, with 15 A2 < Bi < 55 A2, nearly half of the crystallographic waters have 40 A2 < Bi < 55 A2. The histogram of water occupancies in the complex, Figure 32, shows that majority of the located water molecules are of fairly high occupancy, with nearly 40 water molecules per occupancy shell in each of the five shells between 0.5 and 1.0, and the largest number of waters (55) having the highest possible occupancy, 1.0. An examination of the R-factor versus resolution data in Table 11 calculated without the water molecules included clearly indicates that the water structure makes a significant contribution to the overall R-factor, and to the the R-factor of each resolution shell. The overall R-factor for the unhydrated complex is 0.242, 0.069 larger than that of the hydrated structure. In addition, while the effect of the water 117 .ousuosuum caneouze1cfivauum may :« moaaumuoz Loans mnu we mucuomuum Hmacfi>ficcm may we enumODmfim .Hm ousmfim maoeoqu mzwh 0.0» 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00. 0.0N 0.0N 0.0. 0.0" 0.0 0.0 _ r e _ p _ _ _ _ a .o. 6" 11 a." , .0“ 1.1 on 11. dc do an . if SNBIUH JO 338NOH 118 .o.H mo moccazooo cm o>mn .H.H 1 o.H .Hawzm mucmasuoo ummH on“ a“ measuwaoe kumxa HH< .musuusuum caneoune1eficsuwm one a“ measumaoz Loam: mzu mo muouomm mocmqsuuo may no EMLNODmfim >oz¢anooo 0.0 0.0 5.0 0.0 0.0 0.0 0.0 «.0 «.0 _ l .Na ..eaae on .— On“ ann .0» .00 Fl out on? SHBIUH JO 838NON 119 structure is most significant in the lowest resolution shell (0.089 for the 7.00 to 4.13 A resolution data), where ordered solvent typically makes its largest contribution to x-ray scattering, the water contribution is surprisingly large in all the resolution shells (Table 11). Even in the highest resolution shell (2.42 to 2.30 A), removal of the water molecules causes the R-factor to increase by 0.039 for the shell. The considerable impact of the water structure at all resolution levels is probably a result of the protein structure re-adjusting in the refinement to accomodate the water molecules optimally. VII RESULTS AND DISCUSSION A. The Hirudin Structure Hirudin appears to possess two distinct domains in the complex (Figure 33). (The hirudin sequence is given in Figure 34.) The NH2—terminal domain is compact, and contains residues Ilel' to Pro48'. The COOH-terminal domain adopts an extended conformation involving two stretches of polypeptide, residues Glu49’—Gly54' and Asp55’-Pro60’, and terminates with a near 310 type III helical reverse turn (Glu6l’-Gln65') [149]. The secondary structural elements present in the hirudin molecule are summarized in Table 13. The globular conformation of the NHZ-terminal domain of hirudin (Figure 33) is intimately related to the presence of the three-disulfide core (Figure 35). The disulfides Cy56'-Cysl4' and Cysl6'-Cy528' orient nearly perpendicular to one another with the distance between the midpoints of the disulfides being 4.97 A. In addition, Cysl6'-Cy528' and Cy322'-Cys39' are nearly parallel, and the distance between the disulfide bond midpoints is 5.35 A. Similar close disulfide core interactions are also observed in the kringle structures of prothrombin fragment 1 [150] and plasminogen K4 [151], C3a anaphylatoxin [152], and squash seed trypsin inhibitor [153]. A complete list of the sulfur-sulfur distances in the hirudin molecule is presented in Table 14. It is of particular note that, excluding the actual disulfide bonds themselves, the smallest inter-sulfur distance is 3.74 A (Cys6’ SG-Cy528' SG), the largest distance is only 8.32 A (Cysl4' SG—Cys39’ SG), and the mean distance is 5.91 A +/-1.30 A(0). The result of these close cystine interactions is that the loop segments B, C, and D, which comprise a double loop in hirudin (Figure 34), fold into three unique 120 121 . 32- 35!] Figure 33. Stereoview of the Folding of Hirudin in the Complex. N, CA, C atoms only; hirudin disulfides in bold; disordered or poorly defined regions indicated by dashed lines. 122 Figure 34. The Sequence of Recombinant Hirudin Variant 2-Lysine 47 (rHV2-K47). Isolated loop designated 'A’ and loop segments comprising the two interconnected loops designated I r ICI, and ID]. 123 Table 13. Secondary Structural Elements of Hirudin. Element B—Structure 1 2 Helix H1 Reverse Turns T1 T2 T3 T4 Type Antiparallel Antiparallel Polyproline Type II' Type II ? Type III Residues Cysl4’-Cysl6' Asn20’-Cy522' LysZ7'-Gly31' Gly36'-Val40' II Pro46'-Hi551' Glu17'—Gly18'-Ser19'—Asn20’ Gly23’-LysZ4'-Gly25’-Asn26' Ser32'-Asn33'—Gly34’-Ly535' Glu61'-Glu62'-Tyr63’-Leu64' 124 28' 28' 6. Figure 35. Stereoview of the Cystine Disulfide Core of Hirudin. Disulfide linkages in bold. Table 14. Sulfur Atom 6! 14' 16’ 22' 28' 39' 125 Sulfur-Sulfur Distances (A) in Hirudin. A '*' indicates a disulfide bond. 6' 14' 16' 0.0 2.01* 4.72 0.0 6.43 0.0 22' 6.11 6.56 4.70 0.0 28' 3.74 5.43 2.06* 5.48 0.0 39' 7.65 8.32 5.22 2.05* 6.58 0.0 126 loops of protein folding. This, combined with the ordinary Cys6'-Cysl4’ loop, produces four three-dimensional loops (Figure 36). The hirudin NHZ-terminal domain appears to be stabilized by antiparallel B—structure. A short segment (91, Table 13) results from the interaction of strands Cysl4'-Cysl6' with Asn20’-Cy522’ (Figure 37). The 81 structure is maintained by the hydrogen bonds Cys22' N-Cysl4' 0 (2.83 A) and Cysl6’ N-Asn20' O (3.17 A). A short hydrogen bond interaction (2.60 A) between residues Leu13’ N and CysZZ' 0 immediately precedes these two strands. The polypeptide strands of this B—structure are connected by a type II' reverse turn (T1, Table 13) [149]. A type II reverse turn (T2, Table 13) [149] is also present in loop segment C involving residues Gly23'- Asn26'. A longer antiparallel B-finger (92, Table 13) occurs in loop segment D, and involves residues Ly527'-Gly31' and Gly36’-Va140' (Figure 38). Four hydrogen bonds are apparent in this structural element: Lys27' N-Val40’ O (2.39 A); Val40' N-LysZ7' O (2.42 A); Ile29’ N-Gln38' o (2.53 A); and Gln38’ N-Ile29' o (2.72 A). As with 91, a reverse turn (T3, Table 13) joins these strands; however, as in solution [78-80], it is completely disordered in the crystal structure of the complex (Figure 33). The structure of the 62 strand does give some indication that its associated turn will likely be disordered. The strands of peptide appear to diverge just prior to the turn (Figure 38), and this is further evidenced by the fact that what should be a hydrogen bond preceding the turn, Gly31’ N-Gly36' O, is a 3.25 A long contact - too long to be a viable hydrogen bond. The flexibility of this turn could be due to the fact that it is hinged to glycines 31' and 36'. In addition to the disulfide core, many intramolecular hydrogen bonds also appear to stabilize the N—terminal domain of hirudin in the complex (Table 15). In all, 25 intramolecular hydrogen bonds are found Figure 36. 127 Stereoview of the Folding of NHZ—terminal Domain of Hirudin in the Complex. N, CA, C atoms only; designations ’A', ’B', ’C’, and 'D’ correspond to those of Figure 34; disorder indicated by dashed lines. 128 Figure 37. Stereoview of the 81 Antiparallel B-Strand in Hirudin. Hydrogen bonds indicated by dashed lines. 129 Figure 38. Stereoview of the 92 Antiparallel B—Strand in Hirudin. Hydrogen bonds indicated by dashed lines. Table 15. A. 8. Element 130 Intramolecular Hydrogen Bonds in Hirudin. Hydrogen atoms are assigned geometrically idealized Donor atoms are denoted ’D’ and acceptor positions. atoms ’A'. Donor Cys6’ Gly10' Asn12' Leul3' Leu15’ Cysl6' Asn20' Cy522' Asn26' LysZ7' Cy528' Ile29' Gln38' Val40’ Gly42’ Gly44' Thr45’ Lys47' Tyr63’ Leu64' Donor Thr7’ Glu8' Glnll’ Leu30' Cys39' Lys47' Lys47' ZZZZZZZZZZZZZZZZZZZZ N N NE2 N N NZ NZ Acceptor Leu15' Cy528’ Thr45' Cys22’ Thr4’ Asn20' Glu17’ Cysl4' Gly23' Val40’ Glnll' Gln38' Ile29' LysZ7’ Gly25' Asn26’ Glle’ Asn12’ Pro60' Glu6l’ OOOOOOOOOOOOOOOOOOOO Mean: Standard Deviation: Hydrogen Bonds involving Side Acceptor Glnll' Glnll' Glu8’ Ser9' Glu17' Thr4’ Asp5' 0E1 0E1 081 OG 081 001 0 Mean: Standard Deviation: 1. 0. 46 .69 .34 .52 .83 .53 .34 .34 .91 .89 .95 .03 93 35 70 .71 .62 .69 .94 1.75 0. 18 NNNNNWNU ON ON NNNNNWNNNWNWNWWNNNNNU Hydrogen Bonds involving Main Chain Atoms. Distances (A) H..A 1.79 1.44 1.38 1.72 1.85 2.17 2.63 1.89 2. 1 2 1 1 1 2 2 1 1 1 2 ..A .75 .38 .24 .60 Chain Atoms. Distances (A) H..A 1.56 2.06 1. 1 1 1 1 ..A .48 .01 .63 .60 .57 .64 .90 .69 .19 DHA 164 159 141 148 161 162 130 149 137 128 130 167 158 147 134 123 145 144 159 134 146 14 DHA 144 160 158 150 169 149 152 155 Angles(deg) CAH CAD 145 142 137 145 117 130 166 171 119 125 157 154 109 122 152 155 130 130 128 146 125 138 156 160 140 148 168 167 128 120 138 132 137 148 174 170 124 128 92 106 Angles(deg) CAH CAD 133 137 83 87 116 114 120 130 133 129 130 121 145 150 131 in the domain, of which 18 involve main chain atoms only (eight reside in turns or B—structure), and seven involve side chain atoms. In particular, while an examination of Figure 36 might lead one to wonder what maintains the conformation of the Val40'-Pro48’ polypeptide, impressively nine hydrogen bonds along this nine residue stretch link the chain to the rest of the N-terminal domain. Hydrogen bonds involving Lys47' also appear to be important in extending the domain to residue Pro48' (Figure 39). The e-amino group of Lys47’ is involved in two hydrogen bonds with residues of the amino terminal pentapeptide of hirudin: Thr4’OG1 (2.64 A) and Asp5'0 (2.90 A) (Figure 39). In addition, Lys47' N hydrogen bonds to Asn12’ O (2.70 A) (Figure 39). Such interactions bring the N- and C-terminals of the domain in relatively close proximity, increasing the size but maintaining the compactness of the domain. The rigidity of the lysine side chain might well be related to the two prolines which flank it, Pro46’ and Pro48’. These three residues also initiate the polyproline II helix (81, Table 13) [154] which extends from Pro46' to HisSl’ (Figure 40); the six residues in this stretch exhibit Ramachandran ¢, w angles similar to those in a collagen helix [155]. A hydrogen bond is a two bond system, D-H..A, consisting of a donor (D-H) and an acceptor (H..A) bond, where D and A represent the electronegative donor and acceptor atoms, respectively. As such, the hydrogen bond contains a large positive (H atom) and a large negative (acceptor) partial charge at close distance. Moreover, since a positive charge assumes its lowest potential energy between two negative charges when all three are aligned, hydrogen bonds are ideally linear. Experimentally, however, it has been shown that the angle decreases from 180° for strong hydrogen bonds to around 130° for weak Figure 39. 132 Stereoview of the Intramolecular Lys47' Hydrogen Bonds in the Hirudin N-terminal Domain. CA structure of residues 1’-48' including disulfides (disulfide linkages in bold), and hydrogen bonding residues (bold). The disorder in residues 32’—35' as well as the hydrogen bonds are indicated by dashed lines. Figure 40. 133 46' 46' 5V Stereoview of the Polyproline II Helix in Hirudin. N, CA, C atoms indicated in bold. 51' 134 ones [156,157]. Hydrogen bond interactions are commonly described in terms of two distances, r(D..A) and r(H..A), and the DHA angle. (Additional angles can also be used to more fully describe the interaction; the CAB and CAD angles can be reported, where ’C' represents the carbon atom to which the acceptor atom A is bonded, but these are not as critical as the DHA angle.) The mean values of the hydrogen bond descriptors r(H..A), r(D..A), and of the angle DHA were determined for the intramolecular hydrogen bonds in hirudin (25 of the 27 bonds are in the hirudin NHZ-terminal domain), and in general they agree fairly well with mean values determined from organic crystal structures (Table 15). The mean values of the distances H..0 and N..O, and of the angle NHO for the 20 intramolecular main chain atom (N, CA, C, 0) hydrogen bonds in hirudin were 1.93 A, 2.79 A, and 146°, respectively (Table 15). These values are in good agreement with those determined for intramolecular main chain atom hydrogen bonds {r(H..O) a 1.988 A; r(N..O)=2.755 A; NHO angle - 132.5°} and all (intra- and intermolecular) main chain atom hydrogen bonds in general {r(H..O) = 1.921 A; r(N..O) . 2.878 A: NHO angle . 158.3°} from the literature [153]. Similarly, the corresponding mean values for the intramolecular hirudin hydrogen bonds involving amino acid side chains {r(H..A) = 1.75 A; r(D..A) = 2.69 A; DHA angle a 155°} (Table 15) are in good agreement with those determined from organic crystal structures involving nitrogen donor and oxygen acceptor atoms {r(H..A) a 1.93 A: r(D..A) . 2.858 A: DHA angle - 160.4°} [159]. Finally, while the ideal bond length for general hydrogen bonds involving oxygen and nitrogen-oxygen atoms is considered to be 2.80 A and 2.85 A [145], respectively, the 1-4 hydrogen bonds in turns are often accepted to be as long as 3.2 A [149]. The fact that so hFu 11.. 135 many of the intramolecular hirudin hydrogen bonds in Table 15 deviate significantly from the ideal values suggests that the diffraction data are not of high enough resolution or quality to resolve them accurately. The folding of the NHZ-terminal domain of hirudin in the crystalline complex is very similar to that which has been observed for hirudin in solution by ZD NMR studies [78-80]. An optimal superposition of 114 hirudin N, CA, C atoms from the complex (residues 4'-30' and 38'—48') with the average coordinates of 32 NMR structures [79] gave an r.m.s. difference of 0.86 A, with an r.m.s. difference of 1.95 A for side chains. The r.m.s. difference of 0.86 A agrees well with the average atomic r.m.s. differences between the 32 individual NMR structures and with the mean NMR structure, which was approximately 0.7 A for backbone atoms [79]. The NMR hirudin structures also had disorder in the vicinity of the 32'—35' reverse turn [78-80]. Based on these comparisons, the close correspondence between the NMR and the crystal structures indicates that the hirudin N-terminal domain possesses a similar structure in the free and thrombin-bound states. Surprisingly, while the main chain atom agreement can be considered to be quite good between the NMR and crystalline complex hirudin structures, the corresponding r.m.s. difference between cystine sulfurs is quite large, 2.43 A. An optimal superposition of the two disulfide cores based on the N, CA, C fitting mentioned previously is displayed in Figure 41; not surprisingly, the orientation of the disulfides is fairly different between the two structures. The dihedral angles and dihedral energies of these disulfide bridges are presented in Table 16. The Cys6-Cysl4 and the CysZZ-Cys39 disulfides are displaced approximately a half-bond length and a full bond length in the NMR structure from where they are positioned in the crystal. Moreover, the orientation of the 136 Figure 41. Stereoview of the Disulfides in the Crystal Complex and in the NMR Hirudin Structure [79]. NMR hirudin disulfide side chains indicated by dashed lines; disulfide linkages in bold. 137 Table 16. Dihedral Angles and Dihedral Energies of the Disulfide Bridges in Hirudin. A ’*' indicates the energy value was excluded from the calculation of the mean. CA-—CB--SGl--SG2--CB--CA X1 X2 X3 X1' X2' Disulfide Source x1 x2 X3 Xl' x2' CA—CA(A) E(kcal/mole) 6-14 NMR -161 -62 87 108 65 16—28 NMR —60 —169 75 -168 -40 22-39 NMR -128 110 165 -83 -121 97 3.32 21 1.92 49 15.05* 89 2 62 Mean Values: . 81) (+/-0.99) 4. 6. 6. 5. (+l-Oo 6'-14’ Crystal -51 -66 97 86 54 4.99 2.07 5. 5. 16'-28’ Crystal ~165 81 85 76 -175 94 1.70 22'-39' Crystal -175 70 80 65 174 55 2.23 Mean Values: 5.49 2.00 (+/-0.48) (+/—0.27) 138 the Cysl6-Cy528 disulfide is different in the two structures: the x1, x1' angles and the x2, X2’ angles differ by approximately 120° and 247°, respectively (Table 16). Available protein structure coordinates indicate that two general disulfide families can be defined based on the chirality of the dihedral x3 angle about the sulfur-sulfur bond: right-handed (x3 approximately 90°) and left-handed (x3 approximately -90°) [160]. Five of the six disulfides presented in Table 16 are right-handed - all of the disulfides from the crystal structure and two of the disulfides from the NMR structure. The CysZZ-Cys39 disulfide of the NMR structure has an unusually large x3 value (X3 a 165°) which has not been observed in protein structures [160] most likely due to its high dihedral energy. The dihedral energies for the disulfides presented in Table 16 were calculated using a formula from the program AMBER [161]: E(kcal/mole) 2.0*(1 + cos(3*X1)) + 2.0*(1 + cos(3*X1')) + 1.0*(1 + cos(3*x2)) + 1.0*(1 + cos(3*x2’)) + 3.5*(1 + cos(2*x3)) + O.6*(1 + cos(3*x3)) ...(41) The dihedral energy for the CysZZ-Cys39 disulfide in the NMR structure is 15.05 kcal/mole, drastically larger than the mean value of 3.19 (+/— 1.43) calculated for right-handed disulfides using available protein crystal structure data [160]. Excluding this outlier value, the mean dihedral energy of 2.62(+/- 0.99) kcal/mole for the remaining two disulfides in the NMR structure, and of 2.00(+/-0.27) kcal/mole for the disulfides in the crystal structure agree well with mean value calculated by Katz and Kossiakoff [160]. The mean CA-CA(A) distance of 5.49(+/-0.48) A (Table 16) observed in the complex also agrees well with the value of 5.07(+/—0.73) A for right-handed disulfides [160]. It is 139 of note that the mean CA-CA(A) value of 5.89(+/-0.81) A for the disulfides in the NMR structure is on the high side due to the fact that two of these distances were long — 6.29 A (Cysl6 CA-CysZB CA) and 6.49 A (CysZZ CA-Cys39 CA). The COOH-terminal domain of hirudin exists as two extended stretches of polypeptide with a bend at Asp55' (Figure 33). The first segment of this domain is approximately 18 A long and contains residues Glu49’ to Gly54'. The first three residues of this segment, G1u49'-Ser50'—HisSl’, complete the polyproline helix displayed in Figure 40. The last three residues of this stretch, Asn52'-Asn53’-Gly54', have very poorly defined electron density in the 2Fo-Fc maps. The second segment of the COOH—terminal domain is approximately 19 A long and consists of 15 A in extended conformation (Asp55'-Pro60') followed by a type III 310 helical reverse turn [149]. This reverse turn (T4, Table 13) involves residues Glu61'-Leu64', contains a hydrogen bond between Leu64' N-Glu61' O (2.82 A) (Table 15), and is shown in Figure 42. A hydrogen bond interaction between Tyr63' N—Pro60’ 0 (2.90 A) (Table 15) precedes this turn. A general helical structure was also deduced for residues Glu61'-Leu64’ by NMR with a synthetic peptide fragment of the hirudin tail [162]. The elongated nature of the C-terminal tail with no stabilizing interactions back to the N-terminal domain probably explains why this portion of the hirudin molecule is disordered in solution. B. The Hirudin-Thrombin Interaction The CA structure of the rHV2-K47-thrombin complex is shown in Figure 43, from which it can be seen that the NH2-terminal tail of hirudin binds in the active site region, while the 39 A long COOK—terminal tail extends across the surface and finishes almost 140 Figure 42. Stereoview of the T4 Type III Helical Reverse Turn in Hirudin. Hydrogen bonds indicated by dashed lines; main chain atoms in bold; G1n65’ also included. 141 Figure 43. Stereoview of the CA Structure of the Hirudin-Thrombin Complex. Hirudin CA and disulfides of hirudin and thrombin in bold. N— and C—terminals of thrombin designated. 142 diametrically opposed to the hirudin N-terminus. A particularly unique and unexpected interaction occurs between the NHZ-terminal tripeptide of hirudin and the the active site region of thrombin from the standpoint of classical serine proteinase inhibition by macromolecular inhibitors. All SERPIN (SERine Proteinase INhibitor)-serine proteinase complexes characterized to date [84,163] display a common mode of interaction in the active site which is also observed in the PPACK-thrombin structure [6] (Figure 44). The polypeptide chain of PPACK aligns antiparallel to the 214-217 strand in thrombin, and the arginyl side chain, which corresponds to the P1 residue of thrombin, sits in the specificity pocket in a substrate-like manner - making an ion pair with residue Asp189 at the base of the pocket. The hirudin NH2-terminal tripeptide, however, forms a parallel B—strand interaction with the 214-217 thrombin segment and makes no use of the specificity pocket (Figure 45). The hirudin tripeptide forms three hydrogen bonds with the thrombin 214-217 strand (Table 17). Residue Ile 1’ engages in two hydrogen bonds (Ilel' N-Ser214 0, 3.02 A; Ile 1' 0-Gly216 N, 3.21 A) and Tyr3’ participates in one (Tyr3' N-Gly216 0, 3.01 A) (Table 17). In addition, the carbonyl oxygen of Tyr3' makes a hydrogen bond with the amide nitrogen of Gly219 (2.84 A) (Table 17). Most notable, however, is the fact that the amide nitrogen of Ilel' also interacts through a hydrogen bond with Ser195 06 (3.06 A) (Table 17). A consideration of the relative spatial positioning of the hirudin N-terminal tripeptide and PPACK in the thrombin active site reveals both similarities as well as differences. The side chain atoms of Ilel’ and Tyr3’ occupy basically the same spatial position in the active site region as do Pro and D-Phe of PPACK, respectively. The isoleucyl side chain sits in the $2 subsite of thrombin and makes numerous hydrophobic Figure 44. 143 217 Stereoview of PPACK in the Active Site Region of Thrombin. PPACK, from the PPACK-thrombin structure [6], optimally superimposed into thrombin active site based on N, CA, C fitting of PPACK—thrombin and hirudin-thrombin complex thrombin structures; PPACK in bold. 144 Figure 45. Stereoview of the Interaction of the Hirudin N-terminal Tripeptide in the Active Site of Thrombin. Hirudin N-terminal tripeptide in bold: hydrogen bonds of the N-terminal nitrogen indicated by dashed lines. 145 Intermolecular Hydrogen Bonds in the Hirudin—Thrombin Complex. Hydrogen atoms are assigned geometrically Donor atoms are denoted for an unprotonated Ilel' amino terminus. ID! and A protonated amino terminus is designated 'N+' and an unprotonated amino terminus is A '*' indicates the mean and standard deviation were calculated using values Table 17. idealized positions. acceptor atoms 'A’. designated 'N’ for Ilel'. A. 8. Hydrogen Bonds involving Main Chain Atoms. Donor Acceptor Ilel' N+ Ser214 0 Ilel' N Ser214 O Tyr3’ N Gly216 O Glu57' N Thr74 O Gly216 N Ilel’ O Gly219 N Tyr3' O Ly3224 NZ Ser19' 0 Mean*: Standard Deviation*: Hydrogen Bonds involving Side Chain Atoms. Ilel’ N+ Ser195 OG Ilel' N Serl95 OG Va121' N Glu217 0E1 Lys60F NZ Glu49’ OEl Arg73 NHZ Asp55’ ODl Arg73 NHl Asp55' OD2 Tyr76 N Glu57’ 0E2 Mean*: Standard Deviation*: Distances H..A 2.05 2.04 2.04 1.84 2.32 2.17 2.17 2.10 3 0.16 O 2 2 2 2 1 2 1 .45 .26 .10 .16 .99 .32 .89 .12 .16 CW WNWNUOLIJUJU NWNUJUJWUJ (A) A .01 .01 .01 .82 .21 .84 .10 .00 .15 Angles(deg) DHA CAH CAD 154 144 152 155 149 152 165 161 159 173 165 164 149 153 158 123 138 155 149 150 150 152 17 117 126 132 132 121 132 163 133 128 176 94 94 142 101 110 153 123 116 169 130 129 156 17 146 contacts with the thrombin residues HisS7, Tyr60A, Trp60D, Leu99, and Trp215. The aromatic moeity of Tyr3' is in the hydrophobic cleft defined by Leu99, Ile174 and Trp215. This hydrophobic region is likely the apolar binding site for indole reported by Berliner and Shen [40], elaborated upon by others [41,47,58], and delineated by Bode et al., [6] in the PFACK-thrombin structure. The nonpolar contacts of the Tyr3' side chain with thrombin are also complemented by those involving nearby residues in the hirudin molecule: in all, this tyrosyl group is involved in eight van der Vaals contacts of less than 4.0 A with the side chains of Ilel’, Leu15’, and Val 21'. The side chain of Thr2' only penetrates the edge of the specificity pocket and well below the point of entry of the arginine of PPACK. The fact that hirudin interacts in such a novel fashion in the thrombin active site coupled with the fact that it displays no close sequence or topological similarity to the ten existing classes of serine proteinase inhibitors [73] suggests that hirudin truly represents a heretofore unknown family of inhibitors [72]. It is likely that all naturally occurring hirudins interact in the active site as does rHV2-K47, since they all possess Ile-Thr-Tyr or Val-Val-Tyr NHZ-terminal sequences [61]. Furthermore, the importance of maintaining nonpolar character in the first two positions has been demonstrated experimentally in binding studies with specific 1-2 position site-directed mutants [90]; these experiments showed that replacing N-terminal Val-Val residues with polar amino acids resulted in an increase in the inhibition constant (Ki), while replacement of these residues with aromatic or hydrophobic amino acids had little effect on the value of Ki. This is consistent with the nonpolar character of the $2 subsite. More difficult to reconcile with the crystal structure is 147 experimental evidence obtained by Stone, Braun, and Hofsteenge which supports the importance of having a positive charge on the hirudin amino-terminus [81]. In Table 17, appropriate hydrogen bond distances and angles are presented for the two amino-terminal Ilel' N hydrogen bonds to Ser214 0 and Serl95 OG assuming both a protonated and an unprotonated hirudin N-terminus. The significantly improved H..A distance (2.26 A vs. 2.46 A) and DHA angle (132° vs. 117° ) for the Ile1' N-Ser195 OG hydrogen bond with an unprotontated amino terminal nitrogen supports the the notion that Ilel' N is uncharged. Since 81557 of the catalytic site is also protonated at pH 4.7, a neutral hirudin amino-terminus could alleviate a positive charge build-up in the active site region. Moreover, a protonated Ilel’ nitrogen would be only 3.23 A from a protonated H1557 NEZ in the crystal structure. Although most of the NHZ-terminal domain of hirudin is not in contact with the thrombin surface (Figure 43), many hydrophobic and polar interactions exist at the interface of the two. For instance, three hirudin residues participate in nonpolar interactions at this interface (Figure 46). Residues Leu13’ and Pro46’ form hydrophobic contacts of 3.34 A and 3.73 A, respectively, with Pro6OC of the nine residue 60A-6OI insertion loop which narrows the active site cleft [6]. In addition, Va121' has two van der Vaals contacts of less than 4.0 A with Ile174. Two ion pair and two hydrogen bond interactions are also found at the interface (Figure 47). The salt bridges are: Asp5'-Arg221A (3.6 A), and Glu17'-Arg173 (5.0 A). Hydrogen bonds occur between residues Ly5224 NZ-Ser19' OG (3.10 A) and Va121’ N-Glu217 031 (3.05 A) (Table 17). In all, of the 48 amino acids in hirudin N—terminal domain, 15 residues are directly involved in contacts of less 148 15./174 )T174 \ ‘37, fi?‘ @600 Q Q.- Figure 46. Stereoview of Hydrophobic Contacts of the N-terminal Hirudin Domain with Thrombin. Hirudin residues in bold. Residues in figure: I1e174, Pro6OC, Leu13', Va121', and Pro46’. 149 173 Figure 47. Stereoview of Electrostatic Interactions of the N-terminal Hirudin Domain with Thrombin. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, and ion pairs by +/-o 150 than 4.0 A with thrombin. A number of water-mediated hirudin—thrombin and hirudin-hirudin hydrogen bond interactions involve residues in the N-terminal hirudin domain (Tables 18, 19, and 20). The carbonyl oxygen of Gly18', side chain oxygens of Asp5’and Asn20', and the ring hydroxyl group of Tyr3’ all interact with thrombin residues via water molecules (Tables 18 and 19). In particular, while Asp5’ and Arg221A appear to form a strong salt-bridge directly, a nearly pyramidal-coordinate water molecule in the vicinity, V622, appears to bridge this contact as well as bond to the Gly18' 0 (Figure 48). The result is that the water molecule makes three hydrogen bonds with the three residues involving Asp5’ OD2 (2.56 A), Gly18' O (2.87 A), and Arg221A NHZ (2.65 A). The remaining hydrogen-bonding interactions with thrombin through water molecules are Tyr3'-Tyr60A and Asn20'-Arg173. There are also several long range (d > 3.2 A) water-mediated hirudin N-terminal domain-thrombin interactions which involve the following residues: Leu13'—Trp60D; Asn20' with Thr172-Arg173, Glu217, Ly5224; and Va121'-Arg173. Lastly, there are four solvent-bridged hydrogen bond interactions between residues within the domain: Thr4'-Leu13'; Asp5'-Gly18'; Leu13'-Lys47': and Glu17'-Asn20' (Table 20). The tripeptide segment Pro46'-Lys47'-Pro48' appears to be important in facilitating the binding of the hirudin N-terminus in the active site region of thrombin (Figure 49). However, it is not because this stretch of residues resembles a thrombin substrate-like sequence [164] or because the Cys39'-Pro48' segment bears 502 sequence homology to residues Arg148-Ser157 of the thrombin cleavage site on prothrombin [86]. The hydrogen bonds made by the s-amino group of Lys47' to Thr4' (MS and Asp5’ O, as well as the solvent-bridged interaction with Arg221A, Table 18. Donor Thr2' Thr2’ Tyr3' Thr4' Thr4' Asp5’ Glu17' LysZ4' Gly25' Glu58' Glu62' Tyr63’ V401 V467 V469 V472 V483 V502 V508 V541 V543 V573 V622 V622 V623 V672 V672 V676 V726 151 Hirudin-Vater Molecule Hydrogen Bonds in the Complex. Hydrogen atoms are assigned geometrically idealized positions. Donor atoms are denoted 'D', acceptor atoms ’A'. Acceptor Distances (A) Angles(deg) H..0 D..A DHA COH V498 2.13 3.05 173 V410 1.94 2.85 163 V606 2.15 3.00 152 V469 1.82 2.72 147 V612 2.13 3.10 173 V534 2.07 3.05 168 V502 2.04 2.99 168 V673 2.14 3.08 162 V493 1.79 2.70 163 V722 1.95 2.99 160 V472 2.00 2.86 146 V657 1.83 2.71 154 Glu57' 0 2.36 133 Glu58' 082 2.75 121 Leu13’ 0 2.78 133 Glu62' 082 2.86 101 Thr2’ 0G1 2.40 121 Asn20’ 0 2.85 126 Glu49’ 081 3.17 104 Leu15' 0 3.08 135 Asp55' o 2.56 150 Asn20' 001 2.33 130 Asp5’ 002 2.56 121 Gly18' 0 2.88 111 81851’ 0 3.03 89 Leu13' 0 3.07 119 Lys47' o 2.56 139 01058' 081 2.30 144 Cys6' 0 2.75 157 Table 19. Solvent-Bridged Hirudin-Thrombin Hydrogen Bond Interactions. involved in the interaction are represented as ’A' and ’B', and the mediating water 'V'. A Vater B Molecule Tyr3' 0H V606 Tyr60A Asp5’ 0D V622 Arg221A Gly18’ 0 V622 Arg221A Asn20' 0D1 V573 Arg173 Glu49' OEl V508 Arg35 Glu57' 0 V401 Arg67 Glu57’ 0 V401 Arg67 Glu58' 082 V467 Arg77A Table 20. Solvent-Bridged Intramolecular Hydrogen Bond Interactions in Hirudin. atoms involved in the interaction are represented as 'A' and 'B', and the mediating water 'V'. A Vater B Molecule Thr4' N V469 Leu13' 0 Asp5' 0D2 V622 Gly18' 0 Leu13' 0 V672 Lys47' 0 Glu17' N V502 Asn20' 0 Glu62' N V472 G1u62' 0E2 152 The protein atoms 0H N82 NHZ NHl NHl NH2 NB Distances (A) NNNwNNNw> w (a) The protein Distances (A) A..V V..B 2.78 2.88 2.56 2.85 2.86 V..B 2.38 2.66 2.66 2.84 2.93 3.00 2.94 2.94 153 I.ws22 221A ’ 15' Ml W622 I‘\ fife 9 Figure 48. Stereoview of a Hirudin-Thrombin Vater Mediated Interaction. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/-, the water molecule by *. Figure 49. 154 Stereoview of Hirudin Pro46'—Lys47'-Pro48' and the Hirudin N-Terminus-Thrombin Active Site Interaction. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/-, and water molecules by *. 155 could help stabilize the active site binding by "clamping down" on the 4'-5’ dipeptide. Moreover, the van der Vaals contact between Pro46’ and Pro60C, and the polar contacts between Lys47' and Trp6OD could help anchor the N-terminal domain in the proper position for the amino-terminal tripeptide to associate intimately with active site region residues. It is of note that Lys47' has often been viewed as the P1 residue of substrate in the hirudin—thrombin complex [67,70,87]. Though the lysyl side chain of 47' does not occupy the specificity pocket of thrombin and is located approximately 11 A away (Figure 49), it appears to be a factor in the hirudin-active site region binding. The COOH-terminal tail in hirudin, in contrast to the NHZ-terminal head, adopts an unusual, extended conformation in the complex (Figure 43). In addition, by means of this extended conformation, residues of the tail can interact intimately with a wealth of residues on the thrombin surface; in all, the side chains of 12 of 17 hirudin C-terminal domain residues are involved in electrostatic or nonpolar contacts with thrombin. The region of thrombin where these residues bind has become known as the anion-binding exosite [47]. In addition, this region, which is defined by the loop segments Phe34-Ser41 and Lys70-Glu80 [6], is particularly rich in charged side chains - especially positively—charged arginyl and lysyl side chains [165,166] (Figure 50). Another extensive and highly electropositive exosite appears to exist on the opposite side of the thrombin molecule and consists of Ly387, Arg93, ‘Arg97, Lysl69, Arg175, HisZ30, Arg233, Lys236, and Ly5240 (Figure 51); this might be the site of heparin binding since the latter binds at an exosite different from that of fibrinogen [45]. In the first segment of the COOH-terminal hirudin tail, the initial three residues, Glu49'-Ser50’-Hi351', are involved primarily in 156 Figure 50. Charged Residues of the Anion-Binding Exosite Region of Thrombin. Thrombin CA structure only with thrombin disulfides and charged anion-binding exosite residues in bold. Figure 51. 157 Basic Residues of the Postulated Heparin Binding Site of Thrombin. Thrombin CA structure only with thrombin disulfides and basic residues of heparin binding site in bold. 158 electrostatic contacts with thrombin (Figure 52). Residue Glu49' forms a hydrogen-bonding ion pair with Lys60F (3.20 A) (Table 17) and interacts with Arg35 via the water molecule V508 (Tables 18, 19). The OG of Ser50’ appears to be involved in close, bifurcated polar contacts of 3.36 A and 3.68 A with the side chain oxygen atoms of Glu192. Also, the imidazolium ion of HisSl' is involved in an ion pair with Glu39 of thrombin (HisSl' ND1-Glu39 0E2, 3.0 A). The final three residues of this C-terminal tail segment, Asn52'-Asn53’-Gly54', are poorly defined in the electron density maps. The second segment of the hirudin COOH-terminal tail, Asp55'-Gln65’, can be viewed as being composed of a 15 A extended stretch of polypeptide from Asp55'-Pro60' followed by a type III 310 helical reverse turn from Glu61'-Leu64'. Impressively, nine of the 11 residues in this hirudin tail region interact with residues on the thrombin surface. Four of the last eleven residues in the hirudin C-terminal tail participate in electrostatic contacts with thrombin (Figure 53). The carboxylate of Asp55' is involved in a double hydrogen-bonding salt bridge with Arg73 (Arg73 NHZ-Asp55' OD1, 2.93 A; Arg73 NHl—Asp55' 002, 3.22 A) (Table 17) and forms an ion pair with Lysl49E of 4.9 A. The Asp55'-Lysl49E salt bridge is in agreement with a successful lysine cross-linking experiment carried out using a synthetic dinitrofluorobenzyl-Gly54-Leu64 peptide [167]. In addition, Lysl49E is the position of the y-cleavage site in thrombin [48,82] (Figure 10). The residue Glu57' is engaged in hydrogen bonds with main chain atoms of residues Thr74( Glu57' N-Thr74 o, 2.93 A) and Tyr76 (Tyr76 N-Glu57’ 032, 2.89 A) (Table 17), and has 10 contacts of less than 4.0 A with main chain and side chain carbon atoms of Arg75. The carbonyl oxygen of 159 Figure 52. Hirudin(Glu49'-Ser50'-Hi551')-Thrombin Interactions. Hirudin residues in bold. Hydrogen bonds indicated by dashed lines, ion pairs by +/-, and water molecules by *. 160 flew we “5' as? 73 I ‘Iw 76 26;: 77A 77A 74 5 73%: M14967 7 Russ W737; x 5 Figure 53. Electrostatic Hirudin(Asp55'—Gln65')-Thrombin Interactions. Hirudin residues in bold. indicated by dashed lines, molecules by *. Hydrogen bonds ion pairs by +/-, and water 161 GluS7' also interacts with the Arg67 side chain atoms NHl and NH2 through the water molecule V401 (Table 19). However, the most interesting aspect of this residue is that it participates in a double hydrogen-bonding ion pair with Arg#75 of a symmetry-related molecule in the crystal (Arg#75 NE-Glu57’ 031, 2.81 A; Arg#75 NH2-GluS7' 032, 2.93 A); the proximity of Arg75 to Glu57’ suggests that this salt—bridge could likely occur within the complex in solution. The Glu57'—Glu58’ residues are also involved in an electrostatic interaction with Arg77A, which is the position of the B—cleavage site in thrombin [48,52]. The side chain of Glu58' forms an ion pair with Arg77A directly while GluS7’ can interact with the arginyl group through the mediating water V467, although the distance is long for a hydrogen bond (Glu57’0E2—V467, 3.72 A). Finally, the C-terminal carboxylate of Gln65’ forms an ion pair of 3.33 A with the e-amino nitrogen of Lys36. Glutamate residues 61' and 62’ have no interactions with thrombin; this was somewhat surprising as site-directed mutagenesis studies have shown that single and multiple mutations of the carboxy-terminal glutamate residues 57',58', 61', 62'to glutamine caused increases in the hirudin-thrombin dissociation constant [87]. The water molecule V472 appears to bridge an intramolecular and intraresidue hirudin hydrogen bond interaction between G1u62' 0E2 and G1u62’ N (Table 20). An unexpected number of van der Vaals contacts also occur between the latter half of the extended C-terminal tail and the anion-binding exosite. The residue Phe56’ penetrates into a depression on the thrombin surface and makes 11 close contacts with thrombin residues: one with the sulfur of Met32, three with Phe34, one with Leu40, and seven with Thr74 (Figure 54). Most notably, as the planes of Phe56’ and Phe34 are nearly perpendicular to one another, edge-on aromatic stacking 162 56' 74 74 Figure 54. Hirudin Phe56' in a Hydrophobic Thrombin Cavity. Phe56’ in bold. 163 occurs between the two phenyl groups (Figure 54). There is also a substantial concentration of nonpolar hirudin—thrombin interactions in the vicinity of the type III helical reverse turn (Figure 55). Hirudin residue Ile59’ has three close intramolecular contacts with the side chain of Leu64', as well as one contact each with the side chain atoms of Leu65 and Ile82. Residue Pro60' is involved in five intermolecular contacts of less than 4.0 A with Tyr76 and has one close contact with Ile82. The residue Tyr63’, which possesses an acyl-sulfate in natural hirudin, participates in a sum total of five close hydrophobic approaches with Leu65 and Ile82. As a result of the type III 310 helical reverse turn, Leu 64' makes two close contacts each with Ile59’ and main chain atoms of Glu61’, as well as one close contact with the side chain of Leu65 of thrombin. The interactions in the vicinity of the turn terminate with the Gln65'-Lys36 ion pair. As the lack of a sulfated tyrosine at position 63' in hirudin has been shown to lead to a six-fold increase in the dissociation constant of the hirudin-thrombin complex with respect to natural Tyr63’-sulfated hirudin [86], it is of some importance to consider residues in the vicinity of a sulfated Tyr63' which could conceivably be involved in a stabilizing interaction. Initially, it seemed reasonable that the nearby thrombin residues Lys81 and Lyle9-LysllO could form ion pairs with a sulfated residue via free bond rotation. In the crystal complex, the e-amino nitrogen of LysBl forms a hydrogen bond with the carbonyl oxygen of Ala113. The terminal amino groups on the side chains of Lyle9 and LysllO, on the other hand, are not engaged in stabilizing interactions — making these two lysyl side chains more likely participants in an ion pair with a sulfated-Tyr63’. However, the 164 34 34 Figure 55. Hirudin—Thrombin Interactions near the Type III Helical Reverse Turn of Hirudin. Hirudin residues in bold, with N, CA, C of hirudin bolder. Ion pairs indicated by +/-. 165 structure of the hirugen-thrombin complex at 2.2 A resolution [168] (hirugen is Asn54'-Leu64' with a sulfated Tyr63’) reveals that the oxygen atoms of the Tyr63' sulfate make hydrogen bonds to Tyr76 0H and Ile82 N, and that no definitive ion pairs occur. Moreover, the spatial similarity of Tyr63’ and Tyr76 in the hirudin-thrombin structure to the corresponding tyrosine rings in the hirugen-thrombin structure suggests that this interaction is likely responsible for the additional binding affinity observed in a sulfated hirudin-thrombin complex. The aforementioned electrostatic interactions certainly support experimental evidence that acidic residues of the hirudin C-terminal tail [70] and basic residues of thrombin [18,165—166] are important to hirudin binding. Recently, Chang has identified inaccessible lysyl residues in hirudin—blocked thrombin ( lysines 36, 60F, 70, 109, 110, 149E) and additionally implied the participation of the arginine residues 67, 73, 75, and 77A in the interaction [166]. The involvement of the lysine residues 36, 60F, 149E, and the arginine residues 67, 73, 75, and 77A in electrostatic interactions with hirudin has already been mentioned. However, the side chains of Lyle9 and LysllO are completely exposed to solvent and have no interactions with hirudin. Moreover, while Lys70 has no contacts with hirudin in the crystal structure of the complex, the Lys 70 c—amino nitrogen (N2) is involved in two internal ion pair interactions with Glu77 (3.5 A) and with Glu80 (3.4 A). Similar lysyl inaccessibility studies have been carried out by reductive methylation which imply hirudin protection for the lysines 36, 60F, 70, 81, 145, and 224. [169]. Of these newly implied lysine residues, LysBl forms an internal hydrogen bond with Ala113 (LysBl NZ—Ala 113 0, 2.73 A), Lysl45 interacts with Glu18 through a solvent-bridged hydrogen bond interaction (Lysl45 NZ-V705, 2.87 A; V705-Glu18 0E2, 2.80 A), and Ly5224 166 N2 forms a hydrogen bond with Ser19' 0 (3.10 A) of hirudin (Table 17), as well as internal hydrogen bonds with Ser171 0 (2.76 A) and with Glu217 032 (2.60 A). The COOH-terminal tail of hirudin appears to be firmly anchored to thrombin at the end of the anion—binding exosite through hydrophobic interactions of the helical turn and the terminal Gln65’-Lys36 salt bridge. Although removal of Gln65' has little affect on the Ki inhibition or dissocation constant (increases by a factor of 1.3), deletion of Tyr63'-Leu64’ raises the Ki by a factor of 40 [170]. The bulky 31° helical turn may very well be important for hirudin-thrombin recognition, but whether this region of thrombin is part of the fibrinogen binding site remains uncertain. An aspect of the hirudin COOH terminal tail-thrombin interaction in the anion-binding exosite region to emerge in importance as a result of the structure solution of the complex is the extent of the hydrophobic contribution. In the latter half of the C—terminal hirudin tail (Asp55'-Gln65’), nearly half of the residues (five of 11) are hydrophobic or aromatic in nature, and all five are engaged in nonpolar hirudin—thrombin interactions. The importance of these interactions to the anticoagulant activity of hirudin C-terminal tail analogues (like hirugen) has already become evident through thrombin binding studies which indicate that the minimal peptide necessary for detection of anticoagulant activity is Phe56'-Gln65' and, furthermore, that the activity is sensitive to modification of residues Phe56', Ile59', Pro60’,Leu64', as well as Glu57’ [89,95]. Based on the information in the crystal structure of the rHVZ-K47-thrombin complex, it is clear that the hydrophobic contacts are at least as important as the polar interactions in the latter half on the hirudin COOH-terminal tail. 167 It is likely that hirudin binds with such high affinity to thrombin - forming tight, noncovalent complexes with dissociation constants in the pico-femtomolar range [59,71] - due to the fact that as it meanders across the the thrombin surface it is able to interact with a wealth of residues in a very intimate manner (Figure 56). All in all, 26 of the 65 residues of hirudin and 33 of the 259 residues in the thrombin B-chain are involved in 216 contacts of < 4.0 A, of which eight are ion pairs and 12 are hydrogen bonds. A summary of the hirudin-thrombin interactions is presented in Table 21. The hirudin N-terminal tripeptide is engaged in a remarkable 41 close contacts which are largely hydrophobic or neutral (30 out of 41 or 73%), as well as five hydrogen bonds. The remainder of the N-terminal hirudin domain, residues Thr4'-Pro48', have 62 close contacts with thrombin in which there are two hydrogen bonds, two ion pairs, and four nonpolar or neutral contacts. Thus, the N—terminal domain is responsible for 103 or 47% of the close contacts with thrombin; it is, however, important to also recognize that 41 or 402 of these 103 contacts are concentrated in the first three residues of hirudin. As the C-terminal tail begins, there are 32 close contacts within the first three residues, Glu49’-His51', including two ion pair interactions, and one hydrogen bond as well as 19 close contacts along the length of the Glu49' and Trp6OD side chains. The last three residues of this first segment of the C-terminal tail, Asn52’-Gly54’, have poorly defined density. As the latter half of the C-terminal tail begins, there is again a large number of interactions within a small number of residues; 60 close contacts occur between the Asp55’—Glu58’ stretch and thrombin. Among these contacts, the residues Asp55', Glu57’, and Glu58' are largely involved in electrostatic interactions - 168 Figure 56. Space-filling Drawing of the rHV2—K47-Human a—Thrombin Complex. Hirudin in sky blue. 169 H H I H HHN ll MNHHWNNWHHMQPWM H 7 v? 7 M‘DWFVIDMVIDMGQOIDQN N HF‘H .5. mmeIlHflllém m m m c Amy M mu no om an O N m H no canonaoucm: ”veep cowouuzg m wumochH nmmv m «muwma umHsomHoeuwucH mum madnesz wméllcmlltmm N N MH mH NH mH AHv .m. m Him 7. N a w «L: a mellelltmv H m m MN HIAv. mH Aw. Hm on av #0 N» H H N H w o mH NH Him v Him N m VH H H H H AH. 5' @v VN HN ON aH QH bH mH MH K m I > z m U a H A .vHOD Cw vUC«Hu0@H—D mud MHUQHGOU Hflhuaflc co“ m“ “mass: m an vwaoHH0u mononucmumq “4 o.« v muomucoo .mcoHuomawucH :Hnsounsla :HvsuHm v N .vv 0 v a B m OH NN u HiN H N Tu N TM a Ta TH N N #411 .HN mHan wellHVIiOH mA‘HOB N n mnzomim 29m vNNx ~unua- I II I I I ..A on no an IIDIIIIIIIINIIIIIIIIIIIInINIIIIIMIIIJQIIIIIIIIINI. I 3 3 a, a 3 3 3 IdIIId®I II‘IIFIIIINIIII IIIOIIIIII’OIIIIIuI‘IIII’IIN‘ I I a a . a. av a 31?- a I I I I u a o u a u n a a 4 o o n a a 88-. and-unnuauuaeo T:- 2 on u u 31?. 172 hirudin-thrombin interaction, one can examine results obtained from solvent accessible surface calculations; these calculations were performed using a probe radius of 1.4 A [171] on hirudin, thrombin, and the hirudin-thrombin complex, and the information gathered from this work is summarized in Tables 22-24. In its complex with thrombin, hirudin blocks 11.9% of the thrombin solvent acessible surface - masking 9.9% of the main chain atom accessibility and 12.4% of the side chain atoms (Table 22). This result is not surprising, for, as Figure 56 clearly inidicates, hirudin covers only a small fraction of the large, nearly spherical thrombin surface. More impressive are the results presented in Tables 23 and 24. The information in Table 23 indicates that in the complex, thrombin masks 37.7% of the total solvent accessibility of hirudin, in which there is a 23.6% loss in main chain atom accessibility and a 42.4% loss of side chain atom accessibility. In addition, Table 24 reveals that the bulk of this loss is concentrated in the same small stretches of hirudin polypeptide which had a large number of close thrombin contacts in Table 21. The hirudin N-terminal tripeptide, which displays the highest concentration of close thrombin contacts, also shows the greatest loss in solvent accessibility due to complexation - with a 90.7% total accessibility loss, of which 85.4% and 92.6% are due to a loss of main chain atom and side chain atom accessibility, respectively. As most of the remainder of the hirudin N-terminal domain, Thr4'—Pro48', is located away from the thrombin surface, this region shows the lowest loss of accessibility due to complex formation; among these residues there is a loss of only 20.2% in total atom accessibility, in which 7.2% is due to main chain atoms and 25.0% is due to side chain atoms. The two short C-terminal hirudin polypeptide stretches which 173 Table 22. Solvent Accessible Surface Area of Thrombin Residues Alone and in Complex with Hirudin. Abbreviations used: TASA: total atom surface accessibility; MASA: main chain atom surface accessibility; SASA: side chain atom surface accessibility. Thrombin Alone Thrombin in the Loss of Thrombin Residue (A2) Hirudin-Thrombin Accessibility Due to Com lex Complexation with Hirudin ( 2> (A2) (7.) TASA 13,669.3 12,044.8 1,624.5 (11.9%) MASA 2,650.0 2388.9 261.1 (9.9%) SASA 11,019.3 9655.9 1,363.4 (12.4%) Table 23. Solvent Accessible Surface Area of Hirudin Residues Alone and in Complex with Thrombin. Abbreviations used are as defined in Table 22. Hirudin Alone Hirudin in the Loss of Hirudin Residue (A2) Hirudin-Thrombin Accessibility Due to Complex Complexation with Thrombin (A ) (AZ) (Z) TASA 4,957.3 3,090.6 1,866.7 (37.7%) MASA 1,243.0 949.7 293.3 (23.6%) SASA 3,714.3 2,140.9 1,573.4 (42.4%) 174 Table 24. Percent Loss of Hirudin Solvent Surface Accessibility Due to Complexation with Thrombin. Results presented with regard to hirudin residue ranges. Abbreviations used are as defined in Table 22. %Loss TASA %Loss MASA %Loss SASA Hirudin Residue Range: 1-3 90.7 85.4 92.6 4-48 20.2 7.2 25.0 49-51 46.8 30.0 51.0 55—58 56.4 31.9 62.3 59-65 35.7 34.5 36.0 175 display a high concentration of thrombin interactions ( Glu49'-HisSl' and Asp55'-Glu58') in Table 21 also showed large losses in accessibility due to complexation. In the Glu49’-Hi551' and Asp55’-Glu58' segments there are 46.8% and 56.4% losses in total surface accessibility, respectively. Most notable, however, is the high loss in side chain atom accessibility in these regions which is 51.0% (Glu49'-HisSl’) and 62.3% (Asp55’-Glu58'), and is consistent with the wealth of interactions involving the side chains of these residues. Finally, in the vicinity of the type III helical reverse turn of hirudin, one sees a modest 35.7% total loss in surface accessibility (34.5% main chain; 36.0% side chain) resulting from complexation. This result is likey due to the fact that intramolecular van der Vaals contacts between side chains atoms both in and near the turn restrict access to residues in this region. A listing of the residue surface accessibility of free hirudin is presented in Table 25. The table includes the total, main chain, and side chain atom surface accessibilities for each residue present in the crystal structure, as well as two ratios: the total atom accessibility (TASA) divided by the theoretical side chain atom accessibility (SASAt) and the side chain atom accessibility (SASA) divided the theoretical side chain accessibility. A TASA/SASAt ratio greater than one implies that there exists at least some main chain atom accessibility, while the SASA/SASAt ratio indicates the fraction of actual side chain atom accessibility with regard to the maximum possible. In general, residues with high TASA/SASAt and SASA/SASAt ratios are solvent accessible, and many of these participate in contacts in the complex. It is of note that while the mean values of the TASA/SASAt and SASA/SASAt ratios for most of the hirudin residues in the complex were 0.57 (+/- 0.33) and 0.47 (+/— 0.27), respectively, (residues poorly defined in electron 176 Table 25. Residue Surface Accessibility of Free Hirudin. Conformation of hirudin as it appears in the complex. Abbreviations used: SASAt= theoretical side chain atom surface accessibility. All other abbreviations used are as defined in Table 22. All surface accessibility results reported in A2. SASA/SASAt ratios indicate how the actual side chain accessibility compares to the maximum possible. TASA/SASAt ratios greater than one indicate that that there is at least some main chain accessibility. These ratios are not given for glycines as glycine has no non-hydrogen side chain atoms. A '*’ preceding a line indicates that the values given are susceptible to error due to disorder in or near these residues. A ’+’ preceding a line indicates a residue which has at least one direct contact of less than 4.0 A with thrombin. Res. MASA SASA TASA TASA SASA SASAt SASAt + Ile 1’ 72.7 125.2 197.9 1.14 0.72 + Thr 2' 29.0 94.9 123.9 0.98 0.75 + Tyr 3' 8.9 84.2 93.1 0.43 0.39 + Thr 4' 3.1 61.6 64.7 0.51 0.49 + Asp 5' 1.0 90.5 91.5 0.73 0.72 Cys 6' 11.3 0.2 11.5 0.10 0.00 Thr 7' 23.7 66.9 90.6 0.72 0.53 Glu 8' 3.8 95.4 99.2 0.63 0.60 Ser 9' 25.2 42.5 67.7 0.71 0.45 Gly10’ 19.6 0.0 19.6 *** *** Gln11' 0.0 2.4 2.4 0.01 0.01 Asn12' 0.0 4.6 4.6 0.03 0.03 + Leu13’ 7.6 39.3 46.9 0.28 0.24 Cysl4' 0.0 0.0 0.0 0.00 0.00 + Leu15' 3.8 34.2 38.0 0.23 0.20 Cysl6' 10.1 0.0 10.1 0.09 0.00 + Glu17' 7.6 70.5 78.1 0.49 0.45 + Gly18' 62.9 0.0 62.9 *** *** + Ser19' 23.8 77.3 101.1 1.06 0.81 + Asn20' 2.9 98.8 101.7 0.72 0.70 + Va121' 14.8 47.5 62.3 0.44 0.33 CysZZ' 0.2 0.0 0.2 0.00 0.00 Gly23' 30.2 0.0 30.2 *** *** + LysZ4' 21.5 141.2 162.7 0.83 0.72 Gly25' 44.9 0.0 44.9 *** *** Asn26' 0.0 44.1 44.1 0.31 0.31 LysZ7' 0.0 81.1 81.1 0.41 0.41 Cy828' 0.0 0.0 0.0 0.00 0.00 Ile29’ 0.8 45.6 46.4 0.27 0.26 Leu30' 19.2 42.4 61.6 0.37 0.25 * Gly31' 77.3 0.0 77.3 *** *** * Gly36' 102.4 0.0 102.4 *** *** Asn37' 15.7 37.0 52.7 0.38 0.26 Gln38' 0.0 104.7 104.7 0.64 0.64 Table 25. + + + + + + (cont'd.) Cys39' Val40’ Thr41’ Gly42’ Glu43’ Gly44’ Thr45' Pro46' Lys47’ Pro48' Glu49' Ser50' HisSl' Asn52' Asn53' Gly54' Asp55’ Phe56’ Glu57' G1058’ Ile59’ Pro60’ Glu61' G1u62' Tyr63’ Leu64' Gln65' l6 4. 7. 24. 19. 13. 0. 2. 11. 17. 27. 15. 33. 28. 44. 62. 38. 30. 18. 28. 12. .9 20.4 37. 4 28.8 33. 3 116.6 123. 7 0.0 24. 4 70.3 89. 1 0.0 13. 1 94.7 94. 9 53.4 56. 1 50.8 61. 7 85.0 102. 8 110.8 138. 0 72.2 87. 5 121.2 154. 8 93.5 122. 0 59.8 103. 2 0.0 62. 0 95.3 133. 1 127.8 157. 4 127.1 145. 3 125.2 153. 5 82.6 92. 1 64.6 76. 2 122.2 126. 8 103.7 130. 7 122.0 147. 2 80.3 110. 0 122.4 202. All residues: mean: standard deviation: All residues which have contacts of < 4.0 A with thrombin: mean: standard deviation: 177 bUVU‘J-‘NHUTMOLONCDWVNO‘VVOWQH\JNONW 0.33 0.23 0.98 *** 0.57 .75 .41 .32 .88 .92 .87 .74 ** .07 .79 .92 .97 .53 .56 .80 .68 .66 .24 idC> 0.45 A for all main chain atoms, > 1.43 A for all side chain atoms) were visually examined using computer graphics in the optimally superimposed thrombin strutures with the aid of an Evans and Sutherland P5390 stereographics display and the PSFRODO version of the molecular graphics program FRODO. Following this extensive comparison, most of the significant structural differences can be accounted for in terms of the following factors or influences: (a) tentatively placed residues, poorly defined residues in the electron density of either structure, or alternate interpretations of density maps; (b) affect of crystal packing interactions; (c) interaction with hirudin; and (d) variability of surface residue conformation. Listings, with pertinent comments for the residues possessing significant main chain and/or side chain r.m.s. deviations ( > 10 ) believed to result from each of the four categories of factors or influences, are presented in Tables 28 to 31. Inspection of Figures 62 and 63, as well as a more careful examination of Figure 61, reveals that the most outstanding main chain and side chain differences in the two structures occur at the N- and 189 A J Autumn!!! 0|‘lf- Berlol-Olvldfl . 5‘. _1_Aop|A ‘15 _"" q- H.242- Dana‘s «a _F 2 5 l5 —~ i - 10 -~ 1 0 i‘ 2.5 --i- o' I Prowl- ‘ °'"‘a c 191- a :03- ‘- 2.. "'0 Luuo out“ 4: s i _ s ‘ f,..°.. AI||W \ . 0‘ Y 1.3 --.""_[ A.” [ Lulu 1‘ “:33“; L001? e .3.- hr 79- '01 I t3.“ } “:70 i "I‘79 '\ one" 1A ‘4. "i- I i . \v .I. " r ‘ ‘ A AA- 1 " es-E ' . + ._. lll [JillllILllillllllll 'llllllIleIlllllllllrlTllIll I» la 19 as a. to so nor :21 104 :99 171 194 2:2 233 249 Itlflll I'll! Figure 62. Average R.M.S. Deviation of Main Chain Atoms versus Residue Number for the Superimposed Thrombin Molecules of the Complex and PPACK—Thrombin. Superpositioning as in Figure 61. The horizontal lines represent la and 20 deviations. Most of residues with deviations > la are identified. 5.. 4.5 4.. 3.5 3.. 190 2.5 KAN I H In roe CINCMIN A7015 0.5 I.‘ IN Figure 63. A Luello J ._.] "H!- «v.77. Q—ThrtGT-Valaoc, LuelQn. Borlfi. Olulc > OIHISI A Anala-fheals lie!“- """ clnaavzlgcato l Thrgol— ’ \1' A. 7"‘73 Lqeit 'IOZO‘A \3 r” FA A b ‘ “W'so 2' '7 Thri77 I?” \‘f‘ i .'.12. 1 Lawn ”I"? 1 H0004 Oluxqa L002“ Q...“ LIU|I° Ola:- T. .T T .4. 13 I. :0 fl .9 so 107 137 I“ I” 1” 1°‘ :12 232 245 mm: W Average R.M.S. Deviation of Side Chain Atoms versus Residue Number for the Superimposed Thrombin Molecules of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. The horizontal lines represent 1a and 20 deviations. Many of the residues with deviations > la are identified. Table 28. Residues(s) GlylF SerlE Glle Ilel4K Asp14L Gly14M Asp60E Glu97A Leu99 LysllO 191 Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Residues Poorly Defined in Electron Density or Tentatively Placed in Either Structure, or due to Alternate Interpretations of the Density Maps. Optimal superpositioning of thrombin molecules as in Table 27. Abbreviations: HT = hirudin-thrombin complex; PT = PPACK-Thrombin; MCA r.m.s.d. s main chain atom average root mean square deviation; SCA r.m.s.d. a side chain atom average r.m.s.d.; S a surface; AS a active site; conf. . conformation; rot. a rotation. Deviations > la on main chain (0.45 A) and/or side chain (1.43 A) atoms are listed. Location MCA SCA Comments r.m.s.d. r.m.s.d. (A) (A) S 12.34 ---- ——Gly1F-Gly1D undefined 8 13.16 19.94 in PT & tentatively S 9.71 --—- placed along B-chain [6]. S 1.62 2.66 --Ile14K-Gly14M undefined S 5.28 3.63 in PT 8 arranged with 5 11.13 ---- regard to crystal packing [6]. S 1.01 1.67 --no side chain density beyond atom CB in HT 5 1.29 —--— --no side chain density in HT new conf. for side chain AS 0.24 1.67 --approximate 180° rot. CB-CG bond; could be due to density map interpretation. 5 1.50 5.00 --different main chain & side chain conf. due to cis-Prolll in PT, trans- Pr0111 in HT; density map interpretation difference. 192 Table 29. Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Crystal Packing Interactions in the Complex. Optimal superpositioning of thrombin molecules as in Table 27. Abbreviations: as = aromatic stacking; hb = hydrogen bond; hip = hydrogen-bonding ion pair; hc s hydrophobic contact; pc . polar contact; sympos a symmetry molecule positions (see Table 26), wmi- water mediated interaction. All other abbreviations are defined in Table 28. Residues in brackets [] are not involved in crystal contacts but have significant conformational differences which could be due to packing interactions of nearby residues. A '*' denotes the average MCA and SCA r.m.s.d. for residues Asn143-G1n151. Table 32 gives a complete listing of the MCA and SCA r.m.s.d.'s for the residues. Deviations > la on main chain (0.45 A) and/or side chain (1.43 A) atoms listed. Residues(s) Location MCA SCA Packing Interactions/ r.m.s.d. r.m.s.d. Comments (A) (A) GlylF S 12.34 ---- --sympos-l,7; wmi 0—V534- Glu#146 0E2. Serlfi 5 13.16 19.94 --sympos-1,7; OG-Lys#47’ NZ (pc) & OG-Thr#4'OG(pc) Glle S 9.71 ---- --sympos-1,7: Trp#148(hb) [Glu1C] S 6.79 7.92 AlalB S 1.91 0.55 --sympos-1,7: Trp#148(hb) [AsplA] S 0.55 0.53 [Ser14I] S 0.57 0.20 Tyr14J S 0.74 0.97 --sympos-1,5: Tyr#134(as), Phe#204A(as). Ilel4K S 1.62 2.66 --sympos-1,5: Leu#14G(hc), Ile#14K(hc). Asp14L S 5.28 3.63 --sympos-1,5: 0-Arg#14D NH1 (PC) [Gly14M] S 11.13 ---- Arg75 S 0.61 1.96 --sympos-1,6: Glu#57'(hip) Asn143-Gln151 S 3.99* 3.71* --sympos-1,7: 8 hydrogen bonds involving these residues (see Table 26). [Ser203] S 0.65 1.03 Pr0204 S 1.65 0.62 --sympos-1,5: Ile#14K(hc), Pro#204A(hc) Phe204A S 1.15 3.16 —-sympos-1,5: Tyr#14J(hc), Leu#129C(hc), Tyr#134(hc), Pro#204(hc), Phe#204A(hc) [Asn204B] S 1.23 0.56 [Asn205] S 0.54 1.04 [Ile242] S 0.67 0.72 [Asp243] S 1.19 5.35 Asn244 S 5.83 11.31 --sympos-1,7: Lys#145(hb), Thr#149(hb), Val#149C(hb). Phe245 S 6.57 13.52 --sympos-1,7: Trp#148(as) Table 30. Residues(s) Phe34 Arg35 Lys36 Pro6OC Trp6OD Lys60F Arg75 [Tyr76] Arg77A [Asn78] Asn143-Gln151 Arg173 Ile174 [Arg175] Cysl91 Glu192 [Gly193] [Asp194] Ser195 [Gly196] Gly219 193 Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to the Interaction with Hirudin. Optimal superpositioning of the thrombin molecules as in Table 27. All abbreviations are defined in Tables 28 and 29. Residues in brackets [] do not interact with hirudin, but have significant conformational differences which could be due to hirudin interactions of nearby residues. Deviations > 10 on main chain (0.45 A) and/or side chain (1.43 A) atoms listed. Location MCA SCA Interacting Hirudin r.m.s.d. r.m.s.d. Residues/Comments (A) (A) S 0.46 0.87 Phe56'(hc) S 0.68 1.05 Glu49’ via V508 S 0.68 0.82 Gln65’(ip) S 0.85 0.85 Leu13'(hc), Ly824'(pc), Pro46’(hc) S 1.15 0.87 Ile1’(hc), Lys47'(pc), Glu49’(mostly he) 5 0.71 1.28 Glu49’(hb) S 0.61 1.96 Glu#57'(hip—sympos-1,6) S 0.64 1.08 S 0.78 5.79 Glu58’(ip), Glu57’ via V467 S 0.61 1.01 S 3.99* 3.71* Lg. shift possibly explains CD change occurring upon hirudin C-terminal peptide bind- ing but not with PPACK. S 0.60 0.96 Glu17'(ip), Asn20'(pc) S 0.82 0.42 Va121'(pc) S 0.64 0.71 AS 0.84 0.34 Thr2'(pc) A8 0.88 1.70 Thr2'(pc), Thr4'(pc), Ser50'(pc) A5 0.70 --- AS 0.60 0.75 AS 0.50 0.77 Ile1'(hb) AS 0.49 --- AS 0.53 --- Thr2'(pc), Tyr3'(hb), Leu15'(pc) Table 31. Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] due to Surface Residue Conformational Variability. Optimal superpositioning of the thrombin molecules as in Table 27. Abbreviations: fs = free side chain; ident. = identical. All other abbreviations defined in Tables 28 and 29. Deviations > 10 on main chain (0.45 A) and/or side chain (1.43 A) atoms listed. Residues(s) Location MCA r.m.s. (A) ProS S 0.54 Serll S 0.69 Leu12 S 0.67 Lysl4A S 0.34 Arg14D S 0.59 Glu14H S 0.30 Glu14E S 0.61 Gly19 S 0.64 Ser20 S 0.53 Asp21 S 0.48 Gln38 S 0.31 Leu41 S 0.20 Arg50 S 0.23 Asn6OG S 0.52 Phe60H S 0.58 Thr60I S 0.74 Glu61 S 1.69 Asn62 S 1.61 His71 S 0.51 Ly581 S 0.45 Met84 S 0.23 Ly587 S 0.40 Trp96 S 0.67 SCA .m.s.d. (A) 0.75 1.34 1.11 1.38 2.43 1.40 0.78 0.50 0.55 1.98 1.45 2.63 1.03 0.99 1.59 1.53 4.62 0.51 3.30 1.89 1.67 1.39 Comments --CA,C,O offset; fs, side chain nearly ident. --peptide atoms offset; fs, rot. approx. 90° CA-CB bond -- new side chain conf.; new pc: NZ-Asp 21 O, 3.53 A. --new ip w/ Glu14H, 3.8 A —-peptide atoms offset; side chain conf. nearly ident. new ip w/ Glu14E, 3.8 A. --main chain atoms of surface tripeptide Gly19-Asp21 offset --approx. 90° rot. CA-CB bond --approx. 90° rot. CA-CB bond; new hc's w/ Phe60H -—new conf. for side chain —-Thr6OI-Asp63 turn shifted; new ¢, 0 angles for Asn62 classify turn as type 1; en- hanced hc’s between Phe6OH, Leu64, Leu41 in new conf. --main chain atoms offset; approx. 20° twist to histidine ring --new hb: NZ-Ala 113 0, 2.73 A --approx. 30° rot CA-CB bond --fs; approx. 30° rot. CB-CG bond --main chain atoms offset; side chain conf. nearly ident. 195 Table 31 (cont’d.) Arg97 S 1.14 2.32 --main chain atoms offset; new conf. for side chain; new pc: NH1-Asn95 0D1, 3.35 A Asn98 S 0.56 1.35 --main chain atoms offset; new conf. for side chain Arg126 S 0.26 2.08 --new side chain conf. Glu127 S 0.34 1.42 --new side chain conf. Leu160 S 0.21 1.59 --approx. 45° rot. CA—CB bond Va1167 S 0.51 0.66 -—peptide atoms offset; side conf. nearly ident. Lysl69 S 0.32 1.45 --fs; new side chain conf. Thr177 S 0.41 2.27 --approx. 180° rot. CA—CB bond Asp186A S 0.46 1.46 -—fs; new side chain conf. Glu186B S 0.86 0.57 --peptide atom conformation significantly different Gly186C S 0.92 ---- for G1u186B-Ly5186D Lysl86D S 0.60 1.22 --new ip w/ Glu14E, 3.8 A Leu234 S 0.37 1.56 --approx. rot. 180° CB-CG bond. Trp237 S 0.47 0.44 --shifting of type III turn Ile238 S 0.56 0.52 Lys235—Ile238 Gln239 S 0.62 3.69 --new side chain conf. LysZ40 S 0.68 3.56 --fs; new side chain conf. 196 Table 32. Structural Differences Between the Thrombin Structures of the Complex and PPACK-Thrombin [6] in the Vicinity of the 149A-149E Insertion Loop. Optimal superpositioning of the thrombin molecules as in Table 27. Abbreviations: r.m.s.d. = root mean square deviation; MCA r.m.s.d. = average main chain atom r.m.s.d.; SCA r.m.s.d. = average side chain atom r.m.s.d. All residues listed have deviations > la on main chain (0.45 A) and/or side chain atoms (1.43 A). Residue MCA SCA r.m.s.d.(A) r.m.s.d.(A) Asn143 0.65 0.98 Leu144 0.63 0.62 Lysl45 0.85 3.24 Glu146 1.52 1.11 Thr147 4.48 5.25 Trp148 7.22 8.07 Thr149 7.54 9.52 Ala149A 7.59 1.43 Asn149B 8.84 5.88 Va1149C 6.62 2.33 Gly149D 5.19 ---— Lysl49E 2.82 4.56 Gly150 0.97 ---- Gln151 0.38 1.51 197 C-termini of the thrombin A-chain, the C-terminus of the thrombin B—chain, and in the loop containing the l49A—149E insertion. Moreover, it appears that the the aforementioned factors (a), (b) and (c) are at the core of these large deviations. The drastic differences in conformation of the tripeptide segments GlylF—SerlE-Glle and Ilel4K-Asp14L-Gly14M at the N- and C-termini of the thrombin A-chain in the two structures (Figure 61), and the large average r.m.s. deviations for main chain (Figure 62, Table 27) and side chain atoms (Figure 63, Table 27) for these polypeptide segments results from the fact that these residues were totally undefined in the PPACK-thrombin structure, and were arranged in reasonable orientations — along the thrombin B-chain for residues lF-lD, and in a conformation allowed by crystal packing for residues 14K—14M [6]. The probability that the conformation of these positioned residues would be similar to that observed in the hirudin-thrombin crystals is low. Packing interactions in the crystals of the complex appear to play a role in the conformation of residues at the N- and C- termini of the thrombin A-chain, the C-terminus of the thrombin B-chain and in the Ala149A-Lysl49E insertion loop (Tables 26 and 29). As was seen in Figure 60 and mentioned in the previous chapter, at a crystal contact between molecules at symmetry positions 1 and 7 (Table 26), A-chain residues Gly#1D and AlafilB engage in two hydrogen bonds and B-chain residues Gln#244 and Phe#245 engage in four hydrogen bonds and an aromatic stacking interaction with residues in this insertion loop (Lysl45, Trp148, Thr149, Asn149B, and Va1149C). Such strong intermolecular forces will clearly have an impact on the conformation of the short polypeptide stretches Gly#1D-Ala#lB and Gln#244-Phe#245, as well as on nearby residues in these regions. In particular, Figure 64 198 Figure 64. Stereoview of the Superimposed Residues LysZ36—Phe245 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. Symmetry molecule interactions of the complex displayed; symmetry molecules indicated by a '#' preceding the residue number. Hydrogen bonds indicated by dashed lines. 199 reveals how interactions at an analagous crystal contact, involving Asn244 and Phe245 with residues in the 140-150 thrombin loop of a symmetry-related molecule, are most likely responsible for the conformation of the C-terminal residues Asp243-Phe245 in complex; the C-terminal B-chain folding of residues LysZ36—Ile242 agrees quite well in the two structures but at Asp243 the termini diverge. In addition, the conformation of the N-terminus of the thrombin A-chain is likely influenced by the hydrophobic and aromatic stacking interactions involving Tyr14J and Ilel4K (Figure 58) and by a close polar contact between the carbonyl of Asp14L and Arg#14D NH1 (3.57 A) at a crystal contact involving symmetry-related molecules at crystal positions 1 and 5 (Table 26). The wealth of hydrogen bonds (8) along the protruding thrombin insertion loop from residues Lysl45 to Va1149C (Table 26, Figure 60) and the aromatic stacking interaction between Trp148 and Phe#245 (Figure 60) at the crystal contact between molecules at symmetry positions 1 and 7 (Table 26) suggests that the conformation of this loop is well-stabilized by, if not dependent on, these interactions. The drastically different conformation this loop adopts in the two structures is shown in Figure 65. The most striking feature of the two loop conformations is the relative position of the indole ring of Trp148. In the complex, the indole is at the center of the loop and makes many contacts, while in PPACK-thrombin it is directed outside of the loop and toward the solvent. In addition, 14 residues in this loop have the r.m.s. differences > lo on main and/or side chain atoms between the two structures; these values are presented in Table 32. Crystal contacts alone may not be the only factor involved in influencing or provoking the dramatic shift in the conformation of this 200 146 150 Figure 65. Stereoview of the Superimposed Residues Glu146-Gly150 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. 201 thrombin loop in the complex. The orientation of this loop in the complex cannot be the same as in the PPACK-thrombin structure, as it would result in steric problems and would prevent binding of the hirudin N-terminus in the active site region of thrombin; this point is graphically made in Figure 66. If the insertion loop adopted the same conformation in the complex as in PPACK-thrombin, the indole of Trp148 would practically collide with the Thr4'-Asp5’ dipeptide of hirudin, and would obstruct the hirudin N-terminal tripeptide from penetrating into the thrombin active site. Thus, a change in the loop conformation between the two structures is necessary. Moreover, since the presence of the hirudin N-terminus in the thrombin active site region is important to the potency with which hirudin binds and inhibits thrombin, it is not surprising that the position of the loop and the indole of Trp148 in the complex are compatible with this important hirudin-thrombin interaction; the thrombin 140-150 loop residues are well away from the five hirudin N-terminal residues and the region where they interact with thrombin (Figure 66). As hirudin is a 'double-headed' thrombin inhibitor, it is worth considering whether the binding of the N-terminal head or the C-terminal tail is responsible for triggering this necessary loop shift. On the basis of steric restraints and on the fact that the hirudin NH2-terminus must enter the thrombin active site region for optimal binding, one could argue that the hirudin N-terminal head is responsible for the shift in the 149A-149E insertion loop. However, two different types of experimental evidence point to the likelihood of hirudin C-terminal tail binding to thrombin as the cause of the loop shift. Circular dichroism (CD) studies indicate that the binding of hirudin or a dodecapeptide analogue of the hirudin COOH—terminal tail to thrombin produces a change Figure 66. 202 Q51? 1' “(795 150 Stereoview of the Superimposed Residues G1u146-Gly150 of PPACK-Thrombin and the Complex, including the Hirudin N-Terminal Pentapeptide. Superpositioning as in Figure 61. Residues of PPACK-thrombin in bold. Thrombin catalytic triad of the complex displayed. 203 in CD which does not occur upon PPACK inhibition [95,96]. Vhile the exact structural transitions repsonsible for these CD changes are not known, it is possible that the CD spectal results reflect the altered conformation for the indole of Trp148 and its associated 100p. Moreover, the lack of change in CD upon PPACK binding to thrombin and the presence of one upon whole hirudin or C-terminal analogue binding suggest that hirudin binding in the vicinity of the anion-binding exosite, and not the thrombin active site, is responsible for the spectral change. In addition, studies of the kinetics of hirudin inhibition of thrombin indicate that C-terminal tail binding precedes N-terminal head binding, as the rate of the first step in complex formation is decreased by increasing the ionic strength and is unaffected by binding at the active site of thrombin [59]. Taken together, the experimental data suggest that initial C-terminal tail binding of hirudin to the anion-binding exosite region of thrombin is responsible for inducing a CD change, which could be due to a conformational change involving Trp148 and its associated loop. Thus, all in all, it appears as though the hirudin-thrombin interaction as well as crystal contacts are influential in determining the structure of the 149A-149E thrombin insertion loop. Vhile the largest differences in the thrombin structures have been examined, the majority of the significant differences have yet to be considered in terms of the four categories of influences likely involved in these changes. In category (a), two differences appear to result from poorly defined residues in the complex and two appear to be due to alternate interpretations of electron density maps (Table 28). The residues Asp60E and Glu97A of the complex show large side chain and/or main chain deviations with respect to the PPACK-thrombin structure, most 204 likely because there is little or no side chain density for these residues in hirudin—thrombin (Table 28). The structural deviations associated with Leu99 and LysllO appear to stem from map interpretation (Table 28). Residue Leu99, in the active site region, has nearly the same main chain conformation in the two structures but differs by 1.67 A in its side chain. Though this deviation is large in magnitude, in actuality the side chains in the two structures occupy the same space and simply differ by a near 180° rotation about the CB-CG bond; thus, either could be suitable for modeling in the density map. A much more dramatic difference, resulting from alternate density map interpretations, involves residues LysllO-Prolll (Figure 67). The peptide bond preceding residue Prolll was interpreted to be cis-, making this a cis-proline (CP), in PPACK-thrombin while in the complex a more typical trans-peptide bond was modeled. Moreover, while a cis- versus a trans-peptide bond does affect the configuration of Prolll, it places much more severe restraints on the conformation available to the preceding lysine residue, and is why the main and side chain deviations are so large for LysllO (Figure 67). Though many of the significant structural differences linked to crystal contact interactions have already been mentioned, there are a few more worth pointing out. The double-hydrogen bonding ion pair between Arg#75 and GluS7' which takes place at the two-fold axis between molecules at symmetry positions 1 and 6 (Table 26, Figure 59) is likely the source of the large main and side chain deviations between the two thrombin structures at Arg75 (Table 29). Also, the hydrophobic cluster created at another two-fold axis by residues from the thrombin A— and B-chains of molecules at symmetry positions 1 and 5 (Table 26, Figure 58) is probably the cause of the significant main and side chain Figure 67. 205 111 110 Stereoview of the Superimposed Residues LysllO-Prolll of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. Peptide bond preceding Prolll is cis- in PPACK-thrombin, trans- in the complex. 206 deviations with respect to PPACK-thrombin for residues Pr0204 and Phe204A, and those which adjoin them (Ser203, Asn204B, and Asn205). The heptapeptide segments LysZOZ-Arg206 taken from the main chain fitted thrombin structures are displayed in Figure 68, which reveals the noticeably different positioning of Phe204A in each molecule. An examination of the entries in Tables 28 to 31 reveals that the majority of the conformational differences in the thrombin structures appear to arise from the hirudin-thrombin interaction and the variability of surface residues. The notion that the hirudin-thrombin interaction is responsible for producing numerous significant structural perturbations in the thrombin molecule with respect to the PPACK-thrombin structure is certainly reasonable when one considers the potency of this interaction, and the fact that only 26 of the 65 hirudin residues are responsible for producing a remarkable 216 contacts < 4.0 A with thrombin. Also not surprising is the fact that the bulk of the affected thrombin residues are associated with or adjoin important hirudin (and in general macromolecular) binding sites on thrombin - namely, the active site cavity and the anion-binding exosite region (Table 30). Moreover, these hirudin residues are largely found in short peptide stetches displaying high interaction per residue densities - Ile1'-Tyr3’ (41 contacts/3 residues or 13.7 contacts/residue), Glu49'-HisSl' (32 contacts/3 residues or 10.7 contacts/residue), and Asp55'-Glu58' (60 contacts/4 residues or 15.0 contacts/residue) (Table 21). In an effort to more carefully examine the conformational differences between the thrombin structures in the active site binding cavity, consider Figure 69 which displays the superimposed active sites from the hirudin-thrombin (bold) and PPACK-thrombin structures, and 207 Figure 68. Stereoview of the Superimposed Residues Lys202-Arg206 of the Complex and PPACK-Thrombin. Superpositioning as in Figure 61. Residues of the complex in bold. 208 10 .15 21 Figure 69. Stereoview of Superimposed Active Site Residues of PPACK-Thrombin and the Complex, including the Hirudin N-Terminal Tripeptide. Residues of the complex in bold. Superpositioning as in Figure 61. Hydrogen bonds indicated by dashed lines. 209 includes the hirudin N-terminal tripeptide (bold). It is interesting to note that while the residues involved in the B—strand interaction with the hirudin N-terminal tripeptide, Ser214—Gly219, are in nearly the same conformation in the two structures, residues Cysl91-Ser195 and in the Tyr60A-Trp60D loop have quite different conformations between the two. The basis for the structural similarity of the Ser214-Cy5220 polypeptide segments is likely due to the fact both hirudin and PPACK are involved in strand interactions with this rigid thrombin stretch and, even though the the strand interaction is antiparallel in PPACK and parallel in the complex, most likely the inhibitor tripeptides conform to the thrombin strand rather than vice versa. However, the significant structural perturbations at positions Cysl91—Ser195 and Tyr60A-Trp60D most likely arise from hirudin-thrombin interactions which have no analogous match in the interaction of PPACK with thrombin. The large deviation in Cysl91-Ser195 strand probably results from the need for Glu192 to adopt a different conformation in the complex, as well as from a unique interaction between Ser195 and the hirudin NHZ-terminus. The Glu192 side chain cannot have the same position in the complex as in PPACK—thrombin for its carboxylate would come too close to the Thr2' side chain of hirudin. This problem coupled with the fact that the side chain can form a stabilizing close polar contact with Ser50' is sufficient to induce the Glu192 main chain and side chain atoms to shift. Similarly, the structural perturbation at Ser195 is a result of the unique hydrogen bond interaction between Ilel' N and Ser195 00. The residues which adjoin Ser195 and Glu192 - Cysl91, Gly193, and Asp194 — undergo significant main chain structural changes due to the Glu192, Ser195 hirudin-thrombin interactions. Moreover, while the proline residue of PPACK and the Ilel' residue of hirudin both have hydrophobic 210 contacts with Tyr6OA and Trp6OD of the thrombin 60A-60I insertion loop, the significant structural differences in the conformations of Pro60C, Trp60D, and Lys6OF also result from unique hirudin-thrombin interactions which are unmatched in PPACK-thrombin; these include the van der Vaals interaction between Pro46' and Pro60C, the close polar contacts between Lys47' and Glu49' with Trp6OD, and the ion pair between Glu49' and Lys60F (Table 30). As PPACK has no interactions with the thrombin anion-binding exosite while hirudin has numerous hydrophobic and electrostatic contacts here, there are expectedly many significant deviations between the two thrombin structures in main and side chain conformation at this binding region (Table 30). Among the Phe34-Ser41 loop of residues which comprise the anion-binding exosite, Phe34, Arg35, and Lys36 display significantly altered main chain and somewhat altered side chain configurations in the complex. These deviations are probably the result of the following contacts: the hydrophobic stacking interaction between Phe34 and Phe56', the water-mediated contact between Arg35 and Glu49', and the ion pair involving the C-terminal carboxylate of Gln65' and Lys36. A number of such structural changes can be found in the Lys70-Ly381 loop of the anion-binding exosite; these differences can be appreciated from Figure 70, which displays Arg75—Ly581 of both thrombin structures superimposed, as well as some pertinent hirudin residues. The difference in the conformation of Arg75 is probably due the double hydrogen-bonding ion pair between this residue and Glu 57'of a symmetry molecule. The major main and side chain deviations at Arg77A stem from the direct ion pair formed between this arginyl side chain and Glu58' of hirudin, as well as from the longer range interaction with Glu57' via the water molecule, V467. Lastly, the differences at Tyr76 and Asn78 Figure 70. 211 Stereoview of the Superimposed Anion-Binding Exosite Residues Arg75-Lys81 of the Complex and PPACK-Thrombin. Additional residues of the complex (thrombin Ala113, hirudin Glu57'-Glu58', and the water molecule V467) included. Residues of the complex in bold. Superpositioning as in Figure 61. Hydrogen bonds indicated by dashed lines. 212 are likely a by-product of the hirudin-thrombin interactions of the nearby residues. Also present in Figure 70 is another type of conformational difference; the side chain NZ of LysBl participates in a hydrogen bond with the carbonyl oxygen of Ile82 in PPACK-thrombin (2.84 A), while in the complex it forms a hydrogen bond with the carbonyl oxygen of Ala113 (2.73 A). This difference is a clear example of the conformational variability possible with surface residues [172-174], and represents the most common type of structural alteration found in the two thrombin molecules; overall 40 thrombin residues have significant conformational differences in the two structures which appear to be due to this (Table 31). Another interesting example this of variability involves the residues Leu41, Phe60G-Leu64 (Table 31), and is displayed in Figure 71. The Thr6OI-Asn63 surface turn shifts in the complex with respect to that in PPACK-thrombin. Moreover, while the conformational 0, w angles of Asn62 in PPACK-thrombin do not conform with any of the three major turn classifications [149], these angles are quite different in the complex, and identify this turn as type I. The shifting of the turn is the main cause of the large side chain deviations for residues Thr6OI-Glu61, while the drastic change in the 0, w conformational angles at Asn62 as well as the turn shift are responsible for the large main and side chain differences for this residue. In addition, several of the new side chain conformations in this loop of the hirudin-thrombin complex appear to stabilize the shift in turn. In particular, the large side chain conformational changes at Leu41 and Leu64 in addition to the significant main chain shift for Phe60H in the thrombin structure of the complex result in there being six close nonpolar contacts of < 4.0 A between Phe6OH and Leu41, Leu64 (three with each residue); in the PPACK-thrombin 213 I! -'- i 63 f / \ 64 3% 60H $4 41 i! / Figure 71. Stereoview of Residues Leu41 and Phe60H-Leu64 of the Complex and PPACK-Thrombin. Residues of the complex in bold. Superpostioning as in Figure 61. Hydrogen bonds indicated by dashed lines. 214 structure there is only one close contact < 4.0 A between Phe60H and Leu64, and none involving Leu41. The shift in the side chain orientation at Thr6OI in the complex also results in a new close polar contact with Asp63 (Thr6OI 001-Asp63 ODl, 3.27 A) which is not present in the PPACK-thrombin structure. A careful examination of Table 31 reveals that, all in all, not only is the number of residues with main/side chain structural changes likely due to surface residue conformational variabilty large (40), but so is the extent to which this variability manifests itself. Ten residues are engaged in new contacts: four are involved in ion pairs (Arg14D, Glu14H, Glu14E, and Ly5186D), three form close polar contacts (Lysl4A, Arg97, and Thr6OI), two engage in close van der Vaals contacts (Leu41, Phe6OH), and one is involved in a hydrogen bond interaction (Ly581). The conformation for eight residues is significantly affected by a shifting in the position of two surface turns: five (Phe6OG-Asn62) are affected by the shifting of a type I turn involving residues Thr6OI-Asn62, and three (Trp237-Gln239) are affected by the shifting of a type III turn involving residues LysZ35-Ile238. In addition, all but 11 residues in Table 31 have significantly altered side chain atomic positions, with six residues displaying dramatically new side chain conformations because they are unstabilized and directed into the solvent (Serll, Leu12, Ly587, Lysl69, Asp189,and Ly5240) and six more displaying new side chain conformations simply 'by reason of free rotation about a single bond (Gln38, Leu41, Lys87, Leu160, Thr177, and Leu234). It has been possible to associate most of the major main and side chain structural differences between the thrombin molecules in the . hirudin-thrombin complex and in PPACK—thrombin with one or another of 215 the (a)—(d) influences. However, all of these differences have involved solvent-accessible residues located on the thrombin surface - and have been classified in Tables 28 to 31 as belonging to the general thrombin surface (S) or the specialized surface of the active site binding cavity (AS). If one excludes the problematic differences associated with category (a), which probably all appear on the surface only fortuitously, some clear implications can be drawn from this comparison. First of all, only surface residues display significantly altered main/side chain structure in the two thrombin molecules. Second, the deviations associated with these surface residues appear to result not only from binding or crystal contact interactions but also from an inherent conformational flexibility which is likely due to being on a solvent-accessible surface [172-174]. Finally, the main chain and side chain folding of 'internal’ or non-surface residues agreed very well - displaying no major (> la ) deviations. As thrombin is a large, globular, and well—defined serine proteinase, the conclusions drawn cannot be expected to be of general applicability to all structural comparisons involving the same protein or» enzyme in free or complexed states. However, it is of particular note that the observations made in this thrombin comparison, in general, agree well with conclusions drawn from a very careful, comprehensive, and detailed structural comparison made on a close relative of thrombin, a—chymotrypsin [174]. In this study, Blevins and Tulinsky examined the structural differences in the two independent molecules of the aschymotrypsin dimer crystal structure at 1.67 A resolution, and further compared these independent molecules with monomeric y—chymotrypsin. Among the conclusions they drew from this study were that the main chain folding was basically identical between the two independent molecules in 216 the dimer, and that there was marked asymmetry in the side chain structure of the surface residues - some which could be explained by interfacial interactions but much of which had to be attributed to general flexibility in surface side chain structure. Moreover, when the independent molecules of the chymotrypsin dimer were compared to y—chymotrypsin, although the two molecules of the dimer were more like one another than y-chymotrypsin, the same conclusions could be drawn from the independent molecules in the dimer comparison. The chymotrypsin comparison, in particular, elegantly displays that even in a situation where the molecules being compared are exposed to the same macroscopic chemical and pH environment, there is still significant surface side chain variability which cannot be explained by interface interactions - emphasizing that protein surface structure is variable as well as adaptable. One important difference in the conclusions drawn from the thrombin and chymotrypsin structural comparisons is that there were no significant main chain deviations involving surface residues among the chymotrypsin molecules while there were between the two thrombin structures. This discrepancy is probably due to two factors. In the chymotrypsin (CHT) comparison, basically the same free enzyme molecule is examined in all the situations. In the comparison of unique molecules within the dimer of a—CHT, differences simply due dimerization were examined, and in the aHCHT versus y-CHT comparisons, differences due to pH (aHCHT, pH 3.5; y-CHT, pH 5.5) and aggregation state (aHCHT, dimer; y-CHT, monomer) were assessed. In the thrombin comparison, on the other hand, thrombin structures were contrasted which not only differed in pH and chemical environment but, more importantly, were differently inhibited molecules: the thrombin inhibitors differed 217 significantly in their extent of interaction with the thrombin surface. In particular, it appears that unique hirudin-thrombin interactions are likely the source of many of the significant main (as well as side) chain deviations between the two structures. Also, as the thrombin molecule can be viewed as possessing 12 insertions with respect to chymotrypsinogen (Figure 57) [6], the significant number of main chain differences between the thrombin structures could be due to the fact that the thrombin surface is inherently more adaptable than that of chymotrypsin as a result of these inserted sequences. LI ST 01’ REFERENCES 10. 11. 12. 13. 14. 15. 16. 17. LIST 0? REFERENCES Magnusson, S. (1971) in The Enzymes (Boyer, P.D., Ed.) pp 277-321, Academic Press, New York. Nesheim, M.E., Hibbard, L.S., Tracy, P.B., Bloom, J.V., Myrmel, K.H., & Mann, K.G. (1980) in The Regulation of Coagulation (Mann, K.G., & Taylor, P.B., Eds.) pp 145-149, Elsevier-North Holland, New York. Mann, K.G. (1976) Methods Enzymol. £2, 123-156. Butkowski, R.J., Elion, J., Downing, M.R., & Mann, K.G. (1977) J. Biol. Chem. 222, 4942-4957. Thompson, A.R., Enfield, D.L., Ericsson, L.H., Legaz, M.E., & Fenton II, J.V. (1977) Arch. Biochem. Biophys. 222, 356-367. Bode, V., Mayr, I., Baumann, U., Huber, R., Stone, S.R., & Hofsteenge, J. (1989) EMBO J. 8, 3467-3475. Jackson, C.M. & Nemerson, Y. (1980) Annu. Rev. Biochem. 32, 765- 811. Nilsson, B., Horne, M.K. III, & Gralnick, H.R. (1983) Arch. Biochem. Biophys. 222, 127-133. Doolittle, R.F. (1984) Annu. Rev. Biochem. 22, 195-229. Hantgan, R.R., & Hermans, J. (1979) J. Biol. Chem. 222, 11272- 11281. Blomback, B., Hessel, B., Hogg, D., & Therkildsen, L. (1978) Nature 222, 501-505. Blomback, B. (1986) Ann. N.Y. Acad. Sci. £22, 120-123. Casassa, E.F. (1955) J. Chem. Phys. 22, 596-597. Carr, M.E., Shen, L.L., & Hermans, J. (1977) Biopolymers 2g, 1-15. Ferry, J.D., & Morrison, P.R. (1947) J. Am. Chem. Soc. g2, 388- 400. Lorand, L., Downey, J., Gotoh, T., Jacobsen, A., & Tokura, S. (1968) Biochem. Biophys. Res. Commun. 22, 222-230. Fenton II, J.V. (1981) Ann. N.Y. Acad. Sci. 220, 468-495. 218 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 219 Fenton II, J.V. (1986) Ann. N.Y. Acad. Sci. 222, 5-15. Esmon, C.T., Esmon, N.L., Kurosawa, S., & Johnson, A.E. (1986) Ann. N.Y. Acad. Sci. 522, 215-220. Rick, M.E. & Hoyer, L.V. (1978) Brit. J. Haematol. 22, 107-119. Nesheim, M.E. 8 Mann, K.G. (1979) J. Biol. Chem. 222, 1326-1334. Seegers, V.H., Novoa, E., Henry, R.L., & Hassouna, H.I. (1976) Thromb. Res. 2, 543-552. Kisiel, V., Canfield, V.H., Ericcson, L.H., & Davie, E.V. (1977) Biochemistry 22,5824-5831. Lorand, L. (1975) in Proteases and Biological Control (Reich, E., Rigkin, D.B., & Shaw, E., Eds.) pp 79-84, Cold Spring Harbor Laboratory Press, Cold Spring Harbor. Travis, J. & Salvesen, 0.5. (1983) Annu. Rev. Biochem. 22, 655- 709. Bjork,I. 6 Lindahl, U. (1982) Mol. Cell. Biochem. 22, 161-182. Pratt, C.V., Vhinna, H.C., Meade, J.B., Treanor, R.E., & Church, F.C. (1989) Ann. N.Y. Acad. Sci. 222, 104-115. Markwardt, F. (1970) Methods. Enzymol. 22, 924-932. Shuman, M.A. (1986) Ann. N.Y. Acad. Sci. 222, 228-239. Veksler, B.B., Ley, C.V., & Jaffe, E.A. (1978) J. Clin. Invest. 22, 923-930. Prescott, S.M., Zimmerman, G.A., & McIntyre, T.M. (1984) Proc. Natl. Acad. Sci. USA 22, 3534-3538. De Groot, P.G., Gonsalves, M.D., Loesberg, C., van Buul- Vortelboer, M.F., van Aken, V.G., & van Mourik, J.A. (1984) J. Biol. Chem. 222, 13329-13333. Gelehrter, T.D., & Sznycer-Laszyk, R. (1986) J. Clin. Invest. 22, 165-169. Bender, M.L., & Kezdy, F.J. (1964) J. Am. Chem. Soc. 22, 3704- 3714. Inward, P.V., & Jencks, V.P. (1965) J. Biol. Chem. 222, 1986- 1996. Stroud, R.M., Kay, L.M., & Dickerson, R.E. (1974) J. Mol. Biol. 22, 185-208. Birktoft, J.J., & Blow, D.M. (1972) J. Mol. Biol. 22, 187-240. Shotton, D.M., & Vatson, H.C. (1970) Nature 222, 811-816. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 220 Sawyer, L., Shotton, D.M., Campbell, J.V., Wendell, P.L., Muirhead, B., Vatson, H.C., Diamond, R., & Ladner, R.C. (1978) J. "01. 8101. E, 137-2080 Berliner, L.J. & Shen, Y.Y.L. (1977) Biochemistry 22, 4622-4626. Sonder, S.A., & Fenton II, J.V. (1984) Biochemistry 22, 1818-1823. Berliner, L.J., Sugawara, Y., & Fenton II, J.V. (1985) Biochemistry 22, 7005-7009. Olson, T.A., Sonder, S.A., Vilner, C.D., & Fenton II, J.V. (1986) Ann. N.Y. Acad. Sci. 222, 96-103. Fenton II, J.V., Olson, T.A., Zabinski, M.P., & Vilner, C.D. (1988) Biochemistry 22, 7106-7112. Church, F.C., Pratt, C.V., Noyes, C.M., Kalayanamit, T., Sherrill, G.B., Tobin, R.B., & Meade, J.B. (1989) J. Biol. Chem., 222, 18419-18425. Bar-Shavit, R., Kahn, A.J., Mann, K.G., & Wilner, C.D. (1986) Fenton II, J.V. (1988) Semin. Thromb. Hemost. 22, 234-240. Boissel, J-P., Le Bonniec, B., Rabiet, M-J., Labie, D., & Elion, J. (1984) J. Biol. Chem. 222, 5691-5697. Levis, S.D., Lorand, L., Fenton II, J.V., & Shafer, J.A. (1987) Biochemistry 22, 7597-7603. Bezeaud, A. & Guillin, M-C. (1988) J. Biol. Chem. 222, 3576-3581. Braun, P.J., Hofsteenge, J., Chang, J-Y. & Stone, S.R. (1988) Thromb. Res. 22, 273-283. Hofsteenge, J., Braun, P.J. & Stone, S.R. (1988) Biochemistry 22, 2144-2151. Tsernoglou, D., Valz, D.A., McCoy, L.E., & Seegers, V.H. (1974) J. Biol. Chem. 222, 999 (1974). McKay, D.B., Kay, L.M., & Stroud, R.M. (1977) in Chemistry and Biology of Thrombin ( Fenton II, J.V., Lundblad, R.L. & Mann, K.G. Eds.) pp 113-121, Ann Arbor Science, Ann Arbor. Edwards, B.F.P., Kumar, V., Bedford, B.A., Martin, P.D., & Kunjummen, R.D. (1986) Ann. H.I. Acad. Sci. 222, 411-413. Skrzypczak-Jankun, E., Rydel, T.J., Tulinsky, A., Fenton II, J.V., & Mann, K.G. (1989) J. Mol. Biol. 222, 755-757. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 221 Magnusson, S., Peterson, T.E., Sottrup-Jensen, L., 8 Claeys, H. (1975) in Proteases and Biological Control (Reich, E., Rifkin, D.B., 8 Shaw, E., Eds.) pp 123-150, Cold Spring Harbor Laboratory Press, Cold Spring Harbor. Furie, B., Bing, D.B., Feldmann, R.J., Robison, D.J., Burnier, J.P., 8 Furie, B.C. (1982) J. Biol. Chem. 222, 3875-3882. Stone, S.R. 8 Hofsteenge, J. (1986) Biochemistry 22, 4622-4628. Valsman, P., 8 Markwardt, F. (1981) Pharmazie 22, 653-660. Scharf, M., Engels, J., 8 Tripier, D. (1989) FEBS Lett. 222, 105-110. Dodt, J., Muller, H.P., Seemuller, U., 8 Chang, J.Y. (1984) FEBS Lett. 222, 180-183. Harvey, R.F., Degryse, E., Stefani, L., Schamber, F., Cazenave, J.P., Courtney, M., Tolstoshev, P., 8 Lecocq, J.P. (1986) Proc. Natl. Acad. Sci. USA 22, 1084-1088. Dodt, J., Machleidt, V., Seemuller, U., Maschler, R., 8 Fritz, H. (1986) Biol. Chem. Hoppe-Seyler 222, 803-811. Loison, C., Findeli, A., Bernard, 5., Nguyen-Juilleret, M., Marquet, M., Reihl-Bellon, N., Carvallo, D., Guerra-Santos, L., Brown, S.H., Courtney, M., Roitsch, C., 8 Lemoine, Y. (1988) Bio/Technology 2, 72-77. Courtney, M., Loisson, C., Lemoine, Y., Riehl-Bellon, N., DeGryse, E., Brown, S.U., Cazanave, J.P., Defreyn, C., Delebassee, D., Bernat, A., Maffrand, J.P., 8 Roitsch, C. (1989) Semin. Thromb. Hemost. 22, 288-292. Johnson, P.H., Sze, P., Vinant, R., P.V. Payne, and J.B. Lazar, Semin. Thromb. Hemost. 22, 302 (1989). Bagdy, D., Barabas, E., Graf, L., Peterson, T.E., 8 Magnusson, S. (1976) Methods Enzymol. 22, 669-678. Fink, E. (1989) Semin. Thromb. Hemost. 22, 283-287. Chang, J.Y. (1983) FEBS Lett. 222, 307-313. Seemuller, U., Dodt, J., Fink, E., 8 Fritz, H. (1986) in Proteinase Inhibitors (Barrett, A., 8 Salvesen, C., Eds.) pp 337-359, Elsevier, New York. Dodt, J., Seemuller, U., Maschler, R., 8 Fritz, H. (1985) Biol. Chem. Boppe- Seyler 222, 379-385. Laskowski, M. Jr., 8 Kato, I. (1980) Annu. Rev. Biochem. 22, 593-626. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 222 Apella, E., Weber, I.T., Blasi, F. (1988) FEBS Lett. 222, 1-4. Dodt, J., Schmitz, T., Schafer, T., 8 Bergmann, C. (1986) FEBS Lett. 222, 373-377. Fortkamp, E., Rieger, M., Heisterberg-Moutses, G., Schweitzer, S., 8 Sommer, R. (1986) DNA 2, 511-517. Bergman, C., Dodt, J., Kohler, 8., Pink, E., 8 Gassen, H.G. (1986) Biol. Chem. Hoppe-Seyler 222, 731-740. Sukumaran, D.K., Clore, C.M., Preuss, A., Zarbock, J., 8 Gronenborn, A.M. (1987) Biochemistry 22, 333-338. Folkers, P.J.M., Clore, C.M., Driscoll, D.C., Dodt, J., Kohler, S., 8 Gronenborn, A.M. (1989) Biochemistry 22, 2601-2617. Haruyama, H., 8 Uuthrich, K. (1989) Biochemistry 22, 4301-4312. Stone, S.R., Braun, P.J., 8 Hofsteenge, J. (1987) Biochemistry 22, 4617-4624. Fenton II, J.V., Landis, B.H., Valz, D.A., 8 Finlayson, J.S. (1977) in Chemistry and Biology of Thrombin (Fenton II, J.V., Lunblad, R.L., 8 Mann, K.G., Eds.) pp 43-70, Ann Arbor Science, Ann Arbor. Noe, G., Hofsteenge, J., Rovelli, G., 8 Stone, S.R. (1988) J. Biol. Chem. 222, 11729-11735. Read, R.J., 8 James, M.N.G. (1986) in Proteinase Inhibitors (Barrett, A. 8 Salvesen, C., Eds.) pp 301-333, Elsevier, New York. Peterson, T.E., Roberts, H.R., Sottrup-Jensen, L. 8 Magnusson, S. (1976) in Protides of the Biological Fluids (Peeters, H., Ed.) pp 145-149, Pergamon Press, Oxford. Dodt, J., Kohler, S., 8 Baici, A. (1988) FEDS Lett. 222, 87-90. Braun, P.J., Dennis, S., Hofsteenge, J., 8 Stone, S.R. (1988) Biochemistry 22, 6517-6522. Stone, S.R., Dennis, S., 8 Hofsteenge, J. (1988) Biochemistry 22, 6857-6863. Krstenansky, J.L., Owen, T.J., Yates, M.T. 8 Mao, S.J.T. (1987) J. Med. Chem. 22, 1688-1691. Wallace, A., Dennis, S., Hofsteenge, J., 8 Stone, S.R. (1989) Biochemistry 22, 10079-10084. Dodt, J., Kohler, 8., Schmitz, T. 8 Wilhelm, B. (1990) J. Biol. Chem. 222, 713-718. Dennis, 5., Wallace, A., Hofsteenge, J. 8 Stone, S.R. (1990) Eur. J. Biochem. 222, 61-66. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 223 Krstenansky, J.L. 8 Mao, S.J.T. (1987) FEBS Lett. 222, 10-16. Maraganore, J.M., Chao, 8., Joseph, M.L., Jablonski, J. 8 Ramachandran, K.L. (1989) J. Biol. Chem. 222, 8692-8698. Mao, S.J.T., Yates, M.T., Owen, T.J. 8 Krstenansky, J.L. (1988) Biochemistry 22, 8170-8173. Konno, S., Fenton II, J.W., 8 Villanueva, G.B. (1988) Arch. Biochem. Biophys. 222, 158-166. Markwardt, F. (1986) Ann. N. Y. Acad. Sci. 222, 204-214. Markwardt, F. (1985) Biomed. Biochim. Acta 22, 1007-1013. Markwardt, P., Hauptmann, J., Nowak, C., Klessen, C., 8 Walsmann, P. (1982) Thromb. Baemostasis 22, 226-229. Fenton II, J.W., Fasco, M.J., Stackrow, A.B., Aronson, D.L., Young, A.H., 8 Finlayson, J.S. (1977) J. Biol. Chem. 222, 3587-3598. Bischoff, R., Clesse, D., Whitechurch, 0., Lepage, P., 8 Roitsch, Co (1989) Jo Chi-0.0 fl, 245‘255. Ornstein, L. (1964) Ann. N.Y. Acad. Sci. 222, 321-349. Davis, B.J. (1964) Ann. N.Y. Acad. Sci. 222, 404-427. Hoefer Scientific Instruments catalogue (1986), pp 110-116, Hoefer Scientific Instruments, San Francisco. Technical Bulletin No. MKR-137 (1986), pp 1-8, Sigma Chemical Company, St. Louis. Bryan, J.K. (1977) Anal. Biochem. 22, 513-519. Reid, B.R., Koch, G.L.E., Boulanger, Y., Hartley, B.S., 8 Blow, Do". (1973) J. "01. 31°10 8_0, 199-2010 Carter, C.W. Jr. 8 Carter, C.W. (1979) J. Biol. Chem. 222, 12219-12223. Thaller, C., Weaver, L.H., Eichele, C., Wilson, E., Karlsson, R., 8 Jansonius, J.N. (1981) J. Mol. Biol. 222, 465-469. Matthews, B.V. (1968) J. Mol. Biol. 22, 491-497. Edumndson, A.B., Wood, M.K., Schiffer, M., Hardman, K.D., Ainsworth, C.F., Ely, K.R., 8 Deutsch, H.F. (1970) J. Biol. Chem. 222, 2763-2764. Ichikawa, T. 8 Sundaralingam, M. (1972) Nature New Biol. 222, 174-176. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 224 McPherson, A. (1982) in Preparation and Analysis of Protein Crystals, p. 241, John Wiley 8 Sons, New York. Stout, G.H. 8 Jensen, L.H. (1989) in X-ray Structure Determination - A Practical Guide, pp. 23-25, John Wiley 8 Sons, New York. Vyckoff, H.W., Doscher, M., Tsernoglou, D., Inagami, T., Johnson, L.N., Hardman, K.D., Allewell, N.M., Kelly, D.M., 8 Wyckoff, H. (1985) Methods Enzymol. 222, 330-386. Pflugrath, J.W. 8 Messerschmidt, A. (1989) Munich Area Detector NB System User’s Guide, Enraf-Nonius, The Netherlands. Pflugrath, J.W. 8 Messerschmidt, A. "Structured Software for Data Collection with Area Detectors", from Crystallography in Molecular Biology Conference, Bischenberg, France, September 1985. Sjolin, L. 8 Wlodawer, A. (1981) Acta Crystallogr. 222, 594-604. Data Collection Operation Manual (1982), Section D.9 pg 5, Nicolet XRD Corp., Madison. North, A.C.T., Phillips, D.C., 8 Mathews, F.S. (1968) Acta Crystallogr. 222, 351-359. Steigemann, W. (1974) Ph. D. Thesis, Technische Universitat Munchen. Messerschmidt, A., Schneider, M., 8 Huber, R. (1990) J. Appl. Cryst. 22, 436-439. Rossmann, M.G. (1972) in The Molecular Replacement Method (Rossmann, M.G., Ed.) pp 1-42, Gordon and Breach, New York. Huber, R. (1965) Acta Crystallogr. 22, 353-356. Rossmann, M.G. 8 Blow, D.M. (1962) Acta Crystallogr. 22, 24-31. Rydel, T.J, Ravichandran, K.G., Tulinsky, A., Bode, W., Huber, R., Roitsch, C., 8 Fenton II, J.W. (1990) Science 222, 277-280. Patterson, A.L. (1935) Z. Krist. 229, 517-542. Rossmann, M.G., Blow, D.M., Harding, M.M., 8 Coller, E. (1964) Acta Crystallogr. 22, 338-342. Crowther, R.A. 8 Blow, D.M. (1967) Acta Crystallogr. 22, 544-548. International Tables for X-ray Crystallography (1952) (Henry, N.F. 8 Lonsdale, K., Eds.) Vol. I, p 186, The Kynoch Press, Birmingham. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 225 Huber, R. 8 Schneider, M. (1985) J. Appl. Cryst. 22, 165-169. Sim, G.A. (1959) Acta Crystallogr. 22, 813-815. Sim, G.A. (1960) Acta Crystallogr. 22, 511-512. Stout, G.H. 8 Jensen, L.H. (1989) in X—ray Structure Determination - A Practical Guide, p 337, John Wiley 8 Sons, New York. Blow, D.M. 8 Crick, F.H.C. (1959) Acta Crystallogr. 22, 794-802. Blundell, T.L. 8 Johnson, L.N. (1976) in Protein Crystallography, p 419, Academic Press, New York. Pflugrath, J.W., Saper, M.A., 8 Quiocho, F.A. (1984) in Methods and Applications in Crystallographic Computing (Hall, 5. 8 Ashida, T., Eds.) pp 404-407, Oxford Univ. Press, Oxford. Jones, T.A. (1978) J. Appl. Cryst. 22, 268-272. Jack, A. 8 Levitt, M. (1978) Acta Crystallogr. 222, 931-935. Levitt, M. (1974) J. Mol. Biol. 22, 393-420. Deisenhofer, J., Remington, S.J., 8 Steigemann, W. (1985) Methods Enzymol. 222, 303-323. Ten Eyck, L.F. (1973) Acta Crystallogr. 222, 183-191. Ten Eyck, L.F. (1977) Acta Crystallogr. 222, 486-492. Hendrickson, W.A. 8 Konnert, J.H. (1980) in Computing in Crystallography (Diamond, R., Ramaseshan, S. 8 Venkatesan, K., Eds.) pp 13.01-13.26, The Indian Academy of Sciences, Bangalore, India. Hendrickson, W.A. (1985) Methods Enzymol. 222, 252-270. Finzel, B.C. (1987) J. Appl. Cryst. 22, 53-55. Luzzati, P.V. (1952) Acta Crystallogr. 2, 802-810. Crawford, J.L., Lipscomb, W.N., 8 Schellman, C.G. (1973) Proc. Natl. Acad. Sci. USA 22, 538-542. Tulinsky, A., Park, C.H., 8 Skrzypczak-Jankun, E. (1988) J. Mol. Biol. 222, 885-901. Mulichak, A.M. 8 Tulinsky, A. Blood Coagulation 8 Fibrinolysis, manuscript in press. Huber, R., Scholze, H., Paques, E., 8 Deisenhofer, J. (1980) Boppe-Seyler's z. Physiol. Chem. 222, 1389-1399. 153. 154. 155. 156. 157. 158. 159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 226 Bode, W., Greyling, H.J., Huber, R., Otlewski, J., 8 Wilusz, T. (1989) FEBS Lett. 222, 285-292. Arnott, S. 8 Dover, S.D. (1968) Acta Crystallogr. 222, 599-601. Yonath, A. 8 Traub, W. (1969) J. Mol. Biol. 22, 461-477. Pedersen, B. (1974) Acta Crystallogr. 222, 289-291. Brown, I.D. (1976) Acta Crystallogr. 222, 24-31. Taylor, R., Kennard, 0., 8 Versichel, W. (1984) Acta Crystallogr. 222, 280-288. Gorbitz, G.H. (1989) Acta Crystallogr. 222, 390-395. Katz, B.A. 8 Kossiakoff, A. (1986) J. Biol. Chem. 222, 15480- 15485. Weiner, S.J., Kollman, F.A., Case, D., Singh, U.C., Ghio, C., i Alagona, C., Profeta, S. Jr., 8 Weiner, P. (1984) J. Am. Chem. Soc. 222, 765-784. Ni, F., Konishi, Y., 8 Scheraga, H.A. (1990) Biochemistry 22, 4479-4489. Huber, R. 8 Bode, W. (1978) Acc. Chem. Res. 22, 114-122. Chang, J.-Y. (1985) Eur. J. Biochem. 222, 217-224. Chang, J.-Y. (1986) Biochem. J. 222, 797-802. Chang, J.-Y. (1989) J. Biol. Chem. 222, 7141-7146. Bourdon, P., Fenton II, J.W., 8 Maraganore, J.M. (1990) Biochemistry 22, 6379-6384. Skrzypczak-Jankun, E. 8 Tulinsky, A., unpublished results of this laboratory. Zuck, V. 8 Owen, W., unpublished observations. Degryse, E., unpublished observations. Lee, B. 8 Richards, F.M. (1971) J. Mol. Biol. 22, 379-400. Tulinsky, A., Vandlen, R.L., Morimoto, C.N., Mani, N.K., 8 Wright, L.H. (1973) Biochemistry 22, 4185-4192. Tulinsky, A. (1980) in Biomolecular Structure, Conformation, Function, and Evolution (Srinivasan, R., Ed.) Vol. 1, pp. 183-199, Pergamon Press, Oxford. Blevins, R.A. 8 Tulinsky, A. (1985) J. Biol. Chem. 222, 4264-4275. MICHIGAN STATE UNIV. LIBRARIES 11111111111111W111111111111111111111111 3129300901 1549