‘ .,.,....vm-«.—.,._ «m4... . a PART T- TNE CRYSTAL AND MOLECULAR STRUCTURE , OF TETHA N PROPYLPOHPNTHE ~f '__L:5-11,;Tiff: PART H: N FORMYL L PHENY‘LNLANINE ’ .3. CHYNOTNYPSTN COMPETITIYE TNHTBTTQR BINDING Thesis for the Dégree of Ph D. NECNTGNN STATE UNIVERSITY PENELOPE WIXSON comma . 19?} V i ’ “L131. my ‘ {Michigan State ' University This is to certify that the thesis entitled Part I. The Crystal and Molecular Structure of Tetra-n-propylporphine Part II. N-Formyl-L-Phenylalanine: a-Chymotrypsin Competitive Inhibitor Binding presented by Penelope Nixson Codding has been accepted towards fulfillment of the requirements for (ITTLJZ?k&LP1 Major professog Date Septflaer 3, 1971 0-7639 ABSTRACT PART I: THE CRYSTAL AND MOLECULAR STRUCTURE OF TETRA-N-PROPYLPORPHINE PART II: N-FORMYL-L-PHENYLALANINE:a-CHYMOTRYPSIN COMPLEX, THE EFFECT OF BINDING OF A COMPETITIVE INHIBITOR SUBSTRATE-LIKE MOLECULE By Penelope Wixson Codding The structure of a,8,y,G-tetra-n-propylporphine was determined by three dimensional X-ray crystallographic techniques. The molecule crystallizes in space group P21/c with two molecules in the unit cell and cell dimensions of §_- 5.078, b_- 11.59, g.= 22.39 A, and B - 99.50°. The three dimensional structure to 2.8 A resolution of a N-formyl-L-phenylalanine:a-chymotrypsin complex was studied. The complex is isomorphous with a-chymotrypsin and crystallizes in space group P21 with four molecules per unit cell and cell dimensions of §_- 49.27, 2.- 67.10, g_- 65.84 A, and B = 101.75°. The structure of tetrapropylporphine was solved using Sayre's equation. The molecule is centrosymmetric with two independent pyrrole rings. A common structure for the free base macrocycle was obtained from this determination and the structures of porphine and tetraphenyl- porphine. The free base structure consists of opposite pyrrole rings possessing imino hydrogen atoms and pyrrole rings differing in like pairs. Small differences which are localized at the bridge carbon ¢\» 7" 55‘ - . ..v A no. 0-. n-u hi- a“ Penelope Wixson Codding atoms were found for the substituted compounds. In addition, the aliphatic substituent seems to decrease some of the differences between the two pyrrole rings. The nucleus of the tetrapropylporphine ring is essentially planar with some small deviations in planarity due to the close intermolecular distances between the planes stacked along-a. The 2.8 A resolution difference electron density of the inhibitor:enzyme complex was calculated with the phases of a-chymotrypsin. The N-formyl-L-phenylalanine molecule binds in the active site region by displacing a localized water molecule which was hydrogen bonded in the active site region. The inhibitor molecule has close contacts with the main chain peptide and side chain of methionine 192 and with the sequence tryptophan 215-serine 214—valine 213. The substitution of N-formyl-L-phenylalanine produces several changes in the enzyme structure. The most important change is a movement of histidine 40 by about 0.5 A to a position which is more favorable for hydrogen bonding to glycine 43. The two-fold equivalent residue, histidine' 40 is hydrogen bonded to glycine' 43 in the native enzyme structure. This change may be important for maintaining the conformation of the active site region. Another large change is observed in the region of favored uranyl binding. The change results from a movement of 0.4 A of the side chain of arginine' 154 of one enzyme molecule in the asymmetric unit toward the side chain of glutamic acid 21 of the other molecule. Lt PART I: THE CRYSTAL AND MOLECULAR STRUCTURE OF TETRA-N-PROPYLPORPHINE PART II: N-FORMYL-L-PHENYLALANINE:a-CHYMOTRYPSIN COMPLEX, THE EFFECT OF BINDING OF A COMPETITIVE INHIBITOR SUBSTRATE-LIKE MOLECULE By Penelope Wixson Codding A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Chemistry 1971 To my parents, Mr. and Mrs. Robert M. Wixson, for their constant encouragement throughout my education and to Ed for his unlimited patience and understanding. ii 311E110 a :eaCh' Thu {153455 he to :1th Th he: thr Su Fmiat r1 0" , 1 .anc I'm far he Acknowledgments The author wishes to thank Dr. Alexander Tulinsky for his patience and assistance during this study and for his excellence as a teacher. The author is also indebted to Mr. Richard L. Vandlen for helpful discussions and many computer programs. In addition, thanks are due to all those who contributed to the structure determination of a-chymotrypsin which made the second half of this thesis possible. The author is grateful for an NDEA Fellowship which supported her throughout this study. Support from the Molecular Biology Section of the National Science Foundation is also gratefully acknowledged. The author would like to thank Dr. Alan D. Adler of the New England Institute for supplying a sample of tetrapropylporphine and for helpful discussions. iii We TABLE OF CONTENTS PART I: THE CRYSTAL AND MOLECULAR STRUCTURE OF TETRAPROPYLPORPHINE 1’ IntrOdUCtion o o o o c o o o o o o o o o o o o o o o o o 1 1. General . . . . . l 2. Influence of Porphyrin Side Chains . . . . . . . . . . 3 3. Hydrogen Bonding . . . . . . . . . . . . . . . . . . . . . 4 4. The Central Hydrogen Atoms . . . . . . . . . . . . . . . . 7 II.Experimental.............. 10 1. crystals 0 o o o o o o o o o o o a o o o o o o o o a o o 10 2. Space Group Determination . . . . . 10 3. Intensity Data Collection . . . . . 12 III. Solution of the Phase Problem . . 21 l. The Phase Problem . . 21 2. The Patterson Function . . . . 21 3. Direct Methods . . . . . . . . . 23 IV. Structure Analysis . . . . . 30 l. The Method of Least Squares . . . 30 2. The Weighting Scheme . . . . . 31 3. Refinement . . . . . . . . 32 4. Location of the Hydrogen Atoms . . . . . 35 5. Secondary Extinction Correction . . . . 36 6. Evaluation of the Refinement . . . . . . 37 7. The Gaussian Ellipsoid Approximation . . 39 V0 Results 0 o o o o o o o o o o o o o o o o o o o o o o o o o o 43 VI. Discussion 0 I o o o o o o o o o o o o o o o o o o o o o a 54 PART II: N-FORMYL-L-PHENYLALANINE:a-CHYMOTRYPSIN COMPLEX, THE EFFECT OF BINDING OF A COMPETITIVE INHIBITOR SUBSTRATE-LIKE MOLECULE VII. Introduction . . . . . . . . . 65 1. Chymotrypsin . . . . . . . . . . . . 6S 2. N-Formyl- Phenylalanine (NFP) . . . . . . . 67 iv w"- ”0W. ‘am‘J R-‘ h P. ‘“ Am‘J IX. Experimental . . . . . . . . . \IO‘U‘vFUNH 0 Crystal and Derivative Preparation Intensity Data Collection . Crystal Motion . . . . . . . . Lack of Balance and Twin Corrections Absorption Correction . Decay Correction Data Processing . . X. Discussion of the Results #UNH REFERENCES APPENDICES APPENDIX I Substitution in the Active Site Region Histidine 40 . . . . . . . . . . Other Similarities with pH Change . . . . . . Final Comments . . . . . . . . . . . The Reciprocal Lattice APPENDIX II The Observed and Calculated Structure Factors of Tetrapropylporphine . . . . . . . . . . . APPENDIX III The Amino Acids and the Sequence of a-Chymotrypsin 76 76 77 80 83 83 89 90 95 95 103 105 108 111 115 116 124 -*-_- p——4 I"? 'u on "41 TABLE II. III. IV. VI. VII. VIII. XI. XII. XIII. XIV. LIST OF TABLES The Positions of'I in P21/c . . . . . . . . . . . . . . . 23 Statistical Distribution of lEI's . . . . . . . . . . . . 25 Weighting Scheme for TPrP Least Squares Refinement . . . 33 Reflections Corrected for Extinction . . . . . . . . . . 38 Final Atomic Parameters, Carbon and Nitrogen . . . . . . 44 The Mean Square Displacements (in A2) for the Carbon and Nitrogen Atoms . . . . . . . . . . . . . . . . . . . 46 Final Hydrogen Atom Parameters . . . . . . . . . . . . . 47 Atomic Deviations from the Least Squares Plane of Individual Pyrrole Rings . . . . . . . . . . . . . . . . 48 Data Ranges for the Enzyme . . . . . . . . . . . . . . . 78 The Cell Parameters for NFP:a-Chymotrypsin . . . . . . . 81 Record of Crystal Motion . . . . . . . . . . . . . . . . 84 Ratios of the Intensities of Real and Twin Lattices . . . 85 Decay Correction Data . . . . . . . . . . . . . . . . . . 91 Rr Factors for the Redundant Reflections . . . . . . . . 93 Positions of Structural Changes of a-CHT with pH from Vandlen and Tulinsky (60) O O I O O O I O I O O O O O O O 107 The Major Peaks in the NFP Difference Electron Density . 109 vi “‘6 " -—‘. _ q. .o 541' salvo v LIST OF FIGURES Figure 1. Porphine (a), chlorophyll a (b), and the tetrapyrrole portion of vitamin B12 (c) . . . . . . . . . . . . . 2. The visible spectra of tetraphenylporphine, tetra- propylporphine and porphine . . . . . . . . . 3. Three models for hydrogen bonding in porphine 4. The path of the ring current (a) and two major resonance forms of porphine (b,c) . . . . . . . . . . . . . . 5. Schematic of a four—circle diffractometer. (a) at X = 90° and (b) at x = 0° . . . . . . . . 6. Flow diagram of automatic data collection from Vandlen and Tulinsky (19) . . . . . . . . . 7. The percent error curve for porphine from Chen (15). 8. Plots of the observed and calculated weighting schemes . 9. Labeling scheme for TPrP and the deviations from the NLS plane (in A) a o o o o o o o o o 10. Bond distances (in A) and angles (in degrees) 11. Computer plot of the TPrP molecule . 12. The composite electron density . . . . . . . 13. The packing along a? . . . l4. Dominant resonance structures of the porphine macrocycle (a,b). Average observed distances and angles of the independent pyrroles of TPrP (c) . . . . . . 15. Average structure for free base porphine . 16. Scale representation of the two types of pyrrole rings . 17. Average structure for porphine and TPP with the differences (A) between the pyrrole rings vii 13 18 34 40 45 49 50 51 52 56 58 59 61 2i. n I 5‘». 1'. 6‘- 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. Axial diffraction patterns for Native, NFP (2 weeks), and NFP (6 weeks). I O O O O O O I O O O O O O O O O I O 69 Histogram of the number of reflections in each figure of merit range for o-CHT, from Tulinsky, gt 31. (53) . . 75 Plot of the intensity of the (0,18,0) vs, exposure time. 82 Plot of the relative absorption observed at x = 90° for crystals 1, 3, and 5 . . . . . . . . . . . . . . . . . . 87 Plot of the relative absorption observed at x = 90° for crystals 2 and 4 . . . . . . . . . . . . . . . . . . . . . 88 The radial distribution of Native (solid line) and NFP:Native (dashed line) . . . . . . . . . . . . . 94 (a) Electron density in the active site region viewed perpendicular to the local two-fold axis . . . . . 96 (b) Distances (in A) from NFP in the electron density Of Figure 24 (a) O O I O O O O O O O O O O O O O C O O 97 (3) Electron density in the active site region parallel to the two-f01d 8X13 o o o o o o o o o o o a o o o o 99 (b) Distances (in A) from NFP in the electron density shown in Figure 25(a) . . . . . . . . . . . . . . . . 101 Distances (in A) from the local two-fold axis (heavy line) 102 The region near Histidine 40 . . . . . . . . . . . . . . . 104 The uranyl substitution position . . . . . . . . . . . . . 106 viii ‘8: .1 PART I. THE CRYSTAL AND MOLECULAR STRUCTURE OF TETRA-N-PROPYLPORPHINE '.. Genera The F :::;lex r: ;c:;':yrin recesses transport PLYQEEIII “CST. I. Introduction 1. General The porphyrins are tetrapyrrolic macrocycles with the ability to complex readily with many different metals. As metal chelates, porphyrin-like materials are important in a number of biological processes including photosynthesis, cellular reSpiration, and electron transport. Excellent reviews of the chemistry and function of porphyrins and their related compounds are available (1,2). Most of the porphyrins of animal origin are substituted at the pyrrole positions 1-8 (Figure 1a); however, the chlorophylls (Figure 1b) and vitamin B12 (Figure 1c) have side chains at the bridge carbon positions (a,B,Y,6). The porphyrins can have a wide range of functions through variations in the stereochemistry of the macrocyclic structure and the side chains. The correlation of the function of the porphyrin moiety with its stereochemistry makes these substances particularly interesting from a structural viewpoint. The porphyrins are highly conjugated and intensely colored. The resonance stabilization for porphine was determined by Longo, g£_§l,, (3) to be 419 kcal/mole. This large stabilization energy was attributed to delocalization of n-electrons available from the several identical units of 2-viny1pyrrole linked in conjugation. The strong resonance stability of porphyrins is indicated in the mass spectra of the porphyrins also. Adler, Green, and Mautner (4) found that the formation of a stable composition n-electron system with an effective number of l M I '0'; NB- 'C'SCM HI- ‘CMMWI (C) Figure]. P . or hine p (a), chlorophyll a (b), and the tetrapy 1 rro e portion of Vitamin 312 (c). N:::CY21' ilezule. ' 9 27s a c: Scret s: ""55 ”In 't‘uva ‘. u "Au-u. 5": V “DC Y Q :1! ”I.” 0.. a . St..ru: I '5 I' h 'ISLE ' o :‘f s‘.‘ . vi; ‘ .1 iii 3 n—electrons equal to 48 + 2 (where S is an integer) was the major factor in the fragmentation of the porphyrin ion. Hence, the charge on the macrocycle is positioned so as to maintain the aromaticity of the basic molecule. Such strong aromatic character should temper the electronic effects of any substituent on the ring. 2. Influence of Porphyrin Side Chains The highly conjugated nature of the porphyrins causes their electronic absorption spectra to be particularly useful. The porphyrins have a characteristic intense absorption band at 400 mp called the Soret band; in addition, there are two to four bands in the visible region whose wavelengths and relative intensities depend on the specific compound. Lemberg and Falk (5) found a correlation between the nature and number of the substituents on the porphyrin molecule and its visible spectrum. For the naturally occuring porphyrins there are four types of visible spectra which are related to the positions of the substituents and their electronegativities. This correlation of substituent effects to changes in the visible spectra is explained by the theory of porphyrin spectra developed by Gouterman (6). For a square planar porphyrin the visible bands are a result of transitions from filled A2u orbitals to vacant Eg orbitals. I}; the free base the symmetry is lowered to D2h and the degeneracy is jLifted producing four bands instead of two which arise from the forbidden (0-0) and the allowed (0-1) vibrational transitions. The transitions producing these bands are associated with electronic displacement to the periphery of the ring, thus accounting for the sensitivity of these absorptions to changes on the exterior of the ring. =12 subs: respectix is: the I he I :eir nu: Hising : 2.52 :‘n I: 10'; f 53rd to straits 3: the b HE per; 3- Eydx Int . H RIP} - w'! n J, L: l h "v L,‘.‘I L. “m I 4 The sensitivity of the visible bands to changes in side chains is demonstrated in the visible spectra of porphine, tetraphenylporphine (TPP), and tetrapropylporphine (Figure 2), where the a,B,y,6 positions are substituted with hydrogen atoms, phenyl groups, and n-propyl groups, respectively. The bands vary in relative intensity and in wavelength for the three compounds. The effect of substitution on the porphyrins is also observed in their nuclear magnetic resonance (NMR) spectra (7). The ring currents, arising from the delocalized n-electrons, in porphyrins are large and cause the peaks for protons on the periphery of the ring to be shifted to low field. The peak from the methine (bridge) hydrogen atom was found to be very sensitive to the symmetry of the substitutions on the pyrroles. Abraham, g£_§l,, (8) found that substitution of methyl groups on the bridge positions decreased the ring current or aromaticity of the porphyrin by approximately 10%. 3. Hydrogen Bonding In the free base porphyrins the close proximity of the central hydrogen atoms to all four nitrogen atoms suggests the possibility of N¥H---N hydrogen bonding. Badger, g£_al., (9) found the N-H stretching .frequency to be 3320 cm.1 for a number of free base porphyrins whereas theIN-H stretching frequency in pyrrole, indole, and carbazole is 13500 cm_1. This decrease in stretching frequency indicates hydrogen bonding; furthermore, the N-H stretching frequency does not change significantly for solids suggesting that the hydrogen bonding is intramolecular . Three possible models for the hydrogen bonds were suggested by M88011 (10); these are given in Figure 3. The symmetrical model IV TPP \/ Tetropropylporphine W Mmu) 500 2 600 700 Porphine I“"‘~._. __i ‘11 Figure 2. The visible spectra of tetraphenylporphine, tetrapropylporphine and porphine. Solvent for all three was benzene. ‘ figure (0) (b) (C) Figure 3. Three models for hydrogen bonding in porphine. 7 (Figure 3c) was rejected since the N-N distance of 2.65 A (from phthalocyanine) would result in the N-H bond being too long; since the interior of porphine rings tend to be larger than for phthalocyanines this model becomes even less likely. Mason identified three N-H vibrations in the infrared spectrum of porphine which correspond to the D symmetry of the opposite tautomer (Figure 3b); the adjacent 2h tautomer with sz symmetry would have five bands. The conclusion from the infrared studies was that porphyrins have the imino hydrogen atoms bonded to opposite pyrrole rings and that intramolecular hydrogen bonding is present. 4. The Central Hydrogen Atoms With the acceptance of the opposite arrangement of the imino hydrogen atoms attention was directed to the possibility of rapid tautomerism of the N-H protons. Becker, Bradley, and Watson (11) studied the NMR spectrum of coproporphyrin~l which has methyl groups substituted on the pyrrole rings in positions 1,3,5,7 (Figure la). In Figure 4 the path of the ring current for the opposite arrangement of imino hydrogen atoms is given along with the two major resonance structures which contribute to the hybrid. Methyl groups in positions 1 and 5 are further from the ring current than the other two methyl groups; hence, two methyl hydrogen NMR lines were expected. Becker, ££H21.,(11), observed one single peak for the methyl hydrogen atoms; this result was explained by rapid tautomerism of the imino protons which would make all four pyrrole rings magnetically equivalent. At -63°C no broadening of the CH line was observed which indicated that 3 the tautomerism rate was greater than 200 exchanges per second. (b) (C) Figure 4. The path of the ring current (a) and two major resonance forms (b,c) of porphine. 9 In the crystal structure of porphine reported by Webb and Fleischer (12) the molecule is observed to have D symmetry with half-hydrogens 4h bonded to each of the four nitrogen atoms. This result was explained as due to the tautomerism of the N-H protons. In the crystal structure of triclinic tetraphenylporphine (TPP) (13) the imino hydrogen atoms are bonded to opposite nitrogen atoms and there are slight differences in the two types of pyrrole rings resulting in a macrocycle with D symmetry and no evidence of tautomerism. 2h The tetragonal form of TPP (14) is required to have D4h symmetry because of the space group and cannot therefore provide any information about tautomerism. The structural investigation of a,B,y,é-tetra—n-propylporphine (TPrP) was undertaken for two reasons: (1) to obtain an accurate structure of the free base macrocycle and (2) to determine the effect, if any, of the aliphatic substituents on the aromaticity of the porphyrin moiety. The TPP structural determination has indicated a macrocyclic structure which corresponds to a hybrid of the two major resonance structures (see Figure 4). Since the time the TPrP structure determination was begun, Chen (15) has redetermined the structure of porphine and found it to be similar to TPP. Thus there were two examples of the hybrid structure of the macrocycle, both the parent compound and one with aromatic side chains. The aliphatically substituted molecule (TPrP) could provide more evidence for the common macrocycle if its structure is similar. Also the structural analysis of TPrP could determine if the electronic effects which produce differences in the visible spectra of TPP, TPrP, and porphine (Figure 2) also produce differences in their structures. '3‘. £22 I... ‘7 floated :‘H'O! IIIE 1.- 'II- .".0'. Mun u 0‘ I .41 7‘5". "Lil: II. Experimental 1. Crystals The crystals were obtained by slow evaporation of a benzene solution of tetrapropylporphine in the dark; the crystals are purple hexagonal platelets. The crystal which was used for all X-ray measurements was approximately 0.15 mm in each dimension. This crystal was smaller than the X-ray beam and of sufficient size and quality to produce a strong diffraction pattern. The density of a TPrP crystal, measured by flotation in an aqueous silver nitrate solution, was determined to be 1.22(2)g cm-B. 2. Space Group Determination The geometry of the diffraction pattern must be determined before the intensity measurements can be made. Photographic methods are a convenient way to determine the size and symmetry of the direct and reciprocal lattices (see Appendix I for a discussion of the reciprocal lattice). The oscillation, Weissenberg and precession methods were used in this study. Detailed treatments of photographic methods and their uses have been presented by Stout and Jensen (l6) and Henry, Lipson,and Wooster (17). In the oscillation method the crystal is mounted such that it can be rotated about a direct lattice line and oriented so that this lattice line is perpendicular to the incident X-ray beam. The reciprocal lattice planes which are perpendicular to the lattice line 10 11 can then be brought into reflecting position through oscillation of the crystal. The diffraction pattern is recorded on photographic film which is placed in a cylindrical camera around the oscillation axis. The pattern appears as a series of separated layer lines each made up of individual spots. As well as being useful for aligning the crystal, the oscillation photograph can provide information regarding the symmetry of the lattice line and the spacing of the lattice along the oscillation line. A 10° oscillation photograph of the TPrP crystal showed no symmetry which indicated that the oscillation axis was not a symmetry axis; in addition, the separation of the layer lines indicated that the lattice spacing was 5-6 A. In the Weissenberg film method, the cylindrical camera is used to record only one layer line, which is selected by means of a slotted screen. As the crystal is rotated the film is synchronously translated past the slot giving a two dimensional picture of the reciprocal lattice plane. Zero and first layer Weissenberg photographs were taken of the TPrP crystal. These photographs showed two axes 90° apart, one of which was not perpendicular to the rotation axis. However, neither the identification of the crystal class nor the indexing of the Weissenberg photographs could be accomplished until the oscillation axis had been identified. The precession method provides a means for observing the reciprocal lattice planes parallel to the oscillation axis. The precession motion is achieved by two oscillations: one about the oscillation axis of the crystal and the other about a line 90° from the oscillation axis and parallel to the X-ray beam. The flat photographic film is tilted to be parallel to the reciprocal lattice planes being photographed and follows the same precession motion as the crystal. A screen with an 12 annular opening is used to select the desired reciprocal lattice plane to be photographed. Precession photographs for TPrP were taken of the planes defined by the oscillation axis and the two axes previously identified in the Weissenberg photographs. These photographs indicated that the oscillation axis was not a cell axis and subsequent indexing revealed that this axis was the (h0h) lattice line. The crystal class was determined to be monoclinic and the unique axis, 2, and the g ..l axis were the two axes observed in the Weissenberg photographs. When indexed the Weissenberg photographs revealed systematic absences for the(0k0) reflections for odd values of k. These absences indicate the presence of a two-fold screw axis along the b_axis (a two-fold rotation coupled with a translation of b/2 along the rotation axis). The precession photographs of the (hOQ) and (hll) zones showed another systematic absence for reflections of the type (hOR) whenever 2 was odd. This type of extinction is characteristic of a.g axis glide plane which corresponds to reflection through a plane parallel to the 3 axis and a translation by c/2 along the 5 axis. The fact that TPrP was monoclinic along with the observation of systematic extinctions in the Weissenberg and precession photographs fixed the space group to be uniquely determined as centrosymmetric P21/c, space group no. 14, subgroup number 2, as tabulated in the International Tables of X-ray Crystallography, Volume I (18). 3. Intensity Data Collection The crystal was mounted on a 4-circle diffractometer with the (hdh) lattice line coincident with the ¢ axis of the diffractometer (Figure 5 gives the geometry of the four circle diffractometer). The b_axis was thus located in the equatorial plane (x = 0°). The (b) Figure 5. Schematic of a four-circle diffractometer. (a) at x = 90° and (b) at x = 0°. {U E : 3'. a. at an 958 f‘SiI 1 II-I ‘o ‘0‘“ tc‘: 14 unit cell dimensions were determined by measuring the angular positions of twelve centered, moderately intense reflections which were well distributed in reciprocal space. The unit cell parameters and the orientation parameters of the crystal were obtained from the angular coordinates by the method of least squares. The unit cell parameters are a_- 5.078(5), b_- ll.59(2), g = 22.39(3)A and B = 99.50(5)°. The calculated density of the crystal on the basis of two molecules per unit cell is 1.223 g cm-3 which compares favorably with the observed density of 1.22 i .02 g/cm3. Since there are four equivalent positions in each unit cell for space group P21/c, the fact that there are only 2 molecules/unit cell fixes each molecule of TPrP to be situated on a center of symmetry. The quality of the crystal and the quadrant for data collection can be determined by examination of the mosaic spreads of the intensities of the reflections. The spread arises because the Small mosaic blocks which comprise the crystal are not perfectly aligned so each one diffracts from a slightly different angle. If the intensity data are to be obtained with a non-scanning technique, it is necessary that the mosaic spreads be narrow enough that all of the diffracted beam can be measured at one time thus obtaining the integrated intensity of the peak. The mosaic spreads were obtained by centering the peak with the angles o, x, and 26 and then measuring the intensity of the peak at steps of 0.02° in.w across the peak. The intensities for the (060), (060), (00,12), (00,I2), (100), and (205) reflections were measured. The mosaic spreads were symmetrical, single-peaked, and had widths less than 0.3° from background to background. Therefore, the quadrant choice was based on convenience and was defined by the 15 f2? and fig? axes including the +2? axis and the fa? axis. To avoid Hal, Kaz splitting, the intensity data were collected to sin elk :_0.55 corresponding to a minimum spacing of 0.94 A. The intensity data were collected using CuKa radiation (A = 1.5418 A) and a Picker Four Circle Automatic X-ray Diffractometer controlled by a Digital Equipment Corp.(DEC) 4K PDP-8 computer (FACS-I System) coupled to a DEC 32K Disc File. Balanced Ni and Co filters were used to ensure that the incident radiation would be monochromatic within a fixed window width of 0.12 A. Monochromatic radiation improves the peak to background ratio of the diffraction pattern. These filters consist of a metal foil of an element which has an absorption edge at the wavelength of the unwanted X-rays. An absorption edge occurs when the X-radiation is energetic enough to excite an electron out of an atomic orbital. For a filter of an element with atomic number Z-l the wavelength of the absorption edge lies between the Ka and K lines of the target element with 8 atomic number 2. A system of balanced filters provides a means of obtaining X—radiation having a very narrow range of wavelengths which may be used to measure the peak intensity of a particular reflection. In this manner the peak is first measured using a KB filter (Ni in the case of CuKCIradiation) then the background is measured with a balanced Rh filter (Co) which transmits the same amount of K radiation as the 8 KB filter while removing some of the Km radiation (about 10% transmission of K6). The final intensity is calculated from the difference between the two measurements. The intensity measurements were made under the control of a computer program using an ulstep scan procedure (19). This is essentially a 16 stationary crystal-stationary counter technique which measures the intensity of six steps in w, in 0.03° increments, through the calculated position of the peak. At each step the intensity is measured for four seconds, using a balanced Ni filter, and the four largest measurements are summed to give the integrated intensity (count-six- drop—two; Wyckoff, 25.31., (20)). At the w value of the step with the highest intensity, the background is measured for four seconds, using the Co balanced filter, and the intensity is multiplied by four to give the total background. If the crystal has undergone some misalignment the calculated value of m will not correspond to the observed peak. To compensate for these small changes in w, the program checks the w value of the step corresponding to the highest intensity and the value of the intensity and if this step is not at the center of the range and if the intensity is greater than some preset value, then one or two additional steps are made depending on the position of the highest intensity. This "wandering" w step scan procedure allows the computer to continue to collect useful data even when some small misalignment has occurred. Throughout the intensity data collection, the intensities of three reflections were measured every 100 reflections (about 1 hour) as a check on the alignment of the crystal. The computer program (19) automatically detects any misalignment from a decrease in the intensities of the monitor reflections and realigns the crystal by remeasuring the angular settings of the orienting reflections and obtaining a new orientation matrix through least squares analysis. The monitor reflections were chosen to give a description of the orientation of the reciprocal lattice; they were the (060) reflection at x - 0° and the (202) reflection at x - 90° measured at two ¢ values 90° apart. A on ' a: . by.‘ n .pqhfl .:.ta cpl! I 3.! 0 egg.) bone: " O L! C 11 a. D {-I. . My Au. . ul:" a... ‘ Dvl ‘n CC?! LJ‘ (1" .‘q‘ l7 flow diagram of the entire automated process of data collection is given in Figure 6. Before the intensity data can be converted to structure factor amplitudes and used to determine the crystal structure, it may be necessary to apply several corrections to the data. These corrections are absorption, lack of balance, decay, and Lorentz-polarization. The amount of absorption of X-rays by a crystal depends on the content of the unit cell and the path lengths through the crystal of the incident and diffracted beams. The absorption by one unit cell is determined by the mass absorption coefficient at the appropriate X-ray wavelength(18) of each element comprising the unit cell. The linear absorption coefficient of 9.11 cm“1 for the TPrP unit cell was relatively small. Calculation of the path length of the X-ray beam for all the reflections used in a structure analysis would be difficult, especially for a crystal of general shape. Furnas (21) has suggested that the variation of the absorption in the crystal can be measured by the variation in the intensity of a reflection with o at x = 90°. In this orientation the normal to the reflecting planes is coincident with the o-axis so that a rotation about ¢ does not move the planes from the reflecting position and the intensity should remain constant. Thus, any variation in the intensity of the reflection would be due to a difference in the X-ray absorption or pathlength. This variation would be a measure of the relative absorption by the crystal. For TPrP the variation of the intensity of the (202) reflection with ¢ was measured for the o values used in the data collection. Due to the small value of the absorption coefficient and the nearly isotropic Shape of the crystal a reliable variation of absorption could not be 18 g I a v" No , 1 GWEN: Input for I. Orientotion Motrin o 2.0oto Collection 3.Center Reflection 4. Leoet Squoree I I t. ----- Meoeure let 'Meoe re . v. ’ Coll Antoinette _BetIeetIoo.--l u Iomtore ‘ Collect °°“ °°.-'-'!-°'-‘-°=- mat. raw ........ °" “'1— No Cotoulote ‘ Coll Automotic a Center Reflectione Center Ionitore; Obtoh New In teneitiee Coll Automatic Leoet Soooree Figure 6. Flow diagram of automatic data collection from Vandlen and Tulinsky (l9). H- no. r1 ”4;: ‘ .v a“: I? 4'. i Pg: e ‘e 19 measured; therefore, no absorption correction was applied to the intensity data. The Ni/Co filter pair which was used for data collection was not balanced exactly. This lack of balance was determined as a function of the scattering angle (26) by measuring the general background with both the Ni and Co filters. A plot of the differences between these background measurements was used to correct the background intensities for lack of balance. This correction was small for the TPrP data crystal and decreased to zero at about 26 = 40°. The intensities of the monitor reflections were compared to determine if the crystal had sustained any damage due to X-ray exposure. The ‘monitor reflections were measured 22 times over a period of approximately 20 hours and the percent standard deviations from the average intensity were 0.5, 0.9, and 1.0% for the (060) and the two (202) reflections, respectively. Thus there was no decrease in intensity due to X-ray exposure and no correction for decay was applied. In all, the intensities of 1744 independent reflections were measured. In addition, a redundant set of (Oki) reflections were measured and used to calculate ROBS which is ROBS ' 24'1010. ’ 10k)?“ Z(10kt + IOkE)° The resulting ROBS value of 0.011 for 382 redundant reflections indicates that the data are highly reproducible. The average value of the measured intensities for the (hOR) systematically absent reflections were used to fix an observable limit of 15 counts/l6 seconds. This limit gave 331 accidentally absent reflections and 1413 (81%) reflections were taken to be observable. 20 Since the absorption and decay corrections were negligible, the data were converted to structure amplitudes by simply applying Lorentz— polarization (Lp) and lack of balance corrections. III. Solution of the Phase Problem 1. The Phase Problem Due to the periodicity of the molecular arrangement in a crystal the electron density, p(x,y,z), may be represented as a Fourier series "l'bo - .1. p(x,y,z) — V2; 12‘ EBWF(hkg)exp{-21r.(,(hx+ky+£z)} (1) The Fourier coefficients, known as structure factors, are complex quantities related to the observed intensity data. Unfortunately, the phase relations among the diffracted waves can not be observed so only the intensities of the waves are obtained. Thus, the structure amplitudes, |F(hk2)|, can be obtained from the data but the phases must be deduced. This is the phase problem in crystallography and the utility of the method depends upon its solution. 2. The Patterson Function The Patterson function, P(u,v,w), is defined as the product of two electron density functions: P(u,v,w) - Vfolfolfolp(x,y,z)p(x+u,y+v,z+w)dxdydz. (2) This function will be nonzero only when the electron density at the Points (x,y,z) and(x+u,y+v,z+w) is nonzero; hence, the Patterson function will have positive regions at the positions representing the vectors between atoms. This function is useful in structure analysis 21 22 because it is a phaseless Fourier series which can be calculated directly from the observations. The Patterson function in this form is P(u,v,w) =‘%-Z i Z +00|F(hk£)|2cos 2n(hu+kv+2w), (3) h in» which is also a centrosymmetric function. The three dimensional Patterson function for TPrP was computed using the squares of the observed structure amplitudes as coefficients. The positive regions of this function were concentrated in planes parallel to the (114) plane. If the vectors between atoms lie close to this plane, the atoms of the molecule must also lie near the plane. As further support for the conclusion that the (114) planes are parallel to the approximate molecular plane the three largest intensities are those of the (114), (104), and (112) reflections. The tilt of the molecule was thus defined by the requirements that the molecular plane be parallel to the (114) planes and the translation by the molecule being situated on a center of symmetry. The angular orientation of the molecule could be determined by systematically rotating a model structure in the plane until the best fit of the observed diffraction pattern was obtained. The angular orientation was not determined in the manner described above because the molecule was originally positioned at the wrong origin. In space group P21/c there are four independent sets of inversion centers (these are given in Table I) and the rotational fit ‘was attempted only for the set (0,0,0); (0,1/2,l/2). The extra symmetry of the Patterson function and the presence of cross-vectors between the two molecules in the unit cell obscured the real centers of syummmry of (0,0,1/2); (0,1/2,0) which were subsequently determined by dIl-l'ect methods . 23 Table I. The Positions of'I in le/o 1. (0,0,0);(o,1/2,1/2) 2. (1/2,o,0);(1/2,1/2,1/2) 3. (0.0.1/2);(0.1/2.0) 4. (1/2.0.1/2);(1/2.1/2.0) 3. Direct Methods Since the Patterson function can be used to deduce the phases indirectly (which are calculated using the atomic coordinates obtained from the vectors) employing only the structure amplitudes, information of the structure must be contained in the intensities or their relative distribution. Direct methods are objective mathematical procedures for obtaining the phases from the intensities and are independent of the structure. The method makes use of one or more mathematical expressions which predict the phase of a reflection from the phases and magnitudes of combinations of other reflections. In these methods the intensity data are reduced to a form which describes point atoms with all temperature effects removed. The structure amplitudes are corrected for vibrational motion and placed on an absolute scale, then converted to normalized structure factors, IEl's. The conversion to lEI's is necessary because the probability function for the phase of E is independent of the complexity of the structure; thus, the mathematical expressions are valid regardless of the number of atoms in the unit cell. E is defined by 2 |E(hk£)|2 =|F§hkml . (4) 6E f12(hk£) i=1 24 where N is the number of atoms in the unit cell, f1(hk£) is the scattering factor of atom i at the 6 value associated with the (hkt) reflection, and e is a symmetry factor which accounts for the multiplicity of |F(hk2)| in symmetry Operations in reciprocal space. The theoretical distribution of |E|'s for both centrosymmetric and noncentrosymmetric crystals may be obtained from the probability function for E (22). The TPrP data were converted to normalized structure factors with the absolute scale and average temperature factor (3.5 A2) determined by Wilson's method (23). The |E|'s were scaled so that = 1.0. The distribution of IEI's, given in Table II, is as expected: they are similar to that of a centrosymmetric crystal. In a centrosymmetric space group, the center of inversion causes the positions x,y,z and -x,-y,-z to be equivalent. This condition reduces the geometrical part of the structure factor to a cosine term and since the sine term is zero the phase angles muSt be 0 or n and the corresponding cosine terms must be i’l. Since the space group for TPrP is centrosymmetric, the phase problem is reduced to the determination of the signs of the F(hk£)'s. The mathematical phase relationship used for TPrP is Sayre's equation (24) which is derived from the properties of squared Fourier series with the assumption that the electron density is non—negative and contains resolved atoms. This equation is similar to the 22 relationship developed by Hauptman and Karle (25) from probability considerations and may be expressed with normalized structure factors as S(EA) = 8(2 E E ). (5) - A Table II. <|E|> <|EZ-1|> 74> 1.0 Z > 2.0 Z > 3.0 25 Statistical Distribution of IEI'S 325g 0. 1. 0. 31. 83 00 90 08 .66 .50 Theoretical Centric Acentric 0.80 0.89 1.00 1.00 0.97 0.74 32.00 37.00 5.00 1.80 0.30 0.01 :2 5). ‘_T —-——« t§__ 26 where S( ) means "the sign of", A, _I_3_, and 9 represent vectors of Miller indicies (hki), and the sum is over all combinations for which A’= §_+ C. The sign of the sum in Equation (5) can be determined from a small number of terms if these terms are significantly larger than the rest. Thus, if the set of |E|'s is restricted to those with large values, 1,3,, [El >1.5 (26), only a partial sum is necessary. The probability function, P+(EA), which gives the probability that the sign of EA is positive is defined by Cochran and Woolfson (27) to be 0 m 3 P+(Eé)ml/2+(l/2)tanh(O——37-2-)| Eél EB BEBE (6) 2 _ _. _ N m where am = Z Zi . The Z1 is the atomic number of atom i_and N is the i=1 total number of atoms in the unit cell. Sayre's equation requires knowledge of the signs of some reflections before it can be used to predict new signs. A set of starting signs can be obtained by assigning the origin which is accomplished by choosing the phases of certain reflections. In a centrosymmetric monoclinic crystal there are eight possible origins (Table I) and the choice of any origin (61, 82 E3) affects the structure factors in 9 the following way: F'(hk2) = F(hk£) x (-1)2(h€1+k°2”’°3) (7) The signs of structure factors for reflections of the type (hkl) = (even, even, even) are independent or the origin and are determined only by the structure. Reflections of other parities can be arbitrarily assigned phases which will determine the origin of the unit cell. For a mono- clinic crystal three reflections are necessary to define the origin. The origin-defining reflections should be chosen from the largest values of IEI and should have a large number of interactions in the 27 phase determining equation. In addition, these reflections must also be independent, iag., parity of (hkl) must be different for each reflection. Each reflection must form a large number of useful terms in the Sayre's equation; thus, the reflection must have a large number of interactions to form products EBEC. The three reflections which were used for the TPrP sign determinations were the (112),(23,I7) and (16,IO) reflections with IEI's of 2.55, 3.14, and 2.57, respectively. The number of interactions in the Sayre's equation were 104, 59, and 55, respectively. By assigning all three of these reflections positive signs the origin was defined as (0,0,0) (25). Additional signs can be generated through the phase relationships of the space group. For P21/c these relationships are: E(hkSL) = HEB?) E(hk£) = (-1)k+"E(hEi) , (8) £61.51) = (-1)k+‘LE(th) Signs may also be obtained from the symbolic addition method in which reflections with large IEI's and many interactions are assigned symbols and then phases based on these n symbols are generated. Since each symbol may be + or - there will be 2n possible solutions to the structure. The computer program (28) which was used to calculate the signs for the TPrP structure generates all the possible sets of phases by applying Sayre's equation to the 2n starting sets. The starting set is used to determine additional signs which are then used in an iterative application of the Sayre's equation which is terminated when no new additions or changes are made to the list of signs. Once all the possible sets of phases are calculated one or the true solution must be determined. One method is to calculate all 2n electron -.4,- n' "4"“ . S .. 59" HI . A' .2»: «is 'u . .I rh .A'h PV- *L . AI. Olfi l .1 28 density maps. The true solution will produce a map which gives the chemically correct structure. A more efficient method is to use a consistency index as a test for the correct structure. The consistency index, C, is defined as (9) where the average is taken over all values of A, If C is equal to one, then for each reflection there are no sign contradictions in the Sayre's sum. The true solution is usually the most consistent one. There are 'm,?-'Mfl'fl ‘_ exceptions, however, when the phase relationships for a space group never change the sign of E. For these cases, PI is an example, one false solution will have C = 1.0. The computer program used for TPrP has two other useful tests: the true solution always requires the minimum number of cycles and for the true solution the signs of the starting set are always predicted to be their starting values. In the TPrP structure determination the 202 IEI's with values greater than 1.3 (giving a ratio of [number of signs/atom] m 10 (26)) were used in the sign determination program. A starting set of seven signs was used for TPrP; these were the three origin-defining reflections and four general reflections: the (114), (229), (116) and (145). In each of the sixteen possible solutions all 202 signs were determined and in four of these the solutions were obtained in three cycles. Of these four, the solution with the highest consistency index, C = 0.828, also had no contradictions in the determinations of the starting signs and was taken as the true solution. 29 The phases of this set along with the IEl's were used to calculate an E-map which is a synthesis of the electron density with E(hk£)'s as the Fourier coefficients. Since the IEI's describe point atoms with no vibrational motion the resulting map is characterized by sharp peaks and a low background. The E-map for TPrP revealed the positions of all eighteen non-hydrogen atoms in the asymmetric unit. The atomic positions had an average peak height of 237 with a fluctuating background of :_70. The trial structure obtained from the map with an average thermal parameter for all atoms was then used to calculate structure factors. The R-factor for this calculation was 0.31. R is defined as R = leFol-IFCII {IFOI (10) where IFOI and IFCI are the observed and calculated structure amplitudes, respectively, and the sum is over all of the observed reflections. The choice of this structure as the true one was subsequently confirmed by the successful refinement of the structure. At the completion of the refinement, the phases of the final structure (which are given in Appendix II) were compared to those obtained from direct methods and all of the 202 signs which were originally determined with Sayre's equation were correct . IV. Structure Analysis 1. The Method of Least Squares After a trial structure consisting of all the non-hydrogen atoms has been obtained, the preliminary parameters can be improved by refinement. The method of least squares was used to refine the trial structure of TPrP. The principle of least squares is to find the parameters which best describe the observations by minimizing the weighted sum of the squares of the differences between observed and calculated values. The structure factor, which is related to the observations in crystal structure analysis, can be represented as a function of the atomic parameters by N F(hk£) = Z fi(hki)Ti(hki)exp{2si(hxi+kyi+izi)} (11) 131 where fi(hk2) is the atomic scattering factor, Ti(hk2) is the thermal parameter, x1, yi, z are the coordinates of atom i, and N is total 1 number of atoms in the unit cell. The structure factor is not a linear function of the atomic parameters so the relationship is linearized by expanding Equation 11 in a Taylor series and truncating the series after the first term. The trial parameters have to be close to the true values for this approximation to be successful. The function which is minimized is B _ 2 D — 2 wi(|F01I-chil) (12) i=1 30 31 where the sum is over all of the observed reflections and mi is the individual weight of the observed structure factor. The weight of the reflection is related to the standard deviation (Ci) of lFol by mi = l/oiz. The method will usually converge even though the equation for Fc is approximate and the resulting adjustments are not exact. A system of iterative cycles of refinement is used until the changes in the parameters are insignificant with respect to the standard deviations of the parameters. In the least squares refinement, the overall scale constant, the three coordinates of each atom, and the individual temperature factors are the parameters which are varied. The individual temperature factor is used to describe the thermal motion of the atoms and is related to the mean-square amplitude (£5) of atomic vibration. Each atom may be assumed to have an isotropic temperature correction, T(hk£) = exp(-Bsin26/A2) 2 2 . . with one parameter, B 8 8".E , to be determined or to have an anisotroplc thermal correction of the form 2 2 2 T(hk2) = exp[-(Bllh +322k +3332 +2812hk+2313h2+2323k2)]. (13) These two forms of the thermal parameter are interrelated by 811 = Baf2/4, 812 - 83*1b* cosy*/4, etc. The scattering factors for the carbon and nitrogen atoms were taken from Cromer and Waber (29). A spherical scattering factor (30) was used for the hydrogen atoms. 2. The Weighting Scheme A weighting scheme for least squares refinement is based on the estimated errors of the intensity measurements. The weighting scheme, which was chosen to be similar to that of Hughes (31), is given in 32 Table III. The percent error curve which was used for the small intensities was constructed to be similar to that used in porphine (Figure 7 (15)) with limits of 100% error for I = 15 counts/l6 seconds and 4% error for I - 500 counts/16 seconds. For the intermediate range of intensities the 0(hk2) was assumed to be a constant percentage of the magnitude of the structure amplitude. For the other ranges the o(hk£) values reflected the increased error in the measurement of very small and very large reflections. 3. Refinement A set of 1100 signs were obtained from a structure factor calculation based on the E-map coordinates and were used with the observed structure amplitudes to calculate an electron density. Estimates of individual thermal parameters and improved atomic coordinates were obtained from this map and were used as input for the least squares refinement. Full matrix least squares refinement with unit weights and isotropic thermal parameters was initiated using the computer program ORFLS (32). Three cycles of refinement varying the coordinates of the ten atoms of the two pyrrole rings, the coordinates of the methine bridge and propyl side chain atoms, and all of the isotropic thermal parameters reduced R to 0.150. At this stage, the isotropic thermal parameters were converted to the anisotropic form and the weighting scheme discussed above was included. The anisotropic thermal parameters were varied for six atoms per cycle. The first two cycles were with respect to the atoms of one pyrrole ring and one methine carbon atom while the last cycle varied the parameters of the six propyl carbon atoms. Refinement of the thermal parameters, coordinates of all the atoms, and the overall scale 33 Table III. Weighting Scheme for TPrP Least Squares Refinement. Intensities in counts/16 seconds. I > 20,000 m = (M) x 1 2 I (0.04IF l) 0 500 < 1 1 20,000 w = 1 2 (0.04IF |) 0 l 15 < I 1 500 w = (% error x IFOI)2 34 1 error (X) OL—— - . A , ‘ 200 «400 goo *Inmnw...) Figure 7. The percent error curve for porphine from Chen (15). 35 constant one time reduced R to 0.135. The phases obtained from this refinement were used to locate the hydrogen atoms. 4. Location of the Hydrogen Atoms The difference Fourier synthesis was used to determine the positions of the hydrogen atoms. This difference electron density function for a centrosymmetric crystal is defined as Ap(x,y,z) a éiiz +00(IFOI-IFCI)SC cos 2n(hx+ky+iz) (14) hk9.=-0° where Sc is the sign of the calculated structure factor. If the signs, Sc, are nearly correct this function provides a measure of the differences between the model structure used to calculate FC and the true structure. The u3(x,y,z) synthesis was used to locate the hydrogen atom positions which should be positive peaks since they were not included in the model. Three difference Fourier syntheses and structure factor calculations (12 hydrogen atoms were located in the first map and R was reduced to 12.7; the six missing hydrogen atoms of one propyl group were located in the next two calculations, R = 12.2 and 11.7) revealed all of the hydrogen atoms' positions except that of the imino hydrogen atom. The position of the central hydrogen atom was obscured by the presence of residual density in the positions of the nitrogen atoms and by the presence of a relatively large peak at the center of symmetry of the molecule. This residual peak is in a special position (00,1/2);also only reflections of the type k + 2 = 2n contribute to the density at this position. The electron density function for this position is of the form oo(oo,1/2) - ZEXT cos st. (15) hk£ k+£=2n 36 Since the average value of the squared geometric part of the electron density, c0326, is not l/2 for this Special position, the standard error of p(00,l/2)is larger than the 0(0) for a general position by a factor of /2 (33). The hydrogen atoms were assigned isotropic thermal parameters which were 25% larger than the isotropic B's of the atom to which they were bonded. The positions of the hydrogen atoms were taken from the difference Fourier calculations and were included in the refinement. The refinement was carried out by alternating the carbon and nitrogen atom refinement with cycles refining the coordinates of the hydrogen atoms. After several cycles, R reduced to 0.089. An examination of the values of |Fo| and chl indicated that several low order reflections were affected by extinction. 5. Secondary Extinction Correction The secondary extinction effect is a decrease in the observed intensity of a reflection which is due to the incident beam being partially reflected by outer mosaic blocks before the beam penetrates to the deeper blocks. These blocks receive less power and thus reflect less intensity than they would have otherwise. The secondary extinction effect is largest for high intensity reflections at low values of 26. The secondary extinction coefficient, g, is defined as (Ic-Io)/2ICIo and the values of the corrected structure amplitudes are obtained by F2 _ 2 2 corr - F0 (l+2g(Lp)FC ) (16) where Lp is the Lorentz—polarization factor and (Lp)FC2 is taken as IC (34). 37 Thirteen low order reflections showed marked discrepancies between IFOI and [Fe] and were corrected for extinction. The value of g was determined to be 4.58 X 10-7e2 from the slope of a plot of Fez/F02 XE: (Lp)Fc2. The corrected values of the extinction affected reflections along with the final calculated values are given in Table IV. A structure factor calculation including the corrected values for the thirteen extinction affected reflections produced an R-factor of 0.080. A difference Fourier synthesis based on this calculation revealed a single position for the imino hydrogen atom which was included in the refinement in a manner similar to the other hydrogen atoms. The structure was refined as previously described for several cycles and the resulting R-factor was 0.065. 6. Evaluation of the Refinement The calculated bond distances based on the atomic coordinates of the 0.065 structure showed some anomalies: the C-C distances in the propyl groups ranged from 1.492-1.526 A, considerably shorter than the expected single bond distance of 1.54 A; the bond distances of the pyrrole rings were within 0.01-0.02 A of the expected values but the exact nature of the geometry of the rings was not evident. The apparent shortening of the propyl group distances suggested that the introduction of the hydrogen atoms early in the refinement procedure and the subsequent refinement of the hydrogen parameters might have prejudiced the least squares analysis. Such an effect has been observed in the structural analysis of a,8,7,6-tetra(4-pyridyl)porphinatomonopyridinezinc(II) (PyZnTPyP) (35). Two schemes of least squares refinement were used for PyZnTPyP; Refinement I excluded hydrogen atoms from the structure, and Refinement 11 included hydrogen atoms in the structure first in fixed positions and Table IV. Reflections Corrected for Extinction 38 *chl taken from final structure factor calculation (at R = 0.054) hki 020 off 002 ' 022 013 llI 102 112 113 104 114 114 116 IFOI 284 192 273 180 281 196 215 449 292 423 372 189 312 lFcorrI 334 218 382 206 318 209 234 609 323 527 434 198 329 ch|* 333 215 377 202 330 220 246 599 326 527 437 202 338 39 then allowed the positions to be refined. A comparison of the two structures obtained by these schemes showed that Refinement II had produced systematic shifts in the positions of the carbon atoms bonded to the hydrogen atoms with the cumulative effect being a "shrinkage" of the rings containing these carbon atoms. To check the possibility of a similar effect in the TPrP structure, a refinement similar to Refinement I of the PyZnTPyP analysis was carried out on the R - 0.065 structure excluding the propyl hydrogen atoms. After one full cycle of refinement on all of the carbon and nitrogen atom parameters, R decreased from 0.108 to 0.105. However, the propyl C-C distances increased somewhat and there were slight changes in the pyrrole distances. Another aspect of the final structure that seemed a little inconsistent was the final R-factor of 0.065 which seemed large in terms of the reproducibility of the data determined from the redundant reflections (R - 0.011). An examination of the weighted differences between OBS [Po] and chl indicated that for the smaller structure amplitudes the estimated values of o(|Fo|) were too large. A comparison of the weighting scheme with the observed differences, AP = IIFCI-IFOII, was obtained from plots of /<|F°|> Kg. and 0(IFOI) gs. (Figure 8). These plots indicated that the weighting scheme was too severe for the small intensities; therefore, unit weights were used for all further refinement for all reflections. 7. The Gaussian Ellipsoid Approximation The observed electron density based on phases calculated from the structure at R - 0.105 was used to obtain a new set of atomic positions. At this level of refinement the phases should be correct so that an iiSure 40 l ' I \ (PAJFEI) .50“ ‘1...— “ l 40* “ l l .301- “ l l (AF) .20" ’J/I/dfi') .10" 20 40 60 80 100 120 (”ED Figure 8. Plots of the observed and calculated weighting schemes. 41 electron density based on the observed structure amplitudes and these phases should be free of previous complications encountered in the refinement. The atomic positions for the carbon and nitrogen atoms were obtained by a Gaussian analytical method first employed by Shoemaker, 35 21., (36). In this method the 27 points of the electron density in a 3x3x3 grid surrounding the peak maximum representing an atomic position are fitted by least squares to the three dimensional function 0 = exp(p -'§x2 — 3y - £22 + ux + vy + wz + lyz + mxz + nxy). (17) The values of the parameters p, r, s, t, etc. for each peak are determined by least squares from the calculated values of the electron density 01, at the 27 grid points. These parameters are determined by equations of the form: 1 27 = — 9. P i=1 27 (18) l = — 9. r kr E Ci,r n 01, etc., i=1 where each parameter is determined by a set of constants and the values of 01. These constants have been tabulated by Dawson (37) for the case of equally weighted grid points. The location of the peak center is obtained by setting the derivatives of D with respect to x, y, and 2 equal to zero which results in the three simultaneous equations, rx - 1y - mx = u -£x + sy - nz = v (19) -mx - ny + tz = w, for the coordinates x, y, z of the maximum. 42 The Gaussian coordinates obtained from the observed electron density were corrected for non-convergence of the Fourier series which arises because the Fourier sum is not infinite. The effect of series termination errors is to enlarge the volume of density due to one atom, causing the atomic densities to overlap, which slightly displaces the point of the maximum. The back shift correction for this effect is obtained from the difference between the atomic coordinates used for a structure factor calculation and the coordinates obtained from the resulting calculated electron density. This difference between the input coordinates and the calculated map coordinates is subtracted from the coordinates obtained from the observed electron density. The Gaussian method was used to obtain the atomic coordinates from the calculated electron density for the R = 0.105 structure of TPrP. These coordinates were then used to make the back shift corrections. The coordinates of the hydrogen atoms were obtained by the Gaussian method from the corresponding difference Fourier method based on a structure factor calculation with all hydrogen atoms excluded. The structure obtained from the electron densities was refined for three cycles, one on all the parameters of the carbon and nitrogen atoms, a second on the coordinates of the hydrogen atoms, and a third on the positional parameters of the carbon and nitrogen atoms. This refinement reduced the R-factor from 0.062 to 0.054. The change in R—factor between the last two cycles was only 0.001 and the shifts in the parameters were insignificant with respect to the standard deviations of the coordinates; therefore, the refinement was terminated at this point. V. Results The final coordinates, anisotropic temperature factors and peak heights for all the carbon and nitrogen atoms are listed in Table V with the atoms labeled according to the scheme shown in Figure 9. The estimated standard deviations of the atomic parameters are given in the last row of Table V. The mean square displacements along the principal axes (ORTEP (38)) are given in Table VI for the carbon and nitrogen atoms. The final coordinates, isotropic thermal parameters, and peak heights for the hydrogen atoms are listed in Table VII. The hydrogen atoms have the label of the carbon or nitrogen atoms to which they are bonded. The atomic positions of the nucleus of the TPrP molecule were fitted to least squares plane according to the method described by Shoemaker, g§_gl,,(39). The plane is described by the equation m x + m y + m z = d, l 2 3 where d is the origin-to-plane distance and the best values of m1 are obtained by minimizing the sum of the squares of the residuals, (mixi-d)2. The equation obtained for the least squares plane (NLS) of the 24 atoms of the TPrP nucleus is 3.37x + 2.02y + 13.032 = 9.89 A and the deviations of the atoms from this plane are given in Figure 9. The standard deviation of this plane is 0.015 A. The individual pyrrole rings were also fit to least squares planes and the atomic deviations from these planes are given in Table VIII. The interatomic distances and bond angles for TPrP are shown in Figure 10. The standard deviations in bond lengths vary from 43 44 _HAuxmmmm+u£mamN+x£NHmN+NQMMm+NxNNm+N£HHmvlgmxm fl HOuUMM wudumuwmfimu UHQOHuOmfiG manna 45 \«oau/_oau\ozo /3.fl \au\ .oau\«nau/ aoau .A< Gav woman qu oeu aowm wcoaumw>oo mnu was mme wow maonom woaaoooq .m madman 3.6 00.0 no. —| 5.— . gov: pofil .QWAI. h” p VO.| “9| no.l \(U/ (U VON N «6.0 Zvipcd w Ii<2 No.0 /. \n = —§{AF2(l-m2) + AF2 + 62 (22) V and AF2 - (IFP+I|-|FP|)2. The term AF2(l-m2) is due to the experimental errors in the phases, the AF2 term is due to the error in the use of aF as the scattering factor of the inhibitor, and the 02 term is due to the experimental errors in the AF values. The difference Fourier technique is a valuable method of determining the structures of isomorphous chemical modifications of the native enzyme. The method allows the investigation of many similar structures readily after the phases are determined for the native enzyme structure. 2. Structure Determination of o-Chymotrypsin Since the investigation of the structure of NFP:o-CHT depends upon the determination of the phases and thus the structure of the native enzyme a brief summary of this work (53) will be presented here. The crystals of a-CHT are monoclinic, space group P21 with four molecules in the unit cell and thus two molecules per asymmetric unit. The unit cell dimensions, from an average of the parameters of seven crystals, are a . 49.24(7), b_= 67.20(10), g_= 65.94(9) A and B - 101.79(8)°. The standard deviations of the last digit from 74 the average are in parenthesis. The two molecules in the asymmetric unit are related by a non-crystallographic local two-fold axis (57). The equation for this axis, as determined from the MSU 3.5 A native enzyme electron density map, is z = 0.158 x + 0.367 (23) y = 0.708 where (x, y, z) are fractional coordinates. The phases were determined by multiple isomorphous replacement with six heavy-atom derivatives to 2.8 A resolution. The phase analysis produced an average figure of merit, , of 0.76 with two-thirds of the data having a figure of merit greater than the average (see Figure 19). This gives a root mean square error in the best native electron density at 2.8 A resolution of 0.20 e/A3. 75 3800 — 3600 1 3400 3349 (.37) 2000 ,_ l Total No.= 9044 . ) /544 (.06) .' , / 0.9 0.8 0.7 0.6 0.5 0.4 0.3 02 DJ 0 l l L l l 1 . Figure 19. Histogram of the number of reflections in each figure of merit range for o-CHT, from Tulinsky, _e_t_ _a_l_., (53). IX. Experimental 1. Crystal and Derivative Preparation Proteins crystallize with a considerable quantity of the solvent in a fairly mobile state within each unit cell. The presence of the mother liquor is necessary to keep the crystals well ordered but the composition of the solvent can be changed. Fairly large molecules can diffuse into protein crystals and enter each unit cell without disturbing the structure of the protein molecules. This is the basis of the preparation of heavy atom derivatives as well as the means for studying inhibitor:enzyme complexes. o-CHT crystals contain approximately 40% mother liquor so that diffusion of inhibitor molecules into the crystals is readily accomplished. o-CHT crystals are grown from about half-saturated ammonium sulphate solutions at pH N 4.0 and then they are stored over 75% saturated ammonium sulphate at the same pH. NFP, obtained from International Chemical and Nuclear Corp., Irvine, California, and used without further purification, was dissolved in a few drops of acetonitrile and the solution was diluted to a volume of 1-2 ml with a 75% saturated ammonium sulphate solution. lflmnlapproximately one-half of the volume of storing solution in a tube of o-CHT crystals was replaced with the inhibitor solution producing an approximate molar ratio of NFP:o-CHT of 25:1 (58). 76 77 After two weeks of soaking, a crystal was mounted and its diffraction pattern showed differences from that of the native enzyme (see Figure 18). The changes in the diffraction pattern became stationary after about six weeks of soaking; the diffraction data of the NFP:orCHT complex were obtained at this time. 2. Intensity Data Collection The NFPanHT data were obtained to 2.8 A resolution which includes m 11,000 reflections to 26 - 32.0°. The data were collected from five crystals because the crystals suffer radiation damage and usually have to be abandoned after 35-40 hours of X-ray exposure. The data were collected in concentric shells of 26 with small regions of overlap (see Table IX). The reflections in the overlap regions are used to verify the scale of the data from the different crystals. The NFP:o-CHT data were collected with the same instrument, counting technique, and automatic realignment program as were used for TPrP (see Section II). The data for the five crystals were collected and converted to structure amplitudes separately and then merged into one data set. The total data collection included 11,885 intensity measurements over a 16 day period. The data treatment was much the same as that used for the TPrP data. The differences in these procedures caused by the special properties of the protein crystals are discussed below. The crystals were always mounted with the unique axis, 2, parallel to the o axis of the four circle diffractometer (Figure 5). The quadrant for data collection was defined by 52? and fa? including +2}. 78 Table IX. Data Ranges for the Enzyme Crystal 26-Range (degrees) dmin(A) Number of Reflections 1 2.5-20.0 4.4 2800 2 19.7-25.0 3.6 2650 3 24.75-28.0 3.2 l 2200 4 27.80—30.15 3.0 1900 5 29.90-32.0 2.8 1900 79 Before data collection, 26 scans of the axial data were made to ensure crystal quality and reproducibility of the derivative diffraction pattern. These axial runouts were repeated following the data collection to check for systematic changes in the diffraction which might arise from radiation damage or loss of humidity. In addition to this check, two sets of centric (hOB) projection data to 6 A resolution (26<15°) were collected before and after data collection. These (hoe) data sets served several purposes. A comparison of the before and after data provided a check on the stability of the substitution during data collection and provided some measure of the decay of the intensities. The (hOE) data sets were used to calculate electron density projections from which an approximate scale factor to the native data could be obtained. Finally a comparison of the difference Fourier projections provided a further check on the reproducibility of the amount and position of substitution in the several crystals. The absorption, lack of balance, and twin size of the crystals were examined before data collection. The absorption measurements were made as described for the TPrP crystal on four b_axis reflections at x = 90°; these were the (020), (040), (060), and (0,18,0) with 26 - 2.6, 5.3, 7.9 and 23.8°, respectively. Thus, any 26 dependence of the relative absorption could be detected. Lack of balance measurements were made as previously for the TPrP analysis. Crystals of a-CHT are twinned along the 2? direction (53). A twinned crystal appears to be a single crystal but is in fact two different interpenetrating orientations of a lattice. In the a—CHT crystals the two lattices practically superimpose for the (Okl) 80 reflections. The size of the twin varied among the crystals and by careful choice of the crystals for data collection the effect of the twin lattice can be minimized. Before data collection the intensities of the (600) reflection for the real and the twin lattice are compared to determine the magnitude of the twin. The cell dimensions for each inhibitor:enzyme complex crystal and the orientation of the lattice were determined from the angular coordinates of 12 reflections distributed throughout reciprocal space. The cell constants for the five data crystals are given in Table X. The average cell parameters for the NFP:o-chymotrypsin complex are 3" 49°27(2)9.P.' 67.10(6), g_- 65.84(7) A and B = 101.75(2)°. A comparison of these values with the average cell dimensions of the native enzyme crystals indicates that the two kinds of crystal are the same within the error of the determination. The intensities of three reflections were monitored every 100 reflections (N 1 hour) throughout the data collection as a check on the crystal alignment and on radiation damage. These reflections were the (0,18,0) at x - 90° over both the+_a_* and+_c_:_* axes and the general reflection (4,14,5) at a x value of approximately 59°. 2. Crystal Motion The enzyme crystals are mounted in glass capillaries and are held to the wall of the capillary by the surface tension of the mother liquor. Crystals mounted in this manner usually move continuously to new orientations during data collection. The computer program, discussed in Section II, compensates for the reorientation of the crystal. A plot of the intensity of the (0,18,0) monitor reflection .at+3f from the crystal 2 data is shown in Figure 20. This plot Table X. Crystal 1 2 Average a-CHT 81 The Cell Parameters for NFP:o-Chymotrypsin g (A) 49.27 49.24 49.28 49.26 49.28 49.27(2) 49.24(7) 67 67. 67 67. 67 67 67 be) .04 16 .ll 16 .02 .10(6) .20(10) _C_ (A) 65 65. 65. 65 65 .91 79 77 .81 .92 .84(7) .94(9) 8(degrees) 101.75 101.77 101.74 101.74 101.75 101.75(2) 101.79(8) 82 V 90001 NOOVV'. 5000'4 " i ‘ ;, XM s. x. I XM 30004- 10004- 5 To 75 20 is Hours of exposure Figure 20. Plot of the intensity of the (0,18,0) gs, exposure time. I in counts/l6 seconds and XM means crystal was considered to have moved. 83 indicates the effective compensation for reorientation accomplished by the program. Table XI contains a record of the realignments necessary for each crystal along with the total hours of exposure for each crystal. After the data for each crystal were collected the following corrections were applied: lack of balance, twin, absorption, decay, and Lorentz-polarization. E 4. Lack of Balance and Twin Corrections The data were corrected for lack of balance by adding the measured correction, at the approximate 26 angle, to the background intensity of the reflection. This correction was relatively large only for 26<10° and thus was important only for crystal 1. The twin correction was applied to the (Oki) reflections by using a twin ratio calculated from the intensity of the (600) reflection for the real and the twin lattice. The twin ratios, k = I /I t real twin’ for the five data crystals are tabulated in Table XII. The observed intensities for the (Okl) reflections result from both the real and the twin lattice and therefore have to be reduced to only the real contribution. The correction is of the form k t (kt+1) (Okl) = (Okl). Ireal IOBS 5. Absorption Correction The absorption measurements were corrected for lack of balance and then plotted as IMAx/I(¢) gs, 0, where IMAX is the maximum observed intensity of the reflection, to show the relative absorption of the crystal. These plots were compared for each of the four (OkO) Crystal 1 2 84 Table XI. Record of Crystal Motion Number of Realignments 4 4 3 2 1* Total Exposure (hours) 38 36 30 27 26 *This crystal really showed no motion; it was made to be realigned after 20 hrs. as a precautionary measure. “_ fiisfrrsi f‘.u§i'. 85 Table XII. Ratios of the Intensities of Real and Twin Lattices Crystal Ireal/Itwin l 8.42 2 3.56 3 3.37 4 2.01 5 6.36 86 reflections and were averaged to give one absorption correction if no major differences in the curves were observed. These plots are shown in Figures 21 and 22. For crystals 1 and 3 all four curves were the same and only one absorption correction was used in each case. For crystal 2 the (020) reflection showed a different absorption so two corrections were applied: the (020) curve for k.fi 2 and an average curve for k > 2. The (0,18,0) reflection showed a different absorption in crystals 4 and 5 and an average correction was applied for k‘: 6 and the (0,18,0) curve for k > 6. These differences probably arise from the difference in the path length of the X-ray beam through the capillary, mother liquor, and crystal at the different 26 values of the (OkO) reflections. Crystals 2 and 4 (Figure 22) showed the maximum relative absorption with (Abs.)MAX = 1.99 and 2.15, respectively, where (Abs.)MAX = [IMAX/I(¢)]MAX° The absorption correction was made using the semi-empirical method of North, Phillips, and Mathews (59). In this method the transmission of a reflection by a crystal, T(hk£), is defined as the average of the transmission in the direction of the incident and reflected beams, thus T(hk£) - [T(¢1nc) + T<¢refl>1/2 (24) where Abs(hk2) = l/T(hk£). This correction accounts for the change in the path length of the incident and reflected beams as x is changed from 90° to 0°. The values of ¢ and ¢ are calculated inc refl from ¢hk£ and for the diffractometer geometry are -1 ¢ = ¢ - sin (sin6cosx) . -1 ¢ref1 ¢hk£ + Sln (sin6cosx) mi f 1.64 r 1.44 1.2-- 1.0 1 L81 " 1.6; '- L2” I 1.0- 1.6- > V 1.4- 1.21 U 1.0-r Figure 21. 87 Plot of the relative absorption observed at x = 90° for crystals 1,3, and 5. The average curves are solid lines and the individual curves are dashed lines. 88 I 2.01 11H U 1.4- r L2» " 1.01 :12? 23> 1 1.ij IAMP Iat- l2br :- E? Figure 22. Plot of the relative absorption observed at x = 90° for crystals 2 and 4. The average curves are solid lines and the individual curves are dashed lines. 89 With these ¢ values the absorption corrections are obtained from the curves described above and used to calculate Abs(hk£) from 2Abs(¢ )Abs(¢ ) Abs(hk£) - 1““ rafl (26) Abs(¢inc) + Abs<¢ref1) which is a rearrangement of Equation 24. This method gives a good estimate of the relative absorption correction. 6. Decay Correction Another systematic error present in the data is the loss of intensity with exposure to X-rays. This effect is observed in the decrease in the intensities of the monitor reflection with increased exposure (see Figure 20) and is the major reason that several crystals have to be used for a data set. The decay correction is obtained from the monitor reflections by fitting a plot of I(l)/I(n) gs, n with a least square line, where n is the number of the reflection or the time at which it was measured. The reflections are numbered consecutively as they are measured and the reflection number, n, is used as an estimate of elapsed time. The slope of the least squares line (intercept = 1.0) is used to obtain the decay correction for any reflection from the equation Decay (n) = 1.0 + (slope) x n. Most of the reflections collected for this data set have 26 values between 20° and 32° thus the decay of the monitor reflections at 26 = 21° and 23° are good estimates for all reflections. However, the data set for crystal 1 contains the reflections with 26<20° and for this range, an additional decay correction may be obtained from 90 the before and after (h02) data. The decay correction for low 26(<15°) is obtained from the scale factor between the (h0l) data for the NFP complex before and after data collection. The maximum decay for the data with 29<15° is defined as the ratio between the scale factor for the after data and the scale factor for the before data. Two decay corrections were applied to the crystal 1 data: for reflections with 2&315° the decay was obtained from the (hOZ) scale ! factors and for reflections with 26>15° the correction was obtained from the slope of the decay curve of the (4,14,5) reflection (26 = 21°). The rest of the data from crystals 2—5 were corrected for decay with the slope of the average decay curve of all three monitor reflections since these curves were approximately the same. Table XIII gives the decay correction data for all five crystals. The largest decrease in intensity was 21% for crystal 2. With the above corrections the five data sets were converted to structure amplitudes which were then merged and averaged into one set of independent reflections. 7. Data Processing The initial scale factor for each data set was obtained from a comparison of the peaks in the before (hOR) projections, to 6 A resolution, of the electron density of the native enzyme and the NFP complex. The scale factor for the NFP complex structure amplitudes was defined as the scale factor necessary for the peak heights in the two projections to be equal (excluding the positions where the NFP substitution produced changes). 91 Table XIII. Decay Correction Data Number of 5 Maximum Crystal 20 Ranges Reflections Slope(X10 ) Correction 1 0<26<15 2904 1.72 1.05 15<26<20 5.08 1.15 2 all 2713 7.85 1.21 3 all 2301 5.33 1.12 4 all 1990 4.37 1.09 5 all 1977 4.91 1.10 92 The data sets were merged with the scale factor between each two sets of data being additionally verified by a comparison of the redundant reflections arising from the overlap of the data ranges. The value of IFI for the reflections in the overlap region was taken as the average of the values of [F] for the redundant reflections. The quality of each merge was assessed by evaluating the quantity Rr’ n1, = EllFi|-/ X<|F|> 5.735 115-— A—P’ where |F1| and <|F|> are the redundant structure amplitudes and their average value, respectively, and the sum includes all redundant ‘u-lmiin 1 reflections. The values of Rr for the four merges of the NFP complex data are given in Table XIV. Another check on the correct scaling of the merged data was provided by the radial distributions of |FP|2 and IF 2 which are ml plots of the <|F|2>.!§. <26>. The relative distribution of the magnitudes of <|F|2> over the range of the data should not change appreciably with the substitution of an inhibitor molecule. Therefore, the radial distribution curve of NFPza-CHT should have nearly the same shape as the native curve and since the contribution of the inhibitor is small the magnitudes of the curves should also be nearly equal. The radial distribution curves for the native enzyme and the NFP complex to 2.8 A resolution are shown in Figure 23. The shapes of these curves are similar and a final scale factor of 0.98 on the NFP:o-CHT data caused the magnitudes of the two curves to be approximately equal. The final set of observations for the NFP complex contained 10473 independent reflections of which 9810 were taken as observed reflections with an average reproducibility of about 2.6%. 93 Table XIV. Rr Factors for the Redundant Reflections Crystals Merged Redundant Reflections Rr Total Observed Reflections 1 and 2 540 0.027 4932 1,2 and 3 381 0.026 6864 1,2,3 and 4 371 0.025 8321 l,2,3,4 and 5 413 0.027 9810 94 2004 - 180+ ’ IOOV' (In?) x no" 1 I 10 30 50 (20> (degreeS) Figure 23. The radial distribution of Native (solid line) and NFP: Native (dashed line). X. Discussion of the Results 1. Substitution in the Active Site Region The active site regions of the two molecules of a—CHT in the asymmetric unit are spatially located close to each other across a local two-fold rotation axis. The Tyr 146 residue of one molecule of a-CHT penetrates the molecular boundary of the two-fold related molecule and is located in close proximity to the active site region of this molecule. The substitution of NFP is found to occur in a space in the active site region such that the inhibitor molecule is surrounded by portions of the peptide chains of a-CHT. The composite 2.8 A resolution electron density of the active site region along with the difference electron density of NFP is shown viewed parallel to the two-fold axis in Figure 24(a). The distances of the active site residues from the NFP substitution positions are given in Figure 24(b). The distances are taken from the midpoints of the contours except for the Ser residues where the distance was taken from the estimated position of the hydroxyl side chain. The NFP peaks in the unprimed molecule are attributed to the phenyl ring (P) with a peak height of 0.18 eA-3 and to either the carboxyl or acylamido group (PX) with a peak height of 0.13 eA-a. The peaks P and PX are separated by 3.4 A which agrees with the approximate geometry of the NFP molecule. The phenyl position partially overlaps the position of a localized water molecule which 95 .xmwuoumm cm :uas poxuma ma mans odomtoau one .mnuoucfi muuoucfi m <0 m~.o an ozone ma hufimcov souuooao o>fiums «:9 .mem vaowlosu Hmooa mnu ou Hmaamuma vmsmw> :ofiwou ouwm o>fiuum may :a huwwaov couuooam Aqum wusmfim 97 .mmoouw ahxouvhn Hausa mo manuamoa woumawumo onu aouw :oxmu mums «commando mma sum was oma now one .xmwuoumm Suds vmxuma mwxo vacuuose .onqm ousmfih mo muflmsov couuooao on» ma mmz aoum A< aHv unassumwn Anvqm ouswwm s3... .2. 8. .zm 52.0 a... n2 .3. 520 or. 8. to... 98 is hydrogen bonded to Ser 190 in the native enzyme structure and is displaced upon binding of the inhibitor molecule. The substitution position of the NFP molecule is such that the peak PX and the sulfur side chain of Met 192 have a close contact of about 2.7 A. This interaction might be a hydrogen bond formation with the sulfur. The PX position is near the main chain of Met 192 which is above the substitution positions in the electron density presented in Figure 24(a). The interaction with the Met 192 residue and its side chain may help to position the substrate in the proper orientation for hydrolysis. In the primed molecule only one peak (P') is observed (at a peak height of 0.14 eA-3) and is related by the local two-fold axis to the phenyl ring peak in the unprimed molecule. The smaller peak height for the NFP substitution in the primed molecule of a-CHT suggests that this active site region has less affinity for the inhibitor molecule. The primed molecule of a-CHT does not have a water molecule hydrogen bonded to Ser' 190; however, another water molecule (Wl') which is associated with the Tyr 146 residue occupies part of the active site region and may interfere with the substitution. Even though some of the residues have close contacts with the sub- stitution positions (Tyr 146, Ser 190, see Figure 24(b)), there are no major changes in the orientation of these residues in the active site region with the substitution of NFP. Some small shifts in the primed molecule are evident when the other residues in the active site region are examined. These residues and shifts are shown in Figure 25(a) where the electron density of the active site region is viewed perpendicular to the local two-fold axis. 99 .moxm vaowlosu onu moo3uon msouum mansov ma woumofivcw moamumww sou hp huHHMHu now woumamcmuu ma oaaooaoa voafium wow .AquN ouowam CA as mama mum muscucou .maxm vacuuosu ou umaaofivconuoa sowwou ouwm o>fiuom mnu ca xuawaov souuuoam AmeN muswfim 100 The shifts are in the positions of the main chain peptides of Thr' 222 and Ser' 223 by 0.25 A and 0.35 A, respectively, resulting in the movement of this part of the peptide chain away from the active site region. These shifts may be due to the more compact active site region in the primed molecule which is suggested by the distances in Figure 25(b). A comparison of these distances indicates that the two-fold symmetry of the active site region of the substituted molecules is not exact. The largest differences are those of the Thr 224 and Met 192 positions and these residues also show the largest deviations from two-fold symmetry in the native enzyme (see Figure 26). Thr 224 and Thr' 224 g" differ in position across the two—fold axis by about 3.8 A and the Met 192 positions differ by approximately 2.0 A. The resulting close Thr'-P' distance is probably the cause of the shift of the main chain peptides Thr' 222 and Ser' 223. The deviation of Met 192 from two-fold symmetry is particularly important because of the interaction of this residue with the inhibitor molecule. The short Met' l92-P' distance may be the cause of the decreased binding of NFP in the primed molecule. The proposed PX position is beneath Cys 191 in the electron density shown in Figure 25(a). The other residues with close contacts with the phenyl position are the main chain peptides of Trp 215, Set 214, and Val 213. Thus, the active site forms a cage around the NFP molecule with the residues in Figure 25(a) forming the walls and Tyr' 146 and Ser 190 forming the top and bottom of the cage. The NFP molecule is available to the solvent through an opening underneath Ser 221 . In solution Tyr' 146 of one molecule does not interact with the active site region of the adjacent molecule so that the NFP molecule would be available to the solvent through the top of the cage also. 101 .oc«H m>mo£ mfi mfixm waomlosa mums moonwamfip 0:9 .musoucoo mnu mo uswomvwa one aoum coxmu .Amvmm ouowam ca s30£m hufimcov souuooao onu ca mmz aoum A< sfiv mmocmumfia Anvmm shaman no— nun ”u” no— .30 x ”N we» no. no. 0N tom .32 ad two oi 8— .uxu _ . . . as“ a o “of o o .9 a . 3N. IIIIIIIIIIIIIII n how mm. “mm .wm n. «a. —mm b . «a a 2 app 102 .AonH h>mmnv mwxm vaomlosu Hmooa osu Baum A< Gav moosmumfia om ouawwm no. N H. 8. .16 an ad :0 «2 NV .02 no. was an no. N» .8.» «o—Illudll .32 .2 2" .aunlnh a. ._o> _.= no to ma , . a. .o> .a do w n a 3m I 5.” am 3. o a . _umluas .tm 13.—«u as com as he» Nd Nd at... V2 ...= 13 as NV— 2:. 103 The relatively minor changes in the enzyme structure in the active site region are not typical of the effect of the NFP substitution on the rest of the molecule. 2. Histidine 40 One of the largest changes observed in the NFP:a-CHT complex occurs in the region of His 40 in both molecules and is shown in Figure 27. In the native enzyme structure the side chain of His' 40 appears to be hydrogen bonded to Gly' 43 while a similar interaction does not occur in the unprimed molecule. In addition, the positions of the main chain peptides of His 40 and His' 40 differ across the two-fold by l.5-2.0 A. Upon formation of the NFPza-CHT complex the imidazole of His 40 moves by about 0.5 A in the direction of Gly 43 possibly forming a hydrogen bond. The effect of NFP substitution on the primed molecule is more complicated and not fully understood because the shift of the His' 40 side chain is accompanied by a shift of 0.2 A of the main chain peptide of Trp' 141 away from Gly' 43 and a shift of 0.2 A by Cys' 42 toward the His' 40 side chain. The shift of Cys' 42 is not shown in Figure 27 since the residue lies under the Trp' l4l density. A similar change in the region of His 40 and its two-fold related residue is observed when the pH of the native enzyme crystal is changed from 3.9 to 6.7 (60). The fact that this change occurs when a pseudo- substrate binds in the active site and when the pH is increased to a value near the pH of maximum activity suggest that this change is related to the activity of the enzyme. His 40 is found to be common in the sequences of chymotrypsinogen A, chymotrypsinogen B, trypsinogen, and porcine elastase (61) as is His 57 and the sequence Gly-Asp-Ser-Gly— Gly-Pro near Ser 195. These homologies in the sequences of the 104 .v liaifilfluk. xiii ma hufimaov couuooam mucouowwwv use 8. be .onwm ouamwm c« as sswuv mum unsoucou was monoumsnmmouo .osHH m>mon ma cans oaowlosa \ 4| «o— :3: .oe ocavfiumwm woos sofiwou one napxxw .mm mammam 06— av 105 serine proteases along with the proximity of His 40 to the active site region, as shown in Figure 27, suggest that this residue may play a role in maintaining the conformation of the active site. 3. Other Similarities with pH Change An intermolecular feature which is common to both pH change and NFP substitution occurs in the region of favored uranyl binding and is shown in Figure 28. This interaction involves the Arg' 154 side chain which shifts about 0.40 A toward the side chain of Glu 21. This change is accompanied by a shift of the Val 23 side chain of 0.32 A away from Glu 21 and a shift of the main chain peptide of Val' 23 and Ala' 22 of about 0.21 A. The two-fold equivalent position does not exist since this region is formed by the molecular boundaries of three u-CHT molecules. Table XV gives the positions of changes in a-CHT which result from the increase in pH. Those residues in boxes are also observed to change upon binding of NFP to the enzyme. The regions of similarity include: (1) those already discussed; (2) the change at Ser 159, which has a peak height of 0.16 eA-3; and (3) the change near Asp' 102, with a height of 0.15 eA-S. The change in the region near Trp' 29- Trp' 207 is weak in the NFP density. The lack of movement near Tyr' 146-His 57 is significant since these residues have been implicated in the dimerization of a-CHT (62). The dimerization is a maximum at pH - 4.0 and the dimer dissociates at higher pH values. The movement of these residues in the pH study (60) was probably related to the dissociation and not particularly involved with the activity of a-CHT. The substitution of NFP produces substrate- induced changes in the enzyme which have also been observed when the 106 Figure 28. The uranyl substitution position. The Arg' 154* residue is from the primed molecule related by the screw axis along 2, The difference electron density is cross-hatched and the contours are drawn as in Figure 24(a). 107 Table XV. Positions of Structural Changes of u-CHT with pH (from Vandlen and Tulinsky (60)) MAJOR CHANGES [Glu 21 - Arg' 154] Tyr' 146 - His 57 V [Trp' 29 - Trp' 207] -Phe 71 - Asp 72 - Glu 73 - -Leu 155 - Arg 154 - Asp 153- LESSER CHANGES Ser 119 Ser 159 Ser 214 [Trp' 141] [Asp' 102 - Asn' 100] x‘L. ‘ _ 108 pH is changed to a value which increases the activity of the enzyme. The NFP substitution does not induce the same changes as increased pH when these changes are related to dimerization. 4. Final comments Several other changes with peak heights greater than 0.15 eA-3 are observed in the NFP difference electron density. The standard error in the difference electron density calculated using only the first term I in Equation 22 is 0.02 eA-3. These peaks, along with those already discussed, are identified in Table XVI. Substitution in an auxilary binding site (51) was not observed for NFP and very few of the changes listed showed two-fold symmetry. The results of this study are similar to those of the MRC group (52) in that the amino acid sequences near the aromatic residue are nearly the same; however, there are two major differences. The MRC study was confined to the active site region of the average map of the electron density of the two molecules. The MRC results are reported in detail for the N-formyl-L-tryptophan complex at pH 8 5.7 (52). They report difficulty in the interpretation of the active site region due to a movement in the Tyr' 146 residue and the hydroxyl group of Ser 195. In contrast to these results, the NFP:a-CHT difference map showed no movement in either Tyr' 146 or Ser 195. Since the MRC work was done at a higher pH it is possible that the movements are related to the pH dependent dissociation of the dimer and are similar to the shifts found in the pH study of Vandlen and Tulinsky (60). It is also possible that the shifts are due to the inclusion of a molecule of dioxane in the active site region during crystallization. Neither of these complications, higher pH or dioxane, are present in the NFP study. The other major 109 Table XVI. The Major Peaks in the NFP Difference Electron Density. x, y, z are fractional coordinates peak pk.ht.eA-3 x y 2 P, phenyl ring of NFP 0.18 0.6119 0.2700 0.3780 PX, portion of NFP molecule 0.13 0.6269 0.2800 0.4268 P', two-fold of phenyl ring NFP 0.14 0.6269 0.3300 0.5488 Uranyl position, Arg' 154 - Glu 21 0.30 0.0448 0.5000 0.5976 (pH) movement of Val' 23 main chain 0.17 0.1493 0.0500 0.2927 movement of Val 23 side chain 0.15 0.0149 0.0800 0.3049 shift of His 40 0.21 0.8806 0.2700 0.4024 (pH) -0.15 0.8358 0.3700 0.6220 (pH) movement of His' 40 0.16 0.8507 0.3300 0.6098 (pH) Trp' 141 movement, main chain 0.15 0.7910 0.4000 0.6098 (pH) Cys' 42 movement -0.18 0.8060 0.2400 0.6220 shift of Lys 170 0.22 0.3284. 0.4300 0.5732 -0.16 0.3284 0.4500 0.5366 shift of Ser' 223 main . chain -0.16 0.4925 0.3900 0.5732 0.13 0.4478 0.3800 0.5732 movement of Thr' 222 0.16 0.4179 0.4200 0.5122 shift near Ser 217 0.16 0.4925 0.2600 0.4024 -0.13 0.4328 0.2350 0.4329 movement of Ser 159 0.16 0.5075 0.1000 0.2317 (pH) shift of Thr 224 main chain by 0.2 A -0.16 0.4478 0.1600 0.2927 movement of Asn' 150 0.16: 0.8358 0.4400 0.5000 shift of Leu 162 side -0.15 0.2657 0.2100 0.1463 chain by 0.34 A 0.15 0.2239 0.2300 0.1585 movement near Cys' 182 - Ile' 181 -0.15 0.3134 0.3000 0.5732 movement of Leu 46 side chain 0.15 0.8955 0.3000 0.1829 movement of Asp' 102 - Asn' 100 0.15 0.6119 0.1900 0.7073 (pH) 110 difference in the two studies is that the NFP study showed a lack of two-fold symmetry in the active site region (which would have been obscured in the average map of the MRC group) and in the other major changes in the enzyme structure which were not discussed in the MRC report. This study has determined the position of the substitution of NFP in the active site region of a-CHT and has uncovered that the substitution produces substrate-induced changes in the interactions of other residues, which are also observed when the pH is increased to a value near the pH of maximum activity of the enzyme. The specific chemical interactions which produce these changes may be understood more fully when the molecular details of the present structure of a-CHT have been determined. .s “’13.‘ §\’|Ir REFERENCES 10. 11. 12. 13. 14. 15. 16. References J. E. Falk, "Porphyrins and Metalloporphyrins", Elsevier Publishing Company, Amsterdam, 1964. G. 8. Marks, "Heme and Chlorophyll", D. van Nostrand Company Ltd., London, 1969. F. R. Longo, J. D. Finarelli, E. Schmalzbach, and A. D. Adler, J. Phys. Chem., 74, 3296 (1970). A. D. Adler, J. H. Green, M. Mautner, Organic Mass Spectrometry, 3, 955 (1970). . Lemberg and J. E. Falk, Biochem. J., 43, 674 (1951). R M. Gouterman, J. Chem. Phys., 39, 1139 (1959) and J. Mol. Spectry., 6, 138 (1961) . W. S. Caughey and W. S. Koski, Biochemistry, l, 923 (1962). R. J. Abraham, A. H. Jackson, G. W. Kenner, and D. Warburton, J. Chem. Soc., 853 (1963). G. M. Badger, R. L. N. Harris, R. A. Jones, and Jenneth M. Sasse, J. Chem. Soc., 4329 (1962). S. F. Mason, J. Chem. Soc., 976 (1958). E. D. Becker, R. B. Bradley, C. J. Watson, J. Am. Chem. Soc., 83, 3743 (1961). L. E. Webb and E. B. Fleischer, J. Chem. Phys., 43, 3100 (1965). S. Silvers and A. Tulinsky, J. Am. Chem. Soc., 89, 3331 (1967). M. J. Hamor, T. A. Hamor, and J. L. Hoard, J. Am. Chem. Soc., 86, 1938 (1964). a. M. L. Chen, Ph.D. Thesis, Michigan State University, East Lansing, Michigan (1970). G. H. Stout and L. H. Jensen, "X-ray Structure Determination", The MacMillan Company, New York, 1968. 111 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 112 N. F. M. Henry, H. Lipson, and W. A. WOoster, "The Interpretation of X-ray Diffraction Photographs", MacMillan and Company, Ltd., London, 1961. "International Tables for X-ray Crystallography", Vols. I-III, The Kynoch Press, Birmingham, England, 1952. R. L. Vandlen and A. Tulinsky, Acta. Cryst., B23, 437 (1971). H. W. Wyckoff, M. Doscher, D. Tsernoglou, T. Inagami, L. N. Johnson, K. D. Hardman, N. M. Allewell, D. M. Kelly, and F. M. Richards, J. Mol. Biol., 2], 563 (1967). T. C. Furnas, "Single Crystal Orienter Instruction Manual", t' General Electric Company, Milwaukee (1957). I. L. Karle, H. Hauptman, J. Karle, and A. B. Wing, Acta. Cryst., 11, 257 (1958). I A. J. C. Wilson, Nature, 129, 151 (1942). D. Sayre, Acta. Cryst.,‘g, 60 (1952). H. Hauptman and J. Karle, "Solution of the Phase Problem I. The Centrosymmetric Crystal", A.C.A. Monograph No. 3, Brooklyn: Polycrystal Book Service. I. L. Karle and J. Karle, Acta. Cryst., «2’ 1356 (1964). W. Cochran and M. M. Woolfson, Acta. Cryst., 8, l (1955). R. E. Long, Ph.D. Thesis, University of California, Los Angeles, California, 1965. D. J. Cromer and J. T. Waber, Acta. Cryst., 18, 104 (1965). R. F. Stewart, E. R. Davidson, and W. T. Simpson, J. Chem. Phys., 43, 3175 (1965). E. W. Hughes, J. Am. Chem. Soc., 63, 1737 (1941). W. R. Bushing, K. 0. Martin, and H. A. Levy, "A Fortran Crystallographic Least-Squares Program", Report ORNL-TM-305, Oak Ridge, Tennessee (1962). H. Lipson and W. Cochran, "The Determination of Crystal Structures", Cornell University Press, New York (1966), pp. 319-323. R. W. James, "The Optical Principles of the Diffraction of X-rays", C. Bell and Sons, Ltd., London, 1950, pp. 292-294. D. M. Collins and J. L. Hoard, J. Am. Chem. Soc., 93, 3761 (1970). 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 113 D. P. Shoemaker, J. Donohue, V. Schoemaker, and R. B. Corey, J. Am. Chem. Soc., 73, 2328 (1950). B. Dawson, Acta. Cryst., 14, 999 (1961). C. K. Johnson, "ORTEP, a Fortran Thermal-Ellipsoid Plot Program for Crystal Structure Illustrations", ORNL-3794, Oak Ridge National Laboratory, Oak Ridge, Tennessee, 1965. V. Schomaker, J. Waber, R. E. Marsh, and G. Bergman, Acta. Cryst., 12, 600 (1959). J. R. Platt, in "Radiation Biology", Vol. 3, A. Hollaender, Ed., McGraw Hill, New York, 1956, p. 101. W. C. Hamilton and J. A. Ibers, "Hydrogen Bonding in Solids", W. A. Benjamin, Inc., New York, 1968, pp. 262-263. M. Rovery, M. Poilroux, A. Yoshida, and P. Desnuelle, Biochem. Biophys. Acta., 88, 608 (1957). N. Brenner, H. R. Muller, and R. W. Pfister, Helv. Chim. Acta., 38, 568 (1950). G. E. Hein and C. Niemann, J. Am. Chem. Soc., 84, 4495 (1962). S. G. Cohen, A. Milovanoic, R. M. Schultz, and S. Y. Weinstein, J. Biol. Chem., 244, 2664 (1969). B. S. Hartley and B. A. Kilby, Biochem. J., 59, 672 (1952). G. Schoellman and E. Shaw, Biochemistry,lg, 252 (1963). G. P. Hess in "The Enzymes", Vol. III, P. D. Boyer, Ed., Academic Press, New York, 1971, pp. 236-244. C. P. Hess, J. McConn, E. Ru, and G. McConkey, Phil. Trans. Roy. Soc. Lond., B257, 89 (1970). M. L. Bender and K. C. Kemp, J. Am. Chem. Soc., 19, 116 (1957). A. Tulinsky, P. W. Codding, and R. L. Vandlen, Abstracts, Winter Meeting of the American Crystallographic Association, Columbia, South Carolina, Feb. 1971, No. K3. T. A. Steitz, R. Henderson, and D. M. Blow, J. Mol. Biol., 49, 337 (1969). A. Tulinsky, N. V. Mani, C. N. Morimoto, and R. L. Vandlen, unpublished results. R. L. Vandlen, N. V. Mani, C. N. Morimoto, and A. Tulinsky, Abstracts, Winter Meeting of the American Crystallographic Association, Columbia, South Carolina, Feb. 1971, No. K2. 55. 56. 57. 58. 59. 60. 61. 62. 114 R. E. Dickerson, J. C. Kendrew, and B. E. Strandberg, Acta. Cryst., 14, 1188 (1961). R. Henderson and J. K. Moffat, unpublished results. P. B. Sigler, D. M. Blow, B. W. Matthews, and R. Henderson, J. Mol. Biol., 85, 143 (1968). R. L. Vandlen, private communication. A. C. T. North, D. C. Phillips, and F. S. Mathews, Acta. Cryst., .Ag4, 351 (1968). R. L. Vandlen and A. Tulinsky, Abstracts, 62nd Annual Meeting of American Society of Biological Chemists, San Francisco, California, June 1971, No. 444. B. S. Hartley, Phil. Trans. Roy. Soc. Lond., 8327, 77 (1970). K. C. Anne and S. N. Timasheff, Biochemistry, 19, 1609 (1971). APPENDICES APPENDIX I The Reciprocal Lattice The reciprocal lattice is a three dimensional array of points around the crystal in which each point represents in orientation and spacing a direct lattice plane (hkz). The vector from the origin to the reciprocal lattice point represents the normal to the direct lattice plane and the length of that vector is equal to 1/d(hk£) where d(hk£) is the spacing of the lattice planes. d(hk2) is related to 0, the angle of the incident and diffracted beams to the direct lattice plane (hkl) by 2d -> a*.b-a*oc.b*oc-b*oa=c*oacc*ob=0 and 115 APPENDIX II The Observed and Calculated Structure Factors of Tetrapropylporphine 116 ”'71 1417 :.PNI s.¢ (.a_| ~.o_u p.amo c.a~ d.wm c.n w.¢pu I 7 c.an o.m¢ ¢.c_ o.rcn «.sm (.0- ..¢0 ..~: «our! :.¢~ c._~u c.cnl r.:_c 5...: a.c~ «.5: c.dr 9.qml n.~c r.¢ml _.o_l ¢.¢a v.¢-o a.~a «.cw ~.mni o.wmu (.~a ~.¢_o o.oci n.nmc ~.n_n e.gpu m.oo ~.cnu 0.”. n.o~ c.0nn 0.00 p.50 c._c.o m.¢m—o s.mni n.p| _.nno ¢.n_o .Uadu.u o.nm n.c 5.:- ¢.¢~ a.am u.mm p.- ¢.c a.~m ..~_ u.¢p (.ra (.0. _..o 4.5« m.¢ _.~ o.c¢ _.pn 4.5m «.0. «.on v.»— «.0 ¢.~s o.cc o.mn s..m n.¢¢ c.c~ ¢.o. 5.0 _.m_ n.om o.o~ n.4n o.mm a.n¢ o._~ 0.0: w.mm v.0— c..s p.mc ¢.sn ~.- 4.¢n a.cM ~.~ ~.oo c.0n. _.oo_ c.cn c.~ ¢.on n.c_ .vac.u ~I. N- N. N- J goo—sanOO—I—CNNMHOQU‘U‘OCFFQQO°°—¢—NHMQU‘U‘({FCCJCC6——NNMHQU‘((FCO .— -~- 1-'~--I~¢VRINIUGJN¢\GIN€\0.KfNfiuNfMGJN1M0~N¢“F5n'"'1nffir)n'fiP1P'DF‘H~3d Q~34 4.94 {-3.14 4~81 —.o _._m ~.m_ (.WJI ¢.u¢ _._r n.o~u ¢.ca_u c.c:| :.mm o.rm o._~_ c.i¢| ¢.c.l m._an c.m__c (.aom n.£4mn c.1cm 0.:— «.m—I _.s¢| o.m—I N.cn c.0m o.—'\"-€U'(f~¢70¢:-N°--~ruflwPic-sUNm-Oh-5cratc-—c>o‘—-~IWF10 0u~0\o.c5-5:DO‘OI=C>— --- I—~—~——————n—~—~——nf\~~f\~f\m“(\mmfiKNfi-fififlflMF.MMFHMPP.MFFMHFFFF{Q3 «.00: 0.n0: 0.05 5.nn c.5rn ¢.nn.: 0.0m c.o. 0.5 5.~0: «.00: 0.0a: 0.2.. ..~0: 0.¢~.: 0.00 c.0o.: 1.0m: 0.c~: 0.0.: 0... ..o. n.~n ..c. 0.0: 0.5« 0...: 0.0« 1...: 0..n: 0.5 5.0«: 1..0 0.5.: 0.0 «.00: 0.0: 5.0: 9.5. a... ~.n« 0.0: ¢.0~: 0.~. ..a~ 0.n. 7.55: ..0«: 0.0. 0.00 5.5 0.55 0.5.: 0.c~: 0.0: 5.~«: .u.¢u.. 0“ ~.o~ .0:C.u . O-OO-D-NN' xh-cqzabo.——-~:np:g.Quam.¢435p~¢;o:pc:—.—«:nrw¢~801r~CF-h:r-0c:—f\P5¢UN€I~¢:O -- - 1flu~twnu~lwflwnenP-n:'rzn:nv-Po—r-n:nr-0 04:0.30 0:34-34’0~0U‘m¢:o¢:c~°<:c:c<>c¢:c-—-—-~-—~ 120 0.00 5.00 0.5: 0.0 0.1 0.0. (.0\: U.V& _..5.: 0.00. c.0«o 0.p« ..c« c.c« ...« ..5« «.00: 5.00 «.c.: 5... ..0. ..0« (.0. 0..¢ 7.u« «.¢« ..0.: 0.0« c... 0.5. (.0. r... ..J«: ..0« c.0 0.0 o.n« 5.0« c.{ 1.0 «.c«: 3.1. :.n«- 5.:« 0.«0: 5.00 0.0.: ..0. (.c. c.0— n... c.«— «.50 H.0V ..«.: «... 0.51- ..«0 1.0 0.0 0.:«- 0.0. 0.0«: «.0« ..«.: 0.5. 0.5. ..0. 5.50 «.50 5.:0: 0.00 0.0«: 5.0« 0.5«: ..5~ ..0< 0.00 0.00 5.00 «.5r 0.0M ..«0: 0.«0 0.00: (.00 ¢.~c. ..xo «.0. 0.5. 0.5: 0.0 0.«.: 0.5. 0.0 «.5 €.¢0: «.54 0..« c.0« 0.5 0.5 ..c« ..on ..«. ..0. 0.00: 0.«5 0.0 1.0 0.0« n.en .u.c>c>c-o¢:¢:c>c.c.-.——-...._._.-_.......—.—--.—.-...._.—.-n,«,«,«,«,«.«,Ncy ..«. 0.05 0.c. 0.0: «.00: ..o: 0.0.: c..n: «.5: 0.«~ «.0. ..«. 5.0: 0.0« °.nn: 5.0«: 0.5«: 0.00 0.0.: 0.00: o... 0.n«: 0.0.: 0..0: 0.0.. .... 0.05: 0.0: ..c. 0... 0.05 5.«n: 5.0: 0.05: s..«: 0.00: «.0 0.«q: 0.00: «.0. ¢.«« 0.5m. 0...: ..n0 0.0«. 0.0m. 0.«. 0.00: ..0.: 5.0. «.0. ..0. 0.5. 0.05 5.0«: 5.0.: .u.¢u.. 0 WP.- 00~00~v~r- 0313083 0 {N-Ofl - coo-0000 ’N‘V H7“"’¢'ZU"’~¢O~7?CO 0.0. 0.00 ..0«. 0.00. 0.0 0.00 0.5. 0.0. 0.0. ..0. 0.0. 0.05 0.5. 0.0. .000.. 03000 a —-0 x~~6005¢00——-««nm000000550000——««nn0000£5t¢0000—-~«mn0000 -- Ic:c:=c:c.0::caoc:—-—-—-~-—-—-——-—--————-~.——-«v»«.«t\«.«tV«.«r\«.«:\«.«r\P~Mrrv~Ptnrwvccm 0.«. «.«0: 5.00: c.0« c.5«o 0...: 5.0m: 0.«.: «.«« c.0. 0.0. 0.. 0.0.: 5.0.: «.0: ..x. 0.0. 0...: .... 0.«5: 0.0. 0.0.: 0.0 0..0: 0.0. 0.55: 5.5m: ..00: 0.50: ..5« «.0 0.«. (.«5: (.5: o..¢ «.5.: 0.00 0.0. 0.0.: 1.0.: 5..0 0.n0 «.0 0.0m: o.5«« ..on 5.50: 0.0« 0.05.: 0.«n. 0..n «.0. 0.0.: 5.05: 0..0 ..0.: .u.¢u.. 0... 0.05 n.00 o.«« ...« 5... 0.05 0... 0.0« 5.0. 0.55 5.0. «.0. 0.5. 0.0 ..5. 0.0. 1.0 0.«. 5.55 n.0. ..0. 0... «.50 «.c« 0.05 0.05 ..00 «..0 «.c« ..5. 0.0. 0.00 «.0 5..: 0.5. 5.00 «.0. «.«. :.n« 0.00 0.00 «.0 ..05 ..c«« ¢.«0 5.«5 0.05 0.0«. 0.05. 5.¢« o.o« 0.0. 0..0 0.00 0.0. .000.u J INFOCOOCO-FNNHHQWU‘OORFCCOOOGCC--—f\-H""4QU‘U‘OFCQOCO—NHJU‘IDCFO—“J z---—--n;~¢wn:~:w«.«:w«:«rv«.«vwnam«\0:Nt\910I'P'n'nfi‘fivrs*n"F'P"P'0 0.:0'0-r0'0<:0-ru~0 1121 0..0: 5.5« ..55: 0.0.: 0.50 5.5«: 5.05: 5.«_: 0.0. 5.0«: 0.05: 0.55: ..0« «.«. 0.0. «.50: 1.0: ..a 0.05: 5.0« 0...: 0.00: 0.0. «.0« «.5.: 0.0 0.0.: «.0. 0.0«: ..05 1.05: 0.0.: 0.00: 0.0« 5.0. 0.0« 0.5 ..0.: ..00 ..0. 0.55.: 0.0N: 0.00: 0.05 0.00 0.0« 5..0: 5.0« 0.00 5.00 5.5.: «.0.: 0.5. «.5«: 5.«. 0.0.: .040005 ..«. 0.0. 0.0« 5..5 0.05 0... 5.00 0.5« 0.0. 5..5 0.0 «.0. 0.00 0.0. 0.0«. 0.0« 0.00 0.55 5.00 0.«« 5.«0 0.05 0.00 «.55 0.0. 5.0. 0... 0..« «.«. 0.5. .000.5 0.- 0.: 0.: 0.: 0.: 5.: n.: 5.: 5.: 0 IK¢-'7{lflliOd-fllfl~OU‘OID¢:°-—-NI\F\q[Qinu\0.tF-¢{rd’O¢3ca—..n;~:'rfi;.nu:o.c5.g¢=o:¢‘3_._¢~n‘. 0.«. 0.05: 0.«. 5.5.: 0.0. 5.5«: 0.5« 0..0: «.«0 0.0.: 0..0: 0.0. 0.5: «... «.0.: 0.0 5...: «.05 ..0« «.0: 5.0« 0.5. «..« 5.05 5.0: 0.01: 0..«: 0.05 c.«.: 0.0« ..0: 5... n.«0: «..« 0.0: 0.50: 5.«0: 0.5.: 0.0. 0..: «.5« 0.0. 0.«5: 0.0: 0...: 5.5. 0.5« 0...: «.0: 0.0: 0.00: 0.0« ..00 0.00: 0.0 0.0«: .040000 «.5. 0.«5 n... ..0. 0.5. ¢.5~ n.0~ 0.«0 O . C O . -'O~3 O... ..u\.-0-0¢rfl.¢I~r- 0:00'O'h {a-F-Ol“F-N -P5-A:¢ 5.00 0.00 0.5 5.5N .wcc.u 0.: 5.: n. 5. 5.: 5. 0.: r. 0.: n. 5.: 5. n.: n.: r. 5. n. 5.: 5.: 0. r. 0.: 0.: 5. n. n.: n.: n.: 0.: 0.: ~.: «.: A 93‘ CIPU“‘OF~¢LOI--R.N¢WFWQ fiiflU‘OIDF l-‘if-WVFM'U‘OfiI:O1D-N¢NFWP3QIIU‘V‘013FI~¢ZO¢’-~'\fl:fi - 1:0~tc:°I==>°¢DC:O1=<3-~a------~--—----A:~fun.~'\« N‘Mn.«r\n.~'\P‘Mrfivsnrflranrwr:nrwn coc~l m..- ?.00 5.0.: 0.5.: ..0«: 0.05 0.0. ..~. 0.5.: 8.8 0.05: 0.0. 2.0. ..No ..05 ..05: 0.05: ..n. 0...: o..- 0.05: _..m: 0.00 0...: ..0« ..a: 5._~: 0.0: c._5: 0.5 0._0: 5.55: 5.0. 5..0: cos 0.55: 0.55: 5.0~ 0.5. 0.0.: ..0«: ..55 m.0~ ...5 0.~o 0.¢~: ..0«: «.05: 0.0.: 5.0. 0.50 5.0~ ..w. ..05: 5.5. .0400.. 0.0« «.0 0.05 0.0 5.0. 0.50 5.0« «.0. ..5. c.«. 0.0. ..05 5.00 ..o. 0.00 0.0« «.00 0..5 0.50 0.0. 0.0 0.05 0..0 0.50 0.0. 5.0N 5.0. 5..~ 0.5 0..5 5.0 r..0 0.«5 0.5. 5.00 0.0 ..05 0.05 «.0« 0.0. 0.5. 0.0« 5..~ 0.0~ 5.55 None 0.05 0..« 0..5 0.5. 0.0. «.00. 0.50 0.0. 5.05 5.«. .0¢c00 '13-‘-~¢VF50|0U\0‘DF~-QC’OWDQEOlD—-~'\NJMFEJ JU‘W:C¢:NP~Siti’CBO-—-Nffifii¢-3U‘OF‘F-C--NFUH —— 0.00: 0..« 0.0.: 0.00: 5.««: 0.00 n.0«: 5.0 0.0.: 5.05 0...: o.0~: «.5« 0... 0.¢« «.0: «.00 0.0. 0.0. 0.0 ..0« 0.0. 0.0.: 5.0. 0.5:: 0.50: Iowa: 5.«0: 0... 0.5. ..0. 0.0. 5.«. 0.«5: 0.55: 1..0 0.0.: 0.00 0.0« «.50 0.0.: 3.0: 0.0. 0.0« 5.50 5.0. 5.0 0.«0: 0.0.: 0.¢n: «.0 ¢.«5 0.05: 0..0: r.00: 0.00 .040u.u 0..0. «. n.5« 0... 5.00 0.0« 0.50. 0.05 0.0 5.5. 0.05 5... n.0« 5.0« n.0« 0.55 0.0 «.00 5.0. «.0. ..0 0.0. «.5. 0.0. ..5. «.00 0.00 «.50 0.50 «.0. 0.0. 5.0. 0.0. 0.0. 5.0« 0.05 «..0 0.5. 0.05 0.«« o.~5 5.0. c... 5.0. 5.55 5.«0 5.0. 0.0 0.00 5.0. «.05 0.0. 0..5 0.00 0.00 0.00 5.00 .000.0 ~.: ~.: ~.: «.: ~.: «.: «.: «.: «.: ..: ..: ..: ..: .. ..: ..: ..: .. ..: .. ..: .. ..: .. ..: .. ..: .. ..: ..: ..: .. .- 3050.0unu:c:~¢:0:3....~:u050~0.0u\0.c¢:¢¢>a*0-—-NIVF‘0~30:rU‘c-OF-—r\«n0U\c-0:VP10tn<>F'D°‘° — I..—.-.—_.-.--.-—-«:un.«:\n.«:Mn.«cun:«:w«r\0‘r("P'HPNF'“F=F"F"4~34-94'¢0‘°¢=€’°‘>°"°"‘ 1J22 r.nm 4.0.: c.m.o ~.c.u v.wco ..¢n o.¢n r.¢a n.r. o.o~u c.c~n o.c.u n.a~o ~._¢ 4.:ml c.a.u ..~. ..:. «.mn (..mn c.o c.o:u :.c~ ..¢.n r.xr p.ua o.~mc ..vsl 4.x: t... r.~ :.¢m ..¢. ..c. v.5 c.cm ...~ v.a.n p.a. ..:: n.«m c.o. c.¢u c.sn r.¢n c.~.n ..s ¢.nsu ..- ..c~n v.¢.u ..: «.cm o.~.u ¢.¢~| m.s.n .u.¢u.u ¢.¢~ ..c~ u.~. «.30 ¢.cn v.cr o.m o... c.¢~ n.c~ ..- a.cc o.- ~.¢. 0.0. ¢.~. .vxc.k ow on! cw! oNI om- owl can 0.0 o—o 0.0 O—u 0.- 0.! 0.0 o— 9.0 0.0 o. 0.I o. 0.- 0.- o- o— o—u 0.0 O. 0.0 0—0 9.0 0.0 O—I o.- 0.0 m.0 o—I 0.0 3.: C—I O—I "‘””°°°"~~”*W‘"°-~HO¢c-~Pomh~-~nnccmmo——~nnmo-ncme-mncmo so~cu 0.0—O nosn non~ 9.540 ~.o~ $.0— n.4- ~.cn so.r hosNI m.c. 0.0- cosl m.n~ m.omo 0.00 coo—0 ¢.¢.o 5.0.0 9..~ -..no coon! ..0. n.5— com—0 ~.onl c..~0 0.50 ..m0 Hon:— 5.00 ~oo.l o.m.l Morpu o..~ ooc~ oon. mumwu oocn 0.:— _oh ..Nn c.4— come- coco coowl ooc_l .ommo occ— o..~l oon~0 QonO Comm coo-o Coon .Uacuvu coo-coo. 0 ~ 0 O -n¢v-n.~r—..- ¢¢=U‘°-U‘cr-u —r~..m'. — n n.gc*¢>m~c¢:n~o<:n-tn.n¢=o¢~ t ~ 0 C Q coco cocN oo.~ horx nor— ¢o.~ n.&~ Nclm ~.o~ ¢.w_ O..~ .mcc.u c. «.: c. c. ¢.u (.0 c. c. «.: c. x.- «.: «.0 ¢.- c.o c.- ..: ..: ..: 5.: s.. ..: ..: s. ..: ..: ..o s.u .— s. 5.- s. u. ..: 5.- s. «.0 s. 5.- 5.: s. s. s.- s. s. s.u s.| s. 5.: p.u ..: 5.0 >.0 ~.u ..: >.0 4 xv-NCMCFa~~~nncwmo¢r~N¢——~~rr..wmo‘__Nn.u._~nc_nm‘,.°__m~nfl‘cm x°°°c°c°~~---‘_‘---~-N~r\~~N~~~~~HFHFHFQQ§Ococ°c~——~—_~—~~ 9.00 ..o- ..c~- .u.¢u.. 0.0. 0.0. s.c. ooun ..s c.o~ ..On ¢.~. ..nm c.s~ o.~. cohn ..o coo m.m. s.~n 8.0 ..N. 0.0: o.m. c... ~.c~ w..o cos 0.:— cos soo~ m.om n.~ eoo. ..o» m.~ n.~. cow. Moan Po:— n.o ocwn n.~n 0.0 1.0: Home com. a.» ~.:~ Coco p-om oonw o.me som— {can Noe v.0: n.0— poo. —.o~ .mac.h c-a o—a o—I 0.- 0.0 c.0 0.0 c. c.0 0.0 c_- c_- c. c. c.. 0.- a. 0.- 0.- o. c.- c_- c_- 5.- 0.- 0.- 0.- 0.- ¢.- m.- m_- m.- m_- m.- 4 I CI-NJFVO'O-‘fll¢|buCF~liO °1=--RJGJP\F!Q ¢IDU\(NOF~F-¢ ”I’l3C>-'NIQ Jufld‘ilfi O-CN¢MFHP\U'OF‘C>—'fl:¢ I":3w,3OCOOOOOCCnan—a—o—n—a—a———~———n———~f\-(\(\-f\Nam“~~nnmnnnmpn.q.¢ z.¢o o.». o... o.o~u «.:. ¢.omu n..~ a.c4 «.1. ..:.n (.c.0 ..:. n... :..n ..cn c.o~ s..:. c.... o.omu (.0- a.o~ «.0. m.can v.0 «.:. c.o~ ¢.nn n.:m a..n ~.o~- c.o. c.. m.cm- n..m ¢.c~ m.cn ..:. 4.x.u 0.70 ..p ..mma ~.~oo ..cm o.~¢u o.m.n 9... p.9n ~.rn ..o.o 4.n~ o.o r.c.c o.nn o..nu o.c. o..¢c .u.¢u.u e.g. m.rr c.o ~..~ n.c. c..n n..~ 0.3: n.p. ..n. o.o o.¢. a... 0.0m o.¢n a..n c.o¢ n.m. o.mm c.¢ ~.cm n.¢. c.m¢ n... ..:. v.9m o..r o.om m.c¢ a.cr c..m ~.c. ~.A¢ 0.0. m.Cp ¢.nr ~.z c..~ «.0 a... s.nn ¢.- ~.n. o.oc .mco.u v.0 m. ¢.u m.: w. w.- m. r.u m.o m. m. m.n m.n m. m. m.u v.0 m. w. m.- «.: r.u m.o r.u r.| v.0 v.0 m.t w.n «.: o.- «.0 0.1 c.n c.0 0.0 4.- 4.- :— c.c . X'WFM‘U‘Wutfi-O-flflutu‘~¢UPhcu50r~I:O--Nflunf50‘3U‘W~D¢>FF-O--~fVF\H.tU‘Ouo¢:—-floN'WFiO*IW IMHMPPHPQQqQ.oooccecoo———~———————~—~—NNNNRNNNNNNP‘HHPHMHP‘M 123 ~.~:- m..m 4.: ..mm ¢.m ..c a..~ o.o. ..cn «.mn- o..«- c.nr- ..n.- ..o. ..:- ...m- 0.0- x...- c.0n o.om- p.c.- r.o- ..o ...p- c.on- v.m- c.«n n.0m m.c. ..:. Oocm ....- ¢.c~ ..c. ..m (.m- ..:. roo- 7.mp m.m. (.mp (.s ..«co ..o. r.:. «.7- «.mm c.m.- ¢.om- r.¢. c.~ ..o c...- ~.w.- ..nr n.rm .u.cu.u PFI‘U‘U‘CFU‘Q . C . 0 fl. 0.!" un—DCFOCNC .orn cock .vcc.u n- n~e n- n- -- -- -o -- -- -- - -- «- - «- «- -- ww- «- ~N- .N- .- .- .- .~- .~ .- .- .~ .~ .- _- .- .- o- c- c- cw- cN- ex- cw- c~ om- cw- o~ c~ o- o- c~ o~ c~ c- e~o c~ o~o 4 x<>°-'NFNH~tu501ac:~-—n.m.gu-¢.-AJP.;.-n.n.r.-~ruP-¢.cn-c-n-h.c-n.fi-sc>c-h-~!F4rc-naN-P=- 1.-—.—.-—.—..—.—n;~ryn,mr\nun-'r-P-'ch cc:c.—.—.-~.——-mr\n.rr'c>cGDC'o-—-—-—-—-~fwnac--~ APPENDIX III The Amino Acids and The Sequence of a-Chymotrypsin' 124 Glycine (Gly) HZNCHZCOOH Valine (Val) H2 N HCOOH on; “an Isoleucine (Ile) H 2N HCOOH CH EC H 3 2 5 Phenylalanine (Phe) H zucncooa cu2 Q Tryptophan (Trp) H2 2NZHCOOH I D H APPENDIX III Aliphatic Acids Alanine (Ala) 11 2rig-moon CH3 Leucine (Len) H 2NE:COOH CH: HCH3 Aromatic Amino Acids Tyrosine (Tyr) H2 2NEHCOOH 125 126 Hydroxyamino Acids Serine (ser) H N HCOOH 2 HZOH Acidic Acids Aspartic acid (Asp) H2 HZNEHCOOH éoou Acid Amide Amino Acids Aspargine (Asn) HZNQHCOOH CH2 CONH2 Basic Amino Acids Lysine (Lys) H NCHCOOH 2 (6.5112)4 NH2 Histidine (His) HZNCHCOOH CH2 N N H \\;59 Threonine (Thr) HZNCHCOOH 83:“ Glutamic acid (Glu) HZNCHCOOH CH Glutamine (Gln) uzugucoon CH Arginine (Arg) H NCHCOOH 2 ($32)3 ya fig‘hn 2 127 Sulfur-Containing Amino Acids Methionine (Met) Cystine (Cys) H N HCOOH H Ngncoon 2( H ) 2 CH SH gr 2 2 CH3 Secondary Amino Acids Proline (Pro) fui'?H2 CH2 cucoou ‘\ / N H 128 .J'll' use mH< mu< Hm> use mH< 3mg Hm> am< use .m> saw cHo use am. mH< mH< cm< Hm> . _flu mac umm.>ao.aue.umm.am>.:HH.mao.am>.=ma.u£H.auH.mH< own uwm xao use any cm< o —NN o o umm.una.umm.mmo mm. m 9.3 co. m ma. . umm.m%o.umz.>ao.am<.uom.hao.>au.oum.:ma.am>.mmo m m.m>u.una.una.zau.ma<.mH<.mnm Hmm Hm> Qm< m . . H > >H0.umm.mH¢.%HU.mH<.mho.=HH.umz.ma<.am<.w%A.DHH unH Qw< m awn uma uwm :mq.oum.:mq.=mq.umm.cm<.uce.cm<.mho.mmq.mzq.u%9.aue.>au.une mmo mm< uwm awe uwm o ‘n— o o ma¢.cao.aao.smg.wu<.mm<.oum.unfi.:m<.ma« 1mmH.wu<.use.=mA.mHo oum 3.5 nmq.=mq.mzq.smq.umm.u2H.mH<.mH<.umm.mnm.uom.cau.u:H.Hm>.umm.mH<.Hm>.mwo use m =HH m cm<.:m¢.cm<.=HH.una.:mq.umm.mm<.ume.mmq.umm.Cm<.mmq.wnm.aw>.m%g mxo «H4 mac hHo.>Hu.o:m.am<.:Hu.%Ho.umm.umm.uom.=au.mmq.=HH.cHo.m%A.=mA.mma.:HH Hm> «Ha ow. . kn . Hm>.Hm>.Hm>.am<.umm.HSH.H£H.Hm>.zao.who.mam.ma<.ma<.una.am>.Hm>.aua mH< m 5.2 as 3 m . . Hm>.umm.smA.cHo.am<.mhq.uny.mau.mnm.mwm.wsm.m%o.>ac.maw.umm.amg.=HH.cm<.:a :Hw a.u owm . _u . auh.oum.auk.umm.hHw.oum.Hm>.wH<.=ao.3Hu.mau.am<.Hm>.:HHIIIIIIIIIbmA.>Hu.Hum.amA.Hm I RARI 5342 {MIME ufinflfnhnfim llHlllWlWllllHl 3 6 4 0 3 o 3 9 2