w .4.‘ m in“: 3th.. ""* '4“ av: w any ahgw .. ‘trxsf-x-‘W m “c .4 um». sm« ‘1' v x......;. . ., m 534‘: “a . “S ' 4 ~ , 4.3 n. — . 3L .1 _ . - . , n. _ 1 a. -. ‘ 1.— , ‘ . . . ‘ “1x . ‘ , . . . ‘ , V . » - . ~, .n 14...; ..: .3; ‘. ,1, ~\ . "x. _ ,2“: riff fié?:m . .- “‘4 I. '6‘“ ., ”In - v‘ .> “i JAIN; 7.».1. ¢ 311: ,- t “!-*~ 5, EH}?! ‘2 . .'r~ r. a ‘. :2 _ ,1.“ .‘A‘ . k 3.1,; , ”M “A. V. , ‘ 3'". _3;..‘. 4 . ‘x" . 7 a I‘ ‘3, . g _ fr” “t ,. . “ ‘ um ‘_ “ m, < ‘ 3-152" - ' “4‘ 1“" .. ... M. v.‘ a“ a“?! a . <'J§\%~ ‘ Aims ‘ ~ < hnty A I M4“; . . 13». .. . . ‘ ,. . \ , '9)“ , m. .. v - - , -. ; “a ‘ - ‘ f a.‘ . .v u. .. . ‘ \ H. 5‘0 a ‘ I ' I, | , . I . ‘ ( K . u “‘53. ' Wm: . :k‘vab . M u m“: . I 1' L i i i I . w 4: r H. Frau- VL‘ gwgvm =. sax; C n. r‘ .. ‘ V 4i ‘ xv ”um - . A 1.1“ ‘ (“Vii .r..»'1' I'v-O‘UJW‘IV‘I ; ~ . :1 $.13:- w - ’ :2"? wk; +~ .szlggz-‘Vtw‘ _ . . x . 51 , a V ' . xxx-.7, "f 4 : , ‘ “ . fink ’1— a"; . h‘mv M"? 3,4 V... , . ,5 2’3! - a. 3‘~—.~;J.‘..§v'j"::. 1.: _' UL” 'figlrm'é " r, ., .- _ .zgv, "3.1%. 24.3" ' v > . _ MW,” :. ., V ' *3 ‘4' ' ,.,..r4--‘ . 0" m-J «.4. J"!!! ‘¥'$“§"3Iw' ‘* ‘ " 32%?L’e‘r’wgv..:tr' ‘1: . ' "W8- " ,. , agfifiggynflwfizz, “ , 'rr ., “A” . "31:3 53%!" ‘ " 2‘ NEW" ““7 i mawhmr‘égr ., ”gamut; - 0 u m; V ‘ ‘ . .‘rgih ‘3‘“ r“ i I . A r >— “I I? " ' : ~ . . ~ _ 9-3151. {"7 ‘1‘ 7? .5 3&2}. . . A I‘ .. E" ‘ .. r. . . n r 4’ . ‘ . w , Q 1 '4 .-— _ -r a ‘ ' "v “'3 a .. '3‘ «ac-”V” T}. 3.3: ‘ f“ ; ... . 4 3:35". firfigfiwfi». -« w «w * ... .~.— . 7-- , 5m 7;}. Junu "5%?” gal!“- 4. . 51% 1‘ .1 gr ' 7’ x; ‘ -. , 9- ~ “4.. "q~r";‘ 7}» A 1" . ‘r ‘ 5‘ ~ ‘ "'1” «5"!er .' ”‘W"‘.:;-'~‘-sa ‘r a? A .. . , - * WU”"~'~'€"- a” “yr”? “RN" A? , . . ‘ .17 ‘ ' "3,3: J41. ’ ‘ . 9"” ‘ 3'”. ~. .4 1 . 1 . v . .. W W "meats lillilimml 1 This is to certify that the dissertation entitled THE CLONING OF CDNA'S CODING FOR TYPES I AND III RAT HEXOKINASES AND SEQUENCE COMPARISONS TO OTHER HEXOKINASES presented by David A. Schwab has been accepted towards fulfillment of the requirements for Ph.D Biochemistry degree in /%( g‘Z/lwo Major professor Datefiy/Mz /?¢y T MSU is an Affirmative Action/Equal Opportunity Institution 0-12771 __.n- LIBRARY Michigan State Unlverslty PLACE ll RETURN BOXto mwoflwbchockunm yum. TO AVOID FINES Mun on or bdoroddoduo. DATE DUE DATE DUE DATE DUE MSU I. An Affirmative Action/Emu Oppommy Intuition Walnut THE CLONING OF CDNA’S CODING FOR TYPES I AND III RAT HEXOKINASES AND SEQUENCE COMPARISONS TO OTHER HEXOKINASES BY David A. Schwab A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Biochemistry 1994 ABSTRACT THE CLONING OF CDNA’S CODING FOR TYPES I AND III RAT HEXOKINASES AND SEQUENCE COMPARISONS TO OTHER HEXOKINASES BY David A. Schwab The cDNA’s coding for types I and III mammalian hexokinases were cloned from rat brain and rat liver cDNA libraries, respectively. After sequencing, the respective amino acid sequences were deduced. Comparisons between the type I amino acid sequence and the deduced amino acid sequences of the yeast hexokinase isozymes demonstrated a sufficient degree of similarity that the crystallographic structure of the yeast isozymes was used to construct a model for the mammalian hexokinases. The model was shown to be consistent with a variety of experimental data derived directly from the type I enzyme. The amino acid sequences of hexokinases and glucokinases (deduced from the respective cloned sequences) from various organism were aligned to determine which residues or regions were conserved. The alignment and the yeast crystallographic model were used to determine, at least to a first approximation, where these regions are located. Thus, the residues involved in the binding of glucose (previously determined from crystallographic studies of the yeast hexokinase isozymes) were determined to be conserved among the aligned sequences. Furthermore, regions utilized in the binding of ATP were proposed, based, in one case, on conservation in the aligned sequences of previously determined sequences utilized in the binding of ATP, and in the other case, on proteins (HSC70, actin, and glycerol kinase) that have been shown to have structurally similar ATP binding sites. Preliminary experiments were reported for the expression of type I hexokinase in E.coli. DEDICATION To my father, Don F. Schwab (who taught me everything I know, not everything he knows, but everything I know) and my mother, Edna J. Schwab (for always being there). iv ACKNOWLEDGEMENTS I would like to acknowledge the past and present members of the Wilson lab., my committee members — Dr. Jerry Dodgson, Dr. Shelagh Ferguson-Miller, Dr. Thomas B. Friedman, and Dr. Jon Kaguni. Additionally, I would like to acknowledge the patience, guidance, and support of Dr. John E. Wilson throughout this project. On a more personal note, I would like to acknowledge the unrelenting faith and support of my wife, Deborah J. Schwab (who, at times, it seemed contributed more to the success of this degree than I did), and last (and also least) Amanda and Ashley (our latest cloning projects), who have already given me more pleasure than I thought was possible. TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS CHAPTER I Literature Review Introduction Cloned Hexokinase and Glucokinase Sequences Mammalian Hexokinases Regulation of Activity Tissue Distribution Subcellular Association Ontogenetic Studies Yeast Hexokinases Yeast Glucokinase Evolution of Hexokinases CHAPTER II Materials and Methods Materials Methods Preparation of Anti-hexokinase Antibodies vi xi xiv 11 13 14 18 19 26 27 28 28 CHAPTER III Affinity Purification of Antibodies to Rat Brain Hexokinase . . . Preparation of Affigel-IO Hexokinase Column Purification of Anti-Rat Brain Hexokinase Antibodies Immunological Screening of Agtll cDNA Library . . . . . . . . cDNA Synthesis and Construction of Agth Libraries . . . . . . . . . . Screening of Agth cDNA libraries Sequencing of cDNA Clones Northern Blot Construction of Plasmids for Expression of Rat Brain Hexokinase in E. coli. pHB4 and pM1-7 pXNl and pNB6 Expression of Rat Brain Hexokinase in E.coli SDS-gel Electrophoresis and Immunoblotting Alignment of Amino Acid Sequences Generation of Stereo Images Cloning of cDNA’S Coding for Type I Rat Hexokinase; Comparison to Yeast Hexokinases; Proposed Model for Type I Hexokinase Cloning of the C-terminal Half of Rat Brain Hexokinase Verification of cDNA Clone HKI 12. 4- 4 as Coding for the C— terminal Half of Type I Hexokinase . . . . . . vii 28 28 29 29 32 33 33 33 34 34 37 38 39 39 39 41 42 43 Cloning of Full Length Rat Brain Hexokinase cDNA . . . . . . . . . . Authenticity of Full Length Clone HKI 1.4-7 . . . . . . . . . Comparison of Hexokinase Type I Halves and Yeast Isozymes Proposed Structure for Mammalian Hexokinase Type I CHAPTER IV Cloning of cDNA’S Coding for Type III Hexokinase from Rat Liver and Quantitative Comparisons of Sequence Similarities Between Hexokinases Cloning of cDNA’s Coding for Type III Hexokinase . . . . . . . Authenticity of Type III Hexokinase cDNA Clones . . . . . . . . . . Comparisons of Deduced Amino Acid Sequences of Hexokinases CHAPTER V Glucose and ATP Binding Sites The Glucose Binding Site The ATP binding site Prediction of the ATP Binding Site Based on Sequence . . . . . . . . . ATP Binding Site Based on Structurally Similar Proteins . . . . . CHAPTER VI Heterologous Expression of Type I Hexokinase Background Plasmid Constructs Expression Results viii 45 47 51 59 69 70 71 72 82 83 87 88 96 115 116 117 118 CHAPTER VII Future Research . . . . . . . . . . . . . . . . 122 APPENDICES . . . . . . . . . . . . . . . . . . . . . 129 APPENDIX A RESTRICTION SITES FOR HEXOKINASE TYPE I cDNA . . . . . . . . . . . . . . . . . 129 APPENDIX B RESTRICTION SITES FOR TYPE III HEXOKINASE cDNA . . . . . . . . . . . . . . . . . 145 REFERENCES . . . . . . . . . . . . . . . . . . . . . 165 ix Table Table Table Table Table LI ST OF TABLES Kinetic Parameters of Mammalian Hexokinases Quantitative Comparison of Hexokinase and Glucokinase Sequences Structurally Equivalent Residues in HSC70, Yeast Hexokinase, Actin, and Glycerol Kinase Expression of Type I Hexokinase Expression of N- and C-terminal Halves of Type I Hexokinase 78 100 119 121 Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 10. 11. 12. 13. 14. 15. LIST OF FIGURES Crystallographic Structure of Yeast Hexokinase ("Open" vs. "Closed" Conformation) Internal Gene Duplication in Yeast Hexokinase. Tryptic Sites in Type I Hexokinase. Proposed Evolution of Hexokinases Construction of Plasmids pHB4 and pM1-7 Used for Expression . . . . . . . . . . . Sequencing Strategy for cDNA Clone HKI 12. 4- 4 . . . . . . . . Nucleotide and Deduced Amino Acid Sequence of cDNA Clone HKI 12.4-4 . . . . . Sequencing Strategy for Type I cDNA Clones and Relevant Restriction Sites . . . Northern Blot for Type I Hexokinase mRNA. Composite Nucleotide Sequence Obtained from cDNA Clones HKI 1.4-7 and HKI 1.1 Aligned Amino Acid Sequences of Rat Brain and Yeast Hexokinases. . . . . . . . Stereo Images of Yeast Hexokinase Highlighting Secondary Structural Features Stereo Images Highlighting Conserved Residues of Type I Hexokinase Stereo Images Showing the Locations of Peptides I, II, and III Stereo Images Depicting Structural Differences Between Yeast and Type I Hexokinases xi 17 20 22 24 35 43 44 46 48 49 52 53 55 58 60 Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. Stereo Images Showing the Proposed Model of Type I Hexokinase / Yeast Hexokinase Dimer Sequencing Strategy for cDNA Clones Coding for Type III Hexokinase and Relevant Restriction Sites Composite Nucleotide Sequence and Deduced Amino Acid Sequence of Rat Type III Hexokinase Alignment of Known Hexokinase and Glucokinase Sequences Stereo Image Showing Insertion in Yeast Glucokinase . . . . . Stereo Images of Residues Involved in the Binding of Glucose in the "Open" Conformation of Yeast Hexokinase Stereo Images of Residues Involved in the Binding of Glucose in the "Closed" Conformation of Yeast Hexokinase ATP Site Based on the Sequence Gly-X-Gly-X-X—(Gly/Ala). Residues Proposed to be Used in Orienting ATP into the Active Site. . . . . . Location of the Additional Gly-X—Gly-x-X-(Gly/Ala) Sequence Purported to be Utilized in the Binding of ATP. Location of Lys-111 Suggested, by Tamura et al., to be Involved in the Binding of ATP. Sequences of Structurally Similar Regions in Yeast Hexokinase, Actin, and Glycerol Kinase Stereo Images Showing Structurally Similar Regions in ATP Binding Proteins . . Stereo Images Showing Close Up Views of PHOSPHATE 1 and PHOSPHATE 2 . Stereo Images Highlighting Structurally Similar Region (ADENOSINE) Utilized in Binding the Adenine Base of ATP. Stereo Images Showing Adenine Base Binding Regions of Actin and Glycerol Kinase that are Structurally Similar to Yeast Hexokinase xii 67 71 73 75 81 84 85 92 94 94 95 100 103 104 105 108 Figure Figure Figure Figure Figure 32. 33. 34. 35. 36. Crevices in S-sheets in Yeast Hexokinase that Contribute Active Site Residues. Stereo Images Depicting the Interdomain Hinge. . . . . . . Time Course for Heterologous Expression of Type I Hexokinase . . . . . Conserved Residues in Hexokinases. Stereo Images Highlighting Conserved Residues in Comparisons of Groups of Hexokinases. . . . xiii 110 112 121 124 126 bP DEAE Glc IPTG NAD OTG PLP-AMP R.T. SDS SSC Tris LIST OF ABBREVIATIONS basepair chloramphenicol acetyltransferase diethylaminoethyl glucose isopropylthiogalactoside kilobase kilodalton nicotinamide adenine dinucleotide O-toluoylglucoswmine pyridoxyl 5'-diphospho-5’ adenosine monophosphate room temperature sodium dodecyl sulfate saline sodium citrate tris[hydroxymethyl]aminomethane xiv CHAPTER I Literature Review 2 Introduction Hexokinase (ATPzD-hexose 6-phosphotransferase, EC 2.7.1.1) catalyzes the phosphorylation of glucose using Mg”ATP as phosphoryl donor. There are four isozymes in mammalian tissues, designated as types I, II, III, and IV (type IV is commonly referred to as glucokinase). All four mammalian isozymes have been cloned as well as other hexokinases and glucokinases (see below). In this chapter the mammalian isozymes will be reviewed in terms of regulation of activity, tissue distribution, and subcellular associations. This is followed by discussion of the yeast hexokinase isozymes and yeast glucokinase. The chapter concludes with the current hypothesis for the evolution of the mammalian hexokinases. Cloned Hexokinase and Glucokinase Sequences The cDNA’s coding for all four types of the mammalian isozymes from rat have been cloned and the respective amino acid sequences have been deduced. Additionally, cDNA's coding for hexokinases and glucokinases from different organisms have also been cloned. Accordingly, cDNA's for the type I isozyme have been cloned from rat (1,2 and this thesis), bovine (3), mouse (4), and human (5). The cDNA's for the types II (6) and III (7 and this thesis) isozymes have been cloned from rat and for the type IV isozyme from rat (8,9) and human (10). The genes from yeast coding for hexokinase isozymes A and B (11-13) and 3 glucokinase (14) as well as a hexokinase from Schistosoma mansoni (15) have also been cloned. An alignment of the above mentioned hexokinase and glucokinase deduced amino acid sequences will be shown in this thesis. mammalian Hexokinases The four hexokinase isozymes present in mammalian tissues can be distinguished via different electrophoretic mobilities towards the anode during starch gel electrophoresis with mobility increasing with the designated number of the isozyme (16). Alternatively, the four isozymes have also been designated as types A through D as determined by their order of elution from a DEAE-cellulose column, with types A through D corresponding to types I through IV, respectively (17). MW Hexokinase catalyzes the conversion of glucose and Mg”ATP to glucose-S-phosphate and MgnADP (18). Three of the four hexokinases, types I, II and III, have low K;s for glucose, in the submillimolar range, and are therefore often referred to as the "low K5" isozymes (Table 1). The other isozyme, type IV or glucokinase, requires a much higher concentration of glucose to reach half saturation. One of the reaction products, glucose-G-phosphate, is a potent inhibitor of the reaction for all three "low Kl" isozymes, but does not inhibit the type IV isozyme at physiologically relevant levels (19). The "low K5" isozymes are all similar in their specificity for ATP as substrate with ITP 4 being able to achieve less then 10% the activity relative to ATP while the other nucleoside triphosphates are even poorer substrates (18). All four types are composed of a single polypeptide chain with the "low K5" isozymes all having a molecular weight of approximately 100 kDa while the type IV isozyme is only 50 kDa. Thus, the "low K5" isozymes are easily distinguished from the type IV isozyme by size, inhibition by glucose-S-phosphate, and affinity for glucose. Table 1. Kinetic Parameters of Mammalian Hexokinases Parameter (mM) Hexokinase H I II III IV K, glucose 0 .04 0 .13 0 .02 4 .50 KnATP 0.42 0.70 1.29 0.49 IQ G1C-6-P VS ATP 0.026 0.021 0.074 15.0 This table was adapted from Ureta (19), and the references therein. Most of the type I hexokinase in rat brain is bound reversibly to mitochondria (18). This binding is modulated by the inhibitory product glucose-6-phosphate, with increasing levels causing solubilization of the enzyme. Solubilization by glucose-G-phosphate is antagonized by inorganic phosphate while Mg+2 enhances binding. Inorganic phosphate alone has no effect on this isozyme. Felgner et a1. purified a protein from mitochondria which was shown to be necessary for the reversible binding of the type I isozyme (20). This protein was later determined to be the pore-forming protein porin (21,22) 5 through which ATP and ADP enter and exit the mitochondria. Consequently, it is suggested that the enzyme has preferential access to one of its substrates, namely ATP, due to the fact that the enzyme is bound to these pores through which ATP exits the mitochondria (23,24). The binding of the enzyme to mitochondria causes the K; for ATP to decrease, while the K; for the inhibitor glucose-6- phosphate increases (25-27). Therefore, the bound form of the active enzyme represents a more active form that is not as easily inhibited as the soluble form. Studies on substrate specificity have led to the conclusion that the type I isozyme can tolerate quite a large variation in structure at the carbon 2 position of the glucose molecule (28). Accordingly, compounds such as mannose (C-2 epimer of glucose), 2-deoxyglucose, glucosamine, and N-acetylglucosamine are substrates for, or competitively inhibit, the reaction catalyzed by hexokinase. In contrast to the type I isozyme, where inhibition by glucose-6-phosphate is instantaneous, the type II isozyme exhibits a pronounced delay in inhibition (29,30) with this delay becoming even more pronounced for the bound form (e.g. the half time for the response to glucose-6-phosphate inhibition is 12 seconds for the soluble form and 130 seconds for the mitochondrially bound form). Although this inhibition can be relieved by inorganic phosphate in the type I isozyme (18), the type II isozyme shows no such effect (30). Inorganic phosphate is actually an inhibitor of the type II isozyme. The K; for inhibition of type III hexokinase by glucose-G-phosphate is much higher than for the type I and II isozymes. This isozyme is also inhibited by physiologically relevant levels of the substrate glucose (17). It is interesting to note that type III hexokinase from.rat liver attains maximum substrate inhibition at approximately the same glucose concentration that glucokinase reaches half saturation (31). It has also been reported that the type III isozyme is affected by inorganic phosphate much the same way as the type II isozyme, i.e. inorganic phosphate does not reverse the glucose-G-phosphate induced inhibition of the type III isozyme of pig erythrocyte (32) or bovine liver (33) while inorganic phosphate alone has been shown to inhibit the type III isozyme isolated from.bovine liver (33). As previously stated, the type IV isozyme has an affinity1 for glucose which is much higher than the other isozymes. Due to the lack of inhibition by glucose-6- phosphate, and a half saturation constant for glucose approximating normal blood glucose concentrations, this isozyme is well suited for its role in the homeostatic control of blood glucose levels (34). 1 Actually, since the type IV isozyme exhibits cooperativity, K; (which, strictly speaking, applies only to enzymes that adhere to Michaelis-Menten kinetics) is not really correct. Type I hexokinase has the distinction of being present in all tissues examined to date (18). In most tissues, except for muscle, this isozyme is present at relatively high levels. Due to its prevalence in such a wide diversity of tissues, it has been referred to as the "basic" hexokinase and suggested to be involved in a function basic to all these tissues: glycolysis. In fact, in those tissues with a substantial reliance on blood-borne glucose, the type I isozyme is the predominant form. Brain, being totally dependent on blood-borne glucose, contains virtually exclusively the type I isozyme (hence the designation of type I hexokinase as "brain hexokinase"), as is also the case with erythrocytes. Since in both cases high levels of metabolism are occurring through the glycolytic pathway, it is certainly reasonable to expect that this isozyme’s physiological role is primarily glycolytic in nature (18). Type II is the predominant form in insulin-sensitive tissues such as muscle“, adipose tissue, and mammary gland. (reviewed in 18). The predominance of the type II enzyme has been correlated with the degree to which a tissue is sensitive to insulin. For example, as the insulin sensitivity of rat mammary gland changed during lactation, the activity of type II hexokinase changed in parallel (35). Conversely, in skeletal muscle, which is highly “.Actually, and.surprisingly, human muscle reportedly has type I levels that are much higher than type II levels. 8 insulin-sensitive, the type II isozyme predominates, but as the proportion of type II decreases, relative to type I, the insulin sensitivity decreases (36). Definite decreases of type II hexokinase have been noted in the insulin-sensitive tissues of diabetic animals (36,37); therefore the availability of insulin seems to be critical for the maintenance of type II hexokinase levels in insulin- sensitive tissues. The predominance of the type II isozyme in insulin—sensitive tissues, with episodic glucose availability, seems to suggest an anabolic role for this isozyme, such as would be required for glycogen synthesis in skeletal muscle (18). In support of this contention is the effect of inorganic phosphate on this isozyme. Glucose-6- phosphate inhibits the enzyme with inorganic phosphate not being able to reverse this inhibition. Muscle contraction is characteristically associated with increased levels of inorganic phosphate (due to increased hydrolysis of high energy phosphate compounds, ATP and creatine phosphate) and increased glycogenolysis leading to elevated levels of glucose-6-phosphate. Under these conditions, glucose-6- phosphate inhibition of hexokinase would not be relieved by inorganic phosphate, and hence, as the scenario goes (18), the type II isozyme would only be active during the anabolic phase of glycogen production. Additionally, since levels of inorganic phosphate increase and, unlike the type I isozyme, do not relieve the inhibition by glucose-6-phosphate, they may actually contribute to inhibition. Therefore, it appears 9 the type II isozyme would be inhibited during the catabolism of glycogen and active during the anabolic phase where glucose-6—phosphate levels return to much lower levels. Type III, the least studied of the hexokinases, has not been found to be the predominant form in any tissue (18). This certainly does not preclude the possibility that it may still represent the dominant isozyme in a subpopulation of cells within a tissue (38). The tissues which show the highest amount of activity attributable to type III hexokinase are liver, spleen, and lung (18). This isozyme has also been detected in rat kidney and brain (38). Additionally, Preller and Wilson (38) have demonstrated a staining (using a monoclonal antibody) for type III hexokinase which locates the enzyme at the nuclear periphery in specific cell types in each of these tissues. They point out the prominence of transport functions in many of the cell types in which the nuclear staining for type III hexokinase occurred, although the possible relationship between transport activity and nuclear localization of type III hexokinase is unclear. Type IV, or glucokinase, is known to be present in the E-cells of the pancreas and in the liver (34 and ref. therein). Diet and fasting, insulin (39), and glucagon (40) all influence the levels of this isozyme in liver, although these factors do not seem to affect the glucokinase levels in the pancreatic islet B-cells. Even though insulin does not affect the levels of glucokinase in the E-cells, 10 the levels of blood glucose do seem to affect the levels of this isozyme (reviewed in 34). In the scenario proposed by Magnuson (34), pancreatic E-cells are stimulated to secrete insulin due to an elevation of activity of endogenous glucokinase brought on by elevated blood glucose levels. By increasing the rate of glycolysis, elevated glucokinase activity is thought to increase the ATP/ADP ratio and hence the ATP levels. This in turn inhibits the opening of ATP-sensitive K? channels, causing depolarization of the plasma membrane (41) which then triggers the voltage sensitive Ca+2 channels, thus leading to an increase in cytoplasmic Ca+2 levels. The release of insulin then occurs due to the increase in Ca+2 concentration (42). This insulin, in turn, increases glucokinase levels in the liver where glucose is taken up from the blood thereby decreasing blood glucose levels (glucose is also taken up by other insulin-sensitive tissues, e.g. muscle). The glucokinase gene appears to be under differential regulation due to dual transcription control regions (reviewed in 34). In the liver and B-cells, different transcription units give rise to tissue-specific mRNAs being altered only in their 5’ regions, with the resultant proteins differing solely in the first 15 amino acids. On the other hand, a cDNA has been isolated from an insulinoma library which has a deletion resulting in a E-cell glucokinase which is missing 17 amino acids near the glucose 11 binding region (43). This deletion seems certain to have an impact on this isozyme, though exactly how it manifests itself is unknown. SubeellulaLAsseciatien The reversible binding of type I hexokinase to ‘mitochondria is well documented (18). The binding is believed to have both hydrophobic and electrostatic components. The electrostatic component is due, in part, to divalent cations such as Mg”, presumably via the bridging of negative charges on both the enzyme and the mitochondrial membrane (44). Other electrostatic interactions may arise from the attraction of opposite charges contributed by the enzyme and those located on the mitochondrial membrane. Due to the variation in pIs of the isozymes, with these dissimilarities presumably a reflection of differences in surface charges between the isozymes, it is reasonable to expect that the electrostatic component of binding will be important in influencing the relative degrees to which the isozymes bind (18). On the other hand, the hydrophobic interaction of hexokinase with mitochondria has also been determined to be extremely important. Cleavage of a small hydrophobic peptide (9 residues) from the N—terminus of the type I isozyme with chymotrypsin was shown, by Polakis and Wilson (45), to prevent mitochondrial binding of the enzyme. xie and Wilson (46) determined later that this essential N-terminal hydrophobic region of the intact enzyme is inserted into the 12 outer mitochondrial membrane. In crosslinking studies, hexokinase bound to liver mitochondria was found to exist as a monomer or a tetramer with, curiously, no evidence found for intermediate dimers or trimers (47). In another approach, Gelb et a1. (48) generated a chimeric reporter construct which consisted of the first 15 amino acid residues of type I hexokinase coupled to chloramphenicol acetyltransferase (CAT). They demonstrated that these first 15 residues conferred on CAT the ability to bind to mitochondria, which otherwise does not occur. Furthermore, the native hexokinase isozyme was shown to compete with the chimeric CAT construct for binding to mitochondria. Additionally, N,N'-dicyclohexylcarbodiimide, which prevents hexokinase from binding mitochondria by covalently modifying porin, also prevented the chimeric CAT construct from binding. This certainly complements the work of Felgner et a1. (49) who had previously shown that the protein (porin) they had purified from mitochondria was able to confer on lipid vesicles the ability to bind hexokinase and, most importantly, this binding was sensitive to glucose-G-phosphate. Type II hexokinase has been shown to bind mitochondria in a competitive manner with the type I isozyme (50). The cDNA for the rat isozyme has been cloned and the amino acid sequence deduced (6). As with the type I isozyme, the type II isozyme has an N-terminal region which is hydrophobic, although less hydrophobic when compared to the type I l3 isozyme due to the presence of serine and histidine residues (6). Indeed, it is these hydrophilic residues which have been implicated (6), at least in part, in the decrease in avidity with which the type II isozyme binds mitochondria (50) (relative to type I hexokinase). As previously stated, Type III hexokinase has been demonstrated to have a weak association with the external surface of nuclei by Preller and Wilson (38). In contrast to earlier findings labeling this isozyme as "soluble" and hence cytoplasmic in location, they were able to demonstrate this association via confocal microscopy after staining the isozyme through the use of a monoclonal antibody. The cDNA for the rat Type III enzyme has been cloned from liver (7) as part of the work described in this thesis. Qntgggnetig Studies Ureta carried out a rather extensive study on the levels of each of the "low Km " isozymes (Types 1,11, and III) in rat liver as a function of time (51). The isozymes were isolated and separated (on DEAE-cellulose columns) starting five days before birth and terminating around 17 days after birth at which time the isozyme levels reach their adult levels. Type I isozyme levels at 5 days before birth are approximately twice the adult level with a maximum level of 4.5 times the adult level attained at birth. Levels of this isozyme then fall to 2.5 times the adult level during the first week with the adult level being attained by the end of 14 the second week after birth. The type II levels are very low at 5 days before birth with a maximum level of approximately 3.5 times that of the adult level being attained within the first few days after birth. The levels then decrease reaching adult levels midway into the second week after birth. Isozyme III remains at low levels before and just after birth reaching a maximum level of approximately 2.5 times the adult level by the end of the first week after birth. Thereafter, the type III isozyme undergoes a rather precipitous decline to adult levels by midway through the second week after birth. Although the data were not presented, type IV hexokinase was noted to be present at birth, albeit at very low levels. The levels of this isozyme begin to rise at the end of the second week after birth, reaching adult levels at the end of four weeks. Yeast Hexokinases Yeast contains two isozymes of hexokinase designated as A and B, or P-I and P-II, respectively (reviewed in ref. 52 and 53). Both isozymes have a molecular weight of approximately 50 kDa and are composed of a single polypeptide chain. TWo separate groups have cloned both isozymes (11-13). The isozymes have 378 identical residues out of a total of 485, with the differences being scattered throughout the enzymes. The yeast isozymes can be separated by chromatography 15 on DEAE-cellulose (52) using a pH gradient which results in the A isozyme eluting first. Alternatively, isozyme A migrates more anodically during electrophoresis using Tris buffer at pH 9. During the isolation of the yeast hexokinase isozymes, due to endogenous protease action, alternative enzymatically active forms of these isozymes (i.e. S-I and S-II) (52) are detected in which the first 12 residues have been removed. Native forms of the yeast isozymes form dimers under conditions of high protein concentration and low pH. The first 24 amino acids of both isozymes are identical, and while removal of the first 12 residues has no effect on activity, they appear to be essential for the formation of the dimer. It is interesting to note how the first few residues of the N-terminal sequences of both the yeast isozymes and the mammalian type I isozyme play such an important role in binding. Measurements of the dissociation constant for the binding of glucose to the dimer (52) indicate that the dimer binds glucose poorly at low glucose concentrations (Khan ca. 10“) but shows positive cooperativity with binding improving at higher glucose concentrations (Kan, ca. 10*). In experiments with the proteolytically modified forms, 8-1 and S-II, which are unable to form dimers, the binding of glucose is much better (Kflu,= 3x10'5 and 3x10“, respectively). This led to the proposal that in the dimer the active site for glucose is largely buried, whereas, in 16 the dissociated monomer the active sites are readily accessible (52). The parallel between this behavior and the masking of sites in the intact type I hexokinase of mammals (see below) is intriguing. The two yeast isozymes differ in their specific activities (52), with the B isozyme’s specific activity being four times greater than that of the A isozyme. Additionally, the isozymes differ in their abilities to use fructose and glucose as substrates. The ratio of fructose maximum activity compared to the maximum activity with glucose is 3.0 for the A isozyme, while it is only 1.0 for the B isozyme (52). The isozymes have been crystallized and the three dimensional structures have been determined. The B isozyme’s structure has been determined after its crystallization as a dimer (54-56) without any substrates. Crystallization has also been carried out with the A isozyme and glucose (57,58) and the B isozyme with the glucose analog O-toluoylglucosamine (59). Comparisons between these structures have demonstrated that binding of the sugar causes extensive alterations in the structure of the enzyme (Figure 1)(60). That this conformational change has been brought about by the binding of glucose and is not due to a difference in the isozymes has been experimentally verified (61). The most convincing evidence is the fact that the change in the radius of gyration of the B isozyme in solution upon binding glucose is the same as that Figure 1. Crystallographic Structure of Yeast Hexokinase ("Open" vs. "Closed" Conformation). A: Yeast Hexokinase B in the "open" conformation with glucose (derived from OTG) in the active site. 3: Yeast Hexokinase A in "closed" conformation. C and D: Both conformations superimposed. "Open" conformation is drawn with solid lines, "closed" conformation is drawn with dotted lines. 18 calculated from the crystallographic coordinates. Yeast Glucokinase In the yeast Saccharomyces cerevisiae there are three enzymes known to phosphorylate glucose; hexokinase isozymes A and B and yeast glucokinase (62). While the hexokinases are also able to utilize fructose, yeast glucokinase essentially does not. This was illustrated in a study carried out by Lobo and Maitra (62) where they measured the doubling time of yeast grown on glucose or fructose. The strains of yeast were altered such that each strain produced only one of the three enzymes that phosphorylate glucose. When grown on glucose, the strains containing only one of the three enzymes (hexokinase isozyme A, B or glucokinase) grew at a rate comparable to the wild type strain (contains all three enzymes), doubling in under three hours as opposed to under two hours in the wild type strain. On the other hand, the strains containing either hexokinase A or B, when grown on fructose, still doubled in under three hours (wild type still under two hours), while the strain containing only glucokinase took 16 hours to double when grown on fructose. Indeed, the enzyme is so specific for glucose that "trace quantities of glucose in fructose may be analyzed conveniently by using glucokinase" (63) (K; Glucose = 0.03 mM, K. Fructose = 31 mM) . Yeast glucokinase has a molecular weight of 51 kDa and can be isolated from a hexokinase deficient mutant principally using ammonium sulfate precipitation and DEAE- 19 cellulose chromatography (63). The amino acid sequence has been deduced from the cloned gene (14) and will be used in this thesis in comparisons with other hexokinases. Evolution of Hexokinases Rossman et a1. (64) originally noted the similarity between regions of the two lobes of yeast hexokinase which border the substrate binding cleft. McLachlan (65) further pointed out that in comparisons of the two lobes, each of which possess a structural feature comprised of a five stranded B-sheet and three a-helices (Figure 2), superposition of common regions resulted in 57 common pairs of a-carbons (32 from the E-sheet and 25 from the a- helices). This led McLachlan (65) to propose that the yeast isozymes may have evolved, in part, by duplication and fusion of a smaller gene encoding the similar structural feature. Harrison (66) however, points out that the central three strands in the B-sheet in the large lobe are shorter than their counterparts in the small lobe, which he concludes casts doubt on the theory of gene duplication in the evolution of the 50 kDa yeast isozymes. Many researchers (19,50,67—70) have speculated that the 100 kDa mammalian hexokinases have evolved by duplication and fusion of an ancestral 50 kDa hexokinase not unlike the yeast isozymes. It was proposed that one of the catalytic sites was conserved while the other evolved to take on a regulatory role. This scenario has undergone some modifications as more information has ,1. “3,; 3.114;, . ‘9: " o , . "pJ§”§3uf - c ,‘ .’ I '(|§._'~, ' u. l»- Figure 2. Internal Gene Duplication in Yeast Hexokinase. Stereo images of yeast hexokinase B highlighting the regions in each of the two lobes purported to have arisen through gene duplication. A: E-sheet and a-helices of the small lobe. B: B-sheet and a-helices of the large lobe. C and D: S-sheets and a-helices of the small and large lobes, respectively, oriented to demonstrate similarity. 21 become available, as will be discussed below. Polakis and Wilson (71) have shown that digestion of the native type I isozyme with trypsin results in the generation of three principal fragments. The smallest fragment, 10 kDa in size, represents the extreme N—terminal portion of the molecule. The other two fragments generated were of molecular weights 50 and 40 kDa with the 50 kDa fragment being the center fragment located between the N- terminal 10 kDa fragment and the C-terminal 40 kDa fragment (Figure 3). The 40 kDa fragment was subsequently shown by labeling experiments to contain binding sites for both substrates: ATP (72) and glucose (73). Thus the C- terminal portion of the molecule would be expected to contain the catalytic site. White and Wilson (74), using a different approach, were able to derive a different pattern of digestion using trypsin. Incubating the enzyme in low concentrations of the denaturant guanidinium hydrochloride resulted in more extensive proteolysis with fragments of 52 and 48 kDa appearing as intermediate species (Figure 3). They determined that the enzyme is, in essence, comprised of two major domains of approximately the same size: a 52 kDa N- terminal portion and a 48 kDa C-terminal domain. By adding the inhibitor glucose-6-phosphate they were able to selectively protect the N-terminal portion from denaturation in guanidine hydrochloride and upon addition of trypsin the C-terminal portion was proteolytically removed (74). In the 22 converse experiment, this time using a glucose analog, N- acetylglucosamine, they were able to selectively protect the C-terminal half of the enzyme (75). Thus, they were able to conclude that the binding site for the allosteric effector glucose-6-phosphate resides in the N-terminal half of the intact enzyme and is separate from the catalytic site. Using a similar approach they were able to isolate the C-terminal portion of the enzyme and demonstrate that it does in fact possess catalytic activity (74). Further work demonstrated, surprisingly, that the isolated C-terminal portion of the enzyme was inhibited by glucose-6-phosphate and that both halves of the enzyme did, in fact, possess binding sites for the inhibitor glucose-6-phosphate as well as the substrates ATP and glucose (also inorganic phosphate) (74). This information lead to a modification of the gene duplication and fusion theory such that the ancestral 50 kDa hexokinase would have had both the glucose binding site as well as the glucose-6-phosphate regulatory site (Figure 4). That this was a reasonable postulation was further supported by the fact that starfish hexokinase has a molecular weight of 50 kDa and is, in fact, inhibited by glucose-6-phosphate. 10 kDa 50 kDa 40 kDa N + l c Native 52 kDa 48 kDa N l C Denatured Figure 3. Tryptic Sites in Type I Hexokinase. The predominant sites at which trypsin cleaves the native enzyme and the enzyme under partially denatured conditions are shown with the resultant fragment sizes indicated. 23 Direct measurements of ligand binding on the intact enzyme have shown only one physiologically relevant binding site for glucose (76,77) and one for glucose—S-phosphate (76,78). Therefore, it was postulated that the glucose site is masked in the N-terminal portion of the intact enzyme with the glucose-6-phosphate site being masked in the C-terminal half. In order to gauge the reactivity of sulfhydryl groups in the intact enzyme, Hutny and Wilson (79) used the sulfhydryl specific reagent 2-bromoacetamido-4-nitrophenol. Upon binding of glucose-6-phosphate to the high affinity site in the N-terminal portion of the molecule, some of the previously reactive sulfhydryls in the N-terminal portion were protected, as was expected. The fact that sulfhydryls present in the C-terminal portion were also partially protected supports the contention that the structure of the N-terminal half of the intact enzyme (and hence conformational changes occurring therein) impinge on the structure of the C—terminal half. Therefore, evolution of the 100 kDa mammalian hexokinases by gene duplication and fusion from an ancestral hexokinase similar to the 50 kDa starfish hexokinase (which contains sites for catalysis as well as inhibition), with the final 100 kDa enzyme having some of those sites altered or masked, is a reasonable postulation. 24 Figure 4. Proposed Evolution of Hexokinases. According to this scheme, a 50 kDa ancestral hexokinase evolved in two separate directions: one giving rise to present day yeast isozymes, the other giving rise to a Glc-6-P inhibited enzyme not unlike the starfish 50 kDa enzyme. The present day 100 kDa enzymes evolved from the duplication and fusion of a Glc-6-P inhibited form, except in one half the Glc-6-P regulatory site B is masked (catalytic half) and in the other half the catalytic site 0 is masked (regulatory half). —— 0 Yeast Hexokinase 25 Ancestral Hexokinase l n l I U 7 Starfish Hexokinase I A I I [_l/\I I V l l LJVI I F'Ifl II HA» I UV ll L_J\J—l a O .1 E \/ masked Masked . m\.\ \nn. I L_J\)\ IA\\J I Mammalian Hexokinase Figure 4 CHAPTER I I Materials and Methods 26 27 Materials Enzymes used in the restriction or modification of DNA were obtained from a variety of sources, although most were purchased from New England Biolabs (Beverly, MA), Boehringer Mannheim Biochemicals (Indianapolis, IN), or BRL (Gaithersburg, MD). Other DNA modifying enzymes were purchased from Pharmacia (Piscataway, NJ), U.S. Biochemicals (Cleveland, OH), Life Sciences (St. Petersburg, FL), or Stratagene (La Jolla, CA). Radioisotopes were purchased from either NEN Dupont (Boston, MA) or Amersham (Arlington Heights, IL). Other reagents and materials were obtained from a variety of standard commercial suppliers. A rat brain cDNA library constructed in th11, using mRNA from adult rat brains, was generously provided by Dr. Ronald L. Davis. Rat brain hexokinase (Type I) was prepared according to Chou and Wilson (80). 28 Methods Preparation of Anti-hexokinase Antibodies Preparation of anti-hexokinase antibodies was carried out as previously described (81). Affinity Purification of Antibodies to Rat Brain Hexokinase 1) 2 mg of HK were incubated overnight with 1 ml of Affigel-lo (Biorad) at 4°C in 50 mM (Na)HXL (pH 7.0). The Affigel-lo was then washed in 0.1 M ethanolamine at room temperature (R.T.) after loading it into a 3cc syringe which had been plugged with silanized glass wool. The column was washed sequentially with five column volumes of the following buffers. TBS-NP40: 120 mM NaCl 50 mM Tris pH 7.5 0.5% Nonidet P-40 1 M LiCl Glycine buffer: 50 mM Glycine pH 2.5 150 mM NaCl PBS: 0.1 M NaCl 0.01 M Na phosphate pH 7.5 The column was stored at 4°C in PBS + 0.1% NaN3. 29 Entifiigatign Qf Anti-Rat Brain Hexokinase Antibodies 1.) 2 ml of antiserum were recycled over the column at R.T. a minimum of five times with a flow rate not exceeding 5 ml/hr. 2.) The column was washed sequentially with five column volumes each of PBS, TBS-NP40, and PBS. 3.) The affinity purified Ab’s were eluted using the glycine buffer. The eluate (usually 10 ml) was collected and neutralized with 1 M Tris-HCl pH 9.0. 4.) The column was equilibrated with PBS and NaN3 added to 0.1% for storage at 4°C. Immunological Screening of Agtll cDNA Library 1.) Grow an overnight culture of the bacterial strain Y1090 in L broth + ampicillin @ 50 ug/ml. 2.) 100 ul or less of the appropriate dilution of the Agtll cDNA library is mixed with 100 ul of the bacterial strain Y1090 and incubated at 37°C for 20 min to allow for infection of the bacteria by the phage. 3.) The mixture is then plated on a warm (50°C) 100 mm diameter agar (L broth) plate using 3 ml of top agar (L broth) previously melted and kept at 45-50°C. 4.) Wet nitrocellulose filters with 10 mM IPTG (isopropylthiogalactoside) and air dry (20 min.). 5.) After the top agar has hardened (5 min. in cold room at 4°C or 15 min. at R.T.) the IPTG treated nitrocellulose filter is placed on the top agar while avoiding trapping any air bubbles between the filter and the 30 top agar. 6.) Incubate plates at 42°C for a minimum of 15 min. (time necessary for the entire plate to reach 42°C) followed by a minimum 3 hour incubation at 37°C. (It is common at this point to leave plates overnight @ 37°C.) 7.) The orientation of the nitrocellulose filter on the plate is then clearly marked by injecting an extremely small amount of black india ink into the agar plate after piercing the nitrocellulose filter and agar plate (three injections at the periphery of the filter in an unambiguous manner). 8.) 100 ng of purified rat brain hexokinase is spotted directly onto the top side of the filter (the side not in contact with the agar) as a positive control for the immunological screening. The filter is gently peeled off the agar plate using forceps while only touching the extreme edges (point indentations may "light up" as positives). NOTE: At this point the filter may be numbered using a pencil. The filter is placed face up (side in contact with agar up) in a petri dish containing 20 ml TBS (10 mM Tris pH 7.5, 0.15 M NaCl). The agar plate is stored at 4°C. 10.) Any agar sticking to the filter can be removed by swirling the petri dish. The filter is then incubated for a minimum of 5 min. with TBS containing 3% calf serum (or gelatin) to block any protein binding sites. 11.) Incubate filter with 20 ml of a 1/1000 dilution (using TBS) of the affinity purified rabbit anti-HK Ab's containing 0.5% Nonidet P-40 at R.T. with shaking for 1 hr 31 (1/1000 dilution with respect to the antiserum). NOTE: Some Ab preparations give higher backgrounds than others. The background may be reduced by incubating the Ab soln. segueutielly with two or three filters using non— recombinant Agt11 as phage (a preadsorption step) - if this is done, a 1/10 dilution of the Ab's can be used in the preadsorption step followed by dilution to 1/1000 for the screening. The resulting solution can be used numerous times. 12.) The Ab solution is poured off and kept at 4°C after addition of NaN3 to 0.1%. The filter is washed 3 times with TBS for 5 min. (while shaking). 13.) Incubate filter with 20 ml TBS containing 3% goat serum for 5 min. 14.) Incubate filter with 20 ml of a 1/1000 dilution of affinity purified horseradish peroxidase conjugated goat anti-rabbit Ab’s in TBS containing 0.5% Nonidet P-40 for 1 hr. 15.) The goat Ab’s are poured off and saved at 4°C for future use (reuse up to three times without any difficulty in detection). Wash filter three times with TBS for 5 min. each (while shaking). 16.) The filter is then incubated with 10 ml of developing soln. for 5 to 15 min. in the dark. Developing soln.: 60 mg 4-chloro-naphthol in 20 ml cold CH30H added to 100 ml TBS containing 60 ul of 30% HKL. 17.) After developing, the filters are rinsed with dHJ) 32 and stored in the dark to prevent yellowing. 18.) The filter is aligned with the agar plate using the black india ink spots and a plug of agar is cored from the region containing the suspected positive (the large end of a disposable pipette is ideal for this procedure). 19.) The plug is placed in a 1.5 ml Eppendorf tube containing 1 ml of SM (plus 50 ul CHCl3 to prevent bacterial growth) and left overnight at 4°C to allow the phage to elute from the plug. SM (per liter) 5.89 NaCl 2gms MgSO, . 7H,O 50 ml 1M Tris HCl pH 7.5 5 ml 2% gelatin NOTE: To expedite matters the plug can be broken up with a toothpick and the Eppendorf tube incubated at R.T. on a rocker for 1 hr. to elute the phage. 20.) The phage solns. from each positive are plated as above at a lower density and rescreened. A well resolved positive plaque is picked and used for subsequent phage DNA isolation. cDNA Synthesis and Construction of th10 Libraries For the rat brain cDNA library, total RNA was isolated from adult rat brains. Total RNA to be used in the construction of the rat liver cDNA library was isolated from the livers of 6-day old rats. (This is the time point at which the type III isozyme’s activity is at a maximum, and therefore the mRNA levels for this isozyme were presumed to 33 be at a maximum.) Both libraries were constructed using the procedure of DeWitt and Smith (82) starting with 5 ug of mRNA which had been isolated from total RNA as described in (83). Screening of Agth cDNA libraries Plaque hybridization of the rat brain th10 cDNA library using the immunologically isolated clone HKI 12.4-4 (1) was carried out by procedures described in Maniatis (84). The rat liver Agtlo cDNA library was screened via plaque hybridization (84) as in Maniatis, using the full length rat brain cDNA clone HKI 1.4-7 (2) as probe. The only procedural difference was that, after hybridization, the filters were washed only in 2 x SSC at 50°C. Labeling of the cDNA clones used in the plaque hybridizations was carried out via random priming (85,86). Sequencing of cDNA Clones The dideoxy method (87) was used to sequence the cDNA clones after generating non-random deletions via the method of Henikoff (88,89). Non-random deletions were generated by digesting successively larger regions of DNA from one end of the pertinent cDNA clones. This was carried out (separately) on both ends of the cDNA clones such that the sequence of both strands could be determined. Northern Blot Preparation and hybridization of the northern blot was 34 carried out as in Maniatis using 10 ugs of rat brain mRNA and type I hexokinase cDNA clone HKI 12.4-4 as probe with the only difference being that after hybridization the blot was washed only in 2 x SSC at 48°C. Construction of Plasmids for Expression of Rat Brain Hexokinase in E. coli. p335 end pM1-7 Full length clone HKI 1.4-7 in pUC18 was digested with Bam H1 and religated. This resulted in the removal of the 3’ untranslated region (Bam H1 cuts a few bases down stream of the stop codon and once in the multiple cloning site). The modified clone was designated pHKI 1.4-7-B and is the starting clone in Figure 5. The next step was removal of the 5’ untranslated sequence which was carried out because this clone was initially going to be used in a different expression vector. Step 1: Digestion of pHKI 1.4-7-B with EcoRl and Sma I to isolate the 256 bp fragment corresponding to the 5' untranslated region and the first 165 bps of the coding region. Step 2: Digestion of the isolated 256 bp EcoRl - Sma I fragment with Nla III while removing aliquots throughout the digestion in order to isolate the fragment which is cleaved at only one of the Nla III sites (partial digestion), the site located at the starting Met codon. Step 3: The starting clone, pHKI 1.4-7-B, is digested with Sph I and Sma I in order to ligate it (step 4) to the partial digestion fragment of step 2. The Sph I and Nla III sites are compatible, although the Sph I site will be lost upon 35 Figure 5. Construction of Plasmids pHB4 and pM1-7 Used for Expression. See text for details. 36 Pm 1e‘-7" Hind!!! Op): 1 room Nla xxx Nla xxx 3.. x Baa nx AAOCTTOCATOC OAATTC gCATOa CATO CCCOOO OOATCC__ "ncmace cum ' ounce one moccc corms 3. Digest Sph Ila-a I 1. Digest Too Ri/Sla I 3. Digest Sph I/Sla I look: Nla txx Nla ”I See I AATTC gCATOa CATO ccc O cOTACt OTAC 600 2. Digest Ila III Isolate partial ninaxxx Sph x xx. xxx n1. xxx Sma x Des a: __AAOCTTO + CATOa CATO CCC 4 OOO OOATCC__ TTCOAACOTAC t OTAC OOO CCC CCTAOO 0. Ligate pan-1 Rind!!! Ina III Mia 11! Sea 1 Sea H1 __AAOCTTOCATOs CATO CCCOOO OOATCC__ TTCOAACOTACt OTAC OOOCCC CCTAOO 5. Digest Mind III/Ban 81 nxnaxxx Nla xxx N1. 111 Sea x 3.. n1 AOCTTOCATOa CATO CCCOOO O + le-III o-pAJ (Hind III/Danni cut) ACOTACt OTAC OOOCCC CCTAO 6. Ligate p834 pin-xxx oupAZ Hind xxx Nla xxx Sea x Baa n: pxn-xxx onpAI OATCOCCOCO CATO CCCOOO OOITCC__ ACTAOCOOCOC OTAC OOOCCC CCTAOO ValAIaOlnAlaAIaOIuPheOlnAla tIleAlaAIa 7. Delete using oligo 981-7 Nla 111 Sea 1 Ba- H1 CGCCOCO CATO CCCOOO GCIIC¢__ ACTAOCOOCOC GTAC OOGCCC CCTAOC VIIAJOOIDAJ EIIOAJAAIO Cleavage site of e-pA signal peptide Figure 5 . 37 ligation (denoted by the small "a" next to the Nla III site arrived at in step 2). This clone was designated pNH2-1 and was digested with Hind III and Bam H1 (step 5) and ligated into the expression vector pIN-III ompA2 which had been cut similarly (step 6). The resulting clone, pHB4, was in frame with the ompA signal peptide and was used in the initial expression experiments aimed at determining if the expressed rat brain hexokinase was catalytically active. If the signal peptide was cleaved correctly, the expressed protein would still have 6 amino acid residues tacked onto the N-terminus which corresponds to the cloning site. Clone pM1-7 was constructed using the deletion mutagenesis procedure (step 7) outlined by Takahara et a1. (90) and the 24 base oligomer designated J.E.W.4 (GTAGCGCAGGCCATGATCGCCGCG). The rat brain enzyme expressed from this clone, if correctly processed, should begin with the starting Met. RKNI_§EQ_PNE§ Originally, the type III hexokinase cDNA clone was also to be used in experiments aimed at bacterial expression and clones were constructed using the same procedure as above (90). Unfortunately the oligonucleotide used to delete the 5’ untranslated sequence contained an extra nucleotide which was inserted down stream from the start codon and hence prevented expression of this isozyme due to a frame shift error. Nevertheless, one of the constructs, designated pIII- 1, still proved useful in that the start codon was conveniently located in an Nco I site. This site was 38 utilized in the construction of the plasmids used to express the two "halves" of rat brain hexokinase (pXN1 and pNB6, N- terminal and C-terminal halves, respectively) described below. Clone pHKI 1.4-7 contains a unique Nco I site approximately midway through the coding region. This clone was digested with Nco I and Bam H1 and the 1403 bp fragment was cloned into pIII-l which had similarly been cut. The resulting clone, pNB6, was constructed to express the C- terminal half of rat brain hexokinase. Clone pM1-7, constructed above, was digested with Xba I and Nco I. The 1552 bp fragment corresponds to the coding region of the N-terminal half of rat brain hexokinase with an additional 100 bps on the 5’ end (up to the Xba I site) coming from the pIN-III ompA2 vector. This fragment was cloned into pIII-1 which had also been cut with Xba I and Nco I. The resulting clone, pXNl, should express the N- terminal half of rat brain hexokinase. Expression of Rat Brain Hexokinase in E.coli 1.) Grow a 1.5 ml culture (strain JA221 harboring the appropriate plasmid) to be used for expression, overnight at 37°C in L broth with ampicillin @ 75 ug/ml. 2.) Add 100 ul to 10ml of media (TB broth + ampicillin @ 75 ug/ml) in a 50 ml screw cap tube (with appropriate amount of IPTG) and let grow on shaker for 16 hrs. 3.) Transfer 1 ml of culture to 1.5 ml eppendorf and spin down for 2 min. discarding supernatant. 39 4.) Resuspend pellet in 500 ul of 20% sucrose, 10 mM Glc, 10mM thioglycerol, 10 mM Tris, pH 7.5, and store on ice for 10 min. 5.) Spin down for 2 min. in cold room. Resuspend pellet in 200 ul of ice cold 10 mM Glc, 10 mM thioglycerol. Leave on ice for 15 min. 6.) Spin down 5 min. in cold room. Supernatant contains expressed enzyme. Hexokinase activity was measured spectrophotometrically as in (91). SDS-gel Electrophoresis and.Immunoblotting Procedures for SDS-gel electrophoresis and immunoblotting were the same as those described in (71). Alignment of Amino Acid Sequences The alignments of amino acid sequences of hexokinase isozymes were determined by first matching regions with a high degree of similarity before aligning the remaining sequence while keeping gaps to a minimum. Amino acid residue changes that occurred within one of the following six categories were considered to be conservative changes: a) Val,Met,Ile,Leu. b) Gln,Asn. c) His,Lys,Arg. d) Ala,Thr,Ser. e) Gln,Asp. f) Phe,Tyr,Trp. Generation of Stereo Images Stereo images were generated using the Brookhaven 40 Protein Data Bank coordinates for the "open" conformation of yeast hexokinase B complexed with OTG (filename PDBZYHX.ENT), the "closed" conformation of yeast hexokinase A complexed with Glc (filename PDBlHKG.ENT), actin (filename PDBlATC.ENT), and glycerol kinase (filename PDBlGLB.ENT). The program was written in Pascal on an IBM PC and designed for the generation of stereo images using the HP-GL/2 language of Hewlett-Packard LaserJet Printers (III or IV). Secondary structural features of yeast hexokinase were determined at the computational chemistry facility of Upjohn (Kalamazoo, MI) by the algorithm intrinsic to the software package MOSAIC using x-ray crystallographic coordinates from the Brookhaven Protein Data Bank. CHAPTER I I I Cloning of cDNA’s Coding for Type I Rat Hexokinase; Comparison to Yeast Hexokinases; Proposed Model for Type I Hexokinase 41 42 This chapter begins with description of the type I hexokinase cDNA clones, isolated from rat brain cDNA libraries, followed by verification of the authenticity of the clones as coding for this enzyme. Comparisons between the deduced amino acid sequences of the N- and C-terminal halves of type I hexokinase and the yeast isozymes establish that type I hexokinase appears to have evolved via gene duplication and fusion of a 50 kDa ancestral hexokinase. Separate comparisons of the N- e; C-terminal halves of rat brain hexokinase with yeast hexokinase isozymes A and B reveal that the yeast crystallographic structures provide a reasonable model (at least to a first approximation) for both halves of type I hexokinase. The chapter concludes with a proposed model of the mammalian enzyme constructed using the yeast crystal structures and relevant experimental data pertaining to type I hexokinase. Initially, clones coding for the C-terminal half of rat brain hexokinase were isolated and sequenced. Subsequently, clones were isolated which contained the entire coding region. Therefore, discussion of the determination of the amino acid sequence for the C-terminal half of rat brain hexokinase occurs before the N-terminal half. Cloning of the C-terminal Half of Rat Brain Hexokinase A rat brain Agtll cDNA library was screened immunologically for clones coding for rat brain hexokinase. The largest clone isolated was designated HKI 12.4-4 (1) and contained a 2.1 kb insert. Both strands were sequenced after 43 generating non-random deletions using the strategy depicted in Figure 6. -—-—-53- -—-——€5> --—-€E>' 2)- E)- 3’- Eco R1 P51: 1 Ban H1 Eco R1 ‘<: -<: -‘E ‘__> ———> ___> HKIlEA-4 HKIli HKI L4-7 Figure 8. Sequencing Strategy for Type I cDNA Clones and Relevant Restriction Sites. Regions of cDNA clone HKI 1.4-7 not present in HKI 12.4-4 were sequenced after the generation of subclones (represented by arrows) via nonrandom deletions. The 3’ end of HKI 1.1, not present in HKI 1.4-7, was also sequenced. The cDNA clone HKI 1.4-7 contains a 3.7 kb insert which starts 91 bases upstream from the translation initiation codon. This clone extends 32 bases past the 3’ end of the previously isolated and sequenced clone, HKI 12.4-4, and since clone HKI 12.4-4 contains the stop codon (and extensive 3' untranslated sequence > 700 bps), cDNA clone HKI 1.4-7, therefore, includes all of the coding region. A second cDNA clone, designated HKI 1.1 (2), contains a 2 kb 47 insert and extends an additional 13 bases beyond the 3’ end of HKI 1.4-7 concluding with 27 adenine residues, the beginning of a presumptive poly(Af) tail. Sequencing of the 3’ end of clone HKI 1.1 provided the 40 bases not included in HKI 1.4-7 (Figure 8). (The sequence of greater than 500 bases upstream from the 40 bases at the 3' end of HKI 1.1 was determined to be identical to the 3' end of clone HKI 1.4-7.) In conclusion, these clones were determined to represent 3.7 of the 4.3 kb present in the mRNA detected in a Northern blot of rat brain mRNA (Figure 9). Authenticity of Full Length Clone HKI 1.4-7 The composite nucleotide sequence determined from HKI 1.4-7 and HKI 1.1, and unique from clone HKI 12.4-4, is shown in Figure 10, under which the deduced amino acid sequence is given. Regions of the deduced amino acid sequence matching those previously determined directly from the enzyme are underlined and are as follows. (a) Starting at base 92, and establishing clone HKI 1.4-7 as containing the initiating Met (and hence the entire coding region), is a 9 residue amino acid sequence which agrees well with the N-terminal sequence of the enzyme determined by Polakis and Wilson (45). It should be pointed out that the deduced amino acid sequence Ala-Ala-Gln (residues 3 to 5), which could not be unambiguously identified (for technical reasons) by Polakis and Wilson (45), was reported as (Ala,Gln)-Ala. (b) Residues 102-121 match a 20 amino acid sequence determined directly from the N-terminus of the 90 kDa fragment produced 48 Origin _____ 28s rRNA = 4700 bases .0 ' _____ M82¢ RNA = 3636 bases _____ 23s rRNA = 2904 bases _____ 185 rRNA = 1900 bases _____ 16s rRNA = 1541 bases Figure 9. Northern Blot for Type I Hexokinase mRNA. Positions and sizes of control RNAs are as shown. 10 ugs of rat brain mRNA was probed with cDNA clone HKI 12.4-4 (see Methods). 49 COCC OAT CTO CCO CTO OAO OAC CAC TOC TCA CCA OOO CTA CTO AOO AOC CAC TOO CCC CAC ACC TOC TTT TCC OCA TCC CCC ACC OTC AOC ATO tATC :52 :5: gig ETA CTO I eu C O C TAT TAC C ACC O O CTO AAO OAT OAC CfAfl mAAO A OAC AAO TAT CTO Ai;_gz; Tyr 3:. Thr O u Leu Lys Asp Asp O Lys Lys I Asp Lys Tyr Leu TAC O C ATO COO CTC TCT OAT 0901 A CTO A A OAT A C CTO ACA COA Tic AAO AAA Tyr A a Met Arg Leu Ser Asp O Leu I e Asp I e Leu Thrk Mg P e Lys Lys O O ATO AAO AAT OOC CTC TCC COO OAT TAT AAT CCA ACA O C TCC OTC AAO ATO CTO O u Met Lys Aen Oly Leu Ser Arg Asp Tyr Aen Pro Thr A Ser Val Lye Met Leu CCC ACC TTO CTC COO TCC A CCO OAC OOC TCA OAA AAO OOO OAT TEC AIT OCC CTO Pro Thr Leu Leu Arg Ser I Pro Asp Oly Ser Olu Lys Oly AspP e I Ale Leu OAT CTC O TCT TCC TTT COA A C CTO COO OTO O OTO AAC O AAO AAC Asp Leu egO Oy Ser Ser Phe Arg I e Leu Arg Val cIn Val Aen His OIu Lys As; O AAC GTE“ AOC ATO O TCT “TC TAC OAC ACC CCA O O AAC A C OT CAT C s Vh O Ser As Pro O Aen I eVa O AOT A ACC O CTT C OAT CAT OTC OCT OAC TOC CTO OOA OAC C ATO OtO AAA Ser O y ThrO n Leu P e Asp His Val Ala Asp Cys Leu Oly Asp P e Met O u AAO AAO A C AAO OAC AAO AAO TTA CCC O OOA TTC ACA TTT TCC TTC CCC TOC COA Lys Lys I e Lys Asp Lys Lys Leu Pro Va Oly Phe Thr Phe Ser Phe Pro Cys Arg CfATC OA A OAT O O OCT OT CTO A C ACO TOO ACA AAO COO TTC AAA OEC AOT O Ser LysI e Asp O u Ala Va Leu I e Thr Trp Thr Lys Arg Phe Lys A a OOC OT? GIAO A O O OAT O OT AAO TTO CTO AAT AAA OEC ATT AAO AAO COA y V O Oy A a AspV Lys Leu Leu Aen Lys A I e Lys Lys Arg O y OAC TAT OAT AAC A OT GICVT AAT OAC ACA OT OOO ACC ATO ATO ACC Asp Tyr Asp Aa Aen I Va Va Aen Asp Thr Va Oly Thr Met Met Thr TOC T TAT OAT OAC C O TOT Ou O C C A ACA OOC ACC AAT Cys O y Tyr Asp Asp O n O n Cys O OIym Leu IIe IIT eg Thr Oly Thr Aen EST TOC TAC ATO O O O CTO COA CAC ATC OAC CTO OTO OtA OOC OAC O O OOO AOO a Cys Tyr Met O u O u Leu Arg Hie I Asp Leu val O y Asp O u Oly Arg ATO TOT A AAC ACO O TOO OOA C TTT OOO OAT OAT TCC CTO O OAC ATC Met Cys I e Aen Thr Ou uTrp Oly A a Phe Oly Asp Asp O y Ser Leu O u Asp I e COA ACC O O uTET OAC AOA O O TTA OAC COT OOA TCT CTC AAC CCT OOO AAO GAO CTO Arg Thr O e Asp Arg O u Leu Asp Arg Oly Ser Leu Aen Pro Oly Lye O HUI OUI 00 OH H. Nd U0 DU UIUI OOO NISUI ”NH U UIUI U10 HU ‘00 Q” UIG DH NUI 0'0 0. GO 0” NO 4.0 H 0H 00‘ ‘00 OH Hb Us! U0 0U UIUI 00 Mt OATO OTO AOC OOC ATO TAC ATO OOO O CTO OT COO CTA A C CTO OT? e O u Lys Met Val Ser Oly Met Tyr Met Oly O u Leu Va Arg Leu I e Leu Vh AAO ATO OfCfl “O OQA OOC CTC TTA TTC OAA OOO COC A C ACT CCA OAO CTO CTC ACO Lys Met A a Lys O u Oly Leu Leu Phe O1u O y Arg I e Thr Pro Olu Leu Leu Thr AOO O A AAO TEC AAC ACT AOT OAC OTO TCC O C A OAA AAO OAT AAO O OOC ATT Arg O y Lys P e Aen Thr Ser Asp Val Ser A a I e Olu Lye Asp Lye O u Oly I e CEA AAT OfC AAO O ATC TTA ACC COC TTO OOA GTIO O O CCO TCT OAT O OAC TOT O Asn A Lys O u I e Leu Thr Arg Leu Oly Va u Pro Ser AspV Asp Cys OT? TCO OT CAO CAC A C TOC ACO A C OT TCC C COA TCA OCC AAC CTO OTO OEC Ser va n His I eCys Thr I e Va Ser P e Arg Ser Ala Aen Leu Val A a O C ACO CTC O T yOIfC ATC TTO AAC COC CTO COO OAC AAC AAO OOC ACA CCC AOC CTO A a Thr LeuO Leu Aen Arg Leu Arg Asp Aen LysO y Thr Pro Ser Leu COO ACC ACO OTTO O C OTO OAC OOT TCT CTC TAC AAO ATO CAC CCA CAO TAC TCC COO Arg Thr Thr V Oy Val Asp Oly Ser Leu Tyr Lye Met His Pro Oln Tyr Ser Arg COO TTC CAC AAO ACC CTO AOO COO OTO OT? CCT OAC TCC OAC OT COT C CTC CTC Arg Phe His Lys Thr Leu Arg Arg Val Va Pro Asp Ser Asp Va Arg P Leu Leu TCA O O AOT OOC ACO C AAO OEC OEC ATO OT ACO O OT OCC TAC CTO Ser O u Ser Oly ThrO Oy Lys O y A a A a Met Va Thr A Ala Tyr Arg Leu O 3 O O O CAC /1493-3597/ TTTAOTOAOCCATTOTTOTACOTCTAOTAAACTTTOTACTOATTCAA AAAAAAAAAAAAAAAAAAAAAAAAA 3669 H H 5.: O. 00 0U 0U UN UH UH U0 U0 N0 N0 N0 ”‘1 ”Q H0 H0. HUI HUI H.» HU P r- H r- H MO 0” N01 00 GUI 01% .U U0 H” we: no am mm mm a» U o: 0 . Figure 10. Composite Nucleotide Sequence Obtained from cDNA Clones HKI 1.4-7 and HKI 1.1. The last 40 bps are from HKI 1.1. Nucleotides 1490-3597 (not shown) correspond to cDNA clone HKI 12.4-4 (Figure 7). The deduced amino acid sequence corresponds essentially to the N-terminal half of the enzyme. Sequences derived directly from the enzyme and shown to be in the deduced sequence are underlined. Only part of the N-terminal amino acids corresponding to the 48 kDa fragment are shown (beginning at residue 463). The segment encoding the presumed polyadenylation signal (93) is also underlined; the consensus signal is AgeAglUlooA99A99A“, where the subscripts represent the percentage of 134 vertebrate mRNAs examined (93) that contained the designated base at the indicated position. 50 by tryptic digestion under the conditions of Polakis and Wilson (71). Summation of the molecular weights of the deduced 817 amino acid residues corresponding to the 90 kDa fragment gives a value of 90,719 Da, in agreement with the experimentally determined size. (0) A 9 residue amino acid sequence, corresponding to residues 463-471 in the deduced amino acid sequence, completely matches that determined by White and Wilson (74) directly from the N-terminus of the 48 kDa fragment (produced by tryptic cleavage at Tg'under partially denaturing conditions). Summation of the molecular weights of the 456 deduced amino acids corresponding to this fragment gives a value of 50,749 Da, which is similar to the experimentally determined size of 48 kDa. Additionally, immediately upstream from the N-terminus of each of the fragments discussed above is an Arg or Lys residue, as expected, due to the generation of these fragments via tryptic cleavage. In summary, the sequence identities (demonstrated between the deduced amino acid sequence and those derived directly from the enzyme), both discussed above and previously with clone HKI 12.4-4, are located throughout the deduced primary sequence of the enzyme beginning with the initiating Met, spanning an open reading frame coding for 918 amino acids, and concluding with the terminal Ala. Therefore, there is little doubt that cDNA clone HKI 1.4-7 contains the entire coding region of rat brain type I hexokinase. 51 Comparison of Hexokinase Type I Halves and Yeast Isozymes The deduced amino acid sequences of the N- and C- terminal halves of rat brain hexokinase and yeast hexokinase isozymes A and B are aligned in Figure 11. It is evident that the similarity between the N- and C-terminal halves of the brain enzyme and between these and the yeast hexokinase isozymes is rather extensive. Indeed, when the N- and C- terminal halves are quantitatively compared, 47% of the amino acid residues are identical and an additional 17% represent conservative substitutions. This high degree of similarity, along with the similarity to the yeast isozymes, certainly supports the proposal (19,50,67-70) that this mammalian hexokinase evolved by duplication and fusion of a gene encoding an ancestral hexokinase of ~ 50 kDa. Comparison of the N-terminal half of rat brain hexokinase with the A isozyme of yeast hexokinase reveals that 27% of the residues are identical with an additional 15% being the result of conservative substitutions. Similarly, comparison of the C-terminal half of rat brain hexokinase with yeast hexokinase isozyme A shows that 28% of the residues are identical with an additional 15% classified as conservative substitutions. Furthermore, the alignment in Figure 11 shows that in comparisons of either the N- or C— terminal half of rat brain hexokinase with the yeast isozymes, the similar residues (identical + conservative substitutions) are located throughout the amino acid sequence of the yeast isozymes. 52 u: 1 quLWQmmLYmsnfimmm ' Y c: 476 "m QTnnzvxxaLRrsusuGLRxsr 1m: 1 mapnpmosmwmmqn IV D“ G 99 RIP xx 2; aaaaaaaaaaaaaa UWMWUWU gRngDGnE £GDF3ALDLGG~SFRHLW 0¥N33 . Qi RfiIPDGgEHGDFLALDLGGTNFRVLLVK» 3 gPufiPe v r-p ocE QEQL‘JDLGGTNWRVTLVKQ II 56 CI 504 Yst 61 II 119 a Y b , QA j p GFTFSFPCWQKZKIDE-Q CI 567 ; . g u 1 .- _pLGFTFSFPc3Qa Yet 119 R ~ 3 ' 2 * = Trrrra aufiuaaa 888888811 171 111 II 177 333W vauLL KAIKKRe;fiD-- I\AVVNDTVGTUMTCCY6DOOCEMGLIMGTGTNACYUL CI 625 f’ GEGHD - LLhLAfiw‘RE ______ :°VAVVNDTVGTMMTCHY3'P CEuGLIEGTGTNACYME Yst 180 IP VEGHD LLsKu KREIPD 131v :gNDTVGTgrg. ,,,,, Y” DP! 1; TIvGT‘TN3‘,fig ;N QI NI V '1‘ '9‘ II 241 kFF: I; - HEWGAFGDS L DIRT FD : DreSLNpGKQu 3 CI 689 -' 1'» ~ ~ EMGAFGDSGPLGDIRTégDi? DE SLNEGKQHFEK 3 Yet :43 . - . ‘“ _ »- . 3 81 298 CI 746 Yet 306 RI 360 CI 808 Yst 370 NI 424 CI 872 Yst 428 Figure 11. Aligned Amino Acid Sequences of Rat Brain and Yeast Hexokinases. The N- and C- terminal halves of rat brain hexokinase (NI and CI, respectively) are aligned above the yeast A (Yst) and B isozymes (only those re81dues in the B isozyme which differ from the A isozyme are shown). Blac ned regions correspond to identical residues and stippled regions correspond to conserved residues. Secondary structural features are designated beneath the aligned sequences, with a denoting a- helices, 8 denoting 8- strands, and 1 denoting 8- turns. B B H .. ,. , a x _ «“5: 39:51:: ;. ‘ ‘T-w-M‘ u! :,.‘»,g:. 2.. C C Figure 12. Stereo Images of Yeast Hexokinase Highlighting Secondary Structural Features. Alternate views of the yeast hexokinase crystal structure containing bound glucose are shown with the darkened regions corresponding to either a- helices or 8-sheets. 54 Secondary structural features are designated below the sequences in Figure 11. These features were determined from the crystal structure of the "open" conformation of yeast hexokinase (see Methods, chapter II) (54,55) and are highlighted in the stereo images in Figure 12. Using the alignment in Figure 11, the residues conserved in both the N-terminal half of rat brain hexokinase and yeast hexokinase A were mapped to the yeast hexokinase crystal structure. These residues are highlighted in the stereo images in Figure 13, parts A and B. This has also been carried out with the conserved residues in the C-terminal half of rat brain hexokinase and the A isozyme of yeast hexokinase and is shown in parts C and D of Figure 13. The stereo images demonstrate that, although the conserved residues are located throughout the respective structures, there is a high degree of conservation in regions that comprise the cleft and secondary structural features. Conversely, the least conserved regions map to the surface of the enzyme structure, as expected. Therefore, extensive similarity in the secondary and tertiary structures of these enzymes seems to be a reasonable expectation (94-96). Consequently, the yeast crystal structures provide a reasonable model which can be used to establish the location within the tertiary structure (at least to a first approximation) of conserved residues present in either half of rat brain hexokinase. Harrison (66), working in Steitz's laboratory, refined the crystal coordinates well enough to Figure 13. Stereo Images Highlighting Conserved Residues of Type I Hexokinase. Alternate views of yeast hexokinase with darkened residues being conserved between yeast hexokinase A and the N-terminal half of type I hexokinase (A and B) or the C-terminal half of type I hexokinase (C and D). 56 identify residues that hydrogen bond to the hydroxyls of the bound glucose molecule. These residues include: Ser-158, Aen-210, Asp-211, Sly-235, Asn-237, G1u-269, and G1u-302. If the yeast crystal structures are reasonable approximations to the two halves of rat brain hexokinase, conservation of residues providing as crucial a role as the binding of the substrate glucose would be a fair expectation (certainly in the C—terminal half of rat brain hexokinase which has been shown to be catalytically active (74)). Conservation of Ser— 158 had previously been demonstrated due to its presence in the sequence of Peptide III isolated by Schirch and Wilson (92 and discussed below). Conservation of the other residues could not be confirmed with the limited sequence information that existed before the cloning of the cDNA for rat brain hexokinase. The alignment in Figure 11 now demonstrates that each of these residues has been conserved in the C-terminal half of rat brain hexokinase. Surprisingly, these residues are also conserved in the N-terminal half of the molecule. Conservation of all these residues in the N-terminal half of the molecule was not expected since this half appears to no longer possess catalytic activity. Studies have been conducted by Schirch and Wilson (92) on the glucose binding site of hexokinase. During the course of their work, three key peptides were isolated, designated Peptides I, II, and III (mentioned above). Peptides I and III were identified as being located at the glucose binding site of brain hexokinase, based on their reactivity with a 57 glucose analog and protection by competing ligands, and are highly similar to sequences found at the glucose binding region of the yeast enzymes. Although Peptide II was also labeled with the reactive glucose analog, unlike Peptides I and III, competitive ligands did not prevent the labeling of this peptide. (The reason Peptide II was labeled is presently unclear.) Due to the fact that there exists no significant homology between this peptide and the sequences of the yeast isozymes, Schirch and Wilson (92) were unable to locate this peptide within the yeast structure. Now, with the determination of the entire amino acid sequence of rat brain hexokinase and the ability to use the yeast crystal structures to map the location of Peptide II, the inability of competitive ligands (vs. glucose) to prevent the labeling of this peptide is readily apparent. Figure 14 shows that this peptide is located in the large lobe far from the cleft containing the active site. Peptides I and III are also highlighted and their proximity to the active site is easily seen, in accord with the proposal of Schirch and Wilson (92). Additionally, examination of the sequence for the N- terminal portion of brain hexokinase (Figure 11) shows that these peptides are sufficiently unique in sequence that their location within the overall sequence could be established. Accordingly, all three peptides were derived from the C-terminal half of the enzyme, which is certainly in support of the C-terminal half as possessing the catalytic site, as was concluded by Schirch and Wilson (92). 58 Figure 14. Stereo Images Showing the Locations of Peptides I, II, and III. A and 8: Alternate views depicting the location of Peptides I (180-191), II (364-378), and III (152-171) which were labeled with a glucose analog by Schirch and Wilson (92). 59 Figure 11 reveals that there are several insertions and deletions that have occurred during the evolution of yeast and mammalian hexokinases. Most of these differences map to surface regions in the yeast crystal structure or are located near the ends of secondary structural features. Frequently, these differences seem unlikely to result in radical changes to the overall structure (94,97). However, there are some insertions and deletions in the rat brain enzyme which, due to their magnitude, seem likely to significantly alter the yeast crystal structure. They are: the two deletions in the mammalian enzyme corresponding to residues 255-261 and 437-443 of the yeast hexokinases; and a 5 residue segment in both the N- and C-terminal halves (residues 405-409 and 853-857, respectively) of the mammalian enzyme, which would be inserted between residues 413 and 414 in the yeast hexokinases (Figure 15). All of these changes occur in the hinge region which links the small and large lobes in the yeast hexokinase structure. Although exactly how these differences manifest themselves is unknown, they are located in a region where they seem certain to impact on the structure. Proposed Structure for Mammalian Hexokinase Type I Using the yeast hexokinase crystal structures as reasonable approximations to the structures of the two halves of rat brain hexokinase, a model for the entire rat brain hexokinase enzyme was constructed (2). The alignment in Figure 11 indicates that the C-terminal half of rat brain 60 Figure 15. Stereo Images Depicting Structural Differences Between Yeast and Type I Hexokinases. A and 8: Alternate views of insertion (darkened region) and deletions (dotted region) that have occurred in the evolution of type I hexokinase. 61 hexokinase lacks the region spanning from the N-terminus, up through, and including the first a-helix of yeast hexokinase. After deleting this region from one of the yeast structures, the resulting N—terminus of this molecule was fused to the C-terminus of a complete yeast crystal structure. The C-terminal half was then rotated, keeping the N-terminal half stationary, in order to eliminate any steric conflicts. The final structure arrived at (Figure 16, parts A and B) was one in which the two "halves" were allowed close enough approach such that noncovalent interactions between them were possible. Although the model was constructed in a rather subjective manner, it seems far from arbitrary due to the fact that the possible structural alignments were limited. It should be pointed out that the structure in Figure 16 is missing those amino acids that comprise the extreme N- terminal sequence up to the beginning of the first a-helix. Steitz and colleagues were unable to determine a structure for this region due to localized disorder (55,56,58,66); therefore, not shown is, presumably, a flexible peptide attached to the N-terminus of the first a-helix. Many structural features of this model agree with previously determined experimental results, as will be discussed below. It has been well established that rat brain hexokinase binds to the outer membrane of mitochondria. One of the crucial features of this binding is the presence of the hydrophobic N-terminal amino acid segment (45) which is 62 inserted into the membrane (46). Protrusion of this segment from the enzyme’s surface would be consistent with its role in tethering the enzyme to the mitochondrial membrane as well as its noted susceptibility to proteolysis (45). This segment corresponds to the flexible peptide, referred to above, which is attached to the first a-helix in the model presented above. Digestion of native rat brain hexokinase with trypsin results in cleavage at two very susceptible sites, I; and T2 (71), which correspond to Lys-101 (Asn-102 in yeast hexokinase) in the N-terminal half, and Arg-551 (Thr-104 in yeast hexokinase) in the C-terminal half of the rat brain enzyme. These tryptic sites both map to virtually the same structural region of the yeast crystal structure; at the end of one of the B-strands that comprise the B-sheet of the small lobe. This region is at the surface of the yeast structure as would be expected due to its marked susceptibility to trypsin. Manifestation of both of these sites, one in each of the two halves, is consistent with an enzyme structure that is composed of two conformationally similar halves. This is precisely the case for the constructed model. Cleavage of the native rat brain enzyme with trypsin has revealed that the N— and C—terminal halves of the enzyme interact strongly by noncovalent forces. In fact, the interactions are so strong that the proteolyzed enzyme, under native conditions behaves, in many respects, as the 63 intact enzyme (71). Alternatively, if tryptic digestion is carried out in 0.6 M guanidine hydrochloride, the interactions between the two halves of the enzyme are weakened and a new cleavage site is revealed (74). This tryptic cleavage site, designated.Tg, has been determined to be located at Arg-462 via direct sequencing of the protein and is only a few residues from the site at which both halves of the model were fused together. In the proposed model (Figure 16, parts A and B), this site would be inaccessible to trypsin under native conditions due to the juxtaposition of the strongly interacting halves. However, if the interactions between the two halves were weakened by denaturant, this site would become susceptible to proteolysis. Therefore, the location of T3 in the model is consistent with the behavior of this site in the enzyme. Yeast hexokinase B has been crystallized as a dimer and its structure determined (ref. 98 and Figure 16, parts C and D). Due to the similarities between the two halves of rat brain hexokinase and the yeast enzymes, the possibility of the yeast dimer structure as representing that of rat brain hexokinase should be considered in that the yeast dimer may provide a model for the mammalian isozyme with respect to the relative disposition of the two monomers. This will now be discussed with the "yeast dimer" model referring to a model which would be based on the dimer structure, and the "mammalian" model referring to the model proposed above. 64 The absence of the region leading up to and including the first a-helix in the C-terminal half of the mammalian enzyme seems contradictory to a "yeast dimer" model (Figure 16, parts C and D). If this region is removed from either of the yeast monomers, the newly generated N-terminal end (FP - fusion point in Figure 16) of the C-terminal half (of rat brain hexokinase), would be located quite a distance from the C-terminal end (13 in Figure 16) of the other monomer (the N-terminal half of rat brain hexokinase). These two ends would have to be fused to create the single polypeptide of rat brain hexokinase. Additionally, manifestation of T3 does not support this model in that this region is totally exposed in both of the monomers and hence would be susceptible to proteolysis in the native structure. Although manifestation of I; and problems with the fusion of the two halves indicate the yeast dimer is unsatisfactory as a model for rat brain hexokinase, more recent experimental evidence totally eliminates this structure from consideration, and moreover, is consistent with the "mammalian" model proposed above. This evidence comes from the work of Smith and Wilson (99,100) in which they defined the epitopic regions recognized by monoclonal antibodies raised to native rat brain hexokinase and will be discussed below. Although the yeast dimer is composed of two identical subunits (with respect to primary sequence), Steitz et a1. (98) concluded that due to heterologous interactions between 65 the two monomers, they are not structurally equivalent. Hence the designation of one of the monomers as being the "up subunit" and the other as being the "down subunit". Therefore, the "yeast dimer" model presents two possibilities in terms of modeling rat brain hexokinase. The first possibility, in which the "down subunit" (monomer on the right in Figure 16, part C) corresponds to the N- terminal half of rat brain hexokinase (with the "up subunit" corresponding to the C-terminal half of rat brain hexokinase), can be eliminated due to the fact that monoclonal antibody 3A2 (99) binds to residues in the N- terminal half of rat brain hexokinase which correspond to yeast hexokinase residues 36-60 (highlighted in Figure 16, part C). This region is occupied by the other monomer in the yeast dimer, which of course would preclude the binding of this antibody. The second possibility is that the "up subunit" (monomer on the left in Figure 16, part D) corresponds to the N-terminal half of rat brain hexokinase (with the "down subunit" now corresponding to the C-terminal half of rat brain hexokinase). This possibility does not seem reasonable again due to the epitope of monoclonal antibody 3A2 which appears to be somewhat occluded by the other monomer. Furthermore, Smith and Wilson (100) were able to successfully represent the epitopic regions recognized by a battery of monoclonals using the "mammalian" model presented above. Consequently, in mapping these epitopes, they accounted for the entire surface area of the N-terminal 66 half of rat brain hexokinase using the "mammalian" model, exclusive of the region that would be in contact with the C- terminal half. Not only does this potentially eliminate any variation of the "yeast dimer" model, but this strongly supports the relative disposition of the two halves of rat brain hexokinase in the "mammalian" model. The structure proposed in Figure 16 (parts A and B) is certainly not meant to represent rat brain hexokinase in detail. The insertions and deletions mentioned previously have not been taken into account in the construction of this model nor is such an undertaking feasible at this time. More refined coordinates (66) are not available through the Brookhaven data base, although major structural changes are certainly not expected due to the resolution to which the present coordinates have been refined (55,56,58). In conclusion, the proposed model has been shown to agree with a variety of experimental data, and despite its limitations, should prove useful in the future design and interpretation of experiments aimed at the elucidation of function to structure relationships in rat brain hexokinase. 67 Figure 16. Stereo Images Showing the Proposed Model of Type I Hexokinase / Yeast Hexokinase Dimer. A and B: Alternate views of a model of type I hexokinase constructed from two yeast hexokinase structures with the darkened half corresponding to the C-terminal half of type I hexokinase. TL I}, and.I3 are tryptic cleavage sites (see text for details). C and D: Yeast hexokinase dimer. The darkened region is the segment corresponding to the location of the epitope for monoclonal antibody 3A2 in the type I hexokinase sequence. I; is located at the carboxy end of the N-terminal half and this is the point at which the N-terminal half is fused to the beginning of the C-terminal half (FP for fusion point). Figure 16 . CHAPTER IV Cloning of cDNA'S Coding for Type III Hexokinase from Rat Liver and Quantitative Comparisons of Sequence Similarities Between Hexokinases 69 70 This chapter covers the cloning of cDNA's coding for type III hexokinase, after which the amino acid sequences of hexokinases from different organisms are aligned. Subsequent quantitative comparisons, using this alignment, support the duplication and fusion proposal for the evolution of the "low Kq" mammalian hexokinases as well as providing further insight into the evolution of glucokinase. Cloning of cDNA's Coding for Type III Hexokinase Type III hexokinase cDNA clones (7) were isolated from a rat liver cDNA library using the type I hexokinase cDNA clone HKI 1.4-7 (2). Three of the positive clones were determined to overlap and furthermore, their combined length was sufficient to provide the entire coding sequence for the 100 kDa type III isozyme (Figure 17). A 2.5 kb clone, designated L4.1-h, contained approximately 85% of the coding region and 180 bases of 3’ untranslated sequence. A second clone, designated L7.1-1, included L4.1-h and additional 3’ noncoding sequence which contained a presumptive polyadenylation signal and concluded with 19 adenine residues. The third clone, L7.1-2, overlapped with L4.1-h and extended in the 5’ direction giving the remaining 15% of the coding region and 80 bps of 5’ untranslated sequence. Both strands of L4.1—h were completely sequenced, as were the unique regions of L7.1-1 and L7.1-2. The sequencing of L7.1-1 and L7.1-2 was extended such that at least 200 bp of overlapping sequence with the corresponding region of L4.1-h was obtained. Restriction sites relevant to 71 sequencing, and sequencing strategy, are depicted in Figure 17. <+——- <+——— <———- e———— <———— <+———- <——- <+———- <————- <———- ‘e———— <————— > ————4> ————+> -————> -————+> —————+> > > A; 4;: ————> SM 1 Eco!!! Men 1 Sec 1 Sec 1 Ms! 1 Kpn 1 the l Pstl EcoRI Bed-ll l l l J l I l l l I I Coding Region } L 7.1-? L 44-h L 7.1-1 Figure 17. Sequencing Strategy for cDNA Clones Coding for Type III Hexokinase and Relevant Restriction Sites. The regions contained within clones L4.1-h, L7.1-1, and L7.1-2 are shown beneath the composite sequence. Direction and extent of sequencing of subclones (generated via nonrandom deletions) is indicated by the arrows. Authenticity of TYpe III Hexokinase cDNA Clones Figure 18 contains the nucleotide sequence determined from the overlapping clones coding for type III hexokinase under which the deduced amino acid sequence is given. Marcus and Ureta (101) have previously isolated tryptic peptides, designated Peptides 1 through 7, from the type III isozyme. The amino acid sequences determined from these peptides are underlined in Figure 18 (five of which are 72 distinct from the type I isozyme) and the presence of these sequences throughout the deduced sequence confirms that these cDNA clones code for the type III isozyme of hexokinase. Although the overlapping sequence of clone L7.1- 2 with clone L4.1-h indicates it as coding for type III hexokinase, further verification was provided by the presence of Peptide 7 (unique to type III) in the deduced amino acid sequence of the 5’ region of L7.1-2 not contained in L4.1-h. In the deduced amino acid sequence immediately preceding the N-terminus of each peptide is a Lys or Arg residue which is consistent with the generation of these peptides by trypsin. There is one discrepancy, Cys-171, which was reported by Marcus and Ureta (101) to be a Ser. Comparisons of Deduced Amino Acid Sequences of Hexokinases The cloning of hexokinases and glucokinases from different organisms has been carried out by various researchers (see chapter I, page 2). The deduced amino acid sequences of these clones are aligned in Figure 19. (Note: The sequence of Z. mobilis glucokinase (102) was not included in Figure 19. The degree of similarity was very low and regions that were highly conserved in all of the other sequences were not conserved in this glucokinase. Upon translating the nucleotide sequence, some of the highly conserved regions were found to exist in the alternate reading frames; therefore this sequence was not included since it seems likely to contain sequencing errors.) Using the alignment in Figure 19, quantitative TT OAO A 11o CAO Oln OOA CAO Oly CTO Lou OAO TTT 013 gho To} TAC Tyr TOO 7'9 ATT 11o me has TOO 7?? CAO O1n OOT Oly CAC [is COA ISO OCC A1s TCC For OAA O1u OOA Oly OOO Oly OCC A1o CCC Pro OOC O17 oac AID ATO lot TCA Ior OOA ATO Oly CAO Oln CTO Lou OOT Oly CTA COT CTT CCT Pro TAc Tyr use A” CTO Lou OTO Olu OAA Olu Oln OAO Olu CCC OTO Pro Vol ACA AAA Thr Lys OAT OTO Asp Vo1 ACT OOT Thr Oly OOC TCC Oly lor AOO TTT Arg Pho OOC TOC Oly Cys OTC CAC Vol lio OCT OCC Ala Ale ATT TCT lor TTO Lou AOT Oor OOO Oly ATC 11o OAO O1u OOT Oly TOO OOT Oly CAA Oln CCC Pro OCC Ala CCT Pro AAT Asa TTT Pho OTA OCC Vo1 ACC Thr TTC Phe OAO Olu OCC A1o ACA Thr CAO Oln Ale AAT Asu TAT Tyr AAO Lys TCT Ior OTC Vol CTC Lou ACT OOA OOO COAfl Thr Oly TTC ATC Pho 11o OAO ACC Olu Thr OAO AOC Olu Oor OOC ACC O17 Thr CAO OOC Oln Oly CTO OOT ou O TOC OAO Cys Olu ACO Thr OAA Olu CTO Lou OTO Vol OAA O1u AOC Ior TAC Tyr AOO Arg CTO Lou OCT Ale CCC Io: ACC Thr OOO 01y OTO Vol CAT Oly CCC Pro CTO Lou TCC lor AAC Asn TCT Oor OOC Oly OOO Oly CTC Lou ATO lot CTO Lou oac MP ACT Thr OCA A1- CAC M's TCT Oor OCA A1o TCC Oor TTC Phe OOA Oly OAO CTO AOA CTT OAO OAA OOT OAT CTO CAC CCO OOA CAA AOA Lou lis CAA TTC Oln Pho OCT CCT A1o Pro ACA OOA Thr Oly CAA OAO Oln Olu CAO OOT Oln Oly AOO TOC OTO Vol TOT CY. OAA Olu ATO lot OCC Ale an: Mp ATO lot CCT Pro CTO Lou TOT Cys 11o OCO Ale CAO Oln OCC Ale ATT OOT Pro Oly AAO OTO Lys Vol TCT For OTC Vo1 OCC Ala OTO Vol CTO Lou TCA Oor ATC 11o AAO Lyo AOT AAT Aon TAT Tyr OAO Olu OAC Asp ATO lot OCC Ale OOC Oly CTO Lou TTO Oly TTO Lou OOC Oly Lou TCT OCC Oor A1o TOO Olu ACC Thr coo Mo CTA Lou CTA Lou CTT Lou Arg ATO lot no 5:: COT Arg OOT Oly OOO Oly ACA OTO Thr v.1 OAO OAA Olu Olu CTA OOO Lou Oly CTT TAC Lou Tyr AOT CAA Oor Oln OOT CTO Oly Lou CTO OCT Lou Ala CAC CCC one up ACA Thr TCA Oor CAO Oln TTO Lou CCC Pro OTO Vol OCT Ale TTT Pho TTO Lou OOC Oly AAT *2 AAC AOC lor CTO Lou ACA Thr TOO Trp CAO Oln TTC Pho 73 TTC TOC Cys CAO Oln TAC Tyr OTA Vol CAO Oln TCT Sor TOA CCC Pro CAO Oln OTO Vol ACA Thr CTC Lou TTT Pbo ATC CAO Oln ATC 11o AOO Arg CTO Lou TTT Phe CCT Pro OOC CAO OAT OTO OTC CAO“ Arg 22o Oor 01: Vol Olu le Oln Asa Vol Vol Oln Lou Lou Arg OOT ACC Oly Thr OCO AOC Ale Arg CCA OTA Pro Vol TTO OOT Lou 01y AAC AOC Asn Oor AOC CCT Oor Pro OCA OTC Ala Vol AOO TTC Vol Pho Olu T52 lie Pro Ara Pho OOT OOT OOC COO OOT OTO Oly Oly Oly Arg Oly Val CAO CTO AOC TTO OAO CAO OTO OAT Vol Asp CCA TTT Pro Pho CTC COC Lou Arg COT OTC Arg Vol CAO AAO Oln by- TTC ACC TTC TCT Pho hr Pho Oor CAA Oln ACC Thr COO Arg CTC Lou OOO Oly ATC TTT 11o CTO Lou OTO v.1 TTC OAT OTT Asp Vo1 ATO lot AAT Asn OOC Oly OAO O1u ATO lot OTO Vol ACC Thr ATC 11o AAO Lys TCT Oor OAA Olu AOO Pho ACO Thr OTO Ve1 TCC L33 :1; Pgo g}: g;. lor Arg Lys A1o OCO TTO A1o Lou AOO ATT ACA CCC COO OOC CAT TTC TTC AAO OAA AOO CAO AOO Oln Lou Oor Lou O1u Oln ATO CTO CCC ACT TAC OTC lot Lou Pro Thr :zr Vol Arg OAO OOC CTO Lou TTO Lou TTT Pho CCT Pro TAT Tyr TOT Cys AOT Oor TTT Pho COC Arg AAO Lyo on My ATA 11o OTO Vol ACT Thr CTC Lou TTT Pho OTO Vol TCC Bor OCO Ale TOC Cys OTC Vol ACC Thr on up AAO Lys CTO Lou OTC Vol CCC OTA Vol on up too Cys TTA Lou COC Arg CAT lio AAO Lys TTA Lou OOC Oly OTO Val an up CAT Iio TTT Pho OCC Ale COO Arg TCA Oor TAT Tyr CCC Pro OCT Ale ATC 11o CTC Lou COT Arg ACA TTT CTO TTT AOC OOO ACC ATT AOA CCA OTO OCC Vol Ala ATT OTO 11o Vol CAO CTT Oln Lou COO OAA Arg Olu OAT Asp OOO Oly AOC Oor CTC Lou TCC Bor an M9 OAC Asp OTO Vo1 CTO Lou OAO Olu ATO lot AAC Asn ACA Thr OCT Ale OTC Vo1 COO RIO OTT Vol TOC Cys CTC OAA CTO AAA TTT CCA OTC AOA ATO OAC ATO lot CAC lio CTO Lou OAO Olu ATC 11o CAO O1n CTA Lou CTC Lou OCA A1e CTO Lou COA ATO lot OTO Vol ACC Thr CTO Lou CTC Lou OCC Ala TCC Bor TOC Cys ATO lot ACA Thr OCA Ala Oly TOC Cys CTO Lou ATT Ilo TOT Cys OOC Oly CAO Oln TTA Lou OAO Olu OAO Olu CTO Lou AAO Lys CTO Lou COO CCA OAT OTT COC OOC Oly OCA Ale ACC Thr OTA Vol CTO Lou TCA Oor COC Arg ATC 11o OTO Vol OCO Ala ACO Thr AOT 8or ATC 11o OAC Asp AOO Mrs TOT Cys CAC lio OCA A1o ACC Thr AOC Oor OTO Vol CAO Oln CTA Lou ACC Thr TAT OAC CTC AOC TOO TOT Cys OCT Ale TTC Pho AOO Arg OAA Olu OAT Asp CTC Lou CTA Lou ACT Thr OTO Val CCC Pro OTT Vol OTO Vol CAO O1n COC Arg OAO Olu ATO lot TCC Oor AOT Oor OAC Asp TOC Cys OAO Olu OCC A1o CAO Oln CAO TOO CCC OAO Olu CAA Oln TCC Oor ACO Thr OAC Asp TOT Cys Bio Oln Thr 011 Lou Aog Clo OAO Olu CTO Lou OAC Asp CTO Lou CAT lis OCT Ale CAO Oln AAO Lys OCT Ale CAO O1n OAT Asp CAO Oln cue Mp OOC Oly AOA Arg ATO lot 101' w- ATC 11o CTT Lou AOC Bor CAO Oln CTO Lou CCT Pro ATO lot CCA CCA OOC Oly OCC Ale ACA Thr OOC Oly TTT Pho CAC OCTA CTO Lou an M9 OAT Asp OTO Vol OTO Vol OAO Olu CAC Rio OAO Olu OTO Vol OCA A1. OOC Oly ATC 11o TTC Pho ATC 11o CAO Oln OOC Oly ATC 11o AAC Asn OOA Oly CTO Lou OCT Ale ACA Thr CAO Oln OCC Ale CCC OCA OCA OOC OOC AAO OOT OOC OOT ATT 11o AOT Oor CCA Pro ACC Thr OCT Ale CAO AOO OOC Oly OAO Olu OCC Ale CTO Lou OCC Ale CTC Lou AOC 8or ACO Thr OCA Ala CAA Oln AOC lor ACC Thr CAO Oln CTC Lou OCA Ala CTC Lou AAC Asn CCA Pro OTT 7.1 OCC Ala OTO Vol OTO Vol TOC Cys TOC Cys ACA AOO OOT CTT ATO COO OOA TAA AAO ATA AOT CAA CCA Pro CTT Lou AOO Arg CTO Lou CAT Iis AAO Lys OCC Ala ACA Oly OAA Olu COC Arg OOC OAT Asp OCC Ale AOO Arg COC Arg OAC Asp ACC Thr OAC Asp CTO Lou OTC Vol CAC lio AAA Lys OTO Vol AOO Arg ATO lot CAO Oln OAO O1u ATO lot COC Arg OTA Vol OCC Ale ATO lot OAA Olu AAC Asn coo M's COA Arg CAO Oln AAO Lys CTC Lou OTO Vol ATT 11o ATO OAO lot Olu OOC AAA Oly Lys CTC TTC Lou Pho CTO COT Lou Arg COC Arg OTO Vol mo hrs AAC Asn OAO Olu OTC Vol TCC Oor TCT Oor ACA Thr OTT Vol Ind ACT Figure 18. Composite Nucleotide Sequence and Deduced of Rat Type sequence of underlined. Cys-171 has reported by signal (93). OOC ACC TAC CAT TOT CCC Pro TOT Cys TCA Cor TCC Oor OOT Oly ATO lot CAA Oln OAO Olu AOC 8or CTC Lou Thr CAC lio TOC Cys TTO OTO Val TCT Oor OAC AAO ATT CAO AOO 11o Oln Arg CCA TOT OAA Pro Cys Olu OOC COT ACC Oly Arg Thr CAC OAO TCC His Olu For CAO Oln CCT Pro TOC Cys ACA Thr OCC A1o TTO TCC Lou 8or OAO OAC Olu Asp COC OTO Arg Va1 CAO CAO Oln Oln CTC TTO Lou Lou CTO OCT Lou Ala OAA OCC Olu Alo OOT OAC Oly Asp OTC TAC Vol Tyr CAA OOC CTT ACC Thr ATO lot TTC Pho TCT Oor ACT Thr AAT Asn too Trp CTO Lou AAO Lys OTO Val OOT Oly OCC Ale TTT Pho CAO Oln COA Arg OCC Ale OAT MP TTO Lou OOA 01y TOO Trp CAO Oln COO Arg ACC Thr OOT Oly AOO Arg OOC Oly OTC Vol OCC Ala OTO Vol TTT Pho CAO Oln ma Ara OOA Oly ACC Thr OOA TCT AOC Oor OAO Olu Oly OAO Olu Olu Oor CAO Oln OTC Vol TOT Cys CTO Lou CAT [is OCC Ale ATO lot CTO Lou CCA Pro CAC [is ATC 11o CTO Lou ATT Ilo AAC TTA Lou CAO Oln are up ACC Thr TTC Phe ACC Thr OOO Oly OOO Oly OTC Vol OTT Val OOO Oly ACT Thr OCT Ale CAC lie OAO O1u AOO Arg mo Ara OCT Ale CCT Pro AOC OOA O1n Olz Lou Oor 01x Oln Oor Lou OOO Oly OTT Ve1 ACC Thr TTT Pho OAO Olu AAO Lys OCC Ale CAA O1n OOO Oly CAA Oln TTC Pho OOC Ala AAC Asn 000 Oly AAA Lys ACO Thr ATC 11o CTC Lou ACO Thr TCO Oor AAO ATO lot Olu Ala TTC Pho ma M's CTO Lou CTC Lou ACC Thr CTT Lou AOC Oor OCC Ale Lou CTO Lou CTO Lou AOC Oor OCC Ale OCA Ale AAO Lys OTO Vol CAO Oln OAT MP ATT 11o TAC Tyr ATT 11o ATC 11o OCA Ale TCC Bor AAT Asn OTA Vol OAO Olu CCT OOT OCT Pro OTC Vo1 OOO Oly OTO Vol OTO Vol 'rcrr CY- coc Arc OOO Oly TTO Lou Oly CTC Lou ATA 11o TOC Cys OCC Ale on My ATC 11o CTT Lou OAC MP OAO TAT Olu Tyr CAO AOC AAT Asn ATT 11o OCC Ala OAT MP ATO lot CAA Oln CTO Lou TOC Cys CTC Lou OAO Olu AOC OCA Ala OTC Va1 TOC Cys OAC MP ATC 11o TOC Cys OAO Olu OOO Oly TAC Tyr OAT MP AAT Ala TTT Pho OCC Ale ACO Thr OTO Vo1 OTC v.1 CTO Lou CAA O1n CTA Lou OTA Vol CTA TCA Oor AAT Asn TAT Tyr OOC Oly AOC Oor CTT Lou OAC MP OCA Ale AAO Lye OOO Oly TOO CCT OTT ACO TCT OCT OTO AOO AOA AAA 99 179 33 3‘9 ‘3 359 93 ‘49 133 539 153 ‘29 193 719 213 .09 203 .99 273 989 303 1079 333 1159 353 1259 393 1309 ‘23 1439 £53 1529 (93 1‘19 513 1709 5‘3 1799 573 Amino Acid Sequence III Hexokinase. Amino acid sequences identical to the tryptic peptides determined by Marcus and Ureta (101) are (NOTE: within the peptide comprised of residues 163—178, not been underlined due to a discrepancy with the sequence Marcus and Ureta (101), which had a Ser at this position.) Underlined in the 3’ untranslated region is the presumed polyadenylation the 74 comparisons between the N- and C-terminal halves of the deduced amino acid sequences of the 100 kDa enzymes (Table 2) reveal that within each isozyme the C-terminal half is quite similar to the respective N-terminal half. This supports the proposal (19,50,67-70) that the 100 kDa hexokinases arose by duplication and fusion of a gene coding for a 50 kDa enzyme. Comparisons between the C-terminal halves of the 100 kDa isozymes show that all three "loW'Kq" isozymes have very similar C-terminal halves (over 60% of the residues are identical). The N-terminal halves of types I, II and III are also similar, although this similarity is not as pronounced with the type III isozyme. Nevertheless, the N-terminal half of type III is more similar to the N-terminal halves of types I or II than to the corresponding C-terminal halves. Therefore, the similarity among the N-terminal halves along with the similarity among the C-terminal halves gives support to the concept that the original 100 kDa "fused" protein was (at least) subsequently triplicated resulting in the three "low K5" isozymes. An indication of the evolutionary relationship of the type IV isozyme to the other hexokinases is given by the quantitative comparisons in Table 2 coupled with the evolutionary scheme described by Ureta (19). Two distinct possibilities were presented by Ureta (19) for the evolution of the mammalian isozymes. In one of the possibilities, the type IV isozyme and the other present-day 50 kDa isozymes 75 b1 1 IIAAQLLAYYFTELKDDQVKKflDKYL RLSDETLLIJJ REgKEM‘ GLSK hi 1 IIAAQLLAYYFTELKDDQVKKNDKYL RLSDETL: RFRKEMi¥GLSV m1 1 IIAAQLLAYYFTELKDDQVKKnDKYL RLSDE L L RFI WKEM GL9 I 1 IIAAQLLAYYFTELKDDQVKKNDKYL ‘+RLSDEL IL RFmKEMggGLSV II 1 I IARFEB- FTEL IQIQVTKVD-.LYH&RLSDETLLEIS«RFRKEMEKGL - III 1 MAAIEPSG IP ERDSSCPQBGIPRPSGS ELAQ YLooFI u oLoo oASLLCSIE LIG b1 476 HFRLSIoTL EVKKRIR EME. hl 476 HEEL 'IILLEVKKR’R EMEIGLOK ml 476 HFRLSI -L;EVKK§¢R EME: GLfiK I 476 HFRLSIoTL EVKKRI RnEME? fifix II 476 LfLSII EELLBVKHR EMEé GLSK III 489 . 'IFOLSIE-L - cggIRK~III§GLOG th 1 . u -HL0 -EnLI v~I G IV 1I-aLgrEnL an 1 "Ls I ng l ICD'FLHL'EQLHD Yst 1 MVHLGPKKEQARKGS n SETU K‘ PT QA T S aaaaaaaaaaaaaa b1 108 lYu ECLgDF+mK ' P QGFTFSFPCRQI hl 108 'ECLfiDFfiuKP' P RGFTFSFPCEQ; ml 108 ' SqLFDH ECLgDF$mKP‘ p . n I 108 OLFDHMAN‘CL:DF Kr- p p 'ILYSE I; _ Yll 2... .L-IYu PENIg E qI JIPE=tI ' 9 .- ,_ LGETESEPCEQT LGFTFSFPCfiQT E LFD I EiLEDHIiEC¢ Gfi IELFI AE_L, "U'U'U'U'U'U'U'U , -_ TTKEQ 36' £3 ' Q :QEL5N K TL R EA QNP n E A IE FPQG ISEP 111 TTTTTaaaaaaaaaaaaaaaaaa 8838883833811 Figure 19. Alignment of Known Hexokinase and Glucokinase Sequences. The aligned sequences from top to bottom are the N- terminal halves of hexokinases from: bovine type I (b1), human type I (hl), mouse type I (m1), rat t e I (I), rat type II (II), rat type III (III) followed by the respect1ve C- terminal halves. Next are the sequences of human liver glucokinase (hIV), rat liver glucokinase (IV), schistosoma mansoni hexokinase (sm), yeast glucokinase (ng), yeast hexokinase A (Yst), yeast hexokinase B (only residues that differ from isozyme A are shown), and secondary structural features determined from the yeast crystal structures (a = a- helices, 8 = 3-s heet, T = 8—turns). Blackened regions correspond to identical residues and stippled correspond to conserved. 76 KgDp‘ILITWTKgFKAu 4E9 VVgLLnIAIKchngn» L AVVNDTVGTMIICGyfioouC KMDE~ILITWTKPFKASGVE ‘DVVgLthAIKgRGDgDR¥%VAVVNDTVGTMMTCGYWD0IC KHDE‘FLITWTKEFKASGVEGflDVVgLLI‘AlngGDgD‘ iVAVVNDTVGTMMTCGYmDQQC -'LITWTKEFKASGVE iDVVHLLKEAIKxRGDyDR¥EVAVVNDTVGTMMTCGYQDEQC 'LEEWTKGFKGSGVEGEDVVHLERQAIERRGDFD LIgWTKGEr SGVEGfiD éLLRDAIfiRE SLD GILITWTKGFK w- GHDVVTLLRDAfiKRm In; AVVNDTVGTMMTCGYwDOGC . IDWAIVNDTVGTWECE‘MC HFDLDVVAVVNDTVGTMMTC'YEQPuC vFDLDVVAVVNDTVGTMMTCHYEgp c vFDLDVVAVVNDTVGTMMTCflYEgPEC -FDLDVVAVVNDTVGTMMTCEYEgng vFBLDVVAYVNDTVGTMMTCGYEDPHC fiLfiVVAfiVNDTVGTMMECGYEDPQC " GHDVHILLRD TKRRH uLDtGILl$WTKGFK «- EGHDflfigLLRDAuKRRu LDC ILIWTKGFKASGgEGEDWTLLH: IERRII eLDéGIL‘IWTKGFfiASGhE EDXXELLRgAIERRg eLLRDAIKRRGDFMSDVVA¥VNDTVnTM¥J D‘~¥DI1GILI WTKGFKAS EG & w n DLDHGILl WTKGFKASGEEGI VgLLRDAIKRRGDFl'mW/RDA3VNDT _j_TM.B eL .1 LMQWTKGFS‘WGVEGfimwflzLL'TgfinfRVLNVKC AVVNDTVGT SLISGuLIP TKGFfiIfiD GgDVVQL QEQESAgGuPMfiNVka NDTVG K? EGILo- TKGFuIP VEGHD 'LLOKEESjRHLP XSQVAfigNDTVGT I N x Q NI u T 111 111 aaaaaaaaaaaa 3833333338 aaaaaaaaaaaa11333 SLDgGILITWTKGFKAHD EVGLIIGTGTNACYMEErR $3? EVGLIIGTGTNACYMEEfiRhafl EVGLIficTGTNACYMEsfiRggéLVEGDE LIIGTGTNACYMEEERI ILVEGDE EVGLIVGT sNACYMEEMg GLIVGTGENACYMEEMSNVE «Gal LIK swcgpnfignx IPSNS wENNIS!“- ps v 3838 11111 Y M 1w- SLNPGKQWFEKM¥SGMYLG VRLILVgéaKEGLL SLNPGKQflFEKM¥SGMYTGELVRLILVHuhKEELL SLNPGKQHFEKMMSGMY¥GELVRLImeuIKEGLLE SLNPGKQEFEKMISGMYgGELVRLILVH--K“3LLF sLmPGnQRFEKMIEGHYLGELVRLfiLme¢g S SL G§QRPE§MISGEYLGEHVRFILgDfiTKgGgLFR A LDDIRTuu " SL IGKQRgEKMISGMYLGEfixRIIL DETKgGgLFR chs LDDIRTQFD= VDE fiEGKQRFEKMISGMYLGEg RIIL DgTKgGgLFR qu . SL GKQRFEKMISGMYLGEHVRQILIDHTKHGQLFR GgIS SLNPGKQRFEKMISGMYLGEfixRIILIDQTKT G SHNPGKQRFEKMISGMYLGEg RHILIILT EKWIEGEYEGELVRLTLgsL aEKEIfiGngGELVRLgLL$L VSGMYLGELVRglfivuL . - SGMaLGEfingILVDLu D - ‘Q‘ s *YLGELgRLTL§?LI:1GLf TQ jS' IT; E T s I A in; K Egfi N FDK F aaaaaaaaa dadddd aaaaaaaaaaaaaaaaa T T T Figure 19. (Cont.) 77 b1 331 RGIE T - s IEHDKIGLH >ILTRL :: R:-DDC T VCTIVSI , L hl 331 RGEFNT - SE*EE~KEGL!§ PILTRL ;: maunnc SE3.NCTIVSIREAEL -AEL m1 331 RGIF T - IEQDK: m ‘‘‘‘‘ =ILTRLG_°' P~IDDC VQ NCTIVSIRa2*L AnLe I 331 RGtFNT - gEIEHDKIGgQI 4 ILTRL H Pun CTIVSER II 331 FETK» SHIEii-DKHGAEKAYQILTRLGL PIo DC - RQCVIVSTRMAELC L- n: 344 ornsnz. IWs-F'EI- VML p._ D- ivaIW uTRAAQLcflL- b1 779 RGIFETKF=#SQIESDanLLQVRAIL STCDD ”VCEEVSHRAAQLCGAGE h1 779 RGIFETKFESQIESDELHLLQVRAILooLGL STCDD sLVKuVC SERAAQLCGAGE m1 779 RGIFETKFESQIESDELELLQVRAILomLGL STCDD vc VSHRAAQLCGAGE I 779 RGIFETKFESQIESDELELLQVRAILaoLGL STCDD figyVSHRAAQLCGAQ II 779 RGIFETKFESQIESDiLnLLQVRAILh‘LGLI STCDD RAAQLCGAGE III 786 REIFQTKFESI-IESDSLH LDQVRAILI-LGL T DD- fig'E‘v‘s‘ERAAQLCGA°°° 1‘ th 326 RGHFETEFVSQEESD GD IQI IL; LGLr 3&5 RHCI: STRAAEEC GL- IV 327 R FETI FVSQ ESDSGD"QL3 IL; LGLr Pm m IRECI: STRAAEfg GL an 320 RISL Tfign- .EEDP "LL . ITD-LI 'VVEP DI- Y Cm VT—R- “GAGE ng 361 gLTTPFQLssI SIIEIDDSTGLR: TILSLI QSLRLPTTPT: -R oIQKL R ISRRS TL- Yat 341 IMDTSYPARIEDDPFENLEDTDDqum I TEL? 'GL RMCIL1¢TR--'L v- j; v L NE IE VQ s a T T Tadddddd aaaaaaaaaaa T; DGSLYK HpoQSRRrHKT R :5 ngRERgNE s -VL~RIfiENgG;eRLR LIT- GVDGS¥_ GVDGS III 429 L'LSEDGSGKGAA ‘S TGAAEEfity- i 8888 aaaauaaaaaaaaaaaaa Figure 19. (Cont.) ocfiEm msu uo *Hm u .Ammmcmao w>fium>uwmnoo mum *m Hmnofiufiuvm cm cum Hmowucmofl mum ownmmfioo mmoqmsvmm o3u man Ga mmsufiwmu Ufiom M\Hmv mmmnmgo m>flum>ummcoo a\HMUflquufl x ”mum am>fim mumnfisz .Auwwv d mmmqflxost ummmh .Axmwv mmmcflxOUSHm ummm» \Ava mmmqfixoxmn wGOmcmE mfiomoumfisom .A>Hv mmmnflxoosam Hwbfla umu .A>H:V mmmcfixoudam Hm>HH amass no mmunmswwm wan mum uxmz .AHHHU .HHU .Ho .HEU .Hnu .Hnuv wm>Hmn Hmcfieumulu m>fiuommmmu mgu can AHHHZV HHH mmxu umu .AHHzV HH mmxu um“ .Asz H mm»» um“ .Aaezv H mmxu mmsos .Aanzv H mmmu 78 CNS—.53 .AHQZV H 0%..“ wfiflxron EON“ mwmwfiflxoxwfl HO mmerMS HMGflEHMUIZ wfiu "mum mmofimgwm ed: 22% wig mi: meN 3:: mimw «12 mi: mi: Sim u: mimm Sim SIN ~13 ~13 Ni: :\ma 1?" «:3 mica xm» 13v I}. 3\: «Q3 H13. 3%.: «ion Minn 23m .5. «\nm v.23 v.23 mime mien 33. mi: :3. >H ”:1 «1.3 2).. 33m 53v 33v 33.. >2 Q3 33w ~13 «.13. mimv mi: 3}; :3 012. 235 33.. S): :Rv 53v Cu 33 Q3 11am hi: hi: 53v Ho 33 Cam I}: 33v mi: 53¢ 26 {on I): 53.. 53¢ 33.. 16 vi: :3. 53. 33v 3U 33.” miov mi: :5 2:» 3); 3:3 32 «Ca 13 Hz M\mm aez n\~m an: ulunleu-I-Ju-IuJual rJuw- Inuul-andlmwn. Hz I”; lJn1.-lll mmuqmszm mmmqflxoous cam mwmcwxoxmm mo nomflnmmeoo m>flumu«ucmso .N manna 79 (not inhibited by g1ucose-6-phosphate) diverged from the ancestral 50 kDa enzyme (also not inhibited by glucose-6- phosphate) before the initial gene duplication and fusion event giving rise to the 100 kDa isozymes (types I-III, which are inhibited by glucose-S-phosphate). Therefore, the type IV isozyme would be expected to be more similar to the other 50 kDa isozymes than to the mammalian types I-III. The other possibility proposed by Ureta (19) was that the type IV isozyme arose after the duplication and fusion event which gave rise to the 100 kDa hexokinases. The type IV isozyme would then be a product of the subsequent resplitting of one of the genes to restore a 50 kDa form. In this case, the type IV isozyme would be expected to be more similar to the types I-III isozymes than to the other 50 kDa hexokinases. The results in Table 2 indicate the latter to be the case. Indeed, comparisons of type IV with the 100 kDa mammalian isozymes results in similarities where approximately 50% of the residues are identical as opposed to the 50 kDa yeast isozymes where only 27% are identical. The alignment in Figure 19 demonstrates that the insertions and deletions that have previously been noted between the type I isozyme and yeast hexokinase, which are likely to impact on structure (chapter III, page 59), are also present in the other mammalian enzymes. Extensive conservation of sequence among the enzymes of Figure 19 make it reasonable to expect an overall conservation of structure in these enzymes (94,103-105). The secondary 80 structural features of yeast hexokinase are also indicated in Figure 19 below the sequences of the yeast hexokinases. As previously shown for the type I isozyme (Figure 11), most of the insertions or deletions evident in the sequence alignment are located near the ends of secondary structural features such as a-helices and B-strands. This would be expected in the case of homologous proteins (94). Yeast glucokinase (14), however, appears to contain a region that is an exception. An insertion of 11 residues is present in yeast glucokinase which would occur between yeast hexokinase residues Thr-226 and Lys-227. This insertion is not present in any of the other enzymes. It appears this insertion would increase the length of a B-strand located in the "hinge" region joining the two lobes of yeast hexokinase which seems certain to have a major impact on this region of the molecule (Figure 20). 81 Figure 20. Stereo Image Showing Insertion in Yeast Glucokinase. Residues 226 and 227 are shown by thickened regions of the backbone. This is the location of an apparent 11 residue insertion in yeast glucokinase. CHAPTER V Glucose and ATP Binding Sites 82 83 In this chapter, a closer look is taken at the residues involved in the binding of glucose and the conservation of these residues in the known hexokinase sequences using the sequence alignment in the previous chapter (Figure 19). The chapter concludes with discussion of the region (and the residues therein) proposed to be involved in the binding of the other substrate, MgnATP. (Note: This chapter contains stereo images of the yeast hexokinase isozymes, actin, and glycerol kinase. In the cases of actin and glycerol kinase, amino acid residues determined in the crystal structures agree with those deduced from the respective cDNA sequences. However, the crystal structures for the yeast hexokinase isozymes were determined prior to the availability of the amino acid sequences. As a result many of the side chains were misidentified. Therefore, in the stereo images of the yeast hexokinase isozymes, if the side chains do not match the amino acid label, the amino acid label is correct.) The Glucose Binding Site Figure 19 shows many regions where the sequences of the enzymes are well conserved. Not surprisingly, some of these regions comprise the glucose binding site. Residues, determined by Harrison (66), which appear to hydrogen bond with the hydroxyls of glucose in the open conformation of yeast hexokinase include the side chains of Asn-210, Asn- 237, Glu-269, and Glu-302 as well as the carbonyl oxygens from the peptide bonds of residues Gly-235 and Val-236 (Figure 21). In the closed conformation (Figure 22), yeast 84 xaoo "xnc moo rue . - a 1531° 11191° I sue piswio 90 1'90 5 1'1 0 {no P3: :110 1150““ ~ 5310 3;“: 134°. ..GJ70 1340. ,_.,6370 woo woo Mg mo H 0‘93 13° 0 9330 - 32% if “£210 one a V4] 137.0 ‘ :é . :3 .1 145.01% N: I; 1350):: so L350 3 7’3? .' v1: 1353359 5260 '1' , 5260 "I . . " 1.290 in“; 1.310 “0.1-fro" 15310 ~ ”9“.“ 1.30 @50} "WE L30 - , {no . p20 'r1o 1'10 11110110 ”50‘ r, 2156 c110 0110 r ‘1. - . " « {158 _ " ' €159 ‘ , 3210 N210 a 2% . ~ (2” 0235 1‘ . 58—":“7” 0235 . A ‘ ‘ f ‘ 1.2102 r11?“ '1 \N'v" :7 ' ‘ i L‘ . " ‘1 ' '- t "v" 1., 5 3.". \<51 6 (5158 t. 1 ‘ N210 1 11210 _ [Fl-102 Dhl ~. . K ' 269 \ ,_ W ' oz’v ‘ ’ 02“ v23 ‘ ‘ ' ‘ y 321”” . ‘/ ..‘ 31 . 2;. - . 5’1 1 1', Figure 21. Stereo Images of Residues Involved in the Binding of Glucose in the "Open" Conformation of Yeast Hexokinase. The side chains as well as carbonyl oxygen bonds of residues involved in the binding of glucose to yeast hexokinase are darkened. A: "Open" conformation of hexokinase with bound glucose. B: and C: Alternate close up views. Residues utilized to bind glucose in the "open" conformation are Asn- 210, G1y-235, Val-236, Asn—237, Glu-269, and Glu—302. 85 .noo _ nco ‘ “11"” ‘ a 20 .1150 9199’ 1‘ "50P1l9o': 011139 5110 mi .1 111704 “ 1‘90 1170 D ‘0 1.” . a) ““a 1:370 ... )0 ' n10 - 0370 WW '3‘]?! {affix 1:290 ““0 1.??? “a” o- 0290 ”We [mo 0V1] 013330 . o: ”3:35 y120u°nslnoo big-fig} ”‘2le '.x ”20:50 52:0 :0 am A . .21. no 1.310 ‘ Lzso ' L"? 1169“” ,‘rno ”a ‘9“ “go I 1390.130 . . no 1131 ‘ .1120 x110 ‘ ' 1:10 [110 g ‘1'“ “5910 ”1:590 {15! ' {158 > > . N210 11210 R 5:: R; 5 35.. | v s _ . \ ' . \ Mn)’ {.3337}? .,_ 1." K‘ $155 ‘ [\ém _ 210 1 , 015 7 uasmso ; H R ”at; a r " , . .. ..;“ ‘3‘. [-1 ‘agz¢‘n 26° ; 'Ljspuff: qififiyo , ‘ ‘ ‘ 4‘; . x 7‘ 98° _, ‘ .920. L30 ‘.a¢36§v. , ‘ ' ‘ 5 r h "t ' 3“ij :xuo I ‘m‘oPGSO ‘ 5:200 ~. .r1ae a “ '\ ,' ? .Vlco ““60 who . 1.150 I F 311,0 ,[1 o '7 33200 ‘. -Vtao hypl’o 9 ' 1150 ' (. "é! 017° _ ' .‘rao . “ca‘vowo-n ».1?fu:r>”;..rI . 0x4 ' - .. a: (946100;)me ’32° 7 ‘ ..,' o 8190 I “DI-‘33 '0 ' " :sto .1 v! Fixaoi ,D330 t‘?\. Figure 32. Crevices in B-sheets in Yeast Hexokinase that Contribute Active Site Residues. Thin lines denote the 8- sheets and thickened lines denote the region of the crevice where active site residues are located (see text). A: 8- sheet of small lobe, B: B-sheet of large lobe. 111 HSC70 proteins, Flaherty et al. (123) noted that the structure of HSC70 was sufficiently refined that some of the water molecules could be located. One such water molecule, referred to as Wat 546, is oriented such that its oxygen atom is situated 3.5 A from the terminal phosphate of ATP. The 03’ y-phosphate bond of the bound ATP is aligned with a line from Wat 546’s oxygen to the y-phosphate. (The 03’ oxygen is the oxygen in the phosphodiester bond "linking" the B and y phosphorus atoms). They suggest, therefore, that this water molecule is a good candidate for an in-line attack on the y-phosphate of ATP. Additionally, they noted that in the yeast hexokinase crystal structure containing N- o-toluolylglucosamine, the position of the 6-hydroxyl of the bound glucose molecule approximates that of Wat 546. An in- line attack of the 6-hydroxyl of glucose on the y-phosphate of ATP would be consistent with stereochemical studies of yeast hexokinase which show the reaction proceeds with inversion of configuration at phosphorus (128,129). The previous two sections have discussed the ATP binding site, based in the first section on predictions derived from proteins of similar sequence, and based in the second section on comparisons to proteins that utilize structurally equivalent regions to bind the nucleotide substrate. Both sections have pointed out the importance of conserved glycine residues in the ATP binding site that are necessary for the binding (close approach) of the phosphate side chain. On the other hand, the region used in binding Figure 33. Stereo Images Depicting the Interdomain Hinge. Thick lines correspond to CONNECT 1 and CONNECT 2 which are helices proposed by Bork et al. (114) to form an interdomain hinge. Thin lines are the rest of the regions that are structurally similar. A: Yeast hexokinase, B: Actin, C: Glycerol kinase. 113 the adenine moiety may not be as clear. In the first section, the adenine moiety is predicted to be bound by the hydrophobic B-sheet of the small lobe. This was supported by studies showing that the binding of ATP to a peptide containing a portion of this B-sheet, and equivalent to yeast hexokinase residues 78-127 (Figure 23, part C), was independent of the chelation status (by Mg”) of the phosphate side chain of ATP. Therefore, the region was surmised to be binding the adenine moiety of ATP. In the second section, the importance of the 3m-helix (yeast hexokinase residues 419-424) in the binding of the adenine moiety was discussed. Additionally, based on the actin structure, analogous residues that appear to interact with the divalent cation in yeast hexokinase were determined to be Asp-86 and Asp-211. Only one of these residues is present in the peptide containing yeast hexokinase residues 78-127. Both residues may be necessary to bind the divalent cation. This may explain why binding of ATP by this peptide was independent of the chelation status of ATP. Nevertheless, this does not eliminate the B-sheet of the small lobe from involvement in the binding of the adenine moiety of ATP. The adenine moiety could be initially bound by the hydrophobic surface of the B-sheet, after which extensive movement of this region would occur with the result being that the adenine moiety would be "clamped" into place between the 3- sheet and the 3m-he1ix upon closure of the cleft. Therefore, the hydrophobic B-sheet of the small lobe would 114 be important in the initial binding of ATP while the 3m- helix would become important upon closure of the cleft. This is similar to the binding of glucose in yeast hexokinase where the majority of the residues interacting with glucose in the "open" conformation are from the large lobe, whereas residues from both lobes are interacting with glucose in the "closed“ conformation. This chapter has described the glucose binding site and residues proposed to be involved in the binding of MgnATP. It is readily apparent from Figure 19 that the residues involved in binding glucose as well as most of those proposed to be involved in binding ATP are well conserved in both halves of the "low K5" mammalian isozymes. The functional significance of these residues in the presumably noncatalytic N-terminal halves of the mammalian hexokinases remains unknown. A possible reason for the conservation of these residues might be that the N-terminal half is catalytically active only under specific conditions, such as when the enzyme is bound to mitochondria (type I hexokinase). Through the use of site directed mutagenesis, these residues, and the role they play, may be further investigated, an undertaking already occurring in this laboratory. CHAPTER VI Heterologous Expression of Type I Hexokinase 115 116 This chapter covers preliminary results on the bacterial expression of type I hexokinase using the plasmid pIN-III ompA (130,131). Parameters such as media, temperature, and concentration of inducer were varied in order to achieve the highest yield (in terms of activity). Background In pIN-III ompA expression is under control of the lac promoter that can be induced by the gratuitous inducer IPTG (isopropylthiogalactoside). Additionally, this vector is designed such that the type I hexokinase protein produced will contain the signal peptide of the OmpA protein which targets the protein for secretion into the periplasmic space. Upon translocation across the cytoplasmic membrane the signal peptide is cleaved. The expressed protein accumulates in the periplasmic space where protease activity is greatly reduced (130,132) compared to intracellular expression. Additionally, the periplasmic space provides an oxidizing environment (133) (as opposed to the intracellular reducing environment (134)) which may help prevent the formation of insoluble aggregates of the expressed protein (reviewed in 135). After expression into the periplasmic space the expressed protein is isolated via osmotic shock (136). Expression into the periplasmic space should provide for a simplified isolation of the expressed protein. 117 Plasmid Constructs Construction of the plasmids that were used to express type I hexokinase is given in the methods section. The four plasmids designated pHB4, pM1-7, pXNl, and pNB6 are described below. Plasmid pHB4 contains the entire coding region of type I hexokinase with an additional sequence corresponding to the multiple cloning site of pIN-III ompA located at the 5’ end of the cDNA insert. Upon induction, the expressed hexokinase protein will have additional amino acids "tacked" on the N-terminus even after the signal peptide has been cleaved (during translocation into the periplasmic space). pHB4 was used in the initial experiments to determine whether or not the expressed type I hexokinase would be catalytically active. The next step in the constructions was deletion of the region corresponding to the multiple cloning site located between the 5’ end of the type I hexokinase cDNA and the region coding for the signal sequence. The resulting plasmid, pM1-7, codes for a protein that, after translocation into the periplasmic space and cleavage of the signal peptide, should be full length type I hexokinase beginning with the native N-terminal starting Met. Sequence comparisons shown earlier have clearly established the mammalian hexokinases as being comprised of two similar halves. Plasmids pXNl and pNB6 were constructed for expression of the N-terminal and the C-terminal halves, 118 respectively, of the type I isozyme. Expression Results Replica nitrocellulose filters containing E.coli. (strain JA221) harboring either pIN-III ompA (negative control) or pHB4 (should produce rat brain hexokinase) were induced with IPTG (2mM) and grown overnight at room temperature. Colonies on the filters were osmotically shocked (to release the contents of the periplasmic space) and the filters were then screened with affinity purified polyclonal antibodies (rabbit) raised against type I hexokinase. Colonies harboring pHB4 were immunoreactive to the anti-hexokinase antibodies while those harboring only the vector pIN-III ompA were not. Additionally, replica filters were incubated (after osmotic shock) in hexokinase activity stain. The filter from colonies harboring pHB4 developed positive signals much faster and more intense than the filter derived from colonies harboring only pIN-III ompA. Therefore, polyclonal antibodies indicated hexokinase was being produced in cells harboring pHB4 and activity staining indicated the enzyme was catalytically active. Initial experiments aimed at expressing type I hexokinase in culture (strain JA221), using pM1-7, did not result in the detection of any significant hexokinase activity (after osmotic shock) relative to the negative control, pIN-III ompA. These experiments were carried out at 37°C with induction by 2 mM IPTG. The growth temperature was reduced to room temperature (137,138) and the levels 119 of IPTG used to induce expression were varied as well as the media (L broth vs. TB broth)(138). The results are shown in Table 4. Table 4. Expression of Type I Hexokinase mU/ml'culture Clone Inducer conc. L broth TB broth pM1-7 .0 IPTG 7 pM1-7 .20 IPTG 7 pM1-7 .020 IPTG 41 pM1-7 .002 IPTG 76 pM1-7 IPTG pIN-III ompA 0.002 IPTG pIN-III ompA 0 IPTG Values are expressed as milliunits/ml of culture. 1 unit is equivalent to the reduction of 1 umole NADP‘/min. The results in Table 4 indicate that the highest levels of activity were achieved using TB broth as media with no added IPTG. Yeast extract, a component of both L broth (yeast extract @ 5 g/l) and TB broth (yeast extract @ 24 g/l) apparently contains an activator of the lac operon and is therefore able to induce expression (135). The time course of expression (Figure 34) of type I hexokinase in E. coli (strain JA221) was determined using pM1-7 in TB broth with no added inducer (IPTG). It is apparent that maximum levels of expression are reached shortly after the culture reaches saturation. The two "halves" of rat brain hexokinase were expressed at room temperature using TB broth and varying the level of inducer - IPTG. The results are listed in Table 5. 120 Surprisingly, the N-terminal half (pXNl) appears to possess some activity, although this is not certain since the protein was not purified. As expected the C-terminal half (pNB6) appears to be catalytically active although maximum activity levels appear to be at an IPTG concentration of 2mM as opposed to the full length expressed enzyme (pM1-7) where maximum activity levels occurred with no addition of IPTG. This chapter has described preliminary results of expression of rat brain hexokinase in E. coli. The expressed enzyme appears to be catalytically active and based on a specific activity of 60 u/mg for the enzyme isolated from rat brains, the amount of active enzyme (pM1-7) at maximum expression is calculated to be 1.8 mg/l (assuming all the activity measured is ascribable to the expressed enzyme). Unfortunately, it appears that the expressed enzyme is unable to bind mitochondria (data not shown). Whether this is because the signal peptide was not cleaved from the enzyme, or cleavage of this peptide results in a charged N- terminus (which can not insert into the mitochondrial membrane as is required for binding) has not been determined. The expression levels of type I hexokinase (and the N- and C-terminal halves) were maintained for well over a year. Unfortunately, in recent experiments the expression levels of all clones (as determined by activity measurements) have dropped to negligible levels for reasons which, regrettably, remain undetermined. iLOOO 121 1301) m E 0’0’0'0“0=6. 3 o’ / 9» 0 1.5001- oz ,e --97.5 5- ° /0/ ’0’. R ‘” 1‘) .’ I) 2: E3 ,L) ‘./' \\ 5. 1.000 €-/o / . --65.0 ‘2' C) > \~ ,4 /. . 3 l /- E (D (e a 0.500-- ./ 432.5 5 ‘,er’ (a (1000. : : : : : : : : 01) '° 16 18 20 , 22 24 26 28 30 32 Thne(hn0 Figure 34. Time Course for Heterologous Expression of Type I Hexokinase. 500 mls of T.B. broth were inoculated with 5 mas of culture (in T.B. broth, previously grown to saturation at 37°C). The resultant culture was grown at room temperature with aliquots taken at 1 hour intervals for analysis. Total glucose phosphorylating activity (0) isolated from.the periplasmic space and optical density of the culture at 550 nm (o) were determined for each time point. Table 5. Expression of N- and C-terminal Halves of Type I Hexokinase Inducer conc. mU/ml culture pXNl 2.0 mM IPTG 3 pXNl 0.20 mM IPTG 1 pXNl 0.020 mM IPTG 13 pXNl 0.002 mM IPTG 16 pXNl 0 mM IPTG 26 pNB6 2.0 mM IPTG 206 pNB6 0.20 mM IPTG 202 pNBG 0.020 mM IPTG 61 pNB6 0.002 mM IPTG 25 pNB6 0 mM IPTG 17 pIN-III ompA 0.002 mM IPTG 1 pIN-III ompA 0 mM IPTG CHAPTER VII Future Research 122 123 The cDNA’s coding for types I and III hexokinases make possible mutagenesis experiments aimed at investigating the structure to function relationships in the hexokinases. The importance of residues involved in binding glucose as well as those proposed to be involved in binding ATP, discussed in chapter V, can be explored with mutagenesis of the regions of interest. Specifically, the importance of the hydrophobic B-sheet of the small lobe and the 3m-helix (yeast hexokinase residues 419-424) in the binding of the adenine moiety of ATP could be investigated with site- directed mutagenesis. Mutagenesis of hydrophobic portions of the B-sheet should prevent binding of ATP and catalysis, whereas, mutagenesis of the Bm-helix should not prevent binding of ATP since the adenine moiety, and hence ATP, can still initially bind. The enzyme should still be catalytically inactive since the conformational changes necessary for catalysis cannot occur because the adenine moiety cannot be properly "clamped" between the B-sheet and the mutated 3m-helix. In the sequence alignment in Figure 19, there are 50 residues that are identical and an additional 43 residues that are conserved in all of the sequences (including both halves of the 100 kDa isozymes). These residues are highlighted in Figure 35. It is not surprising that these residues are located almost exclusively in either the cleft or are buried in the enzyme. In the close up views of Figure 35 (parts C and D), two strictly conserved residues (in Figure 35. Conserved Residues in Hexokinases. Darkened regions are (A:) identical, or (B:) identical + conserved residues in all the sequences of Figure 19. Close up views of the region containing Lys-176 and Thr-212 using the (C:) "open" or (D:) "closed" conformation of yeast hexokinase. 125 addition to those previously suggested to be important in the binding of substrates) appear to be reasonable candidates for site directed mutagenesis. They are Thr-212 which is located deep in the cleft with its side chain oriented towards glucose, and Lys-176 which is located in the small lobe at the "lip" of the cleft. Figure 35 shows that Thr-212 is in a position to affect the interactions between the conserved residues Asn-210, Asp—211, Asn-237 and glucose. Similarly, the side chain of Lys-176 is oriented into the cleft directly above the bound glucose and seems certain to affect the bound glucose. As discussed in chapter IV, comparisons between the "low K5" isozymes demonstrate that the C-terminal halves are similar, as are the N-terminal halves. Figure 36 highlights the residues that are identical in comparisons of the amino acid sequences of either all the N-terminal halves of the "low K5" isozymes (part A), or all the C-terminal halves of the "low K5" isozymes (part B), or the catalytic "halves" (part C) (the C-terminal halves of the "low K5" isozymes + all the 50 kDa enzymes of Figure 19) of all the sequences of Figure 19. As expected, in all three cases, the majority of the residues that are identical are located either in the cleft or are buried in the enzyme. There are, however, two helices located on the surface (Figure 36, part B, yeast residues 346-352 and 359-369) that are comprised of residues that are strictly conserved only in the C-terminal halves of the "low K5" isozymes (which are the only "enzymes" 126 Figure 36. Stereo Images Highlighting Conserved Residues in Comparisons of Groups of Hexokinases. Darkened residues correspond to identical residues in the sequences of A: N- tenminal halves or B: C-terminal halves of the "low K5" isozymes, or C: catalytic "halves" of all the sequences in Figure 19. 127 inhibited by physiologically relevant levels of glucose-6- phosphate). Glucose-6-phosphate is a competitive inhibitor of ATP (139) and these conserved helices are close to the 3m-helix implicated, in chapter V, in the binding of ATP. This leads to speculation as to whether this region is involved in the glucose-6-phosphate inhibitory site. Mutagenesis of these residues may reveal the reason for the strict conservation of so many of the residues in this region of the C-terminal halves of the "low Kq" isozymes. In chapter V, the residues that are utilized in binding the substrates glucose and ATP were shown to be conserved in both halves of the mammalian hexokinases questioning whether or not the N-terminal halves possess catalytic activity. In the case of type I hexokinase, one possibility is that the N-terminal half is only active when the enzyme (type I) is bound to mitochondria. This could be investigated by mutating the C-terminal half so that the soluble (unbound) enzyme is no longer active (via a mutation in the C-terminal half) and then binding this mutant to mitochondria and assaying for activity. Any detected activity would be an indication that the N—terminal half possesses catalytic activity. The N-terminal sequence is critical for binding type I hexokinase to mitochondria (18). This sequence also appears to be sufficient to effect the binding of other proteins to mitochondria (48). Type III hexokinase has been shown to be associated (weakly bound) with the nuclear envelope (38). By 128 manipulating the cDNA’s of types I and III, the type I N- terminal sequence could be changed to the type III sequence. It would be interesting to see if the expressed enzyme is now associated with the nuclear envelope. If this were the case, the kinetic properties of the type III isozyme would have been exchanged for the type I isozyme. A major difference is that the type III isozyme is inhibited by the substrate glucose (17), and since type I is not, the intracellular effects of eliminating this inhibition may possibly be investigated. APPENDICES ACC ACC ACC ACY AFL (AGGCCT) (GACGTC) (GTVWAC) (CGCG) (TCCGGA) (GPCGQC) (ACPQGT) (GPCGQC) (AGCT) APPENDIX A RESTRICTION SITES FOR HEXOKINASE TYPE I CDNA # 11 SITES 1039 2638 1406 3151 100 3030 304 1638 1406 2219 1104 2525 1406 2219 129 465 963 1002 1071 1609 FRAGMENTS 1599 (43. 1039 (28. 1035 (28 2267 (61. 1406 (38. 3151 (85 522 (14 2930 (79 643 (17 100 ( 2 2035 (55 1334 (36 304 ( 8 1454 (39 1406 (38. 813 (22. 1421 (38 1148 (31 1104 (30. 1454 (39 1406 (38. 813 (22. 699 (19. 538 (14. 505 (13. 498 (13. 362 ( 9. 336 ( 9. 129 5) 3) .2) 7) 3) .8) .2) .8) .5) .7) .6) 3) 1) .7) .3) 1) .6) 3) 1) 1039 2638 1406 3151 100 3030 1638 304 2219 1406 1104 2525 2219 1406 1809 1071 3168 465 2806 129 FRAGMENT ENDS 2638 1039 3673 3673 1406 3151 3673 3030 3673 100 3673 1638 304 3673 1406 2219 2525 3673 1104 3673 1406 2219 2508 1609 3673 963 3168 465 APA 1 (GGGCCC) A80 2 (TTCGAA) AVA 1 (CQCGPG) AVA 2 (GGRCC) AVA 3 (ATGCAT) BAL 1 (TGGCCA) BAM H1 (GGATCC) BAN l (GGQPCC) SITES 1809 2508 2586 2806 3168 3204 360 1049 246 2424 19 297 726 894 1006 1201 1311 2989 3060 3112 1029 3107 2855 788 1262 130 FRAGMENTS 220 ( 6 200 ( 5 129 ( 3 78 ( 2 69 ( 1 39 ( 1 3204 (87 469 (12 2624 (71 689 (18 360 ( 9 2178 (59. 1249 (34. 246 ( 6. 1678 (45 613 (16 429 (11. 278 ( 7 195 ( 5 168 ( 4 112 ( 3 110 ( 3 71 ( 1 19 ( 0 3112 (84 561 (15 2078 (56. 1029 (28. 566 (15. 2855 (77 818 (22 788 (21 544 (14 .0) .4) .1) .9) .1) .2) .8) .4) .8) .8) .7) .3) .5) .8) 2586 1609 1 2508 1002 963 3204 1049 360 246 2424 1311 3060 297 1006 726 894 1201 2989 3112 1029 3107 2855 2594 FRAGMENT ENDS 2806 1809 129 2586 1071 1002 3204 3673 3673 1049 360 2424 3673 246 2989 3673 726 297 1201 894 1006 1311 3060 19 3112 3673 3107 1029 3673 2855 3673 788 3138 131 # SITES FRAGMENTS FRAGMENT ENDS 1393 535 (14.6) 3138 3673 1796 474 (12.9) 788 1262 2069 403 (11.0) 1393 1796 2132 375 (10.2) 2219 2594 2219 273 ( 7.4) 1796 2069 2594 131 ( 3.6) 1262 1393 3138 87 ( 2.4) 2132 2219 63 ( 1.7) 2069 2132 BAN 2 (GPGCQC) 1 3204 3204 (87.2) 1 3204 469 (12.8) 3204 3673 BBV 1 (GCTGC) 10 280 855 (23.3) 1624 2479 961 681 (18.5) 280 961 1072 414 (11.3) 1072 1486 1486 402 (10.9) 2971 3373 1624 300 ( 8.2) 3373 3673 2479 280 ( 7.6) 1 280 2506 272 ( 7.4) 2506 2778 2778 193 ( 5.3) 2778 2971 2971 138 ( 3.8) 1486 1624 3373 111 ( 3.0) 961 1072 27 ( 0.7) 2479 2506 BCL 1 (TGATCA) 3 594 1548 (42.1) 774 2322 774 1351 (36.8) 2322 3673 2322 594 (16.2) 1 594 180 ( 4.9) 594 774 BGL 1 (GCCNNNNNGG 1 2597 2597 (70.7) 1 2597 1076 (29.3) 2597 3673 BGL 2 (AGATCT) 3 420 1344 (36.6) 420 1764 1764 1270 (34.6) 2403 3673 2403 639 (17.4) 1764 2403 420 (11.4) 1 420 BIN 1 (GGATC) 7 340 995 (27.1) 1860 2855 941 817 (22.2) 2856 3673 1684 743 (20.2) 941 1684 1723 601 (16.4) 340 941 1860 340 ( 9.3) 1 340 2855 137 ( 3.7) 1723 1860 2856 39 ( 1.1) 1684 1723 1 ( 0.0) 2855 2856 132 # SITES FRAGMENTS FRAGMENT ENDS BSM 1 (GAATGC) 1 1134 2539 (69.1) 1134 3673 1134 (30.9) 1 1134 BSP 1286 (G2GC3C) 6 1294 1294 (35.2) 1 1294 1795 872 (23.7) 2131 3003 2068 501 (13.6) 1294 1795 2131 469 (12.8) 3204 3673 3003 273 ( 7.4) 1795 2068 3204 201 ( 5.5) 3003 3204 63 ( 1.7) 2068 2131 BSP M1 (ACCTGC) 3 62 2936 (79.9) 737 3673 376 361 ( 9.8) 376 737 737 314 ( 8.5) 62 376 62 ( 1 7) 1 62 BSP M2 (TCCGGA) 2 304 2035 (55.4) 1638 3673 1638 1334 (36.3) 304 1638 304 ( 8.3) 1 304 BST N1 (CCRGG) 23 32 380 (10.3) 3105 3485 337 339 ( 9.2) 490 829 490 310 ( 8.4) 2542 2852 829 305 ( 8.3) 32 337 898 267 ( 7.3) 1921 2188 953 234 ( 6.4) 1243 1477 1018 225 ( 6.1) 1018 1243 1243 188 ( 5.1) 3485 3673 1477 165 ( 4.5) 1510 1675 1510 153 ( 4.2) 337 490 1675 150 ( 4.1) 2188 2338 1780 147 ( 4.0) 2338 2485 1846 117 ( 3.2) 2940 3057 1921 105 ( 2.9) 1675 1780 2188 88 ( 2.4) 2852 2940 2338 75 ( 2.0) 1846 1921 2485 69 ( 1.9) 829 898 2542 66 ( 1.8) 1780 1846 2852 65 ( 1.8) 953 1018 2940 57 ( 1.6) 2485 2542 3057 55 ( 1.5) 898 953 3105 48 ( 1.3) 3057 3105 3485 33 ( 0.9) 1477 1510 32 ( 0.9) 1 32 BST X1 (CCANNNNNN 3 621 1900 (51.7) 1773 3673 1239 621 (16.9) 1 621 1773 618 (16.8) 621 1239 # CFR 1 (QGGCCP) 4 CLA 1 (ATCGAT) l DDE 1 (CTNAG) l3 EAE 1 (QGGCCP) 4 ECO 0109 (PGGNCCQ 3 ECO R5 (GATATC) l FNU 4H1 (GCNGC) 24 SITES 1029 1248 2604 3107 3398 41 313 417 1382 1423 1482 1526 1586 2000 2454 2583 2822 3518 1029 1248 2604 3107 893 1867 3204 203 10 178 280 961 984 133 FRAGMENTS 534 (14.5) 1356 (36.9) 1029 (28.0) 566 (15.4) 503 (13.7) 219 ( 6.0) 3398 (92.5) 275 ( 7.5) 965 (26.3) 696 (18.9) 454 (12.4) 414 (11.3) 272 ( 7.4) 239 ( 6.5) 155 ( 4.2) 129 ( 3.5) 104 ( 2.8) 60 ( 1.6) 59 ( 1.6) 44 ( 1.2) 41 ( 1.1) 41 ( 1.1) 1356 (36.9) 1029 (28 0) 566 (15 4) 503 (13.7) 219 ( 6.0) 1337 (36.4) 974 (26.5) 893 (24.3) 469 (12.8) 3470 (94.5) 203 ( 5.5) 855 (23 3) 681 (18.5) 300 ( 8.2) 234 ( 6.4) 198 ( 5.4) 178 ( 4.8) 1239 1248 3107 2604 1029 3398 417 2822 2000 1586 2583 3518 2454 313 1526 1423 1482 1382 1248 3107 2604 1029 1867 893 3204 203 1624 280 3373 3136 1250 1072 FRAGMENT ENDS 1773 2604 1029 3673 3107 1248 3398 3673 1382 3518 2454 2000 313 2822 3673 2583 417 1586 1482 1526 1423 41 2604 1029 3673 3107 1248 3204 1867 893 3673 3673 203 2479 961 3673 3370 1448 1250 134 # SITES FRAGMENTS FRAGMENT ENDS 1072 172 ( 4.7) 2606 2778 1250 165 ( 4.5) 2971 3136 1448 150 ( 4.1) 2818 2968 1486 138 ( 3.8) 1486 1624 1624 102 ( 2.8) 178 280 2479 88 ( 2.4) 984 1072 2506 88 ( 2.4) 10 98 2579 80 ( 2.2) 98 178 2606 73 ( 2.0) 2506 2579 2778 38 ( 1.0) 1448 1486 2781 27 ( 0.7) 2579 2606 2792 27 ( 0.7) 2479 2506 2818 26 ( 0.7) 2792 2818 2968 23 ( 0.6) 961 984 2971 11 ( 0.3) 2781 2792 3136 10 ( 0.3) 1 10 3370 3 ( 0.1) 3370 3373 3373 3 ( 0.1) 2968 2971 3 ( 0.1) 2778 2781 FNU 02 (CGCG) 2 100 2930 (79.8) 100 3030 3030 643 (17.5) 3030 3673 100 ( 2.7) 1 100 FOX 1 (GGATG) 21 75 738 (20.1) 907 1645 136 526 (14.3) 2693 3219 640 504 (13.7) 136 640 852 281 ( 7.7) 3392 3673 886 212 ( 5.8) 640 852 907 207 ( 5.6) 2038 2245 1645 174 ( 4.7) 2497 2671 1651 163 ( 4.4) 3219 3382 1804 153 ( 4.2) 1651 1804 1875 138 ( 3.8) 2359 2497 2005 130 ( 3.5) 1875 2005 2038 86 ( 2.3) 2245 2331 2245 75 ( 2.0) 1 75 2331 71 ( 1.9) 1804 1875 2359 61 ( 1.7) 75 136 2497 34 ( 0.9) 852 886 2671 33 ( 0.9) 2005 2038 2693 28 ( 0.8) 2331 2359 3219 22 ( 0.6) 2671 2693 3382 21 ( 0.6) 886 907 3392 10 ( 0.3) 3382 3392 6 ( 0.2) 1645 1651 GDI 2 (QGGCCG) 2 1248 1356 (36.9) 1248 2604 2604 1248 (34.0) 1 1248 1069 (29.1) 2604 3673 HAE 1 (RGGCCR) 8 111 961 (26.2) 1677 2638 238 791 (21.5) 238 1029 135 # SITES FRAGMENTS FRAGMENT ENDS 1029 638 (17.4) 1039 1677 1039 469 (12.8) 2638 3107 1677 365 ( 9.9) 3308 3673 2638 201 ( 5.5) 3107 3308 3107 127 ( 3.5) 111 238 3308 111 ( 3.0) 1 111 10 ( 0.3) 1029 1039 HAE 2 (PGCGCQ) 3 2219 2219 (60.4) 1 2219 2476 717 (19.5) 2956 3673 2956 480 (13.1) 2476 2956 257 ( 7.0) 2219 2476 HAE 3 (GGCC) 21 54 531 (14.5) 239 770 112 364 ( 9.9) 3309 3673 239 317 ( 8.6) 2791 3108 770 302 ( 8.2) 2192 2494 1030 260 ( 7.1) 770 1030 1040 213 ( 5.8) 1979 2192 1249 209 ( 5.7) 1040 1249 1447 198 ( 5.4) 1249 1447 1513 190 ( 5.2) 1678 1868 1678 165 ( 4.5) 1513 1678 1868 152 ( 4.1) 2639 2791 1979 127 ( 3.5) 112 239 2192 111 ( 3.0) 2494 2605 2494 111 ( 3.0) 1868 1979 2605 97 ( 2.6) 3108 3205 2639 77 ( 2.1) 3232 3309 2791 66 ( 1.8) 1447 1513 3108 58 ( 1.6) 54 112 3205 54 ( 1.5) 1 54 3232 34 ( 0.9) 2605 2639 3309 27 ( 0.7) 3205 3232 10 ( 0.3) 1030 1040 HGA 1 (GACGC) 3 1537 1537 (41.8) 1 1537 2677 1140 (31.0) 1537 2677 3149 524 (14.3) 3149 3673 472 (12.9) 2677 3149 HGI A1 (GRGCRC) 1 3003 3003 (81.8) 1 3003 670 (18.2) 3003 3673 HGI C1 (GGQPCC) 9 788 788 (21.5) 1 788 1262 544 (14.8) 2594 3138 1393 535 (14.6) 3138 3673 1796 474 (12.9) 788 1262 2069 403 (11.0) 1393 1796 2132 375 (10.2) 2219 2594 # HGI J2 (GPGCQC) 1 HHA 1 (GCGC) 7 HINC 2 (GTQPAC) 2 HINF 1 (GANTC) 14 HPA 2 (CCGG) 16 SITES 2219 2594 3138 3204 101 1057 2220 2477 2957 3029 3515 1187 2054 193 216 363 413 543 1400 1932 2115 2709 3092 3178 3227 3288 3642 247 295 305 1009 1363 1492 1639 1649 1705 1800 1873 2395 2470 2598 136 FRAGMENTS 273 ( 7 131 ( 3 87 ( 2 63 ( 1 3204 (87 469 (12 1163 (31 956 (26 486 (13 480 (13 257 ( 7 158 ( 4 101 ( 2 72 ( 2 1619 (44. 1187 (32 867 (23 857 (23. 594 (16. 532 (14. 383 (10. 354 ( 9 193 ( 5 183 ( 5 147 ( 4 130 ( 3 86 ( 2 61 ( 1 50 ( 1 49 ( 1 31 ( 0 23 ( 0 704 (19. 546 (14. 522 (14. 388 (10. 354 ( 9. 247 ( 6 147 ( 4 141 ( 3 129 ( 3 128 ( 3 95 ( 2 75 ( 2 73 ( 2 56 ( 1 .4) .6) .4) .7) .2) .8) .7) .0) .2) .1) .0) .3) .0) 1) .3) .6) 1796 1262 2132 2069 3204 1057 101 3029 2477 2220 3515 2957 2054 1187 543 2115 1400 2709 3288 1932 216 413 3092 3227 363 3178 3642 193 305 2598 1873 3285 1009 1492 3144 1363 2470 1705 2395 1800 1649 FRAGMENT ENDS 2069 1393 2219 2132 3204 3673 2220 1057 3515 2957 2477 3673 101 3029 3673 1187 2054 1400 2709 1932 3092 3642 193 2115 363 543 3178 3288 413 3227 3673 216 1009 3144 2395 3673 1363 247 1639 3285 1492 2598 1800 2470 1873 1705 137 # SITES FRAGMENTS FRAGMENT ENDS 3144 48 ( 1.3) 247 295 3285 10 ( 0.3) 1639 1649 10 ( 0.3) 295 305 HPH 1 (GGTGA) 16 29 531 (14.5) 2738 3269 123 435 (11.8) 1021 1456 379 404 (11.0) 3269 3673 709 365 ( 9.9) 2373 2738 979 330 ( 9.0) 379 709 1021 270 ( 7.4) 709 979 1456 256 ( 7.0) 123 379 1549 223 ( 6.1) 1667 1890 1667 217 ( 5.9) 2011 2228 1890 121 ( 3.3) 1890 2011 2011 118 ( 3.2) 1549 1667 2228 114 ( 3.1) 2228 2342 2342 94 ( 2.6) 29 123 2373 93 ( 2.5) 1456 1549 2738 42 ( 1.1) 979 1021 3269 31 ( 0.8) 2342 2373 29 ( 0.8) 1 29 M80 2 (GAAGA) 15 232 528 (14.4) 1024 1552 353 475 (12 9) 3198 3673 391 459 (12.5) 1555 2014 514 427 (11.6) 2771 3198 902 388 (10.6) 514 902 973 380 (10.3) 2391 2771 1024 232 ( 6.3) 1 232 1552 230 ( 6.3) 2161 2391 1555 123 ( 3.3) 391 514 2014 121 ( 3.3) 232 353 2093 79 ( 2.2) 2014 2093 2161 71 ( 1.9) 902 973 2391 68 ( 1.9) 2093 2161 2771 51 ( 1.4) 973 1024 3198 38 ( 1.0) 353 391 3 ( 0.1) 1552 1555 MNL 1 (CCTC) 44 17 317 ( 8.6) 267 584 43 303 ( 8.2) 1081 1384 241 209 ( 5.7) 2180 2389 267 203 ( 5.5) 2981 3184 584 198 ( 5.4) 43 241 678 192 ( 5.2) 850 1042 809 188 ( 5.1) 1692 1880 845 172 ( 4.7) 3501 3673 850 148 ( 4.0) 2427 2575 1042 144 ( 3.9) 3357 3501 1081 131 ( 3.6) 678 809 1384 129 ( 3.5) 2024 2153 1417 124 ( 3.4) 2637 2761 1420 113 ( 3.1) 3244 3357 1502 104 ( 2.8) 1588 1692 1525 96 ( 2.6) 1880 1976 MST 2 (CCTNAGG) NAE 1 (GCCGGC) NAR 1 (GGCGCC) NCI l (CCSGG) SITES 1547 1558 1588 1692 1880 1976 1992 2002 2017 2024 2153 2180 2389 2427 2575 2637 2761 2826 2859 2894 2977 2981 3184 3234 3239 3244 3357 3501 1381 2597 2219 246 247 1362 1648 1705 1800 1872 2395 3143 138 FRAGMENTS 94 ( 2.6) 83 ( 2.3) 82 ( 2.2) 65 ( 1.8) 62 ( 1.7) 50 ( 1.4) 39 ( 1.1) 38 ( 1.0) 36 ( 1.0) 35 ( 1.0) 33 ( 0.9) 33 ( 0.9) 30 ( 0.8) 27 ( 0.7) 26 ( 0.7) 26 ( 0.7) 23 ( 0.6) 22 ( 0.6) 17 ( 0.5) 16 ( 0.4) 15 ( 0.4) 11 ( 0.3) 10 ( 0.3) 7 ( 0.2) 5 ( 0.1) 5 ( 0.1) 5 ( 0.1) 4 ( 0.1) 3 ( 0.1) 2292 (62.4) 1381 (37.6) 2597 (70 7) 1076 (29 3) 2219 (60 4) 1454 (39.6) 1115 (30.4) 748 (20.4) 530 (14.4) 523 (14.2) 286 ( 7.8) 246 ( 6.7) 95 ( 2.6) 72 ( 2.0) 57 ( 1.6) 1 ( 0.0) 584 2894 1420 2761 2575 3184 1042 2389 809 2859 2826 1384 1558 2153 241 17 1502 1525 1 1976 2002 1547 1992 2017 3239 3234 845 2977 1417 1381 2597 2219 247 2395 3143 1872 1362 1705 1800 1648 246 FRAGMENT ENDS 678 2977 1502 2826 2637 3234 1081 2427 845 2894 2859 1417 1588 2180 267 43 1525 1547 17 1992 2017 1558 2002 2024 3244 3239 850 2981 1420 3673 1381 2597 3673 2219 3673 1362 3143 3673 2395 1648 246 1800 1872 1705 247 NCO 1 NLA 3 NLA 4 (CCATGG) ( CATG) (GGNNCC) 24 28 SITES 1452 91 175 409 449 476 502 730 805 988 994 1453 1789 1855 1944 1982 2074 2149 2209 2602 2713 3089 3111 3122 3330 45 54 458 725 788 875 893 894 1174 1262 1368 1393 1445 1796 1868 2069 2132 2184 2219 2594 2789 2855 2988 3060 3138 139 FRAGMENTS 2221 1452 459 393 376 343 336 234 228 208 183 111 91 89 75 75 66 6O 38 27 26 22 (60. (39. (12 FJH OCD OOOOOHHHHNMNNNNwmmmO‘mKO Hra AAAAAAAAAAAAAAAAAAAAAAAAA HHHHHHHHNNNNNNWWWWUWQQQWOH .5) .7) .2) .3) .1) .4) .2) .7) .0) .0) .5) .4) .3) .0) .0) .8) .6) .0) .7) .7) .6) .3) .2) 1452 994 2209 2713 3330 1453 175 502 3122 805 2602 1982 1855 2074 730 1789 2149 409 1944 449 476 3089 3111 988 54 2219 1445 894 458 3204 1868 2594 2855 3436 3557 1262 1174 788 3060 2988 1796 3138 2789 2069 725 2132 1393 2184 FRAGMENT ENDS 3673 1452 1453 2602 3089 3673 1789 409 730 3330 988 2713 2074 91 1944 175 2149 805 1855 2209 449 1982 476 502 3111 3122 994 458 2594 1796 1174 725 3436 2069 2789 2988 3557 3673 1368 1262 875 3138 3060 1868 3204 2855 2132 788 2184 1445 2219 NSI NSP NSP PPU PST PVU PVU RRU RSA 1 (ATGCAT) BZ (CVGCWG) C1 (PCATGQ) M1 (PGGRCCQ) 1 (CTGCAG) 1 (CGATCG) 2 (CAGCTG) 1 (AGTACT) 1 (GTAC) SITES 3204 3436 3557 3112 11 962 2507 2585 2779 2805 987 893 3156 1218 2838 962 2507 2585 2805 1356 169 590 991 1357 2335 140 FRAGMENTS 25 ( 0.7) 18 ( 0.5) 9 ( 0.2) 1 ( 0.0) 3112 (84.7) 561 (15.3) 1545 (42.1) 951 (25.9) 868 (23.6) 194 ( 5.3) 78 ( 2.1) 26 ( 0.7) 11 ( 0.3) 2686 (73.1) 987 (26.9) 2780 (75.7) 893 (24 3) 3156 (85.9) 517 (14.1) 1620 (44.1) 1218 (33.2) 835 (22.7) 1545 (42.1) 962 (26.2) 868 (23.6) 220 ( 6.0) 78 ( 2.1) 2317 (63.1) 1356 (36.9) 978 (26.6) 871 (23.7) 421 (11.5) 413 (11.2) 401 (10.9) 1368 875 45 893 3112 962 2805 2585 2507 2779 987 893 3156 1218 2838 962 2805 2585 2507 1356 1357 2748 169 2335 590 FRAGMENT ENDS 1393 893 54 894 3112 3673 2507 962 3673 2779 2585 2805 11 3673 987 3673 893 3156 3673 2838 1218 3673 2507 962 3673 2805 2585 3673 1356 2335 3619 590 2748 991 141 # SITES FRAGMENTS FRAGMENT ENDS 2748 366 (10.0) 991 1357 3619 169 ( 4.6) 1 169 3637 36 ( 1.0) 3637 3673 18 ( 0.5) 3619 3637 SAU 1 (CCTNAGG) 1 1381 2292 (62.4) 1381 3673 1381 (37.6) 1 1381 SAU 3A (GATC) 24 5 625 (17.0) 3048 3673 94 452 (12.3) 1233 1685 341 384 (10.5) 1939 2323 421 277 ( 7.5) 942 1219 473 247 ( 6.7) 94 341 517 219 ( 6.0) 2404 2623 595 216 ( 5.9) 2623 2839 775 187 ( 5.1) 2856 3043 942 180 ( 4.9) 595 775 1219 167 ( 4.5) 775 942 1233 96 ( 2.6) 1765 1861 1685 89 ( 2.4) 5 94 1723 80 ( 2.2) 341 421 1765 78 ( 2.1) 1861 1939 1861 78 ( 2.1) 517 595 1939 57 ( 1.6) 2347 2404 2323 52 ( 1.4) 421 473 2347 44 ( 1.2) 473 517 2404 42 ( 1.1) 1723 1765 2623 38 ( 1.0) 1685 1723 2839 24 ( 0.7) 2323 2347 2856 17 ( 0.5) 2839 2856 3043 14 ( 0.4) 1219 1233 3048 5 ( 0.1) 3043 3048 5 ( 0.1) 1 5 SAU 96 (GGNCC) 19 19 468 (12.7) 3205 3673 54 429 (11.7) 297 726 297 355 ( 9.7) 1513 1868 726 302 ( 8.2) 2191 2493 894 297 ( 8.1) 2493 2790 1006 243 ( 6.6) 54 297 1201 213 ( 5.8) 1978 2191 1311 199 ( 5.4) 2790 2989 1446 195 ( 5.3) 1006 1201 1513 168 ( 4.6) 726 894 1868 144 ( 3.9) 3060 3204 1978 135 ( 3.7) 1311 1446 2191 112 ( 3.0) 894 1006 2493 110 ( 3.0) 1868 1978 2790 110 ( 3.0) 1201 1311 2989 71 ( 1.9) 2989 3060 3060 67 ( 1.8) 1446 1513 3204 35 ( 1.0) 19 54 SCA 1 SCR F1 SDU 1 SFA N1 (AGTACT) (CCNGG) (G2GC3C) (GATGC ) 32 12 SITES 3205 1356 32 246 247 337 490 829 898 953 1018 1243 1362 1477 1510 1648 1675 1705 1780 1800 1846 1872 1921 2188 2338 2395 2485 2542 2852 2940 3057 3105 3143 3485 1294 1795 2068 2131 3003 3204 74 277 689 1059 1345 142 FRAGMENTS 2317 1356 342 339 1294 872 501 469 273 201 63 971 412 370 299 286 (63. (36. AAA/NAAAAAAAAAAAAAAAAAAAAAAAAAAAAA OOOOOOOI—‘I—‘HHHHHHHNMNNWWWWQDWU’IQQCDQQ (35 (23 (13 (12 ( 7 ( 5 ( 1 (26 ( 7 .2) .7) .6) .8) .4) .5) .7) .4) (11. (10. ( 8. .8) 1) 1) 1 3204 1356 3143 490 2542 1921 1018 3485 337 2188 1510 1243 2940 1362 2395 247 2852 1705 829 953 2485 2338 898 1872 3057 1800 3105 1477 1675 1648 1846 1780 246 2131 1294 3204 1795 3003 2068 2430 277 689 1345 1059 FRAGMENT ENDS 19 3205 3673 1356 3485 829 2852 2188 1243 246 3673 490 2338 1648 1362 3057 1477 2485 337 2940 1780 898 1018 2542 2395 953 1921 3105 1846 3143 1510 1705 1675 1872 1800 247 1294 3003 1795 3673 2068 3204 2131 3401 689 1059 1644 1345 SMA (CCCGGG) SPE (ACTAGT) SSP (AATATT) STU (AGGCCT) STY (CCRRGG) TAQ (TCGA) TTHlll 1 (GACNNNG TTHlll 2 (CCAPCA) 1 9 SITES 1644 1833 1876 2006 2202 2430 3401 246 1097 2283 1039 2638 1032 1146 1452 1680 361 471 825 969 1050 2837 3101 3399 140 260 143 FRAGMENTS 272 ( 7 228 ( 6 203 ( 5 196 ( 5 189 ( 5 130 ( 3 74 ( 2 43 ( 1 3427 (93 246 ( 6 2576 (70. 1097 (29. 2283 (62 1390 (37 1599 (43 1039 (28 1035 (28 1993 (54. 1032 (28. 306 ( 8 228 ( 6 114 ( 3 1787 (48 361 ( 9 354 ( 9 298 ( 8 274 ( 7 264 ( 7 144 ( 3 110 ( 3 81 ( 2 3533 (96 140 ( 3 1173 (31. .4) .5) .3) .1) .5) .0) .2) .3) .7) 1) 9) .2) .8) .5) .3) .2) 3) .3) .2) .1) .7) .8) .6) .1) .5) .9) .0) .2) .2) .8) 9) 3401 2202 74 2006 1644 1876 1 1833 246 1097 2283 1039 2638 1680 1146 1452 1032 1050 471 3101 3399 2837 825 361 969 140 1713 FRAGMENT ENDS 3673 2430 277 2202 1833 2006 74 1876 3673 246 3673 1097 2283 3673 2638 1039 3673 3673 1032 1452 1680 1146 2837 361 825 3399 3673 3101 969 471 1050 3673 140 2886 144 # SITES FRAGMENTS FRAGMENT ENDS 754 494 (13.4) 260 754 1204 450 (12.3) 754 1204 1599 395 (10.8) 1204 1599 1713 341 ( 9.3) 2931 3272 2886 312 ( 8.5) 3361 3673 2931 260 ( 7.1) 1 260 3272 114 ( 3.1) 1599 1713 3361 89 ( 2.4) 3272 3361 45 ( 1.2) 2886 2931 XBA 1 (TCTAGA) l 2898 2898 (78.9) 1 2898 775 (21.1) 2898 3673 XHO 2 (PGATCQ) 8 340 818 (22.3) 2855 3673 420 743 (20.2) 941 1684 941 639 (17.4) 1764 2403 1684 521 (14.2) 420 941 1722 452 (12.3) 2403 2855 1764 340 ( 9.3) 1 340 2403 80 ( 2.2) 340 420 2855 42 ( 1.1) 1722 1764 38 ( 1.0) 1684 1722 XMN 1 (GAANNNNTTC 2 1130 1293 (35.2) 2380 3673 2380 1250 (34.0) 1130 2380 1130 (30.8) 1 1130 The following do not appear: AFL 2 AHA 3 A08 1 APA L1 ASP718 1 AVR 2 BSPH 1 888 H2 BST E2 DRA 3 ECO R1 HIND 3 HPA 1 KPN 1 MLU 1 MST 1 NDE 1 NHE 1 NOT 1 NRU 1 PFL M1 RSR 2 SAC 1 SAC 2 SAL 1 SFI 1 SNA 1 SNA Bl SPH 1 XHO 1 XMA 3 APPENDIX B RESTRICTION SITES FOR TYPE III HEXOKINASE CDNA # SITES FRAGMENTS FRAGMENT ENDS AAT 1 (AGGCCT) 2 1210 1824 (49.4) 1868 3692 1868 1210 (32.8) 1 1210 658 (17.8) 1210 1868 AAT 2 (GACGTC) 1 2534 2534 (68.6) 1 2534 1158 (31.4) 2534 3692 ACC 1 (GTVWAC) 2 807 1916 (51.9) 1776 3692 1776 969 (26.2) 807 1776 807 (21.9) 1 807 ACC 2 (CGCG) 4 873 1952 (52.9) 1740 3692 1236 873 (23.6) 1 873 1258 482 (13.1) 1258 1740 1740 363 ( 9.8) 873 1236 22 ( 0.6) 1236 1258 ACC 3 (TCCGGA) 2 295 1875 (50.8) 295 2170 2170 1522 (41.2) 2170 3692 295 ( 8.0) 1 295 ACY 1 (GPCGQC) 2 2534 2534 (68.6) 1 2534 3453 919 (24.9) 2534 3453 239 ( 6.5) 3453 3692 AFL 3 (ACPQGT) 4 389 1486 (40.2) 2206 3692 1130 741 (20.1) 389 1130 1739 609 (16.5) 1130 1739 2206 467 (12.6) 1739 2206 389 (10.5) 1 389 AHA 2 (GPCGQC) 2 2534 2534 (68.6) 1 2534 3453 919 (24.9) 2534 3453 239 ( 6.5) 3453 3692 ALU 1 (AGCT) 26 16 684 (18.5) 2743 3427 129 615 (16.7) 2044 2659 165 282 ( 7.6) 1270 1552 217 207 ( 5.6) 565 772 145 146 # SITES FRAGMENTS FRAGMENT ENDS 364 195 ( 5.3) 1030 1225 493 193 ( 5.2) 1626 1819 565 173 ( 4.7) 857 1030 772 147 ( 4.0) 217 364 857 137 ( 3.7) 3555 3692 1030 129 ( 3.5) 364 493 1225 120 ( 3.3) 1924 2044 1262 113 ( 3.1) 16 129 1270 105 ( 2.8) 1819 1924 1552 85 ( 2.3) 772 857 1557 72 ( 2.0) 493 565 1567 69 ( 1.9) 3427 3496 1626 59 ( 1.6) 3496 3555 1819 59 ( 1.6) 1567 1626 1924 52 ( 1.4) 165 217 2044 45 ( 1.2) 2698 2743 2659 39 ( 1.1) 2659 2698 2698 37 ( 1.0) 1225 1262 2743 36 ( 1.0) 129 165 3427 16 ( 0.4) 1 16 3496 10 ( 0.3) 1557 1567 3555 8 ( 0.2) 1262 1270 5 ( 0.1) 1552 1557 APA 1 (GGGCCC) 1 2923 2923 (79.2) 1 2923 769 (20.8) 2923 3692 APA L1 (GTGCAC) 2 1253 1503 (40.7) 1253 2756 2756 1253 (33.9) 1 1253 936 (25.4) 2756 3692 ASP718 1 (GGTACC) 3 753 1550 (42.0) 2142 3692 816 1326 (35.9) 816 2142 2142 753 (20.4) 1 753 63 ( 1.7) 753 816 ASU 2 (TTCGAA) 1 1368 2324 (62.9) 1368 3692 1368 (37.1) 1 1368 AVA 1 (CQCGPG) 3 110 2080 (56.3) 110 2190 2190 1048 (28.4) 2190 3238 3238 454 (12.3) 3238 3692 110 ( 3.0) 1 110 AVA 2 (GGRCC) 18 319 555 (15.0) 1148 1703 674 484 (13.1) 2911 3395 709 356 ( 9.6) 2555 2911 869 355 ( 9.6) 319 674 962 319 ( 8.6) 1 319 AVA 3 (ATGCAT) 2 AVR 2 (CCTAGG) 3 BAL 1 (TGGCCA) 4 BAM H1 (GGATCC) 1 BAN 1 (GGQPCC) 12 BAN 2 (GPGCQC) 7 SITES 1049 1070 1148 1703 1934 2080 2288 2498 2516 2555 2911 3395 3619 535 1972 476 926 1706 485 1135 1348 3107 3358 333 414 753 777 816 1375 1541 1713 2142 2229 2265 2962 902 1202 FRAGMENTS 231 ( 6.3) 224 ( 6.1) 210 ( 5.7) 208 ( 5.6) 160 ( 4.3) 146 ( 4.0) 93 ( 2.5) 87 ( 2.4) 78 ( 2.1) 73 ( 2.0) 39 ( 1.1) 35 ( 0.9) 21 ( 0.6) 18 ( 0.5) 1720 (46.6) 1437 (38.9) 535 (14.5) 1986 (53.8) 780 (21.1) 476 (12.9) 450 (12.2) 1759 (47.6) 650 (17.6) 585 (15.8) 485 (13.1) 213 ( 5.8) 3358 (91.0) 334 ( 9.0) 730 (19.8) 697 (18.9) 559 (15.1) 429 (11.6) 339 ( 9.2) 333 ( 9.0) 172 ( 4.7) 166 ( 4.5) 87 ( 2.4) 81 ( 2.2) 39 ( 1.1) 36 ( 1.0) 24 ( 0.7) 1118 (30.3) 902 (24.4) 1703 3395 2288 2080 709 1934 869 962 1070 3619 2516 674 1049 2498 1972 535 1706 926 476 1348 485 3107 1135 3358 2962 2265 816 1713 414 1541 1375 2142 333 777 2229 753 1805 1 FRAGMENT ENDS 1934 3619 2498 2288 869 2080 962 1049 1148 3692 2555 709 1070 2516 3692 1972 535 3692 1706 476 926 3107 1135 3692 485 1348 3358 3692 3692 2962 1375 2142 753 333 1713 1541 2229 414 816 2265 777 2923 902 BBV 1 (GCTGC) 24 BCL 1 (TGATCA) 3 BGL 1 (GCCNNNNNGG 1 BIN 1 (GGATC) 8 SITES 1224 1625 1805 2923 3259 130 218 221 491 507 514 855 1084 1232 1263 1290 1325 1497 1565 1643 1675 1922 1981 2616 2802 3299 3327 3402 3553 1603 1826 2332 1500 227 458 473 2106 2879 3358 3359 3532 FRAGMENTS 433 401 336 300 180 22 1603 1360 506 223 2192 1500 1633 773 479 231 227 173 160 15 H04 AAAAAA connect—- H14 AAAAA"AAAAAAAAAAAAAAAAAAA oooooooor—u—smewwehpma‘mqqu (43 (13 (59. .6) (40 (44 AAA/NA oobhmox .7) .1) .1) .9) .6) .4) (36. .7) ( 6. 0) 4) .2) (20. (13. .3) 0) .7) .3) .4) .0) 3259 1224 2923 902 1625 1202 1981 2802 514 221 1675 855 2616 1325 3402 1084 3553 130 1565 3327 1497 1922 1290 1643 1232 3299 1263 491 507 218 2332 1826 1603 1500 473 2106 2879 227 1 3359 3532 458 3358 FRAGMENT ENDS 3692 1625 3259 1202 1805 1224 2616 3299 855 491 1922 1084 2802 1497 3553 1232 3692 130 218 1643 3402 1565 1981 1325 1675 1263 3327 1290 507 514 221 1603 3692 2332 1826 3692 1500 2106 2879 3358 458 227 3532 3692 473 3359 # BSM 1 (GAATGC) 1 BSP 1286 (G2GC3C) 16 BSPH 1 (TCATGA) 1 BSP M1 (ACCTGC) 4 BSP M2 (TCCGGA) 2 BST E2 (GGTNACC) 1 BST N1 (CCRGG) 39 SITES 145 413 614 776 902 985 1202 1224 1253 1625 1712 1805 2187 2756 2923 2963 3259 3463 2268 2607 3053 3114 295 2170 203 54 137 353 439 446 530 628 149 FRAGMENTS 3547 (96.1) 145 ( 3.9) 569 (15.4) 433 (11.7) 413 (11.2) 382 (10.3) 372 (10.1) 296 ( 8.0) 217 ( 5.9) 201 ( 5.4) 167 ( 4.5) 162 ( 4.4) 126 ( 3.4) 93 ( 2.5) 87 ( 2.4) 83 ( 2.2) 40 ( 1.1) 29 ( 0.8) 22 ( 0.6) 3463 (93.8) 229 ( 6.2) 2268 (61.4) 578 (15.7) 446 (12.1) 339 ( 9.2) 61 ( 1.7) 1875 (50.8) 1522 (41.2) 295 ( 8.0) 3489 (94.5) 203 ( 5.5) 354 ( 9.6) 228 ( 6.2) 216 ( 5.9) 209 ( 5.7) 208 ( 5.6) 178 ( 4.8) 170 ( 4.6) 145 2187 3259 1805 1253 2963 985 413 2756 614 776 1712 1625 902 2923 1224 1202 3463 3114 2607 2268 3053 295 2170 203 1937 3095 137 2886 3484 781 2716 FRAGMENT ENDS 3692 145 2756 3692 413 2187 1625 3259 1202 614 2923 776 902 1805 1712 985 2963 1253 1224 3463 3692 2268 3692 3053 2607 3114 2170 3692 295 3692 203 2291 3323 353 3095 3692 959 2886 150 # SITES FRAGMENTS FRAGMENT ENDS 665 162 ( 4.4) 1124 1286 781 156 ( 4.2) 1538 1694 959 146 ( 4.0) 2570 2716 974 143 ( 3.9) 981 1124 981 124 ( 3.4) 1381 1505 1124 116 ( 3.1) 3332 3448 1286 116 ( 3.1) 665 781 1381 98 ( 2.7) 530 628 1505 95 ( 2.6) 1286 1381 1526 90 ( 2.4) 1802 1892 1538 86 ( 2.3) 2348 2434 1694 86 ( 2.3) 353 439 1772 84 ( 2.3) 446 530 1802 83 ( 2.2) 54 137 1892 78 ( 2.1) 1694 1772 1931 54 ( 1.5) 1 54 1937 51 ( 1.4) 2519 2570 2291 49 ( 1.3) 2434 2483 2307 41 ( 1.1) 2307 2348 2348 39 ( 1.1) 1892 1931 2434 37 ( 1.0) 628 665 2483 36 ( 1.0) 3448 3484 2510 30 ( 0.8) 1772 1802 2519 27 ( 0.7) 2483 2510 2570 21 ( 0.6) 1505 1526 2716 16 ( 0.4) 2291 2307 2886 15 ( 0.4) 959 974 3095 12 ( 0.3) 1526 1538 3323 9 ( 0.2) 3323 3332 3332 9 ( 0.2) 2510 2519 3448 7 ( 0.2) 974 981 3484 7 ( 0.2) 439 446 6 ( 0.2) 1931 1937 BST x1 (CCANNNNNN 1 1802 1890 (51.2) 1802 3692 1802 (48.8) 1 1802 CFR 1 (QGGCCP) 11 82 1327 (35.9) 1744 3071 485 585 (15.8) 3107 3692 875 403 (10.9) 82 485 1135 390 (10.6) 485 875 1342 278 ( 7.5) 1466 1744 1348 260 ( 7.0) 875 1135 1466 207 ( 5.6) 1135 1342 1744 118 ( 3.2) 1348 1466 3071 82 ( 2.2) 1 82 3097 26 ( 0.7) 3071 3097 3107 10 ( 0.3) 3097 3107 6 ( 0.2) 1342 1348 DDE 1 (CTNAG) 20 18 566 (15.3) 3000 3566 158 517 (14.0) 167 684 167 499 (13.5) 2253 2752 151 # SITES FRAGMENTS FRAGMENT ENDS 684 332 ( 9.0) 1222 1554 988 304 ( 8.2) 684 988 1104 238 ( 6.4) 1960 2198 1200 235 ( 6.4) 1554 1789 1207 140 ( 3.8) 2752 2892 1214 140 ( 3.8) 18 158 1222 126 ( 3.4) 3566 3692 1554 116 ( 3.1) 988 1104 1789 108 ( 2.9) 2892 3000 1872 96 ( 2.6) 1104 1200 1960 88 ( 2.4) 1872 1960 2198 83 ( 2.2) 1789 1872 2253 55 ( 1.5) 2198 2253 2752 18 ( 0.5) 1 18 2892 9 ( 0.2) 158 167 3000 8 ( 0.2) 1214 1222 3566 7 ( 0.2) 1207 1214 7 ( 0.2) 1200 1207 DRA 3 (CACNNNGTG) 2 385 3266 (88.5) 426 3692 426 385 (10.4) 1 385 41 ( 1.1) 385 426 EAE 1 (QGGCCP) 11 82 1327 (35.9) 1744 3071 485 585 (15.8) 3107 3692 875 403 (10.9) 82 485 1135 390 (10.6) 485 875 1342 278 ( 7.5) 1466 1744 1348 260 ( 7.0) 875 1135 1466 207 ( 5.6) 1135 1342 1744 118 ( 3.2) 1348 1466 3071 82 ( 2.2) 1 82 3097 26 ( 0.7) 3071 3097 3107 10 ( 0.3) 3097 3107 6 ( 0.2) 1342 1348 ECO 0109 (PGGNCCQ 9 153 1368 (37.1) 1147 2515 708 595 (16.1) 2647 3242 922 555 (15.0) 153 708 1069 376 (10.2) 3242 3618 1147 214 ( 5.8) 708 922 2515 153 ( 4.1) 1 153 2647 147 ( 4.0) 922 1069 3242 132 ( 3.6) 2515 2647 3618 78 ( 2.1) 1069 1147 74 ( 2.0) 3618 3692 ECO R1 (GAATTC) 2 525 2504 (67.8) 525 3029 3029 663 (18.0) 3029 3692 525 (14.2) 1 525 FNU 4H1 FNU D2 FOK 1 (GCNGC) (CGCG) (GGATG) # 32 21 SITES 84 130 218 221 491 507 514 855 874 1084 1232 1263 1290 1325 1497 1565 1643 1675 1922 1981 2589 2616 2802 2827 3091 3268 3299 3302 3327 3402 3419 3553 873 1236 1258 1740 298 533 668 689 863 1118 1394 1445 1457 1523 1943 2240 2296 2507 152 FRAGMENTS 608 1952 873 482 363 420 304 298 297 285 276 255 235 211 174 174 152 135 133 H AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA OOOOOOOOOOOCOOI—‘HHNNNNWWthnhU'IU'ImQQkDOA (52. (23 (13 ( 9 ( 0 H wwhpbmmmqqoomoot-I 9) .6) .1) .8) .6) .4) .2) .0) .7) .5) .4) .7) .7) .1) .7) .6) 1981 514 221 2827 1675 874 2616 3091 1325 1084 3553 3419 130 1565 3327 1497 1922 84 1290 1643 3268 1232 2589 1263 3302 2802 855 3402 491 507 3299 218 1740 1258 873 1236 1523 3388 1943 2783 1118 863 298 2296 2507 689 3068 533 3220 FRAGMENT ENDS 2589 855 491 3091 1922 1084 2802 3268 1497 1232 3692 3553 218 1643 3402 1565 1981 130 1325 1675 3299 1263 2616 1290 3327 2827 874 3419 507 514 3302 221 3692 873 1740 1236 1258 1943 3692 298 2240 3068 1394 1118 533 2507 2681 863 3220 668 3353 GDI 2 HAE 1 HAE 2 HAE 3 (QGGCCG) (RGGCCR) (PGCGCQ) (GGCC) 14 37 SITES 2681 2703 2783 3068 3220 3353 3388 82 875 1342 1466 1744 3071 3097 485 662 783 1013 1135 1210 1348 1868 1928 2201 2842 3107 3203 3320 260 83 154 371 486 663 784 876 923 931 1014 1136 1211 1343 FRAGMENTS 80 66 56 51 35 22 21 12 1327 793 595 467 278 124 82 26 641 520 485 372 273 265 230 177 138 122 121 117 96 75 60 3432 260 371 278 217 207 194 177 148 135 132 124 122 121 115 AAAAAAA" (35. (21. (16. (7: AAA ONO») (17. (14. AA F‘H ocu Hmwwwwwpmqq AAAAAAAAAA AAAAAAAAAAAAA wwwwwwhbmmmqo OOOOI—‘Hl—‘N .2) .8) .4) .9) .6) .3) 2703 1457 2240 1394 3353 2681 668 1445 1744 3097 875 1466 1342 3071 2201 1348 3320 1928 2842 783 485 1210 1013 662 3203 3107 1135 1868 260 3321 1467 154 2202 2649 2924 1989 1211 1745 1014 663 371 FRAGMENT ENDS 2783 1523 2296 1445 3388 2703 689 1457 3071 875 3692 1342 1744 1466 3097 2842 1868 485 3692 2201 3107 1013 662 1348 1135 783 3320 3203 1210 1928 3692 260 3692 1745 371 2409 2843 3072 2124 1343 1869 1136 784 486 154 # SITES FRAGMENTS FRAGMENT ENDS 1349 102 ( 2.8) 2486 2588 1421 92 ( 2.5) 784 876 1467 83 ( 2.2) 931 1014 1745 83 ( 2.2) 1 83 1869 81 ( 2.2) 2843 2924 1929 78 ( 2.1) 3243 3321 1989 78 ( 2.1) 2124 2202 2124 77 ( 2.1) 3127 3204 2202 77 ( 2.1) 2409 2486 2409 75 ( 2.0) 1136 1211 2486 72 ( 2.0) 1349 1421 2588 71 ( 1.9) 83 154 2649 61 ( 1.7) 2588 2649 2843 60 ( 1.6) 1929 1989 2924 60 ( 1.6) 1869 1929 3072 47 ( 1.3) 876 923 3078 46 ( 1.2) 1421 1467 3090 39 ( 1.1) 3204 3243 3098 19 ( 0.5) 3108 3127 3108 12 ( 0.3) 3078 3090 3127 10 ( 0.3) 3098 3108 3204 8 ( 0.2) 3090 3098 3243 8 ( 0.2) 923 931 3321 6 ( 0.2) 3072 3078 6 ( 0.2) 1343 1349 HGA 1 (GACGC) 4 2417 2417 (65.5) 1 2417 2492 767 (20.8) 2687 3454 2687 238 ( 6.4) 3454 3692 3454 195 ( 5.3) 2492 2687 75 ( 2.0) 2417 2492 HGI A1 (GRGCRC) 6 614 1131 (30.6) 1625 2756 985 936 (25 4) 2756 3692 1224 614 (16.6) 1 614 1253 372 (10.1) 1253 1625 1625 371 (10.0) 614 985 2756 239 ( 6.5) 985 1224 29 ( 0.8) 1224 1253 HGI C1 (GGQPCC) 12 333 730 (19.8) 2962 3692 414 697 (18.9) 2265 2962 753 559 (15.1) 816 1375 777 429 (11.6) 1713 2142 816 339 ( 9.2) 414 753 1375 333 ( 9.0) 1 333 1541 172 ( 4.7) 1541 1713 1713 166 ( 4.5) 1375 1541 2142 87 ( 2.4) 2142 2229 2229 81 ( 2.2) 333 414 2265 39 ( 1.1) 777 816 2962 36 ( 1.0) 2229 2265 24 ( 0.7) 753 777 155 # SITES FRAGMENTS FRAGMENT ENDS HGI J2 (GPGCQC) 7 902 1118 (30.3) 1805 2923 1202 902 (24.4) 1 902 1224 433 (11.7) 3259 3692 1625 401 (10.9) 1224 1625 1805 336 ( 9.1) 2923 3259 2923 300 ( 8.1) 902 1202 3259 180 ( 4.9) 1625 1805 22 ( 0.6) 1202 1224 HHA 1 (GCGC) 6 261 1254 (34.0) 2027 3281 1087 826 (22.4) 261 1087 1235 507 (13.7) 1520 2027 1520 411 (11.1) 3281 3692 2027 285 ( 7.7) 1235 1520 3281 261 ( 7.1) 1 261 148 ( 4.0) 1087 1235 HIND 3 (AAGCTT) 1 564 3128 (84.7) 564 3692 564 (15.3) 1 564 HINF 1 (GANTC) 9 46 1090 (29.5) 1106 2196 123 739 (20.0) 2897 3636 551 428 (11.6) 123 551 969 418 (11.3) 551 969 1106 369 (10.0) 2528 2897 2196 332 ( 9.0) 2196 2528 2528 137 ( 3.7) 969 1106 2897 77 ( 2.1) 46 123 3636 56 ( 1.5) 3636 3692 46 ( 1 2) 1 46 HPA 2 (CCGG) 10 111 1173 (31.8) 296 1469 296 671 (18.2) 1469 2140 1469 467 (12 6) 3074 3541 2140 429 (11.6) 2645 3074 2171 240 ( 6.5) 2405 2645 2191 214 ( 5.8) 2191 2405 2405 185 ( 5.0) 111 296 2645 151 ( 4.1) 3541 3692 3074 111 ( 3.0) 1 111 3541 31 ( 0.8) 2140 2171 20 ( 0.5) 2171 2191 HPH 1 (GGTGA) 11 33 928 (25.1) 2764 3692 203 864 (23.4) 1900 2764 470 458 (12.4) 1026 1484 593 289 ( 7.8) 737 1026 KPN 1 (GGTACC) 3 M80 2 (GAAGA) 13 MLU 1 (ACGCGT) 1 MNL 1 (CCTC) 58 SITES 737 1026 1484 1686 1765 1900 2764 753 816 2142 239 918 1001 1530 1859 2163 2401 2414 3038 3083 3544 3547 3595 1739 27 141 157 317 382 462 468 518 620 701 837 845 867 921 992 1073 1090 1121 FRAGMENTS 267 202 170 144 135 123 79 33 1550 1326 1953 1739 192 AAAAAAAA ONWUUDJIHLHQ (42. (35. (20. .7) ( 1 (18. (16. H14 A)» OOHHNNmmmm AAAAAAAAA (52. (47. NNNNNNNNNwwwwwhhmm .2) .5) .9) .7) .3) .1) .9) 0) 9) 203 1484 33 593 1765 470 1686 1 2142 816 753 239 2414 1001 3083 1530 1859 2163 3595 918 3547 3038 2401 3544 1739 2126 2562 1450 157 2372 1986 701 27 1749 3048 518 3154 1862 2967 992 620 382 2781 FRAGMENT ENDS 470 1686 203 737 1900 593 1765 33 3692 2142 753 816 918 3038 1530 3544 1859 2163 239 2401 3692 1001 3595 3083 2414 3547 3692 1739 2318 2751 1630 317 2514 2126 837 141 1862 3154 620 3241 1946 3048 1073 701 462 2854 157 # SITES FRAGMENTS FRAGMENT ENDS 1146 71 ( 1.9) 921 992 1206 66 ( 1.8) 1213 1279 1213 65 ( 1.8) 1684 1749 1279 65 ( 1.8) 317 382 1307 62 ( 1.7) 1388 1450 1357 61 ( 1.7) 2854 2915 1388 60 ( 1.6) 3407 3467 1450 60 ( 1.6) 1146 1206 1630 56 ( 1.5) 3506 3562 1634 55 ( 1.5) 3562 3617 1684 54 ( 1.5) 2318 2372 1749 54 ( 1.5) 867 921 1862 52 ( 1.4) 2915 2967 1946 51 ( 1.4) 3241 3292 1949 50 ( 1.4) 1634 1684 1986 50 ( 1.4) 1307 1357 2126 50 ( 1.4) 468 518 2318 48 ( 1.3) 3622 3670 2372 48 ( 1.3) 2514 2562 2514 44 ( 1.2) 3363 3407 2562 39 ( 1.1) 3467 3506 2751 37 ( 1.0) 1949 1986 2781 31 ( 0.8) 1357 1388 2854 31 ( 0.8) 1090 1121 2915 30 ( 0.8) 2751 2781 2967 28 ( 0.8) 1279 1307 3048 27 ( 0.7) 3292 3319 3154 27 ( 0.7) 1 27 3241 25 ( 0.7) 1121 1146 3292 22 ( 0.6) 3670 3692 3319 22 ( 0.6) 3341 3363 3341 22 ( 0.6) 3319 3341 3363 22 ( 0.6) 845 867 3407 17 ( 0.5) 1073 1090 3467 16 ( 0.4) 141 157 3506 8 ( 0.2) 837 845 3562 7 ( 0.2) 1206 1213 3617 6 ( 0.2) 462 468 3622 5 ( 0.1) 3617 3622 3670 4 ( 0.1) 1630 1634 3 ( 0.1) 1946 1949 MST 2 (CCTNAGG) 2 157 2486 (67.3) 1206 3692 1206 1049 (28.4) 157 1206 157 ( 4.3) 1 157 NCI 1 (CCSGG) 8 110 1358 (36.8) 111 1469 111 721 (19.5) 1469 2190 1469 618 (16.7) 3074 3692 2190 429 (11.6) 2645 3074 2191 240 ( 6.5) 2405 2645 2405 214 ( 5.8) 2191 2405 2645 110 ( 3.0) 1 110 3074 1 ( 0.0) 2190 2191 1 ( 0.0) 110 111 # NCO 1 (CCATGG) 2 NHE 1 (GCTAGC) 2 NLA 3 (CATG) 18 NLA 4 (GGNNCC) 32 SITES 250 733 2280 2744 80 209 251 330 734 758 787 1065 1131 1244 1601 1640 2084 2207 2219 2258 3296 3464 333 369 378 414 442 708 753 777 816 903 977 1069 1148 1375 1384 1421 1541 1713 2079 2136 2142 2229 2265 2407 158 FRAGMENTS 2959 483 250 2280 948 464 1038 444 404 357 278 228 168 129 123 113 79 66 42 39 29 24 12 366 (80. (13. .8) ( 6 (61. (25. (12. (28. (12. H O AAAAAAAAAAAAAAAA OOOHHHI—‘Nwwwwhmfiw AAAAAAAAAAAAAAAAAAAAAAAA HHHHHHNNNNNWWWWDQQQQQCDKOKO 1) 1) .9) .0) .0) .2) .5) .1) .7) .5) .8) .3) .1) .5) .4) .4) .1) .5) .2) .1) .1) .0) .0) 733 250 2744 2280 2258 1640 330 1244 787 3464 3296 2084 1131 251 1065 209 2219 1601 758 734 2207 1713 3395 442 2647 2407 1148 1541 3076 2265 1421 3243 2962 977 2142 816 1069 903 2079 708 2923 777 3358 1384 FRAGMENT ENDS 3692 733 250 2280 3692 2744 3296 2084 734 1601 1065 3692 3464 2207 1244 330 1131 251 2258 1640 787 758 2219 2079 333 3692 708 2911 2647 1375 1713 3243 2407 1541 3358 3076 1069 2229 903 1148 977 2136 753 2962 816 3395 1421 NSI 1 NSP BZ NSP C1 PFL M1 PPU M1 PST 1 # (ATGCAT) 2 (CVGCWG) 7 (PCATGQ) 5 (CCANNNNNT 2 (PGGRCCQ) 5 (CTGCAG) 5 SITES 2647 2911 2923 2962 3076 3243 3358 3395 535 1972 216 512 872 1551 1566 1573 2336 1130 1639 2206 2257 3295 665 3080 708 1069 1147 2515 3618 219 1185 1291 2617 2652 159 FRAGMENTS 36 ( 1 36 ( 1 36 ( l 28 ( 0 24 ( 0 12 ( 0 9 ( 0 9 ( 0 6 ( 0 1720 (46 1437 (38. 535 (14 1356 (36. 763 (20. 679 (18. 360 ( 9. 296 ( 8 216 ( 5 15 ( 0 7 ( 0 1130 (30. 1038 (28. 567 (15. 509 (13. 397 (10. 51 ( 1. 2415 (65. 665 (18. 612 (16. 1368 (37. 1103 (29. 708 (19 361 ( 9 78 ( 2. 74 ( 2. 1326 (35 1040 (28 966 (26 219 ( 5 106 ( 2 35 ( 0 .0) .6) 9) .5) 1) 9) .2) .8) 1) 0) .9) .2) .2) .9) .9) .9) 2229 378 333 414 753 2911 1375 369 2136 1972 535 2336 1573 872 512 216 1551 1566 2257 1639 1130 3295 2206 665 3080 1147 2515 708 1069 3618 1291 2652 219 1185 2617 FRAGMENT ENDS 2265 414 369 442 777 2923 1384 378 2142 3692 1972 535 3692 2336 1551 872 512 216 1566 1573 1130 3295 2206 1639 3692 2257 3080 665 3692 2515 3618 708 1069 1147 3692 2617 3692 1185 219 1291 2652 160 # SITES FRAGMENTS FRAGMENT ENDS PVU 2 (CAGCTG) 3 216 2126 (57.6) 1566 3692 1551 1335 (36.2) 216 1551 1566 216 ( 5.9) 1 216 15 ( 0.4) 1551 1566 RRU l (AGTACT) 1 935 2757 (74.7) 935 3692 935 (25.3) 1 935 RSA 1 (GTAC) 10 754 801 (21.7) 936 1737 817 773 (20.9) 2345 3118 880 754 (20.4) 1 754 936 406 (11.0) 1737 2143 1737 336 ( 9.1) 3145 3481 2143 211 ( 5.7) 3481 3692 2345 202 ( 5.5) 2143 2345 3118 63 ( 1.7) 817 880 3145 63 ( 1.7) 754 817 3481 56 ( 1.5) 880 936 27 ( 0.7) 3118 3145 SAC 1 (GAGCTC) 2 1224 2067 (56.0) 1625 3692 1625 1224 (33.2) 1 1224 401 (10.9) 1224 1625 SAC 2 (CCGCGG) 1 872 2820 (76.4) 872 3692 872 (23.6) 1 872 SAU 1 (CCTNAGG) 2 157 2486 (67.3) 1206 3692 1206 1049 (28.4) 157 1206 157 ( 4.3) 1 157 SAU 3A (GATC) 12 227 1131 (30.6) 473 1604 458 523 (14.2) 2357 2880 473 479 (13.0) 2880 3359 1604 279 ( 7.6) 1827 2106 1763 231 ( 6.3) 227 458 1827 227 ( 6.1) 2106 2333 2106 227 ( 6.1) 1 227 2333 173 ( 4.7) 3359 3532 2357 160 ( 4.3) 3532 3692 2880 159 ( 4.3) 1604 1763 3359 64 ( 1.7) 1763 1827 3532 24 ( 0.7) 2333 2357 161 # SITES FRAGMENTS FRAGMENT ENDS 15 ( 0.4) 458 473 SAU 96 (GGNCC) 35 154 304 ( 8.2) 370 674 319 282 ( 7.6) 1421 1703 370 273 ( 7.4) 1148 1421 674 263 ( 7.1) 2648 2911 709 231 ( 6.3) 1703 1934 869 224 ( 6.1) 3395 3619 923 165 ( 4.5) 2123 2288 930 165 ( 4.5) 154 319 962 160 ( 4.3) 709 869 1049 154 ( 4.2) 1 154 1070 153 ( 4.1) 2924 3077 1148 152 ( 4.1) 3243 3395 1421 120 ( 3.3) 2288 2408 1703 117 ( 3.2) 3126 3243 1934 92 ( 2.5) 1988 2080 1988 87 ( 2.4) 962 1049 2080 78 ( 2.1) 2408 2486 2123 78 ( 2.1) 1070 1148 2288 73 ( 2.0) 3619 3692 2408 61 ( 1.7) 2587 2648 2486 54 ( 1.5) 1934 1988 2498 54 ( 1.5) 869 923 2516 51 ( 1.4) 319 370 2555 43 ( 1.2) 2080 2123 2587 39 ( 1.1) 2516 2555 2648 37 ( 1.0) 3089 3126 2911 35 ( 0.9) 674 709 2923 32 ( 0.9) 2555 2587 2924 32 ( 0.9) 930 962 3077 21 ( 0.6) 1049 1070 3089 18 ( 0.5) 2498 2516 3126 12 ( 0.3) 3077 3089 3243 12 ( 0.3) 2911 2923 3395 12 ( 0.3) 2486 2498 3619 7 ( 0.2) 923 930 1 ( 0.0) 2923 2924 SCA 1 (AGTACT) 1 935 2757 (74.7) 935 3692 935 (25.3) 1 935 SCR F1 (CCNGG) 47 54 253 ( 6.9) 1937 2190 110 228 ( 6.2) 3095 3323 111 216 ( 5.9) 137 353 137 208 ( 5.6) 3484 3692 353 188 ( 5.1) 2886 3074 439 178 ( 4.8) 781 959 446 170 ( 4.6) 2716 2886 530 162 ( 4.4) 1124 1286 628 156 ( 4.2) 1538 1694 665 143 ( 3.9) 981 1124 781 116 ( 3.1) 3332 3448 959 116 ( 3.1) 665 781 974 100 ( 2.7) 2191 2291 162 # SITES FRAGMENTS FRAGMENT ENDS 981 98 ( 2.7) 530 628 1124 95 ( 2.6) 1286 1381 1286 90 ( 2.4) 1802 1892 1381 88 ( 2.4) 1381 1469 1469 86 ( 2.3) 353 439 1505 84 ( 2.3) 446 530 1526 78 ( 2.1) 1694 1772 1538 75 ( 2.0) 2570 2645 1694 71 ( 1.9) 2645 2716 1772 57 ( 1.5) 2348 2405 1802 56 ( 1.5) 54 110 1892 54 ( 1.5) 1 54 1931 51 ( 1.4) 2519 2570 1937 49 ( 1.3) 2434 2483 2190 41 ( 1.1) 2307 2348 2191 39 ( 1.1) 1892 1931 2291 37 ( 1.0) 628 665 2307 36 ( 1.0) 3448 3484 2348 36 ( 1.0) 1469 1505 2405 30 ( 0.8) 1772 1802 2434 29 ( 0.8) 2405 2434 2483 27 ( 0.7) 2483 2510 2510 26 ( 0.7) 111 137 2519 21 ( 0.6) 3074 3095 2570 21 ( 0.6) 1505 1526 2645 16 ( 0.4) 2291 2307 2716 15 ( 0.4) 959 974 2886 12 ( 0.3) 1526 1538 3074 9 ( 0.2) 3323 3332 3095 9 ( 0.2) 2510 2519 3323 7 ( 0.2) 974 981 3332 7 ( 0.2) 439 446 3448 6 ( 0.2) 1931 1937 3484 1 ( 0.0) 2190 2191 1 ( 0.0) 110 111 SDU 1 (G2GC3C) 16 413 569 (15.4) 2187 2756 614 433 (11.7) 3259 3692 776 413 (11.2) 1 413 902 382 (10.3) 1805 2187 985 372 (10.1) 1253 1625 1202 296 ( 8.0) 2963 3259 1224 217 ( 5.9) 985 1202 1253 201 ( 5.4) 413 614 1625 167 ( 4.5) 2756 2923 1712 162 ( 4.4) 614 776 1805 126 ( 3.4) 776 902 2187 93 ( 2.5) 1712 1805 2756 87 ( 2.4) 1625 1712 2923 83 ( 2.2) 902 985 2963 40 ( 1.1) 2923 2963 3259 29 ( 0.8) 1224 1253 22 ( 0.6) 1202 1224 SPA N1 (GATGC) 15 534 614 (16.6) 2544 3158 690 534 (14.5) 3158 3692 892 534 (14.5) 1 534 163 # SITES FRAGMENTS FRAGMENT ENDS 954 321 ( 8.7) 1522 1843 1117 303 ( 8.2) 1974 2277 1218 249 ( 6.7) 2295 2544 1393 202 ( 5.5) 690 892 1522 175 ( 4.7) 1218 1393 1843 163 ( 4.4) 954 1117 1942 156 ( 4.2) 534 690 1974 129 ( 3.5) 1393 1522 2277 101 ( 2.7) 1117 1218 2295 99 ( 2.7) 1843 1942 2544 62 ( 1.7) 892 954 3158 32 ( 0.9) 1942 1974 18 ( 0.5) 2277 2295 SMA 1 (CCCGGG) 2 110 2080 (56.3) 110 2190 2190 1502 (40.7) 2190 3692 110 ( 3.0) 1 110 SPH 1 (GCATGC) 3 1639 1639 (44.4) 1 1639 2257 1038 (28.1) 2257 3295 3295 618 (16.7) 1639 2257 397 (10.8) 3295 3692 SSP 1 (AATATT) 1 717 2975 (80.6) 717 3692 717 (19.4) 1 717 STU 1 (AGGCCT) 2 1210 1824 (49.4) 1868 3692 1868 1210 (32.8) 1 1210 658 (17.8) 1210 1868 STY 1 (CCRRGG) 7 150 1986 (53.8) 1706 3692 250 780 (21.1) 926 1706 418 257 ( 7.0) 476 733 476 193 ( 5.2) 733 926 733 168 ( 4.6) 250 418 926 150 ( 4.1) 1 150 1706 100 ( 2.7) 150 250 58 ( 1.6) 418 476 TAQ 1 (TCGA) 4 895 1870 (50.7) 1369 3239 949 895 (24.2) 1 895 1369 453 (12.3) 3239 3692 3239 420 (11.4) 949 1369 54 ( 1.5) 895 949 164 # SITES FRAGMENTS FRAGMENT ENDS TTH111 1 (GACNNNG 1 2663 2663 (72.1) 1 2663 1029 (27.9) 2663 3692 TTH111 2 (CCAPCA) 9 358 1390 (37.6) 1732 3122 482 557 (15.1) 488 1045 488 422 (11.4) 1310 1732 1045 358 ( 9.7) 1 358 1061 343 ( 9.3) 3349 3692 1310 249 ( 6.7) 1061 1310 1732 227 ( 6.1) 3122 3349 3122 124 ( 3.4) 358 482 3349 16 ( 0.4) 1045 1061 6 ( 0.2) 482 488 XHO 1 (CTCGAG) 1 3238 3238 (87.7) 1 3238 454 (12.3) 3238 3692 XHO 2 (PGATCQ) 4 226 2653 (71.9) 226 2879 2879 479 (13.0) 2879 3358 3358 226 ( 6.1) 1 226 3531 173 ( 4.7) 3358 3531 161 ( 4.4) 3531 3692 XMA 3 (CGGCCG) 1 875 2817 (76.3) 875 3692 875 (23.7) 1 875 The following do not appear: AFL 2 AHA 3 A03 1 BGL 2 B88 H2 CLA 1 ECO R5 HINC 2 HPA 1 MST 1 NAE 1 NAR 1 NDE 1 NOT 1 NRU 1 PVU 1 RSR 2 SAL 1 SFI 1 SNA 1 SNA Bl SPE 1 XBA 1 XMN 1 LIST OF REFERENCES 10. 11. 12. 13. 14. LIST OF REFERENCES Schwab, D.A. and Wilson, J.E. (1988) J. Biol. Chem. 263, 3220-3224. Schwab, D.A. and Wilson, J.E. (1989) Proc. Natl. Acad. Sci. USA 86, 2563-2567. Griffin, L.D., Gelb, B.D., Wheeler, D.A., Davison, D., Adams, V., and McCabe, E.R.B. (1991) Genomics 11, 1014- 1024. Arora, K.K., Fanciulli, M., and Pedersen, P.L. (1990) J. Biol. Chem. 265, 6481-6488. Nishi, S., Seino, S., and Bell, G.I. (1988) Biochem. Biophys. Res. Cbmm. 157, 937-943. Thelen, A.P., and Wilson, J.E. (1991) Arch. Biochem. Biophys. 286, 645-651. Schwab, D.A., and Wilson, J.E. (1991) Arch. Biochem. Biophys. 285, 365-370. Andreone, T.L., Printz, R.L. Pilkis, S.J. Magnuson, M.A. and Granner, D.K. (1989) J; Biol. Chem. 264, 363-369. Iynedjian, P.B., Ucla, C., and Mach, B. (1987) J; Biol. Chem. 262, 6032-6038. Tanizawa, Y., Koranyi, L.I., Welling, C.M., and Permutt, M.A. (1991) Proc. Natl. Acad. Sci. USA 88, 7294-7297. Stachelek, C., Stachelek, J., Swan, J., Botstein, D., and Konigsberg, W. (1986) Nuc. Acids Res. 14, 945-963. Kopetzki, E., Entian, K., Mecke, D. (1985) Gene 39, 95- 102. Frohlich, K., Entian, K., Mecke, D. (1985) Gene 36, 105- 111. Albig, W., and Entian, K.D. (1988) Gene 73, 141-152. 165 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 166 Personal communication fromfiDr. Charles Shoemaker, School of Tropical Medicine, Harvard University. Katzen, H.M., Soderman, D.D., and Nitowsky, H.M. (1965) Biochem. Biophys. Res. Cbmmun., 19, 377-382. Gonzalez, C., Ureta, T., Babul, J., Rabajille, E., and Niemeyer, H. (1967) Biochemistry 6, 460-468. Wilson, J.E. (1985) in Regulation of Carbohydrate Metabolism (Beitner, R., Ed.), Vol. I, pp. 45-85, CRC Press, Inc., Boca Raton, F1. Ureta, T. (1982) Cbmp. Biochem. Physiol. 718, 549—555. Felgner, P.L., Messer, J.L., and Wilson, J.E. (1979) J. Biol. Chem. 254, 4946-4949. Linden, M., Gellerfors, P., and Nelson, B.D. (1982) FEBS Lett. 141, 189-192. Fiek, C., Benz, R., Roos, N., and Brdiczka, D (1982) Biochim. Biophys. Acta 688, 429-440. BeltrandelRio, H. and Wilson, J.E. (1991) Arch. Biochem. Biophys. 286, 183-194. Viitanen, P.V., Geiger, P.J., Erickson-Viitanen, S., and Bessman, S.P. (1984) J: Biol. Chem. 259, 9679-9686. Salotra, P.T. and Singh, V.N. (1982) Arch. Biochem. Biophys. 216, 758-764. Kosow,, D.P. and Rose, I.A. (1968) J; Biol. Chem. 243, 3623-3630. Wilson, J.E. (1968) J} Biol. Chem. 243, 3640-3647. Sols, A. and Crane, R.K., (1954) J: Biol. Chem. 210, 581- S95. Kosow, D.P. and Rose, I.A. (1972) Biochem. Biophys. Res. Cbmm. 48, 376-383. Kosow, D.P., Oski, F.A., Warms, J.V.B., and Rose, I.A. (1973) Arch. Biochem. Biophys. 157, 114-124. Ureta, T. (1975) in Isozymes (Markert, C.L., Ed.), Vol. III, pp. 575-602, Academic Press, New York, NY. Magnani, M., Stocchi, V., Serafini, N., Piatti, E., Dacha, M., and Fornaini, G. (1983) Arch. Biochem. Biophys. 226, 377-387. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 167 Siano, D.B., Zyskind, J.W., and Fromm, H.J. (1975) Arch. Biochem. Biophys. 170, 587-600. Magnuson, M.A., (1990) Diabetes 39, 523-527. Walters, E., and McLean, P. (1968) Biochem. J; 109, 737- 741. Katzen, H.M. (1967) in Advances in Enzyme Regulation (Weber, G., Ed.) Vol. 5, pp. 335-356, Pergamon Press, New York. McLean, P., Brown, J., Walters, E., and Greenslade, K. (1967) Biochem. J; 105, 1301-1305. Preller, A., and Wilson, J.E. (1992) Arch. Biochem. Biophys. 294, 482-492. Sharma, D., Manjeshwar, R., and Weinhouse, S. (1963) J. Biol. Chem. 238, 3840-3845. Iynedjian, P.B., Jotterand, D., Nouspikel, T., Asfari, M., and Pilot, P. (1989) J; Biol. Chem. 264, 21824-21829. Ashcroft, F.M., Harrison, D.E., and Ashcroft, S.J.H. (1984) Nature 312, 446-448. Nelson, T.Y., Gaines, K.L., Rajan, A.S., Berg, M., and Boyd III, A.E. (1987) J. Biol. Chem. 262, 2608-2612. Magnuson, M.A. and Shelton, K.D. (1989) J; Biol. Chem. 264, 15936-15942. Wilson, J.E. (1968) J; Biol.Chem. 243, 3640-3647. Polakis, P.G. and Wilson, J.E. (1985) Arch. Biochem. Biophys. 236, 328-337. Xie, G. and Wilson, J.E. (1988) Arch. Biochem. Biophys. 267, 803-810. Xie, G. and Wilson, J.E. (1988) Arch. Biochem. Biophys. 276, 285-293. Gelb, B.D., Adams, V., Jones, S.N., Griffin, L.D., MacGregor, G.R., and McCabe, R.B. (1992) Proc. Natl. Acad. Sci. USA 89, 202-206. Felgner, P.L., Messer, J.L., and Wilson, J.E. (1979) J. Biol. Chem. 260, 4946-4949. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 168 Kurokawa, M., Oda, S., Tsubotani, E., Fujiwara, H., Yokoyama, K., and Ishibashi, S. (1982) Mol. Cell. Biochem. 45, 151-157 Ureta, T., Bravo, R., and Babu1,J. (1976) Enzyme 20, 334- 348. Colowick, S.P. (1973) in The Enzymes (Boyer, P.D., Ed.), 3rd ed., Vol 9, pp. 1-48, Academic Press, New York. Purich, D.L., Fromm, H.J., and Rudolph, F.B. (1973) Adv. Enzymol. 39,249-326. Anderson, C., McDonald, R., and Steitz, T. (1978) J1.Mbl. Biol. 123, 1-13. Anderson, C., Stenkamp, R., McDonald, R., and Steitz, T. (1978) J: Mol. Biol. 123, 207-219. Anderson, C., McDonald, R., and Steitz, T. (1978) J; Mol. Biol. 123, 15-33. Bennett, W., and Steitz, T. (1980) J2.Mol. Biol. 140, 183- 209. Bennett, W., and Steitz, T. (1980) J1.Mol. Biol. 140, 211- 230. Steitz, T., Anderson, W., Fletterick, R., and Anderson, C. (1977) J. Biol. Chem. 252, 4494-4500. Bennett, W., and Steitz, T. (1978) Proc. Natl..Acad. Sci. USA 75, 4848-4852. Anderson, C.M., Zucker, F.H., and Steitz, T.A. (1979) Science 204, 375-380. Lobo, Z., and Maitra, P. (1977) Arch. Biochem. Biophys. 182, 639-645. Maitra, P. (1975) Methods Enzymol. 42, 25-30. Rossmann, M., and Argos, P. (1977) J: Mol. Biol. 109, 99- 129. McLachlan, A. (1979) Eur. J; Biochem. 100, 181-187. Harrison, R. (1985) Crystallographic Refinement of TWO Isozymes of Yeast Hexokinase and Relationship of Structure to Function. Ph.D. thesis, Yale‘University, NewflHaven, CT. Easterby, J.S. and O’Brien, M.J. (1973) Eur. J; Biochem. 38, 201-211. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 169 Rose, I.A., Warms, J.V.B., and Kosow, D.P. (1974) Arch. Biochem. Biophys. 164, 729-735. Holroyde, M.J., Trayer, I.P., and Cornish-Bowden, A. (1976) FEBS Lett. 62, 213-219. Gregoriou, J., Trayer, I.P. and Cornish-Bowden, A. (1983) Eur. J; Biochem. 134, 283-288. Polakis, P.G. and Wilson, J.E. (1984) Arch. Biochem. Biophys. 234, 341-352. Nemat-Gorgani, M. and Wilson, J.E. (1986) Arch. Biochem. Biophys. 251, 97-103. Schirch, D.M. and Wilson, J.E. (1987) Arch. Biochem. Biophys. 259, 402-411. White, T.K. and Wilson, J.E. (1987) Arch. Biochem. Biophys. 259, 402-411. White, T.K., and Wilson, J.E. (1989) Arch. Biochem. Biophys. 274, 375-393. Chou, A.C., and Wilson, J.E. (1974) Arch. Biochem. Biophys. 165, 628-633. Ellison, W.R., Lueck, J.D., and Fromm, H.J. (1974) Biochem. Biophys. Res. Cbmmun. 57, 1214-1220. Ellison, W.R., Lueck, J.D., and Fromm, H.J. (1975) J; Biol. Chem. 250, 1864-1871. Hutny, J., and Wilson, J.E. (1990) Arch..Biochem..Biophys. 283, 173-183. Chou, A.C., and Wilson, J.E. (1972) Arch. Biochem. Biophys. 151, 48-55. Wilkin, G.P., and Wilson, J.E. (1977) J..Néurochem. 29, 1039-1051. DeWitt, D.L. and.Smith, W.L. (1988) Proc..Natl. Acad. Sci. USA 85, 1412-1416. Pittler, S.J., Kozak, L.P., and Wilson, J.E. (1985) Biochim. Biophys. Acta 843, 186-192. Maniatis, T., Fritsch, E.F., and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Lab., Cold Spring Harbor, NY). 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 170 Feinberg, A.P., and Vogelstein, B. (1983) Anal. Biochem. 1321 6'13 . Feinberg, A.P., and Vogelstein, B. (1984) Anal. Biochem. 137, 266-267. Sanger, P., Nicklen, S., and Coulson, A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. Henikoff, S. (1984) Gene 28, 351-359. Henikoff, S. (1987) in Promega Notes, No. 8, pp. 1-3, Promega Corp., Madison, WI Takahara, M., Hibler, D., Barr, p., Gerlt, J., and Inouye, M. (1985) J. Biol. Chem. 260, 2670-2674. Wilson, J.E. (1989) Prep. Biochem. 19, 13-21. Schirch, D.M., and Wilson, J.E. (1987) Arch. Biochem. Biophys. 257, 1-12. Wickens, M. and Stephenson, P. (1984) Science 226, 1045- 1051. Creighton, T.E. (1983) Proteins, pp. 252-262, Freeman Publications, New York. Lesk, A. M., and Chothia, C. (1986) Philos. Trans. R. Soc. London B 317, 345-356. Chothia, C., and Lesk, A.M. (1986) EMBO J. 5, 823-826. Craik, C.S., Rutter, W.J., and Fletterick, R. (1983) Science 220, 1125-1129. Steitz, T.A., Fletterick, R.J., Anderson, W.F., and Anderson, C.M. (1976) J. Mbl. Biol. 104, 197-222. Smith, A.D. and Wilson, J.E., (1991) Arch. Biochem. Biophys. 287, 359-366. Smith, A.D. and Wilson, J.E. (1991) Arch. Biochem. Biophys. Marcus, F., and Ureta, T. (1986) Biochem. Biophys. Res. COmmun. 139, 714-719. Barnell, W., Yi, K.C., and Conway, T. (1990) J. Bacteriology 172, 7227-7240. Keim, P., Heinrickson, R.L., and Fitch, W.M. (1981) J. Mbl. Biol. 151, 179-197. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 171 Rossman, M.G., Liljas, A., Branden, G.I., and Banaszak, L.J. (1975) in The Enzymes (Boyer,P.D., Ed.), 3rd ed., Vol. 11, pp. 61-102, Academic Press, New York. Weber, I.T., Takio, K., Titasni, K., and Steitz, T.A. (1982) Proc. Natl. Acad. Sci. USA 79, 7979-7983. Rudolph, F.B., and From, H.J. (1971) J. Biol. Chem. 246, 6611-6619. Noat, G., Richard, J., Borel, M., and.Got, C. (1970) Eur. J. Biochem. 13, 347-360. Branden, C., and.Tooze, J. (1991) Introduction to Protein Structure, pp. 141-159, Garland Publishing, Inc., New York. Hanks, S.K., Quinn, A.M., and Hunter, T. (1988) Science 241, 42-52. Hol, W.G.J. (1985) Prog; Biophys. Mol. Biol. 45, 149-195. Arora, K.K., Shenbagamurthi, P., Fanciulli, M., and Pedersen, P.L. (1990) J. Biol. Chem. 265, 5324-5328. Fry, D.C., Kuby, S.A., and Mildvan, A.S. (1986) Proc. Natl. Acad. Sci. USA 83, 907-911. Tamura, J.K., LaDine, J.R., and Cross, R.L. (1988) J. Biol. Chem. 263, 7907-7912. Bork, P., Sander, C., and'Valencia, A. (1992) Proc. Natl. Acad. Sci. USA 89, 7290-7294. Pollard, T.D., and.Cooper, J.A. (1986) Ann..Rev. Biochem. 55, 987-1035. Carlier, M. (1991) J. Biol. Chem. 266, 1-4. Hennessey, E.S., Drummondq D.R., and.Sparrowg J.C. (1993) Biochem. J. 282, 657-671. Gething, M.J., and Sambrook, J. (1992) Nature 355, 33-42. Ingolia, T.D., and Craig, E.A. (1982) Proc. Natl. Acad. Sci USA 79, 525-529. Schlossman, D.M., Schmid, S.L., Braell, W.A., and Rothman, J.E. (1984) J.Cell Biol. 99, 723-733. Chirico, W.J., Waters, M.G., and Blobel, G. (1988).Nature 332, 805-810. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 172 Flaherty, K.M., DeLuca-Flaherty, C, and McKay, D.B. (1990) Nature 346, 623-628. Flaherty, K.M., McKay, D.B., Kabsch, W., and.Holmes, K.C. (1991) Proc. Natl. Acad. Sci. USA 88, 5041-5045. Kabsch, W., Mannherz, H.G., Suck, D., Pai, E.F., and Holmes, K.C. (1990) Nature 347, 37-44. Hurley, J.H., Faber, H.R., Worthylake, D., Meadow, N.D., Roseman, S., Pettigrew, D.W., and Remington, S.J. (1993) Science 259, 673-677. Levitt, M., and Chothia, C. (1976) Nature 261, 552-557. Branden, C.I. (1980) Q. Rev. Biophys. 13, 317-338. Orr, G.A., Simon, J., Jones, S.R., Chin, G.J., Knowles, J.R. (1978) Proc. Natl. Acad. Sci. USA 75, 2230-2233. Blattler, W.A., Knowles, J.R. (1979) J. Am. Chem. Soc. 101, 510-511. Ghrayeb, J., Kimura, H., Takahara, M., Hsiung, H., Masui, Y., and Inouye, M. (1984) EMBO J. 3, 2437-2442. Takahara, M., Sagai, H., Inouye, S., and Inouye, M. (1988) Biotechnology 6,195-198. Swamy, K.H.S., and Goldberg, A.L. (1982) J. Bacteriol. 149, 1027. Better, M., Chang, G.P., Robinson, R.R., Horwitz, A.H. (1988) Science 240, 1041. Pollitt, S., and Zalkin, H. (1983) J. Bacteriol. 153, 27. Duffaud, G.D., March, P.E., and Inouye, M. (1987) .Methods Enzymol. 153, 492-507. Neu, H.G., and Heppel, L.A. (1965) J. Biol. Chem. 240, 3685-3692. Schein, C.H., and Noteborn, M.H.M. (1988) Biotechnology 6, 291-294. Takagi, H., Morinaga, Y., Tsuchiya, M., Ikemura, H., and Inouye, M. (1988) Biotechnology 6, 948-950. Fromm, H.J. (1981) in The Regulation of Carbohydrate Formation and Utilization in Mammals (C.M. Veneziale, Ed.), University Park Press, Baltimore.