A. 1.....4 ‘ "”3557; ,2 ~u.‘ a 1‘. € I‘m-”5. :' ‘1. . ‘x‘. ‘ a 3‘: ’1 fJN-Izna [Nyrf‘ - . , ,.. vux‘i'fg‘hé‘ p- ,v ‘L 'I - Infriflrv‘ .. m '{Sél‘fl u .. ,X,‘ -' PM'I-ms' . 3%.: ,'~’ "LEE-1.. 2 u A; 3:;3- '1 arm-Eu. "M .3.J.. ‘9 ‘4 1. .. .1! '11“ am <: flan" 1 “'J'mnW» . r0. .. «1;: 5., . 1'! . E-_y‘.‘..—,,4f - 1 ' ‘91,;3-"3, . , ' t "It?!" "1 u . ‘v ‘ :h‘l' _ -- L," in? 4 4.5"" 1 ,. . .f 1 . ._ J H I£.' «he 7 ‘a 53;}; ' .. . ‘ {.1 v "7"" H u ’ I 0 ms] rn. ..‘, I'."""" I u . :u'_ ”:1. , u . . . . V n ,_ . . ' r- .! ..,1. I'. .._.'...':‘ .. u . c "p. ‘:-y . V ”I . '- u - - X - ' ‘ " '1'». dag-"2;. at?!" .u . . . v . . . 1,1 "2 . w 4 ”Mun! ~. n. - - ; , , , ' ' '5“ "X r_r__ r .1". O _’ m -g “'9er u- U ‘ "' ' " . . < (a "—npkpunw "”3 r‘ Ju‘LPT,’ -v 2". " "P“ ~1leon w- .- m. ,. . n ’4: . '...u3;:r;: ;. 23’. $17. . "$1,;‘O'I' 1 11”" q mm «A. r m—g 7.". Pt“ h”. v 9 mm. . u 4.,” .0” 1. " "POW- . W... .mrJvmr. ’2 ,_, M : y. 5“,,“ v. m, m ”NI"... MN?“ .m . A. , '34:""2‘" « ’1', 2...." 5; _ .. v . ' .. .. v . n: fin-fitv v.“ T," r in"... »' m; 3’ .. . u .. ‘n, .. . ’ .. .. . ..,.X...~.. ,3; “I'.'>4-ur.'../ mu“: 'm"" - """"urv-..r' ”M,” . - ,. w .,, "’3‘" "Wt-“r3" . . .. “22:." L4 . . b m .h 5": ‘ u -.1 :.‘:..' ‘ u .A;-;-. "2;, _ 1" " ’h‘. .- ‘ -' .. . - "2'“ ..I_. ,. 7;. . . I... “,2 {~4:v..,.::.,, . ' 'J‘A.‘ -. . “ :w‘-" a, I.:."....., r m ”AI-n .-;r. I '4 H. 'fz‘uvuw ‘. .- "‘ “NV! 0 3.. " ivy; r” » ~ " " n- .n. .....»...., 74:; - . f o I. y." y..’.7‘u, ~.-.‘....:;;-. v¢~vr... n... llllllllllllllllllllfillllil 3 1293 01055 1806 r This is to certify that the dissertation entitled TYPE II HEXOKINASE: MOLECULAR CLONING, SEQUENCE AND PROMOTER ANALYSIS presented by Annette P. Thelen has been accepted towards fulfillment of the requirements for Ph . D . degree in Biochemistry -Q . 1 Major professor John E. Wilson Date April 7, 1992 Professor 'and Chair. Dept. of Biochemistry MSU is an Affirmatiw Action/Equal Opportunity Institution 0- 12771 LIBRARY ‘ {Michigan State *1 University PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before die due. I DATE DUE DATE DUE DATE DUE l “—1! I MSU le An Affirmative Action/Equal Opportunity Institution _ _ 9W3...“ TYPE II HEXOKINASE: MOLECULAR CLONING, SEQUENCE AND PROMOTER ANALYSIS By Annette P. Thelen A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirmenets for the degree of DOCTOR OF PHILOSOPHY Department of Biochemistry 1992 s X... , at} / “/ ABSTRACT TYPE II HEXOKINASE: MOLECULAR CLONING, SEQUENCE AND PROMOTER ANALYSIS By Annette P. Thelen The 917-residue amino acid sequence of Type II hexokinase has been deduced from the overlapping nucleotide sequences of two clones, isolated from rat skeletal muscle cDNA libraries. The sequences of 197 nucleotides in the 5’ untranslated region and 687 bases of the 3’ untranslated region have also been determined from cloned cDNA. There is extensive similarity between the sequences of the N- and C- terminal halves of the Type II isozyme, as previously seen with the Type I isozyme; this is consistent with the view that these enzymes evolved by a process of gene duplication and fusion. The region of overlap between the two discrete cDNA clones, was confirmed by isolation and sequencing of a genomic DNA clone that spanned the region. Within this region, the 634-nucleotide coding sequence was divided into three exons, each of 150-250 nucleotides. In addition, the predominant transcription start site was located in a region 465 nucleotides upstream from the translation initiation codon. The genomic sequence upstream from the transcription start site was found to contain several potential promoter elements. . These include a TATA-like sequence, 2 CCAAT sequences, and 3 Spl binding sites. Alignment of genomic clones for Type II hexokinase predict a minimum gene length of at least 35 kb. These results suggest that the gene encoding Type II hexokinase is likely to be quite complex. A cDNA encoding the entire C-terminal half of a hexokinase from Novikoff ascites tumor cells was also isolated and found to be identical to a cDNA encoding the corresponding region of the Type II isozyme of skeletal muscle. Northern analysis indicated that a single mRN A, approximately 5200 nucleotides in length, encoded both the skeletal muscle and the tumor enzymes. These results do not support previous speculation that the hexokinase isozymes of normal tissue are distinct from those of tumor, and suggest the possibility that post-translational modifications of a single protein species might account for apparent differences between the isozymes of normal and tumor tissues. Copyright by ANNETTE PHYLLIS TI-IELEN 1992 To my family ACKNOWLEDGEMENTS I want to thank my family, especially my husband, Bill, without whose emotional support the completion of this degree would not have been possible. My children, Greg and Sarah, have been very patient and supportive through the easy and hard times. I want to thank John Wilson, for his guidance and intellectual support throughout this project. He has taught me much by his example, allowing me to develop and mature as a scientist. When things looked the darkest, he kept me looking for the dawn. I would like to acknowledge my committee whose advice has been most appreciated. They are: Susan Conrad, Steve Triezenberg, Clarence Suelter, and Jerry Dodgson. I want to give a special thanks to Patty Voss, Joe Leykam, Marion Healy, Linda Sherwood, Marcia Kieliszewski, and Al Smith for all their help and support. Lastly, my deepest appreciation to Dr. Dave McConnell, whose encouragement and assistance led to the fulfillment of a lifelong goal. vi TABLE OF CONTENTS LIST OF TABLES .................................... ix LIST OF FIGURES .................................... x LIST OF ABBREVIATIONS ............................. xii CHAPTER 1 Introduction ..................................... 1 Prologue .................................... 2 Background .................................. 3 Tissue Distribution ............................. 5 Influences on Hexokinase Levels ..................... 6 Regulation of Expression - the Glucokinase Gene ........... 9 Intracellular Locations of Hexokinase .................. 12 Regulation of Hexokinase Activity .................... 14 Tumor Hexokinase ............................. 15 Evolution of the Mammalian Hexokinases ............... 17 Structure-Function Relationship in Hexokinase ............. 20 CHAPTER 2 Materials and Methods ............................. 25 Materials ................................... 26 Methods .................................... 27 Purification and N -Terminal Sequencing of Type II Hexokinase from Rat Skeletal Muscle ......... 27 cDNA library Synthesis and Screenings ............ 29 Isolation of Genomic Clones Containing Type II Hexokinase Gene ..................... 30 Site-directed Mutagenesis of Type II Hexokinase: Creation of NcoI Site ................... 33 Expression of Type II Hexokinase in COS-1 Cells ...... 35 RNA Isolation and Northern Analysis ............. 36 vii Primer Extension .......................... 37 SI Nuclease Protection Assay .................. 38 Cell-Free in vitro Transcription Assay ............. 39 CHAPTER 3 Results ........................................ 42 Isolation of Clones Containing cDNA for Rat Type II Hexokinase .............................. 43 Isolation of Genomic Clones for Rat Type II Hexokinase ...... 53 Isolation of a Partial cDNA Clone for Novikoff Tumor Hexokinase .............................. 68 Northern Blot Analysis of mRN A from Rat Skeletal Muscle and Novikoff Ascites 'Drmor Cells .......... 68 Identification of the Transcription Initiation Sites ........... 71 Type II Hexokinase Promoter ....................... 81 Expression of Type II Hexokinase in COS-1 Cells .......... 86 CHAPTER 4 Discussion ...................................... 93 Amino Acid Sequence of Type II Hexokinase ............. 94 Structure of Type II Hexokinase mRNA ................ 99 Type II Hexokinase Gene ......................... 102 CHAPTER 5 Future Work .................................... 106 CHAPTER 6 References ...................................... 110 APPENDIX Appendix A ...................................... 120 viii LIST OF TABLES TABLE Page I. Comparison of several parameters for the mammalian hexokinase isozymes ........................................ 4 II. Comparison of Amino Acid Sequences of N- and C-Terminal Halves of Rat Type I and Rat Type III Hexokinases, and Glucokinase ......... 21 III. Summary of cDNA library synthesis/ screening strategies and results . . . . 44 IV. Summary of Transfection Results .......................... 89 V. Comparison of Deduced Amino Acid Sequences of the N- and C- Terminal Halves of Rat Type II Hexokinase with Sequence of the Type IV Isozyme and Sequences of the N - and C—Terminal Halves of the Type I and III Isozymes ............................... 97 ix LIST OF FIGURES FIGURE Page 1. Postulated evolution of mammalian hexokinases .................. 19 2. Alignment of clones containing genomic DNA for rat Type II hexokinase .................................. 32 3. HaeIII restriction patterns of cDNA clones ..................... 46 4. Relevant restriction sites and sequencing strategy for clones containing either cDNA or genomic DNA for rat Type II hexokinase .................................. 48 5 . Nucleotide and deduced amino acid sequences for rat Type II hexokinase .................................. 50 6. HaeIII restriction patterns of cDNA clones for Type II hexokinase (C-terminus) ............................ 55 7. Southern blot analyses of ten clones containing genomic DNA for Type II hexokinase .............................. 59 8. Southern analysis of clones containing 5’ and 3 ’ genomic sequences for Type II hexokinase ...................... 61 9. Alignment of relevant clones containing genomic DNA for rat Type II hexokinase ................. i ............... 64 10. Southern analysis of genomic clone 3G3A ...................... 67 11. Northern blot analyses of rat Type II hexokinase mRNA . . ., .......... 70 12. SI nuclease protection assay results .......................... 73 X 13. 14. 15. 16. 17. 18. 19. 20. Primer-extension results ................................. 75 Identification of the transcription initiation region of the Type II hexokinase gene ............................ 78 Identification of the transcription initiation site of the Type II hexokinase gene ............................ 80 Southern analysis of genomic clone 5G3A ...................... 83 Sequence of the Type II hexokinase promoter region ............... 85 Cell-free in vitro transcription assay results ..................... A 88 Western blot analysis of Type II hexokinase expressed in COS-1 cells ................................ 92 Comparison of aligned amino acid sequences of N- and C-terminal halves of rat Types I-III hexokinases and rat glucokinase (Type IV) ..................... 96 BCA LIST OF ABBREVIATIONS bicinchoninic acid basepair bovine serum albumin diethylaminoethyl Dulbecco’s modified eagle’s medium deoxynucleoside triphosphate dithiothreitol disodium, ethylenediamine tetraacetate ethidium bromide glucose 6~phosphate glucose guanidine hydrochloride N -2-hydroxyethylpiperazine-N ’-2-ethanesulfonic acid kilobase kilodalton piperazine—N,N’-bis[2-ethane-sulfonic acid] sodium dodecyl sulfate saline sodium citrate monothioglycerol tris[hydroxymethyl]aminomethane xii CHAPTER 1 Introduction Prologue The phosphorylation of glucose by hexokinase yields glucose-6-phosphate. Because glucose-6-phosphate is a substrate in many metabolic pathways, hexokinase is one of the major points of regulation in the metabolism of glucose in mammals. The mammalian hexokinase family consists of several distinct isozymes, each with different biochemical properties. The major aim of this work was to determine the amino acid sequence of Type II hexokinase from rat muscle, permitting comparison with other rat isozymes and thereby contributing to our understanding of the structural differences between these isozymes that bring about their regulatory diversity. This chapter will provide background information on the localization, regulatory properties, expression, structure, and evolutionary aspects of the hexokinase isozyme family. MAMIVIALIAN HEXOKINASES The conversion of glucose to glucose-6-phosphate in mammalian tissues is catalyzed by hexokinase (ATP: D-hexose 6 phosphotransferase, EC 2.7.1.1.) using ATP as the phosphoryl donor. Since glucose-6-phosphate is the initial substrate in many metabolic pathways such as glycogen biosynthesis and glycolysis, it is not surprising that hexokinase is under complex regulation via product inhibition and mitochondrial binding. The study of the function, structure, and regulation of hexokinase is made more complex by the fact that there are at least four isozymes in mammalian tissues. W The multiple isozymes, first observed in liver in 1963 by Viiiuela (1) and Walker (2), can be separated from one another by either chromatographic (3) or electrophoretic (4) techniques. The isozymes are named A-D, or alternatively I-IV , based on their order of elution from a DEAR—cellulose column (3), or their order of increasing mobility during starch gel electrophoresis (5), respectively. Although each of these isozymes has the same catalytic function, they differ significantly from one another in size, tissue distribution, and details of regulation. Several of these characteristics, discussed below, are summarized in Table I. Isozymes I-III are monomers with approximate molecular weights of 100 kDa and have low Kms for glucose (0.02-0.13 mM). Isozyme IV (glucokinase, EC 2.7.1.2) is Table I. Comparison of several parameters for the mammalian hexokinase isozymes“. ; V ‘ HEIOKINASE 'ISOZYME ”Wm , ,_ ‘ H H , , '1," ’ 11 ° 7 m M,,approx. 98,000 96,000 98,000 49,000 Sourcec brain muscle spleen liver erythrocytes adipocytes lung pancreas K, Glc 0.04b 0.13 0.02 4.5 Km ATP 0.42 0.07 1.29 0.49 K, Glc-6-P 0.026 0.021 0.074 15 vs ATP a-Table adapted from Ureta (57), and references therein. b-All apparent kinetic constant values are expressed in mM. c-Some examples are shown; this is not an all inclusive list. a monomer of 50 kDa and has a much higher K," for glucose (4.5 mM) (6-9). The apparent molecular weight of isozyme IV is similar to that of Types L1 and L2 hexokinase from wheat germ (10) and Types A and B hexokinase from yeast (23). Tissue Digibution A large variation in the hexokinase isozyme levels within various tissues has been observed (1 1). Generally, there is more than one hexokinase isozyme present in most tissues. Type I hexokinase is found in virtually all tissues and can be considered the "primary" hexokinase. Since Type I hexokinase is the most prevalent member of this family, much of the biochemical information available about the hexokinase reaction has been obtained from the study of this isozyme. This isoform is the predominant hexokinase in tissues with heavy reliance on blood-home glucose for energy, as in brain and erythrocytes. In contrast, Type II hexokinase is the major isozyme in insulin-sensitive tissues such as muscle, adipose and mammary gland. In these tissues, Glc-6—P can be directed into energy storage forms such as lipids (in adipose or mammary tissue) or glycogen (in muscle tissue). The predominance of Type II hexokinase in these tissues leads one to speculate that this isozyme plays an integral part in such energy storage pathways. The Type III isoform can be found in several tissues (e. g. spleen, kidney, and lung) but in much lower amounts than the other three isozymes. For this reason, less is known about Type III hexokinase than the other isozymes. Early work indicated that Type IV hexokinase, or glucokinase, was present only in the liver (5). However, glucokinase has since been found in the islets of Langerhans in the pancreas (12). More recently, a glucokinase was detected in anterior pituitary cells and pituitary cell lines (13) . Both Northern and Western blot analysis, using antibodies against the pancreatic glucokinase, detected glucokinase in the pituitary cell line AtT20. No glucokinase mRN A or enzyme activity was detectable in the pituitary tissue, even though the antibodies against the pancreatic enzyme did react with a protein band of the appropriate size in the Western analysis. Hughes et al. (13) postulated that an undetectably low amount of mRN A might be sufficient for expression of glucokinase since the protein half-life is as much as 30 h. However, Liang et al. (14) reported detecting glucokinase mRNA, but no enzyme activity, in both the pituitary cell line AtT20 and pituitary tissue. These investigators felt that alteration of the open reading frame for glucokinase may explain the lack of enzyme activity in the pituitary. Influences on Hexgldnag Igvels While the relative amount of each isozyme is influenced by age, diet, and hormones as well as physical activity, isozymes II and IV appear to be most affected. Bernstein and Kipnis investigated the effects of age and diet on Types I and II hexokinase activity in rat skeletal muscle and adipose tissues (15). They found that the hexokinase in both tissues decreased with age. Similar decreases in hexokinase activity were found in muscle and adipose tissues of young rats after a 3-day fast. These investigators determined that, since the levels of Type I hexokinase in both tissues were unaffected by age and diet, changes in the Type II isozyme were responsible for the alteration in hexokinase activity in muscle and adipose tissue due to these conditions. These results were similar to those seen by Katzen and Schimke (5). Under diabetic conditions, the level of Type II hexokinase decreases in fat pad, heart, and skeletal muscle as well as adipose tissue and mammary gland, and upon insulin treatment is returned to normal levels (1618). Frank and Fromm used streptozotocin to induce diabetes in rats in order to investigate the effect of insulin on the synthesis (19) and degradation (20) of Type II hexokinase. These investigators monitored the incorporation of [’H]leucine into Type II hexokinase in the skeletal muscle of diabetic and insan treated diabetic rats. They found that the rate of synthesis of skeletal muscle Type II in normal rats was approximately 1.9 times greater than in diabetic animals. Insulin treatment of the diabetic animals brought the reduced synthesis rate of the Type II isozyme back to near normal levels (19). In a similar set of experiments (20), Frank and From found that the degradation rate constant for Type II hexokinase was approximately 3 times greater in diabetic animals than in normal animals. In these experiments, the investigators did not determine if the changes in Type II hexokinase levels were related to the rate of transcription of its gene or the stability of its message. The half-er of this protein in diabetic muscle was 9 hr versus 28 hr in normal tissue. Insan treatment of diabetic skeletal muscle restored both the rate of degradation and the half-life of the Type II isozyme to levels approximating those found in normal tissues. The level of glucokinase in liver is influenced by several factors including insulin, diet, fasting (21) and glucagon (22). This is not the case for the glucokinase found in the B-cells of the pancreas, which appears to be influenced by changes in blood glucose concentration, and not insan (reviewed in ref 24). Because insan secretion from B-cells is controlled through the glycolytic rate, the pancreatic glucokinase has been implicated in the control of insan levels. These seemingly opposite controls can provide an effective regulation cycle for plasma glucose levels. Elevated glucose increases pancreatic glucokinase activity which in turn stimulates insulin release. This elevated insulin level stimulates hepatic glucokinase which will decrease blood glucose levels. For a number of years, it has been known that exercise (25 ,26) and chronic stimulation (27,28) cause an increase in hexokinase activity in muscle. Weber and Pette, using [3’S]methionine and immunoprecipitation, monitored protein production in chronically stimulated rat skeletal muscle. They found that the increase in hexokinase activity in stimulated tissue was the result of increased Type II hexokinase synthesis (29). These observations were confirmed, and expanded upon, when Weber and Pette demonstrated that hexokinase activity reached a maximum peak after 14 days of chronic stimulation (30). When stimulation was withdrawn a decreased rate of synthesis and possibly an increased rate of degradation of the Type II isozyme (31) caused an immediate decline in both hexokinase activity and the levels of the Type II isozyme. These researchers also found that nearly 50 % more total hexokinase activity was mitochondrially bound in stimulated muscle than in unstimulated tissue (30); the potential regulatory significance of the binding of hexokinase to mitochondria will be discussed below. The increases in both the Type II hexokinase protein levels and the amount of mitochondrially bound hexokinase represent intracellular changes in response to the increased energy demand of chronic stimulation. ' n f i n - l ' 11 As previously discussed, several factors influence both the activity of glucokinase, and the level of the protein. Hormones, diet, age, and activity all appear to alter the amount of glucokinase, to differing degrees in different tissues (or cell types). Investigation into the regulation of glucokinase transcription became possible when the Granner laboratory isolated and characterized both the cDNA (32) and gene (33) for liver glucokinase. The gene encoding glucokinase is the only hexokinase gene studied in depth to date. The hepatic glucokinase gene was found to be 15 .5 kb in length, and contained 10 exons ranging in size from 96 to 977 bp. Transcription initiation was localized over a 4 base range, with the strongest band 127 nucleotides upstream from the translation initiation codon. This translation initiation codon is located in exon 1. The message for hepatic glucokinase is 2.4 kb in length, encoding a protein of 465 residues. Analysis of 5 ’ flanking sequences located several promoter elements, including a TATA box and an Spl binding site. Also present were several elements found in other liver-specific genes. Using run—on transcription analysis, these investigators demonstrated that with insan treatment, the rate of transcription of the hepatic glucokinase gene in diabetic rats increased approximately 20-fold in 2 hr, with a significant increase within 1 hr. Similar increases in the transcription of glucokinase mRNA in rat liver and hepatocytes have been reported by Sibrowski and Seitz (34), 10 and Iynedjian and coworkers (22), respectively. In both cases the half-life for glucokinase mRNA was determined to be 40-45 min. The laboratory of Tanaka and coworkers (35) investigated the genomic region upstream from the transcription initiation site for liver glucokinase in an attempt to locate genomic regions important in the insulin regulation of this gene. Using the chloramphenicol acetytransferase (CAT) assay system connected to deletion constructs of the glucokinase gene (-5 .5 kb to -48), these investigators determined that the genomic region, from immediately upstream of the transcription start site (+1), to nucleotide -87, was sufficient for promoter activity in rat hepatocytes. However, using the same experimental strategy, insan treatment of the transformed hepatocytes resulted in no change in CAT activity of the cells. They concluded that the 5 .5kb sequence 5 ’ of the transcription initiation site for glucokinase did not contain insan responsive elements. It is possible that such sequences may be further upstream or may be downstream contained within an intron in the glucokinase gene. Magnuson and Shelton (36), in studies on the expression of glucokinase in pancreatic B-cells, isolated and characterized the transcription unit of this isozyme from an insulinoma cell line. These investigators reported the size of the pancreatic glucokinase mRN A to be approximately 2600 nucleotides in length, about 250 longer that the message for liver glucokinase. Iynedjian and coworkers (37) reported a difference in glucokinase message size of 400 nucleotides, with the pancreatic message being longer (approx 2.8 kb). Similar message sizes have also been seen in both pituitary tissue and the AtT20 cell line (13). 11 Magnuson and Shelton (36) found that the promoter for this pancreatic enzyme was at least 12 kb upstream from the hepatic glucokinase promoter, making the transcription unit at least 27.5 kb. The 5 ’ ends of the cDNAs for pancreatic and hepatic glucokinases were completely different, resulting in 15 different amino acids at their N -termini. The point at which the hepatic and pancreatic sequences diverged corresponded to a splice site between exons 1 and 2 in the hepatic transcript unit. However, all residues 3 ’ of this splice site were identical between the hepatic and pancreatic glucokinases. Several laboratories have detected a number of glucokinase variants in liver, pancreas, and pituitary cells and tissue. Magnuson and Shelton (36) found in pancreas a glucokinase cDNA with a 51 bp deletion which generated a 17 amino acid deletion (in frame) (14). Using PCR technology, Magnuson et al. (14) detected mRN As, in both pancreas and liver, that contained the aforementioned 5’ dissimilar regions and the 51 bp deletion. They also reported locating additional alternate splicing products in both pituitary tissue and the pituitary cell line, AtT20. In recent work, Newgard and colleagues (13, 38) have identified several additional variant rat glucokinase transcripts in liver, pancreas, AtT20 cells and pituitary tissues. Expression of the pancreatic and liver variants containing a 52 bp deletion, at the 3’ end of exon 2 of the glucokinase gene, resulted in no glucokinase activity (3 8). However, they could not detect the transcript containing the 51 bp deletion that Magnuson er al. had analyzed. 12 It is evident that there are at least two distinct promoters for the glucokinase gene that control tissue specific expression of unique rat glucokinases in the liver, pancreas, and pituitary. The significance of additional unique transcripts, resulting from several alternate splicing events, remains to be determined. As research into Types I-III hexokinase continues, similar complexity will undoubtedly be found in their gene structures. Preliminary characterization of the Type II gene, presented herein, supports this observation. We The association of hexokinase with the particulate fractions of tissue homogenates has been observed by many laboratories using a number of tissues (1 l). The actual subcellular structure, however, that bound the hexokinase(s) was not always determined. Several researchers have reported that the particulate hexokinase activity was associated with mitochondria, for most tissues tested. Significant portions of Type I and Type II hexokinase have been found associated with the mitochondria in tissues such as brain (1 l), as well as heart, diaphragm, skeletal muscle and mammary gland (39). Generally, Type III hexokinase has been found only in the soluble fraction of tissue homogenates such as lung (40) and hepatoma (41). One report of the association of Type III hexokinase with the mitochondria (42) may have incorrectly identified the Type II isozyme as Type III hexokinase (41) . However, based on localization studies by Preller and Wilson (43), the Type III isozyme appears to have a weak association with nuclei of several different rat tissues. Many of the cell types exhibiting nuclear association of Type III were 13 endothelial or epithelial cells. One laboratory has reported the presence of glucokinase in the nuclei, as well as the cytoplasm, of parenchymal cells of rat liver (45). The outer mitochondrial membrane protein to which Type I hexokinase binds was isolated by Felgner et al. (44). Subsequently, it was shown that this hexokinase binding protein was the pore-forming protein, porin (46,47). Hexokinase binding to porin, through which molecules such as ADP and ATP flow, creates a direct link between glycolysis in the cytosol and ATP production in the mitochondria. This association between hexokinase and porin gives the enzyme preferential access to mitochondrially generated ATP (48,49). Inui and Ishibashi (50) reported that the efficient utilization of mitochondrial ATP was dependent upon hexokinase being bound to the mitochondrial membrane. The bound form of the enzyme has a slightly greater affinity for ATP and is considerably less sensitive to Glc-6-P inhibition (40,51 ,52). These kinetic differences, coupled with preferential access to intramitochondrially generated ATP, have led to the suggestion that mitochondrial binding represents a mechanism for activation (relative to the unbound enzyme) of hexokinase. Divalent cations and a hydrophobic "tail” at the N -terminus of the protein facilitate the association of hexokinase with the mitochondrial membrane. Divalent cations are known to enhance this protein-membrane association in brain (52), tumor (53), and skeletal muscle (49). It is likely that this occurs by bridging the negatively charged groups on both the protein and the membrane. Critical to the association of 14 Type I hexokinase with the mitochondria is the N-terminus of the enzyme. Polakis and Wilson (54) used limited chymotryptic digestion to demonstrate that the loss of the 9 hydrophobic residues at the N-terminus of Type I hexokinase resulted in the loss of binding to the mitochondrial membrane. Similar results were also reported by Rose and Warms (5 3). Further studies into the mitochondrial binding of Type I hexokinase, by Xie and Wilson (55), demonstrated that this hydrophobic N-terminus was actually inserted into a hydrophobic core in the mitochondrial membrane. Inspection of the N -terminal residues of the deduced amino acid sequence for Type III hexokinase (5 6) reveals that this region of Type III is much less hydrophobic than the N-terminus of Type I. The lack of such a hydrophobic region in Type III hexokinase is one factor that makes mitochondrial binding less likely, if not impossible. As with the regulation of the level of the hexokinases, the enzymatic activity of each hexokinase isozyme is regulated differently. Mammalian hexokinase Types I- 111 are sensitive to allosteric inhibition by Glc-6-P (11), whereas glucokinase shows sensitivity to this ligand only at concentrations higher than normal physiological levels (57). The inhibitory effect of Glc-6-P on the Type I isozyme is immediate which is in contrast to the delayed inhibition of Type II hexokinase by this ligand (58,59). The tm of the response of the Type II isozyme to Glc-6-P varied from 12 sec to 130 sec, for soluble or mitochondrially bound enzyme, respectively. Other ligands affecting hexokinase activity are inorganic phosphate (Pi), Glc 1,6P2, and glucose. Inorganic phosphate reverses the inhibition of the Type I 15 isoform by Glc-6-P (11) but has little if any effect on the Type II isozyme (59). Glucose 1,6-bisphosphate is also an inhibitor of Type I hexokinase and has been shown to have an even greater affinity for the Type II isozyme (60,61). The substrate, glucose, is itself an inhibitor of Type 111 hexokinase at concentrations above 0.2mM (3), suggesting that the isozyme is likely to be active only when intracellular concentrations of glucose are low. 13mm Hexokinase For more than 60 years, it has been known that tumor cells exhibit increased glycolytic rates (62). Several changes in cellular metabolism in tumors are directly associated with increased glycolysis. Included in these changes are the increased levels of key enzymes such as hexokinase, and a change in the degree of hexokinase associated with the mitochondria. Bustamante and Pedersen showed that the transformation of liver cells into tumor cells increases the hexokinase activity by more than 20-fold (63). Using the hepatoma cell line H-9l , these investigators found that at least half the total hexokinase activity was associated with the mitochondria, and that mitochondrially generated ATP was preferentially used to phosphorylate glucose. In a similar set of experiments utilizing the Ehrlich ascites tumor line, Bustamante et al. (64) found that approximately two-thirds of the hexokinase activity was mitochondrially associated. These authors also reported that fast growing tumor cells (those reaching maximum size within 1 month) have the highest levels of hexokinase activity, again with 70 % of the activity associated with the mitochondria (64). 16 To gain further insight into the nature of tumor hexokinase, Nakashima and co-workers (65) purified and characterized a mitochondrially bound hexokinase from the AS-30D rat hepatoma cell line. They demonstrated that this cell line, as with those mentioned above, contained increased hexokinase with a significant portion bound to the mitochondria (when compared to normal liver). Anion exchange chromatography of liver and AS-3OD hexokinases showed that two tumor isozymes coeluted with Types I and II from normal liver. These authors reported that the purified AS-30D hexokinase, which coeluted with the liver Type II hexokinase, had kinetic and chromatographic properties similar to those of the Type II isoform from normal tissues. A comparison of their amino acid compositions indicated that the tumor and liver isozymes were not identical. However, these researchers did note that the amino acid compositions were obtained from different laboratories using different techniques. Nakashima et al. concluded that, in the transformation from normal liver to hepatoma, there was a significant change in hexokinase content to an isozyme form with very different properties. The complete amino acid sequence for a tumor hexokinase was first reported by Arora and coworkers (66). They isolated cDNA for a hexokinase from c37 mouse hepatoma cell line, using a partial cDNA for rat brain Type I hexokinase (67) as the probe. The deduced sequence for this mouse tumor hexokinase was the same length (918 residues) as the rat Type I isozyme (68), and the enzymes differed at only 32 positions in their amino acid sequences. The 12-amino acid hydrophobic stretches at the N -terminus of both hexokinases were identical. The presence of this hydrophobic 17 "tail" in the tumor hexokinase could account, at least in part, for the high level of mitochondrial binding seen with tumor hexokinases. The authors felt that their results could not conclusively identify the mouse hepatoma hexokinase as Types 1, II or III. However, it seems likely that this hepatoma hexokinase is Type I from mouse with the 32-amino acid difference (only 3.5 % of the total 918 residues) due to species variation (mouse vs rat) and not a reflection of the transformation process. Evolution of the Mammalm Hexokinases The fact that Types I-III hexokinases are twice the size of both glucokinase and the yeast isozymes has led several investigators to postulate that the mammalian hexokinases evolved through a process of gene duplication and fusion from an ancestral hexokinase similar to the yeast enzymes (23,57,69-72). The similarities in amino acid composition (57,73) and antigenic cross reactivities (41,74) among the various isozymes give support to this theory. This is not an unprecedented postulation since gene duplication and fusion are also suggested for the evolution of rabbit phosphofructokinase (75), glycogen phosphorylase (76), and the B-crystallin protein family (77). However, in contrast to glucokinase and the yeast hexokinases (23), several organisms have 50 kDa hexokinases that are inhibited by Glc-6-P. These organisms include silkworm (78), locust (79) and starfish (80). Recent work (84) supports the current postulation that the starfish hexokinase, not the yeast monomer, may be a direct descendent of an ancestral hexokinase that contained both glucose and Glc-6-P binding sites (Fig. 1). It is possible that duplication and fusion of such an ancestral 18 Figure 1. Postulated evolution of mammalian hexokinases. The catalytic site is represented by a circle, and the Glc-6-P regulatory site by a square. In this scheme, a 50 kDa ancestral hexokinase, which was not inhibited by Glc—6—P, evolved in two directions, one leading to the present day yeast hexokinase (not inhibited by Glc-6-P). In the second pathway, a Glc-6-P site evolved on the 50 kDa protein, giving a protein with properties similar to those of the present-day starfish hexokinase. Gene duplication and fusion of the ancestral gene encoding the 50 kDa Glc-6-P sensitive hexokinase would give rise to the present-day 100 kDa mammalian hexokinases (84) . 19 $9 83.) .23 8.2) :2» 338.2 092.30on cezeezdz mmdsxoxmx 52¢...de To Ill / mmasxoxmx vmdm> TIIOII|_ a “O I TIIIOIII_ /_l.l\ a UMOCEOXOI ddL¥mmUC¢ FIGURE 1 20 gene gave rise to the genes encoding the current mammalian hexokinases. The extensive sequence similarities (Table II) of the N— and C-terminal halves of Types I and III hexokinase, to one another and to glucokinase, support this gene duplication and fusion theory. The degree of similarity among the deduced amino acid sequences of Type I hexokinase from human kidney (81), Type I isozyme from rat brain (68), Type III hexokinase (5 6) and glucokinase from rat liver (32) , and a hexokinase from a mouse hepatoma cell line (66) lend additional support to this theory. _ . 1 . . . H Since the work by Crane and Sols (82) which led to the view of a distinct Glc- 6-P regulatory site in the hexokinase molecule, much research has been conducted to define and locate the catalytic and regulatory sites of hexokinase. The first substantive information about the location of the substrate binding sites was provided by Nemat-Gorgani and Wilson (83). Using a photoactivatable ATP analog (8-Azido- ATP) and limited tryptic proteolysis, these researchers demonstrated that the substrate nucleotide binding site was in the C-terminal portion of Type I hexokinase. The location of the glucose binding site was also placed in the C-terminal domain of isozyme I. Schirch and Wilson (85) used a radiolabelled glucose analog (N -(bromoacetyl)-D-glucosamine (GlcNBrAc)) to modify sulfhydryl groups at the glucose binding site. Analysis of tryptically digested hexokinase, treated with [“C]GlcNBrAc, showed that the C-terminal portion of the molecule had been labelled. A subsequent set of experiments (93) further defined the location of the regions that had been labelled by the glucose analog. Three tryptic fragments, 21 Table II. Comparison of Amino Acid Sequences of N- and C-Terminal Halves of Rat Type I and Rat Type III Hexokinases, and Glucokinase“. NIII" NI CIII CI NI 39(16)° CIII 40(14) 45(15) CI 38(14) 46(17) 62(11) IV 38(15) 46(18) 49(15) 49(15) a—Table adapted from Schwab and Wilson (56). b-Abbreviations used: NIII, N-terminal half of rat Type III isozyme; NI, N-terminal of rat Type I isozyme; CIII, C-terminal half of rat Type III isozyme; CI, C-terminal half of rat Type I isozyme; IV, rat Type IV hexokinase (glucokinase). c-Percent identical residues is shown without parenthesis; percent conservative substitutions is shown in parenthesis. 22 radiolabelled by GlcNBrAc treatment, were isolated and subjected to amino acid sequence analysis. Two of the fragments were quite similar to yeast sequences (86,87) that are located near the glucose binding site (88,89). The third peptide showed no significant similarity to other published hexokinase sequences. By comparing the two rat Type I peptides to the yeast sequences, Schirch and Wilson further localized at least a portion of the glucose binding site within a 5 kDa segment of the C-terminal half of the enzyme. Under nondenaturing conditions, limited tryptic digestion of Type I hexokinase yields 3 specific fragments of M, 10, 40 and 50 kDa (90). However, White and Wilson (91) found that the susceptibility of the Type I isozyme to trypsinization is increased when the enzyme is denatured with 0.6 M GuHCl. Furthermore, the ligand binding domains (N - and C-terminal halves) are selectively stabilized when the ligands, or their analogs, are present during denaturation and proteolysis. Using this selective protection from proteolysis as an indication of binding of the ligand, White and Wilson (84,91) demonstrated that the allosteric binding site for Glc-6-P was in the N—terminal of Type I hexokinase. The enzyme was denatured with GuHCl in the presence of analogs of either substrate or inhibitor. When the Glc-6-P analog, 1,5 - anhydroglucito1-6-P, was used, the N-terminal half of Type I hexokinase was protected from digestion. Catalytic activity was lost under these conditions. In contrast, the substrate analog, N-acetylglucosamine, protected the C-terminal half from proteolysis, and catalytic function was retained. These results confirmed the previous findings that the catalytic domain was in the C-terminal portion of the 23 molecule, and provided direct evidence for the location of the allosteric regulatory site in the N-terminal half of the Type I isozyme. Isolation and characterization of cDNA for the Type I isozyme from rat brain, by Schwab and Wilson (67,68), provided the first complete deduced amino acid sequence (918 residues in length) for a rat hexokinase. The sequences for yeast hexokinases A and B had previously been determined by two separate laboratories (86,87). Comparison of the rat Type I protein sequence with those of the yeast isozymes revealed extensive sequence similarities between these hexokinases, and internally between the N- and C-terminals halves of the rat isozyme. Schwab and Wilson (68) proposed a 3-D model for rat Type I hexokinase based on the yeast hexokinase x-ray structure of Steitz and coworkers (88,89,92). Each of the halves of the 100 kDa Type I isozyme were considered to be structurally similar to the yeast enzyme. This model for mammalian hexokinase consists of two large domains, each composed of 2 lobes with a cleft between each lobe. The cleft in the C-terminal domain, shown to be responsible for catalytic function (84), contains the binding sites for the substrates glucose (85,93) and ATP-Mg2+ (the magnesium chelate of ATP) (83). Regulation of Type I hexokinase has been assigned to the N-terminal domain of the molecule which is not catalytically active (84) and contains the binding site for the allosteric effector, Glc-6-P (91,94). Considering the high degree of similarity between the deduced amino acid sequences for Types I and III hexokinase (56), it is 24 likely that this same structure-function arrangement is present in the Type III isozyme and, one may anticipate, in Type II hexokinase also. The isolation and characterization of the Type 11 cDNA, as part of this project, is another step toward understanding how this protein’s structure is related to its catalytic and regulatory properties. Comparison of the deduced amino acid sequences for the Types 1, II and III isozymes demonstrates a high degree of similarity. Such sequence conservation clearly suggests that a structure-function arrangement similar to that found in the Type I enzyme is present in all of these isozymes. With the availability of the cDNAs for the hexokinase isozymes, it becomes possible to determine how the variation in the structures of these proteins lead to their distinctive catalytic/ regulatory properties. CHAPTER 2 Materials and Methods 25 MATERIALS DNA modifying enzymes were obtained from Boehringer Mannheim Biochemicals (Indianapolis, IN) , BRL (Gaithersburg, MD) or New England Biolabs (Beverly, MA). Radioisotopes were purchased from either NEN DuPont (Boston, MA) or Amersharn (Arlington Heights, IL). Nitrocellulose from Schleicher and Schuell (Keene, NH) was used for library screenings and blot analyses. The primers used in cDNA synthesis and subsequent manipulations were either obtained from Boehringer Mannheim Biochemicals, Pharmacia (Piscataway, NJ) , New England BioLabs, or U. S. Biochemicals (Cleveland, OH) or synthesized at the Macromolecular Structure, Sequencing, and Synthesis Facility (Michigan State University). DE-52 DEAE-cellulose marketed by Whatrnan (Clifton, NJ) and Affi— Gel Blue affinity chromatography gel from Bio-Rad Laboratories (Richmond, CA) were used in protein purification procedures. The BCA reagent for protein determination was obtained from Pierce Chemical Co. (Rockford, IL). Oligo (dT) cellulose and the Sequenase sequencing kit were purchased from Boehringer Mannheim Biochemicals and U.S. Biochemicals, respectively. AMV Reverse Transcriptase used in primer extension analysis was purchased from Life Sciences (St. Petersburg, FL). The Uni-ZAP cDNA synthesis kit was obtained from Stratagene (Ia Jolla, CA). The columns used to purify plasmid DNA for transfection were purchased from Qiagen, Inc. (Chatsworth, CA). U. S. Biochemicals was the source for T1 RNase. All other reagents were obtained from standard commercial sources. 26 27 An amplified cDNA library, constructed in mm using mRNA from adult rat soleus muscle was generously provided by Dr. F. H. Schachat, Duke University Medical Center. Two cDN A libraries, constructed in pUC 8 and pUC 9 from rat skeletal muscle mRNA, were kindly provided by Dr. D. M. Helfman, Cold Spring Harbor Laboratory. Total RNA isolated from electrically stimulated rat skeletal muscle which contains elevated levels of Type II hexokinase (29,31) was provided by Dr. D. Pette, University of Konstanz. A rat genomic library, constructed in kCharon 4A, was kindly made available by Dr. Tom Sargent, National Institutes of Health. The Novikoff ascites tumor cell line was obtained from the Fels Research Institute (Philadelphia, PA) with the assistance of Dr. Sidney Weinhouse, and propagated in female Sprague—Dawley rats obtained from Holtzman (Madison, WI). METHODS Standard procedures were used for library screening, DNA labeling, restriction enzyme mapping, subcloning, isolation of DNA, and Northern and Southern analyses. Unless otherwise noted, in vitro DNA manipulations were performed as previously described (95). ' ’ u . .__u - rrmr. an en ,. f ln‘ II -x: clar- rom ' W. Hexokinase activity was assayed spectrophotometrically, coupling the hexokinase reaction to NADPH formation using Glc-6-P dehydrogenase (96). Hind limb skeletal muscle was obtained from adult Sprague-Dawley rats of either sex. The tissue was homogenized for 2 min in a Waring blender with 50 mM Tris, 1 mM .TA, 50 mM Glc, 20 mM TG, pH 7.0 (2 ml buffer per g tissue). After 28 centrifugation for 30 min at 25,000 x g, the pH of the supernatant was adjusted to 7 .0 with 0.1 M KOH, and extracted enzyme adsorbed batchwise onto DEAE-cellulose equilibrated with homogenization buffer; approx. one ml settled volume of DEAE cellulose per unit of enzyme was required for complete adsorption of activity. After exhaustive washing, the DEAE-cellulose was poured into a column. The column was washed with 2 column volumes of homogenization buffer, and hexokinase eluted with a linear gradient, 0—0.4 M in KC]. Fractions containing hexokinase activity eluting at 0.15 M-0.25 M KCl were combined, concentrated and dialyzed against the homogenization buffer. The enzyme was then loaded onto a 3 .6 cm x 5 .5 cm column of Affi-Gel Blue, equilibrated in the homogenization buffer. The column was sequentially washed with homogenization buffer of increasing pH, first 7.5 then 8.0 and finally pH 8.5; the pH 8.5 wash buffer included 20 % (v/v) glycerol. In each case, washing was continued until the absorbance at 280 nm declined to a negligible value. Hexokinase was then eluted with 1.5 mM Glc-6-P in the pH 8.5 buffer, essentially as previously described for the Type I isozyme (96). At this point, the specific activity of the enzyme was about 10 units per mg protein, an approximately 50-fold increase over that in the initial extract. SDS-gel electrophoresis on 6.5 %-15 % linear acrylamide gradient gels was performed as described previously (90,97). Type II hexokinase, migrating with an apparent mol. wt. of approx. 107 kDa (98), was well resolved from other components, only one of which - with apparent mol. wt. 66 kDa - was major. The Type II hexokinase band was excised from the gel, the enzyme electroeluted using a 29 CBS Scientific (Del Mar, CA) Model ELU-40 device, and prepared for sequencing as described by Hunkapillar et al. (99). Sequencing by automated Edman degradation was done by the Macromolecular Structure, Sequencing, and Synthesis Facility (Michigan State University). WW Two cDNA libraries were synthesized using the UniZAP II cDN A kit from Strategene, following manufacturer’s directions. The sources of mRN A for these libraries were rat skeletal muscle and rat epididymal fat pad. Two cDNA libraries with Agth as the cloning vector were synthesized using mRN A from either normal or electrically stimulated rat skeletal muscle (29,31). The cDNAs for these libraries (one from normal tissue, the other from stimulated tissue) were synthesized using random hexanucleotide primers at a ratio of 0.5 ug primers per pg mRNA. Five micrograms of mRNA from normal tissue and 1 pg mRNA from stimulated tissue were used. The subsequent synthesis procedures were as described by DeWitt and Smith (100). The prehybridization and hybridization solutions for all library screenings, consisted of deionized formamide (37 96 for medium stringency and 50 % for high stringency requirements), 5x SSC, 5x Denhardt’s reagent, 0.1% SDS, and 0.1 mg fragmented and denatured salmon sperm DNA per ml. Prehybridization was carried out at 42 °C for at least 4 hours, and hybridization was performed at 42°C for at least 8 hours, and usually overnight. The hybridization solutions contained 10‘ cpm of radiolabelled probe per ml. The probes were radiolabelled using random hexanucleotide primers (101), to a specific activity greater than 10‘ cpm/ug. The 30 hybridized filters were washed at room temperature in 2x SSC/O.l% SDS, for two 15 minute periods. The filters were then washed either at 40°C (medium stringency) or 50°C (high stringency) in 0.1x SSC/0.1% SDS for 30 minutes, until no additional counts were removed from the filters. Positive recombinant clones were purified by multiple rescreening under the appropriate conditions. W The )‘Charon 4A library from Dr. T. Sargent was screened with several different probes from the cDNA for the Type II isozyme, under high stringency conditions. The procedure for these screenings was the same as mentioned in the above section on cDN A library screening. Southern blot analyses of genomic clones digested with EcoRI were used to determine relative placement of these clones with respect to cDNA sequences for Type II hexokinase. Five genomic clones, containing either pertinent sequence information or the largest inserts, were aligned as shown in Fig. 2. The radiolabelled cDNA probes used to isolate and characterize these genomic clones were obtained from the following regions of the Type II cDNAs (given by restriction sites and nucleotide position in the appropriate cDNA): EcoRI-Xhol (5’ end of 12- 1.3C - 152), EcoRI-BglII (5’ end of 12-1.3C - 525), BglII-Styl (525-1137), Styl- EcoRI (1156 - 3’ end of 12-1.3C), SphI-EcoRI (307 - 3’ end of RG2B), Haem- HaeIII (1312-1519) and HaeIII-EcoRI (1698 - 3’ end of RG2B). The first 4 cDNA fragments were isolated from cDNA clone 12-1.3C. The remaining 3 restriction fragments were isolated from cDNA clone RG2B. The EcoRI sites used in the above 31 Figure 2. Alignment of clones containing genomic DNA for rat Type II hexokinase. In the center is the composite figure which contains the coding region for Type II hexokinase, and the 5’ and 3’ untranslated regions of the cDNA. Shown above the composite figure are the overlapping cDNA clones, 12-1.3C and ‘ RG2B/NK3B. Below the composite drawing are the genomic clones isolated using portions of clones 12-1.3C and RG2B as the probes. The alignment of the genomic clones relative to the composite figure are the results of Southern blots probed with restriction fragments of cDNA clones, 12-1.3C and RG2B. The restriction sites used to generate the probes for these blots are designated by number below the cDNA clones. The restriction site positions in the Type II cDNAs are: l, EcoRI at 5 ’ end of 12-1.3C; 2, Xhol, 152; 3, BglII, 525; 4, Styl, 1137/1156; 5, EcoRI at 3’ end of 12-1.3C; 6, SphI, 307; 7-9 HaeIlI, 1312, 1519, and 1698, respectively; and 10, EcoRI at 3’ end of RGZB/NK3B. Sites 1-5 are in clone 12-1.3C; sites 6-10 are located in clones RG2B/NK3B. Slash marks (//) indicate that these clones are longer than represented, and are not drawn to scale with the composite figure for the enzyme. The dashed line at the 5 ’ end of clone 5 63A indicates sequence upstream of the cDNA clone 12-1.3C, portions of which were used to determine the transcription initiation sites and promoter sequences. 32 ¢mu._. _ 3.0m _ \wx . m 4|“ . m m.om _ L “ NW . _ _ m u _ R «mum _ _ \. «mom — \\ _ — ¢~ d Ill-4V .mH " m 4 w m. . . _ . ... o." $5. w... ___ _ ___ a a mmxz .mmom omélmfi FIGURE 2 33 probe isolations were from the linker regions of the cDNA cloning vectors and not part of the cDN A for Type II hexokinase. Site—gm ted Mutagenesis of 13m II Hexokinfie; Creation of NgoI Site. The mutagenesis procedure was that of Kunkel (102,103). The 306 bp EcoRI—Sphl fragment of clone RG2B (nucleotides 1-306) was directionally subcloned into M13mpl9. The recombinant (designated M13N5) was grown in the ung, dut double mutant E. coli strain 0236. This strain allows for the incorporation of some uracil residues in thymine positions. Single stranded phage DNA, containing uracil residues, was purified using PEG (20 % polyethylene glycol 8000, 2 M NaCl) precipitation (95). The mutation primer, a l7-mer(5 ’-GGCTGC§ATGGTGACGG-3’) was identical to nucleotides 154-170 of clone M13N5. This corresponds to 1552-1568 of the composite cDNA for Type II (Fig. 5, Results Chapter) with a change from T to C at the underlined nucleotide 160 (nucleotide 1558, Fig. 5) generating an NcaI site. This primer was annealed to the template DNA at a ratio of 5 pm primer to 200 ng DNA in a 10 ul reaction that contained 20 mM Tris-HCl (pH 7.4), 2 mM MgC12, and 50 mM NaCl. The mutated strand was synthesized by the addition of 0.4 mM each dNTP, 0.75 mM ATP, 2 mM DTI‘, 1 unit T4 DNA ligase, and 1 unit T4 DNA polymerase. The reaction was incubated at 37°C for 1.5 hr. Approximately 30% of the reaction was used to transform E. coli. strain mvll90. This strain has a functional uracil N-glycosylase which inactivates the parental strand containing uracil residues. The non-uracil containing mutated strand remains intact. This becomes an efficient selection process as most of the recombinants will contain the desired 34 mutation. Several recombinants were analyzed for the desired mutation by sequencing the 306 bp fragment in M13mpl9. Eighty percent of the recombinants chosen contained the NcoI site. A full length cDNA for Type II hexokinase, containing the NcoI site, was constructed using restriction endonucleases Ach and Apal. Ach restricts the N- terminal cDNA clone 12-1.3C at nucleotide 1442; the site for Apal is at nucleotide 200 of the C-terminal Type 11 cDNA clone RGZB (nucleotide 1599, Fig. 5). The cDNA inserts were isolated from EcoRI digested Agth clones. The isolated cDN A fragments were subsequently restricted with the appropriate endonuclease (Ach in clone 12-1.3C, ApaI in clone RG2B). Digestion of the mutated M13 clone (Ml3N5) with these two restriction enzymes generated a small (157 bp) fragment containing the NcoI site. These 3 fragments (shown below, schematically) were isolated and ligated together. The resulting 3.6 kb fragment, with EcoRI ends and the internal NcoI site, was subcloned into the Eco]? site of pUC 1 8. The resultant recombinant was designated pIIN-RI. Accl NcoI Ape! EcoRI SphI M13N5 #41. EcoRI Accl ApaI SphI EcoRI 1 #4 l (/ *i3’ 5’ f I F I 12-L3C RGEB 35 Exptessign of Tm II Hexglgnag' in COS-1 Cells. The expression vector, pSVT7, and COS-1 cells were provided by Dr. W. L. Smith of this department, with the permission of Dr. J. Sambrook (104). A 3 Kb fragment of Type II hexokinase cDNA (from clone pIIN-RI, containing an NcoI site), from the EcoRI site at the 5 ’ end to the KpnI site at nucleotide 3082 at the 3’ end, was directionally subcloned into the EcoRI-Kpnl sites of pUC18. This 3 Kb fragment included the entire coding region for the Type II isozyme, along with 196 bases upstream of the start codon, and 146 nucleotides downstream of the end of the coding region. The cDNA was excised fiom the pUC18 recombinant using restriction enzymes EcoRI and Pstl, sites within the polylinker of pUC18. This fragment was directionally subcloned into pSVT7, previously digested with the same enzymes. This aligned the coding region for Type II hexokinase properly with respect to the SV40 origin of replication and early promoter in pSVT7. The recombinant plasmid (designated pSVT7-HKII) , containing Type II hexokinase as determined by restriction analysis, was grown in DHSa cells and purified over QIAGEN columns according to manufacturer’ 5 directions. Quantities of the plasmid, pSVT7, were also purified in the same manner. COS-1 cells were grown in Dulbecco’s modified Eagle’s medium (high glucose) supplemented with 8 % bovine calf serum, 2 % fetal calf serum, and 2 mM glutamine. Cell cultures used for transfection were 80-90 96 confluent. The COS-1 cells were transfected, as previously described (105), using 36 pg DNA and 750 pg DEAE Dextran per plate of cells. Sham (no DNA added) and pSVT7 (vector only) transfected cultures served as controls. Cells, harvested 42 hrs after transfection, 36 were resuspended in PBS (0.045 M potassium phosphate, 0.15 M NaCl pH7 .3) containing 10 mM glucose and 10 mM TG. The resuspended cells were sonicated and centrifuged at 15 ,000xg for 15 min. Hexokinase activity in the supematants was determined spectrophotometrically, as previously described (96). Protein concentration was determined with the BCA reagent using BSA as the standard. Aliquots of the homogenates were electrophoresed on 6.5-20 % SDS- polyacrylamide gels. Purified Type I hexokinase was used as a positive control for the anti-Type I antibody reactions. Also included on the gel was a sample from a crude homogenate of rat skeletal muscle. The crude sample was prepared by initially grinding 2 gm of tissue under liquid N,, followed by homogenization in 50 mM sodium phosphate (pH 8.0), 1 mM Glc, 10 mM TG, and 1% Triton X100 (4 ml per gm of tissue). The crude homogenate was centrifuged at 30,000 rpm for 1 hr. The clear supernatant was stored at -20°C after the protein concentration was determined as above. The separated proteins were transferred to nitrocellulose in a carbonate buffer system (106). The protein blots were reacted with either pre-immune sera or anti-hexokinase I polyclonal antibodies, at a dilution of 1:500 in ms (50 mM Tris- HCl, 154 mM NaCl, 0.05% (v/v) Tween 20, pH7.5), as previously described (90), using 5 % nonfat dry milk as the blocking reagent. The blots were developed with the tetrazolium method of Taketa et al. (107). WWW. Skeletal muscle RNA was prepared from hind limb muscles of 4-7 week old Sprague—Dawley rats using the method of Chomczynski and Sacchi (108) except that, prior to homogenization, the tissue was 37 ground to a fine powder under liquid N,. Novikoff ascites tumor RNA was prepared using the method of Chirgwin et al. (109). RNA preparations were redissolved in water to give approx 1 mg/ ml. Polyadenylated mRNA was isolated from total RNA by chromatography on oligo (dT) cellulose as described by Maniatis et a1 . (95 ). Northern blotting was done by standard procedures, essentially as described by Maniatis et at. (95). Duplicate blots containing 3 pg of Novikoff mRNA and 20 pg of rat skeletal muscle mRNA were air dried, and vacuum baked for 1 hr. After prehybridization in 5x SSC, 0.1% SDS, 5x Denhardt’s, and 50% formamide at 42°C for at least 1 hr, the blots were hybridized overnight using cDNA probes radiolabelled with [a-3’P]dCTP by random primer synthesis (101) to a specific activity greater than lO‘cpm/ pg. Two non-overlapping cDNA probes from Type II hexoldnase were used, representing N-terminal (nucleotides 1-1562, Fig. 5) and C-terminal (nucleotides 1705-3082, Fig. 5) regions of the enzyme. After hybridization the blots were washed sequentially in: 2x SSC, 0.1% SDS at room temperature for 15 minutes; 2x SSC, 0.1% SDS at 60°C for 15 minutes (twice); and 2x SSC, at room temperature, 15 minutes. Hybridized bands were visualized autoradiographically. The size of the mRNA bands was estimated by extrapolation between the 283 (4.72 kb) and 18s (1.87 kb) ribosomal RNA bands. W A 24-base oligonucleotide (5 ’- GCI'I‘AACCACGATGGCI’CACCAGC-3’), complementary to nucleotides 34-58 of Type II hexokinase mRN A, was synthesized and 5 ’-end-labelled with [y-“PJATP (6000 Ci/mm) and T4 polynucleotide kinase. The primer (1x10° cpm) was annealed in a 38 total of 15 ul to either 3 pg Novikoff tumor poly(A)+ RNA or 20 pg rat skeletal muscle poly(A)* RNA. The annealing buffer contained 0.15 M KCl, 10 mM Tris- HCl (pH 8.3), and 1 mM EDTA. After hybridization for 1.5 hr at 62°C, the elongation reagents were added to the reaction mixtures containing the RNA and annealed primer. The 40 pl elongation reaction contained 20 mM Tris-HCl (pH 8.3), 10 mM MgCl,, 6 mM DTT, 0.3 mM each dNTP, actinomycin D (150 ug/ml) and 5 units of AMV reverse transcriptase. The reaction was incubated at 42°C for 1 hr. and then treated with RNase A (15 pg/ml) for 15 min. The products were fractionated on 5 % or 9 % polyacrylamide denaturing gels and visualized by autoradiography. Way, Single-stranded DNA, generated from a 630 bp (SmaI-SmaI) fragment of genomic clone 5G3A that had been subcloned into M13mpl9, was annealed to the same 24-base primer used in the primer-extension analysis that had been 5’-end-labelled with [7-3’P]ATP (6000 Ci/mm) to a specific activity of approximately 10’ cpm per pg. The primer was extended in the presence of 0.4 mM each dNTP and the Klenow fragment of DNA polymerase I. The double- stranded DNA was digested with Sall, and the strands were separated on a 3.5% polyacrylamide 7M urea gel. The isolated 649-nucleotide fragment (5 x 10’ cpm) was hybridized to either 3 pg Novikoff tumor poly(A)+ RNA or 20 pg rat muscle poly(A)+ RNA in a 10 pl reaction containing 0.4M NaCl and 10 mM Pipes (pH 6.4). The reaction was heated at 70°C for 5 min and incubated at 65 °C for at least 6 hrs. After hybridization, 300 pl of a solution containing 0.2 M NaCl, 2 mM ZnSO4, 39 20 mM sodium acetate (pH 5.0), and 200 units of S1 nuclease were added and the mixture was incubated at 37°C for 1 hr. The digestion by 81 nuclease was terminated with the addition of 80 pl of a solution containing 4M ammonium acetate, 50 mM EDTA. Following phenol/chloroform extract and ethanol precipitation, the resultant products were fractionated on 5 % or 9 % polyacrylamide 7M urea gels and visualized by autoradiography. Cell-Fm; in vim Tmsgriptign Assay, The ”G-free cassette" plasmids, p(C2AT),, (110) and pML-GFC2 (111), and the rat liver nuclear proteins (prepared according to Gorski et al. (112)) were generously provided by Dr. D. Jump (Physiology Dept. , Michigan State University). The plasmids were used with the permission of Dr. R. G. Roeder (Rockefeller University) (110). These plasmids contain a synthetic DNA fragment subcloned into the SacI-Smal sites of pUC13 that generate a discrete G-free RNA product when transcribed in the absence of GTP. The length of the synthetic fragment in p(QAT)19 is approx. 380 nucleotides. The plasmid, pML-GFC2, contains the same synthetic fragment shortened to approx 280 nucleotides, which is under the control of the adenovirus-2 major late promoter. Inclusion of 3 ’-O-methyl-GTP hinders transcription of any promoter-like sequences in the vector and T, RNase degrades spurious transcripts containing G residues. The Type II hexokinase promoter is numbered relative to a transcription start site of +1. The plasmid p(C2AT)-HKII contains the Type II hexokinase promoter from positions - 260 to +12 upstream of the "G-free cassette" of p(QAT),,. This section of the promoter region was isolated using endonucleases SmaI and DdeI (5 ’-3 ’) and blunt 4o ends were generated using T4 polymerase. After the addition of EcoRI linkers, the promoter fragment was subcloned into the EcoRI site at the 5 ’ end of the "G-free cassette" of p(C,A'l')l9 and grown in E. coli strain JM105. Restriction analysis was used to determine orientation of the promoter fragment in several recombinants. Recombinants with this promoter fragment in both orientations were purified using QAIGEN columns. These purified templates were used in the in vitro transcription assay system, essentially as described by MacDougald and Jump (111). Using the p(C,AT)-HKII constructs, the ability of a portion of the Type II hexokinase promoter to direct transcription was determined. The transcription reactions (40pl) contained 25 mM Hepes (pH 7.5), 5% glycerol, 50 mM KCl, 6 mM MgC12, 0.6 mM ATP and CTP, 0.03 mM UTP, 15 pCi [a-32P]UTP, 0.1 mM 3’-O— methyl GTP, 15 units of Tl-RNase, and 60 pg of rat liver nuclear proteins. Two micrograms of each p(C,AT)-HKII construct was included in a reaction along with 50 ng of pML-GFC2. As a negative control, one transcription assay included 2 pg of the original plasmid, p(C2AT)19 containing no promoter, along with 50 ng of pML- GFC2. The plasmid, pML-GFC2, was included in each assay as an internal positive control. The transcription reactions were incubated at 30°C for 90 min. The reactions were stopped with the addition of 380 pl of 50 mM Tris-Cl (pH 7 .5), 1% SDS and 5 mM EDTA. To aid in the precipitation of the transcription products, 40 pg of yeast t-RNA were added. The reactions were extracted 3 times with phenol/CHCl, (equilibrated with 10 mM NaOAc (pH 5.0), 100 mM NaCl, 41 1 mM .TA). After ethanol precipitation the transcription products were fractionated on a 6 % denaturing and visualized by autoradiography. CHAPTER 3 Results 42 lsobtign of Clones Contam’ m g QDNA for M jflpe II Hexokinase. Approximately 250,000 recombinants of an amplified Agth rat soleus muscle cDNA library (Library 1, Table III) were screened under medium stringency conditions with a partial cDNA for rat Type I hexokinase, previously described and designated HKI 12.4-4 (67), as the radiolabelled probe. Two positive recombinants, termed 12-1.3C and 15—1, were isolated. Both contained 1.6 kb inserts, and partial sequence analysis and HaelII digestion patterns (Fig. 3 , lanes 3 and 4) indicated that these inserts were identical. Thus, further analysis was restricted to clone 12-1.3C. The restriction map and sequencing strategy for 12-1.3C are shown in Fig. 4. The nucleotide sequence for 12-1.3C represents nucleotides at positions 1-15 62 of the sequence shown in Fig. 5 . There is an open reading frame that extends from nucleotide 197 to the 3 ’ end of the clone at nucleotide 1562. The deduced amino acid sequence shows extensive similarity to that of the rat Type I (68), Type III (56), and Type IV (32) isozymes, yet is clearly distinct from these. An 8-residue segment of the deduced sequence (underlined in Fig. 5) is identical to the sequence determined by direct N-terminal sequencing of the Type II isozyme, prepared as described in Methods: Phe-Thr-Glu-Ieu-Asn-Gln-Asn—Gln. The N-terminal sequence for Type II hexokinase, purified from rat skeletal muscle by a different procedure, is Glu-Leu-Asn-Gln-Asn-Gln-Val-Gln-Lys-Val-Asp-Gln-Phe-Leu- Tyr-X-Met-Arg-Val (P. Fischer, F.E.Weber, K. Beyreuther, and D. Pette, personal communication). It is evident that residues 3-8 in the sequence determined from 43 Table III. Summary of cDNA library synthesis/screening strategies and results. 2-1.6kb clones muscle (HKI-C)(l) (12-1.3, 15) 12-1.3C(2) 39 I 12-1.3C NK33(2) 6 I 12-1.3C 2 th10 dT 12-1.3C 6I12-l.3C liver (2) |I 3 pUC8/9 dT 12-1.3C 3 clones, all within muscle directional (2) 12-1.3C ' 4 UniZAPII dT-Xhol 12-1.3C no + muscle (2) 5 UniZAPlI dT-XhoI 12-1.3C no + fat pad (2) 6 UniZAPII dT-XhoI 12-1.3C 11 I23 kb° kidney (2) 6 I 0.4 kb°'° 7 UniZAPlI dT-Xhol 12-1.3C 4-2.3 kb‘ spleen (2) l I 0.4 kb“ 8 Agth random 12-1.3C(3’) 3 clones, largest muscle° (2) RG2B(2.3 kb);other two contained within RGZB 9 )xgt10 random 12-1.3C (3’) lIRGZB muscle (2) 10 Agth dT 12-1.3C 2 cloneszlargest tumor (2) 2.3 kb, NK33 I RG28 a-Stringency determined by temperature and percent formamide: (1) medium-42°C and 37%; (2) high-42°C and 50% b- I identical as determined by HaeIlI restriction digest patterns c-RNA from stimulated muscle tissue, a generous gift of Dr. Pette d-Identical to sequence at 5’ end of 12-1.3C e-Each clone also contained an identical 3.4 kb EcoRI-Xhol fragment 45 Figure 3. HaeIII restriction patterns of cDNA clones. Shown is an EtBr-stained 5 % polyacrylamide gel containing cDNA restricted with the endonuclease HaeIII. Lane 1 contains pBR digested with HaeIII which was used as size markers. The sizes (in bp) of several bands in lane 1 are marked at the left. lanes 2 shows the HaeIII restriction pattern of the insert of cDNA clone HKI-12.4-4. Lanes 3 and 4 contain cDNA from clones 12-1.3C and 15-1 (respectively), restricted with HaeIII. Clone HKI-12.4-4 was the cDNA probed used to isolate clones 12-1.3C and 15-1. 46 587— 434— 267— 184— 124— 64— FIGURE 3 47 Figure 4. Relevant restriction sites and sequencing strategy for clones containing I either cDNA or genomic DNA for rat Type II hexokinase. The composite sequence, which includes the coding region and 5’ and 3’ untranslated regions, is shown near the top of the figure. Depicted above it are the portions of genomic clone 3G3A that were sequenced for verification of the overlapping cDNA sequences. The solid lines in the genomic clone represent exons and the dashed lines are introns. The dotted lines from the genomic clone to the composite figure show the positions of the three exons in the composite sequence. Shown below the composite figure are the cDN A clones 12-1. 3C, RGZB and NK3B (see text). Restriction sites used in sequencing and the direction of sequencing are shown for each clone. Restriction site abbreviations are: A, AvaII; B, BglII; C, HincII; D, HindlII; E, HaeIII; H, thl; M, SmaI; P, PstI; R, EcoRI; S, SphI; V, PvuII. ‘P w k. mmxz 48 e we mmum Ira I Ixfln.» m \m omanmfi 4) 3a . ea 33 . an anwum. .uz_H.nn.o m L. _ ._ ..M.. ”a «mom I All 49 Figure 5. Nucleotide and deduced amino acid sequences for rat Type II hexokinase. The underlined region in the amino acid sequence corresponds to the sequence obtained by direct N-terminal sequencing of the enzyme (see text for further comments in this regard). The dashed line above the nucleotide sequence indicates the region of overlap between the cDNA clone 12-1.3C, coding for the N-terminal half of the enzyme, and clone RGZB which encoded the C—terminal half. AIC AIC CCC lot 11o A1o Corn s lot 616 16A 6A1 6A6 A66 611 Lou Sor Asp 61u 1hr Lou 616 AAA A16 116 661 A66 Vs1 Lys Hot Lou Pro 1hr 661 616 616 66A 61AM Ara Vs1 Lou Ara Vs1 666 A61 66A ACC CA6 61y Sor 61y 1hr 610 661 116 A66 116 166 61y Pho 1hr Pho Sor 6AA 666 A6A 6A1 616 61u 61y Ara Asp Vol 611 666 A66 A16 A16 Vs1 61y 1hr Hot Hot 6AA A16 661 6A1 A11 61u Hot Ara His 11o AAI 6A6 A16 66A A66 Asn Asp 11o Ara 1hr 1A6 A16 666 6A6 616 1yr Hot 61y Lou A616 666 166 1hr 61y Sor 616 AA1 CCA Lou Asn Pro 666 666 616 A1o A1s Vol :66!" C16 Lou IIC Pho CI VI ACI 1hr ssc Ase 6A6 61u 6IC Vs1 61u 1hr CAC ACC AAA 6A 61u Asp 50 CCAAAACCICIIICICCAAACCCCACCCCCICACCICCICA CCAI 6ICICCCAICCCACCCCCACACCCCCCCCICCIIICAAACCCCICCAACICCICICCCCCCICCACCCCIACCCCICCACCI rcs up no AIC ccc rec m m: 41cc as crc AA6 cu AA6 cu St W 11o A1o Cys 616 6A6 A11 Lou 61u 11o 111 616 A66 Pho Vs1 Ara 61? A66 6A6 Vs 1hr Asp 111 6A6 CA6 Pho Asp His 666 166 6A6 Pro Cys Hts 6A6 616 A16 Asp Lou 11o 161 666 1A1 Cys 61y 1yr A16 616 6A6 lot Vs1 61u 111 6A6 C6A Pho Asp Ara A66 616 AIC Ara Lou 11o CAI CI Lou 161 A66 666 Cor Ara Ara 16A A61 666 tor 1hr Pro AAI CCC CIC Asn 61y Lou A16 666 6AA 11o A1o 61u 6A6 A6A AAA 61o 1hr Lys 666 AA6 661 Ara Lys A1o 6A1 6A1 CA6 Asp Asp 61n CCA CAI CA6 116 666 AA6 6A6 A16 Pho Ara Lys 61u Hot CAT 666 ACA 6AA 6A1 Asp 61 y 1h r 61u uls 6A6 A6A 616 6A6 A16 61o Ara Vs1 61u lot 166 616 6 6 AAC 116 Cys Lou A s Asn Pho C16 6A1 6A6 A61 111 Lou Asp 61u tor Pho A16 6A6 666 A6A 666 11o 61n Ara Ara 61y AA6 166 6A6 A11 661 Asn Cys 61u 11o 61y 666 666 A16 166 A16 61y Asp 61u 61y Ara Hot Cys 11o 6A6 A16 6A6 A16 666 166 616 AA6 61u 11s Asp Not 61y Sor Lou Asn 616 616 AA6 A16 666 AA6 66A 6A6 Lou Vs1 Lys Not A1o LysA 61u 166 6A1 A11 6A1 AA6 CAI Asp Vs Sor Asp 11o Asp Lys Asp 161 616 666 A66 CA6 66A A16 166 6A6 A11 Cys Vo1 A1o 1hr Hrs Ara 11o Cys 61n 11oV AAA 6A6 AA6 AA6 666 6A6 6A6 61u Asn Lys 61y 61u 6u WGAG Lys 61u 61u CCA CII CCC uAra Lou Ara 6 661 6166611161 6A6 AA6 61h Lys 6A6 AAA 666 61A 66A 61u Lys 61y Lou 61y 666 6A6 116 616 661 61y 61u Pho Lou A1sL 6A6 6AAC CA6 AIC IAC CCC 61u Asn 61o 11o 1yr A1o IC A16 6A6 AA6 61A 6AA A lot Asp Lys Lou 61a 11o 116 616 166 166 A61 AA6 Lou Vs1 Sor 1rp 1hr Lys 6A6 111 6A6 A11 6A6 A11 Asp Pho Asp 11o Asp 11o 616 A 1 616 666 A61 666 Lou l o Vs1 6 y 1hr 61y AA6 A16 6A6 166 66A GEC 1rp 6 y A o Asn Not 61u AA6 CA6 616 111 661 666 Pro 61y Lys 61a Lou Pho 616 116 116 6AA Lou Lou Pho 61a 6 y Lys C6A6 AA6 6 6 1A cm 61y 11o 61u CI 1 GA: 1A Asp 661 Ale 6 6 CI Vs I LOU 166 A66 666 166 6 Sor Ihr Ara Sor As CCI CIA CAA 111 61n Pho A C IN CA Asp AI 11 I C C I C 0 AAA LY! CC 61 CI VI 6 Y 6 1 A66 Sor II Ph CA 61u AAA 61 I 6 C LOU AC Sor LN Lys A s 1yr 61o I ICC ACC AIC CCI CIC CAI 66C Sor 1hr 11o 61y Vs1 Asp 61y Sor 666 6A1 Pro uls CAC 6A6 CA “is 61u 61n A16 616 666 not Lou Pro 616 616 616 Lou Lou Vs1 666 6AA 6A6 61y 61u 61u A6A 116 166 1hr Pho Sor 6A6 6A1 616 61u Asp Vs1 A61 A16 A16 1hr Hot Hot 661 AA1 616 Ara Asn Vs1 IIC CCC ACC Lou Ara Ihr 66AM 6A11 616 61y 61u 11o Vs1 A16 111 6AA A61 AA6 116 11o Pho 61u 1hr Lys Pho A66 A66 166 6A1 6A6 A66 Sor 1hr Cys Asp Asp Cor 61A 616 6A6 AA6 AIA A6A Vs1 Vs1 Asp Lys 11o Ara 111 666 Pho A1o 66A 661 A1o A1o 6A6 61u Vs1 661 Are 116 6A6 Pho Asp 661 166 Pro Cys A66 116 1hr Lou ICI 66C Cys 61y CIC CI LOU Vs III CAI Pho A69 666 AA6 Ara Asn 61 Vs CI V6 IAAAIIAICACAACAICCACCCC1 AACCACIICIICCCACIICCICCI ICAACCICAQ IIICI ACCICCAIS CCICIICAACI CAACCCAI MCCJ CICACAIGCCICAAICICICAICCAC CCACC IACACACCIIGC IACIIACIIIAAACCACCICCICICCAIICICIAAICCIIC CIC I A66 A66 1? 616 666 6A6 Vs Ara Ara 6A uls Lou Vs1 Pro Asp 1A6 661 616 I62 AIE VsI fife 1yr Ara Lou AI: 6A6 CA‘ CAC Asp 61n His 611 AA6 A6A A6A A16 AA6 616 6AA A16 6A6 6A6 Vs Lys Ara Ara lot Lys Vs1 61u Hot 61 666 A61 66A“ 1666 ACA 6A6 AAA Cys A s 1hr Pro Asp 61y 1hr 61u L78 AA1 666 AA6 666 A66 666 616 6A6 A16 Asn 61y Lys Ara Ara 61y Vs1 61u lot CAC A11 61 CA6 166 A11 666 6A6 116 His 11o Vs 61o Cys 11s A1s Asp Pho CA6 6A6 AA6 A66 61A 6A6 CA6 A66 AIC 616 61c 61o 61o Asn Sor Lou Asp 61o Sor 11o Lou Lou 616 AA6 6AA 6 A11 CAC 666 66A 6A6 6A6 111 Lou Lys 61u Ao 11o" his Ara was a“ CA ‘3 AAC 1 Asn 1 6A6 Lou 61u nO-‘< n: Ara 61u 61u Pho 1A6 6AA 6A6 661 CA6161 6AA 611 666 616 A11 1yr 61u Asp Pro Mls Cys 61u Vs1 61y Lou 11o 666A 06 6A6 66A 666 A16 161 616 AA6 A16 Asp 61y 61u 61u 61y Ara lot Cys Vs1 Asn Hot 611 1 61 6A1 6A6 611 161 616M Vs1 A s Vs Asp 61u Lou SorL A11 616 A16 6A1 116 A66 AA6 666 11o Loul 11o Asp Pho 1hr Lys Ara CIC ICI CAC A A CAC ACC CAC ICC flA 6C6 CIC Lou Sor 61a to 61u Sor Asp Cys Lou A s Lou A16 A16 616 AA6 6A6 61? 166 A61 616 611 666W 11o 11s Vs Lys 61u Vs Cys 1hr Vs1 Vo1 As 6A6 AA6 661 6 616 6A6 AA6 616 AAA 61f 61u Asn Ara 6 y Lou Asp Asn Lou Lys Vs wAsn Pro I665“ ACA 1hr IICCCACACACCACCCICICACACICCCAC1 ICCICACCICCICCCAICCICCICCCCCAC‘ AACAICI ICAAAICICCAI CCIICI CICA _ 1ICA ACCCICCCCCIICIAAAACAIII 6CI C CIACI TCCICIACAA CACACC MCA CI66 6"; 61u 61a 6 ICI CAI CIf Cys A69 V6 666* 1 616 A66 AA6 CA6 A66 CA16 666 CI 1hr uls A1o 611 66A 66A A6A AA6 Asp Pho Lou A1o Lou Asp Lou 61 y Lou Sor Lys 61u 116 116 666 116 6A1 AA6 A16 1A6 166 A16 Lys 11o 1yr Sor 11o 1A6 A16 666 A16 AA6 1yr lot 61y lot Lys AA6 166 ACA AA6 Lys 1rp 1hr Lys 6 y 6A6 C16 6A1 616 611 Asp Lou Asp Vo1 Vs1 611 666 A66 A66 Vs1 61y 1hr 6 y Cor 6A6 166 111 y AI: Pho 61u 1rp 6 Lys 61o Ara Pho 38 §3§ EE E§ 2.3. 3% :59. 29 C6A6 61u Lys 61A 6A6 61 Lou 61a Vs mgfs! A C Ara Ara A s As III CI: VoIm Asp Cfem 1hr ICCIA )0 (0" P 59 , I CIC oLou Ara Hts Lou 6 CA6 616 161 666 61o Lou Cys 61y As 616 1A1 AA6 611 6A1 Lou 1yr Lys Lou uls AA6 616 A16 CA1 6A6 A66 616 A6A 6A1 616 661 666 AAA 161 6A6 616 166 116 616 6AA 166 6A6 6A6 666 A61 666 AA6 Lys Vo1 lot uls 61u 1hr Vs1 Ara Asp Lou A1o Pro Lys Cys Asp Vol 616 A16 A61 666 616 666 166 666 A16 666 6A6 661 :66 6A6 A6A 1A6AA66116666ACCA6A6666661161616611CCICACIICCCCCI11 Lou 11o 1hr A1o Vs1 Als Cys Ara 11o Ara 61u A1s 61y 61n Ara End IIIICICICICIAIAIICACICIACACIIICCIACCCAAICCIICCCCIlf ICCCAICCC CCACACICIICACAAAAICCII I CIICCCIC CCCACCCACCAC CACACAIACCACCICCCCI C C CAAICICAAC C CCAI ACAC CAAACCIA CACI CCACCI IICII CAI1ICII ngngCACICCCIIICCICCCIIIIIACIIIIIICIIIICIIIICIIIIIIICIIIIIIAAIIIAICIIICAAAII 8233 04(6) ’90-.“ 582339232 666 116 616 666 161 6A6 6A1 666 Ara Pho Lou Ara Sor 61u Asp 61y 666 666 C66 CA6 AA6 ACC CI6 6A6 161 616 AA6 s Ara 61o Lys 1hr Lou 61u tor Lou L 6 no so I O --> t '38 P-s C .0 -s 666 616 616 116 66A 666 666 A16 16A 6A6 61y Lou Lou Pho Ara 61y Ara 11o Cor 61u A tor Pho Lou 61u tor 61u Asp 61y Sor 61y Lys PD 36" AACCIICIIICICICCCICCICCA ICCACCCCCACICCCCCCACCACC 616 1A6 6A6 A16 661 Lou 1yr Mrs Not Ara A66 6A6 661 ACA 66A 661 1hr le Pro 1hr A1s A1o 611 66A 66A A66 AA6 116 Lou 61y 61y 1hr Asn Pho 661 6A6 6A6 A16 A16 666 Pro 61u Asp 11o Not Ara 6A6 AA6 AA6 616 661 616 61u Lys Lys Lou Pro Lou 116 AA6 166 A61 666 616 Pho Lys Sor Sor 61y Vs1 666 616 616 AA1 6A6 ACA A1o Vs1 Vs1 Asn Asp 1hr AAC 666 166 1A6 A16 6A6 Asn A1o Cys 1yr Hot 61u 666 6A6 6A6 661 A6A 616 61y Asp Asp 61y 1hr Lou AA6 A16 A11 A66 666 A16 Lys lot [to Sor 61y Hot A66 66A 6AA 616 611 A66 Sor Pro 61u Lou Lou 1hr AIC CIC AIC 666 CIC 661 11o Lou Not Ara Lou 61y 616 166 66A 666 wCys A1o A1o ICC ys L 661 Pro 116 61y 1hr Asn Pho 6 611 A16 6A1 Vs1 lot "is 616 661 116 Lou Pro Lou 161 666 166 Sor 61y Cys AA1 6A6 ACA Asn Asp 1hr 1A6 A16 6AA 6A 1yr Hot 61u 666 166 C16 61y Cys Lou A66 666 A16 Sor 61y Hot CIC AAC ACA Lou Lys Ihr CCC CAC CIA 663 CEC 9 CAI Asp 1A6 1yr A66 66A Ara 61y 616 6A6 Lou 61u 666 666 A1o A1o 661 CAC Pro uts 66A 61y E 666 A16 s61y Hot CCICACAAICICAACI FIGURE 5 77 196 3:? 376 8 oi 00 {I :32: :33: <3 1816 540 1906 570 1996 600 2086 630 2176 66 51 enzyme purified in our laboratory (underlined in Fig. 5) correspond to the initial portion of the latter sequence. Furthermore, the sequence found by Fischer and colleagues matches the deduced sequence (Fig. 5) exactly, with the exception that the C-terminal Val in the sequence of Fischer et al. is a Leu in the deduced sequence. The obvious heterogeneity at the N -terminus of the enzyme purified in these two different laboratories, together with the finding of Fischer et al. (personal communication) that approx. 80 % of the enzyme in their preparation lacked the first two residues (Glu-Leu), undoubtedly reflects artifactual proteolytic modification during purification of the enzyme, previously shown to occur with the Type I isozyme (54,113,114). The above observations made it evident that clone 12-1.3C represented only the N-terminal region of the rat Type II isozyme. Exhaustive rescreening of the cDNA library (Library 1, Table III), from which 12-1.3C had been derived, failed to provide additional clones containing sequence for the C-terminal half of the enzyme. Several other libraries, No. 2-9 of Table III, were obtained from other investigators or were synthesized using RNA from tissues known to contain the Type II enzyme. Each library was screened with the Type II hexokinase cDNA clone 12-1.3C in order to identify clones containing sequence for the remaining portion of the Type II isozyme. Extensive screening of a rat liver cDN A library (N o. 2 , Table III) generated six clones identical to clone 12-1.3C as determined by size, HaeIII digestion pattern, and partial sequence analysis. The plasmid libraries (N o. 3 , Table III) produced only 3 52 clones whose inserts (500-800 bases in length) were completely contained within the sequence of 12-1.3C. The muscle and epididymal fat pad libraries (No. 4—5, Table III), using UniZAPII as the cloning vector, produced no positive recombinants which placed the quality of these libraries in question. Screening of the other two libraries constructed in UniZAPII (No. 6 and 7, Table III) yielded clones containing inserts of either 3. 8 kb or 5.7 kb in length. Both sets of clones contained a 3 .4 kb fragment upon double digestion with restriction enzymes EcoRI and XhoI, and also either a 0.4 kb or a 2.3 kb fragment. The 0.4 kb fragment was identical to the 400 bases at the 5’ end of 12-1.3C, as determined by sequence analysis. The 2.3 kb fragment was distinct (by size and HaeIII digestion pattern) from any other clones investigated to date. However, Southern blot and partial sequence analyses indicated no homology between the 3.4 kb fragment and any known rat hexokinase cDNA. Since the recombinants from both the kidney and spleen libraries appeared to contain at least some cloning artifact(s), i.e. the unidentified 3.4 kb fragment, the search for the cDNA encoding the C-terminal portion of Type II hexokinase focused on other libraries. Therefore, two random-primed cDNA libraries were constructed in Agth, following the procedure of DeWitt and Smith (100) and using mRNA isolated from either chronically-stimulated rat skeletal muscle in which the level of Type II hexokinase is elevated more than 10-fold (29,31), or normal muscle tissue. If the level of the Type II isozyme in stimulated muscle tissue reflected transcriptional regulation, elevated levels of Type II hexokinase mRNA would also be expected. 53 These two unamplified libraries were screened under high stringency conditions using a 407 bp fragment from the 3’ end of clone 12-1.3C as probe. One recombinant (2.3 kb in length and designated RM5) was isolated from the library synthesized using mRNA from normal tissue (No. 8, Table III). Three recombinants, designated RGZA, RG2B and RGZC, were isolated from the library constructed using mRN A from stimulated tissue (N o. 9, Table III). Restriction mapping indicated that clones RG2A and RGZC were contained entirely within clone RGZB. HaeIII digestion patterns (Fig. 6, lane 2 and 3) and partial sequence analysis indicated that clones RM5 and RG2B were identical. Therefore, further studies concentrated on clone RGZB. Clone RGZB contained an insert of 2236 bp. The restriction map and sequencing strategy are indicated in Fig. 4; the sequence of R623 corresponds to nucleotides at positions 1399 to 3634 in Fig. 5 . The first 164 nucleotides at the 5’ end of RGZB were identical in sequence to the 164 residues at the 3’ end of clone 12- 1.3C. A single open reading frame extended for 1548 nucleotides, followed by a TAG termination codon and a 3’ untranslated region of 687 bases. Overlapping of the sequence from clones 12-1.3C and RGZB produced the composite nucleotide sequence shown in Fig. 5 , from which the complete amino acid sequence (917 residues) of rat Type II hexokinase was deduced. W. In order t0: (I) confirm the overlap of the cDNA clones, (2) identify the 5 ’ end of the message for Type II hexokinase, and (3) characterize its gene, a ACharon 4A rat genomic library was probed with three unique restriction fragments of the cDNA for this isozyme. A 54 Figure 6. HaeIII restriction patterns of cDNA clones for Type II hexokinase (C- terminus). Shown is an EtBr-stained 5 % polyacrylamide gel containing cDNA restricted with the endonuclease HaeIII. Lane 1 contains pBR, digested with HaeIII, which was used as size markers. The lengths (in bp) of several fragments in lane, 1 are shown to the left. Lanes 2 and 3 contain HaeIII restricted cDNA from clones RM5 and RG23. Lane 4 shows the HaeIII restriction pattern for the cDNA from clone 12-1.3C, which was used as the probe in the isolation of clones RM5 and RGZB. 55 5.7— 434 _' 267— 184 — I 24— 64— 1234 FIGURE 6 56 diagram of Type 11 cDNA clones 12-1.3C and RGZB, with relevant restriction sites indicated, is shown below. The restriction sites are: 1, 5, 10, EcoRI; 2, XhoI; 3, Bglm; 4, SryI; 6, SphI; 7-9, HaeIII. In the following results, these restriction sites are referenced in parentheses. 12-1.3C R623 1 1 I I l l 3’ F 1 I I r 1 6 7 89 10 5’H 1 1 A. 1 2 3 4 5 When the genomic library was probed with the 407 bp 3’ region StyI-EcoRI fragment (4-5) of clone 12-1.3C, two recombinants designated 3G3A and 3G5A were isolated. Further analysis was performed only on clone 3G3A, since the two clones appeared to be identical by restriction analysis. Eight recombinants, whose insert sizes ranged from 9 kb to 20 kb, were isolated when this same genomic library was probed with a 526 bp EcoRI-Bglll (1-3) fragment from the 5’ end of clone 12-1.3C. These clones were designated 562A, 563A, 5643, 564D, 565A, 5653, 5G6A, and 566C. Screening of the genomic library with the 3’ 1928 bp SphI-EcoRI fragment (6-10) of cDNA clone RGZB as the probe resulted in one additional recombinant, designated TG3A. Southern blots of all 10 recombinants isolated with the three cDNA probes were used to align the genomic clones within the cDNA for Type II hexokinase. The probes used in the Southern analyses were restriction fragments of cDNA clones 57 12-1.3C and RGZB. Autoradiographs of these Southern blots are shown in Fig. 7 and 8. Each blot in Fig. 7 contained DNA from the 10 genomic clones restricted with EcoRI, fractionated on 0.8 % agarose gels and transferred to nitrocellulose. Panel A, Fig. 7, was hybridized to the 5’ 526 bp EcoRI-Bglll (1—3) fragment of cDNA clone 12-1.3C. Eight of the ten genomic clones reacted with this probe to varying degrees. Only clone TG3A, in Fig. 73, did not react with the BglII-Styl (3-4) fragment of clone l2-l.3C (nucleotide 525-1137, Fig. 5). Clones 3G3A, 564B and 564D (Fig. 7C) reacted most intensely with the 3’ 407 bp StyI-EcoRI fragment (4-5) of clone 12- 1.3C, while clones 562A and 565A demonstrated some homology to the probe. As shown in Fig. 7D, clones TGBA and 564D gave the strongest signal when hybridized to the 3 ’ 1928 bp (6-10) of cDNA clone RGZB. Genomic clone 363A also showed some crossreactivity to this probe. The results of additional Southern blots used to further define the results of Fig. 7 are shown in Fig. 8. As shown in panel A (Fig. 8), reacted with the 5’ 153 bp EcoRI-Xhol portion of clone 12-1.3C, genomic clone 5 63A (lane 3) hybridized strongly to the probe; clone 5 64B (lane 2) gave a slight signal indicating little sequence in common with the probe. Genomic clone 5 66C (lane 1) gave no signal. Panel B (Fig. 8), containing clones TG3A, 3G3A, and 564D (lanes 4-6, respectively) was reacted with a 207 nucleotide HaeIII-HaeIII fragment (7- 8) of clone RGZB (near the 3’ end of the open reading frame in the Type II cDNA). Clone TG3A gave the strongest signal, with clone 564D showing some crossreactivity and clone 363A showing no sequence similarity. Only clone TGBA (Fig. 8, Panel C,lane 9) gave a signal when probed with the 3’ 543 bp HaeIII-EcoRI fragment (9-10) 58 Figure 7. Southern blot analyses of ten clones containing genomic DNA for Type II hexokinase. Blots A-D are identical, each containing EcoRI restricted DNA from 10 genomic clones isolated using Type II cDNA as the probe. The genomic clones, with their respective lane designations, are: 3G3A (1), TGBA (2), 562A (3), 563A (4), 564B (5), 564D (6), 565A (7), 565B (8), 566A (9), and 566C (10). The radiolabelled probes used for each high stringency hybridization were from the following regions of two cDNA clones: Blot A, 5’ 526 bp EcoRI-Bglll (1-3) fragment of clone 12-1.3C; Blot B, 612 bp BgllI-StyI (3-4) fragment of clone 12- 1.3C; Blot C, 3’ 407 bp StyI-EcoRI (4—5) fragment of clone 12-1.3C; Blot D, 3’ 1928 SphI—EcoKI (6-10) fragment of clone RGZB. The numbers in parentheses refer to restriction sites in the schematic diagram of clones 12-1.3C and RGZB shown on pg 5 6. Shown to the left of Blots A and C are the sizes (in kb) of several fragments of Abacteriophage digested with HindIII, used as size markers. 59 M 1234567891012345678910] n O -.'O , -3 '3 . 3 - p .4 - (kb) .. O 23 — I .'z 6.6— " ‘ . 3 O 2.0— M 12345573910 (kb) 12345678910 23— FIGURE 7 60 Figure 8. Southern analysis of clones containing 5’ and 3’ genomic sequences for Type II hexokinase. Blots A-C contain DNA from genomic clones restricted with EcoRI, electrophoresed in O. 8 % agarose gels, and transferred to nitrocellulose. Each blot was probed under high stringency with radiolabelled cDN A fragments from clones 12-1.3C and RGZB. Blot A, containing DNA from clones 566C, 564B and 563A in lanes 1-3, respectively, was probed with the 5’ 153 bp EcoRI-Xhol (1-2) fragment of clone 12-1 .3C. The restricted DNA from clones TG3A, 363A, and 564D (lanes 4—6 of Blot B, respectively) were hybridized with a 207 HaeIII (7-8) fragment of clone RGZB. Restricted DNA from clones 3G3A, 5 64D, and TG3A (Blot C, lanes 7-9 respectively) were probed with the 3’ 543 bp HaeIII-BcoRI (9-10) fragment of clone RG2B. The numbers in parentheses refer to restriction sites in the schematic diagram of clones 12-1.3C and RGZB shown on pg 56. Indicated to the left of Blot A are the sizes (in kb) of several fragments of Abacteriophage DNA digested with HindIII. (kb) 23— 9.4— 29— 61 FIGURE 8 62 of cDNA clone RG2B. Genomic clones 3G3A and 5G4D (Panel C, lanes 7 and 8, respectively) demonstrated no sequence similarity with the probe. The signals seen in the 23 kb region of each blot in Figs. 7 and 8 are likely due to incomplete digestion of the genomic DNA, as the cloning vector Charon 4A accepts inserts totally 20 kb or less. The results from the aforementioned Southern analyses were used to align the five clones containing pertinent sequence information or the largest unique inserts, as shown in Fig. 9. Each clone spans those portions of cDNA that gave positive results on the Southern blots. The 5’ and 3’ endpoints of these clones have been positioned within the regions of cDNA that hybridized to these clones. These positions are only approximate and have not been further defined by sequence analysis. The approximate insert sizes for the clones represented in Fig. 9 are: 563A, 15 kb; 5G4B, 16 kb; 5G4D, 20 kb; 3G3A, 15 kb, TG3A, 10 kb. From inspection of this alignment, it is likely that the gene encoding Type II hexokinase is at least 35 kb in length. Of the 10 genomic clones isolated, Southern blots of only clone 5 G3A reacted to a significant degree with the 5’ EcoRI-Xhol 153 bp fragment (1-2) of cDNA clone 12-1.3C (Fig. 8A, lane 3). Thus, this clone, containing a 15 kb insert, was further analyzed for additional 5’ message and promoter sequences for Type II hexokinase (discussed below). Sequence and restriction map analyses indicate that clone 5G3A contains approx 8 kb of nucleotide information upstream from the 5’ end of the cDNA clone 12-1.3C. A partial restriction map of clone 5G3A is shown in the 63 Figure 9. Alignment of relevant clones containing genomic DNA for rat Type II hexokinase. Shown is the composite figure which contains the coding region for Type II hexokinase, and the 5’ and 3’ untranslated regions of the cDNA. Below this composite drawing are five genomic clones isolated and aligned using portions of clones l2-1.3C and RG2B as the probes (see Fig. 2, Material and Methods Chapter). The approximate size (in kb) is indicated above each genomic clone. The slash marks (//) indicate that the genomic clones are longer than represented, and are not drawn to scale with the composite figure for Type II hexokinase. The dashed line at the 5’ end of clone 563A indicated sequence upstream from the 5’ end of cDNA clone 12-1 .3C. Portions of this region of clone 5 63A were analyzed for the transcription initiation sites and promoter sequences for Type II hexokinase. .m s 3% _ a 8. 8 _ m¢om _ «mm: x at r \x L 8. 2 a... 2 _ gum _ _ «mom i at ..i.......x\ _ \w _ _ 9v. 9 02 mg - . ea . 3a . 8.2 . a...» .- .fl 23.th .9339 59$ T .m FIGURE 9 65 schematic below. The restriction site abbreviations are: B, BamHI; E, EcoRI; H, HindIII; S, SmaI. The arrow above the SmaI fragment delineates the portion of this clone that was analyzed for promoter sequences. The arrow points in the direction upstream from the 5’ end of the cDN A for Type II hexokinase. The dotted line indicates cDNA sequences. The dashed lines represent XCharon 4A vector arms. SI _1_ n —r— Confirmation of the overlap of the cDNA clones was obtained by isolation of a genomic DNA clone (3G3A) containing segments that span the region of overlap. After restriction with various endonucleases, Southern blots of genomic clone 3G3A were probed under high stringency conditions (as above) to identify fragments including sequences found in the 3’ region of clone 12-1.3C. The results of these Southern blots are shown in Fig. 10. Panel A contains EcoRI digested DNA from clone 3G3A. Hybridization with the 407 bp fragment from the 3 ’ end of cDNA clone 12-1.3C resulted in signals from the 1.6 kb and 6.2 kb fragments of the clone. The 6.2 kb fragment was isolated and restricted with several endonucleases. A Southern blot of these digests, probed with the 3’ end of clone 12-1.3C, gave results shown Fig. 8B. The 1.6 kb EcoRI fragment of clone 3G3A and the 2 kb SphI fragment 66 . Figure 10. Southern analysis of genomic clone 3G3A. Genomic clone 3G3A (or portions thereof) was restricted with endonucleases, electrophoresed in an O. 8 % agarose gel, and transferred to nitrocellulose. Blot A, containing clone 3G3A digested with EcoRI, was probed with the 3’ 407 bp StyI-EcoRI fragment of cDNA clone 12-. 1.3C. Lanes 1-10 of Blot B contain DNA from the 6.6 kb fragment of genomic clone 3G3A, restricted with various endonucleases. The restriction enzymes used to digest DNA from the 6.6 kb fragment in each lane were: 1, SacII; 2, Sad; 3, SmaI; 4, HindIII; 5, BamHl; 6, BgllI; 7, I-IincII; 8, SphI; 9, PvuII; and 10, PstI. Blot B was also hybridized to the 3’ 407 bp fragment of 12-1.3C, mentioned above. Indicated to the left of both blots are the sizes (in kb) of several fragments of Xbacteriophage digested with HindlII. 67 FIGURE 10 68 from the 6.2 EcoRI fragment of clone 363A (lane 8, Fig. 10B) were subcloned into M13 (mpl8 and mpl9). Initial sequence analysis located 3 regions (one at the 3’ end of the 1.6 kb fragment, and the 5’ and 3’ ends of the 2 kb SphI fragment) containing sequences identical to clone 12-1.3C. Those 3 regions were sequenced as indicated in Fig. 4. These included coding regions spanning the section from nucleotides 1072 to 1705 (Fig. 5). Two introns divided this sequence into segments comprised of nucleotides 1072-1227, 1228-1461, and 1462-1705. Isolation of a Pagial cDNA Clone fgr Novikoff flnmgr Hexokinase. An unamplified cDNA library (No. 10, Table III) was prepared in )sgth, using oligo-dT as primer and mRNA isolated from Novikoff ascites tumor cells. Screening of this library under high stringency conditions, using the 407 bp fragment representing the 3’ region of clone 12-1.3C as probe, produced three positive recombinants. These were designated NK3B, having an insert of approx. 2300 bp, and NK3A and NK3C, each with inserts of approx. 500 bp. HaeIII restriction analysis and partial sequencing indicated that NK3A and NK3C were contained entirely within NK3B, which was further characterized and found to be identical to the previously isolated RG2B (Fig. 5). Nu m B1 :1... i f 1L1; ‘ . '. . M I - an Noovikff Ascites Iymgr gens. Polyadenylated mRN A from rat skeletal muscle and Novikoff ascites tumor cells was examined by Northern blotting (Fig. 11). Duplicate blots were probed with cDN A representing either the N -terminal or C-terminal half of Type II hexokinase; these were the insert from clone 12-1.3C (nucleotides 1-1562 in 69 Figure 11. Northern blot analyses of rat Type II hexokinase mRNA. Blots A and B are duplicates, and each contains approx. 3 p g of N ovikoff rat hepatoma mRN A and 20 pg of rat skeletal muscle mRNA in lanes 1 and 2, respectively. Blot A was probed with radiolabelled cDNA corresponding to nucleotides 1705-3082, while the probe for blot B was cDNA corresponding to nucleotides 1- 1562 (nucleotide sequences in Fig. 3). The relative positions of the 188 and 288 bands of ribosomal RNA are indicated at the left. 70 FIGURE 11 71 Fig. 5) and a SphI/Kpnl fragment from clone RGZB (nucleotides 1705-3082, Fig. 5), respectively. Both probes hybridized to a 5 .2 kb message, found in both skeletal muscle and tumor mRNA. Ioontifioation of the Traosoriotjon Initiation Sitoa. The 81 nuclease protection assay was used to locate the transcription initiation site of the Type II hexokinase gene. The 8] probe was synthesized from a 630 bp Smal fragment of the genomic clone 5 G3A that had hybridized to the 5’ 15 3 bp of the cDNA for Type II hexokinase in Southern blots. After hybridization of the SI probe to poly(A)+ RNA from either tumor or normal muscle tissue, the products were treated with S] nuclease and the protected fragments separated on a 9 % polyacrylamide 7M urea gel. A portion of an autoradiograph from one such experiment is shown in Fig. 12. Lanes 1-7 show pBR size markers and dideoxynucleotide sequencing reactions from clone 563A used to define the region protected from $1 digestion. The sequencing reactions were generated from the same Smal fragment and primer used to produce the SI nuclease probe. Lanes 6 and 7 contain the products from tumor and skeletal muscle mRN A, respectively. There were two protected fragments in each lane, the predominant fragments having an approximate size of 315 nucleotides and the minor fragments slightly longer. A primer-extension assay was also used to identify the transcription initiation site. A portion of an autoradiograph, from a 9 % denaturing gel, obtained from one of these experiments is shown in Fig. 13. Lane 1 shows pBR size markers with the fragment lengths indicated at the left. Lanes 2-5 are dideoxynucleotide sequence 72 Figure 12. SI nuclease protection assay results. Shown is a portion of an autoradiograph from an 81 nuclease protection assay, whose products were separated in a 9 % denaturing polyacrylamide gel. Lane 1 contains pBR digested with MspI as size markers, with fragment lengths (bp) indicated to the left. The $1 nuclease protection experiment was performed using approx 3 pg of poly(A)+ RNA from Novikoff rat hepatoma cells (lane 6) or 20 pg of poly(A)+ RNA from rat skeletal muscle (lane 7). Lanes 2—5 show A, C, G, and T dideoxynucleotide sequencing reactions, respectively, used to defrne the relative position of the protected products. 73 1 ¢ ,3 a g...:_a_= mgéfi E Eyes? a. F. g 3, a e . I. E and § . .. ,i. v ., 4 . m 4 .’ muum FIGURE 12 74 Figure 13. Primer-extension results. Shown is a portion of an autoradiograph of primer-extension products electrophoresed in a 9 % denaturing gel. Lane 1 contains pBR (digested with MspI) as markers, with the fragment lengths (bp) indicated to the left. Lanes 2-5 show A, C, G, and T dideoxynucleotide sequencing reactions (respectively) that define the relative nucleotide positions of the extension products. Lanes 6 and 7 show the extension products from approx 3 pg of Novikoff hepatoma mRNA and 20 pg of rat skeletal muscle mRNA, respectively. Lanes 6 and 7 were exposed approx twice as long as the other lanes. 75 & Ilium . 11‘ Ill!“ . 1 2 funic- \ “ W I" 5; G‘I a m’.‘a«§, . E. ' it: 3 i -. g- l ' - <—~ . .' ._~ ‘— - 1‘ vn' FIGURE 13 76 reactions synthesized from the 630 bp Smal fragment of genomic clone 5G3A and the same primer used in the extension assays. lanes 6 and 7 show the DNA products of an extension reaction using poly(A)" RNA from Novikoff tumor cells and normal rat skeletal muscle tissue, respectively. Again, there were two synthesized fragments in each extension lane, and the predominant bands were approximately 315 bases in length. Shown in Fig. 14 and 15 are the results of two primer-extension experiments, similar to the aforementioned work. The products of these extension experiments were electrophoresed in 5 % denaturing polyacrylamide gels. In each figure, lane 1 contains pBR size markers with the 309 bp fragment indicated at the left. Lanes 2-5 are the dideoxynucleotide sequences of the Smal fragment of genomic clone 5 G3A. Lanes 6 and 7 of Fig. 14 contain extension products from muscle and tumor mRNA, respectively. Lane 6 of Fig. 15 contains the extension products from approx 1 pg Novikoff hepatoma mRN A; indicated to the right of this lane is the nucleotide sequence surrounding the transcription initiation site, determined from lanes 2-5 , Fig. 13. It is clear from Fig. 14 (lanes 6 and 7) that, in both muscle and tumor, extension products were 343-348 nucleotides long; however, the majority of the extension products were in the range of 320-328 nucleotides long. In this region, the strongest bands (as seen in Fig. 15) were 324 and 325 nucleotides in length including the primer and correspond to initiation at either an adenine or guanine residue. The adenine, the first intense base near the center of the primary region, has been designated as position +1 for numbering the nucleotides in this portion of the gene for Type II hexokinase. 77 Figure 14. Identification of the transcription initiation region of the Type II hexokinase gene. Shown is a portion of an autoradiograph from a primer extension experiment. The products were electmphoresed in a 5 % denaturing gel, for 15 hrs. Lane 1 is MspI restricted pBR used as the size marker, with the 309 bp fragment indicated at the left. Lanes 2-5 show A, C, G, and T dideoxynucleotide sequencing reactions (respectively) used to define the nucleotide positions of the extension products. Lanes 6 and 7 contain the extension products from approx 20 pg of rat skeletal muscle mRN A and 3 pg of Novikoff hepatoma mRNA, respectively. 78 FIGURE 14 79 Figure 15. Identification of the transcription initiation site of the Type II hexokinase gene. A portion of an autoradiograph from a primer extension experiment is shown. The extension products were separated on a 5 % denaturing gel, electrophoresed for 15 hrs. Lane 1 is MspI restricted pBR used as the size marker, with the 309 bp fragment indicated at the left. Lanes 2-5 show A, C, G, and T dideoxynucleotide sequencing reactions (respectively) used to define the nucleotide positions of the extension products. Lane 6 shows the primer extension products from approx 1 pg N ovikoff hepatoma mRNA. The nucleotides surrounding the transcription initiation site are shown to the right. An asterisk (*) marks the two nucleotides giving the strongest signal, with +1 designating the adenine residue as the first predominant site. 80 (I + AACTCCAGTGCT‘ G to FIGURE 15 81 film II Hoxokinage _P_romoter, Genomic clone 563A was the only recombinant to demonstrate a high degree of cross-reactivity with the extreme 5 ’ portion of the cDNA for Type II hexokinase. After restriction with several endonucleases, Southern blots of this clone were probed to locate fragments that contained the 5’ sequence of the Type II isozyme. Figure 16 shows the’results of this hybridization. Fragments hybridizing to the 5’ 153 bp of Type 11 cDNA varied in length from approx 600 bp to 20 kb. A Smal fragment of genomic clone 563A (near the bottom of lane 8, Fig. 16), 630 bp in length, was isolated and subcloned into M13 (mp18 and mpl9) for complete sequence analysis. This sequence is shown in Fig. 17. Examination of this sequence, along with the 5 ’ sequence from the previously isolated cDNA (Fig. 5), places the predominant transcription start site 465 nucleotides upstream from the ATG initiation codon. There is a TATA-like box centered 25 nucleotides upstream from the adenine at position +1; a CCAAT sequence starts at position -79, with an inverted CCAAT sequence starting at -135. There are also 3 Spl sequences starting at nucleotides -55, -123 and -212, and a potential cAMP reponse element at -64. The 260 nucleotides upstream from the transcription start site is GC rich (70% G+C) with a ratio of CpG to GpC of 0.63. The 3’ end of this genomic fragment corresponds identically to the 101 nucleotides at the 5’ end of Type II hexokinase cDNA clone 12-1.3C. To verify that this genomic fragment did indeed possess promoter capability, a portion of this fragment (272 base Smal-Ddel fragment, -260 to + 12 of Fig. 17) was subcloned into the vector p(C2AT)19. Recombinants containing the Type 11 promoter 82 Figure 16. Southern analysis of genomic clone 5G3A. Shown is an autoradiograph of genomic DNA from clone 5G3A probed with the 5’ 153bp EcoRI-Xhol fragment of cDNA clone 12-1.3C. The genomic DNA from clone 5G3A was restricted with various endonucleases, electrophoresed in an 0.8 % agarose gel and transferred to nitrocellulose. The restriction enzymes used were: 1, EcoRI; 2, EcoRI + BamHI; 3, BamHI; 4, EcoRI + HindIII; 5, HindIII; 6, EcoRI + Xhol; 7, Xhol; 8, Smal. The blot in lane 8 was exposed approximately twice as long as the other blot. The sizes (in kb) of the hbacteriophage/Hindlll fragments used as markers are indicated to the left of the blot. 83 FIGURE 16 84 Figure 17. Sequence of the Type II hexokinase promoter region. This figure shows the sequence of the 630 nucleotide Smal fragment of genomic clone 563A. It contains the 260 bases of DNA that are on the 5’ side of the adenine residue designated as +1 (primary transcription initiation site) and the 369 bases to the 3’ side of this initiation site. The overlined nucleotides are the 2 CCAAT sequences; a TATA-like box is marked in bold print; three potential Spl binding sites are underlined. A potential CAMP response element is marked by a dashed overline; the A and G residues at the transcription initiation site are marked with an asterisk (*). The primer used for the SI nuclease protection and primer-extension assays was synthesized complementary to nucleotides +303-326, which are marked by a dashed underline. The arrow (‘1) delineates the 5 ’ end of the cDNA for Type II hexokinase (Fig 3). The DdeI site, which is the 3’ end of the promoter fragment in p(C2AT)1,, is marked by a closed circle (0) under the nucleotide. -260 -240 -l92 -144 ~96 -48 +1 +49 +97 +145 +193 +241 +289 +337 85 GGGCTCTAGCACGGAACACA CGchCAACTCTGGCGCcococrooooooTAGccrcccocCGGTCTCT CCCGCCGCCTGCTTGGGTGCTGGAGCAGCCGCGCCCGCGGGCTCTGGG CGCTGXETEECTGTGGACTGcooooooccnoCCGGAGAGCGCACACAC cCTCTTCCCGCAGEEEEEGAGCGCGCCCKEEEEKCTGTCTTooooooc CCAAAGAGCCGGCAGCCCCTCAAIIAGCCACATTGTTGCACCAACTCC AGTGCTAGAGTCTCAGGACACCACAGGCTACACGGAGTTATCCCGCTT AGGAGACCCGAAGGCAGGAGCATCACTCCAGTGACTCTGATAAGGTGC GATCGCCCGAGAGGAACAGAACTGTCATTTTTGCGAAGTTGAGCCTTA CGGATCCCGTGGGCGAAGTTAGCGACGGGACGCTGAGCAACTAGACCG GTCGGCAGGAGTGAGACTTAGGTGCCTTCTAGTAGTTGTGACTTAAAA AAAAAAAAAAAAAGGAAAAGAAAAAAGGAGGAAAACCTGTTTCTGGAA a ACGCGAGGCCCTCAGCTGGTGAGCCATCGTGGTTAAGCTTCTTTGTGT GGCTCCTGGAGTCTCCGATCCCAGCCGGACACCC 86 region were used as the template in a cell-free in vitro assay system. The results of the transcription assays are shown in Fig. 18. Lanes 1-5 are the results of transcription assays which included pML-GFCZ as an internal positive control. Two transcripts, approximately 280 and 380 bases in length, are seen in lanes 1-3 which contain the constructs with the Type II hexokinase promoter in the correct orientation (designated p(C2AT)19-HKII+7, +11, and +17, respectively). The designations for the recombinants used in the assay include a (+) denoting correct promoter orientation or a (-) for the inverse orientation; the numbers are for clone identification and are not related to sequence location. There are few, if any, transcription products 380 bases in length, in lane 4 (with the Type 11 promoter in the wrong orientation, designated p(C2AT),9-HK]I-6) and lane 5 (containing p(C2AT),9 with no promoter). These results indicate that the region of genomic DNA laying immediately upstream of the putative transcription start site for Type II hexokinase is capable of directing transcription, at least in a cell-free system. W The plasmid. PSVT'7- HKII, was used to transfect COS-1 cells. As controls, COS-1 cells were also transfected with either the plasmid pSV'I'7 (with no insert) or with no DNA (Sham). Shown in Table IV are the results from two sets of transfections. The level of hexokinase activity in the COS-1 cells transfected with pSVT7-I-IKII was approximately 14 times greater than that found in cells transfected with pSVT7 or in the Sham transfection. 87 Figure 18. Cell-free in vitro transcription assay results. Shown is a section of an autoradiograph of in vitro transcription products from Type II hexokinase promoter constructs and pML-GFC2, using rat liver nuclear proteins. The transcription products were electrophoresed in a 6% denaturing polyacrylamide gel. In lanes 1-5, 50 ng of pML-GFC2 and 2 pg of p(C,AT)19 constructs were used as the template DNA. The constructs p(C2AT)19-HKII+7 , +11, and +17 were the template DNA in lanes 1-3, respectively. The construct p(C2AT)19-I-IKII—6 was included in lane 4. The original plasmid, p(C2AT)19 with no promoter sequence included, was used as a negative control in lane 5. Lane 6 contains pBR digested with Mspl, with the size (bp) of the fragments indicated to the right. 1 2 3 / -_— 88 ll' .22 827 404 217 101 1” FIGURE 18 89 Table IV. Summary of Transfection Results * Transfection I SPfidficActiW ' "7 ‘ ‘ ' - j: 9i. ’,f-::(nlrng) “ ‘sm ”" 0.04., 0.03. || pSVT7 0.047, 0.035 II 90 The results of the Western blot analysis, using anti-Type I hexokinase polyclonal antibodies, are shown in Fig. 19. Lane 1 contained purified Type I hexokinase. Lanes 2 and 3 contained protein from the pSVT7-I-IKII and Sham transfected cells, respectively. It is readily apparent that, in the cells transfected with pSV'I'7-HKII (lane 2), there is a definite increase in protein that reacts with the. anti- Type I hexokinase polyclonal antibodies. The band in lane 2 that reacted most intensely with these antibodies migrated slightly faster than purified Type I hexokinase. This was as expected since the reported M, for Type II hexokinase (9 8) is somewhat smaller than that of the Type I isozyme. This same mobility pattern has been observed when purified Types I and II hexokinases are electrophoresed in SDS polyacrylamide gels and stained for protein. Bands corresponding to hexokinase isozymes I and II were also seen at the appropriate positions in the crude muscle extract (data not shown). There is also evidence of crossreactivity between the endogenous hexokinase (lane 3) and the Type I polyclonal antibodies, which was not unexpected. The bands below the Type II band in lane 2 are also present in the Sham extract (lane 3). It has not been determined whether the peptides reacting with the anti-Type I hexokinase antibodies are proteolytic fragments of the endogenous hexokinase or unrelated proteins. The lower band seen in lane 1 is most likely a proteolytic fragment (approx 90 kDa) of Type I hexokinase (54,96). 91 Figure 19. Western blot analysis of Type II hexokinase expressed in COS-1 cells. Proteins isolated from transfected COS-1 cells were electrophoresed, blotted onto nitrocellulose and reacted with antibodies, as described in mm. Lane 1 contains 0.5 pg of purified Type I hexokinase. lanes 2 and 3 contain 45 pg of protein from COS-1 cells transfected with pSVT7—HKII and no DNA (Sham), respectively. The blot was reacted with anti-hexokinase Type I polyclonal antibodies. 92 FIGURE 19 CHAPTER 4 Discussion 93 Amino Aoid Smoence of Type II Hexokinase The deduced amino acid sequence of rat Type II hexokinase is compared with previously determined sequences of the Type I (68), Type III (56), and Type IV (32) isozymes in Fig. 20. The striking similarity between sequences in the N- and C- terminal halves of the Type II hexokinase has been previously seen for the Type I (68) and III (56) isozymes. The similarities between these sequences and that of yeast hexokinase (86) confirm the view that all of the 100 kDa enzymes evolved by a mechanism of duplication and fusion of a gene encoding an ancestral 50 kDa hexokinase related to the yeast enzyme. A quantitative comparison of the sequence similarities among the rat isozymes is presented in Table V. The 9 amino acid residues at the N-terminus of Type II hexokinase (Met-Ile- Ala-Ser—I-Iis~Met-Asn-Ile—Ala-Cys) are nearly as hydrophobic as the corresponding sequence in the Type I isozyme (Met-Ile-Ala-Ala-Gln-Ieu-Ieu—Ala-Tyr). Polakis and Wilson (54) have shown the necessity of this hydrophobic ”tail" for binding of the Type I isozyme to mitochondria. Thus, these results are consistent with reports that the Type II isozyme also binds to mitochondria (39,41), although apparently with somewhat less affinity than does the Type I isozyme (116). The less hydrophobic nature of the N -terminus of Type III hexokinase may, at least partially, explain the lack of binding of this isozyme to mitochondria (41). The kinetic differences that presumably underlie differences in physiological function of the various isozymes, e. g. , the ability of Pi to antagonize inhibition of the Type I isozyme, but not the Type II isozyme, by Glc—6-P (l 1), clearly must depend 94 95 Figure 20. Comparison of aligned amino acid sequences of N- and C-terminal halves of rat Types I-III hexokinases and rat glucokinase (Type IV). Three or more aligned residues identical to the corresponding amino acid in either the N— or C- terrninal region of Type II hexokinase are considered identities (blackened residues). Shaded residues are conservative substitutions when compared to identical aligned amino acids at a given position. In alignments of two sets of 3 identical amino acid residues, all residues are viewed as conservative substitutions. 96 ........ ... ;EK mp.w :er..gA .. .AA..2. A A _A AAAA» w Anv.iruy_. aAAD..r.A Ham >H «HmAAALAAaW35w .aaiAAAAAAmAA.. ... a: HAAAAAAAAAA.AA A Few. .AAAAAAA.m MA» H moudmAHAA<><9 AthAAmAAAn LAAA;A .. . ..A:AAAA AaAA A 9>.A1AAAA AA.o mwm HH 4:. AAA4>CP.AA3AAAA A:;.AMA man H o 0.;AAnm ...... ArtAAAJAAAA; A:_:.A ._. AarAAAA.AA. AAA. : m,A A.,AAmAAmwA mam HH mqu «A. LAAA.».3Aa.oA AJA..A:: . . e quu.A:AA mmbon . «AAA ...Am=. Am ooc HHH M RAAA. ArtmdAemumAg A . .., 3A .6A A .m. . .A:;sz5~Aa :2A A.. AmA - .-A mom >H www-mA A :. . .. .. c- . . _ . A a» A. a“. , . . a :sm MHm H .:«443<£AA -mAem 1A- ,.. .,_AA . s . . ; Au .AA. A s A A A A. :» nHm HH :AAA mAAAoquzu- at. . A . .a A...» 5 g . .; A r A. r A man HHH t; Aw .a...l bun >H QAAAZA AAAAAAAAEAAAAAAAH.Ane-A. .,A .p ; AA:.A.. AAAAAA., .n;. . ..:. _ arm H a AAJA AAAAzAmAzAAAAoAAAzAm_ AA.. , , AA.A.A:: .AaA , m.ma-p.. .IAAmAAA arm HH .AA_ A AAAAAAAAEAAAAAAAAAmx..:A AA ; A. . _AAmA AAA» , AAA mom HHH A.AAAA>AAAAAAAAJZAAAEAALAAAA AA A... A , .APAA:.Az:AA~ Hm" H ._AAAAAAAA AAAAAzAAAaAAAAmA.9 A A;,A,, ;.. AAAAAAA:AE:AAAAAAAAAA . , zAAAAAAA. Hm" HH AAAJAA>AAAA..A) AAAAodAA: m._A ; . . . AAnAmA;AHm AOAAAAA. A . w-.AA::@Ab. vvu HHH a. A9 AAArA..Aw A AAA .. .LE.mA mVH >H ham H hmm HH ¢ow HHH mvd H med HH de HHH .ArA.AAc;A9A+A;uAMC.AA:uwaab._- . AAmAAi . Anfl EAE.AA.AA.A...A.AAA 9...... z ..A . AAAA>AA2AA AAAezzAAzsA AwbmquAAAa-AA AAAAMAA AAEAAAoAzAA:eAA. ..p>aAr .AAA>AA. ..AnAAAszAA:eAAA..AAA A AAAAmAA .AeAAAhzzeAAHAz:.mm A.A.,A AA AAA>AA.ABAA AJAEPAAAAzwnAppAJAna , ' . ' r. .1; V . A; .... finaznuer .........~HW.A.\7H. .3me A . AmaAAAAA;3rAA AAA3AAA.:_ _AA AmrAABAbrmm A91A AA:A »\\ DOD illmlhml‘ E‘E-‘Er L {LLLIH D. '. L?) C} 0 C) L» i \ w I r"! r. . nu >—M >44 b—M v—N 5+4 .. LL L1. L1. Lt. 3.12.; (I) U) (I) (I) (I? L?‘ (J U L; 1..) A.) P\ 4 «pr»..... le LAD A AAarA ... :2 _. ;. - ,g , . ...Awa.g..:r . ..;:a -.A-. mm >H AAAAmArm .erA e..cflamwmu_.. . Ab. . ,. an..y ;,. AAAAAJAALAA cam H .AAA.AA. AAAAAA_.aA3$JA.AA ABA :A. 9.22.». ... A.:.AA _AwAAA omm HH A wYJJAAAA..AmA.<. AAAm_Apoz w . :AArwAAAAAAAAAAr nAAA omm HHH .wmyrrAAAA AMA. . .AA .A mm m . AAWNAA ArmAALtuAAA 2. H AA AAAAom mww. .mH.AAA .A.:A,.A AmgA AArwpnAAAerzamAvAcA «A HH ..A '01 .11.. pg :.,AnAJAAAA.APAfl _A no HHH . <0 4 19m Audra-IL. H >H mh+ H mpv HH «aw HHH onu>oanuAm_W M HH madqmqmummmmHemomommouueamqommmH<A A:.LEA§£fia A9 ..AAA24f n. H lFHI3IJIIEiIMD 97 Table V. Comparison of Deduced Amino Acid Sequences of the N- and C-Terminal Halves of Rat Type II Hexokinase with Sequence of the Type IV Isozyme and Sequences of the N- and C-Terminal Halves of the Type I and III Isozymes.“ N—IIb C-II N—II 100 55,14 N-I 68,14 49,17 01 54,14 76,11 N-III 44,14 41,13 C-III 48,15 66,9 IV 52,15 53,14 a-Based on the alignments shown in Fig. 20. The first number is the percent identity, and the second number is the percent of conservative substitutions. Thus the sum of the two numbers reflects the overall similarity of the compared sequences. b-Tbe abbreviations used are: N-II, C-II: N-terminal half (residues 1-475) and C-terminal half (residues 476-917), respectively, of the Type II isozyme; N-I, C-I: N-terminal half (residues 1-475) and C-terminal half (residues 476-918), respectively, of the Type I isozyme (68); N- III, C-III: N-terminal half (residues 1-488) and C-terminal half (residues 489-924), respectively, of the Type III isozyme (56); IV, Type IV isozyme (glucokinase), sequenced by Andreone et al. (32). 98 on structural differences resulting from changes in amino acid sequence. Since the conservation of sequence is so extensive, the sequence comparisons in Fig. 20 focus attention on relatively limited regions which might be responsible for the observed alterations in functional properties. This is facilitated by the demonstration that catalytic activity is associated with the C-terminal half of the Type I isozyme (84,85,93) while regulatory function is associated with the N-terminal half (91,94). Presumably this same functional organization is found in the other 100 kDa enzymes. Although differences in the N -terminal halves of the molecules can obviously not be excluded, the identity - at both the nucleotide and amino acid level - between the sequences of the C-terminal halves of Type II hexokinase and the enzyme from Novikoff ascites tumor cells suggests that there is a single Type II isozyme in both normal tissue and tumors. Further support for this view is provided by the observation that these enzymes are encoded by polyadenylated mRNAs which are indistinguishable in size. This also appears to be the case with the rat Type III isozyme since the deduced amino acid sequence of this enzyme (5 6) includes several segments that are identical to those of tryptic peptides (115) derived from the Type III isozyme isolated from Novikoff tumor cells (41). These observations do not support speculation (65) that the tumor isozymes are distinct in amino acid composition, and hence sequence, from the isozymes of normal tissues. However, the Type II hexokinase in the Novikoff cell line used in the present work was distinct from the Type II enzyme of normal skeletal muscle when compared by ion exchange chromatography, nondenaturing gel electrophoresis, or isoelectric 99 focusing (T. Ureta and LE. Wilson, unpublished results). It seems appropriate to consider possible posttranslational modifications that might account for these differences, and which might lead to the apparent altered function of hexokinase in highly glycolytic tumor cell lines (63,64). W A message size of approx 5.2 kb was seen for both tumor and skeletal muscle Type II hexokinase (Fig. 11). An open reading frame of 2751 nucleotides was contained in two cDN A clones. Included in these clones were 197 and 687 nucleotides in the 5’ and 3 ’ untranslated regions, respectively. The identification of the transcription start sites, through primer extension and SI mapping, indicate that the 5 ’ untranslated region for Type II hexokinase is approximately 467 nucleotides. Transcription initiation spans a 30 base region. Diffuse transcription initiation has also been seen from both the hepatic and pancreatic glucokinase promoters (33,36). The lack of a single discrete transcription initiation site may be the result of a ”weak” TATA sequence (discussed below). While the function of such long 5’ leader sequences is not known, they have been identified in other proteins. For instance, the 5’ untranslated region for HMG CoA reductase encompasses as many as 670 nucleotides (117). The 5’ leader sequences for the ATPases (Na+K+) from rat and sheep lu'dney contain 460 and 528 nucleotides, respectively (118,119). The 33 nucleotides immediately 11me from the 5 ’ end of the Type 11 cDNA (Fig. 17) contain 28 A residues. This region is an excellent candidate for 100 hybridization to the oligo-dT primer used in the synthesis of several of the libraries that were screened for Type 11 cDNA. This observation would explain why the cDN A terminates approximately 270 downstream from the actual 5 ’ end of the message. The designation of the first ATG in the cDNA as the translation start site is confirmed by alignment of the N-terminal amino acid sequence of Type II with the corresponding sequence of Type I . Three nucleotides upstream from the starting ”ATG” (GCAGGALQATC) is a purine (in this case an A), which is highly conserved at that position. In 97% of 699 mRNAs analyzed by M. Kozak (120), there was a purine, usually an A, at this -3 position. However, other features of the translation initiation consensus sequence (CCA/GCCATGG) (120) are not conserved. Both tumor and skeletal muscle mRN As were used to determine the transcription start region. The primer extension and SI digestion results obtained from both mRNAs were identical. These results clearly indicate that the message for Type II hexokinase is the same, at least for the 5’ untranslated region and the C- terminal half of the protein, in both normal and tumor tissues. These facts, in conjunction with an identical message size of 5 .2 kb, make it highly unlikely that the mRNA encoding the N-terminus of this tumor hexokinase is different from the Type 11 message found in skeletal muscle. However, this complete message identity has yet to be verified. The length of the message for Type II hexokinase, characterized thus far, is approximately 1300 nucleotides shorter than suggested by Northern analysis. Since 101 poly A+ RNA was used for both Northern analysis and library syntheses, it seems likely that the polyadenylation signal and the polyA" region will be found in the 1300 nucleotides not yet isolated. Although the role of the unusual stretch of ’Ts’ near the 3’ end of the cDNA (Fig. 5) is not known, it may function in several ways. It could serve as a signal for termination of transcription as proposed by Resnekov er al (121). However, if this is the purpose, the termination signal is approximately 1300 nucleotides upstream from the end of the transcript. It has also been suggested that such a stretch of ’Ts’ may facilitate the release of the transcript unit from the genomic DNA template (122), due to the instability of (dAer) regions. Alternatively, another proposed function for such a sequence involves message stability. Wilson and Treisman (123) found that the shortening of the poly A“ region of c-fos mRNA was much slower when the 3’ AU-rich sequences were deleted. These researchers suggested that such 3’ AU-rich sequences may destabilize mRN A by causing rapid removal of the poly A” region. However, until further research is conducted the function of this T—rich region remains unknown. Inspection of the nucleotide sequences surrounding the overlapping regions of cDNA clones 12-1.3C and RG2B has given no clues as to why full length Type II cDNAs have yet to be isolated. As seen in Table III, a total of 10 libraries were prepared and screened in the search for the cDNA for the Type II isozyme. These libraries were synthesized by various methods, using mRN A isolated from several rat tissues known to contain the Type II isozyme. The libraries were screened with 102 portions of cDNA clones 12-1.3C or RGZB/NK3B. The only clones isolated from any of these libraries contained inserts identical to those described above for clones 12-1.3C and RGZB. Each library gave only the "12-1.3C-type" insert or the "RGZB- type" insert, or portions thereof; no library gave both, nor did any library yield a full length cDN A. Also curious is the fact that clones RG2B and NK3B were identical, with both lacking a 3’ polyadenylated region, even though the Novikoff cDN A library yielding clone NK3B was synthesized using ofigMT as primer while RGZB was isolated from a random primed library. There are no EcoRI restriction sites in this area that may have affected the cloning of full length cDNA, nor are there any multiple adenine residues that could serve as an additional binding site for the oligo- dT primer used in library synthesis. Clearly there is something most unusual occurring, possibly resulting from some exceptional secondary structural features in these mRN As. Wager: Alignment of 5 genomic clones (Fig. 9) containing sequences complementary to Type II hexokinase cDNA indicates that the gene for this enzyme is at least 35 kb in length. Sequence analysis of portions of genomic clone 363A, used to verify the overlap of the two cDNA clones, generated 3 exons of length 150-250 nucleotides. Similar exon sizes have been seen in the hepatic glucokinase gene, with an average length of nine exons being 153 nucleotides (33). Several potential regulatory sequences have been found in the promoter region, upstream from the transcription start region (Fig. 17). The sequence AATAA, 103 located at positions -27 to ~23, may be a ”TATA box,” although it is not a good consensus sequence (124). Conservation of the TATA box sequence could play a critical role in the precise initiation of transcription (124,125). The lack of complete homology between the AATAA sequence in Fig. 17 and the consensus TATA sequence may explain the existence of two regions, rather than a unique site, of transcription initiation for this gene. Two CCAAT sequences, at nucleotides -79 and —135 , and 3 potential Spl binding sites (starting at nucleotides -55, -123, and ~212) have also been identified. The positions of the AATAA and CCAAT sequences are in agreement with the TATA (-25 to -30) and CCAAT box (-70 to -80) locations found in many eucaryotic genes (124). However, if the AATAA sequence is not a TATA element, transcription may be directed by an initiator (Inr) element located at or near the transcription initiation site. Even though a number of such Inr elements have been defined and investigated (reviewed in 126) no concensus sequences for these elements have been proposed. It remains to be seen whether the gene for Type II hexokinase is under the control of such Inr elements. The 5’ untranslated region (725 nucleotides) may be, at least a portion of, a CpG island as defined by Gardiner-Garden and Frommer (127). The %GC (G+C) of this genomic region is 60% with a CpG/GpC ratio of 0.77. Such CpG islands may serve an important function in either transcriptional or post-transcriptional regulation of the Type II hexokinase gene. Post-transcriptional regulation has been found for a number of genes including c-fos ( 128) and c-myc (129). 104 The nucleotides from -63 through -70 (CCACGTCA) are 75 % identical to the CAMP response element (TGACGT/ CC/ AA/ A) (130). Whether Type II hexokinase is influenced by cAMP has yet to be determined. However, it is known that transcription of glucokinase is negatively affected by CAMP (acting through glucagon (22)); the Type II isozyme may be regulated in a similar fashion. Even though the importance of each of the aforementioned promoter elements has yet to be determined, the 260 bp genomic fragment containing these elements has the ability to direct transcription as demonstrated in a cell-free in vitro system. The low activity level of this promoter, relative to that of the adenovirus major late promoter (40 fold more Type 11 promoter was used in the transcription assays seen in Fig. 18), may indicate that elements important to promoter activity were not present in the genomic fragment used in the transcription assay. Alternatively, a negative regulatory element may be present within this 260 bp fragment. Such negative elements have been found in the S 14 promoter from rat liver (111) and in the human c-fos promoter (131). There also exists the possibility that nuclear factors in rat liver, important for the efficient transcription of the Type II isozyme, are present at higher concentrations immediately following birth and decline shortly thereafter. Ureta et al. (132) has shown that the level of Type II hexokinase in rat liver peaks during the first week after birth. The liver nuclear extracts, used for the in vitro transcription assay, were prepared from 6—7 week old rats. The nuclear factors necessary for efficient transcription may be deficient, or in low concentrations, in the liver of older rats. 105 Nuclear extracts, from tissues in which Type II hexokinase is the predominant isozyme (e. g. , muscle, and adipose), may contain increased levels of factors necessary for efficient transcription of the isozyme. Recently Alexander et al. proposed a 25 nucleotide sequence (AAC‘ITI‘CCCGCCTCTCAGCCGAAAG) as a minimum core of a putative insulin response element (IRE) (133). No significant homology is seen between this possible insan element and the promoter region of the Type II gene (Fig. 17). However, the IRES located by Alexander et al. were in a region from 270 to 490 residues upstream from the transcription start site. Since the promoter region for the Type II isozyme, characterized thus far, extends only 260 nucleotides upstream from the transcript site, it is possible that insulin responsive elements for this gene are in upstream regions not yet isolated. However, Tanaka et al. (35) have found no enhancer sequences in the 5’ flanking region of the glucokinase gene that appear to be responsible for insulin regulation. They had investigated 5.5 kb upstream from the transcript start site. Further studies are needed to determine the existence and location of any insulin response sequences, and to confirm the role of other promoter/ enhancer elements, in the Type II hexokinase gene. CHAPTER 5 Future Work 106 With the availability of the cDNA (134) and promoter region for Type II hexokinase, several directions for future research have become feasible. Determination of factors controlling the transcription of the Type II hexokinase in normal and diabetic rats, as well as tumor cells, can now be undertaken. Also, with cDNAs for all the hexokinase isozymes now available, future work can focus on elucidating the structural differences that underlie the diversity of their catalytic and regulatory properties. Several potential regulatory elements in the promoter region for Type II hexokinase have been located upstream from the transcription initiation site for Type II hexokinase, as defined in this project. Ordered deletions of this region can be tested, in in vitro transcription assays with the ”G—free cassette" vector p(QAT)19, to identify segments important in the regulation of this gene. Nuclear extracts, prepared from various tissues (e. g. , liver, muscle and mammary) from normal, diabetic and insulin-treated diabetic rats, should be used in these transcription assays. The use of such nuclear extracts, with the deleted "G-free" constructs, will define promoter regions that either enhance or suppress the rate of initiation of this gene. Through this approach, it may be possible to locate insulin response elements} (IRES) in this gene. However, considering the findings of Tanaka et al (35), it may be necessary to include regions of the Type II hexokinase gene further downstream (than investigated herein) in the search for such IREs. Using a similar approach with fast growing tumor cell lines (e. g. , Novikoff ascites hepatoma) or stimulated muscle tissue, it may be possible to identify 107 108 cis-elements, and trans-acting factors, responsible for increased levels of Type II hexokinase seen in both tumor cells (64) and stimulated muscle (29,31). Techniques such as "gel shift" assays (135,136) and DNase I footprint analysis (95,133) can further define the location of important regulatory elements. The availability of the Type II cDNA makes site-directed mutagenesis a very useful tool to study the importance of specific amino acid residues implicated in catalytic or regulatory functions of hexokinase. Residues in the Type I isozyme thought to be critical in the binding of glucose are Ser 603, Asp 657, Glu 708, and Glu 742 (68,92,93); Thr 661 and the sequence Gly-Ser-Gly-Lys-Gly-Ala (896-901, Fig. 20, Discussion Chapter) may be involved in the binding of ATP in Type I hexokinase (56,68). These residues are completely conserved between the Types 1, II and III isozymes (Fig. 20), which may imply conservation of function as well. Curiously, these residues are also well conserved at the corresponding positions in the N-termini of the three hexokinase isozymes, even though the N-terminal domain of Type I is not catalytically active (84,91 ,94). Site-directed mutagenesis of these residues, in both halves of Type II hexokinase, will help determine their function. The 100 kDa hexokinases differ in both kinetic and regulatory properties (reviewed in ref. 11). However, because of the extensive similarities of their amino acid sequences, it is reasonable to assume that structure/ function relationships found in the Type I isozyme (84) exist in the other hexokinases as well. Construction and expression of chimeric hexokinase molecules may be a very useful method to 109 substantiate this supposition, and to investigate the interactions between the N - and C- terrninal domains of each isozyme. The exchange of the N-terminal half of Type II hexokinase for that of the Type I isozyme will test the proposition that structural regions affecting regulatory properties (such as reversal of Glc-6-P inhibition by P,) reside in the N—terminal domains of these isozymes. In order to facilitate this exchange, an NcoI site was created in the Type II cDNA at nucleotide 1558 (Fig. 5, Results Chapter). This new site coincides with an NcoI site at the same position in the Type I cDNA. This mutation does not alter the amino acid sequence, and expression of this cDNA in COS-1 cells produced active protein. Hence, the stage is now set for construction of chimeric hexokinases in which the N- and C-terminal domains of the Type I and Type II isozymes can be interchanged. CHAPTER 6 References 10. 11. 12. 13. Viiiuela, E., Salas, M., and Sols, A. (1963) J. Biol. Chem. 238, PC1175- PC1177. Walker, D.G. (1963) Biochim. Biophys. Acta 77, 209-226. Gonzalez, C., Ureta, T., Babul, J., Rabajille, E., and Niemeyer, H.(l967) Biochemistry 6, 460—468. Katzen, H.M., Soderman, DD, and Nitowsky, H. (1965) Biochem. Biophys Res. Commun. 19, 377-382. Katzen, HM. and Schimke, RT. (1965) Proc. Natl. Acad. Sci. USA. 54, 1218-1225. Grossbard, L. and Schimke, RT. (1966) J. Biol. Chem. 241, 3546-3560. Easterby, 1.8. (1971) FEBS Len. 18, 23-26. Chou, A.C. and Wilson, J.E. (1972) Arch. Biochem. Biophys. 151, 48-55. Neumann, S., Falkenburg, F., and Pfleiderer, G. (1974) Biochim. Biophys. Acta 334, 328-342. Meunier, J.C., Buc, J., and Richard, J. (1971) FEBS Lett. 14, 25-28. Wilson, J.E. (1985) in Regulation of Carbohydrate Metabolism (Beitner, R. , Ed.), Vol. I, pp. 45-85, CRC Press, Inc., Boca Raton, FL. Meglasson, M.D., Buch, P.T., Bemer, D.I(., Najafi, H., Vogin, A.P., and Matschinsky, RM. (1983) Proc. Natl. Acad. Sci. USA 80, 85-89. Hughes, S.D., Quaade, D., Milbum, J.L., Cassidy, L., and Newgard, CB. (1991) J. Biol. Chem. 266, 4521-4530. 110 14. 15. 16. 17. l8. 19. 20. 21. 22. 23. 24. 25. 26. 111 Liang, Y., Jeffon, T.L., Zimmerman, E.C., Najafi, H., Matschinsky, RM. and Magnuson, M.A. (1991) J. Biol. Chem. 266, 6999-7007. Bernstein, RS. and Kipnis, D.M. (1973) Diabetes 22, 913-922. Katzen HM. (1967) in Advances in Enzyme Regulation (Weber, 6., Ed.) Vol. 5, pp. 335-356, Pergamon Press, New York. McLean, P., Brown, J., Walters, E., and Greenslade, K. (1967) Biochem J. 105, 1301-1305. Walters, E. and McLean, P. (1968) Biochem J. 109, 737-741. Frank, SK. and Fromm, HI. (1986) Arch. Biochem. Biophys. 249, 61-69. Frank, SK. and Fromm, H.J. (1986) Biochem. Biophys. Res. Commun. 138, 374-380. Sharma, D., Manjeshwar, R., and Weinhouse, S. (1963) J. Biol. Chem. 238, 3840-3845. Iynedjian, P.B., Jotterand, D., Nouspikel, T., Asfari, M., and Pilot P-R. (1989) J. Biol. Chem. 264, 21824-21829. Colowick, SP. (1973) in The Enzymes (Boyer, P.D., Ed.) 3rd Ed., Vol. 9, pp. l-48, Academic Press, New York. Magnuson, M.A. (1990) Diabetes 39, 523-527. Peter, J.B., Jeffres, R.N., and Lamb, DR. (1968) Science 160, 200-201. Green, H.J., Reichmann, H., and Pette, D. (1983) Ifflr‘igers Arch. 399, 216- 222. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 112 Pette, D., Smith, M.E., Staudte, H.W., and Vrbova, G. (1973) Pfliigers Arch. 338, 257-272. Sirnoneau, J.-A. and Pette, D. (1988) [wagers Arch. 412 8692. Weber, RE. and Pette D. (1988) FEBS Len. 238, 71-73. Weber, RE. and Pette, D. (1990) Eur. J. Biochem. 191, 85-90. Weber, RE. and Pette, D. (1990) FEBS Len. 261, 291-293. Andreone, T.L., Printz, R.L., Pilkis, S.J., Magnuson, M.A., and Granner, D.K. (1989) J. Biol. Chem. 264, 363-369. Magnuson, M.A., Andreone, T.L., Printz, R.L., Koch, S., and Granner, DR. (1989) Proc. Natl. Acad. Sci. USA 86, 4838-4842. Sibrowski, W. and Seitz, HI. (1984) J. Biol. Chem. 259, 343-346. Noguchi, T., Takenaka, M., Yamada, K., Matsuda, T., Hashimoto, M., and Tanaka, T. (1989) Biochem. Biophys. Res. Comm. 164, 1247-1252. Magnuson, M.A. and Shelton, K.D. (1989) J. Biol. Chem. 264, 15936-15942. Iynedjian, P.B., Pilot, P-B., Nouspikel,T., Milbum, J.L., Quaade, C., Hughes, 8., Ucla, C., and Newgard, CB. (1989) Proc. Natl. Acad. Sci. USA 86, 7838-7842. Quaade, D., Hughes, S.D., Coats, W.S., Sestabk, A.L., Iynedjian, P.B., and Newgard, CB. (1991) FBS Len. 280, 47-52. Bartley, J.C., Barber, 8., and Abraham, S. (1975) Cancer Res. 35, 1649- 1653. Salotra, RT. and Singh, V.N. (1982) Arch. Biochem. Biophys. 216, 758-764. 41. 42. 43. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 113 Radojkovié, J. and Ureta, T. (1987) Biochem. J. 242, 895-903. Parry, D.M. and Pedersen, P.L. (1983) J. Biol. Chem. 258, 10904-10912. Preller, A. and Wilson, J. (1992) Arch. Biochem. Biophys. (in press) Felgner, P.L., Messer, J.L., and Wilson, J.E. (1979) J. Biol. Chem. 254, 4946-4949. Miwa, I., Mitsuyama, S., Toyoda, Y., Nonogaki, T., Aoki, S., and Okuda, J. (1990) Biochem. Int. 22, 759-767. Linden, M., Gellerfors, P., and Nelson, B.D. (1982) FEBS Len. 141, 189- 192. Fiek, C., Benz, R., Roos, N., and Brdiczka, D (1982) Biochim. Biophys. Acta 688, 429-440. BeltrandelRio, H. and Wilson, J.E. (1991) Arch. Biochem. Biophys. 286, 183- 194. Viitanen, P.V., Geiger, P.J., Erickson-Viitanen, S., and Bessman, SP. (1984) J. Biol. arem. 259, 9679-9686. Inui, M. and Ishibashi, S. (1979) J. Biochem. 85, 1151-1156. Kosow, DP. and Rose, LA. (1968) J. Biol. Chem. 243, 3623-3630. Wilson, J.E. (1968) J. Biol. Chem. 243, 3640-3647. Rose, LA. and Warms, J.V.B. (1967) J. Biol. Chem. 242, 1635-1645. Polakis, PG. and Wilson, J.E. (1985) Arch. Biochem. Biophys. 236, 328-337. Xie, G. and Wilson, J. (1988) Arch. Biochem. Biophys. 267, 803-810. 56. 57. 58. 59. 61. 62. 63. 65. 66. 67. 114 Schwab, DA. and Wilson, J .E. (1991) Arch. Biochem. Biophys. 285, 365- 370. Ureta, T. (1982) Comp. Biochem. Physiol. 71B, 549-555. Kosow, DP. and Rose, LA. (1972) Biochem. Biophys. Res. Commun. 48, 376-383. Kosow, D.P., Oski, F.A., Warms, J.V.B., and Rose, LA. (1973) Arch. Biochem. Biophys. 157, 114-124. Beitner, R., Haberman, S., and Livni, L. (1975) Biochim. Biophys. Acta 397, 355-369. Rose, LA. and Warms J.V.B. (1975) Arch. Biochem. Biophys. 171, 678-681. Warburg, 0., Posener, K., and Negelein, F. (1924) Biochem. Z. 152, 309- 344. Bustamante, E. and Pedersen, P.L. (1977) Proc. Natl Acad. Sci. USA 74, 3735-3739. Bustamante, E., Morris, H.P., and Pedersen, P.L. (1981) J. Biol. Chem. 256, 8699-8704. Nakashima, R.A., Paggi, M.G., Scott, L.J., and Pedersen, P.L. (1988) Cancer Res. 48, 913—919. Arora, K.K., Fanciulli, M., and Pedersen, P.L. (1990) J. Biol. Chem. 265, 6481-6488. Schwab, DA. and Wilson, J.E. (1988) J. Biol. Chem. 263, 3220-3224. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 115 Schwab, D.A. and Wilson, J.E. (1989) Proc. Natl. Acad. Sci. USA 86, 2563- 2567. Easterby, J.S. and O’Brien, MJ. (1973) Eur. J. Biochem. 38, 201-211. Rose, I.A., Warms, J.V.B., and Kosow, DP (1974) Arch. Biochem. Biophys. 164, 729-735. Holroyde, M.J., Trayer, LP, and Comish-Bowden, A. (1976) FEBS Len. 62, 213-219. Gregoriou, J., Trayer, LR, and Cornish-Bowden, A. (1983) Eur. J. Biochem. 134, 283-288. Trayer, LP. and Darby, M.K. (1981) Biochem. Soc. Trans. 9, 62. Iawrence, GM. and Trayer, LP. (1984) Comp. Biochem. Physiol. 79B, 233- 238. Poorman, R.A., Randolph, A., Kemp., R.G., and Heinrikson, R.L. (1984) Nature 309, 467-469. Palm, D., Goerl, R., and Burger, K.J. (1985) Nature 313, 500-502. Wistow, G. (1990) J. Mol. Evol. 30, 140-145. Yanagawa, HA. (1978) Insect Biochem. 8, 293-305. Storey, KB. (1980) Insect Biochem. 10, 637-645. Mochizuki, Y. and Hori, SH. (1977) J. Biochem. 81, 1849-1855. Nishi, S., Seino, S., and Bell, G.I. (1988) Biochem. Biophys. Res. Commun. 157, 937-943. Crane, R.K. and Sols, A. (1954) J. Biol. Chem. 210, 597-606. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 116 Nemat-Gorgani, M. and Wilson, J.E. (1986) Arch. Biochem. Biophys. 251, 97-103. White, T.K. and Wilson, J.E. (1989) Arch. Biochem. Biophys. 274, 375-393. Schirch, D.M. and Wilson, J.E. (1987) Arch. Biochem. Biophys. 254, 385- 396. Stachelek, C., Stachelek, J., Swan, J., Botstein, D., and Konigsberg, S. (1986) Nuc. Acids Res. 14, 945-963. Kopetzki, E., Entian K-D., and Mecke, D. (1985) Gene (AmstJ 39, 95-102. Anderson, C.M., Stenkamp, R.E., McDonald, RC, and Steitz, T.A. (1978) J. Mol. Biol. 123, 15-33. Anderson, C.M., Stenkamp, R.E., McDonald, RC, and Steitz, T.A. (1978) J. Mol. Biol. 123, 207-219. Polakis, P.G. and Wilson, J.E. (1984) Arch. Biochem. Biophys. 234, 341-352. White, T.K. and Wilson, J.E. (1987) Arch. Biochem. Biophys. 259, 402-411. Bennett, W.S. Jr. and Steitz, T.A. (1980) J. Mol. Biol. 140, 211-230. Schirch, D.M. and Wilson, J.E. (1987) Arch. Biochem. Biophys. 257, 1-12. White, T.K. and Wilson, J .E. (1990) Arch. Biochem. Biophys. 277, 26-34. Sambrook, J., Fritsch, ER, and Maniatis, T. (1989) Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. Wilson, J.E. (1989) Prep. Biochem. 19, 13-21. Laemmli, U.K. (1970) Nature 227, 680—685. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 117 Qadri, SS, and Easterby, J.S. (1980) Anal. Biochem. 105, 299-303. Hunkapillar, M.W., Lujan, E., Ostrander, F., and Hood, LE. (1983) in Methods in Enzymology (Hirs, C.H.W., and Timasheff, S.N., Eds.), Vol. 91, pp. 227-236, Academic Press, New York. DeWitt, D.L. and Smith, W.L. (1988) Proc. Natl. Acad. Sci. USA 85, 1412- 1416. Feinberg, AP. and Vogelstein, B. (1983) Anal. Biochem. 132, 6-13. Kunkel, T.A. (1985) Proc. Natl. Acad. Sci. USA 82, 488-492. Kunkel, T.A., Roberts, JD. and labour, RA. in Methods in Enzymology (Wu, R. and Grossman, L., Eds.), Vol 154, pp. 367-382, Academic Press, New York. Bird, P., Gething, M.-J., and Sambrook, J. (1987) J. Cell Biol. 105, 2905- 2914. Gluzman, Y. (1981) Cell 23, 175-182. Dunn, SD. (1986) Anal. Biochem. 157, 144-153. Taketa, K., Ichikawa, E., and Hanada, T. (1986) J. Immunol. Meth. 95, 71- 77. Chomczynski, P., and Sacchi, N. (1987) Anal. Biochem. 162, 156-159. Chirgwin, J.M., Przybyla, A.E., MacDonald, R.J., and Rutter, WJ. (1979) Biochem. 18, 5294-5299. Sawadogo, M. and Roeder, KG. (1985) Proc. Natl. Acad. Sci. USA 82, 4394-4398. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 118 MacDougald, O.A. and Jump, DB. (1991) Biochem. J. 280, 761-767. Gorski, K., Cameiro, M. and Schibler, U. (1986) Cell 47, 767-776. Felgner, P.L., and Wilson, J .E. (1976) Biochem. Biophys. Res. Commun. 68, 592-597. Polakis, P.G., and Wilson, J .E. (1982) Biochem. Biophys. Res. Commun. 107, 937-943. Marcus, F. and Ureta, T. (1986) Biochem. Biophys. Res. Commun. 139, 714- 719. Imai, J., Akimoto, H., Oda, M., Okazaki, H., Ishibashi, S., and Kurokawa, M. (1988) Mol. Cell. Biochem. 81, 37-41. Reynolds, G.A., Basu, S.K., Osborne, T.F., Chin, D.J., Gil, G., Brown, M.S., Goldstein, J.L., and Luskey, KL. (1984) Cell 38, 275-285. Young, R.M., Shull, GE, and Linng J.B. (1987) J. Biol. Chem. 262, 4905- 4910. Shull, G.E., Lane, L.K., and Kingrel, J.B. (1986) Nature 321, 429-431. Kozak, M. (1987) Nuc. Acids Res. 15, 8125-8148. Resnekov. L., Ben-Asher, E., Bengal, E., Choder, M., Hay, N., Kessler, M., Ragimov, N., Seiberg, M., Skolnik-David, H., and Aloni, Y. (1988) Gene (Amst) 72, 91-104. Martin, RH. and Tinoco, I. (1980) Nuc. Acids Res. 8, 2295-2299. Wilson, T. and Treisman, R. (1988) Nature 336, 396-399. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 119 Breathnach, R. and Chambon, P. (1981) in Annual Review of Biochemistry (Snell, E.E., Boyer, P.D., Meister, A., and Richardson, C.C., Eds.), Vol. 50, pp. 349-383, Annual Reviews, Inc., Palo Alto, CA. Myers, R.M., Tilly, K., and Maniatis, T. (1986) Science 232, 613-618. O’Shea-Greenfield, A. and Smale, ST. (1992) J. Biol. Chem. 267, 1391- 1402. Gardiner-Garden, M. and Frommer, M. (1987) J. Mol. Biol. 196, 261-282. Treisman, R. (1985) Cell 42, 889-902. Knight, E., Jr., Anton, B.D., Fahey, D., Friedland, B.K., and Jonak, 6.]. (1985) Proc. Natl. Acad. Sci. USA 82, 151-154. Faisst, S. and Meyer, S. (1992) Nuc. Acids Res. 20, 3-26. Hipskind, R.A. Nordheim, A. (1991) J. Biol. Chem. 266, 19583-19592. Ureta, T., Radojkovié, J., Lagos, R., Guixé, V., and Nur‘iez, L. (1979) Arch. Biol. Med. Exper. 12, 587-604. Nasrin, N., Ercolani, L., Denaro, M., Kong, X.F., Kang, I., and Alexander, M. (1990) Proc. Natl. Acad. Sci. USA 87, 5273-5277. Thelen, A.P. and Wilson, J.E. (1991) Arch. Biochem. Biophys. 286, 645-651. Garner, M.M. and Revzin, A.R. (1981) Nuc. Acids Res. 5, 3157-3170. Ceglarek, J .A. and Revzin, A.R. (1989) Electrophoresis 10, 360-365. APPENDIX AUMPEDUJEKIA LIST OF RESTRICTION SITES IN I-IEXOKINASE TYPE 11 cDNA (Composite, includes engineered NcoI Site at 1557) # SITES FRAGMENTS FRAGMENT ENDS AAT 2 (GACGTC) 2 3330 3330 (91.6) 1 3330 3445 189 ( 5.2) 3445 3634 115 ( 3.2) 3330 3445 ACC 1 (GTVWAC) 1 1442 2192 (60.3) 1442 3634 1442 (39.7) 1 1442 ACY 1 (GPCGQC) 2 3330 3330 (91.6) 1 3330 3445 189 ( 5.2) 3445 3634 115 ( 3.2) 3330 3445 AFL 3 (ACPQGT) 2 1740 1740 (47.9) 1 1740 2856 1116 (30.7) 1740 2856 778 (21.4) 2856 3634 AHA 2 (GPCGQC) 2 3330 3330 (91.6) 1 3330 3445 189 ( 5.2) 3445 3634 115 ( 3.2) 3330 3445 AHA 3 (TTTAAA) 2 2997 2997 (82.5) 1 2997 3601 604 (16.6) 2997 3601 33 ( 0.9) 3601 3634 ALU 1 (AGCT) 24 33 483 (13.3) 1146 1629 55 432 (11.9) 636 1068 234 363 (10.0) 1914 2277 352 303 ( 8.3) 2388 2691 373 275 ( 7.6) 2952 3227 570 270 ( 7.4) 1644 1914 615 201 ( 5.5) 3234 3435 636 197 ( 5.4) 373 570 1068 179 ( 4.9) 55 234 1107 132 ( 3.6) 3502 3634 1146 118 ( 3.2) 234 352 1629 111 ( 3.1) 2277 2388 1644 106 ( 2.9) 2793 2899 120 121 # SITES FRAGMENTS FRAGMENT ENDS 1914 102 ( 2.8) 2691 2793 2277 67 ( 1.8) 3435 3502 2388 53 ( 1.5) 2899 2952 2691 45 ( 1.2) 570 615 2793 39 ( 1.1) 1107 1146 2899 39 ( 1.1) 1068 1107 2952 33 ( 0.9) 1 33 3227 22 ( 0.6) 33 55 3234 21 ( 0.6) 615 636 3435 21 ( 0.6) 352 373 3502 15 ( 0.4) 1629 1644 7 ( 0.2) 3227 3234 APA 1 (GGGCCC) 1 1599 2035 (56.0) 1599 3634 1599 (44.0) 1 1599 APA L1 (GTGCAC) 1 2662 2662 (73.3) 1 2662 972 (26.7) 2662 3634 AVA 1 (CQCGPG) 2 99 3482 (95.8) 152 3634 152 99 ( 2.7) 1 99 53 ( 1.5) 99 152 AVA 2 (GGRCC) 9 754 982 (27.0) 831 1813 831 754 (20.7) 1 754 1813 598 (16.5) 2361 2959 2361 548 (15.1) 1813 2361 2959 452 (12.4) 3158 3610 3015 126 ( 3.5) 3032 3158 3032 77 ( 2.1) 754 831 3158 56 ( 1.5) 2959 3015 3610 24 ( 0.7) 3610 3634 17 ( 0.5) 3015 3032 AVA 3 (ATGCAT) 3 1859 1859 (51.2) 1 1859 1895 924 (25.4) 1895 2819 2819 815 (22.4) 2819 3634 36 ( 1.0) 1859 1895 AVR 2 (CCTAGG) 1 2614 2614 (71.9) 1 2614 1020 (28.1) 2614 3634 122 # SITES FRAGMENTS FRAGMENT ENDS EAL 1 (TGGCCA) 3 597 2332 (64.2) 1302 3634 1134 597 (16.4) 1 597 1302 537 (14.8) 597 1134 168 ( 4.6) 1134 1302 BAN 1 (GGQPCC) 3 1498 1498 (41.2) 1 1498 2231 851 (23.4) 2231 3082 3082 733 (20.2) 1498 2231 552 (15.2) 3082 3634 BAN 2 (GPGCQC) 4 233 1721 (47.4) 1913 3634 1045 812 (22.3) 233 1045 1599 554 (15.2) 1045 1599 1913 314 ( 8.6) 1599 1913 233 ( 6.4) 1 233 BBV 1 (GCTGC) 14 189 737 (20.3) 2897 3634 371 523 (14.4) 371 894 894 347 ( 9.5) 1729 2076 1066 289 ( 8.0) 1066 1355 1355 267 ( 7.3) 2076 2343 1539 213 ( 5.9) 2684 2897 1553 193 ( 5.3) 2491 2684 1642 189 ( 5.2) 1 189 1729 184 ( 5.1) 1355 1539 2076 182 ( 5.0) 189 371 2343 172 ( 4.7) 894 1066 2491 148 ( 4.1) 2343 2491 2684 89 ( 2.4) 1553 1642 2897 87 ( 2.4) 1642 1729 14 ( 0.4) 1539 1553 BCL 1 (TGATCA) 2 856 1571 (43.2) 856 2427 2427 1207 (33.2) 2427 3634 856 (23.6) 1 856 BGL 1 (GCCNNNNNGGC) 1 1358 2276 (62.6) 1358 3634 1358 (37.4) 1 1358 BGL 2 (AGATCT) 3 525 1344 (37.0) 525 1869 1869 967 (26.6) 1869 2836 2836 798 (22.0) 2836 3634 525 (14.4) 1 525 BIN 1 (GGATC) BSM 1 (GAATGC) BSP 1286 (G2GC3C) BSP M1 BSP M2 833 H2 BST E2 BST N1 (ACCTGC) (TCCGGA) (GCGCGC) (GGTNACC) (CCRGG) 16 123 SITES 84 445 760 1789 590 233 470 886 1045 1499 1599 1913 2626 2662 3607 324 409 762 2680 2095 3142 72 105 436 595 1058 1123 FRAGMENTS 1845 (50.8) 1029 (28.3) 361 ( 9.9) 315 ( 8.7) 84 ( 2.3) 3044 (83.8) 590 (16.2) 972 (26.7) 713 (19.6) 454 (12.5) 416 (11.4) 314 ( 8.6) 237 ( 6.5) 233 ( 6.4) 159 ( 4.4) 100 ( 2.8) 36 ( 1.0) 3607 (99.3) 27 ( 0.7) 2872 (79.0) 353 ( 9.7) 324 ( 8.9) 85 ( 2.3) 2680 (73.7) 954 (26.3) 2095 (57.6) 1047 (28.8) 492 (13.5) 464 (12.8) 463 (12.7) 336 ( 9.2) 331 ( 9.1) 284 ( 7.8) 260 ( 7.2) FRAGMENT ENDS 1789 760 445 590 2662 1913 1045 470 1599 233 886 1499 2626 3607 762 409 324 2680 2095 3142 2402 595 1615 105 2866 3150 3634 1789 445 760 3634 590 3634 2626 1499 886 1913 470 233 1045 1599 2662 3607 3634 3634 762 324 409 2680 3634 2095 3142 3634 2866 1058 1951 436 3150 3410 124 # SITES FRAGMENTS FRAGMENT ENDS 1270 252 ( 6.9) 1363 1615 1363 224 ( 6.2) 3410 3634 1615 207 ( 5.7) 2140 2347 1951 189 ( 5.2) 1951 2140 2140 159 ( 4.4) 436 595 2347 147 ( 4.0) 1123 1270 2402 93 ( 2.6) 1270 1363 2866 72 ( 2.0) 1 72 3150 65 ( 1.8) 1058 1123 3410 55 ( 1.5) 2347 2402 33 ( 0.9) 72 105 BST X1 (CCANNNNNNTGG) 2 600 2908 (80.0) 726 3634 726 600 (16.5) 1 600 126 ( 3.5) 600 726 CFR 1 (QGGCCP) 7 597 1344 (37.0) 1365 2709 807 925 (25.5) 2709 3634 1134 597 (16.4) 1 597 1302 327 ( 9.0) 807 1134 1341 210 ( 5.8) 597 807 1365 168 ( 4.6) 1134 1302 2709 39 ( 1.1) 1302 1341 24 ( 0.7) 1341 1365 CLA 1 (ATCGAT) 1 2471 2471 (68.0) 1 2471 1163 (32.0) 2471 3634 DDE 1 (CTNAG) 14 30 821 (22.6) 1691 2512 289 544 (15.0) 2559 3103 540 455 (12.5) 711 1166 711 372 (10.2) 3224 3596 1166 364 (10.0) 1166 1530 1530 259 ( 7.1) 30 289 1631 251 ( 6.9) 289 540 1691 171 ( 4.7) 540 711 2512 121 ( 3.3) 3103 3224 2547 101 ( 2.8) 1530 1631 2559 60 ( 1.7) 1631 1691 3103 38 ( 1.0) 3596 3634 3224 35 ( 1.0) 2512 2547 3596 30 ( 0.8) 1 30 12 ( 0.3) 2547 2559 EAE 1 (QGGCCP) 7 597 1344 (37.0) 1365 2709 807 925 (25.5) 2709 3634 1134 597 (16.4) 1 597 1302 327 ( 9.0) 807 1134 1341 210 ( 5.8) 597 807 # ECO 0109 (PGGNCCQ) 4 END 4H1 (GCNGC) 21 FNU D2 (CGCG) 4 FOX 1 (GGATG) 23 125 SITES 1365 2709 25 1812 3031 3609 186 189 371 894 1066 1355 1367 1539 1542 1553 1642 1729 2076 2343 2433 2491 2505 2684 2711 2897 2922 21 1335 1369 2681 195 412 535 685 1012 1092 1120 1234 1454 1534 1879 2038 FRAGMENTS 168 ( 4.6) 39 ( 1.1) 24 ( 0.7) 1787 (49.2) 1219 (33.5) 578 (15.9) 25 ( 0.7) 25 ( 0.7) 712 (19.6) 523 (14.4) 347 ( 9.5) 289 ( 8.0) 267 ( 7.3) 186 ( 5.1) 186 ( 5.1) 182 ( 5.0) 179 ( 4.9) 172 ( 4.7) 172 ( 4.7) 90 ( 2.5) 89 ( 2.4) 87 ( 2.4) 58, ( 1.6) 27 ( 0.7) 25 ( 0.7) 14 ( 0.4) 12 ( 0.3) 11 ( 0.3) 3 ( 0.1) 3 ( 0.1) 1314 (36.2) 1312 (36.1) 953 (26.2) 34 ( 0.9) 21 ( 0.6) 345 ( 9.5) 327 ( 9.0) 300 ( 8.3) 228 ( 6.3) 220 ( 6.1) 219 ( 6.0) 217 ( 6.0) 196 ( 5.4) 195 ( 5.4) 159 ( 4.4) 158 ( 4.3) 150 ( 4.1) 1134 1302 1341 25 1812 3031 3609 2922 371 1729 1066 2076 2711 189 2505 1367 894 2343 1553 1642 2433 2684 2897 2491 1355 1542 1539 186 21 1369 2681 1335 1534 685 3194 2926 1234 2383 195 2602 1879 2143 535 FRAGMENT ENDS 1302 1341 1365 1812 3031 3609 3634 25 3634 894 2076 1355 2343 2897 186 371 2684 1539 1066 2433 1642 1729 2491 2711 2922 2505 1367 1553 1542 189 1335 2681 3634 1369 21 1879 1012 3494 3154 1454 2602 412 2798 195 2038 2301 685 # GDI 2 (QGGCCG) 4 HAE 1 (RGGCCR) 9 HAE 2 (PGCGCQ) 2 HAE 3 (GGCC) 17 126 SITES 2089 2143 2301 2350 2383 2602 2798 2926 3154 3194 3494 807 1341 1365 2709 496 597 1134 1248 1302 1782 2218 2916 3095 148 2517 26 103 497 598 808 1135 1249 1303 1342 1366 1600 1783 2219 2504 2710 FRAGMENTS 14o ( 3 128 ( 3 123 ( 3 114 ( 3 80 ( 2 80 ( 2 54 ( 1 51 ( 1 49 ( 1 4o ( 1 33 ( 0 28 ( 0 1344 (37. 925 (25. 807 (22. 534 (14. 24 ( o. 698 (19 539 (14 537 (14 496 (13 480 (13 436 (12. 179 ( 4. 114 ( 3. 101 ( 2 54 ( 1 2369 (65. 1117 (30. 148 ( 4. 538 (14. 436 (12 394 (10. 327 ( 9. 285 ( 7 234 ( 6 210 ( 5 207 ( 5 206 ( 5 183 ( 5 179 ( 4 114 ( 3 101 ( 2 77 ( 2 54 ( 1 3494 2798 412 1120 1454 1012 2089 2038 2301 3154 2350 1092 1365 2709 807 1341 2218 3095 597 1302 1782 2916 1134 496 1248 148 2517 3096 1783 103 808 2219 1366 598 2710 2504 1600 2917 1135 497 1249 FRAGMENT ENDS 3634 2926 535 1234 1534 1092 2143 2089 2350 3194 2383 1120 2709 3634 807 1341 1365 2916 3634 1134 496 1782 2218 3095 1248 597 1302 2517 3634 148 3634 2219 497 1135 2504 1600 808 2917 2710 1783 3096 1249 598 103 1303 127 # SITES FRAGMENTS FRAGMBNT ENDS 2917 39 ( 1.1) 1303 1342 3096 26 ( 0.7) 1 26 24 ( 0.7) 1342 1366 HGA 1 (GACGC) 4 283 1932 (53.2) 1702 3634 922 639 (17.6) 283 922 1471 549 (15.1) 922 1471 1702 283 ( 7.8) 1 283 231 ( 6.4) 1471 1702 HGI.A1 (GRGCRC) 5 233 1443 (39.7) 470 1913 470 972 (26.7) 2662 3634 1913 713 (19.6) 1913 2626 2626 237 ( 6.5) 233 470 2662 233 ( 6.4) 1 233 36 ( 1.0) 2626 2662 HGI c1 (GGQPCC) 3 1498 1498 (41.2) 1 1498 2231 851 (23.4) 2231 3082 3082 733 (20.2) 1498 2231 552 (15.2) 3082 3634 HGI J2 (GPGCQC) 4 233 1721 (47.4) 1913 3634 1045 812 (22.3) 233 1045 1599 554 (15.2) 1045 1599 1913 314 ( 8.6) 1599 1913 233 ( 6.4) 1 233 HHA 1 (GCGC) 12 149 1105 (30.4) 1353 2458 778 934 (25.7) 2700 3634 955 629 (17.3) 149 778 1267 312 ( 8.6) 955 1267 1336 177 ( 4.9) 778 955 1353 149 ( 4.1) 1 149 2458 90 ( 2.5) 2518 2608 2518 72 ( 2.0) 2608 2680 2608 69 ( 1.9) 1267 1336 2680 60 ( 1.7) 2458 2518 2682 18 ( 0.5) 2682 2700 2700 17 ( 0.5) 1336 1353 2 ( 0.1) 2680 2682 128 # SITES FRAGMENTS FRAGMENT ENDS HINC 2 (GTQPAC) 3 260 1906 (52.4) 403 2309 403 1325 (36.5) 2309 3634 2309 260 ( 7.2) 1 260 143 ( 3.9) 260 403 HIND 3 (AAGCTT) 3 54 2738 (75.3) 54 2792 2792 683 (18.8) 2951 3634 2951 159 ( 4.4) 2792 2951 54 ( 1.5) 1 54 HINF 1 (GANTC) 14 76 1058 (29.1) 181 1239 181 442 (12.2) 1619 2061 1239 419 (11.5) 2116 2535 1279 315 ( 8.7) 3107 3422 1314 247 ( 6.8) 2535 2782 1383 237 ( 6.5) 2870 3107 1619 236 ( 6.5) 1383 1619 2061 212 ( 5.8) 3422 3634 2116 105 ( 2.9) 76 181 2535 88 ( 2.4) 2782 2870 2782 76 ( 2.1) 1 76 2870 69 ( 1.9) 1314 1383 3107 55 ( 1.5) 2061 2116 3422 40 ( 1.1) 1239 1279 35 ( 1.0) 1279 1314 HPA 2 (CCGG) 13 92 834 (22.9) 763 1597 100 459 (12.6) 2929 3388 143 442 (12.2) 2235 2677 325 353 ( 9.7) 410 763 410 312 ( 8.6) 1810 2122 763 252 ( 6.9) 2677 2929 1597 246 ( 6.8) 3388 3634 1810 213 ( 5.9) 1597 1810 2122 182 ( 5.0) 143 325 2235 113 ( 3.1) 2122 2235 2677 92 ( 2.5) 1 92 2929 85 ( 2.3) 325 410 3388 43 ( 1.2) 100 143 8 ( 0.2) 92 100 HPH 1 (GGTGA) 13 37 672 (18.5) 2158 2830 484 523 (14.4) 1561 2084 651 447 (12.3) 37 484 814 435 (12.0) 1126 1561 1126 314 ( 8.6) 2830 3144 1561 312 ( 8.6) 814 1126 2084 254 ( 7.0) 3380 3634 2097 236 ( 6.5) 3144 3380 129 # SITES FRAGMENTS FRAGMENT ENDS 2119 167 ( 4.6) 484 651 2158 163 ( 4.5) 651 814 2830 39 ( 1.1) 2119 2158 3144 37 ( 1.0) 1 37 3380 22 ( 0.6) 2097 2119 13 ( 0.4) 2084 2097 KPN 1 (GGTACC) 1 3082 3082 (84.8) 1 3082 552 (15.2) 3082 3634 M30 2 (GAAGA) 17 631 631 (17.4) 1 631 1078 513 (14.1) 2496 3009 1129 447 (12.3) 631 1078 1223 386 (10.6) 1223 1609 1609 349 ( 9.6) 3009 3358 1662 281 ( 7.7) 1917 2198 1723 187 ( 5.1) 1723 1910 1910 182 ( 5.0) 3452 3634 1917 164 ( 4.5) 2258 2422 2198 94 ( 2.6) 1129 1223 2258 91 ( 2.5) 3358 3449 2422 74 ( 2.0) 2422 2496 2496 61 ( 1.7) 1662 1723 3009 60 ( 1.7) 2198 2258 3358 53 ( 1.5) 1609 1662 3449 51 ( 1.4) 1078 1129 3452 7 ( 0.2) 1910 1917 3 ( 0.1) 3449 3452 MNL 1 (CCTC) 54 24 276 ( 7.6) 950 1226 29 196 ( 5.4) 204 400 158 152 ( 4.2) 1889 2041 204 145 ( 4.0) 1652 1797 400 142 ( 3.9) 641 783 453 139 ( 3.8) 3474 3613 499 136 ( 3.7) 2521 2657 542 131 ( 3.6) 783 914 641 129 ( 3.5) 29 158 783 120 ( 3.3) 1532 1652 914 119 ( 3.3) 2294 2413 941 117 ( 3.2) 3030 3147 950 111 ( 3.1) 1292 1403 1226 99 ( 2.7) 542 641 1292 98 ( 2.7) 2657 2755 1403 89 ( 2.4) 2413 2502 1489 86 ( 2.4) 1403 1489 1492 75 ( 2.1) 2801 2876 1522 75 ( 2.1) 2129 2204 1532 70 ( 1.9) 3223 3293 1652 70 ( 1.9) 2221 2291 1797 66 ( 1.8) 1226 1292 1846 63 ( 1.7) 3356 3419 1889 62 ( 1.7) 3161 3223 MST 1 (TGCGCA) MST 2 (CCTNAGG) NCI 1 (CCSGG) (NCO 1, CCATGG) 130 # SITES 2041 2044 2081 2087 2129 2204 2221 2291 2294 2413 2502 2521 2657 2755 2801 2876 2933 2983 3030 3147 3161 3223 3293 3311 3315 3318 3356 3419 3474 3613 1352 2457 539 99 100 1597 1810 2676 2929 1 1557 E a 1352 1177 1105 3095 539 1497 866 705 253 213 99 2077 1557 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA OOOOOOOOOOOOOOOOOHHHHHHHHHHHHHH (37 (32. (30. (85. (14 FRAGMENT ENDS 2876 3419 400 2933 1797 2983 2755 453 158 1846 2087 3318 2044 1492 914 3613 2502 3293 2204 3147 1522 941 2081 3311 3315 2291 2041 1489 2457 1352 539 100 1810 2929 2676 1597 99 1558 2933 3474 453 2983 1846 3030 2801 499 204 1889 542 2129 3356 2081 1522 941 3634 2521 3311 2221 3161 1532 950 2087 3315 3318 2294 2044 1492 1352 3634 2457 3634 539 1597 2676 3634 2929 1810 100 3634 1557 131 # SITES FRAGMENTS FRAGMENT ENDS NDE 1 (CATATG) 1 209 3425 (94.2) 209 3634 209 ( 5.8) 1 209 NLA 3 (CATG) 22 280 657 (18.1) 2822 3479 425 607 (16.7) 1099 1706 550 288 ( 7.9) 1966 2254 607 280 ( 7.7) 1 280 835 270 ( 7.4) 2437 2707 910 228 ( 6.3) 607 835 934 192 ( 5.3) 1706 1898 958 155 ( 4.3) 3479 3634 970 145 ( 4.0) 280 425 1042 125 ( 3.4) 425 550 1099 123 ( 3.4) 2314 2437 1706 111 ( 3.1) 2707 2818 1898 75 ( 2.1) 835 910 1960 72 ( 2.0) 970 1042 1966 62 ( 1.7) 1898 1960 2254 60 ( 1.7) 2254 2314 2314 57 ( 1.6) 1042 1099 2437 57 ( 1.6) 550 607 2707 24 ( 0.7) 934 958 2818 24 ( 0.7) 910 934 2822 12 ( 0.3) 958 970 3479 6 ( 0.2) 1960 1966 4 ( 0.1) 2818 2822 NLA 4 (GGNNCC) 17 68 611 (16.8) 2231 2842 321 552 (15.2) 3082 3634 455 419 (11.5) 1812 2231 563 267 ( 7.3) 563 830 830 253 ( 7.0) 68 321 980 246 ( 6.8) 1190 1436 1190 213 ( 5.9) 1599 1812 1436 210 ( 5.8) 980 1190 1498 150 ( 4.1) 830 980 1599 134 ( 3.7) 321 455 1812 116 ( 3.2) 2842 2958 2231 108 ( 3.0) 455 563 2842 101 ( 2.8) 1498 1599 2958 68 ( 1.9) 1 68 3015 62 ( 1.7) 1436 1498 3032 57 ( 1.6) 2958 3015 3082 50 ( 1.4) 3032 3082 17 ( 0.5) 3015 3032 N81 1 (ATGCAT) 3 1859 1859 (51.2) 1 1859 1895 924 (25.4) 1895 2819 2819 815 (22.4) 2819 3634 36 ( 1.0) 1859 1895 132 # SITES FRAGMENTS FRAGMENT ENDS NSP BZ (CVGCWG) 7 32 1203 (33.1) 2431 3634 372 891 (24.5) 1540 2431 569 498 (13.7) 569 1067 1067 340 ( 9.4) 32 372 1368 301 ( 8.3) 1067 1368 1540 197 ( 5.4) 372 569 2431 172 ( 4.7) 1368 1540 32 ( 0.9) 1 32 NSP C1 (PCATGQ) 4 279 1198 (33.0) 2436 3634 957 748 (20.6) 957 1705 1705 731 (20.1) 1705 2436 2436 678 (18.7) 279 957 279 ( 7.7) 1 279 PFL M1 (CCANNNNNTGG) 2 502 2585 (71.1) 502 3087 3087 547 (15.1) 3087 3634 502 (13.8) 1 502 PPU M1 (PGGRCCQ) 3 1812 1812 (49.9) 1 1812 3031 1219 (33.5) 1812 3031 3609 578 (15.9) 3031 3609 25 ( 0.7) 3609 3634 DVD 2 (CAGCTG) 4 32 2567 (70.6) 1067 3634 372 498 (13.7) 569 1067 569 340 ( 9.4) 32 372 1067 197 ( 5.4) 372 569 32 ( 0.9) 1 32 RSA 1 (GTAC) 6 999 999 (27.5) 1 999 1096 861 (23.7) 1096 1957 1957 643 (17.7) 2440 3083 2440 483 (13.3) 1957 2440 3083 453 (12.5) 3181 3634 3181 98 ( 2.7) 3083 3181 97 ( 2.7) 999 1096 RSR 2 (CGGRCCG) 1 2360 2360 (64.9) 1 2360 1274 (35.1) 2360 3634 133 # SITES FRAGMENTS FRAGMENT ENDS SAC 1 (GAGCTC) 2 233 1721 (47.4) 1913 3634 1913 1680 (46.2) 233 1913 233 ( 6.4) 1 233 SAC 2 (CCGCGG) 1 1368 2266 (62.4) 1368 3634 1368 (37.6) 1 1368 SAU 1 (CCTNAGG) 1 539 3095 (85.2) 539 3634 539 (14.8) 1 539 SAU 3A (GATC) 12 84 797 (21.9) 2837 3634 199 754 (20.7) 1036 1790 214 558 (15.4) 1870 2428 446 409 (11.3) 2428 2837 526 234 ( 6.4) 526 760 760 232 ( 6.4) 214 446 857 179 ( 4.9) 857 1036 1036 115 ( 3.2) 84 199 1790 97 ( 2.7) 760 857 1870 84 ( 2 .3) 1 84 2428 80 ( 2.2) 1790 1870 2837 80 ( 2.2) 446 526 15 ( 0.4) 199 214 SAD 96 (GGNCC) 13 26 768 (21.1) 831 1599 102 652 (17.9) 102 754 754 598 (16.5) 2361 2959 831 548 (15 1) 1813 2361 1599 452 (12.4) 3158 3610 1600 213 ( 5.9) 1600 1813 1813 126 ( 3.5) 3032 3158 2361 77 ( 2.1) 754 831 2959 76 ( 2.1) 26 102 3015 56 ( 1.5) 2959 3015 3032 26 ( 0.7) 1 26 3158 24 ( 0.7) 3610 3634 3610 17 ( 0.5) 3015 3032 1 ( 0.0) 1599 1600 SCR F1 (CCNGG) 22 72 463 (12.7) 595 1058 99 331 ( 9.1) 105 436 100 274 ( 7.5) 2402 2676 105 260 ( 7.2) 3150 3410 436 234 ( 6.4) 1363 1597 595 224 ( 6.2) 3410 3634 1058 221 ( 6.1) 2929 3150 1123 207 ( 5.7) 2140 2347 134 # SITES FRAGMENTS FRAGMENT ENDS 1270 195 ( 5.4) 1615 1810 1363 190 ( 5.2) 2676 2866 1597 189 ( 5.2) 1951 2140 1615 159 ( 4.4) 436 595 1810 147 ( 4.0) 1123 1270 1951 141 ( 3.9) 1810 1951 2140 93 ( 2.6) 1270 1363 2347 72 ( 2.0) 1 72 2402 65 ( 1.8) 1058 1123 2676 63 ( 1.7) 2866 2929 2866 55 '( 1.5) 2347 2402 2929 27 ( 0.7) 72 99 3150 18 ( 0.5) 1597 1615 3410 5 ( 0.1) 100 105 1 ( 0.0) 99 100 SDU 1 (G2GC3C) 9 233 972 (26.7) 2662 3634 470 713 (19.6) 1913 2626 886 454 (12.5) 1045 1499 1045 416 (11.4) 470 886 1499 314 ( 8.6) 1599 1913 1599 237 ( 6.5) 233 470 1913 233 ( 6.4) 1 233 2626 159 ( 4.4) 886 1045 2662 100 ( 2.8) 1499 1599 36 ( 1.0) 2626 2662 SEA N1 (GATGC) 12 963 963 (26.5) 1 963 1264 462 (12.7) 1264 1726 1726 352 ( 9.7) 2925 3277 1858 301 ( 8.3) 963 1264 2037 282 ( 7.8) 2643 2925 2069 257 ( 7.1) 3377 3634 2263 245 ( 6.7) 2263 2508 2508 194 ( 5.3) 2069 2263 2643 179 ( 4.9) 1858 2037 2925 135 ( 3.7) 2508 2643 3277 132 ( 3.6) 1726 1858 3377 100 ( 2.8) 3277 3377 32 ( 0.9) 2037 2069 SMA 1 (CCCGGG) 1 99 3535 (97.3) 99 3634 99 ( 2.7) 1 99 SPH 1 (GCATGC) 1 1705 1929 (53.1) 1705 3634 1705 (46.9) 1 1705 STU 1 (AGGCCT) 1 1248 2386 (65.7) 1248 3634 1248 (34.3) 1 1248 135 # SITES FRAGMENTS FRAGMENT ENDS STY 1 (CCRRGG) 8 1137 1137 (31.3) 1 1137 1156 829 (22.8) 1785 2614 1785 629 (17.3) 1156 1785 2614 419 (11.5) 3092 3511 2811 209 ( 5.8) 2811 3020 3020 197 ( 5.4) 2614 2811 3092 123 ( 3.4) 3511 3634 3511 72 ( 2.0) 3020 3092 19 ( 0.5) 1137 1156 TAQ 1 (TCGA) 11 121 865 (23.8) 173 1038 153 803 (22.1) 2472 3275 173 498 (13.7) 1920 2418 1038 489 (13.5) 1431 1920 1197 359 ( 9.9) 3275 3634 1242 189 ( 5.2) 1242 1431 1431 159 ( 4.4) 1038 1197 1920 121 ( 3.3) 1 121 2418 54 ( 1.5) 2418 2472 2472 45 ( 1.2) 1197 1242 3275 32 ( 0.9) 121 153 20 ( 0.6) 153 173 TTHlll 2 (CCAPCA) 6 1591 1591 (43.8) 1 1591 1818 911 (25.1) 2227 3138 2014 333 ( 9.2) 3301 3634 2227 227 ( 6.2) 1591 1818 3138 213 ( 5.9) 2014 2227 3301 196 ( 5.4) 1818 2014 163 ( 4.5) 3138 3301 xno 1 (CTCGAG) 1 152 3482 (95.8) 152 3634 152 ( 4.2) 1 152 XHO 2 (PGATCQ) 5 445 1264 (34.8) 525 1789 525 967 (26.6) 1869 2836 1789 798 (22.0) 2836 3634 1869 445 (12.2) 1 445 2836 80 ( 2.2) 1789 1869 80 ( 2.2) 445 525 XMN 1 (GAANNNNTTC) 1 2111 2111 (58.1) 1 2111 1523 (41.9) 2111 3634 The following do not appear: AFL 2 ASU 2 ECO R1 ECO R5 NAE 1 NAR 1 NOT 1 NRU 1 RRU 1 SAL 1 SNA 1 SNA Bl 'I'I'Hlll 1 XBA 1 136 HPA NCO PST SCA SPE H1 DRA PVU SFI SSP “711111111111114111)“