LIBRARY Michigan State Unlversity PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINE return on or before date due. MTE DUE MTE DUE MTE DUE 1!” WM“ IDENTIFICATION AND CHARACTERIZATION OF PROMOTER REGIONS OF THE GENE FOR RAT TYPE I HEXOKINASE By Wenjing Liu A DISSERTATION Submitted to Michigan State University in partial fiilfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Biochemistry 1997 ABSTRACT IDENTIFICATION AND CHARACTERIZATION OF PROMOTER REGIONS OF THE GENE FOR RAT TYPE I HEXOKINASE By Wenjing Liu The 5’-flanking region of the gene for rat type I hexolcinase has previously been isolated from genomic clones and sequenced. 5’-RACE, RT-PCR and RNase protection assays indicated that there are multiple transcriptional start sites clustered in three regions at positions approximately 460, -300, and -100 relative to the translational start codon. These regions lack classical TATA sequence and are located in a GC-rich segment, a “CpG island”. These characteristics are frequently associated with “housekeeping genes”. The goal of this dissertation is to identify the promoter regions of type I hexokinase gene and to characterize important cis-elements and corresponding trans-factors regulating the promoter activity. PC12 cells and H9c2 cells were transfected with luciferase reporter constructs containing genomic sequence between positions -3366 and -171. Marked (85%) decrease in promoter activity was associated with deletion of sequence between -742 and -516. In DNase I footprinting experiments, two regions, called P1 (-5 52 to -529) and P2 (-480 to -458) boxes, were protected by proteins present in nuclear extracts from PC 12 cells. The P2 box overlaps with the most upstream cluster of transcriptional start sites, which is about 80 bp downstream from the P1 box. Mutations or deletions in the P2 box had no effect on promoter activity. In contrast, mutations or deletions in the P1 box had markedly detrimental effects on promoter activity. A second Spl site (-570), just upstream from the P1 box, was also shown to be fimctionally important although not protected in footprinting experiments. F urtherrnore, the Pl box could be fimctionally replaced by the ~57O Spl site. Two DNA-protein complexes were observed in gel-shift experiments with P1 box sequence and PC12 nuclear extract. Maintenance of a consensus Spl binding site centrally located in the Pl box was critical for the formation of both complexes. Supershifi experiments demonstrated the involvement of Spl, Sp3, and Sp4 in formation of these complexes, and implicate these transcription factors in regulating promoter activity associated with this region. Another series of reporter constructs, including sequence between -171 and - l, permitted detection of an additional promoter activity downstream from -364. While not yet extensively characterized, it is already evident that the cis-elements influencing the downstream promoter activity are distinct from the Sp factors determined to be important in expression from the upstream promoter region. To my family ACKNOWLEDGMENTS I am grateful to my mentor, Dr. John E. Wilson for his guidance and encouragement over the years. I learned so much from his discipline in science and generosity in social life. I would like to express my deep appreciation to my committee members, Dr. Paul Coussens, Dr. Don Jump, Dr. Steve Triezenberg and Dr. John Wang for their criticisms and advice on my work. I also thank Dr. Zach Burton for his discussion and attendance at my dissertation defense. I would like to thank my colleagues, past and present Wilson lab. members. They were always friendly and willing to help. Whether attending a scientific meeting or just a lunch out together, I enjoyed their company and friendship. Special thank goes to Dr. Joe White who taught me molecular biology techniques and cooperated with me at early stage of my research project. I feel extremely lucky to make friends with some Chinese women here in the Department of Biochemistry at MSU. They are Yin Tang, Jie Qian, Yue Li and Tong Hao. Because of our common cultural and academic background, we shared many things personal and professional. I thank them for being always available when I need comfort and support in difficult times. I want to thank my parents, sister and brother back in China. Their unconditional love and firm faith in me pushed me go this far. Last, but not least, I thank my husband, Yijun Guo, for everything. TABLE OF CONTENTS LIST OF TABLES ........................................................................................................... x LIST OF FIGURES ........................................................................................................ xi LIST OF ABBREVIATIONS ........................................................................................ xii CHAPTER I INTRODUCTION ........................................................................ l l. Isozymes of mammalian hexokinase ...................................................................... 2 Background and kinetic properties ............................................................ 2 cDNA sequences and evolution ................................................................. 4 Tissue distribution ..................................................................................... 7 Subcellular association .............................................................................. 8 2. Regulation of hexokinase gene expression and promoters of hexokinase genes ................................................................................................ 10 Type I hexokinase ................................................................................... 11 Type II hexokinase ....................................................... , .......................... 14 Type IV hexokinase (Glucokinase, GK) .................................................. 17 Type III hexokinase ................................................................................ 20 3. Spl family of transcription factors ...................................................................... 20 Spl ......................................................................................................... 20 Sp3 and Sp4 ........................................................................................... 22 Roles of Spl in transcriptional regulation ................................................ 24 Roles of multiple Sp factors in transcriptional regulation ......................... 25 4. Thesis overview .................................................................................................. 26 CHAPTER II MATERIALS AND METHODS ................................................. 27 1. General methods .......................................................................................................... 28 2. Generation of promoter test constmcts ........................................................................ 28 vi 3. Transfection of PC12 and H9c2 cells with reporter constructs ................................... 3O 4. Reporter assay .................................................................................................... 31 S. Non-radioactive labeling of RNA probe and RNase protection assay .................................................................................................. 31 6. Preparation of nuclear extracts from PC 12 cells .......................................................... 32 7. Electrophoretic mobility shift (" gel shift") and "supershifi" experiments .................................................................................................................. 33 8. DNase I footprinting .................................................................................................... 33 9. Mutation of promoter reporter constructs ................................................................... 34 10. Hexokinase assay and western blot ..................................................................... 35 CHAPTER III RESULTS ................................................................................... 37 1. Promoter activity associated with Sac I fi'agment in HKI upstream region .......................................................................................... 38 2. Sequence between -742 and -516 is important for promoter activity .......................................................................................................... 42 3. DNase I footprinting reveals two protected regions ..................................................... 45 4. The P1 box and Spl site at -570 are functionally important cis-elements for promoter activity but the P2 box is not .............................................. 50 5. Gel shift and supershift experiments demonstrate binding of Sp family members to the Pl box ............................................................................ 57 6. Effect of overexpression of Sp transcription factors on HKI expression in PC 12 cells ...................................................................................... 69 7. Reporter constructs including sequence between positions -171 and +1 reveal another region with promoter activity ........................................... 69 8. Effect of downstream sequence (+1 to +77) on promoter activity .......................................................................................................................... 76 vii CHAPTER IV DISCUSSION ............................................................................. 79 1. The upstream promoter: P1 box and -570 Spl site ...................................................... 80 2. Multiple members of Sp family of transcription factors regulate the activity of upstream promoter ................................................................... 81 3. The P2 box and transcription initiation ......................................................................... 82 4. The downstream promoter ........................................................................................... 83 5. Efl‘ect of +1 to +77 on the promoter activity ............................................................... 84 CHAPTER V FUTURE WORK ........................................................................ 85 1. To investigate roles of Sp factors in regulation of the upstream promoter - Transfection into Dr050phila SL2 cells ............................... 86 2. To test downstream promoter activity in H9c2 cells ............................................ 86 3. To identify cis-elements in the downstream promoter .......................................... 87 4. To study transcriptional regulation of hexokinase I expression ............................ 87 5. To define transcriptional start site and promoter usage in different tissues and cell lines .............................................................................. 88 LIST OF REFERENCES ............................................................................................... 9O APPENDICES APPENDIX A DNA SEQUENCE AROUND HKI PROMOTER REGION ............................. 100 APPENDIX B DNA SEQUENCE OF HKI UPSTREAM REGION ......................................... 101 APPENDIX C SCHEMATIC REPRESENTATION OF RESTRICTION SITES IN HKI UPSTREAM REGION ............................................................ 104 viii APPENDIX D LIST OF RESTRICTION SITES IN HKI UPSTREAM REGION ................... 109 Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7. Table 8. Table 9. LIST OF TABLES Molecular weights and kinetic parameters of mammalian hexokinases .............................................................................................. 3 Comparison of amino acid sequences of N- and C- terminal halves of rat Type I, II and III hexokinases, glucokinase, and yeast hexokinase A ......................................................... 5 Expression of luciferase in PC12 cells transfected with reporter constructs .................................................................................. 41 Potential cis-elements in P1 box and P2 box sequences .............................. 51 Primers used for generation of substitution or deletion mutants ...................... 52 P2 box mutation data .............................................................................. 56 DNA Fragments Used in Gel shift Experiments ............................................. 58 Hexokinase activity in PC 12 cells overexpressing Sp transcription factors ................................................................................ 72 Effect of downstream sequence on promoter activity ............................... 77 Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. Figure 10. Figure 11. LIST OF FIGURES Partial Restriction map of HKI upstream region ..................................... 39 Transfection of PC 12 and H9c2 cells with promoter- reporter constructs ......................................................................................... 43 RNase Protection Assay .................................................................................. 46 DNase I footprinting analysis with a nuclear extract from PC12 cells ....................................................................................................... 48 Effect of mutations in the P1 box and upstream (-570) Spl site on promoter activity .................................................................................. 54 Gel shifi experiment showing complexes formed by proteins in nuclear extracts from PC12 cells and an oligonucleotide including the P1 box sequence ......................................................................... 59 Gel shift experiment showing the effect of mutations in the Spl binding site within the P1 box sequence ......................................................... 62 Supershifi experiment showing involvement of Spl, Sp3, and Sp4 in complexes formed with nuclear extracts from PC12 cells ....................................................................................................... 65 Summary of gel-shift and super-shift experiments .......................................... 67 Overexpression of Spl, Sp3 and Sp4 in PC12 cells ......................................... 70 Transfection of PC12 cells with promoter-reporter constructs including sequence between -l7l and -1 ........................................................ 74 xi Glc-6-P KDa GK RACE PCR RT-PCR bp Kb nt LIST OF ABBREVIATIONS glucose-6-phosphate kilodalton molecular weight hexokinase glucokinase rapid amplification of cDNA ends polymerase chain reaction reverse transcription followed by PCR base pair(s) kilobase pair(s) nucleotide(s) hour(s) minute(s) specificity protein simian virus 40 cytomegalovirus microgram(s) microlitre(s) xii CAMP dig Hepes PBS EDTA PMSF DTT SDS Fig. SD mu mg luc cyclic AMP untranslated region orthophosphate digoxigenin N—(2-hydroxyethyl)-1-piperazineethanesulfonic acid phosphate buffered saline (ethylenedinitrilo) tetraacetic acid phenylmethylsulfonyl fluoride dithiothreitol count(s) per minute volt(s) sodium dodecyl sulfate figure standard deviation milliunit(s) milligram(s) B-galactosidase luciferase xiii CHAPTER I INTRODUCTION 1. Isozymes of Mammalian Hexokinase Backgmncgnd kinetic properties Hexokinase (ATP: D-hexose 6 phosphotransferase, EC 2.7.1.1) catalyzes the phosphorylation of glucose, using MgziATP as phosphoryl donor. The product, glucose- 6-phosphate, is a common substrate for phosphohexoisomerase, phosphoglucomutase, and Glc-6-P dehydrogenase, which introduce Glc-6-P into glycolysis, glycogen synthesis and the pentose monophosphate pathway, respectively. Four distinct isozymes of hexokinase exist in mammalian tissues. They are designated as type I, II, III and IV isozymes, based on their order of electrophoretic mobility on starch gels (1). Alternatively, they can be separated by chromatographic technique, and named isozymes A-D according to their order of elution from DEAE- cellulose columns (2). Although catalyzing the same reaction, the four isozymes can be distinguished from one another by their different molecular weights and kinetic properties. As summarized in Table 1, type I-III isozymes have molecular weights of about 100 kDa, low Kms for glucose, in the submillimolar range, and are therefore often referred to as the “low K...” isozymes. On the other hand, type IV isozyme (EC 2.7.1.2, commonly called glucokinase), has a molecular weight of about 50 kDa and a much higher K... for glucose (4.5mM). All four isozymes have similar Kms for ATP. The “low Km” isozymes are sensitive to inhibition by their reaction product Glc-6-P at physiologically relevant concentrations, but Table 1. Molecular weights and kinetic parameters of mammalian hexokinases HEXOKINASE ISOZYME PARAMETER I 11 111 IV M.(kDa) 102.3 102.6 100.3 49 K... Glc (mM) 004 0.13 0.02 4.5 K... ATP (mM) 0.42 0.70 1.29 0.49 K; Glc-6-P vs. ATP (mM) 0.026 0.021 0.074 15 This table was adapted fiom Ureta (3), and the references therein. Indicated molecular weights are based on amino acid sequences deduced from cloned cDNAs (4-9). type IV is not (3). cDNA seqyencewd evolution In recent years, the cDNAs coding for all four isozymes have been cloned from rat (4-9), and the respective amino acid sequences have been deduced. Additionally, cDNAs coding for hexokinases and glucokinase from many other organisms have also been cloned. By comparing amino acid sequences deduced from cDNAs, rat hexokinase I, II and III are very similar. Table 2 shows the extensive sequence similarities of N- and C- terminal halves of type I, II and III hexokinases, to one another, and to glucokinase (9) and yeast hexokinase A (10). Based on the fact that the molecular weight of mammalian hexokinase I-III is twice that of yeast hexokinase and glucokinase, it has been proposed by many researchers (3, 11-15) that mammalian 100 kDa hexokinases have evolved by gene duplication and fusion of an ancestral 50 kDa hexokinase which is similar to present-day yeast hexokinase and glucokinase. It has been suggested that one of the duplicated catalytic sites retained the catalytic function, while the other evolved to acquire a regulatory role. This theory is strongly supported by the internal repetition of amino acid sequence between N- and C-temrinal halves of 100 kDa hexokinases, as well as the sequence similarities among N- and C- terminal halves of hexokinases I-III, glucokinase and yeast hexokinase. This gene duplication and fusion theory is further supported by the same Table 2. Comparison of amino acid sequences of N- and C- terminal halves of rat Type I, II and III hexokinases, glucokinase, and yeast hexokinase A NI NII NIII C1 C11 CIII (1-475) (1-475) (1-488) (476-918) (476-917) (489-924) N11 68 (14) NIII 39 (16) 44 (14) CI 46 (17) 54 (14) 38 (14) C11 49 (17) 55 (14) 41 (13) 76 (11) C111 45 (15) 48 (15) 40 (14) 62 (11) 66(9) Iv 46 (18) 52 (15) 38 (15) 49 (15) 53 (14) 49 (15) YHKA 27 (14) 22 (13) 27(15) 27(13) a- This table was adapted from Schwab and Wilson (6), and Thelen and Wilson (7). b- Abbreviations used: NI, N-terminal half of rat Type I isozyme; NII, N-terminal half of rat Type II isozyme; NIII, N-terminal half of rat Type III isozyme; Cl, C- terrninal half of rat Type I isozyme; CII, C-terrninal half of rat Type II isozyme; CIII, C-terminal half of rat Type III isozyme; IV, rat Type IV hexokinase (glucokinase); YHKA, yeast hexokinase A. c- Percentage of identical residues is shown without parenthesis; Percentage of conservative substitutions is shown in parenthesis. intron-exon structure among hexokinases. The splicing sites and the exon sizes of type II (16) and type I (17) hexokinase genes repeat directly between the N- and C-terminal halves. The same splicing pattern is also observed in the gene for 50 kDa glucokinase (18), and thus suggested that an ancestral 50 kDa hexokinase gene similar to glucokinase underwent gene duplication and fusion to form the hexokinase I and 11 genes. The gene duplication and fusion theory underwent some modification as more information became available. White and Wilson (19) were able to digest hexokinase I into a 52 kDa N-terminal fragment and a 48 kDa C-terminal domain. The N-terminal domain was selectively protected by Glc-6-P from denaturation in guanidine hydrochloride (19); the C-terminal half was protected by a glucose analog, N-acetylglucosamine (20). The isolated C-terminal half of the enzyme retained the catalytic activity (20). Thus, they concluded that the binding site for Glc-6-P resides in the N-terminal half of the intact enzyme and is separate from the catalytic site which is associated with the C-terminal half. Further study demonstrated, surprisingly, that the isolated C-terminal half of the enzyme is inhibited by Glc-6-P and that both halves of the enzyme possessed binding sites for the inhibitor Glc-6-P as well as the substrate glucose and ATP (19). This led to a modification of gene duplication and fusion theory such as that the ancestral 50 kDa hexokinase would have had both the glucose binding site and the Glc-6 P regulatory site before gene duplication and fusion occurred. Direct measurement of ligand binding on the intact hexokinase I enzyme showed only one binding site for glucose (21, 22) and one for Glc-6-P (21, 23). Therefore, it was suggested that the glucose site in the N-terminal half and the Glc-6-P site in the C- terrninal half were masked in the intact enzyme molecule. Studies using chimeric hexokinases consisting of the N-tenninal half of one isozyme and the C-terminal half of another confirmed that catalytic activity is associated with the C-terminal halves and regulatory function is coupled to the N-terminal halves of the type I and type III isozymes (24, 25). However, in the type II isozyme, both halves possess catalytic activities and both halves are sensitive to Glc-6-P inhibition (26). Thus, type II hexokinase gene is suggested to be the immediate product of gene duplication and fusion, which subsequently evolved into type I and III, in which N- and C-terminal halves are fimctionally differentiated. Tissue distribution Each of the four isozymes has its distinct tissue distribution (reviewed in references 27, 28). Generally, more than one isozyme is found in most tissues. Type 1 isozyme is present in virtually all tissues examined to date, at relatively high levels in most tissues except for liver. In fact, in those tissues with a heavy reliance on blood-borne glucose, the type I isozyme is the predominant form. Brain is totally dependent on blood-bome glucose, through glycolysis, to provide energy and it contains exclusively the type I isozyme. It is also the case with erythrocytes. Therefore, type I isozyme has been referred to as' the “basic” hexokinase and is suggested to play an important role in introducing glucose into glycolysis (27, 28) . Type II isozyme, on the other hand, is the major form in insulin-sensitive tissues such as skeletal muscle, diaphragm, adipose tissue, and mammary gland (27, 28). There is an apparent relationship between the predominance of the type II hexokinase and insulin sensitivity of the tissue. It has been speculated that type II isozyme is responsible for directing glucose into energy storage forms such as lipids (in adipose or mammary tissue) or glycogen (in skeletal muscle, diaphragm) (27, 28). Type III, the least studied isozyme, has not been found to be the predominant form in any tissue. The tissues which show the highest amount of activity attributable to type III hexokinase are liver, spleen, and lung (27, 29). Type IV, or glucokinase, is known to be present in the B -cells of the pancreas and in liver (3 0-3 2). Because its K... for glucose is in the range of normal blood glucose levels, fluctuation of blood glucose greatly regulates its activity. This in turn makes glucokinase a key enzyme to convert excess blood glucose into glycogen in liver and to serve as a “glucose sensor” governing the release of insulin in the pancreatic B-islets (31). Subcellular association Intracellular distribution of hexokinases is not homogeneous. The association of hexokinases with particulate fractions of tissue homogenates has been well documented in a number of tissues (27, 28). A large portion of hexokinase I and II has been found to be associated with mitochondria in tissues such as brain, heart, diaphragm, skeletal muscle and mammary gland (27, 33, 34). Type I hexokinase is believed to be bound to the outer mitochondrial membrane via both hydrophobic and electrostatic interactions between the N-terminal half of the enzyme and moieties on or in the membrane. The electrostatic interaction between the negative charge on the hexokinase surface and presumably mitochondrial membrane phospholipids is bridged by divalent cations such as Mg” (35). The hydrophobic interaction is dependent on a small hydrophobic N-terminal segment of type I isozyme which targets hexokinase I to mitochondria. The importance of this hydrophobic segment has been confirmed by several studies. First, cleavage of a 9 residue peptide from the N- terrninus of type I hexokinase with chymotrypsin prevented this enzyme from binding to mitochondria (36). Second, this essential N-terminal hydrophobic region of the intact enzyme is inserted into the core of the lipid bilayer when hexokinase is bound to mitochondria (3 7). Recently, another approach was employed to show that the chimeric construct of the first 15 amino acid residues of hexokinase I coupled to reporter protein, chloramphenicol acetyltransferase (CAT), was able to bind to liver and hepatoma mitochondria, while the native CAT was not (38). Similar study with Green Fluorescence Protein (GFP) as reporter protein showed that N-terminal fragment of type I or II hexokinase was able to target the reporter protein to mitochondria (3 9). The outer mitochondrial membrane protein to which type I hexokinase binds was originally isolated as the “hexokinase binding protein” (HBP) (40), but is now known to be identical with the pore forming protein (porin) (41, 42), through which molecules such as ADP and ATP flow. This association between hexokinase and porin gives the enzyme preferential access to rnitochondrially generated ATP (43, 44). Together with the fact that 10 mitochondrially bound hexokinase has slightly greater aflinity for ATP and considerably less sensitivity to G-6-P inhibition (45-47), the binding of hexokinase to mitochondria represents a mechanism for activation of the enzyme. ' Hexokinase III was thought to be “soluble” and hence cytoplasmic in location (30), but recently it was found to be weakly associated with the nuclear periphery. This was demonstrated via confocal microscopy after staining the isozyme through the use of a monoclonal antibody (29). It is generally accepted that hepatic type IV hexokinase is located in the cytosol (48). However, Miwa et al. showed both nuclear and cytoplasmic localization of glucokinase in liver (49) and translocation of glucokinase during fasting-refeeding (50) and postnatal development (51). Both the type III and type IV isozymes lack the hydrophobic N-terminal fragment critical for mitochondrial binding. 2. Regulation of Hexokinase Gene Expression and Promoters of Hexokinase Genes Mammalian hexokinase is the key enzyme committing glucose into cellular metabolic pathways, and it is not surprising that this enzyme is under complex regulation. In short term, hexokinase activity can be regulated in response to altered metabolic status via product Glc-6-P inhibition (for type I-III), antagonism (for type I) or supplementation (for type II and III) of this inhibition by P5, substrate glucose inhibition (for type III) and altered interaction with mitochondria (reviewed in 28). In the long term, levels of hexokinase protein change during development, in response to changes in hormone and ll nutritional status, or under chronic alteration in metabolic status. In principle, these long- term changes in level of the enzyme could be the results of transcriptional regulation, posttranscriptional regulation (e. g., mRNA stability or translational rate) or posttranslational modification (e. g., phosphorylation or glycosylation) (28). Type I hexokinase Type I isozyme is ubiquitously expressed in mammalian tissues and appears to play a general role in mammalian glucose metabolism. Thus, type I hexokinase may be considered a ”housekeeping enzyme". The type I isozyme is found at particularly high levels in brain (52), consistent with the importance of glycolytic metabolism of glucose for sustaining a highly active energy metabolism in this tissue (53). However, distribution of hexokinase activity shows marked variations in difl‘erent brain regions (54) and in different layers of retina (55) and cerebellum (56). Subcellular fractionation suggested that major hexokinase activity is located in the nerve endings (57). The variation in hexokinase activity in different neural structural elements, measured histochemically, are correlated with the amount of the enzyme protein itself, measured by irnmunofluorescent procedure (58). This is consistent with the fact that no posttranslational modification has been found to affect the specific activity of hexokinasel (28). In situ hybridization with hexokinase I-specific oligonucleotide probe also demonstrated extensive neuronal distribution of hexokinase I mRNA with regional differences in the expression pattern (59). 12 Changes in hexokinase levels during development have been particularly well studied in brain. Rat brain hexokinase levels are relatively low prenatally, and increase several fold within the first three weeks postnatally to attain the adult levels (54, 60). Developmental increases in hexokinase activity in neural tissues are correlated with the amount of the protein found in those tissues, measured by irnmunofluorescent staining (61, 62). Griffin et al. examined the levels of mRNA for type I hexokinase (relative to that for phosphoglycerate kinase, PGK) in developing brain and other tissues of rat (63). They found that mRN A levels for type I hexokinase in brain were relatively high during all developmental stages, compared to those in other tissues. The relative level of hexokinase I mRN A increased to a maximum at about one week postnatally before declining to adult levels by about 3-4 weeks postnatally. However, close examination of the data revealed that the mRNA levels of PGK were not constant during development, so the absolute levels of HKI mRNA actually peaked at 2 to 4 weeks postnatally. Nevertheless, based on the lack of correlation between relative levels of mRNA and activity of hexokinase in brain and other tissues during development, it is proposed that both transcriptional and posttranscriptional regulation processes are involved in developmental stage- and tissue- specific regulation of hexokinase I (63). Yokomori et al. reported that levels of mRNA for type I hexokinase were increased 2.5 fold in response to treatment of cultured rat thyroid FRTLS cells with thyroid stimulating hormone (T SH) (64). This effect was due to an increase in the rate of gene transcription, shown in nuclear run-on transcriptional assays, but not due to the change in RNA stability. (Bu)chMP and forskolin had a similar effect, suggesting that TSH stimulates hexokinase gene expression via the CAMP-dependent pathway (64). Thyroid hormone is known to have a major l3 influence on development of brain (65). Hypothyroidism delays the normally observed postnatal increase in hexokinase activity, whereas hyperthyroidism accelerates the increase (62). Using quantitative immunofluorescence techniques, the levels of type I hexokinase (66, 67) in various regions of rat brain have been correlated with previously reported basal rates of glucose utilization in these regions (68, 69), except for several regions (referred to as group II) in which hexokinase content exceeded that expected from basal glucose utilization. It is suggested that group H regions may be adapted to sustain a large range of changes in glucose utilization rates. Hexokinase activity in various brain regions has been reported to be influenced by several physiological perturbations which cause persistently altered metabolic activity, including water deprivation (70), hypertension (71, 72), streptozotocin-induced diabetes (73 ), and surgically-induced heart failure (74). The 5’-flanking region for the rat type I hexokinase gene has been studied in our laboratory (75, 76 and this thesis). My early work in cooperation with Dr. White was focused on isolation of the promoter region and identification of transcriptional start sites. A genomic clone containing sequence identical to the 5’ region of the cDNA for rat type I hexokinase was isolated. A 5.4-kb EcoR I fi'agment from this clone, containing the matching sequence, was sequenced in its entirety (75). 5’-RACE, RT-PCR and RNase protection assays indicated that there are multiple transcriptional start sites clustered in three regions at positions approximately -460, -300, and -100 relative to the translational start codon (75). These regions lack classical TATA sequences and are located in a GC-rich segment, a "CpG island" (77, 78), approximately 1 kb in length. These characteristics are frequently associated with l4 “housekeeping genes". In this thesis, the promoter region for type I hexokinase is analyzed; important cis-elements and corresponding trans-factors regulating promoter activity are identified. Type II hexokinase Type II hexokinase is distributed in insulin-sensitive tissues such as skeletal muscle, diaphragm, adipose tissue, and mammary gland (27). It has long been known that the levels of type H hexokinase are regulated by insulin, with decreased type II hexokinase observed in insulin-sensitive tissues of diabetic animals (33, 79). Frank and Fromm reported that the rate of hexokinase II degradation increases by a factor of 3 in the skeletal muscle of diabetic rats as compared with that of normal animals (80). Furthermore, the relative rate of synthesis of hexokinase II is approximately 1.9 times higher in the normal than in the diabetic rat (81). Insulin treatment of diabetic animals restores the degradation and synthesis of hexokinase II to normal levels (80, 81). Printz et al. (16) reported that hexokinase H mRNA was decreased in adipose tissue from diabetic rats, but was restored by insulin treatment. Insulin also induced hexokinase II mRNA in adipose and skeletal muscle cell lines. In one of the skeletal muscle cell lines, the increase in mRN A is accounted for by a corresponding increase of gene transcription (16). The activity of the type II hexokinase in muscle is regulated by contractile activity. Increases in HK activity have been demonstrated in exercising muscle (82) and muscles subjected to chronic, low-frequency stimulation (83, 84). Up to a 14-fold increase in total 15 hexokinase activity and the hexokinase II isoform was observed in rat fast-twitch muscle after 2 weeks of chronic, low-frequency stimulation. This increase in enzyme protein content was related to an approximately 30 fold increase in protein synthesis rate (85, 86). Cessation of stimulation resulted in a normalization of hexokinase activity associated with decreased rate of synthesis of type II hexokinase (86). The same stimulation also evoked an immediate increase in the ratio between structure (mitochondria)-bound and free hexokinase (87), and presumably represents an early response to increased energy demand. The observed transient increase in hexokinase II content represents an additional increase in glucose phosphorylation capacity under these stimulation conditions. After prolonged stimulation (3 weeks), hexokinase II activity declined, consistent with the previously observed switch from a carbohydrate-based to a fatty-acid-based energy metabolism (87). It was observed that under the same condition, hexokinase II mRNA was elevated significantly after one hour of stimulation and 30-fold after 12 hour, and the rate of HKII protein synthesis increased 20-fold after 24 hours (88). Another group also reported an increase in the levels 'of mRN A for type II hexokinase after a single brief period of exercise (89). Those experiments suggest that the increase in hexokinase II activity in response to contractile activity is at least partly at the transcriptional level. The 5’-flanking region of hexokinase II has been isolated from a rat liver genomic library and analyzed in the rat skeletal muscle cell line, L6. The rate of hexokinase II gene transcription in L6 cells is increased by insulin, catecholamine, and CAMP, resulting in increased hexokinase II mRNA, protein synthesis, and glucose phosphorylation (16, 90, 91). The 5’ untranslated region of hexokinase II mRNA is 462 bp long. The basal 16 promoter consists of about 160 base pairs of S’-flanking sequence that includes a classical TATA box, an inverted CCAAT box (referred to as a'Y box), a CCAAT box, and a CAMP response element (CRE). The CCAAT box and the CRE are both involved in CAMP responsiveness. The Y box contributes to basal promoter activity. Several known transcription factors bind to these sequences, notably CREB and ATF-l to the CRE and NF-Y to both the Y and the CCAAT boxes (91). Tumor cells exhibit increased glycolytic rates, and increased levels of key enzymes like hexokinase. The fast growing tumor cells have increased hexokinase activity (92) and higher percentage of hexokinase associated with mitochondria (93). It has been shown that various tumor cell lines have relatively high levels of mRNAs, particularly for the type II isozyme (7, 94, 95). The 4.3 kb proximal promoter region of the hexokinase II gene has been isolated and characterized in a rapidly growing hepatoma cell line, AS-30D (96). The DNA sequence of this promoter region is the same as that of the hexokinase 11 promoter in normal rat liver cells (91). However, in the AS-3OD cells transfected with promoter- reporter construct, the promoter activity was enhanced by glucose, phorbol 12-myristate l3-acetate (a phorbol ester), insulin, CAMP, and glucagon, whereas these same agents produced little or no effect on promoter activity in transfected hepatocytes (96). The differences in the transcriptional regulation of MCI between normal cells and tumor cells suggested that transcription of the type 11 tumor gene may occur independent of metabolic state. The HKII promoter also contains two functional p53 response elements (96a). The highly abundant mutant form of p53, which lacks the ability to suppress or control cell cycle progression as wild type p53 does, activted the HKII promoter, thus providing a l7 linkage between the loss of cell cycle control and the high glycolytic rate in fast growing cancer cells (96a). Type I V hexokinase (Glucokinase. GK) Type IV hexokinase, or glucokinase, is expressed in the B—CCllS of the pancreas and in liver, and is important in glucose metabolism and homeostasis (30-3 2). In liver, glucokinase is involved in the utilization of excess circulatory glucose. The levels of glucokinase activity in rat liver vary with the nutritional status of the animal. Hepatic glucokinase activity falls during fasting and is restored by glucose refeeding (97, 98). The glucokinase mRNA is undetectable in liver from rats fasted for 24-72 h. Oral glucose administration causes a rapid, massive and transient accumulation of the hepatic glucokinase mRNA (9). The hepatic glucokinase is also up-regulated by insulin. Both enzyme protein and enzyme mRNA are absent from the livers of streptozotocin-induced diabetic rats. Insulin treatment causes a prompt transient build-up of mRNA and enzyme, resulting fi'om a burst in the transcriptional activity of the glucokinase gene, as evidenced by run-on assays with isolated nuclei from liver (99) as well as from primary culture of rat hepatocytes (100). On the other hand, cyclic AMP exerts dominant negative control over glucokinase gene expression. Glucagon or derivatives of CAMP have suppressor effects on induction by insulin; this effect is primarily at the transcriptional level (100). In contrast, the major function of glucokinase in B-Cells is to “sense” the Circulating glucose level and to allow flux through glycolysis which controls the synthesis and secretion of insulin. Levels of islet glucokinase mRNA and protein are relatively constant during the fasting-refeeding cycle (101). Hormones like insulin do not regulate glucokinase mRNA or protein in islet cells (92). The major determinant of glucokinase expression in islets is glucose (102). Transfection experiments suggested that glucose phosphorylation by glucokinase is the rate limiting step of glucose catabolism in B-cells and the key step for activation of the insulin promoter by glucose. The glycolytic intermediates between fructose 1,6-diphosphate and phosphoenolpyruvate are essential for B-cell glucose sensing (103). Recent reports on the differences in glucokinase gene products in liver and pancreatic B-cells provide a mechanism for tissue-specific regulation of this enzyme (101, 104; reviewed in 48, 105). Although glucokinase proteins in liver and pancreatic B-cells display similar kinetic properties, they actually arise from alternative splicing of a single gene. The first exon encoding the 5’ end of the hepatic mRN A is contiguous to the body of the structural gene. But the first exon for the 5’ end of the insulinoma mRNA is more than 12 kb further upstream. The tissue specific splicing specifies not only the difference in 5’ untranslated region of the islet and liver mRNA, but also their initial 15 amino acids (104). Also, the usage of alternative promoters presumably allows different regulation of glucokinase expression in two tissues, e.g., insulin regulation in liver and glucose regulation in B-cells. The hepatic glucokinase promoter, or downstream promoter, has been studied in primary culture of rat hepatocytes (106). The sequence between -123 to -34 (relative to the transcription start site) is the minimal promoter driving reporter expression in l9 hepatocytes, as well as in insulinoma and in hepatoma cells, which do not express the endogenous glucokinase gene. The fragment between -1003 to -707, however, is a hepatocyte-specific enhancer, which. stimulates reporter expression in hepatocytes when linked to the SV40 promoter or the glucokinase promoter regardless of orientation or position. The same sequence is a silencer in hepatoma and is neutral in insolinoma cells (106). The B—cell glucokinase promoter, or upstream promoter, has no TATA box. Thus, transcription initiates over a region of 62 bases (104). Multiple cis-elements in the region between -280 to -1 (with the most proximal initiation sites designated as +1) contribute to transcription in insulinoma cells (107). The first element is three binding sites with a consensus sequence of CAT(T/C)A(C/G), designated as upstream promoter elements (UPEs). The factor binding to these sites is expressed preferentially in pancreatic islet B- cells and is 50 kDa in size. The same factor also binds to similar elements, termed CT boxes, in the insulin promoter, suggesting a common control mechanism for pancreatic islet B-cells specific gene expression (107). The second element is two copies of a pair of perfect palindromic repeats separated by a single base, TGGTCACCA, that have been termed Pall and Pal2 (107). A factor specific to neuroendocrine (NE) cell types, including the pancreatic B—CCll and pituitary corticotrope, binds to Pal elements, in addition to many other factors. The presence of the NE-specific factor in certain NE cell lines correlates with transcription of GK promoter-reporter constructs, suggesting a key role of this factor in determining NE-specific expression of GK (108). 20 T me 111 hexokinase The regulation of expression of the gene for type III isozyme has not been studied. 3. Spl Family of Transcription Factors l‘é’ Spl (Specificity protein 1) was first identified as a factor required for the efficient transcription of the SV40 early promoter (109, reviewed in 110). In the SV40 promoter, Spl binds to proximal promoter elements, GC boxes, and contributes to the basal promoter activity. Alternatively, Spl can bind to a distal site (an enhancer) to activate gene expression. Moreover, when combined, the distal and proximal GC boxes act synergistically to give a strong Spl response through cooperative protein-protein interactions between Spl proteins, a phenomenon called superactivation (111). However, Spl molecules bound to adjacent sites are apparently unable to make such favorable protein-protein contacts. Spl is unable to bind simultaneously to adjacent two sites if the center-to-center distance between the two sites is less than 10 bp (1 10). Spl contains three Cys-2His-2 zinc finger motifs at the carboxy-terminal end, serving as DNA binding domains. The activation domain is comprised of alternating serine/threonine-rich and glutamine-rich regions that constitute much of the amino- terrninal two-thirds of the protein (112,113). The two domains can be fimctionally 21 uncoupled. The binding domain binds to DNA even when the activation domain is deleted (114). The activation domain, when attached to the transcriptionally inactive GAL-4 DNA binding domain, activates promoters containing GAL-4 binding sites. Even the fingerless Spl, which lacks the DNA binding domain, could act with native Spl to reach superactivation, suggesting that superactivation is a result of protein-protein interaction between Spl factors (111). Spl also interacts and synergizes with other transcription factors, such as C/EBP beta, to activate gene expression (115). Spl activates transcription by contacting components of the basal transcription machinery. For example, a glutamine-rich hydrophobic patch in Spl contacts the dTAFIIl 10, a component of the Drosophila TFIID complex, and mediates transcription activation (116). A study with synthetic promoters containing TATA and/or Inr (initiator) showed that the Spl activation domain stimulates Inr-containing and TATA-containing core promoters equally well, while the VP16 activation domain activates the TATA- containing core promoter only (117). The lack of preference for TATA-containing core promoters might explain the frequent involvement of Spl in activation of TATA-less , housekeeping genes. Two posttranslational modifications, glycosylation and phosphorylation, regulate activity of Spl factor. A recent report showed that in cell culture under glucose deprivation, Spl protein becomes hypoglycosylated and more susceptible to proteasome degradation. This process could potentially reduce general transcription under conditions of inadequate nutrients (1 18). 22 Spl is a preferred substrate for a double-stranded DNA-dependent protein kinase. Infection of cells with SV40 virus results in a significant increase in the extent of Spl phosphorylation (119). A recent study showed the importance of Spl phosphorylation in the activation of HIV transcription, induced by okadaic acid (OKA), a selective inhibitor of the serine-threonine phosphatase (120). Another study showed that glucose-induced Spl dephosphorylation resulted in enhanced binding of Spl to promoter II of the acetyl- CoA carboxylase gene and transcriptional activation of this gene (121). The regulation of Spl by phosphorylation makes Spl a potential linkage between signal transduction pathways and transcriptional regulation of gene expression (110). Sp3 and S24 Other members from the Spl family of transcription factors have been recently discovered. Sp2 and Sp3 have been cloned by screening a human HUT 78 (0113 T cells) cDNA library using the Spl zinc finger domain as a probe (122). Sp4 and Sp3 (designated as SPR-l and SPR-2 originally) have been Cloned by screening an Ishikawa (a human endometrial cell line) cDNA expression library for proteins that bind to an Spl recognition site (123). Spl, Sp3 and Sp4 are Closely related members of a gene family encoding proteins with very similar structural features, including a zinc finger containing DNA binding domain as well as glutamine and serine/threonine-rich stretches. They bind to GC boxes and GT boxes with comparable specificity and affinity (123). 23 Spl is ubiquitously expressed, but expression levels vary considerably; how expression of Spl is regulated has not been determined (110). Sp3 is also ubiquitously expressed in various cell lines and organs, with the relative mRNA amount varying moderately. Sp4 transcripts, on the other hand, are abundant in brain and barely detectable in other organs (123). In contrast to the structural similarities, Sp3 and Sp4 are functionally quite different from Spl. Sp3 generally plays a repressive role in transcriptional regulation. It is suggested that Sp3 inhibits transcription by competing with Spl for binding sites in various viral and cellular promoters (124-126). Sp3 also acts as an activator on some promoters in some cell lines, depending on the presence of other factors in the same cells (127). A recent study showed that the glutamine-rich domain of Sp3 alone activates transcription; in intact Sp3 protein, an inhibitory domain can silence the glutamine-rich activation domain and completely suppress transcriptional activation (128). Sp4 is an activator like Spl (125), but is unable to act synergistically through multiple binding sites. However, Sp4 mediated activation can be enhanced in the presence of fingerless Spl, suggesting the direct interaction between Spl and Sp4 (129). Recently, other transcription factors are also identified as GT/GC box binding proteins. BTEB was isolated from rat liver by binding to a GC box found in the P-4501Al gene promoter and was capable of activating other GC box-containing promoters (130). BTEB2 was isolated from human placenta by a similar procedure and had a similar activation fiinction (131). These factors bind to the same DNA elements as the Sp family 24 of transcription factors, and potentially make the regulation through these Cis-elements more complex. Roles of Spl in trapscriptional regulation It is now known that Spl binds to a 9-bp consensus recognition sequence G/TG/AGGCG/TG/AG/AG/T in promoters of many viral and cellular genes; many of them are TATA-less housekeeping genes (110). The roles of Spl in transcriptional regulation are diverse. Spl activates transcription of many viral genes, such as SV40 early gene (109). It mediates the activation of the hepatitis B virus pregenomic promoter by retinoblastoma susceptibility gene product (Rb) (132). In addition, Spl activates expression of many cellular genes by interacting with other transcription factors. For example, a distance- dependent cooperative interaction between transcription factors Spl and Oct-1 is critical for full activity of human U2 snRNA gene promoter (133). Spl has been shown to be important in maintaining the basal, constitutive expression of many housekeeping genes, such as endothelial prostaglandin H synthase-l gene (134). On the other hand, it is an essential element in directing tissue-specific expression of human CD14 in monocytes (135), as well as human insulin-like growth factor H in adult liver (136). It is also critical in start site selection for the TATA-less human Ha-ras promoter(137). Another recent study showed that a developmental 25 activation of an episomic hsp70 gene promoter in two-cell mouse embryos is mediated by Spl (138). Roles Qfmgltiple .Sjpfactors in ”@scriptional regulation Since the discovery of other members of the Sp family of transcription factors, regulation of many promoters has been shown to involve multiple members bound to Spl binding sites. One example is the U5 repressive element of the long terminal repeat of human T cell leukemia virus type I, with a Spl binding core CACCC motif (139). Another report showed that expression of the SIS/PDGF-B gene in human osteosarcoma cells, U2- OS, requires both Spl and Sp3. Cotransfection of U2-OS cells with Sp expression plasmids and PDGF-B promoter/reporter constructs demonstrated that Spl and Sp3 can independently and additively activate the PDGF-B promoter (140). In a third study, transcription from the uteroglobin promoter was shown to be controlled by Spl and Sp3 through a non-classical Sp binding site. Gene transfer experiments into Drosophila SL2 cells that do not contain endogenous Sp factors revealed that expressed Spl activates the uteroglobin promoter, while Sp3 suppresses the activation by Spl (141). Finally, the pyruvate kinase M gene promoter is activated by expressed Spl in Drosophila SL2 cells. Sp3 has a synergistic effect on this Spl activation (142). 26 4. Thesis Overview The goal of this thesis work was to identify the Cis-elements and trans-factors governing the transcriptional regulation of the hexokinase I gene. This is the first step toward the understanding of transcriptional regulation of hexokinase I in different tissues, in response to hormone changes, such as that of CAMP, and ultimately, in response to changes in energy demand. Identification and characterization of Cis-elements and trans-factors is the initial step that provides valuable information about how the promoter activity is maintained and regulated. In this thesis, the PC 12 cell line is used to study the promoter of hexokinase I gene. These cells are frequently used as a neuronal model (143) and, as with brain itself(144, 52), express a high level of the type I isozyme (145). Various techniques were employed, including transfection of PC12 cells with nested deletions of the 5’-flanking region of hexokinase I gene linked to a luciferase reporter gene, footprinting and gel-shift experiments, and mutagenesis made in important cis-element sequences. Most of the work in this thesis has been published in Archives of Biochemistry and Biophysics (7 5, 76). CHAPTER H MATERIALS AND METHODS 27 28 I. General method; Standard methods (146) were used for routine procedures of molecular Cloning. DNA sequencing was done by the method of Sanger (147) using Sequenase v.2.0 kits fiom U.S. Biochemicals. PCR methods were as described by Innis et al. (148). Oligonucleotides for PCR and electrophoretic mobility shift (gel shift) experiments were synthesized in the Macromolecular Structure Facility, Michigan State University. Spl and Ap2 consensus Oligonucleotides were from Promega, with sequences of AT’TCGATCGGGGCGGGGCGAGC and GATCGAACTGACCGCCCGCGGCCCGT, respectively. Protein was determined with the BCA Protein Assay reagent and bovine serum albumin standard (Pierce Chemical Co.). Human Spl expression vector was kindly provided by Dr. R. Tjian from Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California, Berkeley; rat Sp3 and Sp4 expression vectors were by Dr. G. Suske from Institut filr Molekularbiologie find Tumorforschung, Philipps-Universitat, Marburg, Germany. 2. Generation of promoter test constructs Promoter test constructs were prepared in the vector pGL2-basic (Promega), which carries the Coding sequence for firefly luciferase. Various fragments fiom previously described (75) genomic Clone, pBS39.5, were inserted into pGL-2 basic, as described below: 29 The constructs pGL2 SS+ (inserted in correct orientation) and pGL2 SS- (inserted in reverse orientation) were obtained by subcloning a 572-bp Sac I fragment (positions -742 to - 171, relative to translational start codon, ATG, where A= +1) into the Sac I site of pGL2- basic. pGL2 SM3 79 was constructed by subcloning SM fragment (position -742 to -516) into pGL2-basic. Another luciferase reporter construct, designated pGL2 I-IKII H26, contained 2.6 kb of sequence upstream fi'om the transcriptional start site and included the promoter region of rat type H hexokinase; this was prepared by Cloning a 2.9 kb Hind HI fiagment from the previously described genomic Clone 5G3A (149) for Type II hexokinase into similar digested pGL2 basic. The first set of luciferase reporter constructs was generated as follows. An Eco RI-Sac I fiagment (-3366 to -171) fiom pBS39.5 was inserted into pGL2-basic to give the reporter construct designated pGL2 ES. A series of 5' deletions were then generated from pGL2 ES by conventional restriction digestion and religation. These constructs contained only the upstream transcriptional start sites (approximately -460 and -3 00) identified in the previous study (75). A second set of constmcts, with 3' end at -1 and thus including the downstream transcriptional start sites (75) at approximately -100, were generated from the set of constructs described above. A PCR fragment corresponding to sequence fiom an Mlu I site at -3 64 to position -1, with a Hind HI site included in the downstream primer, was used to replace the Mu I to Sac I region in the parent constructs. All plasmids were purified using Qiagen plasmid kits, and Checked for purity by agarose gel electrophoresis. 30 3. T ransfection of PC 1 2 and H9c2 cells with reporter constmcts Rat pheochromocytoma PC 12 cells were cultured, as described by Tischler and Green (143), in RPMI medium (HyClone Laboratories) containing 10% horse serum (HyClone Laboratories), 5% Cosmic calf senim (HyClone Laboratories) and antibiotics (Sigma). Rat myoblast H9c2 cells (150) were grown in Dulbecco’s modified Eagle’s medium (DMEM, HyClone Laboratories) containing 10% Cosmic calf serum and antibiotics. Nearly confluent cultures of PC12 cells and H9c2 cells were replated at 1:3 dilution onto collagen coated and regular 6-well tissue culture plates, respectively. Twenty four hours later, at which time cells had attained 50-60% confluency, transfection was performed. Initially, PC12 cells were transfected by incubation for 18 hours with 1.5 ug of test construct DNA plus 5 ug Lipofectin (Gibco-BRL), using the protocol described by the manufacturer. The medium was then exchanged for fiesh medium and extracts were prepared after a further 48 h incubation. Cells were washed twice with phosphate-buffered saline (0.15 M NaCl, 0.015 M sodium phosphate, pH 7.4) and lysed in 200111 Lysis Buffer (Promega). The plates were placed at -80°C for 30 min, then at 23°C for 15 min before collecting the lysate. Lysates were centrifuged at 15,000g for 5 min at 4°C, and supematants were either assayed immediately or frozen (-80°C) for later assay. In these experiments, luciferase activities were expressed on a per milligram protein basis. In later experiments, cells were co-transfected with a control vector, pCMV B-gal (Clontech), fiom which B-galatosidase expression is driven by the CMV promoter and enhancer. Transfection conditions were as described above except that PC12 cells were 31 transfected with 8 pg Lipofectin, 1.5 ug test constmct DNA, and 0.3 ug pCMV B-gal DNA, while H9c2 cells were transfected with 10 pg Lipofectin, 2 ug test construct DNA, and 0.4 ug pCMV B-gal DNA. Luciferase activities were then expressed relative to B-galactosidase activities. 4. Reporter assay Luciferase activities (arbitrary light units) in cell extract were assayed with the reagent kit and protocol from Promega, using a Turner Model 20 Luminometer. B-galactosidase activities in the same extract were determined using the assay kit fiom Promega and following the manufacturer’s instruction on microtiter plate format. Luciferase activities were then normalized to B-galactosidase activities, and were expressed as percentage of the normalized luciferase activity seen with the pGL3 -control vector (Promega) in which luciferase expression is driven by the SV 40 promoter. In mutant constructs, the activities were expressed as a percentage of that of the corresponding wild type construct. 5. Non-radioactive labeling QLRNAfiprobe and RNase protection assay To synthesize an RNA probe corresponding to the 3’ UTR region of HKI mRNA, a BamH I/ Pst I fiagment (2855 to 3160) of HKI cDNA was subcloned into pBluescript H SK+ vector with same digestion. After digestion with BamH I, a 369 nt long anti-sense RNA probe was synthesized by MAXIscript In Vitro Transcription Kits (from Ambion), using T7 RNA 32 polymerase. This probe was labeled with dig-UTP (from BMB) in the reaction containing dig- UTP and normal UTP at the ratio of 1256. Full length probe was excised and eluted after electrophoresis in 6% denaturing polyacrylamide gel. RNase protection assays were done with RPA-II kits (from Ambion), following manufacturer’s instruction, except that 10 units of RNase One (from Promega) was used to digest RNA at 37°C for 60 min. Protected probes (306 at) were then separated in 6% denaturing gel, transferred onto Amersham’s Hybond-N membrane, using Bio-Rad Trans-blot SD Semi-Dry Transfer Cell. Signals on the membrane were then detected with Genius System for Filter Hybridization (from BMB), following Cherniluminescent Detection protocol. The membrane was then exposed to X-ray film. 6. Premation Qf nuclear extractsfiom PC 1 2 cells Preparation of nuclear extracts followed the procedure of Ausubel et al. (151). Specifically, PC12 cells fi'om confluent cultures were collected by centrifugation, washed with PBS, and suspended in hypotonic buffer containing 10 mM Hepes, 1.5 mM MgC12, 20 mM KCl, 0.2 mM EDTA, 0.2 mM PMSF, and 0.5 mM DTT, pH 7.9. After incubation for 10 min on ice, the swollen cells were homogenized with a glass Dounce homogenizer and nuclei pelleted by centrifiigation at 3300 x g for 15 min. Nuclei were resuspended in low salt buffer (20 mM Hepes, 25% (v/v) glycerol, 1.5 mM MgC12, 20 mM KCl, 0.2 mM EDTA, 0.2 mM PMSF, and 0.5 mM DTT, pH 7.9). Addition of an equal volume of "high salt" buffer, which had the same composition as above except with 1.2 M KCl instead of 20 mM KCl, resulted in 33 release of soluble protein from nuclei. After centrifugation at 25,000 x g for 30 min, the supematant was dialyzed against 20 mM Hepes, 20% (v/v) glycerol, 100 mM KCl, 0.2 mM EDTA, 0.2 rnM PMSF, and 0.5 rnM DTT, pH 7 .9. The nuclear extract was divided into aliquots and stored in liquid nitrogen. 7. Electrophoretic mobility shifl (”gel shift") and 'impershi " experiments Nuclear extract (20ug protein) was incubated at room temperature for 30 min with 10- 50 x 103 Cpm of the 32P-labeled DNA fiagment, and 1.5 ug poly (dI-dC) as nonspecific competitor in a buffer containing 10mM Tris-HC 1 (pH7.5), 50mM NaCl, 2.5mM MgC 1 2, 0.5mM DTT, 4% (v/v) glycerol and 0.05% (v/v) NP-40; the total volume was 15 ul. Samples were then applied to 5% native acrylamide gel and electrophoresis n1n at 150 V. For supershift experiments, 2 u] anti-Spl, anti-Sp3, or anti-Sp4 antibody (Santa Cruz Biotechnology) was added after the initial incubation, and incubation continued for another 30 min before loading onto the gel. Each anti-Sp antibody is specific and does not cross-react with other members in the Sp family. Gels were dried and exposed to Kodak Biomax MR film or analyzed on a phosphoirnager (Molecular Dynamics Model 4008). 8. DNase I footprinting DNase I footprinting was done with the SureTraCk Footprinting Kit (Pharmacia), following the protocol provided by the manufacturer. A Sac I-Mlu I fragment (-742 to -3 64), 34 labeled with Klenow fiagment (146) to a specific activity of approximately 200Cpm/bp, was incubated at room temperature with indicated amounts of PC 12 extract and poly(dI-dC) in the same buffer used for electrophoretic mobility shift assays (see above). After 30 min incubation, DNase I was added; after a further one min incubation, digestion was stopped by addition of the "DNase stop solution". Proteinase K (2 pg) was added, followed by incubation at 42° for 30 min. Samples were then extracted with phenol-chloroform, DNA collected by precipitation with ethanol, and samples loaded onto 5% sequencing gel. The same DNA fragment, treated with formic acid and piperidine to Cleave at G and A residues, was loaded on this same gel to provide a sequencing ladder. 9. Mutation of; promoter reporter constructs Deletion or substitution mutations were made by the two step PCR method of Higuchi (148), using a Sac I fragment (-742 to -17 l) cloned in the pGL2-basic vector as template. Description of the exact nature of these modified constructs will become relevant only after presentation of results below. However, primers used for their generation are given in Table 1. Outside primer JW 86 (equivalent to the GLprimerl sequencing primer from Promega) is located 10 nt upstream from the multiple cloning site in the pGL2-basic vector, while outside primer IW93 (equivalent to the GLprimer2 sequencing primer from Promega) corresponds to sequence just downstream from the multiple Cloning site. Outside primer JW 63 corresponds to hexokinase sequence at positions -361 to -332. Underlined regions in sequences for the inside primers correspond to hexokinase sequence located at positions appropriate for generating 35 specific mutations or deletions described below. The position of deleted sequence generated with primers JW115 and JW116 is indicated by double slashes within the primer sequences. Mutated sequences are shown in italics. All mutations and deletions were confirmed by direct sequencing of the entire region corresponding to the PCR fragment. 10. Hexola'nase asng and western blot PC12 cells were transfected with human Spl and rat Sp3, Sp4 expression plasmids, in which cDNA of Sp transcription factors is driven by CMV promoter. Transfection procedures were same as described for luciferase reporter constructs, except that 211g of expression plasmid was used in each transfection. Cells were lysed 48 hours after transfection, in 200111 (per well) of ice-cold BTGE buffer (50 mM Bicine, 10 mM thioglycerol, 10mM Glc, 0.5 mM EDTA, 0.1% (v/v)TritonX-100, pH 8.2) and went through two freeze-thaw cycles. The lysate was then centrifuged at 800xg for 5 min at 4°C. Hexokinase activity in the supernatant was determined immediately, and then stored at -80°C for future use. Hexokinase activity was determined spectrophotometrically as described previously (152). Protein was detennined with the BCA Protein Assay system. Samples were pretreated with excess iodoacetamide (153) to avoid interference by thioglycerol present in the PTGE bufl‘er. For western blot experiments, 100 1.1g extract was loaded onto 8% SDS acrylamide gel, then transferred onto nitrocellulose membrane. The Sp transcription factors were detected by specific anti-Sp antibodies (at 1:3000 dilution, Santa Cruz Biotechnology), 36 which do not cross-react with other members of Sp family. The secondary antibody was goat anti-rabbit IgG conjugated to horseradish peroxidase (from Bio-Rad), and detected with SuperSignal Chemiluminescent Substrate (from Pierce). The membrane was then either exposed to X-ray film or quantitatively analyzed on Molecular Imager System GS- 505 (Bio-Rad). Protein contents in the samples were assessed by densitometric analysis of Coomassie Blue stained gels and used to adjust expression level calculated from western blot. CHAPTER HI RESULTS 37 38 1. Promoter activity associated with Sac I fragment in HKI upstream region Figure 1 shows a schematic representation of a partial restriction map of hexokinase I upstream region (-3366 to +77), which is related to this work. As previously identified, transcriptional start sites are clustered in three regions approximately located at -460, -300 and -100, with A in start codon ATG designated as +1 (75). Complete DNA sequence and restriction maps of the same region are shown in Appendices A-D. A 572-bp Sac I fiagment (nt -742 to -171) was inserted into the multiple cloning site upstream from the luciferase coding region of the reporter vector pGL-2 basic; plasmids with the insert in either the correct (pSS+) or reverse (pSS-) orientation were prepared. The pSS+ plasmid contains transcriptional start sites located near the -460 and -300 positions. Also constructed was pSM, containing only the upstream transcriptional start site (nt -737 to -359). The various reporter constructs or the unmodified pGL-2 basic vector were used to transfect PC12 cells. These cells express a high level of the type I isozyme (145). Thus we anticipated that the promoter for type I hexokinase would be quite effective in PC 12 cells and indeed, luciferase expression driven by the pSM and pSS+ constructs was well above that seen with the promoter-less pGL-2 basic vector (Table 3). Reversing the orientation of the inserted promoter region (pSS-) resulted in a marked reduction in expression of luciferase activity. The previous experiment (75) indicated that PC12 cells favored the upstream (-460) start sites; hence, deletion of the downstream (-3 00) transcriptional start site, in construct pSM, might have been expected to have only a marginal effect on expression. Thus, the substantial decrease (relative to pSS+) in expression from pSM, was somewhat surprising. 39 Figure 1. Partial restriction map and schematic organization of HKI promoter region. Genomic DNA of the hexokinase I gene from the Eco RI site (-3366) to the Nar I site (+77) is shown as a horizontal line. The positions of the restriction sites related to this work are represented by the vertical lines. The three vertical bars represent transcriptional start points (TSP) clustered at -460, -300 and -100. The horizontal bar represents the first exon (+1 to +66). Refer to Appendix B, C and D for complete DNA sequence and restriction map. Fig. 1 4O in b -742 -520 -359 -248 -l71 +77 p 1 1 I 1 I 1 1 I SaCI BstXI MluI PstI SacI NarI \ \ TSP TSP TSP \ \ 4. \ -460 -300 -100 l ‘\\N 1 ‘ \ \ 1 F1 Pv P K P S B M PS N M in kb -3.4 -2.5 . -2.l -l.8 -1.0 -0.7 R = EcoR 1 PV = Pvu II P = Pst l K = Kpn l S = Sac I B = Bst XI M = Mlu I N = Nar I Table 3. Expression of luciferase in PC 12 cells transfected with reporter constructs Construct Sequence in HKI Luciferase Fold upstream region activity' increase pGL2-basic --- 4.3 :t 1.2 1 pGL2 SS+ -742 to -171 1100 3; 96 260 pGL2 SS- -742 to -171 (inverse) 56 :1: 23 13 pGL2 SM -737 to -359 300 :1: 150 70 pGL2 HKII H2.6 --- 50 d: 16 12 ' Luciferase activity expressed as units/mg protein (mean 3: SD for four transfections). 42 Nonetheless, expression of luciferase activity remained 70-fold increased over the basal level. The type H isozyme of hexokinase is expressed at low level in PC 12 cells (J .E. Wilson, unpublished observation). In agreement with this, low level of luciferase were expressed in PC12 cells transfected with a reporter construct (pI-IKII H26) containing the promoter and transcriptional start site for type H hexokinase. 2. Sequence between -742gfi -51 6 is importanL for promoter activity Results of experiments with reporter constructs containing Type I hexokinase genomic sequence between -3366 and -171 are shown in Fig. 2. Deletions of upstream sequence between -3366 and the Sac I site at -742 had no significant efi‘ect on expression of luciferase activity in PC12 cells (Fig. 2B). However, deletion of sequence between -742 and the Bst XI site at -516 resulted in 85% loss of promoter strength. Another significant decrease occurred after deletion of sequence between the Mlu I site at -364 and the Pst I site at -245. Very similar results were seen with these reporter constructs transfected into H9c2 cells (Fig. 2C). The one qualitative difference between results with PC 12 cells and those with H9c2 cells (Fig. 2B, C) was in expression driven by the PvS construct. In H9c2 cells, there was a modest but significant decrease in expression relative to that seen with longer and shorter constructs, ES and KS, respectively. This suggests the possible existence of positive and negative regulatory elements in sequence from -3366 to -2511 and fiom -2511 to -1833, respectively. If these indeed exist, they do not appear to be generally fiinctional since their effect is not seen with PC12 cells. This observation has not been pursued firrther at this time. 43 Figure 2. Transfection of PC12 and H9c2 cells with promoter-reporter constructs. A, schematic representation of reporter constmcts. All constmcts included only sequence upstream from position -171 in the 5' flanking region of the gene encoding rat Type I hexokinase (4); see text for more detailed comments. B and C, normalized (to B-galactosidase activity) luciferase activities after transfection of PC12 (B) or H9c2 (C) cells with indicated construct or basal control vector, pGL2-basic; results are the mean i SD for at least eight independent transfections, and are given as percentage of normalized luciferase activity expressed from the pGL3-control plasmid. An asterisk means that the activity of a constnict is significantly, according to statistic analysis, different from that of the next longer construct. 44 0.0 r Yo b 2220995.. .0 65 3983263.. «.0 0 cm mp or m fl . _ I N02... .0 «run. .m 386:2... . 2.8.2 .H. «mm HUII new. wSIHllll «mm. mm Hill 3m- mel we». an. H :3. mx U 82- mi HT :8- mm. HIlJ. 88. www. 239N133 E floazwcoo .< Fig. 2 45 As noted previously (75), the overall expression was much lower in H9c2 cells than in PC12 cells. Thus, expression of luciferase from the longer constructs in PC 12 cells was 10- 15% of the pGL3-control, while in H9c2 cells, these same constructs were expressed at 0.4- 0.5% of the pGL3-control, a level of more than 20 fold lower than that in PC12 cells. The promoter activities are consistent with the observed levels of HKI mRNA by RNase protection assays (Fig. 3). The level of HKI mRNA seen with long PC 12 total RNA (lane 5) was comparable to that seen with long rat brain total RNA (lane 4). The level of HKI mRNA in 100ug H9c2 total RNA (lane 7), on the other hand, was a little lower than those in long total RNA from either PC12 cells or rat brain, indicating that the level of HKI mRNA in H9c2 is at least 10 fold lower than that in PC12 cells and rat brain. The major point emerging from these transfection experiments was the importance of sequence fiom -742 to -516 for promoter activity. Thus, attention was focused on this segment. 3. DNase Liootprinting revea_ls two protected regions DNase I footprinting analysis was done using a Sac I-Mlu I fiagment containing sequence between -742 and -359. Using nuclear extracts from PC 12 cells, two protected regions were observed (Fig. 4). These were designated as the Pl box (24 nt in length, positions -552 to -529) and the P2 box (23 nt in length, positions -480 to -458). Densitometric analysis revealed that, with the maximum amount of nuclear extract used, densities in the P1 and P2 regions were reduced to approximately 10% and 25%, respectively, of that in the unprotected 46 Figure 3. RNase Protection Assay. The same amount of the dig-labeled anti-sense RNA probe corresponding to BamH I/ Pst I fragment (2855 to 3160) in the 3 ’-untranslated region of HKI cDNA, was incubated with total RNA fiom yeast (lanes 2,3) or various tissue or cell lines (lanes 4-7), then subjected to RNase I digestion (lanes 3-7). Lane 1 is RNA marker. In lane 2, one tenth of the sample was loaded onto the gel to give a signal comparable to that in lanes 3 to 7, where all the samples were loaded onto each lane. 47 H + .32 +. 2.: £3. . Fig. 3 48 Figure 4. DNase I footprinting analysis with a nuclear extract from PC12 cells. Protected regions in a 32P-labeled Sac I-Mlu I fiagment (positions -742 to -3 59), designated as the P1 and P2 boxes, are noted. The location of an additional Spl site upstream (position -570) fiom the Pl box (see text) is also indicated, although no protection was noted in this region. Lane 1 is a G+A chemical sequencing reaction of the same labeled fragment to provide a sequence ladder. The sequence of the P1 box is CT’ITI'I‘CCACGCCCAC'I‘TGCGTGC, and the P2 box is GAGGAAGGGGTGTGGCCCCGTTC. Fig. 4 49 02.55102040 0.811.216 2.44 2112 24 Nuclear extract (pg) DNase I (u) Poly (dl-dC) (119) fl 1] Spt (.570) P1 BOX P2' BOX 50 control (no nuclear extract). The P2 box is more weakly protected, includes the upstream transcriptional start sites identified in previous work (75), and does not appear to be important for promoter activity (Fig. 2). In contrast, the P1 box lies within the region identified as being of major importance for promoter activity. The sequence of the P1 box, CT'ITTTCCACGCCCAC'ITGCGTGC, includes a consensus Spl binding site (154) in its central region. We also noticed another potential Spl binding site, CCGCCCA, just upstream from the P1 box (positions -574 to -568). The possible significance of these elements for promoter activity was further examined using a series of mutated or deletion constructs (results below). Although these experiments demonstrated the fiinctional importance of the upstream (-570) Spl site, we were unable to detect significant protection of this region in DNase I footprinting experiments. Compared to consensus sequences of binding sites for known transcription factors by searching the GCG database, the P1 box and P2 box contain some potential Cis-elements shown in Table 4, notably GC-rich Spl and AP2 binding sites in the P1 box. 4. The P1 box and the Spl site at -5 70 arevfimctionally important cis-elements jar; promoter activity hurt the P2 box is not The locations of the P1 and P2 boxes, as well as the Spl site at -570, within the sequence of the parental pGL2SS are shown in Fig. 5. The construct pGL2SS- was identical to pGL2SS except that the insert was in the reverse orientation (The reporter constmcts pGLZSS and pGLZSS- were referred to as pS572+ and pS572-, respectively, in Table 3). Several 51 Table 4. Potential cis-elements in P1 box and P2 box sequences P1 BOX CTTTTTCCACGCCCAC'I'I‘GCGTGC cis- Consensus Matching Sequence’ Degree Elements Sequence in P1 box of Matching XRE ' TTGCGTG CTTTTTCCACGCCCACTTGCGTGC 7/7 Spl C/AC/TC/TC/AGCCC/TC/A Myc CACGTG AP2 CCCA/CNG/CG/CG/C CTTTTTCCA CGCCCACTTGCGTGC 8/9 CTTTTTCCACGCCCACZIGCGTGC 4/6, 5/6 CTTTTTCCA CGCCCACTTGCGTGC 7/8 P2 BOX GAGGAAGGGGTGTGGCCCCG’ITC Cis- Consensus Matching Sequence’ Degree Elements Sequence in P2 box of Matching B-globin GGGTGTGGC PEAB-RS AGGAAG TEF-Z-GT-I occrorco Pu box GAGGAA GAGGAAGGGGTGTGGCCCCGTTC 9/9 GAGGAAGGGGTGTGGCCCCGTTC 6/6 GAGGAAGGGGTGTGGCCCCGTTC 8/ 8 GAGGAAGGGGTGTGGCCCCGTTC 6/6 ’ Bold letters in underlined sequence represent nucleotides identical to consensus sequence of known cis-element; italic letters represent mismatching nucleotides. 52 dos—ow memos 8:269. 65 30856.. 8:37 05:09 92605 mom 95 woos—6055 865369. .6882: 582%.. £632 652 dowoaoe MUm E 829:2 2 30:5 :3» 86:63—68 3.5.595 .9583 629: 5 ”Bo Z UNNDE S§VUV8§D§ 2651 AGGTAGTI'I‘G GGT'I'I'AGCAA TGTGAACT CT GACAA’ITI‘GG GATGTAGAGC -667 2701 TGGTGGGCCA TCGTGGGACG CCAAGCATCA TCC'ITAGAGT TTGGATCCTI‘ -617 2751 TAGGGCAGGC AGGCACAGGG ACCCAGTGCG AGATCAGTGA AGCCGCCCAG -567 2801 TTTCGGC'I'I' C CGCTC'ITITT CCACGCCCAC TTGCGTGC’I'I‘ CT CCAACAGT -517 2851 GTGGATGGGA GGGGTGGGGG ACGAGCCCT A ATCTCCGAGG AAGGGGTGTG -467 2) 2901 GCCCCGTTCG TGTTCTCCAG TI'I‘GTGGCGT CCTGGATCT G TCCT CT GGTC -417 2951 CCCTCCAGAT CGTGTCCCAC ACCCACCCGT TCAGGCATGG CACTGTGCCG -367 3001 CCACGCGTGA CCGTGCGCT C C’I'I‘ACGTGGG GGACGTGCAG GGTGCTGCCT -317 :> 3051 CCI'TTCCGGT GCGGGAGGGA GCGGCCGTCT TI‘CTCCTGCT CT GGCT GGGA -267 3101 AGCCCCAGCC ATTGCGCTGC AGAGGAGACT TGCAGCCAAT GGGGACT GAG -217 Z) 3151 GAAGTGGGCC GGCTGGCGGT TGTCACCCT C CCGGGGACCG GAGCTCCGAG -167 :9 3201 GTCTGGAGAG CGCAGGCAGA CGCCCGCCCC GCCCGGGGAC TGAGGGGGAG -117 3251 GAGCGAAGGG AGGAGGAGGT GGAGTCTCCG ATCTGCCGCT GGAGGACCAC -67 3301 TGCTCACCAG GGCI‘ACI‘GAG GAGCCACTGG CCCCACACCT GC’I‘I'ITCCGC 17 +1 3351 ATCCCCCACC GTCAGCATGA TCGCCGCGCA ACTACTGGCC TA'I'I‘ACTTCA +34 MI AAQ LLAYYFT 3401 CCGAGCTGAA GGATGACCAA GTCAAAAAGG TGAGCCCCGC CGGCGCCGCC +84 ELK DDQVKK :9 Represent 5’ end points of HKI upstream sequence in promoter-reporter constructs. APPENDIX C APPENDIX C SCHEMATIC REPRESENTATION OF RESTRICTION SITES IN HKI UPSTREAM REGION pBS39.5 (1 TO 3366) 104 “""w 105 oo::_::cI::wu 0.3003 o ou.>u oo ...:=::==:=cc=u¢ooeo ...==I====::c=u<9m::::u< o mcaoco w 930.0 oluou>.o > 99<. .u . . ... dd . . b . . . . p - _ . - . _ _ . . . _ . . 1 . . L . u . _ u. . . a . h . . . . . . h n . . _ . p . . . . . . — = . . :. _ . . p .. I .. u > p. . . a . - .. . . .. I : a u q. q p _ n . I m|... .._. . .. . . . . _ u. I u : f. : . q . h . p h p p H . u . . q < . - — ooom coma ooom oomH oooH com .bonhfi BamH .w wean momm uou H .qmom "x0 www.mmnq "no Boaam<= Hummcfiqv HHH¢0¢ m H10000.0 £00.009 ...c:.:::=::ccnoe<00 001:.00 I=:.:cm:00000 Iooaoo. uuap¢<.o uco.uu ...G:GG:¢G¢=¢GO<U HaH>u HHoHu Homo Hmong Hmmamm Hagan qumm HHmuum Hmcumm Houmm Hmumm Hanna Hmumm HHmm Hague HH Danna Ho mm Hm» Hana HHN mm Seam Baum Hfimm .b Ch .b.. q. C. ‘- ‘- 107 Cb ‘I-dn 1..- I. ‘P b“ .- db CI- oo‘Looeoooo .1- .1- i... ab‘b I. db .b ‘1- .I‘. qb‘P ‘1‘, q. o a 1P.“- .- .- n u n u p q _ . q n d n u C o . q C C o p o a O u . o - ‘. F" '5 I o o o d..- .h C. .- I. .b .boHFH bama .o aqua FN.’ a o .b I ‘. coon mmmm Ho» a J. "T .- .- ul- .1- di- ‘.o‘;oo"oduooor - anr u d dud .- dd F”. .- f' "u can .3: 3.0 8223 do H.334: 285.: ‘D .- db .1- .b .b .b ‘b I- d- C. ‘D .b ‘r CD .1- F \[i I . .H 1 | . .r.¢.'l§1‘-" HHax Hoe. Haox Ha: Ha.» H5." H2» Him HHaa HHum HHuom H.0m HHam HHo.m HHH.H HHHH HHHHHnuH HHH.H HHaH HgaH HHcHH HHHHHHHH Ho.H >n.z Hsuz Huoz H.gz H.uz Hooz Hu.z Has: Ho.» H.am HHHueHH HHuaHa H... >¢oua Hzoon HHou HaHu HHHHH.H Hung-H Hm..a HHH..H Hun-H H... Ha... HHuH HuuH H.AH HouH HHHHH HuHH HHHHH Hung HH».4 Huau go: on Hag» non—anew ooeuu¢u H H . . . . I . . . . Hanna I uoacccuanu¢¢o H . H. . . . . H Hanan .c cacawmwummumwuw m . H H. . . . . H . H H . HHHHHHHH .cflgufi Md : .H H B . H. H H. . H . H H . H . H H H H W“ “SUB lug“. a dd .H d ' .1 .1 a mud. d .o u up. u“ u o a b! o a .4 d... “HQO “a eo.cuu< HH H ImII H H. . . . . . . H . . . HUHMHH IUHaao. H . . . It . . .. . . . . . . . . . HmHaua UIozo.o H .s H. . H .H. . . H . H u H H H H H H..H Iou.ou H H . . . H m . . Haze u e:<.o H H .H . H . . H . . HHHa :wwmww m H H H H H . II. . H . H . . mun” ...=I==c== uugo H H . . . .H H . . HII . kuua no.9 H . . . H . H . H as .HuoaI m H. . . H . . . . H H «a uuosxo.u H H H H . H . u H H H H Hm auo.oo¢ H . . . . . H . Haum ma¢.a¢¢ H . . . H . . . HHHum Ho ozoho‘ H . . . . . . H HHHHHoum oowIcHHUUw M . . H . . . . . m» mHoum H . . . . d a . Hau.u¢a H H Haas» uIHH>a.u H H H . . H . . . H HHam ...H..HH H . . . . . H H L H H _ H H... oHIowuuuuu H H H . H H H . H H . H H¢wum ouoo=.==a cocoa H . . . . .H . HHHm coon coma coo" oemH occH com .HHHHH HHHH .H «can HHHH "OH H .HHHH "Ho www.mann "Ho HOHHHHz HHauaHHH APPENDIX D APPENDIX D LIST OF RESTRICTION SITES IN HKI UPSTREAM REGION pBS39.5 (1 TO 3366) AceIII CAGCTCnnnnnnn'nnnn_ Cutsat: 0 1003 1351 1585 1746 1803 2089 2685 3366 Size: 1003 348 234 161 57 286 596 681 A011 c'cc_c Cutsat: 0 832 1163 1286 1315 2454 2523 2793 2810 Size: 832 331 123 29 1139 69 270 17 Cutsat: 2810 2998 3061 3071 3166 3224 3229 3286 3347 Size: 188 63 10 95 58 5 57 61 Cuts at: 3347 3366 Size: 19 MIN A'CryG__T Cuts at: 0 238 314 2485 3003 3366 Size: 238 76 2171 518 363 AluI AG'CT Cutsat: 0 426 545 855 993 1341 1368 1555 1564 Size: 426 119 310 138 348 27 187 9 Cuts at: 1564 1575 1760 1793 2079 2379 2627 2699 3193 Size: 11 185 33 286 300 248 72 494 Cuts at: 3193 3366 Size: 173 A1wIGGATCnnnn'n_ Cutsat: 0 2738 2751 2942 3366 Size: 2738 13 191 424 A1wNICAG_nnn'CTG Cuts at: 0 1148 2412 3366 Size: 1148 1264 954 109 110 Apal G_GGCC'C Cutsat: 0 1226 2169 3366 Size: 1226 943 1197 ApoI f'AATTJ Cutsat: 0 l 814 2605 3366 Size: 1 813 1791 761 AvaI C'yCGr__G Cutsat: 0 994 1988 2613 3180 3232 3366 Size: 994 994 625 567 52 134 AvaII G'GwC_C Cutsat: 0 31 135 199 1558 1651 1961 2431 2769 Size: 31 104 64 1359 93 310 470 338 Cuts at: 2769 2947 3185 32.94 3366 Size: 178 238 109 72 AvrII C'CI‘AG_G Cuts at: 0 609 3366 Size: 609 2757 Bad GrTACnnnnGTnnnnnnnnnn_nnnnn‘ Cutsat: 0 1498 1531 2022 2055 3366 Size: 1498 33 491 33 1311 BamHI G'GATC_C Cuts at: 0 2743 3366 Size: 2743 623 Banl G'Ger_C Cutsat: 0 1462 1536 2185 3366 Size: 1462 74 649 1181 BanII G_rGCy'C Cutsat: 0 890 1226 1662 2169 2235 2629 2877 3195 Size: 890 336 436 507 66 394 248 318 Cutsat: 3195 3366 Size: 171 T‘- I‘ 111 Bbvl GCAGCnnnnnnnn'nnnn_ Cutsat: 0 636 842 1305 1802 2418 3031 3103 3144 Size: 636 206 463 497 616 613 72 41 Cuts at: 3144 3366 Size: 222 Bed CCATC Cutsat: 0 672 776 845 1313 1631 1732 2016 2083 Size: 672 104 69 468 318 101 284 67 Cuts at: 2083 2211 2709 2856 3366 Size: 128 498 147 510 Bee83I CIT GAGnnnnnnnnnnnnnn_nn' Cuts at: 0 209 1774 2643 3366 Size: 209 1565 869 723 Bcefl ACGGCnnnnnnnnnnnn‘n_ Cutsat: 0 654 837 2330 3060 3366 Size: 654 183 1493 730 306 BciVI GGATACnnnnn_n' Cutsat: 0 1593 2184 3366 Size: 1593 591 1182 BfaI C‘TA_G Cutsat: 0 127 610 1129 1365 1369 2223 2360 3366 Size: 127 483 519 236 4 854 137 1006 B111 ACTGGGnnnn'n_ Cutsat: 0 741 1393 1676 2230 2766 2790 3366 Size: 741 652 283 554 536 24 576 BglI GCCn_nnn'nGGC Cutsat: 0 957 3366 Size: 957 2409 BglII A'GATC_T Cutsat: 0 491 3366 Size: 491 2875 BmgI GkGCCC "‘2' , -l v, -I-. 112 Cutsat: 0 932 1224 2167 2186 3366 Size: 932 292 943 19 1180 Bpll GAGnnnnnCT C Cuts at: 0 990 2008 3366 Size: 990 1018 1358 Bme CT GGAGnnnnnnnnnnnnnn_nn' Cutsat: 0 171 639 2900 2938 3224 3310 3366 Size: 171 468 2261 38 286 86 56 BpulOI CC'TnA_GC Cutsat: 0 974 1590 3366 Size: 974 616 1776 Bpu11021 GC'TnA_GC Cutsat: 0 791 2380 3366 Size: 791 1589 986 BsaI GGTCTCn'nnnn_ Cutsat: 0 899 2308 3366 Size: 899 1409 1058 BsaAI yAC'GTr Cutsat: 0 250 1876 3025 3366 Size: 250 1626 1149 341 353111 Gr'CG_yC Cutsat: 0 2718 2927 3220 3366 Size: 2718 209 293 146 85311 C'CnnG_G Cutsat: 0 194 385 402 609 656 889 952 961 Size: 194 191 17 207 47 233 63 9 Cutsat: 961 1458 1466 1502 1928 1929 2168 2169 2189 Size: 497 8 36 426 1 239 1 20 Cutsat: 2189 2190 2227 2243 2401 2885 3180 3181 3196 Size: 1 37 16 158 484 295 1 15 Cutsat: 3196 3232 3233 3307 3366 Size: 36 1 74 59 113 BsaWI w'CCGG_w Cutsat: 0 1175 3055 3187 3366 Size: 1175 1880 132 179 BsaXI ACnnnnnCl' CC Cutsat: 0 1397 1410 1923 2536 3073 3191 3270 3366 Size: 1397 13 513 613 537 118 79 96 35131 CAACAC Cutsat: 0 636 1401 1450 3366 Size: 636 765 49 1916 85061 CCCGT Cuts at: 0 2645 2904 2977 3366 Size: 2645 259 73 389 BseRI GAGGAGnnnnnnnn_nn' Cutsat: 0 212 3137 3263 3275 3278 3333 3366 Size: 212 2925 126 12 3 55 33 BsgI GTGCAGnnnnnnnnnnnnnn_nn' Cutsat: 0 2116 2332 3056 3366 Size: 2116 216 724 310 851131 CG_ry'CG Cutsat: 0 3075 3366 Size: 3075 291_ BsiHKAI G_wGCw'C Cutsat: 0 919 2629 3195 3366 Size: 919 1710 566 171 leI CCnn_nnn'nnGG Cutsat: 0 194 471 662 712 713 834 939 1022 Size: 194 277 191 50 1 121 105 83 Cutsat: 1022 1182 1577 1667 1668 1930 1934 2088 2228 Size: 160 395 90 1 262 4 154 140 Cutsat: 2228 2249 2250 2554 2577 2616 2802 2891 2923 Size: 21 1 304 23 39 186 89 32 114 Cutsat: 2923 3026 3165 3182 3187 3202 3233 3234 3366 Size: 103 139 17 5 15 31 1 132 BsmAI GTCTCn'nnnn_ Cutsat: 0 172 707 899 1271 2308 3119 3279 3366 Size: 172 535 192 372 1037 811 160 87 8511131 CGT CT Cn'nnnn_ Cuts at: 0 707 3366 Size: 707 2659 BsmFI GGGACnnnnnnnnnn'nnnn_ Cutsat: 0 44 212 788 1074 1135 1246 1539 1893 Size: 44 168 576 286 61 111 293 354 Cutsat: 1893 1947 2225 2417 2606 2729 2782 2882 2933 Size: 54 278 192 189 123 53 100 51 Cutsat: 2933 2949 3044 3156 3198 3250 3366 Size: 16 95 112 42 52 116 Bsp241 CCAnnnnnnGTCnnnnnnnn_nnnnn' Cuts at: 0 2947 2979 3366 Size: 2947 32 387 Bsp12861 G_dGCh'C Cutsat: 0 890 919 934 1226 1662 2169 2188 2235 Size: 890 29 15 292 436 507 19 47 Cuts at: 2235 2629 2877 3195 3366 Size: 394 248 318 171 BspGI CTGGAC Cutsat: 0 1041 1561 3366 Size: 1041 520 1805 BspLUl 11 A'CATG_T Cutsat: 0 238 314 2485 3366 Size: 238 76 2171 881 BspMI ACCT GCnnnn'nnnn_ Cutsat: 0 836 1552 2640 3346 3366 Size: 836 716 1088 706 20 115 8er ACT G_Gn' Cutsat: 0 166 188 748 1188 1389 1672 2034 2237 Size: 166 22 560 440 201 283 362 203 Cutsat: 2237 2773 2797 2917 3331 3366 Size: 536 24 120 414 35 BsrBI CCG'CT C Cutsat: 0 2454 2812 3071 3366 Size: 2454 358 259 295 13er1 GCAATG_nn' Cutsat: 0 2674 3109 3366 Size: 2674 435 257 Ber1 r'CCGG_y Cutsat: 0 2367 3158 3366 Size: 2367 791 208 BerI T'GTAC_A Cuts at: 0 2496 3366 Size: 2496 870 BstAPI GCAn_nnn'nTGC Cutsat: 0 2412 3366 Size: 2412 954 BstEII G'GTnAC_C Cuts at: 0 2363 3366 Size: 2363 1003 B500 CCAn_nnnn'nTGG Cuts at: 0 905 2850 3366 Size: 905 1945 516 BstYI r'GATC_y Cutsat: 0 491 2743 2934 3366 Size: 491 2252 191 432 Bsu36I CC’TnA_GG 116 Cuts at: 0 757 2279 3366 Size: 757 1522 1087 Cac81 GCn'nGC Cutsat: 0 409 512 687 788 991 1291 1339 1749 Size: 409 103 175 101 203 300 48 410 Cutsat: 1749 1762 2319 2757 2761 2835 3160 3164 3214 Size: 13 557 438 4 74 325 4 50 Cutsat: 3214 3224 3366 Size: 10 142 Cjel CCAnnnnnnGTnnnnnnnnn_nnnnnn' Cuts at: 0 357 391 952 986 1078 1079 1112 1113 Size: 357 34 561 34 92 1 33 1 Cutsat: 1113 1563 1597 1657 1691 2946 2980 3366 Size: 450 34 60 34 1255 34 386 CjePI CCAnnnnnnnTCnnnnnnnn_nnnnnn' Cutsat: 0 227 260 874 907 1330 1363 1368 1401 Size: 227 33 614 33 423 33 5 33 Cutsat: 1401 1487 1520 2153 2186 2200 2233 2382 2415 Size: 86 33 633 33 14 33 149 33 Cutsat: 2415 2713 2746 2881 2914 2947 2980 3366 Size: 298 33 135 33 33 33 386 CviJIrG'Cy Cutsat: 0 14 86 391 426 463 510 514 545 Size: 14 72 305 35 37 47 4 31 Cuts at: 545 608 615 641 666 685 746 790 795 Size: 63 7 26 25 19 61 44 5 Cuts at: 795 809 847 855 888 951 960 973 989 Size: 14 38 8 33 63 9 13 16 Cutsat: 989 993 1086 1148 1187 1205 1224 1289 1306 Size: 4 93 62 39 18 19 65 17 Cutsat: 1306 1337 1341 1368 1457 1555 1564 1575 1589 Size: 31 4 27 89 98 9 11 14 Cutsat: 1589 1633 1660 1689 1751 1760 1764 1793 1923 Size: Cuts at: Size: Cuts at: Size: Cuts at: Size: Cuts at: Size: Cuts at: Size: CviRI TG'CA Cuts at: Size: Cuts at: Size: DdeI C'TnA_G Cuts at: Size: Cuts at: Size: Cuts at: Size: Cuts at: Size: DpnI GA'TC Cuts at: Size: DraI 'I'I'I‘AAA Cuts at: Size: 1923 2901 2294 2321 913 44 27 11 145 7 52 21 7 173 20 119 11 0 79 1219 79 1140 74 27 28 0 100 139 100 39 924 974 1 l 50 DraIII CAC_nnn'GTG Cuts at: 0 1095 3366 1934 2079 2106 2139 2158 2167 2221 283 194 117 29 62 9 4 29 130 2226 27 33 19 9 54 5 2226 2233 2285 2317 2330 2359 2379 2570 2599 32 13 29 20 191 29 2599 2620 2627 2699 2707 2792 2806 2875 2901 72 8 85 14 69 26 3074 3094 3102 3108 3135 3158 3162 3193 8 6 27 23 4 31 3193 3312 3323 3330 3366 7 36 1790 2133 2294 343 161 1437 1778 341 12 1293 144 2349 2415 2469 3037 3119 3132 3366 66 54 568 82 13 234 797 913 116 422 730 757 791 308 27 34 6 1590 1710 120 1168 1424 1453 1494 256 29 41 96 1710 1794 2108 2119 2279 2305 2380 2475 2733 84 314 11 160 26 75 95 258 2733 3146 3240 3316 3366 413 94 76 50 0 381 493 2745 2783 2936 2959 3281 3366 381 112 2252 38 153 23 322 85 0 1198 3366 1198 2168 118 Size: 1095 2271 DrdI GACnn_nn'nnGTC Cutsat: 0 2271 3366 Size: 2271 1095 DrdII GAACCA Cuts at: 0 1803 3366 Size: 1803 1563 Bad y'GGCC_r Cutsat: 0 639 1762 3072 3366 Size: 639 1123 1310 294 EagI C'GGCC_G Cutsat: 0 3072 3366 Size: 3072 294 Earl CT C'I‘T Cn'nnn_ Cutsat: 0 85 217 878 2333 3366 Size: 85 132 661 1455 1033 150047111 AGC'GCT Cutsat: 0 412 2394 3366 Size: 412 1982 972 E00571 CT GAAGnnnnnnnnnnnnnn_nn' Cutsat: 0 109 1655 2411 3366 Size: 109 1546 756 955 56601091 rG'GnC_Cy Cutsat: 0 135 199 949 1222 1922 1961 2165 2166 Size: 135 64 750 273 700 39 204 1 Cutsat: 2166 2283 2431 2598 2618 2769 3366 Size: 117 148 167 20 151 597 15le G'AA'IT_C Cutsat: 0 1 3366 Size: 1 3365 EcoRII 'CCwGG_ 119 Cutsat: 0 14 193 365 402 514 530 655 889 Size: 14 179 172 37 112 16 125 234 Cutsat: 889 951 960 1037 1211 1465 1928 2168 2189 Size: 62 9 77 174 254 463 240 21 Cuts at: 2189 2226 ‘2243 2298 2400 2930 3306 3366 Size: 37 17 55 102 530 376 60 Paul CCCGCnnnn'nn_ Cutsat: 0 1308 3054 3231 3236 3366 Size: 1308 1746 177 5 130 Fnu4HI GC'n_GC Cutsat: 0 650 832 856 1287 1294 1791 2407 2793 Size: 650 182 24 431 7 497 616 386 Cutsat: 2793 2998 3045 3072 3117 3133 3286 3366 Size: 205 47 27 45 16 153 80 FokI GGATGnnnnnnnnn'nnnn_ Cutsat: 0 786 1111 1228 1323 1718 2225 2570 2703 Size: 786 325 117 95 395 507 345 133 Cutsat: 2703 2715 2866 3336 3366 Size: 12 151 470 30 FspI TGC'GCA Cuts at: 0 859 3366 Size: 859 2507 GdiII C'GGCC_1’ Cuts at: 0 639 1762 3072 3366 Size: 639 1123 1310 294 HaeI wGG'CCw Cutsat: 0 510 666 685 960 989 3366 Size: 510 156 19 275 29 2377 HaeII r_GCGC‘y Cutsat: 0 414 2396 3366 Size: 414 1982 970 HaeIII GG'CC 120 Cutsat: 0 391 510 641 666 685 746 847 951 Size: 391 119 131 25 19 61 101 104 Cutsat: 951 960 989 1187 1224 1289 1689 1764 1923 Size: 9 29 198 37 65 400 75 159 Cuts at: 1923 2167 2221 2285 2317 2599 2620 2707 2901 Size: 244 54 64 32 282 21 87 194 Cuts at: 2901 3074 3158 3330 3366 Size: 173 84 172 36 Hgal GACGCnnnnn'nnnnn_ Cutsat: 0 1826 2726 2916 3228 3366 Size: 1826 900 190 312 138 HgiEII ACCnnnnnnGG'I‘ Cuts at: 0 1042 3366 Size: 1042 2324 HhaIG__CG'C Cutsat: o 398 413 860 1167 2395 3017 3116 3212 Size: 398 15 447 307 1228 622 99 96 Cutsat: 3212 3366 Size: 154 Hin4I GAbnnnnanC Cutsat: 0 890 990 2008 2456 2511 2741 2877 2939 Size: 890 100 1018 448 55 230 136 62 Cutsat: 2939 3073 3190 3270 3299 3366 Size: 134 117 80 29 67 I-Iinfl G'AnT_C Cutsat: 0 98 728 801 909 1055 1159 1172 1416 Size: 98 630 73 108 146 104 13 244 Cutsat: 1416 1496 1781 1881 1991 2117 2265 2446 3272 Size: 80 285 100 110 126 148 181 826 Cutsat: 3272 3366 Size: 94 thl GGTGAnnnnnnn_n' 121 Cuts at: 0 374 977 1037 1491 1955 1966 2126 2267 Size: 374 603 60 454 464 1 1 160 141 Cutsat: 2267 2357 2429 2512 3165 3296 3366 Size: 90 72 83 653 131 70 Kpnl G_GTAC'C Cutsat: 0 1540 3366 Size: 1540 1826 MaeIII 'GTnAC_ Cutsat: 0 1178 1497 1943 1954 2114 2273 2363 2639 Size: 1178 319 446 11 160 159 90 276 Cutsat: 2639 3006 3171 3366 Size: 367 165 195 MboII GAAGAnnnnnnn_n' Cutsat: 0 102 204 320 639 895 1810 2350 2417 Size: 102 102 116 319 256 915 540 67 Cuts at: 2417' 3366 Size: 949 Mlul A'CGCG_T Cuts at: 0 3003 3366 Size: 3003 363 Mmel TCCrACnnnnnnnnnnnnnnnnnn_nn' Cutsat: 0 1012 1654 2242 2565 2867 3366 Size: 1012 642 588 323 302 499 MnlI CCI‘Cnnnnnn_n' Cutsat: 0 21 126 230 233 469 475 595 663 Size: 21 105 104 3 236 6 120 68 Cuts at: 663 710 752 806 933 950 969 979 996 Size: 47 42 54 127 17 19 10 17 Cutsat: 996 1021 1072 1386 1468 1511 1527 1640 1717 Size: 25 51 314 82 43 16 1 13 77 Cutsat: 1717 1738 1935 2021 2105 2320 2445 2474 2524 Size: 21 197 86 84 215 125 29 50 Cutsat: 2524 2548 2852 2880 2952 2962 3058 3115 3141 122 Size: 24 304 28 72 10 96 57 26 Cutsat: 3141 3187 3191 3235 3241 3253 3256 3259 3285 Size: 46 4 44 6 12 3 3 26 Cuts at: 3285 3311 3366 Size: 26 55 Msel T'TA_A Cutsat: 0 182 354 459 617 1110 1197 1259 1899 Size: 182 172 105 158 493 87 62 640 Cutsat: 1899 1974 2332 2374 3366 Size: 75 358 42 992 MslI CAynn'nm’l‘G Cutsat: 0 903 1129 1214 1952 2832 3366 Size: 903 226 85 738 880 534 MspI C'CG__G Cutsat: 0 692 806 1176 2368 2527 2555 3056 3159 Size: 692 114 370 1192 159 28 501 103 Cutsat: 3159 3181 3188 3233 3366 Size: 22 7 45 133 MspAlI CmG'CkG Cutsat: 0 834 855 3288 3366 Size: 834 21 2433 78 Mon GCnn_nnn'nnGC Cutsat: 0 404 743 752 792 957 1312 1561 1757 Size: 404 339 9 40 165 355 249 196 Cutsat: 1757 2164 2291 2327 2390 2412 2460 2567 3228 Size: 407 127 36 63 22 48 107 661 Cuts at: 3228 3366 Size: 138 Neil CC's_GG Cutsat: 0 806 2527 2555 3181 3182 3233 3234 3366 Size: 806 1721 28 626 1 51 1 132 NgoAIV G'CCGG_C 123 Cutsat: 0 3158 3366 Size: 3158 208 NlaIII _CATG' Cutsat: o 79 83 242 318 602 908 1658 2473 Size: 79 4 159 76 284 306 750 815 Cuts at: 2473 2489 2637 2989 3366 Size: 16 148 352 377 NlaIV GGn'nCC Cutsat: 0 32 200 950 1036 1085 1224 1464 1538 Size: 32 168 750 86 49 139 240 74 Cutsat: 1538 1652 1963 1986 2167 2187 2234 2433 2600 Size: 114 311 23 181 20 47 199 167 Cutsat: 2600 2619 2745 2770 2771 2902 2949 3186 3322 Size: 19 126 25 1 131 47 237 136 Cuts at: 3322 3331 3366 Size: 9 35 NsiI A_TGCA'T Cutsat: 0 81 1439 3366 Size: 81 1358 1927 NSpI r__CATG'y Cuts at: 0 242 318 2473 2489 3366 Size: 242 76 2155 16 877 PM CCAn_nnn'nTGG Cutsat: 0 471 662 2088 2923 3366 Size: 471 191 1426 835 443 PleI GAGTCnnnn'n_ Cutsat: 0 92 722 809 903 1180 1424 1504 1875 Size: 92 630 87 94 277 244 80 371 Cutsat: 1875 1999 2111 2259 3280 3366 Size: 124 112 148 1021 86 PspSII rG'GwC_Cy Cutsat: 0 135 199 1961 2431 2769 3366 Size: 135 64 1762 470 338 597 124 PstI C_TGCA'G Cutsat: 0 1295 2323 3121 3366 Size: 1295 1028 798 245 PvuII CAG'CT G Cuts at: 0 855 3366 Size: 855 2511 Real T‘CATG_A Cuts at: 0 598 3366 Size: 598 2768 RleAI CCCACAnnnnnnnnn_nnn' Cutsat: 0 1093 1631 2983 3349 3366 Size: 1093 538 1352 366 17 RsaIGT'AC Cutsat: 0 22 248 252 1508 1538 1771 2009 2498 Size: 22 226 4 1256 30 233 238 489 Cutsat: 2498 2536 3366 Size: 38 830 SacI G_AGCT‘C Cuts at: 0 2629 3195 3366 Size: 2629 566 171 SanDI GG'GwC_CC Cuts at: 0 2769 3366 Size: 2769 597 Sap] GCTCI'I’Cn'nnn_ Cuts at: 0 878 3366 Size: 878 2488 Sau961 G'GnC_C Cutsat: 0 31 135 199 389 745 846 949 1185 Size: 31 104 64 190 356 101 103 236 Cutsat: 1185 1222 1223 1558 1651 1688 1922 1961 2165 Size: 37 1 335 93 37 234 39 204 125 Cutsat: 2165 2166 2219 2283 2431 2598 2618 2705 2769 Size: 1 53 64 148 167 20 87 64 Cutsat: 2769 2900 2947 3156 3185 3294 3329 3366 Size: 131 47 209 29 109 35 37 Sau3AI‘GATC__ Cutsat: 0 379 491 2743 2781 2934 2957 3279 3366 Size: 379 112 2252 38 153 23 322 87 SchI CC'n_GG Cutsat: 0 16 195 367 404 516 532 657 806 Size: 16 179 172 37 112 16 125 149 Cutsat: 806 891 953 962 1039 1213 1467 1930 2170 Size: 85 62 9 77 174 254 463 240 Cutsat: 2170 2191 2228 2245 2300 2402 2527 2555 2932 Size: 21 37 17 55 102 125 28 377 Cutsat: 2932 3181 3182 3233 3234 3308 3366 Size: 249 1 51 l 74 58 SfaNI GCATCnnnnn'nnnn_ Cutsat: 0 238 1206 2082 2203 2548 2734 3358 3366 Size: 238 968 876 121 345 186 624 8 SfcI C'TryA_G Cutsat: 0 1291 1700 1979 2269 2319 2441 3117 3366 Size: 1291 409 279 290 50 122 676 249 S111 GGCCn_nnn'nGGCC Cutsat: 0 957 3366 Size: 957 2409 SgrAI Cr'CCGG_yG Cutsat: O 2367 3366 Size: 2367 999 SimI GG'GTC_ Cutsat: 0 556 1651 2551 2769 3366 Size: 556 1095 900 218 597 Smal CCC'GGG Cuts at: Size: Smll C‘TyrA__G Cuts at: Size: SnaBI TAC'GTA Cuts at: Size: SpeI A'C’I‘AG_T Cuts at: Size: 126 0 3182 3234 3366 3182 52 132 0 224 1753 2622 3366 224 1529 869 744 0 250 3366 250 3116 0 1128 3366 1128 2238 Sse83871 CC_TGCA'GG Cuts at: Size: 0 2323 3366 2323 1043 Sse8647I AG'GwC_CT Cuts at: Size: SspI AAT'ATI‘ Cuts at: Size: StuI AGG'CCT Cuts at: Size: StyI C'waG_G Cuts at: Size: Tail _ACGT' Cuts at: Size: Tan T'CG_A Cuts at: Size: 0 135 3366 135 3231 0 1832 3366 1832 1534 0 685 3366 685 2681 0 385 609 385 224 893 1502 3366 1864 0 252 1256 252 1004 622 1878 3027 3036 3366 1149 9 330 0 811 1077 3366 811 266 2289 127 TanI GACCGAnnnnnnnnn_nn' Cutsat: 0 172 1128 1458 2986 3366 Size: 172 956 330 1528 380 TatI wGTACw Cuts at: 0 22 1771 2498 3366 Size: 22 1749 727 868 TauI GCSGC Cuts at: 0 832 1287 2793 2998 3072 3286 3366 Size: 832 455 1506 205 74 214 80 TfiI G'AwT_C Cutsat: 0 1055 1159 1781 2446 3366 Size: 1055 104 622 665 920 ThaI CG'CG Cuts at: 0 1165 1286 3005 3366 Size: 1165 121 1719 361 TseI G'CwG_C Cutsat: 0 649 855 1293 1790 2406 3044 3116 3132 Size: 649 206 438 497 616 638 72 16 Cutsat: 3132 3366 Size: 234 Tsp451 'GTSAC_ Cuts at: 0 1497 1943 1954 2114 2273 2363 2639 3006 Size: 1497 446 11 160 159 90 276 367 Cutsat: 3006 3171 3366 Size: 165 195 Tsp4CI AC_n'GT Cutsat: 0 35 552 701 921 1118 1282 2534 2848 Size: 35 517 149 220 197 164 1252 314 Cutsat: 2848 2994 3012 3360 3366 Size: 146 18 348 6 Tsp5091 'AA'I'I‘_ 128 Cutsat: 0 1 146 351 814 1199 1256 2057 2605 Size: 1 145 205 463 385 57 801 548 Cuts at: 2605 2683 3366 Size: 78 683 TSpRI CAsTGnn' Cutsat: 0 108 166 426 924 1123 1195 1279 1418 Size: 108 58 260 498 199 72 84 139 Cutsat: 1418 1672 2041 2780 2791 2853 2997 3304 3331 Size: 254 369 739 11 62 144 307 27 Cutsat: 3331 3366 Size: 35 Tthl 1 ll GACn'n_nGTC Cutsat: 0 35 1820 3366 Size: 35 1785 1546 Tthl 1 111 CAArCAnnnnnnnnn_nn' Cutsat: 0 189 242 863 2516 2738 3366 Size: 189 53 621 1653 222 628 UbaDI GAACnnnnnnTCC Cutsat: 0 205 2535 3366 Size: 205 2330 831 UbaEI CACCTGC Cutsat: 0 3338 3366 Size: 3338 28 129 Enzymes that do cut: AceIII AciI AflIII AluI AlwI AleI Apal ApoI Aval AvaII AvrII BaeI Baml-II Banl BanIl Bbvl Bch Bce83l Bcefl BciVl BfaI Bfil BglI BglII Bmgl BplI Bme Bu 101 Bpul 1021 Bsal BsaAI BsaI-Il BsaJI BsaWI BsaXI BSbI BscGI BseRI BsgI BSiEI BsiHKAI BSlI BsmAl BsmBI BsmF I Bsp24 Bsp1286l BspGI BspLUl 11 BSpMI Ber BsrBI Ber1 Ber1 Ber1 BstAPI BstEII BstXI BstYl Bsu36l Cac8I Cjel CjePI CviJI CviRl DdeI DpnI Dral DraIII Drdl DrdII Eael Eagl Earl Eco47III E00571 EcoOlO9I EcoRI EcoRII FauI Fnu4HI FokI FspI GdilI Hael Haell Haelll Hgal HgiEII HhaI Hin4I Hinfl thl KpnI MaeIII MboII MluI Mmel Mnll Msel Msll Mspl MspAlI Mwol NciI NgoAIV NlaIII NlaIV NsiI NspI PflMl PleI PspSII Pstl PvuII Rcal RleAI RsaI SacI SanDI SapI Sau96I Sau3AI SchI SfaNI SfcI Sfil SgrAI SimI SmaI Smll SnaBI SpeI Sse8387l Sse8647I Sspl Stul StyI Tail Taql TanI Tatl TauI TfiI Thal TseI Tsp4SI Tsp4Cl Tsp509l TspRI Tthllll TthlllII UbaDI UbaEI Enzymes that do not cut: AatII Ach AflII Ahdl ApaLI AscI Bbsl Bcgl BclI BsaBI BsmI BspEI Bssl-III BSSSI BstDSI BstZl7I ClaI Ecil EcoNI EcoRV F sel Hincll HindIII Hpal Mscl MunI NarI Ncol NdeI Nhel NotI Nrul NspV PacI PH] 1081 PinAI Pmel Pmll PshAI Pspl4061 PvuI RerI Sacll SalI Scal SexAI Sgfl SmiI SphI Srfl Sunl Vspl Xbal Xcml Xhol anI HICHIG IIIIIIIIIIIIIIIIII“