/ IJU‘r'L/UV \\\\\\\\\\\\\\\\\\\\\\\\\\ “MATE WW \ \llll \\ \\\\\\\\\\\\\\ fin“ : 31 12.9300 LIBRARY Michigan State University M;— A fl This is to certify that the dissertation entitled CHICKEN CHROMOSOMAL PROTEIN GENES presented by David Lawrence Browne has been accepted towards fulfillment of the requirements for Ph. D. Biochemistry _ _'degree In Dm&_goly 20, 1990 MS U is an Affirmative Action/Equal Opportunity Institution KK- PLACE IN RETURN BOX to remove thus checkout lrom your record. TO AVOID FINES mum on or below date do. DATE DUE DATE DUE DATE DUE I } l l M J MSU I: An Amrmdwo Actaoniqw Opponuruty Inststwon (KIC‘WM30 ‘ CHICKEN CHROHOSOMAL PROTEIN GENES By David Lawrence Browne A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Biochemistry 1990 ABSTRACT CHICKEN CliROlfOSOib‘aL PROTEIN GENES by David Lawrence Browne Two types of chicken chromosomal protein genes have been studied. the histone “1 genes and the HMO-14 genes. A known chicken H1 gene was used as a hybridization probe to isolate 4 other members of the gene family. The sequence of one of these genes was completely determined. Comparison of the two H1 sequences showed that the two The newly characterized and 3' genes encode unique Proteins. gene contains common sequence elements in the 5' regions which flank the coding sequence. Partial sequence analysis showed that the other members of this gene family encode unique proteins and contain common eukaryotic promoter elements. Chicken ”HG-1A and HUG-l7 CDNA clones were used to isolate the chicken genes for these proteins. Sequence analysis showed that the HMO-l7 gene consists of 6 exons and 5 introns. The “HG-17 promoter contains the common which is present in some. but not all, of the ”HG-14 mRnA. he multiple splicing seen at the 5' end of the “NC-14 gene may result from multiple initiation sites, since the hHC-lé promoter contains no TATAA or CCAAT elements. fiultiple processing also occurs at the 3' end of the “XS-14 transcript. in contrast to the NRC-l7 transcript. To Linda, whose Spartan support of science has been remarkable. TABLE OF COHTEHTS PACE LIST OF FIGURES ...................................... vii CHAPTER 1. LITERATURE REVIEW: HISTOKES ASD HISTORE GESES ...................................... l Histone structure .......................... 7 Histone variants .......................... 10 Preposal .................................. 22 References ................................ 24 CHAPTER 2. 1ATERIALS ASD RETHODS ..................... 31 iaterials ................................. 32 Nethods ................................... 32 Isolation of HRS clones ................... 33 Sequence determination .................... 33 RSA isolation ............................. 34 51 protection analysis .................... 34 Primer extension .......................... 35 References ................................ 36 CHAPTER 3. CHAPTER 4. SEQUENCE AHALYSIS OF CHICKEN HI HISTOHE CEHES ............................. 37 Hl.lOa .................................... 38 H1.2e ..................................... 46 Hl.lc ..................................... 49 H1 sequences .............................. 52 Hl.lOa sequence ........................... 53 Other H1 gene sequences ................... 60 References ................................ 68 LITERATURE REVIEW: HHG CHROHOSOHAL PROTEISS .................................. g2 HSIC-l and lifiC-Z ........................... 77 Ricroheterogeneity ........................ Possible HflG-l,2 functions ................ 78 nae-1a and HHO:1?L.;...:: ................. 81 CHAPTER 5. PAGE SOLATIOH ASD CHARACTERIZATION OF CHICKEN “HG-1&3 AND HHC-l? CERES .......... 95 The chicken HMO-l7 gene .................. 100 The HMO-17 promoter ...................... 114 Si protection analysis ................... 118 Other results ............................ 125 Conclusion ............................... 126 The chicken HHS-14a gene ................. 127 Exon 0 ................................... 138 The HHS—143 promoter ..................... 159 The HHS-lea 3' end ....................... 162 The HMO-14b gene ......................... 166 Comparison of human and chicken HMO genes ................................ 167 References ............................... 169 ‘PPERDIX ............................................. 171 LIST OF FIGURES FIGURE PAGE 1. Nucleosome structure ........................... 5 2. The role of H1 in nucleosome ordering .......... 8 3. The genomic organization of histone genes ..... 18 4. Recombinant ICharonAA chicken genomic clones which contain chicken H1 genes, and plasmid subclones derived from them ....... 39 5. Comparison of p2c2.2?1 and plOaS.OBR .......... Al 6. Restriction maps of pCH3dR8 and p2c6.8RH ...... 44 7. Restriction maps of p2e3.5RR and p5e5.2RB ..... 47 8. Restriction map of ACch ...................... SO 9. Fine structure restriction map of pIOaS.OBR...54 10. Comparison of the sequences of Hl.la and H1.10a ........................................ 56 11. Comparison of 5' coding sequences of chicken Hi genes .............................. 62 12. H1 promoter sequences ......................... 65 13. The sequence of a rat HflC-l cDR .............. 75 14. Restriction maps of recombinant xCharonéA chicken genomic clones which hybridize to the HMO-l7 cDGA pLGla ........................ 102 17. 18. 19. 20. P) f.) 23. 31. 32. 33. 34. HHC-l7 exon-intron boundaries ................. 110 The 3' end of the HHS—17 gene ................. 112 The HMO-l7 promoter region .................... 116 51 protection analysis of HMO-17 mRNA ......... 120 Results of 81 protection analysis of HMO-17 mRXA from 15 day old chicken embryos .......... 122 Restriction maps of recombinant zCharonAA chicken genomic clones which hybridize to the HMS—14a cDHA pLH3a ........................ 128 Restriction maps of plasmid subclones which contain HHS-14a exons ......................... 131 The exon-intron structure of HHS-14a .......... 13A Conserved exons of HRS-143 and HMO-l7 ......... 136 HHC-lba exon—intron boundaries ................ 140 The HMO-1&3 promoter region ................... 142 Restriction map of pLHZa ...................... 146 The HHG-lha promoter region ................... 148 Multiple splicing patterns of HUG-14a mRNA as represented by two cDflA clones ............. 150 81 protection analysis of HHS-14a mRNA ........ 153 Results of 51 protection analysis of HHG-léa mRIiA .......................................... 155 Primer extension analysis of HMO-143 mRRA ..... 160 The 3' end of the HMO-14a gene ................ 163 911593111 1 Wu Review; Eugene: and mm; genes 2 In 1884 Albrecht Kossel reported the isolation of a fraction of basic proteins from the nuclei of goose erythrocytes (1). He called this fraction "histone" and discussed the possibility of its interaction with nucleic acids. Since his observations histones have been found to be essentially universal in eukaryotic nuclei, and much has been learned about the structures of histone proteins and the nature of their interactions with DNA. As our knowledge increases, though, so does the sephistication of our ignorance, and the biochemistry of histones remains an active field of study today. Chromatin is the complex of proteins and nucleic acids found in the nuclei of all eukaryotic cells. It contains the genetic information of a cell and is intimately involved in both the expression of that information (transcription) and the maintenance of that information from one generation of cell to the next (replication). Chromatin undergoes dynamic structural changes as it performs each of these functions (2-4). Eukaryotic chromosomes are composed of approximately equal amounts of DNA, histones, and other proteins. Histones are the main protein components of chromatin. They __ . 3 histone is at least 10 times as abundant as any nonhistone chromatin protein (5). Chromosomes have long been studied by microscopists, and biochemists have studied the components of chromatin. Around 1975 both types of studies revealed the basic repetitive nature of chromatin. Improved electron micrographs revealed the ”beads on a string" structure of chromatin (6-8). This structure was correlated with the products of nuclease digestion of chromatin, which appeared to be monomers and multimers of a basic structure. This basic structure is the nucleosome. When chromatin is digested with a nuclease like DNase I, nucleosomes are released. These particles contain about 200 base pairs (bp) of DNA and a standard complement of histones: one molecule of R1 and 2 molecules of each of the other classes of histones (9). Further digestion of nucleosomes releases H1 and reduces the DNA to 165 bp (10). The resulting nucleosome core is relatively resistant to further nuclease digestion. The stability of the nucleosome cores is reflected by the ability to reconstitute them. Isolated core histones will reaggregate when mixed in solution. and when DNA is - . ‘ .. .. ._ - _ A - I g 4 165 bp of DNA are wrapped twice around the histone core. In undigested chromatin, cores are joined by a species-specific length of ”linker” DNA (averaging about 30 bp) which is associated with H1 histone. One conception of this structure is shown in Figure 1. The location of H1 external to the nucleosome core is inferred by its release by nucleases and by other studies of whole chromatin. H1 is more readily extracted from chromatin than are other histones: that is, lower concentrations of salt disrupt its interactions with DNA and other histones (17). Isolated Hl will reassociate with H1- depleted chromatin (18). The lower stability of H1 in chromatin has made the stoichiometry of H1 somewhat problematic. Measurements of the stoichiometry of H1 show there are about 0.8-1.0 molecules of H1 per core octamer (22). The absence of 81 from some nucleosomes may be artifactual or may reflect the in yiyg situation. The orientation and interactions of H1 in native chromatin are not known. It can be shown that H1 interacts with both ends of the 165 bp of core DNA (19) and crosslinking studies of reconstructed chromatin show that one H1 molecule can link two core particles (20.21). Figure 1. Nucleosome structure. From (88). Inner hmoncs Figure 1 H1 (loss 0! Much“ bound to spacer region 7 Electron microscopy of Hi-depleted chromatin and reconstructed chromatin shows that H1 orders nucleosomes into a regular, zig-zag structure (23). Figure 2 illustrates this role of H1. WW Because of their abundance amino acid sequences of histones from many species have been determined (24). When the structures of histones from many sources are compared some general features are seen. Histones are small globular proteins with molecular weights between 15 kilodaltons (kd) and 21 kd. They are rich in basic amino acids, with lysine and arginine typically comprising about 25‘ of the residues. These and other polar residues are concentrated in the amino-terminal regions of the molecule: the carboxy-terminal two-thirds is quite hydrophobic. H1 differs from the other histones by having another basic domain at the C-terminus. The amino-terminal basic domain and the hydrophobic central portion of H1 are approximately the same sizes as the equivalent domains of the core histones (25). Sequence comparisons show that the primary structures Of the histones have been highly conserved through time. The evolutionary stability of H4 is legendary. The sequences of 84 of calf, mouse, and frog are identical Figure 2. The role of H1 in nucleosome ordering. From (89). Manila. 1 1M 10 sequence (29), relative to that of vertebrates. H3 is also highly conserved. A calf H3 differs from a chicken H3 by four substitutions (30) and from pea H3 by only 5 substitutions (31). HZA and H28 sequences have diverged more than H3 and H4 sequences. While mouse and calf H28 differ by only one residue, H28 from wheat has only 78% homology to the mammalian protein (32). Almost all of the variability in the H28 family is found in the charged amino-terminal domain, and includes deletions and insertions of residues as well as substitutions (33,34). The hydrOphobic carboxy- terminal domain of H28 has been highly conserved. The H2A family has variability throughout the molecule. H2A proteins from chicken and calf are 78‘ homologous, and the proteins from plants (35) and yeast (36) are quite different from the vertebrate proteins. The H1 class is the most divergent of the 5 classes of histones. In fact, H1 has never been found in yeast (37). Both ends of the molecule are variable and very basic: the central hydrophobic region is more conserved (38). H1 molecules vary much more in size than other histones: H1 proteins of 189 and 22¢ amino acids have been reported (39,40). ll histone variants in individual organisms has long been known, and it originally led to a plethora of nomenclatures (41). There are two sources of histone variability within a species. First, non-allelic variants with different primary structures may exist. Second, all the histones undergo a variety of post-translational modifications. The intraspecies evolutionary variability of different histone classes is similar to the variability seen among species. Thus, organisms generally have only one type of H4 histone protein. Three H3 histone proteins with different primary structures are known in calf, and they are identical to the 3 chicken types. H2A and H28 variants have diverged to the point that homologies between species can be difficult to assign (42,43). The highly diverse H1 family includes some of the best studied variants. The most abundant of these are H1' and H5. H1' is a mammalian H1 variant which is found in high amounts in non-dividing tissues (44). H1‘ itself can be separated into two forms, but the differences between them are not known. The other major H1 variant is H5, which is found only in the nucleated erythrocyes of birds, reptiles, amphibians, and fish (45). Erythrocytes are non-dividing cells and- interactindlv_ H5 nrntoins arp mnra 14». U1. be-“ 12 number and amounts of H1 variants can vary from one tissue to another within a single species (48). In most cases primary structures of the variants have not been determined, so heterogeneity in this class of histones could result from post-translational modifications in addition to differences in primary structures. All 5 classes of histones undergo a variety of post- translational modifications. The most common modifications are acetylation, methylation, and phosphorylation, but other modifications have also been found. Two types of amino-acetylation are seen in histones: N-terminal and internal. The N-terminus of H1, HZA, and H4 can be acetylated (49-51). Serine is the usual acetylated N-terminal residue, but others are possible (52,53). Host histone acetylation occurs at the side chains of internal lysine residues of all the histones except H1. Internal acetylation is reversible. The general level of histone acetylation is controlled by the action of a variety of nuclear acetylases (54) and deacetylases (55,56). Since multiple side chains may be acetylated, and this modification is reversible, internal acetylation can generate many specific forms of histones (57). Tnfnrhn1 Tuning cifln rhninc nan alcn ho finbku1—5-A L- 13 methylation of arginine side chains has also been found (58). Phosphorylation of internal residues is a common modification that is seen in all 5 histone classes. Serine and threonine are the usual sites of phosphorylation, but 2 histidine residues of H4 can also be phosphorylated (59). Many peOple have attempted to correlate histone variants and modifications with functional states of chromatin. Particularly popular are studies of cells undergoing differentiation, neoplastic transformation, senescence, and DNA repair. Despite hundreds of studies, the involvement of histone variants in chromatin activity is still poorly characterized and often controversial. Studies of the roles of histone variants and modifications in modulating chromatin structure and function are difficult for two related, fundamental reasons. First, isolation of chromatin or fractions of chromatin must involve some destruction of structure. Second, specific variant structures are only a small fraction of total chromatin, and procedures to enrich chromatin preparations for specific variants are disruptive and have not been very ¢\!f~f~n¢-Q‘Ia‘ IN-“- {n - ‘AI.’ (Danni .1 nae-na- 3—..-\--:__ L . - 14 changes that are part of this process. The best documented is the coupling of chromosome condensation and phosphorylation of specific residues of H1 (60). A mitosis- associated kinase, active during late 62 in the cell cycle, is responsible for phosphorylation of hydroxyl groups in the sequence lys-ser/thr-pro (61,62) in both the N-terminal and C-terminal domains of H1. The activity of this kinase is itself regulated by phosphorylation. It participates in a cascade of phosphorylation which leads to modifications of H1 histones, ribosomal proteins, and other nonchromosomal proteins (63). Other histone modifications occur at this time, including H3 phosphorylation (62) and an increase of poly-ADP-ribosylation of undetermined chromosomal proteins (64). These changes have been reported in other stages of the cell cycle, though (65-67), so their importance in mitosis is unclear. The best biochemical studies of chromatin have allowed only vague correlations of structure and function. It seems that new approaches to the study of chromatin are needed. One that should prove fruitful is the increasingly refined determination by x-ray crystallography of the physical structure of nucleosomes of defined composition, including H1 and mnflifiad Pnra hiqtnnoq- 15 a sea urchin ovum initiates a series of rapid cell divisions with no actual growth of the embryo. During this morula stage the need for histones and other nuclear proteins is great and at least 60‘ of the embryo's translational activity is devoted to the production of these proteins (68,69). Large numbers of synchronized early embryos are easy to obtain and are a good source of polysomes which are enriched with histone mRNA. Measurements of the hybridization kinetics of sea urchin histone mRNA and genomic DNA showed that the histone genes occur several hundred times in the sea urchin genome (70). When the DNA was fractionated by equilibrium density gradient centrifugation most of these highly reiterated genes (if not all) were found in a satellite fraction (70,71). This suggested that the histone genes might be linked, but proof of this was only obtained after the development of recombinant DNA techniques when cloned genomic DNA became available. When histone mRNAs were used to probe cloned DNA and restricted genomic DNA it was learned that genes of the 5 classes of histones are grouped into nearly identical clusters, each about 6 kilobase pairs (kb) long, which are tandemlv reheated 300-600 times. Figure 13 :thc nu au-__‘- l6 strand of DNA. The order of the genes (...Hl-H4—H28-H3- HZA- ...) has been conserved. The sequences between the coding regions are A+T-rich and contain regulatory elements (72) but not other genes. The availability of cloned histone genes allowed study of histone gene organization in other organisms. When the organization of Drgsgpnilg histone genes was investigated, it was found that these genes, too, occur as tandemly reiterated quintets. The ngsgpnilg repeats differ from sea urchin repeats in several ways, though. The order of the histone genes in these flies is different (...Hl-H3-H4-H2A- H28-...) and the genes are not all transcribed from the same strand of DNA. Also, 2 types of repeats are found. They differ by the presence or absence of a specific 240 bp intergenic sequence (73). This insert doesn't seem to be important for histone production, since flies have been bred which lack either type of repeat with no apparent deleterious effects (74,75). Figure 3b shows the structure of the ngsgpnilg histone repeats. The arrangement of histone genes is less ordered in the vertebrates. Reiterated quintets of histone genes occur in amphibians, but a good deal of heterogeneity has evolved 17 clusters of histone genes (77). The number of copies of each class of histone gene ranges from about 50 to 1500 in amphibians, probably because of the large range of haploid DNA content among these species (78). Birds and mammals have fewer histone genes than those animals discussed above, about 10-40 copies of each type in a haploid genome. These genes are not arranged in repeated blocks at all. Many of the genes are clustered, though. Figure 3c shows restriction maps of a number of clones of chicken genomic DNA which contain histone genes, as judged by Southern blotting. It is clear that, though these clones contain clusters of histone genes, each is unique and not tandemly repeated. The collection of genomic clones shown in Figure 3c contains most, but probably not all, of the chicken histone genes. Analysis of genomic Southern blots suggests that the H1 gene is present in about 6 copies, while the other histone genes are present about 10-15 times each (79). These estimates agree with early studies of solution hybridization kinetics (80) which were used to infer histone gene copy number. The sequence of at least one gene of each class of chicken histone has been determined (81). Also, various sea 18 Figure 3. The genomic organization of histone genes. a.) b.) The major repeat of sea urchin L; pigtgs. From (87). The Drosophila mglgnggggtg; repeat. From (73) . Chicken histone genes. Location of genes, as determined by Southern blot analysis, is indicated by boxes over the restriction maps. Solid black = H1, all white = H2A, vertically hatched = H28, diagonally hatched = H3, and stippled = H4. Sequenced genes are named and their direction of transcription is indicated with arrows. From (81). ---e -o 4 —o -0 H1 H4 H28 NJ NBA A . E M. No ..... Hg 07' ' - c 5 1000 base can: -* . ,_______., P f ‘23:: j r j r 7 N00 m g...“— -——-—~o [— T 1 I Saw 3“ 100 boson-u. H INS! RYION 8 . _ o «Awflw H 1 H ' H3 H4 N21 M20 W a A—T mcn \\ I mprueo H—Ikb+¢ SEQUENCE C . giifmgn"w“”“”1’”r F' “CZ—”1:3 ACHlo ,2: ¢____3 Q.__!x .L Y a; ’41 IL‘ULLIJJ ACHZO ‘————-3 9* if I" FLJ—J‘11*J’ at: ‘11! 9 ACHSd m 4:), 1F a! I i nijstHH ACHSq XCHZC 1!} ACHZd 4f” ’6 6&1: ¥ ' iii—+2.2”; ACHZc hggfi‘mg fi git 1* I 3 ACH3D JUUIEEJ _ 'X f I, A 18113? i, , ’11; ACH3C ‘41? if ‘L ‘1 41—? ”‘3” ' ACH3d m m — J 1 __l 9 ACHIOO 62:3 . r 21.3.3. 2 1 ACHSc 20 The regions 5' of the coding sequences of genes contain sequences which are important for gene expression. The best characterized of these promoter elements are the TATAA element (82) and the CCAAT element (83). The TATAA element has been found upstream of almost every histone gene analyzed, as well as many other genes. This sequence directs the start of transcription to a point about 30 bp downstream of it through its interaction with RNA polymerase II and other proteins (83). Some genes lack the TATAA element, and these genes often initiate transcription at several sites. The CCAAT element is also found in many genes, including most histone genes examined to date. Its location, 30-60 bp upstream of the TATAA element, is variable, and some genes contain more than one copy. The CCAAT element, like the TATAA element, interacts with proteins to modulate transcription. Other upstream sequences may be promoter elements which are specific to one gene or a family of genes. The functions of these elements are often unknown, but they are probably binding sites for less ubiquitous activators of transcription. Comparisons of vertebrate histone genes have revealed several sequences that are evolutionarily conserved and thus are candidate promoter elements. Two of these will 21 representatives of a given class in a given species and often in the same class of histone genes in wide variety of species. Some histone H28 genes are preceded by the octamer sequence ATTTGCAT. This element may be involved in cell cycle regulation of the H28 genes by interaction with one or more regulatory proteins (84). The octamer sequence has been shown to bind a ubiquitous transcription factor Octl. This sequence is also present in lymphoid-specific enhancers and can bind a lymphoid-specific factor, Oth. A number of histone H1 genes have an A+C-rich element located upstream of the CCAAT element (85). The function of this element (usually AAACACA) is not known but its evolutionary conservation suggests it is important in the regulation of these genes. DNA sequencing of histone coding regions has revealed that almost all histone genes consist of a single exon. Two of the few exceptions occur in chickens (86). The intron- containing chicken histone genes are also unusual in that they are unlinked to other histone genes, and their expression does not vary through the cell cycle. Histone mRNA differs from most other mRNA because it is generally not polyadenylated. Histone mRNA usually ends 22 stability on histone mRNA. This feature of histone mRNA has been highly conserved and is found in animals from sea urchins to mammals. Bremen Determination of the structure of genes is useful for two main reasons. First, comparisons of gene structures have allowed identification of sequences important for the functions of both the genes and their products. Second, knowledge of a gene's structure allows very sensitive measurements of its activity. Characterization of the H1 family is the only definitive way to determine variability in this important class of proteins, and to differentiate variability in primary sequence from post-translational variability. Only then can the participation of such variability in different states of chromatin structure and, perhaps, in chromatin function (transcription/replication) be assessed. H1 histone gene structure analysis will also facilitate further studies of H1 function by enabling the measurement of the production of specific H1 variants at the transcriptional level. The high level of similarity in the coding regions of the 6 chicken histone genes requires that individual DNA sequences be determined in order to develop gene-specific He 23 propose to: Determine the size of the H1 family in chickens by isolating all of the genes. Determine the sequences of the genes to deduce primary sequence variability and sequences important for gene activity. Use knowledge of H1 gene structure to measure the activity of the genes in chicken cells. 24 References 11. 12. 13. 14. Donecke, D., and P. Karlson. (1984). Trends in Biochemical Sciences 2:404-406. Heintraub, H., and H. Groudine. (1976). Science 21:848. Heisbrod, S. (1982). Nature (London) 221:289. Felsenfeld, G. (1978). Nature (London) 211:115. Johns, E.H.,(ed.). (1983). "The HMG Chromosomal Proteins.“ Academic Press, New York. Hoodcock, C.L.F. (1973). J. Cell. Biol. 52:368. Olins, A.L., and D.£. Olins. (1973). J. Cell Biol. 52:252. Olins, A.L., and 0.8. Olins. (1974). Science 181:330. Kornberg, R.D. (1974). Science 185:868. Noll, H., and R.D. Kornberg. (1977). J. Mol. Biol. 102:393. Thomas, J.O., and R.D. Kornberg. (1975). Proc. Nat. Acad. Sci. Q55 12:2626. Stein, A., H. Bina-Stein, and R.T. Simpson. (1977). Proc. Nat. Acad. Sci. QSA 15:2780. Thomas, J.O., and R.D. Kornberg. (1975). FEBS Lett. 58:353. Thomas, J.O., and P.J.G. Butler. (1977). J. Hol. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 25 Thomas, J.O., and P.J.G. Butler. (1978). Cold Spring Harbor Symp. Quant. Biol. 52:119. Steinmetz, H., R.E. Streeck, and H.G. Zachau. (1978). Eur. J. Biochem. 81:615. Caron, F., and J.O. Thomas. (1981). J. Hol. Biol. 116:513. Finch, J.T., L.C. Lutter, D. Rhodes, T. Richmond, 8. Rushton, H. Levitt, and A. Klug. (1977). Nature (London) 253:29. Itkes, A.V., 8.0. Glotov, L.C. Nikolaiv, S.R. Preem, and 2.8. Severin. (1980). Nuc. Acids Res. 8:503. Thomas, J.O., and A.J.A. Khabaza. (1980). Eur. J. Bioch. 112:501. Bates, D.L., and J.O. Thomas. (1981). Nuc. Acids Res. 2:5883. Thoma, F., T. Koller, and A. Klug. (1979). J. Cell. Biol. 31:403. Isenberg, I. (1979). Ann. Rev. Bioch. 58:159. Cole, R.D. (1984). Anal. Bioch. 116:24. Seiler-Tuyns, A., and H.L. Birnsteil. (1981). J. Hol. Biol. 151:607. Hoorman, A.F.H., P.A.J. de Boer, R.J.H. deLaaf, H.H.A.H. Van Dongen, and O.H.J. Destree. (1981). FEBS 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 26 DeLange, R.J., D.H. Fambrough, E.L. Smith, and J. Bonner. (1969). J. Biol. Chem. 215:5669. Brandt, W.F., and C. Von Holt. (1974). Eur. J. Bioch. 16:407. Patthy, L., £.L. Smith, and J. Johnson. (1973). J. Biol. Chem. 258:6834. Von Holt, C., W.N. Strickland, W.F. Brandt, and H.S. Strickland. (1979). FEBS Lett. 109:201. Franklin, 8.6., and A. Zweidler. (1977). Nature (London) 266:273. Hunt, L.T., and H.O. Dayhoff. (1982). in ”Macromolecular Sequences in Systematic and Evolutionary Biology", Goodman, H., (ed.), 193. Plenum Press, New York. Rodrigues, J. de A., W.F. Brandt, and C. Von Holt. (1979). Biochim. Biophys. Acta 518:196. Choe, J., P. Kolorubetz, and H. Grunstein. (1982). Proc. Nat. Acad. Sci. £55 12:1484. Brandt, W.F., K. Patterson, and C. Von Holt. (1980). Eur. J. Biochem. 119:67. Smith, 8.J., H.R. Harris, C.H. Sigournay, E.L.V. Hayes, and H. Bustin. (1984). Eur. J. Biochem. 118:309. Smith, 8.J., J.H. Walker, and E.W. Johns. (1980). 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 27 Bradbury, B.H. (1975). in ”The Structure and Function of Chromatin”, Ciba Foundation Symposium 28. Ass. Scientific Publ., Amsterdam. West, H.H.P., and H.H. Bonner. (1983). Comp. Biochem. Physiol. 188:455. Zweidler, A. (1984). in ”Histone Genes: Structure, Organization, and Regulation", Stein, G.S., Stein, J.L., and Harzluff, W.F., (eds.), p. 339. Hiley/Interscience, New York. Smith, 8.J., and E.W. Johns. (1980). FEBS Lett. 119:25. Isenberg, I. (1979). Ann. Rev. Biochem. 18:159. Smith, 8.J., J.H. Walker, and £.W. Johns. (1980). FEBS Lett. 112:42. Smith, 8.J., J.H. Walker, C.H. Sigournay, E.L.V. Hayes, and H. Bustin. (1984). Eur. J. Biochem. 118:309. Cole, R.D. (1984). Anal. Biochem. 118:24. R811, S.C., and R.D. Cole. (1971). J. Biol. Chem. 288:7175. Phillips, O.H.P. (1963). Biochem. J. 81:258. Ogawa, Y., 6. Quagliarotti, J. Jordan, C.W. Taylor, W.C. Starbuck, and H. Busch. (1969). J. Biol Chem. 211:4387. 53. 54. 55. 56. 57. 58. S9. 60. 61. 62. 63. 64. 28 Harvey, R.P., J.A. Whiting, L.S. Coles, P.A. Krieg, and J.R.E. Wells. (1983). Proc. Nat. Acad. Sci. 883 89:2819. Saxholm, H.J.K., A. Pestana, L. O'Connor, D.A. Sattler, and H.C. Pitot. (1982). Mol. Cell. Biochem. 18:129. Kelner, D.N., and K.S. McCarthy, Sr. (1984). J. Biol. Chem. 282:3413. Yikioka, M., 3. Sasaki, 8. Le Qui, and A. Inoue. (1984). J. Biol. Chem. 288:8372. Liao, L.W., and R.D. Cole. (1981). J. Biol. Chem. 288:3024. Gupta, A., D. Jensen, 8. Kim, and W.H. Paik. (1972). J. Biol. Chem. 281:9677. Matthews, H.R., and V.D. Heubner. (1984). Mol. Cell. Biochem. 82:81. Bradbury, E.M., R.J. Inglis, H.R. Matthews, and H. Sarner. (1973). Eur. J. Biochem. 18:131. Quirin-Stricker, C. (1984). Eur. J. Biochem. 112:317. Gurley, L.R., R.A. waiters, and R.A. Tobey. (1974). J. Cell Biol. 88:356. Maller, J.L. (1990). Biochemistry 22:3157. Tanuma, S., and Y. Kanai. (1982). J. Biol. Chem. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 29 Whitlock, J.P., D. Goleazzi, and H. Schulman. (1983). J. Biol. Chem. 288:1299. Kidwell, H.R., and H.C. Mage. (1976). Biochemistry 15:1213. Kedes, L., and P. Gross. (1969). J. Mol. Biol. 12:559. Moav, 8., and M. Nemer. (1971). Biochemistry 18:881. Kedes, L., and M. L. Birnstiel. (1971). Nature New Biology 288:165. Birnstiel, M.L., J. Telford, 8.8. Weinberg, and D. Strafford. (1974). Proc. Nat. Acad. Sci. 885 11:2900. Hentschel, C., and M.L. Birnstiel. (1981). Cell 28:301. Lifton, R.P., M.L. Goldberg, R.W. Harp, and D.H. Hogness. (1977). Cold Spring Harbor Symp. Quant. 8101. 12:1047. Moore, G., J. Proncunier, 0. Cross, and A. Grigliatti. (1979). Nature (London) 282:312. Wright, T., R. Hodgetts, and A. Sherald. (1976). Genetics 85:267. Stephenson et al. (1981). Cell 21:639. Turner, Woodland. (1983). Nuc. Acids Res. 11:971. Old, R.W., and H.R. woodland. (1984). Cell 18:624. 80. 81. 82. 83. 8‘. 85. 86. 87. 88. 89. 30 Crawford, R.J., P. Krieg, R.P. Jarvey, D.A. Gewish, and J.R.£. Wells. (1979). Nature (London) 218:132. Sugarman, 8., J.B. Dodgson, and J. Engel. (1983). J. Biol. Chem. 288:9005. Goldberg, M. (1979). Thesis. Standford University. McKnight, S.L., and R. Kingsbury. (1982). Science 211:316. Fletcher, C., N. Heintz, and R.G. Roeder. (1987). Cell 81:773. Coles, L.S., and J.R.B. Wells. (1985). Nuc. Acids Res. 11:585. Engel, J.D., B.J. Sugarman, and J.B. Dodgson. (1982). Nature (London) 221:434. Roberts, 5.8., K.E. Weisser, and G. Childs. (1984). J. Mol. Biol. 115:647. Olins, D.H., and A. L. Olins. (1978). Amer. Sci. 88:704. Thoma, F., T. Holler, and A. Klug. (1979). J. Cell 8101. 81:403. 31 $8821.11 2 mm m 8.1..__0 beds 32 Hamish Restriction enzymes, calf alkaline phosphatase, T4 polynucleotide kinase, T4 DNA ligase, RNase A, $1 nuclease, DNase 1, DNA polymerase I, and reverse transcriptase were obtained from the following sources: Bethesda Research Laboratories, United States Biochemical Corp., International Biotechnologies, Inc., Promega Biotec, New England Biolabs, and Boehringer Mannheim. Yeast tRNA was obtained from Sigma Chemical Company. l-nP-ATP and a-RP-dCTP were obtained from New England Nuclear. The plasmid cloning vector ”Bluescribe” and its 88 9911 host XLl-Blue were obtained from Stratagene. Oligonucleotide primers JD-20 (d[AATAGGAAGGGAACCGCCGAG]) and JD-Zl (d[TTTTCGAGCGTCCT GAGGAA1) were obtained from the MSU Macromolecular Synthesis Facility. Lgtll primers were purchased from New England Biolabs. The primer "SK” for sequencing DNA cloned in Bluescribe was obtained from Stratagene. Fifteen day old chicken embryos were obtained from the MSU poultry farm. nemesis The cloning and Southern blotting analysis used in this work followed the protocols outlined by Maniatis, Fritsch, and Sambrook (1). The H1 histone probe used to analyze native or cloned chicken DNA was a nick-translated PstI- 33 probes used to analyze chicken genomic clones were the nick- translated inserts of cDNA clones pLM3b (HMO—14) and pLGla (HMO-17), which are described in the Appendix. The nick- translation protocol (1) routinely produced probes of 10a de/ug- Isslstisn Q: BBQ :12an The library of recombinant :Charon 4A chicken genomic clones originally described by Dodgson, Strommer, and Engel (3) was screened with the HMG-l4 and HMO-17 probes described above. 4 x 10‘ plaque forming units were plated on each of 10 150mm petri dishes for each screen, and duplicate filters were prepared by standard methods (1). Plaques which were positive in the initial screen were isolated, replated, and rescreened. Three genomic clones of each HMG gene were purified and analyzed by restriction mapping and sequence determination. 3101191193 Dstemineiisn Chicken H1 histone, HMO-14, and HMG-17 gene sequences were determined by the chemical degradation method of Maxam and Gilbert (4) as modified by Smith and Calvo (5). The sequences of the 5' and 3' untranslated regions of the HHG- 14 cDNA clone pLM2a were determined by the chain termination method of Sanger, Nicklen, and Coulson (6) using the 34 88.812le Total RNA was isolated from 15 day old chicken embryo liver, brain, heart, skeletal muscle, and blood by the method of Chirgwin g; 81. (7) as described by Davis, Dibner, and Batten (8). Approximately 40 embryos yielded 1-4g of each tissue. Tissues were homogenized in 16 ml of 4H guanidine isothiocyanate, 25mg sodium acetate (pH 6), 50mg 2-mercaptoethanol. The homogenate was layered onto 16 ml of 5.75 CsCl in an ultra-centrifuge tube and spun overnight (37,000 rpm, ZO'C) in a Beckman Ti50 rotor. The clear, gelatinous RNA pellet was dissolved in 0.35 sodium acetate (pH 6) and extracted with phenolzchloroform (1:1). The RNA was precipitated from the aqueous phase with 2 volumes of cold ethanol, redissolved in 10mg Tris, 1mm EDTA (pH 7.6) and quantified spectrophotometrically at 260 nm. Slimlsstisnbnslxsis HMG-l4 and HMO-17 mRNA was analyzed by 81 protection as described by Maniatis, Fritsch, and Sambrook (l). The HMG- 17 probe was made by digesting Zug of pHM1.7BH with Sau96al, treating the digested DNA with calf alkaline phosphatase and the T4 polynucleotide kinase and l-RP-ATP. The 118 bp Sau96al fragment which contains the HMG-l? transcription 35 isolation of the labelled HincII-OxaNI fragment (approximately 700 bp) from a 1.21 agarose gel. DNA was labelled to a specific activity greater than 5x107ldpm/ug by this method. Enough DNA to give 5x10s dpm was coprecipitated with song of RNA from embryonic tissue (or yeast tRNA) for use in the hybridization and digestion steps of the protocol. Hybrids were formed overnight at 50' in 3Cyl of 40 m5 PIPES (pH6.4), 11113 ED’I‘A, 0.4” NaCl, 80% formamide. After hybridization 300 pl ice-cold nuclease buffer (0.2814 NaCl, 0.055 NaOAc, 4.5m: ZnSO,) was added and the mixture divided into 2 equal aliquots. Sl nuclease (lSOu,300u) was added to each aliquot; digestion was performed for 30 minutes at 37°C. Products of the reactions were analyzed on autoradiographs of 64 denaturing gels (4). Primer Himmler) NEG-14 mRNA was analyzed by the primer extension method of McKnight and Kingsbury (9) as described by Ausubel (10). 100 ng of the oligonucleotide primer JD-21 was labelled with l-nP-ATP and polynucleotide kinase. 1 ng of labelled primer was coprecipitated with song RNA, then hybridized to complementary sequences by overnight incubation in In NaCl, 165mg HEPES (pH 7.5), 0.33mn EDTA. After ethanol precipitation of the mixture, hybridized primer was extended 36 Befgrenges l. Maniatis, T., E. Fritsch, and J. Sambrook. (1982). “Molecular Cloning: A Laboratory Handbook.” Cold Spring Harbor Laboratory, New York. (1983). J. 10. Sugarman, 8., J.B. Dodgson, and J. Engel. Biol. Chem. 258:9005. Dodgson, J.B., J. Strommer, and J.D. Engel. (1979). Cell 11:879. A., and W. Gilbert. (1980). Methods in Maxam, Enzymology 88:499. Smith, D., and J. Calvo. (1980). Nuc. Acids Res. 8:2255. Sanger, F., S. Nicklen, and A.R. Coulson. (1977). Proc. Nat. Acad. Sci. 888 15:5463. Chirgwin, J.M., A.E. Przybyla, R.J. MacDonald, and W.J. Rutter. (1979). Biochemistry 18:5294. Davis, L.G., M.D. Dibner, and J.F. Batten. (1986). "Basic Methods in Molecular Biology", p. 130. Elsevier Sci. Pub. Co., New York. McKnight, S.L., and R. Kingsbury. (1982). Science 211:315. Ausubel, F.M., R. Brent, R.E. Kingston, D.D. Moore, J.G. Seideman, J.A. Smith, and K. Struhl. (1987). 37 9111211121 51 was! lingual; 9.1 91115131: m. slates: 9mg 38 Sugarman gt 81. (1) reported the isolation of 50 recombinant LCharon4A clones which contain chicken histone genes. Fifteen unique members of this collection were further characterized by restriction mapping and Southern blotting. They located an H1 gene on LCHla and determined its sequence. When a fragment from this gene was used as a hybridization probe, 6 of the other clones were shown to contain H1 genes (Figure 4). Only two of these clones, LCH3d and LCHIOa, were known to overlap. To facilitate more detailed mapping of the H1 genes, plasmid subclones of fragments containing the genes were constructed. After DNA was isolated from each recombinant phage clone, it was digested with the appropriate restriction enzymes. The H1 gene-bearing fragments were purified from agarose gels and ligated into the common Cloning vector pBR322. These subclones are indicated and named in Figure 4. 811199 Detailed restriction maps of p2c2.2RK and p10a5.08R were deduced from the fragment generated by single and double restriction digests (Figure 5). The known chicken H1 gene sequence (of Hl.1a) contains overlapping ApaI and SacII sites at codons 36-39. A similar coincidence of ApaI and 39 Figure 4. Recombinant LCharon 4A chicken genomic clones which contain chicken H1 genes, and plasmid subclones derived from them. From (1). 40 . _r,: :35:.‘:‘:‘:-_:”“: sing: a: mu. ,3: pCH 1.81 8K1 p2c8.BRH W p2C22RK E'f'rrrr’rrxrrrrzj AC’QC g_ .-N a. ..... ~ l——-—’ .__..__.__.._ ...— , ’_‘:-:’_ 57-:2-2'":;_:;rfi.__.'_' “‘41“ A0430 4.1-, If?! mam—”33‘73 5—,...) ACmOo m) DIG-5.088 9505.288 W E‘ *“ ”’3 L_._,____..A_,. ! fl; A015: :4 ..fl.__.t.\ll____' ' *f—+,.’.+’ ACH2C M 9203.588 DZGTJRH [T'WWTVYYW '1 Q l A I d I, ACHRd I r I, ’ ..“..' L-‘...'(‘“‘o..~.0’.."«' .OIDIx. OO‘I5!.‘ O’A‘t 41 Figure 5. Comparison of p2c2.2RK and p10a5.0BR. 'A? 9-.." 0...”) r) , ‘ i d 5 pLK/z..gRix 9 g___;' ’2 Y 3’ ‘U ' a g ECQFJ HHLCF 8:3 mi-H BstEH Kpnl 7' [VJCH ? a. -Lj. 4:) .0 ADC] SOCH Figure 5 pIOOSIIN? 43 locate and orient the H1 genes on these two subclones. The strongly hybridizing region of each subclone contains most of an H1 gene (approximately 185 codons): the weakly hybridizing fragments each contain approximately 36 codons of sequence homologous to the 5' end of the H1.1a gene. The conservation of restriction enzyme sites in the suspected coding regions could be expected because of the generally high homology among histone genes. This level of homology does not extend into the flanking regions of the histone genes. Grandy's analysis of the chicken H28 genes (2) showed that, except for a few short sequence blocks, sequences flanking the histone genes are not conserved. Since the maps of p2c2.2RK and p10a5.0BR are identical for more than 1.5 kb upstream of their H1 genes, these clones must overlap. The iclones from which they are derived are aligned in Figure 4 to show how they overlap. The only discrepancy between the overlapping maps of JCH2c and :CH3d is in the location of the sequenced H4 gene on 3CH3d. It seemed likely that the order of the small EcoRI fragments at the end of the :CH3d insert was incorrectly assigned. If these fragments were misordered, the map of the H4 gene—bearing fragment (called pCH3dR8 in (1)) should correspond to one end of p2c6.8RH. Figure 6 44 Figure 6. Restriction maps of pCH3dR8 and p2c6.8RH. Weak hybridization to the H1 probe is indicated by the dashed overline. Arrows indicate regions of p2c6.8RH which were sequenced to confirm the absence of H1 gene in this region. 45 (6) it“ pCH3dR8 4.--.21 ? 4,3 7 l E 6 "’ pzcaenn & Ecol-'4! linker 4: room V Hanlll Y Aral ‘ HIMI ‘ Sac" 7 ‘9" 1 kb Figure 6 46 The region subcloned into p2c6.8RH was previously reported to hybridize to an H1 probe (1): however, we only observed weak hybridization localized near the H4 histone gene at its left-hand end (Figure 6). Furthermore, there is barely enough room between the H4 gene and the end of p2c6.8RH to contain a normal H1 gene. These facts cast doubt on the assignment of an H1 gene to p2c6.8RH. Sequence analysis upstream of the known H4 gene showed that there is in fact no H1 gene in this area (Figure 6, sequence not shown). To prove that there is no H1 gene on p2c6.8RH, restriction digests of chicken genomic DNA were blotted onto nylon membranes and probed with either the p2c6.8RH insert or the H1 gene probe. These probes hybridize to different genomic restriction fragments (data not shown), confirming that there is no H1 gene on p2c6.8RH. fllblfi Fine structure maps of p2e3.5RR and p5e5.2RB are shown in Figure 7. The patterns of hybridization to the H1 gene probe around the coincident ApaI and SacII sites locate and orient the H1 genes on these subclones. The maps of these subclones are identical for more than 2 kb downstream of the H1 genes, so we concluded that these clones overlap. The parent lclones, lCHZe and ;CH5e, are . -—- L--- ‘ILA-o ‘--‘_‘-_ 47 Figure 7. Restriction maps of p2e3.5RR and p5e5.2RB. Hybridization to the H1 gene probe is indicated by heavy or dashed overlining. 148 ’f Y M if if.) ‘f L..1.IJ__JL1 M ? ? Y 1 3 9505.288 T W Ecom 6 [com “an: 4\ lamHl 9 am ‘Ihdi 9 one" no Y Avol 9 mm Figure 7 49 The failure to assign an H28 gene to the region of p5e5.2R8 was an oversight in the initial characterization ofl_CH5e (1). filLlQ An analysis of chicken histone gene copy number by Ruiz-Carillo g; 81. (3) showed that the chicken genome contains 5 or 6 H1 genes. Figure 4 shows that since :CHZc and LCHlOa contain the same H1 gene, iCHZe and ;CH5e contain the same H1 gene, and p2c6.8RH contains no H1 gene, the set of recombinant iclones characterized by Sugarman gt 81. (1) contains 4 H1 genes. Sugarman g; 81. had isolated 50 histone gene-bearing Lrecombinants in their original library screen but only characterized 25 of them. We decided to screen the uncharacterized isolates for H1 genes in an attempt to find other members of the gene family. This was done by spotting each of the isolates in an array on a lawn of host bacteria. After overnight incubation each clone formed a large (8 mm) plaque. The arrays were transferred to nylon membranes and hybridized to the H1 gene probe. Four uncharacterized clones showed homology to the Probe. Restriction mapping showed that 3 of these were identical to previously characterized clones. One clone was - --u Lu—— me- _-_ -c Ohie fl1nna lruin 4e ehnun in 50 Figure 8. Restriction map of LCch. The region which hybridizes to the H1 probe is indicated by the heavy bar. The plasmid subclone of this region is p1c2.0HH. 51 plc2.0HH 4L; :2 an 1} 1T .8. Eco!!! llnker ACHic v HMO". 1\ 80mm 1 kb Figure 8 52 family to be isolated. Since the original set of 50 Lrecombinants contain numerous sibling pairs and overlapping clones, and since no more than 5 genomic bands were found to hybridize to a H1 gene probe by Ruiz-Carillo gt 81., it seemed that the H1 gene family was complete with 5 members. This idea was wrong. D'Andrea e; dl- were studying the organization of the chicken histone genes using the same recombinant LCharon4A chicken clones as we were (18). Their collection of _\clones, like ours, contains 5 H1 histone genes. They also isolated a cosmid (cosmid 6.3c) which contains a sixth chicken H1 histone gene. This gene lies very near sequences contained in :Csz, and lies on the same genomic EcoRI, HinDIII, and 8amHI fragments as H1.2d. These were the enzymes we used to determine Hl histone gene copy number by Southern blotting, which explains our failure to detect a sixth gene. We attempted to isolate this gene from our library, using probes D'Andrea gt 81- showed flank it. We were unsuccessful, so it seems this part of the chicken genome is not represented in the library. H1 EEQREDE£§ Sugarman g; 81. published the first DNA sequence of a chicken H1 gene (1). To learn the degree of diversity among members of the H1 family: 1) the entire coding sequence of .-_ 91".. ‘- “" ‘n-‘ n.-- ““—‘-‘A --.‘ .‘A—— -- 53 811.194 same The sequence of H1.10a was determined by the chemical degradation technique of Maxam and Gilbert. Figure 9 shows the restriction sites which were radioactively labeled with g 32P-ATP and polynucleotide kinase to sequence the gene and its flanking regions. Figure 10 shows the sequence of H1.10a compared to the originally sequenced H1.1a. Like most histone genes, neither gene contains an intron. H1.10a codes for a protein of 223 amino acids, six more than H1.1a. When the sequences of the two genes are aligned to accommodate the extra amino acids of H1.10a, the remainder of the two coding sequences are 904 homologous. The differences between the two H1 sequences are not evenly distributed. Only one amino acid sequence difference (asn/ser) is in the central hydrophobic globular region of the molecule. This is also the region that is most conserved between species (4,5). The other 19 amino acid differences occur throughout the basic termini of the proteins. Most of the differences are nonconservative: that is, they involve amino acid side chains with significantly different chemical properties. The nonconservative differences (and also insertions) in H1.10a often involve 54 Figure 9. Fine structure restriction map of p10a5.0BR. Labelling sites used for sequence determination (arrows) are shown. 55 Figure 9 madman F A‘ 4} , -—-> “—— , £3" ? 8316" Y Ava) Y'Bdl S’Nom : "lb A Kpnl .1 Eco!“ T BamHl Figure 10. 56 Comparison of the sequences of H1.1a and H1.10a. The CCAAT element, TATAA element, coding sequence and 3' hyphenated dyad are capitalized. The H1 specific element is underlined. The proteins are different lengths: dashes are inserted in the sequence of H1.1a to align the genes to show homology. Amino acid differences are indicated. Cflla CHan tggrggnatt gtagaanana ccaacgtccc ctcartcccg atnsnsgsnn snnpzansrt tctgtaggaa napgapnttt gcgraCCAAT cetcprgfkg Cxacsscrsc ssnscsanUC tflctcrxaar ccnurnnttc RR’RSCRRFR gcgnccpcgc ser ATC TCC CAG ATC CCT CAC ala pro sly CCC GUC --- CCC CCC ACC ala pro thr AAC CCC CCC AAC CCC CCC ACC CTC ACC ‘ ACC CTC ACC CAC CCC AAC CAC CCC AAC CCC CCC (;(}(3 CCC 0C) 53.". on AAC CTC CCC AAC CTC CCC ACC ACC CCC CCC CCC CCC CCC CCC CTC CTC CCC CCC CCT CCT CCC AAC .“ CCC AAC CCC CCC CCC CCC ‘ CTC ATC CTC ATC CTC TCC TAC CAC TAC CAC AAC ACC AAC ACC 57 ccrccgatct cattgttcgc (prgcltttt cgcctgttaa,gaaacacaaa gtaaggaact ggctgcgcpg gcggtcaatt C'HCHCCH'S CBFHCEHRRC RRBFLCtSCH ttggaccgac aagaaacaca accggagcpg ctccgctcTA TAAntacpag gccgccgact AATcagcacg cgcpgcgctg cTATAAaggg prpgaacgac gtccgtcacc ggagctccgc aggaggcgcc sla CTT CCC CCC CCC CCC CTC TCT CCC CTC CCT CCC CCC CAT CTC CCC CCC CCC CCG AAG AAA ACC ACC CTC CTC CTC CTC CTC CTC CCC CCC pro CCC CCC ala AAG AAC CCC CCC CAC CAA CTC CTC asp --- CCC AAC AAC CCC AAG CCC CCC AAG AAG CCC AAG pro CCC AAC CCC CCC CCC CCC CCC AAC CCC CCC CCC CCC CCC CTC TCC CCC TCC AAC CCC CTG TCC CCC TCC AAG CCC CTC AAG AAC CCC CTT CCC CTC AAC AAC CCC CTC asn AAC AAC AAC ACC CCC ATC AAA ACT AAC ACC CCC ATC aer ACC AAC CCC ACC CTC CTC ACC AAG CCC ACC CTC CTC CHla CHAUG AAA AAC AC0 [hr CCC (ICC A AC AAC AAC AAC erg AAC lys AAC AAC CCC CCC AAC AAC CCC CCT CCT CCT ACC ACC AAC AAC CCC CCC CCC CCC AAC AAC 58 pro (hr Lhr (XXI(RH'CAC AVA AAA CAC AAA CCC ACT AAC AAC AAC 1C0 CCT CAT CTC AAC CAC AAC CCT CCT AAC AAC AAA ser val pro CCC CCC AAC CCC AAC AAC CCC CCC CCC AAC AAC CCT CCA CCC AAC CCC AAC AAC CCC CCC CCC AAC AAA CCT 85H CCT CCC AAC AAC CCC AAC AAC CCA CCC CCG CTC AAC CCT CCC AAC AAC CCC AAC AAC CCC CTC GCA CTG AAC val CCC AAC AAA CCC AAC AAC CCC CCA CCT CCT CCC ACC CCA AAC AAA CCT AAC AAC CCC CCC CCT CCC CCC ACC ala gly CCC CCC AAC ACC CCC AAC AAC CCT ACC AAC CCT CCC CCC CCC AAC ACC CCC AAC AAC CTG ACC AAC CCT CCC thr val ala AAC AAC ACT --- CCC --- AAC ACC CCC CCC AAC CCA AAA AAC CCC CTC CCT CTC AAC ACC CCC CCC AAC CCA 818 val val ser ala CTC AAC CCC AAA CCT CCC AAC TCA AAC CCC CCC AAA CTC AAC CCC AAC CCT CCC AAC CCC AAC CCC ACC AAA pro thr thr CCC CCC AAC CCA AAC AAC CCA CCC ACC AAA AAC AAC CCA CCC AAC CCA AAC AAC CCC CCC CCT AAC AAA AAC TAAgatgaca gnagsaattc TAAatatcct ggggnaanaa CIIIIAACAC CCecccattt aanacccaac CCCTCTTTTA accgcggcag cacaactaat tttcgctgtt tcttacgatt pro gngtctgctc atttaaaaac cccaaaCCCT aaaaaaaaac cctcccctct gctttgcaga attctcapaa agagctggaa tgctgcggga ACACCCaccc aaagaaaccc aaaaagngcc tatctcagtt gcagagattc agatttgggc ctttgtgtgt gtggagatgg aggttcgctt 59 while the basic character of the termini of H1.1a and H1.10a is the same, the different secondary structures of the termini may alter their functions (6). The 5’ nontranscribed sequences of genes are highly variable but contain short conserved sequences which act as control elements. The chicken H1 genes are no exception. Two elements which facilitate transcription are the TATAA and CCAAT elements (7,8). Both of these elements are found in many eukaryotic genes. The H1.10a gene, like the H1.1a gene, has both of these elements. The H1.10a TATAA element is 13 bp closer to the translation initiation codon ATG, so the untranslated leader of the H1.10a mRNA is somewhat shorter than that of the H1.10a mRNA. The location of the CCAAT element, relative to the TATAA element, is identical in the two genes. In each of these H1 genes the CCAAT sequence starts 23 bp upstream of the TATAA element. A sequence element which has been found upstream of the CCAAT element in vertebrate H1 genes is an A+C-rich sequence containing the core sequence AAACACA (9). This sequence is found in both of these chicken H1 genes. A comparison of the A+C-rich elements from these genes shows an extended conserved sequence of AAGAAACACAA. However, the relative locations of the A+C-rich elements are different in these two genes. The element is 39 bp upstream of the CCAAT 60 element in H1.10A but 67 bp upstream of the H1.1a CCAAT element. The function of the conserved A+C-rich elements is not known. The 3' ends of most eukaryotic mRNAs undergo post- transcriptional processing which adds a tail of polyadenosine. Most histone mRNAs are unusual in that they are not polyadenylated. Another type of processing generates specific 3' termini of histone mRNAs (10,11). This processing requires the presence of at least 2 histone gene-specific sequences: a hyphenated dyad which can form a stem+loOp structure, and a purine-rich region 5-15 bp downstream of the dyad. These elements are thought to bind a small ribonucleOprotein during histone mRNA processing (12). The H1.1a histone gene was shown to have a hyphenated dyad sequence, GGCTCTTTTATAAGAGCC, which is similar or identical to the processing signal found in other histone mRNAs (1). The H1.10a gene contains the identical element which is flanked by sequences containing only A and C. This is characteristic of other histone genes as well. Fifteen bp downstream of each dyad is the start of a purine-rich sequence. The chicken element A(G/A)AAAGAG is similar to other vertebrate elements (13-15). 9182: H1 928: sssusnsss All of the chicken H1 genes contain a BclI site 61 originating at these BclI sites were subcloned to facilitate sequence analysis of the other chicken H1 genes. This procedure also provided a rapid analysis of different H1 histone gene promoter regions, so that gene-specific Sl assays for mRNA levels could be designed. Figure 11 shows the coding sequence of the 5' portions of the five chicken H1 genes. These portions of the genes code for the N-terminal basic domains of the histones. Each of the genes codes for a unique protein. Some differences between the H1 amino acid sequences are common to two or three variants. H1.1a has a serine residue at the N-terminus. H1.1c differs from H1.1a at only one amino acid in this region of the protein. H1.2e differs from H1.1a at 5 positions, and has an inserted residue at position 20, but is like H1.1a and H1.1c by having serine at the N-terminus. The other two H1 genes, H1.10a and H1.2d, have alanine at the N-terminus. They also have 2 amino acids inserted after position 16 relative to H1.1a. Figure 12 shows the promoter regions of the 5 chicken H1 genes. All of these promoters contain the TATAA element and the CCAAT element, though the H1.2e gene has a variant TATAA element, TAAAA. A comparison of the sequences flanking the prototypical TATAA elements shows that the Figure 11. 62 Comparison of 5' coding sequences of chicken H1 genes. Dashes are inserted in some of the sequences to align the H1 protein sequences. Codons which specify nonconsensus amino acids are underlined. The overlapping SacII and ApaI sites found in all of the genes are indicated. 63 "1.1. ...ATTI Trc cm ACC ccr: CCC 9T: CCC ccc CCC CCC on: TCT CCC CCC “1.10. .. .nx; 9'7 w; Affl‘ (arr (:th u}; (H uric (m 9.5: (In: car: etc [.19 mm ...Al'} 1+6 mu; m: «n m: C4 (I w: (:1: or. CCT (.10 me m; m; "1.24 ...nr; 9': on: m: UT err on (51A m; cm; 0:30 5.33 —-- cm cm “1.29 .. .AIC CCC CAC ACC CLC CCC CCC CCC CCC CCC CAT CCC CCC CCC CCC “1.1a (ICC ------ CFC AAC CFC CCC --- CCC AAC MG CCC AAC AAC CCC "Ll“a (gm; Avg gr; CH; m; (:1 r; (m: (‘53 C(TC AM} AAC CL‘C AAC AAC (:rc IH.lc (.1 L? ------ (.z‘c AM} (.171; (...r‘r.‘ -—- (;(:(: AMI AAC (.130 MG AAC CCU m .24 (:11: (35: (:51; CH; AAC (,‘I‘C nut --- CCC AAC AAC CCC AAC AAC cm "1.29 (LC ------ (15C AAC CCC CCC g; CCC AM: AM; CCC AAC AAC CCC "1.1a (If; grg (11!? air: AAC gig CCC AAC CCC C(‘G CCC CCC A<‘.C...8cll... H|.l'la (:11; (u: (n: (:11? AAC (.11; (Wt AAC CttC C(‘G one CCC A20%). 5) They are rich in glutamic acid and aspartic acid (>201). 6) They are rich in proline (>51). Proteins very like the calf thymus HMG proteins have 73 (9-12). Primary sequence information confirms the homology of some of these proteins to the prototypical calf thymus proteins. The occurrence of HMG proteins in nonvertebrate eukaryotes is not well documented. An early report described proteins isolated from yeast (13) and wheat (14) chromatin which fit the physical criteria of Johns gt 81., but without sequence information, no assignment of homology could be made. These proteins are intermediate in size between calf HMG-1,2 and HMG-14,17. Proteins with HMG-like size and amino acid composition have been isolated from several types of insect chromatin, including onsophila mslanssastsr (15.16). Dixirilis (17). and Csratitus sssifafs (18). Again, the sequences of these proteins are not known so they are only nominally HMG proteins. The strongest evidence for an HMG protein outside of those of vertebrates is a recent report of an HMG-1,2 homologue in Saccharomyces ssrxisias called ACP2. The sequence of the ACP2 gene was determined and the deduced amino acid sequence matches the calf HMGl sequence at 43% of their residues when conservative replacements are included (19). This level of homology indicates that ACP2 is indeed a member of the HMG- 1,2 family, since the trout and calf proteins have only 58% identity. Disruption of ACP2 gene function is lethal. The 74 fact, homologues of the vertebrate proteins. If, as many assume, the HMG proteins play a fundamental role in the dynamics of chromatin structure, this would be expected. Absent a functional assay for any HMG protein, homologous protein sequences from other phylogenetically diverse species would be reassuring. HH§:1 and 332:2 HMG-1 and HMG-2 are similar in size (26 kd) and amino acid composition (20). Comparison of the amino acid sequences that are known confirms that these two proteins are closely related and they are sometimes collectively known as HMG-1,2. The primary sequences of a number of vertebrate HMG-1,2 proteins have been determined. Figure 13 shows an example from a rat. The composition of this HMG-1 protein is typical of the family: lys+arg - 24%, asp+glu a 274, and pro - 61. The charges are not evenly distributed throughout the molecule. The C-terminus consist of a run of 30 acidic residues, while the N-terminal two-thirds of the molecule is basic. On the basis of their amino acid sequences (21) a 3- domain structure for the HMG-1,2 proteins has been proposed (22). Domains A and B are globular basic regions with moderate helicity (23). These are residues 1 to 79 and residues 90 to 163 of rat (and human). A 10 residue linker Figure 13. 75 The sequence of a rat HMG-1 cDNA. The C- terminus of the predicted protein is made only of lysine (K), aspartic acid (D), and glutamic acid (E). (From (24)). '01 *6 70 595 6‘0 745 76 affirm:mutt”:Wunr‘nmmvmmmurrmwrrw CTMA‘ATHI‘A'W (AGAIN 77MM“:‘C(XMIMTUTCCTCATATXIATTCTTTCTG‘MASCTGZ HGIGD'II'IIIMSSYAFFVOTC (‘L‘IJ "-in .h’a Al. A.R..AAM J.A'V Tiff-Al" rmvmcxmmmmcmu‘m IllMIIINPDASVNFSETSKICSCI VIM .ATAT'W‘Y‘TYT'TAM M I rim"?! M ".ATATOI lCAAAfl'ITTCACMO ‘Cnxmanm MlTMSAICIGIFIDMAIADKAIYEI (MA 1' M.L‘T ACA TC'C'I‘TI'AM I .A'AI’ZWMTTCAAG LACCCCAATUCCI mama; [HITYIPPIKYIIII’IOPNAPIASS arr TYC‘TTY‘TTVRT‘T‘YGTYTT'ZN :1 N‘L‘ITKTAAMATCWACITLA"!‘ATtTTT'II‘TTATL‘CATTuTh .AmTT AftLrCStYIPIIIGtNPGLSIGDV (TIMAZAMTTAIA 1AM?” 1M- M‘ACT‘L‘T'BCOLATGK‘W‘KITCCTATXWWCCAAG All:Latnwwwtaaopxo'vsllaan (“mun .xman *ZM‘A .nn‘r' m rm rarxmwx‘m‘anrun‘mz ”rm: LllilYCIDIAAYIAIGIPDAAIIGV (.T'TMan‘7.;~M.NTAKMAMMIMNIMMCIWJWAJTVMLAUJICNiMLAfllAG VIAIIsulllrttDOIzltoztocncc (M M A’JM .AT'ANJT'MMPM mnamnnmrurmmmarmmmn trtzozorrtoooozo 7‘ ".1" TAYMA‘I‘AYYYM’”"I “Tire: N AN‘T"MWTTWMAMAATTCAMTCTWDSTGTCT MMTTTUTTTTTW-TDTUCTUTY'TTTTTTTCTATACTTAACCG 792 Figure 13 77 segment that precedes it. The high charge of this domain prevents the formation of stable structures in solution. BMW When preparations of HMG proteins are analyzed by electrophoresis or high pressure liquid chromatography, microheterogeneity is found in every HMG protein fraction (25-28). The reason for this is that all of the HMG proteins undergo a variety of posttranslational modifications, much like the histones. The functions of these modifications are not known for either histones or HMG proteins. When purified calf thymus HMG-1 and HMG-2 proteins are analyzed, 3-44 of HMG-1 and 8-94 of HMG-2 is found to be methylated (29). The modified residue is Ntdf-dimethyl arginine, but which specific arginines are modified is not known. No methylated lysine residues are found. HMG-1,2 can also be acetylated (30-32). The modified residues are lysines in the N-terminal 12 amino acids of the proteins which are released by treatment with cyanogen bromide. Sterner et a1. (32) added 3H-acetate to a calf thymus homogenate to radioactively label the modified residues. When they sequenced the N-terminal peptide fragment they found only 2 of the 4 lysines it contains were labelled, specifically those at positions 2 and 11. The N-terminus 78 Several types of glycosylation have been demonstrated (33). The sugars mannose, galactose, glucose, fucose, N- acetylglucosamine, and an unidentified sugar (possibly xylose) can be released from HMG-1,2 by alkaline borohydride hydrolysis. Mild alkaline hydrolysis, which releases 0- linked sugars, will not release the carbohydrate from HMG proteins, suggesting they are N-glycosyl linkages. HMG-1,2 are reportedly modified by ADP-ribosylation, but neither the site nor the structure of this modification is well characterized (34). WWW Despite many investigations, no clear functions for the HMG proteins are known. Studies of HMG protein function have generally been of 2 types: 1) since the HMG proteins are major components of chromatin, their interactions with DNA, histones, and nucleosomes have been studied, and 2) the HMG proteins, or nucleosomes associated with them, have been characterized from various cell types. One of the first properties of HMG-1,2 discovered was that the molecules have a much higher affinity for single stranded DNA than for double stranded DNA. This is demonstrated in the most straightforward way by passing HMG proteins over columns which contain immobilized DNA. 79 not (35). It has been suggested that one function of the HMG proteins might be destabilizing the DNA double helix by virtue of their affinity for single stranded DNA. Yoshida and his colleagues described an interesting interaction of HMG-1,2 with supercoiled DNA. They inserted a (CG),o fragment into the ampicillin resistance gene of pBR322. When this plasmid is highly supercoiled the 20 bp insert assumes the z-DNA form. When this plasmid is put into an in 11t19 transcription reaction, 81 2911 RNA polymerase cannot transcribe through the region of Z-DNA and an abbreviated transcript is made, but when HMG-l is included in the reaction the transcriptional block is removed (36). They think that HMG-l binds in or near the region of z-DNA and relieves supercoiling there, even as it increases supercoiling in the rest of the plasmid. Another interaction of HMG-1 and DNA was recently described by Bianchi et 81. (37). They designed DNA fragments which anneal into cruciform DNA, made an affinity Chromatography column with these cruciform fragments, and purified a cruciform binding protein from rat liver. Peptide analysis and sequencing showed that this protein is in fact HMG-1,2. flgng {18; rat HMG-1, produced from a cloned cDNA, binds cruciform DNA as measured by a gel shift assay. Because of the clever design of their artificial 80 do not bind HMG-1, and suggest the HMG-1 binds to the branch point where one double strand splits to two single strands. This is consistent with all but the earliest results described above. Nucleosomes can be reconstituted from their component DNA and histones, but only when the components are mixed in a high salt solution, followed by lengthy dialysis to physiological conditions (38,39). The reassembly of nucleosomes at physiological ionic strengths can be facilitated by the addition of certain Xenopug oocyte extracts (40). When the "assembly factor" is further purified, it is found to be an acidic protein (41) called nucleoplasmin. Nucleoplasmin is the predominant nucleoprotein of Xgngpgg oocytes and in yittg it appears to act by preventing nonspecific aggregate formation between DNA and the large pools of histones present in the oocytes, thus allowing pr0per nucleosome assembly. Acidic polypeptides (polyglutamic acid or polyaspartic acid) can substitute for nucleoplasmin in the assembly reaction (42). These facts led Bonne~Andre gt 81. to investigate the possibility that HMG-1, with its acidic C domain and its histone-interacting A domain, might be a nucleosome assembly factor (43). They found that HMG-l facilitiates the formation of core octamers of histones. Also, the addition 81 the reaction. The nucleosomes reconstituted in the presence of HMG-l appear to be normal in electron micrographs. It is not yet possible to decide which of the properties of HMG-1,2 are biologically relevant. Whether HMG-1,2 is involved in DNA binding or unwinding or melting or nucleosome assembly, a role in either of the two basic functions of chromatin, replication and transcription, can be postulated. 539:11 289 Hfl§:1l Originally, two small (9-12 kd) HMG proteins were found in mammalian and avian cells. These are HMG-l4 and HMG-l7 (44). The amino acid compositions of these proteins are similar and show that these proteins form a small family, so they are sometimes collectively referred to as HMG-14,17. Trout have a single small HMG protein, called H6 (45). It is a member of the HMG-14,17 family, but because of its amino acid composition and small size (7 kd) its relation to individual mammalian HMG proteins is not clear. Several of the small HMG proteins have been completely sequenced, including calf HMG-17 (46), calf HMG-14 (47), chicken HMG-17 (48), and trout H6 (49), and a partial sequence of chicken HMG-14 is known (50). The sequences clearly show similarities throughout the proteins, confirming that these proteins are related (51). 82 HMG-14,17 contain only a low proportion of hydrophobic amino acids. Structural studies of HMG-14 (52) and HMG-17 (53) using circular dichroism and nuclear magnetic resonance spectroscopy show that these proteins have little or no secondary or tertiary structure in a wide range of solution conditions. These methods only detect significant structural involvement of amino acid side chains when the proteins interact with DNA at low ionic strength. The residues involved in this DNA binding are in the N-terminal parts of the molecules, approximately between residues 15 and 40. This region is highly conserved in all of the HMG- 14,17 molecules examined. HMG-14,17, like HMG-1,2, undergo a number of post- translational modifications which cause microheterogeneity when protein preparations are analyzed by electrophoresis or, especially, high pressure liquid chromatography. The best studied of these is phosphorylation. 13 yittg studies show that mammalian HMG-14,17 can be substrates for phophorylation by cAMP- and cGMP-dependent kinases (54,55) and other kinases (56,57). Both serine and threonine side chains can be phosphorylated 1n 21ttg. Several researchers have tried to correlate 1n 2128 levels of HMG-14,17 phosphorylation to stages of the cell cycle by adding ”P- 83 studies show that phosphorylation of HMG—14 is higher in metaphase—arrested HeLa cells than cells in interphase (58- 60). An endogenous kinase of Chromatin may be responsible for this phosphorylation. Another study shows that an HMG- 14¢like protein is phosphorylated in both metaphase and interphase cells, but that the electrophoretic mobility of the metaphase protein is less (61). Multiply phosphorylated forms of HMG-14 may account for the higher levels of HMG-l4 phosphorylation seen in all the studies: basal levels of HMG phosphorylation may not be detected by some labelling regimens. Phosphorylation of HMG-17 was not detected in the work just described, but has been reported in other studies. Both HMG-l4 and HMG-17 are phosphorylated in Chinese hamster ovary cells in interphase. When these cells are arrested at metaphase, the relative phosphorylation of HMG-l4 increases (62,63). Ph05phorylated HMG-l7 has also been found in rat cells (64) and mouse cells (65). HMG-14,17 can be acetylated at lysine residues in the N-terminal portion of the molecules (65,66). Acetylation of chromosomal proteins may have profound effects on gene structure and activity (67). 13 21tgg, HMG-14,17 can inhibit the action of histone deacetylase (68,69). This property may be important for their in yivg function since 84 Mammalian HMG-14,17 can also be modified by a number of carbohydrate moieties. Fructose, galactose, mannose, and N- acetyl glucosamine have been found in these proteins, but it is not known what percentage of them is modified or how heterogeneous the glycosylated forms are (71). ADP— ribosylation of HMG-14,17 has been reported in transformed human cells (72) and mouse mammary carcinoma cells (73). Only 0.034 of the HMG proteins are ADP-ribosylated in the mouse cells, but this small fraction may play a regulatory role since glucocorticoid treatment, which induces transcription of some cellular and tumor virus genes, decreases the level of HMG ADP-ribosylation significantly (74). Possible Eflfizlilll functions During develOpment of an organism, specific sets of genes are transcribed in various cells at various times. Genes that are to be activated are assembled into a specific chromatin structure prior to the onset of their transcription (75). Despite considerable efforts, the unique components and structure of transcriptionally active chromatin are not well understood. The best evidence that active chromatin is structurally distinct from inactive chromatin has come from nuclease digestion experiments. These experiments show that active 85 sensitivity is seen throughout the region of an active gene (77,78). Superimposed upon this generally increased nuclease sensitivity, many active genes have sites which are hypersensitive to DNase I or micrococcal nuclease (77). These hypersensitive sites are often in the control sequences 5' to active genes. The appearance of hypersensitive sites before transcription begins and their persistence after transcription ceases show that these sites are indications of an activated chromatin structure, rather than a consequence of the process of transcription itself (78,79). The active chromatin structure indicated by DNase I hypersensitivity is necessary but not sufficient for transcription induction in most genes that have been examined. Light digestion of chromatin by nucleases releases mononucleosomes and oligonucleosomes which are substantially enriched in actively transcribed sequences (75,80). Studying the components of these nucleosomes, and reconstituting nuclease-sensitive chromatin from isolated components, has provided evidence that HMG-l4 and HMG-17 are involved in the maintenance of a transcriptionally active chromatin structure. The earliest report relating HMG-l4 and HMG-17 to nuclease sensitivity was by Weisbrod and Weintraub (81). 86 erythrocyte Chromatin but not in brain chromatin. This sensitivity is lost when erythrocyte chromatin is extracted with 0.35 U NaCl, which removes some Chromosomal proteins including HMG~14 and HMG-17. Reconstitution of the depleted erythrocyte chromatin with HMG-l4 or HMG-17 restores DNase I sensitivity, even when brain is the source of the HMG proteins. This shows that, though HMG-14 and/or HMG-17 are a necessary part of the active globin nucleosomes, some other feature of the nucleosome is responsible for tissue- specific transcription patterns. Weisbrod gt 81. extended this work to show that most of the active genes of erythrocytes and of a leukemia cell line have nuclease sensitivity which is conferred by HMG-14 and HMG-17 (82). By measuring the hybridization kinetics of nuclease-generated nucleosomes and total nuclear RNA they could demonstrate the involvement of HMG-14 and HMG-17 throughout the active portion of the genome. The bulk stoichiometry of their reconstitution experiments is puzzling. Although 20‘ of the genome is transcribed in these cells, nuclease sensitivity is restored to individual genes (gtgt,8- globin) by just one mole of either HMG-14 or HMG-17 protein per 20 moles of nucleosomes. The selective affinity of HMG~14 and HMG-17 for globin nucleosomes in erythrocytes was confirmed by Sandeen gt g1. (83). They 87 nucleosomes have 2 strong binding sites for either HMG-14 or HMG-17, confirming other studies (84,85). They also showed that HMG-14 and HMG-17 bind naked DNA, albeit less tightly. In the studies cited above, active nucleosomes were shown to be associated with HMG-14,17. Two groups have taken the reverse approach by isolating HMG-associated nucleosomes and characterizing the sequences they contain. The results of these studies are mixed. Dorbic and Witting (86) used a monoclonal anti-HMG-17 antibody to isolate nucleosomes produced by light nuclease digestion of nuclei. They found that nucleosomes released from liver nuclei contain a gene active in liver (vitellogen II) and oviduct nucleosomes contain genes active in oviduct (ovalbumin and lysozyme), but not yigg ygtgg. Druckmann gt 81. also used an antibody to enrich a fraction of HMG-17-containing nucleosomes from the livers of rats before or after treatment with a carcinogen (87). This carcinogen induces a P450 liver enzyme. They examined the HMG-17 enriched fraction for the presence of repetitive DNA, non-transcribed DNA, transcribed genes, and the inducible P450 gene. They concluded that actively transcribed genes are enriched in the fraction of nucleosomes which contain HMG-17: but, to a lesser extent, so are some genes that are not actively transcribed. 88 The nucleosomes used in the study described first were prepared by light digestion of nuclei, whereas the nucleosomes in the second study were prepared from a much more complete digest of prepared chromatin. It seems likely that higher levels of digestion of chromatin may destroy some higher orders of chromatin structure. Other studies have implicated higher order structures in the maintenance of nuclease sensitivity (88). It is possible that some rearrangement of chromatin components occurs during chromatin isolation so that the structure of active chromatin is obscured when chromatin is isolated. Furthermore, if HMG-14,17 need not associate with every nucleosome in an active array, complete digestion to mononucleosomes before antibody binding will lessen the specific enrichment for active genes. These problems are not limited to studies of HMG-14,17 function: they constitute a virtually unavoidable problem in relating biochemical observations of solubilized chromatin to the roles of histones, HMG proteins and other Chromosomal proteins that constitute the active chromosomal DNA structures in the living cell. To allow further study of the function, production, and evolution of HMG-l4 and HMG-l7 we have isolated HMG-l4 and HMG-l7 cDNA clones and genomic clones. At the start of this “‘ " ‘ ‘ I l I .0... Q A ‘- WF_1 ‘T 5-1 A--- L-‘ BAAH 89 isolation of human HMG-14 cDNA clones (89) and HMG-l7 cDNA Clones (90) in 1986. These clones were constructed in _gtll, a vector which allows expression of the cloned cDNA, so that anti-HMG antibodies could be used to isolate the HMG cDNA clones. When the HMG cDNA clones were used as probes to study the HMG genes it was found that both the human HMG- 14 and HMG-l7 genes are members of multigene families with about 50 members: most members of these families are processed retropseudogenes (91). They isolated 125-150 genomic clones by hybridization to the cDNA clones. The active genes were isolated from this set by hybridization to a series of oligonucleotides complementary to the 3' portions of the cDNAs (92,93). They also used the human HMG-14 and HMG-l7 cDNA clones as probes to isolate a chicken HMG-14 cDNA clone (94), a chicken HMG-17 cDNA clone (95) and the chicken HMG-17 gene (96). The results their studies of these clones are compared to ours in Chapter 5. 90 References 10. 11. 12. Gabrielli, F., R. Hancock, and A.J. Faber. (1981). Eur. J. Biochem. 128:363. Sanders, C., E.W. Johns. (1974). Biochem. Soc. Trans. 2:547. Goodwin, G.H., J.M. Walker, £.W. Johns. (1978). Biochim. Biophys. Acta 812:233. Johns, E.W. (1967). Biochem. J. 188:78. Chapman, G.H., P.G. Hartman, and E.M. Bradbury. (1976). Eur. J. Biochem. 81:69. Faire, R.J., and D.W. Cooper. (1987). Comp. Biochem. Physiol. 8 Comp. Biochem. 81:423. Walker, J.M., and E.W. Johns. (1980). Biochem. J. 188:383. Walker, J.M., C. Stearn, and E.W. Johns. (1980). FEBS Lett. 112:207. Watson, D.G., £.H. Peters, and G.H. Dixon. (1977). Eur. Journ. Biochem. 15:53. Watson, D.G., N.C.W. Wong, and G.H. Dixon. (1979). Eur. Jour. Biochem. 28:193. Rabbani, A., G.H. Goodwin, J.M. Walker, 8. Brown, and £.W. Johns. (1980). FEBS Lett. 182:294. kennedy, 8.P., and P.L. Davies. (1980). J. Biol. Chem. 288:2533. 00—t-—_ ‘- -.CJ . v-_..l_-..._ linnn\ al-—L—nls 14. 15. 16. 17. 18. 19. 20. 2‘. 21. 22. 23. 25. 91 Spiker, S., J.M.W. and I. Isenberg. (1978). Biochem. Biophys. Res. Com. 82:129. Alfageme, C.R., G.T. Rudkin, and L.H. Cohen. (1980). Chromosoma 18:1. Bassuk, J.A., and J.F. Hayfield. (1982). Biochemistry 21:1024. Krasnov, P.A., A.A. Karavanov, and L.I. Lorochkin. (1984). Ontogenez 18:547. Marquez, G., F. Moran, L. Franco, and F. Montero. (1982). Eur. J. Biochem. 121:165. Haggren, W., and D. Kolodrubetz. (1988). Mol. Cell Biol. 8:1282. Walker, J.M. (1982). in "HMG Chromosomal Proteins", Johns, E.W., ed., p. 223. Academic Press, New York. Paonessa, G., R. Frank, and R. Cortese. (1987). Nuc. Acids Res. 18:9077. Chou, P.Y., and G.D. Fasman. (1978). Adv. in Enzymology 11:45. Reeck, G.R., P.J. Isackson, and D.G. Teller. (1982). Nature 288:76. Land, D., D.J. Cox, D.R. Manning, and G.R. Reeck. (1985). Biochim. Biphys. Acta 811:207. Elton, T.S., and R. Reeves. (1985). Anal. Biochem. 111:403. AAAAAA ‘_-‘ al--I.-.— 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 92 Elton, T.S., and R. Reeves. (1986). Anal. Biochem. 181:53. Goodwin, G.H., P.N. Cockerill, S.Kellam, and C.A. Wright. (1985). Eur. J. Biochem. 112:47. Boffa, L.C., R. Sterner, G. Vidali, and V.G. Allfrey. (1979). Biochem. Biophys. Res. Com. 82:1322. Nicolas, R.N., and G.H. Goodwin. (1982). in ”The HMG Chromosomal Proteins", Johns, E.W., ed., p. 41. Academic Press, New York. Sterner, R., G. Vidali, R.L. Heinrickson, and V.G. Allfrey. (1978). J. Biol. Chem. 282:7601. Sterner, R., G. Vidali, and V.G. Allfrey. (1979). J. Biol. Chem. 281:11577. Reeves, R., D. Chang, and S.C. Chung. (1981). Proc. Nat. Acad. Sci. 885 18:6704. Poirier, G.G., C. Niedergang, M. Champagne, A. Mazen, and P. Mandel. Eur. J. Biochem. 121:437. Isackson, P.J., J.L. Fishback, D.L. Bidney, and G.R. Reeck. (1979). J. Biol. Chem. 281:5569. Waga, S., 5. Mizuno, and M. Yoshida. (1988). Biochem. Biophys. Res. Com. 181:334. Bianchi, M.E., M. Beltrame, and G. Paonessa. (1989). Science 211:1056. Oudet, P., M. Gross-Ballard, and P. Chambon. (1975). 39. 40. 41. 42. 43. 44. ‘5. 46. ‘7. 48. 49. 93 Camerini-Otero, R.D., B. Sollner-Webb, and G. Felsenfeld. (1976). Cell 8:333. Laskey, R.A., A.D. Mills, and N. Morris. (1977). Cell 18:237. Laskey, R.A., B.M. Honda, A.D. Mills, and J.T. Finch. (1978). Nature (London) 218:416. Stein, A., J.P. Whitlock, and M. Bina. (1979). Proc. Nat. Acad. Sci. 885 18:5000. Bonne-Andrea, C., F. Harper, J. Sobczak, and A.-M. De Recondo. (1984). EMBO J. 1:1193. Goodwin, G.H., R.H. Nicolas, and E.W. Johns. (1975). Biochim. Bi0phys. Acta 188:280. Dixon, G.H. (1978). in "The HMG Chromosomal Proteins”, Johns, E.W., (ed.), p. 149. Academic Press, New York. Walker, J.M., J.R.B. Hastings, and E.W. Johns. (1977). Eur. J. Biochem. 18:461. Walker, J.M., G.H. Goodwin, and E.W. Johns. (1979). FEBS Lett. 188:394. Walker, J.M. (1982). in ”The HMG Chromosomal Proteins", Johns, E.W., (ed.), p. 69. Academic Press, New York. Watson, D.G., H.C.W. Wong, and G.H. Dixon. (1979). 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 94 Walker, J.M., E. Brown, G.H. Goodwin, C. Stearn, and E.W. Johns. (1980). FEBS Lett. 112:253. Reeck, G.R., and D.C. Teller. (1985). in ”Progress in Nonhistone Protein Research", Vol. II, Bekhor, 1., (ed.), p. 10. CRC Press, Boca Raton. Javaherian, R., and S. Amini. (1978). Biochem. Biophys. Res. Com. 88:1385. Abercrombie, B.D., G.G. Kneale, C. Crane-Robinson, B.M. Bradbury, G.H. Goodwin, J.M. Walker, and E.W. Johns. (1978). Eur. J. Biochem. 81:173. Palvimo, P., A. Linnala-Kankkunen, and P.H. Maenpaa. (1983). Biochem. Biphys. Res. Com. 118:378. Taylor, 8.3. (1982). J. Biol. Chem. 281:6056. Harrison, J.J., and R.A. Jungman. (1982). Biochem. Biophys. Res. Com. 188:1204. Inoue, A., Y. Tei, T. Hasuma, M. Yukioka, and S. Morisawa. (1980). FEBS Lett. 111:68. Walton, G.M., and G.H. Gill. (1983). J. Biol. Chem. 288:4440. Bhorjee, J.S., I. Mellon, and L. Rifle. (1983). Biochem. Biophys. Res. Com. 111:1001. Paulson, J.R., and 5.5. Taylor. (1982). J. Biol. Chem. 287:6064. Lund, T., J. Hotland, M. Fredricksen, and S.C. Laland. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 95 Arfmann, H.A., and H. Baydoun. (1981). Z. Naturforsch. 28:319. Arfmann, H.A., E. Haase, and H. Schroter. (1981). Biochem. Biophys. Res. Com. 181:137. Harrison, J.J., and R.A. Jungmann. (1982). Biochem. Bi0phys. Res. Com. 188:1204. Sterner, R., G. Vidali, and V.G. Allfrey. (1979). J. Biol. Chem. 281:11577. Sterner, R., and V.G. Allfrey. (1983). J. Biol. Chem. 288:12135. Allfrey, V.G. (1980). in ”Cell Biology. A Comprehensive Treatise”, Goldstein, L. and D. Prescott, (eds.), p. 347. Academic Press, New York. Reeves, R., and E.P. Candido. (1980). Nuc. Acids Res. 8:1947. Mezquita, J., J. Chiva, S. Vidal, and C. Mezquita. (1982). Nuc. Acids Res. 18:1781. Malik, N., M. Smulson, and M. Bustin. (1984). J. Biol. Chem. 282:699. Reeves, R., D. Chang, and S.C. Chung. (1981). Proc. Nat. Acad. Sci. 853 18:6704. Giri, C.P., M.H. West, and M. Smulson. (1978). Biochemistry 11:3495. -— - __.- 4- a 1-5..--— 11001\ 1 Din‘ Phom 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 96 Tanuma, S., L.D. Johnson, and 6.5. Johnson. (1983). J. Biol. Chem. 288:15371. Weintraub, H., and N. Groudine. (1976). Science 121:848. Garel, A., and R. Axel. (1976). Proc. Nat. Acad. Sci. 888 18:3966. Weisbrod, S. (1982). Nature (London) 221:289. Wood, W.I., and G. Felsenfeld. (1982). J. Biol. Chem. 281:7730. Groudine, M., and H. Weintraub. (1982). Cell 28:131. Levy, W.B., and G.H. Dixon. (1978). Nuc. Acids Res. 8:4155. Weisbrod, S., and H. Weintraub. (1979). Proc. Nat. Acad. Sci. 888 18:630. Weisbrod, S., M. Groudine, and H. Weintraub. (1980). Cell 12:289. Sandeen, C., W.I. Wood, and G. Felsenfeld. (1980). Nuc. Acids Res. 8:3757. Albright, S.C., J.M. Wiseman, R.A. Lange, and W.T. Garrad. (1980). J. Biol. Chem. 288:3673. Jackson, J.B., J.M. Pollock, and R.L. Rill. (1979). Biochemistry 18:3739. Dorbig, T., and B. Wittig. (1986). Nuc. Acids Res. ‘A-Qfilfl 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97 Druckmann, S., E. Mendelson, D. Landsman, and M. Bustin. (1986). Exp. Cell Research 188:486. Senear, A.W., and R.D. Palmiter. (1981). J. Biol. Chem. 288:1191. Landsman, D., T. Thyagarajan, R. Westermann, and M. Bustin. (1986). J. Biol. Chem. 281:16082. Landsman, D., N. Soares, F.J. Gonzalez, and M. Bustin. (1986). J. Biol. Chem. 281:7479. Srikantha, T., D. Landsman, and M. Bustin. (1987). J. Mol. Biol. 121:405. Landsman, D., O.W. McBride, N. Soares, M.P. Crippa, T. Srikantha, and M. Bustin. (1989). J. Biol. Chem. 281:3421. Landsman, D., O.W. McBride, and M. Bustin. (1989). Nuc. Acids Res. 11:2301. Srikantha, T., D. Landsman, and M. Bustin. (1988). J. Biol. Chem. 288:13500. Landsman, D., and M. Bustin. (1987). Nuc. Acids Res. 18:6750. Landsman, D., T. Srikantha, and M. Bustin. (1988). J. Biol. Chem. 288:3917. 98 MI} meamm 184mm 99 The Appendix describes the isolation of cDNA clones derived from chicken HMG-l4 and HMG-l7 mRNAs. The known amino acid sequences of HMG-14 and HMG-l7 were used to predict possible mRNA sequences. A mixture of oligo- nucleotides which would hybridize to these mRNA sequences was used successfully to screen a :gtll cDNA library generated from chicken liver mRNA (1). DNA sequence analysis showed that pLM3b is a HMG-l4 cDNA, and pLGla is a HMG-17 cDNA. When the HMG cDNAs were used as hybridization probes of chicken genomic Southern blots, the HMG-l4 and HMG-17 genes were judged to be single copy genes. Srikantha, Landsman, and Bustin (2) used a different approach to isolate a Chicken HMG-14 cDNA. They used a human HMG-14 cDNA to isolate homologous chicken cDNAs and determined the sequence of one of them. The sequence of their HMG-14 cDNA is quite different from the one we isolated, and the cDNAs detect different single copy genes with Southern hybridization regimens of normal stringency. Though the cDNA they isolated is clearly a member of the HMG-14 family, the amino acid composition of the protein it encodes differs significantly from the amino acid composition of the major chicken HMG-l4 protein. In particular, amino acid analysis shows that chicken HMG-14 has 10-11 prolines, 5 serines, l histidine, and no valine. while the anus we isolated encodes a protein with this 100 encodes a protein with 8 prolines, 8 serines, no histidine, and 5 valines. Since the discovery of this second member of the chicken HMG-14 family, the cDNA we isolated (and its gene) has been called HMG-14a, while the human homologue has been called HMG-14b. It is not yet known if mammals contain a homologue of chicken HMG-14a. Whether or not a human HMG-14a homologue can be isolated by nucleic acid hybridization, the homology of HMG-14b and the human transcript suggests that the HMG-14 family is an ancient one. The HMG-14b gene has been isolated and characterized by Srikantha, Landsman, and Bustin (3). The structure of HMG-14b is compared with HMG- 14a below. Thsshisksnmltllsene To isolate the chicken HMG-17 gene, the chicken HMG-17 cDNA pLGla was used as a radioactive hybridization probe in a screen of a 2Charon4A library of chicken genomic DNA clones as described in Chapter 2. Nine plaques tested positive in the initial hybridization screen. When these 9 clones were isolated, replated, and rescreened, 3 of them gave strong positive signals. These were JHMSa, LHM6b, and 1HM7a. DNA was prepared from the purified clones and used for restriction mapping and Southern analysis. Figure 14 shows restriction maps of LHMSa, LHM6b, and 101 blots were hybridized to the HMG-17 cDNA probe to identify fragments which contain coding sequences of the HMG-l7 gene. The resultant hybridization patterns also facilitated alignment of the maps to indicate overlap among these clones. LHM7a appeared to contain the entire HMG-17 coding region. Two fragments of LHM7a hybridize strongly to the HMG-l7 cDNA probe (Figure 14). These fragments, a 1.7 kb BamHI- HinDIII fragment and a 2.1 kb HinDIII fragment, were isolated and subcloned into a plasmid vector (Bluescribe KS+). The resulting subclones, pHM1.7BH and pHM2.1HH, were used for restriction mapping and sequence analysis. (For reasons discussed below, it also became necessary to subclone the 0.8 kb HinDIII-8amHI fragment which adjoins pHM1.7BH. This fragment does not hybridize to the HMG-17 cDNA probe.) One immediate conclusion that could be drawn from the initial mapping and Southern analysis of lHM7a was that the HMG-17 gene contains an intron. The HMG-17 cDNA contains no HinDIII sites, yet 2 HinDIII fragments hybridize to the HMG- 17 cDNA probe. This suggested that the HinDIII site between pHMl.7BH and pHM2.1HH lies within an intron. The previously characterized HMG-17 cDNA sequence enabled us to predict restriction sites which should be found in the exons of the HMG-17 gene. These enzyme sites 102 Figure 14. Restriction maps of recombinant LCharon4A chicken genomic clones which hybridize to the HMG-17 cDNA pLGla. E..- fiH QHDth :03! 0:33.02. huh. ‘ :65: D .1500 6 5:81.... _ _ Econ 9 rompsra 815.563 ax mxodazo slit.) m 1 32.2 _ fl o a, , an». .0114 3:050... § 4“ 14 :35... :2... :3: not}. e i 1|..L... 104 sites were used as labelling sites for sequence determinations by the chemical degradation method of Maxam and Gilbert. Figure 15 shows detailed restriction maps of pHM1.7BH and pHM2.1HH, emphasizing sites found in the cDNA and/or used for sequencing. The regions which were sequenced are indicated below the maps. Sequence determination showed that the chicken HMG-17 gene consists of 6 exons and 5 introns. The locations of the exons are indicated on the map in Figure 15. Figure 16 shows the sequences of the 6 exons that are found in the HMG-l7 cDNA. Exon 1 contains all of the 5' untranslated sequences, the initiation ATG, and 4 codons. Exons 2, 3, and 4 are quite small containing 45 bp, 30 bp, and 51 bp, respectively. Exon 5 is 96 bp long. Exon 6 contains 33 bp of coding sequence, the termination codon TAA, and the long 3' untranslated region of the gene. Because of the length of the 3' untranslated region, Exon 6 is much larger than the other exons. In the 3' untranslated region there are three nucleotide differences between the genomic sequence and the cDNA sequence. These were confirmed by sequencing both strands of the genomic clone. Whether these small differences are allelic in chickens or artifacts of cDNA synthesis was not investigated. Figure 15. 105 Restriction maps of pHM1.7BH and pHM2.18H. The locations of the 6 exons of the HMG-17 gene are shown. Arrows indicate the regions which were sequenced. Part of the HMG-17 promoter is on pHMO.8HB. 106 ma mwswwm .5 a .93 6 .25 r .8. + 253.6 :55: b to: 0‘9: 0 w 1 I M U A.A.IIIIIO All Ill-fill: 105...... «to-e 2.0-0 Two-O 060- m D 0.8! n '9 I. ..I III- F 107 Figure 16. The exon-intron structure of HMG-17. The sequence of the cDNA pLGla is shown; the locations of the 5 introns is indicated with arrows. The coding sequence is capitalized. flag-12 gnfib anggzg qaattccqca cccthqccc ccccctccqc qccachcaq tcccccqctt tcgctctctc Inn-'0 ' CCGAAGAGAA CAACGGAGAT AAAAAGGCAG CCTGGCAACG AAAGCCCAAG ctgqtqactq tqttttactt qqqqqqqqca ttaccccttc gtgctqcaca cctgttqcca cctttttqcc actctaaatq ctaaaaggaq ttttaatttt aatqtqaaag atttgthqt caqttgttgt tatgaaaaqt anal-alas. Figure 16 AGGCTCAAGG CGGCAAGGTT intro: ¢ CTCCAAAGAA AGGGAAACAA GTGCTGGTCA tacaqtttqa tttttaagct qtqqqacaaa ccaqtttttt cctcttccqt acttcaqaac taqaqcctat cattqtcaqq ctgcatttcc tcctcgcaaa tqtccqccct tttataqcaa aaaatgttqc acctttaata Iaaaaaaaaa 108 cqaaqccqqc cthccqcca cctcctcgca AGATACCAAG Ilium I ATCTGCTAAA GAGTGAGAAG CCCTGCAGAA TGCCAAGTAA aatactattt atqttgttaq cqtcacttaa aqaagqactc tttqtqqacc tqcaqtttqc cactccqaaa tqatctgaac tctttcatat qctaqqqtaq cactctaaac cctttatgtt agattqtagc aaqctgqata aaaaaaaaal cqccagcccc ccgagcqagc caacacacgc GGCGATAAGG CCTGCCCCTC GTGCCCAAGG AATGGAGATG aatgtgtgaa tttatcaaqt cacacaaacc tctqtttctt ttcctaaatg gcatcaqagt agtqccctct tacaqcaqac ttctgqtqtc tgtagatcta atttgtgaag atttccctct tngtaqtcc ccatgtcctq cggtttgqct aacggaattc gccgcgccqc ccqgtgcccg acgcgccgcc CCAAAGTTAA CGAAGCCAGA GARAGAAGGG CCAAAACAGA tttttgataa tttataacaa qctttqttqt gqaacctaaa 9890599359 qaacgqaagc gcgtttcctt atgqcatgtt taatttggqa caaattaagq agttqttaaa acaagtatac atgaagggag cctaaattac tggaaaaaaa cccqctctcc cccccgcccq cggagctATG tuna! GGATCAGCCA GCCTAAACCT GAAAGCTGAT annS CCAGGCACAG ctgtqtactt tgcagaattt tgtgttttga ttttaaaagt ggattccttc tcccgagatq tcatgccctc qggactcacc tataataqct aatctgcaqt caacatgcta aaaaatgaag qqqagtttqa catqattgtt aaaaaaaaaa 109 Figure 17 shows the exon-intron boundaries of each of the introns. All of the introns start with the sequence GT, and all end with the sequence AG. This arrangement is seen at the ends of almost all introns in nuclear genes (4). In addition to the essentially invariant dinucleotides at the ends of introns, consensus sequences of preferred nucleotides around splice donor and Splice acceptor sites have been seen. These structures are often: ...AG/gtr...intron...ttncag/N... (donor) (acceptor) . The HMG-l7 splice donors match the consensus sequence quite well except that the splice donor for intron 3 is CT/gtr, not AG/gtr. The splice acceptors do not match the consensus sequence well, except that 4 of the 5 introns end with the preferred cag. Intron 5 ends with tag, which is the next most common end. The 3' end of the HMG-l7 cDNA appears to faithfully represent the processed end of an HMG-17 mRNA. The cDNA ends with a 49 base polyadenylate tract; 27 bases upstream of this sequence is the canonical polyadenylation signal AATAAA (5). The genomic sequence of the 3' end of the HMG- 17 gene shows that the polyadenylate tract is in fact added post-transcriptionally. The genomic sequence is identical to the cDNA sequence until the polyadenylate sequence of the cDNA. Figure 18 shows the 3' end of the HMG-17 gene, and 110 Figure 17. HMG-17 exon-intron boundaries. Coding sequences are capitalized: intron sizes are indicated. Inzlnn Intron Intrun Intrun Int run ((‘(L‘x-‘ULULA Vii-Lit .H try" at t . . . (yFIfl-KA".'(LA’IiuVfijt 3.1514431! l L: . . . AI.!.'I'I'.\'I('I‘:171:!mm!a!t: ( . . . (At'K‘Tt (TEN-\Afiut rum! t 3:! lg. . . AKA?“I'VLUT'AURI at my: my a . .. Figure 17 111 ".3‘fih. . “.Zékh. . . “..Q‘WI), , "J“‘Hv. . .t ( (c in ppm am§("I’(u‘\:\i;(n\(lz\T .( [ft t t t gut umTCACAACCUACA [m m t( tat (up.M:\(”C"TLL(L(.‘(T(T .tg(gtlcnacagflAUAGTbAGAAG .ttttt(ttttngqufAUAAAuCC 112 Figure 18. The 3' end of the HMG-17 gene. Sequences found in the cDNA pLGla are capitalized: the polyadenylation signal is underlined; the polyadenylation site is indicated with an arrow . 113 . . .TTMT‘ATCAT '11L'I'I'] HEM A \i.T.\('("l'I'I‘ flfiisxlk (..H V (.(I'I'I’l' (£(L("IT(?(.*( t uthlhmtuulhuntl ((hltuu.) [inllllif.ll( ltldalplla l.|.li[l«lf.1'.i “Hyatt“! patdattmn h tuttthm Unumnttg gaunt: 33:51.! lttttdtutt lJLHZJLZf Lu; ilIJ-Ufiitfl-H' (ttgttmtg m'tultltuat emu Hwy! (REMHLZJL‘; ((unutun (mull... Figure 18 12W) 114 the polyadenylation site. The polyadenylation signal, AATAAA, is 27 bp upstream of the polyadenylation site. This spacing is similar to the Spacing seen in other genes (5-7). IRE flfi§:11 EIQEQEQI The regions 5' of eukaryotic protein genes have been found to be important in the control of gene expression. These promoter regions often contain small sequence elements which have been well conserved (see Chapter 1). The most common of these is the TATAA element or Goldberg-Hogness box (8) which is similar to the prokaryotic TATA promoter and has a similar function (9). The TATAA element is recognized by part of the RNA polymerase II transcription complex, leading to accurate initiation of transcription at just one or a few nucleotides about 30 bp downstream of TATAA (10). In contrast to prokaryotic genes, not all eukaryotic protein genes have TATAA elements in their promoters. When the TATAA element is absent initiation of transcription usually can occur at a number of sites (11). Another sequence element often found in promoters is CCAAT. This sequence is usually located 30 to 60 bp Upstream of the TATAA element, but it may be found closer. Some promoters contain degenerate CCAAT elements, and some contain none at all (12). The 5' regions of eukaryotic genes are often G+C-rich and contain G+C-rich elements which facilitate high levels 115 these elements is variable, but their effect seems to be additive and proportional to their proximity to the TATAA element (when one is present) (14). It is thought that these elements, like CCAAT and TATAA elements, are binding sites for proteins which participate in transcription. The first and best characterized of these proteins is the mammalian protein SP1 (15). This protein increases transcription by binding to a variety of G+C~rich elements which share the core consensus sequence GGCGGG. An avian homologue of SP1 has not been demonstrated, but G+C-rich sequences are common in the upstream regions of avian genes. Figure 19 shows the promoter region of the chicken HMG- 17 gene and exon 1 (in capitals). All of the common promoter elements are found in the HMG-17 promoter. Part of the HMG-l7 promoter lies on pHMO.BBH, upstream of the subclones which contain the coding sequences. The HMG-17 promoter contains two TATAA elements. They are located 31 bp and 41 bp upstream of the first base represented in the cDNA clone pLGla. This suggests that pLGla is a nearly complete cDNA copy of an HMG-17 mRNA, perhaps missing a few transcribed nucleotides at the 5' end. Either TATAA element, or both, might function in this promoter. The site(s) of the start of HMG-17 mRNA transcription must be determined directly (see below), not by an examination of Figure 19. 116 The HMG-17 promoter region. Exon 1 sequences are capitalized; common promoter elements are underlined; and mRNA initiation sites determined by 81 protection analysis are indicated with arrows. 117 5:1 «'EUL‘L‘ufm ~30!) 1H glut v. .1 {31((r(f;'11l “'11.;31'tg ((.r«;tl(««(1(((.t((}:ljiittt‘r'ai‘T n34“ a" a (x mm." yr '1‘:‘.;-’I‘L’S"l'1’. Sttztty .uaitlziv yrt'lzt'rfa'zm 'c" 't'iifkt-‘HYRC ~15“ t?.I.'!Y‘.":'« izll‘ttJ-‘(Yf(Y.l.'-".1t{t'((‘l','!l(1([((((L".:t'[“f.‘.¢‘(l{i(((' - J” l .1«,;t'.'1v¢( 7.11 w s" ‘-' ' .I .t' Hui?! it":' 1"."."." '."" {it "it'll! ( 1".1'1 LEI-"cl! ( $11311 4‘” 1's'vuw;:t< ( h {My [.9 m.” t t l 1 in.“ m ! . a? n! Mild-Ii t* $1! gut 32H vi; 1 (.lei..r.(..\'.: '.( *..-'.( l.,'.'..','rr. cum r it! (‘ ('1 ur rim m t.‘ MHU.‘ II‘T (7' (.('(“I« (;(;(‘ w) “yum“ 11r‘1(1,('f(,((\'r(.'(!.f.(,.\(.HUJJI‘J (H.(((HI-(‘ ('r(.(('l'(’f"I‘(’ ll‘) (H‘jt {.a it} I‘H'rc [H )r (I \r \\r .1! A (‘H _\<'t.s i H 1.: t (.(.(._.'(',("] A'1‘(,'(T(T(;,\,\(;A “SH (.113 n, 1.1: t “u! .l'( . . . Figure 19 118 one cDNA clone. The chicken HMG-17 promoter also contains a CCAAT element 61 bp upstream of the S' TATAA element. This spacing is similar to that seen in many eukaryotic genes. The HMG-17 promoter has a G+C content of about 75%. within this G+C-rich region are 4 SP1 binding sites. The 2 sites farthest upstream are on the non-coding strand of DNA, while the downstream elements, including one between the CCAAT and TATAA elements, are on the coding strand of the gene. This seemingly haphazard arrangement of SP1 binding sites is typical of this element. amalgam One method of determining where transcription of a gene starts involves 51 protection analysis. In this method, a radioactive probe is made which overlaps the end of the mRNA of a gene. This probe is hybridized to mRNA, then the hybrids are treated with $1 nuclease, which degrades single stranded nucleic acids, but not double stranded hybrid material. Measuring the portion of the probe which is protected from digestion by hybridization to mRNA can give an accurate indication of where transcription begins. Furthermore, when, as is usual, the hybridization probe is in excess of the corresponding mRNA, the intensity of the protected band is a good measure of the gene's mRNA level. This method, as applied to the chicken HMG-17 gene, is 119 and +59 (Figure 19). This 118 bp Sau96a1 fragment overlaps the suspected transcriptional start site. After treatment with calf alkaline phosphatase, the fragment was labelled with 4vnP-ATP and polynucleotide kinase. After the labelled DNA was denatured, it was hybridized to mRNA isolated from several tissues. The resultant RNA:DNA hybrids were then treated with $1 nuclease and analyzed on a denaturing polyacrylamide gel. Figure 21 shows the results of this analysis. A small amount of untreated probe was run in lane P. The two labelled strands of DNA, each 118 nucleotides long, have slightly different mobilities because of their different base compositions. Some full length probe remains in each of the treated sample lanes. This material probably reannealed during the hybridization step and so became resistant to digestion by $1 nuclease. The amount of probe added to the RNA samples was lOOO-fold more than the amount loaded in lane P, so the reannealed probe in lanes 1-12 is a small fraction of the original input. when the 118 nucleotide probe was hybridized to total RNA isolated from embryonic brain, heart, skeletal muscle, or blood, a range of fragments centered on 70 to 71 nucleotides in length was protected from $1 nuclease digestion. This means that these tissues contain mRNAs homologous to the probe, and that these mRNAs start 70-71 bp upstream of the Sau96al site in exon 1. (It is formally Figure 20. 120 81 protection analysis of HMG-l7 mRNA. 121 1932 lamHl hues-1 Saw 9601 HinDIII | l Y‘Y“ .IOH ‘ pH”! .7 3H Sawaal 32 CAP, ”nose. mPMYP C __ *_________ __ . H7 09 SeuOGal (moment “mRNA men, hybnduo mRNA If ‘ 1 DNA 81 nuclease v i pcolectod (moment Figure 20 122 Figure 21. Results of $1 protection analysis of HMG-17 mRNA from 15 day old chicken embryos. Lane P = probe only; M = pBR322 Hian size marker. Experimental lanes: 1,7 = liver RNA 2,8 = brain RNA 3,9 = cardiac muscle RNA 4,10 = skeletal muscle 5,11 = blood RNA 0) H N ll yeast RNA. Samples in lanes 7-12 were treated with 2 times as much 81 nuclease as samples in lanes 1-6. Signal sizes (in nucleotides) are indicated. PM 123 1 23456789101112 124 site, but its correspondence with the cDNA clone start site, the absence of consensus splice acceptor sequences, and the presence of consensus promoter sequences point to it being the mRNA initiation site. It is also possible that mRNA transcription began upstream of this region and was processed to mature mRNA, but consensus promoter sequences and the rarity of such 5' processing in RNA polymerase II mRNAs make this doubtful.) The nucleotides corresponding to possible mRNA initiation sites are shown in Figure 19. Because 51 cleavage is imprecise to within 2 bp, it cannot be determined if the HMG-17 mRNA begins with one or the other or both purine nucleotides shown. They are 9 and 10 bp upstream of the start of the cDNA clone, indicating that the cDNA is slightly truncated. The start sites indicated by 51 protection analysis are appropriately located with respect to the upstream TATAA element (30-31 bp from the central T) suggesting that it controls the major initiation site for HMG-17 transcription. (A weak signal at about 63 nucleotides in the gel could result from initiation regulated by the downstream TATAA or be due to an internal Sl-sensitive region in the RNAzDNA hybrids.) The 118 bp probe was hybridized to a sample of yeast tRNA (commercially prepared) as a negative control. Since this RNA contains no chicken HMG-17 mRNA, none of the probe was protected from 81 nuclease digestion and no signal was generated (Figure 21, 125 $1 analysis also shows that the HMG-17 gene is not active in embryonic liver. Figure 21, lanes 1 and 7, shows that liver RNA prepared from 15 day old embryos does not generate the 70 and 71 nucleotide signals. The few faint larger bands seen in lane 1 are products of incomplete Sl digestion: when more 51 nuclease is used to digest the hybridization mixture these bands do not appear (lane 7). A+T~rich regions of double stranded nucleic acids are somewhat susceptible to $1 digestion because of transient local melting of these regions. We suspect that the 2 TATAA elements on the reannealed 118 bp probe are slightly digested by high levels of 81 nuclease, generating a faint signal of about 90 nucleotides in some lanes. Either Results Landsman, Srikantha, and Bustin (17) reported the isolation and characterization of the chicken HMG-17 gene. They used a human HMG-17 cDNA clone to isolate genomic clones which contain the chicken gene. Their results agree with ours in every respect except one. They used the primer extension method to determine the mRNA initiation site. Their results predict that HMG—17 mRNA starts at the first A in the sequence TTCAAATTAGTGGGG, while we predict a start at the fourth A or its neighbor C (Figure 19). While their published results are not completely convincing (not shown), the tendency of $1 nuclease to digest A+T-rich hybrids may 126 81 protocol, since they predict an mRNA with 6 As and Ts at the 5' end. The initiation site they predict is 24 bp 3' of the central T in the upstream TATAA. They have evidence, as do we, of minor utilization of the downstream TATAA. Their evidence predicts a minor mRNA species which starts in the (a sequence 8 bp downstream of the major initiation site. We have not attempted to resolve this minor discrepancy between our results. Cmnslusign Analysis of the chicken HMG-17 gene has shown that: 1. The gene is a single copy gene. 2. The gene contains 5 introns: the exon-intron boundaries are unremarkable. 3. The mRNA is polyadenylated 22 nucleotides downstream of the polyadenylation signal AAUAAA. 4. The HMG-l7 promoter contains TATAA, CCAAT, and SP1 binding elements in a normal arrangement. 5. The gene is active in embryonic brain, heart, skeletal, muscle, and blood, but not in embryonic liver. 127 mmunmm To isolate the chicken HMG-14a gene, the chicken HMG- 14a cDNA pLHJa was used as a radioactive hybridization probe in a screen of a LCharon4A library of the chicken genome as described in Chapter 2. Fourteen plaques tested positive in the initial hybridization screen of the library. When the 14 clones were isolated, replated and rescreened, 3 of them gave strong positive signals. These were named LYNl, LYNZ, and LYN). DNA was prepared from the purified clones and used for restriction mapping and Southern analysis. Figure 22 shows restriction maps of lYNl, LYNZ, and LYNB. Southern blotting the mapping gels and hybridization of the HMG-14a cDNA probe to the blots allowed identification of the fragments which contain coding sequences of the gene, and facilitated alignment of the maps to show how these clones overlap. Fragments of iYNl and iYNJ which hybridize to the HMG-14a cDNA probe were subcloned into a plasmid vector (Bluescribe KS+). These subclones, pHMl.9HB, pHMl.BBH, pHHl.8HB, pHM1.0RH, and pHMl.5HP, were used for restriction mapping and sequence analysis. Identical pHMl.BHB subclones were isolated independently from both iYNl and iYNJ, confirming that these iclones contain overlapping sequences of the chicken genome. Figure 22. 128 Restriction maps of recombinant LCharon4A chicken genomic clones which hybridize to the HMG-14a cDNA pLMBa. The clones are aligned to show how they overlap. 129 IGHJ’ZO «1111.4 IIGA'XB IIOJIIB 7i} NN musmwh :03! 06.5332. 9. . l a: a :65: o :2...- G 58- 9 0...... 380 w 6 3.2 Ito .Ixe 7):) $3.31.. 130 The region which hybridizes to the HMG-14a cDNA probe is about 10 kb. This is much larger than the chicken HMG-17 gene, which is less than 4 kb. Southern analysis of genomic DNA had shown that the chicken HMG-14a gene is a single copy gene, but might have weak homology to a few other sequences in the genome. It seemed possible that LYNl and LYN3 might contain 2 homologous genes. Alternatively, the chicken HMG- 14a gene might contain more intron sequences than the HMG-17 gene. Restriction mapping and sequence analysis eventually showed that there is only one large HMG-14a gene in this region. Figure 23 shows detailed restriction maps of the subclones ofJYNl and LYN3. As in the analysis of the HMG-17 gene, restriction sites present in the cDNA clone were suggestive of the locations of exons. In some cases these sites were used as labelling sites for sequence determination by the chemical degradation method. The regions which were sequenced are indicated below the maps in Figure 24. Sequence analysis showed that the chicken HMG-14a gene consists of 7 exons and 6 introns. For reasons discussed below the exons are numbered exon 0 through exon 6. Figure 24 shows the sequences of the 7 exons that are found in the HMG-14a cDNA pLN3a. Exon 0 is very small, consisting of 131 Figure 23. Restriction maps of plasmid subclones which contain HMG-14a exons. The exons are indicated with heavy bars. Arrows indicate regions which were sequenced to locate the exons . 132 1 . DHILOHB A “ 9 fi 1 a Y t 9 g $ 1.",02 é ClonJ LDHM138H L ‘ _’ V (— 04 010 $ pHAflJHB T V ——-> L__L T 9 0x005 pH”! .01!" 4__—__ ——-P (— v «mom 9 Ace! Q OIINI 0:006 l ( ._ pHILSHP —-—-D ————" L A r ‘ 4 ion: 100139 0 ”one 1 0mm 9 no: 9 rm Q A»! A Ec00109 Q 800301 9 son A HMCII 9 mm A Hlnfl Q ow 133 untranslated region, the initiation ATG, and 4 codons make up exon 1. This is similar to exon 1 of HMG-17. Exons 2, 3, and 4 of HMG-14a are quite small, as they are in HMG-l7. In HMG-14a they consist of 30 bp, 30 bp, and 51 bp, respectively. Exon 5 is somewhat larger at 144 bp. Since exon 6 contains the long 3' untranslated region of the gene, it is much larger than any of the other exons. The relative sizes of the exons of HMG-14a and HMG-l7 are similar. In fact, 3 of the exons contain coding regions for virtually identical regions of the homologous proteins. Figure 25 shows a comparison of the codons present in exon 1, exon 3, and exon 4 of the two chicken HMG genes. The regions of the protein specified by exon 1 of each gene have identical amino acid sequences. The same is true of exon 4 of each gene. The portions of the two proteins specified by each exon 3 differ by just one amino acid. Conservation of exon structure and intron location is seen in many other gene families, for example, between the a- and fl-globin genes (4) and even where the predicted evolutionary separation of the homologues is as ancient as the 2 characterized chicken histone H3.3 genes (17). Figure 24. 134 The exon-intron structure of HMG-14a. The sequence of the cDNA pLMBa is shown; the locations of the 6 introns is indicated with arrows. The coding sequence is capitalized. Sequences which are complementary to the synthetic oligonucleotides JD-20 and JD-21 are underlined. unfi;11_sflfla_inssrt 13S 10"” O qaattccgtc ccctissisa_sgangstssa_anacagttts_1§ggsgg£19.;stfssfatt_ JD-ZO ttttacacct JD-Zl ctcccgatct Intro! 1 AACAAAGGCT CCAGCTGAAG Intron 3 ATCTGCTAAA CCTGCTCCGC CTAAACCGGA ACAAAAGGCA AGCCAAAGGC TGGACATACC CTCCGAGTAA gtattgttaa atgaatttaa aaaacaaaac gqtacatgga gtaagtcatg ttaaaqtggq tgcttttata Figure 24 GCAAACCATA AAAGACGAAA AAAACTAATG tgttaaccct cagagaggaa ttatggaaca aaaacaaaaa aagaataagt cttacagact gaggtctcaa aagaaggtga ctctatttgc agtcaactat taaggtgcaa ctATGCCCAA CCCACGCCAA AAAAGGAAGA CTAAACAAGA Intron S AGGCACCAGC gccctatatc tatttttatc tcttcatctc aaaatcattg qqtqqtaqct tcagatttta aacagataac gctattttca Inn-n 2 CGAGGAGCCA GCCAAACCCC CAAAAAGGCA GGATCCAAAA TGCTGAAGCA tccatcattt aactatttta ggttacttgg ttttaaattt tttgacttct attttaccct tgtgttaaac tgaaaaaaaa AAGAGAAAGT AAAAAGGCAG CCAACAAAAG GAAGAAAACC TCTGATGATA ggtatccgta taaatgcagg gaattaaatc gtgattgtaa gtcagtgtgt tgtatgtgtt attccagtgg aaaaaaaaaa CGGCCAGACT Inna! 4 CACCTAAGAA GGAAGAAAGG ACTCTGAAAA AGGAAGCCAA cctccatgct tttttttagc cctaacaaac tagtttgtat ccctttttgt gtatggtttc ttctgtgggt aacggaattc 136 Figure 25. Conserved exons of HMG-14a and HMG-17. These portions of the proteins differ at only one amino acid. 137 0<< <00 P00 <00 0<< <<< H00 <<< 900 0<0 <00 0<< 000 H00 000 H00 0<< H00 <00 «00 0<< <<< 000 0<< <00 040 000 <<< #00 000 P00 H00 cdw H00 PUP «HP 00< <00 00H <0< 000 < . ”lb 0 so» I I 40* .' , I ”is . . . go» / l0> . so 40 a0 an ice 0 I)" B o 70> I ..t , w ' as y a a . . 00’ ' a my mi 0 a s0 0 A A. A A A 20 ‘0 .0 so lm lihlt'i l4 sequence The chicken ll\lti l4 ammo and sequence I\ amt rs crwnpared to that of human llhlti-l4 H atis. panel Al and chicken HMG-IT (Y asis. panel It The human llhlti I4 sequence is from Landsman et al (I9Ibbi The Iindmi nl com- parison In l0 .3 res-dues Itth 50‘. identity required for g punitive result to be recorded 'I‘) not matrra (invariants nl ammo bud about 50 nt 3' to the stop codon in the chicken HMG-I4 sequence (40 nt 3' in the human) The function of this region is unknown. but the high level of sequence similarity in this area strongly suggests an important role in the HMG-l4 mRNA structure This sequence may be unportant in both the HMG-l4 and HMG-I7 mRNAs. since a com panson of the chick HMG-I7 cDNA to that of the chick HMG-l4 (Fig 4C) also shows similarity between the tsso regions at nt positions 500-550 The leycl of sequence sumlanty between these two cDNAs in this region is not as great as costs between the chicken and human HMG.“ cDNAs. but It is of the same magnitude as the similarity between the chick HMG-I4 and HMG-17 coding 175 :9: regions which. as described above. leads to 44', identity in their ammo acrd sequences Fig 4C also identifies a region very rich in A residues (nt 594-624 in Mg II in the chick HMO-I4 3-untranslated region This shoyss up in Fig 4C as a series of horizontal lines which mark the similarity of this tract to seyeral smaller A.nch regions tn HMO-l7. (The A blocks corresponding to the polth tails hate been deleted in Flg 4.) Both chicken cDNAs share the characteristics noted for the human ll\l(i-l4 and llhl(.i~|7 cDNAs (Iandsman et al . I‘Htiahlof being (3 + C-richin their 5 -untranslatcd regions and A s T-rich in their 3'-untranslatcd regions The functions. if any. of these nucleotide biases and the long A tract are unknoysn tci liens copy number fig 5 shows blots of chicken genomic DNA cut vyith either Ele or Bumlll restriction enzymes and hybridized ystth the HMO-l4 or HMG-l7 cDNA inserts. In each case. only one strongly hybridizing band is seen. suggesting that both cDNA sequences are single copy in the chicken genome. This has been confirmed by isolation of cloned genomic DNAstI) LB .unpublishcd results) whose restriction maps demonstrate that the strongly hybridizing bands do not result from H MG genes duplicated in tandem or seyeral HMG genes closely linked on a single restriction fragment. For both HMG genes. approximately one positive clone was isolated per 50000 A recombinants (IS—20 kb tn- serts) screened. in agreement with each gene being single copy in the chicken genome. llouever. in the case of HMG.“ there is one other band which hybridizes at 20-30', the strength ofthe mayor band (at 7 8 kh in lane 2 and 6 9 kb in lane 4 of Fig 5) and one or two other still weaker bands. Preliminary results (D I. B ) from our genomic clones suggest that these minor bands are not due to small portions of the HMG-l4 gene existing as separate exons on diflerent restriction fragments. The weaker bands are likely to result from partial sequence similarity of the II MG- I4 probe to other sequences in the chicken genome. perhaps to other HMG genes. However. despite the sigml'ic ant similarity ofchicken HMG-I4 and HMO I7 coding sequences. there is no observa- tile cross-hybridization of the two probes under the conditions of this experiment (Fig. 5. lanes l and 2). 1“” 176 a .e ‘ . . I I . s a u . l.- . . a r c a u . sis O u .e... . t .i . . . i. I O 1 n . . . . s a s . t ’ . i. . . i.... . . . s».li . . .t. cl. .s . .. . . .. 4 .I . no i i i e I s . . . .i . y I‘ a .s . a i: i a m . l I. . . at. .. .. . Is is i v . . t c ... a a e e ih\. Q .. t. a: r .. . U 74.x... ‘ . us t‘..hdl .. I s . .-.ia.\.. . n r .c‘s .I . a v . . s . . .s oVU... ? D l l aoosooaoom :16 . .4” Fig 5. Chromosomal blots of chicken genomic DNA hybridized to HMG cDNAs. Each lane contained I l )4; of chicken genomic DNA digested with EcoRI (lanes l. 2) or Baal-ll (lanes 3. 4). Blots were hybridized in 50'/. formamide hybridization solution at 42'C as described (Grandy and Dodgson. I987) to nick- translated ( IO' cpm/ug) ["PchNA inserts from the HMG-l7 (lanes l. 3)or the HMG-I4 (lanes 2. 4)clone. Blots were washed at 65'Cin 0.! M NaCI-0.0l M Tris - HO. pH 7.5-l mM EDTA. Arrows denote the positions and sizes in kb ofinternal EcoRl- digested I. DNA markers (lanes I. 2)or external Hindlll-digested 11 DNA markers (lanes 3. 4) Since it appears that the genes for HMG~14 and HMG-l7 are single copy in the chicken genome. it is possible that most of the multigene family members observed for these two cDNAs in man and mammals (Landsman and Bustin. I986) are pseudogenes. For unknown reasons pseudogenes seem to be rather rare in the chicken genome. For example. we have yet to identify a single pseudogene in either of the two chicken globin gene clusters (three and four genes each; Dodgsonet al.. l98l;Dolanet al.. l981)orthe two replication variant historic gene clusters (19 and 2| genes each; Grandy and Dodgson. 1987). l7 7 293 (d) HMG-M mRNA levels Preliminary measurements have been made of chicken HMG-l4 mRNA levels by RNA blotting as shown in Fig. 6. As expected from our cloning results (four positives from about 100000 phage. see MATERIALS AND METHODS. section a). there is a low but clearly measurable level of HMG-l4 mRNA in chicken liver total RNA (Fig. 6. lane 3). A single band was observed of a size (approx. 950 nt) similar to that of the cloned cDNA insert. However. we were unable to detect HMG-l4 mRNA in either reticulo- cyte RNA from anemic birds or in RNA from HD3 cells. an erythroid precursor cell line (transformed with is avian erythroblastosis virus; Beug et al.. 1982) grown in culture. HMG-l4 and HMG-l7 are l 2 3 ‘- l.6 *- 0.6 Fig. 6. Chicken HMG-l4 mRNA levels. Total RNA was pre- pared as described (Yoshihara et al.. I981) from: lane I. HDJ chicken cells (Beug et al.. 1982); lane 2. anemic hen reticulocytes; and lane 3. adult chicken liver. RNA samples ( l00 rig/lane) were run on a 2.2-M formaldehyde-MK agarose gel and the gel was blotted as described (Maniatis et al.. I982). Hybridization with nick-translated HMG-l4 cDNA and washing were as described in the legend to Fig. 5. The arrows at 600 and I600 nt desipiate the positions of an internal RNA (a-globin mRNA) and an external single-stranded DNA standard. respectively. RNA and DNA standards were shown to run equivalently on this gel. Fig. s. Dot-matrix comparison of HMG cDNA sequences. (A) Comparison of human HMG-l1 cDNA (X-axis) to chicken HMG-l7 cDNA (Y-axis) (I) Comparison of human HMG-l4 cDNA (X-asis) to chicken HMG-l4 cDNA (Y-aais) (C) Comparison of chicken HMG-l4 cDNA (X~asis) to chicken HMG-l7 cDNA (Y-airist Human cDNA sequences are from Landsman et al. (l986a.b). A window of IO-nt residues was used with 50'/. identity required for a positive result in all cases. Linker and J' poly(A) regions have been removed for this analysis. 294 clearly present in chicken erythrocyte chromatin (Mayes. I982). but these proteins may have been synthesized early in erythroid differentiation and/or were translated from relatively low mRNA levels. thus accounting for our inability to detect the message in reticulocyte mRNA. The absence of detectable message in HD3 cells is surprising in view of the results of Bustin et al. (1987). which showed much higher HMG-I7 mRNA levels in cultured cells than in liver. but these authors also found con- siderably higher levels of HMG-I7 mRNA than HMG-I4 mRNA in HeLa cells. More sensitive measurements will be required to delineate the over- all regulation of chicken HMG-I4 mRNA levels. (e) Conclusions Nucleotide sequence analysis of chicken HMG- l4 and HMG-l7 cDNAs demonstrates considerable sequence similarity between avian and mammalian HMG-I7 sequences but much less similarity between the analogous HMG-I4 sequences. How- ever, comparison of several HMG-I4 and HMG-I7 cDNA sequences suggests a potential conserved regulatory region in the 3'-untranslated portion of these mRNAs. In contrast to the mammalian HMG-I4 and HMG-I7 gene families, these sequences appear to exist in one to two copies per haploid chicken genome. Low levels of the HMG- I4 mRNA. 950 nt in length. were detected in total chicken liver RNA but not in RNA isolated from anemic chicken reticulocytes or from a chicken erythroblast cell line gown in culture. ACKNOWLEDGEMENTS We thank Drs. Ed Fritsch (Genetics Institute. Cambridge. MA) and Ron Davis (Michigan State University) for discussions regarding oligodeoxy- nucleotide screening techniques. This work was sup- ported by an All-University Research Initiation Grant from Michigan State University and by an NIH Grant (GM 28837) to l.B.D.; J.B.D. is the recipient of a Research Career Development Award from NIH. This is journal article No. 12428 of the Michigan Ayicultural Experiment Station. 178 REFERENCES Beug. H., Palmieri. 5.. Freudenstein, C., Zentgraf. H. and Graf. T.: Hormone-dependent terminal differentiation in vitro of chicken erythroleukemia cells transformed by rs mutants of avian erythroblastosis virus. Cell 28 (I982) 907-9I9. Bustin. M.. Soares, N.. Landsman, D., Srikantha, T. and CollinS. J.M.: Cell cycle regulated synthesis of an abundant transcript for human chromosomal protein HMG-l7. Nucl. Acids Res. I5 (I987) 3549-356l. Dodgson. J.D.. McKune. K.C.. Rusling. DJ.. Krust. A. and Engel. J.D.: Adult chicken c-globin genes. 1“ and 1°: no anemic shock a-globin exists in domestic chickens. Proc. Natl. Acad. Sci. USA 78 (I98I) 5998-6002. Dodgson. J .B.. Yamamoto, M. and Engel. J.D.: Chicken historic H138 cDNA sequence confirms unusual 3' UTR structure. Nucl. Acids Res. l5 (I987) 6294. Dolan. M.. Sugarman. DJ.. Dodgson. 1.3. and Engel. J.D.: Chromosomal arrangement of the chicken fi'type globin genes. Cell 24 (I98I) 669-677. Grandy, D.R. and Dodgson. 1.3.: Structure and organization of the chicken H28 histone gene family. Nucl. Acids Res. I5 (I987) l063—I080. Jacobs. K.. Shoemaker. C.. Rudersdorf. R., Neill. S.D.. Kaufman. RJ.. Mufson. A., Seehra. J., Jones. 5.8.. Hewick. R. Fritsch, ER. Kawakita. M., Shimizue. T. and Miyake. T.: Isolation and characterization of genomic and cDNA clones of human erythropoietin. Nature 3I3 (I985) 806-8I0. Johns. E.W. (Ed): The HMG Chromosomal Proteins. Academic Press. New York. I982. Landsman, D. and Dustin. M.: Chromosomal proteins HMG-I4 and HMG-I7. 1. Biol. Chem. 261 (I986) I6087-I6091. Landsman, D. and Dustin. M.: Chicken non-historic chromoso- mal protein HMG-I7 cDNA sequence. Nucl. Acids Res. I5 (I987) 6750. Landsman, D., Soares, N.. Gonzales. F.J. and Bustin. M.: Chro- mosomal protein HMG-I7. J. Biol. Chem 261 (I986a) 7479-7484. Landsman, D., Srikantha. T.. Westermarui. R. and Dustin. M.: Chromosomal protein HMG~I4. J. Biol. Chem. 262 (l986b) I6082-l6086. Maizel. lr.. l.V. and Look. R.P.: Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Natl Acad. Sci. USA 78 (I98I) 7665-7669. Maniatis, T.. Fritsch, E.P. and Sambrook. 1.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory.Cold Spring Harbor. NY. I982. Mariam. AM. and Gilbert. “7.: Sequencing end~labcled DNA with base-specific chemical cleavage. Methods Enzymot. 65 (I980) 499-560. Mayes. ELV.: Species and tissue specificity. In Johns. E.W. (Ed). The HMG Chromosomal Proteins. Academic Press. New York. I982. pp. 9—40. Ohtsuka. E. Matsuki. 8.. Ikehara. M., Takahashi. Y. and Matsubara. K.: An alternative approach to deoxyoligo- nucleotides as hybridization probes by insertion of daoxyinosine at ambiguous codon positions. J. Biol. Chem. 260 (I985) 2605-2608. 179 295 Walker. J.M.: Primary structures. In Johns. EW. (Ed). The Walker.J.M..Steam.C.andJohns.E.W.:The primary structure HMG Chromosomal Proteins. Academic Press. New York. of non-histone chromosomal protein HMGI7 from chicken I982. pp. 69-87. erythrocyte nuclei. FEBS Lett. “2 (I980) 207-2l0. Walker. J.M. and Johns. E.W.: The isolation. characterization Woo. S.LC.: A sensitive and rapid method for recombinant and partial sequences of the chicken erythrocyte non-histone phage screening. Methods Enzymol. 68 (I979) 389-395. chromosomal proteins HMG“ and HMGI7. Biochem. J. Yoshihara. C.M.. Lee. J.-D. and Dodgson. J.D.: The chicken '35 (I930) 383-386. carbonic anhydrase II gene: evidence for a recent shift in Walker. J.M.. Goodwin. G.H. and Johns. E.W.: The primary intron position. Nucl. Acids Res. I5 (I987) 753-770. structure of the nucleosome-associated chromosomal protein HMG”. F535 LC". 100(I979l 394-398. Communicated by J.L. Slightom. "‘T'iifiiilil'lllliilifl O O "iiiliflllliililll'lilii‘s