5.35%. 1.. _, E5: 3 I“ ' 2'41! “3“" 2,1331 1 fly}?! . Ln,i ~ 15'9“». " w E , \S m: ¥ frat ”1.?th 12.: .1 . skunk“ 11w: . 1t1,x’;;gt 31:3”:3 {25¢ .;1€$Jflf‘- ‘k‘ik 3‘ my; V'PI ‘ 3 :- .gn’ww V J 3w"; {‘1 im .. . 12:. KM" firing} 51.5?! '1‘sz f6 ' I m. w "1"» " * 5% M , W IK- ’x'” . r . “155.2% :1 Ed? Egg}. Lari“; . W31. J,- WI' -253? .h'; .. p '1 1' :h dyad”? .‘ I, ‘11 M W ; :34: . H . : 415', ;; W ‘ a- - ~39. -w..- ' $5? I “J w 414”” , . :EIW in : I (J! i 121;. ,3:- Wri'é . .10“ - .‘Wl { t I ‘I g i! :\ ’1‘. ~" 'I 'K)’ 1". '“rL. " J: ' J" 15' 1’7 .rW-v _. ,9”? 3.3, a" 11%9'“ . y 5‘ rl‘{ “‘1? , I r Q" : ' 1‘)»: ”w", “I“: I a»; l 3 I 1'sz a .. . M. «1 Ivghf‘jfii 1:! ¢§n 4-,. :w"! ‘5‘ ml (71' Céifwez. “Wan gait." , L w nu 1 1- t: 44 * Av. 77;» ’3‘ Sr ' an, ”11-7 ”M00 '2. y Eli-WW $5311.? 1.25:" L‘x; L- is. 33:34:34: 5 7. n n m In 1131.2. ”"r U '1“ 1 M“. W 12:1 Eli‘f‘m”""1-:‘I"I"‘!Lfif,wVI; l “m“ ‘M‘Mr‘ 73:21...“ a '4“ 31%;??? "Pall ”1:" :V": M" *1 13'” .N z‘ ,3? A: man? TIE: M"?! 52.1? Fri—legs This is to certify that the thesis entitled CHARACTERIZATION OF A CHICKEN H3.3 REPLACEMENT VARIANT HISTONE GENE presented by DAVID CHRISTOPHER BRUSH has been accepted towards fulfillment of the requirements for m -- S - degree in B ‘l°(/H'€"M‘ 3 m \/ \ Egg/m M Maren 0-7539 MSUi: an Affirmn'iuv ‘ ' " ' A”, ‘, Insritution 1 MSU LIBRARIES RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. CHARACTERIZATION OF A CHICKEN H3.3 REPLACEMENT VARIANT HISTONE GENE By David C. Brush A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Department of Biochemistry 1985 ABSTRACT CHARACTERIZATION OF A CHICKEN H3.3 REPLACEMENT VARIANT HISTONE GENE By David C. Brush The subclone pBH6b-2.6 was restriction mapped and subsequently sequenced by the method of Maxam and Gilbert. This subclone contains a 2.3 kb fragment from the ACharon 4A recombinant clone, ACH6b, which hybridized to a probe consisting of sea urchin H3 histone DNA sequences. Nucleotide sequence data reveals that pBH6b-2.6 contains a H3.3 replacement variant histone gene interrupted by three introns. Two of the intervening sequences occur in the coding portion of the gene while the third (revealed by $1 mapping experiments) occurs in the 5‘-nontranslated region of the gene. The H3.3 gene described herein is the second example of a H3.3 replacement variant histone gene characterized in detail. It codes for an identical protein sequence to the first such gene (H3.3-1), thus it has been designated H3.3—2. Although the two genes code for the same histone variant, comparison of their non-coding base pairs and intron sizes suggests that the two genes are evolutionarily, only distantly related. In addition, it was shown that H3.3-2 messenger RNA is post-transcrip- tionally polyadenylated whereas mRNA of the replication variant, H3.2, is not. The expression of the H3.3-2 gene in different tissues is also described. It is demonstrated that the H3.3-2 gene is expressed at low (basal) levels in dividing and nondividing tissues. The structure and expression of H3.3-2, a replacement variant histone, is compared to that of H3.2, the corresponding replication variant histone. To My Wife, LeAnne ACKNOWLEDGEMENTS I would like to thank Dr. Jerry B. Dodgson for his guidance and support throughout this research project. I would also like to thank Theresa Fillwock for typing of this manuscript. TABLE OF CONTENTS Page List of Figures ......................... iii 1. Introduction ...................... . . 1 II. Materials and Methods ................... 22 A. Methods . . . . ...... . . ..... . . ..... 22 Subcloning of DNA ................. 22 1. 2. Transformation of H8101 with Subclone pBH6b-2.6 . . 22 3 Large Scale Isolation and Purification of Plasmid DNA .................... 23 4. Restriction Endonuclease Mapping of pBH6b—2.6 . . . 25 5. Purification of Gel Fractionated DNA Fragments . . 28 6. Labeling of Double- Stranded DNA . . ..... 29 7. Maxam and Gilbert DNA Sequencing of Plasmid pBH6b- 2. 6. .................... 31 8. Preparation and Isolation of Chicken RNA ...... 34 9. SI Nuclease Mapping . ............... 35 10. Primer Extension Analysis ............. 36 B. Materials ....................... 37 III. Results ............... . .......... 38 IV. Discussion ......................... 75 V. References ......................... 86 Figure 10. LIST OF FIGURES Arrangement of the tandem repeat units of sea urchin histone genes and Drosophila histone genes. . . . . . . . . . . . . . ..... . . . . . Organization of the xCharon 4A recombinant clone ACH6b. . . . . . . . . . . . . . . . . . . . Restriction endonuclease map, Maxam and Gilbert sequencing strategy and relative position of H3.3 histone gene for subclone pBH6b-2.6 . . . . . Complete nucleotide sequence of the 2.3 kb insert of subclone pBH6b-2.6. . . . . . . . . . . . . . . Comparison of the protein and nucleotide sequences of histone H3.3-2 to histones H3.3—l and H3.2. Identification of the leader exon of the H3.3—2 gene .......... . . . . . . . ..... . Comparison of the major chicken H3 histone (H3.2) versus variant chicken H3 histone (H3.3-2) consensus sequences thought to be essential for the proper initiation of transcription . . . . . . Bar graph representation of the variation in intron size among the histones known to contain intervening sequences. . . . . . . . . . . . . . . Comparison of the intron/exon junctions within the histone genes known to contain intervening sequences. . . . . . . . . . . . . . . . . . . . . Expression of variant H3.3-2 and total H3.2 histone genes in different RNA samples . . . . . . 41 43 46 50 56 62 64 66 70 INTRODUCTION Our fundamental knowledge of eukaryotic gene expression has been enhanced over the years by the ever-increasing body of information relating to the histone genes. Histones comprise a group of highly conserved, small, basic proteins which are present in all eukaryotes. In addition, histones complex with DNA to fonn the basic subunit of chromatin (1). This subunit, the nucleosome, can complex with other nuclear proteins to fonn even higher orders of chromatin structure. Two sources of variation are known to exist within the histones; these include sequence differences and post-translational modifications (acetylation, phosphorylation, methylation) (2). These sources of variation are known to affect the ways in which histones interact with DNA, as well as with each other. Although the role of these variants is not firmly established, they may indirectly influence gene expression. Nucleosomes with different variant composition could display differences in DNA-binding, thereby accounting for alterations in chromatin structure. Chromatin structure has been linked to transcriptional activity through the use of DNase I (3). Regions upstream of actively transcribed genes are more susceptible to nuclease attack than are similar sequences upstream of inactive genes. Histones are interesting therefore, since they may exert a wide spread influence on gene expression through their fundamental association with chromatin. A second consideration involves the mode of histone gene expression itself. Histone gene expression is known to be both cell—cycle and developmentally regulated. Most histone protein synthesis appears to be confined to S phase, with some evidence that transcription is closely coordinated to DNA replication (4). Since control at the transcriptional level is the most frequently utilized mechanism for control of eukaryotic gene expression (5), this ”linkage" of histones to DNA replication might be explained by a transcriptional regulatory mechanism. Hereford gt al. have shown that in yeast things are not that simple, and that two levels of control are implicated (6). The first is an activation of histone transcrip- tion as the cell cycle passes through late GI. The second involves stabilization of histone mRNA by the process of DNA replication itself. Such a control mechanism could be used to regulate expression of any gene whose product was required in late G1 or S phase. A somewhat separate but related area of interest is the developmental regulation of histone genes. Various groups of distinct histone genes are turned on during development; a good example are the early and late histone genes of sea urchin (7). Both sets of genes are under tight transcriptional control relative to the requirement for certain proteins at defined stages of development. Many other genes in the cell also must find a way to meet similar requirements. At this point, it seems reasonable that both the cell cycle and the developmental regulation of histone gene expression are related to similar phenomena that exist for a variety of other eukaryotic gene families. Four distinct sets of histones are recognized (8). These include the histones unique for spermatogenesis and oogenesis, as well as the replication and replacement histones. The oocyte-specific histones are present during maturation and in embryos, while the spermatocyte-specific histones are found during meiotic prophase of spermatocytes. The replication histones comprise a set of embryonic histones which are found in rapidly dividing somatic tissue. The replacement histones can be seen in non-dividing tissues in increasing amounts as these tissues age. The histone proteins fall into five classes originally based on the composition of their basic amino acids (9). These include the arginine-rich histones H3 and H4, the slightly lysine-rich histones H2A and H28, and the very lysine-rich H1 histones. From the evolutionary standpoint, histones H3 and H4 are among the most highly conserved proteins known. Histones H2A and H28 exhibit a greater degree of evolutionary variability, with histone H1 being the most variable of the five classes. Most of the variability is confined to the N-terminal portion of the protein, with this being the region involved in histone DNA binding. The C-terminal portion is very highly conserved and it participates in the histonezhistone interactions crucial to nucleosome formation. As previously stated, the histones are small ranging in size from 11,000 daltons for H4 to 24,000 daltons for H1. The nuclesome is composed of a core particle consisting of two copies each of histones H2A, H28, H3 and H4 to form an oligomer (1). Histone H3 binds to histone H4 to form a (H32H42) tetramer, which determines the diameter of the core (9). DNA is wrapped around the core partcle twice, thereby binding 146 base pairs of DNA. In between the nucleosomes are the linker regions, which consist of 20-80 base pairs of DNA connecting the nucleosomes. The appearance of the nucleosome at this point has the characteristic of beads on a string. Histone H1 has been shown to bind to the DNA as well as to the other histones. A monomer of histone H1 is thought to seal the DNA in the nucleosome by binding at the point where it enters and leaves. Whether or not H1 is present determines which of two characteristic appearances chromatin takes on. At low ionic strength without H1, a 10 nm fiber is visible under the electron microscope. This is essentially a continuous string of nucleosomes. At greater ionic strength with H1 present, a 30 nm fiber is visible. The 30 nm fiber can be seen to have an underlying coiled structure that contains approximately six nucleosomes per turn. The 30 nm fiber can also be involved in even more complex types of chromatin structuring. Thus, we can see that histones are central to the most basic level of chromatin structure, and they are essential for the highly complex organization of DNA into the eukaryotic chromosome. Early attempts to characterize the histones revolved around the search for a unifying theme. The essential use of histones in all eukaryotes coupled with their high degree of sequence conservation encouraged the somewhat naive belief that many structural and functional properties would be similar in most organisms. This unified view was further encouraged when the first histone genes were isolated in various sea urchins. Different species of sea urchin which were a hundred million years apart evolutionarily had almost identical topological gene organizations (7). However, as more organisms have been characterized with respect to their histone genes, much greater diversity has been revealed. Thus, it can no longer be said that there exists a "typical" arrangement of histone genes. As stated earlier, analysis of histone gene organization was first attempted in sea urchin (7). Consequently, the largest body of information pertaining to histone gene organization and regulation exists regarding the sea urchin. A logical starting point for any discussion of histone genes therefore involves a review of sea urchin histone gene organization. The most common sea urchin histone genes are organized into a series of highly-reiterated tanden repeats. Each of the five histone genes is present in the repeat unit once, with the repetition frequency anywhere from one-hundred to several hundred copies (7). Coding portions are GC-rich and all five genes are transcribed off the same strand. The gene order relative to the transcribed strand is 3' to 5' H1, H4, H28, H3, H2A (10). Although all the genes are transcribed off the same strand, there is little or no evidence for polycistronic messages suggesting that each gene has its own separate promoter. Hentschel and Birnstiel (10, Table 2) reviewed sequence data for all of the five major classes of histones from P; miliaris and §;_purpuratus. Consensus promoter sequences (TATA box) could be demonstrated upstream of the coding portions for each of the sea urchin histone genes. The fact that these sequences have been highly conserved suggests that they must be functional and that each gene is transcribed individually. Spacer regions are composed of AT-rich nontranscribed DNA interspersed between coding regions. An examination of spacer organization reveals several notable structural characteristics. The primary sequences of various spacer regions .AQOpv mmcmm mcoumw; m_?;avmoco ecu Aeouponv mmcmm mcoumv; cwzucz new we wave: pmwamc Ewucwp esp do ucwemmcmccm uwpmwcmuomcmcu .H mesmwd .32 .fl. mammmoaaz 4m mm 8&3 50.: mQZmDOmm 1:571 awzmawm . > 12m el< zofimmmé mzmuu< wmumwmmm kHEwwcmH wcwu mGOummm diverge greatly between sea urchin species but overall size and location are fairly constant. Spacers also contain no detectable highly repetitive sequences, but some clustering of AT-rich sequences is observed. Additionally, spacers have been found to contain homocopolymer stretches such as (CT/GA)27 which may function in the maintenance of repeat homogeneity. Although some microheterogeneity exists, overall the several hundred-fold duplication of tanden repeats results in a fair degree of identity among the repeats. At this point, it seems appropriate to relate new ideas to classical thinking and thereby put into perspective the ways in which our understanding of histone gene organization and regulation is changing. Initially, it was a commonly held belief that each repeat in sea urchins represented one of several hundred identical copies (7). Sea urchins require massive amounts of histone mRNA during a relatively short timespan of early development. This type of gene arrangement could very easily account for the large number of transcripts sea urchins require at this time. However, recently increased sensitivity in resolving histone variants has begun to paint a somewhat different picture. The use of nonionic detergents such as Triton X-100 in gels by Zwiedler and his co-workers (11) have shown that several distinct variants of histones exist. Restriction endonuclease mapping and hybridization studies also bear out the existence of variants. Variation occurs in both primary sequence and in peptide length. It now appears that the tandem repeats actually represent distinct gene batteries which contain variants to be expressed at certain times during development. Thus the commonly studied sea urchin tandem repeats would be presumably only that battery of replication histone genes used specifically during rapid cell division early in embryonic development. Variants, or isohistones, can be shown for a variety of sea urchin species in both mature spenn and early embryos (12, Von Holt gt .El-)- 3; ngulosus mature sperm cells contain three H28 isohistones (H281, H282, H283) which are found in widely varying amounts from cell to cell. In early embryos of S; purpuratus and P; ngulosus a variety of histone mRNAs are detected. These represent distinct mRNAs required at different stages of development and not simply post-translational modifications. Evidence for this comes from placing mRNAs in a heterologous translation system (in which no modifications take place) which yields proteins with electrophoretic mobilities identical to corresponding histone protein variants isolated from the growing embryo. Additionally, stage-specific histones have been isolated in sufficient amounts to allow a partial structural determination. These stage-specific histone variants demonstrate distinct primary structural differences (see below). Several isohistones display unique sequence variations which could play a role in determining gene expression. Various H28 isohistones upon sequence comparison reveal highly variable N-tenninal regions both in sequence and in length (12, Von Holt gt g1.). At the crux of this variability is a characteristic pentapeptide repeat unit. An example of the pentapeptide repeat for isohistone H281 of g; angulosus spenn cells is Pro-Thr-Lys-Arg-Ser. In a typical H28 isohistone from an embryonic cell, the pentapeptide repeat can be absent or present in a single copy. However, in a haploid spenn cell where transcription and replication are completely repressed, up to 10 three or four intact pentapeptide repeats are present with an addi- tional one or two mutated repeats. A similar phenomenon is also seen with isohistones of H1. N-terminal variability is even more pron0unced in isohistones of H1 than H28. In this case though, a tetrapeptide (Ser-Pro-Arg-Lys) is absent in embryo H1 isohistones and four distinct repeats are reiterated three or four times in H1 isohistones from mature sperm. It has been suggested that both of these repeats bind to DNA due to a highly basic composition. The greater the number of repeats, the stronger the interaction with the DNA and therefore the tighter the isohistones complex with the DNA. If this binding difference does occur, it is interesting that the strongest interaction would occur in transcriptionally inactive tissue (sperm) and the weakest binding would occur in the most transcrip- tionally active tissues (in embryos). Nucleosomes comprised of different isohistones which are developmentally regulated suggest structural features which could give rise to the major functional forms of the genome. These include actively transcribed, temporarily repressed but inducible, and permanently repressed regions of DNA. Our understanding of histone gene regulation in sea urchins is changing almost as rapidly as have our structural conceptions: different distinct sets of histones are expressed at different stages during development (10). Early in development until the fifth or sixth cleavage, the cleavage stage histones are expressed. There is also a set of early histone genes and a set of late histone genes which can be identified during sea urchin development. Until recently, transcriptional control appeared to be the most obvious method of regulating these events. However, recent advances in the powerful technique of in_situ hybridization are indicating a different mechanism may be involved (12, Angerer gt_§l,). Initial experiments using a nick-translated early histone repeat (S; purpuratus) demonstrated a high signal in the pronuclei of embryos. Further experimentation involving a series of much more specific probes authenticated the level of hybridization as being due to early histone mRNA localized in the pronuclei. Calculations by two separate groups (12, Angerer gt 31., 13; Showman gt_gl.) using different methods placed the fraction of localized early histone mRNA in the pronucleus at 95-100%. This high content persists until the first cleavage whereupon the nuclear membrane breaks down. Showman et_al. (13) have demonstrated that other abundant maternal nRNAs (tubulin, actin) do not accumulate in the pronucleus. From these studies a striking observation in the developmental regulation of histone genes emerges (12, Angerer et 91,). Cleavage stage histone genes are transcribed and mRNAs are transported to the cytoplasm before maturation of the oocyte. However, early variant mRNA transcription is initiated after maturation and the resulting transcripts are sequestered in the pronuclei until the period of development where they are required. Thus, both transcriptional and transport control are necessary to provide for the appearance of the proper histone proteins at their respective times during development. This serves to reinforce two earlier points: histones vary widely in their approaches to solving regulatory problems, and what we learn from histones could be widely applicable to the control of gene expression in general. The second species in which histone genes were extensively characterized was Drosophila melanogaster. Comparisons of Drosophila vs. sea urchin and subsequent comparisons with histone genes of other 12 organisms only serve to reinforce the variation that exists in the regulation of histone genes. Drosophila histone gene organization was initially examined by Karp and Hogness when they screened plasmids containing Drosophila DNA with probes made from labelled sea urchin histone "RNA (14). The histone genes are present as highly-reiterated clustered tandem repeats with a repetition frequency of about 110 copies per haploid genome (10). Each repeat consists of one copy of the 5 major types of histone genes. Interspersed between the five coding portions of each repeat are spacer regions. Two major types of repeats are found in Drosophila; these include a 4.8 and 5.0 kilobase repeat. The only detectable difference between the two is an insert of 240 base pairs in the spacer region between the coding portions for histones H1 and H3. The larger of the two repeating units is present in excess of the smaller repeat by a ratio of 3:1. In many ways the Drosophila histone genes appear to be very similar overall to the major sea urchin histone genes. However, several important differences between the two are known to exist. For instance, the order of the histone genes and the mechanism of transcription are both distinct. The Drosophila histone gene order is H3, H4, H2A, H28 and H1. Histones H3, H2A and H1 are transcribed off one strand while histones H4 and H28 are transcribed off the opposite strand (10). The divergent transcription utilized by Drosophila is similar to yeast, but different from that seen for sea urchin. Divergent transcription would also require that at least two sites for the initiation of transcription be present. Initially it was postulated that all five genes in sea urchin could be transcribed from a single promoter and would give rise to repeat-length RNA. Although this now appears not to be the case, Drosophila represented the first system where the requirement for multiple sites of initiation could be demonstrated in histone gene expression. We next wish to consider the multiple levels of gene regulation that exist during Drosophila development. It is important to note that, unlike sea urchins, no stage- or tissue-specific variant histone mRNAs have been shown in Drosophila (12, Anderson gt_§l,). Therefore, elucidation of regulatory pathways do not necessarily have to take into account the expression of a variety of histones required at different stages of development. The three major levels of control of Drosophila histone gene expression include translational efficiency, rates of transcription and rates of mRNA turnover (12, Anderson 3: 31.). All of these regulatory mechanisms combine to produce the appropriate level of histone protein to complex with DNA at various developmental stages. Since very little histone protein is stored in the mature egg, it is the histone mRNA in the embryo which is crucial. Each of the three regulatory mechanisms make a contribution to the control of Drosophila histone protein synthesis. However, each contributes to a different degree and one must look at all three to sort out the overall picture. In examining these mechanisms, it is helpful to compare synthesis and turnover to a cellular standard. In this case total cytoplasmic poly (A)+ mRNA serves as a useful reference point. Take for instance, translational efficiency. The fraction of total poly (A)+ mRNA associated with polysomes during early embryogenesis increases slightly from 55% at one hour to 70% at four hours. This contrasts with recruitment of histone mRNA into polysomes which goes from 25% immediately after oviposition to 90% four hours later. Transcription also shows increased activity that has been quantitated. In the first six hours of embryogenesis, the rate of synthesis per nucleus of total poly (A)+ mRNA and histone mRNA is roughly parallel. However, between 6 and 13 hours the rate of synthesis per nucleus of total poly (A)+ mRNA remains constant while the rate of histone mRNA synthesis per nucleus drops 20-fold. This decrease almost exactly parallels the rate of DNA replication. Finally, the rate of mRNA turnover of total poly (A)+ mRNA remains constant throughout embryogenesis while histone "RNA stability drops at least 15-fold. Upon careful consideration of each of these mechanisms of regulation, we gain a better understanding of the multiple levels of gene regulation that exist in Drosophila. Translational control represents a relative fine tuning of the system since the fraction of histone messages associated with the polysomes changes only about three-fold. By comparison, histone nRNA stability changes 15-fold and the rate of histone mRNA synthesis per nucleus decreases 60-fold in the first thirteen hours of embryogenesis relative to total poly (A)+ mRNA. Histone mRNA turnover and synthesis therefore must account for the major levels of control of histone gene expression in Drosophila. Having given a general account of histone genes in sea urchin and Drosophila, it is now possible to examine the histone genes of chicken. The initial characterization of chicken histone genes and their regulation of expression were among the first attempts to expand our knowledge of histone gene organization to vertebrates. As stated earlier, it was initially thought that the high copy number of histone genes in the sea urchin would allow for the large quantity of histone protein required in embryogenesis. DNA replication in vertebrate 15 development is not nearly as intense as in sea urchin; therefore the demand for extremely rapid histone protein synthesis is also absent. Histone gene organization in the chicken has indeed turned out to be quite different from Drosophila and sea urchin. Crawford gt_al. (15) initially observed that each of the chicken histone genes was represented approximately ten times, and that the genes were present in a tandemly duplicated array. The latter observation, however, is now clearly in error. The chicken histone genes are often (but not always) present in clusters but no tandem repeats have been observed. Engel and Dodgson (16) established this by direct isolation and characterization of a variety of genomic chicken histone clones. Harvey et_al, (17) also established at about the same time that the chicken histone genes were non-tandemly arranged. They did this by isolating two genomic clones with chicken histone cDNA which were then mapped with several gene specific probes. Additionally, Harvey et_al. showed that two chicken H3 histone genes present in one clone were divergently transcribed. Additional clues to the gross overall organization of the chicken histone genes has come from two sources. Sugarman et_§l, (18) characterized in detail 15 lambda Charon 4A recombinant bacteriophage containing chicken histone genes. Initially 50 lambda recombinants had been isolated (16) due to hybridization to sea urchin H2A and H3 histone genes, and 15 unique recombinants were selected for further experimentation. Sugarman §t_gl, extensively mapped all 15 lambda recombinants with probes constructed of each of the five chicken histone genes HI, H2A, H28, H3, H4. J.R.E. Wells and colleagues took a somewhat different approach to look at the overall topology. Wells and colleagues (12, Engel) were able to walk down the chromosome and look at the organization of 36 kilobases of contiguous chromosomal DNA containing chicken histone genes. 80th of these studies reveal a clustering of the chicken histone genes, but once again there are no tandem repeats. The existence of histone variants in chicken is well documented. A good example of tissue-specific variation is histone H5 which replaces histone H1 in the condensed nuclei of adult avian erythrocytes (12, Harvey and Hells). The chicken histone H5 gene has recently been isolated by two groups working independently as well as in our own laboratory. Krieg gt_al. (19) utilized the known protein sequence available for H5 to choose a region in the gene from which to construct a unique 11-base deoxynucleotide. This sequence was then used to prime cDNA synthesis and thereby generate a probe. The extended primer was used to screen a cDNA library made from reticulocyte RNA. The final step in isolating the gene involved using the H5 cDNA to isolate the H5 gene from a lambda Charon 4A recombinant library. Ruiz-Vasquez and Ruiz-Carillo (20) chose a different route. They used a specific antibody to identify unique cDNA clones containing H5 sequences. As a result of these efforts, the chicken H5 histone gene has been sequenced and characterized. Several interesting observations regarding histone H5 can be made from this data. Briefly, H5 is present as a single-copy gene, it codes for the expected protein sequence (it is not a pseudogene) and it is not linked to any of the other histone genes (13, Harvey, Hells). Probably the nest surprising finding was that only one copy of H5 is present per haploid genome, even though it replaces histone H1 which is present approximately 10 times per haploid genome. Furthermore, the histone H5 gene was found to contain no intervening sequences and, as was previously demonstrated, the H5 mRNA is polyadenylated (despite the lack of a AAUAAA sequence). So far, histone H5 is the only proven example of a tissue-specific histone in chicken. In the course of characterizing the chicken histone gene family, other variant chicken histones have come to our attention. Engel, Sugarman, and Dodgson (21) have isolated a chicken histone variant during analysis of their initial 50 lambda Charon 4A histone gene clones. While attempting to locate chicken H5 histone genomic sequences, they characterized a clone which contained a gene coding for a known protein variant H3 histone (22). This gene, designated H3.3 shows only four amino acid differences (135 total) when compared to a normal H3 histone (H3.2) gene, however, there is a 19% primary sequence difference. The most unique feature of the H3.3 variant is that the coding portion of the gene is interrupted by two intervening sequences. For a long time histones have been recognized as exceptions to the rule that most eukaryotic genes contain intervening sequences (23). This was the first example of any histone gene in any organism which contained intervening sequences. Since this report, Woudt et_al, (24) have reported intervening sequences in the genes coding for histones H3 and H4 of Neurospora grassa. Wells and his collaborators have also isolated a chicken histone variant while screening a cDNA recombinant library with core histone probes to identify clones with large inserts that showed weak hybridization (12, Harvey, Wells). Several H2A and H28 clones were isolated and a cDNA that weakly hybridized to the H2A probe was characterized. Sequencing data revealed an extremely variant H2A-like protein. Therefore, Wells has labeled this gene H2A.F. H2A.F contains a unique nonapeptide sequence highly conserved in all H2A histones sequenced so far, and yet it shows a 40% divergence from the amino acid sequence of the most abundant H2A histone in chicken erythrocytes. To put this in perspective, calf and chicken H2A histones only differ by 4%. Additionally, H2A.F shows no hybridization to mouse, human or sea urchin DNAs and considerable variation in level of expression is apparent between different tiSSues. From a structural examination of these variants, we can see that differences do exist even in the relatively small numbers of the vertebrate histone genes. This leads to the main thrust of this investigation. Do the variants which have been characterized so far represent isolated mutational events or the existence of a yet to be discovered class of histone genes? Engel (12, Engel) concluded that all previous histone studies have relied on embryonic histone genes. Childs et_§l, (25) have isolated and characterized late-stage histone H3 and H4 genes from the sea urchin Lytechinus pigtus, The late-stage histone genes are different from the early-stage histone genes which are tandemly repeated and highly reiterated several hundred times. Late-stage histone genes are present in a greatly reduced number so that isolation of these genes has lagged until the development of a positive cloning procedure that specifically excluded tandemly repeated early histone genes from recombinant DNA libraries. Comparison of early—stage histone H3 to late-stage histone H3 reveals identical proteins but a 19% primary sequence difference. The question has arisen whether or not all of the histone genes isolated so far simply represent genes coding for embryonic histone proteins. If so, then where are the other histones which must replace the embryonic histones in adult tissue. A clue may come from the report of Wu and Bonner (26) who noted that some types of variant histone biosynthesis are not exclusively restricted to S phase. Several variants were shown to be synthesized at a "basal" level throughout the cell cycle in Chinese hamster ovary cells. Included among the variants was the histone H3.3. One of the most pressing questions at this time regarding histone gene expression is whether or not separate classes of histones exist which have been overlooked in the past. Due to the sequence differences of the two variants isolated so far it is possible that classical nethods of isolating histone genes with heterologous sea urchin probes may have selected for only one class. The majority of genes isolated so far w0uld fall under the heading of replication histones since they are present in rapidly dividing tissue. The class yet to be explored (variants like H3.3, H2A.F) would be that present in non-dividing tissue at low levels throughout the cell cycle as demonstrated by Wu and Bonner. This group probably constitutes the replacement histones briefly mentioned earlier. If these two classes exist, it should be possible to demonstrate the existence of other variant chicken histone genes. If their existence can not be demonstrated, variants such as H3.3 might be considered artifacts or peculiarities rather than representatives of a separate distinct histone class. Taking into account the possible role which has been postulated for variation in nucleosome structure with regards to gene 20 expression, these variant classes of histone genes may play a fundamental role in regulation of overall gene expression. The chicken H3 histone genes appeared to be an attractive system in which to further this investigation. One of the few variants characterized so far was an H3 histone gene. As it turned out this variant also is the only known vertebrate histone gene to contain intervening sequences. Whether or not this will turn out to be the characteristic of replacement histones is unknown. Only the continuing structural elucidation of variants will give us the answers. Since the characterization of the chicken histone genes was first initiated, several other vertebrates have been examined for the organization and regulation of their histone genes. In Xenopus, each of the core histones are reiterated 45-50-f0ld with the exact number of H1 histone genes yet to be determined (12, Van Dongen et_gl,). The Xenopus histone genes are clustered and approximately 30 clusters show nearly identical restriction maps while 20 others are unique. The 30 identical clusters show a tandem arrangement. More than one gene order has been found for the Xenopus histone genes, each one associated with a different variant H1 histone gene (10). Expression of the Xenopus histone genes has been shown to be differential in development and transcription takes place off both strands. Regulation of histone gene expression in Xenopus is different fr0m any of the systems examined so far. Large amounts of maternal histone protein and histone nRNA are stored in the oocyte for use in early embryogenesis. During this time, histone protein synthesis is not coordinated with DNA synthesis. Use of the stored 21 histone protein and mRNAs ceases in the early gastrula stage, normal zygotic transcription takes over and transcription becomes coordinated with DNA replication. This mechanism allows Xenopus to meet the requirement for large amounts of histone protein during early embryogenesis, despite possessing a moderate number of histone genes. The five major types of histone genes appear to be present in mouse at a gene copy number of 10-20 copies (10). No simple repeating structure of the genes coding for particular histones is obvious (12, Marzluff and Graves). Several variants of histone H1 have been characterized and extensive primary sequence variation is evident. Analysis of a clone containing mouse histone DNA sequences has revealed that transcription occurs off opposite strands of the DNA. Regulation of mouse histone protein levels is accomplished by regulating mRNA transcription and the rate of mRNA turnover. The human histone genes are moderately reiterated with the gene copy number approximately 40 (12, Stein et_gl.). The human histone genes appear to be clustered, but no simple tandem repeat is apparent. Anywhere from 4-7 characteristic arrangements of the histone genes can be observed by restriction mapping. Interspersed between the histone genes are several members of the Alu family of highly repetitive DNA sequences. Variants can be detected for each of the five major types of histone genes. An abundance of evidence exists which suggests that human histone gene expression is temporally coupled to DNA replication in the cell. MATERIALS AND METHODS A. METHODS Subcloning of DNA. The ACH6b phage was originally isolated by Engel and Dodgson (16) and further characterized in detail by Suganman gt l. (18). The DNA fragment of ACH6b which strongly hybridized to a histone H3 probe (see Results) was flanked by a Hind III and 8am HI site. This fragment was isolated by electroelution (see below) and subcloned using standard techniques (27) into the plasmid vector pBR322. Transformation of H8101 with Subclone p8H6b-2.6. Plasmid p8R322 containing the insert subcloned from ACH6b phage (see above) was used to transform the E; £911 strain H8101 by the RbC12 transfonnation technique (D. Hanahan, 28). A tube containing 200 pl of H8101 at 3.0-A600 units/ml in 40 mM KOAc, 15% sucrose, 60 mM CaClz, 45 mM MnClz and 100 mM RbClZ, pH 5.9 (stored as stock at —70° until needed) was quick-thawed and put on ice for 30 minutes. Next, 7 ul of dimethylsulfoxide was added, the tube slightly agitated and put on ice for 10 minutes. The competent H8101 bacteria were added directly to the suspended plasmid (ligated DNA in 10 ul TE) and the mixture was placed on ice for 10 minutes. At this point, the bacteria were quick frozen by immersion in a COz/ethanol bath. The bacteria were allowed to remain in the 22 C02/ethanol bath for 2 minutes, whereupon they were quick thawed. When the thaw was complete, the tube was immediately placed on ice for 30 minutes. The bacteria were then allowed to stand for 2 minutes in a 37° water bath, after which 0.2-0.8 ml of LB broth medium was added (no antibiotic) and the bacteria incubated at 37° for 30 minutes without shaking. Transformed bacteria were spread on LB agar plates containing the antibiotic ampicillin. The presence of the appropriate insert in the subclone was confirmed in transfonnants by growing up plasmid mini-preps. The mini-prep procedure used was the alkaline-lysis protocol outlined in reference 27. A 10 ul aliquot of the mini-prep was used in a Hind III/8am HI double-digest. The DNA fran each digestion was run out on a 0.8% agarose gel, stained with ethidium bromide (0.2 ug/ml) and visualized under an ultraviolet lamp. From this it was determined which of the plasmids from the ampr colonies contained the appro— priate 2.6 kb 8am HI/Hind III insert and one of these strains was selected for large-scale plasmid DNA isolation. The subclone consisting of the histone H3 hybridizing region of the ACHGb phage and p8R322 vector sequences was designated as pBH6b-2.6. Large Scale Isolation and Purification of Plasmid DNA. The protocol used to isolate and purify large amounts of pBH6b-2.6 plasmid DNA is a slightly modified version of the alkaline lysis procedure found in reference 27. An overnight that consisted of 7 ml LB broth medium and ampi- cillin (50 ug/ml) was innoculated with the pBH6b-2.6—containing H8101 bacteria (see above) and incubated at 37° in a shaker overnight. 24 Large scale growth was initiated by infecting 500 ml of M-9 enriched medium with 3 ml of overnight and vigor0usly shaking at 37° until an 00600 of 0.4-0.5 was reached. Once the proper density was achieved, 2.5 ml of chloramphenicol (34 mg/ml in ethanol) was added and incubation was continued at 37° with vigorous shaking for 12-16 hours. Bacterial cells were harvested by centrifugation, and the supernatant was discarded. The pelleted bacteria fran a 500 ml culture were resuspended in 6 ml of 50 mM glucose, 25 mM Tris-Cl, pH 8.0; 10 nM EDTA; 5 mg/ml lysozyme, transferred to 50 ml $534 plastic tubes and let stand for 5 minutes at room temperature. Next, 12 ml of 0.2 N NaOH, 1% SDS was added and the contents of the tube were mixed by gently inverting the tube. This mixture was let stand 10 minutes on ice, whereupon 9 ml of ice cold 5 M_KOAc (pH 4.8) was added, the contents vigorously mixed, and the tube put back on ice for 10 minutes. Cellular DNA and bacterial debris were pelleted out by centrifugation at 12K for 10 minutes. The supernatant was transferred to a 50 ml SS34 plastic tube, 0.6 volumes of isopropyl alcohol was added, and the plasmid DNA was allowed to precipitate at room temperature for 15 minutes. Plasmid DNA was pelleted by centrifugation at 10-12 K for 15 minutes. Pelleted plasmid DNA was resuspended in 5 ml of 0.2 M NaCl, 0.01 M Tris-Cl, 1 mM EDTA, extracted once with phenol chloroform (1:1) and precipitated with 5-6 ml of iSOprOpyl alcohol at -20° for 1 hour. Once the plasmid DNA had been isolated, it had to be purified away from chromosomal DNA sequences. This was accomplished by centrifugation through a two-step cesium chloride-ethidium bromide step gradient (27). Plasmid DNA was spun down at 10K for 15 minutes 25 at 4°, and the pellet dried. Pelleted plasmid DNA was resuspended in 2.4 ml of 0.01 M Tris-Cl, 1 mM EDTA (TE); 4.3 grams of CsCl was dissolved in the DNA solution and 0.3 ml (10 mg/ml) ethidium bromide was added (in the dark if possible). This solution was mixed thoroughly and immediately underlayed below 5 ml of 1.47 density CsCl in TE in a Ti70 centrifugation tube. Plasmid DNA was banded by centrifugation at 40K overnight using a Ti70 Beckman rotor, and the plasmid DNA was visualized under ultraviolet light. Two bands are apparent with the lower band representing the purified plasmid DNA. This band was renoved with a Pasteur pipette, and ethidium bromide was removed from the plasmid DNA by a series of 3-4 extractions with CsCl-saturated isobutanol. The final step in the purification involved extensive dialysis of the plasmid DNA with at least three changes of TE buffer. Concentrations of plasmid DNA were determined using a Beckman spectrophotometer with 1 A260 = 50 ug/ml DNA. Restriction Endonuclease Mapping of p8H6b-2.6. All reactions utilizing restriction endonucleases were carried out under the manufacturer's recommended assay conditions. In the case of double digests where two restriction enzymes had similar assay conditions (ionic strength, temperature) both enzymes were added to the reaction at the same time. In double digests where two restriction enzymes required different assay conditions, one enzyme was added and the reaction was allowed to proceed for several hours. At the end of this time, the conditions were optimized (e.g., salts added, temperature lowered) for the second enzyme and the digestion continued for several more hours. Routinely, approximately 1 ug of plasmid DNA was incubated with 1-2 units of enzyme for 6-8 hours. Restriction 26 endonuclease reactions were terminated with one-tenth volume of 1 M NaCl, 0.25 M_EDTA. DNA was precipitated by the addition of 2.5 volumes of ethanol at -20° overnight (or -70° for 1 hr). The precipitated DNA was recovered by centrifugation in a microfuge for 15 minutes at 4°. Pelleted DNA was drained, dried, and resuspended in 25 ul of 100 nM Tris—borate, 2 "M EDTA, 5% Ficol, 0.05% bromophenol blue and 0.05% xylene cyanol in preparation for polyacrylamide gel electrophoresis. DNA fragments which had been Subjected to restriction endonuclease digestion were electrophoresed vertically on 5% polyacrylamide gels (2 mm, 4 mm thickness) for approximately 300 volt-hours in 100 nM Tris-borate, 2 mM EDTA, pH 8.3. Molecular weight size standards consisting of Hinf I, Hind III or Taq I digested p8R322 were run in lanes next to the restricted DNA frgaments. Electrophoresed DNA fragments were visualized by soaking the gel in 0.2 ug/ml ethidium bromide in water for approximately 30 minutes at room temperature. Once stained, the DNA fragments were visualized by exposing the gel to an ultraviolet light source. Each gel was subsequently photographed using Polaroid 667 film and a red filter. From the photograph, one could measure the distance traveled by the molecular weight size standards and construct a standard curve for each gel. Once the standard curve had been generated, the actual sizes of the various restricted DNA fragments could be deduced. By examining the sizes of the DNA fragments generated by single- and double-digests, one could construct a physical map of the restriction endonuclease sites within plasmid p8H6b-2.6. 27 Several ambiguities in the restriction map remained after the above procedure; therefore it was also necessary to use the Smith-Birnsteil procedure (27) to map restriction endonuclease sites. Plasmid DNA was linearized using either Hind III or 8am HI, the protruding 5' ends were labeled with [y-32PJATP (see below), and the plasmid was digested a second time with whichever of the two enzymes that had not been used to initially linearize the plasmid. The labeled Hind III/8am HI insert fragment was then isolated by electroelution (see below) for Smith Birnsteil mapping. Approximately 104 cpm of end-labeled insert fragment, 1 ug salmon sperm carrier DNA, 1 pl 10X restriction enzyme buffer, water up to 10 ul and 1-2 units of restriction endonuclease was placed in a tube. The reaction was incubated at the appropriate temperature and 1.8 ul aliquots were withdrawn at time intervals of 2, 5, 10, 15 and 30 minutes into corresponding aliquots of 1 ul of 0.5 M_EDTA. All samples were combined, 2 ul of gel—loading dye (5% Ficol, 0.05% bromophenol blue, 0.05% xylene cyanol) was added and the reaction was run out on a 5% polyacrylamide gel with 104 cpm of end-labeled molecular weight size standards. The gel was dried and exposed to Kodak X-Omat AR film for 8-12 hours without intensifying screens. A ladder of digested DNA fragments was visible on the film. Each fragment's length indicated a given restriction site that distance from the labeled end of the insert. Using a combination of these two techniques, a reasonably accurate restriction endonuclease map was generated for plasmid p8H6b-2.6 (see Figure 3). 28 Purification of Bel Fractionated DNA Fragment_. DNA fragments digested with restriction endonucleases that were purified fran gels fell into two categories. Fragments which were to be labeled for DNA sequencing and fragments which had been labeled, but were cut with a second enzyme to generate a singly end-labeled fragment. The procedure outlined below (Girvitz, 29) was used for both types of fragment isolated from either polyacrylamide or agarose gels. DNA fragments were run out on a 5% polyacrylamide gel (see above), stained with ethidium bromide and visualized under an ultra— violet light score. The portion of the polyacrylamide gel containing the desired DNA fragment was excised as a block with a scalpel. The block of polyacrylamide gel was then placed in an empty minigel pouring form, and molten 0.7% agarose (in 40 mM Tris-acetate, 2 mM EDTA, pH 7.5) was poured around the fragment. Once the agarose had hardened, a slice was made lengthwise with a scalpel at the boundary between the polyacrylamide gel section and the hardened agarose. A piece of Whatman 3 MM paper backed by dialysis membrane was cut to the approximate size of the polyacrylamide section and inserted into the incision. The 3 MM paper was between the dialysis membrane and the DNA fragment of interest. Current was then applied for 15-30 minutes so that the DNA fragment migrated out of the polyacrylamide section and into the 3 MM paper. DNA is recovered fran the 3 MM paper by placing both the 3 MM paper and dialysis membrane in separate Eppendorf tubes which had been punctured through the bottom with a hot 25 ga. syringe needle. These tubes were placed in 12 x 75 mm plastic test tubes and both the 3 MM paper and dialysis membrane were washed with 200 pl of 0.2 M_NaCl, 50 mM Tris-Cl, pH 7.5, 1 mM EDTA. Wash was 29 collected in the bottom of the tubes by centrifugation at medium speed for 20-40 seconds in a table top centrifuge. Each sample received 3-4 washes, with the membrane wash used to wash the paper in the subsequent wash, and the contents of the collecting tubes were precipitated by the addition of two volumes of isopropyl alcohol overnight at -20°. This procedure also works very well for agarose gels, however it is not necessary to excise the desired DNA fragment as a block with a scalpel. An incision is simply made in front of the DNA fragment in the horizontal preparative agarose gel and the 3 MM paper and dialysis membrane inserted as above. Electroelution is then performed as described. Labeling of Double-Stranded DNA. DNA fragments which had been generated by restriction endonuclease digestion and purified by electroelution were labeled according to the ends of the DNA which had been generated by enzymatic cleavage. A slightly modified procedure was used for DNA fragments with blunt ends versus fragments with 5'-protruding ends. Approximately 5-10 ug of DNA with 5'-protruding ends in 50 ul of distilled water was combined with 50 ul of 100 mM Tris-Cl, pH 9.0 and 1 ul of calf alkaline phosphatase (ca. 100 U/ml). This reaction was incubated for 1 hour at 37° followed by the addition of 10 ul of I M_NaCl, 0.25 M EDTA. The DNA was extracted with one volume of phenolzchlorofonn (1:1) twice, extracted with one volume of ether twice, and the contents precipitated by the addition of 55 ul 7.5 M NH40AC and 500 uT ethanol for 15 minutes in a COz/ethanol bath. The DNA was pelleted by centrifugation in a microfuge for 15 minutes at 4°, drained, washed with 1 ml ice cold ethanol, respun for 5 30 minutes, drained and dried. Pelleted DNA was resuspended in distilled water, 5 pl 10X protruding kinase salts (0.5 M_TriS°Cl, pH 7.6, 0.1 M MgClz, 50 mM DTT, 1 mM spermidine, 1 mM EDTA), 0.2-0.5 mCi of [Y-32PJATP ( 3000 C/mmole; ICN) and 3-5 units of T4 polynucleotide kinase (50 pl total volume). The kinase reaction was incubated at 37° for 60 minutes, whereupon the reaction was terminated with 50 pl of 0.3 M NaOAc. Labeled DNA was precipitated with 250 pl ethanol, incubated in a COg/ethanol bath for 5 minutes, spun in a microfuge at 4° for 10 minutes, in some cases reprecipitated, drained and dried. Labeled plasmid DNA was then taken up in water and used either for Maxam and Gilbert DNA sequencing, Smith—Birnsteil mapping or $1 nuclease mapping. DNA fragments with blunt ends required a slight modified of the above protocol. Approximately 10-20 pg of plasmid DNA with blunt ends in 50 pl of 50 mM Tris-Cl, pH 9.0 was incubated for 2 hours at 60° with 2 pl calf alkaline phosphatase. Two additions of the phosphatase was made, 1 pl was added at the start of the reaction and a second 1 pl was added after 1 hour of incubation. After the two hours had expired, 200 pl of 0.3 M NaOAc was added, the phosphatased DNA extracted with one volume of phenol chloroform (1:1) twice, extracted with one volume of ether twice and precipitated with 2.5 volumes of ethanol for 15 minutes in a COz/ethanol bath. DNA was spun in a microfuge for 15 minutes at 4°, drained, washed twice with 500 pl ethanol and dried. The pellet was taken up in 11.5 pl distilled water, 2.5 pl 50 mM EGTA, 4.0 pl 50 mM spermidine and incubated for 3 minutes at 90° in an oil heating block. After 3 minutes, the DNA was immediately placed on ice for 1 minute, and then 1.0 pl 5 mg/ml BSA, 31 2.5 pl 10X kinase buffer (500 mM Glycine-NaOH, pH 9.5, 100 mM MgCl2, so mM DTT, 1 mM EDTA, 30% glycerol), 0.2-0.5 mCi [Y-32PJATP ( 3000 C/mmole; ICN) and 5-8 U of T4 polynucleotide kinase were added (25 pl, total volume). The kinase reaction was incubated for 4 hours at 37° whereupon the reaction was tenminated and precipitated as above. An additional method was used to label double-stranded DNA with 5'-protruding ends, so that the DNA could be read 3'-+ 5' in Maxam and Gilbert sequencing. This procedure allowed one to read the complementary strand to a kinased 5'—protruding end-labeled fragment. Basically, 5—10 pmoles of restricted DNA with 5'—protruding ends was combined with 30 pl. [Y-32PJdNTP (NEN, 800 C/mmole, 13 pM), 5 pl 10X cDNA salts (0.5 M Tris-Cl, pH 8.3, 0.6 M NaCl, 0.06 M MgClZ, RNase-free), 2.5 pl 0.4 M DTT, 2 pl AMV reverse transcriptase (14 U/pl) and 10.5 pl distilled water. The reaction was mixed and incubated at 37° for 1 hour, at which time the reaction was processed identically to the kinase reaction. Maxam and Gilbert DNA Sequencing of Plasmid pBH6b-2.6: DNA sequencing by chemical modification was performed as outlined by Maxam and Gilbert (30) using at some points the modifications of Smith and Calvo (31). Maxam and Gilbert DNA sequencing requires a singly end-labeled substrate; therefore labeled DNA was cut with restriction endonuclease, run out on a gel, electroeluted and precipitated. DNA to be sequenced was resuspended in 50 pl distilled water and 5 pl aliquots were placed in each of four 1.5 ml silanized Eppendorf tubes. Tubes were designated C, CT, AG and G to reflect the nucleotide which would be susceptible to chemical modification. Each tube received 32 1 pl of salmon sperm DNA (1 pg/pl) to act as carrier and the contents were mixed. Tube C received 15 pl 5 M NaCl, 30 pl hydrazine, was incubated for 10 minutes on ice and the reaction terminated with 200 pl of hydrazine stop buffer (0.3 M NaOAc, 0.1 nM EDTA, 25 pg/ml tRNA). The CT reaction was identical to the C reaction except that 15 pl of distilled water was substituted for the 15 pl of 5 M NaCl. Tube AG received 15 pl distilled water, 50 pl of formic acid, was incubated at 23° for 2 minutes and the reaction stopped with 180 pl of hydrazine stop buffer. Reaction G consisted of 200 pl of DMS buffer (50 mM sodium cacodylate, pH 8.0, 1 mM EDTA), 0.5 pl dimethyl sulfate and was incubated for 10 minutes on ice and terminated with 50 pl DMS stop buffer (1.5 M NaOAc, pH 7.0, 1.0 M_mercaptoethanol, 100 pg/ml tRNA). From this point, all reactions were treated the same. Modified DNA was precipitated by the addition of 600 pl ethanol and 5 minutes incubation in a COZ/ethanol bath. The DNA was pelleted by centrifugation for 5 minutes in a microfuge at 4°, drained and taken up in 200 pl 0.15 M NaOAc followed with 500 pl ethanol. DNA was precipitated a second time as above, spun 5 minutes, drained, washed with 500 pl ice cold ethanol, briefly respun, drained and dried in a dessicator. Each pellet was resuspended in 50 pl 1 M piperidine (fresh) and incubated for 30 minutes at 90° in an oil heating block. Tubes were briefly placed on ice, given a quick spin to sediment condensation on the sides of the tubes and the contents of each tube were placed in a fresh 1.5 ml Eppendorf tube. DNA was precipitated with 50 pl 0.3 M NaOAc and 250 pl ethanol as before, spun 5 minutes and drained. Each pellet was washed with 250 pl 70% ethanol, spun 5 minutes, drained, dried in a dessicator and taken up in 3 pl of FA 33 loading buffer (90% fonnamide, 0.25 mM EDTA, 0.02% bromophenol blue, 0.02% xylene cyanol). Several other modifications of the Maxam and Gilbert DNA sequencing procedure were used. At one point, chemical modification of the C and CT reaction was proceeding too far (excess degradation). Both reactions were modified so that the incubation time was changed to 4 minutes at 23°, however the reactions were terminated and the DNA precipitated as before. When it came time to add 200 pl 0.15 M_Na0Ac, an additional 20 pl of acetylacetone was added to completely neutralize the reactivity of any lingering hydrazine (32). This mixture was let stand 5 minutes at room temperature, 500 pl ethanol added and the reaction processed as before. Ambiguity between C and CT reactions was resolved by the incorporation of a fifth chemical modification reaction to the procedure. This reaction involved photoinduced cleavage of DNA to determine thymidine residues and was designated T>G (33). Briefly, 5 pl of singly end-labeled fragment was mixed with 5 pl of 2 M cyclohexylamine and exposed to an ultraviolet light source for 1 1/2 minutes (time can vary due to intensity and distance of light source). An addition of 100 pl 0.3 M NaOAc, pH 5.2, 1 pl salmon sperm DNA (1 mg/ml) as carrier and 500 pl ethanol was added to the DNA. DNA was precipitated as before, spun 10 minutes, drained, washed with ice cold ethanol and dried in a dessicator. Pelleted DNA was resuspended in 50 pl 1 M piperidine (fresh) and processed as above. Prior to loading the gel, DNA samples were heated at 90° for 1 1/2 minutes in an oil heating block, cooled on ice for 1 minute and spun briefly to concentrate the sample. Three aliquots of 1 pl of 34 each sample were loaded onto either a 6%, 8%, or 12% 84 x 18 cm, 0.4 mm thick polyacrylamide sequencing gel (for an 8% gel, 8% acrylamide, 0.4% Bis, 7 M urea, 90 mM TBE, filtered and degassed) in the order G, AG, TC, C (T>G optional). After the initial 1 pl aliquot of each sample was loaded, the gel was electrOphoresed 12-14 hours at 30-40 constant watts until the xylene cyanol had just run into the lower buffer well. At this time, a second set of each of the samples was loaded onto the gel and electrophoresed approximately 8 hours or until the xylene cyanol had migrated four-fifths of the way down the gel. A third set of samples was then loaded on the gel, and electrophoresed approximately four hours or until the bromophenol blue had migrated two-thirds of the length of the gel. Buffer used for the sequencing set-up was 90 mM TBE. Gels were removed from between the glass plates, overlayed onto Whatman 3 MM paper and dried for 20 minutes at 80° on a BioRad Model 11258 gel drier. Dried gels were exposed to Kodak X-Omat AR5 film with or without intensifying screens depending on the relative amounts of radioactivity. Preparation and Isolation of Chicken RNA. Chickens were made anemic by injections of phenylhydrazine (2.5% w/v) over the course of 6 consecutive days. Anemic red blood cells were prepared and fractionated into cytoplasm and nuclei according to Longacre and Rutter (34). Cytoplasmic RNA was prepared by extensive phenol/chloroform extraction of red cell cytoplasm and ethanol precipitation. Poly (A)+ red cell RNA was also prepared according to Longacre and Rutter (34). All other RNAs were prepared by the method of Chirgwin et al. (35). 35 SI Nuclease Mapping. A singly end-labeled fragment was normally used for SI mapping, this was generated as above and taken up in 100 pl 0.3 M NaOAc, pH 7.0. The fragment was made RNase free by phenol chlorofonn and ether extraction. Typically 50 pl of singly end-labeled fragment was placed in a 1.5 ml Eppendorf tube along with, for example, 40 pl total cytoplasmic red cell RNA (3 mg/ml), 10 pl RNase free 5 M NaCl and 300 pl ethanol. A control was prepared with 50 pl singly end-labeled fragment, 10 pl RNase free 5 M_NaCl, 40 pl RNase free distilled water and 300 pl ethanol in a 1.5 ml Eppendorf tube and run under identical conditions to the RNA-containing reac- tion. In some cases 100 pg of yeast tRNA was added to the control. Singly end-labeled DNA and cytoplasmic RNA were precipitated overnight at -20°, spun down for 15 minutes at 4° in a microfuge and drained. The pellet was washed once with 500 pl ice cold ethanol, spun 5 minutes in a microfuge at 4°, drained and dried. The pellet was resuspended in 20 pl FAHB (80% formamide, 0.4 M_NaCl, 0.04 M PIPES, pH 6.4, 1 mM EDTA, RNase free) incubated for 2 minutes at 90°, then at 55° for 2 hours followed by incubation at 50° for a final 3 hours. The $1 nuclease reaction was quenched with 300 pl 1X S1 salts (25% glycerol, 0.15 M NaOAc, pH 4.5, 5 mM ZnS04, 250 mM NaCl, 50 pg/ml denatured salmon spenn DNA) and the reaction volume divided equally into three tubes. Tube I typically received 50 units of SI nuclease, tube 2 received 150 units of SI nuclease and tube 3 received 400 units. These reactions were incubated at 37° for 15 minutes after which 10 pl of 1 M_NaCl, 0.25 M_EDTA was added to each reaction. Each reaction was extracted once with one volume of phenolzchloroform (1:1), precipitated by the addition of 2.5 volumes of ethanol and incubation for 15 minutes in a COZ/ethanol bath and spun 15 minutes in a microfuge at 4°. Pelleted DNA was drained, washed with 250 pl ice cold ethanol, spun 5 minutes, drained and dried. Each reaction was resuspended in 3 pl FA loading buffer (see above). As stated earlier, the no RNA control was run under exactly the same reaction conditions. Exact RNA and S1 levels are described for specific experiments in the figure legends. S1 nuclease reactions were loaded on a 36 x 18 cm, 0.4 mm thick 8% sequencing gel and run for 2-3 hours (bromophenol blue migrates two-thirds length of gel) at 25 constant watts. Gels were removed from the glass plates, overlayed onto Whatman 3 MM paper and dried at 80° for 20 minutes on a BioRad gel drier. Dried gels were exposed to Kodak X-Omat AR5 film with intensifying screens for 2 days to 1 week depending on the intensity of the radioactive signal. Primer Extension Analysis. The labeled DNA fragment used for primer extension was prepared and hybridized to RNA as described above for $1 mapping. The RNA:DNA hybrid preparation was precipitated (above) and taken up in a 100 pl reaction identical to the one described for 3' end labeling DNA fragments except all dNTPs were unlabeled and at 100 pM and AMV reverse transcriptase was used at 560 U/ml. The reaction was incubated at 42° for 2 hr, phenol:CHCl3-extracted, and 5 pl of 5 N NaOH was added to the aqueous phase followed by incubation at 90° for 5 min. The reaction was neutralized with 5 N HCl; 100 pl of 0.3 M NaOAc were added followed by 600 pl ethanol and the DNA was precipitated and prepared for gel electrophoresis as described for the $1 analyses. 8. MATERIALS Restriction endonucleases were purchased from New England Biolabs, Bethesda Research Labs, Amersham, Biotec and Collaborative Research, Inc. AMV reverse transcriptase was obtained from Life Sciences, Inc. through the Office of Program Resources and Logistics, Viral Cancer Program, National Institutes of Health. Ribonuclease A, lysozyme and ampicillin were purchased from Sigma. T4 polynucleotide kinase was purchased from Amersham and Worthington. T4 DNA ligase was purchased from Worthington. T4 DNA ligase was purchased from Worthington. Calf alkaline phosphatase was purchased from Boehringer Mannheim and further purified by column chromatography by D. Grandy (our lab). [1-32PJDeoxynucleotide triphosphates were purchased from both Amersham and New England Nuclear. Hydrazine, DMS and x-ray film were from Eastman Kodak. S1 nuclease was purchased from PL Biochemicals. Dimethylsulfoxide was purchased from Aldrich and formic acid, cyclohexylamine and piperidine were purchased from Fisher Scientific. Glass plates, comb and sequencing stands were all purchased from Bethesda Research Laboratories. RESULTS Outline of Protocol. As stated earlier, the first example of a histone gene which contained intervening sequences was isolated and characterized by Engel, Sugarman and Dodgson (21). This gene was isolated from a set of A Charon 4A recombinant clones known to contain chicken histone DNA sequences. Initially, Engel and Dodgson (16), had screened a chicken DNA-containing A Charon 4A recombinant library with sea urchin histone H2A and H3 heterologous probes. Sugarman gt_al. (18) subsequently extensively mapped and characterized 15 of these A recombinants and they went on to demonstrate the existence of H1, H2A, H28, H3 and H4 chicken histone DNA sequences within the clones. These workers also attempted to see if any of the A chicken histone clones contained chicken histone H5 DNA sequences. Since it was not known at the time if histone H5 was linked to other histone genes, the 50 original A recombinants were screened to look for sequences typical of H5 sequences. At that time, histone H5 was the only vertebrate histone known to have its message polyadenylated (8) and it was also known to be expressed specifically in red blood cells. Thus, a cDNA probe complementary to red cell poly (A)+ mRNA was hybridized to the set of A histone clones. It has since been shown that H5 is not linked to any of the other histone genes. However, several of the 50 histone clones did in fact hybridize to the cDNA made to red cell poly (A)+ mRNA. The first of these clones to be characterized (ACH4b) 38 39 turned out not to contain the H5 histone gene, but instead it contained a histone H3 gene (H3.3-1) which surprisingly contained two intervening sequences which split the coding regions of the gene. This gene has been shown to code for a variant H3 histone protein designated H3.3. Although the H3.3-1 gene hybridized to mRNA in the poly(A)+ fraction, the more common histone H3 gene (H3.2) was shown not to produce polyadenylated mRNA. Originally Engel, Sugarman and Dodgson proposed that the H3.3-1 clone may have hybridized to the oligo-dT primed cDNA to poly(A)+ mRNA due to a d(AllGA9) sequence in the mRNA antisense strand 255 base pairs 3' to the termination codon (21). If transcribed, this sequence could account for the selection of the gene's RNA on oligo-dT cellulose. However, it now appears that the message encoded by this variant gene may indeed be post-transcriptionally polyadenylated (0.0. Engel, personal communication). This H3.3-1 gene was the first histone gene from any organism shown to contain intervening sequences. Since the initial characterization of the H3.3-1 variant histone gene (1982), one other report has appeared in the literature pertaining to the isolation of histone genes with intervening sequences. Woudt gt_al. (24) have demonstrated the existence of one intron in a histone H3 gene and two introns in a histone H4 gene from Neurospora crassa. It also appears that a variant chicken H2A histone gene (Showman et_al.; J.R.E. Wells, personal communication) and a human H3.3 histone gene (L. Kedes, personal communication) contain introns. This report describes efforts to characterize a second clone isolated from the chicken histone A recombinants (ACH6b) (16) which 40 hybridized to both the heterologous sea urchin histone H3 probe and to the probe made from cDNA prepared against adult red cell poly(A)+ mRNA (21). To determine if this clone also contained a histone H3 gene possessing intervening sequences, it was necessary to determine its DNA sequence. This would tell us if the presence of introns was a common aspect of several chicken histone genes, and thus if there may be a whole family of such genes which may also show similarities in their pattern of expression (e.g. replacement histones). The chicken histone H3-hybridizing region which was subcloned and characterized in this report was contained in the A Charon 4A recombinant clone designated A CH6b (Figure 2). This A recombinant was initially restriction mapped by Sugarman gt al. (18), and the 2.3 kilobase pair (kb). 8am HI/Hind III fragment was shown to hybridize to both the heterologous H3 sea urchin (and Drosophila) probe and to cDNA prepared against adult red cell poly(A)+ mRNA (J. Dodgson, unpublished results). The 2.3 kb H3-hybridizing fragment was excised with a 8am HI/Hind III double-digest, isolated and subcloned into the plasmid vector p8R322. The plasmid containing the H3-hybridizing region from A CH6b ligated to 8am HI/Hind III cut p8R322 was designated plasmid pBH6b-2.6. A fine structure restriction endonuclease map was generated for the 2.3 kb insert fragment of pBH6b-2.6 (Figure 3). This map was important for two reasons. 0f major importance, was the fact that knowledge of restriction endonuclease sites determined my sequencing strategy. This strategy is illustrated in Figure 3 above the restriction map, each arrow indicating the site end-labelled and the direction and approximate distance sequenced. Of secondary importance 41 .Zorwn umnwcommc wronexm he umpcmmmcawc mew pcmcwnsoomc esp :_;pw3 mmpwm wmmm_o==ou:m :owpuwcpmmm .onm xoo_n cw_om m »n cwumcmwmmu mw :owmwc ac_~_uwcnxzum: ax m.m one .mcwr u;m_ocpm m xn emacmmmcqme mew mmocmscmm wm mxoopn mmwgp :mmzpmn mmcwp c_;p as» .wuwpmc mg» mucmmwcamc ammumcpm mcwocmscmm vcm awe cowpo_cpmwc asp :mwzuwn zoecm mmem_ um—Fwe use come we» .umocmscmm mucmumwv as» ecu cowuumc_c as» .mcwocwzcmm onm mzocc< .zo—mn mponexm saw: umpmowucw mew mmp_m mmmm—oscoccm cowuuwcumwm .cmmzpma cw mew? w xn vmwwwcmwm men @0104 soc» moucwzcmm pcmmmcnmc mmwcm umvmgm .m.~-no:mq weaponzm com mama mcopmw; m.m: do cowbwmoa m>wumpmc new xmmumcum mcwocmzcmm pemnpww new memz .ams mmmmposcoucm cowuowcummm .m mczmwd 44 35 o .25 .w :memm; _u_z_: » >moom « .25“;V ___oz__._ 8 __:>a a __oz:._.,_c 55:8 6 Show: uwmmma w 4 Am 4 a a Mo «mmmm AU! E m. E I. :C. t. #0.. IL. ll HI TU III”. lllv . A! Ill All All! Tl: [Iv lllv 45 was the localization of unique restriction endonuclease sites which were commonly placed in the coding portions of H3.3-1 and the 2.3 kb insert of pBH6b—2.6. For example, the Pvu II site present in the first exon of H3.3-1 is the only Pvu II site present in the entire gene. There is also only one such site in the 2.3 kb insert (see Figure 3). Therefore, this was chosen as a good region to look for sequence homology to the coding portion of the H3.3-1 gene. The total DNA sequence of the 2.3 kb insert of p8H6b-2.6 has been sequenced by Maxam and Gilbert DNA sequencing (Figure 4). Initial DNA sequence comparison around the Pvu II site of the 2.3 kb insert fragment to the DNA sequence around the Pvu II site in the first exon of H3.3—1 revealed extensive homology. Twenty-seven nucleotides around the Pvu II site between the two were identical. Additionally, a unique EcoRV site is present in the third exon of H3.3-1. Therefore, I searched for additional homology in the DNA sequence around the two EcoRV sites present in the 2.3 kb insert of pBH6b-2.6 (see Figure 3). The existence of such homology would orient the gene and suggest an area upon which to concentrate further DNA sequencing efforts. DNA sequence homology with the third exon of H3.3-1 (24 of 27 nts) was demonstrated around the right-hand EcoRV site (see Figure 3, position 1415). This last finding turned out to be rather surprising. Initially, it was thought that if this clone represented a variant histone gene similar to H3.3-1, then the intron sizes would be similar. We based this assumption on the finding that the intervening sequences in, for example, globin genes diverge heavily in sequence between different species, but their position and approximate size is highly conserved. The fact that the distance between these 46 .covuepmcecp co» Acmmmwomc mcovoo A<<3v aopm ecu Ao3mec an op unmzocp mmpwm __< .mem253c m>wpemwc saw: umpwcmwmmv mew Emwcuma: mwucmscmm .AH+V mbwm goo mew .ewuewu_c_ mw :o_pqwcum:mcu mews; ouwm me» an mpcwum wocmzcmm we» mo mcwcwnE32 .ugwcomcmcu uwow_am esp cwspwz ucmmwca mmocwscmm mpecmwmmu mcwppm_ mmmo Lona: .m.m-no:mq wcoponzm mo ucww:_ ax m.N web Go wocmzcmm muwpom_o:c mumpasou .e mesa?» 47 HHH<00<000 00m eeec ceee e>ez meeceeeem cecucH .u e:_P 3e—ee ee>wm m? :ewueuwumeem e_ee e:PEe epewceeceee an» ego eeceeeem ewee e:PEe Nim.m: esp cvsuwz eecw_ceece ece emesp .N.m: eee Num.m: :eezeee ececeeee wee meeeecemwwe ewee eewse peed .Auv ~.m: Le “m0 Hum.m: eee Num.m: ceeZpee eeceeeem <29 :0 meececewmwe ucemeceec 0 van m mecwg .Num.m: xe Lew eeeee eewueeexpee uce_ce> m.m: esp we eeceeeem e_ee e:PEe en» m? a? e>ee< .m-m.m: eceemws 0e eeeeeeem eeweee—eec esp we < mew; .N.m: ece Him.m: meeeumw; e» Num.m: ecepm_; we meeceeeem eewpee_eec eee c_epece one we cemwceeEeu .m eeemwd 09< 00< 090 mHH u£9 Hm> 000 0<0 0<0 mH< 3H0 CH0 51 U U 090 090 000 He> :00 mu< 000 090 00< c_< nc> pz9 e<< ee< eee use pee e~< 0 0 9 0 <0< wu< 090 Dog eee see 090 :00 ON e<< eee a»; mH< ea: 0: 000 900 mH< NH0 om eye eee pee epe 000 00< 0pm wp< e0 000 new 09< 00H en; 0<< m9; 0<0 3H0 :H0 090 so; m9; eee x_e e 00< u:9 0 9 <9< mHH on“ 0<0 mm< eee x0e 009 pom 00 9 0 0 0 00< wu< <<0 5H0 0<0 am< 009 new 0<< m9: 000 wu< AHA use 0<< wxq 000 up< mag 040 CH0 mH< 0 0 909 pom 000 ¢_< Bee AHe eye one o<9 ~99 000 Ohm om ee< age CH0 9<0 aw< 000 wu< 0<0 CH0 09¢ uHH 090 Dog OCH 0 < 900 wu< pom 0<< wad 9<0 aw< 9<9 u>9 <<< wxq <0< h:9 <<< whq 0<0 3H0 cm 000 wp< 900 up< 000 cpm <<0 5H0 <<0 3H0 900 wu< e: be: ONH 0 0 9 0 00< wu< 000 c—< “wee 52 that H3.3-2 is almost as divergent from H3.3—I as from H3.2 even though the two H3.3 genes code for identical protein sequences. The implications of this finding will be considered further in the Discussion. The location of the three exons of H3.3-2 relative to the PvuII and EcoRV sites can be seen in Figure 3. The total DNA sequence analysis of H3.3-2 clearly shows that the gene contains two small intervening sequences that interrupt the coding portions of the gene (see Figure 4). These two intervening sequences occur at exactly the same sites within the coding sequence of H3.3-2 as do the introns in H3.3-I (5' and 3' relative to transcription). However, these intervening sequences are 80 base pairs and 88 base pairs, respectively, versus 766 and about 2,800 base pairs for the introns of H3.3-1 (see Figure 8). As I stated earlier, we might have anticipated that the introns locations and sizes would be conserved. Only one of these assumptions turned out to be true. Several other important pieces of information could be deduced from the DNA sequence of pBH6b-2.6 (see Figure 4). Examination of the intron-exon junctions in H3.3-2 reveal the proper consensus donor and acceptor sites (36). No termination codons are present within the protein coding portions of H3.3-2. Additionally, the proper start codon (ATG) and stop codon (0AA) are present flanking the H3.3-2 gene. In fact, it appears that this gene contains the necessary information required to code for a full—length, functional H3.3 variant histone polypeptide. This suggested, but did not prove, that the H3.3-2 gene was actually expressed. However, analysis of the DNA sequence some 200 bp. upstream of the ATG start site for the H3.3-2 protein did not reveal any consensus sequences for the initiation of transcription by 53 RNA polymerase II (Figure 4). Neither a TATA sequence nor a CCAAT box was identifiable, and this suggested one of three possibilities must exist for H3.3-2. It was conceivable that this gene might not be transcribed despite the encouraging data previously mentioned. It was also possible that this gene did not contain "standard" consensus promoter sequences typical of most other RNA polymerase II-transcribed genes. Since consensus promoter sequences were also not seen directly upstream of the first coding exon of H3.3-I, maybe novel sequences involved in the initiation of transcription existed for both genes. A third alternative was that H3.3-2 might contain another intervening sequence (or several) in the 5'-nontranslated portion of the gene. The existence of intervening sequences in the 5'-nontranslated portion of a gene has been shown for both ovalbumin and insulin genes (23). To determine whether a novel putative promoter sequence existed or if another intervening sequence in the 5'-nontranslated portion had gone undetected, SI nuclease mapping was undertaken. SI mapping utilizes a specific single-stranded singly end-labelled DNA probe to determine were a message begins or where an intron-exon junction occurs. The requirements for the probe in this case were that the DNA be singly end-labeled inside the first exon and that the probe extend 5' to the ATG protein start site. Initially, the singly end-labelled DNA probe is denatured and hybridized at the proper temperature to promote the formation of RNA:DNA hybrids versus reannealing of the two DNA strands of the probe (see Materials and Methods). Once the hybrids are formed, they are treated with S1 54 nuclease which will attack single-stranded DNA but not RNA:DNA hybrids. The RNA DNA hybrid formed between the first coding exon and the message will protect labelled DNA of a length equal to the distance between the label and the beginning of the exon. The shortened DNA fragment is denatured and run out on a polyacrylamide gel with size markers. The size of the fragment tells where transcription is initiated or where the nearest intron exon junction is, since the label and exon length fragment are protected but the intron does not hybridize with the mRNA and is degraded (intron sequences are not present in mature cellular mRNA). This method would therefore shed some light on which of the possible explanations outlined above was correct: no H3.3-2 transcription occurred, the sequences surrounding the transcription initiation site were novel, or an undetected intron existed in the 5'-nontranslated region of H3.3-2. The results of such an S1 mapping experiment are shown in Figure 6(A). The source of RNA was total cytoplasmic RNA from chicken anemic reticulocytes. Since this gene was isolated by hybridization to a probe made of cDNA prepared against adult anemic red cell poly(A)+ mRNA, it was known that the H3.3-2 gene was represented in the above RNA. The probe used in this case consisted of a 932 base pair PvuII/BamHl fragment isolated from p8H6b—2.6 (see Figure 3). This fragment was singly end-labelled at the PvuII site (644), placing the label 60 base pairs from the A of the initiator ATG. The probe contained 875 base pairs upstream this ATG sequence. As we can see in Figure 6A (lanes 2 and 4), a strong signal is indicated at approximately 72-75 base pairs. This would mean that only about 15 55 and hybridized either in the absence of added RNA (Lanes 1,3) or to 120 pg of total anemic red cell cytoplasmic RNA. Digestion reactions were carried out as in A, with SI nuclease levels at 500 U/ML (Lanes 1,2) and 1500 U/ML (Lanes 3,4). All reactions contained in this figure were run out on 8%, 0.4 unisequencing gels (Maxam and Gilbert, 30; Materials and Methods), dried and exposed to one intensifying screen for 3-7 days. Arrows indicate the position of labeled marker DNA fragments (Hinf I digest of plasmid p8R322) run in separate lanes. The figure below the autoradiogram represents the location of the intervening sequence in the 5' nontranslated leader sequence and the location of the two singly end-labeled fragments used in the above SI nuclease reactions. Figure 6. 56 Identification of the leader exon of the H3.3—2 gene. (A.) $1 nuclese analysis of the splice acceptor site 5' to the ATG initiation codon. A DNA fragment singly end- labeled at the PvuII site (+644, Figure 4) and extending 931 nucleotides to the 8am HI site of pBH6b-2.6 (see Figure 4) was prepared as outlined in Materials and Methods. Approximately 0.5 pg of the singly end-labeled DNA fragment was hybridized either in the absence of added RNA (Lanes 1 and 3) or to 400 pg of total anemic red cell cytoplasmic RNA (Lanes 2 and 4). After hybridization, reactions were quenched into 51 reaction buffer, split into three aliquots, digested and processed as outlined in the Materials and Methods. SI nuclease levels were 1500 U/ML (Lanes 1,2) and 4000 U/ML (Lanes 3,4). (Equivalent results obtained with 500 U/ML, not shown). (8) Primer extension analysis of H3.3-2. Approximately 1 pg of the singly end—labeled DNA fragment described in A was digested with the restriction endonuclease Hae III and the resulting 56 bp singly end-labeled PvuII/Hae III fragment isolated. This fragment was hybridized to either 500 pg of yeast tRNA (Lane 1) or 300 pg of total red cell cytoplasmic RNA (Lane 2). Hybridization and primer extension with AMV reverse transcriptase are described in Materials and Methods. (C) Localization of the 5' end of the H3.3-2 mRNA by $1 nuclease analysis. A DNA fragment singly end-labeled at the EcoRV site (+69, Figure 4) and extending 357 nucleotides to the 8am HI site of pBH6b-2.6 (see Figure 4) was prepared DNA FRAGMENT SIZE (nt) 57 \\ ,6“ 8:3. V P l l V 58 bases upstream of codon one were protected. Examination of the DNA sequence data in this area (Figure 4) reveals no consensus promoter sequences. However, there is a consensus intron splice acceptor site (3' end of an intervening sequence, Figure 9) in this region around +570 in Figure 4. This data therefore suggested that at least one further intron existed in the 5'-nontranslated region of H3.3-2. Note also that this SI protection experiment shows that the H3.3-2 gene (or one of identical sequence) is specifically expressed in red cell RNA. The H3.3-I and the H3.2 genes, for example, diverge completely from H3.3-2 upstream of the ATG initiation codon. Thus, at best, transcripts from these genes could protect only 60 bases of the probe (PvuII site to ATG distance). In fact weak bands at about 60 bases are seen in the gel, possibly due to cross-hybridization to transcripts from these other genes. However the bands at 72-75 bases are almost certainly due specifically to transcription of the H3.3-2 gene. In fact since probe DNA is in considerable sequence excess to the H3.3-2 transcript in the RNA, the intensity of the bands at 72-75 are a specific measure of the lgygl of H3.3-2 transcription in a given RNA sample (as discussed later). This was confirmed by using varying levels of RNA in the SI experiment (results not shown). Primer extension was next used to attempt to verify the existence of the intron in the 5'-nontranslated portion of H3.3-2 and to estimate the amount of exon sequence 5' to the first coding exon. 'In this experiment, a singly end-labeled primer is hybridized to the message and the primer is extended to the end of the mRNA using AMV reverse transcriptase. In this case, the probe used in the previous SI mapping experiment (Pvu II*/Bam HI) was cut with Hae III and a 56 bp. primer fragment contained entirely in the first exon of H3.3-2 was generated (see Figure 4). This was hybridized to the total red cell cytoplasmic RNA and the primer extended as indicated (Materials and Methods). The labeled DNA was denatured and run out on a polyacrylamide gel (Figure 68). Because primer extension can be inefficient and terminate prematurely, several bands are seen above the unextended primer band in Figure 68. However, the largest significant band probably results fran complete extension of the primer to the end (5') of the mRNA. In this case, that band is about 180 bases in length. This distance should equal the length of the primer (56 bases) plus the distance between the primer and the splice acceptor site (another 18 bases) plus the length of all further 5' exons (one or more). This latter distance is therefore approximately 106 bases (180—56-18). From this, it became clear that, as expected from the analysis of the sequence data, the site identified by the initial SI mapping was due to an intron-exon junction and not a transcriptional start site. If the latter were the case, the primer should not have been extended past this point (74 bases). However, in this case the primer was extended approximately 100 additional bases, so there exists another 100 base pair exon 5' to the coding portion of the gene (or more than one exon whose total length is about 100 base pairs). Since an intron in the 5'-nontranslated region of H3.3-2 was indicated, the next step was to find out exactly where transcription was initiated. For this, a second probe was constructed. By examin- ing the DNA sequence upstream of the consensus acceptor site, we were able to locate a region which resembled a consensus intron donor site 60 (the 5' end of an intron, Figure 9). This region is apparent at position 101 in Figure 4. (Furthermore, about 130 base pairs upstream of this donor are sequences which resemble consensus promoter sequences, see below). In constructing a DNA probe for the second round of SI mapping, it was important that the label would be upstream of this region. The label must be in a sequence that is present in the mRNA if it is to be protected from $1 nuclease digestion. The probe also had to be long enough to extend past the promoter or past another splice site if additional introns were implicated. The fragment chosen in this case was a 357 base pair EcoRV*/8amHI fragment isolated from pBH6b-2.6. The EcoRV site was end-labeled with [Y-32PJATP (see Materials and Methods), the fragment hybridized to RNA and subjected to SI digestion (Figure 6C) as before. A strong signal was apparent at around 69 bases indicating the location of the putative transcriptional start site. A single exon extending from this site (cap site, +1 in Figure 4) to the splice donor sequence discussed above (+102 in Figure 4) would account for the length of mRNA sequence upstream which was copied in the primer extension experiment. As mentioned above, consensus promoter sequences are visible in the apprOpriate positions upstream of the predicted transcription initiation or cap site (see Figure 4). In sum, these data are consistent with the exon organization of the 5' end of the H3.3-2 gene shown at the bottom of Figure 6. Results of experiments described later suggested that the H3.3-2 mRNA like that of H5 and possibly H3.3-1 is polyadenylated. The appropriate signal sequence for polyadenylation AATAAA (in DNA) was seen at position 1492 (see Figure 4). From this, we predicted the 61 actual poly(A) addition site by comparison to the 3' ends of other polyadenylated messages as shown in Figure 4. Due to the low level of mature mRNA made from the H3.3-2 gene, it has not been possible to date to definitively map the actual polyadenylation site. Further attempts at this mapping are in progress. At this point it is interesting to compare the H3.3-2 gene in detail to other analogous genes. Figure 7 is a comparison of the promoter region sequence between the variant H3.3-2 gene and the major chicken erythrocyte H3 histone, H3.2. Two major observations arise from examination of this Figure. First is that the spacing between these sequences seems to be fairly well conserved. Whether these distances are important with respect to orientation of the promoter during transcription initiation is unknown, but it appears these distances are conserved in a wide variety of genes transcribed by RNA polymerase II (although not all). The second thing involves the conservation of the ”CCAAT” box, "TATA“ box and ”cap” box (or RNA initiation site) relative to the consensus sequence. The "TATA" box appears to be the most highly conserved of the promoter sequences, with the ”CCAAT” box next, followed by the "cap” box. In addition the "cap” box of the H3.3-2 gene looks very different from those of most genes since it is not nearly as pyrimidine-rich. The H3.3 variants may contain weak promoters corresponding to their low level of expression, and this unusual “cap” box may be related to the H3.3—2 promoter strength. The next features of the H3.3-2 gene to be compared are the introns. Figure 8 shows a bar graph representation of the size and location of the introns contained in H3.3—2 compared to those in 62 .:e_uewcemcecu me :e_pe_u_:_ Leeece esp Lew _ewe:emme we eu acmeecu meeceeeem memcemcee Amum.mzv eceum_; m: :exewse uce_ce> memce> Am.mxv ecepmw: m: :exewse Lewes we :emwceeEeu .9 ecsmwd 9.0 Iw.w-m :w.m mezmmzmcm =mm>>as wex mmomppamo -em- manna -N- aaam>>am> -mw- mpamm -N- ownm>>e-m -Aem-mee- opemm -Am-eee- sa>a>= mex ma>a>>>>>m mapappppmm oa>a>>>a>m IMO: iwml -AHm-m:0- =m>e= wex nmma>ma ama>ama e o ampeeme 64 .Aeeppeepc355ee pecemcee .pemcm .o.ev eeeeee xpe>pppcppee some pox pe: e>ez cops; cepmec empepmcecpee: .m mpp cp mcecpcp ezp Le wee mcpepeee empe xpeeeece ecem p-m.m: e59 .ex w.~ peeee appezpee mp H-m.m: :ecpcp .m e59 .meeceeeem empepmeecpcecu_m mcepmec emeezm ece mceppcee mcpeee mcepmec xcee .mcecpcp pcemeceec mcepmec :eeo .meeceeeem mcpce>empcp :pepeee ep :zecx mecepmp; esp aceEe eNpm :ecpcp :p coppepce> esp we coppepcemeceec ceecm Lem .w egsmpd 65 VI «.0690 .Z IH'U an 09 m: @390 .z 9-0.01 N00: .meeceeeem mcpce>cepcp cpepcee ep execx mecem weepmp: esp cpgppz mceppecen :exe\cecp:p esp we cemweeeeeu .m weempd 67 p\e5 kb) or that the exact number of introns within the 5' nontranslated region of H3.3-1 has not been determined yet (J.D. Engel, personal communication). It appears more than one intron may be present in this region of the H3.3-1 gene. From Such a comparison, what can we now conclude There are at least two histone genes (maybe more) which constitute the H3.3 replacement variant histone genes. Whether or not this family is larger is unknown at this time. Furthermore, the two H3.3 genes we have studied are the only true replacement histone genes characterized to date. The red cell-specific H5 histone has occasionally been called a replacement histone, but it more probably represents a different histone class - the tissue—specific histones. As other examples of replacement genes surface, as I feel they will, our understanding of these variant histone genes will grow. Already the two H3.3 genes exhibit Similarities to the tissue-specific H5 histone gene. H5 mRNA is polyadenylated and, like the H3.3 genes, is not linked to other histone genes, however it does not contain intervening sequences. It is possible that polyadenylated messenger RNA and intervening sequences may be two characteristics of all true replacement histone genes. Whether this is true or not is not possible to say at this time, but this data and that of Engel, Sugarman and Dodgson do suggest areas in which to concentrate research efforts aimed at further characterizing replacement histone genes and their expression. The final topic to be considered in this report is the expression of H3.3-2. Throughout the discussion of the structure of the H3.3-2 83 gene, I have noted that the sequence data suggested that H3.3-2 was expressed. Indeed, the success of $1 nuclease analysis used to pinpoint splice sites and the start of transcription (see Figures 6,10) confirm that H3.3-2 must be expressed. It is possible for me to make this statement based on information presented earlier in the Results (see above) section. Previously I detailed how a singly end-labeled DNA fragment was hybridized to RNA and then subjected to SI nuclease digestion. Only regions where RNA:DNA hybrids fonn are protected, since SI nuclease specifically degrades single-stranded DNA. (Conditions of the reaction are optimized to prevent as much reannealing of the probe DNA (DNAzDNA hybrids) as possible to decrease spurious reSults). The source of RNA was £9331 cytoplasmic red cell RNA and the only way the single-stranded singly end-labeled DNA probe would not be completely degraded occurred when a complementary messenger RNA existed in the mRNA pool that would hybridize to the DNA probe (RNA:DNA hybrid). The fact that a 75 bp. Signal showed up in the Results in Figure 6 was evidence that a H3.3-2 transcript must exist in the total cytoplasmic red cell RNA pool. Additionally, the argument was made in the Results based on the intensities of the H3.3-2 and H3.2 probe signals that the 75 bp. protected fragment was specific for the H3.3—2 gene itself and no other (see below). Thus, it is possible from the SI data to conclude that H3.3-2 is expressed. In addition, the SI analysis revealed several other characteristics of H3.3—2 expression. If we look at the SI analysis (see Figure 10) we see further evidence of the sequence divergence that must exist among the H3.3 replacement variants. A strong signal is apparent at 75 bp with very little signal at 60 bp for H3.3-2. 84 However, just the opposite is true for H3.2: a strong signal is present at 60 bp for H3.2 and a weaker signal at 100 bp. As explained in the Results, a strong 60 bp signal is representative of expression of all H3.2 genes whereas the weak 100 bp band indicates expression of one particular H3.2 gene. Likewise, the strong 75 bp band indicates the specific expression of H3.3-2 and the weak band at 60 bp presumably results from weak cross-hybridization with other members of the family of H3.3 genes. H3.3—2 thus shows very little hybridization to other members of its family which might be expected if all the putative H3.3 genes are as different in primary sequence as are H3.3-1 and H3.3-2. Thus we can see that the sequence data of H3.3-2 and H3.3-1 along with the S1 data point out that the H3.3 genes probably exist as a much more divergent population than that of the H3.2 subfamily. This is confirmed by sequence comparisons of two H3.2 genes analogous to our comparison of H3.3—1 and H3.3-2 (Doug Engel, personal communication). An offshoot of this is that the signals in lanes A through F (see Figure 10) represent very specific indications of the level of H3.3-2 expression in various tissues. Note that H3.3-2 appears to be expressed in both dividing and nondividing tissues at a relatively low (basal) level. This trait is characteristic of replacement variant histones as described by Wu and Bonner (26). In contrast, H3.2 is expressed to a greater extent in dividing tissue, which is characteristic of replication variant histones. From the SI analysis and other data not presented we can ascertain that H3.3-2 is expressed at approximately 5% of the level of total H3.2 expression in nondividing or slowly-dividing adult tissues such as reticulocytes and 85 liver. Despite this, H3.3 has been shown to account for up to 50% of histone H3 levels in adult (nondividing) tissue (43). From the 51 data it is possible to see that H3.3—2 shows no large increase in expression between dividing and nondividing tissues. This poses a puzzle then of how the increase in H3.3 histone content can be accounted for. It is doubtful the increase can be attributed solely to an increase in the level of transcription since the SI data presented here does not bear this out. More than likely other factors are also involved such as differences in turnover rate and translational efficiency. It might even be possible that as H3.2 levels decrease, H3.3 may be able to compete more effectively for binding to DNA and thus increase its relative protein stability. Whatever the mechanism, additional work must be done before we completely understand the functional differences between replication and replacement variant histones and the regulation of their expression. REFERENCES 11. 12. REFERENCES McGhee, J.D., and Felsenfeld, G. (1980) Ann. Rev. Biochem. 52:1115-1153. Isenberg, L. (1979) Ann. Rev. Biochem. 48:159-191. Weisbrod, S., Groudine, M., and Weintraub, H. (1980) Cell 12:289-301. Lewin, 8. (1980) Gene Expression 2, Wiley Interscience, New York, pp. 915-918. Darnell, J.E., Jr. (1982) Nature 221:365-371. Nurse, P. (1983) Nature 302:378. Kedes, L. (1979) Ann. Rev. Biochem. 48:837-870. Zwiedler, A. (180) in Gene Families of Collagen and Other Proteins (Prockop and Champe, eds.), Elsevier North Holland, Inc., p. 47-56. Lewin, B. (1983) Genes, John Wiley and Sons, Inc., New York, pp. 456-499. Hentschel, C., and Birnsteil, M. (1981) Cell 25:301-313. Zwiedler, A. (1977) Methods Cell. Biol._17:223—233. Stein, G., Stein, J., and Marzluff, W. (1984) Histone Genes, John Wiley and Sons, New York. Showman, R.M., Wells, D.E., Anstrom, J., Hursh, D.A., and Raff, R.A. (1982) Proc. Natl. Acad. Sci. USA 12:5944-5947. Karp, R. (1979) Ph.D. Thesis, Stanford University, Palo Alto, California. Crawford, R.J., Krieg, P., Harvey, R.P., Hewlish, D.A., and Wells, J.R.E. (1979) Nature 219:132-136. Engel, J.D., and Dodgson, J.B. (1981) Proc. Natl. Acad. Sci. USA zgzzass-zeeo. Harvey, R.P., Krieg, P.A., Robins, A.J., Coles, L.S., and Wells, J.R.E. (1981) Nature 294249-53. 86 18. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 87 Sugarman, B.J., Dodgson, J.D., and Engel, 0.0. (1983) J. Biol. Chem. 258:9005-9016. Krieg, P.A., Robins, A.J., Gait, M.J., Titmas, R.C., and Wells, J.R.E. (1982) Nucleic Acids Res. 10:1495-1502. Ruiz-Vasquez, R., and Ruiz-Carillo, A. (1982) Nucleic Acids Res. 19:2093-2108. Engel, J.D., Sugarman, B.J., and Dodgson, J.B. (1982) Nature 291:434-436. Urban, M.K., Franklin, 5.6., and Zwiedler, A. (1979) Biochemistry 18:3952-3960. Breathnach, R., and Chambon, P. (1981) Ann. Rev. Biochem. 50:349-383. Woudt, L.P., Pastink, A., Kempers-Veenstra, A.E., Jansen, A.E.M., Magen, W.H., and Planta, R.J. (1983) Nucleic Acids. Res. 11:5347-5360. Childs, G., Nocente-McGrath, C., Lieber, T., Holt, C., and Knowles, J.A. (1982) Cell 31:383-393. Wu, R.S., Tsai, S., and Bonner, W.M. (1982) Cell 31:367-374. Maniatis, T., Fritsch, E.F., and Sambrook, J. (1982) Molecular Cloning: A laboratory manual. Cold Spring Harbor Press, Cold Spring Harbor, New York. Hanahan, D. (1983) J. Mol. Biol. 166:557-580. Girvitz, S.C., Baccheti, 5., Rainbow, A.J., and Graham, F.L. (1980) Anal. Biochem. 106:492-496. Maxam, A., and Gilbert, W. (1980) Meth. In Enz. 65:499-560. Smith, D.R., and Calvo, J.M. (1980) Nucleic Acids. Res. 8:2255-2274. Jay, E., Seth, A.K., Romens, J., Sood, A., and Jay, G. (1982) Nucleic Acids Res. 10:6319-6329. Simonesitis, A., and Torok, I. (1982) Nucleic Acids. Res. 19g7959-7964. Longacre, S., and Rutter, W.J. (1977) J. Biol. Chem. gsgzz742-2752. Chirgwin, J., Przybyla, A., MacDonald, R.J., and Rutter, W.J. (1979) Biochemistry 18:5294-5299. Mount, S. (1982) Nucleic Acids Res. 10:459-471. 37. 38. 39. 40. 41. 42. 43. 88 Grandy, D.K., Engel, J.D., and Dodgson, 0.8. (1983) in Gene Expression, Alan R. Liss, Inc., New York, pp. 445-455. Proudfoot, N.J. and Brownlee, G. (1976) Nature 263:211-216. Gill, A., and Proudfoot, N.J. (1984) Nature 312:473-474. Wieringa, 8., Hofer, E., and Weissman, C. (1984) Cell 31:915-925. Keller, E.8., and Noon, W.A. (1984) Proc. Natl. Acad. Sci. USA gi:7417-7420. Noon, W.A. (1984) Cell 39:423-425. Urban, M., and Zwiedler, A. (1983) Dev. Biol. 95:421-428. mllllllljllllllflll[TillflilfllllfllljllllEs