MSU LIBRARIES —_ RETURNING MATERIALS: RTace in book drop to remove this checkout from your record. FINES wiII be charged if book is returned after the date stamped beIow. SEQUENCE ANALYSIS OF BOVINE PROLACTIN MESSENGER RNA By Nancy Louise Sasavage A DISSERTATION Submitted to Michigan State University in partiaI fquiIIment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Biochemistry 1981 C3//TRXk§ ABSTRACT SEQUENCE ANALYSIS OF BOVINE PROLACTIN MESSENGER RNA By Nancy Louise Sasavage The mechanisms for control of eukaryotic gene expression are cur- rently a major area of research in molecular biology. A primary target of these studies is the synthesis and processing of mRNA molecules for specific proteins. We have chosen to study the expression of the bovine prolactin (bPRL) and bovine growth hormone (bGH) genes from the anterior pituitary as one model of gene expression. An understanding of the molecular basis for the regulation of these genes requires a knowledge of the primary structure of the specific mRNAs. In this thesis research, I have developed a method utilizing bGH mRNA to directly determine the 3'-noncoding sequence of enriched mRNA molecules, and I have analyzed the complete sequence of bPRL mRNA. Pituitary mRNA was fractionated on sucrose density gradients and identified as bPRL or bGH mRNA by 15-11259 translation. The two mRNA templates were tested for specific initiation of reverse transcriptase- directed cDNA synthesis with 12 primers of the general sequence d(pT8- N-N'). A single primer sequence was complementary to the 3'-terminus of bGH mRNA, however, five sequences initiated bPRL cDNA synthesis. The specific primers were used to determine the 3'-noncoding regions of these mRNAs by standard methods of DNA sequence analysis and to suggest the nature of processing events at the 3'-termini of mRNAs. The complete primary structure of bPRL mRNA was determined from analysis of cloned sequences. Hybrid molecules containing DNA sequences complementary to bPRL mRNA were inserted into pBR322 and amplified in bacteria. One clone, pBPRL72, contained a 982 base pair insert that included 67 nucleotides of the 5'-untranslated region, the complete cod- ing region of preprolactin, and the entire 3'-untranslated region of the mRNA. The sequence analysis of this clone predicted the amino acid sequence of the signal peptide and confirmed the protein sequence of bPRL with one minor exception. Standard DNA sequence analysis of three bPRL cDNA clones revealed nucleotide substitutions that do not alter the pro- tein sequence. These results suggest variations at the nucleotide level in the bovine gene pool which originate from multiple alleles or dupli- cated loci. This thesis is dedicated to all my good memories and especially to the memory of my father. 11 ~ACKNONLEDGEMENTS I would like to express my appreciation to all the people in the Department of Biochemistry who have helped me during my graduate career. In particular I would like to thank the members of my guidance committee, Drs. Edward Convey (Department of Dairy Science), Hsing-Jien Kung, Ronald Patterson (Department of Microbiology), John Wang, and William Hells. I would also like to thank my research advisor, Dr. Fritz Rottman, for his support and assistance during my graduate research. I wish to particularly express nw appreciation for the special opportunities that I was given to visit other laboratories. I would further like to acknowledge the many pe0ple who have gener- ously provided their assistance throughout the course of this research. I was fortunate to spend a month in the laboratory of Dr. Michael Smith at the University of British Columbia. 1 want to thank Mike for allowing me the opportunity to learn the DNA sequencing techniques in his labora- tory. During this visit, I also worked with Dr. Shirley Gillam. Shir- ley synthesized all the oligodeoxynucleotides used in these studies. I wish to thank Shirley for her incredible generosity and kindness. My collaboration with Mike and Shirley has been very enriching, and I hope to continue our friendship. I also thank my co-workers in Dr. Rottman's laboratory, especially Karen Friderici and Rick Noychik. Karen was a constant and bright source of help and friendship throughout everything. Rick provided me with the 111 final burst of energy and enthusiasm to complete this work. I could not have finished without their help. Several other people have contributed greatly to this work. I am indebted to Dr. Rich Roberts of Cold Spring Harbor Laboratory in whose laboratory some of the sequencing analysis was performed; Dr. Bob Blumen- thal who arranged my visit to Cold Spring Harbor and devoted much of his time to helping me; Bob Swift for setting up the nucleic acid computer programs in our laboratory; and Sue Uselton for her cooperation and excellent preparation of this thesis. My deepest gratitude is extended to Dr. John Clark, Jr. of the Uni- versity of Illinois. Dr. Clark has fostered the development of my career in science from the very beginning. His advice in all things has been a unique source of guidance for me throughout my college career. I hope Dr. Clark will continue to serve as my mentor. Finally, I wish to thank my dear mother for making my college years more livable. Thank you all very much. Part II of this thesis is reprinted with permission from Biochemis- try, (1980) 12, 1737. Copyright 1980 American Chemical Society. iv TABLE OF CONTENTS DEDICATION . . . . . . . . . . . . . . . . . . . ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . TABLE OF CONTENTS . . . . . . . . . . . . . . . . LIST OF FIGURES . . . . . . . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . . . . . . . ABBREVIATIONS . . . . . . . . . . . . . . . . . . PART I LITERATURE SURVEY . . . . . . . . . . . . . . . . AN HISTORICAL PERSPECTIVE OF mRNA ISOLATION . mRNA ISOLATION . . . . . . . . . . . . . . . General Techniques . . . . . . . . . . . Specific Techniques . . . . . . . . . . . Isolation of Poly(A)-Containing mRNA Size Fractionation . . . . . . . . . Immunoprecipitation of Polysomes Molecular Hybridization . . . . . . . Sequence Specific Isolation . . . . . FEATURES AND FUNCTIONS OF mRNA STRUCTURE . . 5'-Terminal Cap Structure . . . . . . . . 5'-Untranslated Sequence . . . . . . . . Internal Methylation . . . . . . . . . . V ix xi N 0000000101 10 11 12 13 14 15 16 3'-Untranslated Sequence . . . . . Polyadenylation . . . . . . . . . . RNA SPLICING AND GENE EXPRESSION . . . THE BOVINE ANTERIOR PITUITARY AS A MODEL FOR GENE EXPRESSIOPI I O O O O O O C O O O O O 0 LIST OF REFERENCES . . . . . . . . . . PART II USE OF OLIGODEOXYNUCLEOTIDE PRIMERS TO DETERMINE POLY- (ADENYLIC ACID) ADJACENT SEQUENCES IN MESSENGER RIBONU- CLEIC ACID. 3'-TERMINAL NONCODING SEQUENCE GROWTH HORMONE MESSENGER RIBONUCLEIC ACID . ABSTRACT O O O O O O O O O O O O O O 0 INTRODUCTION . . . . . . . . . . . . . MATERIALS AND METHODS . . . . . . . . . Materials . . . . . . . . . . . . . OF BOVINE Purification of Bovine Growth Hormone mRNA . Synthesis of Oligodeoxynucleotide Primers . . Screening of Oligodeoxynucleotide Primers for Specific Initiation of cDNA Synthesis . . . . Dideoxy Sequencing Conditions . . . . . . . . Chemical Sequencing Conditions . . . . . . . RESULTS I I O O O O O I O O O O O O O O O O O O 0 Screening of d(pT8-N-N') Primers for Specific Initiation of cDNA Synthesis . . . . . . . . Sequencing Analysis by the Dideoxy Method . . Sequence Analysis by the Chemical Degradation Met h 0d 0 O O O O O O O O O O O O 0 O O O O 0 DISCUSSION 0 C C C O O O O O O O O O 0 LIST OF REFERENCES . . . . . . . . . . vi 20 22 26 27 27 3O 30 3O 31 31 32 33 34 34 46 ‘ 47 53 PART III NUCLEOTIDE SEQUENCE OF BOVINE PROLACTIN MESSENGER RNA. EVIDENCE FOR SEQUENCE POLYMORPHISM . . . . . . . . . . . ABSTRACT . . . . . . . . . . . . . . . . . . . . . . INTRODUCTION . . . . . . . . . . . . . . . . . . . . MATERIALS AND METHODS . . . . . . . . . . . . . . . . Materials . . . . . . . . . . . . . . . . . . . . Construction of cDNA Clones . . . . . . . . . . . Screening of Recombinant Plasmids . . . . . . . . Size Determination of Prolactin Inserts . . . . . DNA Sequence Analysis . . . . . . . . . . . . . . RESULTS . . . . . . . . . . . . . . . . . . . . . . . Detection of Clones with Large Prolactin Inserts Sequence Analysis of Prolactin Inserts . . . . . Comparison of Cloned Prolactin Sequences . . . . DISCUSSION . . . . . . . . . . . . . . . . . . . . . LIST OF REFERENCES . . . . . . . . . . . . . . . . . PART IV VARIATION IN THE POLYADENYLATION SITE OF BOVINE PROLACTIN MESSE [qGER RNA 0 O O O O O I O O O O O O O O O O O O C O 0 ABSTRACT . . . . . . . . . . . . . . . . . . . . . . INTRODUCTION . . . . . . . . . . . . . . . . . . . . MATERIALS AND METHODS . . . . . . . . . . . . . . . . Materials . . . . . . . . . . . . . . . . . . . . Purification of Bovine Prolactin mRNA . . . . . . Synthesis and Use of Oligodeoxynucleotide Primers vii 55 56 55 57 57 58 59 59 59 51 61 62 68 69 84 86 87 88 89 89 89 90 Screening and Sequencing of Prolactin cDNA Clones . RESULTS 0 O O O O O O O 0 O O O O O O O O O O O O O O 0 Screening of d(pT -N-N') Primers for Specific Initiation of Pro actin cDNA Synthesis . . . . . . Sequence Analysis with p(dT8-N-N') Primers . . . . Identification of Bovine Prolactin cDNA Clones with Multiple Poly(A) Adjacent Sequences . . . . . . . . Poly(A) Adjacent Nucleotides of Bovine Prolactin mRNA from a Single Animal . . . . . . . . . . . . . DISCUSSION 0 O O O O O O O O O O O O O O O O O O O O 0 LIST OF REFERENCES I I O O O O O O O O O O O O O O O 0 viii 91 95 98 104 104 111 Figure LIST OF FIGURES PART I I O 0 O O O O O O O O O O O O O O O O O O 0 Method for determination of the d(pT8-N-N') Oligodeoxynucleotide primer complementary to poly(A)-adjacent nucleotides in mRNA . . . . . . . Autoradiograph of Oligodeoxynucleotide primer screening for specific initiation of bovine GH CDNA syntheSiS O O O O O O O O O I O O O O O O O 0 Autoradiograph of sequencing gels for bovine GH mRNA 0 O O O O O O O O O O O O O O O I O O O 0 Nucleotide sequence of the 3'-terminal noncoding region of bovine GH mRNA . . . . . . . . . . . . . PART III 0 O O O O O O O O I O O O O O O O O O O O Sequencing Strategy . . . . . . . . . . . . . . . Nucleotide sequence of pBPRL72 insert . . . . . . Comparison of sequencing gel autoradiographs from pBPRL72 and pBPRL4 O O O O O O O O I O I O O O O 0 Comparison of the nucleotide and amino acid sequences for bPRL and rPRL mRNAs . . . . . . . . PART IV 0 O O O O O O O O O O O O O O O O O O O O Autoradiograph of d(pT8-N-N') primer screening . . Autoradiograph of bPRL cDNA sequencing gel . . . . The poly(A) adjacent sequence of the major bovine prolactin mRNA species . . . . . . . . . . . . . . Sequencing gel autoradiographs of poly(A) adjacent sequences in bovine prolactin cDNA clones . . . . Examination of bovine prolactin mRNA from a single animal 0 O C O O O O O O O O O O O O O O O O O O 0 ix 37 39 43 45 55 64 67 71 74 86 93 97 100 103 106 LIST OF TABLES Table Page I Summary of Sequence Polymorphisms in Bovine Pr01aCt1n mRNA 0 O I O O I O O O O O O O O O 0 O O O O O O 77 II Summary of Sequence Homologies . . . . . . . . . . . . . . 82 cDNA CS ddNTP DEAE ds-cDNA DTT EDTA G GH h HART m5A m7G mRNA N Nm NTP oligo(dl) poly(A) ABBREVIATIONS adenosine bovine cytidine complementary DNA chorionic somatomammotropin 2',3' dideoxynucleoside triphosphate diethylaminoethyl double-stranded complementary DNA dithiothreitol ethylenediaminetetraacetic acid guanosine growth hormone human hybrid-arrested translation N5-methyladenosine 7-methylguanosine messenger RNA nucleoside 2'1ggmethylnucleoside nucleoside triphOSphate oligo(deoxythymidine) poly(adenylic acid) xi PRL Pu rRNA SDS Tris tRNA prolactin purine rat ribosomal RNA sodium dodecyl sulfate thymidine tris(hydroxymethyl)aminomethane transfer RNA uridine xii PART I LITERATURE SURVEY LITERATURE SURVEY Messenger RNA molecules are the vehicle for expression of the genetic material. Specialized eukaryotic cells produce specific proteins by con- trolled differential expression of a portion of their genome. Therefore, the biosynthesis of mRNA molecules offers the cell a potential control point for expression of genetic information. The events that constitute the synthesis, maturation, and expression of mRNAs are currently a major focus of molecular biology. Basically, the mRNA transcripts carry information from the cell nucleus where they are transcribed, to the cytOplasm of the cell where the sequence is translated into protein. A number of steps in this pathway are now envisioned as potential control points in eukaryotic gene expression. Much attention has also been focused on the mRNA molecule itself. The recent ability to isolate, purify, and clone individual mRNA sequences greatly facilitates the study of such regulatory mechanisms. An under- standing of the molecular basis of control requires extensive knowledge of the structure and organization of specific genes and their sequences. This review focuses on the importance of the mRNA molecule in the study of eukaryotic gene expression. An extensive treatment of gene expression has recently been published by Lewin (1980). AN HISTORICAL PERSPECTIVE 0F mRNA ISOLATION Isolation of eukaryotic mRNA molecules has only become possible in 2 the last decade. The chemical and physical similarity of mRNAs presented a major obstacle in obtaining these molecules for study. Initially, iso- lation and characterization of mRNA was hindered by contamination with rRNA and tRNA species. With the introduction of several isolation and purification techniques exploiting the unique prOperties of cells produc- ing large quantities of a single protein, advances in the study of mRNA were possible. The discovery by Aviv and Leder (1972) of polyadenylic sequences at the 3'-terminus of eukaryotic mRNAs led to a simple isola- tion technique for these molecules utilizing the hybridization properties of this region with oligo(dT)-cellulose. (It should be noted here that later studies indicated that not all mRNAs contain polyadenylic sequences.) Additionally, the development 0f.lfl.!i££2 protein synthesis systems capable of translating exogenous mRNAs was crucial for the iden- tification of specific mRNA species. Globin mRNA from rabbit reticulocytes was the first eukaryotic mRNA to be successfully isolated. The characterization of this mRNA led to many important develOpments in the demonstration of discrete mRNA species in eukaryotes. Mabaix and Burny (1964) originally reported the existence of a 95 RNA species from rabbit reticulocytes that was distinct from rRNA (183 and 23S) and tRNA (45) species. This 95 fraction represented about 2% of the total RNA mass on sucrose gradients reflecting the preponder- ance of hemoglobin protein synthesis in the reticulocyte. The subsequent demonstration that the 93 RNA species could be dissociated from polysomes and used to direct globin synthesis in a cell-free translation system positively identified the RNA peak as globin mRNA (Lockard and Lingrel, 1969). It was this RNA fraction that Aviv and Leder (1972) subsequently isolated by affinity chromatography on oligo(dT)-cellulose. The reticulocyte 9S RNA peak was later shown to contain the messengers for both a and 3 globin (Laycock and Hunt, 1979). Separation of the two globin mRNAs species was a long-standing problem. Resolution was ulti- mately achieved by electrOphoresis in denaturing polyacrylamide gels con- taining 98% formamide (Morrison gt 31., 1974; Orkin £3 11., 1975). Pre- parative separations were later shown to be translationally active in cell-free protein synthesis assays (Nudel gt 31., 1977). Furthermore, the highly enriched a and 8 globin mRNAs were capable of directing syn- thesis of cDNAs with reverse transcriptase, an important technology for later studies of gene expression (Nudel gt al., 1977; Ross gt 31., 1972; Verma gt 31., 1972; Kacian gt_ l., 1972). The properties of globin mRNA described above were instrumental in defining the general features of eukaryotic messengers. Size estimates for globin mRNA confirmed the notion that eukaryotic cytoplasmic mRNAs contain more nucleotide bases than are required to code for the specific protein. In 1977 the complete nucleotide sequence of rabbit B globin mRNA was determined from a cloned cDNA sequence (Efstradiadis gt 31., 1977). This analysis was the first complete sequence determination of a eukaryotic mRNA and established several important features of eukaryotic mRNA structure. Namely, eukaryotic mRNAs contain untranslated regions at both the 5' and 3' termini in addition to the coding sequence. The data also showed that the predicted coding sequence agreed with the amino acid analysis of the protein. The study of globin mRNA represents many pioneering steps in the analysis of eukaryotic mRNAs. The ability to isolate individual messen- gers, as outlined here for globin mRNA, is essential for the study of eukaryotic regulatory mechanisms for gene expression. mRNA ISOLATION The isolation of intact and biologically active mRNA molecules has been plagued by some major technical problems in the past. The control of ribonuclease activity present during RNA isolation procedures has been of central importance in obtaining specific mRNA species from various sources. Ribonucleases are present in all tissues and can easily be introduced as contaminants (even from the researcher's own hands!) during purification procedures. The technologies now available to inhibit ribo- nuclease activity are standard protocols, however, I feel that they deserve mention for their contribution to the study of mRNAs. General Techniques The traditional method used for isolation of cellular RNA is depro- teinization with phenol (Kirby, K.S., 1964). Basically, the tissue homogenate is extracted with water-saturated phenol. The nucleic acid remains in the aqueous phase while the proteins partition into the organ- ic phase. The RNA can then be precipitated in the presence of dilute salt and ethanol. Although this procedure is widely applicable, it may not be adequate to recover quantities of intact and translationally active mRNAs. The refinements of this basic methodology have permitted the recent advances in the study of specific mRNAs. Several of these techniques are summarized below. Detergents also act as strong protein denaturants. The inclusion of 505 during tissue homogenization provides protection against cellular ribonucleases as they are released and improves RNA yields (Noll and Stutz, 1968). In combination with SDS, proteinase K, a protease from a fungus, is now commonly included for control of ribonuclease activity (Niegers and Hilz, 1971). Incubation of cellular extracts with protein- ase K and 0.5% SDS destroys ribonucleases and facilitates later phenol extraction. Proteinase K is particularly advantageous because of its ability to work in the presence of detergents and its auto-digestive properties. Diethyl pyrocarbonate is also a comnonly used reagent in RNA isola- tion procedures. This compound reacts with the imidazole nitrogens of histidine residues in proteins and is therefore an effective inhibitor of ribonucleases (Ehrenberg gt_al,, 1976). Of particular advantage is the ability to treat reagents and equipment that can not be rendered "ribonu- clease-free“ by standard methods of heat sterilization. Although diethyl pyrocarbonate is also known to attack single-stranded nucleic acids, the reaction proceeds more slowly than that with proteins (Leonard gt al., 1970). Some RNA isolation procedures have included this reagent during the tissue homogenization. This approach has been found to result in some loss of biological activity and is not commonly used (Rhoads £5 31., 1973; Niegers and Hilz, 1971). Another potent ribonuclease inhibitor is the sulfonated polysacchar- ide, heparin. This compound can be included during homogenization pro- cedures and has been found to improve yields of intact polysomes and RNA (Palmiter §t_al., 1970). The inhibitory activity of heparin apparently results from interaction of the highly charged polymer with ribonucleases similar to its interaction with RNA. The use of heparin in RNA isolation procedures has replaced the relatively inefficient method of direct ribo- nuclease adsorption by bentonite (Payne and Loening, 1970). A recent RNA isolation procedure involves the use of ribonucleoside- vanadyl complexes. Berger and Birkenmeir (1979) included these complexes throughout the purification of RNA from human lymphocytes. Previous attempts to isolate RNA from these cells by other methods had been unsuc- cessful. The complexes between vanadyl sulfate and the four ribonucleo- sides are easily prepared in the laboratory and inhibit ribonucleases by acting as transition-state analogues (Lienhard gt $1., 1971). Alternate methods that avoid the use of phenol have also been devel- Oped for mRNA isolation from eukaryotic cells. One such procedure relies on the rapid denaturation of ribonculeases in 6 M guanidine‘HCl (Cox, 1968). The high salt concentration dissociates ribonucleOprotein com- plexes and allows the purification of total cellular RNA. A modification of this procedure described by Deeley gt 31, (1977) is now extensively used with many systems. Basically, the tissue is homogenized in 8 M guanidine°HCl, and total RNA is precipitated with ethanol after removal of cellular debris by centrifugation. Another rapid RNA isolation method utilizes sedimentation through CsCl gradients (Glisin gt_al., 1974). This method is especially advantageous for isolation of total RNA from small quantities of tissue. The RNA is pelleted in this procedure while DNA and protein remain in the CsCl gradient. Clearly, the ability to isolate quantities of RNA from eukaryotic cells utilizing the techniques described above has made possible the study of mRNAs from virtually all sources. Previously, many interesting mRNAs from tissues such as the pancreas were difficult to isolate for study due to the presence of ribonucleases (Harding gt 21-: 1977). The variety of methods available today for the isolation of RNA demonstrates an important point. Namely, methods that are successful for isolation of mRNA from some cell types are totally inadequate for other cell types. Specific Techniques Several techniques have served as the basis for purification of Spe- cific cellular mRNA sequences. These methods generally exploit the abun- dance of a particular mRNA in a tissue which produces large quantities of a single protein product. As discussed in the previous section, the preparation of undegraded mRNA from eukaryotic cells is dependent upon inactivation of cellular nucleases. The procedures outlined below are extensions of those basic technologies. Isolation of Poly(A)-Containing mRNA A major technical advance in the purification of mRNA molecules is the separation of poly(A)-containing messenger sequences from the two other cytoplasmic mRNA species, tRNA and rRNA. Most cellular mRNAs con- tain a heterogeneous poly(A)-segment at the 3‘-end of the molecule, rang- ing in length from 20-250 bases. Affinity chromatography on oligo(dT)- cellulose is the most commonly used method to obtain an initial enrich- ment of mRNA sequences from total cyt0plasmic RNA. The poly(A)-contain- ing RNA is selectively bound to the cellulose matrix which has 12-18 dT residues, while poly(A)-minus RNA is not absorbed. Although this proce- dure is not specific for an individual mRNA sequence, it serves to enrich the cellular RNA yield for message sequences which only represent 1-2% of the total. Size Fractionation Once the poly(A)-containing mRNA fraction is obtained as described above, the unique size properties of a specific mRNA are commonly exploited to purify the molecule on the basis of length. Many variations of the size fractionation technique have been described for abundant mRNA species, the prototype being the isolation of globin mRNA. Basically, enrichment of a Specific mRNA sequence is achieved by repeated size frac- tionation on sucrose gradients. The principles of the technique are best illustrated with several examples. Silk fibroin mRNA from the silk worm Bombyx mori comprises a large percentage of the mRNA in the posterior gland and was first isolated by Suzuki and Brown (1972). This mRNA is unusually large due to the high molecular weight of the protein. Purification was achieved by two rounds of sucrose density gradient centrifugation. The mRNA was prone to aggre- gation and therefore required denaturation in sucrose gradients contain- ing 70% formamide to obtain the purified 325 fibroin messenger. Vitellogenin is another example of an abundant high molecular weight protein. It is produced in the liver of oviparious animals upon estrogen stimulation and can represent as much as 70% of liver cell protein syn- thesis (Shapiro £5 31., 1976). The 295 mRNA for this egg yolk protein has been isolated from the liver of stimulated Xenopus laevis (Shapiro and Baker, 1977) and chickens (Deeley gt al., 1977) by size fractionation on sucrose gradients. The recovery of intact mRNA in both animals was dependent on stringent conditions for the inhibition of endogenous ribo- nucleases. 0n the other end of the spectrum, very small nRNAs are also amenable to size fractionation purification techniques. Protamine mRNA from trout testes has been isolated by phenol extraction of the tissue and subse- quent sedimentation in sucrose gradients (Gedamu and Dixon, 1976). This 65 mRNA is smaller than the majority of cellular mRNAs and therefore is easily enriched by this method. 10 The ovalbumin mRNA sequence is similar in size to other mRNAs of the hen oviduct, but nevertheless purification has been achieved by size fractionation due to its abundance. The production of this egg white protein is greatly stimulated by estrogen and progesterone (Schimke gt .31., 1975). A significant enrichment of ovalbumin mRNA was obtained by employing isokinetic sucrose gradients for size fractionation of total oviduct poly(A)-containing RNA (Buell gt 11., 1978). Repeated gradients of this type can yield preparations of the mRNA that are 90% homogene- OUS. Immunoprecipitation of Polysomes As stated above, size fractionation purification schemes generally require that the specific mRNA sequence comprises a large percentage of the total mRNA population. The majority of cytoplasmic mRNAs constitute less than 1% of the poly(A)-containing RNA, and therefore cannot be puri- fied by this method. One technique designed to overcome this problem is immunoprecipitation of polysomes. Antibodies are prepared against the protein of interest and used for precipitation of polysomes synthesizing the specific protein. The successful isolation of the specific mRNA sequence depends on the recognition of the nascent peptide chain by this antibody. Generally, the application of this technique has been hindered by problems associated with nonspecific adsorption and trapping of other polysomes. The method as first described by Schimke and his associates (Palacios 23 31., 1972; Palmiter gt 31., 1972) has been modified to be more applicable to systems in which the mRNA of interest can be as little as 1% of the total mRNA population (5hapiro_gt.al., 1974). The indirect immunoprecipitation of polysomes employs a second antibody prepared 11 against the first protein specific antibody in order to facilitate the precipitation of the immune complex. The final immunoprecipitate is col- lected by sedimentation through a discontinuous sucrose gradient, dis- rupted by detergent, and the mRNA is purified by oligo(dT)-cellulose chromatography. This technique for the purification of specific mRNA sequences has been applied to many different systems. For example, the mRNAs for the immunoglobulin light chains from mouse myelomas have been isolated by Schechter (1973, 1974) by indirect immunoprecipitation, as well as legg- pus Vitellogenin mRNA (dost and Pehling, 1976). Although in these sys- tems, the specific mRNA constitutes a relatively large portion of the total mRNA, indirect immunOprecipitation has been successfully applied to systems in which the desired mRNA sequence is present in relatively low concentrations. The mRNAs for rat liver fatty acid synthetase (Flick_gt 21., 1978) and chicken embryo collagen (Pawlowski gt 21., 1975) have been purified by these methods. More recently, a modification of the indirect immunoprecipitation technique was developed utilizing the protein A component Staphylococcus auggus (Mueller-Lantzch and Fay, 1976). This cell wall protein binds to antibodies and can be used directly from inactivated bacterial cells to precipitate the specific antibody-polysome complex. Leukemia virus-spe- cific polysomes from infected cells have been efficiently isolated by this technique (Mueller-Lantzch and Fay, 1976). Molecular Hybridization Another technique for enrichment of mRNA sequences is molecular hybridization. This method has been employed to further enrich partially 12 purified mRNAs obtained by size fractionation or polysome immunoprecipi- tation. Complementary DNA is first transcribed from the partially puri- fied mRNA population. The predominant mRNA species is then enriched by molecular hybridization to a limited Rot value (product of RNA concen- tration and incubation time). The hybridization kinetics of the reaction are monitored to empirically determine the appropriate Rot value. The specific mRNA can be isolated from the molecular hybrids and purified. Strair gt al. (1977) first described this approach for the purification of albumin mRNA from rat liver, however, the procedure is applicable to all mRNA sequences providing they are the most abundant in the popula- tion. Sequence Specific Isolation The recent molecular cloning technology has made it possible to iso- late virtually any mRNA sequence by specific hybridization to cloned sequences. The mRNA sequence of interest is first converted into a duplex DNA capy by reverse transcription and inserted into a bacterial plasmid. The sequences are then amplified in a bacterial host. Although the details of this methodology are beyond the scope of this discussion, the ability to segregate a particular mRNA sequence from a total cellular population is ultimately dependent on the identification of the plasmid harboring the desired sequence. Generally, some prior purification of the mRNA is required for preparation of a cDNA hybridization probe. Labeled cDNA is synthesized from this mRNA and used to selectively screen bacterial colonies containing recombinant plasmids (Grunstein and Hog- ness, 1975). More recently, a more widely applicable method has been developed in which hybrids formed between a mRNA and its cloned sequence 13 are used to preclude the in 313:9 synthesis of the specific protein pro- duct. This technique known as hybrid-arrested translation (HART) employs the total p0pulation of mRNA for the hybridization and therefore can be used to select a specific mRNA sequence that is present in low concentra- tion (Paterson £3 31., 1977; Gordon gt 21., 1978). Identification of a specific translation product is required, however, the identity of the product need not be known. The cloned DNA sequence can subsequently be employed for the prepara- tive isolation of the specific mRNA. One method that has been used for these purposes is adsorption to solid DNA affinity supports. The cDNA sequences are chemically coupled to cellulose and used to selectively bind mRNA molecules from a p0pulation of RNA (Noyes and Stark, 1975). Childs §t_al, (1979) have further developed a method to link DNA to m-aminobenzyloxymethyl-cellulose by diazotization. The nonabsorbed RNA sequences can be removed by a series of washes, and the specific sequences can subsequently be obtained by thermal elution with formamide. Sequences isolated in this manner are intact and capable of directing in vitro protein synthesis. FEATURES AND FUNCTIONS OF mRNA STRUCTURE Several general features of mRNA structure have emerged from the iso- lation and purification of mRNA sequences. Each molecule normally pos- sesses a cap structure at the 5'-terminus and a stretch of adenylic acid residues at the 3'-terminus. The former is an unusual methylated nucleo- tide structure with the basic formula m7G5'ppp5'N'(m)N"(m) initially described by Rottman gt 21- (1974), and the latter is known as the poly(A) tail discussed earlier. Both these modifications are added 14 post-transcriptionally in the nucleus (Shatkin, 1976; Brawerman, 1974). Eukaryotic mRNAs contain sequence information for only a single polypep- tide, in contrast to the polycistronic nature of bacterial mRNAs. Size estimates and sequence analysis of specific eukaryotic mRNAs led to the finding of untranslated regions at the 5' and 3' ends of these molecules. In addition to the methylated cap structure, some mRNAs contain other methylnucleotides located in the internal sequence (Desrosiers gt 31., 1974). In this section, I will further describe these basic features of mRNA structure determined from studies of cellular populations as well as individual mRNA sequences. The presumed biological roles of each modifi- cation or region will also be discussed. 5'-Terminal Cap Structure The cap structure of eukaryotic mRNA contains a terminal 7-methyl- guanylic acid residue joined to the first nucleotide of the mRNA sequence via a 5'-5' triphosphate group. This structure is unusual because the normal linkage of phOSphodiester bonds in RNA is 3'-5'. Furthermore, the first and second nucleotides of the mRNA sequence contained in the cap structure may be methylated at the 2'7gfposition of the sugar moiety. The general structure of this 5'-terminal structure is written m765'ppp5'N'(m)N"(m) (Rottman £3 31., 1974). Cap structures have been found in a wide variety of organisms as well as viral mRNAs, and therfore appear to be universal in eukaryotic systems (Rottman, 1978). The cap structure has been shown to enhance translation of mRNA. Both 2£.El- (1975) first demonstrated a decrease in translational effi- ciency in 11359 upon removal of the cap structure from viral mRNAs. Since then, additional evidence has accumulated supporting this 15 observation. Translation of capped mRNAs_in_vitrg is inhibited by cap analogs apparently at initiation of protein synthesis (Hickey gt 91, 1975; Sasavage gt 21., 1979). The results of such translational studies suggest that the cap structure is not an absolute requirement, but rather a facilitor of protein synthesis. Other biological roles of this 5'-terminal mRNA structure have been suggested. One proposal involves mRNA stability. The methylated cap structure may protect the mRNA population from degradative enzymes and therefore lengthen the half-life of the molecules in liXQ (Rottman, 1978). An alternate role has also been proposed that involves processing of nuclear mRNA precursors (Rottman §t_al., 1974). Schibler and Perry (1976) have suggested that some cap structures result from internal cleavages of primary RNA transcripts. Further examination of these roles for the cap structure is necessary to better understand these processes. 5'-Untranslated Sequence The distance from the 5'-terminal cap structure to the AUG initiation codon for translation varies widely for eukaryotic mRNAs. Generally, the average length of this 5'-untranslated region is 50 nucleotides, however the range is from 10 to over 200 nucleotides (Kozak, 1978). For some messengers, the cap and AUG codon are sufficiently close together so that both may be included in the ribosome initiation complex. This observa- tion has led to the proposal that not only the cap, but the 5'-noncoding sequence facilitates initiation of protein synthesis (Shine and Dalgarno, 1974). However, inspection of the 5'-untranslated sequences from a num- ber of mRNAs has not provided any conserved regions complementary to the 185 ribosomal RNA as was pr0posed. The features of this region of mRNA 16 structure must also explain the differential efficiency of translation that has been observed for some eukaryotic systems (Shih and Kaesberg, 1973). The available nucleotide sequences have not provided the explana- tion of these differences. The current model for initiation of tran- scription involves a scanning mechanism for selection of the methionine initiation codon (Kozak, 1978). This model is based on mRNA sequence data from many 5'-noncoding regions which indicates that the first AUG sequence is generally chosen as the site for initiation of protein syn- thesis. The 405 ribosome is thought to bind at the 5'-terminal cap structure and proceed downstream until it reaches the AUG codon where it stops for initiation of translation. Internal Methylation Labeling studies of cells in culture originally revealed the presence of methylnucleotides in the internal sequence of mRNAs (Desrosiers gt 31., 1974). This modified nucleotide was identified as N5-methyladeny- lic acid (m5A). Quantitation of the m5A levels in cytoplasmic mRNA from Novikoff cells suggested approximately three such modified nucleo- tides were present in the average mRNA molecule. Further studies of this nature have shown that this methylated nucleotide is present in a number of mRNAs, with the exception of globin mRNA (Rottman, 1978). A general trend in the number of m5A residues and the length of the mRNA has been noted (Rottman gt gl., 1976). It is also interesting that the m5A residue occurs in the specific sequence Pu'm5A'C in the mRNA from three different systems (Nei and Moss, 1977; Dimock and Stoltzfus, 1977), however the location relative to the complete mRNA structure has not been defined. 17 Presently, the biological role of internal methylation of mRNA sequences is not known. The investigation of its importance will ulti- mately require the exact location of the m5A residues in a specific mRNA sequence. Such studies are currently in progress in several labora- tories. Rottman (1978) has proposed the involvement of m5A residues in mRNA processing events. Such processing signals could potentially play a crucial role in the removal of intervening sequences from primary mRNA transcripts (see section on RNA Splicing and Gene Expression). 3'-Untranslated Sequence The 3'-untranslated sequence is defined as the region between the termination codon and the start of the poly(A) tail of the mRNA. The nucleotide sequence in this region has been determined for more than 30 eukaryotic mRNAs. Although the length of 3'-noncoding region is highly variable, one highly conserved sequence has been observed. A simple hex- anucleotide sequence, AAUAAA, occurs 11-30 nucleotides from the start of the poly(A) tail and has been postulated to play a role in the addition of the poly(A) sequence to the 3'-terminus (Proudfoot and Brownlee, 1974). Recently, Fitzgerald and Shenk (1981) have conclusively shown that the AAUAAA sequence is indeed required for the polyadenylation and that the location of the hexanulceotide sequence influences the selection of the poly(A) site. No other conserved sequences or structures have been noted in the 3'-untranslated sequence of mRNAs, nor have other biological roles in the synthesis or translation of mRNA sequences been suggested. ‘3 (T) 6? Tl.) In E; 18 Polyadenylation Most eukaryotic mRNAs possess a series of polyadenylic acid residues at their 3'-end. The most notable exception to this rule is the general class of histone mRNAs (Brawerman, 1974). The poly(A) tail is approxi- mately 150-200 nucleotides in length and has been shown to be added post- transcriptionally. Inhibition of polyadenylation can prevent the appear- ance of mRNA sequences in the cyt0plasm. It was this observation which was used to initially suggest that the poly(A) segment functions in nuclear transport, however the later observation that mitochondrial mRNAs also contained poly(A) sequences jeOpardized this hypotheiss (Hirsch and Penman, 1973). A biological role for the poly(A) tail in the stability of mRNAs has also been postulated. The length of this region decreases with the age of the mRNA, but this shortening does not appear to be related to trans— lation (Wilson et al., 1978). Any translational role for the poly(A) tail could not be obligatory since some cellular messengers lack such sequences. It therefore appears that although the possible function of this region has been approached from several angles, no clear role for this region has been defined. RNA SPLICING AND GENE EXPRESSION RNA processing is generally defined as the enzymatic events which transform a primary RNA transcript into the final mRNA product that func- tions in translation. Several results of processing events have already been described in the previous section, including the cap structure, internal methylation, and polyadenylation. Other endonucleolytic and exonucleolytic events have been proposed from the observed alterations in 19 the size of nuclear transcripts. The discovery that eukaryotic gene sequences are not colinear with the final mRNA sequence has reemphasized the crucial role of RNA processing in the correct functioning of gene expression. Such intervening sequences have been shown to be transcribed into the primary RNA transcript, and therefore must be removed in a pro- cessing event known as splicing. Currently, the exact order and mechan- ism of these events are unclear. One approach to an understanding of these processes is to elucidate the structure of both the initial transcription product and the final mRNA molecule. The advent of the recombinant DNA technology and the development of rapid DNA sequencing techniques has led to a wealth of information on the structure and sequence of eukaryotic genes. However, information concerning the mechanisms for mRNA Splicing has not accumu- lated as rapidly. In this section, I will breifly summarize the current understanding of this RNA processing event in relation to gene expres- sion. Sharp (1981) has recently written a more extensive review of this subject. A RNA splicing event joins the 5' site of one splice boundary to the 3' site of the other splice boundary. Breathnach gt 31. (1978) first proposed that the removal of intervening sequences in many RNA tran- scripts occurred within a limited consensus sequence at the borders of the segments to be joined. The exact splice site in the primary tran- script sequence could not be determined, however, due to the duplication of the splice sequence in the joined segments. The possibility that the RNA splicing process was directed by the formation of secondary struc- tures in these regions appeared unlikely. More recently, it has been proposed that small nuclear RNA species play a role in the Splicing 20 mechanism (Lerner gt 11., 1979). The 5'-terminal sequence of U1 RNA, which is a highly abundant nuclear Species, exhibits good complementarity to the consensus splice sequence. This RNA is present in small ribonu- cle0protein particles in the nucleus and may act as a template for directing RNA splicing. Several other observations have been made concerning RNA Splicing. First, Stein _t al. (1980) have shown the existence of a polymorphism in the ovomucoid protein which apparently results from a splicing event. Two functional splice sites exist at this boundary, and both possible protein products have been observed. Furthermore, recent evidence sug- gests that the removal of several intervening sequences from a Single precursor mRNA proceeds in a preferential order (Nordstrom gt 21., 1979; Ryffel gt 31., 1980). In this respect, it also appears that a single intervening sequence of B-globin mRNA is removed in a step wise fashion, rather than a single event (Kinniburgh and Ross, 1979). It is clear that the RNA processing events involved in Splicing play an important role in the correct expression of eukaryotic genes. Further studies will be necessary to elucidate the mechanisms and the possible involvement in gene regulation of these events. THE BOVINE ANTERIOR PITUITARY AS A MODEL FOR GENE EXPRESSION AS stated in the introduction, the control of eukaryotic gene expres- sion is a major area of research in molecular biology. Several systems have been extensively studied, such as the production of the major egg white proteins in the hen oviduct and the globin gene family, and have served as models of eukaryotic gene regulation at the transcriptional level. We have initiated a study of two polypeptide hormones from the 21 bovirwa anterior pituitary as another, and hopefully alternate, model of gene regulation. Prolactin (PRL) and growth (GH) hormone are the most abundant protein products in the bovine anterior pituitary, and their biological and structural properties have been extensively studied (Schreiber, 1974; Tucker, 1979). We have previously shown that the mRNA for these hormones are present in high concentrations in the adult bovine pituitary gland (Nilson gt_al., 1978). Enrichment of these species fran pituitary poly- somal mRNA is readily achieved by fractionation on sucrose density gradi- ents (Nilson gt 21., 1980). Furthermore, we have established primary cultures of bovine pituitary cells in collaboration with Dr. Edward Convey to examine gene expression under controlled conditions ifl.!i££9- The working hypothesis in our studies is that methylation of mRNA may influence expression of genes at the post-transcriptional level. To test this hypothesis we plan to examine the genes for bovine PRL and GH as a model system. The properites cited above make the bovine anterior pitui- tary gland an attrative system for these studies. This thesis defines the primary structure of the cytoplasmic mRNA for bovine PRL (Parts III and IV, Sasavage gt 31., 1981a; Sasavage gt 31., 1981b). This information is of central importance in ultimately defining the exact location and possible involvement of mRNA methylation events in eukaryotic gene expression. In the course of this investigation, I have also sequenced a portion of the mRNA for bovine GH. This analysis is reproduced in Part II of this thesis (Sasavage gt 31., 1980). 22 LIST OF REFERENCES Aviv, H. and Leder, P. (1972) Proc. Natl. Acad. Sci. USA 69, 1408. Berger, S.L. and Birkenmeir, C.S. (1979) Biochemistry 23, 5143. Both, G.N., Banerjee, A.K. and Shatkin, A.d. (1975) Proc. Natl. Acad. Sci. USA 12, 1189. Brawerman, G. (1974) Ann. Rev. Biochem. 43, 621. Breathnach, R., Benoist, C., O'Hare, K., Gannon, F. and Chambon, P. (1978) Proc. Natl. Acad. Sci. USA 15, 4853. Buell, G.N., Nickens, M.P., Payvar, F., Schmike, R.T. (1978) g. Biol. Chem. 253, 2471. Childs, G., Levy, 5. and Kedes, L.H. (1979) Biochemistry 18, 208. Cox, R.A. (1968) Meth. Enzymol. 12B, 120. Deeley, R.G., Gordon, J.I., Burns, A.T.H., Mullinix, K.P., Bina-Stein, M. and Goldberger, R.F. (1977) g, Biol. Chem. 252, 8310. Desrosiers, R., Friderici, K. and Rottman, F. (1974) Proc. Natl. Acad. _S_£i_. DE 11, 3971. Dimock, K. and Stoltzfus, C.M. (1977) Biochemistny_1§, 471. Efstratiadis, A., Kafatos, F.C. and Maniatis, T. (1977) Cell 19, 571. Ehrenberg, L., Fedorcsak, I., Solymosy, F. (1976) Prog. Nucl. Acids Res. ml. Biol. g, 189. Fitzgerald, M. and Shenk, T. (1981) Cell 24, 251. Flick, P.K., Chen, J., Alberts, A.N., Vagelos, P.R. (1978) Proc. Natl. Acad. Sci. USA 15, 730. Gedamu, L., and Dixon, G.H. (1976) 9, Biol. Chem. 251, 1455. Glisin, V., Crkvenjakov, R. and Byus, C. (1974) Biochemistry 13, 2633. Gordon, J.I., Burns, A.T.H., Christmann, J.L. and Deeley, R.G. (1978) g, Biol. Chem. 253, 8629. 11' ‘9 “Ir h 23 Grunstein, M. and Hogness, 0.5. (1975) Proc. Natl. Acad. Sci. USA 12, 3961. Harding, J.D., MacDonald, R.J., Przybyla, A.E., Chirgwin, J.M., Pictet, R.L. and Rutter, N.J. (1977) J, Biol. Chem. 252, 7391. Hickey, E.D., Weber, L.A. and Baglioni, C. (1975) Proc. Natl. Acad. Sci. U_SA E. 19. Hirsh, M. and Penman, 5. (1973) J. M91. Biol. 89, 379. Jost, J.P. and Pehling, G. (1976) Egg. J. Biochem. 66, 339. Kacian, D.L., Spiegelman, 5., Bank, A., Terada, M., Metafora, 5., Dow, L., Marks, P.A. (1972) Nature New Biol. 235, 167. Kirby, K.5. (1964) Prog. Nucl. Acids Res. Mol. Biol. 3, 1. Kinniburgh, A.J. and Ross, J. (1979) Cell 7, 915. Kozak, M. (1978) Cell 15, 1109. Laycock, 0.6. and Hunt, J.A. (1969) Nature 221, 1118. Lienhard, G.E., Secemski, I.I., Kaehler, K.A. and Lindquist, R.N. (1971) Cold Spring Harbor Symp. Quant. Biol. 36, 45. Lerner, M.R., Boyle, J.A., Mount, S.M., Nolin, S.L. and Steitz, J.A. (1979) Nature 283, 220. Leonard, N.J., McDonald, J.J. and Reichman, M.E. (1970) Proc. Natl. Acad. .521- .u_s_a .61. 93- Lewin, B.M. Gene Expression 2, Eukaryotic Chromosomes Vol. 2, John Wiley and Sons, New York. Lockard, R.E. and Lingrel, J.B. (1969) Biochem. Biophys. Res. Commun. 37, 204. Marbaix, G. and Burny, A. (1964) Biochem. Biophys. Res. Commun. 4, 522. Morrison, M.R., Brinkley, S.A., Gorski, J. and Lingrel, J.B. (1974) J, Biol. Chem. 249, 5290. Mueller-Lantzch, N. and Fay, H. (1976) Cell 2, 579. Nilson, J.H., Barringer, K.J., Convey, E.M., Friderici, K. and Rottman, F.M. (1980) J, Biol. Chem. 255, 5871. Nilson, J.H., Convey, E.M., Rottman, F.R. (1978) g, Biol. Chem. 254, 1516. Noll, H. and Stutz, E. (1968) Meth. Enzymol. 128, 129. 24 Nordstrom, J.L., Roop, D.R., Tsai, M.-J. and O'Malley, B.N. (1979) Nature fig, 328. —" Noyes, B.E. and Stark, G.R. (1975) Cell 5, 301. Nudel, V., Ramierez, F., Marks, P.A., Bank, A. (1977) 5, Biol. Chem. 252, 2182. Orkin, S.A., Swan, D. and Leder, P. (1975) 5, Biol. Chem. 250, 8753. Palacios, R., Palmiter, R.D. and Schimke, R.T. (1972) 5. Biol. Chem. 247, 2316. Palmiter, R.D., Christensen, A.K. and Schimke, R.T. (1970) 5, Biol. Chem. 555, 833. Palmiter, R.D., Palacios, R. and Schimke, R.T. (1972) 5, Biol. Chem. 247, 3296. Paterson, B.M., Roberts, B.F. and Kuff, E.L. (1977) Proc. Natl. Acad. Sci. U.S.A. 13, 4370. Pawlowski, P.J., Gillette, M.T., Martinelle, J., Lukeus, L.N., Furthmayr, H. (1975) 5, Biol. Chem. 250, 2135. Payne, P.I. and Loening, V.E. (1970) Biochim. Biophys. Acta 224, 128. Proudfoot, N.J. and Brownlee, G.G. (1974) Nature 252, 359. Rhoads, R.E., McKnight, G.S., Schimke, R.T. (1973) 5, Biol. Chem. 238, 2031. Rottman, F.M. (1978) International Review of Biochemistry, Biochemistry of Nucleic Acids II, 51, 45. Rottman, F., Desrosiers, R. and Friderici, K. (1976) Prog. Nucleic Acid Res. Mol. Biol. 1_9, 21. Rottman, F., Shatkin, A.J. and Perry, R.P. (1974) Cell 3, 197. Ross, J., Aviv, H., Scolnick, E., Leder, P. (1972) Proc. Natl. Acad. Sci. USA 69, 264. Ryffel, G.U., Nyler, T., Muellener, D.B. and Weber, R. (1980) Cell 19, Sasavage, N.L., Friderici, K. and Rottman, F. (1979) Nucleic Acids Res. 6, 3613. Sasavage, N.L., Nilson, J.H., Horowitz, S. and Rottman, F.M. (1981a) Submitted for publication. Sasavage, N.L., Smith, M., Gillam, 5., Astell, C., Nilson, J.H. and Rottman, F. (1980) Biochemistry 15, 1737. 25 Sasavage, N.L., Smith, M., Gillam, 5., Noychik, R.P. and Rottman, F.M. (1981b) Submitted for publication. Schibler, U. and Perry, R.P. (1976) Cell 9, 121. Schechter, I. (1973) Proc. Natl. Acad. Sci. USA 15, 2256. Schechter, I. (1974) Biochemistry 15, 1875. Schimke, R.T., McKnight, G.S., Shapiro, D.J., Sullivan, 0., Palacios, R. (1975) Recent Prog. Horm. Res. 51, 175. Schreiber, V. (1974) in MTP International Review of Science: Biochemis- trygof Hormones (Rickenberg, H.V., ed.) Vol. 5, pp. 61-100. Univer- Sity Park Press, Baltimore. Sharp, P.A. (1981) Cell 15, 643. Shapiro, D.J. and Baker, H.J. (1977) 5, Biol. Chem. 252, 5244. Shapiro, D.J., Baker, H.J. and Stitt, D.T. (1976) 5, Biol. Chem. 251, 3105. Shapiro, D.J., Taylor, J.H., McKnight, G.S., Palacios, R., Gonzalez, C., Kiely, M.L. and Schimke, R.T. (1974) 5, Biol. Chem. 249, 3665. Shatkin, A.J. (1976) Cell 5, 645. Shih, 0.5. and Kaesberg, P. (1973) Proc. Natl. Acad. Sci. USA 15, 1799. Shine, J. and Dalgarno, L. (1974) Biochem. 5, 151, 609. Stein, J.P., Caterall, J.F., Kristo, P., Means, A.R. and O'Malley, B.N. (1980) Cell 11, 681. Strair, R.K., Yap, S.H. and Shafritz, D.A. (1977) Proc. Natl. Acad. Sci. U.S.A. 15, 4346. Suzuki, Y., Brown, 0.0. (1972) 5, Mol. Biol. _5, 409. Tucker, H.A. (1979) Seminars in Perinatology 5, 149. Verma, I.M., Temple, G.F., Fan, H., Baltimore, 0. (1972) Nature New Biol. 155, 163. Mei, C.-M. and Moss, 8. (1977) Biochemistry 15, 1672. Hiegers, U. and Hilz, H. (1971) Biochem. BiOphys. Res. Commun. 55, 513. Nilson, M.C., Sawicki, 5.6., White, P.A. and Darnell, J.E. (1978) 5, 551. Biol. 115, 23. PART II USE OF OLIGODEOXYNUCLEOTIDE PRIMERS TO DETERMINE POLY(ADENYLIC ACID) ADJACENT SEQUENCES IN MESSENGER RIBONUCLEIC ACID. 3'-TERMINAL NONCODING SEQUENCE OF BOVINE GROWTH HORMONE MESSENGER RIBONUCLEIC ACID 26 ABSTRACT Twelve synthetic Oligodeoxynucleotide primers of the general sequence d(pT8-N-N') were tested in a reverse transcriptase reaction for Specific initiation of complementary deoxyribonucleic acid (cDNA) synthesis at the poly(adenylic acid) junction of a messenger ribonucleic acid (mRNA) template. Only the sequence d(pTg-G-C) functioned as a Specific primer of cDNA synthesis with an enriched fraction of bovine growth hormone mRNA from the anterior pituitary gland and produced unique fragments in a dideoxy sequencing reaction. The nucleotide sequence obtained by this method extended into the protein coding region of bovine growth hormone mRNA and was confirmed by chemical sequencing of the cDNA initiated with [5'-32P]d(pT8-G-C). The 3'-untranslated region of bovine growth hormone mRNA is 104 nucleotides in length and contains regions of Significant homology with both rat and human growth hormone mRNAs, including the region surrounding the common AAUAAA hexanucleotide. The method presented here for selection of the d(pTg-N-N') primer com- plementary to the poly(A) junction of mRNA is of general applicability for nucleotide sequence analysis of partially purified mRNAs. INTRODUCTION An understanding of the biosynthesis and function of a specific mRNA requires a knowledge of its primary structure. Advances in both DNA and 27 28 RNA nucleotide sequence analysis in the past several years have greatly stimulated studies of specific genes and gene transcripts at the molecu- lar level. The development of two rapid DNA sequencing methods involving termination of growing DNA chains (Sanger 55,51., 1977) and chemical cleavage of terminally labeled DNA (Maxam and Gilbert, 1977) has contri- buted most significantly to DNA structural analysis of cloned gene sequences. Occassionally, however, it is possible to isolate significant quantities of specific mRNA molecules from which sequence information may be obtained directly without prior cloning of the mRNA sequences. Indeed, a number of investigators have now adapted the rapid DNA sequenc- ing technology to determine the primary structure of mRNA molecules directly. The dideoxy chain termination method for DNA sequencing (Sanger gt _1,, 1977) has recently been modified for use with reverse transcriptase- directed cDNA synthesis. Zimmern and Kaesberg (1978) Specifically primed cDNA synthesis at the poly(A) junction of encephalomyocarditis virus mRNA using the oligonucleotide d(pT7)rC. Their analysis confirmed and extended the previously known 3'-terminal 26 nucleotides of this viral mRNA. Similarly, McGeoch and Turnbull (1978) determined a sequence of 205 nucleotides adjacent to the poly(A)-tail of vesicular stomatitis virus N protein mRNA using d(pTg-C) as a primer for cDNA synthesis in the presence of dideoxy chain terminators. The complementary d(pTg-N) primer sequence for cDNA synthesis was determined by using d(pTlo) to initiate transcription of the template in the absence of TTP (Schwartz gt .51., 1977). Only those oligo(dT) primers that hybridized at the poly(A) junction of the viral mRNA template were able to initiate reverse 29 transcription under these conditions. In this manner, the G residue adjacent to the poly(A)-sequence was determined. Other partial sequences of mRNAs have been determined directly by adapting the dideoxy chain termination method for use with specific syn- thetic oligonucleotides complementary to internal regions of the mRNA (Hamlyn g; 51., 1978) and small restriction fragments (Bina-Stein 55.51., 1979) as primers for cDNA synthesis. In yet another adaptation of the rapid DNA sequencing technology, Noyes g; 51. (1979) utilized a 5'-32P-labeled oligodeoxynucleotide probe complementary to a unique amino acid codon sequence of hog gastrin mRNA to prime cDNA synthesis. The resultant gastrin specific cDNA was subsequently analyzed by the chemical degradation method of DNA sequencing developed by Maxam and Gilbert (1977). In the above analyses, previous sequence information was available for portions of the mRNA prior to selection of a Specific oligonucleotide primer for cDNA synthesis. We present here a method for obtaining nucleotide sequence information for any partially purified poly(A)-con- taining mRNA without previous sequence analysis by some alternate method. The technique involves phased priming at the poly(A) junction of the mRNA with oligodeoxynucleotide primers of the general sequence d(pTg-N-N'). The addition of a second nucleotide to the three possible d(pTg-N) sequences greatly enhances the selectivity of the cDNA priming reaction, thus facilitating the analysis of partially purified mRNAs. Growth hor- mone (GH) mRNA, enriched from bovine anterior pituitary poly(A) RNA, was tested in a dideoxy chain termination reaction with the twelve possible d(pTg-N-N') sequences to determine the Specific primer complementary to the poly(A)-adjacent nucleotides. Of the twelve oligonucleotide 30 sequences, only d(pTg-G-C) functioned as a specific primer for initia- tion of bovine GH cDNA synthesis and produced unique fragments in the dideoxy sequencing reaction. This method of phased priming at the poly(A) junction of the mRNA template has allowed us to determine the complete 3'-noncoding sequence of bovine GH mRNA by both the chain ter- mination and chemical cleavage methods of DNA sequencing. MATERIALS AND METHODS Materials Reverse transcriptase from avian myeloblastosis virus was generously provided by Dr. J.W. Beard (Life Sciences, Inc., St. Petersburg, FL). T4 polynucleotide kinase was purchaSed from New England BioLabs and Escherichia coli alkaline phosphatase was obtained from Worthington. The ‘ [a-3ZPJdCTP (400 Ci/mmole) was from Amersham, and [y-32PJATP (1000-3000 Ci/mmole) was from New England Nuclear. The 2',3'-dideoxynu- cleoside triphosphates and the deoxynucleoside triphosphates were pur- chased from P-L Biochemicals. The dideoxy compounds were used as sup- plied, and the deoxynucleoside triphosphates were purified by DEAE-cellu- lose chromatography as described by Brown and Smith (1977). Purification of Bovine Growth Hormone mRNA Polysomal RNA from fresh or liquid nitrogen frozen bovine anterior pituitary glands was obtained by magnesium precipitation essentially as described by Palmiter (1974). The RNA was subsequently chromatographed on oligo(dT)-cellulose to obtain the poly(A) mRNA fraction as previously described (Nilson gt 51., 1979). GH mRNA was enriched by a modification 31 of the approach described by Nilson 55.51. (1979). The details of this procedure will be described elsewhere (Nilson, g£_51,, 1980). Briefly, GH mRNA was obtained from the poly(A) RNA by sucrose density gradient centrifugation on 5-20% linear sucrose gradients. The GH mRNA fractions were identified by 15.11515 translation, pooled, ethanol precipitated, and centrifuged on identical sucrose gradients one or two additional times. The final bovine GH mRNA preparation was estimated to be approxi- mately 70-80% pure by jg_xit£5 translation, with the major contaminant being prolactin mRNA. Synthesis of Oligodeoxynucleotide Primers The twelve oligodeoxynucleotides of the general sequence d(pTg-N- N') and the three d(pT8-N) sequences were enzymatically synthesized- using 51.9511 polynucleotide phosphorylase. The complete conditions for the synthetic reactions and the oligonucleotide characterizations are described by Gillam and Smith (1980). Screening of Oligodeoxynucleotide Primers for Specific Initiation of cDNA Synthesis The 12 synthetic oligodeoxynucleotide primers with the general sequence d(pTg-N-N') were screened by a chain termination reaction for specific initiation of cDNA synthesis with partially purified bovine GH mRNA. The reverse transcriptase-directed cDNA synthesis using GH mRNA as a template was conducted in the presence of the chain terminator ddTTP. Each reaction contained 0.03 pg of enriched bovine GH mRNA and was incu- bated 10 min at 37°C in a 5 uL-volume with 60 mM Tris (pH 8.3), 75 mM NaCl, 7.5 mM MgClz, 25 mM DTT, 2.5 "M [a-32PJdCTP (330 Ci/mmole), 32 50 pH each dATP and dGTP, 2.5 uM each TTP and ddTTP, 17 pg/mL d(pTg-N- N'), and 270 units/mL reverse transcriptase. The reactions were per- formed in capillary tubes and incubated by placing the capillaries in a 6 x 50 nm Siliconized test tube in a water bath. At the end of the incuba- tion period, 0.25 uL of a mixture containing 2.5 uM each of the four dNTPs was added to the reaction mixtures in the capillaries and synthesis was continued for 5 min at 37°C. The contents of the capillaries were then emptied into the Siliconized test tubes and the reactions were stopped by the addition of 5 uL of 90% formamide, 25 mM NazEDTA, 0.02% bromophenol blue, and 0.02% xylene cyanol. Each sample was heated in the test tubes to 100°C for 3 min and quickly chilled on ice. In these screening assays, electrOphoresis of the samples (5 uL) was performed on a 12% polyacrylamide-7 M urea slab gel (0.5 x 200 x 400 mm) in 50 mM Tris-borate (pH 8.3), 1 mM Nag EDTA buffer for 3 hrs at 25 watts (Sanger and Coulson, 1978). The gel was fixed in 10% acetic acid for approximately 10 min, rinsed in water, and covered with Saran plastic wrap. Autoradiography of the gel was conducted at room temperature over- night with Kodak NS-ST film. Dideoxy Sequencing Conditions The oligodeoxynucleotide complementary to the poly(A) junction of bovine GH mRNA, as determined in the reaction described above, was d(pTg-G-C). This primer was used to initiate cDNA synthesis on the GH mRNA template in the presence of each of the four dideoxy chain termin- ators. The individual S-uL reactions in capillaries contained 0.08 pg of partially purified bovine GH mRNA, 60 mM Tris (pH 8.3), 75 mM NaCl, 7.5 mM MgCl2, 25 mM DTT, 2.5 an [a-32PJdCTP (330 Ci/mmole), 2.5 “M each 33 of the appropriate ddNTP and its counterpart dNTP, 50 uM of the other two dNTPs, 17 pg/mL d(pTg-G-C), and 270 units/mL reverse transcriptase. The capillaries containing the reactions were incubated 10 min at 37°C before the addition of 1 uL of a "cold chase" mix containing 2.5 mM of each dNTP. Incubation was continued for 5 min at 37°C, and the synthesis was terminated by the addition of the dye mixture as described above. The reactions were then heated at 100°C for 3 min and quickly chilled on ice. The samples were divided into S-uL aliquots and electrophoresed on 8, 12 and 20% polyacrylamide-7M urea slab gels. The sequencing gels were treated and autoradiographed as described above. Chemical Sequencing Conditions The oligodeoxynucleotide, d(pTg-G-C), was dephosphorylated and labeled with 32P at the 5' end for use in the chemical sequencing reactions of Maxam and Gilbert (1977). The dephosphorylation reaction with.§;.ggli alkaline phosphatase was performed in 45 mM ammonium formate for 3 hrs at 37°C (Montgomery 35”51., 1979). The reaction was stapped by the addition of EDTA. The sample was boiled for 5 min and then applied to Whatman No. 40 paper. Descending paper chromatography was conducted for approximately 35 hrs in n-propanol/ NH4OH/H20 (55:10:35). The ' oligodeoxynucleotide was located by UV absorption and eluted from the paper with water adjusted to pH 10 with NH4OH. Dephosphorylated d(pT3-G-C) was then labeled with 32P at the 5'-end using T4 poly- nucleotide kinase and [7-32PJATP. Approximately 120 pmol of d(pT3- G-C) was incubated with an equimolar amount of [y-3ZPJATP (1000-3000 Ci/mmole) according to van de Sande gt_51. (1973). The reaction was stopped with 0.25 M EDTA, and the sample was applied directly to a 34 Sephadex G-25 column (0.7 x 26 cm). The 3P—P-labeled primer was sepa- rated from the unreacted [Y-32PJATP at a flow rate of approximately 0.25 mL/min with 50 mM NH4HCO3. The fractions containing the labeled primer were pooled and lyophilized to dryness. The 32P-labeled d(pTg-G-C) (approximately 5 x 105 cpm/pmole) was used to Specifically initiate cDNA synthesis with bovine GH mRNA. A 175-pL reaction mixture contained 8 ug/mL RNA, 200 uM of each dNTP, 50 mM Tris (pH 8.3), 60 mM NaCl, 6 mM MgClz, 20 mM DTT, 0.7 pmoles/uL 32P-d(pT8-G-C) primer, and 270 units/mL reverse transcriptase. Incubation was for 1 h at 37°C. The synthesis was stopped with EDTA, and the reaction mixture was applied to a Sephadex G-200 column (0.7 x 26 cm). The excluded peak of 32P-cDNA was separated from free 32P- primer with 50 mM NH4HC03. The pooled fractions of 32P-cDNA were lyophilized to dryness and analyzed according to the chemical cleavage method of Maxam and Gilbert (1977). RESULTS Screening of d(pTg:N-N') Primers for Specific Initiation of cDNA Syn- EDEEIE. Phased priming of reverse transcriptase-directed cDNA synthesis at the poly(A) junction of mRNA results in a cDNA with a unique 5' end required for sequence analysis. This approach has been utilized in lim- ited instances where the base or bases adjacent to the poly(A) tract of the mRNA had been previously determined in some other manner (Zimmern and Kaesberg, 1978; McGeoch and Turnbull, 1978). In this study, we have designed a method to screen the twelve oligodeoxynucleotides of the 35 general sequence d(pTg-N-N') for specific initiation of cDNA synthesis on a mRNA template when the sequence at the poly(A)-junction is unknown. The technique involves a modification of the dideoxy chain termination method for DNA sequencing (Sanger g£_51,, 1977). The principle of our primer screening method is illustrated in Fig- ure 1. Specific initiation of cDNA synthesis is obtained by base pairing of the primer to the poly(A) junction of the mRNA template. A reverse transcriptase reaction is performed in the presence of one dideoxy chain terminator, such as ddTTP. Specific priming of cDNA synthesis is then analyzed on a sequencing type gel (Sanger and Coulson, 1978). The detec- tion of unique bands by autoradiography identifies the oligodeoxynucleo- tide complementary to the 3' terminus of the mRNA. We have used this method to analyze the 3' terminus of GH mRNA. GH is a major anterior pituitary protein, and its mRNA constitutes a large fraction of the total mRNA obtained from this tissue (Nilson 35,51., 1979). An enriched fraction of bovine GH mRNA isolated by sucrose den- sity gradient centrifugation was screened with the twelve d(pT8-N-N') primers as described above. No prior sequence information was available to predict the correct oligonucleotide primer for specific initiation of cDNA synthesis on this mRNA template. Only the sequence d(pTg-G-C) functioned as a specific primer of GH cDNA synthesis as evidenced by the unique and most intense fragments produced in the ddTTP chain termination reaction (Figure 2). This result identifies the 3'-terminal nucleotides adjacent to the poly(A) segment of the mRNA as GC. An identical, but very weak banding pattern occurred with the primer d(pTg-G-T) (Figure 2). A faint ladder of T residues imposed on the banding pattern obtained from the primer d(pTg-G-T) may indicate some nonSpecific initiation on Figure 1. 36 Method for determination of the d(pTg-N-N') oligodeoxynu- cleotide primer complementary to poly(A)-adjacent nucleotides in mRNA. The above figure illustrates the principle for selection of the specific d(pTg-N-N') sequence complemen- tary to the two 3'-terminal nucleotides of a mRNA sequence. The reverse transcriptase directed cDNA synthesis is carried out in the presence of one dideoxynucleoside triphosphate such as ddTTP. Chain termination occurs at Specific nucleo- tides in the sequence when cDNA synthesis is initiated by the complementary primer. Gel electrophoresis and autoradio- graphy of the labeled fragments on a sequencing type gel pro- duces a unique and specific pattern of bands. 37 H mesm_d .u~o :8 Se .08.... but bun FP_._.._.._....._.._.Z.Z|.._.uv .nd . 1<<<<<<<2000 Ci/mmole) was from Amersham. All restriction enzymes, polynucleotide kinase, and exonuclease III were purchased from Bethesda Research Laboratories and used as specified. T7 exonuclease was prepared and generously supplied by Dr. R.J. Roberts of Cold Spring Harbor Laboratory. Reverse transcriptase was a gift from Dr. J. Beard (Life Sciences, Inc., St. Petersburg, FL). Sequencing reactions utilized ddNTPs from P-L Biochemicals and dNTPs from Sigma. Other enzymes used in this study were the Klenow fragment of DNA polymerase I from Boehringer-Mannheim, Sl-nuclease from Miles Laboratories, and ter- minal transferase from Enzo Biochemicals. Construction of cDNA Clones Poly(A)-containing RNA from bovine anterior pituitary glands was used as a template for the synthesis of dS-cDNA with reverse transcrip- tase essentially as described previously (Nilson 51.51., 1980b). The ds-cDNA product (0.6 ug) was treated with 180 units of Sl-nuclease in the presence of 3 pg of carrier 18S rRNA in 0.3 M NaCl, 0.05 M KOAc pH 4.5, and 1 mM ZnSO4 for 30 min at 25°. Molecules greater than 600 bp were then Size-selected on 5-20% linear sucrose gradients containing 100 mM NaCl, 1 mM EDTA, and 10 mM Tris pH 7.5. Terminal transferase was used for homopolymeric addition of deoxycytidine residues. The conditions used for the terminal transferase reaction were essentially those described by Chang 51H51. (1978), except that the temperature of the incubation was 15° instead of 37°. The plasmid pBR322 was restricted with Pst I and similarly tailed with deoxyguanosine residues, but the reaction was at 37° for 5 min. Hybridization of the insert DNA and the plasmid was performed at a molar 59 ratio of 2 to 1 as described by Goeddel 51151. (1979). The recombinant molecules were used to transform 5. coli HB101 under P2 conditions as Specified by the NIH Guidelines. Transformants were detected by selec- tive screening on plates containing tetracycline. The transforming effi- ciency was 12 transformants/ng cDNA. Screening of Recombinant Plasmids [32PJbPRL specific cDNA was synthesized from purified bPRL mRNA with reverse transcriptase and used as a hybridization probe for the detection of recombinant plasmids containing PRL sequences (Nilson 51 .51., 1980b) by the method of Grunstein and Hogness (1975). Size Determination of Prolactin Inserts Plasmid DNA was prepared from selected positive colonies by the rap- id mini-screen procedure developed by Birnboim and Doly (1979). The PRL insert size was determined by Pst I digestion of the plasmids and elec- trophoresis of the DNA in 1.5% neutral agarose gels. DNA Sequence Analysis The chain terminator DNA sequencing method (Sanger 51.51., 1977) using single-stranded DNA as described by Smith (1979) was employed. The template DNA for the sequencing reactions was prepared from the plasmids pBPRL4 and pBPRL72 which had been previously linearized by Eco RI diges- tion. (There were no Eco RI sites in the cloned PRL sequences.) Linear plasmids were digested at a concentration of 0.2 mg/ml with either exonu- clease III or T7 exonuclease at 1 unit/pg DNA for 30 min at 37°. The reaction conditions for the exonuclease III reaction were as recommended 60 by the supplier. The T7 exonuclease digestion conditions have been described by Zain and Roberts (1979). Primers for the sequencing reactions were prepared by digestion of the pBPRL72 insert with the enzymes Alu I, Dde I, Hae III, Hinf I, or Hpa II (see Figure 1). The fragments were separated on preparative polya- crylamide gels and stained with ethidium bromide. The regions of the gel containing the desired DNA bands were cut out, crushed with a glass rod, and eluted overnight with agitation in 1.5 ml of 0.1 M NaCl, 10 mM Tris pH 8.0, and 0.1 mM EDTA. The gel pieces were separated from the eluted DNA fragments by filtering through a small Quick-Sep column (IsoLabs). The solution was then phenol extracted, ether extracted, and precipitated with three volumes of ethanol at -80° for 1 hr or overnight at -20°. The DNA precipitate was collected by centrifugation in a SW50.1 rotor (Beck- man) for 1 hr at 45,000 rpm and 4°. Hybridization of the primer restriction fragment to the single- stranded DNA template was performed as follows. Five pl of the DNA tem- plate from the exonuclease digestion mixture (the equivalent of one pg of the initial linearized DNA), approximately 1.5 pmoles of the primer in 5 pl of H20, and 2 pl of 10X hybridization buffer (10X contains 66 mM each of NaCl, Tris pH 7.5, MgCl, and DTT) were mixed, and sealed in a capillary. The mixture was denatured in a boiling water bath for 3 min before hybridization at 68° for 30 min. Each set of sequencing reactions used 20 pl of an equal proportion mixture of the four [a-32PJdNTPS (840 Ci/mmole). The hybridization mixture was subsequently used to recover the dried radioactive nucleo- tides. The final concentration in the sequencing reactions of each [a-32PJdNTP was 2 pM. An individual sequencing reaction contained 3 61 pl of the hybridization mixture with the labeled nucleotides, 1 pl of either 35 pM ddCTP, 135 pM ddTTP, 165 pM ddATP, or 35 pM ddGTP, and 1 pl (0.05 units) of the Klenow fragment of DNA polymerase I. The reactions were incubated 15 min at room temperature before adding 1 pl of a mixture that was 2.5 mM in each dNTP. Incubation was then continued for 15 min at room temperature. The sequencing reactions were stopped with 10 pl of 90% formamide and denatured at 100° (Sanger 51.51., 1977). Two pl ali- quots of each reaction were electrOphoresed as described by Sanger and Coulson (1978). Autoradiographs of the sequencing gels were done with Kodak XR-5 or XAR-5 film. The chemical DNA sequencing method of Maxam and Gilbert (1980) was used to analyze the extreme 5' and 3' portions of the pBPRL72 insert after labeling with [Y-32PJATP at the Pst I sites and secondary cleavage with Hae III. The total sequence of the insert from pBPRL72 (Figure 2) was obtained by the overlap of data from both DNA sequencing methods (Figure 1). Sequence data was confirmed by the agreement of complementary strands and restriction enzyme analysis (data not shown) where possible. The sequence data was analyzed by computer programs written by Drs. R.J. Roberts, R. Blumenthal, and T. Gingeras and Mr. R. Swift. RESULTS Detection of Clones with Large Prolactin Inserts We have recently reported the cloning and sequence analysis of a 225 base pair insert coding for amino acids 119-192 of the bPRL protein (199 amino acids total) (Nilson 51.51., 1980b). In the present study, mild 62 Sl-treatment and subsequent sizing of ds-cDNA synthesized from bovine pituitary poly(A)-containing RNA was employed in order to obtain full-length PRL sequences. A total of 165 PRL positive colonies were identified from approximately 600 transformants. Recombinant plasmids from 24 independent colonies were prepared by the mini-screen technique of Birnboim and Doly (1979), digested with Pst I, and electrOphoresed in agarose gels to determine the size of the PRL inserts. The majority of the clones contained inserts of approximately 600 bp (data not shown). Two recombinant plasmids containing large inserts of approximately 750 and 950 bp were selected for sequence analysis. These plasmids were designated pBPRL4 and pBPRL72 (plasmid bovine prolactin, 4th and 72nd colonies), respectively. Sequence Analysis of Prolactin Inserts The insert from pBPRL72 was excised with Pst I and separated on a preparative polyacrylamide gel. The Pst I digestion resulted in a single insert fragment indicating the absence of a Pst I site in the bPRL sequence. The recovered insert was analyzed with a series of restriction enzymes. Digestion with the enzymes Alu I, Dde I, Hae III, Hinf I, and Hpa II resulted in fragments (data not Shown) of appropriate size (20-150 bp) for use as primers in the dideoxy DNA sequencing method (Smith, 1979; Sanger 51.51., 1977). Several fragments were selected to determine the DNA sequence of the inserts from pBPRL4 and pBPRL72. The chemical method for DNA sequencing (Maxam and Gilbert, 1980) was used to determine the length of the homopolymer tails resulting from the dG'dC tailing tech- nique. Figure 1 shows a summany of the sequencing strategy. Figure 1. 63 Sequencing Strategy. Restriction sites for the enzymes Dde I, Alu I, Hinf I, Hae III and Hpa II are shown within the Pst I insert of pBPRL72. The open boxes at the Pst I sites repre- sent the dG’dC homopolymer tracts introduced during cloning. The solid boxes below the map represent restriction fragments used as primers in the dideoxy sequencing reactions. The hor- izontal arrows indicate the extent that each sequence was determined from a specific primer. Arrows pointing to the left were determined from single-stranded templates prepared with T7 exonuclease and arrows to the right were determined from exonuclease III templates. The two arrows originating from solid circles represent the sequence determined by end- labeling at the Pst I site and chemical sequencing. 64 H 953“. uHoan Board HEY; H340 Have. I1 Iv ll w lllllv Iv Al I1 Iv I! 4 hi i I a d O 04 4 0+ : _ __ ... M. _ .2 _ 1 00m 0 65 The complete sequence of the larger clone, pBPRL72, was determined using both methods for DNA sequencing (Figure 2). Most areas of the insert were sequenced two times and for 70% of the insert length both complementary strands were analyzed (Figure 1). The cloned insert con- tained a total of 907 nucleotides corresponding to the bPRL mRNA sequence as well as 21 A residues from the poly(A) portion of the message. The homopolymer tails consisted of 26 and 28 bases at the 5' and 3' ends of the insert, respectively. The amino acid sequence predicted from the reading frame beginning at base number 158 (Figure 2) agrees with that of the bovine prolactin protein (Dayhoff, 1978) with one minor exception. The DNA sequence predicts an aspartic acid instead of an asparagine at amino acid position 31. Assignments from the DNA sequence at the inde- terminate positions (glx and asx) of the amino acid sequence Show glutam- ic acid for amino acids 70, 78, 118, 120, 121, 122, 143 and 145, gluta- mine for 71, 73, and 74, asparagine for 10 and 92, and aspartic acid for 93. The clone pBPRL72 also contained the entire sequence corresponding to the signal peptide of the bPRL protein. Lingappa 51.51. (1977) reported the length of this signal sequence to be 30 amino acids, but the complete amino acid sequence was not determined. The nucleotide sequence of pBPRL72 also predicts a signal peptide of this length. Only a single ATG codon (nucleotides 68-70, Figure 2) is found in the sequence preced- ing the codon for the N-terminal threonine of the authentic bovine pro- lactin protein. In addition, the location of eight leucine codons in the hydrophobic center of the signal sequence reported by Jackson and Blobel (1980) are predicted correctly from the DNA sequence of the pBPRL72 insert. The amino acid sequence deduced from the reading frame following Figure 2. 66 Nucleotide sequence of pBPRL72 insert. The 907 bp sequence of the cDNA insert from clone pBPRL72 is presented with the restriction enzyme sites indicated in Figure 1. The bases are also numbered as in Figure 1. The length correSponding to the coding portion of bovine preprolactin is 690 base pairs, starting from the ATG codon labeled met (-30) and terminating with the ochre TAA codon, labeled term (+200). The beginning of processed PRL protein is labeled thr (+1). The total length of this insert including the dC°dG tails and poly(A) segment is 982 base pairs. The bases marked by * are nucleo- tide positions that differ in the insert sequence of clone pPBRL 4 (see Figure 3). The DNA sequence of pBPRL4 contains C, G, A, and A at nucleotide positions 442, 463, 568, and 898, respectively. 67' N mcamwm .m h...p»»<¢<~»ppouu~uouu . ___o~:. c . 6 _~:: emu com omv (u8< u... of .1. E .I u a . 3 , 1 a: R. a; 8. . c . < a a . u S .. a u 5 _¢ . < a 513.65}... .8 a... B. 3. 33:3 68 .9. “I... on. 38 ”a. 5.. 9.ng 6:... .3 .5 as ”8. =8_ . 8 :u .3 .1. .15. .5 .2. .5 .o. o v... .350 a an 33.34.21...» .3 2. EA... :1... 2.. .28. .3 s... :x .: H.J.... 3. .3 1.11.... 1w... 2.1.1:... . .21.... 2...... .2 3...}?! 5...... i..— = .2 a. r. s 8 was... 3.5831348: 33 U2.3.35.5. u 3. n...&.§_z._..:E 3 n4, 5. stn a... .0, o . . . .. 1...; 3. S... S. _. . . 3. 14.1....I131. .2 c: 2 s3 3. 315d 2. c... .: ., 3. z. .b . z. ,5. f. _< u a .3.3.<3_ . a 3.3.. .2 73 .. .5 a . d. .5. Uz: .. e o 6”... a... w... .3 «:3 a... 8.. 6.. .6 . 2 m. 3 N. i 11.1.. 5...... 5. h. . .5 .3 E.M.»... .57. .2... .. c... as .8 .8 a. 3. 33...... =9 . gins}... 5E :9. o. .o 3. 3:17.415 3.. a... a... a: a... l. e... .3 .2 .3 .8 g 3. a. s... z. c... S. 2. ..... .r .. 1111113.. .2... u u S 3 3 5. u a a... 8 a < 8. . 93 s. . ..... a. 35 5...... ,3 3. =3 3... .f 3. .8 w... .9. =8 :1 no. .3 .3 .8 8. 8.. .8 =8 .. s3 a: u..n.3§.€. a; 95¢ a... so» .6» c... .3 S: A... 9t .3: at u p; o s... .3 .Q» A... a... 3 o. .. U E- .5. 381...... 35.2.3... 2. A... 0.4 1* Q: .9: a... $4 .. .. .. _ .. E. .. «3.. Q1. «H., Jaw a... 1 9:... :u at 933(§u§§§§§3§94 ax... ._ . . J. 1.. wk: .3. < J a .3. ¢ 9 uguiflzgggguggaggcusgggg 3‘0 as. ”which... 3. a. «it .0... .3 ...I c... .3 a... a: Q: .a E. a- 75 preprolactin sequence is the longest reported signal sequence of a secreted protein (Jackson and Blobel, 1980). The start of the signal peptide sequence in pBPRL72 is preceded by 67 nucleotides correSponding to the 5'-untranslated portion of bPRL mRNA. By electrophoresis in denaturing gels we have previously estimated the size of bPRL cyt0plasmic mRNA to be 1000 nucleotides, including the 3'-poly(A) segment (Nilson £3 31., 1979). If the poly(A) region of bPRL mRNA is approximately 100-150 nucleotides long, the 907 bp insert of pBPRL 72 may represent a nearly complete copy of the mRNA. Furthermore, electron microsc0pic analysis of cyt0plasmic bPRL mRNA and pBPRL72 duplexes does not show an unhybridized 5'-terminal extension of the mRNA. This observation indicates that any additional sequence for this region of the mRNA is likely to be less than 50 nucleotides. Together these data indicate that the 5'-noncoding region of bPRL mRNA represented in the cloned pBPRL72 insert is nearly complete. The complete 3'-noncoding portion of bPRL mRNA is also included in clone pBPRL72. The appearance of a poly(A) sequence in this clone esta- blishes the exact length of the 3'-untranslated region to be 150 nucleo- tides. The common AAUAAA sequence found in the 3'-noncoding regions of eucaryotic mRNAs is located 28 bases upstream of the poly(A) addition site. This sequence has been postulated to play a role in the addition of poly(A) segments to mRNAs, and generally occurs approximately 20 bases prior to the poly(A) junction (Proudfoot and Brownlee, 1976). In the current study, we examined the nucleotide sequence of two independent bPRL cDNA clones obtained in the same cloning experiment. Comparison of the insert DNA sequences from pBPRL72 and pBPRL4 indicates four positions where a change in the identity of the base is evident 76 (Figure 2). All the nucleotide substitutions are silent with respect to the amino acid sequence of bPRL. Three of the changes occur in the third position of codons, while the fourth difference occurs in the 3'-noncod- ing region, ten nucleotides from the poly(A) junction of the mRNA (Figure 2). Similar sequence heterogeneities have been observed by Cooke gta_l. (1980) in two cDNA clones of rat PRL. Earlier sequence data from our laboratory (Nilson gt_gl., 1980) and another laboratory (Miller 3; al., 1980) indicated differences in the third base of codons for four amino acids in the bPRL sequence. Compari- son with the present cloned bPRL mRNA sequences further documents the extent of these heterogeneities. Including the earlier differences, we have now identified a total of seven amino acids that display alternate bases in the third position of their codons. Both transitions and trans- versions of the nucleotides have been observed. These differences are summarized in Table I. In each instance the sequence polymorphism does not alter the protein sequence. Moreover, we have noted that the differ- ences all occur in the codons of hydrophobic amino acids. The signifi- cance of this trend is unknown at this time. Although it is conceivable that such nucleotide differences nay arise from errors in reading the sequencing gels, two clear examples of nucleotide changes are shown from our results in Figure 3. The codon for valine 103 reads as GUA_in pBPRL72, but as GQQ in pBPRL4. Miller gt al. (1980) report a GHQ sequence for the codon of this amino acid. One other nonexpressed dif- ference is evident between the 3'-noncoding sequence of bPRL mRNA report- ed by Miller gt al. (1980) and our data. We find the nucleotides CA at bases 768 and 769 (Figure 2) where they have reported a single G residue. The effect of this deletion/substitution on the bPRL mRNA structure is 77 TABLE I E? Summary of Sequence Polymorphisms in Bovine Prolactin mRNA . 91w: Amino Position .Agid_ Number Eggg pBPRL72 {p§£fi£fl_ (Eggggl leucine 95 -—— CQQ CUE -—— valine 103 —- Gdfi GU§_ GHQ valine 137 GHQ GHQ, GUA Guu serine 151 UQQ UQfi n.d. UQA glycine 152 cap 695 n.d. GSA leucine 156 tug CUE, n.d. ng serine 166 UQQ UQQ n.d. UQQ The location and identity of the amino acids from bPRL with differ- ences in the third base of their codons is listed. The total number of amino acids in the protein is 199. The bPRL cDNA clones contain the nucleotide sequence corresponding to the following amino acids: pBP6, 119 to 192 (10); pBPRL72, -30 to 199; pBPRL4, 26 to 199; and pBP261, 99 to 199 (Miller gt al., 26). The codons that have not been sequenced are indicated as n.d. 78 unknown, but it is nevertheless silent with respect to the protein sequence. Assuming both sequences are correct, this difference represents yet another type of polymorphism in the bPRL gene. The origin of the sequence heterogeneities in these bPRL cDNA clones may result from several possible sources. Although errors in the initial copying of the mRNA template by reverse transcriptase have not proven to be a problem in cDNA cloning, the possibility of introducing random base changes is feasible. Furthermore, growth and amplification of the recom- binant plasmids in g. £911 may introduce mutations in the cloned sequences. However, since we only detect heterogeneities in the third positions of codons, it seems unlikely that the differences result from random errors by reverse transcriptase or random mutations of the hybrid plasmids in g. 9911. A third possibility is that the sequence heteroge- neities result from the existence of multiple bPRL mRNA sequences. Such multiple sequences may indicate a number of alleles for PRL in the gene pool of cattle or multiple loci within each animal. As noted above, the mRNA used to generate the cDNA clones described in this study was obtained from several animals. We have therefore initiated sequencing studies of bPRL mRNA from single animals by extension of DNA restriction fragment primers hybridized to the mRNA template. Preliminary results suggest the presence of two nucleotides at the same position on the sequencing gel corresponding to the third nucleotide of some amino acid codons (data not shown). Animals with this type of sequence polymorphism in the cytOplasmic PRL mRNA would presumably be heterozygous at the PRL gene locus or have duplicated genes. Confirmation of these results will require cDNA cloning from one anterior pituitary gland. at thl seque dete: enzye simil gene. genes this cific of at as Wel FjQUre 79 The existence of multiple alleles for bPRL may be further examined at the genomic level. Jeffreys (Jeffreys, 1979) has recently described sequence polymorphism in the human globin genes of normal individuals detected by Southern blot hybridization. All three variant restriction enzyme cleavage sites occurred in the intervening sequence of the genes, l. (1979) with the chicken ovalbumin _, similar to the results of Lai gt gene. No variants were detected in the coding regions of the globin genes examined in this study, however, restriction enzyme analysis of this type is of course limited to sequence changes that occur within spe- cific enzyme recognition sites. Based on Jeffrey's analysis, an average I; of at least 1 in 100 bp may be expected to vary polymorphically through- out the human genome. Comparison of cloned DNA sequences is more exten- sive and has allowed us to detect polymorphism in the coding region of bPRL. He may be able to investigate polymorphism in the coding sequence of bPRL at the genomic level due to the occurrence of restriction enzyme sites containing the variant sequences described here, as well as poly- morphism of the intervening sequences. Although it appears that the base substitutions we have identified are clustered in a small area of the bPRL protein (Table I), the total bPRL mRNA sequence is not equally represented in all the clones. A simi- lar cluster of silent nucleotide heterogeneities is also evident in the sequence of rPRL mRNA as reported by two laboratories (Gubbins gt al., 1980; Cooke gt al., 1980). The two rPRL mRNA sequences also display one instance of a base change that leads to a conservative amino acid change, as well as a single base change in the 3'-noncoding region (see legend to Figure 4). Together, these comparisons of PRL mRNA sequences within two 80' species suggest polymorphic variations at the nucleotide level occur fre- quently in the gene pool. The primary structure of bPRL mRNA is also of interest for compari— son of evolutionary relationships. PRL is structurally similar to chor- ionic somatomammotropin (placental lactogen) and growth hormone. It has been proposed that the genes for this set of polypeptide hormones origin- ated by duplication of a common ancestral gene (Niall £3 91., 1971). Sequences have been reported for human CS (Shine gt_al,, 1977) and GH (Martial gt al., 1979), rat GH (Seeburg gt al., 1977) and PRL (Gubbins gt al., 1980; Cooke 33 El-: 1980), as well as the bovine GH (Miller £3 31., 1980). Our complete analysis of bPRL mRNA makes possible a comparison of PRL sequences from the rat and cow. Following the preparation of this manuscript, the sequence of human PRL (Cooke gt al., 1981) also became available. Substantial stretches of homology are apparent between the rat and bovine PRL sequences at both the amino acid and nucleotide level (Figure 4). The overall homology for the protein sequences including the signal peptide is 57%, while the total base homology including the untranslated portions is 68% (Table 11). Similarly the protein sequences of bPRL and hPRL contain 73% identical amino acids, while the nucleotide sequences are 80% homologous (data not shown). The bPRL mRNA sequence reported here is more extensive than the rPRL mRNA or hPRL sequences. A total of 51 and 94 nucleotides have been identified from the 5' and 3' untranslat- ed sequences of rPRL mRNA, respectively. The size of rPRL mRNA (Gubbins _tu_l., 1980) is similar to bPRL mRNA in which we have identified 67 and 150 nucleotides from these respective regions. Significant homologous stretches are apparent in these regions by introducing gaps for r fig. Table II. 81 The amino acid and nucleotide homology between bovine PRL and rat PRL and between bovine PRL and bovine GH is summarized. The calculations in the coding regions of the mRNAs are based on the alignment of the amino acid sequences according to Day- hoff (1978). Alignment of nucleotide sequences in the noncod- ing portions of the mRNAs were maximized by introducing appro- priate gaps. In each case, the ratio used to calculate the percent homology represents the number of matches per total number of positions compared. 82 wo~\~o 0.0 oom\~H mcouou FmUFucmupcoc wo~\om o.mH oo~\wm Ammu_pom_u:c NV m:o_u:u_pmn:m ummmmcnxm mom\w~ o.~H oo~\¢m Amu_uom_uac HV mcowuapwumnzm vmmmmgaxo mo~\¢ o.o oo~\o Ammu_uom_o== NV m=o_u=u,pma=m u=m__m wom\m~ o.aH oo~\wm Amuwbom_u== Hy meowu=a_umn=m p=m__m wo~\o~ o.mm oo~\w~ mcouoo ~oupucmu_ mo~\mm o.wm oom\mHH muwua o=_su Pmo_pcmu_ zHMHOma mm\mH m.mH OM\¢ mcouou _muwucmuwcoc mm\o m.mH om\¢ Ammu_uom_u:: NV mco_u=g_umn:m ummmmgqu MM\m m.m~ OM\N Amu_uom_o:: Hv mco_u=uwumn=m ummmmcnxm wom\¢ o.o OM\o Amma_pom_u== NV m=o_a=b_bma=m p=m_wm mM\o o.o~ OM\o Amu_bom_o== Hv m=o_u=u_pmn=m p=a__m mm\¢ o.om OM\m mcouou _mu_u=muw mm\w o.om om\mH muwua o=_so Pouppcmuw moapaua 42000 Ci/mmole) was from New England Nuclear. T7 SJ exonuclease was a generous gift of Dr. P. Sadowski (University of Toron- to) and reverse transcriptase from avian myeloblastosis virus was provid- I ' ed by Dr. J. Beard (Life Sciences, Inc., St. Petersburg, FL). Exonucle- ase III and restriction enzymes were purchased from Bethesda Research Laboratories and used as specified. All other enzymes and materials were obtained or prepared as previously described (Sasavage E£.El-: 1980). Purification of Bovine Prolactin mRNA Polysomal RNA was prepared from fresh bovine anterior pituitary glands (Nilson, 1980a). PRL-enriched mRNA was obtained by sucrose den- sity gradient sedimentation of poly(A)-containing RNA (Nilson, 1980a). The final PRL mRNA preparation used in this study was estimated to be 80- 90% homogeneous RY.ifl.!i££2 translation (data not shown). For the purification of PRL mRNA from a single animal, total RNA was isolated from one pituitary gland by the method of Glisin SE 61. (1974). PRL mRNA sequences were enriched by adsorption to PRL-specific DNA cellu- lose (Nilson g£_gl., 1980b). The bound mRNA fraction was eluted and used as a template for cDNA synthesis as described below. 90 Synthesis and Use of Oligodeoxynucleotide Primers The conditions for the enzymatic synthesis of the oligodeoxynucleo- tide primers, d(pT8-N) and d(pTg-N-N'), have been described in detail by Gillam and Smith (1980). Screening of the oligodeoxynucleotide sequences for Specific initiation of cDNA synthesis on the bPRL mRNA tem- plate was performed as described previously (Sasavage ££.El-, 1980). Reactions employing the primers for determining nucleotide sequences by the chain termination and chemical cleavage methods of DNA sequencing were as reported for bGH mRNA (Sasavage EL 61., 1980). Screening and Sequencing of Prolactin cDNA Clones A library of bPRL-positive cDNA clones prepared by Sasavage SE 61. (1981) was screened (Grunstein and Hogness, 1975) with a nick-translated restriction fragment from a previously sequenced bPRL insert. The Alu I fragment (108 base pairs) of clone pBPRL72 corresponded to the 3'-noncod- ing portion of the mRNA sequence (Sasavage g£._l., 1981). Several posi- tive colonies were selected and the plasmids prepared for sequence analy- sis as described (Sasavage 66 El-: 1981; Smith, 1979; Zain and Roberts, 1979). Briefly, plasmids linearized with EcoRI were digested with exonu- clease III or T7 exonuclease to form single—stranded templates. The same Alu I restriction fragment was used as a primer for the DNA sequencing reactions by the dideoxy method (Smith, 1979; Sanger gt_gl., 1977). 91 RESULTS Screeninggof d(pT8:N-N') Primers for Specific Initiation of Prolactin cDNA Synthesis Ne have synthesized 12 oligodeoxynucleotide primers of the sequence d(pTg-N-N') (Gillam and Smith, 1980) to Specifically hybridize at the poly(A)-junction of mRNA templates. Such specific phasing of the primer on the template allows initiation of reverse transcriptase-directed cDNA synthesis to occur with a unique 5'-terminus required for sequence analy- sis. Ne have previously shown that the single d(pTg-N-N') sequence complementary to the poly(A) junction of a mRNA template may be deter- mined utilizing a single dideoxy chain terminator (Sasavage g; 61., 1980). Detection of a specific and unique pattern of bands on a sequenc- ing gel identifies the complementary primer sequence. Since the chain termination fragments are characteristic of a particular mRNA sequence, the pattern of bands may be used to identify different mRNA species. In this study, enriched bPRL mRNA was screened with the 12 d(pT8- N-N') primers to determine the two nucleotides adjacent to the poly(A) tail of this mRNA. PRL mRNA sequences constitute the major portion of mRNA from the bovine anterior pituitary (Nilson 6£“61., 1979) and are readily enriched by sucrose density gradient centrifugation of pituitary polysomal mRNA (Nilson g; 61., 1980a). Each primer was tested separately for specific initiation of PRL cDNA synthesis in the presence of ddTTP. The resulting labeled fragments were separated on a sequencing gel to identify the d(pT8-N-N') sequence which specifically hybridized to the poly(A) junction and initiated cDNA synthesis (Figure 1). Surprisingly, 1'le Figure 1. 92 Autoradiograph of d(pTg-N-N') primer screening. The 12 syn- thetic oligodeoxynucleotide primers with the sequence d(pTg- N-N') were tested in a chain termination reaction for specific initiation of cDNA synthesis with enriched bPRL mRNA. The production of specific sequence bands was used as the criter- ion for determining which primer was complementary to the poly(A) junction of the mRNA template. The reactions were performed with the inhibitor ddTTP and the resulting radioac- tively-labeled fragments were electrophoresed in a 12% poly- acrylamide-7 M urea gel. The gel lanes are labeled with the nucleotides of the primer sequence. The autoradiograph also shows the results of the three d(pTg-N) sequences and a con- trol reaction (-) which was performed in the absence of an oligonucleotide primer. The letters XC mark the migration of the xylene cyanol dye. Specific initiation of bPRL cDNA syn- thesis was observed with the sequences CC, CG, AT, AG, GA, C, G and A. 93 66 GA GT GC A6 I AC AA C CA (G m.“ we; 4 b ,w-m—w 23.1%., S 2...»: .e-w-I. 3: . .., ..a...’ 1~J Wa’S J;— 3:. .3 3.: o . Figure 1 94 more than one primer sequence produced specific bands in the chain ter- mination reaction with the bPRL mRNA template. cDNA synthesis primed by the sequences d(pTg-C-G), d(pTg-A-G), and d(pTg-G-A) resulted in intense chain termination fragments on the sequencing gel shown in Figure 1. The identical pattern of the chain termination fragments indicated that all three primers initiated cDNA synthesis on the same mRNA template. Comparison of the lanes on the gel revealed a particular set of characteristic fragments that was easily identifiable with each of the three primer sequences. This pattern of five T residues, a Space, and three T residues (which we will refer to as T5-T3) was shifted in position on the sequencing gel. Since denatur- ing sequencing gels separate oligonucleotides differing in length by a single nucleotide, it is possible to estimate the relative size differ- ence of a particular sequence band by counting the number of positions the band has shifted. For example, we estimated the first residue of the T5-T3 pattern produced with d(pTg-A-G) was shifted seven nucleo- tides upwards relative to its position in the lane for d(pTg-G-A) and Similarly 12 nucleotides relative to d(pTg-C-G). Furthermore, the dis- tance a particular fragment migrates in the gel reflects its nucleotide length. By this analysis, we concluded that the mRNA template primed by d(pTg-A~G) was the longest, followed by d(pTg-G-A) and d(pTg-C-G), respectively. Closer inspection of the primer screening gel (Figure 1) revealed two other lanes with less intense bands of identical character to those produced with the AG-, GA-, and CG—containing oligodeoxynucleotides. The sequences d(pTg-C-C) and d(pTg-A-T) also display the characteristic T5-T3 pattern. In the case of d(pTg-G-C), a rather strong, but 95 nonidentical set of chain termination fragments was observed (Figure 1). We have previously determined that this sequence is complementary to the poly(A) junction of bGH mRNA (Sasavage SE 61., 1980). The bPRL mRNA utilized in the primer screening assay was enriched from pituitary poly- somal mRNA by fractionation in sucrose density gradients, however, the major contaminant of this preparation as determined 9Y.ifl.!i££9 transla- ’ tion is GH mRNA (data not shown). It is therefore likely that the bands observed with the oligodeoxynucleotide d(pT8-G-C) result from the pres- ence of GH mRNA molecules in the enriched PRL mRNA preparation. We have also tested the three oligo(dT) primers containing a single nucleotide, d(pTg-N), for comparison. These oligodeoxynucleotides also initiate cDNA synthesis at the poly(A) junction of mRNA and produce spe- cific chain termination fragments (Figure 1). The pattern and location of bands produced by the d(pT8-N) primers parallels those produced with the major primer sequences d(pT8-A-G), d(pT8-G-A), and d(pTg-C-G). Squence Analysis with 9(glg-N-N') Primers We chose to further analyze the 3'-noncoding region of bPRL mRNA with the complementary d(pTg-N-N') sequences containing the dinucleo- tides, AG, GA, and CG. The primers were used to specifically initiate reverse transcriptase-directed cDNA synthesis in a complete set of dide- oxy sequencing reactions. Each oligodeoxynucleotide primer resulted in an indentical cDNA sequence for approximately 150 nucleotides of the 3'- noncoding region of bPRL mRNA. A representative sequencing gel from analysis of bPRL mRNA with the primer d(pTg-C-G) is shown in Figure 2. The sequence obtained by this method was positively established as that of bPRL mRNA by identification of codons for several carboxy terminal Figure 2. 96 Autoradiograph of bPRL cDNA sequencing gel. One of the d(pT3-N-N‘) sequences, d(pT8-C-G), determined to be com- plementary to the poly(A) junction of bPRL mRNA (Figure 1) was used to initiate cDNA synthesis for sequencing by the dideoxy chain termination method. A portion of the cDNA as determined on a 20% sequencing gel is shown in the autoradiograph. The letters XC correspond to the migration of xylene cyanol. The lanes are marked with the appropriate chain terminator. Figure 2 98 amino acids of the bPRL protein (Dayhoff, 1978) and an ochre termination codon (data not shown). Further analysis of the 3'-terminus of bPRL mRNA by the chemical method for DNA sequencing (Maxam and Gilbert, 1980) was necessary to com- plete a small portion of the sequence. We utilized [32P](dT8-A—G), the sequence complementary to the longest mRNA template, to initiate cDNA synthesis for subsequent sequence analysis of the product by the method of Maxam and Gilbert (1980). The sequence data obtained by this method (data not shown) confirmed the previous dideoxy sequence analysis and completed the sequence of nucleotides innmoiately adjacent to the primer that had not been obtained by the chain termination method. Examination of the nucleotide sequence from this area innmoiately preceding the poly(A) junction of the mRNA suggested an explanation for the initiation of cDNA synthesis by multiple d(pTg-N-N') sequences (Figure 3). Complementary sites were present in the bPRL mRNA template for each of the dinucleotide sequences, AG, GA and CG. Furthermore, the number of nucleotides between the putative poly(A) sites was as previously estimated on the primer screening gel. Together these results predicted the existence of multiple Species of bPRL mRNA that differ only in the site of polyadenylation within a Span of 12 nucleotides. Identification of Bovine Prolactin cDNA Clones with Multiple Poly(A) Adjacent Sequences To further document the existence of multiple poly(A) sites in bPRL mRNA molecules we examined a library of bPRL cDNA clones for the hetero- geneous 3'-termini predicted by the primer analysis. We have previously Figure 3. 99 The poly(A)-adjacent sequence of the major bovine prolactin mRNA species. A portion of the 3'-terminal sequence of bPRL mRNA determined by initiation of cDNA synthesis at the poly(A) junction by the three major complementary d(pTg-N-N') oligo- nucleotides is shown. The sequence information represents data obtained by both the dideoxy and chemical nethods for DNA sequence analysis. The primer sequences at the poly(A) junc- tion are marked. The hexanucleotide sequence AAUAAA is under- lined to serve as a reference point for comparisons. It is the cDNA sequence from the hexanucleotide area that produces the characteristic T5-T3 pattern on the primer screening gel. As predicted from that analysis, the difference in length of the three bPRL mRNA species is five nucleotides between the Species complementary to the d(pTg-C-G) and d(pTg-G-A) primers, and seven nucleotides between the spe- cies complementary to the d(pTg-G-A) and d(pTg-A-G) prim- ers. Identical sequences were determined with each primer for the entire 150 nucleotides of the 3'-noncoding region. 100 Ill ...... J... m meamwm Loewca HHHHHHHHdd