.- ;-‘11 um m: ‘ 0:th d -: rum in :1; . -. .n - h « 9 ,.‘ “m“ J“ .1 LIIEBRAR WWW\IIHHWINHINIHNHHIHI “Ill This is to certify that the dissertation entitled IDENTIFICATION AND CHARACTERIZATION OF CARBOHYDRATE BINDING PROTEIN 35 GENE STRUCTURE presented by SHIZHE JIA has been accepted towards fulfillment of the requirements for DOCTOR OF PHILOSOPHY degreein BIOCHEMISTRY Engi LIB-t3 Major professor ~‘1/‘3 [7/ Date MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 ___ LIBRARY lMlchlgan State : University L.__ PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE l r -"————r‘=“ MSU Is An Affirmative Action/Equal Opportunity Institution cmmtlnnt IDENTIFICATION AND CHARACTERIZATION OP CARBOHYDRATE BINDING PROTEIN 35 GENE STRUCTURE BY SHIZHE JIA A DISSERATATION SUBMITTED TO NICRIGAN STATE UNIVERSITY IN PARTIAL FULFILLMENT OP REQUIREMENTS FOR TEE DEGREE OP DOCTOR OF PHILOSOPHY DEPARTMENT OF BIOCRENISTRY 1990 éjj- a!) w ABSTRACT Inna-arcane»: nun marrow or cmoamm BINDING PROTEIN 35 GENE STRUCTURE 3]! Shirhe Jia Carbohydrate Binding Protein 35 (C8235) is a galactose-specific lectin identified in the cytoplasm.and nucleus of mouse 3T3 fibro- blasts, as well as a number of other tissues and cell types. Affinity purified antibodies directed against C8235 were used to screen a lambda gtll expression library derived from.mRNA of mouse 3T3 fibroblasts. One positive clone containing cDNA for C8235 was characterized by expression of a fusion protein containing beta- galactosidase and C8235 sequences. Limited proteolysis of bacterial lysates containing the fusion protein. followed by $88 polyacry- lamide gel electrophoresis and immmoblotting with anti-C8235, yielded.a peptide mapping pattern comparable to that obtained.frem parallel treatment of authentic C8235. Such a limited proteolysis followed by affinity chromatography on a column of Sepharose deriva- tized with galactose also yielded a 30-kDa polypeptide that exhi- bited carbohydrate-binding activity. This polypeptide can be insane- blotted with anti-C8235. but not with antibodies directed against beta-galactosidase. These results indicate that the positive clone is be an authentic 68235 cDNA clone. The complete nucleotide sequence of the cDNA clone for C8235 has been determined. The deduced amino acid sequence showed that the protein consists of two domains: (a) an amino terminal portion that contains an internal sequence homology featured by 8 repeats of a 9 amino acid motif which is rich in proline and glycine residues and which shows structure similarity to certain regions of proteins of the heterogeneous nuclear ribonucleoprotein complex (hnRNP) ; and (b) a carboxyl terminal portion that is homologous to beta—D-galactoside specific lectins isolated from a number of animal tissues. A cloned genomic DNA segment for C8235 gene was isolated . Nucleo- tide sequence analysis of the cloned gene, as well as the results of genomic Southern blot hybridization revealed that the gene is unique and spans 9 kilobases of genomic DNA. The mRNA for the lectin is encoded by five exons separated by four introns. Intron I must necessarily be in the 5' transcribed but untranslated region of the primary transcript. Examination of the nucleotide sequence 5' to the transcription initiation site revealed characteristic TATA and CCAAT boxes. A putative serum—responsive element has also been identified about 200 nucleotides upstream from the site for initiation of transcription. This may serve as a binding site for a class of transcription factors responsible for the serum-stimulated expression of the C8235 gene. DEDICATED TO MY PARENTS iv ACKNOWLEDGEMENTS I would like to express my sincere appreciation to Dr. John Wang for all his encouragement and support. He has provided me with opportunities such that I can become more mature during my graduate career. I would like to thank my committee, Drs. Lee McIntosh, Arnold Revzin, Leonard Robbins and John Wilson for their assistance during my education. I thank Drs. Richard Anderson, Melvin Schindler, John Ho, Ioannis Moutsatsos for their advice and discussion. I am grateful to my laboratory mates for their cooperation and criticisms: Patty Voss, Kristen Yang, Sung-Yuan Wang, Neera Agrwal, Kim Hamann, Jamie Laing and Dr. Liz Cowles. TABLE OF CONTENTS PAGE LIST OF TABLES .......................................... i LIST OF FIGURES .. ........................................ ii LIST OF ABBREVIATIONS .................................... 1v CHAPTER I LITERATURE REVIEW .............................. l I INTRODUCTION ......................................... 1 II ANIMAL LECTINS: STRUCTURAL MOTIFS ................... l A: CLASSIFICATION ................................... l B: C-TYPE LECTIN .................................... 3 C: S-TYPE LECTIN ...... . ......................... . 7 III CARBOHYDRATE BINDING PROTEIN 35 ..................... 10 A: ISOLATION AND CHARACTERIZATION ................... 10 B: DISTRIBUTION ........... ' .......................... 10 C: NUCLEAR LOCALIZATION OF CBP35 ...... - .............. 12 D: REGULATION OF EXPRESSION ... ...................... 14 IV IDENTITY BETWEEN CBP35 AND PROTEINS STUDIED UNDER OTHER NAMES ........................................ . 15 A: L—34 ............................................. 16 B: Mac-2 ........................ . ................... 16 C: LAMININ BINDING PROTEIN (LBP35) ............. ..... 18 D: IgE BINDING PROTEIN (eBP) ........................ 18 V STRUCTURE OF 14 KD S-TYPE LECTINS ................... 19 VI GOAL OF THIS THESIS ................................. 20 REFERENCES .......................................... 20 CHAPTER II CARBOHYDRATE BINDING PROTEIN 35: MOLECULAR CLONING AND EXPRESSION OF A RECOMBINANT POLYPEPTIDE WITH vi LECTIN ACTIVITY IN ESCHERICHIA COLI ........... SUMMARY ................................................. INTRODUCTION .... ...... . ................................. MATERIALS AND METHODS ..................... .............. (a) AFFINITY PURIFICATION OF ANTIBODIES AGAINST CBP35 (b) SCREENING OF THE LAMBDA gtll EXPRESSION LIBRARY ..... (c) ISOLATION OF RECOMBINANT POLYPEPTIDE WITH LECTIN ACTIVITY FROM FUSION PROTEIN OF CLONE 1 ....... ..... (d) NORTHERN BLOTTING ANALYSIS ......................... RESULTS AND DISCUSSION .................................. (a) IDENTIFICATION OF A cDNA CLONE FOR CBP[35 .. ........ (b) EXPRESSION OF A FUSION PROTEIN ENCODED EY LAMBDA gtll CLONE 1 DNA ... ...... ................ ..... .... ...... (c) PEPTIDE MAPS DERIVED FROM FUSION PROTEIN AND CBP35 (d) ISOLATION OF RECOMBINANT CBP35 FROM FUSION PROTEIN (e) NORTHERN ELOT ANALYSIS OF mRNA ENCODING CBP35 ...... ACKNOWLEDGMENTS ........................................ . REFERENCES .............................................. CHAPTER III CARBOHYDRATE BINDING PROTEIN 35: COMPLEMENTARY DNA SEQUENCE REVEALS HOMOLOGY WITH PROTEINS OF A: DNA CLONING ......................................... B: DNA SEQUENCING ...................................... RESULTS AND DISCUSSION ...... . ............. . .............. REFERENCES . ............. , ................................ CHAPTER IV NUCLEOTIDE SEQUENCE OF THE MURINE GENE FOR vii 24 25 26 27 27 28 29 30 32 32 32 35 38 45 49 50 54 55 56 57 58 58 58 59 74 FOOTNOTES ............................................... SUMMARY ................................................. INTRODUCTION ............................................ EXPERIMENTAL PROCEDURES ................................. A: SCREENING OF THE GENOMIC LIBRARIES .................. B: CLONING AND SEQUENCING .............................. C: DIRECTIONAL DELETION OF A 2.7 KB FRAGMENT FOR DNA SEQUENCING .......................................... D: PRIMER EXTENSION ASSAY .............................. E: SOUTHERN ANALYSIS OF GENOMIC DNA .................... RESULTS ....... p .......................................... ISOLATION AND CHARACTERIZATION OF GENOMIC CLONES SEQUENCE ANALYSIS OF THE GENOMIC CLONES ............. STRUCTURAL FEATURES OF THE CBP35 GENE ............... UOU’W COMPARISONS OF THE SEQUENCES OF CBP35 WITH L—34, Mac-2, AND eBP ...................................... DISCUSSION .............................................. REFERENCES .............................................. viii 76 77 78 79 81 81 81 84 85 86 87 87 88 94 LIST OF TABLES TABLE CHAPTER I l. EXAMPLES OF MEMBRANE BOUND AND SOLUBLE LECTINS ..... 2. COMPARISON OF C—TYPE AND S-TYPE ANIMAL LECTINS ..... CHAPTER IV 1. DIFFERENCES IN THE AMINO ACID SEQUENCES OF CBP35 VERSUS L-34 ........................... . ............ PAGE LIST OF FIGURES FIGURE CHAPTER I 1. SUMMARY OF STRUCTURAL FEATURES OF C—TYPE ANIMAL LECTINS ................ . ........................... 2. SUMMARY OF STRUCTURAL FEATURES OF S-TYPE ANIMAL LECTINS ............................................ CHAPTER II ' 1. ISOLATION OF CLONE 1 AND CDNA INSERT OF CLONE 1 FOR CBP35 CDNA ............. , ........................ 2. DETECTION OF EXPRESSION OF FUSION PROTEIN BY IMMUNOBLOTTING ANALYSIS ............................ 3. COMPARISON OF TEE PEPTIDE MAPS DERIVED FROM THE FUSION PROTEIN AND CEP35 ....... ........................... 4. IMMUNOBLOT ANALYSIS OF RECOMBINANT POLYPEPTIDE WITH LECTIN ACTIVITY USING AFFINITY PURIFIED ANTI-CBP35 5. NORTHERN BLOT ANALYSIS OF THE POLYA+ FRACTION OF CHAPTER III 1. RESTRICTION MAP OF THE CDNA INSERT OF CBP35 CLONE 1 STRATEGY FOR SEQUENCING .. .......................... 2. NUCLEOTIDE AND DEDUCED AMINO ACID SEQUENCE FOR CBP35 3. COMPARISON OF THE AMINO ACID SEQUENCE OF CBP35 WITH THE AMINO ACID SEQUENCES OF SIX OTHER BETA-D- GALACTOSIDE BINDING PROTEINS ....................... PAGE 34 37 40 43 47 61 63 66 ALIGNMENT OF THE AMINO ACID RESIDUES SHOWING INTERNAL SEQUENCE HOMOLOGY .................................. 69 6. COMPARISON OF THE DEDUCED AMINO ACID SEQUENCE OF CBP35 WITH THE AMINO ACID SEQUENCES OF hnRNP PROTEINS .... 71 CHAPTER Iv 1. RESTRICTION MAP AND SEQUENCING STRATEGY FOR MURINE CBP35 GENE ...... .................................. 83 2. SOUTHERN BLOT HYBRIDIZATION OF MURINE GENOMIC DNA WITH cDNA FOR CBP35 ............................... 90 3. NUCLEOTIDE SEQUENCE OF THE MURINE CBP35 GENE ...... 93 4. IDENTIFICATION OF THE TRANSCRIPTION INITIATION SITE OF THE MURINE CBP35 GENE BY PRIMER EXTENSION ASSAY ... 96 5. COMPARISON OF THE NUCLEOTIDE SEQUENCE OF THE CODING REGION OF THE CBP35 GENE AND CORRESPONDING SEQUENCE REPORTED FOR L-34 .................................. 100 6. COMPARISON OF THE NUCLEOTIDE SEQUENCE OF THE 5' UNTRANSLATED REGION OF L-34, Mac-2, eBP AND CBP35 .. 114 xi LIST OF ABBREVIATIONS ASGP: asialoglycoprotein beta-galase: beta—galactosidase bp: base pair(s) CBP: carbohydrate binding protein CLL-I: Chicken lactose lectin I CLL—II: Chicken lactose lectin II CRD: carbohydrate recognition domain eBP: IgE binding pro:ein EDTA: (ethylenedinitrilo)-tetraacetic acid Gal: galactose hnRNP: heterogenous nuclear ribonucleOprotein Complex 8R2: horseradish peroxidase IPTG: isopropyl~beta-D-thio—galactopyranoside IVS: Intervening sequence kb: kilobase(s) kD: kilodalton(s) LBPBS: laminin binding protein 35 L—34: mouse lectin 34 M-G—P: Mannose 6-phosphate Mac-2: mouse macrophage surface antigen PAGE: polyacrylamide gel electrophoresis PMSF: phenylmethyl sulfonyl fluoride SDS: sodium dodecyl sulfate SRE: serum response element TE: 125mM Tris-HCl, lmM EDTA, pH 6.8 Tris: tris (hydroxymethyl) aminoethane xii CHAPTER I LITERATURE REVIEW I INTRODUCTION Carbohydrate binding proteins bind to saccharide-containing regions of glycoproteins and glycolipids. Lectins are carbohydrate binding proteins that were originally identified. in grunn: extracts as agglutinins of erythrocytes (1). In the more recent literature, lectin and carbohydrate binding protein have become interchangeable terms, and these will be used similarly in this thesis. Although lectins were originally identified in plants, they have also been found in many other organisms including bacteria, slime molds and vertebrates (2). In this thesis, the gene structure of an animal lectin, designated Carbohydrate Binding Protein 35 (CEP35), will be described. The literature review will therefore focus on certain structural themes of animal lectins. II ANIMAL LECTINS : STRUCTURAL MDTIFS A: CLASSIFICATION There are two main Classification systems for animal lectins. The first system classifies the proteins according to solubility (Table I): a) integral membrane lectins which require detergents for their solubilization; and b) lectins soluble in aqueous buffer (17). 2 Key examples of the membrane bound lectin group include the asialoglycoprotein (ASGP) receptors (3, 4) that may function to clear serum glycoproteins and the mannose 6—phosphate (M6P) receptors that facilitate the transport of lysosomal enzymes from the golgi to their target organelles (5, 6). The soluble lectins in turn fall into two major families, one with molecular weights of 14,000—16,000 (14 kD —16 kD) and a second with molecular weights of 29,000 -35,000 (29 kD-35 kD). Another classification of animal lectins is based on the dependence of the carbohydrate binding activity on cations (C—type lectins ) or on thiols (S—type lectins) (18). The carbohydrate binding activity of C—type lectins is Ca2+—dependent, while the S—type lectins are thiol—dependent for the activity. Table II summarizes the distinguishing features of the two types of lectins (18). From numerous structural studies, it is now clear that both C-type and S-type lectins consist of two different domains. The first domain, if one exists, is termed the special effector domain; this confers unique properties on the particular lectin. The second domain is the carbohydrate recognition domain (CRD). B: C-TYPE LECTINS Figure l is a summary of the structural features of C—type lectins (18) . An important feature of the Structure is that all these lectins have a CRD of 130 amino acid residues, in which 18 amino acid re— sidues are conserved (20). Of particular interest are the four Cys residues that are involved in disulfide bonds. It should be noted that the carbohydrate binding specificities of different C-type lectins differ: some are specific for galactose (e.g. ASGP receptor) Table I Examples of Membrane Bound and Soluble Lectins CLASSIFICATION LECTIN SUBUNIT MW LIGAND SOURCE REFERENCE Membrane ASGP 26 K GlcNAc Chicken 3 bound receptor 52 K D-Gal Rabbit 4 M-6-P 46 K _ Man-6-P Bovine 5 receptor 225K Man-6-P Human 6 CLL—I/CLL-II 14—16K D-Gal Chicken 7-9 Soluble L-34 /CBP35 Mac~2/LBP35 29-35 K D—Gal Mouse 14—16 eBP 31K D—Gal Rat 36 Table II Comparison of C—type and S—type Animal Lectins PrOperty C-type lectins S-type lectins Caz+ requirement Yes No State of cysteines Disulfides Free thiols Solubility Variable Aqueous solutions Location Extracellular Intracellular and extracellular Carbohydrate specificity Various Mostly B—galactosides Figure 1. Summary of the structural features of C-type animal lectins. The invariant residues found in the common carbohydrate-recognition domain of the C-type lectins are shown, flanked by schematic diagrams of the special effector domains (if any) found in individual members of the family . EGF, epidermal growth factor; GAG, glycosaminoglycan; (adapted from ref. 18) a mrzmaa mpquawm 3:..-qu $8 .. o ... 0 2350865 5.38%: mZEIOO 52-30 flb\lx\o€ mzaxo So a. i. E l E L F . -....- figuroooflomm . a ma ...TV a: 1 / . _ // 0..»0 2.83 2.5% «mm / / 2.83 5.. / / // mxfiwmlk Fzfioqumam $420132 C 255.: / o ozaza- mmozzdz $5323 quomrmwizqu «638% we... wtoorazz _ 2.83 0:3er 286:; 2228 till... 29:58? - r «058% mkdmotlommdo z_w.»O¢QOU>JOOJSm< “mark: 0 . 7 while others are specific for mannose. Thus, the conservation of amino acids in the CRD is for carbohydrate—binding, not necessarily for a specific saccharide. A second feature of the structure is that different lectins in the C-type category have distinct effector domains. For example, the ASGP receptor described earlier is a membrane bound lectin; it therefore has a membrane anchor as its effector domain (Fig. 1). Similarly, the cartilage proteoglycan core protein has a CRD that is linked.to an effector domain to which many glycosaminoglycans are attached. Finally, certain C-type lectins, such as the fly lectin and the sea urchin lectin, contain only the CRD; they lack the effector domain. Thus, although the exact functions of all of the C—type lectins are still not entirely understood, the structure information of this class of lectins is relatively clear. In fact, some of the structural information on the effector domain provides clear cut notions of their overall function. This theme is also shared by the domain delineation in the S-type lectins. C: S-TYPE LECTINS In S—type lectins, there are two groups of proteins: 14 —16 kD and 29-35 kD lectins. The 14—16 kD lectin group exhibits only one domain, the CRD. In addition to the CRD, the 29-35 kD lectins group exhibits a second domain. These features are schematically illustrated in Figure 2. Like the C-type lectins, the CRD of S—type lectin exhibits Figure 2. Summary of the structural features of S-type animal lectins. Conserved residues found in all of the family members so far sequenced are shown. In addition , the extra domain found in CBP35 is shown schematically. N mismam .....ollamzamnoia V pp»... z..zm.£o..m.mmla w t....32-u_l<-e.....o¢..xuz wz_zomm95%) of the CBP35 is found inside the cell. Although CBP35 could be detected at the cell surface ( e. g. by anti-CBP35 antibodies), the interpretation of such results has not been clear cut. Much of these extracellular CBP35 could be due to cell lysis and leaking of intracellular lectins, which become bound to the cell surface. Therefore, most of the attention.has been focused on intracellular CBP35. 12 The results from immunofluorescence studies with anti-CBP35 antibodies on formaldehyde-fixed and detergent permeabilized 3T3 cells have been particularly interesting. First, the lectin was predominantly localized in the nucleus of the proliferating 3T3 cells, while in quiescent 3T3 cultures the majority of the CBP35 was found in the cytoplasm. Second, CBP35 had a distinct punctate staining in the 3T3 cell nucleus (21). This indicated that CBP35 may be associated with some subnuclear structure. C: NUCLEAR LOCALIZATION OF CBP35 Immunochemical studies have shown that CBP35 may not be present in a free form in 3T3 cells but rather may be associated with the heterogeneous nuclear ribonucleoprotein complex (hnRNP) in the nucleus (27) and ribonucleOprotein complexes (RNPs) in cytoplasm (Laing, J., and Wang, J., L. unpublished data). When unfixed and detergent-permeabilized 3T3 cells were digested with ribonuclease A (RNase A) and deoxyribonuclease I (DNase I), only RNase A could release CBP35 from the nucleus. This indicated that the nuclear CBP35 may be associated with the ribonucleoprotein fraction in the nuclei. When subnuclear fractions were isolated by sucrose or cesium sulfate gradient centrifugation, CBP35 was found in the same fractions as hnRNP. Moreover, when nucleoplasm was subjected to affinity chromatography on a galactose—Sepharose column, the bound fraction eluted with lactose contained CBP35 as well as polypeptides corresponding to those identified in hnRNP ( 27 ). This was interpreted to indicate that CBP35 in the complex bound to the column which in turn co-purified the hnRNP components as a complex. 13 Since CBP35 was identified in the hnRNP complex, it is helpful to summarize some literature 090 cao 0<0 L99 9<9 oca <00 cao 09< 0H< 900 an< 9<0 sac <<0 Am> 990 ado 9 o<9 ado <00 .ao< 900 cn< any . oo< H0> 090 can 0<< co< 9<< oca 900 aa< 000 oaa 900 CH0 0<0 oca < 990 0L9 009 mc< <0< oaa ooo ora 900 Lee pol ora <00 9H0 90 0<< ca< 0<< 0L< 00< zoo 09o ma< <00 oca <00 r99 9<9 com 099 ao< o<0 .21 ova cu< 9<< era 099 an< 0<0 ado ooo ado ooo ua< ooo ado 9o< ado 900 no: o<0 dn< o<0 ao< 9<0 L99 9<9 oaa 900 02a ooo ado ooo aa< 0<< <00 909 000 900 . . V‘u luau» <<0 009 0<0 0<9 com 90< 00< o<0 ado 0 990 am> 090 ado 000 aa< 000 L99 9<9 0H< <99 000 3H0 aa> <<04990 are oo< oHH 99< are oo< oaa <00 com 90< aa< ooo am< 9<0 cn< 0<< mc< 00< zoo 090 0:1 ob? oca 900 0H< 900 cn< 0<< 900 CH0 < oao Hm> 090 0H< <00 loo «oo »Ho «00 0H< 900 oaa <00 com 909 0HH 09¢ Coo <9 .o<9 one 000 3H0 <<0 0AM <9< Hm> 090 oca ooo ota 000 0H< ooo 11o <00 ado ooo ooo 990 009 999 00< 00< ma< 000 090 <<< 0L< <0< «>0 <<< Hm> 090 02a 900 021 <00 9&0 000 <<0 ooo 09o oca 099 0L< <0< Ho> 090 11o Foo CH0 <<0 cx9 9<9 0H< <00 an< ma< 040 <00 64 This was quite distinct from the carboxyl terminal portion (residues 126-263) , which contained both hydrophobic and hydrophilic regions that are characteristic of many globular proteins. Each of these two domains is homologous to distinct groups of proteins previously identified on the basis of their function/activity. A region containing 76 amino acids (residues 138—214) was found to be homologous to a number of b—D-galactoside specific lectins (7—11) (Fig. 3). For bovine heart lectin (10), chicken skin lectin (7,8), and electric eel lectin (11), the complete amino acid sequences over this 76-residue stretch were available. The extents of homology between CBP35 with these lectins were: (a) 34% with lectin from electric eel; (b) 36% with bovine heart lectin; and (c) 38% with chicken skin lectimn Severa1.peptide sequences are highly conserved in all the lectins sequenced so far. These include the sequence His—Phe-Asn-Pro-Arg~Phe-Asn (residue 171—177) and. the sequence Trp-Gly-Lys-Glu-Glu-Arg—GlnSer—Ala-Phe—Pro-Phe (residues 194-205). The latter sequence, containing a tryptophan and two glutamic acid residues, correspond to the peptide hypothesized to be present in the b—D-galactoside binding site of the eel lectin (11). Previous searches of data bases with the bovine heart lectin sequence (10) and with the sequences of two hepatoma clones (9) failed to reveal any significant homologies with other proteins. As a result of these analyses, it was proposed that the soluble vertebrate lectins with b-D—galactoside binding activity represent a new protein family. CBP35 now joins this protein family on the basis of its sequence homology with these lectins. Moreover, the present sequence results also provide the structural basis for the observation that the clone 1 fusion protein, upon V—8 protease digestion, yielded a polypeptide with carbohydrate-binding activity 65 Figure 3: Comparison of the amino acid sequence of CBP35 clone 1 (residues 127 through 214 in the numbering system of Figure 2) With the amino acid sequences of six other b—D-galactoside binding lectins. BHL, bovine lung lectin (10), CSL, chicken skin lectin (7,8), HLL, peptides from human lung lectin (9), HEP 1 and HEP 2, sequences deduced from two cDNA clones derived from a human hepatoma library (9), and EEL, lectin :from electric eel (ll). Dashes indicate gap introduced for optimal alignment; complete gaps are positions for which no sequence information is available; * denotes uncertain residue assignments. Sequence identities between CBP35 clone 1 and the other lectins are highlighted by boxes. The sequences were compared using the FASTP program (16). K P N A N N A K 127LTvaDLPLPGGvMIFRMLIUMF CBP35 BHL CSL M S C Q G P V C T N L G HLL NMDMKflGSTLKmTGSIADGTU NGVVDERMSFKAGQNLIVKpVPSIDST HEP l EEL 66 m N P R D A H G D 0 0 A V ELMH F G L H F N P R F D A H G D V N L 1 ' L N L H F N P R F S G S T L L L L HEVEFHFNPRFNENNRRv-fl NLYLHFNPRFNAHGDVNLI 12:2 91:23am :waP—¢ ..ocncaw mxxxoz 01004901903 LL._J._J_J——l> szzzz __J..JZ__.|'—‘""‘ >_.J>>>< p—ou_u.u..u_u— ormwwoz c3935 Bu CSL HLL HEP 1 EEL <3 ._—. >2 >-9—LL uJHCD ¥w-‘-—’°-°' u_u_u_u__Ju.|-l— <>==<99 mmt— > > y—mcf) 222 one >>> m r—‘N m “J 33:: 09-4 ”JL“ 31 QmQIII 67 (5). While the amino acid sequence toward the carboxyl terminal half of CBP35 is homologous to the sequences of other lectins, the sequence at the amino terminal end also showed interesting features. First, the sequence between residues 40-112 showed eight internal sequence homologies (Fig. 4): (I) residues 40-48; (II) residues 49—57; (III) residues 58-66; (IV) residues 67-75; (V) residues 76—84; (VI) residues 85-93; (VII) residues 94-102; and (VIII) residues 104-112. Each of these homologous regions consists of a 9-residue repeat, with a consensus sequence of Pro-Gly-Ala-Tyr-Pro-Gly, followed by three additional amino acids. As a result, this stretch of the sequence is characterized.by a high proportion of Pro (27%) and Gly (24%). In addition, the sequence at the amino terminal portion also showed homology with the amino acid sequences of several proteins identi- fied.as polypeptides of the heterogeneous nuclear ribonucleoprotein complex (hnRNP) (Fig. 5). These include: (a) 25 identities over 108 residues with a glycine-rich protein (GRP33) of the hnRNP of brine shrimp.Artemia salina (12); (b) 18 identities over 71 residues with the deduced amino acid sequence of clone DL-4, identified from a human hepatoma cDNA library on the basis of it expression of a fusion protein reactive with chicken antibodies directed against bovine hnRNP proteins (13); and (c) 11 identities over 42 residues with human hnRNP protein C1 (14). There was no apparent homology between the sequences of CBP35 and the rat hnRNP protein A1 (15) when subjected to the same analysis with the FASTP program (16). None of'tflua hnRNP proteins contained any copies of the striking 68 9-residue repeat sequence (Fig. 4) found in CBP35. However, the extent of homology between CBP35 and the hnRNP proteins shown in Fig.6 was 25%, comparable to the level of homology between the hnRNP proteins themselves (12—15). More over, based on sequence- scrambling comparison of 20 random sequences, the FASTP program of Lipman and. Pearson (16) yielded. the following statistical significance for the observed homologies: (a) The aligned score between CBP35 and.GRP33 was 1.52 standard deviations above the mean score; (b) The aligned score between CBP35 and hnRNP protein Cl was 0.96 standard deviations above the mean score; and (c) The aligned score between CBP35 and.DL-4 sequences was 0.03 standard deviations above the mean score. For comparison, the aligned score between hnRNP proteins Cl and.A1 was 1.04 standard.deviations above the mean score . Finally, CBP35 shares with a number of hnRNP proteins what appears to be one of their typical features, i.e. the.presence of distinct domains with non-uniforntdistribution.of Gly and.Pro residues in the polypeptide chains. In.the case~of CBP35, 86% of the 36 Gly residues and 87% of the 39 Pro residues are in the amino terminal domain of the molecule. For both GRP33, the hnRNP protein from brine shrimp (12) and rat hnRNP protein A1 (15), approximately 77% of the total Gly residues are located within the carboxyl terminal 124 amino acid residues. An unequal distribution of Pro has also been.observed.for human hnRNP protein C1; all of the Pro residues were found in the amino terminal half of the polypeptide chain (14). Therefore, the homology of CBP35 to the sequences of hnRNP proteins and the conservation of certain structural features in distinct domains suggest that the lectin might be one of the hnRNP proteins. 69 Figure 4: Alignment of the amino acid residues showing internal sequence homologyu The amino terminal domain of CBP35 clone 1 shows a repetitive sequence, each. consisting’ of nine amino» acids. Numbers at the left and right indicate the amino acid residue in the numbering system of Figure 2. 70 007650.322 455700901 111 PPPAAAGG AAATTPPS QQQPPQQC 066556689 PPPPP.D..D.P YYYYYYEY AAAAAAAA GGGEGGGG PPPPPPPP 090075548. 0.0.5678gm Fig. 4 71 Figure 5: Comparison of the deduced amino acid sequence of CBP35 clone 1 (residues 14 through 120 in the numbering system of Figure 2) with the amino acid sequences of hnRNP proteins. GRP33, glycine-rich protein of the hnRNP of Artemia salina (12), DL-4. sequence deduced from a CDNA clone derived from a human hepatoma library (13) , and human hnRNP C1 protein (14) . Dashes indicate gaps introduced for optimal alignment. j Sequence identities between CBP35 clone 1 and the hnRNP proteins are highlighted by boxes. The sequences were compared using the FASTP program (16). R G GEQEGAMVAAT G R G R G R G G F S GNflPGAGGYP YTAN - QGR 20(QSA (E) NPO (39m i P P60 LEK GSG llIG'N 200 G G CBP35 GRP33 DL-A fiGA”GQ-A-PPSAYP e FD 0R m GPGAEERLEALE] PPGAYLPJGQAP RMNTSETMDP‘ CBP35 GRP33 DL-A HNRNP C1 72 0c: 0......J .0: <0 00: Cl.—o: «.902 OLD mo: u391 NH Figure 1 84 The 6.4 kb EcoRI-EcoRI fragment and the 6.7 kb EcoRI-SalI fragment from the AGH clone (Fig. 1) were cloned into the PUC-18 plasmid. After HindIII digestion, these fragments were further subcloned into M13mp18 and.M13mp19 vectors (23). Both single-stranded and double- stranded DNA sequencing analyses were performed with the Sequenase Kit. Eight oligonucleotide primers, synthesized on the basis of the CDNA. sequence (6), were used, in carrying out the sequencing reactions. The oligonucleotides were synthesized.cx12nn Applied Biosystems DNA synthesizer (Model 380B) in the Macromolecular Structure Facility at Michigan State University. The areas and directions of the sequencing analysis are shown by the arrows in Figure 1. Directional Deletion of a 2.7 kb Fragment for DNA Sequencing The method used followed that described in Sambrook et a1. (25). The 2.7 kb EcoRI-HindIII fragment of XGH clone (Fig. 1) was digested by Nuclease Sl (Boehringer Mannheim) to blunt end.the DNA fragment. This was then cloned into the SmaI site of the M13mp18 vector. The orientation of the insert was determined by restriction analysis. A clone with the original HindIII end toward the SalI site of the M13mp18 multiple cloning site was selected. Approximately 10 ug of M13 replicative form DNA were isolated from this clone and were digested with SalI and PstI. Nuclease $1 (40 U; Boehringer .Mannheinn and Exonuclease III (200 U; Boehringer Mannheim) were then added to the double-digested DNA at 37°C. Samples were withdrawn from this digestion mixture at 90, 135, 180, and 225 seconds. The samples were placed at 30°C for 30 minutes. Then, the Klenow fragment of DNA polymerase I (10 U; Boehringer Mannheim) was added 85 to each sample, along with the dNTPs (0.5 mM each) and incubated at 25°C for 30 minutes to create blunt ends. The DNA samples were finally ligated and transformed into E. coli. Twenty-five clones were selected from each transformation and sequenced using universal primer and the Sequenase .Kit. A total of about 96 clones were analyzed. Primer Extension Assay Quiescent, serum-starved cultures of 3T3 fibroblasts were stimulated by the addition of serum (5) . After 18 hours, cytoplasmic RNA was isolated following the protocol of Henikoff (26). The cells were isolated and resuspended on ice for 5 minutes in 10mM Tris Buffer (pH 8.6) containing 0.14 M NaCl, 1.5 mM MgC12, 0.5% Triton X-100, and 1000 U/ml of Rnasin. The nuclei were pelleted by centrifugation at 500 x g for 5 minutes. The supernatant was then made 2% in' sodium dodecyl sulfate and 50 (lg/ml in proteinase K. After 1 hour at 37°C, the supernatant was extracted with phenol: chloroform (v/v, 1:1) once and the RNA was collected by ethanol precipitation. The DNA in the sample was removed by digestion with 2 ug/ml deoxyribonu- clease I at 37°C for 1 hour. The general method for the primer extension assay has been described (25). A 22— nucleotide primer, 5’-CGAAAAGCTGTCTGCCATTTTC-3’ was synthesized and 32PO4-labeled at the 5' end with T4 polynucleotide kinase (2 U/ul) at 37°C for 30 minutes. The labeled primer (107 cpm/llg) was isolated by ethanol precipitation and was then allowed to hybridize for 12 hours at 30°C to 30 (lg of cytoplasmic RNA isolated from 3T3 cells. Reverse transcriptase (60 U; Boehringer Mannheim) was added at 37°C for two hours. The radioactive product 86 of the extension reaction was electrophoresed on a 10% polyacryl- amide sequencing gel, followed by autoradiography. The length of the extended cDNA was estimated by comparison to a sequencing reaction using a 20-nucleotide primer. Southern Analysis of Genomic DNA Mouse liver nuclei were isolated from a homogenate of 15 grams of liver tissue by centrifugation at 500 x g for 51ninutes. The nuclei were suspended in 10 mM Tris (pH 8.0) containing 0.14 M NaCl, 1.5 mM MgC12, 0.5% sodium dodecyl sulfate and 100 ug/ml proteinase K at 37°C for 16 hours. DNA was isolated by ethanol precipitation and purified by CsCl gradient centrifugation. The DNA was then digested with EcoRI, HindIII and BamHI. 'The initial digestion used 20 U enzyme per ug of DNA and was carried.out for 20 hours. An addition- al 10 U of enzyme per ug of DNA was added and incubated for 4 more hours. The digested DNA.was separated by agarose gel electrophore— sis (0.8%), subjected to Southern blotting (27) using 32P-labeled CBP35 cDNA (108 cpm/ug) as a hybridization probe. 87 RESULTS Isolation and Characterization of Genomic Clones IX X EMBL4 library derived from House liver chromosomal DNA was screened using the cDNA for CBP35 (6) as a hybridization probe. Of the 106 plaques screened, three gave reproducible positive signals. Oneof these, designated as AG3, was further cloned by lower density rescreening and plaque purification. Digestion of AG3 with PstI and sequence analysis in both directions (Fig. 1) revealed the single PstI site in the nucleotide sequence of the cDNA (6). Comparison of the genomic sequence with the cDNA sequence defined exon V, corresponding to the 3'end.of the cDNA clone. The 1.7 kb fragment, derived from the "left" end of 163 by PstI digestion, was sequenced in its entiretyz, without finding any additional sequence matching the cDNA. This suggested.that the 1.7 kb fragment was inside an intron of the genomic DNA” Thus, this 1.7 kb fragment of 163 (highlighted by a hatched rectangle in Fig. 1) was used as a hybridizatirxxprobe to search for other genomic clones from.the A FIXII library, also derived.fronlmouse liver chromosomal DNA. Clone KGH was selected from such a screening. Analysis of restriction enzyme cleavages and of the partial nucleotide sequence showed that the XGH and 1G3 clones overlapped over a region of approximately 6 kb, corresponding to the "right" end of AGH and the "left" end of 163 (Fig. 1). When the cDNA for CBP35 (6) was digested with PstI, a fragment corresponding to the 3’ end (nucleotides 696-883) of the cDNA was isolated. This Figure 2 : 0 88 Southern blot hybridization of mouse genomic DNA with the cDNA clone for CBP35. Chromosomal DNA was isolated from mouse liver nuclei. The DNA was digested with restriction enzymes, separated on agarose gels (0.8%) and subjected to Southern blot analysis with 32P-labeled CBP35 CDNA (108 cpm/lig). Panel A: lane 1, BamHI digest; lane 2, HindIII digest; lane 3, EcoRI digest; and lane 4. undigested DNA. In panel B, the same EcoRI digest sample that was used for lane 3 of panel A was electrophoresed for a longer time to better resolve the bands of a doublet. 89 23-6 ' 9.4- vzth 6.6' a '23 Litéi 4.4. i i ...-I a .9.4 2.5% 2.3- __ . I - . -66 1.3- : 1.0" 1 2 3 4 Figure 2 90 fragment hybridized to both XGH and 1G3 clones. In contrast, the other fragment of the cDNA generated by PstI (nucleotides 1-695) hybridized to the AGH clone, but not the 1G3 clone. Therefore, it appears that the overlapping region between 1GB and 1G3 contained the 3' end of the cDNA, while the 5’ end of the cDNA was found in the remainder of the XGH clone (Fig. 1). Nucleotide sequence and primer extension analyses (see below) indicate that the gene for CBP35, including several consensus sequences for 5’ regulatory elements, can be entirely mapped onto the AGH clone. DNA isolated from nuclei of mouse liver was digested with several restriction enzymes, separated by gel electrOphoresis and then subjected to Southern blot analysis with the cDNA for CBP35. A single fragment (19-20 kb) was observed after BamHI digestion (Fig. 2A, lane 1). HindIII digestion yielded fragments of 4.2 kb, 3.2 kb, and 1.3 kb (Fig. 2A, lane 2). Although the EcoRI digested.material yielded a broad.band in the experiment shown in Figure 2A (lane 3), it actually contained two fragments, which could be resolved into a doublet upon longer electrophoresis of the same material (Fig. 2B). The positions of migration of the bands corresponded to DNA molecules of ~ 6.4 kb and ~ 7.0 kb. The lengths of each of these genomic DNA fragments that hybridized with the cDNA probe were in good agreement with those derived from the genomic clones isolated above (Fig. 1). No other bands could be observed. These results indicate that the CBP35 gene is a single gene in the normal mouse genome. Sequence Analysis of the Genomic Clones Using.a strategy similar to that described for the analysis of the ‘- 91 PstI fragment of clone 1G3 (highlighted by a hatched rectangle in Fig. 1), nucleotide sequences were determined for parts of the AGH clone (see Fig. 1). Figure 3 reports the nucleotide sequences of the 5' and 3’ flanking regions, the exons, and the exon/intron boundaries of the CBP35 gene. The beginning of exon I was identi— fied by carrying out directional deletion on the 2.7 kb EcoRI- HindIII fragment of 1GB, cloning each of the resulting fragments, and sequencing approximately the first 100 nucleotides of each of the 90 or so clones obtained (highlighted by dotted line in Fig. 1) . This revealed a sequence CTTCCG, fitting the consensus for a mRNA cap site (rectangle highlighted by a wavy underline in Fig. 3). On the basis of this landmark, the initiation.site of transcription was deduced.to be an adenine residue (circled.and labeled +1 in Fig. 3). The partial sequences of the 2.7 kb EcoRI—HindIII fragment (high- lighted by dotted line in Fig. 1) also revealed a 2.3 kb intron (IVSl in Fig. 3). On thei? side of this IVSl, exon II starts with the sequence GAAAATGG. This ATG triplet (solid triangle in Fig. 3) falls within the consensus translation initiation sequence A/GNN- ATG-G (28). On the basis of comparing the nucleotide sequence obtained in this study with that of the cDNA (6), the present assignment of methionine to this initiation codon revealed: (a) the cDNA clone coded for the entire CBP35 polypeptide chain except for the NHZ-terminal Met residue; and (b) the second amino acid of the polypeptide should be Ala rather than the Arg that was reported previously (6). These conclusions are consistent with the results of the nucleotide and amino acid sequences reported for Mac-2 (l6), L-34 (15), and SBP (18). Given these assignments of the start of exon I and the translation Figure 3 : 92 Nucleotide sequence of the murine CBP35 gene. The numbering system, shown on the left, is based on the transcription initiation site; this is the A residue highlighted by a circle and whose position is labeled +1. The nucleotide sequences of the introns, denoted IVS 1—4, are not shown except for the 5’ and 3’ ends to highlight the conservation of the donor and acceptor splice site consensus sequences. The sequence CCAATTAAGG, highlighted by a rectangle with a double underline, represents a putative Serum Response Element. The sequence CCAAT, highlighted by a rectangle with a single solid underline, and the sequence AATATATAT. highlighted by a rectangle, represent sequences that fit the consensus sequences and locations of CCAAT and TATA boxes, respectively. The sequence CTTCCG, highlighted by a rectangle with a wavy underline, represents a putative mRNA cap site. The translation initiation codon, ATG. is highlighted by a triangle. The open reading frame extends to a termination codon, TM: highlighted by an inverted triangle. The sequence AATAAA, highlighted by a rectangle with a dotted under- line. represents the polyadenylation signal ( D (,1) ‘ 99999 M.A.- - 280 ~220 - IGO — ‘00 —40' 20 '00 ISO 220 280! 34() ‘400 ‘460 520 58C) ($40 700 760 820 880 940 1000 93 CATCTCATGAGATGCTGATCTCGTAGCTGAAGTCTGATCTAGATAGATGTGTGTTACAAC GTGTGCI-TACAACTTACTCGGGTCCAATGACTGTTGTAACCTCCGTTTC GCCGAATTCCTGTGGATCTGTAGGGGTCTCGCCAGAGGGACAGGAéhCCAGAGGAGAAAT ACTTCAACCACCAT§§§§§ACGACAGAGGGTTTTcncccccTAGAGAACGACTGTAGACG +1 TAAGAc—ACCTCTTCAACGAGGTCACCCAGccyrTTGAc-GGAAC IVSI GTACCCATACTCTAGGGTCCTCAGGGATGGGGTA[étaaa ----- (2-3 kb) ——————— ---——---—————-—-—-¥ ————————————— cctag]GAAAAC§§CAGACAGCTTTTCG Ivsz Etaaa ——————— (0.5 kb) ---------- cttaé}CTTAACGATGCCTTAGCTGGCTC TGGAAACCCAAACCCTCAAGGATATCCGGGTGCATGGGGGAACCAGCCTGGGGCAGGGGG CTACCCAGGGGCTGCCTATCCTGGGGCCTATCCAGGACAGGCTCCTCCAGGGGCCTACCC AGGACAGGCTCCTCCAGGGGCCTATCCAGGACAGGCTCCTCCTAGTGéCTACCCéGGCCC AACTGCCCCTGGAGCTTATCCTGGCCCAACTGCCCCTGGAGCTTATCCTGGTCAACCTGC CCCTGGAGCCTTCCCAGGGCAACCTGGGGCACCTGGGGCCTACCCCCAGTGCTCTGGAGG CTATCCTGCTGCTGGCCCTTATGGTGTQCCCGCTGGACCACTG éYziig ----------- ----- (3.1 kb) -- cttag ACGGTGCCCTA‘I‘GACCT GCCCTTGCCTGGAGGAGTCATGCCCCGCATGCTGATCACAATCATGGGCACAGTCAAACC CAACGCAAACAGGATTGTTcTAGATTTCAGGAGAGGGAATGAIGTTGCCTTCCACTTTAA CCCCCGCTTCAATGAGAACAACAGAAGAGTCATTGTGTGTAACACGAAGCAGGACAATAA CTGGGGAAAGGAAGAAAGACAGTCAGCCTTCCCCTTTGAGAGTGGCAAACCATTCAAA élfgat; ------------------- (1 . Bkb) -------------------- cctag AT ACAAGTCCTGGTTGAACCTGACCACTTCAAGGTTGCGGTCAACGATGCTCACCTACTGCA GTACAACCATCGGATGAAGAACCTCCGGGAAATCAGCCAACTGGGGATCAGTGGTGACAT AACCCTCACCAGCGCTAACCACGCCATGATC§§§CCCAGAAGGGGCGGCACCGAAACGCC CTGTGTGCCTTAGGACTGGGAAACTTGGCATTTCTCTCTCCTTATCCTTCTTGTAAGACA TCCCCCATTflEZ:E§ZBTCTCATGGGAGAGAGAGCCATGTTTTGGGGGTTTTTATGATAT GGGTTCAAATTCTTTAGGAC Figure 3 94 initiation codon in exon II, there must necessarily be an intron in the 5’ transcribed but untranslated region of the primary tran- script, a somewhat unusual occurrence. Tfidr;conclusion.is supported by preliminary experiments that indicate an oligonucleotide probe synthesized on the basis of a nucleotide sequence in IVSl failed to hybridize to the 1.3 kb mRNA, previously identified with our cDNA clone (29,30). The sequence results also predict that the 5’ untranslated region of the CBP35 mRNA will be 57 nucleotides long. A primer extension experiment was performed to ascertain this length. A 22-mer oligonucleotide complementary to the entire exon II (Fig. 3) was synthesized, labeled with 32P, and was used as the primer for reverse transcription of the mRNA (Fig. 4A) . The results showed that the product of theextension reaction contained 72 nucleotides (Fig. 4B), consistent with the notion that the tran- scription initiation site was located ~ 54 bases upstream from the initiation ATG codon in the mRNA. Structural Features of the CBP35 Gene The CBP35 gene spans ~ 9 kb of genomic DNA and contains five exons (Fig. 1): (a) exon I, 53 bp; (b) exon II, 22 bp; (c) exon III, 366 bp; (d) exon IV, 255 bp: and (e) exon V, ~ 323 bp (the exact size of last exon is not known; see below). These exons are interrupted by four introns: (a) IVSl, ~ 2.3 kb: (b) IVSZ, ~ 0.5 kb: (0) IVSB, ~ 3.1 kb; and (d) IVS4, ~ 1.8 kb. The RNA donor and acceptor splice sites are characterized by GTAAA/G ---- CC/TTAG sequences (Fig. 3), conforming to the GT - AG rule (31). The Met initiation codon is located in exon II (residues 58-60, solid triangle, Pig. 3), characterized by the consensus initiation Figure 4: 95 Identification of the transcription initiation site of the marine CBP35 gene by primer extension assay. A) The primer was synthesized to be complementary to the sequence of exon II (Figure 3), including four nucleo- tides of the S'untranslated region, the AUG initiation codon, anui 15 additional nucleotides of the coding region. In The product of the primer extension reaction was separated on a 10% polyacrylamide gel, followed by autoradiographyu jLane 1, jproduct of the extension reaction using primer synthesized as shown in panel A. Lane 2, product of the extension reaction using tRNA as a control. The four lanes marked G, C, T, and A on the right are size markers from a sequencing reaction. The numbers on the right indicate the length of the nucleo- tide that would migrate to the position on the gel. The number on the left highlights the length of the nucleo- tide determined for the primer extended product. v' ~~~~~ fl. L3? 30' 96 A 0 3‘20 1990 nucleotides 5' 3 mRNA {ll—Tl COWNG REGION j UT DOW A GAAAAUGGCAGACAGCUUUUCG CTTTTACCGTCTGTCGAAAAGC 4‘v‘v“ prime: 8 12 GCTA .. '11. ‘ If” i" 1 t: ' :fl '2‘; 4 10 0 3...: “hf f'! 72 .' . a” ‘ 70 ,7 -<6o Figure 4 97 sequence A/GNN-ATG-G. From this initiation codon through the termination codon at positions 851-853 (inverted triangle, Fig. 3) , there is a translation Open reading frame coding for a polypeptide of 264 amino acids. Thus, the coding sequence accounts for 795 nucleotides. As discussed above, the 5’ untranslated region of the mRNA contains 57 nucleotides. In the 3’ untranslated region, there is an AATAAA sequence (rectangle highlighted by dotted underline in Fig. 3), characteristic of the consensus polyadenylation signal (32), located 103 nucleotides downstream from the translation termination codon. Although we have an additional 64 nucleotides of sequence information, in terms of the genomic sequence, we have not determined the precise site of polyadenylation. Assuming the typical distance between the polyadenylation signal and the site of poly A addition to be ~ 30 nucleotides, the size of exon V is estimated to be 290-323 bp. As indicated previously, the 5' end of the CBP35 gene was an adenine residue (circled and labeled +1 in Fig. 3), deduced to be the initiation of transcription site on the basis of a consensus sequence for mRNA capping, CTTCCG, located six residues downstream (rectangle highlighted by a wavy underline in Fig. 3). At -34 through -26, a TATA box-like sequence, AATATATAT (rectangle in Fig. 3), was found. This agreed well with the customary location of such a promoter sequence (31,33) . Approximately 80 nucleotides upstream from the adenine residue described above, the sequence CCAAT {-86 to ~82) was found (rectangle highlighted by a single solid underline in Fig. 3). In previous studies, we had demonstrated that the expression of the gene for CBP35 was stimulated upon addition of serum to serum- ;'. 98 starved, quiescent cultures of 3T3 fibroblasts (29,30). This stimulation resulted in an increase in the nuclear transcription of the CBP35 gene and in the accumulation of its mRNA, early in the activation process. Moreover, both of these increases were not dependent on de novo protein synthesis, inasmuch as they occurred even in the presence of cycloheximide. From studies on the genomic sequences of several serum-stimulated genes, a regulatory element designated as Serum Responsive Element (SRE) had been identified (34,35) . The SRE is characterized by the consensus sequence, C—C-A- A/T-A-T—A/T-A/T-G—G, to which certain transcription factors bind. A search for a.possible SRE in the genomic sequence of CBP35 derived from the present study resulted in a candidate; this sequence CCAATTAAGG is located at positions -213 through -204 (rectangle highlighted by a double underline in Fig. 3). Comparisons of the Sequences of CBP35 with L-34, Mac-2, and eBP We had previously shown (19) that the amino acid sequences of CBP35 and.eBP were identical in 223 out of the 262 positions compared (85% identity). Analyses of antibody cross-reactivity and.demonstration of carbohydrate-binding activity in sBP established that CBP35 and SBP were mouse and rat homologous, respectively. Since that report, the sequences of two other mouse proteins have been published, L-34 (15) and.Mac-2 (16). The nucleotide sequences of CBP35 and.L-34 are compared in Figure 5, starting at the 5’