.Iuv 4?‘ b .$.,..m«;..i .. :... 13. . s 5. ). an 3.3, .. . f. by. ...z \ 1 «.7. mm“? . ma: 2. W39; . a... . .. :5. it . magma? . .. a 2m. 1.... .. r, , 9...... ”new...” J- ; .ggvz .wfihufi IL. Ila. _..,33§.§3§.33 3%.? . . . . - . x I ‘i._ I 'l . . . lllllllllllllllllllllllllllHIlllllllllIllllllllllll 3 1293 017718 LIBRARY Michigan State University This is to certify that the thesis entitled NMR Studies 0-? Hum Armada I and Yeas-l; Gauglwtfi ktnase. presented by 3.101%“; Gao has been accepted towards fulfillment of the requirements for M degree in Bio dielktyfl HM” [9W1 Major professor {AZ/9ft 0-7639 MS U is an Affirmative Action/Equal Opportunity Institution PLACE iN RETURN BOX to remove this checkout from your record. TO AVOID FINE return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE MAY €17; 2002 Mr V 1/” COMM“ NMR STUDIES OF HUMAN ANNEXIN I AND YEAST GUANYLATE KINASE By J inhai Gao A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Department of Biochemistry 1999 ABSTRACT NMR STUDIES OF HUMAN ANNEXIN I AND YEAST GUANYLATE KINASE By J inhai Gao Annexins are excellent models for studying the folding mechanisms of multidomain proteins because they have 4-8 domains with high similarity in folding but low identity in sequence. The solution structure of an isolated domain 1 of human annexin I has been determined by NMR spectroscopy. The root-mean-square deviation of the ensemble of 20 refined conformers was 0.57 i 0.14 A for the backbone atoms. The NMR structure of domain 1 could be superimposed with an RMSD of 1.36 A for all backbone atoms with the corresponding part of the crystal structure of a truncated human annexin I containing all four domains. The result suggests that isolated domain 1 constitutes an autonomous folding unit and interdomain interactions may play critical roles in the folding of annexin l. A sequential working model was proposed for the folding of annexin I Guanylate kinase (GK) is a suitable model enzyme for NMR studies of structural and dynamic properties of nucleoside monophosphate kinases. A series of 2D and 3D NMR data have been collected for free and GMP-bound forms of GK. Sequential backbone resonance assignments for the GK complex with GMP have been made. The results obtained in this work provide the basis for the NMR studies of the structure- function relationships of GK. Proposals for the further efforts towards elucidating dynamic and structural changes that control kinase catalysis were also discussed. T 0 my parents iii ACKNOWLEDGEMENTS First of all, I would like to express my appreciation to my thesis advisor, Dr. Honggao Yan, for his guidance and support for my graduate study at MSU. Thanks are also due to my guidance committee members, Dr. Zachary Burton and Dr. Michael Garavito. I would like to thank Dr. Yue Li for establishing the expression system for human annexin I and yeast guanylate kinase. Thanks also go to other lab members, Dr. Yanling Zhang for her helpful discussion at the beginning of this project, Dr. Lincong Wang for his critical but insightful suggestions during my research, and Dr. Genbin Shi for preparing the inhibitor GP5A. I Thank Mr. Kermit Johnson, Dr. George Gray, and Dr. Ouwen Zhang for their help in NMR experiments. I Thank Drs. Charles Cottrell, Clemens Anklin and George Gray for assistance in acquiring the NMR data for annexin I. 1 Thank Drs. Xiangwei Weng and Sung-Hou Kim for providing us the unpublished coordinate of the refined crystal structure of annexin I. I Thank Dr. Zhiheng Huang for collecting mass spectrometry data for isotope-labeled GK samples. Finally, I am grateful to many of my friends and classmates whose fiiendship helped me go through my graduate life at MSU. iv TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS INTRODUCTION CHAPTER 1 NMR SOLUTION STRUCTURE OF DOMAIN 1 OF HUMAN ANNEXIN I 1.1 1.2 1.3 1.4 Introduction Materials and Methods Results Discussion References CHAPTER 2 SEQUENTIAL BACKBONE RESONANCE ASSIGNMENTS OF YEAST GUANYLATE KINASE IN COMPLEX WITH GMP 2.1 2.2 2.3 2.4 2.5 Introduction Materials and Methods Results Discussion Summary and Perspective References APPENDICES vi vii viii 13 32 42 46 55 57 69 70 72 7S LIST OF TABLES CHAPTER 1 Table 1.1 15 N, 13C and 1H resonance assignments for domain 1 of human annexinI 21 Table 1.2 Statistics of NMR solution structures of domain 1 of human annexin I 27 CHAPTER 2 Table 2.1 '5 N, '3 C and 1H backbone resonance assignments for 65 yeast guanylate kinase in complex with GMP vi LIST OF FIGURES CHAPTER 1 Fig. 1.1 (a) Ribbon diagram of the X-ray structure of a truncated human annexin I that lacks the N-terminal 31 residues.(b) Superposition of the four domains of annexin 1. (c) Sequence alignment of the four domains. Fig. 1.2 Strip plot of 3D HNCACB and 3D CACB(CO)NH spectra of Domain 1 of annexin I. Fig. 1.3 Strip plot of 15N-edited NOESY-HSQC Spectrum of domain 1 of annexin I. Fig. 1.4 lH-‘SN HSQC spectrum of domain 1 of human annexin I. Fig. 1.5 Summary of sequential and short-range NOEs and chemical shifi indices for H“ and C‘ll observed for domain 1 of annexin I. Fig. 1.6 (a) Superposition of the final 20 calculated NMR structures of domain 1 of annexin I.(b) Superposition of the minimized average NMR structure and the X-ray structure of domain 1. Fig. 1.7 Distributions of the average backbone RMSDS of the ensemble of the NMR structures from its mean coordinate and from the X-ray crystal structure. Fig. 1.8 The hydrophobic core structure of domain 2 and the interface between domain 2 and domain 4 of annexin 1. Fig. 1.9 A working model for the folding process of annexin I. CHAPTER 2 Fig. 2.1 The crystal structure of yeast guanylate kinase in complex with GMP. Fig. 2.2 Domain movements correlated with substrate binding to adenylate kinase. Fig. 2.3 Ship plot of 3D HNCACB (A) and 3D CACB(CO)NH (B) spectra of yeast GK in complex with GMP. Fig. 2.4 Strip plot of 'sN-edited NOESY-HSQC spectrum of yeast GK in complex with GMP. Fig. 2.5 ‘H - "N HSQC spectrum of yeast GK in complex with GMP. vii 15 17 19 25 29 31 35 40 49 59 62 64 2D: 3D: DQF-COSY: HSQC: IPTG: NMR: NOESY: PAGE: RMSD: SDS: TOCSY: HCCH-TOCSY: HNCA: HNCO: HNCACB: CBCA(CO)NH: (HB)CBCACO(CA)HA: GK: LIST OF ABBREVIATIONS two-dimensional; three-dimensional; double quantum filtered correlation spectroscopy; heteronuclear single quantum coherence; isopropyl- 1 -thio-B-D-galactopyranoside; nuclear magnetic resonance; nuclear Overhauser effect spectroscopy; polyacrylamide gel electrophoresis; root-mean-square deviation; sodium dodecyl sulfate; total correlation spectroscopy; proton-carbon-proton correlation using carbon total correlated spectrum; amide proton to nitrogen to OI-carbon correlation; amide proton to nitrogen to carbonyl carbon correlation; amide proton to nitrogen to CUB-carbon correlation; 01/ B proton to CUB carbon (via carbonyl carbon) to nitrogen to amide proton correlation; CUB carbons to 01/13 carbonyl carbon (via OI-carbon) to a proton correlation; guanylate kinase; viii AK: UK: GP5AZ APSA: UP5AZ adenylate kinase; uridylate kinase; (P l -(5 ’-adenosyl) PS-(S ’-guanosyl) pentaphosphate); (P ' ,Ps-(S ’-diadenosyl) pentaphosphate; (P I -(5 ’-adenosy1) PS-(S ’-uridyl) pentaphosphate. ix INTRODUCTION The developments of modern molecular biology and multidimensional nuclear magnetic resonance (N MR) Spectroscopy have increased explosively the use of NMR spectroscopy for studying the structure-function relationships of biological molecules. NMR spectroscopy and X-ray crystallography are complementary methods for studying biomolecular structure and dynamics. While X-ray crystallography is more productive and can be applied to very large biomolecules, NMR data can be interpreted in terms of dynamic models in solution. The most important dynamics for biological function are those with time constants on the order of a nanosecond to second, and it is in this time range that NMR relaxation measurement is most powerful. Although it is still not easy to interpret relaxation data with full confidence, NMR relaxation experiments have been applied successfully in a number of dynamic studies of proteins (1). In this thesis, NMR spectroscopy has been applied to study two proteins, human annexin I and yeast guanylate kinase. With the well-defined domains and the symmetric structure, annexins are excellent models for studying the folding mechanisms of multidomain proteins (2). Our approach to dissect the folding mechanism of annexin I is to compare the folding properties of the intact protein and the four isolated domains. The results showed that domain 1 of human annexin I constitutes an autonomous folding unit and interdomain interactions may play critical roles in the folding of annexins. The results allowed us to propose a possible scenario for the folding process of annexin I. Guanylate kinase (GK) belongs to a family of nucleoside monophosphate (N MP) kinases. GK is required for the metabolic activation of the anti-herpes drugs acyclovir and gancyclovir and anti-HIV agent carbovir (3-5). Thus it is of biomedical significance to study the catalytic mechanism of this enzyme. It has been suggested that the substrate- induced domain closure and the dynamic relocation are important for the catalysis of NMP kinase (6,7). Guanylate kinase is an excellent model enzyme to study these conformational and dynamic changes by NMR spectroscopy. This thesis presents the multinuclear multidimensional NMR experiments and sequential backbone assignments of the GK complex with GMP. The results provide the basis for further NMR studies of the str'ucture-fimction relationship of this enzyme. References 1. Wagner, G. (1997) Nature Struct. Biol. (NMR supplenment) 4, 841-844. 2. Liemann, S., and Huber, R. (1997) Cell. Mol. Life Sci. 53, 516-521 3. Miller, W. H. and Miller R. L. (1980) J. Biol. Chem. 255,7204-7207. 4. Boehme, R. E. (1984).]. Bio. Chem. 259, 12346-12349. 5. Miller, M. H., Daluge, S. M., Garvey, E. R, Hopkins, 8., Reardon, J. E., Boyd, F. L., and Miller, R. L. (1992) J. Biol. Chem. 267, 21220-21224. 6. Vonrhein, C., Schlauderer, G. L., and Schulz, G. E. (1995) Structure 3, 483-490. 7. Miiller, C., Schlauderer, G. L., Reinstein, 1., and Schulz, G. E. (1996) Structure 4, 147- 1 56. CHAPTER 1 NMR SOLUTION STRUCTURE OF DOMAIN 1 OF HUMAN ANNEXIN I 1.1 Introduction Most proteins in nature are large multidomain proteins (1). While a great deal of knowledge on the folding properties of small Single-domain proteins has been acquired (2), our understanding of the folding of multidomain proteins is still poor. To date, the folding mechanisms of few multi-domain proteins has been studied. It has been suggested that the domains of large proteins fold independently and subsequently assemble to form the native structures (3-5). Annexins are a large family of ubiquitous proteins that bind to phospholipids in the presence of calcium ions (6,7). Although their physiological functions are not clear, these proteins are implicated in many important cellular processes (8) such as exocytosis (9,10) and ion channeling (11). All annexins contain four homologous repeats of ~70 residues (Fig. 1.1a and l.lc) and a variable N-terminus, with the exception of annexin VI which has four additional repeats. The crystal structures of annexins 1, II, III, IV, V, VI, VII and X11 have been determined (12). As revealed by X-ray crystallography, each repeat forms a compact domain consisting of five helix segments, named A to E, organized in a typical super-helix topology. All the domains are highly similar in structure, as illustrated in Fig. 1.1b with the four domains of annexin I. The four domains of each annexin are arranged in a planar-cyclic manner with domain 4 in contact with Figure 1.1 (a) Ribbon diagram of the X-ray structure of a truncated human annexin I that lacks the N-terminal 31 residues (18). The four homologous domains are indicated in different colors: domain 1, green; domain 2, yellow; domain 3, cyan; and domain 4, magenta. Except domain 1, only the helices involved in the interdomain interactions are labeled. (b) Superposition of the four domains of annexin 1: domain 1 (17-86), domain 2 (87-158), domain 3 (169-246) and domain 4 (247-319). Domains 1 to 4 are colored as in (a). Only the helices of each domain wereused for the structural alignment. (c) Sequence alignment of the four domains. The numbering is according to the crystal structure of the truncated annexin I (18). The hydrophobic core residues are shown in yellow, and other conserved residues in blue. Fig. 1a and 1b were generated using the program MOLMOL (43). domain 1, as depicted in Fig. 1.1a. Domains 1 and 4 as well as domains 2 and 3 have many tight hydrophobic contacts, mainly involving helices B and E, constituting two two-domain modules. The interactions between these two modules are mostly hydrophilic via helices A and B of domains 2 and 4, forming a central hydrophilic channel. With the well-defined domains and the simple and elegant structure, annexins are excellent models for studying the folding mechanisms of multidomain proteins. They are composed of four domains with almost identical topologies but only limited sequence homology of approximately 30%. Using synthetic peptides and more recently recombinant peptides, Samson and collaborators have been systematically studying the folding properties of domain 2 of human annexin I (13-16). They have clearly shown, with CD and NMR, that isolated domain 2 of annexin l is largely unfolded in aqueous solution (15). A preliminary study on the folding properties of domain 1 has also been reported (17). Our approach to dissect the folding mechanism of annexin I is to compare the folding properties of the intact protein and the four isolated domains. We have expressed the entire annexin I and the four individual domains in Escherichia coli. Expression of separated domain 3 and 4 in Escherichia coli results in inclusion bodies. Domain 2 was found to be largely unfolded in solution, although it contains a significant amount of secondary structure in solution. Using multidimensional NMR techniques, we have determined the solution structure of domain 1 (residues 14-86, according to the numbering of the crystal structure of an N-terminally truncated human annexin I (18)). The NMR structure of the isolated domain 1 is highly similar to the corresponding part of the crystal structure of a truncated human annexin 1 containing all four domains (18). The result shows that in contrast to isolated domain 2, isolated domain 1 constitutes an autonomous folding unit. Comparative structural analysis suggests that inter-domain interactions may play critical roles in the folding of annexin I. 1.2 Materials and Methods Materials. The Escherichia coli clone containing the cDNA encoding human annexin I was purchased from ATCC (ATCC number 65114, deposited by Joel Ernst). The expression vector pET-l7b was purchased from Novegen. DNA sequencing kit was obtained from United States Biochemical. Enzymes for recombinant DNA experiments were purchased from Gibco BRL or New England Biolabs. l5NH4C1 and ['3C6] D-glucose were purchased from ISOTEC. Other chemicals were analytical or reagent grade fi'om commercial sources. Cloning. The amino acid sequence of domain 1 of human annexin I is shown in Fig. 1c. The portion of human annexin I cDNA that encodes domain 1 was cloned into the expression vector pET-l7b by PCR and other standard recombinant DNA techniques. The primers used for the PCR cloning were 5’-GGAATTCCATATGACCTTCAATCCA TCCTCG-3’ (forward) and 5’-CCGGATCCTTATTTTAGCAGAGCTAAAACAAC-3’ (reverse). The correct amino acid sequence was verified by double stranded DNA sequencing of the DNA insert in the expression construct pET-17b-ANX1D1. Expression and purification. Unlabeled protein was produced by growing the Escherichia coli strain BL21(DE3) containing the expression construct pET-17b- ANXlDl in LB media in the presence of 100 ug/ml ampicillin at 37 °C without IPTG induction. Uniformly 15N-labeled protein was produced by growing the same expression strain in M9 media with 15NILI4CI as the sole nitrogen source, and uniformly 15N/HC- labeled protein in M9 media with l5Nl-I4Cl and ['3C6] D-glucose as the sole nitrogen and carbon sources. Protein production in the M9 media was induced by addition of IPTG to a final concentration of 0.4 mM when the cultures reached an OD600 of ~1 .0. The culture was incubated for four more hours after addition of IPTG. The bacterial cells were harvested by centrifugation and suspended in buffer A (40 mM acetate, pH 5.3). The bacterial suspension was sonicated on ice and centrifuged (27,000 g) at 4 °C for 30 min. The supernatant was applied to a CM-cellulose column equilibrated with buffer A. The column was washed with buffer A until OD280 of the eluent was less than 0.05. Elution of the column was achieved by a linear NaCl gradient (0-500 mM in buffer A) and monitored by OD280 and 15% SDS-PAGE. The fractions containing domain 1 of annexin I were pooled and concentrated by an Amicon ultrafiltration cell using a YM 3 membrane. The protein preparations were >95% pure as judged by SDS-PAGE. Isotopically labeled proteins were further purified by a Sephadex G-50 column. The protein solutions were dialyzed against double distilled water, lyophilized and stored at — 80 °C. NMR spectroscopy. NMR samples were prepared by dissolving the lyophilized protein in 20 mM acetate-d3, pH 5.2 (pH meter reading without correction for isotope effects), in HzO/2H20(9/l) or 2H20. The protein concentrations of the NMR samples were 2-5 mM. NMR spectra were acquired at 25 0C on a Bruker DMX 600 Spectrometer at The Ohio State University, a Bruker DRX 600 Spectrometer at Bruker USA, or a Varian IN OVA 600 spectrometer at Varian Application Laboratories. Homonuclear 2D spectra recorded were DQF-COSY (D20) (19,20), TOCSY (D20) (21-23), and NOESY (H20) (24,25). Heteronuclear double and triple resonance spectra acquired included 2D 'H-‘SN HSQC (26,27), 3D 'H-‘SN TOCSY-HSQC (28), 3D 'H-‘SN NOESY-HSQC (28,29), HNCACB (30,31), CBCA(CO)NH (31,32), and HCCH-TOCSY (33,34). The 10 acquisition sweep widths and numbers of complex points for these experiments were as follow: 2D DQF-COSY, TOCSY and NOESY, 1H(F1) 7183 Hz, 512, lH(E2) 7183 Hz, 512; 21) 'H-‘SN HSQC, ‘5N(F2) 2500 Hz, 256, ‘H(1=2) 7000 Hz, 102; 13cm HSQC, l3C(F2) 27163 Hz, 512, ‘H(F2) 7000 Hz, 962; 'sN-edited NOESY-HSQC with a 150 ms mixing time and a 15N-edited TOCSY-HSQC experiments with a 47.3 ms mixing time, lH(1=1) 7183Hz, 256, 15N(I~‘2) 2074Hz, 64, 'H(F3) 7183 Hz, 1024; 3D HNCACB, '5N(Fl) 2200 Hz, 48, ”C(FZ) 9000 Hz, 256, ‘H(1=3) 8000 Hz, 1024; 3D CBCA(CO)NH, lsN(1=1) 2310 Hz, 62, l3C(Fz) 8000 Hz, 94, 'H(F3) 8000 Hz, 1024; 3D HCCH-TOCSY, 'H(Fl) 6238 Hz, 128, 13cm) 10000 Hz, 128, 'H(F3) 8000 Hz, 1024. The spectra were processed with the program NMRPipe (35) and analyzed with the program PIPP (36). Briefly, solvent suppression was improved by convolution of time domain data (37). The data size in each indirectly detected dimension of the 3D data was extended by backward-forward linear prediction (38). A 45°-Shified sine bell and single zero-filling were generally applied before Fourier transformation in each dimension. Derivation of structural restraints. Approximate interproton distance restraints were derived from sequentially assigned NOES. NOE cross peaks between aliphatic protons were picked from the homonuclear 2D NOESY spectrum, and those involving amide protons from the 3D lH-ISN NOESY-HSQC spectrum. The NOE intensities obtained by the program PIPP were converted into approximate interproton distances by normalizing them against the calibrated intensities of NOE peaks between backbone amide protons (dNN) within the identified (it-helices. The upper limits of the interproton distances were calibrated according to the equation Va: Vb (rb/ra)6 , where Va, Vb were the NOE intensities and r3, n, the distances. The distance bounds were then set to 1.8—2.7 A 11 (1.8—2.9A for NOE cross peaks involving amide protons), 1.8—3.3 A (1 .8—3.5 A for NOE cross peaks involving amide protons) and 1.8-5.0 A corresponding to strong, medium and weak NOEs respectively. Pseudoatom corrections were made for non- stereospecifically assigned methylene and methyl resonances (39). An additional 0.5 A was added to the upper bounds for methyl protons. Structure calculation. NMR structures were calculated with a hybrid distance geometry-simulated annealing protocol (40) using the program X-PLOR (version 3.1) (41) on an SGl Indigo II workstation. A square-well potential function with a force constant of 50 kcal mol'1 A'2 was applied for the distance restraints. The X-PLOR fwd fimction was used to simulate van der Waals interactions, with atomic radii set to 0.80 times their CHARMM values (42) and a force constant of 4.0 kcal mol’lA“. A total of fifiy structures were generated using this protocol. The structures were inspected by the programs MOMOL (43), QUANTA96 (Molecular Simulations) and analyzed by PROCHECK-NMR (version 3.4.4) (44,45). An iterative strategy was used for the structure refinement. In each round of structure refinement, newly computed NMR structures were employed to assign more NOE restraints, to correct wrong assignments, and to loosen the NOE distance bounds if spectral overlapping was deduced. Then another round of structure refinement was carried out with the modified NMR restraints. All structures were converged after several rounds of such refinement. An ensemble of 20 structures was selected according to their best fit to the experimental NMR restraints and the low values of their total energies. 12 1.3 Results Sequential backbone resonance assignments. Total sequential resonance assignments of the isolated domain 1 were achieved by the combined analysis of 2D and 3D NMR data, including 3D HNCACB, CBCA(CO)NH and HCCH-TOCSY. The combination of HNCACB and CBCA(CO)NH provided most of the sequential linkage of domain 1. Figure 1.2 shows the sequential connectivities fi‘om Hile to Asp21. In some cases, the triple-resonance spectra were incomplete because of a lack of C“ and CB chemical shifis due to the low sensitivity of the HNCACB experiment, such as Thr30, Val67, Val68, and Leu71. The sequential connectivities for these residues could be made through sequential NOE analysis from 15N-edited NOESY-HSQC Spectrum. Almost all the HN-H“ correlations could be obtained from the 3D 15N-edited TOCSY-HSQC experiment. Then the sequential assignments were made form the Hm-HN and HN-HN NOE connectivities in the 3D I5N-edited NOESY-HSQC experiment. Examples of H“,— HNM and HNi-HN.+1 connectivities from Arg37 to Thr48 of domain 1 are shown in Figure 1.3. The assignments obtained from triple-resonance experiments are in good agreement with the sequential NOE analysis. The sequential assignments of the backbone and side- chain amide resonances are shown in a 'SN-IH HSQC spectrum in Fig. 1. 4 Side-chain resonance assignments. Most H“, as well as some HI3 and HY resonances were assigned in lsN-edited TOCSY-HSQC Spectrum. Extensions of assignments further along the side chain were made by the use of a 3D HCCH-TOCSY experiment. A 3D HCCH-TOCSY experiment yielded sequence-specific assignments of side chain proton resonances and their attached ‘3 C resonances for nearly all the aliphatic 13 Fig. 1.2 Strip plot of 3D HNCACB (A) and 3D CACB(CO)NH (B) spectra of domain 1 of annexin 1, Showing the sequential J connectivities of Cm and CB for the residues His-12 to Asp-21. Distinction between C1‘ and C'3 resonances is aided by their opposite phases in HNCACB strips. 14 an c9 20 2M :> 32. n: :< 33 Sm an cant. adv I Qmm .# anr A695 . . 2.96 o 2-3 2.96 3.66 2-3 2.96 2-8 3.66 2-3 9&6 o 2.3 2.58 2.3 . 3.56 2-3 2-? 2.3 3.06 . 2-3 Qw— mzfioozomo .3 En— ou> 30 3M S> 22 m: 3< 9M «3.— Qne own 1 va 1 9mm 1 QWN 1 358 ,Iu 5. “1. lo 8 nl 9 . 8 al- T a; ml c6 _ 652: a 15 Fig. 1.3 Strip plot of 15N-edited NOESY-HSQC spectrum of domain 1 of human annexin 1, showing characteristic sequential dim and daN connectivities of the residues Arg-37 to Thr-48. l6 - 6.0 o o 000. o o. 0.006 0 a... .o. 2'0 a. 0. A42 A43 Y44 L45 Q46 «6.. 00.0 30 o. 099 6 . 020°. .. o 03 “(i-1) . 8 6.9. e. E47 T48 I40 K41 R37 Q38 Q39 17 Fig. 1.4 l5N-‘H HSQC spectrum of domain 1 of human annexin I. Sequential assignments are indicated with one-letter amino acid codes and residue numbers. Pairs of cross peaks resulting from the Side-chain NH; groups of asparagine and glutarnine residues are connected by horizontal lines. The amino acid numbering is according to the isolated domain 1 with residue 1 corresponding to residue 14 in the crystal structure numbering. 18 104.0 1'30 0 1‘48 9 - 108.0 062 fi 1— T61 o 049 , 039 Q36 - 112.0 9 L39 I 1 ‘ Q46 . t——l——p- 0 N3 ‘9 - K38 ss 1 34., v20 0 O'Tgssgut—‘Q-L—Of ~116.0 . 0 V67 019 "3.4 $115. 124 91.72 ° .3, mm - "N O Q36 up 1.1 33,5 125 R37 w :41 "33‘ s" 0,36 1132 o - 120.0 L" 9 4AM 99 0 H63 V80 9 V“ Q39? 6 A70 . 3'.» — L31 Y; 140 A430? A10 ”V17 0 mg: A42 66 ' '0 W - 124.0 0 1332 ”.7 1;? 021 K73 m ’ ‘9 - 128.0 1.52 . ' in: 10.0 9.5 9.0 8.5 ' 8.0 7.5 7.0 6.5 Wm 1H 19 residues except residue Leu7 l , which was assigned by 3D ISN-edited TOCSY-HSQC and 2D TOCSY experiments. Phenylalanine and tyrosine spin systems were assigned using 2D TOCSY and 2D DQF-COSY in D20. Aromatic side chain protons were then matched with the sequential assigned Phe3 and Tyr42 residues by the observation of NOES between the ring protons and HB protons. The two His residues, Hile and His63, were assigned using a combination of 2D TOCSY and NOESY Spectra. Assignment of Side chain amide resonances from three asparagine and four glutamine residues was made from the l5N- edited NOESY-HSQC experiment, where NOES from the amide to side chain protons were found. Stereospecific assignments were made for about 70% of B-methylene protons and the methyl groups of valine and leucine residues based on qualitative estimations of 3Ja5 constants from the DQF-COSY spectrum in conjunction with the NOE data (46). The complete lH, '5 N and '3 C assignments for domain 1 are listed in Table 1.1. Secondary structure determination. The secondary structures were deduced from the characteristic NOE patterns and chemical shift indices. Figure 1.5 summarizes the sequential and medium-range NOES and H“ and C“ secondary shifts for domain 1. As expected, many residues in domain 1 are found to possess features that are characteristic of an a-helix: positive C“ secondary shifis, negative H“ secondary shifts and strong dNN(i. i +1), dam/(Li +3) and d3~(i,i +3) NOE connectivities. Five helices were identified in domain 1: helix A (residues 5-15), helix B (22-30), helix C (34-47), helix D (52-58) and helix B (63-70). 20 Table 1.1 lsN, '3 C and 1H resonance assignments for domain 1 of human annexin I. Residue "N(HN) ”C“ (Ha) 130(H'1) Others T1 119.40.11) S9.9(4.33) 67.6(4.07) 019.50.19) F2 127.5(8.66) 55.3(4.63) 37.7(3.22,2.88) 0130.9(7.25);0130.4035); 0 127.80.36) N3 128.5(8.50) 46.9(4.79) 37.6(2.85,2.52) NH27.53,6.96 P4 61.20.95) 30.10.88) 0 24.2(2.91,1.92); 0 48.3(3.74) 85 115.80.91) 60.0(4.00) 60.8(3.80) S6 120.10.96) 59.4(421) 60.9(3.88) D7 125.10.75) 55.7(4.47) 38.6(2.76,2.37) V8 121.6(8.44) 65.9(3.55) 29.40.18) 02300.10); 19.30.79) A9 122.5(7.75) 52.90.13) 15.80.47) A10 122.3(7.87) 52.90.12) 16.8(1.45) L11 121.1(8.81) 55.90.83) 40.3(2.21,l.16) 024.60.95);022.80.75); 24.40.70) H12 118.3(9.02) 57.1(450) 26.9(3.29,3.19) 0135.1(8.29);0117.50.17) K13 118.7(7.84) 57.40.83) 30.40.90) 022.40.49,1.34); 0 27.00.66); 03980.91) A14 121.40.76) 52.7(4.33) 16.90.62) 115 117.50.89) 61.60.74) 36.80.87) 02760.78); 0M°15.90.74); 012.10.61) M16 ll7.8(7.30) 52.9(4.40) 30.50.04,1.91) 029.50.35) V17 122.90.28) 60.8(3.86) 30.40.02) 01990.06); 18.8(0.89) K18 129.3(8.47) 50.5(400) 29.6(1.76,1.65) 022.60.43); 02680.64); 039.80.082.95) (319 ll7.9(8.67) 43.9(4.13,3.65) V20 11650.07) 63.6(3.56) 28.7(2.75) 020.00.01); 19.70.70) D21 127.7(8.37) 49.5(487) 36.6(2.93,2.60) E22 125.40.92) 57.40.57) 30.00.92.152) 033.50.202.00) A23 121.4(8.31) 53.40.57) 16.30.47) T24 117.8(7.30) 64.50.80) 65.40.68) 019.90.10) 125 120.1(6.81) 63.50.25) 36.30.73) 02650.71); 0Me 14.90.64); 0 12.30.54) 126 117.70.95) 61.40.38) 36.50.63) 026.40.16);0”‘15.60.75); 0 10.50.54) D27 124.9(8.20) 56.1(4.ll) 39.8(2.7S,2.62) 128 116.2(7.66) 62.90.53) 36.60.64) 027.00.83.1.01);0M°13.90.66); 012.00.75) L29 10.20.77) 55.90.74) 40.00.721.12) 02430.94);01950.62); 24.60.62) T30 105.3095) 60.2(4.04) 66.6(4.22) 020.10.15) K31 122.9033) 53.4(4.36) 30.5(1.91,1.81) 023.40.46.134); 0 27.10.57); 039.7085) R32 120.8(6.96) 50.5(4.60) 28.9(1.66,l.48) 022.70.32);04130.95) N33 119.8(8.09) 49.1(4.62) 36.3(3.23,2.75) NH27.30,6.73 N34 ll7.5(8.52) 56.1(4.13) 36.8(2.87,2.73) NH27.66,7.15 A35 123.8(8.18) 53.40.01) 15.70.37) 21 Table 1.1 (Continued) Residue 151901“) ”C“ (H“) '30 (HB) Others 036 119.60.44) 56.20.80) 26.90.03) 030.6(2.29,095);NH2 6.57,6.56 R37 120.40.91) 59.10.76) 28.50.27) 022.80.47); 041.60.522.91) Q38 11880.16) 56.5(4.06) 25.00.22) 030.50.672.47); NH2 7.29,6.83 Q39 121.50.79) 57.30.24) 27.30.572.20) 032.80.652.48); NH2 739,683 140 123.40.78) 64.10.56) 36.00.94) 028.10.21,1.13); 0“” 14.90.73); 01230.92) K41 12090.42) 58.9(3.77) 30.50.021.97) 023.60.65);028.10.55); 038.20.98) A42 123.8(7.71) 53.10.20) 16.00.52) A43 123.00.24) 52.40.22) 16.50.43) Y44 03.00.20) 60.50.80) 37.6(3.08) 0117.40.53);0131.90.04) L45 123.3(7.48) 55.30.24) 39.80.921.69) 0244.70.39);023.40.89); 20.10.56) Q46 12010.66) 56.60.87) 26.50.11,197) 031.2(2.36);NH27.42,6.75 E47 11680.28) 56.50.06) 27.9(1.87,1.78) 033.10.222.08) T48 107.40.90) 60.00.26) 68.80.86) 01530.19) G49 112.6091) 43.30.203.75) K50 12280.15) 49.9(4.87) 32.6(1.70,l.60) 022.3031);024.6039); 039.90.092.94) 1>51 59.40.67) 30.50.50) 0 26.3(2.11,2.02); 0 48.4(3.82,3.53) L52 09.10.93) 55.20.48) 39.20.521.04) 024.00.14);024.10.51); 18.2(-0.11) D53 118.6088) 54.90.73) 35.90.572.51) E54 121.20.96) 56.80.83) 28.10.001.85) 033.40.302.18) T55 116.00.55) 64.70.85) 66.40.21) 01940.25) L56 123.00.24) 55.80.93) 38.9(1.75,1.09) 02430.64); 020.9061); 24.3(0.48) K57 12040.02) 57.00.01) 30.60.81) 022.40.33);02740.54); 039.70.74) K58 115.0(692) 54.50.28) 30.8(1.91,1.81) 022.80.521.42);026.80.65); 039.80.95) A59 121.60.52) 51.20.33) 18.8(1.38) L60 11690.68) 51.00.68) 41.6(l.56,l.23) 024.50.70);020.20.79); 23.40.70) T61 110.90.28) 57.9(4.64) 69.90.12) 01840.08) G62 11000.49) 43.60.003.81) H63 122.1086) 57.50.54) 26.6(3.24,3.06) 0134.80.52);0118.40.35) L64 12070.62) 55.60.93) 39.30.71) 024.80.53);022.30.85); 22.20.80) E65 119.4(6.99) 57.8(3.43) 24.50.33) 030.30.412.01) E66 118.0(7.39) 57.30.74) 27.40.101.97) 033.30.372.22) V67 117.3091) 63.60.63) 29.30.94) 022.80.76); 19.90.87) V68 12180.14) 64.70.36) 28.70.87) 022.20.79); 20.10.60) L69 11980.09) 55.30.83) 37.2(1.78,1.39) 02570.78);022.70.80); 22 20.1(0.73) Table 1.1 (Continued) Residue 15N(HN) '30 (H0) '30 (H5) Others A70 122.30.36) 52.70.11) 15.90.46) L71 12300.66) 51.90.07) 39.50.50) 02490.19) L72 117.00.24) 52.20.19) 40.6(1.74,1.56) 024.00.72);02050.62); 24.00.59) K73 12780.00) 57.00.90) 31.00.75) 022.60.43); C526.9(1.64,l.50); 039.90.082.91) 23 Fig. 1.5 Summary of sequential and short-range NOES and chemical shift index for H“ and Ca observed for domain 1 of annexin I. The derived helices are shown at the bottom. 24 ‘ 10 - 20 30 TFNPSSDAVALHKAIMVKGVDEATIIDILTKR BN6“) - #fl—J‘ NN (1+2) —=- —'H—— —— 0N (1+2) — -—--—- 0N (1+3) flNG-I-3) —— -.—-__—._ GOG-1'3) E "".=_..__ ; -=_=_ NN(i+l) 0N (1+1) BN6“) NN (1+2) aN (1+2) 0N (1+3) BNG+3) drum) 1 ASH“ o 0 .1 ~ - ‘.‘ '. .‘r:.' m 5 '1 . V' j. " ‘7} .4 - . ”‘5' ‘ ’ 4 . ’,. " i" '2’ -’r‘v .. - ~~ , -' t» 1 . . h, i' j‘. .‘ 4‘ v .‘ v . .‘ , 3‘ 71:“ .11....103 .1. . L‘. t “W ‘5' ‘ f}; .f_':,', '1- :1. .2 .7 94".. - -. .‘2‘ U ‘45 ..: ,_ 4.251 . 1‘.‘ . _ . ,’-“.";‘.’ , Z-‘re'u‘ .— 7’ .. " .i ': . 3 .‘ yr' ..‘ . .; . . ’ .._1...“‘.. 2‘. .L. M.,... . 2. _ .- . . b ~. ' ' .5 ‘ . . l — 25 Solution structure calculation. A total of 1099 structurally useful distance restraints were obtained from the analyses of the homonuclear 2D NOESY (D20) and 3D 'H-‘SN NOESY-HSQC spectra (Table 1.2), 707 of which were medium- and long-range NOES. In average, each residue had ~15 NOE restraints. A superposition of 20 calculated structures with no NOE restraint violations above 0.5 A is shown in Fig. 1.6a. The structural analysis are summarized in Table 1.2. The precision of the structures (RMSD of the ensemble of the 20 NMR structures from its mean coordinate) was 0.57 A for the backbone (N, C“, C’, 0) and 1.11 A for all heavy atoms. The distribution of the average backbone RMSDS is Shown in Fig. 1.7a. The structure of domain 1 consists of five helices: helix A, residues 5-15; helix B, residues 22-30; helix C, residues 34-47; helix D, residues 52-58; and helix E, residues 63-70 (numbering according to the isolated domain 1). Helices A, B, D and E are assembled in a bundle with two nearly parallel helix-loop- helix motifs. Helix C lies approximately perpendicular to the helical bundle with one end close to the N-terminus and the other to the C-terminus of domain 1. The ensemble of the NMR structures and constraints have been deposited at the Protein Data Bank (http://wwwpdbbnlgov) under PDB code 1bo9. 26 Table 1.2 Statistics of NMR solution structures of domain 1 of human annexin I. Restraints for structure calculations Total NOE restraints 1099 Intraresidue 392 Medium range(lS|i-j|54) 549 Long range(|i-j|>4) 158 Statistics for structure calculations {SA} ' r NOE violations (>05 A) 0 0 R.m.s.d. from distance restraints (A) 0.026 3: 0.001 0.027 R.m.s.d from idealized geometry Bonds (A) 0.0034 i 0.0001 0.0033 Angle (°) 0.56 i 0.01 0.54 Irnpropers (°) 0.39 :1: 0.02 0.36 X-PLOR potential energies (kcal/mol)2 Etctai 220.3 i 11.9 213.5 Em 38.8 i 3.4 41.3 Em. 53.7 :1: 4.7 52.1 Eimp, 12.8 :i: 1.6 11.3 Ramachandran plot statistics3 Residues in most favored regions 78% 81.8% Residues in additionally allowed regions 17.4% 15.2% Residues in generously allowed regions 3.9% 3% Residues in disallowed regions 0.7% 0% R.m.s.d. of atomic coordinates (A) backbone heavy atoms {SA} vs. All residues 0.57 :t 0.14 1.11 i 0.19 Helices only 0.47 i 0.18 1.02 i 0.23 {SA} vs. X-ray All residues 1.36 i 0.11 2.12 i 0.13 Helices only 1.01 i 0.13 1.82 i- 0.16 ' {SA} is the ensemble of 20 NMR solution structures of domain 1. is the mean atomic structure obtained by averaging the individual structures following a superimposition of the backbone heavy atoms. r is the energy-minimized average structure. 2 The distance constraints were used with a square-well potential (F me = 50 kcal mol'lA' 2). The Fm; firnction was used to simulate van der Walls interactions with a force constant of 4.0 kcal mol'lA4 and atomic radii set to 0.8 times their CHARMM values. 3 The Ramachandran plot statistics were obtained from the PROCHECK-NMR analysis. 27 Fig. 1.6 (a) Superposition of the final 20 calculated NMR structures of domain 1 of annexin I. Only the backbone atoms (N, C“ and C’) are superimposed and colored according to the secondary structure: helices A (5-15) in red, B (22-30) in green, C (34- 47) in cyan, D (52-58) in magenta and B (63-70) in yellow and the loops in gray. The amino acid numbering is according to the isolated domain 1 with residue 1 corresponding to residue 14 in the crystal structure numbering. (b) Superposition of the minimized average NMR structure (red) and the X-ray structure (cyan) of domain 1. 28 29 Fig. 1.7 Distributions of the average backbone RMSDS of the ensemble of the NMR structures from its mean coordinate (a, top) and from the X-ray crystal structure (b, bottom). The amino acid numbering is according to the isolated domain 1 with residue 1 corresponding to residue 14 in the crystal structure numbering. 30 a. NMR ensemble 10 20 30 40 50 60 7O Residue Number - Ill-““11““ lllfllll ‘ 10 20 l b. NMR vs. X-Ray "Hill 30 40 50 60 70 Residue Number 31 1.4 Discussion Comparison with the crystal structure of human annexin I. The structure of a truncated human annexin I has been determined by X-ray crystallography in the presence of 10 mM CaC12 (18). The truncated annexin I lacks the N-terrninal 32 residues but has four domains all intact (Fig. 1.1a). Six calcium ions are found to bind to the truncated annexin 1, two each in domains 1 and 4 and one each in domains 2 and 3. The solution structure of the isolated domain 1 is highly Similar to the corresponding part of the crystal structure of the truncated annexin 1 containing all four domains. Thus, the minimized average NMR structure of the isolated domain 1 can be superimposed very well with the corresponding X-ray structure as shown in Fig. 1.6b. There are 1-2 residues differences in the lengths of some helices but the five helices are assembled in the same way. The distribution of the average backbone RMSDS of the ensemble of the 20 NMR structures from the corresponding X-ray structure is shown in Fig. 1.7b. The largest differences are found at the N-terrninus and in the AB loop. It should be noted that the NMR structure of the isolated domain 1 was determined in the absence of Ca2+. The difference in the conformations of the AB loop could be due to binding of Ca2+ because the carbonyls of G1y-32 and Val-33 in the AB loop along with the carboxylate of Glu—35 at the N- terminus of helix B form a calcium-binding Site. However, binding of Ca2+ to the second calcium-binding site apparently does not cause any significant conformational change because the conformation of the DE loop that constitutes the second site is essentially the same as that found in the crystal structure, probably because the second site has lower affinity for Ca2+ than the first site. 32 Implications for protein folding. AS described earlier, the four domains of annexin I are highly homologous in structure when folded together (Fig. 1.1a and b). The hydrophobic cores are highly conserved among all annexin domains. Surprisingly, isolated domain 2 is largely unfolded in aqueous solution and thus is not an independent folding unit (15). Its helical content is less than 25% compared to ~80% when the domain is folded together with the rest of the protein. In contrast to domain 2, our work presented here clearly demonstrates that the isolated domain 1 is fully folded in solution with little change in structure from that in the native state, and thus constitutes an autonomous folding unit. The results present the interesting question Of why the domains with high sequential and structural homologies exhibit totally different folding behaviors. The failure of the isolated domain 2 to form its native structure is likely due to the removal of the interdomain interactions that exist in the whole protein. As mentioned earlier, according to the crystal structure of annexin I (18), domains 2 and 3 form a modular structure with many hydrophobic interactions, and so do domains 1 and 4. Thus, it is unlikely that the removal of the hydrophobic contacts with domain 3 is the cause for the folding failure of the isolated domain 2. By default, then, the removal of the interactions with domain 4 may be the cause for the failure of the isolated domain 2 to fold to its native structure. Indeed, there are many interactions between domain 2 and domain 4 as Shown in Fig. 1.8. This explanation is supported by the NMR studies of the isolated domain 2 and its components helices A and B (14,15). It has been Shown by NMR that a stable nonnative N-terrninal cap, with the sequence F91D92A93D94E95L96 (numbering according to the crystal structure of the truncated annexin I), is formed in helix A in a peptide fragment containing helices A and 33 Fig. 1.8 The hydrophobic core structure of domain 2 and the interface between domain 2 and domain 4. The drawing is based on the X-ray structure of the truncated human annexin 1 containing four domains (18). The main-chains of domain 2 and domain 4 (partial) are represented by blue and cyan ribbons, respectively. The residues involved in the normative cap and the cluster of acidic residues as well as Arg-117 in domain 2 are shown in magenta. The residues within 5 A distance of Len-96 are shown in yellow, and other core residues in gray. The residues of domain 4 are in green. Hydrogen bonds are indicated by dotted lines. The amino acid numbering is according to the crystal structure of the truncated annexin I. 34 35 B of domain 2 (14). With the carboxyl groups of Asp-92 and Glu-95 hydrogen-bonded to their reciprocal backbone amides and many hydrophobic contacts between Phe-9l and Leu-96, it is a canonical N-terrninal cap (47,48). Furthermore, the normative cap persists in isolated domain 2 (15,16). It has been suggested that the normative N-terminal cap serve as a very potent initiation site for folding (14). However, it may be more likely that the formation of the normative N-terminal cap prevents the isolated domain 2 from reaching the native state for two reasons, although its role in the folding of entire annexin I is not known. First, it disrupts a pair of hydrogen bonds between the carboxyl group of Asp-92 and the guanidinium group ofArg117 that helps to lock helices A and B in place (18) (Fig. 1.8). The breakage of the hydrogen bond also makes it possible for Arg117 to form nonnative salt bridges as found in the isolated domain 2 (16). Second, as Shown in Fig. 8, in the native structure, Leu96 is roughly at the center of the hydrophobic core. It is surrounded by as many as seven core residues: Met-100 from helix A, Leu110, Ile113 and Ile114 from helix B, Ile125 and Tyr129 from helix C, and Leu137 from helix D. On the other hand, the side-chains of Phe91 and Leu96 are >10 A apart. Thus, the normative hydrophobic interactions between Phe91 and Leu96 in the isolated domain may not only take the side-chain of Leu96 out of the hydrophobic core structure but also disrupt the packing of the other hydrophobic core residues. The nonnative conformation of the isolated domain 2, however, may not necessarily have a lower energy than the native conformation. The nonnative N-terrninal cap may act as a kinetic trap that keeps the isolated domain 2 from reaching the native structure. Why does the normative N-terminal cap form in the isolated domain 2? The separation of domain 2 from the rest of the protein has two structural consequences that 36 may bear on the formation of the normative N-tenninal cap as Shown in Fig. 1.8. First, it breaks four hydrogen bonds between domains 2 and 4, namely Glu95/Ly8267, Asp108/Lys254 and Glul l2/Arg271 (two hydrogen bonds). The salt bridge between Glu107 of domain 2 and Ly8235 of domain 3 is also broken. This leaves a cluster of negatively charged residues without positively charged partners, including Glu95, Asp106, G1u107, Asp108, and Glu112. The carboxyl group of Glu95 is ~6.7 A away from that of Asp106 and ~7 .1 A away from that of Glu-112. It is likely that the negative charge potential generated by the cluster of acidic residues may push away the carboxyl group of Glu95 so that it forms a hydrogen bond to the backbone amide of Asp92. Second, Phe91 is almost completely buried in the whole protein but its side-chain becomes mostly exposed to solvent in the isolated domain 2. Thus, Phe91 in the isolated domain 2 seeks hydrophobic partners and it finds Leu96. It is noted that Phe91 and Glu95 are replaced by a serine and an alanine, respectively, in domain 1 (Fig. 1.1c). Therefore, the normative N-terminal cap is unlikely to form in the folding process of the isolated domain 1. The hypothesis may be tested by replacing Phe91 and Glu95 of domain 2 with the corresponding amino acids of domain 1 by site-directed mutagenesis. Refolding at a higher salt concentration may also help the isolated domain 2 to reach the native conformation by reducing the effects of the negative charges of the cluster of acidic residues and strengthening the hydrophobic interactions to drive formation of the hydrophobic core. A sequential working model for annexin folding. For multidomain proteins, the formation of a native structure requires not only the correct folding of each domain but also the appropriate assembly of the domains via interdomain interactions. However, 37 little is known about the roles of interdomain interactions during the folding process. AS discussed above, interdomain interactions may play a critical role in the folding of domain 2 of annexin I. It is interesting to note that among the four domains of annexin 1, only domain 1 is folded and soluble when expressed in Escherichia coli. Domain 2 is soluble but largely unfolded. Expression of separated domain 3 and 4 in Escherichia coli results in inclusion bodies (data not Shown). It has been reported that domain 3 is easily degraded but domain 4 forms inclusion bodies when expressed as fusion proteins of glutathione transferase (17). It appears that only domain 1 is an autonomous folding unit, although it is not known at present whether domains 3 and 4 can be solubilized and refolded to their native structures. As described earlier, annexin I is composed of two modules. One module consists of domains 1 and 4, and the other domains 2 and 3. Each module has a hydrophobic interface between its constituents. The two modules are assembled with mostly hydrophilic interactions between domains 2 and 4. Several possible scenarios can be proposed for the folding process of this multi-domain protein such as a general model proposed by Fink (49), in which the D2-D3 module constitutes an autonomous folding unit that brings domain 1 and 4 together. Apparently, Our experimental data did not agree this model. We therefore propose another model in which the folding of annexin I follows a sequential process with domain 1 as an autonomous initial folding unit. The sequence of the events in our proposed working model is depicted in Fig. 1.9. (1) First, domain 1 folds independently, domains 2 and 3 are maintained partly unfolded by local non-native interactions. As discussed above, the inter-domain interactions between domains 2 and 4 are critical for the complete folding of domain 2. We may 38 Fig. 1.9 A working model for the folding process of annexin I. In this model the protein folds sequentially by three principle steps. (A) Domain 1 folds first as an autonomous unit. Domains 2 and 3 are remained partly unfolded by local nonnative interactions to facilitate the docking of domain 4 to domain 1. (B) In a second step, Domain 4 is docked to domain 1 by the hydrophobic interactions (gray bars) between these two domains, which will also facilitate the complete folding of domain 4. (C) Finally, the hydrogen bonds and hydrophobic interactions (dash lines) between domains 4 and 2 help domain 2 to get rid of the normative cap and reach the native structure. Domain 2, in turn, assists the folding of domain 3 through many hydrophobic inter- domain interactions (gray bars). 39 (A) (C) 40 reasonably assume that domain 4 must dock to domain 1 in order to establish the hydrophilic interface between domains 2 and 4. The flexibility of unfolded domain 2 and 3 allows domain 4 to search for domain 1. (2) In a second step, domains 1 and 4 are docked together and the folded structure of domain 1 facilitates the folding of domain 4 through the hydrophobic interface. (3) Finally, the hydrogen bonds and hydrophobic interactions between domains 4 and 2 help domain 2 to get rid of the normative cap and reach the native structure. The hydrophilic core is formed between domains 2 and 4. Domain 2, in turn, assists the folding of domain 3 through, many hydrophobic inter- domain interactions. 0ur model emphasizes the inter-domain interactions in the folding of armexins. This proposal can be tested by systematic studies of the folding properties of the entire protein and separated domains of annexin I. 41 References 10. ll. 12. 13. 14. 15. . Srere, P. A. (1984) Trends Biochem. Sci. 9, 387-390 Creighton, T. E. (ed) (1992) Protein Folding, W. H. Freeman and CO., New York . Jaenicke, R. (1987) Prog. Biophys. Mol. Biol. 49, 117-237 Jaenicke, R. (1991) Biochemistry 30, 3147-3161 Jaenicke, R. (1996) Curr. T op. Cell. Reg. 34, 209-314 Barton, G. J., Newman, R. H., Freemont, P. S., and Crumpton, M. J. (1991) Eur. J. Biochem. 198, 749-760 Morgan, R. R., and Femadez, M.-P. (1995) Mol. Biol. Evol. 12, 967-979 Raynal, P., and Pollard, H. B. (1994) Biochem. Biophys. Acta 1197, 63-93 Creutz, C. E. (1992) Science 258, 924-931 Donnelly, S. R., and Moss, S. E. (1997) Cell. Mol. Life Sci. 53, 533-538 Voges, D., Berendes, R., Demange, P., Benz, 1., Gottig, P., Liemann, S., Huber, R., and Burger, A. (1995) Adv. Enzymol. Relat. Areas Mol. Biol. 71, 209-39 Liemann, S., and Huber, R. (1997) Cell. Mol. Life Sci. 53, 516-521 Macquaire, F., Baleux, F., Huynh Dinh, T., Rouge, D., Neumann, J. M., and Sanson, A. (1993) Biochemistry 32, 7244-54 Odaert, B., Baleux, F., Huynh-Dinh, T., Neumann, J. M., and Sanson, A. (1995) Biochemistry 34, 12820-9 Cordier-Ochsenbein, F., Guerois, R., Baleux, F ., Huynh Dinh, T., Chaffotte, A., Neumann, J. M., and Sanson, A. (1996) Biochemistry 35, 10347-57 42 l6. Cordier-Ochsenbein, F ., Guerois, R., Baleux, F ., Huynh-Dinh, T., Lirsac P.-N, Russo- Marie, F ., J.-M., N., and Sanson, A. (1998) J. Mol. Biol. 279, 1163-1175 17. Cordier-Ochsenbein, F., Guerois, R., Russo-Marie, F., Neumann, J .-M., and Sanson, A. (1998).]. Mol. Biol. 279, 1177-1185 18. Weng, X., Luecke, H., Song, I. S., Kang, D. S., Kim, S. H., and Huber, R. (1993) Protein Sci. 2, 448-58 19. Piantini, U., Serensen, 0. W., and Ernst, R. R. (1982) J. Am. Chem. Soc. 104, 6800- 6801 20. Rance, M., Serensen, 0. W., Bodenhausen, G., Wagner, G., Ernst, R. R., and Wiithrich, K. (1983) Biochem. Biophys. Res. Commun. 117, 479-85 21. Braunschweiler, L., and Ernst, R. R. (1983) J. Magn. Reson. 53, 521-528 22. Bax, A., and Davis, D. G. (1985) J. Magn. Reson. 65, 355-360 23. Griesinger, C., Otting, G., Wiithrich, K., and Ernst, R. P. (1988) J. Am. Chem. Soc. 110, 7870-7872 24. Jeener, J., Meier, B. H., Bachmann, P., and Ernst, R. R. (1979) J. Chem. Phys. 71, 4546-4553 25. Macura, S., and Ernst, R. R. (1980) M01. Phys. 41, 95-117 26. Bodenhausen, G., and Ruben, D. J. (1980) Chem. Phys. Lett. 69, 185-188 27. Kay, L. E., Keifer, P., and Saarinen, T. (1992) J. Am. Chem. Soc. 114, 10663-10665 28. Marion, D., Driscoll, P. 0, Kay, L. E., Wingfield, P. T., Bax, A., Gronenbom, A. M., and Clore, G. M. (1989) Biochemistry 28, 6150-6 29. Fesik, S. W., and Zuiderwcg, E. R. P. (1988) J. Magn. Reson. 78, 588-593 30. Wittekind, M., and Mueller, L. (1993) J. Magn. Reson. Ser. B 101, 201-205 43 31. Muhandiram, D. R., and Kay, L. E. (1994) J. Magn. Reson. Ser. B 103, 203-216 32. Grzesiek, S., and Bax, A. (1992).]. Am. Chem. Soc. 114, 6291-6293 33. Bax, A., Clore, G. M., and Gronenbom, A. M. (1990) J. Magn. Reson. 88, 425-431 34. Kay, L. E., Xu, G.-Y., Singer, A. U., Muhandiram, D. R., and Forrnan-Kay, J. D. (1993) J. Magn. Reson. Ser. B 101, 333-337 35. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A. (1995) J Biomol NMR 6, 277-93 36. Garrett, D. S., Powers, R., Gronenbom, A. M., and Clore, G. M. (1991) J. Magn. Reson. 95, 214-220 37. Marion, D., Ikura, M., and Bax, A. (1989) J. Magn. Reson. 84, 425-430 38. Zhu, G., and Bax, A. (1992) J. Magn. Reson. 100, 202-207 39. Wiithrich, K., Billeter, M., and Braun, W. (1983) J. Mol. Biol. 169, 949-61 40. Nilges, M., Gronenbom, A. M., and Clore, G. M. (1988) FEBS Lett. 229, 317-324 41. Briinger, A. T. (1992) X-PLOR Version 3.1: A System for Crystallography and NMR, Yale Univeristy Press, New Haven, CT 42. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J ., and Karplus, M. (1983) J. Comput. Chem. 4, 187-217 43. Koradi, R., Billeter, M., and Wiithrich, K. (1996) J. Mol. Graph. 14, 51-5, 29-32 44. Laskowski, R. A., MacArthur, M. W., MOSS, D. S., and Thornton, J. M. (1993) J. Appl. Crystallogr. 26, 283-291 45. Laskowski, R. A., Rullmannn, J. A., MacArthur, M. W., Kaptein, R., and Thornton, J. M. (1997) J. Biomol. NMR 8, 477-86 46. Basus, V. J. (1989) Methods Enzymol. 177, 132-149 47. Harper, E. T., and Rose, G. D. (1993) Biochemistry 32, 7605-9 48. Scale, I. W., Srinivasan, R., and Rose, G. D. (1994) Protein Sci. 3, 1741-5 49. Fink, A. L., (1995) Annu. Rev. Biophys. Biomol. Struct. 24, 495-522 45 CHAPTER 2 SEQUENTIAL BACKBONE RESONANCE ASSIGNMENTS OF YEAST GUANYLATE KINASE 2.1 Introduction Guanylate kinase (GK) belongs to a family of nucleoside monophosphate (N MP) kinases, including adenylate kinase (AK), uridylate kinase (UK), cytidylate kinase (CK), and thymidylate kinase (TK). All NMP kinases catalyze the phosphoryl transfer from ATP to NMP to form nucleoside diphosphates, which are then activated by nucleoside diphosphate kinase to nucleoside triphosphates as building blocks for DNA or RNA synthesis. Guanylate kinase catalyzes the following reversible reaction: MgATP + GMP (—-) MgADP + GDP. It plays an essential role in the cGMP cycle and may be involved in guanine nucleotide-mediated signal transduction pathways by regulating the ratio of GTP to GDP (1,2). It is also required for the metabolic activation of the anti-herpes drugs acyclovir and gancyclovir and the anti-HIV agent carbovir (3-5). Thus it is of biomedical significance to study the catalytic mechanism and the nucleotide Specificity of this enzyme. The firnctional significance of GK is also highlighted by the discovery of membrane-associated GK homologues (MAGUK), including the Drosophila discs-large tumor suppressor protein (dlg-A), the protein encoded by C. elegans vulvaless gene [in-2, the mammalian zonula ocludens or tight junction proteins ZO-l and Z0-2, the erythrocyte membrane protein p55 and several synapse-associated proteins (PSD-95/SAP90, SAP97 46 and SAP102). However, these GK homologues are unlikely to be enzyrnatically active. It has been suggested that the GK domains in these proteins may be involved in protein- protein interactions (6). The GK activity was first reported by Klenow & Lichtler in 1957 (7). GK has been purified from several sources, but detailed characterization has been hampered by its low abundance. It was not until 1989 that yeast GK was purified to homogeneity and its amino acid sequence was determined. The yeast GK gene was cloned by Konradi in 1992 (9), followed by cloning of the E. coli GK gene and bovine GK gene (10). Yeast GK shares 45% identity with E. coli GK and 55% with bovine GK. The crystal structure of a yeast GK complex with GMP was first reported in 1990 and refined at 2A resolution in 1992 (l 1). Human GK has recently been cloned and shares ~50% amino acid identity with yeast GK. The human enzyme is inactive when it is expressed in E. coli or produced by cell-free translation (12). We have been interested in the catalytic mechanism of yeast GK and have completed extensive kinetic and mutagenesis studies in our lab (13-15). To date, about 26 crystal structures have been determined for NMP kinases (16). All structures are highly similar; containing three domains termed CORE, LID and NMPbind (Fig. 2.1). A typical five-stranded parallel B-sheet with helices on both sides constitutes the rigid CORE domain. The CORE domain contains a “glycine-rich loop” (P-loop) which forms a giant anion hole and binds ATP. The NMPbind domain forms the NMP binding site. The LID domain covering the phosphates at the active site carries many of the catalytically important residues. Among these NMP kinases, AK has been extensively studied by X-ray, NMR and site-directed mutagenesis (17). By comparison of different forms Of homologous AKs, 47 Fig. 2.1 The crystal structure of yeast guanylate kinase in complex with GMP (11). 48 49 apo-forrn AKl, AMP complex with AK3 and AP5A (P',P5-(5’-diadenosyl)- pentaphosphate) complex with AKe, the substrate-induced conformational changes have been established in a gradual manner (Fig. 2.2) (18,19): Binding of AMP induces movement of the AMP binding domain, while binding of ATP causes closure of the LID domain. Binding of the second substrate causes further closure of both domains. The formation of a “ternary” complex with AP5A results in the closure of both LID and NMPbind domains. These domain movements have been summarized in a movie that represents an interpolation of different structures of NMP kinases (20). It has been suggested that these substrate-induced domain movements are important for preventing the enzyme from hydrolytic activity and stabilizing the transition state. However, no detailed descriptions have been possible because no NMP kinase structures have been determined in all forms. It has been widely accepted that dynamics of enzymes play an important role in catalysis. However, a direct correlation has not been clearly demonstrated between dynamics and catalysis. From studies of several different enzymes, it has been found that the dynamic properties of binding Sites are important for substrate binding. In the case of AK, it has been suggested that the flexibility of P-loop, which mainly binds the phosphate chain of ATP, is required for efficient substrate binding by allowing different isomers of ATP to convert to a productive isomer (21,22). Assuming that B-factors reflect the relative mobility, it has been found that the LID and NMPbind domains in AK are mobile in free form and the rest of the enzyme is relatively well fixed. Upon binding of AP5A, these two domains become immobilized and the two loops between 014-133 and 015-134 in the CORE domain become mobilized. The mobility of these two loops is proposed to be 50 Fig. 2.2 Domain movements correlated with substrate binding to adenylate kinase (18,19). (a) Model of AKl without bound substrates. (b) Model of AK3 with bound AMP. (c) Model of AKy mutant (D89V, R1651) with an ATP analogue (AMPPCF 2P). (d) Model of AKe with bound AP5A. In all depicted models, the CORE, LID and NMPbind domains are Shown in cyan, green and yellow, respectively. All the substrates (AMP, AMPPCF 2P and APSA) are shown in red. 51 52 an “energetic counterweight” that keeps the ternary complex from dropping into an energy well (23). It is an interesting idea and also an important model because in this model the dynamics is directly correlated to the catalytic mechanism. However, the observed B-factor distributions of other NMP kinases Show significant differences (24,25). The catalytic mechanism and the structural basis of nucleotide specificity of GK are still largely unknown. Work in our lab has shown that GK catalyzes the phosphoryl transfer via a sequential mechanism and the chemical step is the major rate-limiting step (13). GK has the highest specificity at the NMP binding site among the NMP kinases. Compared with AK, the CORE domain and the putative ATP binding domain of GK are similar to those of AK (Fig. 2.1). However, the GMP binding domain of GK and the AMP binding domain of AK are quite different. While the GMP binding domain consists of a mixed B-Sheet and a short helix, the AMP binding domain is completely a-helical. GK has not been cocrystallized with ATP, because the ATP binding site is partly covered by the crystal contact. The ATP binding Site was tentatively assigned on the basis of the structural homology to AK and GTP-binding proteins (EF-Tu and H-ras-p21). However, there are two problems with the proposed ATP binding model: (1) The distance (6A) between the y—phosphate of ATP and the nearest oxygen of GMP is too far for a nucleophilic attack. (2) The 'y—phosphate of ATP is so much exposed to the solvent that hydrolysis cannot be avoided. Because of its high Specificity at both ATP and GMP binding Sites and the favorable properties for NMR study (soluble, stable and ~ 20 kDa monomeric), yeast GK 53 is an excellent model enzyme for studying the substrate Specificity and catalytic mechanism. Besides the development of multidimensional multinuclear NMR spectroscopy which has greatly facilitated the structural studies of proteins, NMR relaxation experiments have been applied successfirlly to elucidate the dynamic properties of proteins in solution (26). We aim to study the structures and dynamics of GK in solution by NMR spectroscopy and to evaluate how the conformational and dynamic changes are correlated to the catalytic mechanism. This thesis presents the multinuclear multidimensional NMR experiments and sequential backbone assignments of theGK complex with GMP. The results provide the basis for firrther NMR studies of the structure-function relationship of this enzyme. 54 2.2 Materials and Methods Protein eaqrression and purification. The GK gene from a yeast genomic library has been amplified by PCR and then cloned into the expression vector pETl7b designated pET—YGK (13). The unlabeled protein samples were expressed in the BL21 (DE3) E. coli strain containing pET-YGK in a LB medium without IPT G induction. For samples labeled uniformly with 15N or l5N/BC, 15NH4C1 and l3C-glucose were substituted for their unlabeled counterparts in a variation of M9 minimal medium with IPTG induction. Proteins selectively labeled with specific '5 N amino acid were expressed from the E. coli strain DL49PS in a medium supplemented with appropriate unlabeled amino acids (27). 15N-labeled amino acids were substituted for their unlabeled counterparts and the expression was induced with IPTG. Proteins were expressed and purified according to the protocols in the Appendices. NMR sample preparation and experiments. Approximate 0.6 ml protein samples for NMR experiments were prepared in 20 mM predeuterated Tris-HCl buffer and 100 mM KCl in 90%H2O/ 10%D2O solution at pH 7.5. The final protein concentration was ~2.0 mM for all samples, except for the selectively labeled samples, which were ~1.0 mM. The GMP complex samples contained 5-fold excess of GMP. GMP titration by 2D HSQC experiments demonstrated that 5-fold excess of GMP is sufficient to saturate the enzyme. NMR experiments were conducted at 22 °C on a Varian Inova 600 MHz spectrometer. All the NMR data were acquired in the States-TPPI mode. Heteronuclear double and triple resonance Spectra acquired for both free and GMP-bound forms of GK included 2D 'H-‘SN HSQC (28,29), 3D 'H-‘SN TOCSY-HSQC (30), 3D 'H-‘sN NOESY-HSQC (30,31), HNCO (32), CBCA(CO)NH (33,34), and 55 HCCH-TOCSY (35,36). The 2D HSQC spectra for Specific labeled proteins (including lsN-Gly, lSN-Leu, 15N-Ile, l5N-Lys, l5N-Phe and 15N-Val) have been collected for both free and GMP-bound forms of GK. 3D HNCA (37) spectrum was collected for free GK. 3D HNCACB (33,38) and (HB)C BC ACO(CA)HA (39) spectra were collected for GK complex with GMP. The acquisition sweep widths and numbers of complex points for these experiments were as follow: 2D lH-‘SN HSQC, lsN(F2) 3600 Hz, 256, lH(F2) 8000 Hz, 102; 15N-edited NOESY-HSQC (150 ms mixing time) and lsN-edited TOCSY- HSQC (48.6 ms mixing time), lH(l=1) 7200Hz, 128, 15N02) 2200Hz, 32, 'H(F3) 8000 Hz, 1024; 3D HNCACB, '5N(1=1) 2200 Hz, 32, ”C(FZ) 11000 Hz, 96, 1H03) 8000 Hz, 1024; 3D CBCA(CO)NH, 15N01) 2200 Hz, 32, l3C02) 11000 Hz, 64, 1H03) 8000 Hz, 1024; 3D HCCH-TOCSY (23.4 ms mixing time), lH(1=1) 7200 Hz, 128, ”C(FZ) 12070 Hz, 64, 'H(F3) 8000 Hz, 1024; 3D HNCA, '5N(Fl) 2200 Hz, 32, 13C02) 4900 Hz, 64, 'H(F3) 8000 Hz, 1024; 3D HNCO, 15N01) 2200 Hz, 32, '3010) 2200 Hz, 32, 1H03) 8000 Hz, 1024; 3D (HB)CBCACO(CA)HA, l3C(Fl) 12001 Hz, 86, l3CO(F2) 3000 Hz, 45, 'H(F3) 8000 Hz, 1024. The spectra were processed with the program NMRPipe (40) and analyzed with the program NMRView (41). Briefly, solvent suppression was improved by convolution of time domain data (42). The data size in each indirectly detected dimension of the 3D data was extended by backward-forward linear prediction (43). A 45°-shifted sine bell and single zero-filling were generally applied before Fourier transformation in each dimension. 56 2.3 Results The general strategy for the sequential assignments of the GK-GMP complex was to link all the spin systems sequentially by HNCACB/CBCA(CO)NH experiments via C“ and C‘3 chemical shifts, HNCO/(HB)CBCACO(CA)HA experiments Via CO chemical shifts, and 3D lsN-edited NOESY-HSQC via sequential NOES. The assignments procedure was carried out using the program NMRView (41). According to the lH-ISN cross peaks in the HSQC Spectrum, strips along the carbon dimension were extracted from all lSN-edited 3D NMR spectra. Spin systems for some residues with typical chemical shifts, such as Ala, Ser, Thr and Gly, could be identified. Six GK samples labeled with one type of 15N-amino acid were used to aid Spin system identifications (44), including Gly, lle, Leu, Lys, Phe and Val. These spin systems were used as the starting points in the subsequent sequential assignments. The HNCACB and CBCA(CO)NH Spectra were first analyzed to obtain sequential connectivities. Most of the assignments were made from these two Spectra. Figure 2.3 shows the sequential connectivities from Phe181 to Lys186. The HNCO and (HB)CBCACO(CA)HA Spectra provided additional independent links through C’ resonances, which could be assigned directly from 3D HNCO spectrum. The relative higher sensitivity for these two experiments not only confirmed all linkages established from HNCACB/CBCA(CO)NH analysis, but also provided additional linkages. When the triple-resonance spectra were incomplete because of a lack of C“ and CB chemical shifts due to the low sensitivity of the HNCACB experiment, the sequential 57 Fig 2.3 Strip plot of 3D HNCACB (A) and 3D CACB(CO)NH (B) Spectra of yeast GK in complex with GMP, showing the sequential J connectivities of '3 C nuclei for the residues F181 to K186. 58 om. om. ov. 0N. Egg on _ mmwx mmwm 3:2 m2...— Nw: 5E 1 4| . I] , lJ 1 . 1 126958 ”m E ow. om. ov. om- om; mmww gw< mar". mm: 5F. 1 ii. I] . 09, , 1 m0